How can I use a variable in a T-SQL partition function? - sql-server

I'm confused about the definition for boundary_values in the Microsoft documentation as related to constants and expressions.
I have created partitioned tables in MSSQL, so I know how to set the values for the "partition functions". I also have a stored procedure that updates the boundary values for two partition functions every month, so I know how to programmatically create the functions.
But I recently re-read the documentation for CREATE PARTITION FUNCTION. The syntax is:
CREATE PARTITION FUNCTION partition_function_name ( input_parameter_type )
AS RANGE [ LEFT | RIGHT ]
FOR VALUES ( [ *boundary_value* [ ,...n ] ] )
The explanation has these sentences that baffle me:
boundary_value is a constant expression that can reference variables. This includes user-defined type variables, or functions and user-defined functions. It cannot reference Transact-SQL expressions.
How can a constant expression "reference" a variable? According to the doc on expressions, a constant expression is "a symbol that represents a single, specific data value. For more information, see Constants (Transact-SQL).". That link says that a character constant, for example, is just some stuff between quote marks. Numeric constants are just a single number. This says that variables can't be used in a constant expression -- that would be a Transact-SQL expression.
And the "functions and user-defined functions" phrase also baffles me.
There are lots of examples of how to programmatically build boundary values, especially when working with dates, but they just programmatically make one large SQL expression using Create Partition Function and appending the boundary values (usually in a loop). That's not setting the boundary_value to a constant expression that's "referencing" a "variable". And it's not using functions. I have seen no examples that would illuminate how to reference a variable, or use a function, in the boundary_value.
Does anyone understand what this is trying to say?

Related

Passing multiple ranges as AGGREGATE's array parameter

I would be grateful if anyone knows whether the following issue is documented and/or what the underlying reasons are.
Assuming we have, for example, the numbers from 1 to 10 in A1:A10, the following formula
=SUMPRODUCT(SUBTOTAL(4,OFFSET(A1,{0;5},0,5)))
is perfectly valid and is equivalent to taking the sum of the maximum values from each of the ranges A1:A5 and A6:A10, since the OFFSET function, here being passed an array of values ({0;5}) as its rows parameter and with the appropriate height parameter (5), resolves to the array of ranges:
{A1:A5,A6:A10}
which is then passed to SUBTOTAL to generate a further array comprising the maximum values from each of those ranges, i.e. 5 and 10, before being summed by SUMPRODUCT.
AGGREGATE was introduced in Excel 2010 as, it would seem, a more refined version of SUBTOTAL. My question is why, when attempting the following
=SUMPRODUCT(AGGREGATE(14,,OFFSET(A1,{0;5},0,5),1))
which should be equivalent to the SUBTOTAL example given above, does Excel display the message that it "Ran Out of Resources While Attempting to Calculate One or More Formulas" (and return a value of 0)?
(Note that users of a non-English-language version of Excel may require a different separator within the array constant {0;5}.)
This is a quite unexpected error. Evidently the syntax is not at fault, nor is the passing of the OFFSET construction 'disallowed'. With nothing else in the workbook, what is causing Excel to use so much resource when attempting to resolve such a construction?
A similar result occurs with INDIRECT instead of OFFSET, i.e.
=SUMPRODUCT(SUBTOTAL(4,INDIRECT({"A1:A5","A6:A10"})))
is perfectly valid, yet
=SUMPRODUCT(AGGREGATE(14,,INDIRECT({"A1:A5","A6:A10"}),1))
gives the same error described above.
Regards
[Not enough reputation to add a comment.]
Excel on Mac returns this:
Arrays containing ranges are not supported
The AGGREGATE error appears to be due to passing an array of range references to an argument that expects an array of values. The error message has symptoms of passing an unitialized pointer resulting in unexpected behavior. Indeed, the same error dialog is shown with some other functions like:
=MEDIAN(TRANSPOSE(INDIRECT({"a1:a5","a6:a10"})))
On the other hand, passing an array of references to the fourth or later argument of AGGREGATE is permitted, eg:
=SUMPRODUCT(AGGREGATE(4,,B1,INDIRECT({"a1:a5","a6:a10"})))
In a similar way, SUBTOTAL allows arrays of references in the second or later arguments, none of which natively take arrays. The SUBTOTAL formula is evaluated by applying the function to each range reference in the array, i.e.:
SUBTOTAL(4,INDIRECT({"a1:a5","a6:a10"}))
->{SUBTOTAL(4,A1:A5),SUBTOTAL(4,A6:A10)}
Formatting arrays and range references within function definitions may help with visualising the formula processing:
AGGREGATE(function_num, options, array or ref1, [k or ref2], [ref3], …)
SUBTOTAL(function_num, ref1, [ref2],...)
Note that reference only arguments also allow for arrays of references.
It will be interesting to see if there are any changes to this behavior with the updated calc engine and dynamic arrays currently in Office 365 preview and due for release soon...

Passing Ranking function in query text

I have a scenario in which external agent generates ranking function dynamically which I want to pass as a query argument instead of statically defining it in search definition file, something like
http://localhost:8080/search/?query=honda car&rankfeature.rankingExpression="query(title_match_weight)*matches(title)+query(tags_match_weight)*matches(tags)"&rankfeature.query(title_match_weight)=10&rankfeature.query(tags_match_weight)=20
which I am not able to do now. Do we have solution to achieve this in Vespa?
I have tried foreach in rank expression command to serve this purpose but it doesn't serve flexibility of having any function dynamically.
http://docs.vespa.ai/documentation/ranking.html#using-query-variables
explains about macros and I find that macros is taken as rank-feature and rank feature can be passed in the query. So that should mean macro can be passed in the query which can be used in the expression, but it is not possible.
It's not possible to send ranking expressions with the query (it wouldn't be efficient as they are (often) compiled with LLVM etc).
Couldn't you use a fixed ranking expression and use query features to weight/or turn on or off different parts of it? You can also configure many different ranking expressions and choose between them at query time using ranking.profile=profileName.

Why required "=" before ANY function with array as param, in postgres procedure?

I was answering a postgres question yesterday, and also came across a postgres thread (here) where they describe the following error:
ERROR: operator does not exist: text = text[]
HINT: No operator matches the given name and argument type(s). You
might need to add explicit type casts.
The error seems to appear whenever an ARRAY string type is fed to ANY without using = ANY. This seems completely strange since based on language, logic, and sql conventions, usually you have (e.g. IN):
variable FUNCTION(set)
instead of.
variable = FUNCTION(set) , unless ofcourse operator is a summation/count operation returning one result :)
It would make more senseto have variable ANY(Set/Array) instead of variable=ANY(Set/Array). Similar example is the IN function.
Can anyone explain what is going on here?
IN (...) is basically equivalent to = ANY (ARRAY[...])
Crucially, ANY is not a function. It's syntax defined by the SQL standard, and is no more a function than GROUP BY or the OVER clause in a window function.
The reason that = is required before ANY is that ANY can apply to other operators too. What it means is "Test the operator to the left against every element in the array on the right, and return true if the test is true for at least one element."
You can use > ANY (ARRAY[...]) or whatever. It's a general purpose operator that isn't restricted to =. Notably useful for LIKE ANY (albeit with somewhat bad performance).
There is ALL too, which does much the same thing but returns true only if all results are true.

Distinguish between function calls and indexed arrays using ANTLR4

The syntax of a language is ambiguous in the sense that function names and indexed identifiers are written in an equal way:
var = function(5) => function call where 5 is a parameter
var = array(5) => element 5 of the array
To be able to make the distinction I need to make a first pass and create a symbol table. After that I want to use predicates to do something like:
reference
: {isFunction(getCurrentToken().getText())}? ident (argumentList?)
| {!isFunction(getCurrentToken().getText())}? ident (subscriptionList)?
;
But several questions remain:
Do I have to "extend/inherit" the parser to add the code of "isFunction"? Or do I have to put it in the .g4 file itself?
Is predicates the best way here, or is there a better way to achieve all this?
How to run the parser twice? How to handle the "first" run? (in that case isFunction will always return false as the symbol table is not yet constructed)
Somehow I feel there must be an easy, clean way to handle the above issue...
Not directly the answer you might have seeked but I recommend to do it all in code after parsing and not parse the file twice or make the paring dependent on the symbol table.
This could be done by allowing both function calls / array accesses to appear where any one of them would be allowed.
When you transform the rules into an internal representation later on, you can distiguish these two based upon the knowledge of the symbol table.

Semantic of comma-separated values with bison

I'm trying to give a semantic value to a list of comma-separated values. In fact, I have defined the reduction rules for bison using
commasv : exp
| commasv "," exp
where exp its a number or a variable or a function pointer or, also, a commasv token with its respective syntax and semantic rules. The type of exp is double so the type of commasv must be double.
The thing is that I want to store the list in order to use it, for example, on a function call. For instance
h = create_object()
compute_list(h,1,cos(3.14159))
will give the expected result of a certain compute_list function.
As basis bison file I've used mfcalc example from the bison manual and I replaced the yylex function by other one generated using flex. By now I can do things like
pi = 3.14159
sin(pi)
ln(exp(5))
with the modified version of yylex function with flex but I want use the comma-separated values with function calls, lists creation and more.
Thanks for your answers.
Then create a list to store the results in. Instead of having the result of the commasv rule return an actual value, have it return the list head.
In general, as soon as you get a somewhat moderately advanced grammar (like it incorporating things like lists), you can no longer really use values to represent the parsing, but have to go over to some sort of abstract syntax tree.

Resources