Why required "=" before ANY function with array as param, in postgres procedure? - arrays

I was answering a postgres question yesterday, and also came across a postgres thread (here) where they describe the following error:
ERROR: operator does not exist: text = text[]
HINT: No operator matches the given name and argument type(s). You
might need to add explicit type casts.
The error seems to appear whenever an ARRAY string type is fed to ANY without using = ANY. This seems completely strange since based on language, logic, and sql conventions, usually you have (e.g. IN):
variable FUNCTION(set)
instead of.
variable = FUNCTION(set) , unless ofcourse operator is a summation/count operation returning one result :)
It would make more senseto have variable ANY(Set/Array) instead of variable=ANY(Set/Array). Similar example is the IN function.
Can anyone explain what is going on here?

IN (...) is basically equivalent to = ANY (ARRAY[...])
Crucially, ANY is not a function. It's syntax defined by the SQL standard, and is no more a function than GROUP BY or the OVER clause in a window function.
The reason that = is required before ANY is that ANY can apply to other operators too. What it means is "Test the operator to the left against every element in the array on the right, and return true if the test is true for at least one element."
You can use > ANY (ARRAY[...]) or whatever. It's a general purpose operator that isn't restricted to =. Notably useful for LIKE ANY (albeit with somewhat bad performance).
There is ALL too, which does much the same thing but returns true only if all results are true.

Related

How can I use a variable in a T-SQL partition function?

I'm confused about the definition for boundary_values in the Microsoft documentation as related to constants and expressions.
I have created partitioned tables in MSSQL, so I know how to set the values for the "partition functions". I also have a stored procedure that updates the boundary values for two partition functions every month, so I know how to programmatically create the functions.
But I recently re-read the documentation for CREATE PARTITION FUNCTION. The syntax is:
CREATE PARTITION FUNCTION partition_function_name ( input_parameter_type )
AS RANGE [ LEFT | RIGHT ]
FOR VALUES ( [ *boundary_value* [ ,...n ] ] )
The explanation has these sentences that baffle me:
boundary_value is a constant expression that can reference variables. This includes user-defined type variables, or functions and user-defined functions. It cannot reference Transact-SQL expressions.
How can a constant expression "reference" a variable? According to the doc on expressions, a constant expression is "a symbol that represents a single, specific data value. For more information, see Constants (Transact-SQL).". That link says that a character constant, for example, is just some stuff between quote marks. Numeric constants are just a single number. This says that variables can't be used in a constant expression -- that would be a Transact-SQL expression.
And the "functions and user-defined functions" phrase also baffles me.
There are lots of examples of how to programmatically build boundary values, especially when working with dates, but they just programmatically make one large SQL expression using Create Partition Function and appending the boundary values (usually in a loop). That's not setting the boundary_value to a constant expression that's "referencing" a "variable". And it's not using functions. I have seen no examples that would illuminate how to reference a variable, or use a function, in the boundary_value.
Does anyone understand what this is trying to say?

Perl structure flow to C

I've started working on a program which is in Perl and has to be transformed into C.
There are a lot of subroutines which have structure member accessing which is unfamiliar to me, because I have little to no knowledge about Perl syntax and structure flow.
Example:
$ref->{$struct2[$value]->{field1}}->{struct_insideStruct2}->{$ref2->{field}}
$ref is a third structure
$ref2 is a local copy of a parameter which is of type struct1
My question is: How do you create a line like this in C?
Do I need to create nested multiple structures?
I need to understand how multiple access operators in Perl works and if I can create something similiar in C, thanks in advance.
I recommend to not try to directly translate between languages, as this likely results in a clumsy and unnatural code. That would certainly be the case here, as commented further down. The best I can do for this quest is to explain what the expression does
$ref -> { $struct2[$value]->{field1} }
-> { struct_insideStruct2 }
-> { $ref2->{field} }
The $ref is a reference to a hash (associative array); it's OK to think of it as a pointer to a hash. One can tell because the -> ("arrow") operator dereferences, and the {...} on its right means that on its left there must be a hash reference; this returns a value that it points to.
In this case, the key with which it is dereferenced (the index into the associative array) involves an element of the array #struct2 at index $value; that element is another hash reference, being dereferenced (indexed into) with a key field1 (string literal†).
What this returns is another hash reference, which is then indexed into (dereferenced) with the key struct_insideStruct2 (string), and this again returns a hash reference.
That last one is indexed with a key which itself is produced by dereferencing another hash reference, $ref2, with a key field (string).
This is an example of a Perl complex data structure. How do you like it? I don't, not very much. Even in Perl, ideally I'd like to see this rewritten as a class, as it goes too deep and wide and so it packs too much complexity without any supporting structure which a class can provide.
If you still wish to indeed and really do that kinda thing in C, you can. May want to find a good hash implementation (or use structs and nest them carefully), and probably to dust off your function pointer syntax and such. But I would recommend to not get into all that.
Instead, once you understand the deep-nested data structure explained above, and the data it represents, find a way to implement what it means and does in your code in a native C way. We always want to use logic, techniques, and idioms native to the language at hand.
Along with linked documentation also see the short and sweet perlintro. The full reference for Perl's references is perlref.
† Normally such "barewords" need be under quotes, 'string' (or using "", or q() or qq() operators ...). But if that is a sole thing between {} then the quoting may be omitted.

Apply slicing to array valued function in Fortran [duplicate]

How does one access an element of an array that is returned from a function? For example, shape() returns an array of integers. How does one compare an element of that array to an integer? The following does not compile:
integer :: a
integer, dimension(5) :: b
a = 5
if (a .eq. shape(b)) then
print *, 'equal'
end if
The error is:
if (a .eq. shape(c)) then
1
Error: IF clause at (1) requires a scalar LOGICAL expression
I understand that this is because shape(c) returns an array. However, accessing an element of the array does not appear to be possible like so: shape(c)(1)
Now if I add these two lines:
integer, dimension(1) :: c
c = shape(b)
...and change the if clause to this:
if (a .eq. c(1)) then
... then it works. But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?
Further to the answers that deal with SHAPE and logical expressions etc, the general answer to your question "How does one access an element of an array that is returned from a function?" is
you assign the expression that has the function reference to an array variable, and then index that array variable.
you use the expression that has the function reference as an actual argument to a procedure that takes a dummy array argument, and does the indexing for you.
Consequently, the general answer to your last questions "But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?" is "Yes, you do need to declare another array variable" and hence "No, there is no other way".
(Note that reasonable optimising compilers will avoid the need for any additional memory operations/allocations etc once they have the result of the array function, it's really just a syntax issue.)
The rationale for this particular aspect of language design is sometimes ascribed to a need to avoid syntax ambiguity and confusion for array function results that are of character type (they could potentially be indexed and/or substringed - how do you tell what was intended?). Others think it was done this way just to annoy C programmers.
Instead of using shape(array), I would use size(array).
Note that this will return an integer indicating how many elements there are in ALL dimensions, unless you specify the DIM attribute, in which case it will return only the number of elements in the specified dimension.
Take a look at the gfortran documentation:
http://gcc.gnu.org/onlinedocs/gfortran/SIZE.html.
Also, look up lbound and ubound.
Note that the expression
a == shape(b)
returns a rank-1 array of logicals and the if statement requires that the condition reduce to a scalar logical expression. You could reduce the rank-1 array to a scalar like this:
if (all(a == shape(b)))
This is certainly not a general replacement for the syntactically-invalid application of array indexing to an array-returning function such as shape(b)(1).
It is possible even without the intermediate variable using ASSOCIATE:
real c(3,3)
associate (x=>shape(c))
print *,x(1),x(2)
end associate
end

Distinguish between function calls and indexed arrays using ANTLR4

The syntax of a language is ambiguous in the sense that function names and indexed identifiers are written in an equal way:
var = function(5) => function call where 5 is a parameter
var = array(5) => element 5 of the array
To be able to make the distinction I need to make a first pass and create a symbol table. After that I want to use predicates to do something like:
reference
: {isFunction(getCurrentToken().getText())}? ident (argumentList?)
| {!isFunction(getCurrentToken().getText())}? ident (subscriptionList)?
;
But several questions remain:
Do I have to "extend/inherit" the parser to add the code of "isFunction"? Or do I have to put it in the .g4 file itself?
Is predicates the best way here, or is there a better way to achieve all this?
How to run the parser twice? How to handle the "first" run? (in that case isFunction will always return false as the symbol table is not yet constructed)
Somehow I feel there must be an easy, clean way to handle the above issue...
Not directly the answer you might have seeked but I recommend to do it all in code after parsing and not parse the file twice or make the paring dependent on the symbol table.
This could be done by allowing both function calls / array accesses to appear where any one of them would be allowed.
When you transform the rules into an internal representation later on, you can distiguish these two based upon the knowledge of the symbol table.

Postgres C Function - Passing & Returning Numerics

I am just beginning to test with Postgres External C Functions. When I pass in a Numeric and Return it the function works fine. (Example)
Sample Function
PG_FUNCTION_INFO_V1(numericTesting);
Datum
numericTesting(PG_FUNCTION_ARGS)
{
Numeric p = PG_GETARG_NUMERIC(0);
PG_RETURN_NUMERIC(p);
}
However, when I try to do any math functions on the variable passed in, it will not compile. I get
error: invalid operands to binary *
Sample Function
PG_FUNCTION_INFO_V1(numericTesting);
Datum
numericTesting(PG_FUNCTION_ARGS)
{
Numeric p = PG_GETARG_NUMERIC(0);
PG_RETURN_NUMERIC(p * .5);
}
What is causing this? I'm guessing the the Numeric datatype needs some function to allow math. I tried using: PG_RETURN_NUMERIC(DatumGetNumeric(p * .5)) but that had the same result.
Numeric isn't a primitive type so you can't do arithmetic operations on it directly. C doesn't have operator overloading, so there's no way to add a multiply operator for Numeric. You'll have to use appropriate function calls to multiply numerics.
As with most things when writing Pg extension functions it can be helpful to read the source and see how it's done elsewhere.
In this case look at src/backend/utils/adt/numeric.c. Examine Datum numeric_mul(PG_FUNCTION_ARGS) where you'll see it use mul_var(...) to do the work.
Unfortunately mul_var is static so it can't be used outside numeric.c. Irritating and surprising. There must be a reasonable way to handle NUMERIC from C extension functions without using the spi/fmgr to do the work via SQL operator calls, as you've shown in your comment where you use DirectFunctionCall2 to invoke the numeric_mul operator.
It looks like the public stuff for Numeric that's callable directly from C is in src/include/utils/numeric.h so let's look there. Whoops, not much, just some macros for converting between Numeric and Datum and some helper GETARG and RETURN macros. Looks like usage via the SQL calls might be the only way.
If you do find yourself stuck using DirectFunctionCall2 via the SQL interfaces for Numeric, you can create a Numeric argument for the other side from a C integer using int4_numeric.
If you can't find a solution, post on the pgsql-general mailing list, you'll get more people experienced with C extensions and the source code there. Link back to this post if you do so.
A way to sidestep this problem altogether is to use data type coercion. Declare your SQL function with the type you want to coerce a value to, e.g.
CREATE FUNCTION foo(float8) RETURNS float8 AS 'SELECT $i' LANGUAGE SQL;
Any value provided to that function will be coerced to that value:
SELECT foo(12);
Even explicitly specifying the type will work:
SELECT foo(12::numeric);
Your C code will receive a double. (Credit goes to Tom Lane, see this mailing list post.)
Both operands of * must have arithmetic type (integer or floating-point types). It's a pretty good bet that Numeric is a typedef for something that isn't a simple integer or floating-point type.
Unfortunately, I don't know enough about the Postgres API to be much more help. Hopefully there's a macro or a function that can either convert a Numeric to an arithmetic type, or apply an arithmetic operation to a Numeric type.
#include "utils/numeric.h"
// ....
Numeric p = PG_GETARG_NUMERIC(0);
Numeric b = int64_div_fast_to_numeric(5, 1); // 0.5
bool failed = false;
Numeric r = numeric_mul_opt_error(p, b, &failed);
if (failed) {
// handle failure here
}
PG_RETURN_NUMERIC(r);

Resources