Postgres C Function - Passing & Returning Numerics - c

I am just beginning to test with Postgres External C Functions. When I pass in a Numeric and Return it the function works fine. (Example)
Sample Function
PG_FUNCTION_INFO_V1(numericTesting);
Datum
numericTesting(PG_FUNCTION_ARGS)
{
Numeric p = PG_GETARG_NUMERIC(0);
PG_RETURN_NUMERIC(p);
}
However, when I try to do any math functions on the variable passed in, it will not compile. I get
error: invalid operands to binary *
Sample Function
PG_FUNCTION_INFO_V1(numericTesting);
Datum
numericTesting(PG_FUNCTION_ARGS)
{
Numeric p = PG_GETARG_NUMERIC(0);
PG_RETURN_NUMERIC(p * .5);
}
What is causing this? I'm guessing the the Numeric datatype needs some function to allow math. I tried using: PG_RETURN_NUMERIC(DatumGetNumeric(p * .5)) but that had the same result.

Numeric isn't a primitive type so you can't do arithmetic operations on it directly. C doesn't have operator overloading, so there's no way to add a multiply operator for Numeric. You'll have to use appropriate function calls to multiply numerics.
As with most things when writing Pg extension functions it can be helpful to read the source and see how it's done elsewhere.
In this case look at src/backend/utils/adt/numeric.c. Examine Datum numeric_mul(PG_FUNCTION_ARGS) where you'll see it use mul_var(...) to do the work.
Unfortunately mul_var is static so it can't be used outside numeric.c. Irritating and surprising. There must be a reasonable way to handle NUMERIC from C extension functions without using the spi/fmgr to do the work via SQL operator calls, as you've shown in your comment where you use DirectFunctionCall2 to invoke the numeric_mul operator.
It looks like the public stuff for Numeric that's callable directly from C is in src/include/utils/numeric.h so let's look there. Whoops, not much, just some macros for converting between Numeric and Datum and some helper GETARG and RETURN macros. Looks like usage via the SQL calls might be the only way.
If you do find yourself stuck using DirectFunctionCall2 via the SQL interfaces for Numeric, you can create a Numeric argument for the other side from a C integer using int4_numeric.
If you can't find a solution, post on the pgsql-general mailing list, you'll get more people experienced with C extensions and the source code there. Link back to this post if you do so.

A way to sidestep this problem altogether is to use data type coercion. Declare your SQL function with the type you want to coerce a value to, e.g.
CREATE FUNCTION foo(float8) RETURNS float8 AS 'SELECT $i' LANGUAGE SQL;
Any value provided to that function will be coerced to that value:
SELECT foo(12);
Even explicitly specifying the type will work:
SELECT foo(12::numeric);
Your C code will receive a double. (Credit goes to Tom Lane, see this mailing list post.)

Both operands of * must have arithmetic type (integer or floating-point types). It's a pretty good bet that Numeric is a typedef for something that isn't a simple integer or floating-point type.
Unfortunately, I don't know enough about the Postgres API to be much more help. Hopefully there's a macro or a function that can either convert a Numeric to an arithmetic type, or apply an arithmetic operation to a Numeric type.

#include "utils/numeric.h"
// ....
Numeric p = PG_GETARG_NUMERIC(0);
Numeric b = int64_div_fast_to_numeric(5, 1); // 0.5
bool failed = false;
Numeric r = numeric_mul_opt_error(p, b, &failed);
if (failed) {
// handle failure here
}
PG_RETURN_NUMERIC(r);

Related

Can Postgres SQL function transforms be applied to array types?

I'm attempting to do some numerical processing on Array types in Postgres. I found I'm able to use the Numpy library within Postgres PL/Python but the operation runs too slowly for my purposes and much slower than they would in Python directly or using a C extension.
My suspicion is that there may be overhead to go from Postgres Array Type -> Python List -> Numpy Arrray and then do the reverse on return.
To test (and potentially fix) I'm trying to build a C extension which would skip going through the Python list and convert directly from Postgres Array to Numpy Array and vice-versa.
I've created a C extension which defines the following:
CREATE FUNCTION arr_to_np(val internal) RETURNS internal LANGUAGE C AS 'MODULE_PATHNAME', 'arr_to_np';
CREATE FUNCTION np_to_arr(val internal) RETURNS real[] LANGUAGE C
AS 'MODULE_PATHNAME', 'np_to_arr';
CREATE TRANSFORM FOR real[] LANGUAGE plpythonu (
FROM SQL WITH FUNCTION arr_to_np(internal),
TO SQL WITH FUNCTION np_to_arr(internal)
);
The module loads without problems but when I try to use it in a function
CREATE FUNCTION fn (a integer[])
RETURNS integer
TRANSFORM FOR TYPE real[]
AS $$ return a $$ LANGUAGE plpythonu;
I get: ERROR: transform for type real language "plpythonu" does not exist
My guestimate is that the transform is attempting to be applied to the base "real" type and not the array. Is there any way to specify transforms specifically for arrays?
It turns out this is not possible.
In lsyscache.c (get_transform_oid) it converts the type of a transform to its base type which strips off the "array" and so the function is always applied to the underlying type.
Going to see if I can put together a patch to allow this behaviour, but currently (postgres <=11.0) you cannot create custom transforms for array objects.

Why required "=" before ANY function with array as param, in postgres procedure?

I was answering a postgres question yesterday, and also came across a postgres thread (here) where they describe the following error:
ERROR: operator does not exist: text = text[]
HINT: No operator matches the given name and argument type(s). You
might need to add explicit type casts.
The error seems to appear whenever an ARRAY string type is fed to ANY without using = ANY. This seems completely strange since based on language, logic, and sql conventions, usually you have (e.g. IN):
variable FUNCTION(set)
instead of.
variable = FUNCTION(set) , unless ofcourse operator is a summation/count operation returning one result :)
It would make more senseto have variable ANY(Set/Array) instead of variable=ANY(Set/Array). Similar example is the IN function.
Can anyone explain what is going on here?
IN (...) is basically equivalent to = ANY (ARRAY[...])
Crucially, ANY is not a function. It's syntax defined by the SQL standard, and is no more a function than GROUP BY or the OVER clause in a window function.
The reason that = is required before ANY is that ANY can apply to other operators too. What it means is "Test the operator to the left against every element in the array on the right, and return true if the test is true for at least one element."
You can use > ANY (ARRAY[...]) or whatever. It's a general purpose operator that isn't restricted to =. Notably useful for LIKE ANY (albeit with somewhat bad performance).
There is ALL too, which does much the same thing but returns true only if all results are true.

how to declare variables during run time in c

I am about to start a project of something like simple calculator completely in c and I was wondering how would I allow the user to create variables during the run time of the program, this variable could be a number or complex number or even a matrix or a
one approach was to store the type of the variable, its name and its size and value in a temporary text file and retrieve it whenever needed, is there any better approach. I hope if I could declare real variables during runtime in c
OK, you are going to need three basic modules:
N.B. I haven't had the time to actually compile these examples, hopefully I didn't make too many mistakes.
Parser. This module takes a user entered input, tokenizes it and insures that the entered expression conforms to the grammer.
Interpreter. This module takes the output of the parser and preforms the computation.
Environment. This module manages the state of the computation.
Lets start with the environment, here is what we need (or at least what I would implement). First here are the design considerations the I would use for the environment:
We will only deal with variables
We will only allow 300 variables
All variables will have global scope
Rebinding a variable will overwrite the old binding
Now, lets define the following structure:
typedef struct st
{
char* tokenName;
int type;
union
{
int iVal;
float fVal;
} val;
} tableEntry, * ptableEntry;
tableEntry symbolTable[300];
and now for some functions:
a. init(tableEntry*) -- this function initialized the environment, i.e. sets all values in the symbol table to some, predefined empty state.
b. addValue(tableEntry*, name, value) -- this function takes a pointer to the environment and adds a new entry to the environment.
c. int lookupValue(tableEntry*, name) -- this function takes a pointer to the environment and sees if the token name has been defined in it. Already, we see a problem, we are allowing for both integers and floating point numbers but would like a single lookup functions, so we probably need some sort of variant type or figure out some way to return different types.
d. updateValue(tableEntry*, name, value) -- this function takes a pointer to the environment and updates an existing value. This raises an unaddressed specification, what should updateValue do if the token is not found? Personally I would just add the value, but it up to you as the designer of the calculator on what to do.
This should do for a start for the environment.
Now lets turn to the interpreter for a bit. For this let suppose that the parser emits the abstract syntax tree in prefix form. For example:
the statement x=3 would be emitted as = x 3
the statement z = 4 + 5 would be emitted as = z + 4 5
OK, the trick here is we don't really emit 3, but rather a token that contains more information about what is being passed around.
A possible implementation of a token might be:
typedef struct tok
{
int tokType;
char* tokVal;
} token, * ptoken;
Also lets have the following enumeration:
enum {EMPTY=0, ID, VAL, EQ, PLUS, SUB, MULT, DIV, LPAREN, RPAREN};
so, with this the simplified statement = x 3, would actually be the following
structures:
{EQ, null} {ID, "x"} {VAL, "3"}
OK, so the interpreter in pseudo-code would look like (assuming that the above is presented to the interpreter as a list).
while list not empty
token <-- head(list) /* this returns the first token as well as removing it from the list */
switch (token.tokType)
{
....
case EQ: /* handling assignment */
token <--- head(list)
name = token.name
token <--- head(list)
val = atoi(token.tokVal)
addValue(env*, name, val);
break;
case ID:
name = token.name
val = lookupValue(env*, name)
....
}
Please, be advise that the actual format of the above code will in all probability need to be modified to deal with other constructs, it is just a notional example!
Now it's your turn -- take a stab and show us what you've come up with.
Later
T.
You can not declare new variables at runtime in C as such.
For what you want you could make a list of structs for each type you want to support. Each struct then contains the name of the variable and its value. Declaring a variable will add a new struct to the list.
I'm assuming you were thinking about an interaction with you calculator that might look something like this:
> myCalc
mc> x=5
mc> 5
mc> 3*x
mc> 15
mc> quit
>
Where myCalc is the name of your program, and the mc>prompt shows interaction with the calculator, where the use enters a statement and the calculator displays the result of the statement.
Now, consider the first statement x=5, we need to parse it and determine if it is a valid according to the grammar you are using. Assuming it is, you then need to evaluate the statement, which for sake of discussion has
the abstract syntax tree (AST) ASMT(x,VAL(5)). ASMT and VAL are notional operators.
Now, I would take this as adding a new binding for x into the current environment. Exactly how this environment looks depends on what you are willing to allow, so for now lets assume you are just allowing variable assignment. A simple associative array would work here, where the key is the variable name and the data would be the value.
Now consider the next statement 3*x, after parsing we can assume the AST for the expression is TIMES(3, ID(x)). Now on evaluation of this, or interpreter would first need to handle the ID(x) part which would be looking up the value of x in the environment, which is 5.
After the above, the AST would look like TIMES(3,5) which would be directly evaluated as 15.
N.B. I am being fairly loose with what how AST would be represented and how it would be evaluated by the interpreter. I'm trying to give a flavour of what to do, not full low-level implementation details.
hope this helps (a bit),
T

Semantic of comma-separated values with bison

I'm trying to give a semantic value to a list of comma-separated values. In fact, I have defined the reduction rules for bison using
commasv : exp
| commasv "," exp
where exp its a number or a variable or a function pointer or, also, a commasv token with its respective syntax and semantic rules. The type of exp is double so the type of commasv must be double.
The thing is that I want to store the list in order to use it, for example, on a function call. For instance
h = create_object()
compute_list(h,1,cos(3.14159))
will give the expected result of a certain compute_list function.
As basis bison file I've used mfcalc example from the bison manual and I replaced the yylex function by other one generated using flex. By now I can do things like
pi = 3.14159
sin(pi)
ln(exp(5))
with the modified version of yylex function with flex but I want use the comma-separated values with function calls, lists creation and more.
Thanks for your answers.
Then create a list to store the results in. Instead of having the result of the commasv rule return an actual value, have it return the list head.
In general, as soon as you get a somewhat moderately advanced grammar (like it incorporating things like lists), you can no longer really use values to represent the parsing, but have to go over to some sort of abstract syntax tree.

How to convert a PGresult to custom data type with libpq (PostgreSQL)

I'm using the libpq library in C to accessing my PostgreSQL database. So, when I do res = PQexec(conn, "SELECT point FROM test_point3d"); I don't know how to convert the PGresult I got to my custom data type.
I know I can use the PQgetValue function, but again I don't know how to convert the returning string to my custom data type.
The best way to think about this is that data types interact with applications over a textual interfaces. Libpq returns a string from just about anything. The programmer has a responsibility to parse the string and create a data type from it. I know the author has probably abandoned the question but I am working on something similar and it is worth documenting a few important tricks here that are helpful in some cases.
Obviously if this is a C language type, with its own in and out representation, then you will have to parse the string the way you would normally.
However for arrays and tuples, the notation is basically
[open_type_identifier][csv_string][close_type_identifier]
For example a tuple may be represented as:
(35,65,1111111,f,f,2011-10-06,"2011-10-07 13:11:24.324195",186,chris,f,,,,f)
This makes it easy to parse. You can generally use existing csv processers once you trip off the first and last character. Moreover, consider:
select row('test', 'testing, inc', array['test', 'testing, inc']);
row
-------------------------------------------------
(test,"testing, inc","{test,""testing, inc""}")
(1 row)
As this shows you have standard CSV escaping inside nested attributes, so you can, in fact, determine that the third attribute is an array, and then (having undoubled the quotes), parse it as an array. In this way nested data structures can be processed in a manner roughly similar to what you might expect with a format like JSON. The trick though is that it is nested CSV.

Resources