How do IMMUTABLE, STABLE and VOLATILE keywords effect behaviour of function? - database

We wrote a function get_timestamp() defined as
CREATE OR REPLACE FUNCTION get_timestamp()
RETURNS integer AS
$$
SELECT (FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 10) - 13885344000)::int;
$$
LANGUAGE SQL;
This was used on INSERT and UPDATE to enter or edit a value in a created and modified field in the database record. However, we found when adding or updating records consecutively it was returning the same value.
On inspecting the function in pgAdmin III we noted that on running the SQL to build the function the key word IMMUTABLE had been injected after the LANGUAGE SQL statement. The documentation states that the default is VOLATILE (If none of these appear, VOLATILE is the default assumption) so I am not sure why IMMUTABLE was injected, however, changing this to STABLE has solved the issue of repeated values.
NOTE: As stated in the accepted answer, IMMUTABLE is never added to a function by pgAdmin or Postgres and must have been added during development.
I am guessing what was happening was that this function was being evaluated and the result was being cached for optimization, as it was marked IMMUTABLE indicating to the Postgres engine that the return value should not change given the same (empty) parameter list. However, when not used within a trigger, when used directly in the INSERT statement, the function would return a distinct value FIVE times before then returning the same value from then on. Is this due to some optimisation algorithm that says something like "If an IMMUTABLE function is used more that 5 times in a session, cache the result for future calls"?
Any clarification on how these keywords should be used in Postgres functions would be appreciated. Is STABLE the correct option for us given that we use this function in triggers, or is there something more to consider, for example the docs say:
(It is inappropriate for AFTER triggers that wish to query rows
modified by the current command.)
But I am not altogether clear on why.

The key word IMMUTABLE is never added automatically by pgAdmin or Postgres. Whoever created or replaced the function did that.
The correct volatility for the given function is VOLATILE (also the default), not STABLE - or it wouldn't make sense to use clock_timestamp() which is VOLATILE in contrast to now() or CURRENT_TIMESTAMP which are STABLE: those return the same timestamp within the same transaction. The manual:
clock_timestamp() returns the actual current time, and therefore its
value changes even within a single SQL command.
The manual warns that function volatility STABLE ...
is inappropriate for AFTER triggers that wish to query rows modified
by the current command.
.. because repeated evaluation of the trigger function can return different results for the same row. So, not STABLE.
You ask:
Do you have an idea as to why the function returned correctly five
times before sticking on the fifth value when set as IMMUTABLE?
The Postgres Wiki:
With 9.2, the planner will use specific plans regarding to the
parameters sent (the query will be planned at execution), except if
the query is executed several times and the planner decides that the
generic plan is not too much more expensive than the specific plans.
Bold emphasis mine. Doesn't seem to make sense for an IMMUTABLE function without input parameters. But the false label is overridden by the VOLATILE function in the body (voids function inlining): a different query plan can still make sense.
Related:
PostgreSQL Stored Procedure Performance
Aside
trunc() is slightly faster than floor() and does the same here, since positive numbers are guaranteed:
SELECT (trunc(EXTRACT(EPOCH FROM clock_timestamp()) * 10) - 13885344000)::int

Related

How do I reference a field from a repeatable instrument in REDCap?

I'm trying to use datediff() to calculate age in a longitudinal REDCap database, but the function is returning [no value], despite the calculation being valid and the smart variable help page corroborating that the function seems correct.
The first date is in a non-repeating instrument in one event. The second date, and also where the calculation is being done, is in a field in a second, repeatable instrument, in a separate, non-repeatable event.
My calculation currently looks like this:
datediff([firstdate],[seconddate][current-instance], "y")
I've also (for lack of any idea how to fix it), tried
datediff([firstdate],[secondeventname][seconddate], "y")
Both calculations return [no value]. I've double checked that the dates are in the same ymd format, and that the function DOES work when I replace the second argument with 'today', so I know that the issue is the second argument, but the smart variable FAQ seems to be suggesting the first line of code above, which of course hasn't been working.
Does anyone have experience with what the issue might be?
In a longitudinal data collection project, you should prefix your variables with the event that it comes from, otherwise REDCap will only look into the current event for that variable, and return no value if it can't find anything.
Furthermore, the datediff function takes a 4th parameter for the date format, either "ymd", "dmy" or "mdy", and both date1 and date2 must be in the same format.
You may not need the smart variable for current-instance, at least in my testing for this I didn't need it, since if you are performing this calculation from the event that contains [seconddate], indeed from the instance if it is repeating, then you might only need to use [seconddate] to reference it, whereas to reference [firstdate] you need to prefix it with [event_1_arm_1] or whatever your event name is, or the smart variable [first-event-name] (which would be much more portable for multi-arm studies).
So I would try the following:
datediff( [first-event-name][firstdate], [seconddate], "y", "ymd" )

SQL Server : function precedence and short curcuiting in where clause

Consider this setup:
create table #test (val varchar(10))
insert into #test values ('20100101'), ('1')
Now if I run this query
select *
from #test
where ISDATE(val) = 1
and CAST(val as datetimeoffset) > '2005-03-01 00:00:00 +00:00'
it will fail with
Conversion failed when converting date and/or time from character string
which tells me that the where conditions are not short-circuited and both functions are evaluated. OK.
However if I run
select *
from #test
where LEN(val) > 2
and CAST(val as datetimeoffset) > '2005-03-01 00:00:00 +00:00'
it doesn't fail, which tells me that where clause is short-circuited in this case.
This
select *
from #test
where ISDATE(val) = 1
and CAST(val as datetimeoffset) > '2005-03-01 00:00:00 +00:00'
and LEN(val) > 2
fails again, but if I move length check to before cast, it work. So looks like the functions are evaluated in the order they appear in query.
Can anyone explain why first query fails?
It fails because SQL is declarative so the order of your conditions is not taken into account when the plan is generated (nor is it required to do so).
The usual way to get around this is to use CASE which has strict rules about sequence and when to stop.
In your case you will probably need nested CASEs, something like this:
WHERE
(
case when ISDATE(val) = 1 then
case when CAST(val as datetimeoffset) > '2005-03-01 00:00:00 +00:00' and
LEN(val) > 2
THEN 1 ELSE 0 END
ELSE 0
END
) = 1
(note this is unlikely to be actually correct SQL as I just typed it in).
By the way, even if you get it "working" by rearranging the conditions, I'd advise you don't. Accept that SQL simply doesn't work in that way. As the data changes & stats change, SQL is upgraded, workload varies, indexes are added the query plan could change. Any attempts to "get it working" are going to be short-lived at best so go with the CASE which will continue to work once you've got it right (provided you nest CASE statements where required and don't fall into the same precedence trap in the CASE conditions!)
The mystery is answered if you examine the Execution Plan. Both the CAST() and the LEN() are applied as part of the Table Scan step, while the test for IsDate() is a separate Filter test after the Table Scan.
It appears that the SQL Engine's internal optimizations use certain filtering functions as part of the retrieval of the data, and others as separate filters, almost certainly as a form of query optimization to minimize the load from disk into main memory. However, more complex functions, such as IsDate(), which is dependent on system variables such as system date format in some cases (is '01/02/2017' Jan 2nd or Feb 1st?), need to have the data retrieved before the filter is applied.
Although I have no hard information on this, I strongly suspect that any filter more resource intensive than a certain level is delegated to the Filter steps in the query plan, and anything simple/fast enough to be checked as the data is being read in is applied during the Scan/Seek steps. Also, if a filter could be applied on the data in the index, I am certain that it will be tested before any non-index data is tested, solely to minimize disk reads, which are bad performance juju (this may not apply on the Clustered index of the table). In these cases, the short-circuiting might not be straightforward, with an IsDate() test specified on a non-index field being executed after a similar test on an indexed field, no matter where they are in the list of conditions.
That said, it appears to be true that conditions short-circuit when they are executed in the same step of the query plan. If you insert a string like '201612123' into the temp table, then add a check on Len(val) < 9 after the date comparison, it still generates an error, instead of checking both LEN() conditions at the same time in a tiny optimization.
which tells me that where conditions are not short-circuited and both functions are evaluated.
To expand on LoztInSpace's answer, your terminology suggests you are not interpreting SQL correctly, on its own terms.
The various parts of a SELECT statement are not "functions". The entire statement is atomic. You supply the query as unit, and the DBMS responds. There is no "before" and no "after". There is just the query.
Those are the rules. Your job in formulating the query is to supply one that is valid. It's a logical progression: valid question, valid answer, etc. The moment you step out of that frame, you might as well be asking, "why is the sky seven?".
One a small clarification to #LoztInSpace's answer. When he refers to the order of your statements, he's presumably talking about the phrasing of your query, which for purposes of evaluation is inconsequential. Sequential SQL statements are executed sequentially, as presented. That is guaranteed by the SQL standard.

SQLCLR Aggregate: no message about NULL values being eliminated

when I do
SELECT SUM(some_field) FROM some_table
the result is a single record/field with a number in it. Additionally, there will be a message send to the client along the lines of Warning: Null value is eliminated by an aggregate or other SET operation. in case some_field has a NULL value in the table somewhere. Only when they all are NULL (or the table is empty) it will return NULL.
I'm currently in the process of writing my own SqlUserDefinedAggregate and although things work as expected, it does NOT show me this message when one of the values passed turns out to be NULL. The outcome of the function is still correct, but there is no warning. First I assumed I might have to pipe this manually in the Terminate() method, but alas, SQLCLR then throws me an InvalidOperationException saying Data acces is not allowed in this context.
Any hints?
If your aggregate is discarding NULLs then the IsInvariantToNulls property should definitely be set to true else you might get unexpected results sometimes, as stated on the MSDN page for SqlUserDefinedAggregateAttribute.IsInvariantToNulls:
Used by the query processor, this property is true if the aggregate is invariant to nulls. That is, the aggregate of S, {NULL} is the same as aggregate of S. For example, aggregate functions such as MIN and MAX satisfy this property, while COUNT(*) does not.
Incorrectly setting this property can result in incorrect query results. This property is not an optimizer hint; it affects the plan selected and the results returned by the query.
And a UDA is a function so there is no SqlContext.Pipe to use. And even if there was, the Terminate method isn't an appropriate place to handle this since it executes for every group. The warning you are seeing when using SUM, however, is an ANSI warning and is displayed once for the query, not per group.
So, if SQL Server isn't displaying the warning then there likely isn't anything you can do about it. I assume that SQL Server isn't using the IsInvariantToNulls property as a means of knowing if it should display the message or not because it is not guaranteed to be accurately set.
And personally, I find this to be a benefit since, in my opinion, the "Null value is eliminated by an aggregate" warning is entirely not helpful, yet if you want to get rid of it you need to use ISNULL() to inject a value that won't influence the result (e.g. 0 in the case of SUM), or turn off ALL ANSI warnings, in which case you disable some warnings that are sometimes helpful.

ColdFusion 10 error with Stored Procedures

In a .CFC file, within a CFfunction and with CFargument tags.
<cfscript>
var sp=new storedproc();
sp.setDatasource(variables.datasource);
sp.setProcedure("storedProcedure_INSERT");
sp.addParam(cfsqltype="cf_sql_integer",type="in",value=arguments.one);
sp.addParam(cfsqltype="cf_sql_integer",type="in",value=arguments.two);
sp.addParam(cfsqltype="cf_sql_integer",type="in",value=arguments.three);
sp.addParam(cfsqltype="cf_sql_integer",type="in",value=arguments.four);
sp.addProcResult(name="results",resultset=1);
//writeDump(sp);break; //This dump is reached
var spObj=sp.execute(); //blows up here; this is never reached
writeDump(spObj);break; //This is never reached, either.
var spResults=spObj.getProcResultSets().results;
A shiny nickle to anyone who can tell me why the sp.execute() is blowing up with message
"Cannot find results key in structure.
The specified key, results, does not exist in the structure."
I've used this psuedo-code many, may times in the past, and never had it do this. I'm connected to a MSSQL Server 2012 DB, everything's cricket in CF Admin, and other SPs are working properly. The stack trace doesn't even include any of MY code at all o_O
The error occurred in C:/ColdFusion10/cfusion/CustomTags/com/adobe/coldfusion/base.cfc: line 491
Called from C:/ColdFusion10/cfusion/CustomTags/com/adobe/coldfusion/storedproc.cfc: line 142
Called from //hq-devfs/development$/websites/myProject/cfc/mySOAPWSDLs.cfc: line 123
And SO is blowing up if I try and paste anymore of that. Google has...not been helpful ._.
Short answer: The error means you are trying to retrieve a resultset from the stored procedure, when it does not actually return one. A simple solution is to add a SELECT to the end of your procedure, so it returns a resultset containing the data you need. Then your original code will work:
SELECT ##ROWCOUNT AS NumOfRowsAffected;
Longer answer:
The method you are using, addProcResult(), is the equivalent of <cfprocresult>. It is intended to capture a resultset returned from a stored procedure. (Due to CF's poor choice of attribute names, a lot of people think "resultset" means the storedproc "result" structure, but they are two totally different things). A "resultset" is a query object", in CF parlance.
While all four (4) of the primary sql statements return some result, not all of them return a "query object"
Only SELECT statements generate a "query object"
INSERT/UPDATE/DELETE statements simply return the number of rows affected. They do not generate a "query object".
Since your stored procedure performs an INSERT, it does not generate a "query object". Hence the error when you try and grab the non-existent query here:
sp.addProcResult(name="results",resultset=1);
The simple solution is to add a SELECT statement to the end of your stored procedure, so that it does return a query object. Then your code will work as expected.
As an aside, I suspect you were actually trying to grab the "result" structure, but used the wrong method. The equivalent of <cfstoredproc result=".."> is getPrefix(). Though that would not work here anyway. According to the docs, it does not contain the number of rows affected. Probably because stored procedures can execute multiple statements, each one potentially returning a row count, so there is not just a single value to return.

Why do I get a SQL syntax error with this?

Trying to run this query in LINQPad 4:
SELECT item_group_id as AccountID, IIF(ISNULL(t_item_group.description),'[blank]',t_item_group.description) AS Name
FROM t_item_group
WHERE active = TRUE
I get, "the isnull function requires 2 argument(s)."
I've tried moving the parens around, changing the "[blank]" to "[blank]" and "[blank]" , but none of it helps...
The queries (I have two similar ones (with IIF(ISNULL)) that LINQPad won't run for this reason, yet they run in actuality (in my Web API app) fine; so, LINQPad is more "picky" than it needs to be, perhaps, but what is it expecting, SQL syntax-wise?
ISNULL is already like a 'if' type statement.
You can just replace
IIF(ISNULL(t_item_group.description),'[blank]',t_item_group.description)
with
ISNULL(t_item_group.description, '[blank]')
The ISNULL uses the first parameter (the 'description'), unless that value is null in which case it will use the second parameter.
As an aside, one of the reasons I don't care for ISNULL is that it is poorly named. You'd assume that given its name it will return a bit - true if the parameter is null, false if not null - which you could use in an 'if' statement like you attempted. But that's not how it works.
The alternative is to use COALESCE. It provides much the same functionality, but the naming makes sense.
co·a·lesce ˌkōəˈles verb
1. come together and form one mass or whole.
To COALESCE two parameters is to force them into one non-nullable result. And the function is actually more powerful, as you can provide multiple parameters - COALESCE(i.description, i.name, '[blank]') is perfectly valid.

Resources