JAVA UDF Timeout - snowflake-cloud-data-platform

I'm running the brand new Java UDF in my Snowflake env ( Azure ) for text analysis using CoreNLP. I'm getting the below error for just 100 records. It succeeds for 10 records.
Handler execution timed out in 13.782877666s vs. max of 5s in function INSTRUCTION_ANALYZER with handler com.nlp.InstructionAnalyzer.getNamedEntity
Is there a workaround for this ? Can I set the timeout somewhere or is it a limitation of Java UDF.

Related

Snowflake UDF execution time limit

I've got an error when calling a UDF when the amount of data get increased:
Database Error in model test_model (models/prep/test_model.sql)
100132 (P0000): JavaScript execution error: UDF execution time limit exceeded by function IMAGE_URL
compiled SQL at target/run/models/test/test_model.sql
As mentioned in the documentation, there is a execution time limit for js UDF, so how long is the time limit and is it configurable?
so lets write a function and use it to test this out.
CREATE OR REPLACE FUNCTION long_time(D DOUBLE)
RETURNS variant
LANGUAGE JAVASCRIPT
AS $$
function sleepFor(sleepDuration){
var now = new Date().getTime();
while(new Date().getTime() < now + sleepDuration){
/* Do nothing */
}
}
sleepFor(D*1000);
return D;
$$;
select long_time(1); -- takes 1.09s
select long_time(10); -- takes 10.32s
select long_time(60); -- Explodes with
JavaScript execution error: UDF execution time limit exceeded by function LONG_TIME
BUT, ran for 31.33s, so it seems you have 30 seconds to complete, which I feel is rather large amount of time, per call to me.
The default timeout limit for JS UDFs is 30 seconds for a row set! Snowflake will send rows in sets to the JS engine, and it will try to process all rows in the set within that time. The size of the row set may vary, but you may assume it will be around 1K (this is just an estimation, the number of rows in a set could be much higher or lower).
The timeout limit is different for Java and Python UDFs. It's 300 seconds for them.
As #Felipe said, you may contact Snowflake Support and share your query ID to get help with this error. The Support may guide you to mitigate the issue.

Increase the lock timeout with sqlite, and what is the default values?

Well known issue when many clients query on a sqlite database : database is locked
I would like to inclease the delay to wait (in ms) for lock release on linux, to get rid of this error.
From sqlite-command, I can use for example (4 sec):
sqlite> .timeout 4000
sqlite>
I've started many processes which make select/insert/delete, and if I don't set this value with sqlite-command, I sometimes get :
sqlite> select * from items where code like '%30';
Error: database is locked
sqlite>
So what is the default value for .timeout ?
In Perl 5.10 programs, I also get sometimes this error, despite the default value seems to be 30.000 (so 30 sec, not documented).
Did programs actually waited for 30 sec before this error ? If yes, this seems crasy, there is at least a little moment where the database is free even if many other processes are running on this database
my $dbh = DBI->connect($database,"","") or die "cannot connect $DBI::errstr";
my $to = $dbh->sqlite_busy_timeout(); # $to get the value 30000
Thanks!
The default busy timeout for DBD::Sqlite is defined in dbdimp.h as 30000 milliseconds. You can change it with $dbh->sqlite_busy_timeout($ms);.
The sqlite3 command line shell has the normal Sqlite default of 0; that is to say, no timeout. If the database is locked, it errors right away. You can change it with .timeout ms or pragma busy_timeout=ms;.
The timeout works as follows:
The handler will sleep multiple times until at least "ms" milliseconds of sleeping have accumulated. After at least "ms" milliseconds of sleeping, the handler returns 0 which causes sqlite3_step() to return SQLITE_BUSY.
If you get a busy database error even with a 30 second timeout, you just got unlucky as to when attempts to acquire a lock were made on a heavily used database file (or something is running a really slow query). You might look into WAL mode if not already using it.

Asynchronous cursor execution in Snowflake

(Submitting on behalf of a Snowflake user)
At the time of query execution on Snowflake, I need its query id. So I am using following code snippet:
cursor.execute(query, _no_results=True)
query_id = cursor.sfqid
cursor.query_result(query_id)
This code snippet working fine for small running queries. But for query which takes more than 40-45 seconds to execute, query_result function fails with KeyError u'rowtype'.
Stack trace:
File "snowflake/connector/cursor.py", line 631, in query_result
self._init_result_and_meta(data, _use_ijson)
File "snowflake/connector/cursor.py", line 591, in _init_result_and_meta
for column in data[u'rowtype']:
KeyError: u'rowtype'
Why would this error occur? How to solve this problem?
Any recommendations? Thanks!
The Snowflake Python Connector allows for async SQL execution by using ​cur.execute(sql, _no_results=True)​
This ​"fire and forget"​ style of SQL execution allows for the parent process to continue without waiting for the SQL command to complete (think long-running SQL that may time-out).
If this is used, many developers will write code that captures the unique Snowflake Query ID (like you have in your code) and then use that Query ID to ​"check back on the query status later"​, in some sort of looping process. When you check back to get the status, you can then get the results from that query_id using the result_scan( ) function.
https://docs.snowflake.net/manuals/sql-reference/functions/result_scan.html
I hope this helps...Rich

Timeout during indexing view in Azure Search despite settings

I'm using Azure Search Indexer to index a view from Azure SQL DB.
I've created Data Source (view) and set such settings in connection string
(...)Trusted_Connection=False;Encrypt=True;Connection Timeout=1200;" },
The indexer still returns timeouts and I see from Azure SQL DB logs, that Indexer's query gets cancelled after 30 seconds:
ActionStatus: Cancellation
Statement: SET FMTONLY OFF; SET NO_BROWSETABLE ON;SELECT * FROM
[dbo].[v_XXX] ORDER BY [rowVersion] SET NO_BROWSETABLE OFF
ServerDuration: 00:00:30.3559524
The same statement takes ~2 minutes when run through SQL Server Mgmt Studio and gets completed.
I wonder if there may be any other settings (server or DB) that overwrite my connection timeout preferences? If yes, then why there is no timeout when I query my DB using SSMS and there is timeout when Indexer tries to index the view?
The timeout that cancels the operation is the command timeout, not the connection timeout. The default command timeout used to be 30 seconds, and currently there is no way to change it. We have increased the default command timeout to a much larger value (5 minutes) to mitigate this in the short term. Longer term, we will add the ability to specify a command timeout in the data source definition.
Now there is a setting on the indexer. With it you can configure the queryTimeout. I think it is in minutes. My indexer runs now longer then 20 minutes without error.
"startTime": "2016-01-01T00:00:00Z"
},
"parameters": {
"batchSize": null,
"maxFailedItems": 0,
"maxFailedItemsPerBatch": 0,
"base64EncodeKeys": false,
"configuration": {
"queryTimeout": "360"
}
},
"fieldMappings": [
{
Update: At the moment it can not be set over the azure portal. You can set it via the REST Api:
PUT https://[service name].search.windows.net/indexers/[indexer name]?api-version=[api-version]
Content-Type: application/json
api-key: [admin key]
Use the REST API link https://[SERVICE].search.windows.net/indexers/[Indexer]?api-version=2016-09-01 to get the Indexer definition and then use a POST to the same address to update it.
Ref MSDN

Contant timer delays the thread more than is set

I have the Test plan where is 10 requests. Just requests without Contant timer takes about 18 seconds. When I add one Contant timer with 1000 miliseconds delay after the third request It takes about 28 seconds.
Is It problem of the JMeter or I'm doing something wrong?
I'm running at Ubuntu - ElementaryOS with JMeter v. 2.11 r1554548.
I'm testing another server not mine laptop.
At Jmeter test plan I'm using Cache, Cookie manager and Request Defaults at the begin. One request with POST action. And Summary report, Graph results, View results in Table a Simple data writer at the end of test plan.
Everything is in one thread.
Order of timer object has no impact and does not mean it executes where it is located.
In fact it will apply to every child request of the parent of timer.
Read this:
http://jmeter.apache.org/usermanual/test_plan.html
4.10 Scoping Rules

Resources