Timeout during indexing view in Azure Search despite settings - sql-server

I'm using Azure Search Indexer to index a view from Azure SQL DB.
I've created Data Source (view) and set such settings in connection string
(...)Trusted_Connection=False;Encrypt=True;Connection Timeout=1200;" },
The indexer still returns timeouts and I see from Azure SQL DB logs, that Indexer's query gets cancelled after 30 seconds:
ActionStatus: Cancellation
Statement: SET FMTONLY OFF; SET NO_BROWSETABLE ON;SELECT * FROM
[dbo].[v_XXX] ORDER BY [rowVersion] SET NO_BROWSETABLE OFF
ServerDuration: 00:00:30.3559524
The same statement takes ~2 minutes when run through SQL Server Mgmt Studio and gets completed.
I wonder if there may be any other settings (server or DB) that overwrite my connection timeout preferences? If yes, then why there is no timeout when I query my DB using SSMS and there is timeout when Indexer tries to index the view?

The timeout that cancels the operation is the command timeout, not the connection timeout. The default command timeout used to be 30 seconds, and currently there is no way to change it. We have increased the default command timeout to a much larger value (5 minutes) to mitigate this in the short term. Longer term, we will add the ability to specify a command timeout in the data source definition.

Now there is a setting on the indexer. With it you can configure the queryTimeout. I think it is in minutes. My indexer runs now longer then 20 minutes without error.
"startTime": "2016-01-01T00:00:00Z"
},
"parameters": {
"batchSize": null,
"maxFailedItems": 0,
"maxFailedItemsPerBatch": 0,
"base64EncodeKeys": false,
"configuration": {
"queryTimeout": "360"
}
},
"fieldMappings": [
{
Update: At the moment it can not be set over the azure portal. You can set it via the REST Api:
PUT https://[service name].search.windows.net/indexers/[indexer name]?api-version=[api-version]
Content-Type: application/json
api-key: [admin key]

Use the REST API link https://[SERVICE].search.windows.net/indexers/[Indexer]?api-version=2016-09-01 to get the Indexer definition and then use a POST to the same address to update it.
Ref MSDN

Related

using servicestack ormlite, is there a way to get an execution plan?

Using servicestack ormlite 6,4 and azure SQL server - using SQLServerDialect2012, we have an issue with an enums causing excessive stalling and timeouts.
If we just convert it to a string its quick as it should be.
var results = db.Select(q => q.SomeColumn == enum.value); -> 3,5 seconds
var results2 = db.Select(q => q.SomeColumn.tostring() == enum.value.tostring()); -> 0,08
we are using default settings so the enum in the db is defined as a varchar(255)
both queries give the same result.
to track the issue we wanted to see what its actually firing, but all we get is a query with some #1 #2 etc with no indication of what parameters where used or how they are defined.
All our attempts to get a 1:1 SQL string we can use to manually test the query and see the results have failed... mini profiler was the closest as it shows the parameter values...
but it does not contain the details necessary to recreate the used query and recreate the issue we have. (manually recreating the query gives 80ms as above)
Trying to get the execution plan with the query also fail.
db.ExecuteSql("SET STATISTICS PROFILE ON;");
var results = db.Select(q => q.SomeColumn == enum.value);
db.ExecuteSql("SET STATISTICS PROFILE OFF;");
only returns data, not any extra info i was hoping for.
I have not been able to find any sites or threads that explain how others get any kind of debug info.
What is the correct next step here?
OrmLite's Logging & Introspection page shows how you can view the SQL generated by OrmLite.
E.g. Configuring a debug logger should log the generated SQL and params:
LogManager.LogFactory = new ConsoleLogFactory(debugEnabled:true);

JAVA UDF Timeout

I'm running the brand new Java UDF in my Snowflake env ( Azure ) for text analysis using CoreNLP. I'm getting the below error for just 100 records. It succeeds for 10 records.
Handler execution timed out in 13.782877666s vs. max of 5s in function INSTRUCTION_ANALYZER with handler com.nlp.InstructionAnalyzer.getNamedEntity
Is there a workaround for this ? Can I set the timeout somewhere or is it a limitation of Java UDF.

Operational Error: An Existing connection was forcibly closed by the remote host. (10054)

Am getting this Operational Error, periodically probably when the application is not active or idle for long hours. On refreshing the page it will vanish. Am using mssql pyodbc connection string ( "mssql+pyodbc:///?odbc_connect= ...") in Formhandlers and DbAuth of gramex
How Can I keep the connection alive in gramex?
Screenshot of error
Add pool_pre_ping and pool_recycle parameters.
pool_pre_ping will normally emit SQL equivalent to “SELECT 1” each time a connection is checked out from the pool; if an error is raised that is detected as a “disconnect” situation, the connection will be immediately recycled. Read more
pool_recycle prevents the pool from using a particular connection that has passed a certain age. Read more
eg: engine = create_engine(connection_string, encoding='utf-8', pool_pre_ping=True, pool_recycle=3600)
Alternatively, you can add these parameters for FormHandler in gramex.yaml. This is required only for the first FormHandler with the connection string.
kwargs:
url: ...
table: ...
pool_pre_ping: True
pool_recycle: 60

JPA timeout donot work wit Sql server

all.
I has written a program connectting Azure sql DB using JPA, within the code, I set the query timeout as below:
Map<String,Object> map = new HashMap<String, Object>();
map.put("javax.persistence.lock.timeout", 0);
return this.getEntityManager().find( cls, id, LockModeType.PESSIMISTIC_WRITE,map);
I hope it will return null immediately in case the query can not get the lock under the row.
but it didnot work, it always blocked there.
Is it something wrong? or did sqlserver driver not support timeout?
thanks a lot.
It probably is not supported, but if you include the SQL that was generated, this could be confirmed.
You could also try setting the query timeout,
"eclipselink.jdbc.timeout"="100"
Or, if you are using EclipseLink, you could use,
"eclipselink.pessimistic-lock"="LockNoWait"

Bulk Download via Google App Engine Backend

I have 1.6 Million entities in a Google App Engine app that I would like to download. I tried using the built in bulkloader mechanism but found that it is terribly slow. While I can only download ~30 entities/second via the bulkloader, I can do ~500 entities/second by querying the datastore via a backend. A backend is necessary to circumvent the 60 second request limit. In addition, datastore queries can only live for up to 30 seconds so you need to break up your fetches across multiple queries using query cursors.
The code on the server side fetches an 1000 entities and returns a query cursor:
cursor = request.get('cursor')
devices = Pushdev.all()
if (cursor and cursor!=''):
devices.with_cursor(cursor)
next1000 = devices.fetch(1000)
for d in next1000:
t = int(time.mktime(d.created.timetuple()))
response.out.write('%s/%s/%d\n'%(d.name,d.alias,t))
response.out.write(devices.cursor())
On the client side, I have a loop that invokes the handler on the server with a null cursor to begin with and then starts to pass the cursor received by the previous invocation. It terminates when it gets an empty result.
PROBLEM: I am only able to fetch a fraction - ~20% of the entities using this method. I get a response with empty data even though the full set of entities has not been traversed. Why does this method not fetch everything comprehensively?
I couldn't find anything to confirm or deny this in the docs, but my guess is that all() has a non-deterministic ordering such that eventually one of your fetch(1000)'s will hit the "last element" and devices.cursor() will return nothing.
Try this:
devices = Pushdev.all().order('__key__')

Resources