List of SQL Server errors that should be retried? - sql-server

Is there a concise list of SQL Server stored procedure errors that make sense to automatically retry? Obviously, retrying a "login failed" error doesn't make sense, but retrying "timeout" does. I'm thinking it might be easier to specify which errors to retry than to specify which errors not to retry.
So, besides "timeout" errors, what other errors would be good candidates for automatic retrying?
Thanks!

You should retry (re-run) the entire transaction, not just a single query/SP.
As for the errors to retry, I've been using the following list:
DeadlockVictim = 1205,
SnapshotUpdateConflict = 3960,
// I haven't encountered the following 4 errors in practice
// so I've removed these from my own code:
LockRequestTimeout = 1222,
OutOfMemory = 701,
OutOfLocks = 1204,
TimeoutWaitingForMemoryResource = 8645,
The most important one is of course the "deadlock victim" error 1205.

I would extend that list, if you want absolutely complete list use the query and filter the result.
select * from master.dbo.sysmessages where description like '%memory%'
int[] errorNums = new int[]
{
701, // Out of Memory
1204, // Lock Issue
1205, // Deadlock Victim
1222, // Lock request time out period exceeded.
7214, // Remote procedure time out of %d seconds exceeded. Remote procedure '%.*ls' is canceled.
7604, // Full-text operation failed due to a time out.
7618, // %d is not a valid value for a full-text connection time out.
8628, // A time out occurred while waiting to optimize the query. Rerun the query.
8645, // A time out occurred while waiting for memory resources to execute the query. Rerun the query.
8651, // Low memory condition
};

You can use a SQL query to look for errors explicitly requesting a retry (trying to exclude those that require another action too).
SELECT error, description
FROM master.dbo.sysmessages
WHERE msglangid = 1033
AND (description LIKE '%try%later.' OR description LIKE '%. rerun the%')
AND description NOT LIKE '%resolve%'
AND description NOT LIKE '%and try%'
AND description NOT LIKE '%and retry%'
Here's the list of error codes:
539,
617,
952,
956,
983,
1205,
1807,
3055,
5034,
5059,
5061,
5065,
8628,
8645,
8675,
10922,
14258,
20689,
25003,
27118,
30024,
30026,
30085,
33115,
33116,
40602,
40642,
40648
You can tweak the query to look for other conditions like timeouts or memory problems, but I'd recommend configuring your timeout length correctly up front, and then backing off slightly in these scenarios.

I'm not sure about a full listing of these errors, but I can warn you to be VERY careful about retrying queries. Often there's a larger problem afoot when you get errors from SQL, and simply re-running queries will only further compact the issue. For instance, with the timeout error, you typically will have either a network bottleneck, poorly indexed tables, or something on those lines, and re-running the same query will add to the latency of other queries already obviously struggling to execute.

The one sql server error that you should always catch on inserts and updates (and it is quite often missed), is the deadlock error no. 1205
Appropriate action is to retry the INSERT/UPDATE a small number of times.

Related

batch query is not allowed to request data from "".""

I'm getting started with Kapacitor and have been trying to run the first guide in the Kapacitor documentation, but with data I already have. I managed to define a task, but I can neither enable it nor can I run a backfill. I came across this question, which is similar to my problem, but the answer there didn't help. In contrast to the error message there I get empty strings for database, retention policy, and/or measurement.
In Kapacitor config I set an InfluxDB connection to the local host instance with the name localhost (which has a database mydb and the measurements weather.current.clouds and weather.current.visibility with default retention policy autogen) and created the following weathertest.tick script:
dbrp "mydb"."autogen"
var clouds = batch
|query('select mean(value) / 100.0 as val from "mydb"."autogen"."weather.current.clouds"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
var vis = batch
|query('select mean(value) / 10000.0 as val from "mydb"."autogen"."weather.current.visibility"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
clouds
|join(vis)
.as('c', 'v')
|eval(lambda: 100 * (1 - "c.val") * "v.val")
.as('pcent')
|influxDBOut()
.cluster('localhost')
.database('mydb')
.retentionPolicy('autogen')
.measurement('testmetric')
.tag('host', 'myhost.local')
.tag('key', 'weather.current.lightidx')
This is what I came up with after hours of trial and (especially) error. As given in the title, when I try to enable my task with kapacitor enable weathertest, I get the error message enabling task weathertest: batch query is not allowed to request data from ""."". Same thing when I try to record as in the "Backfill" example. Also, in that example there is a start and a stop date for limiting the time frame. The time format given there is wrong and is not understood by Kapacitor. Instead of e. g. 2015-10-01 I have to put in 2015-10-01T00:00Z to make it at least pass the error message regarding time format error.
In the Kapacitor logs there is not a single line regarding these errors, only when I try to remove a record, I get something like remove /var/lib/kapacitor/replay/1f5...750.brpl: no such file or directory and this can be found in the logs. There are lots of info lines in the logs showing successful POSTs to/from InfluxDB for the _internal database with HTTP response result 204.
Has anyone an Idea what I may be doing wrong?
OK, after the weekend I tried again. Without any change it accepted my script now in the failing steps, however, now I was able to find error messages in the log. The node mentioned there was the eval node and pointed towards a type mismatch. When I changed the line
|eval(lambda: 100 * (1 - "c.val") * "v.val")
to
|eval(lambda: 100.0 * (1.0 - "c.val") * "v.val")
the error messages were gone and the command kapacitor show weathertest showed a rather sane content now.
Furthermore, I redefined, recorded, replayed and deleted the tasks and recordings during my tests over and over again and I may have forgotten to redefine tasks after making changes to the tick script (not really sure). After changing the above, redefining the task and replaying it I finally found the expected data in the InfluxDB instance.

Asynchronous cursor execution in Snowflake

(Submitting on behalf of a Snowflake user)
At the time of query execution on Snowflake, I need its query id. So I am using following code snippet:
cursor.execute(query, _no_results=True)
query_id = cursor.sfqid
cursor.query_result(query_id)
This code snippet working fine for small running queries. But for query which takes more than 40-45 seconds to execute, query_result function fails with KeyError u'rowtype'.
Stack trace:
File "snowflake/connector/cursor.py", line 631, in query_result
self._init_result_and_meta(data, _use_ijson)
File "snowflake/connector/cursor.py", line 591, in _init_result_and_meta
for column in data[u'rowtype']:
KeyError: u'rowtype'
Why would this error occur? How to solve this problem?
Any recommendations? Thanks!
The Snowflake Python Connector allows for async SQL execution by using ​cur.execute(sql, _no_results=True)​
This ​"fire and forget"​ style of SQL execution allows for the parent process to continue without waiting for the SQL command to complete (think long-running SQL that may time-out).
If this is used, many developers will write code that captures the unique Snowflake Query ID (like you have in your code) and then use that Query ID to ​"check back on the query status later"​, in some sort of looping process. When you check back to get the status, you can then get the results from that query_id using the result_scan( ) function.
https://docs.snowflake.net/manuals/sql-reference/functions/result_scan.html
I hope this helps...Rich

NO_SQL_DATA error when deleting a record, firedac, delphi 10.3.1

using sql server and delphi 10.3.1, and firedac.
I am using cached updates, with autocommit on.
I keep managing to get my data into a state where the record has been deleted from the database, and I have also deleted that record in the dataset.
then, when it attempts to commit the change to the database(where the data no longer exists), I get an error:
[my application] raised exception class emssqlNativeException with message [firedac][Phys][odbc][sqlncli11.dll] SQL_NO_DATA
and then I can't clear the cached updates flag on the dataset, because there is stuff 'sitting' there.
my question - how can I get it to NOT return that error? because it's really not an error, it's trying to delete a record that no longer exists. I am not finding ANY documentation on the update options on a query, so is there a flag there I need to set?
You can handle update errors in OnUpdateError and perform any additional checks before deciding how to proceed. Blindly pretending all deletes worked would be something like:
procedure TForm1.FDQuery1UpdateError(ASender: TDataSet; AException:
EFDException; ARow: TFDDatSRow; ARequest: TFDUpdateRequest; var AAction:
TFDErrorAction);
begin
if ARequest = ARDelete then AAction := eaApplied;
end;
Read the online help for OnUpdateError for more information.

Java more than one DB connection in UserTransaction

static void clean() throws Exception {
final UserTransaction tx = InitialContext.doLookup("UserTransaction");
tx.begin();
try {
final DataSource ds = InitialContext.doLookup(Databases.ADMIN);
Connection connection1 = ds.getConnection();
Connection connection2 = ds.getConnection();
PreparedStatement st1 = connection1.prepareStatement("XXX delete records XXX"); // delete data
PreparedStatement st2 = connection2.prepareStatement("XXX insert records XXX"); // insert new data that is same primary as deleted data above
st1.executeUpdate();
st1.close();
connection1.close();
st2.executeUpdate();
st2.close();
connection2.close();
tx.commit();
} finally {
if (tx.getStatus() == Status.STATUS_ACTIVE) {
tx.rollback();
}
}
}
I have a web app, the DAO taking DataSource as the object to create individual connection to perform database operations.
So I have a UserTransaction, inside there are two DAO object doing separated action, first one is doing deletion and second one is doing insertion. The deletion is to delete some records to allow insertion to take place because insertion will insert same primary key's data.
I take out the DAO layer and translate the logic into the code above. There is one thing I couldn't understand, based on the code above, the insertion operation should fail, because the code (inside the UserTransaction) take two different connections, they don't know each other, and the first deletion haven't committed obviously, so second statement (insertion) should fail (due to unique constraint), because two database operation not in same connection, second connection is not able to detect uncommitted changes. But amazingly, it doesn't fail, and both statement can work perfectly.
Can anyone help explain this? Any configuration can be done to achieve this result? Or whether my understanding is wrong?
Since your application is running in weblogic server, the java-EE-container is managing the transaction and the connection for you. If you call DataSource#getConnection multiple times in a java-ee transaction, you will get multiple Connection instances joining the same transaction. Usually those connections connect to database with the identical session. Using oracle you can check that with the following snippet in a #Stateless ejb:
#Resource(lookup="jdbc/myDS")
private DataSource ds;
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
#Schedule(hour="*", minute="*", second="42")
public void testDatasource() throws SQLException {
try ( Connection con1 = ds.getConnection();
Connection con2 = ds.getConnection();
) {
String sessId1 = null, sessId2 = null;
try (ResultSet rs1 = con1.createStatement().executeQuery("select userenv('SESSIONID') from dual") ){
if ( rs1.next() ) sessId1 = rs1.getString(1);
};
try (ResultSet rs2 = con2.createStatement().executeQuery("select userenv('SESSIONID') from dual") ){
if ( rs2.next() ) sessId2 = rs2.getString(1);
};
LOG.log( Level.INFO," con1={0}, con2={1}, sessId1={2}, sessId2={3}"
, new Object[]{ con1, con2, sessId1, sessId2}
);
}
}
This results in the following log-Message:
con1=com.sun.gjc.spi.jdbc40.ConnectionWrapper40#19f32aa,
con2=com.sun.gjc.spi.jdbc40.ConnectionWrapper40#1cb42e0,
sessId1=9347407,
sessId2=9347407
Note that you get different Connection instances with same session-ID.
For more details see eg this question
The only way to do this properly is to use a transaction manager and two phase commit XA drivers for all databases involved in this transaction.
My guess is that you have autocommit enabled on the connections. This is the default when creating a new connection, as is documented here
https://docs.oracle.com/javase/tutorial/jdbc/basics/transactions.html
System.out.println(connection1.getAutoCommit());
will most likely print true.
You could try
connection1.setAutoCommit(false);
and see if that changes the behavior.
In addition to that, it's not really defined what happens if you call close() on a connection and haven't issued a commit or rollback statement beforehand. Therefore it is strongly recommended to either issue one of the two before closing the connection, see https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#close()
EDIT 1:
If autocommit is false, the it's probably due to the undefined behavior of close. What happens if you switch the statements? :
st2.executeUpdate();
st2.close();
connection2.close();
st1.executeUpdate();
st1.close();
connection1.close();
EDIT 2:
You could also try the "correct" way of doing it:
st1.executeUpdate();
st1.close();
st2.executeUpdate();
st2.close();
tx.commit();
connection1.close();
connection2.close();
If that doesn't fail, then something is wrong with your setup for UserTransactions.
Depending on your database this is quite a normal case.
An object implementing UserTransaction interface represents a "logical transaction". It doesn't always map to a real, "physical" transaction that a database engine respects.
For example, there are situations that cause implicit commits (as well as implicit starts) of transactions. In case of Oracle (can't vouch for other DBs), closing a connection is one of them.
From Oracle's docs:
"If the auto-commit mode is disabled and you close the connection
without explicitly committing or rolling back your last changes, then
an implicit COMMIT operation is run".
But there can be other possible reasons for implicit commits: select for update, various locking statements, DDLs, and so on. They are database-specific.
So, back to our code.
The first transaction is committed by closing a connection.
Then another transaction is implicitly started by the DML on the second connection. It inserts non-conflicting changes and the second connection.close() commits them without PK violation. tx.commit() won't even get a chance to commit anything (and how could it? the connection is already closed).
The bottom line: "logical" transaction managers don't always give you the full picture.
Sometimes transactions are started and committed without an explicit reason. And sometimes they are even ignored by a DB.
PS: I assumed you used Oracle, but the said holds true for other databases as well. For example, MySQL's list of implicit commit reasons.
If auto-commit mode is disabled and you close the connection
without explicitly committing or rolling back your last changes,
then an implicit COMMIT operation is executed.
Please check below link for details:
http://in.relation.to/2005/10/20/pop-quiz-does-connectionclose-result-in-commit-or-rollback/

What does: "Cannot commit when autoCommit is enabled” error mean?

In my program, I’ve got several threads in pool that each try to write to the DB. The number of threads created is dynamic. When the number of threads created is only one, all works fine. However, when there are multi-thread executing, I get the error:
org.apache.ddlutils.DatabaseOperationException: org.postgresql.util.PSQLException: Cannot commit when autoCommit is enabled.
I’m guessing, perhaps since each thread executes in parallel, two threads are trying to write at the same time and giving this error.
Do you think this is the case, if not, what could be causing this error?
Otherwise, if what I said is the problem, what I can do to fix it?
In your jdbc code, you should turn off autocommit as soon as you fetch the connection. Something like this:
DataSource datasource = getDatasource(); // fetch your datasource somehow
Connection c = null;
try{
c = datasource.getConnection();
c.setAutoCommit(false);

Resources