I'm testing an import script on a shared web host I just got, but I found that transactions are blocked after running it for 20 minutes or so. I assume this is to avoid overloading the database, but even when I import one item every 1 second, I still run into the problem. To be specific, when I try to save an object I receive the error:
DatabaseError: current transaction is aborted, commands ignored until end of transaction block
I've tried to delay for a few hours after this happens, but there is still a block. The only way to resume importing is to completely restart the importing program. Because of this, I reasoned that all I need to do is reconnect to the DB. This might not be true, but it's wroth a try.
So my question is this, how can I disconnect and reconnect the DB connection in Django? Is this possible?
Most likely some other database error occurred before this one, but your code ignored it and went forward with the transaction in a broken state.
Related
We are developing a SQL Database, and client connects to server with rpc.
Think about a case: there're Transaction_A(TxnA) & Transaction_B(TxnB). TxnA may be like UPDATE tbl SET a=a+1 WHERE id=1 .
When client gets CONN_TIMEOUT exception, the reason might be server crashed or network issues. Therefore, TxnA may or maynot complete write. And meanwhile TxnB may write to the same table and update table meta (like advancing the sequence number or revision).
So it seems no way to check TxnA completed or not by the client. And the query may not be idempotent, such as the 'UPDATE a=a+1' example above. So it cannot be retried recklessly.
My question: Is there any solution to solve this issue? I'm not sure how other SQL system or storage system work with this issue. Try google some keywords, but fail to get the answers.
Thank you for any inspiration.
I'm wondering is there a way to recognize the OfflineComamd is being executed or internal flag or something to represent this command has been passed or mark it has been executed successfully. I have issue in recognizing the command is passed or not with unstable internet. I keep retrieve the records from database and comparing each and every time to see this has been passed or not. But due to the flow of my application, I'm finding it very difficult to avoid duplicates.IS there any automatic process to make sure commands executed automatically or something else?
2nd question, I can use UITimer to check isOffline() to make sure internet is connected or not on the forms. Is there something equivalent on server page or where queries is written to see internet is disconnected or not. When the control moved to queries and internet is disconnected I see the dialog open from form page being frozen for unlimited time and will not end. I have to close and re-open the app to continue the synchronization process.At the same time I cannot set a timeout for dialog because I'm not sure how long it will take the complete the Synchronization process. Please advise.
Extending on the same topic but I have created a new issue just to give more clarity on my questions.
executeOfflineCommand skips a command while executing from storage on Android
There is no way to know if a connection will stay stable as it requires knowledge of the future. You can work like transaction services do where the server side processes an offline command as a transaction using the approach of 2-phase commit.
In this approach you have an algorithm similar to this:
Client sends command to server
Server returns a special unique ID for the command
Client asks server to perform the unique id
Server acknowledges that the command was performed
If the first 2 stages didn't complete you just do that again. The worst thing that could happen is some orphan commands on the server.
If the 3rd option didn't complete you just do it again. The server knows whether it processed the command and will just acknowledge it if it was already processed.
I have recently changed to use custom Go runtime on GAE, and noticed many errors like this from logs:
internal.flushLog: Flush RPC: Call error 3: invalid security ticket: 6c8027dc99b3ed3e
internal.flushLog: Flush RPC: Canceled: (timeout)
The server is still running well, but I have no idea about that error, as well as why it happens.
I'm using a custom Go runtime by using Dockerfile, and App Engine Release is 1.9.37.
Any help to clarify the error would be highly appreciated. Thanks.
This is a known issue with the Go runtime on App Engine Flexible. It tends to happen when a line is logged right before the end of a request/response.
What happens is that when the line is logged it is actually put in a list of log lines to be batched together and sent to the application server as an RPC at periodic intervals. The security ticket is canceled at the end of a request/response which sometimes can happen before the log lines have been flushed. It's harmless, except that you may lose a log line or two. :\
We're actively working on fixing it.
We are having an issue when using NHibernate with distributed transactions.
Consider the following snippet:
//
// There is already an ambient distributed transaction
//
using(var scope = new TransactionScope()) {
using(var session = _sessionFactory.OpenSession())
using(session.BeginTransaction()) {
using(var cmd = new SqlCommand(_simpleUpdateQuery, (SqlConnection)session.Connection)) {
cmd.ExecuteNonQuery();
}
session.Save(new SomeEntity());
session.Transaction.Commit();
}
scope.Complete();
}
Sometimes, when the server is under extreme load, we'll see the following:
The query executed with cmd.ExecuteNonQuery is chosen as a deadlock victim (we can see it in SQL Profiler), but no exception is raised.
session.Save fails with the error message, "The operation is not valid for the state of the transaction."
Every time this code is executed after that, session.BeginTransaction fails. The first few times, the inner exception varies (sometimes it is the deadlock exception that should have been raised in step 1). Eventually it stabilizes to "The server failed to resume the transaction. Desc:3800000177." or "New request is not allowed to start because it should come with valid transaction descriptor."
If left alone, the application will eventually (after seconds or minutes) recover from this condition.
Why is the deadlock exception not being reported in step 1? And if we can't resolve that, then how can we prevent our application from temporarily becoming unusable?
The issue has been reproduced in the following environments
Windows 7 x64 and Windows Server 2003 x86
SQL Server 2005 and 2008
.NET 4.0 and 3.5
NHibernate 3.2, 3.1 and 2.1.2
I've created a test fixture which will sometimes reproduce the issue for us. It is available here: http://wikiupload.com/EWJIGAECG9SQDMZ
We've finally narrowed this down to a cause.
When opening a session, if there is an ambient distributed transaction, NHibernate attaches an event handler to the Transaction.TransactionCompleted, which closes the session when the distributed transaction is completed. This appears to be subject to a race condition wherein the connection may be closed and returned to the pool before the deadlock error propagates across, leaving the connection in an unusable state.
The following code will reproduce the error for us occasionally, even without any load on the server. If there is extreme load on the server, it becomes more consistent.
using(var scope = new TransactionScope()) {
//
// Force promotion to distributed transaction
//
TransactionInterop.GetTransmitterPropagationToken(Transaction.Current);
var connection = new SqlConnection(_connectionString);
connection.Open();
//
// Close the connection once the distributed transaction is
// completed.
//
Transaction.Current.TransactionCompleted +=
(sender, e) => connection.Close();
using(connection.BeginTransaction())
//
// Deadlocks but sometimes does not raise exception
//
ForceDeadlockOnConnection(connection);
scope.Complete();
}
//
// Subsequent attempts to open a connection with the same
// connection string will fail
//
We have not settled on a solution, but the following things will eliminate the problem (while possibly having other consequences):
Turning off connection pooling
Using NHibernate's AdoNetTransactionFactory instead of AdoNetWithDistributedTransactionFactory
Adding error handling that calls SqlConnection.ClearPool() when the "server failed to resume the transaction" error occurs
According to Microsoft (https://connect.microsoft.com/VisualStudio/feedback/details/722659/), the SqlConnection class is not thread-safe, and that includes closing the connection on a separate thread. Based on this response we have filed a bug report for NHibernate (http://nhibernate.jira.com/browse/NH-3023).
not a definitive answer, but i suspect you have some problems with session management and that you are using the same session across multiple calls to handlers. i don't think it's actually the connection that is in a bad state, but rather the nhibernate session. this doesn't seem to jive with you not seeing the problem with connection pooling turned off, so i may be off base, but i still suspect it has to do with reusing sessions.
the first thing i would suggest is that you try to confirm this by logging the hashcode of the session and the hashcode of session.GetSessionImplementation() (my understanding of using the castle nhibernate facility is that you will see the same instance of session, even though it is actually a different session and the session implementation will actually show a difference). see if you are seeing the same hashcodes being used in handling different messages.
if it is a question of session management, try using a nservicebus module to manage your sessions for your handlers. here is a post from andreas about doing that. i don't think his edit about having a way to do this built in on the trunk was in the 2.5 release, so you probably want to go ahead with this. (i could be wrong about that.)
http://andreasohlund.net/2010/02/03/nhibernate-session-management-in-nservicebus/
This doesn't exactly solve your problem, but you could make your IPreInsertEventListener just send a NSB message, and then have the receiver of the message invoke the stored procedure. I've done that with problematic pre-and post event listeners while using NHibernate and NSB in the past.
Another thought is have your pre-event listener create its own connection object wrapped in a nice using statement, then it won't touch NHibernate's connection. If it deadlocks, then just do a throw an make sure you've disposed of any object's in scope.
It is an NHibernate issue. NHibernate is not opening and closing the connection on the same thread, which is not supported by ADO.NET. You can work around it by opening and closing the connection yourself. NHibernate will not close the connection unless it has also opened it.
Workaround
var connection = ((SessionFactoryImpl)_sessionFactory).ConnectionProvider.GetConnection();
using(var session = _sessionFactory.OpenSession(connection))
{
//do database stuff
}
connection.Close();
When importing customizations to CRM 4.0, the import fails with a message "generic SQL error". Digging a bit deeper the error message is really that a timeout has occured. The same error occurs when trying to create a new entity.
I increased the timeout as suggested in the link below, but the timeout occured anyway - it just took longer time to happen.
Increasing the timeout:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSCRM\OLEDBTimeout
This value does not exist by default, and if it does not exist then the default value will be 30 seconds. To change it, add the registry value (of type DWORD), and put in the value you want (measured in seconds). Then you have to recycle the CrmAppPool application pool for the change to take effect
https://community.dynamics.com/product/crm/crmtechnical/b/crmdavidjennaway/archive/2008/09/04/sql-timeouts-in-crm-generic-sql-error.aspx
The SQL profiler displays a set of inserts and updates related to the metadata in CRM, and then a call to the stored procedure exec p_RecreateIndexes
This call is apparently the culprit and never completes in a timely fashion (30+ minutes now and not completed yet). This is an existing test instance of CRM and is quite extensively used and filled with lots of data. Creating new entities has never before taken this long. Just in case, I have run the asyncoperation cleaning scripts from MS. It did not have any visible effect.
Is there any way to find the reason for the delay in this procedure, or some other solution I can try?
Try splitting up your import into chunks. For example, import the first 20 entities, then the second 20 entities, and so on until you've imported all of them. Then publish. Then go back and try importing the entire customizations file at the same time and republish. Following this method exactly has been the only way we've found to import some customization files in particularly stubborn environments.
It sounds like the re-index operation is taking some time - which would be as expected if the data size is large, and the fragmentation is high. It also depends on exactly what that stored procedure does, and how many cores/CPUs you have - and how many SQL Server is allowed to use.
Does the app allow you to defer that operation? You'd be able to run it manually yourself through Management Studio - if that doesn't break the application.
You could be cheeky, and rename that procedure, and replace it with one of your own that does nothing.... and then rename back, and run. Again, it might break something.
Or just keep increasing the timeout until you get past this issue. Some of my re-index jobs on databases generally take hours....
Or contact the vendor if you have support?
If you ran that query through Management Studio, it would complete.... doing that would give you the approximate time required to put (temporarily) into the timeout setting.
I just experienced the exact same problem and I was only importing 2 entities.
I found your questions when I was googling for p_RecreateIndexes after seeing it in the Trace files.
I ended up running exec p_RecreateIndexes in SQL Server. After it completed - about 2 minutes - I reran the Entity import and it worked.