My application executes many queries and it is sure that all connections are closed well. PgAdmin shows many queries have gone "Idle in transaction" and finally DB becomes unresponsive. Is there a way to get the query caused to be 'Idle in transaction' ? Or any other tool which can track it ? Postgres 8.1 is used.
Edit: Connection Pool is used. Also, the state ' in transaction' got cleared after couple of minutes. Then, if any connection is opened, how this get cleared ?
If you check information in Postgres documentation regarding this:
idle in transaction (waiting for client inside a BEGIN block), or a
command type name such as SELECT. Also, waiting is attached if the
server process is presently waiting on a lock held by another server
process
I would suggest following things:
enable logging of "long queries" using log_min_duration_statement
and log_lock_waits option in postgresql.conf in Error Reporting and Logging section
check Lock Management parameters of postgresql.conf configuration file,deadlock_timeout option in particular
check Lock Monitoring article on Postgres Wiki and pg_locks view in Postgres
This is clean signal, so some about closing transaction and closing sessions is wrong in your application. The queries works well. Check your application - unexpected exceptions, fails, ... Some applications are pretty buggy - usually it is pretty serious problem. Orphaned transactions block VACUUM and block reusing connections.
Related
I am using a SQL Server database with Nodejs. I am using connection pool to perform various queries. When I run sp_who2, I can see that the there are almost 20 processes which have status sleeping and command awaiting command.
Should I go ahead and delete these processes? I read in some other post that this happens when you create a transaction in SQL Server but do not close / commit / rollback that transaction. I do not see any point in my application where I did not commit or rollback transaction on error. So I am not sure where the error came from.
I have a feeling that leaving those processes there is going to cause query timeout issues in the future. Is there a way to see what query caused the sleeping but waiting state?
I normally see many sleeping connections. I consider it normal. If you have sleeping connections with open transactions and locks, then you need to investigate. I would try to identify the host and PID holding the lock. In some cases the resolution is a polite talk with the person responsible for not closing their transaction.
A connection pool is a pool of connections to SQL Server. They will be idle and sleeping unless they are in use. Generally, there is a timeout for the connections in the pool. (For example, if you look at the ODBC control panel, the connection pooling tab will generally show a 60 second timeout. It might also always keep a minimum number of idle connections.) Check if you have a minimum number of idle connections. Once you know your timeout, verify that the connections are timing out as expected...eventually. If not, I would look for a connection leak or a connection pool issue. Is the application releasing the connection when done? Does GC have to run before the connection goes away?
Years ago there was an issue where a connection could go back into the pool with an open transaction. It was not until the connection was being prepared for reuse that it was finally reset. This issue has been fixed.
Another past issue was a broken connection. For example, if the SQL Server was rebooted, all idle connections are broken. However, it was not until the connection was requested that this was checked. A connection failure timeout was required for each connection in the pool before it was replaced. This was a PITA.
We are running a website on a vps server with sql server 2008 x64 r2. We are being bombarded with 17886 errors - namely:
The server will drop the connection, because the client driver has
sent multiple requests while the session is in single-user mode. This
error occurs when a client sends a request to reset the connection
while there are batches still running in the session, or when the
client sends a request while the session is resetting a connection.
Please contact the client driver vendor.
This causes sql statements to return corrupt results. I have tried pretty much all of the suggestions I have found on the net, including:
with mars, and without.
with pooling and without
with async=true and without
we only have one database and it is absolutely multi-user.
Everything has been installed recently so it is up to date. They may be correlated with high cpu (though not exclusively according to the monitors I have seen). Also correlated with high request rates from search engines. However, high cpu/requests shouldn't cause sql connections to reset - at worst we should have high response times or iis refusing to send response.
Any suggestions? I am only a developer not dba - do i need a dba to solve this problem?
Not sure but some of your queries might cause deadlocks on the server.
At the point you detect this error again
Open Management Studio (on the server, install it if necessary)
Open a new query window
Run sp_who2
Check the blkby column which is short for Blocked By. If there is any data in that column you have a deadlock problem (Normally it should be like the screenshot I attached, completely empty).
If you have a deadlock then we can continue with next steps. But right now please check that.
To fix the error above, ”MultipleActiveResultSets=True” needs to be added to the connection string.
via Event ID 17886 MSSQLServer – The server will drop the connection
I would create an eventlog task to email you whenever 17886 is thrown. Then go immediately to the db and execute the sp_who2, get the blkby spid and run a dbcc inputbuffer. Hopefully the eventinfo will give you something a bit more tangible to go on.
sp_who2
DBCC INPUTBUFFER(62)
GO
Use a "Instance Per Request" strategy in your DI-instantiation code and your problem will be solved
Most probably you are using dependency injection. During web development you have to take into account the possibility of concurrent requests. Therefor you have to make sure every request gets new instances during DI, otherwise you will get into concurrency issues. Don't be cheap by using ".SingleInstance" for services and contexts.
Enabling MARS will probably decrease the number of errors, but the errors that are encountered will be less clear. Enabling MARS is always never the solution, do not use this unless you know what you're doing.
Can jdbc connections which are closed due to database un-availability be recovered.
To give back ground I get following errors in sequence. It doesn't look to be manual re-start. The reason for my question is that I am told that the app behaved correctly without
the re-start. So if the connection was lost, can it be recovered, after a DB re-start.
java.sql.SQLException: ORA-12537: TNS:connection closed
java.sql.SQLRecoverableException: ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
IBM AIX RISC System/6000 Error: 2: No such file or directory
java.sql.SQLRecoverableException: ORA-01033: ORACLE initialization or shutdown in progress
No. The connection is "dead". Create a new connection.
A good approach is to use a connection pool, which will test if the connection is still OK before giving it to you, and automatically create a new connection if needed.
There are several open source connection pools to use. I've used Apache's JDCP, and it worked for me.
Edited:
Given that you want to wait until the database comes back up if it's down (interesting idea), you could implement a custom version of getConnection() that "waits a while and tries again" if the database doesn't respond.
p.s. I like this idea!
The connection cannot be recovered. What can be done is to failover the connection to another database instance. RAC and data guard installations support this configuration.
This is no problem for read-only transactions. However for transactions that execute DML this can be a problem, especially if the last call to the DB was a commit. In case of a commit the client cannot tell if the commit call completed or not. When did the DB fail; before executing the commit, or after executing the commit (but not sending back the acknowledgment to the client). Only the application has this logic and can do the right thing. If the application after failing over does not verify the state of the last transaction, duplicate transactions are possible. This is a known problem and most of us experienced it buying tickets or similar web transactions.
In my development environment, I seek to recreate a production issue we
face with MSSQL 2005. This issue has two parts:
The Problem
1) A deadlock occurs and MSSQL selects one connection ("Connection X") as the 'victim'.
2) All subsequent attempts to use "Connection X" fail (we use connection pooling). MSSQL says "The server failed to resume the transaction"
Of the two, #2 if more serious: since "connection X" is whacked every
"round robin" attempt to re-use "connection x" fails--and mysterious
"random" errors appear to the user. We must restart the server.
Why I Write
At this point, however, I wish to recreate problem #1. I can create a
deadlock easily.
But here's my issue: whereas in production, MSSQL chooses one
connection (SPID) as the 'deadlock victim', in my test environment, the deadlock just hangs...and hangs and hangs. Forever? I'm not sure, but I left it hanging overnight and it still hung in the morning.
So here's the question: how can I make sql server "choose a deadlock victim" when a deadlock occurs?
Attempts so Far
I tried setting the "lock_timeout" parameter via the jdbc url ("lockTimeout=5000"), however I got a different message than in production (in test,"Lock request time out period exceeded." instead of in production "Transaction (Process ID 59) was deadlocked on lock resources with another process and has been chosen as the deadlock victim.")
Some details on problem #2
I've researched this "unable to resume the transaction" problem and found a
few things:
bad exception handling may cause this problem. E.g.: the java code does
not close the Statement/PreparedStatement and the driver's implementation
of "Connection" is stuck with a bad/stale/old "transaction ID"
A jdbc driver upgrade may make the problem go away.
For now, however, I just want to recreate a deadlock and make sql server
"choose a deadlock victim".
thanks in advance!
Appendix A. Technical Environment
Development:
sql server 2005 SP3 (9.00.4035.00)
driver: sqljdbc.jar version 1.0
Jboss 3.2.6
jdbc url: jdbc:sqlserver://<>;
Production:
sql server 2005 SP2 (9.00.3042.00)
driver: sqljdbc.jar version 1.0
Jboss 3.2.6
jdbc url: jdbc:sqlserver://<>;
Appendix B. Steps to force a deadlock
get connection A
get connection B
run sql1 with connection A
run sql2 with connection B
run sql1 with connection B
run sql2 with connection A
where
sql1:
update member set name = name + 'x' WHERE member_id = 71
sql2:
update member set name = name + 'x' WHERE member_id = 72
The explanation of why the JDBc connection enters the incorrect state is given here: The server failed to resume the transaction... Why?. You should upgrade to JDBC SQL driver v2.0 before anything else. The link also contains advice on how to fix the application processing to avoid this situation, most importantly about avoiding the mix of JDBC transaction API with native Transact-SQL transactions.
As for the deadlock repro: you did not recreate a deadlock in test. You just blocked waiting for a transaction to commit. A deadlock is a different thing and SQL Server will choose a victim, you do not have to set deadlock priority, lock timeouts or anything. Deadlock priorities are a completely different topic and are used to choose the victim in certain scenarios like high priority vs. low priority overnight batch processing.
Any deadlock investigation should start with understanding the deadlock, if you want to eliminate it. The Dedlock Graph Event Class in Profiler is the perfect starting point. With the deadlock graph info you can see what resources is the deadlock occuring on and what statements are involved. Most times the solution is either to fix the order of updates in application (always follow the same order) or fix the access path (ie. add an index).
Update
The UPDATE .. WHERE key IN (SELECT ...) is usually deadlocking because the operation is not atomic. Multiple threads can return the the same IN list because the SELECT part does not lock anything. This is just a guess, to properly validate you must look at the deadlock info.
To validate your hand made test for deadlocks you should validate that the blocking SPIDs form a loop. Look at SELECT session_id, blocking_session_id FROM sys.dm_exec_requests WHERE blocking_session_id <> 0. If the result contains a loop (eg. A blocked by B and B blocked by A) adn the server does not trigger a deadlock, that's a bug. However, what you will find is that the blocking list will not form a loop, will be something A blocked by B and B blocked by C and C not in the list, which means you have done something wrong in the repro test.
You can specify a Deadlock priority ffor a the session using
SET DEADLOCK_PRIORITY LOW | MEDIUM | HIGH
See this MSDN link for details.
You can also use the following command to view the open transactions
DBCC OPENTRAN (db_name)
This command may help you identify what is causing the deadlock. See MSDN for more info.
What are the queries being run? What is actually causing the deadlock?
You say you have two connections A and B. A runs sql1 then sql2, while B runs sql2 then sql1. So, what is the work (queries) being done? More importantly, where are the transactions? What isolation level are you using? What opens/closes the transactions? (Yes, this leads to questioning the exception processing used by your drivers--if they don't detect and properly process a returned "it didn't work" message, then you absolutely need to take them out back and shoot them--bullets or penicillin, your call.)
Understanding the explicit details underlying the deadlock will allow you to recreate it. I'd first try to recreate it "below" your application -- that is, open up two windows in SSMS, and recreate the application's actions step by step, by hand if/as necessary. Once you can do this, step back and replicate that in your application--all on your development servers, of course!
(A thought--are your Dev databases copies of your Production DBs? If Dev DBs are orders of magnitude smaller than Prod ones, your queries may be the same but what SQL does "under the hood" will be vastly different.)
A last thought, SQL will detect and process deadlocks automatically (I really don't think you can disable this), if yours are running overnight then I don't think you have a deadlock, but rather just a conventional locking/blocking issue.
[Posting this now -- going to look something up, will check back later.]
[Later]
Interesting--SQL Server 2005 compact edition does not detect deadlocks, it only does timeouts. You're not using that in Dev, are you?
I see no way to "turn off" or otherwise control the deadlock timeout period. I hit and messed with deadlocks just last week, and some arbitrary testing then indicated that deadlocks are detected and resolved in (for our dev server) under 5 seconds. It truly seems like you don't have deadlocks on you Dev machine, just blocking. But realize that this stuff is hard for "armchair DBAs" to analyzed, you'd really need to sit down and do some serious analysis of what's going on within the system when this problem is occuring.
[ This is a response to the answers. The UI does not allow longer 'comments' on answers]
What are the queries being run? What is actually causing the deadlock?
In my test environment, I ran very simple queries:
sql1:
UPDATE principal SET name = name + '.' WHERE principal_id = 71
sql2:
UPDATE principal SET name = name + '.' WHERE principal_id = 72
Then executed them in chiastic/criss-cross order, i.e. w/o any commits.
connectionA
sql1
connectionB
sql2
sql1
sql2
This to me seems like a basic example of a deadlock. If this a "mere lock", however, and not a deadlock, please disabuse me of this notion.
In production, our 'problematic query' ("prodbad") looked liked this:
UPDATE post SET lock_flag = ?
WHERE thread_id IN (SELECT thread_id FROM POST WHERE post_id = ?)
Note a few things:
1) This "prod problem query" actually works. AFAIK it had a
deadlock this one time
2) I suspect that the problem lies in page locking, i.e. pessimistic locking due to reads elsewhere in the transaction
3) I do not know what sql this transaction executed prior to this query.
4 )This query is an example of "I can do that in one sql statement"
processing, which while seems clever to the programmer ultimately causes much more IO than running two queries:
queryM:SELECT thread_id FROM POST WHERE post_id = ?
queryN: UPDATE post SET lock_flag = ? WHERE thread_id = <>
*>(A thought--are your Dev databases copies of your Production DBs?
If Dev DBs are orders of magnitude smaller than Prod ones, your queries may be the same but >what SQL does "under the hood" will be vastly different.)*
In this case the prod and dev db's differ. "Prod server" had tons of data. "Dev db" had little data. The queries were very differently. All I wanted to do was recreate a deadlock.
*> The server failed to resume the transaction... Why?. You should upgrade to JDB
C SQL driver v2.0 before anything else.*
Thanks. We plan on this change. Switching drivers introduces a little bit of risk, so we'll need to run some test..
To recap:
I had the "bright idea" to force a simple deadlock and see if my connection was "whacked/hosed/borked/etc." The deadlock, however, behaved differently than in production.
I'm trying to perform some offline maintenance (dev database restore from live backup) on my dev database, but the 'Take Offline' command via SQL Server Management Studio is performing extremely slowly - on the order of 30 minutes plus now. I am just about at my wits end and I can't seem to find any references online as to what might be causing the speed problem, or how to fix it.
Some sites have suggested that open connections to the database cause this slowdown, but the only application that uses this database is my dev machine's IIS instance, and the service is stopped - there are no more open connections.
What could be causing this slowdown, and what can I do to speed it up?
After some additional searching (new search terms inspired by gbn's answer and u07ch's comment on KMike's answer) I found this, which completed successfully in 2 seconds:
ALTER DATABASE <dbname> SET OFFLINE WITH ROLLBACK IMMEDIATE
(Update)
When this still fails with the following error, you can fix it as inspired by this blog post:
ALTER DATABASE failed because a lock could not be placed on database 'dbname' Try again later.
you can run the following command to find out who is keeping a lock on your database:
EXEC sp_who2
And use whatever SPID you find in the following command:
KILL <SPID>
Then run the ALTER DATABASE command again. It should now work.
There is most likely a connection to the DB from somewhere (a rare example: asynchronous statistic update)
To find connections, use sys.sysprocesses
USE master
SELECT * FROM sys.sysprocesses WHERE dbid = DB_ID('MyDB')
To force disconnections, use ROLLBACK IMMEDIATE
USE master
ALTER DATABASE MyDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE
Do you have any open SQL Server Management Studio windows that are connected to this DB?
Put it in single user mode, and then try again.
In my case, after waiting so much for it to finish I had no patience and simply closed management studio. Before exiting, it showed the success message, db is offline. The files were available to rename.
execute the stored procedure
sp_who2
This will allow you to see if there is any blocking locks.. kill their should fix it.
In SSMS: right-click on SQL server icon, Activity Monitor. Open Processes. Find the processed connected. Right-click on the process, Kill.
In my case I had looked at some tables in the DB prior to executing this action. My user account was holding an active connection to this DB in SSMS. Once I disconnected from the server in SSMS (leaving the 'Take database offline' dialog box open) the operation succeeded.
anytime you run into this type of thing you should always think of your transaction log. The alter db statment with rollback immediate indicates this to be the case. Check this out: http://msdn.microsoft.com/en-us/library/ms189085.aspx
Bone up on checkpoints, etc. You need to decide if the transactions in your log are worth saving or not and then pick the mode to run your db in accordingly. There's really no reason for you to have to wait but also no reason for you to lose data either - you can have both.
Closing the instance of SSMS (SQL Service Manager) from which the request was made solved the problem for me.....
To get around this I stopped the website that was connected to the db in IIS and immediately the 'frozen' 'take db offline' panel became unfrozen.
Also, close any query windows you may have open that are connected to the database in question ;)
I tried all the suggestions below and nothing worked.
EXEC sp_who
Kill < SPID >
ALTER DATABASE SET SINGLE_USER WITH Rollback Immediate
ALTER DATABASE SET OFFLINE WITH ROLLBACK IMMEDIATE
Result: Both the above commands were also stuck.
4 . Right-click the database -> Properties -> Options
Set Database Read-Only to True
Click 'Yes' at the dialog warning SQL Server will close all connections to the database.
Result: The window was stuck on executing.
As a last resort, I restarted the SQL server service from configuration manager and then ran ALTER DATABASE SET OFFLINE WITH ROLLBACK IMMEDIATE. It worked like a charm
In SSMS, set the database to read-only then back. The connections will be closed, which frees up the locks.
In my case there was a website that had open connections to the database. This method was easy enough:
Right-click the database -> Properties -> Options
Set Database Read-Only to True
Click 'Yes' at the dialog warning SQL Server will close all connections to the database.
Re-open Options and turn read-only back off
Now try renaming the database or taking it offline.
For me, I just had to go into the Job Activity Monitor and stop two things that were processing. Then it went offline immediately. In my case though I knew what those 2 processes were and that it was ok to stop them.
In my case, the database was related to an old Sharepoint install. Stopping and disabling related services in the server manager "unhung" the take offline action, which had been running for 40 minutes, and it completed immediately.
You may wish to check if any services are currently utilizing the database.
Next time, from the Take Offline dialog, remember to check the 'Drop All Active Connections' checkbox. I was also on SQL_EXPRESS on local machine with no connections, but this slowdown happened for me unless I checked that checkbox.
SSMS, especially if running it from your own desktop remotely and not directly within the database server, can be a reason for the long delays in detaching a database. For some reason SSMS may not be able to disconnect any existing "connections" to the database.
We found the process was almost instant when we did it directly from the database server itself. And in fact it killed the attempt from my own desktop SSMS session, and it "took over" and detached the database.
Nothing else suggested here worked.
Thanks
In my case i stopped Tomcat server . then immediately the DB went offline .