I'm getting the error "cannot execute unlisten during recovery" when I use Pooling=True in my connection string.
This error is on a replicated / read server which is running on hot standby.
This is still happening with version 3.2.7. It should be possible to run SELECT queries on a hot standby database, but when doing this with Npgsql, we get
PostgresException 25006: cannot execute UNLISTEN during recovery
at Npgsql.NpgsqlConnector.<DoReadMessage>d__157.MoveNext()
A look at the source confirms that, when pooling connections, Npgsql cleans up after each connection is closed. One of the cleanup operations is UNLISTEN *, which fails on hot standby databases, since it affects state.
Fortunately, there are connection string parameters we can set to avoid this. As the original question mentions, you can disable connection pooling. However, in situations where performance is important, it is better to add No Reset On Close=true; to the connection string instead.
Using No Reset On Close does carry some risk of leaking state from one command to the next, but since you're in read-only mode, it can't affect your stored data. Be careful to dispose of cursors, sequences and temporary tables if you're using them. On the bright side, it may give your queries a slight speed boost.
This seems to be documented in the PostgreSQL docs: a hot standby can't do LISTEN/UNLISTEN/NOTIFY.
Related
I'm using JPA to connect to an SQL server across a WAN. I've been unable to find information on what happens when I begin a JPA transaction that involves writes to the remote DB, but the WAN connection goes down before or during commit.
In each transaction, I'm transmitting a header and several hundred detail lines.
Does the far-end database know enough to discard all the changes?
Obviously, requesting a rollback on the local application isn't going to have any effect since the WAN link is down.
I presume:
By "the connection goes down", I mean, that the dbms-client-driver signals to your application code, that the connection has been lost.
you have just one DBMS, which means no two-phase commit.
Then:
It does not matter if you are using sql-server via WAN or LAN. Either the transaction is done completely, or not at all.
That is the nature of transactions.
So if the connection goes down before the commit, the server will rollback everything. There is no way to reconnect on application level, to complete the transaction.
If the connection goes down during the commit, then dependent on the implementation and on the exact point in time, the transaction might be persisted completely or rolled back completely.
You can be absolutely sure that everything is persisted as intended as soon as commit returns to your code.
Beware, that "connection goes down" might happen after a timeout that might be quite long (several minutes). In that time, the transactions keep all the locks and might slow down the complete system. These timeouts might be set to longer intervals if you are communicating via slower network.
On our admin of our company's production site, we have a little query dumping tool, and I unknowingly, in trying to get data from a database, different than the main one, used the use database command.
And here's the kicker, it then made every coldfusion page with it's query instantly fail.
since it somehow caches that use database command.
Has anyone else heard of this weird bug?
How can we stop this behavior?
If i use a "use database" command, I want that to only exist as far as the current query i am running, after i am done, to go back to the normal database usage.
This is weird and a potentially damaging problem.
Any thoughts?
I imagine that this has something to do with connection pooling. When you call close, it doesn't close the connection, it just puts it back into the pool. When you call open, it doesn't have to open a new connection, it just grabs an existing one from the pool. If you change the database that the connection is pointing to, ColdFusion may be unaware of this. This is why some platforms (MySQL on .Net for instance) reset the connection each time you retrieve it from the pool, to ensure that you are querying the correct database, and to ensure that you don't have any temporary tables and other session info hanging around. The downside of this kind of behaviour, is that it has to make a round trip to the database, even when using pooled connections, which really may not be necessary.
Kibbee is on the right track, but to extend that a little further with three possible workarounds:
Create a different DSN for use by that one query so the "USE DATABASE" statement would only persist for any queries using that DSN.
Uncheck "Maintain connections across client requests" in the CF admin
Always remember to reset the database to the one you intend to use at the end of the request. It kinda goes without saying that this is a very dangerous utility to have on your production server!
It's not a bug nor is it really unexpected behavior - if the query is cached, then everything inside the cfquery block is going along for the ride. Which database platform are you using?
I've got a scenario when sometimes a user selects the right parameters and makes a query which takes several minutes or more to execute. I cannot prevent him to select such a combination of parameters (it's quite legal), so I'd like to set a timeout on the query.
Note that I really want to stop the query execution itself and rollback any transactions, because otherwise it hogs up most of server resources. Add an impatient user who restarts the application and tries the combination again, and you've got a recipe for a disaster (read: SQL Server DoS).
Can this be done and how?
As far as I know, apart from setting the command or connection timeouts in the client, there is no way to change timeouts on a query by query basis in the server.
You can indeed change the default 600 seconds using sp_configure, but these are server scoped.
Humm!
did you try LOCK_TIMEOUT
Note down what it was orginally before running the query
set it for your query
after running your query set it back to original value
SET LOCK_TIMEOUT 1800;
SELECT ##LOCK_TIMEOUT AS [Lock Timeout];
I might suggest 2 things.
1)
If your query takes a lot of time because it´s using several tables that might involve locks, a quite fast solution is to run your queries with the "NoLock" hint.
Simply add Select * from YourTable WITH (NOLOCK) in all your table references an that will prevent your query to block for concurrent transactions.
2) if you want to be sure that all of your queries runs in (let´s say) less than 5 seconds, then you could add what #talha proposed, that worked sweet for me
Just add at the top of your execution
SET LOCK_TIMEOUT 5000; --5 seconds.
And that will cause that your query takes less than 5 or fail. Then you should catch the exception and rollback if needed.
Hope it helps.
In management studio you can set the timeout in seconds.
menu Tools => Options set the field and then Ok
It sounds like more of an architectual issue, and any timeout/disconnect you can do would be more or less a band-aid. This has to be solved on SQL server side, by the way of read-only replica, transaction log shipping (to give you a read-only server to connect to), replication and such. Basically you give the DMZ sql server that heavy read can go to without killing stuff. This is very common. A well-designed SQL system won't be taken down by DDoS - that'd be like a car that dies if you step on the gas.
That said, if you are at the liberty to change the code, you could guesstimate if the query is too heavy and you could either reject or return only X rows in your stored procedure. If you are mated to some reporting tool and such and can't control the SELECT it generates, you could point it to a view and then do the safety valve in the view.
Also, if up-to-the-minute freshness isn't critical and you could compromise on that, like monthly sales data, then compiling a physical table of complex joins by job to avoid complex joins might do the trick - that way everything would be sub-second per query.
It entirely depends on what you are doing, but there is always a solution. Sometimes it takes extra coding to optimize it, sometimes it takes extra money to get you the secondary read-only DB, sometimes it needs time and attention in index tuning.
So it entirely depends, but I'd start with "what can I compromise? what can I change?" and go from there.
You can set Execution time-out in seconds.
If you have just one query I don't know how to set timeout on T-SQL level.
However if you have a few queries (i.e. collecting data into temporary tables) inside stored procedure you can just control time of execution with GETDATE(), DATEDIFF() and a few INT variables storing time of execution of each part.
You can specify the connection timeout within the SQL connection string, when you connect to the database, like so:
"Data Source=localhost;Initial Catalog=database;Connect Timeout=15"
On the server level, use MSSQLMS to view the server properties, and on the Connections page you can specify the default query timeout.
I'm not quite sure that queries keep on running after the client connection has closed. Queries should not take that long either, MSSQL can handle large databases, I've worked with GB's of data on it before. Run a performance profile on the queries, prehaps some well-placed indexes could speed it up, or rewriting the query could too.
Update:
According to this list, SQL timeouts happen when waiting for attention acknowledgement from server:
Suppose you execute a command, then the command times out. When this happens the SqlClient driver sends a special 8 byte packet to the server called an attention packet. This tells the server to stop executing the current command. When we send the attention packet, we have to wait for the attention acknowledgement from the server and this can in theory take a long time and time out. You can also send this packet by calling SqlCommand.Cancel on an asynchronous SqlCommand object. This one is a special case where we use a 5 second timeout. In most cases you will never hit this one, the server is usually very responsive to attention packets because these are handled very low in the network layer.
So it seems that after the client connection times out, a signal is sent to the server to cancel the running query too.
I am looking into using log shipping in a SQL Server 2005 environment. The idea was to set up frequent log shipping to a secondary server. The intent: Use the secondary server to serve report queries, thereby offloading the primary db server.
I came across this on a sqlservercentral forum thread:
When you create the log shipping you have 2 choices. You can configure restore log operation to be done with norecovery or with standby option. If you use the norecovery option, you can not issue select statements on it. If instead of norecovery you use the standby option, you can run select queries on the database.
Bear in mind with the standby option when log file restores occur users will be kicked out without warning by the restore process. Acutely when you configure the log shipping with standby option, you can also select between 2 choices – kill all processes in the secondary database and perform log restore or don’t perform log restore if the database is being used. Of course if you select the second option, the restore operation might never run if someone opens a connection to the database and doesn’t close it, so it is better to use the first option.
So my questions are:
Is the above true? Can you really not use log shipping in the way I intend?
If it is true, could someone explain why you cannot execute SELECT statements to a database while the transaction log is being restored?
EDIT:
First question is duplicate of this serverfault question. But I still would like the second question answered: Why is it not possible to execute SELECT statements while the transaction log is being restored?
could someone explain why you cannot
execute SELECT statements to a
database while the transaction log is
being restored?
Short answer is that RESTORE statement takes an exclusive lock on the database being restored.
For writes, I hope there is no need for me to explain why they are incompatible with a restore. Why does it not allow reads either? First of all, there is no way to know if a session that has a lock on a database is going to do a read or a write. But even if it would be possible, restore (log or backup) is an operation that updates directly the data pages in the database. Since these updates go straight to the physical location (the page) and do not follow the logical hierarchy (metadata-partition-page-row), they would not honor possible intent locks from other data readers, and thus have the possibility to change structures as they are read. A SELECT table scan following the page next-prev pointers would be thrown into disarray, resulting in a corrupted read.
Well yes and no.
You can do exactly what you wish to do, in that you may offload reporting workloads to a secondary server by configuring Log Shipping to a read only copy of a database. I have set this type of architecture up on a number of occasions previously and it works very well indeed.
The caveat is that in order to perform a restore of a Transaction Log Backup file there must be no other connections to the database in question. Hence the two choices being, when the restore process runs it will either fail, thereby prioritising user connections, or it will succeed by disconnecting all user connection in order to perform the restore.
Dependent on your restore frequency this is not necessarily a problem. You simply educate your users to the fact that, say every hour at 10 past the hour, there is a possibility that your report may fail. If this happens simply re-run the report.
EDIT: You may also want to evaluate alternative architeciture solutions to your business need. For example, Transactional Replication or Database Mirroring with a Database Snapshot
If you have enterprise version, you can use database mirroring + snapshot to create read-only copy of the database, available for reporting, etc. Mirroring uses "continuous" log shipping "under the hood". It is frequently used in scenario you have described.
Yes it's true.
I think the following happens:
While the transaction log is being restored, the database is locked, as large portions of it are being updated.
This is for performance reasons more then anything else.
I can see two options:
Use database mirroring.
Schedule the log shipping to only occur when the reporting system is not in use.
Slight confusion in that, the norecovery flag on the restore means your database is not going to be brought out of a recovery state and into an online state - that is why the select statements will not work - the database is offline. The no-recovery flag is there to allow you to restore multiple log files in a row (in a DR type scenario) without bringing the database back online.
If you did not want to log ship / have the disadvantages you could swap to a one way transactional replication, but the overhead / set-up will be more complex overall.
Would peer-to-peer replication work. Then you can run queries on one instance and so save the load on the original instance.
Consider a classic ASP site running on IIS6 with a dedicated SQL Server 2008 backend...
Scenario 1:
Open Connection
Do 15 queries, updates etc all through the ASP-page
Close Connection
Scenario 2:
For each query, update etc, open and close the connection
With connection pooling, my money would be on scenario 2 being the most effective and scalable.
Would I be correct in that assumption?
Edit: More information
This is database operations spread over a lot of asp-code in separate functions, doing separate things etc. It is not 15 queries done in rapid succession. Think a big site with many functions, includes etc.
Fundamentally, ASP pages are synchronous. So why not open a connection once per page load, and close it once per page load? All other opens/closes seem to be unnecessary.
If I understand you correctly you are considering sharing a connection object across complex code held in various functions in various includes.
In such a scenario this would be a bad idea. It becomes difficult to guarantee the correct state and settings on the connection if other code may have seen the need to modify them. Also you may at times have code that fetches a firehose recordset and hasn't finished processing when another piece of code is invoked that also needs a connection. In such a case you could not share a connection.
Having each atomic chunk of code acquire its own connection would be better. The connection would be in a clean known state. Multiple connections when necessary can operate in parrallel. As others have pointed out the cost of connection creation is almost entirely mitigated by the underlying connection pooling.
in your Scenario 2, there is a round-trip between your application and SQLServer for executing each query which consumes your server's resources and time of total executions will raise.
but in Scenario 1, there is only one round-trip and also SQLServer will run all of the queries in just one time. so it is faster and less resource-consuming
EDIT: well, I thought you mean multiple queries in one time..
so, with connection pooling enabled, there is exactly no problem in closing connection after each transaction. so go with Scenario 2
Best practice is to open the connection once, read all your data and close the connection as soon as possible. AFTER you've closed the connection, you can do what you like with the data you retrieved. In this scenario, you don't open too many connections and you don't open the connection for too long.
Even though your code has database calls in several places, the overhead of creating the connection will make things worse than waiting - unless you're saying your page takes many seconds to create on the server side? Usually, even without controlled data access and with many functions, your page should be well under a second to generate on the server.
I believe the default connection pool is about 20 connections but SQLServer can handle alot more. Getting a connection from the server takes the longest time (assuming you are not doing anything daft with your commands) so I see nothing wrong with getting a connection per page and killing it if used afterwards.
For scalability you could run into problems where your connection pool gets too busy and time outs while your script waits for a connection to be come available while your DB is sat there with a 100 spare connections but no one using them.
Create and kill on the same page gets my vote.
From a performance point of view there is no notable difference. ADODB connection pooling manages the actual connections with the db. Adodb.connection .open and .close are just a façade to the connection pool. Instantiating either 1 or 15 adodb.connection objects doesn't really matter performance wise. Before we where using transactions we used the connection string in combination with adodb.command (.activeConnection) and never opened or closed connections explicitly.
Reasons to explicitly keep reference to a adodb.connection are transactions or connection-based functions like mysql last_inserted_id(). In these cases you must be absolutely certain that you are getting the same connection for every query.