General network error after a night of inactivity - sql-server

For some time now our flagship application has been having mysterious errors. The error message is the generic
[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.
This is reliably reproduced by leaving the app open for the night and resuming work in the morning. Since it's a backend server app this is a normal scenario.
The funny thing is - we've migrated from SQL Server 7 to 2000 to 2008 and the issue is present on all of them. But what seems to matter is the OS on which we run the app. On WinXP it works fine, on Vista/7 it fails. So the problem is at the client end.
The results of Google on the error message cover a very wide spectrum of different causes (since this is a very generic error) and none of the scenarios found there are similar to ours.
So perhaps someone around here will know what the problem is in our case?

You should be able to reproduce this error condition on demand by:
1. Opening a database connection (in your client application)
2. Unplugging the network cable
3. Plugging network cable back in (wait until the network connection is restored)
4. Using the previously opened connection to query the database
As far as I can tell from experience, client side ADO code is not able to consistently determine if an underlying network connection is actually valid or not. Checking if the database connection is open (in the client code) returns true. However, performing any operations on that connection results in a General network error.
The connection pool appears to be able to determine when a connection goes 'bad' so it never returns a bad connection to the application. It simply opens a new connection instead.
So, if a database connection is kept alive for a long time (used or unused) by the application, the underlying TCP/IP connectivity can get broken.
The bottom line is that database connections should be closed and returned back to the connection pool when not in use.
Edit
Also, depending on the number of clients connecting to the db, not using the connection pool can cause another issue. You may hit the maximum number of sockets open on the server side. This is from memory. Once a connection is closed on the client side, the connection on the server goes into a TIME_WAIT state. By default, the server socket takes about 4 minutes to close, so it is not available to other clients during that time. The bottom line is that there is a limited number of available sockets on the server. Keeping too many connections open can create a problem.
One project I worked on easily hit this socket limit with around 120 users. A new 'feature' was added that absolutely hammered the server, and after a few hours of using the app, things would suddenly slow to a crawl for everyone. SQL server was not closing enough sockets in time for new connection requests. Although there are 65K sockets altogether, only the first 5000 are made available to the ADO (this is a default registry setting thing, so can be changed).
The number of sockets in TIME_WAIT state would slowly build up until the OS would not allocate any more. So clients had to wait until server side sockets closed and a new connection could then be created.

Have you tried disabling SNP/TCP Chimneying?

Had a similar error. For me it was indirectly caused by mismatched calls to WSACleanup and WSAStartup.
The program called WSACleanup more times than WSAStartup. This would cause a reference counter (somewhere in the sockets library) to reach zero too early.
I think effectively from that moment on all sockets owned by the process are broken.
And this would also kill the SQL client since it uses sockets to 'talk' to the SQL server as well.

Related

What is the cause of a sporadic ORA-12599: TNS:cryptographic checksum mismatch?

In our production environment there are sometimes peaks of ORA-12599 errors the connection works otherwise.
The description is not quite clear to me.
Cause: The data received is not the same as the data sent.
Action:
Attempt the transaction again. If error persists, check (and correct)
the integrity of your physical connection.
https://www.oraexcel.com/database-oracle-11gR2-ORA-12599
Does that mean there is a problem with my network cable? Or any other Hardware?
Because our DBA said the error can be caused by a configuration issue on the client side.
Edit:
It started without any changes on the client or server side. And there are spikes. 80 errors in an hour and then no errors for 4 hours while the load is more or less constant
It can mean a mismatch in how the client and the server are wishing to communicate (in encrypted fashion). For example, trying to connect to an 11g database with 10g JDBC drivers will get this because the encryption implementation changed between these versions.
Check your local (ie, client) sqlnet.ora file - you may need to opt for a different method or change the order of them.
The error can be typically be ignored (its informational) but using the latest drivers will normally resolve the issue.
We had sporadic ORA-12599 errors and also ORA-24300 ("bad value for mode") in a Python web app using Flask and uwsgi. It turned out to be caused by multiple worker processes writing and reading to/from the same TCP connection to the DB, giving chaotic communications on the socket.
In our setup this was caused by creating the TCP connections to the DB before uwsgi forked the workers, and the file descriptors being inherited. We fixed this by using uwgsi's lazy-apps mode.
See also this question.

SQL Server - How to prevent sql server from disconnect the idle connection after period of time?

i am using sql server 2012 with desktop application as client , the application get errors after period of time when no activity on it , i googled about this issue all solutions points me to AUTO_CLOSE option on database but it's already set to false .
i thing is something missing in connection string (ADO Extension)
To be honest, if you have long running connections, you can hit these errors regardless due to firewalls / routers closing connections, etc. The correct solution is to instantiate a connection when you need it, use the connection and release it. With connection pooling, this is not really a performance problem.
If your long-running application is "bursty", it is sometimes convenient to open the connection, do a number of commands -- then when you go idle, release the connection and wait the next burst of activity.

Access dropping SQL connection

Not sure if this is the correct forum for this, but here goes.
Im looking for any suggestions as to what I can try to reslove this...
I have an Access 2003 front end (on each client) with SQL 2008 database. Ive went round each user and set up the odbc connection on each pc.
for most users its fine and been working well for a year, but for a few every now and then when running a query (either an update or a select when opening a form) the SQL connection seems to have been dropped and they cant go any further.
I cant think of any glaring difference between those who have it working and those who dont.
Any idea's where I should start with this?
thanks
I've had such cases before: Access frontend, SQL Server backend. On one or some of the customer's PCs, the connection suddenly drops (throwing some ODBC or SQL Server connection error). Happens randomly and rarely (e.g. once per hour/day/week), and the Access application needs to be restarted to continue working.
In all of these cases, one of the following was the culprit:
Broken network cable
Broken network card
Buggy network card driver
Unstable network protocol (yes, this one was in the old days of NetBIOS)
The thing is: Access is extremely sensitive to network errors. A simple glitch in the network, a few seconds of lost connectivity -- something which you won't even notice with other applications -- will cause an Access frontend application to lose its database connection and crash horribly. It's very frustrating, because the customer will say "I don't experience any network trouble with Word/Windows Explorer/etc., so my network is fine, and it's your application that's broken." It's not true. If Access experiences sporadic and unpredictabe network errors, it's usually really a network problem.
So, the first thing I'd do is to replace (a) the network card, (b) the network cable and (c) use another switch port for one of the machines experiencing problems. If the problems are gone on that machine, you know that one of these components was the faulty one.

SYN receives RST,ACK very frequently

Hi Socket Programming experts,
I am writing a proxy server on Linux for SQL server 2005/2008 running on Windows.
The proxy is coded using bsd sockets and in C, and it is working fine with a problem described below.
When I use a database client (written in JAVA, and running on a Linux box) to fire queries (with a concurrency of 100 or more) directly to the Database server, not experiencing connection resets. But through my proxy I am experiencing many connection resets.
Digging deeper I came to know that connection from 'DB client' to 'Proxy' always succeeds
but when the 'Proxy' tries to connect to the DB server the connection fails, due to the SYN packet getting RST,ACK.
That was to give some background. The question is :
Why does sometimes SYN receives RST,ACK?
DB client(linux) to Server(windows) ----> Works fine
DB client(linux) to Proxy(Linux) to Server(windows) -----> problematic
I am aware that this can happen in "connection refused" case but this definitely is not that one. SYN flooding might be another scenario, but that does not explain fine behavior while firing to Server directly.
I am suspecting some socket option setting may be required, that the client does before connecting and my proxy does not. Please put some light on this. Any help (links or pointers) is most appreciated.
Additional info:
Wrote a C client that does concurrent connections, which takes concurrency as an argument. Here are my observations:
-> At 5000 concurrency and above, some connects failed with 'connection refused'.
-> Below 2000, it works fine.
But the actual problem is observed even at a concurrency of 100 or more.
Note: The problem is time dependent sometimes it never comes at all and sometimes it is very frequent and DB client (directly to server) works fine at all times .
SQL Server needs worker threads to accept incoming connections. If your server is worker starved (which can be easily diagnosed by a high number of entries in sys.dm_os_tasks in PENDING state) then attempting to open new connection will fail. So likely what I suspect it happens is that you're pushing to the server more workload that it can handle. You need to optimize the workload or get a beefier server.
Clients like Java client make effective use of connection pooling and, even under a high load, do not need to open new connection hence you do not see this problem, instead you see only delays in requests completion.
A listening socket keeps queue of established connections and connections in establishment process (e.g. SYN got, SYNACK replied, but not ACK from client yet). If a established queue overflows, IP stack reaction differs on OS. The most traditional approach was to ignore newcoming SYNs, waiting when userland accept()s and frees a slot in the queue. With SYN flooding attacks in mid-90s, the new method was invented named "SYN cookies" which drops need for establishment queue totally, in cost of need to support special TCP option.
OTOH I have heard that Windows stacks change their behavior - under some conditions, the reaction to queue overflow is RST response. In earlier stacks (e.g. Win95) this was the main response and client side was correspondingly changed to ignore RST response to SYN:(
That's why I guess that some proxy host feature triggers RST in Windows stack.
Another guess is that DB server closes listening socket at all under some condition (e.g. detected overload peak) which appears only with proxy.
When the SYN receives the RST response, it should not be the problem of SQL-SERVER.
Because the application is able to accept the socket only after the tcp handshake finished.
Is there any device between the Proxy and the SQL-Server machines?
Try to make sure that the RST response come from sql-server machine.
You connection count is far from SYN-FLOOD, I think.

how long must a sql server connection be idle before it is closed by the connection pool?

I have a client-server app that uses .NET SqlClient Data Provider to connect to sql server - pretty standard stuff. By default how long must connections be idle before the connection pooling manager will close the database connection and remove it from the pool? What setting if any controls this?
This MSDN document only says
The connection pooler removes a connection from the pool after it has been idle for a long time, or if the pooler detects that the connection with the server has been severed.
A few years ago the answer beneath was the situation, but now it's changed so you can refer to the source and write up a summary :)
Old answer
This excellent article tells us what we need to know, using reflection to reveal the inner workings of connection pooling.
From how I understand it, 'closed' connections are cleaned up periodically on a semi-random interval. The cleanup process runs somewhere between every 2min and 3min 50s, but it needs to run twice before a 'closed' connection will be properly closed. Therefore after 7min 40s of being 'closed' the underlying sql connection should be properly closed, but it could be as short as 2min. At the time of writing the first connection pool created in a process would always have a timer interval of 3min 10s, so you'd normally see sql connections being closed somewhere between 3min 10s and 6min 20s after you call Close() on the ADO object.
Obviously this uses undocumented code so could change in future - or could even have changed since that article was written.
Please go through this:
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.connectionstring%28VS.80%29.aspx
The part
"The following table lists the valid
names for connection pooling values
within the ConnectionString."
seems to be of your interest.

Resources