ADO.NET Connection Pooling & SQLServer - sql-server

What is it?
How do I implement connection pooling with MS SQL?
What are the performance ramifications when
Executing many queries one-after-the other (i.e. using a loop with 30K+ iterations calling a stored procedure)?
Executing a few queries that take a long time (10+ min)?
Are there any best practices?

Connection pooling is a mechanism to re-use connections, as establishing a new connection is slow.
If you use an MSSQL connection string and System.Data.SqlClient then you're already using it - in .Net this stuff is under the hood most of the time.
A loop of 30k iterations might be better as a server side cursor (look up T-SQL cursor statements), depending on what you're doing with each step outside of the sproc.
Long queries are fine - but be careful calling them from web pages as Asp.Net isn't really optimised for long waits and some connections will cut out.

A little more info on the connection pooling thing... you're using it already with SqlClient, but only if your connection string is identical for each new connection you open. My understanding is that the framework will pool connections automatically when it can, but if the connection string varies even slightly from one connection to the next, then the new connection won't come from the pool - it gets created anew (so it's more expensive).
You can use the Performance Monitor app with XP/Vista to watch SQL connections and you'll see pretty quickly whether or not pooling is being used. Look under the ".NET CLR Data" category" in Performance Monitor.

I second Keith; if you're calling a stored procedure 30,000 times, you have far bigger problems than connection pooling.

Your question was also partially answered by this thread. A search would have revealed this.. The definition of Connection Pooling, of which a Google would have answered with the first hit being this..
Which would leave just the best practices, which I think would have been a good question :)
+1 to Keith's Answer. He has hit the nail right on the head.
Just a polite reminder from the FAQ:
You've searched the internet before
asking your question, and you come to
us armed with research and information
about your question ... right?

Related

Database Pooling when to use it and when not

I've been researching the whole web about database pooling, but I still don't understand few things which I hope to find answer here from more experienced developers.
My understanding of database pooling is that when there are multiple clients connecting to the database, it makes sense to keep the connection in a "cache" and use that to connect to the database faster.
What I fail to understand is how that would help if let's say I have a server that connects to the database and keeps the connection open. When a client requests data from an endpoint, the server will use the same connection to retrieve the data. How would pooling help in that case?
I'm sure I'm missing something in the understanding of pooling. It would make sense if multiple instances of the same database exist and therefore it's decided in the pool which database to connect to using the cached credentials. Is it what happens there?
Also could you give me a scenario where database pooling should be used and when not?
Thanks for clarifying any doubt of mine.
Connection pooling is handled differently in different application scenarios and platforms/languages.
The main consideration is that a database connection is a finite resource, and it costs time to establish it.
Connections are finite because most database servers impose a maximum limit on the number of concurrent connections (this may be part of your licensing terms). If your application uses more connections than the database allows, it may start rejecting connections (leading to error handling requirements in the client app), or making connections wait (leading to poor response times). By configuring a connection pool, the client application can manage these scenarios in a central place.
Secondly, managing connections is a bit of a pain - there are lots of different error handling scenarios, configuration settings etc.; it's a good idea to centralize this. You can, of course, do that without a connection pool.
Thirdly, connections are "expensive" resources - they take time to establish. Most web pages require several database queries; if each query spends just 1 tenth of a second creating a database connection, you're very quickly spending noticable time waiting for database connections. By using a connection pool, you avoid this overhead.
In .Net, connection pooling is handled automatically, not directly by your application.
All you need to do is open, use and close your connections normally and it will take care of the rest.
https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-connection-pooling
If you're talking about a different platform, the mechanics are different, although the purpose is the same.
In all cases, it's time consuming to open and close connections to the DB server, so something between your application and the DB server (typically the database driver or some sort of middle-ware) will maintain a pool of open connections and create, kill and recycle them as necessary.
Pooling keeps the connections open and cuts down on the overhead of opening one for each request.
Also could you give me a scenario where database pooling should be used and when not?
Connection pooling useful in any application that uses the same database connection multiple times within the lifetime of the connection pool.
There would actually be a very slight decrease in performance if you had an application that used a single connection once, then didn't use it again until the connection pool had timed out and recycled. This is extremely uncommon in production applications.
What I fail to understand is how that would help if let's say I have a server that connects to the database and keeps the connection open.
If you have an application that opens a connection and leaves it open, then theoretically pooling would not help. Practically speaking, it's good to periodically kill and recreate various resources like connections, handles, sockets, etc. because software isn't perfect and some code contains resource leaks.
I'm just guessing, but suspect that you're concern is premature optimization. Unless you have done actual testing and determined that connection pooling is a problem, I wouldn't be too concerned with it. Connection pools are generally self-maintaining and almost always improve performance.

Running a query using ADODB Connection randomly takes a long time to execute

I have come across an issue that seems to be somehow connected to a web server configuration, and resulting in queries randomly taking a long time to execute. The application is created using old plain Classic ASP and ADODB Connection is used.
The scenario goes as follows:
there is a single connection opened in a script at the beginning of processing each HTTP request
this connection is used to execute a query against a SQL Server, that resides on a separate box. conn.Execute is used. Connection is NOT closed afterwards
there are usually a few to a few dozens of conn.Execute in a single ASP page
All has been working well until recently, when some of the conn.Execute started to take much longer to execute, totally on random.
the difference is e.g. 15ms normal execution time vs. 2000ms long execution time
on the SQL Server side, Profiler does not show longer query execution times, so there must be something blocking the conn.Execute request
When a proper practice of closing a connection after each conn.Execute has been implemented, the issue goes away. However, as I have stated before, all has been working flawlessly until recently. This web app is a fairly large one and rewriting it to close and reopen connections properly will take some time. And I need a short-term solution.
My guess is that it could have something to do with the connection pool size, however this is not ADO.NET, therefore I am not sure, whether a connection pool issue should be taken into the consideration at all. On the SQL Server side, there is no limit on the number of concurrent connections to the server.
I need some hints. Brainstorming possible ideas.
Could be related to delays resolving the hostname in the connection string via DNS - have you tried putting an IP address in the connection string instead of the hostname?

dbcp having problems returning connections when database is unavailable

I found the following link by erickson dated jan 29, 2009:
Is DBCP (Apache Commons Database Connection Pooling) still relevant?
"DBCP has serious flaws. I don't think it's appropriate for a production application, especially when so many drivers support pooling in their DataSource natively.
The straw that broke the camel's back, in my case, was when I found that the entire pool was locked the whole time a new connection attempt is made to the database. So, if something happens to your database that results in slow connections or timeouts, other threads are blocked when they try to return a connection to the pool—even though they are done using a database."
I was wondering if much had changed or improved with dbcp since this post. I am seeing this EXACT problem in my production system.
Does anyone have any alternatives to dbcp? I use it in a database connection framework...basically, I inherited a framework where the engineers thought it would be fun to rewrite hibernate. don't ask...it's a long and sordid tale. Anyway, I'm having these problems returning connections to the pool when the database is slow/down. Any ideas, suggestions, alternatives?
Try BoneCP: http://jolbox.com
For your case, it has release helper threads that will take care of releasing a connection slowly or whatever.

Classic ASP - using one connection for many queries?

Consider a classic ASP site running on IIS6 with a dedicated SQL Server 2008 backend...
Scenario 1:
Open Connection
Do 15 queries, updates etc all through the ASP-page
Close Connection
Scenario 2:
For each query, update etc, open and close the connection
With connection pooling, my money would be on scenario 2 being the most effective and scalable.
Would I be correct in that assumption?
Edit: More information
This is database operations spread over a lot of asp-code in separate functions, doing separate things etc. It is not 15 queries done in rapid succession. Think a big site with many functions, includes etc.
Fundamentally, ASP pages are synchronous. So why not open a connection once per page load, and close it once per page load? All other opens/closes seem to be unnecessary.
If I understand you correctly you are considering sharing a connection object across complex code held in various functions in various includes.
In such a scenario this would be a bad idea. It becomes difficult to guarantee the correct state and settings on the connection if other code may have seen the need to modify them. Also you may at times have code that fetches a firehose recordset and hasn't finished processing when another piece of code is invoked that also needs a connection. In such a case you could not share a connection.
Having each atomic chunk of code acquire its own connection would be better. The connection would be in a clean known state. Multiple connections when necessary can operate in parrallel. As others have pointed out the cost of connection creation is almost entirely mitigated by the underlying connection pooling.
in your Scenario 2, there is a round-trip between your application and SQLServer for executing each query which consumes your server's resources and time of total executions will raise.
but in Scenario 1, there is only one round-trip and also SQLServer will run all of the queries in just one time. so it is faster and less resource-consuming
EDIT: well, I thought you mean multiple queries in one time..
so, with connection pooling enabled, there is exactly no problem in closing connection after each transaction. so go with Scenario 2
Best practice is to open the connection once, read all your data and close the connection as soon as possible. AFTER you've closed the connection, you can do what you like with the data you retrieved. In this scenario, you don't open too many connections and you don't open the connection for too long.
Even though your code has database calls in several places, the overhead of creating the connection will make things worse than waiting - unless you're saying your page takes many seconds to create on the server side? Usually, even without controlled data access and with many functions, your page should be well under a second to generate on the server.
I believe the default connection pool is about 20 connections but SQLServer can handle alot more. Getting a connection from the server takes the longest time (assuming you are not doing anything daft with your commands) so I see nothing wrong with getting a connection per page and killing it if used afterwards.
For scalability you could run into problems where your connection pool gets too busy and time outs while your script waits for a connection to be come available while your DB is sat there with a 100 spare connections but no one using them.
Create and kill on the same page gets my vote.
From a performance point of view there is no notable difference. ADODB connection pooling manages the actual connections with the db. Adodb.connection .open and .close are just a façade to the connection pool. Instantiating either 1 or 15 adodb.connection objects doesn't really matter performance wise. Before we where using transactions we used the connection string in combination with adodb.command (.activeConnection) and never opened or closed connections explicitly.
Reasons to explicitly keep reference to a adodb.connection are transactions or connection-based functions like mysql last_inserted_id(). In these cases you must be absolutely certain that you are getting the same connection for every query.

SQL Server connection management in Tomcat 6

We are having trouble with a Java web application running within Tomcat 6 that uses JDBC to connect to a SQL Server database.
After a few requests, the application server dies and the in the log files we find exceptions related to database connection failures.
We are not using any connection pooling right now and we are using the standard JDBC/ODBC/ADO driver bridge to connect to SQL Server.
Should we consider using connection pooling to eliminate the problem?
Also, should we change our driver to something like jTDS?
That is the correct behavior if you are not closing your JDBC connections.
You have to call the close() method of each JDBC resource when you are finished using it and the other JDBC resources you obtained with it.
That goes for Connection, Statement/PreparedStatement/CallableStatement, ResultSet, etc.
If you fail to do that, you are hoarding potentially huge and likely very limited resources on the SQL server, for starters.
Eventually, connections will not be granted, get queries to execute and return results will fail or hang.
You could also notice your INSERT/UPDATE/DELETE statements hanging if you fail to commit() or rollback() at the conclusion of each transaction, if you have not set autoCommit property to true.
What I have seen is that if you apply the rigor mentioned above to your JDBC client code, then JDBC and your SQL server will work wonderfully smoothly. If you write crap, then everything will behave like crap.
Many people write JDBC calls expecting "something" else to release each thing by calling close() because that is boring and the application and server do not immediately fail when they leave that out.
That is true, but those programmers have written their programs to play "99 bottles of beer on the wall" with their server(s).
The resources will become exhausted and requests will tend to result in one or more of the following happening: connection requests fail immediately, SQL statements fail immediately or hang forever or until some godawful lengthy transaction timeout timer expires, etc.
Therefore, the quickest way to solve these types of SQL problems is not to blame the SQL server, the application server, the web container, JDBC drivers, or the disappointing lack of artificial intelligence embedded in the Java garbage collector.
The quickest way to solve them is to shoot the guy who wrote the JDBC calls in your application that talk to your SQL server with a Nerf dart. When he says, "What did you do that for...?!" Just point to this post and tell him to read it. (Remember not to shoot for the eyes, things in his hands, stuff that might be dangerous/fragile, etc.)
As for connection pooling solving your problems... no. Sorry, connection pools simply speed up the call to get a connection in your application by handing it a pre-allocated, perhaps recycled connection.
The tooth fairy puts money under your pillow, the Easter bunny puts eggs & candy under your bushes, and Santa Clause puts gifts under your tree. But, sorry to shatter your illusions - the SQL server and JDBC driver do not close everything because you "forgot" to close all the stuff you allocated yourself.
I would definitely give jTDS a try. I've used it in the past with Tomcat 5.5 with no problems. It seems like a relatively quick, low impact change to make as a debugging step. I think you'll find it faster and more stable. It also has the advantage of being open source.
In the long term, I think you'll want to look into connection pooling for performance reasons. When you do, I recommend having a look at c3p0. I think it's more flexible than the built in pooling options for Tomcat and I generally prefer "out of container" solutions so that it's less painful to switch containers in the future.
It's hard to tell really because you've provided so little information on the actual failure:
After a few requests, the application
server dies and the in the log files
we find exceptions related to database
connection failures.
Can you tell us:
exactly what the error is that
you're seeing
give us a small
example of the code where you
connect and service one of your
requests
is it after a consistent
number of transactions that it
fails, or is it seemingly random
I have written a lot of database related java code (pretty much all my code is database related), and used the MS driver, the jdt driver, and the one from jnetDirect.
I'm sure if you provide us more details we can help you out.

Resources