Opening the database connection once or on every databaseaction? - database

Im currently creating a webportal with ASP.NET which relies heavily on database usage. Basically, every (well almost every :P ) GET query from any user will result in a query to the database from the webserver.
Now, I'm really new at this, and I'm very concerned about performance. Due to my lack of experience in this area, I don't really know what to expect.
My question is, using ADO.NET, would it be a smarter choice to just leave a static connection open from the webserver to the database, and then check the integrety of this connection serverside before each query to the database? - Or, would I be better off opening the connection before each query and then close it afterwards?
In my head the first option would be the better as you save time handshaking etc. before each query and you save memory both on the database and the server side since you only have one connection, but are there any downfalls to this approach? Could 2 queries send at the same time potentially destroy each others integrity or mix the returned dataset?
I've tried searching everywhere in here and on the web to find some best-practices about this, but with no luck. Closest I got was this: is it safe to keep database connections open for long time , but that seems to be more fitting for distributed systems where you have more than one user of the database, whereas I only got my webserver..

You're way early to be worrying about performance.
Anyhow, connections are pooled by the framework. You should be opening them, using them, and disposing of them ASAP.
Something like...
public object Load()
{
using (SqlConnection cn = new SqlConnection(connectionString))
using (SqlCommand cm = new SqlCommand(commandString, cn))
{
cn.Open();
return cm.ExecuteScalar();
}
}

It's better to let ADO.NET handle the connection pooling. It'll persist the connection if it thinks it needs to, but don't use a static connection object. That just smells. It would be better to pass the connection object around to methods that need it, and create the connection in a using block.

You should always close your connection after finishing your DB interaction. ADO.NET has connection pooling which will take care of efficient connection reuse. Whenever you open 2nd, 3rd and subsequent connections - they'll be taken from a pool with almost no overhead.
Hope this helps.

I'd be thinking more about caching than advanced connection pooling. Every get requires a database hit?
If its a portal you've got common content and user specific content, using the Cache you can store common items as well as with a mangled key (with the users id) you can store user specific items.

ADO.NET does connection pooling. When you call close on the connection object it will keep the connection in the pool making the next connection much faster.

Your initial hunch is correct. What you need is database connection pooling.

You definitely don't want to open a connection for every database call, that will result in extremely poor performance very quickly. Establishing a database connecting is very expensive.
Instead what you should be using is a connection pool. The pool will manage your connections, and try to re-use existing connections when possible.

I don't know your platform, but look into Connection Pooling - there must be a library or utility available (either in the base system or as an add-on, or supplied with the database drivers) that will provide the means to pool several active connections to the database, which are ready and raring to be used when you obtain one from the pool.
To be honest, I would expect the pooling to occur by default in any database abstraction library (with an available option to disable it). It appears that ADO.NET does this.

Really the first question to ask is why are you very concerned about performance? What is your expected workload? Have you tried it yet?
But in general, yes, it's smarter to have an open connection that you keep around for a while than to re-open a database connection each time; depending on the kind of connection, network issues, and phase of the moon, it can take a good part of a second or more to make an initial connection; if your workload is such that you expect more than a GET every five seconds or so, you'll be happier with a standing connection.

Related

Codeigniter multiple database connection slows my pages. Do I have to close the connections? Where?

I'm using CodeIgniter 3.0 for a web app.
I have 2 databases, on a same server (I guess) but with 2 different hostnames.
I don't use the second database, except for one kind of user.
When this kind of user connects to the web app, the pages take a very long time to display, but for my other users, no problem, so I guess it's a multiple databases connection issue.
In my database.php file, I write the 2 arrays including the databases informations.
In my model files using the second database, I just write something like that:
$db1 = $this->load->database('db1', TRUE);
...
// I do my query as usual
$db1->...
...
return $db1->get();
I do not close the connection.
Questions:
1) In each page, I use several functions using the second database. Is this issue due to theses multiple connections to my second database?
2) Do I have to close the connection in my functions's model, just before the return? Or is it beter to connect and disconnect in the controler?
3) I saw about the CI reconnect function, but how to use it well? To reconnect, I have to connect first, but where to connect first?
4) Or do you think the issue is due to something else, like some bad SQL queries?
Let's go through your questions one at a time and I'll comment.
1) In each page, I use several functions using the second database. Is
this issue due to theses multiple connections to my second database?
I say no because I have used the same multiple DB approach many times and have never seen a performance hit. Besides, if a performance hit was a common problem there would be lots of online complaints and people looking for solutions. I've seen none. (And I spend way too much time helping people with CodeIgniter.)
2) Do I have to close the connection in my function's model, just
before the return? Or is it better to connect and disconnect in the
controller?
If closing the connection did help then the answer to when to do it depends on the overall structure of the logic. For instance, if a controller is using several methods from the same model the create a page then close the connection in the controller. On the other hand, if only one model method is used to create a given page then close the connection in the model.
What you don't want to do is repeatedly open and close a DB connection while building a page.
3) I saw about the CI reconnect function, but how to use it well? To
reconnect, I have to connect first, but where to connect first?
reconnect() is only useful when database server drops the connection due to it being idle for too long. You'll know you need to use reconnect() when you start getting "no database connection" or "cannot connect to database" errors.
4) Or do you think the issue is due to something else, like some bad SQL queries?
Because the other approaches you ask about won't help this is the strongest possibility. Again, my reasoning is because I've never had this problem using multiple database connections.
I suggest you do some performance profiling on the queries to the second database. Check out the following bits of documentation for help with that.
Profiling Your Application
Benchmarking Class
There are lots of reasons for slow page loads and the use of the second DB might just be a coincidence.
About Closing Connections
The question is, "If I do not close the DB connection by myself, CI will do it for me, but when?".
The answer is found in the PHP manual, "Open non-persistent MySQL connections and result sets are automatically destroyed when a PHP script finishes its execution." That quote is from the mysqli documentation, but, to the best of my knowledge, it is true for all of PHP's database extensions, i.e. Oracle, Mssql, PDO, etc.
In short, DB connection closing is baked into PHP and happens when the script is done. In CI, the script is done very shortly after the Controller returns. (Examine the end of /system/core/Codeigniter.php if you want to see what happens when the controller returns.) In effect, a Controller returning is, more or less, another way of saying "after the page is loaded".
Unless you happen to be using persistent connections (usually a bad idea) you seldom need to explicitly close DB connections. One reason to close them yourself is when a lot (really a lot) of time is required to process the query results. Manually closing connections will help assure the DB server won't reach it's connection limit when the web server is under heavy usage.
To determine what "really a lot" means you have to consider multiple factors, i.e. how many connections the database server allows, how the time-to-process compares to the DB idle connection dropout duration, and the amount of traffic the site needs to handle.
There are likely other considerations too. I'm not a database performance tuning expert.

SQL Server Database Connection Pooling?

I am getting the following exception in the website which has to a large extent deal with data entry operations. It also has Indexes defined on the tables in the associated database. The database calls are made via SQLHelper. For e.g. SQLHelper.ExecuteNonQuery() etc. I cannot see anywhere that the Close() or Dispose() method of the SQLConnection is invoked. So I am assuming that SQLHelper must be taking care of it as I have also read about it on various sites. Also, to check the code in combination with Close() or Dispose() is also very tedious as SQLHelper is used in many places and there are many classes where business logic is defined. The exception that I am getting is:
The record was not updated Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
For now I have tested the code with putting GC.Collect in the Application_EndRequest Method of Global.asax and everything as of now is working fine. But I know that it is strictly not recommend to use the same.
Any help will be GREATLY appreciated as I am stuck # present..
Not sure which version of your SQLHelper is, but if you can't see any connection.Close() been called then you need manually call it to make sure connection closed. Garbage collector will not close the connection for you.
EDIT
Also about the connection pooling it's by default enabled by .Net itself, you call connection.Close() does not mean the connection between your application to SQL Server really been closed, it's just return that connection back to connection pool and wait for others to grab. Only if after a while nobody open new connection that connection them will be physically closed, so you should not need to worry about call connection.Close() too many times, instead of that you need to call it asap to release the resource for other threads to use.
For more detail please check how Microsoft saying about connection pool: http://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
ANOTHER EDIT
I suggest you find an updated version of SQLHelper or go ahead change SQLHelper to add Close() in it. Even if you found GC help you closed the connection but you should not use it that way, GC is not designed for release database connection but just memory, also GC.Collect() is not guaranteed it will immediately do it right away to start garbage collection.
Also you are coding web application so you need to think concurrency, what if when you calling GC.Collect() when another thread is running, will that slow down your system to other users?
It's a common sense that those limited resources (like db connection, TCP/IP port, file read/write handler etc)need to be released as soon as possible. If you are looking for an easy way to make you coding easily without even using connection.Close then you are going to a wrong direction, I understand you want simply you code and not add that line everywhere but you at least need to make sure SQLHelper doing the job to close connection.

Reasons not to use database connection pooling?

I recently found an automatically created connection string specifying "Pooling=False" and wondered as to why it was set up like that. From my understanding pooling is virtually always advantageous as long as it is not totally mis-configured.
Are there any reasons for disabling pooling? Does it depend on the OS, the physical connection or the used DBMS?
Yes, there's a reason to disable pooling. You need to check how a particular pooling library copes with temporary network disconnects. For example some database drivers and/or pool libraries do nothing if connection was lost but connection object is still active. Instead of respawning a new connection, pool will give you stale connections and you will get errors. Some pool implementations check if connection is alive by issuing some fast command to the server before serving the connection to application. If they get error they kill that connection and spawn a new one. You always need to test your pool library against such scenario before enabling pooling.
If it's a single threaded app, pooling seems unnecessary. Was it on a resource constrained device? Is startup time important to the application? These are some factors that might lead to the decision to turn off pooling.
In general, I think you are right that pooling is beneficial. If it's a typical web app then I would inquire about it.
The reason must be that your context is trying to change the state of underlying database for example, if you are doing something that affects TLS(Transport Layer Security) you should not use Connection pool, because LDAP does not track any such state changes, if you do so you are compromising on your security issues.

Storing database connections in session, in a small scale webapp

I have a j2ee webapp that's being used internally by ~20-30 people.
There is no chance of significant growth in the number of users.
From what I understood there's a trade-off between opening a new DB connection for each request made to the webapp (expensive, but doesn't block other users when the DB is in use), to using the singleton pattern (doesn't open new connections but only allows one user at a time).
I thought that since I know that only 30 users will ever use my webapp at the same time, maybe the simplest and best solution would be to store the connection as a session attribute, thus reducing to a minimum the amount of openings made, while still allocating one connection per user.
What do you think?
From what I understood there's a
trade-off between opening a new DB
connection for each request made to
the webapp
That is what connection pools are for. If you use a connection pool in your application, the pool once initialized, is in charge of providing connections for use in the application as and when needed. In a properly tuned connection pool, there are going to be enough connections created on reserve that can be provided to the application, mitigating the need to create and open a connection only when the application requests for it.
I thought that since I know that only
30 users will ever use my webapp at
the same time, maybe the simplest and
best solution would be to store the
connection as a session attribute
Per-user connections are not a good idea, primarily when a web application is concerned. In a web application, it is perfectly possible for users to initiate multiple requests to the server (think multi-tabbed browsing). In such a case, the use of a single connection per user will result in weird application behavior, unless you synchronize access to the connection.
One must also consider the side-effect of putting transient attributes into the session - Connection objects are not serializable and hence must be marked transient. If the session is deserialized at some point, one has to account for the fact that the Connection object will not be available, and must be re-initialized.
I think you're getting into premature optimization especially given the scale of the application. Opening a new connection is not that expensive and like Makach says, most modern RDBMSs handle connection pooling and will hold connections open for subsequent requests. You'd be trying to write better code than the compiler, so to speak.
No. Don't do that. It's perfectly ok to reconnect to the database every time you need to. Any database management system will do their own connection pool caching I think.
If you want to try to keep open connections you'll make it incredible hard for yourself to manage this in a secure, bug-free, safe etc way.

Why use Singleton to manage db connection?

I know this has been asked before here there and everywhere but i can't get a clear explanation so i'm going to pitch it again. So what is all of the fuss about using a singleton to control the db connection in your web app? Some like it some hate it i don't understand it. From what I've read, "it's to ensure that there is always only one active connection to your DB". I mean why is that a good thing? 1 active DB connection on a data driven web app processing multiple requests per second spells trouble doesn't it? For whatever reason nobody can properly explain this. I've been all over the web. I know i'm thick.
Assuming Java here, but is relevant to most other technologies as well.
I'm not sure whether you've confused the use of a plain singleton with a service locator. Both of them are design patterns. The service locator pattern is used by applications to ensure that there is a single class entrusted with the responsibility of obtaining and providing access to databases, files, JMS queues, etc.
Most service locators are implemented as singletons, since there is no need for multiple service locators to do the same job. Besides, it is useful to cache information obtained from the first lookup that can be later used by other clients of the service locator.
By the way, the argument about
"it's to ensure that there is always
only one active connection to your DB"
is false and misleading. It is quite possible that the connection can be closed/reclaimed if left inactive for quite a long period of time. So caching a connection to the database is frowned upon. There is one deviation from this argument; "re-using" the connection obtained from the connection pool is encouraged as long as you do so with the same context, i.e. within the same HTTP request, or user request (whichever is applicable). This done obviously, from the point of view of performance, since establishing new connections can prove to be an expensive operation.
High-performance (or even medium-performance) web apps use database connection pooling, so one DB connection can be shared among many web requests. The singleton is usually the object which manages this pool. I think the motivation for using a singleton is to idiot-proof against maintenance programmers that might otherwise instantiate many of these objects needlessly.
"it's to ensure that there is always only one active connection to your DB." I think that would be better stated as to ensure each CLIENT has only one active connection to your DB. The reason why this is incredibly important is because you want to prevent deadlocks. If I have TWO open database connections (as a client) I might be updating on one connection, then I might try to update the same row in another connection. This will a deadlock which the database cannot detect. So, the idea of the singleton is basically to make sure that there is ONE object who is charge of handing out database connections to each client. Basically. You don't HAVE to have a singleton for this, but most people will tell you it just makes sense that the system only has one.
You're right--usually this isn't what you want.
However, there are plenty of cases where you need to throttle yourself down to a single connection. By serializing your access to the database through a singleton, you can address other issues or constraints like load, bandwidth, etc.
I've done something similar in the past for a bulk processing app. Instead, though, I used a semaphore to synchronize access to the database so I could allow n concurrent db operations.
One might want to use a singleton due to database server constraints, for example, a server might limit the number of connections.
My main conscious reason is that you know what connections can be managed/closed etc., just makes things a bit more organised when you don't have unnecessary, redundant connections.
I don't think it's a simple answer. For instance on ASP.NET, the platform implements connection pooling by default, so it will automatically adjust a "pool" of connections and re-use them so you're not constantly creating and destroying expensive objects.
However, let's say you were writing a data collection application that monitored 200 separate input sources. Every time one of those inputs changed, you fire off a thread that records the event to the database. I would say that could be a bad design if there's a chance that even a fraction of those could fire off at the same time. Suddenly having 20 or 40 active database connections is inefficient. It might be better to queue the updates, and as long as there are updates left in the queue, a singleton connection picks them off the queue and executes them on the server. It's more efficient because you only have to negotiate the connection and authentication once. Once there's no activity for a while you could choose to close down the connection. This kind of behavior would be hard to implement without a central resource manager like a singleton.
"only one active connection" is a very narrow statement for illustration. It could just as well be a singleton managing a pool of connection. The point of a singleton for database connections is that you don't want every consumer making it's own connection or set of connections.
I think you might want to be more specific about, "using a singleton to control the db connection in your web app." Ideally, a java.sql.Connection object will not be thread safe, but your javax.sql.DataSource may want to pool connections, so you should go to a single instance of it to share the pooling.
you are more looking for one connection per request, not one connection for the entire application. you can still control access to it through a singleton though (storing the connection in the HttpContext.Items collection).
It guarantees that each client using your site only gets one connection to the db.
You really do not want a new connection being made everytime a user does an action that will create a db query. Not only for performance reasons with the connection handshaking involved, but to decrease load on the db server.
DB connections are a precious commodity, and this technique helps minimize the amount used at any given time.

Resources