A lot of sources on the web say it is best to make a connection pool or new connection for each request. I understand the reasoning why this is the case (more scalable) and I prefer it this way. But if you don't get 100 requests per minute, is it worth the extra time for each request for the db connection to establish? I would like to hear your different standpoints on the subject.
If you understand the benefits of making a connection pool then it's up to you to decide what to do. It's generally a good thing to do if you want a fast reply time and expect any decent amount of requests.
In this link you can find a detailed explanation: https://dev.to/digi0ps/connection-pooling-what-and-why-bom
Related
I fee very skeptical about this part as I have not found an answer on the web that satisfies me.
The questions are:
Should I keep the db connection on forever and then keep processing my requests. This way is easier as I only have to open up the connection once and forget about closing it unless it goes down for some reason. Here, I understand the overheads of keeping a connection on forever.
Should I open up a fresh connection each time I receive a request that needs some processing in the database and then close after the processing is done.
As per my research, the second part is better, but I need to understand the consequences in both cases.
It will mean a lot if someone can explain me the design overheads here with some examples
I'm pretty new to the world of databases and I was wondering - what is the difference between "Requests per second (RPS)" and "IO per second (IOps)"
From what I know 1 Request can be made up of several I/Os (which are input ouputs). Is this a correct understanding?
Also, in the context of databases, are concurrent connections the same as concurrent users? I would assume that every connection would equate to a user. Is this right?
Please feel free to correct me or throw me examples, links, resources to read up on. I'm a total newbie at this, and would absolutely love to ramp up on my database basics!
One of our problems is that our outbound email server sucks sometimes. Users will trigger an email in our application, and the application can take on the order of 30 seconds to actually send it. Let's make it even worse and admit that we're not even doing this on a background thread, so the user is completely blocked during this time. SQL Server Database Mail has been proposed as a solution to this problem, since it basically implements a message queue and is physically closer and far more responsive than our third party email host. It's also admittedly really easy to implement for us, since it's just replacing one call to SmtpClient.Send with the execution of a stored procedure. Most of our application email contains PDFs, XLSs, and so forth, and I've seen the size of these attachments reach as high as 20MB.
Using Database Mail to handle all of our application email smells bad to me, but I'm having a hard time talking anyone out of it given the extremely low cost of implementation. Our production database server is way too powerful, so I'm not sure that it couldn't handle the load, either. Any ideas or safer alternatives?
All you have to do is run it through an SMTP server and if you're planning on sending large amounts of mail out then you'll have to not only load balance the servers (and DNS servers if you're planning on sending out 100K + mails at a time) but make sure your outbound Email servers have the proper A records registered in DNS to prevent bounce backs.
It's a cheap solution (minus the load balancer costs).
Yes, dual home the server for your internal lan and the internet and make sure it's an outbound only server. Start out with one SMTP server and if you get bottle necks right off the bat, look to see if it's memory, disk, network, or load related. If its load related then it may be time to look at load balancing. If it's memory related, throw more memory at it. If it's disk related throw a raid 0+1 array at it. If it's network related use a bigger pipe.
With a distributed application, where you have lots of clients and one main server, should you:
Make the clients dumb and the server smart: clients are fast and non-invasive. Business rules are needed in only 1 place
Make the clients smart and the server dumb: take as much load as possible off of the server
Additional info:
Clients collect tons of data about the computer they are on. The server must analyze all of this info to determine the health of these computers
The owners of the client computers are temperamental and will shut down the clients if the client starts to consume too many resources (thus negating the purpose of the distributed app in helping diagnose problems)
You should do as much client-side processing as possible. This will enable your application to scale better than doing processing server-side. To solve your temperamental user problem, you could look into making your client processes run at a very low priority so there's no noticeable decrease in performance on the part of the user.
In a client-server setting, if you care about security, you should always program on the assumption that the client may have been compromised. Even if it hasn't, there is always the risk of somebody using an old version of the client, using a competing or modified version of the client, or just of the net connection being a bit screwy.
So while you do as much work on the client as possible, processing and marshalling information into the right form, the server then needs to do a thorough sanity check on anything the client gives it.
So the answer I guess is "both".
The server must analyze all of this
info to determine the health of these
computers
That is probably the biggest clue so far explaning what your application is kinda about. Are you able to provide a more elaborate briefing on what this application is seeking to achieve in this distributed environment? We do not even know if the client-side processing is disk I/O or processor intensive. How you design the solution is dependent on the nature of what needs to be done to help the users/business accomplish their jobs and objectives.
A discussion about Singletons in PHP has me thinking about this issue more and more. Most people instruct that you shouldn't make a bunch of DB connections in one request, and I'm just curious as to what your reasoning is. My first thought is the expense to your script of making that many requests to the DB, but then I counter myself with the question: wouldn't multiple connections make concurrent querying more efficient?
How about some answers (with evidence, folks) from some people in the know?
Database connections are a limited resource. Some DBs have a very low connection limit, and wasting connections is a major problem. By consuming many connections, you may be blocking others for using the database.
Additionally, throwing a ton of extra connections at the DB doesn't help anything unless there are resources on the DB server sitting idle. If you've got 8 cores and only one is being used to satisfy a query, then sure, making another connection might help. More likely, though, you are already using all the available cores. You're also likely hitting the same harddrive for every DB request, and adding additional lock contention.
If your DB has anything resembling high utilization, adding extra connections won't help. That'd be like spawning extra threads in an application with the blind hope that the extra concurrency will make processing faster. It might in some certain circumstances, but in other cases it'll just slow you down as you thrash the hard drive, waste time task-switching, and introduce synchronization overhead.
It is the cost of setting up the connection, transferring the data and then tearing it down. It will eat up your performance.
Evidence is harder to come by but consider the following...
Let's say it takes x microseconds to make a connection.
Now you want to make several requests and get data back and forth. Let's say that the difference in transport time is negligable between one connection and many (just ofr the sake of argument).
Now let's say it takes y microseconds to close the connection.
Opening one connection will take x+y microseconds of overhead. Opening many will take n * (x+y). That will delay your execution.
Setting up a DB connection is usually quite heavy. A lot of things are going on backstage (DNS resolution/TCP connection/Handshake/Authentication/Actual Query).
I've had an issue once with some weird DNS configuration that made every TCP connection took a few seconds before going up. My login procedure (because of a complex architecture) took 3 different DB connections to complete. With that issue, it was taking forever to log-in. We then refactored the code to make it go through one connection only.
We access Informix from .NET and use multiple connections. Unless we're starting a transaction on each connection, it often is handled in the connection pool. I know that's very brand-specific, but most(?) database systems' cilent access will pool connections to the best of its ability.
As an aside, we did have a problem with connection count because of cross-database connections. Informix supports synonyms, so we synonymed the common offenders and the multiple connections were handled server-side, saving a lot in transfer time, connection creation overhead, and (the real crux of our situtation) license fees.
I would assume that it is because your requests are not being sent asynchronously, since your requests are done iteratively on the server, blocking each time, you have to pay for the overhead of creating a connection each time, when you only have to do it once...
In Flex, all web service calls are automatically called asynchronously, so you it is common to see multiple connections, or queued up requests on the same connection.
Asynchronous requests mitigate the connection cost through faster request / response time...because you cannot easily achieve this in PHP without out some threading, then the performance hit is greater then simply reusing the same connection.
that's my 2 cents...