SQL queries to MSSQL contains pauses even with MARS enabled - sql-server

We are testing JDBC drivers from jTDS and Microsoft, and we are suffering from unwanted pauses in query execution. Our application opens many ResultSets and fetches only a few rows from each. Each query selects about 100k rows, but we fetch only about 50 (which is enough to fill a page). The problem is that every query after the first contains a pause of about 2s, on which the driver loads all rows from the previous ResultSet to a temporary storage (memory or disk), so they can be traversed later. Because we have about 6 queries in worst scenarios, there will be a pause of about 10s, which makes the application unresponsive to the user. MSSQL version is 2005.
To remove such pauses, we've tried to enable MARS (Multiple Active Result Sets) via connection string parameters of Microsoft JDBC driver (due to lack of documentation, we tried everything that is listed on https://sites.google.com/site/sqlconnect/sql2005strings). Example of connection string:
jdbc:sqlserver://TESTDBMACHINE;instanceName=S2005;databaseName=SampleDB;MarsConn=yes
But none of them solves the problem. Microsoft JDBC driver seems to accept anything at connection string - if you replace MarsConn=yes by PleaseBeFast=yes, MS driver ignores the parameter and doesn't even log the fact. I don't know if MARS is a client-only feature that just caches rows from a previously active result set, or if it's a server feature. I don't even know how to detect, from the server side, if a given connection is using MARS. If you can comment on this, it will be welcome.
Another solution for the pause was to use scrollable (bidirectional) result sets. This removes the pause, but makes fetch time 80% slower and more network consuming. We are now considering to implement a JDBC connection wrapper that keeps a pool of actual connections and automatically issue queries to distinct "ResultSet free" connections. But this is somewhat cumbersome because we need to keep a link between each connection and its active ResultSet. Also it would consume more connections from the server and may cause troubles to DBAs. And this solution doesn't help if there is an active transaction, on which all queries must be issued on the same connection.
Do you know some parameter, configuration, specific API, link or trick that can remove the pause from the second and subsequent query executions?

fix your SQL queries! why only use the first 50 or so from 100k rows?? use TOP 100 or something like that! There is no reason that the application should be filtering 100k rows, that is the job of the database.

Far more important that your client woes is what happens on the server side. Since you issue queries and then you stop reading the result the server will have to suspend the query in the middle of the execution because the network buffers are full and has no room to write the result into. A query suspended in the middle of the execution is consuming a lot of resources: memory, locks and, most importantly, a worker thread (there are very few of these lying around).
Issue queries for only the data you need, consume all the data, free the connection and, more importantly, the server resources. If your queries are to complex, go back to the drawing board and redesign your data model to properly answer, efficiently, the queries your requesting from it. Right now you are totally barking up the wrong tree, you're simply asking how to make a bad situation worse.

We've created an ODBC data source using the SQL Server driver (Native Client 10 - sqlncli10.dll). This ODBC data source was configured to enable MARS (key HKEY_CURRENT_USER\Software\ODBC\ODBC.INI\ datasource, value of MARS_Connection must be Yes). Then we used Sun's JDBC-ODBC bridge, and voilá! The pauses were gone, and surprisingly, fetch time became faster than JTDS and MSSQL pure Java JDBC drivers!
According to http://msdn.microsoft.com/en-us/library/ms345109(SQL.90).aspx, MARS is a server-aided feature. The pure Java drivers (from JTDS and MSSQL) doesn't seem to support served-based MARS (at least we couldn't enable it after many configuration changes). Because most of our user base that uses MSSQL Server runs on Windows (no surprise), we are about to make the switch from JTDS to JDBC-ODBC. Both Native Client ODBC driver and JDBC-ODBC bridge seems to be mature, full featured and up-to-date solutions, so I guess there should be no problems. If you know some, please comment!
Linux based users will still use JTDS. Since now we know that MARS is a server-aided feature, we'll fill a feature request for JTDS and Microsoft to support MARS in their pure Java JDBC drivers.

Related

SQL Server Table > MS Access Local Copy?

I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.

How to speed up mssql_connect()

I'm working on a project where a PHP dialog system is communicating with a Microsoft SQL Server 2008 and I need more speed on the PHP side.
After profiling my PHP scripts, I discovered that a call to mssql_connect() needs about 200 milliseconds on that particular system. For some simple dialogs this is about 60% of the whole script runtime. So I could gain a huge performance boost by speeding up this call.
I already assured that only one single connection handle is produced for every request to my PHP scripts.
Is there a way to speed up the initial connection with SQL Server? Some restrictions apply, though:
I can't use PDO (there's a lot of legacy code here that won't work with it)
I don't have access to the SQL Server configuration, so I need a PHP-side solution
I can't upgrade to PHP 5.3.X, again because of crappy legacy code.
Hm. I don't know much about MS SQL, but optimizing that single call may be tough.
One thing that comes to mind is trying mssql_pconnect(), of course:
First, when connecting, the function would first try to find a (persistent) link that's already open with the same host, username and password. If one is found, an identifier for it will be returned instead of opening a new connection.
But you probably already have thought of that.
The second thing, you are not saying whether MS SQL is running on the same machine as the PHP part, but if it isn't, maybe there's a basic networking issue at hand? How fast is a classic ping between one host and the other? The same would go for a virtual machine that is not perfectly configured. 200 milliseconds really sounds very, very slow.
Then, in the User Contributed Notes to mssql_connect(), there is talk about a native PHP driver for MS SQL. I don't know anything about it, whether it will pertain the "old" syntax and whether it is usable in your situation, but it might be worth a look.
The User Contributed Notes are always worth a look, there are tidbits like this one:
Just in case it helps people here... We were being run ragged by extremely slow connections from IIS6 --> SQL Server 2000. Switching from CGI to ISAPI fixed it somewhat, but the initial connection still took along the lines of 10 seconds, and eventually the connections wouldn't work any more.
The solution was to add the database server IP address to the HOST file on the server, pointing it to the internal machine name. Looks like some kind of DNS lookup was the culprit.
Now connections and queries are flying, and the world is once again right.
We have been going through quite a bit of optimization between php 5.3, FreeTDS and mssql lately. Assuming that you have adequate server resources, we are finding that two changes made the database interaction much faster and much more reliable.
Using mssql_pconnect() instead of mssql_connect() eliminated an
intermittent "cannot connect to server" issue. I read a lot of posts
that indicated negative issues associated with persistent
connections but so far we haven't seen anything to suggest that it's
an issue. The php server seems to keep between 20 and 60 persistent
connections open to the db server depending upon load.
Using the IP address of the database server in the freetds.conf file
instead of the hostname also lent a speed increase.
The only thing i could think of is to use an ip adress instead of an hostname for the sql connect to spare the dns lookup. Maybe persistant connections are an option and a little bit faster.

How much is the network - determing network overhead in SQL Server

We have a dev server running C# and talking to SQL server on the same machine.
We have another server running the same code and talking to SQL server on another machine.
A job does 60,000 reads (that is it calls a stored procedure 60,000 times - each read returns one row).
The job runs in 1/40th of the time on the first server compared to it running on the second server.
We're already looking at the 'internal' differences between the two SQL Servers (fragmentation, tempdb, memory etc) but what's a good way to determine how much slower the second config is simply because it has to go over the network ?
[rather confusingly I found a 'SQL Server Ping' tool but it doesn't actually attempt any timing measurement which, as far as I can see, is what we need]
Open SQL Server Management Server on the remote machine. Start a new query. Click Query, Include Client Statistics. Run your stored procedure. In the Client Statistics tab of the results, you'll see some basic information about how many packets were sent back & forth over the network. My guess is that for one read, you're not going to see that much overhead.
To get a better idea, I'd try doing a plain select of 60,000 records (since you said it's returning 60,000 records one by one) over the network from your remote machine. Again, that doesn't give you an idea of the stored procedure overhead, but it'll give you a quick seat-of-the-pants idea of the network speed between machines.
SQL Server ships with the Profiler utility. This will tell you what the execution time of your query is on each of your SQL Server instances. Note any discrepencies. Whatever time (in the ExecutionTime column) can not be accounted for here is transmission time... or client display time. Perhaps your client machine takes longer to render the results, or compute the results.
What results are you expecting? Running everything on one machine vs over a network will certainly give you different timings. Your biggest timing difference will be the network throughput. You need to communicate to the networked server both ways.
If you can set NOCOUNT to on, this will help in less network traffic.

Performance problems with SQL Server Management Studio

I'm running Sql Server Management Studio 2008 on a decent machine. Even if it is the only thing open with no other connections to the database, anything that has to do with the Database Diagram or simple schema changes in a designer take up to 10 minutes to complete and SQL Management Studio is unresponsive during that time. The same SQL code takes less than a second. This entirely defeats the purpose of the designers and diagramers.
------------------
System Information
------------------
Operating System: Windows Vista™ Ultimate (6.0, Build 6001) Service Pack 1 (6001.vistasp1_gdr.080917-1612)
Processor: Intel(R) Core(TM)2 Quad CPU Q6700 # 2.66GHz (4 CPUs), ~2.7GHz
Memory: 6142MB RAM
Please tell me this isn't a WOW64 problem; if it is, I love MS, but step up your 64-bit support in development tools.
Is there anything I can do to get the performance anywhere near acceptable?
Edit:
I've got version 10.0.1600.22 of SQL Server Management Studio installed. Is this not the latest release? I'm sure I installed it from an MSDN CD and I pretty much rely on Windows Update these days. Is there any place I can quickly see what the latest release version number is for tools like this?
Edit:
Every time I go to open a database diagram I get the message "This database does not have one or more of the support objects required to use database diagramming. Do you wish to create them?" I say yes every time. Is this part of the problem? Also, if I press the copy icon, I get the message "Current thread must be set to single thread apartment (STA) mode before OLE calls can be made." Database corruption?
I'm running in a similar environment and not having that problem.
As with any performance problem, you'll have to analyze it a bit - just saying "it takes 10 minutes" give no information on the reason it takes so long, so no information you can use to solve the problem.
Here are some tools to play around with. I'd have mentioned them originally, but "play around" is all I've learned to do with them. I'd recommend you try learning a little about them, which I have not done. http://technet.microsoft.com is a good source on performance issues.
Start with Task Manager, believe it or not. It's been enhanced in Vista and Server 2008, and now has a better Performance tab, and a Services tab. Be sure to click "Show processes from all users", or you'll miss nasty things done by services.
The bottom of the Performance tab has a "Resource Monitor" button. Click it, watch it, learn what it can do for you.
The Resource Monitor is actually part of a larger "Reliability and Performance Monitor" tool in Administrative Tools. Try it. It even includes the new version of perfmon, which will be more useful when you have a better idea what counters to look at.
I will also suggest the Process Explorer and Process Monitor tools from Sysinternals. See http://technet.microsoft.com/en-us/sysinternals/default.aspx.
Do your simple schema changes possibly mean that you're reordering the columns of a table?
In that case, what SQL Management Studio does behind the scenes is create a new table, move all the data from the old table to the newly created table, and then drop the old table.
Thus, if you reorder columns on a table with lots of data, lots of indices or both, you CAN incur a massive amount of "reorganization" work without really realizing it.
Marc
Can you try connecting your SQL Management Studio to a different instance of SQL Server or, better, an instance on a remote machine (and try to make similar changes)?
Are there any entries in the System or Application Event Logs (or SQL logs for that matter)? Have you tried uninstalling and reinstalling SQL Server on your machine? What version of SQL Server (database) are you running?
Lastly, can you open the Activity Monitor successfully? Right click on the server (machine name) - top of the three in the object explorer window - and click on 'Activity Monitor'.
Do you have problems with other software on your machine or only with SQL Server & Management Studio?
When you open SSMS it attempts to validate itself with Microsoft. You can speed this process by performing the second of the recommendations at the following link.
http://www.sql-server-performance.com/faq/sql_server_management_studio_load_time_p1.aspx
Also, are you using the registered servers feature? If so SSMS will attempt to validate all of these.
It seems as though it was a network configuration problem. Never trust a developer (myself) to setup a haphazard domain at his office.
I had my DNS server on my computer pointed to my ISP's (default because the wireless router we're using provided by the ISP doesn't allow me to override the DNS server to my own) instead of my DNS server here, so I have to remember to configure it manually on each computer, which I forgot for this particular computer.
I only discovered it when I tried to connect for the first time to a remote SQL Server instance form this PC. It was trying to resolve to an actual sub-domain of mycompany.com instead of my DNS server's authority of COMPUTERNAME.corp.mycompany.com
I can't say why this was an issue for the designers in SQL Server but not anything else, but my only hypothesis is that when I established a connection to my own computer locally using the computer name instead of "." or "localhost", SQL queries executed immediately, knowing it was local, but the designers still waited for a timeout from the external IP address before trying the local one.
Whatever the explanation is, changing my DNS server for my network card on the local machine to my DNS server's IP made it all work very quickly.
I had a similar issue with mine. Turned out to be some interference with the biometrics login service running on my laptop. Disabled the service and now it works.

SQL Server connection management in Tomcat 6

We are having trouble with a Java web application running within Tomcat 6 that uses JDBC to connect to a SQL Server database.
After a few requests, the application server dies and the in the log files we find exceptions related to database connection failures.
We are not using any connection pooling right now and we are using the standard JDBC/ODBC/ADO driver bridge to connect to SQL Server.
Should we consider using connection pooling to eliminate the problem?
Also, should we change our driver to something like jTDS?
That is the correct behavior if you are not closing your JDBC connections.
You have to call the close() method of each JDBC resource when you are finished using it and the other JDBC resources you obtained with it.
That goes for Connection, Statement/PreparedStatement/CallableStatement, ResultSet, etc.
If you fail to do that, you are hoarding potentially huge and likely very limited resources on the SQL server, for starters.
Eventually, connections will not be granted, get queries to execute and return results will fail or hang.
You could also notice your INSERT/UPDATE/DELETE statements hanging if you fail to commit() or rollback() at the conclusion of each transaction, if you have not set autoCommit property to true.
What I have seen is that if you apply the rigor mentioned above to your JDBC client code, then JDBC and your SQL server will work wonderfully smoothly. If you write crap, then everything will behave like crap.
Many people write JDBC calls expecting "something" else to release each thing by calling close() because that is boring and the application and server do not immediately fail when they leave that out.
That is true, but those programmers have written their programs to play "99 bottles of beer on the wall" with their server(s).
The resources will become exhausted and requests will tend to result in one or more of the following happening: connection requests fail immediately, SQL statements fail immediately or hang forever or until some godawful lengthy transaction timeout timer expires, etc.
Therefore, the quickest way to solve these types of SQL problems is not to blame the SQL server, the application server, the web container, JDBC drivers, or the disappointing lack of artificial intelligence embedded in the Java garbage collector.
The quickest way to solve them is to shoot the guy who wrote the JDBC calls in your application that talk to your SQL server with a Nerf dart. When he says, "What did you do that for...?!" Just point to this post and tell him to read it. (Remember not to shoot for the eyes, things in his hands, stuff that might be dangerous/fragile, etc.)
As for connection pooling solving your problems... no. Sorry, connection pools simply speed up the call to get a connection in your application by handing it a pre-allocated, perhaps recycled connection.
The tooth fairy puts money under your pillow, the Easter bunny puts eggs & candy under your bushes, and Santa Clause puts gifts under your tree. But, sorry to shatter your illusions - the SQL server and JDBC driver do not close everything because you "forgot" to close all the stuff you allocated yourself.
I would definitely give jTDS a try. I've used it in the past with Tomcat 5.5 with no problems. It seems like a relatively quick, low impact change to make as a debugging step. I think you'll find it faster and more stable. It also has the advantage of being open source.
In the long term, I think you'll want to look into connection pooling for performance reasons. When you do, I recommend having a look at c3p0. I think it's more flexible than the built in pooling options for Tomcat and I generally prefer "out of container" solutions so that it's less painful to switch containers in the future.
It's hard to tell really because you've provided so little information on the actual failure:
After a few requests, the application
server dies and the in the log files
we find exceptions related to database
connection failures.
Can you tell us:
exactly what the error is that
you're seeing
give us a small
example of the code where you
connect and service one of your
requests
is it after a consistent
number of transactions that it
fails, or is it seemingly random
I have written a lot of database related java code (pretty much all my code is database related), and used the MS driver, the jdt driver, and the one from jnetDirect.
I'm sure if you provide us more details we can help you out.

Resources