I use an sql server regularly and have recently been getting frustrated by the performance. It would be difficult for me to get direct access to find out the hardware so:
Is there a direct way in management studio to assess performance or find out the exact hardware.
Alternatively does someone have a set of test sql procedures I could try and ideally compare to other results to get an idea of it's performance.
So far I have setup a few quick queries on my local machines sql express server just as test these seem to run quicker than the sql server on the network which is meant to be high performance although no one knows when it was last upgraded I have a feeling it hasn't been for 6 or 7 years. Obviously these test don't account for the possibility of others querying at the same time or network transfers of results... Hopefully someone has a better solution.
You can't just ask your server guys? Seems like there's a fair bit of mistrust if you can't get hardware metrics. Count of CPUs, total memory, etc.
If there's that amount of mistrust, even if you found the answer from the database server, rectifying it would be impossible. If you can't get the current parameters, how could you get a change of hardware passed the server guys?
Start building rapport. The best line in the world to get someone on your side is, "I'm in trouble and I need your help..." You've elevated them and subjugated yourself, you've put them in a position to save you. You'd be amazed at how much you can get out of people that way.
As far as standard queries. You could look at TPC queries.
IF you are on 2005:
SELECT * FROM sys.dm_os_performance_counters
That will give you some sql only stats. You will not find much info about the machine without at least terminal access. In the sql startup log you can see some info on processors as well.
You also might try updating your references in your server. I had an issue a while back that 1 query returned in 100ms and an identical query in 5+ minutes and the only difference between the 2 was a Capital letter in the table name in my query (whih obviously shouldn't matter).
After some searching and SO-Questioning, I found that I needed to update my statistics. Could it be something like this is needed for your database / SQL Server too?
This sort of thing can be very political, especially in a firm with an endemic CYA culture (which describes most financial services companies). If there's no reasonable
expectation of a good working relationship with the production staff, A few approaches are:
Look at the query plans of the
queries. Check that they are
sensible (using indexes when they
should etc.)
Make it formal. Ask their manager
to get the specifications of the
machine, the disk layout and server
configuration and the last time
statistics were updated on all
tables and indexes. Make it clear
that the machine appears to be
under-performing.
If the statistics are out of date,
get them updated.
and one more
SELECT * FROM sys.dm_os_sys_info
Related
We have a site in development that when we deployed it to the client's production server, we started getting query timeouts after a couple of hours.
This was with a single user testing it and on our server (which is identical in terms of Sql Server version number - 2005 SP3) we have never had the same problem.
One of our senior developers had come across similar behaviour in a previous job and he ran a query to manually update the statistics and the problem magically went away - the query returned in a few miliseconds.
A couple of hours later, the same problem occurred.So we again manually updated the statistics and again, the problem went away. We've checked the database properties and sure enough, auto update statistics isTRUE.
As a temporary measure, we've set a task to update stats periodically, but clearly, this isn't a good solution.
The developer who experienced this problem before is certain it's an environment problem - when it occurred for him previously, it went away of its own accord after a few days.
We have examined the SQL server installation on their db server and it's not what I would regard as normal. Although they have SQL 2005 installed (and not 2008) there's an empty "100" folder in installation directory. There is also MSQL.1, MSQL.2, MSQL.3 and MSQL.4 (which is where the executables and data are actually stored).
If anybody has any ideas we'd be very grateful - I'm of the opinion that rather than the statistics failing to update, they are somehow becoming corrupt.
Many thanks
Tony
Disagreeing with Remus...
Parameter sniffing allows SQL Server to guess the optimal plan for a wide range of input values. Some times, it's wrong and the plan is bad because of an atypical value or a poorly chosen default.
I used to be able to demonstrate this on demand by changing a default between 0 and NULL: plan and performance changed dramatically.
A statistics update will invalidate the plan. The query will thus be compiled and cached when next used
The workarounds are one of these follows:
parameter masking
use OPTIMISE FOR UNKNOWN hint
duplicate "default"
See these SO questions
Why does the SqlServer optimizer get so confused with parameters?
At some point in your career with SQL Server does parameter sniffing just jump out and attack?
SQL poor stored procedure execution plan performance - parameter sniffing
Known issue?: SQL Server 2005 stored procedure fails to complete with a parameter
...and Google search on SO
Now, Remus works for the SQL Server development team. However, this phenomenon is well documented by Microsoft on their own website so blaming developers is unfair
How Data Access Code Affects Database Performance (MSDN mag)
Suboptimal index usage within stored procedure (MS Connect)
Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 (an excellent white paper)
Is not that the statistics are outdated. What happens when you update statistics all plans get invalidated and some bad cached plan gets evicted. Things run smooth until a bad plan gets again cached and causes slow execution.
The real question is why do you get bad plans to start with? We can get into lengthy technical and philosophical arguments whether a query processor shoudl create a bad plan to start with, but the thing is that, when applications are written in a certain way, bad plans can happen. The typical example is having a where clause like (#somevaribale is null) or (somefield= #somevariable). Ultimately 99% of the bad plans can be traced to developers writing queries that have C style procedural expectation instead of sound, set based, relational processing.
What you need to do now is to identify the bad queries. Is really easy, just check sys.dm_exec_query_stats, the bad queries will stand out in terms of total_elapsed_time and total_logical_reads. Once you identified the bad plan, you can take corrective measures which depend from query to query.
I am looking for information about how SQL Server actually handles query execution in finer details (like what data is kept in buffer/memory and how does it decide to get fresh data even if there is an update change in only one column of a table involved in a query etc)
If anyone knows sources please let me know?
We have an web application using sql server 2000, it has heavy frequent reads almost 70% of the tables(dashboard) every 30 seconds. and at the same time lot of writes are happening.
Please let me know any tips for optimization of above scenario?
No one is going to be able to give you an answer on how to solve your optimization problem. This requires access to your database server. To solve your problems you need to understand roughly how the database works, and for this you need to do some reading.
On-line resources are OK, however the following three books are priceless. The first two books have very detailed information about how SQL Server works. The last ones is a guide of how to write queries but with a discussion on how the engine views queries as well.
Kalen Daleney: Sql Server 2008 Internals
Sql Server 2005: Practical Troubleshooting:
Ben Gan et al: Inside SQL Server 2008: T-SQL Querying
It would take a lot of posts to answer the internal's type questions, I suggest you start reading some of the white papers / blogs and some books.
For 2000 Ken Henderson's SQL Server Architecture will give you deep internal details, for the more recent editions of SQL, he did a 2005 practical trouble shooting which isn't bad, and Kalen Delaney's SQL 2008 Internals book is very good.
You should examine the execution plan.
Press Ctrl-L in the SSMS or issue SET SHOWPLAN_TEXT ON before executing the query.
This will give you detailed information of what indexes are used, which join algorithms applied etc.
You may also want to see the statistics:
SET STATISTICS TIME ON
SET STATISTICS IO ON
, which will give you the information on how many reads were done from actual tables, cache etc., and how much time (actual and CPU time) did every query take.
About SQL Server 2008 buffer management: http://msdn.microsoft.com/en-ca/library/aa337525.aspx
Then you can browse with the left sidebar for information on other subjects.
As far as external sources go, MSDN has a comprehensive resource on optimising your SQL Server 2000 installation, particularly the Patterns & Practices paper on performance and scalability.
If you know where to start looking, take a bottom-up approach with examining SQL profiles and query plans as has been mentioned. Otherwise, start from the top down and learn about the costs involved in recompiling queries, how to index effectively, and how the query optimiser uses statistics.
In my current project, the DB is SQL 2005 and the load is around 35 transactions/second. The client is expecting more business and are planning for 300 transactions/second. Currently even with good infrastructure, DB is having performance issues. A typical transaction will have at least one update/insert and a couple of selects.
Have you guys worked on any systems that handled more than 300 txn/s running on SQL 2005 or 2008, if so what kind of infrastructure did you use how complex were the transactions? Please share your experience. Someone has already suggested using Teradata and I want to know if this is really needed or not. Not my job exactly, but curious about how much SQL can handle.
Its impossible to tell without performance testing - it depends too much on your environment (the data in your tables, your hardware, the queries being run).
According to tcp.org its possible for SQL Server 2005 to get 1,379 transactions per second. Here is a link to a system that's done it. (There are SQL Server based systems on that site that have far more transactions... the one I linked was just the first I one I looked at).
Of course, as Kragen mentioned, whether you can achieve these results is impossible for anyone here to say.
Infrastructure needs for high performance SQL Servers may be very differnt than your current structure.
But if you are currently having issues, it is very possible the main part of your problem is in bad database design and bad query design. There are many way to write poorly performing queries. In a high transaction system, you can't afford any of them. No select *, no cursors, no correlated subqueries, no badly performing functions, no where clauses that aren't sargeable and on and on.
The very first thing I'd suggest is to get yourself several books on SQl Server peroformance tuning and read them. Then you will know where your system problems are likely to be and how to actually determine that.
An interesting article:
http://sqlblog.com/blogs/paul_nielsen/archive/2007/12/12/10-lessons-from-35k-tps.aspx
I'm interested to know how I could improve the performance of SQL Server when using sequential GUID when using Access 2007 as a front end to SQL Server 2008 (please note it's the only context I'm interested in).
I have made some tests (and gotten some fairly surprising results, in particular from SQL Server when using sequential GUID: the insert performance degrades very very quickly and it doesn't seem right to degrade so quickly to me.
Basically the test is as follow:
From the Access front-end, using VBA only, insert 100,000 records in batches of 1000,
sequentially.
I tried it both with a Identity and a sequential GUID as the PK.
I tried it in SQL Server 2008 Standard (no special tweaking just default install) as and an Access 2007 database as the back-end. All tables linked back to the front-end.
Some of the results (more, with raw data available on my blog entry about the test):
It's clear that, as the database grows, the insert performance is reduced but SQL Server isn't performing very well at all here.
http://blog.nkadesign.com/wp-content/uploads/2009/04/chart02.png
Expanded view of the results for SQL Server:
http://blog.nkadesign.com/wp-content/uploads/2009/04/chart03.png
Edit 13APR2009
I've found an issue with my server configuration and I updated the tests on my blog.
Thanks to all for your replies, they helped me a lot.
There's two things at play here. First, it's important to point out that SQL doesn't necessarily work very well, for a specific use case, out of the box. It is a professional product designed to be tuned by a person who knows what they're doing.
By comparison, Access is designed to work very well for most use cases without any configuration. The downside of this trade-off is covered in the second point:
SQL Server is designed for scalability. Notice how Access severely degrades with only 100,000 records. It would probably drop very steeply below SQL's line before a million. By comparison, SQL server holds almost perfectly steady, with the variation stabilizing after about 45,000 records and will continue to hold at many millions.
Edit I think there also may be something else at play here we're not seeing. I thought your SQL numbers looked bad, so I ran a test of my own. On my desktop running Windows Vista 3.6 ghz and 2gb of RAM, inserts with sequential GUID on SQL Server performed:
Average of 1382 inserts per second at 0 records
Average of 1426 inserts per second at 500k records
Averaging 1609.6 inserts per second from 0 to 500k with an average floor of 992 inserts/sec and an average ceiling of 1989 inserts/sec.
So accounting for the normal variance incurred by running this on an in-use desktop, I'd say SQL Server inserts basically scale linearly from 0 records to half a million. On a dedicated, tuned server I'd expect even more consistency (not to mention far better performance):
Excel chart, inserts per second http://img24.imageshack.us/img24/9485/insertspersecond.jpg
My question is whether your test setup represents the reality of your application or not. In short, are you testing the right thing?
Is your app going to be appending large numbers of records one at a time?
Or is it going to be appending batches of records based on a SQL SELECT?
If the latter, you might look at trying to do it all server-side, particularly if the source table(s) in the SELECT are on the server. It's important to realize that with ODBC, a batch append is going to be sent to the SQL Server as a single insert for every single row (every similar to the recordset-based approach in your test code). If you move the same process entirely server-side, it can be done as a batch operation.
Also, you should test again using ADO instead of DAO. It may optimize the operation completely differently.
Last of all, someone brought to my attention just this past week this fascinating article by Andy Baron:
Optimizing Microsoft Office Access Applications Linked to SQL Server
I'm still absorbing the contents of that very useful article, and it discusses several issues in regard to non-GUID-specific topics that may help you optimize your process for maximum efficiency.
You realize at least part of the decreasing performance is the log filling up, and that a GUID id what, 40 bytes longer than an int?
But I'm not quibbling; it's good to see someone taking actual metrics rather than just handwaving. Modded up.
Where are you getting the data from?
Does it change the numbers if you use the Access Export menu options rather than record-at-a-time-in-a-loop?
VBA is really sensitive to the connection paramters too, and there are lots of options that aren't necessarily intuitive.
If an identity column is acceptable, why are you even considering a sequential GUID (which is something of a tacked-on facility in MSSQL last I checked).
EDIT:
Looking at your code and briefly reviewing the Recordset docs on MSDN, I see you may be able to use more efficient parameters. E.g. your dbSeeChanges and dbOpenDynaset, which are appropriate if you are trying to allow for other users messing with the same rows (or needing to get back the inserted IDENTITY value or probably GUID), but I don't think you need those. In essence, after every INSERT or UPDATE, you're reading the record back from the database into VBA. I'd read through those connection config settings carefully, and I bet you'll come up with something a lot more satisfactory.
The last time I saw something like that (really slow insertion with GUID PK) was because of the log-file filling up. Insertion performance was dropping like a stone, pretty fast (no hard measurement, just looking at live traces, but it sure looked like it was kinda logarithmic). This was pre-loading of historical data.
Moved over to identity PK, took care of actually cleaning up the log file, and everything went much better afterwards (a couple of hours where the first version took several hours and was not finished).
Also, just a thought, are there any transactions involved? Maybe SQL Server transactions create a big performance hit that access does not have (given that access is not really geared towards concurrent access).
Jeff mentioned in one of the podcasts that one of the things he always does is put in instrumentation for database calls, so that he can tell what queries are causing slowness etc. This is something I've measured in the past using SQL Profiler, but I'm interested in what strategies other people have used to include this as part of the application.
Is it simply a case of including a timer across each database call and logging the result, or is there a 'neater' way of doing it? Maybe there's a framework that does this for you already, or is there a flag I could enable in e.g. Linq-to-SQL that would provide similar functionality.
I mainly use c# but would also be interested in seeing methods from different languages, and I'd be more interested in a 'code' way of doing this over a db platform method like SQL Profiler.
If a query is more then just a simple SELECT on a single table I always run it through EXPLAIN if I am on MySQL or PostgreSQL. If you are using SQL Server then Management Studio has a Display Estimated Execution Plan which is essentially the same. It is useful to see how the engine will access each table and what indexes it will use. Sometimes it will surprise you.
Recording the database calls, the gross timing and the number of records (bytes) returned in the application is useful, but it's not going to give you all the information you need.
It might show you usage patterns you were not expecting. It might show where your using "row-by-row" access instead of "set based" operations.
The best tool to use is SQL Profiler and analyse the number of "Reads" vs the CPU and duration. You want to avoid high CPU queries, high Read's and long durations (duh!).
The "group by reads" is a useful feature to bring to the top the nastiest queries.
If you're writing queries in SQL Management Studio you can enter: SET STATISTICS TIME ON and SQl Server will tell you how long the individual parts of a query took to parse, compile and execute.
You might be able to log this information by handling the InfoMessage event of the SqlConnection class (but I think using the SQL Profiler is much easier.)
I would have thought that the important thing to ask here is "what database platform are you using?"
For example, in Sybase, installing MDA tables might solve your problem, they provide a whole bunch of statistics from procedure call usage to average logical I/O, CPU time and index coverage. It can be as clever as you want it to be.
I definitely see the value in using SQL Profiler while you're app is running, and EXPLAIN or SET STATISTICS will give you information about individual queries, but does anyone routinely put measurement points into their code to gather information about database queries ongoing - that would pick up on for example, a query on a table that performs fine initially, but as the number of rows grows, becomes slower and slower.
If you're using MySQL or Postgre there's various tools for seeing query activity in real time, but I haven't found a tool as good as the SQL Profiler for measuring query performance over time.
I'm wondering if there is (or should be?) something similar to ELMAH in the way it just plugs in and gives you information without much additional effort?
If you're into Firebird you may want to watch sinatica.com.
We'll soon launch a real-time monitoring tool for Firebird DBAs.
< /shameless plug>
If you use Hibernate (I use the Java version, I'd imagine NHibernate has something similar), you can have Hibernate collect statistics about lots of different things. See, for example:
http://www.javalobby.org/java/forums/t19807.html