SQL Server Performance and Update Statistics - sql-server

We have a site in development that when we deployed it to the client's production server, we started getting query timeouts after a couple of hours.
This was with a single user testing it and on our server (which is identical in terms of Sql Server version number - 2005 SP3) we have never had the same problem.
One of our senior developers had come across similar behaviour in a previous job and he ran a query to manually update the statistics and the problem magically went away - the query returned in a few miliseconds.
A couple of hours later, the same problem occurred.So we again manually updated the statistics and again, the problem went away. We've checked the database properties and sure enough, auto update statistics isTRUE.
As a temporary measure, we've set a task to update stats periodically, but clearly, this isn't a good solution.
The developer who experienced this problem before is certain it's an environment problem - when it occurred for him previously, it went away of its own accord after a few days.
We have examined the SQL server installation on their db server and it's not what I would regard as normal. Although they have SQL 2005 installed (and not 2008) there's an empty "100" folder in installation directory. There is also MSQL.1, MSQL.2, MSQL.3 and MSQL.4 (which is where the executables and data are actually stored).
If anybody has any ideas we'd be very grateful - I'm of the opinion that rather than the statistics failing to update, they are somehow becoming corrupt.
Many thanks
Tony

Disagreeing with Remus...
Parameter sniffing allows SQL Server to guess the optimal plan for a wide range of input values. Some times, it's wrong and the plan is bad because of an atypical value or a poorly chosen default.
I used to be able to demonstrate this on demand by changing a default between 0 and NULL: plan and performance changed dramatically.
A statistics update will invalidate the plan. The query will thus be compiled and cached when next used
The workarounds are one of these follows:
parameter masking
use OPTIMISE FOR UNKNOWN hint
duplicate "default"
See these SO questions
Why does the SqlServer optimizer get so confused with parameters?
At some point in your career with SQL Server does parameter sniffing just jump out and attack?
SQL poor stored procedure execution plan performance - parameter sniffing
Known issue?: SQL Server 2005 stored procedure fails to complete with a parameter
...and Google search on SO
Now, Remus works for the SQL Server development team. However, this phenomenon is well documented by Microsoft on their own website so blaming developers is unfair
How Data Access Code Affects Database Performance (MSDN mag)
Suboptimal index usage within stored procedure (MS Connect)
Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 (an excellent white paper)

Is not that the statistics are outdated. What happens when you update statistics all plans get invalidated and some bad cached plan gets evicted. Things run smooth until a bad plan gets again cached and causes slow execution.
The real question is why do you get bad plans to start with? We can get into lengthy technical and philosophical arguments whether a query processor shoudl create a bad plan to start with, but the thing is that, when applications are written in a certain way, bad plans can happen. The typical example is having a where clause like (#somevaribale is null) or (somefield= #somevariable). Ultimately 99% of the bad plans can be traced to developers writing queries that have C style procedural expectation instead of sound, set based, relational processing.
What you need to do now is to identify the bad queries. Is really easy, just check sys.dm_exec_query_stats, the bad queries will stand out in terms of total_elapsed_time and total_logical_reads. Once you identified the bad plan, you can take corrective measures which depend from query to query.

Related

Azure SQL Query Editor vs Management Studio

I'm pretty new to azure and cloud computing in general and would like to ask your help in figuring out issue.
Issue was first seen when we had webpage that time outs due to sql timeout set to (30 seconds).
First thing I did was connect to the Production database using MS SQL management studio 2014 (Connected to the azure prod db)
Ran the stored procedure being used by the low performing page but got the return less than 0 seconds. This made me confused since what could be causing the issue.
By accident i also tried to run the same query in the Azure SQL query editor and was shock that it took 29 seconds to run it.
My main question is why is there a difference between running the query in azure sql query editor vs Management studio. This is the exact same database.
DTU usage is at 98% and im thingking there is a performance issue with the stored proc but want to know first why sql editor is running the SP slower than Management studio.
Current azure db has 50 dtu's.
Two guesses (posting query plans will help get you an answer for situations like this):
SQL Server has various session-level settings. For example, there is one to determine if you should use ansi_nulls behavior (vs. the prior setting from very old versions of SQL Server). There are others for how identifiers are quoted and similar. Due to legacy reasons, some of the drivers have different default settings. These different settings can impact which query plans get chosen, in the limit. While they won't always impact performance, there is a chance that you get a scan instead of a seek on some query of interest to you.
The other main possible path for explaining this kind of issue is that you have a parameter sniffing difference. SQL's optimizer will peek into the parameter values used to pick a better plan (hoping that the value will represent the average use case for future parameter values). Oracle calls this bind peeking - SQL calls it parameter sniffing. Here's the post I did on this some time ago that goes through some examples:
https://blogs.msdn.microsoft.com/queryoptteam/2006/03/31/i-smell-a-parameter/
I recommend you do your experiments and then look at the query store to see if there are different queries or different plans being picked. You can learn about the query store and the SSMS UI here:
https://learn.microsoft.com/en-us/sql/relational-databases/performance/monitoring-performance-by-using-the-query-store?view=sql-server-2017
For this specific case, please note that the query store exposes those different session-level settings using "context settings". Each unique combination of context settings will show up as a different context settings id, and this will inform how query texts are interpreted. In query store parlance, the same query text can be interpreted different ways under different context settings, so two different context settings for the same query text would imply two semantically different queries.
Hope that helps - best of luck on your perf problem

Copy Stored Proc Execution Plan to Another Database

Setup:
Using SQL Server 2008 R2.
We've got a stored procedure that has been intermittently running very long. I'd like to test a theory that parameter sniffing is causing the query engine to choose a bad plan.
Question:
How can I copy the query's execution plans from one database to another (test) database?
Note:
I'm fully aware that this may not be parameter sniffing issues. However, I'd like to go through the motions of creating a test plan and using it, if at all possible. Therefore please do not ask me to post code and/or table schema, since this is irrelevant at this time.
Plans are not portable, they bind to object IDs. You can use planguides, but they are strictly tied to the database. What you have to do is test on a restored backup of the same database. On a restored backup you can use a planguide. But for relevance the physical characteristics of the machines should be similar (CPUs, RAM, Disks).
Normally though one does not need to resort to such shenanigans as copy the plans. Looking at actual execution plans all the answers are right there.
Have you tried using OPTIMIZE FOR clause? With it you can tune your procedure easier, and without the risk that plan that you copy from another database will be inappropriate due to differences in those databases (if copying the plan is even possible).
http://www.mssqltips.com/sqlservertip/1354/optimize-parameter-driven-queries-with-sql-server-optimize-for-hint/

Why Would Remote Execution of a Query Cause it to be Suspended?

I apologize in advance for not having all of the specifics available, but the machine is building an index probably for a good while still and is almost completely unresponsive.
I've got a table on SQL Server 2005 with a good number of columns, maybe 20, but a mammoth number of rows (tens, more likely hundreds of millions). In order to simplify the amount of JPA work I'd need to do to access it, I created a view that contained the bits I was interested in. The view was created as:
SELECT bigtable.ID, bigtable.external_identification, mediumtable.hostname,
CONVERT(VARCHAR, bigtable.datefield, 121) AS datefield
FROM schema.bigtable JOIN schema.mediumtable ON bigtable.joinID = mediumtable.ID;
When I want to select from the view, I do:
SELECT * FROM vwTable WHERE external_identification = 'some string';
This works just fine in SQL Management Studio. The external_identification column has a non-unique, non-clustered index in bigtable. This also worked just fine on our remotely executing Java program in our test environment. Now that we're a day or two away from production, the code has been changed a bit (although the fundamental JPA NamedQuery is still straightforward), but we have a new SQLServer installation on new hardware; the test version was on a 32-bit single core machine, the new hardware is 64-bit multi-core.
Whenever I try to run the code that uses this view on the new hardware, it either hangs indefinitely on the first call of this query or times out if I have a timeout specified. After doing some digging, something like:
SELECT status, command, wait_type, last_wait_type FROM sys.dm_exec_requests;
confirmed that the query was running, but showed it in the state:
suspended, SELECT, CXPACKET, CXPACKET
for as long as I cared to wait for it. Whenever I ran the exact same query from within the Management Studio, it completed immediately. So I did some research, and found out this is due to waiting on some kind of concurrent operation to start/finish. In an attempt to circumvent that, I set the server-wide MAXDOP to 1 (disabled concurrency). After that, the query still hangs, but the sys.dm_exec_requests would show:
suspended, SELECT, PAGEIOLATCH_SH, PAGEIOLATCH_SH
This indicates that it's some sort of HD/scanning issue. While certainly the machine is less responsive than I'd expect for newer hardware, I wouldn't expect this query (even over the view) to require much scanning, since the column I'm searching by is indexed in the underlying table and it works if I run it locally. But just because I'm out of ideas and under the gun, I'm adding indexes to the view; first I have to add the unique clustered index (over ID) before I can attempt to add the non-unique non-clustered index over external_identification.
I'm the only one using this database; when I select from sys.dm_exec_requests the only two results are the query I'm actively inspecting and the select from sys.dm_exec_requests query. So it's not like it's under legitimately heavy, or even at all concurrent, load.
But I suspect I'm grasping at straws. I'm no DBA, and every time I have to interact with SQL Server outside of querying it it baffles my intuitions. Does anyone have any ideas why a query executed remotely would immediately go into a suspended state while the same query locally would execute immediately?
Wow, this one caught me straight out of left field. It turns out that by default, the MSSQL JDBC driver sends its String datatypes as Unicode, which the table/view might not be prepared to handle specifically. In our case, the columns and indexes were not, so MSSQL would perform a full table scan for each lookup.
In our test environment, the table was small enough that this didn't matter, so I was tricked into thinking it worked fine. In retrospect, I'm glad it didn't -- I can't stand it when computers give the illusion of inconsistency.
When I added this little parameter to the end of my JDBC connection string:
jdbc:sqlserver://[IP]:1433;databaseName=[db];sendStringParametersAsUnicode=false
things immediately and magically started working. Sorry for the slightly misleading question (I barely even mentioned JPA), but I had no idea what the cause was and really did believe it was something SQL Server side. Task Manager didn't report heavy CPU/Memory usage while the query was suspended, so I just thought it was idling even though it was really under heavy disk usage.
More info about MSSQL JDBC and Unicode can be found where I stumbled across the solution, at http://server.pramati.com/blog/2010/06/02/perfissues-jdbcdrivers-mssqlserver/ . Thanks, Ed, for that detailed shot in the dark -- it may not have been the problem, but I certainly learned a lot (and fast!) about MSSQL's gritty parts!
It is likely that the query run in SSMS and by your application are using different query plans - from the wait types you're seeing in dm_exec_requests it sounds like the plan created for the application is doing a table scan where the plan for SSMS is using an index seek.
This is possible because the SSMS and application database connections likely use different connection options, some of which are used as a key to the database plan cache.
You can find out which options your application is using by running a default SQL server profiler trace against the server; the first command after the connection is created will be a number of SET... options:
SET DATEFORMAT dmy
SET ANSI_NULLS ON
...
I suspect this list will be different between your application and your SSMS connection - a common candidate is SET ARITHABORT {ON|OFF}, since that forms part of the key of the cached plan.
If you run the SET... commands in an SSMS window before executing the query, the same (bad) plan as is being used by the application should then be picked up.
Assuming this demonstrates the problem, the next step is to work out how to prevent the bad plan getting into cache. It's difficult to give generic instructions about how to do this, since there are a few possible causes.
It's a bit of a scattergun approach (there are other more targetted ways to attempt to resolve this problem but they require more detailed understanding of the issue that I have now), but one thing to try is to add OPTION (RECOMPILE) to the end of your query - this forces a new plan to be generated for every execution, and should prevent the bad plan being reused:
SELECT * FROM vwTable WHERE external_identification = 'some string' OPTION (RECOMPILE);
Assuming you can replicate the bad performance is SSMS using the steps above, you should be able to test this there.
Beware that this can have negative performance consequences if the query is executed very frequently (since each recompilation requires CPU) - this depends on the workload of your application and will need testing.
A couple of other thoughts:
Check the schemas between the test and production systems; this might be as simple as a missing index from one of the tables in the production database, although given that SSMS queries perform OK this is unlikely.
You should re-enable parallelism by taking the server-wide MAXDOP=1 off, since this will limit the performance of your system overall. The problem is almost certainly the query plan, not parallelism
You also need to beware of the consequences of adding indexes to the view - doing so effectively materialises the view, which will (given the size of the table) require a lot of storage overhead - the indexes will also need to be maintained when INSERT/UPDATE/DELETE statements take place on the base table. Indexing the view is probably unnecessary given that (from SSMS) you know it's possible for the query to perform.

SQL Server 2000 caching

I have one question in order to speed up SQL Server 2000.
I want to use caching mechanism, but I don't know how to use.
I found some articles about it, but can you give an example for how to use.
For example:
there is a stored procedure - sp_stackOverFlow - it executes when every user enter to the program/web site and it is clear it makes slower running.
Is there a way of caching sp_stackOverFlow in every 2 minutes or another?
Your question isn't clear, not least because it isn't obvious what the stored procedure does. If the results are different for every execution and/or user then they cannot easily be cached anyway.
But more fundamentally, "I have a slow stored procedure" does not automatically mean "I need caching"; the database engine itself already caches data when it can. You need to understand why the stored procedure is running slowly: underpowered hardware, poor TSQL code, poor data model design and poor indexing are all very common issues that have major effects on performance.
You can find a lot of information on this site and by Googling about how to troubleshoot slow execution times for procedures, but you can start by reviewing the execution plan for the procedure in Query Analyzer and tracing the execution using Profiler. That will immediately tell you which statements are taking the most time, if there are table scans happening etc.
Because performance troubleshooting is potentially complex, if you need more assistance please post short, specific questions about individual issues. If the code for your stored procedure is very short (< 30 lines formatted) people may be willing to comment on it directly, otherwise it would be better to post only the individual SQL statements that are causing a problem.
Finally, mainstream support for MSSQL 2000 stopped 3 years ago, so you should definitely look into upgrading to newer version. The performance tools in newer versions will make resolving your issue much easier.

Find out sql server hardware or speed test

I use an sql server regularly and have recently been getting frustrated by the performance. It would be difficult for me to get direct access to find out the hardware so:
Is there a direct way in management studio to assess performance or find out the exact hardware.
Alternatively does someone have a set of test sql procedures I could try and ideally compare to other results to get an idea of it's performance.
So far I have setup a few quick queries on my local machines sql express server just as test these seem to run quicker than the sql server on the network which is meant to be high performance although no one knows when it was last upgraded I have a feeling it hasn't been for 6 or 7 years. Obviously these test don't account for the possibility of others querying at the same time or network transfers of results... Hopefully someone has a better solution.
You can't just ask your server guys? Seems like there's a fair bit of mistrust if you can't get hardware metrics. Count of CPUs, total memory, etc.
If there's that amount of mistrust, even if you found the answer from the database server, rectifying it would be impossible. If you can't get the current parameters, how could you get a change of hardware passed the server guys?
Start building rapport. The best line in the world to get someone on your side is, "I'm in trouble and I need your help..." You've elevated them and subjugated yourself, you've put them in a position to save you. You'd be amazed at how much you can get out of people that way.
As far as standard queries. You could look at TPC queries.
IF you are on 2005:
SELECT * FROM sys.dm_os_performance_counters
That will give you some sql only stats. You will not find much info about the machine without at least terminal access. In the sql startup log you can see some info on processors as well.
You also might try updating your references in your server. I had an issue a while back that 1 query returned in 100ms and an identical query in 5+ minutes and the only difference between the 2 was a Capital letter in the table name in my query (whih obviously shouldn't matter).
After some searching and SO-Questioning, I found that I needed to update my statistics. Could it be something like this is needed for your database / SQL Server too?
This sort of thing can be very political, especially in a firm with an endemic CYA culture (which describes most financial services companies). If there's no reasonable
expectation of a good working relationship with the production staff, A few approaches are:
Look at the query plans of the
queries. Check that they are
sensible (using indexes when they
should etc.)
Make it formal. Ask their manager
to get the specifications of the
machine, the disk layout and server
configuration and the last time
statistics were updated on all
tables and indexes. Make it clear
that the machine appears to be
under-performing.
If the statistics are out of date,
get them updated.
and one more
SELECT * FROM sys.dm_os_sys_info

Resources