Drastic difference of the time cost for the same store procedure - sql-server

I am using SQL Server 2008 R2.
The process is actually like this:
First, about 2 million records are pulled from a remote server,
then a join is done locally,
the final result is thousands of records.
The time cost varies from less one 1 min to 30 mins.
And after I experienced the 30 mins delay, it seems the following time costs are all only around 3 mins.
It is the same data, same SP.
What could cause this drastic difference?
Update
I delet the SP, re-start the SQL server service, and re-creat the SP. The execution took only 50 seconds!
What's wrong?

The behaviour you describe seems extreme - but (if you exclude the client), there are 3 logical places to look.
The first is the query execution on the database server. It's worth using the Query Analyzer tool to see if it's using any indices - by far the most common reason for variable performance of database queries is that the query is not using (the right) indices, and that therefore the impact of the query cache plays a big part. SQL Server will cache a lot of data, and the first run of your proc populates that cache; the second run is faster because it hits the cache. After a while, the cache goes stale, and running the proc slows down again.
The second possibility is that the database server is wobbly - it may just not be powerful enough to do all the work it's supposed to do. In that case, one moment you get lucky, have all the server resources to yourself; the next, someone else is running a query and yours slows down. That would make all queries slow, not just this one - so it doesn't sound likely.
Third possibility is networking weirdness - as Phil says, "thousands of records" is nothing too scary, but if they're big, and your network is saturated with pictures of kittens, it might have an impact. Again, that would manifest in general network slowness, and is unlikely to explain a delay of 30 minutes...

Fourth, is anything going on at the same time?
Fifth, does your SP use dynamically generated SQL statements? This would cause the SP not to become pre-compiled. If possible seperate such statements into child SPs.

Related

SQL Server CXPACKET timeout

We've got SQL Server 2016 (v13.0.4206.0), by default there is no restrictions for parallelism - any count SQL wants. And it didn't lead any problems... Till now.
For another feature there were written query that unexpectedly raised timeout exception in our application. I was deeply surprised when it was successfully executed with setting up maximum threads per query to 1. Yes, 6 seconds for query is not so good, even accounting to most of time was spent for fetching, but it's far away from 3 minutes timeout!
By the way, executing this query with SQL Server Management Studio works all the time despite of parallelism settings. It seems that something wrong with connection to database, but all other queries works fine, even which much harder then that one.
Our application is built on ASP.NET Core 3.0 (don't know if it matters), database connection is made using System.Data.SqlClient v4.8.0. All I could determine is that there are so much tasks created for this query:
I've tried to watch for execution in sys.dm_os_waiting_tasks (thanks google). I'm not sure I got it right, but it seems that tasks with context_id 0-8 is blocked with those who have context_id 9-16 and vise versa. Obvious example of deadlock, isn't it? But how can SQL Server manage threads to make it without my "help"? Or what am I doing wrong?
Just in case some inappropriate answers:
I won't turn parallelism off (set maximum threads per query to 1) as solution because of some heavy queries in our application;
I don't want to raise Cost Threshold for Parallelism setting because I'm afraid of same problem with another query (guess, a heavier one). So I just want to determine real cause;
Optimizing the query isn't considered (anymore), as according to actual execution plan I can't make it faster - there are enough indexes for it. But I'm ready to rethink after some really weighty arguments.
So, my question is: why does parallelism that I didn't ask for spoil the query execution? And how can I avoid that?
It's true sometimes the engine chooses to use parallel execution (or not to use) which leads to worse performance.
You do not want to control the server option and the cost as you are not sure how this will reflect to other queries, which is understandable.
If you are sure, your query will be execute better without being handle in parallel, you can specify the option just for it using query hints - MAXDOP like this:
SELECT ...
FROM ...
OPTION (MAXDOP 1);
It's easy and you can rollback if needed. Also, you are not affecting other queries.
You are saying that:
Optimizing query isn't considered (anymore), as according to actual execution plan...
The execution plan is sometimes misleading. As a start - you can save your execution plan and open it with SentryOne Plan Explorer - it's free and can give you a better look of what's going on.
Also, if a query is execute for either 3 seconds or 6 minutes, there must be something wrong with it or may be the activity of your database. If it is executed fast in the SSMS always, maybe the engine is using the correct cache plan. I thing it's better to share the query itself and to attach the two plans (serial and parallel) and spend more time tuning it.

MSSQL slow stored proc on first run, uncached indexes?

Have a 20GB database in SQL Server 2014 behind an IIS web application - DB is queried 24/7 so it's never inactive and auto-close is off but there's a manually-triggered "daily work queue" stored procedure which runs inconsistently during the first execute.
When it's used in the morning for the first time it runs slowly - if you wait, execute again it's back to an immediate response. Minimal other loads on the server at the same time, page life expectancy is healthy and should have necessary indexes to support this query - or at least no additional indexes are being recommended.
Been trying to approach this as a query optimisation problem and getting nowhere, so began exploring other ideas.
Restored DB from backup onto local dev server - first execution is slow, the second execution is fast and see the large (500mb+) indexes loaded via sys.dm_os_buffer_descriptors - if I run DBCC DROPCLEANBUFFERS to simulate everything being unloaded from the buffer, next execution will be slow, can watch the indexes being cached, which point it executes quickly.
Seems to fit the pattern we're experiencing and doesn't seem unreasonable to assume MSSQL might uncache data that's not been used for 10+ hours.
Have I missed something more obvious? Assuming I'm on the right path, can't be the first person to run across this issue so must be an elegant solution out there...
If your "daily work queue" change the data. Then index need to be rebuild and reload in the cache. Is the same as start an old car you need to wait a little bit.
What I do is plan those "work" at night and also program some basic query so index are load in the cache.
Also check your hardware (disk, memory) the first time you run the query the db has to bring the data from disk (slow) and store it in memory (fast but small size).

Same query, different execution plans

I am trying to find a solution for a problem that is driving me mad...
I have a query which runs very fast in a QA Server but it is very slow in production. I realized that they have different execution plans... so I have try recompiling, cleanning the cache for the execution plans, update statistics, check the type of collation... but I still can't find what's going on...
The databases where the query is running are exactly the same and the SQL Servers have also the same configuration.
Any new ideas would be much appreciated.
Thanks,
A.
I just realised the the QA server is running SP3 and in production is SP2. Could this have any impact on this issue?
Is it possible the production server has a larger database size? The plan can be different because it is based on statistics on the data it contains.
I think it could be due to the volume of data present. It happened to us one time where the query literally flew in QA server but was incredibly slow in the production. After breaking our heads for a while we found out that QA server had 15K rows where as production had 1.5 million.
HTH
If the execution plan was the same and one was slow, it would be database load, hardware, locking/blocking, etc.
However, if the execution plans are different something is different between the two databases. Are statistics up to date in both, have the exact same schemas, same indexes, similar number of rows, same distribution of PK and index values, etc. Where did the QA data come from, random data or is it a restore from production?
Disable parallel query execution on production :)
I ran into this recently and here's what I found.
I had two databases that were essentially copies of each other. On one version a TVF was taking 1 second to run, while on the other version took 15 minutes to run.
The execution plans of the underlying SQL code were very different. I was able to fix it by rebuilding some indexes that the TVF relied on. The execution plans aren't the same, but it did change a lot. And the execution time is back down to around a second.
Now, both versions had indexes that were highly fragmented. My assumption is that historical statistic or execution plan information allowed the fast version to continue to find an optimal execution plan.
So to sum up: make sure you look at the fragmentation of your indexes even if they have the same structure or similar rates of fragmentation.

SQL Server 2005 Caching

It's my understanding that SQL Server 2005 does some sort of result or index caching. I'm currently profiling complex select statements which take several seconds to several minutes to complete. My problem is that a second run of a query never takes more than a second to run even if I don't alter it. I'm currently using SQL Server Management Studio Express to execute the queries against a SQL Server 2005 server.
My question is, Is there any way to avoid or clear the cache that is causing my queries to execute so quickly on a second run?
There are a couple different things that could at play here, the 3 that come to mind initially (in probable-ish order) would be as follows - if you would like some help interpreting the results, follow the instructions below and paste the stat outputs in the question:
Your query/batch is taking a long time to compile an execution plan. Execution plans are determined and cached (see this post on serverfault for an overview of understanding for how long, when they are rebuilt, etc.)
To verify this, turn on statistics time output, which will provide you information on how long the engine is taking to generate a query plan. For the query/batch in question:
DBCC FREEPROCCACHE
SET STATISTICS TIME ON
Execute the batch, capture the stats output
Execute the batch again, capture the stats output
Compare the 2 stat outputs, paying particular attention to the parse/compile time differences between the 2 executions.
If this is the problem, you can take a couple of approaches to resolving the issue, including specifying a plan guide, specifying a static plan with use plan, or possibly other options like creating a scheduled job to simply compile the plan every few minutes (not as good on option on Sql 2k5 as the others).
Your query/batch is touching a lot of data - on the first execution the data may not be in the buffer pool (basically the cached pages of data the server needs) and the query is performing physical IO operations as opposed to logical IO operations (i.e. reads from disk vs. reads from cache).
To verify this, turn on statistics io output, which will provide you information on the types of IOs and how many of those the engine is performing for the batch. For the query/batch in question:
DBCC DROPCLEANBUFFERS
SET STATISTICS IO ON
Execute the batch, capture the stats output
Execute the batch again, capture the stats output
Compare the 2 stat outputs, paying particular attention to the physical/read-ahead and logical IO outpus between the 2 executions.
To resolve this, you've basically got only 1 option - optimize the query in question so it performs fewer IO operations. You could consider creating a scheduled job that runs the query every so ofter to keep the data in the buffer pool, but this wouldn't be as good an option.
Your query/batch is getting a poor execution plan and/or a poor execution plan choice for different variable values - is this a batch/query that is using a parameterized statement (i.e. you are using variables/static values in the where/join clauses?)? If so, are you seeing the difference in execution times for the same values or different values? If for the same values, the answer is likely #1 or #2 - if for different values, this is potentially your problem. If you think this is the issue after researching #1 and #2, repost with the .sqlplan, the TSQL, and the different parameter values you are using.
I've found the only realiable metrics for performance tuning come from SQL Server's Profiler application. When looking at CPU Time, and Reads in particular, you become much more sheltered from 'other influences'.
For example, the OS being busy, or multiple users being active will reduce your share of CPU and so increase the time to execute. And you may or may not get parallelism over multiple CPUs. But either way, the total CPU Time (as opposed to execution time) will stay approximately the same.
Do this:
CHECKPOINT
DBCC DROPCLEANBUFFERS

SQL procedure running time widely divergent

I have an application that runs a huge stored procedure on SQL Server 2000. Usually it takes about 1 minute to complete, but occasionally it will take MUCH longer.
Just now I ran it three times in a row in my test system. It took 1:12, 1:23, and 55:25.
What would cause that behavior? There are other things going on in the database, so I wonder if it has something to do with locks. How can I catch this in the act?
Create a trace and examine it in Profiler. That should at least point towards where the problem lies - in your procedure or elsewhere.
It's probably parameter sniffing: based on the input, Sql Server chose a different query plan.
Another possibility is that a separate query was running at the same time and locked everything up.

Resources