I'm currently experiencing some problems on my DotNetNuke SQL Server 2005 Express site on Win2k8 Server. It runs smoothly for most of the time. However, occasionally (order once or twice an hour) it runs very slowly indeed - from a user perspective it's almost like there's a deadlock of some description when this occurs.
To try to work out what the problem is I've run SQL Profiler against the SQL Express database.
Looking at the results, some specific questions I have are:
The SQL trace shows an Audit Logon and Audit Logoff for every RPC:Completed - does this mean Connection Pooling isn't working?
When I look in Performance Monitor at ".NET CLR Data", then none of the "SQL client" counters have any instances - is this just a SQL Express lack-of-functionality problem or does it suggest I have something misconfigured?
The queries running when the slowness occur don't yet seem unusual - they run fast at other times. What other perfmon counters or other trace/log files can you suggest as useful tools for my further investigation.
Jumping straight to Profiler is probably the wrong first step. First, try checking the Perfmon stats on the server. I've got a tutorial online here:
http://www.brentozar.com/perfmon
Start capturing those metrics, and then after it's experienced one of those slowdowns, stop the collection. Look at the performance metrics around that time, and the bottleneck will show up. If you want to send me the csv output from Perfmon at brento#brentozar.com I can give you some insight as to what's going on.
You might still need to run Profiler afterwards, but I'd rule out the OS and hardware first. Also, just a thought - have you checked the server's System and Application event logs to make sure nothing's happening during those times? I've seen instances where, say, the antivirus client downloads new patches too often, and does a light scan after each update.
My spidey sense tells me that you may have SQL Server blocking issues. Read this article to help you monitor blocking on your server to check if its the cause.
If you think the issues may be performance related and want to see what your hardware bottleneck is, then you should gather some cpu, disk and memory stats using perfmon and then co-relate them with your profiler trace to see if the slow response is related.
no
nothing wrong with that...it shows that you're not using the .NET functionality embed in SQL Server.
You can check http://www.xsqlsoftware.com/Product/xSQL_Profiler.aspx for more detailed analysis of profiler trace. It has reports that show top queries by time or CPU (Not one single query, but sum of all execution of a single query).
Some other things to check:
Make sure your datafiles or log files
are not auto-extending.
Make sure your anti-virus is set to
ignore your sql data and log
files.
When looking at the profiler output, be sure the check the queries that finished just prior to your targets,
they could've been blocking.
Make sure you've turned off Auto-close on the database; re-opening after closing takes some
time.
Related
We upgraded from SQL Server 2008 to SQL Server 2014. The upgrade was successful.
However, there were problems with optimization. Some queries have started to create locks based on. Often the blockade disappears or whip it but the base does not want to move.
The solution to this problem with us is the MAXDOP change. After the change I do not know what is freeing but everything starts to go like before the jam in the database. I have no idea what to do about it anymore.
Our SQL Server configuration
We have already changed the cost and MAXDOP parameters. Doesn't help much. I've optimized queries that cause blockades.
The problem persists all the time. Oddly enough, the MAXDOP change helps with this blockage. The system then completely forgives. SQL queries go down and execute.
The performance issue can raise due to a lot of reasons. improper Maxdop settings is just one of them.
Run a health check with Sp_Blitz
Run sp_blitz [sp] (https://github.com/BrentOzarULTD/SQL-Server-First-Responder-Kit#sp_blitz-overall-health-check) to see what is actually causing your performance bottleneck .
check for priorities from 1to 50, those are most crucial.
start fixing them one by one
I'm experiencing periodical Azure SQL Database connection slow downs. As recommended in Wait statistics, or please tell me where it hurts article I ran sys.dm_db_wait_stats (analogue of sys.dm_os_wait_stats for Azure SQL Database) aggregation script which show me that longest waits are of type XE_FILE_TARGET_TVF. Average wait time is 54 seconds.
XE_FILE_TARGET_TVF didn't mentioned in documentation as in any other online resources that I know. I suspect that "XE" means "Extended Events", "TVF" is "table-valued functions", "FILE_TARGET" is probably indicator that something is being written to some file.
So, what kind of wait it is?
Use Azure SQL Database Query Performance Insight for troubleshooting SQL Azure DBs. To begin with, wait stats alone are not a correct performance troubleshooting approach. Read How to analyse SQL Server performance for a more thorough approach, learn to analyze also where the CPU is spent, not only where the elapsed time is spent waiting. Aggregate wait stats are often misleading, filtering out the 'benign' wait stats is a chasing the red lights game.
Unfortunately not everything is actionable in SQL Azure DB environment. Start with the Query Performance Insight to see if you can correlate the performance issues with queries issued by your app. Use the SQL Database Index Advisor to get index recommendations for your workload.
If you cannot find the application problems and suspect this is caused by the platform, you will have to open a support case. Twitting #AzureSupport is very effective in getting help.
To answer your question: a lot of Azure SQL DB monitoring relies on Extended Events and they store data to files. This specific wait stat is unlikely to be related to the cause of your performance problems.
I believe that it's a async process so possibly nothing to worry around and could be a red herring. However I think this is a fabric related task behind the DB. Does this event seem to be consistently causing you issues?
Sometimes queries that normally take almost no time to run at all suddenly start to take as much as 2 seconds to run. (The query is select count(*) from calendars, which returns the number 10). This only happens when running queries through our application, and not when running the query directly against the database server. When we restart our application server software (Tomcat), suddenly performance is back to normal. Normally I would blame the network, but it doesn't make any sense to me that restarting the application server would make it suddenly behave much faster.
My suspicion falls on the connection pool, but I've tried all sorts of different settings and multiple different connection pools and I still have the same result. I'm currently using HikariCP.
Does anyone know what could be causing something like this, or how I might go about diagnosing the problem?
Do you use stored procedures or ad-hoc queries? On reason to get different executions when running a query let's say in management studio vs using stored procedure in you application can be inefficient cached execution plan, which could have been generated like that due to parameter sniffing. You could read more about it here and there are number of solutions you could try (like substituting parameters with local variables). If you restart the whole computer (and SQL Server is also running on it), than this could explain why you get fast queries in the beginning after a restart - because the execution plans are cleaned after reboot.
It turned out we had a rogue process that was grabbing 64 connections to the database at once and using all of them for intense and inefficient work. We were able to diagnose this using jstack. We ran jstack when we noticed the system had slowed down a ton, and it showed us what the application was working on. We saw 64 stack traces all inside the same rogue process, and we had our answer!
We have a site in development that when we deployed it to the client's production server, we started getting query timeouts after a couple of hours.
This was with a single user testing it and on our server (which is identical in terms of Sql Server version number - 2005 SP3) we have never had the same problem.
One of our senior developers had come across similar behaviour in a previous job and he ran a query to manually update the statistics and the problem magically went away - the query returned in a few miliseconds.
A couple of hours later, the same problem occurred.So we again manually updated the statistics and again, the problem went away. We've checked the database properties and sure enough, auto update statistics isTRUE.
As a temporary measure, we've set a task to update stats periodically, but clearly, this isn't a good solution.
The developer who experienced this problem before is certain it's an environment problem - when it occurred for him previously, it went away of its own accord after a few days.
We have examined the SQL server installation on their db server and it's not what I would regard as normal. Although they have SQL 2005 installed (and not 2008) there's an empty "100" folder in installation directory. There is also MSQL.1, MSQL.2, MSQL.3 and MSQL.4 (which is where the executables and data are actually stored).
If anybody has any ideas we'd be very grateful - I'm of the opinion that rather than the statistics failing to update, they are somehow becoming corrupt.
Many thanks
Tony
Disagreeing with Remus...
Parameter sniffing allows SQL Server to guess the optimal plan for a wide range of input values. Some times, it's wrong and the plan is bad because of an atypical value or a poorly chosen default.
I used to be able to demonstrate this on demand by changing a default between 0 and NULL: plan and performance changed dramatically.
A statistics update will invalidate the plan. The query will thus be compiled and cached when next used
The workarounds are one of these follows:
parameter masking
use OPTIMISE FOR UNKNOWN hint
duplicate "default"
See these SO questions
Why does the SqlServer optimizer get so confused with parameters?
At some point in your career with SQL Server does parameter sniffing just jump out and attack?
SQL poor stored procedure execution plan performance - parameter sniffing
Known issue?: SQL Server 2005 stored procedure fails to complete with a parameter
...and Google search on SO
Now, Remus works for the SQL Server development team. However, this phenomenon is well documented by Microsoft on their own website so blaming developers is unfair
How Data Access Code Affects Database Performance (MSDN mag)
Suboptimal index usage within stored procedure (MS Connect)
Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 (an excellent white paper)
Is not that the statistics are outdated. What happens when you update statistics all plans get invalidated and some bad cached plan gets evicted. Things run smooth until a bad plan gets again cached and causes slow execution.
The real question is why do you get bad plans to start with? We can get into lengthy technical and philosophical arguments whether a query processor shoudl create a bad plan to start with, but the thing is that, when applications are written in a certain way, bad plans can happen. The typical example is having a where clause like (#somevaribale is null) or (somefield= #somevariable). Ultimately 99% of the bad plans can be traced to developers writing queries that have C style procedural expectation instead of sound, set based, relational processing.
What you need to do now is to identify the bad queries. Is really easy, just check sys.dm_exec_query_stats, the bad queries will stand out in terms of total_elapsed_time and total_logical_reads. Once you identified the bad plan, you can take corrective measures which depend from query to query.
We have a SQL 2000 server that has widely varied jobs that run at different times of day, or even different days of the month. Normally, we only use the SQL profiler to run traces for very short periods of time for performance troubleshooting, but in this case, that really wouldn't give me a good overall picture of the kinds of queries that are run against the database over the course of a day or week or month.
How can I minimize the performance overhead of a long-running SQL trace? I already know to:
Execute the trace server-side (sp_ create_trace), instead of using the SQL Profiler UI.
Trace to a file, and not to a database table (which would add extra overhead to the DB server).
My question really is about filters. If I add a filter to only log queries that run more than a certain duration or reads, it still has to examine all activity on the server to decide if it needs to log it, right? So even with that filter, is the trace going to create an unacceptable level of overhead for a server that is already on the edge of unacceptable performance?
Adding Filters does minimize the overhead of event collection and also prevents the server from logging transaction entries you don't need.
As for whether the trace is going to create an unacceptable level of overhead, you'll just have to test it out and stop it if there are additional complaints. Taking the hints of the DB Tuning Advisor with that production trace file could improve performance for everyone tomorrow though.
You actually should not have the server process the trace as that can cause problems: "When the server processes the trace, no event are dropped - even if it means sacrificing server performace to capture all the events. Whereas if Profiler is processing the trace, it will skip events if the server gets too busy." (From SQL 70-431 exam book best practices.)
I found an article that actually measures the performance impact of a SQL profiler session vs a server-side trace:
http://sqlblog.com/blogs/linchi_shea/archive/2007/08/01/trace-profiler-test.aspx
This really was my underlying question, how to make sure that I don't bog down my production server during a trace. It appears that if you do it correctly, there is minimal overhead.
It’s actually possible to collect more detailed measurements than you can collect from Profiler – and do it 24x7 across an entire instance -- without incurring any overhead. This avoids the necessity of figuring out ahead of time what you need to filter… which can be tricky.
Full disclosure: I work for one of the vendors who provide such tools… but whether you use ours or someone else’s… this may get you around the core issue here.
More info on our tool here http://bit.ly/aZKerz