Long elapsed_time with Azure SQL - sql-server

We have been using Azure SQL successfully with low data amounts for some time. Now when we have seen moderate increase in data storage volumes we are experiencing very long query times. What is weird is that querying dm_exec_query_stats shows that the worker/cpu time for queries is very low (roughly 0 seconds) but the elapsed time is long (can be over 30 seconds).
We have upgraded our pricing tier to 100 DTU's after which our resource consumption for all categories is only < 10% of available resources. The execution plans look decent and given that cpu times are so low this shouldn't be an issue.
I have also checked wait times which has yielded no significant results. The wait times for individual queries show just a few milliseconds of wait times and the only significant wait type is SOS_WORK_DISPATCHER which doesn't ring a bell - or exist in Microsoft documentation.
Both the web application and sql server consuming the data are both in Azure West Europe and considering that we have not seen considerable IO amounts it shouldn't be a problem.
Does anyone have an idea what could be causing this or what is SOS_WORK_DISPATCHER

Related

How to calculate total optimal connection count and find DB instance type to use?

How can I calculate the optimal total connection count from my service to the my DB endpoint? Is there a basic formula based on expected number of queries per second and CPU and IO taken by each query?
Similarly, is there a formula to calculate the optimal database instance type/size to use based on traffic patterns and query characteristics (CPU, IO consumed or latency of query)?
I will be using this to create the connection pool in my service. I'm assuming that if my service has N hosts then per host the connection pool size need to be the total optimal connection count divided by N.
Note: By instance type I mean similar to AWS EC2 instance type which provides info on vCPU and memory (RAM)
When it comes to sizing the database machine, the key measure is the number of concurrently active database sessions. Since each active session corresponds to at least one database process, you have to provide enough CPU power and I/O capacity to handle them concurrently.
An estimate for that number would be average query duration in seconds * number of queries per second. You have to have at least that many cores, and your I/O system has to be able to handle that many concurrent I/O requests.
When it comes to dimensioning a connection pool, you also have to factor in the time that the database spends idle in a transaction while waiting for the next statement from the client.
The maximal connection pool size would be number of concurrent queries the database can handle / (transaction busy ratio * average query duration in seconds * number of queries per second). The transaction busy ratio is active time for a transaction / total time for a transaction - so if all your transactions consist of only a single statement (which means that no time is spent waiting for the next statement in a transaction), that ratio would be 1.
In practice, it is difficult to estimate or measure the ideal pool size, and you have to run load tests to see how big you can make the pool without overloading your database.

Question about SQL Server replication scalability

I'm hoping someone has some insight to offer here. I'm in an environment where we have a central database server with a database of around 20GB and individual database servers across about 200 facilities. The intention is to run a copy of our application at each facility pointing at their local server, but to sync all databases in both directions as often as possible (no more than 10,000 rows affected per day, individual rows on average 1.5kb). Due to varying connectivity, a facility could be offline for a week or two at times and it needs to catch up once back online.
Question: Using pull replication with the merge strategy, are there practical limits that would affect our environment? At 50, 100, 200 facilities, what negative effects can we expect to see, if any? What kind of bandwidth expectations should we have for the central server (I'm finding very little about this number anywhere I look)?
I appreciate any thoughts or guidance you may have.
Based on your description, the math looks like this:
1.5 kb (per row) * 10000 rows = 15 GB per day (min) incoming, at every one of your 50 to 200 sites.
15 GB * (50 to 200 sites) = .7 to 3 TB per day (min), sent from your central server.
Your sites will be fairly busy (15 GB per day) and your hub will be very busy (3 TB per day)
So bandwidth might be a concern. You will definitely want to monitor your bandwidth and throughput. Negative side effects would be periodic slowness at your hub (every synch).

Are all available DTU used to exec a query?

I have a not simple query.
When I had 10 DTUs for my database, it took about 17 seconds to execute the query.
I increased the level to 50 DTU - now the execution takes 3-4 seconds.
This ratio corresponds to the documentation - more DTU = work faster.
But!
1 On my PC I can execute the query in 1 sec.
2 In portal-statistics I see that I use only 12 DTU (max DTU percentage = 25% ).
In sys.dm_db_resource_stats I see that MAX(avg_cpu_percent) is about 25% and the other params are less.
So the question is - Why my query takes 3-4 sec to exec?
It can be executed in 1 sec. And server does not use all my DTU.
How to make server use all available resources to exec queries faster?
DTU is a combined measurement of CPU, memory, data I/O and transaction log I/O.
This means that reaching a DTU bottleneck can mean any of those.
This question may help you to measure the different aspects: Azure SQL Database "DTU percentage" metric
And here's more info on DTU: https://learn.microsoft.com/en-us/azure/sql-database/sql-database-what-is-a-dtu
On my PC I can execute the query in 1 sec
We should not be comparing our Onprem computing power with DTU.
DTU is a combination of CPU,IO,Memory you will be getting based on your performance tier.so the comparison is not valid.
How to make server use all available resources to exec queries faster?
This is simply not possible,since when sql runs a query,memory is the only constraint ,that can prevent the query from even starting.Rest of the resources like CPU,IO speed can increase or decrease based on what query does
In summary,you will have to ensure ,queries are not constrained due to resource crunch,they can use up all resources if they need and can release them when not needed.
You also will have to look at wait types and further fine tune the query.
As Bernard Vander Beken mentioned:
DTU is a combined measurement of CPU, memory, data I/O and transaction
log I/O.
I'll also add that Microsoft does not share the formula used to calculate DTUs. You mentioned that you are not seeing DTUs peg at 100% during query execution. But since we do not know the formula, you may very well be pegging components of DTU, but not pegging DTU itself.
Azure SQL is a shared environment, and each tenant will be throttled to ensure that the minimum SLA for all tenants
What a DTU is is quite fuzzy.
We have done an experiment where we run a set of benchmarks on machines with the same amount of DTU on different data centers.
http://dbwatch.com/azure-database-performance-measured
It turns out that the actual performance varies by a factor of 5.
We have also seen instances where the performance of a repeated query on the same database varies drastically.
We provide our database performance benchmarks for free if you would like to compare the instance you run on your PC with the instance in the azure cloud.

Multiple queries at time - server performance?

If one (select) query is run against database and it takes 10mins to finish, what is with performance of the server while this query is running? To be more precise, is it possible to run other queries at the same time and how does this "long" one affect speed performance?
Thanks,
Ilija
Database engines are designed for multiple concurrent users. Data and execution plans are cached and re-used, it has it's own scheduler etc
There are some exceptions:
a badly structured query can run 100% CPU on all cores
a long running UPDATE or INSERT or transaction can block other users
not enough memory means paging and thrashing of data through cache
... and lots more edge cases
However, day to day it shouldn't matter and you won't know the 10 minute query is running.

Is there a SQL server performance counter for average execution time?

I want to tune a production SQL server. After making adjustments (such as changing the degree of parallelism) I want to know if it helped or hurt query execution times.
This seems like an obvious performance counter, but for the last half hour I've been searching Google and the counter list in perfmon, and I have not been able to find a performance counter for SQL server to give me the average execution time for all queries hitting a server. The SQL Server equivalent of the ASP.NET Request Execution Time.
Does one exist that I'm missing? Is there another effective way of monitoring the average query times for a server?
I don't believe there is a PerfMon but there is a report within SQL Server Management Studio:
Right click on the database, select Reports > Standard Reports > Object Execution Statistics. This will give you several very good statistics about what's running within the database, how long it's taking, how much memory/io processing it takes, etc.
You can also run this on the server level across all databases.
You can use Query Analyzer (which is one of the tools with SQL Server) and see how they are executed internally so you can optimize indexing etc. That wouldn't tell you about the average, or round-trip back to the client. To do that you'd have to log it on the client and analyze the data yourself.
I managed to do it by saving the Trace to SQL. When the trace is open
File > Save As > Trace Table
Select the SQL, and once its imported run
select avg(duration) from dbo.[YourTableImportName]
You can very easily perform other stats, max, min, counts etc... Much better way of interrogating the trace result
An other solution is to run multiple time the query and get the average query time:
DO $proc$
DECLARE
StartTime timestamptz;
EndTime timestamptz;
Delta double precision;
BEGIN
StartTime := clock_timestamp();
FOR i IN 1..100 LOOP
PERFORM * FROM table_name;
END LOOP;
EndTime := clock_timestamp();
Delta := 1000 * (extract(epoch FROM EndTime) - extract(epoch FROM StartTime)) / 100;
RAISE NOTICE 'Average duration in ms = %', Delta;
END;
$proc$;
Here it run 100 time the query:
PERFORM * FROM table_name;
Just replace SELECT by PERFORM
Average over what time and for which queries? You need to further define what you mean by "average" or it has no meaning, which is probably why it's not a simple performance counter.
You could capture this information by running a trace, capturing that to a table, and then you could slice and dice the execution times in one of many ways.
It doesn't give exactly what you need, but I'd highly recommend trying the SQL Server 2005 Performance Dashboard Reports, which can be downloaded here. It includes a report of the top 20 queries and their average execution time and a lot of other useful ones as well (top queries by IO, wait stats etc). If you do install it be sure to take note of where it installs and follow the instructions in the Additional Info section.
The profiler will give you statistics on query execution times and activities on the server. Overall query times may or may not mean very much without tying them to specific jobs and query plans.
Other indicators of performance bottlenecks are resource contention counters (general statistics, latches, locks). You can see these through performance counters. Also looking for large number of table-scan or other operations that do not make use of indexes can give you an indication that indexing may be necessary.
On a loaded server increasing parallelism is unlikely to materially affect performance as there are already many queries active at any given time. Where parallelism gets you a win is on large infrequently run batch jobs such as ETL processes. If you need to reduce the run-time of such a process then parallelism might be a good place to look. On a busy server doing a transactional workload with many users the system resources will be busy from the workload so parallelism is unlikely to be a big win.
You can use Activity Monitor. It's built into SSMS. It will give you real-time tracking of all current expensive queries on the server.
To open Activity Monitor:
In Sql Server Management Studio (SSMS), Right click on the server and select Activity Monitor.
Open Recent Expensive Queries to see CPU Usage, Average Query Time, etc.
Hope that helps.
There are counters in 'SQL Server:Batch Resp Statistics' group, which are able to track SQL Batch Response times. Counters are divided based on response time intervals, for example, from 0 ms to 1 ms, ..., from 10 ms to 20 ms, ..., from 1000 ms to 2000 ms and so on, So proper counters can be selected for the desired time interval.
Hope it helps.

Resources