What is causing CPU spike in Azure SQL? - sql-server

Below shows the CPU spike in a 24 hour period of one of our Azure SQL database. compute utilization
In Query Performance, the top 5 queries by CPU in thesame 24 hour period is shown in the image below. query performance
However,the number of spikes in the overview above are more frequent than the top query by CPU. Where can we find what is causing the spike beside the Query Performance since it seems to be something else altogether?

I know it is a bit late, but might be useful for others, the following query shows top 10 active CPU consuming queries in Azure:
SELECT TOP 10
GETDATE() runtime,
*
FROM
(
SELECT query_stats.query_hash,
SUM(query_stats.cpu_time) 'Total_Request_Cpu_Time_Ms',
SUM(logical_reads) 'Total_Request_Logical_Reads',
MIN(start_time) 'Earliest_Request_start_Time',
COUNT(*) 'Number_Of_Requests',
SUBSTRING(REPLACE(REPLACE(MIN(query_stats.statement_text), CHAR(10), ' '), CHAR(13), ' '), 1, 256) AS "Statement_Text"
FROM
(
SELECT req.*,
SUBSTRING( ST.text,
(req.statement_start_offset / 2) + 1,
((CASE statement_end_offset
WHEN -1 THEN
DATALENGTH(ST.text)
ELSE
req.statement_end_offset
END - req.statement_start_offset
) / 2
) + 1
) AS statement_text
FROM sys.dm_exec_requests AS req
CROSS APPLY sys.dm_exec_sql_text(req.sql_handle) AS ST
) AS query_stats
GROUP BY query_hash
) AS t
ORDER BY Total_Request_Cpu_Time_Ms DESC;

Related

Actual CPU usage per database

I would like to measure the current of the CPU usage for a particular database and I wrote the following query in order to obtain this information. However, I could not be sure about the accuracy of this query.
DECLARE #Fm INT;
DECLARE #FTime DATETIME;
SELECT #FTime = getdate(),#Fm = SUM(dmqs.total_worker_time)
FROM sys.dm_exec_query_stats dmqs
CROSS APPLY
(SELECT
CONVERT(INT, value) AS [DatabaseID]
FROM sys.dm_exec_plan_attributes(dmqs.plan_handle)
WHERE attribute = N'dbid') dmpa
where DatabaseID=7
GROUP BY dmpa.DatabaseID
WAITFOR DELAY '00:00:01'
SELECT CAST((SUM(dmqs.total_worker_time) - #Fm)* 1.0 / SUM(dmqs.total_worker_time) * 100 AS DECIMAL(5, 2))
FROM sys.dm_exec_query_stats dmqs
CROSS APPLY
(SELECT
CONVERT(INT, value) AS [DatabaseID]
FROM sys.dm_exec_plan_attributes(dmqs.plan_handle)
WHERE attribute = N'dbid') dmpa
where DatabaseID=7
GROUP BY dmpa.DatabaseID
Have a look at Glen Berry's diagnostic queries. He has some CPU related ones including a breakdown of the CPU usage per database in an instance.
https://www.sqlskills.com/blogs/glenn/category/dmv-queries/

Azure SQL Database - queries significantly slower than SQL Database on Azure VM

We moved our SQL Server from an Azure VM to an Azure SQL Database. The Azure VM was DS2_V2, 2 core, 7GB RAM, 6400 max IOPS The Azure SQL Database is Standard S3, 100 DTU. I chose this tier after running the Azure DTU Calculator tool on the Azure VM for 24 hours - it suggested this tier for me.
The problem is that queries (mostly SELECT and UPDATE) are painfully slow now, compared to how they were on the Azure VM. One thing I noticed is that while running a query, I went to the Resource Utilization graph under Monitoring in the Azure Portal, and it's pinging 100% throughout the time any query is being run. Does this mean my tier is in fact too low? I would hope not because the next tier up is a pretty big jump in cost.
Just for information, the Azure SQL Database is identical in schema and data to the Azure VM database, and I rebuilt all indexes (including Full-Text) after the migration.
In my research thus far I've read everything from making sure my Azure SQL DB is in the right region on Azure (it is) to network latency (non-existent on Azure VM) causing the issue.
How long has this system been running now as an Azure SQL Server Database? Presumably if it's more than a few hours old (i.e. some "production" queries have hit it) and it's generated some useful statistics.
Analyzing this and determining the source of your problem will be a multi-pronged strategy.
Service Tier Check
Try the following queries, which determine whether you are at the correct service level:
-----------------------
---- SERVICE TIER CHECK
-----------------------
-- The following query outputs the fit percentage per resource dimension, based on a threshold of 20%.
-- IF the query below returns values greater than 99.9 for all three resource dimensions, your workload is very likely to fit into the lower performance level.
SELECT
(COUNT(end_time) - SUM(CASE WHEN avg_cpu_percent >= 20 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'CPU Fit Percent'
,(COUNT(end_time) - SUM(CASE WHEN avg_log_write_percent >= 20 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Log Write Fit Percent'
,(COUNT(end_time) - SUM(CASE WHEN avg_data_io_percent >= 20 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Physical Data Read Fit Percent'
FROM sys.dm_db_resource_stats
-- Look at how many times your workload reaches 100% and compare it to your database workload SLO.
-- IF the query below returns a value less than 99.9 for any of the three resource dimensions, you should consider either moving to the next higher performance level or use application tuning techniques to reduce the load on the Azure SQL Database.
SELECT
(COUNT(end_time) - SUM(CASE WHEN avg_cpu_percent >= 100 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'CPU Fit Percent'
,(COUNT(end_time) - SUM(CASE WHEN avg_log_write_percent >= 100 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Log Write Fit Percent'
,(COUNT(end_time) - SUM(CASE WHEN avg_data_io_percent >= 100 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Physical Data Read Fit Percent'
FROM sys.dm_db_resource_stats
Resource Consumption Levels
It would also be useful to check the resource consumption, which you can do using the following query. This will report things like DTU consumption and IO.
-----------------
-- Resource Usage
-----------------
select *
from sys.dm_db_resource_stats
order by end_time desc
Indexes
It's also worth a quick check whether you have missing indexes or whether some of your existing indexes are getting in the way.
The missing index query is a doozy, but should be taken with a grain of salt. I generally see it as an advisement on how the db is being used and I make my own judgement on which indexes to add, and how. For example, as a general rule of thumb, all foreign keys should have non-clustered indexes to facilitate the inevitable JOIN's they're involved in.
--------------------
-- Find poor indexes
--------------------
DECLARE #dbid int
SELECT #dbid = db_id()
SELECT 'Table Name' = object_name(s.object_id), 'Index Name' =i.name, i.index_id,
'Total Writes' = user_updates, 'Total Reads' = user_seeks + user_scans + user_lookups,
'Difference' = user_updates - (user_seeks + user_scans + user_lookups)
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = #dbid
AND user_updates > (user_seeks + user_scans + user_lookups)
ORDER BY 'Difference' DESC, 'Total Writes' DESC, 'Total Reads' ASC;
------------------
-- Missing Indexes
------------------
declare #improvementMeasure int = 100
SELECT
CONVERT (decimal (28,1),
migs.avg_total_user_cost *
migs.avg_user_impact *
(migs.user_seeks + migs.user_scans))
AS improvement_measure,
OBJECT_NAME(mid.object_id, mid.database_id) as table_name,
mid.equality_columns as index_column,
mid.inequality_columns,
mid.included_columns as include_columns,
'CREATE INDEX IX_' +
OBJECT_NAME(mid.object_id, mid.database_id) +
'_' +
REPLACE(REPLACE(mid.equality_columns, '[', ''), ']', '') +
' ON ' +
mid.statement +
' (' + ISNULL (mid.equality_columns,'') +
CASE WHEN mid.equality_columns IS NOT NULL
AND mid.inequality_columns IS NOT NULL
THEN ','
ELSE ''
END + ISNULL (mid.inequality_columns, '') +
')' +
ISNULL (' INCLUDE (' + mid.included_columns + ')',
'') AS create_index_statement,
migs.user_seeks,
migs.unique_compiles,
migs.avg_user_impact,
migs.avg_total_user_cost
FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs
ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details mid
ON mig.index_handle = mid.index_handle
WHERE CONVERT (decimal (28,1),
migs.avg_total_user_cost *
migs.avg_user_impact *
(migs.user_seeks + migs.user_scans)) > #improvementMeasure
ORDER BY migs.avg_total_user_cost *
migs.avg_user_impact *
(migs.user_seeks + migs.user_scans) DESC
Maintenance
A maintenance plan should also be setup, whereby you are rebuilding indexes and statistics on a somewhat regular basis. Unfortunately there is no SQL Agent in an Azure SQL environment. But Powershell and either an Azure function or Azure WebJob can help you schedule and execute this. For our on-prem and azure servers, we do this weekly.
Note that WebJob's would only help if you have a pre-existing App Service to run it within.
For scripts on helping you with index and statistics maintenance, checkout Ola Hallengren's script offering.

Azure SQL frequent connection timeouts

We're running a web app (2 instances) on Azure, backed by a SQL Azure database. At any given time there are 50-150 users using the website. The database runs at S2 performance level. The DTU is around 20% on average.
However, a few times every day I suddenly get hundreds of errors in my logs with timeouts, like this:
An error occurred while executing the command definition. See the inner exception for details.
The wait operation timed out.
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. This failure occurred while attempting to connect to the routing destination. The duration spent while attempting to connect to the original server was - [Pre-Login] initialization=1; handshake=21; [Login] initialization=0; authentication=0; [Post-Login] complete=1;
We're using EF6 for queries with the default command timeout. I've configured this execution strategy:
SetExecutionStrategy("System.Data.SqlClient",
() => new SqlAzureExecutionStrategy(10, TimeSpan.FromSeconds(15)));
The database (about 15GB total) is heavily indexed. These errors occur all over the place, usually dozens to hundreds within 1-2 minutes.
What steps can I take to prevent this from happening?
The fact that it happens in 1-2 minutes might mean a burst in activity or some process that might be locking up tables.
If your DTU during those times is at 20% is not a CPU issue, but you can always find which are the bottlenecks by running this query on the DB:
SELECT TOP 10
total_worker_time/execution_count AS Avg_CPU_Time
,execution_count
,total_elapsed_time/execution_count as AVG_Run_Time
,(SELECT
SUBSTRING(text,statement_start_offset/2,(CASE
WHEN statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(max), text)) * 2
ELSE statement_end_offset
END -statement_start_offset)/2
) FROM sys.dm_exec_sql_text(sql_handle)
) AS query_text
FROM sys.dm_exec_query_stats
ORDER BY Avg_CPU_Time DESC
Even if the DB is heavily indexed, indexes get fragmented, I'd advice running this to check the current fragmentation:
select a.*,b.AverageFragmentation from
( SELECT tbl.name AS [Table_Name], tbl.object_id, i.name AS [Name], i.index_id, CAST(CASE i.index_id WHEN 1 THEN 1 ELSE 0 END AS bit) AS [IsClustered],
CAST(case when i.type=3 then 1 else 0 end AS bit) AS [IsXmlIndex], CAST(case when i.type=4 then 1 else 0 end AS bit) AS [IsSpatialIndex]
FROM
sys.tables AS tbl
INNER JOIN sys.indexes AS i ON (i.index_id > 0 and i.is_hypothetical = 0) AND (i.object_id=tbl.object_id))a
inner join
( SELECT tbl.object_id, i.index_id, fi.avg_fragmentation_in_percent AS [AverageFragmentation]
FROM
sys.tables AS tbl
INNER JOIN sys.indexes AS i ON (i.index_id > 0 and i.is_hypothetical = 0) AND (i.object_id=tbl.object_id)
INNER JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') AS fi ON fi.object_id=CAST(i.object_id AS int) AND fi.index_id=CAST(i.index_id AS int)
)b
on a.object_id=b.object_id and a.index_id=b.index_id
order by AverageFragmentation desc
You can also use Azure Automation to schedule an automatic rebuilding of fragmented indexes, see answer at: Why my Azure SQL Database indexes are still fragmented?

Query runs in less than a millisecond in SQL, but times out in Entity Framework

The following linq-to-entities query throws
Entity Framework Timeout expired. The timeout period elapsed prior to
completion of the operation or the server is not responding.
after ToList()ing it.
var q = (from contact
in cDB.Contacts.Where(x => x.Templategroepen.Any(z => z.Autonummer == templategroep.Autonummer)
&& !x.Uitschrijvings.Any(t => t.Templategroep.Autonummer == templategroep.Autonummer))
select contact.Taal).Distinct();
((System.Data.Objects.ObjectQuery)q).ToTraceString() gives me:
SELECT
[Distinct1].[Taal] AS [Taal]
FROM ( SELECT DISTINCT
[Extent1].[Taal] AS [Taal]
FROM [dbo].[ContactSet] AS [Extent1]
WHERE ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[TemplategroepContact] AS [Extent2]
WHERE ([Extent1].[Autonummer] = [Extent2].[Contacts_Autonummer]) AND ([Extent2].[Templategroepen_Autonummer] = #p__linq__0)
)) AND ( NOT EXISTS (SELECT
1 AS [C1]
FROM [dbo].[UitschrijvingenSet] AS [Extent3]
WHERE ([Extent1].[Autonummer] = [Extent3].[Contact_Autonummer]) AND ([Extent3].[Templategroep_Autonummer] = #p__linq__1)
))
) AS [Distinct1]
the query from tracestring runs in under 1 seconds in sql management studio, but times out when actually to-listing it? how is that possible again?
*Update: added SQL PROFILER output for query * this runs as slow as the EF ToList() (>30seconds)
exec sp_executesql N'SELECT
[Distinct1].[Taal] AS [Taal]
FROM ( SELECT DISTINCT
[Extent1].[Taal] AS [Taal]
FROM [dbo].[ContactSet] AS [Extent1]
WHERE ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[TemplategroepContact] AS [Extent2]
WHERE ([Extent1].[Autonummer] = [Extent2].[Contacts_Autonummer]) AND ([Extent2].[Templategroepen_Autonummer] = #p__linq__0)
)) AND ( NOT EXISTS (SELECT
1 AS [C1]
FROM [dbo].[UitschrijvingenSet] AS [Extent3]
WHERE ([Extent1].[Autonummer] = [Extent3].[Contact_Autonummer]) AND ([Extent3].[Templategroep_Autonummer] = #p__linq__1)
))
) AS [Distinct1]',N'#p__linq__0 int,#p__linq__1 int',#p__linq__0=1,#p__linq__1=1
I observed this issue with EF6.
await _context.Database.SqlQuery<MyType>(sql) was timing out even when my timeout value was cranked up to 60 seconds. However, executing the exact same SQL (used profiler to confirm the sql I passed in was unmodified) in SSMS yielded expected results in one second.
exec sp_updatestats
Fixed the issue for me.
(DBCC FREEPROCCACHE)
DBCC DROPCLEANBUFFERS
made the problem go away for now, but I think that might just be a temp. solution
I know this is a little late, but I found the answer here.
Basically Entity Framework likes to track everything by default. If you don't need it (i.e. not inserting or updating or deleting entities), turn it off to speed up your queries.
If you're using Entity Framework Code First you can achieve this like so:
var q = (from contact
in cDB.Contacts.AsNoTracking()
.Where(x => x.Templategroepen.Any(z => z.Autonummer == templategroep.Autonummer)
&& !x.Uitschrijvings.Any(t => t.Templategroep.Autonummer == templategroep.Autonummer))
select contact.Taal).Distinct();
I had similar issue with EF6. When using SqlQuery function in EF, I got timeout although query was executed in milliseconds in Management Studio. I found that it happened due the value of one of the sql parameters that I used in EF query. To make it clear, below is the similar SQL query I experienced with.
SELECT * FROM TBL WHERE field1 > #p1 AND field2>#p2 AND field3<#p3
When #p1 is zero, I received timeout exception. When I made it 1 or something different, it was executed in milliseconds. By the way, the table that I queried on has more than 20M rows.
I hope it helps,
Best
You need to Add one column serves as uniqueId or key to be able to work in EF

Finding what is writing to the Transaction Log in SQL Server?

Is there a way to see what is writing to the transaction log?
I have a log file that has grown 15 Gigs in the last 20 minutes. Is there a way for me to track down what is causing this?
Activity monitor will show you what is executing.
DBCC OPENTRAN will show the oldest open transaction.
There is also the dynamic management view sys.dm_tran_active_transactions. For example, here's a query that shows you log file usage by process:
-- This query returns log file space used by all running transactions.
select
SessionTrans.session_id as [SPID],
enlist_count as [Active Requests],
ActiveTrans.transaction_id as [ID],
ActiveTrans.name as [Name],
ActiveTrans.transaction_begin_time as [Start Time],
case transaction_type
when 1 then 'Read/Write'
when 2 then 'Read-Only'
when 3 then 'System'
when 4 then 'Distributed'
else 'Unknown - ' + convert(varchar(20), transaction_type)
end as [Transaction Type],
case transaction_state
when 0 then 'Uninitialized'
when 1 then 'Not Yet Started'
when 2 then 'Active'
when 3 then 'Ended (Read-Only)'
when 4 then 'Committing'
when 5 then 'Prepared'
when 6 then 'Committed'
when 7 then 'Rolling Back'
when 8 then 'Rolled Back'
else 'Unknown - ' + convert(varchar(20), transaction_state)
end as 'State',
case dtc_state
when 0 then NULL
when 1 then 'Active'
when 2 then 'Prepared'
when 3 then 'Committed'
when 4 then 'Aborted'
when 5 then 'Recovered'
else 'Unknown - ' + convert(varchar(20), dtc_state)
end as 'Distributed State',
DB.Name as 'Database',
database_transaction_begin_time as [DB Begin Time],
case database_transaction_type
when 1 then 'Read/Write'
when 2 then 'Read-Only'
when 3 then 'System'
else 'Unknown - ' + convert(varchar(20), database_transaction_type)
end as 'DB Type',
case database_transaction_state
when 1 then 'Uninitialized'
when 3 then 'No Log Records'
when 4 then 'Log Records'
when 5 then 'Prepared'
when 10 then 'Committed'
when 11 then 'Rolled Back'
when 12 then 'Committing'
else 'Unknown - ' + convert(varchar(20), database_transaction_state)
end as 'DB State',
database_transaction_log_record_count as [Log Records],
database_transaction_log_bytes_used / 1024 as [Log KB Used],
database_transaction_log_bytes_reserved / 1024 as [Log KB Reserved],
database_transaction_log_bytes_used_system / 1024 as [Log KB Used (System)],
database_transaction_log_bytes_reserved_system / 1024 as [Log KB Reserved (System)],
database_transaction_replicate_record_count as [Replication Records],
command as [Command Type],
total_elapsed_time as [Elapsed Time],
cpu_time as [CPU Time],
wait_type as [Wait Type],
wait_time as [Wait Time],
wait_resource as [Wait Resource],
reads as [Reads],
logical_reads as [Logical Reads],
writes as [Writes],
SessionTrans.open_transaction_count as [Open Transactions(SessionTrans)],
ExecReqs.open_transaction_count as [Open Transactions(ExecReqs)],
open_resultset_count as [Open Result Sets],
row_count as [Rows Returned],
nest_level as [Nest Level],
granted_query_memory as [Query Memory],
SUBSTRING(SQLText.text,ExecReqs.statement_start_offset/2,(CASE WHEN ExecReqs.statement_end_offset = -1 then LEN(CONVERT(nvarchar(max), SQLText.text)) * 2 ELSE ExecReqs.statement_end_offset end - ExecReqs.statement_start_offset)/2) AS query_text
from
sys.dm_tran_active_transactions ActiveTrans (nolock)
inner join sys.dm_tran_database_transactions DBTrans (nolock)
on DBTrans.transaction_id = ActiveTrans.transaction_id
inner join sys.databases DB (nolock)
on DB.database_id = DBTrans.database_id
left join sys.dm_tran_session_transactions SessionTrans (nolock)
on SessionTrans.transaction_id = ActiveTrans.transaction_id
left join sys.dm_exec_requests ExecReqs (nolock)
on ExecReqs.session_id = SessionTrans.session_id
and ExecReqs.transaction_id = SessionTrans.transaction_id
outer apply sys.dm_exec_sql_text(ExecReqs.sql_handle) AS SQLText
where SessionTrans.session_id is not null -- comment this out to see SQL Server internal processes
If you transaction log has grown so much is such a short time this means that a lot of statements that make data or structure changes have been executed. If your database works with large blob records you can try looking there first.
Profiler won't help you much in finding out what happened previously but it can help you if this is still going on.
If you want to read into transaction log, you will need a 3rd party transaction log reader. The best solution on the market is ApexSQL Log which saved me couple times in similar situations.
However, if your database is running on sql server 2000, you can try to use SQL Log Rescue from Red Gate cause it's free. Thrid solution is to try and find Lumigent Log Explorer (product is discountinued but maybe you can find it somewhere online).
Try'em all and see which one works better for you.
You can use sql server profiler which show every transaction executed and it's start time and end time and many things and i think you can see what causing your problem.
I hope this help you.

Resources