Is it possible to optimise ShowPlan XML in SQL Server - sql-server

I have a reasonably complex query that takes under a second to run once SQL server has established the query plan (satisfactory). However the first time the query runs the event ShowPlanXML according to the profiler takes about 14 seconds (not satisfactory).
Is there any way to optimise the ShowPlanXML so that the first time it runs it completes more quickly?
Or am I going to have create a plan guide ?
For info here is the SQL query (as generated by NHibernate):
SELECT top 20 this_.UserId as UserId55_0_, this_.User_Version as User2_55_0_, this_.User_ApplicationId as User3_55_0_, this_.User_DeletedOn as User4_55_0_, this_.User_CreatedOn as User5_55_0_, this_.User_ModifiedOn as User6_55_0_, this_.User_CreatedById as User7_55_0_, this_.User_CreatedByName as User8_55_0_, this_.User_ModifiedById as User9_55_0_, this_.User_ModifiedByName as User10_55_0_, this_.User_Name as User11_55_0_, this_.User_ExternalId as User12_55_0_, this_.User_DynamicFields as User13_55_0_,
this_.User_FirstName as User14_55_0_, this_.User_LastName as User15_55_0_, this_.User_Prefix as User16_55_0_, this_.User_Gender as User17_55_0_, this_.User_Language as
User18_55_0_, this_.User_Code as User19_55_0_, this_.User_Nationality as User20_55_0_, this_.User_FirstLanguage as User21_55_0_, this_.User_DrivingLicence as User22_55_0_,
this_.User_Category as User23_55_0_, this_.User_UserStatus as User24_55_0_, this_.User_UserType as User25_55_0_, this_.User_WorkPhone as User26_55_0_, this_.User_MobilePhone as
User27_55_0_, this_.User_Fax as User28_55_0_, this_.User_Mail as User29_55_0_, this_.User_Login as User30_55_0_, this_.User_Password as User31_55_0_, this_.User_BornOn as
User32_55_0_, this_.User_StartedOn as User33_55_0_, this_.User_FinishedOn as User34_55_0_, this_.User_Address as User35_55_0_, this_.User_PostalCode as User36_55_0_,
this_.User_City as User37_55_0_, this_.User_Country as User38_55_0_, this_.User_PositionTitle as User39_55_0_, this_.User_Comments as User40_55_0_, this_.User_OptionalField1 as
User41_55_0_, this_.User_OptionalField2 as User42_55_0_, this_.User_OptionalField3 as User43_55_0_, this_.User_PasswordConsecutiveFailedAttempts as User44_55_0_,
this_.User_PasswordModificationDate as User45_55_0_, this_.User_WrongPasswordAttemptDate as User46_55_0_, this_.User_PictureUrl as User47_55_0_, this_.User_PasswordModificationStatus as User48_55_0_, this_.User_SecretQuestionConsecutiveFailedAttempts as User49_55_0_, this_.User_PlatformMailTransfer as User50_55_0_, this_.User_TimeZoneId as User51_55_0_, this_.User_ConnectionState as User52_55_0_, this_.User_LastConnectionId as User53_55_0_, this_.User_TotalPercentRealized as User54_55_0_
FROM Dir_User this_
WHERE this_.UserId in (
SELECT distinct this_0_.UserId as y0_
FROM Dir_User this_0_ inner join Dir_UserDynamicGroup dynamicgro3_ on this_0_.UserId=dynamicgro3_.UsDy_UserId
inner join Dir_Group dynamicgro1_ on dynamicgro3_.UsDy_DynamicGroupId=dynamicgro1_.GroupId
WHERE dynamicgro1_.GroupId = (51904517)
and this_0_.User_ApplicationId = 65536
and this_0_.User_DeletedOn is null
and this_0_.UserId in (
SELECT distinct this_0_0_.TargetUserId as y0_
FROM Dir_UserGroupMember this_0_0_
WHERE this_0_0_.OwnerUserId = 7341195
and ( (this_0_0_.Scope & 139280) != 0 or ( (this_0_0_.Scope & 139280) != 0
and this_0_0_.GroupId = this_0_0_.SubGroupId))))
ORDER BY this_.User_Name asc

The show-plan profiler events can have a significant impact on the performance of SQL Server (see sqlserver.query_post_execution_showplan Performance Impact). If you want to get an accurate representation of the amount of time taken to compile a stored procedure you should use an alternative method.
You should be able to identify how much time the plan took to compile by looking at the plan cache directly, see Identifying High Compile Time Statements from the Plan Cache.
Unfortunately I'm not aware of many ways to reduce compilation time for SQL Server queries other than simply reducing the complexity of the query. Attempting to reduce the frequency that plan compilation is required through plan caching is the standard approach to improving performance.

Related

What accounts for different execution times between HeidiSQL and SSMS?

When I execute a particular query from Heidi against an MSSQL database, it takes approximately 10 times longer than executing the identical query in SSMS.
They are both being executed against the same server from the same workstation.
What can account for this difference?
Here is the exact query and relative execution times:
SELECT b.ID as BookingID, b.ReservationID, b.RoomID, b.EventName,
b.EventTypeID, b.StatusID, b.DateAdded, build.Value1 as BuildingID,
build.ValueDescription as Building
FROM EMS.dbo.tblBooking b
INNER JOIN EMS.dbo.tblRoom room
ON room.ID = b.RoomID
INNER JOIN ( SELECT deff.Value1, deff.ValueDescription
FROM tblDataExtractionFilter_Fields deff
INNER JOIN tblDataExtractionFilter def ON deff.FilterID = def.ID
WHERE def.Description = '[redacted]'
AND deff.FieldID = 28
AND deff.Show = 0) build
ON room.BuildingID = build.Value1
WHERE b.DateAdded > DATEADD(DAY,-7,GETDATE()) AND (StatusID = 1 OR StatusID = 16);
Heidi: "Duration for 1 query: 1.015 sec."
SSMS: "00:00:01"
I am obviously green, but my understanding was that the execution plan was determined server side and not application side. This leads me to suspect that there is some sort of overhead in Heidi with respect to this query (simpler queries execute MUCH faster so the overhead would not be universal).
This is just a point of curiosity for me. I am still learning. Can anyone offer a clue about what I can check/google/research to try to understand this?
Thanks!
EDIT: The times I have reported do not agree with my statement that the SSMS time is 1/10 that of Heidi. They are both approximately 1 second. My subjective wait time (wall clock time) between execution and display is MUCH faster (and much less than 1 second) in SSMS. Can this be due to SSMS caching the results?

QDS not showing anything while DTU is maxed out

I've trying to identify which query is causing my workload to stall, according to the metrics (Metrics (preview) tab in Azure Portal) I see: 100% DTU utilization, caused by the CPU
But when I go to QDS I see a different picture:
And the reported queries by QDS in this period don't take that as long as the DTU cap is being hit.
I know that the 1 minute reported by the metrics view is the correct one, since the operation from the user side takes that long and I can see in the Web App telemetry the app not responding in this time period.
So how can I identify the query that hits the DTU limit?
P.S. The db is an S0.
UPDATE
#Alberto Morillo, I've executed the query, it there are a lot of cheap queries ran (~2k) - the largest values for total_worker_time are in the 54k (54 ms). On the other hand I see the wait stats is dominated by SOS_WORK_DISPATCHER.
Does this mean that the queries are blocking because the workers can't be spawned by the scheduler that fast?
Please run the following query:
SELECT TOP 10 q.query_id, p.plan_id,
rs.count_executions,
qsqt.query_sql_text,
CONVERT(NUMERIC(10,2), (rs.avg_cpu_time/1000)) as 'avg_cpu_time_seconds',
CONVERT(NUMERIC(10,2),(rs.avg_duration/1000)) as 'avg_duration_seconds',
CONVERT(NUMERIC(10,2),rs.avg_logical_io_reads ) as 'avg_logical_io_reads',
CONVERT(NUMERIC(10,2),rs.avg_logical_io_writes ) as 'avg_logical_io_writes',
CONVERT(NUMERIC(10,2),rs.avg_physical_io_reads ) as 'avg_physical_io_reads',
CONVERT(NUMERIC(10,0),rs.avg_rowcount ) as 'avg_rowcount'
from sys.query_store_query q
JOIN sys.query_store_plan p ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id
INNER JOIN sys.query_store_query_text qsqt
ON q.query_text_id = qsqt.query_text_id
WHERE rs.last_execution_time > dateadd(hour, -1, getutcdate())
ORDER BY rs.avg_duration DESC
Change the ORDER BY clause to avg_cpu_time and avg_rowcount also.

Merge statement optimization

I have a two tables in SQL Server, in which one is the source for a MERGE operation into another.
The source table has 30Mil Records
The Target table has 180Mil Records. Both tables have 227 columns.
I do have SSIS, but I'm told in this case, a MERGE statement is the better option. Below is a shortened version of it:
;WITH MySource as (
SELECT * FROM [STAGE].[dbo].[STAGE_TABLE]
)
MERGE [EDW].[dbo].[TARGET_TABLE] AS MyTarget
USING MySource
ON MySource.[ID_FIELD] = MyTarget.[ID_FIELD]
AND MySource.[LoadDate] >= MyTarget.[LoadDate]
WHEN MATCHED THEN UPDATE SET
<<Target Column>> = MySource.<<Source Colums>> --227 columns
WHEN NOT MATCHED THEN INSERT
(
[ID_FIELD],
[LoadDate],
<<225 Other Columns>>
)
VALUES (
MySource.[ID_FIELD],
MySource.[LoadDate],
MySource.<<225 other columns>>
);
The only changes I made to the script above is truncating the list of columns to keep the code block here short.
My Problem is that I am getting hung on the execution. The profile screen shows a CXPACKET suspension with the error: cwaitpipenewrow, node=2.
How do I troubleshoot this? Thank you.
Seems like CXPACKET and suspended state means that some threads which have completed are logging that other thread's state which have not completed yet.
Please check here. The query need to update upto 1 Billion values in the table. hence it would be slow running queries.
https://dba.stackexchange.com/questions/96346/cxpacket-suspended-and-null-wait-type
https://www.sqlshack.com/troubleshooting-the-cxpacket-wait-type-in-sql-server/
Hope these articles might help you debug.

Space in SQL Server 2008 R2 slows down performance

I have run into a rather weird problem. I have created the following query in SQL Server
SELECT * FROM leads.BatchDetails T1
INNER JOIN leads.BatchHeader h ON T1.LeadBatchHeaderId = h.ID
WHERE
T1.LeadBatchHeaderId = 34
AND (T1.TypeRC = 'R' OR h.DefaultTypeRC = 'R')
AND EXISTS (SELECT ID FROM leads.BatchDetails T2 where
T1.FirstName = T2.FirstName AND
T1.LastName = T2.LastName AND
T1.Address1 = T2.Address1 AND
T1.City = T2.City AND
T1.[State] = T2.[State] AND
T1.Zip5 = T2.Zip5 AND
T1.LeadBatchHeaderId = T2.LeadBatchHeaderId
and t2.ID < t1.ID
AND (T2.TypeRC = 'R' OR h.DefaultTypeRC = 'R' )
)
It runs decently fast in 2 seconds. When formatting the code I accidently added an additional SPACE between AND + EXISTS so the query look like this.
SELECT * FROM leads.BatchDetails T1
INNER JOIN leads.BatchHeader h ON T1.LeadBatchHeaderId = h.ID
WHERE
T1.LeadBatchHeaderId = 34
AND (T1.TypeRC = 'R' OR h.DefaultTypeRC = 'R')
AND EXISTS (SELECT ID FROM leads.BatchDetails T2 where
T1.FirstName = T2.FirstName AND
T1.LastName = T2.LastName AND
T1.Address1 = T2.Address1 AND
T1.City = T2.City AND
T1.[State] = T2.[State] AND
T1.Zip5 = T2.Zip5 AND
T1.LeadBatchHeaderId = T2.LeadBatchHeaderId
and t2.ID < t1.ID
AND (T2.TypeRC = 'R' OR h.DefaultTypeRC = 'R' )
)
All of a sudden the query takes 13 seconds to execute.
I am running SQL Server in an isolated sandbox environment and I have even tested it on a different sandbox. I also checked the executed query in profiler, the reads are virtually the same, but CPU time is way up
If this is not weird enough, it's getting weirder. When I change SELECT * FROM to SELECT Field1, ... FROM at the top of the query the execution takes over 3 minutes.
I have been working with SQL Server for 10 years and never seen anything like this.
Edit:
After following the suggestions below it appears that queries are "white-space-sensitive". However I still have no idea why the SELECT * FROM is so much faster than SELECT Field1, ... FROM
I would guess that you're dealing with two different cached query plans:
You you ran the query once, with a certain set of parameters. SQL Server determined an appropriate query plan, and stored that query plan "Auto-parametrized", in other words replacing the values you provided with variables, for the purposes of the query plan.
You then ran the same query again, with different parameters. The query gets auto-parameterized, and matches the existing cached query plan (even though that query plan may not be optimal for the new parameters provided!).
You then run this second query again, with your extra space. This time, the auto-parametrized query does NOT match anything in the cache, and therefore gets its own plan based on THIS set of parameters (remember, the first plan was for a different set of parameters). This query plan happens to end up faster (or slower).
If this is truly the explanation, you should be able to make the effect go away, by running DBCC FREEPROCCACHE: http://msdn.microsoft.com/en-us/library/ms174283.aspx
There's lots of stuff on auto-parameterization out there, I personally liked Gail Shaw's series:
http://sqlinthewild.co.za/index.php/2007/11/27/parameter-sniffing/
http://sqlinthewild.co.za/index.php/2008/02/25/parameter-sniffing-pt-2/
http://sqlinthewild.co.za/index.php/2008/05/22/parameter-sniffing-pt-3/
(for the record, I have no idea whether SQL Server eliminates/normalizes whitespace before storing an auto-parameterized query plan; I would have assumed so, but this entire answer asssumes that it doesn't!)
This might very well be related to caching issues. When you change your query, even by as little as a space, the cached execution plan of your previous query will no longer be used. If my answer is correct, you should see the same (2 seconds) performance when you run the bottom query for the second time...
Just my 2 cents
You could flush the cache with the following two statements:
DBCC FreeProcCache
DBCC DROPCLEANBUFFERS

Choosing SQL join types for Entity Framework 4

I have a query for example:
var personList = context.People;
People is a view that has 2 joins on it and about 2500 rows and takes ~10 seconds to execute.
Looking at the Estimated Execution plan tells me that it is using a nested loop.
Now if i do this:
var personList = context.People.Where(r => r.Surname.Length > -1);
Execution time is under a second and the execution plan is using a Hash Join.
Adding "OPTION (HASH JOIN)" to the generated SQL has the desired effect of increasing performance.
So my question is ...
How can i get the query to use a Hash Join? It can't be added to the view (I tried, it errors).
Is there an option in EF4 that will force this? Or will i have to put it in a stored procedure?
RE: View
SELECT dbo.DecisionResults.ID, dbo.DecisionResults.UserID, dbo.DecisionResults.HasAgreed, dbo.DecisionResults.Comment,
dbo.DecisionResults.DateResponded, Person_1.Forename, Person_1.Surname, Site_1.Name, ISNULL(dbo.DecisionResults.StaffID, - 999)
AS StaffID
FROM dbo.DecisionResults INNER JOIN
Server2.DB2.dbo.Person AS Person_1 ON Person_1.StaffID = dbo.DecisionResults.StaffID INNER JOIN
Server2.DB2.dbo.Site AS Site_1 ON Person_1.SiteID = Site_1.SiteID
ORDER BY Person_1.Surname
If i add OPTION(HASH JOIN) to the end it will error with :
'Query hints' cannot be used in this query type.
But running that script as a query works fine.

Resources