Space in SQL Server 2008 R2 slows down performance - sql-server

I have run into a rather weird problem. I have created the following query in SQL Server
SELECT * FROM leads.BatchDetails T1
INNER JOIN leads.BatchHeader h ON T1.LeadBatchHeaderId = h.ID
WHERE
T1.LeadBatchHeaderId = 34
AND (T1.TypeRC = 'R' OR h.DefaultTypeRC = 'R')
AND EXISTS (SELECT ID FROM leads.BatchDetails T2 where
T1.FirstName = T2.FirstName AND
T1.LastName = T2.LastName AND
T1.Address1 = T2.Address1 AND
T1.City = T2.City AND
T1.[State] = T2.[State] AND
T1.Zip5 = T2.Zip5 AND
T1.LeadBatchHeaderId = T2.LeadBatchHeaderId
and t2.ID < t1.ID
AND (T2.TypeRC = 'R' OR h.DefaultTypeRC = 'R' )
)
It runs decently fast in 2 seconds. When formatting the code I accidently added an additional SPACE between AND + EXISTS so the query look like this.
SELECT * FROM leads.BatchDetails T1
INNER JOIN leads.BatchHeader h ON T1.LeadBatchHeaderId = h.ID
WHERE
T1.LeadBatchHeaderId = 34
AND (T1.TypeRC = 'R' OR h.DefaultTypeRC = 'R')
AND EXISTS (SELECT ID FROM leads.BatchDetails T2 where
T1.FirstName = T2.FirstName AND
T1.LastName = T2.LastName AND
T1.Address1 = T2.Address1 AND
T1.City = T2.City AND
T1.[State] = T2.[State] AND
T1.Zip5 = T2.Zip5 AND
T1.LeadBatchHeaderId = T2.LeadBatchHeaderId
and t2.ID < t1.ID
AND (T2.TypeRC = 'R' OR h.DefaultTypeRC = 'R' )
)
All of a sudden the query takes 13 seconds to execute.
I am running SQL Server in an isolated sandbox environment and I have even tested it on a different sandbox. I also checked the executed query in profiler, the reads are virtually the same, but CPU time is way up
If this is not weird enough, it's getting weirder. When I change SELECT * FROM to SELECT Field1, ... FROM at the top of the query the execution takes over 3 minutes.
I have been working with SQL Server for 10 years and never seen anything like this.
Edit:
After following the suggestions below it appears that queries are "white-space-sensitive". However I still have no idea why the SELECT * FROM is so much faster than SELECT Field1, ... FROM

I would guess that you're dealing with two different cached query plans:
You you ran the query once, with a certain set of parameters. SQL Server determined an appropriate query plan, and stored that query plan "Auto-parametrized", in other words replacing the values you provided with variables, for the purposes of the query plan.
You then ran the same query again, with different parameters. The query gets auto-parameterized, and matches the existing cached query plan (even though that query plan may not be optimal for the new parameters provided!).
You then run this second query again, with your extra space. This time, the auto-parametrized query does NOT match anything in the cache, and therefore gets its own plan based on THIS set of parameters (remember, the first plan was for a different set of parameters). This query plan happens to end up faster (or slower).
If this is truly the explanation, you should be able to make the effect go away, by running DBCC FREEPROCCACHE: http://msdn.microsoft.com/en-us/library/ms174283.aspx
There's lots of stuff on auto-parameterization out there, I personally liked Gail Shaw's series:
http://sqlinthewild.co.za/index.php/2007/11/27/parameter-sniffing/
http://sqlinthewild.co.za/index.php/2008/02/25/parameter-sniffing-pt-2/
http://sqlinthewild.co.za/index.php/2008/05/22/parameter-sniffing-pt-3/
(for the record, I have no idea whether SQL Server eliminates/normalizes whitespace before storing an auto-parameterized query plan; I would have assumed so, but this entire answer asssumes that it doesn't!)

This might very well be related to caching issues. When you change your query, even by as little as a space, the cached execution plan of your previous query will no longer be used. If my answer is correct, you should see the same (2 seconds) performance when you run the bottom query for the second time...
Just my 2 cents
You could flush the cache with the following two statements:
DBCC FreeProcCache
DBCC DROPCLEANBUFFERS

Related

Stored procedure was fine but then became extremly slow [duplicate]

This question already has answers here:
SQL Server: Query fast, but slow from procedure
(12 answers)
Closed 2 years ago.
My stored procedure was taking around 10 seconds, but suddenly (for unknown reasons) it became so slow (taking 9 minutes).
I did not do any changes at all that may cause the delay.
I wonder if someone can tell why it is so slow.
Here is my query
SELECT
P.PtsID, P.PtsCode, P.PtsName,
FORMAT(P.DOB, 'dd/MM/yyyy') AS DOB, P.Gender, V.VisitID, V.VisitType,
FORMAT(V.VisitDate, 'dd/MM/yyyy') AS VisitDate,
FORMAT(V.DischargeDate, 'dd/MM/yyyy') AS DischargeDate,
R.RepID, R.RepDate, R.RepType, R.RepDesc
FROM
Patients P
INNER JOIN
Visits V ON P.PtsID = V.PtsID
INNER JOIN
Reps R ON R.PtsID = P.PtsID AND R.VisitID = V.VisitID
WHERE
(P.Deleted = 0 AND V.Deleted = 0 AND R.Deleted = 0)
AND (P.PtsName LIKE '%'+TRIM(#PtsName)+'%' OR TRIM(#PtsName) = '')
AND (P.PtsCode LIKE '%'+TRIM(#PtsNo)+'%' OR TRIM(#PtsNo) = '')
AND (R.RepText LIKE '%'+TRIM(#RepText)+'%' OR TRIM(#RepText) = '')
AND (TRIM(#RepCode) = '' OR R.RepID IN (SELECT RepID
FROM tags
WHERE tag = 'XXX'
AND Deleted = 0
AND code IN (SELECT value
FROM string_split(#RepCode,','))))
and this is the execution plan
When I execute the script as an ad-hoc query, not as stored procedure, it is very fast.
Edit
Here is my actual execution plan:
https://www.brentozar.com/pastetheplan/?id=ByPLltcZD
Thanks
So purely bassed on the execution plan.
Statistics
In your exection plan you can see the actual numbers and estimates are far apart.
This can have multiple causes.
Out of date statistics and indexes. Try rebuilding indexes and updating statistics
Updating stats: EXEC sp_updatestats (this might take some time and resources, so if its a production server, do it out of the office hours / batch job windows.)
Parameter sniffig can also cause wrong estimations. You can expirment with OPTION(RECOMPILE) or OPTION(OPTIMIZE FOR UNKNOWN) to test if this is the case.
201 Bucket Problem
Query optimization
When you write IN(subquery) and you have a key lookup, like you have. You are going to have a bad time. For every row returned from the index (NonClusteredIndex-code) the engine needs to access the clustered index to retrieve the RepID one-by-one = Painfully slow when it's multiple rows (In your case: 430.474.500 rows).
You can change this by using EXISTS, in your example:
SELECT
P.PtsID, P.PtsCode, P.PtsName,
FORMAT(P.DOB, 'dd/MM/yyyy') AS DOB, P.Gender, V.VisitID, V.VisitType,
FORMAT(V.VisitDate, 'dd/MM/yyyy') AS VisitDate,
FORMAT(V.DischargeDate, 'dd/MM/yyyy') AS DischargeDate,
R.RepID, R.RepDate, R.RepType, R.RepDesc
FROM
Patients P
INNER JOIN
Visits V ON P.PtsID = V.PtsID
INNER JOIN
Reps R ON R.PtsID = P.PtsID AND R.VisitID = V.VisitID
WHERE
(P.Deleted = 0 AND V.Deleted = 0 AND R.Deleted = 0)
AND (P.PtsName LIKE '%'+TRIM(#PtsName)+'%' OR TRIM(#PtsName) = '')
AND (P.PtsCode LIKE '%'+TRIM(#PtsNo)+'%' OR TRIM(#PtsNo) = '')
AND (R.RepText LIKE '%'+TRIM(#RepText)+'%' OR TRIM(#RepText) = '')
AND (TRIM(#RepCode) = '' OR EXISTS (SELECT 1
FROM tags
WHERE tag = 'XXX'
AND Deleted = 0
AND tags.RepId = r.RepId
AND code IN (SELECT value
FROM string_split(#RepCode,','))))
Index optimizations
If you are still suffering from the Key lookup you may want to change the index, so it also has RepId or include it.
You still have 2 other key lookups. You could also solve this with an INCLUDE on the indexes but only if it makes sense. (Can they be used for other queries, is the current query executing frequently, ...)
Doing the trims and concatenation and storing them in a separate variable may give you some minor improvements.
Wait stats
The following wait stats where also included in the execution plan.
<WaitStats>
<Wait WaitType="RESERVED_MEMORY_ALLOCATION_EXT" WaitTimeMs="1018" WaitCount="2694600"/>
<Wait WaitType="SOS_SCHEDULER_YIELD" WaitTimeMs="514" WaitCount="159327"/>
<Wait WaitType="ASYNC_NETWORK_IO" WaitTimeMs="63" WaitCount="5"/>
<Wait WaitType="MEMORY_ALLOCATION_EXT" WaitTimeMs="25" WaitCount="14639"/>
</WaitStats>
RESERVED_MEMORY_ALLOCATION_EXT and MEMORY_ALLOCATION_EXT but there is no issue with wait stats.
As mentioned in another post: SQL Server: Query fast, but slow from procedure
You can use the following workaround
Slow: SET ANSI_NULLS OFF
Fast: SET ANSI_NULLS ON

Server CPU spike after RDS Certificate update

AWS has a required SSL certificate update for it's RDS instances going out on the 5th. Even though I do not actually use the certificate I went ahead and ran the update so it was done and I wouldn't have any unexpected downtime. Only the SSL Cert should have been updated as I understand it.
Instead my CPU usage went from less than 10% while idle to over 80%. Now I've isolated the cause of this to a query we run every few seconds to retrieve a list of recent transactions. And with some tweaking the CPU usage has returned to normal levels.
But this query has been in place for a few years without issues and it's only after this SSL update that it's caused us any grief. My concern is there is some deeper issue behind the scenes and that changing the query is merely treating a symptom. Before revising the query, I ran all pending updates and rebooted the database with no changes. There was also one other person on the AWS Forums with the same issue but neither of us were able to get any useful responses. Thankfully the rest of the system seems to be behaving itself but I want to know what's going on.
In case it can help identify why a query would suddenly use far more resources here is a (simplified) version of the query prior to my tweak.
SELECT Distinct Top (#NumOfTrx) [trx].* ,[c].*
FROM [dbo].[TRX_Transactions] trx
Inner Join #ProdSelection s on ((trx.Code = s.ID and s.ID != 0) or (s.ID = 0 and (trx.TypeID = 1or StatusID = 4))
or (BatchId > 0 and s.ID in (select b.Code from [dbo].[TRX_Batch] b where b.BatchId = trx.BatchId)))
Join CLI_Details c on trx.UserName = c.UserName
where trx.TransactionDate > DATEADD(Day, -1, GETDATE()) and (trx.Amount >= #Size Or trx.TypeID = 1 or StatusID = 4)
And (#Company = 0 or c.Company = #Company) and (#Agent = '' or [Agent] = #Agent)
order by [trx].[TransactionDate] Desc
Removing the Prod selection join, a filter that is a list of ids that we can operate without for the time being, was what resolved the issue.

What accounts for different execution times between HeidiSQL and SSMS?

When I execute a particular query from Heidi against an MSSQL database, it takes approximately 10 times longer than executing the identical query in SSMS.
They are both being executed against the same server from the same workstation.
What can account for this difference?
Here is the exact query and relative execution times:
SELECT b.ID as BookingID, b.ReservationID, b.RoomID, b.EventName,
b.EventTypeID, b.StatusID, b.DateAdded, build.Value1 as BuildingID,
build.ValueDescription as Building
FROM EMS.dbo.tblBooking b
INNER JOIN EMS.dbo.tblRoom room
ON room.ID = b.RoomID
INNER JOIN ( SELECT deff.Value1, deff.ValueDescription
FROM tblDataExtractionFilter_Fields deff
INNER JOIN tblDataExtractionFilter def ON deff.FilterID = def.ID
WHERE def.Description = '[redacted]'
AND deff.FieldID = 28
AND deff.Show = 0) build
ON room.BuildingID = build.Value1
WHERE b.DateAdded > DATEADD(DAY,-7,GETDATE()) AND (StatusID = 1 OR StatusID = 16);
Heidi: "Duration for 1 query: 1.015 sec."
SSMS: "00:00:01"
I am obviously green, but my understanding was that the execution plan was determined server side and not application side. This leads me to suspect that there is some sort of overhead in Heidi with respect to this query (simpler queries execute MUCH faster so the overhead would not be universal).
This is just a point of curiosity for me. I am still learning. Can anyone offer a clue about what I can check/google/research to try to understand this?
Thanks!
EDIT: The times I have reported do not agree with my statement that the SSMS time is 1/10 that of Heidi. They are both approximately 1 second. My subjective wait time (wall clock time) between execution and display is MUCH faster (and much less than 1 second) in SSMS. Can this be due to SSMS caching the results?

SQL Server - Multiple select queries hit performance

Recently I ran into an issue where we have multiple concurrent client requests causing performance issue in db. I tried the test scenario and as it turned out, when I run SELECT queries (same query) 6 to 7 times (gets worse with more), It degrades the performance and execution takes a lot of time. However I tried this one
SELECT TOP (100) COUNT(DISTINCT([Doc_Number])) AS "Expression"
FROM (
SELECT *
FROM "dbo"."Dummy_Table" "table_alias"
WHERE ((CAST("table_alias"."ID" AS NVARCHAR)) NOT IN
(
SELECT "PrimaryKey" AS ExceptionKey
FROM dbo.exceptions inner_exceptionStatus
LEFT JOIN dbo.Workflow inner_workflowStates ON
(inner_exceptionStatus."Status"= inner_workflowStates."UUID" AND
inner_exceptionStatus."UUID"= 'CA1662D6-73A2-4692-A765-E7E3EDB66062')
WHERE ("inner_workflowStates"."RemoveFromRecordSet" = 1 AND
"inner_workflowStates"."IsDeleted" = 0) AND
("inner_exceptionStatus"."IsArchived" IS NULL OR
"inner_exceptionStatus"."IsArchived" = 0)))) wrapperQuery
The query when runs alone takes around 1sec execution time. But If we runs it in parallel, for each query it takes up a wried amount of time of leads to timeout.
The only thing bothers me here is that SELECT query should be non-blocking and even with shared lock, then need to get along easily.
I am not sure if there is anything wrong in the query that adds up the situation.
Any help is deeply appreciated !!
Try this way
SELECT Count(DISTINCT( [Doc_Number] )) AS Expression
FROM dbo.Dummy_Table table_alias
WHERE NOT EXISTS (SELECT 1
FROM dbo.exceptions inner_exceptionStatus
INNER JOIN dbo.Workflow inner_workflowStates
ON ( inner_exceptionStatus.Status = inner_workflowStates.UUID
AND inner_exceptionStatus.UUID = 'CA1662D6-73A2-4692-A765-E7E3EDB66062' )
WHERE inner_workflowStates.RemoveFromRecordSet = 1
AND inner_workflowStates.IsDeleted = 0
AND ( inner_exceptionStatus.IsArchived IS NULL
OR inner_exceptionStatus.IsArchived = 0 )
AND table_alias.ID = PrimaryKey)
Made couple of changes.
Changed NOT IN to NOT EXISTS
Removed the convert in "table_alias"."ID" because it will avoid using any index present in "table_alias"."ID" column. If the conversion is really required then add it.
Removed Top (100) since there is no Group By it will return a single record as result.
Still if the query is running slow then you need to post the execution plan and make sure the statistics are up-to-date
You can simplyfy your query like this :
SELECT COUNT(DISTINCT(Doc_Number)) AS Expression
FROM dbo.Dummy_Table dmy
WHERE not exists
(
SELECT *
FROM dbo.exceptions ies
INNER JOIN dbo.Workflow iws ON ies.Status= iws.UUID AND ies.UUID= 'CA1662D6-73A2-4692-A765-E7E3EDB66062'
WHERE iws.RemoveFromRecordSet = 1 AND iws.IsDeleted = 0 AND (ies.IsArchived IS NULL OR ies.IsArchived = 0)
and dmy.ID=PrimaryKey
)
Like prdp say :
Changed NOT IN to NOT EXISTS
Removed the convert in "table_alias"."ID" because it will avoid using any index present in "table_alias"."ID" column. If the conversion is really required then add it.
Removed Top (100) since there is no Group By it will return a single record as result.
I add :
Remove you temporary table wrapperQuery
You can use INNER JOIN because into where you test RemoveFromRecordSet = 1 then you remove null values.
Remove not utils quotes ,brackets and parenthèses into where clause

Query runs in less than a millisecond in SQL, but times out in Entity Framework

The following linq-to-entities query throws
Entity Framework Timeout expired. The timeout period elapsed prior to
completion of the operation or the server is not responding.
after ToList()ing it.
var q = (from contact
in cDB.Contacts.Where(x => x.Templategroepen.Any(z => z.Autonummer == templategroep.Autonummer)
&& !x.Uitschrijvings.Any(t => t.Templategroep.Autonummer == templategroep.Autonummer))
select contact.Taal).Distinct();
((System.Data.Objects.ObjectQuery)q).ToTraceString() gives me:
SELECT
[Distinct1].[Taal] AS [Taal]
FROM ( SELECT DISTINCT
[Extent1].[Taal] AS [Taal]
FROM [dbo].[ContactSet] AS [Extent1]
WHERE ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[TemplategroepContact] AS [Extent2]
WHERE ([Extent1].[Autonummer] = [Extent2].[Contacts_Autonummer]) AND ([Extent2].[Templategroepen_Autonummer] = #p__linq__0)
)) AND ( NOT EXISTS (SELECT
1 AS [C1]
FROM [dbo].[UitschrijvingenSet] AS [Extent3]
WHERE ([Extent1].[Autonummer] = [Extent3].[Contact_Autonummer]) AND ([Extent3].[Templategroep_Autonummer] = #p__linq__1)
))
) AS [Distinct1]
the query from tracestring runs in under 1 seconds in sql management studio, but times out when actually to-listing it? how is that possible again?
*Update: added SQL PROFILER output for query * this runs as slow as the EF ToList() (>30seconds)
exec sp_executesql N'SELECT
[Distinct1].[Taal] AS [Taal]
FROM ( SELECT DISTINCT
[Extent1].[Taal] AS [Taal]
FROM [dbo].[ContactSet] AS [Extent1]
WHERE ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[TemplategroepContact] AS [Extent2]
WHERE ([Extent1].[Autonummer] = [Extent2].[Contacts_Autonummer]) AND ([Extent2].[Templategroepen_Autonummer] = #p__linq__0)
)) AND ( NOT EXISTS (SELECT
1 AS [C1]
FROM [dbo].[UitschrijvingenSet] AS [Extent3]
WHERE ([Extent1].[Autonummer] = [Extent3].[Contact_Autonummer]) AND ([Extent3].[Templategroep_Autonummer] = #p__linq__1)
))
) AS [Distinct1]',N'#p__linq__0 int,#p__linq__1 int',#p__linq__0=1,#p__linq__1=1
I observed this issue with EF6.
await _context.Database.SqlQuery<MyType>(sql) was timing out even when my timeout value was cranked up to 60 seconds. However, executing the exact same SQL (used profiler to confirm the sql I passed in was unmodified) in SSMS yielded expected results in one second.
exec sp_updatestats
Fixed the issue for me.
(DBCC FREEPROCCACHE)
DBCC DROPCLEANBUFFERS
made the problem go away for now, but I think that might just be a temp. solution
I know this is a little late, but I found the answer here.
Basically Entity Framework likes to track everything by default. If you don't need it (i.e. not inserting or updating or deleting entities), turn it off to speed up your queries.
If you're using Entity Framework Code First you can achieve this like so:
var q = (from contact
in cDB.Contacts.AsNoTracking()
.Where(x => x.Templategroepen.Any(z => z.Autonummer == templategroep.Autonummer)
&& !x.Uitschrijvings.Any(t => t.Templategroep.Autonummer == templategroep.Autonummer))
select contact.Taal).Distinct();
I had similar issue with EF6. When using SqlQuery function in EF, I got timeout although query was executed in milliseconds in Management Studio. I found that it happened due the value of one of the sql parameters that I used in EF query. To make it clear, below is the similar SQL query I experienced with.
SELECT * FROM TBL WHERE field1 > #p1 AND field2>#p2 AND field3<#p3
When #p1 is zero, I received timeout exception. When I made it 1 or something different, it was executed in milliseconds. By the way, the table that I queried on has more than 20M rows.
I hope it helps,
Best
You need to Add one column serves as uniqueId or key to be able to work in EF

Resources