I have a query that (~95% of the time) executes nearly instantly on the production Azure SQL Database. Running the query in SSMS (in production) shows that my non-clustered index is being utilized with an index seek (cost 100%).
However, randomly the database all of a sudden gets into a state where this same query will fail to execute. It always times out from the calling application. Logging into SSMS when this episode is occurring I can manually execute the query and it will eventually complete after minutes of execution (since there are no time out limits in SSMS vs that of the calling application).
After I allow the query to fully execute without timeouts I can subsequently execute the query again with instant results. The calling application can also call it now with instant results again. It appears that by allowing it to fully execute without a timeout clears up whatever issue was occurring and returns execution back to normal.
Monitoring the server metrics shows no real issues or spikes in CPU utilization that would suggest the server is just in a stressed state during this time. All other queries within the application still execute quickly as normal. Even queries that utilize this same table and non-clustered index.
Table
CREATE TABLE [dbo].[Item] (
[Id] UNIQUEIDENTIFIER NOT NULL,
[UserId] UNIQUEIDENTIFIER NULL,
[Type] TINYINT NOT NULL,
[Data] NVARCHAR (MAX) NULL,
[CreationDate] DATETIME2 (7) NOT NULL,
CONSTRAINT [PK_Item] PRIMARY KEY CLUSTERED ([Id] ASC),
CONSTRAINT [FK_Item_User] FOREIGN KEY ([UserId]) REFERENCES [dbo].[User] ([Id])
);
This table has millions of rows in it.
Index
CREATE NONCLUSTERED INDEX [IX_Item_UserId_Type_IncludeAll]
ON [dbo].[Item]([UserId] ASC, [Type] ASC)
INCLUDE ([Data], [CreationDate]);
Issue Query
SELECT
*
FROM
[dbo].[Item]
WHERE
[UserId] = #UserId
AND [Data] IS NOT NULL
While I was catching it in the act today in SSMS, I also modified to query to to remove the AND [Data] IS NOT NULL from the where clause. Ex:
SELECT
*
FROM
[dbo].[Item]
WHERE
[UserId] = #UserId
This query executed instantly and execution plans show that it is utilizing the index properly. Adding back AND [Data] IS NOT NULL causes the query be slow again. This Data column can hold large amounts of JSON data so I am not sure if that somehow has anything to do with it.
Running sp_WhoIsActive while the episode is occurring and my query is long-running shows that reads, physical_reads, cpu, and used_memory are ever-increasing as the query continues to execute. Interestingly, the query_plan column is NULL while it is running so I am not able to see what plan it is actually utilizing. Though I can always see that the index seek is utilized while running it manually thereafter.
Why would this query get into a state where it would take a really long time to execute while the majority of the time it executes with near instant results? We can see that it is properly utilizing it's non-clustered index as a seek operation.
Why does allowing the query to fully execute in SSMS (vs timing out as the calling application does) seem to clear up the problem going forward?
How can I avoid these types of episodes?
Few things i would check...
1.Your query doesn't have a good index , a good index would be below since you are doing a select * as well as data is not null
create index nci on table(userid,data)
include(rest of columns in select )
2.Try updating statistics and indexes for this table,this will help if there is a index fragmentation or stale statistics
3.Try option(recompile) hint to see if parameter sniffing is a problem
Related
I'm trying to understand how to properly use nonclustered indexes. Here what I found with test data.
CREATE TABLE TestTable
(
RowID int Not Null IDENTITY (1,1),
Continent nvarchar(100),
Location nvarchar(100)
CONSTRAINT PK_TestTable_RowID
PRIMARY KEY CLUSTERED (RowID)
)
ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_RowID
GO
INSERT INTO TestTable
SELECT Continent, Location
FROM StgCovid19
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SELECT *
FROM TestTable
WHERE Continent = 'Asia' --551ms
CREATE NONCLUSTERED INDEX NCIContinent
ON TestTable(Continent)
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SELECT *
FROM TestTable
WHERE Continent = 'Asia' --1083ms
DROP INDEX NCIContinent
ON TestTable
CREATE NONCLUSTERED INDEX NCIContinent
ON TestTable(Continent)
INCLUDE (Location)
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SELECT *
FROM TestTable
WHERE Continent = 'Asia' ---530ms
As you guys can see, if I only add the non clustered index on the Continent column, it performs a seek and also takes double the time to execute the select.
When I add the INCLUDE (Location) it takes less than without any clustered index.
Are you guys able to tell me what is going on?
The strategy of accessing data depends on the table structure, but also, and mainly, on data distribution. It is why statistics about data distribution are stored in indexes and tables :
In indexes, to know the distribution (histogram) of the key value
In tables, to know the distribution (histogram) of the columns values
An execution plan is computed to create a tree compound of branches containing chained steps that are algorithms specialized in one action (join, sort, data access...) to construct the program that will retrieve the data in response to your demand (query).
The optimizer role is to determine, among many execution plans, which will be the most interresting, using the least resources (memory, data volume, cpu...). The plan choose is not systematycally the quicker one, but the one that will have the lower cost in terms in resources usage... This estimate is done by the optimizer upon the bias of statistics...
The test you make has no sense because we does not know the data distribution and the use of the DBCC DROPCLEANBUFFERS has a heavy slide effect that is not current in real word of databases exploitation. In real world, 98 % of the data used by users are in cache... !
Also mesuring time of execution of a query has two problematics :
this metric is not stable and depends of the PC activity which is
really heavy even when you do nothing. Usually we restart the test
at least 10 time, eliminating the slowest and the quickest time and
finally computing the average on the remaining 8 results
time is not the only figure that is interresting and can be mostly the time to send resulting data to the client application. To eliminate this
time, SSMS has a parameter that can execute the query without
sghowing the resulting dataset in SSMS
I am writing a quick-and-dirty application to load sales plan data into SQL Server (2008 FWIW, though I don't think the specific version matters).
The data set is the corporate sales plan: a few thousand rows of Units, Dollars and Price for each combination of customer, part number and month. This data is updated every few weeks, and it's important to track who changed it and what the changes were.
-- Metadata columns are suffixed with ' ##', to enable an automated
-- tool I wrote to handle repetitive tasks such as de-duplication of
-- records whose values didn't change in successive versions of the
-- forecast.
CREATE TABLE [SlsPlan].[PlanDetail]
(
[CustID] [char](15) NOT NULL,
[InvtID] [char](30) NOT NULL,
[FiscalYear] [int] NOT NULL,
[FiscalMonth] [int] NOT NULL,
[Version Number ##] [int] IDENTITY(1,1) NOT NULL,
[Units] [decimal](18, 6) NULL,
[Unit Price] [decimal](18, 6) NULL,
[Dollars] [decimal](18, 6) NULL,
[Batch GUID ##] [uniqueidentifier] NOT NULL,
[Record GUID ##] [uniqueidentifier] NOT NULL DEFAULT (NEWSEQUENTIALID()),
[Time Created ##] [datetime] NOT NULL,
[User ID ##] [varchar](64) NULL DEFAULT (ORIGINAL_LOGIN()),
CONSTRAINT [PlanByProduct_PK] PRIMARY KEY CLUSTERED
([CustID], [InvtID], [FiscalYear], [FiscalMonth], [Version Number ##])
)
To track changes, I'm using an IDENTITY column as part of the primary key to enable multiple version with the same primary key. To track who did the change, and also to enable backing out an entire bad update if someone does something completely stupid, I am inserting the Active Directory logon of the creator of that version of the record, a time stamp, and two GUIDs.
The "Batch GUID" column should be the same for all records in a batch; the "Record GUID" column is obviously unique to that particular record and is used for de-duplication only, not for any sort of query.
I would strongly prefer to generate the batch GUID inside a query rather than by writing a stored procedure that does the obvious:
DECLARE #BatchGUID UNIQUEIDENTIFIER = NEWID()
INSERT INTO MyTable
SELECT I.*, #BatchGUID
FROM InputTable I
I figured the easy way to do this is to construct a single-row result with the timestamp, the user ID and a call to NEWID() to create the batch GUID. Then, do a CROSS JOIN to append that single row to each of the rows being inserted. I tried doing this a couple different ways, and it appears that the query execution engine is essentially executing the GETDATE() once, because a single time stamp appears in all rows (even for a 5-million row test case). However, I get a different GUID for each row in the result set.
The below examples just focus on the query, and omit the insert logic around them.
WITH MySingleRow AS
(
Select NewID() as [Batch GUID ##],
ORIGINAL_LOGIN() as [User ID ##],
getdate() as [Time Created ##]
)
SELECT N.*, R1.*
FROM util.zzIntegers N
CROSS JOIN MySingleRow R1
WHERE N.Sequence < 10000000
In the above query, "util.zzIntegers" is just a table of integers from 0 to 10 million. The query takes about 10 seconds to run on my server with a cold cache, so if SQL Server were executing the GETDATE() function with each row of the main table, it would certainly have a different value at least in the milliseconds column, but all 10 million rows have the same timestamp. But I get a different GUID for each row. As I said before, the goal is to have the same GUID in each row.
I also decided to try a version with an explicit table value constructor in hopes that I would be able to fool the optimizer into doing the right thing. I also ran it against a real table rather than a relatively "synthetic" test like a single-column list of integers. The following produced the same result.
WITH AnotherSingleRow AS
(
SELECT SingleRow.*
FROM (
VALUES (NewID(), Original_Login(), getdate())
)
AS SingleRow(GUID, UserID, TimeStamp)
)
SELECT R1.*, S.*
FROM SalesOrderLineItems S
CROSS JOIN AnotherSingleRow R1
The SalesOrderLineItems is a table with 6 million rows and 135 columns, to make doubly sure that runtime was sufficiently long that the GETDATE() would increment if SQL Server were completely optimizing away the table value constructor and just calling the function each time the query runs.
I've been lurking here for a while, and this is my first question, so I definitely wanted to do good research and avoid criticism for just throwing a question out there. The following questions on this site deal with GUIDs but aren't directly relevant. I also spent a half hour searching Google with various combinations of phrases didn't seem to turn up anything.
Azure actually does what I want, as evidenced in the following question I
turned up in my research:
Guid.NewGuid() always return same Guid for all rows.
However, I'm not on Azure and not going to go there anytime soon.
Someone tried to do the same thing in SSIS
(How to insert the same guid in SSIS import)
but the answer to that query came back that you generate the GUID in
SSIS as a variable and insert it into each row. I could certainly do
the equivalent in a stored procedure but for the sake of elegance and
maintainability (my colleagues have less experience with SQL Server queries
than I do), I would prefer to keep the creation of the batch GUID in
a query, and to simplify any stored procedures as much as possible.
BTW, my experience level is 1-2 years with SQL Server as a data analyst/SQL developer as part of 10+ years spent writing code, but for the last 20 years I've been mostly a numbers guy rather than an IT guy. Early in my career, I worked for a pioneering database vendor as one of the developers of the query optimizer, so I have a pretty good idea what a query optimizer does, but haven't had time to really dig into how SQL Server does it. So I could be completely missing something that's obvious to others.
Thank you in advance for your help.
I have a table with this structure:
CREATE TABLE Log_File
(
Hostname CHAR(15) NOT NULL,
Line_Number INT NOT NULL,
Log_Line VARCHAR(8000) NOT NULL,
CONSTRAINT pk_Log_File PRIMARY KEY (Hostname, Line_Number)
)
Each server is bulk inserting about 1000 rows every 5 seconds. Whenever there is a bulk insert, a trigger runs on the INSERTED table and it iterates through the log_line records with a cursor and updates another table.
When I only have one server writing to the Log_File table, I have no issues. When I try have 20 servers writing to the table at the same time, I occasionally get a deadlock error and the transaction closes on some of the machines, killing the thread.
This is usually a problem when I start up the application on each server because it has to scan the Log_File table to find the MAX(Line_Number) for itself so it knows where to begin reading its own log file from.
I haven't set an index on this table. Would creating a clustered or nonclustered index help the situation? Unfortunately a cursor is absolutely necessary since I need to iterate through each record in order to deal with islands and gaps.
Any help on reducing deadlocks or making this faster is appreciated!
I tried to examine RID (foremerly bookmark) lookup by creating a heap table:
CREATE TABLE [dbo].[CustomerAddress]
(
[CustomerID] [int],
[AddressID] [int],
[ModifiedDate] [datetime]
);
GO
CREATE NONCLUSTERED INDEX x
ON dbo.CustomerAddress(CustomerID, AddressID);
Then, I tried the following query to inestigate execution plan:
SELECT CustomerID, AddressID, ModifiedDate
FROM dbo.CustomerAddress
WHERE CustomerID = 29485;
But, using MSSMS I cannot see RID lookup in the execution plan:
I'm using SQL Server 2008R2 (version 10.50.4000.0) service pack 2.
PS: This question is based on Aaron Bertrand's article.
A table scan means SQL Server does not use your index. It reads from the "heap". A "heap" is the data storage for tables without a clustered index.
Since it does not touch the index at all, SQL Server does not need a RID lookup to go from the index to the heap.
The reason is probably that SQL Server estimates there might be more than +/- 100 rows for one customer. The optimizer will try to avoid a large numbers of lookups.
You could try again with an index on just (CustomerID), or by adding an AddresID to your where clause.
When I run the "Top disk usage by table" report in Sql Server Management Studio, it shows one of my tables using about 1.8GB of disk space:
The table definition:
CREATE TABLE [dbo].[RecipeItems](
[wo_id] [varchar](50) NOT NULL,
[invent_id] [varchar](50) NOT NULL,
[invent_dim_id] [varchar](50) NULL,
[ratio] [float] NOT NULL
) ON [PRIMARY]
I'd roughly estimate that with each row takes less than 200 bytes, and with only 7K records, this shouldn't take up more than 1-2MB. But obviously, this is not the case. What might be the reason this table uses so much storage?
Chances are that a lot of data has been updated or deleted. Since it is a heap updates can lead to forwarding records. I would try this first:
ALTER TABLE dbo.RecipeItems REBUILD;
Next I would consider adding a clustered index.
Do not run a shrink database command to fix this table, PLEASE.
When you perform your "delete all and bulk insert" I would do it this way, running a rebuild in the middle:
TRUNCATE TABLE dbo.RecipeItems;
ALTER TABLE dbo.RecipeItems REBUILD;
BULK INSERT dbo.RecipeItems FROM ...
If you add a clustered index you may want to do this a little differently. And if you can't use TRUNCATE, keep using DELETE, obviously. TRUNCATE will cause less log churn if the table is eligible, and since you are wiping out the table and re-populating it, it's not something you seem to need to recover from. In fact you might just consider dropping the table and re-creating it each time.