I am dealing with a SQL Server on Azure, and I found a rare case where queries over a single table are really slow, over 10 - 12 seconds for a table that has about thousand rows, and while similar tables respond in less that 1 second.
Table definition (script create) is:
CREATE TABLE [dbo].[Content]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Content_ID] [int] NOT NULL,
[CultureCode] [nvarchar](50) NOT NULL,
[Version] [int] NOT NULL,
[UserID] [int] NOT NULL,
[Timestamp] [datetime] NOT NULL,
[Title] [nvarchar](max) NOT NULL,
[Subtitle] [nvarchar](max) NULL,
... 14 more [nvarchar](max) FIELDS...
[NotesPlainText] [nvarchar](max) NULL,
CONSTRAINT [PK_dbo.Content]
PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON)
)
(the columns Content_ID, CultureCode and Version seem to be a unique combination, even is not set in the table as that. [ID] is just used as row identifier)
Aside of 17 columns all nvarchar(max), nothing else weird.
As I said, no uniques defined, no index ...
So I tune up as
CREATE TABLE [dbo].[Content_optimized]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Content_ID] [int] NOT NULL,
[CultureCode] [nvarchar](50) NOT NULL,
[Version] [int] NOT NULL,
[UserID] [int] NOT NULL,
[Timestamp] [datetime] NOT NULL,
[Title] [nvarchar](max) NOT NULL,
[Subtitle] [nvarchar](max) NULL,
... 14 more [nvarchar](max) FIELDS...
[NotesPlainText] [nvarchar](max) NULL,
)
CREATE INDEX IDX_dbo_Content_optimized__Content_ID
ON [dbo].[Content_optimized]([Content_ID])
CREATE INDEX IDX_dbo_Content_optimized__CultureCode
ON [dbo].[Content_optimized]([CultureCode])
CREATE INDEX IDX_dbo_Content_optimized__Version
ON [dbo].[Content_optimized]([Version])
CREATE INDEX IDX_dbo_Content_optimized__UserID
ON [dbo].[Content_optimized]([UserID])
ALTER INDEX ALL ON [dbo].[Content_optimized] REBUILD WITH (FILLFACTOR=90, ONLINE=ON)
and here where things get weird, I am not saving even a single second of execution.
Indeed code like this:
select *
from [Content]
where Content_ID <> 1049
order by Content_ID, Version
select *
from [Content_optimized]
where Content_ID <> 1049
order by Content_ID, Version
gives a 53% and 47% on the execution plan (just 11% faster because of the INDEXes)
Sure I am relative new to SQL optimisation, so there is something I am not seeing here that may be obvious, but I am right now lost on it.
Any help?
Related
I am using SQL Server version 2012. I have a table which has more than 10 million rows. I have to count records using a SQL filter.
My query is this:
select count(*)
from reconcil
where tenantid = 101
which is taking more than 5 minutes for 5 millions records.
Is there any fastest way to count records?
Reconcil table structure is
CREATE TABLE [dbo].[RECONCIL]
(
[AckCode] [nvarchar](50) NULL,
[AckExpireTime] [int] NULL,
[AckFileName] [nvarchar](255) NULL,
[AckKey] [int] NULL,
[AckState] [int] NULL,
[AppMsgKey] [nvarchar](30) NULL,
[CurWrkActID] [nvarchar](50) NULL,
[Date_Time] [datetime] NULL,
[Direction] [nvarchar](1) NULL,
[ErrorCode] [nvarchar](50) NULL,
[FGLOGKEY] [int] NOT NULL,
[FolderID] [int] NULL,
[FuncGCtrlNo] [nvarchar](14) NULL,
[INLOGKEY] [int] NULL,
[InputFileName] [nvarchar](255) NULL,
[IntCtrlNo] [nvarchar](14) NULL,
[IsAssoDataPresent] [nvarchar](1) NULL,
[JobState] [int] NULL,
[LOGDATA] [nvarchar](max) NULL,
[MessageID] [nvarchar](25) NULL,
[MessageState] [int] NULL,
[MessageType] [int] NULL,
[NextWrkActID] [nvarchar](50) NULL,
[NextWrkHint] [nvarchar](20) NULL,
[NONFAERRORLOG] [nvarchar](max) NULL,
[NumberOfBytes] [int] NULL,
[NumberOfSegments] [int] NULL,
[OutputFileName] [nvarchar](255) NULL,
[Priority] [nvarchar](1) NULL,
[ReceiverID] [nvarchar](30) NULL,
[RecNo] [int] NULL,
[RecordID] [int] IDENTITY(1,1) NOT NULL,
[RelationKey] [int] NULL,
[SEGLOG] [nvarchar](max) NULL,
[SenderID] [nvarchar](30) NULL,
[ServerID] [nvarchar](255) NULL,
[Standard] [int] NULL,
[TenantID] [int] NULL,
[TPAgreementKey] [int] NULL,
[TSetCtrlNo] [nvarchar](35) NULL,
[UserKey1] [nvarchar](255) NULL,
[UserKey2] [nvarchar](255) NULL,
[UserKey3] [nvarchar](255) NULL,
CONSTRAINT [RECONCIL_PK]
PRIMARY KEY CLUSTERED ([RecordID] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Unless you materialized the count, this non-clustered index on TenentID will provide better performance because it is narrower than the clustered primary key index and will scan only the matching rows:
CREATE INDEX idx ON [dbo].[RECONCIL](TenantID);
If performance of the aggregate query with this index isn't acceptable, you could create an indexed view with the count. The indexed view will provide the fastest performance for this query but will incur additional costs for storage and index maintenance for inserts and deletes. Also, queries that modify the table must have required SET options for indexed views. Those costs may be justified if the count query is executed often.
SQL Server can use the indexed view automatically in Enterprise (or Developer) editions even if not directly referenced in the query as long as the optimizer can match the semantics of the query using the view. In lesser editions, you'll need to query the indexed view directly and specify the NOEXPAND hint.
CREATE VIEW dbo.VW_RECONCIL_COUNT
WITH SCHEMABINDING
AS
SELECT
TenantID
, COUNT_BIG(*) AS TenentRowCount
FROM [dbo].[RECONCIL]
GROUP BY TenantID;
GO
CREATE UNIQUE CLUSTERED INDEX cdx ON dbo.VW_RECONCIL_COUNT(TenantID);
GO
--Enterprise Edition can use the view index automatically
SELECT COUNT_BIG(*) AS TenentRowCount
FROM [dbo].[RECONCIL]
WHERE TenantID = 101
GROUP BY TenantID;
GO
--other editions require the view to be specified plus the NOEXPAND hint
SELECT TenentRowCount
FROM dbo.VW_RECONCIL_COUNT WITH (NOEXPAND)
WHERE TenantID = 101;
GO
As being suggested, create an index or even partition your table by tenantId if you have so many items. This way you would have one data file per partition which increases performance.
select count(tenantid)
from reconcil
where tenantid = 101 group by tenantid ;
not sure but try using this.
I wrote the next script in SQL Server 2012, but it fails on constraint.
I have a table with 20 000 000 rows.
I created the same table with same index with partition, but when I switched
the table, SQL Server failed
Here is my code:
CREATE DATABASE test
USE test
create Partition Function
[PF_Table_Log] ([DATETIME2](3)) As Range left For VALUES
('2016-04-05 00:00:00.000','2016-04-06 00:00:00.000',
'2016-04-07 00:00:00.000','2016-04-08 00:00:00.000')
Create Partition Scheme PS_Table_Log_Datetime
As Partition [PF_Table_Log]
All To ([Primary]);
create TABLE [Log](
[LogId] [BIGINT] IDENTITY(1,1) NOT NULL,
[ServiceInstanceId] [UNIQUEIDENTIFIER] NULL,
[ServiceId] [UNIQUEIDENTIFIER] NOT NULL,
[Component] [NVARCHAR](100) NULL,
[MachineName] [NVARCHAR](50) NULL,
[Datetime] [DATETIME2](3) NOT NULL,
[Severity] [INT] NOT NULL,
[LogText] [NVARCHAR](max) NULL,
[MessageId] [UNIQUEIDENTIFIER] NULL,
[MessageRole] [INT] NULL
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [PK_Log] ON [Log]
(
[LogId] ASC,
[datetime] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON,ALLOW_PAGE_LOCKS=ON)
GO
create TABLE [Log_new1](
[LogId] [BIGINT] IDENTITY(1,1) NOT NULL,
[ServiceInstanceId] [UNIQUEIDENTIFIER] NULL,
[ServiceId] [UNIQUEIDENTIFIER] NOT NULL,
[Component] [NVARCHAR](100) NULL,
[MachineName] [NVARCHAR](50) NULL,
[Datetime] [DATETIME2](3) NOT NULL,
[Severity] [INT] NOT NULL,
[LogText] [NVARCHAR](max) NULL,
[MessageId] [UNIQUEIDENTIFIER] NULL,
[MessageRole] [INT] NULL
) ON PS_Table_Log_Datetime (datetime)
GO
CREATE CLUSTERED INDEX [PK_Log] ON [Log_new1]
(
[LogId] ASC,[Datetime] asc
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
) ON PS_Table_Log_Datetime ([Datetime])
ALTER TABLE [Log] SWITCH TO [Log_new1] PARTITION 5
"I created the same table with same index with partition, but when I
switched the table, SQL Server failed"
This statement isn't consistent with the script. Table Log is not partitioned but Log_new1 is. You need to either partition Log using the same partition scheme or create a check constraint on the Log table datetime column matching the target partition boundaries. The required check constraint definition for the last partition is:
ALTER TABLE Log ADD CONSTRAINT CK_Log_Datetime CHECK (Datetime > '2016-04-08 00:00:00.000');
I have the following query that I am running on my database server but it takes about 30 seconds to run and I can't work out why this is.
SELECT *
FROM [dbo].[PackageInstance] AS packInst
INNER JOIN [dbo].[PackageDefinition] AS packageDef
ON packInst.[PackageDefinitionID] = packageDef.[PackageDefinitionID]
LEFT OUTER JOIN [dbo].[PackageInstanceContextDef] AS contextDef
ON packInst.[PackageInstanceID] = contextDef.[PackageInstanceID]
This produced the following execution plan which to me looks to be good....so I can't understand why it takes so much time to execute where the resulting data is only 100,000 records (which should be a walk in the park for SQL Server).
Any ideas what could be causing this long execution time?
I have looked at the query in Profiler to see what the stats where on it and they are as follows:
CPU - 4711
Reads - 744453
Writes - 9
Duration - 26329
The following are the table definitions:
CREATE TABLE [dbo].[PackageDefinition](
[PackageDefinitionID] [int] IDENTITY(1,1) NOT NULL,
[ts] [timestamp] NOT NULL,
[ProgramID] [int] NULL,
[VendorID] [int] NULL,
[PackageExecutionTypeID] [int] NULL,
[PackageDefinitionStatusID] [int] NOT NULL,
[IsInternal] [bit] NOT NULL,
[Name] [dbo].[D_Name] NOT NULL,
[Description] [dbo].[D_Description] NOT NULL,
[CreatedDate] [datetime] NOT NULL,
[PublishedDate] [datetime] NULL,
[OwnerUserGuid] [uniqueidentifier] NOT NULL,
[ProcessDefinitionMainID] [int] NULL,
[KeyInfoHtml] [nvarchar](max) NULL,
[DescriptionHtml] [nvarchar](max) NULL,
[WhatToExpectHtml] [nvarchar](max) NULL,
[BestPracticesHtml] [nvarchar](max) NULL,
[RecommendedJourneysHtml] [nvarchar](max) NULL,
[RequiresSLAAgreement] [bit] NOT NULL,
[SLAFileAssetID] [int] NULL,
[ImageDataID] [int] NULL,
[VideoHtml] [nvarchar](max) NULL,
[VideoAssetID] [int] NULL,
[UseMapCosts] [bit] NOT NULL,
[CostMin] [money] NOT NULL,
[CostMax] [money] NOT NULL,
[LandingPageVisitCount] [int] NOT NULL,
[IsDeleted] [dbo].[D_IsDeleted] NOT NULL,
[CreatedByUserGuid] [uniqueidentifier] NOT NULL,
[OrderHtml] [nvarchar](max) NULL,
CONSTRAINT [PK_PackageDefinition] PRIMARY KEY CLUSTERED
(
[PackageDefinitionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[PackageInstance](
[PackageInstanceID] [int] IDENTITY(1,1) NOT NULL,
[ts] [timestamp] NOT NULL,
[PackageDefinitionID] [int] NOT NULL,
[PackageStatusID] [int] NOT NULL,
[Name] [dbo].[D_Description] NOT NULL,
[CampaignID] [int] NULL,
[MarketingPlanID] [int] NULL,
[CountryID] [int] NULL,
[DateEntered] [datetime] NULL,
[DateExecuted] [datetime] NULL,
[ProcessID] [int] NULL,
[OrderedByUserGuid] [uniqueidentifier] NULL,
[RequestedByUserGuid] [uniqueidentifier] NULL,
[SLAEndDate] [datetime] NULL,
CONSTRAINT [PK_PackageInstance] PRIMARY KEY CLUSTERED
(
[PackageInstanceID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[PackageInstanceContextDef](
[PackageInstanceContextDefID] [int] IDENTITY(1,1) NOT NULL,
[ts] [timestamp] NOT NULL,
[PackageInstanceID] [int] NOT NULL,
[ContextObjectDefID] [int] NOT NULL,
[EnteredFieldValue] [varchar](max) NULL,
[SelectedListValueID] [int] NULL,
[AssetIdsString] [nvarchar](max) NULL,
[SelectedListValueIdsString] [nvarchar](max) NULL,
[ContextObjectFieldName] [nvarchar](30) NOT NULL,
CONSTRAINT [PK_PackageInstanceContextDef] PRIMARY KEY CLUSTERED
(
[PackageInstanceContextDefID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Remove the * in SELECT *
It will always scan because you ask for all columns. And do you have clustered indexes?
The answer turned out to be what #MartinSmith suggested. Because the PackageDefinition table contained about 8 NVARCHAR(MAX) columns, when the resulting join was created and that was over 100k rows, this was causing the varchar(max) values to be re-read over and over and they exist in out of row pages. Hence the large number of logical reads.
Thanks all for your support, just have to figure out to make the entity framework produce the query that I want.
What happens if you add the following index...
CREATE NONCLUSTERED INDEX ix ON PackageDefinition(PackageDefinitionID)
...and try the following to reduce the width of the data going into the sort?
SELECT packInst.*,
packageDef2.*,
contextDef.*
FROM [dbo].[PackageInstance] AS packInst
INNER MERGE JOIN [dbo].[PackageDefinition] AS packageDef
ON packInst.[PackageDefinitionID] = packageDef.[PackageDefinitionID]
LEFT OUTER MERGE JOIN [dbo].[PackageInstanceContextDef] AS contextDef
ON packInst.[PackageInstanceID] = contextDef.[PackageInstanceID]
INNER MERGE JOIN [dbo].[PackageDefinition] AS packageDef2
ON packageDef.[PackageDefinitionID] = packageDef2.[PackageDefinitionID]
OF course * should not be used as even if you need all columns you definitely won't need the same columns twice as the result of the JOIN but this is just to maintain the semantics of your original query.
Using this Spatial Query I am trying to get all the country information which intersects the point 78,22. Expected result is information of "India", but this query is returning no rows.
select * from countryspatial
where
geom.STIntersects((geometry::STGeomFromText('POINT (78 22)', 4326)))>0;
This is the table definition:
CREATE TABLE [dbo].[CountrySpatial](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ObjectID] [bigint] NULL,
[FIPS_CNTRY] [nvarchar](255) NULL,
[GMI_CNTRY] [nvarchar](255) NULL,
[ISO_2DIGIT] [nvarchar](255) NULL,
[ISO_3DIGIT] [nvarchar](255) NULL,
[ISO_NUM] [int] NULL,
[CNTRY_NAME] [nvarchar](255) NULL,
[LONG_NAME] [nvarchar](255) NULL,
[ISOSHRTNAM] [nvarchar](255) NULL,
[UNSHRTNAM] [nvarchar](255) NULL,
[LOCSHRTNAM] [nvarchar](255) NULL,
[LOCLNGNAM] [nvarchar](255) NULL,
[STATUS] [nvarchar](255) NULL,
[POP2005] [bigint] NULL,
[SQKM] [float] NULL,
[SQMI] [float] NULL,
[COLORMAP] [smallint] NULL,
[geom] [geometry] NULL,
PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[CountrySpatial] WITH CHECK ADD CONSTRAINT [enforce_srid_geometry_CountrySpatial] CHECK (([geom].[STSrid]=(0)))
GO
ALTER TABLE [dbo].[CountrySpatial] CHECK CONSTRAINT [enforce_srid_geometry_CountrySpatial]
GO
The first thing to comment is that earth surface points should be stored using Geography, not Geometry. There are differences to the storage and how the functions work (even if similarly named)
Here is a working example:
Simplified table:
CREATE TABLE CountrySpatial(
ID int IDENTITY(1,1) NOT NULL PRIMARY KEY,
geog geography NULL)
GO
Insert something that resembles a diamond around India
INSERT INTO CountrySpatial(geog)
VALUES (geography::STGeomFromText('POLYGON((' +
'77.22702 28.67613, ' + -- new delhi (top)
'72.566071 23.059516, ' + -- ahmedabad (left)
'77.593689 13.005227, ' + -- bengaluru (bottom)
'88.374023 22.614011, ' + -- kolkata (right)
'77.22702 28.67613))', 4326));
Find the match. It is UNION-ed to the Point being sought. STBuffer increases the point to a 100km radius, so that it will show up when viewed together with the Geography record found (switch to Spatial tab in the output)
select geog
from countryspatial
where geog.STIntersects(geography::STGeomFromText('POINT (78 22)', 4326))>0
union all
select geography::STGeomFromText('POINT (78 22)', 4326).STBuffer(100000)
Just to find out my would be DB size on production environment, I just populated my tables with 1.5 million rows of nearly same data (Except Primary key). It currently shows 261 MB...
Now, Whether I can rely on this, or since the Data is almost similar on all other columns the SQL server has compressed the size. ie. Will the size be different if the values in each rows are different...
Further.. Does even columns will null values contribute to the size of the DB ?
Thanks for your time...
Edit : Here is my schema... And I have made some indexes too...
CREATE TABLE [dbo].[Trn_Tickets](
[ObjectID] [bigint] IDENTITY(1,1) NOT NULL,
[TicketSeqNo] [bigint] NULL,
[BookSeqNo] [bigint] NULL,
[MatchID] [int] NULL,
[TicketNumber] [varchar](20) NULL,
[BarCodeNumber] [varchar](20) NULL,
[GateNo] [varchar](5) NULL,
[EntryFrom] [varchar](10) NULL,
[MRP] [decimal](9, 2) NULL,
[Commission] [decimal](9, 2) NULL,
[Discount] [decimal](9, 2) NULL,
[CashPrice] [decimal](9, 2) NULL,
[CashReceived] [decimal](9, 2) NULL,
[BalanceDue] [decimal](9, 2) NULL,
[CollectibleFrom] [char](1) NULL,
[PlaceOfIssue] [varchar](20) NULL,
[DateOfIssue] [datetime] NULL,
[PlaceOfSale] [varchar](20) NULL,
[AgentID] [int] NULL,
[BuyerID] [int] NULL,
[SaleTypeID] [tinyint] NULL,
[SaleDate] [smalldatetime] NULL,
[ApprovedBy] [varchar](15) NULL,
[ApprovedDate] [smalldatetime] NULL,
[InvoiceStatus] [char](1) NULL,
[InvoiceRefNo] [varchar](15) NULL,
[InvoiceDate] [smalldatetime] NULL,
[BookPosition] [char](2) NULL,
[TicketStatus] [char](2) NULL,
[RecordStatus] [char](1) NULL,
[ClosingStatus] [char](2) NULL,
[ClosingDate] [datetime] NULL,
[UpdatedDate] [datetime] NULL,
[UpdatedUser] [varchar](10) NULL,
CONSTRAINT [PK_Trn_Tickets] PRIMARY KEY CLUSTERED
(
[ObjectID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Hope this helps
SQL Server 2005 and 2008 Express will not compress your data. SQL Server 2008 can use page compression, but only on Enterprise Edition. NULL columns occupy one bit in the row.
From the description of your data, sounds more like a problem of ordinary normalization. Separate the repeat values into a lookup table, store only distinct combinations, join agaisnt the lookup table. This will save data by schema design and will work on all DB platforms, all versions, all SKUs.
Replace ApprovedBy etc (varchar) with lookups to other tables
Do you need datetime?
Do you expect more then 4 billion rows? Why are the 1st 3 cols bigint?
Save a few bytes here and there = a big difference. Higher page density (eg more rows per 8k page) = less space + smaller indexes.
Compress when you have 1.5 billion rows.