SQL Server not using index - sql-server

I'm trying to write a query in SQL Server, but it's doing a table scan on a table with about 30 million rows (TGS_INFO), so the query runs very slowly.
The actual query is more complex but I've reduced it to a simpler version that still exhibits the same issue.
SELECT DISTINCT UNIT_ITEMS.DBKEY,
UNIT_ITEMS.ID,
UNIT_ITEMS.LOCATION1,
UNIT_ITEMS.LOCATION2
FROM UNIT_ITEMS
INNER JOIN TGS.dbo.TGS_INFO
ON UNIT_ITEMS.UNIT_ID = TGS_INFO.UNIT_ID AND
UNIT_ITEMS.ITEM_ID = TGS_INFO.ITEM_ID AND
UNIT_ITEMS.LOCATION1 = TGS_INFO.LOCATION1 AND
UNIT_ITEMS.LOCATION2 = TGS_INFO.LOCATION2
Here is the execution plan.
StmtText
|--Sort(DISTINCT ORDER BY:([DbName].[dbo].[UNIT_ITEMS].[DBKEY] ASC, [DbName].[dbo].[UNIT_ITEMS].[ITEM_ID] ASC, [DbName].[dbo].[UNIT_ITEMS].[LOCATION1] ASC, [DbName].[dbo].[UNIT_ITEMS].[LOCATION2] ASC))
|--Hash Match(Inner Join, HASH:([DbName].[dbo].[UNIT_ITEMS].[UNIT_ID], [DbName].[dbo].[UNIT_ITEMS].[ITEM_ID], [DbName].[dbo].[UNIT_ITEMS].[LOCATION1], [DbName].[dbo].[UNIT_ITEMS].[LOCATION2])=([Expr1008], [Expr1009], [Expr1010], [Expr1011]), RESIDUAL:([DbName].[dbo].[UNIT_ITEMS].[UNIT_ID]=[Expr1008] AND [DbName].[dbo].[UNIT_ITEMS].[ITEM_ID]=[Expr1009] AND [DbName].[dbo].[UNIT_ITEMS].[LOCATION1]=[Expr1010] AND [DbName].[dbo].[UNIT_ITEMS].[LOCATION2]=[Expr1011]))
|--Table Scan(OBJECT:([DbName].[dbo].[UNIT_ITEMS]))
|--Compute Scalar(DEFINE:([Expr1008]=CONVERT_IMPLICIT(int,[TGS].[dbo].[TGS_INFO].[UNIT_ID],0), [Expr1009]=CONVERT_IMPLICIT(nvarchar(50),[TGS].[dbo].[TGS_INFO].[ITEM_ID],0), [Expr1010]=CONVERT_IMPLICIT(nvarchar(50),[TGS].[dbo].[TGS_INFO].[LOCATION1],0), [Expr1011]=CONVERT_IMPLICIT(int,[TGS].[dbo].[TGS_INFO].[LOCATION2],0)))
|--Table Scan(OBJECT:([TGS].[dbo].[TGS_INFO]))
TGS_INFO and UNIT_ITEMS both have nonclustered indexes on UNIT_ID and ITEM_ID. As mentioned, TGS_INFO has about 30 million rows but they are evenly distributed around about a thousand different UNIT_IDs. UNIT_ITEMS always contains only one UNIT_ID.
Here are the indexes:
CREATE NONCLUSTERED INDEX [IX_UNIT_ID_ITEM_ID] ON [dbo].[TGS_INFO]
(
[UNIT_ID] ASC,
[ITEM_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_UNIT_ID_ITEM_ID] ON [dbo].[UNIT_ITEMS]
(
[UNIT_ID] ASC,
[ITEM_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
As I mentioned in the comments, all the columns are VARCHAR(50) in TGS_INFO. All the columns in UNIT_ITEMS are ints.
For the record, I didn't design the schema of TGS_INFO.

If you don't include LOCATION1 and LOCATION2 in your indexes the join cannot be satisfied from an index alone. Add these columns to the indexes on both tables.
You probably have to include all other columns that are referenced in your query, too.

I notice the execution plan shows the following:
|--Compute Scalar(DEFINE:([Expr1008]=CONVERT_IMPLICIT(int,[TGS].[dbo].[TGS_INFO].[UNIT_ID],0), [Expr1009]=CONVERT_IMPLICIT(nvarchar(50),[TGS].[dbo].[TGS_INFO].[ITEM_ID],0), [Expr1010]=CONVERT_IMPLICIT(nvarchar(50),[TGS].[dbo].[TGS_INFO].[LOCATION1],0), [Expr1011]=CONVERT_IMPLICIT(int,[TGS].[dbo].[TGS_INFO].[LOCATION2],0)))
I can't think of a good reason for the query engine to do an implicit data type conversion on these columns unless the data types between the two tables don't match on the columns you're using for the join.
You may also try moving UNIT_ITEMS.LOCATION1 = TGS_INFO.LOCATION1 AND UNIT_ITEMS.LOCATION2 = TGS_INFO.LOCATION2 to the WHERE clause since they're not covered by an index. The query engine is typically smart enough to account for this, but it's something to try.

Related

speed up long running query on huge tables

I am facing with problematic query on Azure SQL database, which I need to speed up.
This is my query:
SELECT
[Incidents].[Incident_Number],
[Incidents].[Incidentinteraction],
[Incidents].[Incidentid],
[Address].[Ads_Sk]
FROM
[schema1].[Address] AS Address --3529046 rows
JOIN
[schema2].[Incidents] AS Incidents --3268375 rows
ON Incidents.[Ads_Sk_Incidentaddress] = Address.[Ads_Sk]
Address table has 2 indexes:
ALTER TABLE [schema1].[ADDRESS]
ADD PRIMARY KEY CLUSTERED ([ADS_SK] ASC, [ISCURRENTRECORD] ASC, [RECORDSTARTDATE] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
and a non-clustered index:
CREATE NONCLUSTERED INDEX [nci_ADDRESS_ADS_CURRENT]
ON [PROMISE_CDW].[ADDRESS] ([ADS_SK] ASC, [ISCURRENTRECORD] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
Incident table has also two indexes:
ALTER TABLE [schema2].[INCIDENTS]
ADD PRIMARY KEY CLUSTERED ([INCIDENTID] ASC, [ISCURRENTRECORD] ASC, [RECORDSTARTDATE] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
and
CREATE NONCLUSTERED INDEX [nci_ADDRESS_ADS_SK_INCIDENT_NUMBER]
ON [schema2].[INCIDENTS] ([ADS_SK_INCIDENTADDRESS] ASC, [INCIDENT_NUMBER] ASC)
INCLUDE ([INCIDENTID], [INCIDENTINTERACTION])
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
GO
I got my results after 22 seconds and this is unacceptable for business users.
How can I speed up this query?
Thank you in advance for any hint
That query doesn't need a join, all the fields you're selecting belong to the Incidents table.
You're only using the Address table to filter rows that don't have Incidents.[Ads_Sk_Incidentaddress] as part of Address.[Ads_Sk], which is something you can easily do in a where clause with an in.

SQL Server Filtering Index performance

Currently have an index like;
CREATE UNIQUE NONCLUSTERED INDEX [CDPAYAPP_INDEX03] ON [dbo].[CDPAYAPP]
(
[CLSVC_ID] ASC,
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
Unfortunately, at some of our clients the value for CLSVC_ID can be zero for 100k or more of rows out of a couple million total rows. This cardinality appears to cause the optimizer to occasionally consider the index less than optimal, resulting in table scans. Updating stats multiple times a day can help but not always.
I tried to apply a FILTERING INDEX clause, like;
create UNIQUE NONCLUSTERED index CDPAYAPP_INDEX3A ON CDPAYAPP (CLSVC_ID, ID)
WHERE CLSVC_ID > 0;
But noticed that if I requested any column outside the two index columns it uses the original index not the filtered index. If I only select the columns of the index, it uses the filtered index.
Why?

SQL Server - Perfomance trouble while create indexes on a table with 135m records

I try to create indexes. Yes the table is big (135.8M records). No application running on it actually. I create one after one. Every create needs more time.
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RL_SAP_01]') AND name = N'RL_SAP_01BELEGNUMMER')
CREATE NONCLUSTERED INDEX [RL_SAP_01BELEGNUMMER] ON [dbo].[RL_SAP_01]
(
[BELEGNUMMER] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[RL_SAP_01]') AND name = N'RL_SAP_01DOCTYPE')
CREATE NONCLUSTERED INDEX [RL_SAP_01DOCTYPE] ON [dbo].[RL_SAP_01]
(
[DOCTYPE] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Any idea?
Let*s get this out.
The table is not big. It is actually quite small.
What happens is that likely your computer is just too small. Building an index on a not totally trivial amount of data is taxing. Nothing bad for a proper database server, but these days most people think their laptop's green hard disc is good for a database server.
There is nothing you can do. Check the hardware, your server is likely just underpowered for this operation. Some things in databases take time.

Why FREETEXTTABLE scan entire table if there is a fulltext index available?

There is a simple table of cities
CREATE TABLE [dbo].[cities](
[id] [numeric](6, 0) NOT NULL,
[cityname] [varchar](48) NOT NULL,
)
and a full-text catalog which index not only names of cities:
The table has 6259 rows.
I have a query:
SELECT * FROM FREETEXTTABLE([cities],[cityname],
N'Hello, I am looking for cities inside this text,
my house is not -far from London PJ-')
The results contains 11 rows and the query execution takes about 1-2 seconds.
According to execution plan, 6092 rows were scanned in the index.
Why so many rows are scanned if there is index?
I Use SQL Server 2012.
UPDATE:
There is no special index on table cities. The only index from object explorer looks like:
Create TABLE [dbo].[cities] ADD CONSTRAINT [xpkruianobec] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO

Outer join on two tables with sequential guid stalls

I'm attempting to perform a full outer join on two tables that are not related. Each table has a location_id which will eventually form the primary/foreign key relationship (once I figure out this performance issue). When executing the outer join, it just clocks away. Queries and triggers performed against each table on its own complete in less than a second.
This table has 21000 records:
CREATE TABLE [dbo].[TBL_LOCATIONS](
[OBJECTID] [int] NOT NULL,
[Loc_Name] [nvarchar](100) NULL,
[Location_ID] [uniqueidentifier] NULL,
[SHAPE] [geometry] NULL,
CONSTRAINT [R33_pk] PRIMARY KEY CLUSTERED
(
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 75) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[TBL_LOCATIONS] WITH CHECK ADD CONSTRAINT [g17_ck] CHECK (([SHAPE].[STSrid]=(26917)))
GO
ALTER TABLE [dbo].[TBL_LOCATIONS] ADD CONSTRAINT [DF_TBL_LOCATIONS_Location_ID] DEFAULT (newsequentialid()) FOR [Location_ID]
GO
CREATE SPATIAL INDEX [S17_idx] ON [dbo].[TBL_LOCATIONS]
(
[SHAPE]
)USING GEOMETRY_GRID
WITH (
BOUNDING_BOX =(224827, 3923750, 323464, 3967780), GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH),
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE UNIQUE NONCLUSTERED INDEX [UUID_OID_33] ON [dbo].[TBL_LOCATIONS]
(
[Location_ID] ASC,
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 75) ON [PRIMARY]
GO
This table has 53000 records
CREATE TABLE [dbo].[TBL_EVENTS](
[OBJECTID] [int] NOT NULL,
[Event_ID] [uniqueidentifier] NULL,
[Location_ID] [uniqueidentifier] NULL,
CONSTRAINT [PK_TBL_EVENTS] PRIMARY KEY CLUSTERED
(
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[TBL_EVENTS] ADD CONSTRAINT [DF_TBL_EVENTS_Event_ID] DEFAULT (newsequentialid()) FOR [Event_ID]
GO
ALTER TABLE [dbo].[TBL_EVENTS] ADD CONSTRAINT [DF_TBL_EVENTS_Event_ID] DEFAULT (newsequentialid()) FOR [Event_ID]
GO
CREATE UNIQUE NONCLUSTERED INDEX [R36_SDE_ROWID_UK] ON [dbo].[TBL_EVENTS]
(
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 75) ON [PRIMARY]
GO
And here is the query that is running....and running...1 hour and no results.
SELECT
TBL_LOCATIONS.Loc_Name,
TBL_LOCATIONS.Location_ID,
TBL_LOCATIONS.SHAPE,
TBL_EVENTS.Event_ID
FROM
TBL_EVENTS
FULL OUTER JOIN
TBL_LOCATIONS ON TBL_EVENTS.Location_ID = TBL_LOCATIONS.Location_ID
I've tried every permutation of attribute indexes on both tables, rebuilding and reorganizing them, nothing affects the performance. The use of ObjectID as PK is mandated by the application, as is the sequentialGUID. I don't think those are factors here, as both these tables perform splendidly outside of this query. SQL Server 2008 SP1 64BIT on RAID 10/48 GB RAM.
FULL JOIN works well when data in columns used to links tables are unique.
For rows containing duplicated data FULL JOIN behaves like CROSS JOIN and can cause performace issues.
So probably bottleneck comes from duplicates in LOCATION_ID column.
Maybe you need to consider turning off Transaction Logging whilst doing all that.
If the linked field values are not all that unique (location), the query size could approach quite a large number.
In an extreme example, if location only had the value of "1" in both tables, the total rows would be close to the cross join size, about 1,113,000,000 rows (21,000 * 53,000). A query of this size (over a billion rows) will take a long time to run.
EDIT - updating incorrect statement as pointed out in comments

Resources