MS SQL - Delete query taking too much time - sql-server

I have the following query script:
declare #tblSectionsList table
(
SectionID int,
SectionCode varchar(255)
)
--assume #tblSectionsList has 50 sections- rows
DELETE
td
from
[dbo].[InventoryDocumentDetails] td
inner join [dbo].InventoryDocuments th
on th.Id = td.InventoryDocumentDetail_InventoryDocument
inner join #tblSectionsList ts
on ts.SectionID = th.InventoryDocument_Section
This script contains three tables, where #tblSectionsList is a temporary table, it may contains 50 records. Then I am using this table in the join condition with the InventoryDocuments table, then further joined to the InventoryDocumentDetails table. All joins are based on INT foreign-keys.
On the week-end I put this query on server and it is still running even after 2 days,4 hours... Can any body tell me if I am doing something wrong. Or is there any idea to improve its performance? Even I don't know how much more time it will take to give me the result.
Before this I also tried to create an index on the InventoryDocumentDetails table with following script:
CREATE NONCLUSTERED INDEX IX_InventoryDocumentDetails_InventoryDocument
ON dbo.InventoryDocumentDetails (InventoryDocumentDetail_InventoryDocument);
But this script also take more than one day and did not finish so I cancelled this query.
Additional info:
I am using MS SQL 2008 R2.
InventoryDocuments table contains 2108137 rows, has primary key 'Id'.
InventoryDocumentDetails table contains 25055158 rows, has primary key 'Id'.
Both tables have primary keys defined.
CUP - Intel Xeon - with 32 GB RAM
No indexes are defined, because now when I am going to create a new index, that query also get suspended.
Query Execution Plan (1):
2nd Part:
The following query give one row for this and showing status='suspended', and wait_type='LCK_M_IX'
SELECT r.session_id as spid, r.[status], r.command, t.[text], OBJECT_NAME(t.objectid, t.[dbid]) as object, r.logical_reads, r.blocking_session_id as blocked, r.wait_type, s.host_name, s.host_process_id, s.program_name, r.start_time
FROM sys.dm_exec_requests AS r LEFT OUTER JOIN sys.dm_exec_sessions s ON s.session_id = r.session_id OUTER APPLY sys.dm_exec_sql_text(r.[sql_handle]) AS t
WHERE r.session_id <> ##SPID AND r.session_id > 50

What happens when you change the Inner Join to EXISTS
DELETE td
FROM [dbo].[InventoryDocumentDetails] td
WHERE EXISTS (SELECT 1
FROM [dbo].InventoryDocuments th
WHERE EXISTS (SELECT 1
FROM #tblSectionsList ts
WHERE ts.SectionID = th.InventoryDocument_Section)
AND th.Id = td.InventoryDocumentDetail_InventoryDocument)

It sometimes can be more efficient time-wise to truncate a table and re-import the records you want to keep. A delete operation on a large tables is incredibly slow compared to an insert. Of course this is only an option if you can take your table offline. Also, only do this if your logging is set to simple.
Drop triggers table A.
Bulk copy table A to B.
Truncate table A
Enable Identity Insert.
Insert Into A From B Where A.ID Not in ID's to delete.
Disable Identity Insert.
Rebuild indexes.
Enable triggers

Try like the below. It might give you some idea at least.
DELETE FROM [DBO].[INVENTORYDOCUMENTDETAILS] WHERE INVENTORYDOCUMENTDETAILS_PK IN (
(SELECT INVENTORYDOCUMENTDETAILS_PK FROM
[DBO].[INVENTORYDOCUMENTDETAILS] TD
INNER JOIN [DBO].INVENTORYDOCUMENTS TH ON TH.ID = TD.INVENTORYDOCUMENTDETAIL_INVENTORYDOCUMENT
INNER JOIN #TBLSECTIONSLIST TS ON TS.SECTIONID = TH.INVENTORYDOCUMENT_SECTION
)

Related

selecting keys which are not in another table takes forever

I have a query like this:
select key, name from localtab where key not in (select key from remotetab);
The query takes forever, and I don't understand why.
localtab is local table, and remotetab is a remote table in another server. key is an int column which has a unique index in both tables. When I query the both tables separately, it takes just a few seconds.
Linked Severs have terrible performance. Get the data you need to the local server and do the majority of the hard work and processing there instead of a mix of local and remote in a single query.
select remotetab into a temp table
select [key] into #remote_made_local from remotetab
Use the #temp table when doing the where clause filtering and use exists instead of in for better performance
select a.[key], a.name from localtab a where not exists (select 1 from #remote_made_local b where b.[key] = a.[key] )
Vs doing
select [key], name from localtab where key not in (select [key] from #remote_made_local)
There is also a solution without using temporary tables.
By using a left join instead of not in (select ...), you can massively speed up the query. Like this:
select l.key, l.name
from localtab l left join remotetab r on l.key = r.key
where r.key is null ;

Too many parameter values slowing down query

I have a query that runs fairly fast under normal circumstances. But it is running very slow (at least 20 minutes in SSMS) due to how many values are in the filter.
Here's the generic version of it, and you can see that one part is filtering by over 8,000 values, making it run slow.
SELECT DISTINCT
column
FROM
table_a a
JOIN
table_b b ON (a.KEY = b.KEY)
WHERE
a.date BETWEEN #Start and #End
AND b.ID IN (... over 8,000 values)
AND b.place IN ( ... 20 values)
ORDER BY
a.column ASC
It's to the point where it's too slow to use in the production application.
Does anyone know how to fix this, or optimize the query?
To make a query fast, you need indexes.
You need a separate index for the following columns: a.KEY, b.KEY, a.date, b.ID, b.place.
As gotqn wrote before, if you put your 8000 items to a temp table, and inner join it, it will make the query even faster too, but without the index on the other part of the join it will be slow even then.
What you need is to put the filtering values in temporary table. Then use the table to apply filtering using INNER JOIN instead of WHERE IN. For example:
IF OBJECT_ID('tempdb..#FilterDataSource') IS NOT NULL
BEGIN;
DROP TABLE #FilterDataSource;
END;
CREATE TABLE #FilterDataSource
(
[ID] INT PRIMARY KEY
);
INSERT INTO #FilterDataSource ([ID])
-- you need to split values
SELECT DISTINCT column
FROM table_a a
INNER JOIN table_b b
ON (a.KEY = b.KEY)
INNER JOIN #FilterDataSource FS
ON b.id = FS.ID
WHERE a.date BETWEEN #Start and #End
AND b.place IN ( ... 20 values)
ORDER BY .column ASC;
Few important notes:
we are using temporary table in order to allow parallel execution plans to be used
if you have fast (for example CLR function) for spiting, you can join the function itself
it is not good to use IN with many values, the SQL Server is not able to build always the execution plan which may lead to time outs/internal error - you can find more information here

Why LEFT JOIN increase query time so much?

I'm using SQL Server 2012 and encountered strange problem.
This is the original query I've been using:
DELETE FROM [TABLE_TEMP]
INSERT INTO [TABLE_TEMP]
SELECT H.*, NULL
FROM [TABLE_Accounts_History] H
INNER JOIN [TABLE_For_Filtering] A ON H.[RSIN] = A.[RSIN]
WHERE
H.[NUM] = (SELECT TOP 1 [NUM] FROM [TABLE_Accounts_History]
WHERE [RSIN] = H.[RSIN]
AND [AccountSys] = H.[AccountSys]
AND [Cl_Acc_Typ] = H.[Cl_Acc_Typ]
AND [DATE_DEAL] < #dte
ORDER BY [DATE_DEAL] DESC)
AND H.[TYPE_DEAL] <> 'D'
Table TABLE_Accounts_History consists of 3 200 000 records.
Table TABLE_For_Filtering is around 1 500 records.
Insert took me 2m 40s and inserted 1 600 000 records for further work.
But then I decided to attach a column from pretty small table TABLE_Additional (only around 100 recs):
DELETE FROM [TABLE_TEMP]
INSERT INTO [TABLE_TEMP]
SELECT H.*, P.[prof_type]
FROM [TABLE_Accounts_History] H
INNER JOIN [TABLE_For_Filtering] A ON H.[RSIN] = A.[RSIN]
LEFT JOIN [TABLE_Additional] P ON H.[ACCOUNTSYS] = P.[AccountSys]
WHERE H.[NUM] = ( SELECT TOP 1 [NUM]
FROM [TABLE_Accounts_History]
WHERE [RSIN] = H.[RSIN]
AND [AccountSys] = H.[AccountSys]
AND [Cl_Acc_Typ] = H.[Cl_Acc_Typ]
AND [DATE_DEAL] < #dte
ORDER BY [DATE_DEAL] DESC)
AND H.[TYPE_DEAL] <> 'D'
And now it takes ages this query to complete. Why is it so? How such small left join possibly can dump performance? How can I improve it?
An update: no luck so far with LEFT JOIN. Indexes, no indexes, hinted indexes.. For now I've found a workaround by using my first query and UPDATE after it:
UPDATE [TABLE_TEMP]
SET [PROF_TYPE] = P1.[prof_type]
FROM [TABLE_TEMP] A1
LEFT JOIN
[TABLE_Additional] P1
ON A1.[ACCOUNTSYS] = P1.[AccountSys]
Takes only 5s and does pretty much the same I've been trying to achieve. Still SQL Server performance is mystery to me.
The 'small' left join is actually doing a lot of extra work for you. SQL Server has to go back to TABLE_Additional for each row from your inner join between and TABLE_Accounts_History and TABLE_For_Filtering. You can help SQL Server a few ways to speed this up by trying some indexing. You could:
1) Ensure TABLE_Accounts_History has an index on the Foreign Key H.[ACCOUNTSYS]
2) If you think that TABLE_Additional will always be accessed by the AccountSys, i.e. you will be requesting AccountSys in ordered groups, you could create a Clustered Index on TABLE_Additional.AccountSys. (in orther words physically order the table on disk in order of AccountSys)
3) You could also ensure there is a foreign key index on TABLE_Accounts_History.
left outer join selects all rows from left table. In Your case your left table has 3 200 000 this much rows and then comparing with each record to your right table. One solution is to use Indexes which will reduce retrieval time.

Prevent MORE records returned by JOINING a lookup table?

I am having a problem. My Lookup table is producing MORE records than my original query..
I feel I am missing something basic. How do I prevent ending up with more records by bringing in a column or two from the 2nd table?
-- 140930
SELECT COUNT(ID)
FROM dbo.USER_ACCOUNTS AS A
-- 143324
LEFT JOIN dbo.DOMAIN AS B
ON A.Domain = B.DOMAIN
As you can see my count grows to 143324 after the join. I have tried outer joins as well. There are only 150 or so domains to join on. AND some should not even be in the results because no domain match should be found!?
This is SQL SERVER 2008 R2
|Thanks|
SELECT COUNT(ID)
FROM dbo.USER_ACCOUNTS AS A
WHERE EXISTS (
SELECT 1
FROM dbo.DOMAIN AS B
WHERE A.Domain = B.DOMAIN
)

Dealing with large amounts of data, and a query with 12 inner joins in SQL Server 2008

There is an old SSIS package that pulls a lot of data from oracle to our Sql Server Database everyday. The data is inserted into a non-normalized database, and I'm working on a stored procedure to select that data, and insert it into a normalized database. The Oracle databases were overly normalized, so the query I wrote ended up having 12 inner joins to get all the columns I need. Another problem is that I'm dealing with large amounts of data. One table I'm selecting from has over 12 million records. Here is my query:
Declare #MewLive Table
(
UPC_NUMBER VARCHAR(50),
ITEM_NUMBER VARCHAR(50),
STYLE_CODE VARCHAR(20),
COLOR VARCHAR(8),
SIZE VARCHAR(8),
UPC_TYPE INT,
LONG_DESC VARCHAR(120),
LOCATION_CODE VARCHAR(20),
TOTAL_ON_HAND_RETAIL NUMERIC(14,0),
VENDOR_CODE VARCHAR(20),
CURRENT_RETAIL NUMERIC(14,2)
)
INSERT INTO #MewLive(UPC_NUMBER,ITEM_NUMBER,STYLE_CODE,COLOR,[SIZE],UPC_TYPE,LONG_DESC,LOCATION_CODE,TOTAL_ON_HAND_RETAIL,VENDOR_CODE,CURRENT_RETAIL)
SELECT U.UPC_NUMBER, REPLACE(ST.STYLE_CODE, '.', '')
+ '-' + SC.SHORT_DESC + '-' + REPLACE(SM.PRIM_SIZE_LABEL, '.', '') AS ItemNumber,
REPLACE(ST.STYLE_CODE, '.', '') AS Style_Code, SC.SHORT_DESC AS Color,
REPLACE(SM.PRIM_SIZE_LABEL, '.', '') AS Size, U.UPC_TYPE, ST.LONG_DESC, L.LOCATION_CODE,
IB.TOTAL_ON_HAND_RETAIL, V.VENDOR_CODE, SD.CURRENT_RETAIL
FROM MewLive.dbo.STYLE AS ST INNER JOIN
MewLive.dbo.SKU AS SK ON ST.STYLE_ID = SK.STYLE_ID INNER JOIN
MewLive.dbo.UPC AS U ON SK.SKU_ID = U.SKU_ID INNER JOIN
MewLive.dbo.IB_INVENTORY_TOTAL AS IB ON SK.SKU_ID = IB.SKU_ID INNER JOIN
MewLive.dbo.LOCATION AS L ON IB.LOCATION_ID = L.LOCATION_ID INNER JOIN
MewLive.dbo.STYLE_COLOR AS SC ON ST.STYLE_ID = SC.STYLE_ID INNER JOIN
MewLive.dbo.COLOR AS C ON SC.COLOR_ID = C.COLOR_ID INNER JOIN
MewLive.dbo.STYLE_SIZE AS SS ON ST.STYLE_ID = SS.STYLE_ID INNER JOIN
MewLive.dbo.SIZE_MASTER AS SM ON SS.SIZE_MASTER_ID = SM.SIZE_MASTER_ID INNER JOIN
MewLive.dbo.STYLE_VENDOR AS SV ON ST.STYLE_ID = SV.STYLE_ID INNER JOIN
MewLive.dbo.VENDOR AS V ON SV.VENDOR_ID = V.VENDOR_ID INNER JOIN
MewLive.dbo.STYLE_DETAIL AS SD ON ST.STYLE_ID = SD.STYLE_ID
WHERE (U.UPC_TYPE = 1) AND (ST.ACTIVE_FLAG = 1)
That query pretty much crashes our server. I tried to fix the problem by breaking the query up into smaller queries, but the temp table variable I use causes the tempdb database to fill the hard drive. I figure this is because the server runs out of memory, and crashes. Is there anyway to solve this problem?
Have you tried using a real table instead of a temporary one. You can use SELECT INTO to create a real table to store the results instead of a temporary one.
Syntax would be:
SELECT
U.UPC_NUMBER,
REPLACE(ST.STYLE_CODE, '.', '').
....
INTO
MEWLIVE
FROM
MewLive.dbo.STYLE AS ST INNER JOIN
...
The command will create the table, and may help with the memory issues you are seeing.
Additionally try looking at the execution plan in query analyser or try the index tuning wizard to suggest some indexes that may help speed up the query.
Try running the query from the Oracle server rather than from the SQL server. As it stands, there's most likely going to be a lot of communication over the wire as the query tries to process.
By pre-processing the joins (maybe with a view), you'll only be sending over the results.
Regarding the over-normalization: have you tested whether or not it's an issue in terms of speed? I find it hard to believe that it could be too normalized.
Proper indexing will definitely help
IF
amount of rows in this query not over "zillions" of rows.
Try the following:
Join on dbo.COLOR is excessive if there is FKey on dbo.STYLE_COLOR(COLOR_ID)=>dbo.COLOR(COLOR_ID)
Proper index (excessive, should be reviewed)
USE MewLive
CREATE INDEX ix1 ON dbo.STYLE_DETAIL (STYLE_ID)
INCLUDE (STYLE_CODE, LONG_DESC)
WHERE ACTIVE_FLAG = 1
GO
CREATE INDEX ix2 ON dbo.UPC (SKU_ID)
INCLUDE(UPC_NUMBER)
WHERE UPC_TYPE = 1
GO
CREATE INDEX ix3 ON dbo.SKU(STYLE_ID)
INCLUDE(SKU_ID)
GO
CREATE INDEX ix3_alternative ON dbo.SKU(SKU_ID)
INCLUDE(STYLE_ID)
GO
CREATE INDEX ix4 ON dbo.IB_INVENTORY_TOTAL(SKU_ID, LOCATION_ID)
INCLUDE(TOTAL_ON_HAND_RETAIL)
GO
CREATE INDEX ix5 ON dbo.LOCATION(LOCATION_ID)
INCLUDE(LOCATION_CODE)
GO
CREATE INDEX ix6 ON dbo.STYLE_COLOR(STYLE_ID)
INCLUDE(SHORT_DESC,COLOR_ID)
GO
CREATE INDEX ix7 ON dbo.COLOR(COLOR_ID)
GO
CREATE INDEX ON dbo.STYLE_SIZE(STYLE_ID)
INCLUDE(SIZE_MASTER_ID)
GO
CREATE INDEX ix8 ON dbo.SIZE_MASTER(SIZE_MASTER_ID)
INCLUDE(PRIM_SIZE_LABEL)
GO
CREATE INDEX ix9 ON dbo.STYLE_VENDOR(STYLE_ID)
INCLUDE(VENDOR_ID)
GO
CREATE INDEX ixA ON dbo.VENDOR(VENDOR_ID)
INCLUDE(VENDOR_CODE)
GO
CREATE INDEX ON dbo.STYLE_DETAIL(STYLE_ID)
INCLUDE(CURRENT_RETAIL)
In SELECT list replace U.UPC_TYPE, to 1 as UPC_TYPE,
Can you segregate the imports - batch them by SKU/location/vendor/whatever and run multiple queries to get the data over? Is there a particular reason it all needs to go across in one hit? (apart from the ease of writing the query)

Resources