Adding clustered index on temp table to improve performance - sql-server

I have ran an execution plan and noticed that the query is taking time while inserting into temp tables. We have multiple queries that insert into temp tables. I have shared two of them below. How do I add the clustered index to the temp table via the storedprocedure query. It needs to create the index on the fly and destroy it
if object_id('tempdb..#MarketTbl') is not null drop table #MarketTbl else
select
mc.companyId,
mc.pricingDate,
mc.tev,
mc.sharesOutstanding,
mc.marketCap
into #MarketTbl
from ciqMarketCap mc
where mc.pricingDate > #date
and mc.companyId in (select val from #companyId)
---- pricing table: holds pricing data for the stock pprice
if object_id('tempdb..#PricingTbl') is not null drop table #PricingTbl else
select
s.companyId,
peq.pricingDate,
ti.currencyId,
peq.priceMid
into #PricingTbl
from ciqsecurity s
join ciqtradingitem ti on s.securityid = ti.securityid
join ciqpriceequity peq on peq.tradingitemid = ti.tradingitemid
where s.primaryFlag = 1
and s.companyId in (select val from #companyId)
and peq.pricingDate> #date
and ti.primaryflag = 1
Execution plan

What you are doing is pure nonsense. You have to speed up your select, not insert.
And to speed it up you (maybe) need indexes on tables from which you select.
What you are doing now is trying to add a clustered index to a table that does not exist (the error tells you about it!), and the table does not exist because if it exists you drop it

1.First, your data is not more than 5 to 10 thousand, do not use temp table, use table type variable.
2.You can create the index, after inserting the data, use alter table syntax.

Related

How to optimize SQL Server Merge statement running with millions of records

I use SQL Server 2014 and need to update a new added datetime type column in one table. There are two tables related (both have > 30 millions of records):
TableA:
CategoryID, itemID, dataCreated, deleted, some other string properties.
This table contains multiples records for each item with different datecreated.
TableB:
CategoryID, itemID, LatestUpdatedDate (This is the new added column)
both categoryID and itemID are part of an index on this table.
To update tableB's LatestUpdatedDate from table A on matched CategoryID and ItemID, I used the following merge statement:
merge [dbo].[TableB] with(HOLDLOCK) as t
using
(
select CategoryID,itemID, max(DateCreated) as LatestUpdatedDate
from dbo.TableA
where TableA.Deleted = 0
group by CategoryID,itemID
) as s on t.CategoryID = s.CategoryID and t.itemID = s.itemID
when matched then
update
set t.LatestUpdatedDate = s.LatestUpdatedDate
when not matched then
insert (CategoryID, itemID, LatestUpdatedDate)
values (s.CategoryID, s.itemID)
Given the fact that millions of records in both table, How can I optimize this script? Or Is there any other way to update the table with better performance?
Note: This is a one-off script and DB is on live, there would be a trigger added to tableA against insert to update the date in tableB in the future.
As per Optimizing MERGE Statement Performance, the best you can do is:
Create an index on the join columns in the source table that is unique and covering.
Create a unique clustered index on the join columns in the target table.
You may get a performance improvement during MERGE1 by creating an index on TableA on (Deleted, CategoryID, itemID) INCLUDE(DateCreated). However, since this is a one-off operation, the resources (time, CPU, space) required to create this index probably won't offset the performance gains vis-a-vis running the query as-is and relying on your existing index.

Database Index when SQL statement includes "IN" clause

I have SQL statement which takes really a lot of time to execute and I really had to improve it somehow.
select * from table where ID=1 and GROUP in
(select group from groupteam where
department= 'marketing' )
My question is if I should create index on columns ID and GROUP would it help?
Or if not should I create index on second table on column DEPARTMENT?
Or I should create two indexes for both tables?
First table has 249003.
Second table has in total 900 rows while query in that table returns only 2 rows.
That is why I am surprised that response is so slow.
Thank you
You can also use EXISTS, depending on your database like so:
select * from table t
where id = 1
and exists (
select 1 from groupteam
where department = 'marketing'
and group = t.group
)
Create a composite index on individual indexes on groupteam's department and group
Create a composite index or individual indexes on table's id and group
Do an explain/analyze depending on your database to review how indexes are being used by your database engine.
Try a join instead:
select * from table t
JOIN groupteam gt
ON d.group = gt.group
where ID=1 AND gt.department= 'marketing'
Index on table group and id column and table groupteam group column would help too.

MS SQL - Delete query taking too much time

I have the following query script:
declare #tblSectionsList table
(
SectionID int,
SectionCode varchar(255)
)
--assume #tblSectionsList has 50 sections- rows
DELETE
td
from
[dbo].[InventoryDocumentDetails] td
inner join [dbo].InventoryDocuments th
on th.Id = td.InventoryDocumentDetail_InventoryDocument
inner join #tblSectionsList ts
on ts.SectionID = th.InventoryDocument_Section
This script contains three tables, where #tblSectionsList is a temporary table, it may contains 50 records. Then I am using this table in the join condition with the InventoryDocuments table, then further joined to the InventoryDocumentDetails table. All joins are based on INT foreign-keys.
On the week-end I put this query on server and it is still running even after 2 days,4 hours... Can any body tell me if I am doing something wrong. Or is there any idea to improve its performance? Even I don't know how much more time it will take to give me the result.
Before this I also tried to create an index on the InventoryDocumentDetails table with following script:
CREATE NONCLUSTERED INDEX IX_InventoryDocumentDetails_InventoryDocument
ON dbo.InventoryDocumentDetails (InventoryDocumentDetail_InventoryDocument);
But this script also take more than one day and did not finish so I cancelled this query.
Additional info:
I am using MS SQL 2008 R2.
InventoryDocuments table contains 2108137 rows, has primary key 'Id'.
InventoryDocumentDetails table contains 25055158 rows, has primary key 'Id'.
Both tables have primary keys defined.
CUP - Intel Xeon - with 32 GB RAM
No indexes are defined, because now when I am going to create a new index, that query also get suspended.
Query Execution Plan (1):
2nd Part:
The following query give one row for this and showing status='suspended', and wait_type='LCK_M_IX'
SELECT r.session_id as spid, r.[status], r.command, t.[text], OBJECT_NAME(t.objectid, t.[dbid]) as object, r.logical_reads, r.blocking_session_id as blocked, r.wait_type, s.host_name, s.host_process_id, s.program_name, r.start_time
FROM sys.dm_exec_requests AS r LEFT OUTER JOIN sys.dm_exec_sessions s ON s.session_id = r.session_id OUTER APPLY sys.dm_exec_sql_text(r.[sql_handle]) AS t
WHERE r.session_id <> ##SPID AND r.session_id > 50
What happens when you change the Inner Join to EXISTS
DELETE td
FROM [dbo].[InventoryDocumentDetails] td
WHERE EXISTS (SELECT 1
FROM [dbo].InventoryDocuments th
WHERE EXISTS (SELECT 1
FROM #tblSectionsList ts
WHERE ts.SectionID = th.InventoryDocument_Section)
AND th.Id = td.InventoryDocumentDetail_InventoryDocument)
It sometimes can be more efficient time-wise to truncate a table and re-import the records you want to keep. A delete operation on a large tables is incredibly slow compared to an insert. Of course this is only an option if you can take your table offline. Also, only do this if your logging is set to simple.
Drop triggers table A.
Bulk copy table A to B.
Truncate table A
Enable Identity Insert.
Insert Into A From B Where A.ID Not in ID's to delete.
Disable Identity Insert.
Rebuild indexes.
Enable triggers
Try like the below. It might give you some idea at least.
DELETE FROM [DBO].[INVENTORYDOCUMENTDETAILS] WHERE INVENTORYDOCUMENTDETAILS_PK IN (
(SELECT INVENTORYDOCUMENTDETAILS_PK FROM
[DBO].[INVENTORYDOCUMENTDETAILS] TD
INNER JOIN [DBO].INVENTORYDOCUMENTS TH ON TH.ID = TD.INVENTORYDOCUMENTDETAIL_INVENTORYDOCUMENT
INNER JOIN #TBLSECTIONSLIST TS ON TS.SECTIONID = TH.INVENTORYDOCUMENT_SECTION
)

Which query will execute faster, a query which uses table object or a query which uses temporary table in sql server

I have created a stored procedure in which I have used a table object and inserted some column in it. Below is the procedure:
CREATE Procedure [dbo].[usp_Security] (#CredentialsList dbo.Type_UserCredentialsList ReadOnly) As
Begin
Declare
#Result Table
(
IdentityColumn Int NOT NULL Identity (1, 1) PRIMARY KEY,
UserCredentials nVarChar(4000),
UserName nVarChar(100),
UserRole nVarChar(200),
RoleID Int,
Supervisor Char(3),
AcctMaintDecn Char(3),
EditPendInfo Char(3),
ReqInstID Char(3)
)
Insert Into #Result
Select Distinct UserCredentials, 'No', D.RoleName, D.RoleID,'No', 'No', 'No' From #CredentialsList A
Join SecurityRepository.dbo.SecurityUsers B On CharIndex(B.DomainAccount, A.UserCredentials) > 0
Join SecurityRepository.dbo.SecurityUserRoles C On C.UserID = B.UserID
Join SecurityRepository.dbo.SecurityRoles D On D.RoleID = C.RoleID
Where D.RoleName Like 'AOT.%' And B.IsActive = 1 And D.IsActive = 1
Update A
Set A.UserName = B.UserName
From #Result A
Join #CredentialsList B On A.UserCredentials = B.UserCredentials
-- "Supervisor" Column
Update A
Set A.Supervisor = 'Yes'
From #Result A
Join SecurityRepository.dbo.SecurityUsers B On CharIndex(B.DomainAccount, A.UserCredentials) > 0
Join SecurityRepository.dbo.SecurityUserRoles C On C.UserID = B.UserID
Join SecurityRepository.dbo.SecurityRoles D On D.RoleID = C.RoleID
Where D.RoleName In ('AOT.Manager', 'AOT.Deps Ops Admin', 'AOT.Fraud Manager', 'AOT.Fulfillment Manager')
And B.IsActive = 1 And D.IsActive = 1
-- Return Result
Select * From #Result Order By UserName, UserRole
End
In the above procedure, I have made the use of Table object and then created a clustered index on that table object.
However, if I create a temporary table and then process the above info in the SP, will it be faster than using table object instead of temporary table. I tried creating a seperate Clustered index on a column in a table object, but it does not allow me to create it as we cannot create an index on a table object.
I wanted to make use of temporary table in the above stored procedure, but will it reduce the cost as compared to the use of table object.
It depends! - as always there are a lot of factors that come into play here.
A table variable tends to work best for small numbers of rows - e.g. 10, 20 rows - since it never has statistics, cannot have indices on it, and the SQL Server query optimizer will always assume it has just a single row of data. If you have too many rows in a table variable, this will badly skew the execution plan being determined.
Furthermore, the table variable doesn't participate in transaction handling, which can be a good or a bad thing - so if you insert 10 rows into a table variable inside a transaction and then roll back that transaction - those rows are still in your table variable. Just be aware of that!
The temporary table works best if you intend to have rather many rows, if you might even need to index something.
Temporary tables also behave just like regular tables in transactional processing, e.g. a transaction will affect those temporary tables.
But again: the real way to find out is to try it and measure it - and try again and measure again.

Table Valued Parameter has slow performance because of table scan

I have an aplication that passes parameters to a procedure in SQL. One of the parameters is an table valued parameter containing items to include in a where clause.
Because the table valued parameter has no statistics attached to it when I join my TVP to a table that has 2 mil rows I get a very slow query.
What alternatives do I have ?
Again, the goal is to pass certain values to a procedure that will be included in a where clause:
select * from table1 where id in
(select id from #mytvp)
or
select * from table1 t1 join #mytpv
tvp on t1.id = tvp.id
although it looks like it would need to run the query once for each row in table1, EXISTS often optimizes to be more efficient than a JOIN or an IN. So, try this:
select * from table1 t where exists (select 1 from #mytvp p where t.id=p.id)
also, be sure that t.id is the same datatype as p.id and t.id has an index.
You can use a temp table with an index to boost performance....(assuming you have more than a couple of records in your #mytvp)
just before you join the table you could insert the data from the variable #mytvp to a temp table...
here's a sample code to create a temp table with index....The primary key and unique field determines which columns to index on..
CREATE TABLE #temp_employee_v3
(rowID int not null identity(1,1)
,lname varchar (30) not null
,fname varchar (30) not null
,city varchar (20) not null
,state char (2) not null
,PRIMARY KEY (lname, fname, rowID)
,UNIQUE (state, city, rowID) )
I had the same issue that table-valued parameters where very slow in my context. I came up with a solution that passed the list of values as a comma separated string to the stored procedure. the procedure then made a PATINDEX(...) > 0 comparision. This was about a factor of 1:6 faster.
As mentioned here and explained here you can have primary key and unique constraints on the table type. E.g.
CREATE TYPE IdList AS TABLE ( Id UNIQUEIDENTIFIER NOT NULL PRIMARY KEY )
However, check if it improves performance in your case as now, these indexes exist when the TVP is populated which might lead to a counter effect depending if your input is sorted and/or if you use more than one column.
In common with table variables, table-valued parameters have no statistics (see the section "restrictions"); the query optimiser works on the assumption that they contain only one row, which if your parameter contains a lot of rows is likely to result in an inappropriate query plan.
One way to improve your chances of a better plan is to add a statement level recompile; this should enable the optimiser to take the size of the TVP into account when selecting a plan.
select * from table1 t where exists (select 1 from #mytvp p where t.id=p.id) OPTION (RECOMPILE)
(incorporating KM's suggestion)

Resources