Including value from temp table slows down query - sql-server

I have a stored procedure that uses a temporary table to make some joins in a select clause. The select clause contains the value from the Id column of the temporary table like this:
CREATE TABLE #TempTable
(
Id INT PRIMARY KEY,
RootVal INT
)
The Select looks like this:
Select value1, value2, #TempTable.Id AS ValKey
From MainTable INNER JOIN #TempTable ON MainTable.RootVal = #TempTable.RootVal
The query takes over a minute to run in real life but if I remove the "#TempTable.Id" from the select list it runs in a second.
Does anyone know why there is such a huge cost to including a value from a #temp table compared to just using it in a join?

Most likely:
data type mismatch
eg nvarchar vs int
lack of index on MainTable.RootVal
Why have Id as PK and then JOIN on another column?

Related

Query tuning required for expensive query

Can someone help me to optimize the code? I have other way to optimize it by using compute column but we can not change the schema on prod as we are not sure how many API's are used to push data into this table. This table has millions of rows and adding a non-clustered index is not helping due to the query cost and it's going for a scan.
create table testcts(
name varchar(100)
)
go
insert into testcts(
name
)
select 'VK.cts.com'
union
select 'GK.ms.com'
go
DECLARE #list varchar(100) = 'VK,GK'
select * from testcts where replace(replace(name,'.cts.com',''),'.ms.com','') in (select value from string_split(#list,','))
drop table testcts
One possibility might be to strip off the .cts.com and .ms.com subdomain/domain endings before you insert or store the name data in your table. Then, use the following query instead:
SELECT *
FROM testcts
WHERE name IN (SELECT value FROM STRING_SPLIT(#list, ','));
Now SQL Server should be able to use an index on the name column.
If your values are always suffixed by cts.com or ms.com you could add that to the search pattern:
SELECT {YourColumns} --Don't use *
FROM dbo.testcts t
JOIN (SELECT CONCAT(SS.[value], V.Suffix) AS [value]
FROM STRING_SPLIT(#list, ',') SS
CROSS APPLY (VALUES ('.cts.com'),
('.ms.com')) V (Suffix) ) L ON t.[name] = L.[value];

selecting keys which are not in another table takes forever

I have a query like this:
select key, name from localtab where key not in (select key from remotetab);
The query takes forever, and I don't understand why.
localtab is local table, and remotetab is a remote table in another server. key is an int column which has a unique index in both tables. When I query the both tables separately, it takes just a few seconds.
Linked Severs have terrible performance. Get the data you need to the local server and do the majority of the hard work and processing there instead of a mix of local and remote in a single query.
select remotetab into a temp table
select [key] into #remote_made_local from remotetab
Use the #temp table when doing the where clause filtering and use exists instead of in for better performance
select a.[key], a.name from localtab a where not exists (select 1 from #remote_made_local b where b.[key] = a.[key] )
Vs doing
select [key], name from localtab where key not in (select [key] from #remote_made_local)
There is also a solution without using temporary tables.
By using a left join instead of not in (select ...), you can massively speed up the query. Like this:
select l.key, l.name
from localtab l left join remotetab r on l.key = r.key
where r.key is null ;

Does MS SQL Server automatically create temp table if the query contains a lot id's in 'IN CLAUSE'

I have a big query to get multiple rows by id's like
SELECT *
FROM TABLE
WHERE Id in (1001..10000)
This query runs very slow and it ends up with timeout exception.
Temp fix for it is querying with limit, break this query into 10 parts per 1000 id's.
I heard that using temp tables may help in this case but also looks like ms sql server automatically doing it underneath.
What is the best way to handle problems like this?
You could write the query as follows using a temporary table:
CREATE TABLE #ids(Id INT NOT NULL PRIMARY KEY);
INSERT INTO #ids(Id) VALUES (1001),(1002),/*add your individual Ids here*/,(10000);
SELECT
t.*
FROM
[Table] AS t
INNER JOIN #ids AS ids ON
ids.Id=t.Id;
DROP TABLE #ids;
My guess is that it will probably run faster than your original query. Lookup can be done directly using an index (if it exists on the [Table].Id column).
Your original query translates to
SELECT *
FROM [TABLE]
WHERE Id=1000 OR Id=1001 OR /*...*/ OR Id=10000;
This would require evalutation of the expression Id=1000 OR Id=1001 OR /*...*/ OR Id=10000 for every row in [Table] which probably takes longer than with a temporary table. The example with a temporary table takes each Id in #ids and looks for a corresponding Id in [Table] using an index.
This all assumes that there are gaps in the Ids between 1000 and 10000. Otherwise it would be easier to write
SELECT *
FROM [TABLE]
WHERE Id BETWEEN 1001 AND 10000;
This would also require an index on [Table].Id to speed it up.

How to use INSERT SELECT?

I have a table's structure:
[Subjects]:
id int Identity Specification yes
Deleted bit
[Juridical]:
id int
Name varchar
typeid int
[Individual]:
id int
Name varchar
Juridical and Individual it's a children classes of Subjects class. So it's mean that same rows in tables Individual and Subjects have a same id.
Now I have a table:
[MyTable]:
typeid varchar
Name varchar
And I want to select data from this table and insert it into my table structure. But I don't know what to do. I tried to use OUTPUT:
INSERT INTO [Individual](Name)
OUTPUT false
INTO [Subjects].[Deleted]
SELECT [MyTable].[Name] as Name
FROM [MyTable]
WHERE [MyTable].[type] = 'Indv'
But the syntax is not correct.
Just use:
INSERT INTO Individual(Name)
SELECT [MyTable].[Name] as Name
FROM [MyTable]
WHERE [MyTable].[type] = 'Indv'
and
INSERT INTO Subjects(Deleted)
SELECT [MyTable].[Name] as Name
FROM [MyTable]
WHERE [MyTable].[type] = 'Indv'
You can't insert in a single query in two tables, you need two separate queries for that. For that reason I split your initial query into two INSERT statements, to add records to both your Individual and Subjects table.
Just as #marc_s said, you must select the exact number of columns in your SELECT statement with the number of columns you want to insert data into your tables.
Other than these two constraints, which are both related to syntax, you are fully allowed to do any filtering in the SELECT part or make any complex logic as you would do in a normal SELECT query.
You need to use this syntax:
INSERT INTO [Individual] (Name)
SELECT [MyTable].[Name]
FROM [MyTable]
WHERE [MyTable].[type] = 'Indv'
You should define the list of column to insert into in the INSERT INTO line, and then you must have a SELECT that returns exactly that many columns as you need (and the column types need to match, too)

Table Valued Parameter has slow performance because of table scan

I have an aplication that passes parameters to a procedure in SQL. One of the parameters is an table valued parameter containing items to include in a where clause.
Because the table valued parameter has no statistics attached to it when I join my TVP to a table that has 2 mil rows I get a very slow query.
What alternatives do I have ?
Again, the goal is to pass certain values to a procedure that will be included in a where clause:
select * from table1 where id in
(select id from #mytvp)
or
select * from table1 t1 join #mytpv
tvp on t1.id = tvp.id
although it looks like it would need to run the query once for each row in table1, EXISTS often optimizes to be more efficient than a JOIN or an IN. So, try this:
select * from table1 t where exists (select 1 from #mytvp p where t.id=p.id)
also, be sure that t.id is the same datatype as p.id and t.id has an index.
You can use a temp table with an index to boost performance....(assuming you have more than a couple of records in your #mytvp)
just before you join the table you could insert the data from the variable #mytvp to a temp table...
here's a sample code to create a temp table with index....The primary key and unique field determines which columns to index on..
CREATE TABLE #temp_employee_v3
(rowID int not null identity(1,1)
,lname varchar (30) not null
,fname varchar (30) not null
,city varchar (20) not null
,state char (2) not null
,PRIMARY KEY (lname, fname, rowID)
,UNIQUE (state, city, rowID) )
I had the same issue that table-valued parameters where very slow in my context. I came up with a solution that passed the list of values as a comma separated string to the stored procedure. the procedure then made a PATINDEX(...) > 0 comparision. This was about a factor of 1:6 faster.
As mentioned here and explained here you can have primary key and unique constraints on the table type. E.g.
CREATE TYPE IdList AS TABLE ( Id UNIQUEIDENTIFIER NOT NULL PRIMARY KEY )
However, check if it improves performance in your case as now, these indexes exist when the TVP is populated which might lead to a counter effect depending if your input is sorted and/or if you use more than one column.
In common with table variables, table-valued parameters have no statistics (see the section "restrictions"); the query optimiser works on the assumption that they contain only one row, which if your parameter contains a lot of rows is likely to result in an inappropriate query plan.
One way to improve your chances of a better plan is to add a statement level recompile; this should enable the optimiser to take the size of the TVP into account when selecting a plan.
select * from table1 t where exists (select 1 from #mytvp p where t.id=p.id) OPTION (RECOMPILE)
(incorporating KM's suggestion)

Resources