Use of the IN condition - sql-server

I can easily create a stored procedure in SQL Server with parameters that I use with =, LIKE and most operators. But when it comes to using IN, I don't really understand what to do, and I can't find a good site to teach me.
Example
CREATE PROCEDURE TEST
#Ids --- What type should go here?
AS BEGIN
SELECT * FROM TableA WHERE ID IN ( #Ids )
END
Is this possible and if so how ?

With SQL Server 2008 and above, you can use Table Valued Parameters.
You declare a table type and can use that as a parameter (read only) for stored procedures that can be used in IN clauses.
For the different options, I suggest reading the relevant article for your version of the excellent Arrays and Lists in SQL Server, by Erland Sommarskog.

I've done this in the past using a Split function that I add to my schema functions as described here
Then you can do the following:
CREATE PROCEDURE TEST
#Ids --- What type should go here?
AS BEGIN
SELECT * FROM TableA WHERE ID IN ( dbo.Split(#Ids, ',') )
END
Just remember that the IN function always expects a table of values as a result. SQL Server is smart enough to convert strings to this table format, so long as they are specifically written in the procedure.
Another option in your specific example though, could be to use a join. This will have a performance improvement, but often does not really meet a real-world example you need. The join version would be:
SELECT *
FROM TableA AS ta
INNER JOIN dbo.Split(#Ids, ',') AS ids
ON ta.Id = ids.items

If your asking what I think your asking, I do this every day..
WITH myData(FileNames)
AS
(
SELECT '0608751970%'
UNION ALL SELECT '1000098846%'
UNION ALL SELECT '1000101277%'
UNION ALL SELECT '1000108488%'
)
SELECT DISTINCT f.*
FROM tblFiles (nolock) f
INNER JOIN myData md
ON b.FileNames LIKE md.FileNames
Or if your doing this based on another table:
WITH myData(FileNames)
AS
(
SELECT RTRIM(FileNames) + '%'
FROM tblOldFiles
WHERE Active=1
)
SELECT DISTINCT f.*
FROM tblFiles (nolock) f
INNER JOIN myData md
ON b.FileNames LIKE md.FileNames

Related

Query tuning required for expensive query

Can someone help me to optimize the code? I have other way to optimize it by using compute column but we can not change the schema on prod as we are not sure how many API's are used to push data into this table. This table has millions of rows and adding a non-clustered index is not helping due to the query cost and it's going for a scan.
create table testcts(
name varchar(100)
)
go
insert into testcts(
name
)
select 'VK.cts.com'
union
select 'GK.ms.com'
go
DECLARE #list varchar(100) = 'VK,GK'
select * from testcts where replace(replace(name,'.cts.com',''),'.ms.com','') in (select value from string_split(#list,','))
drop table testcts
One possibility might be to strip off the .cts.com and .ms.com subdomain/domain endings before you insert or store the name data in your table. Then, use the following query instead:
SELECT *
FROM testcts
WHERE name IN (SELECT value FROM STRING_SPLIT(#list, ','));
Now SQL Server should be able to use an index on the name column.
If your values are always suffixed by cts.com or ms.com you could add that to the search pattern:
SELECT {YourColumns} --Don't use *
FROM dbo.testcts t
JOIN (SELECT CONCAT(SS.[value], V.Suffix) AS [value]
FROM STRING_SPLIT(#list, ',') SS
CROSS APPLY (VALUES ('.cts.com'),
('.ms.com')) V (Suffix) ) L ON t.[name] = L.[value];

Joining Results of Stored Procedure with Temporary Table

I have two tables that I have joined together. I'd like to join the result of the joined table with the results of a stored procedure that has two variables.
I'm not sure whether or not I should create two temporary tables or another function, so I'm a little lost on where I should even start and what the easiest method would be.
Below is my first join.
SELECT *
FROM dbo.Users a WITH (NOLOCK)
JOIN Company b ON a.email = b.email
Below is my stored procedure, all it does is split one column into more rows. Split is another function. I would like to use an inner join.
SELECT a.*, b.*
FROM [dbo].[Menu] a
CROSS APPLY dbo.Split(SalesPersons, ',') b
WHERE ID = #ID AND Date = #Date
The easiest way to do this, assuming that the output from the stored procedure is deterministic would be to populate the output of the stored procedure into a temp table and then join to it.
CREATE TABLE #tmp
(
COL1 INT NOT NULL,
COL2 INT NOT NULL
)
INSERT INTO #tmp
Exec sproc_YourSproc 'Params'
SELECT *
FROM dbo.Users u
INNER JOIN dbo.Company c ON u.email = c.email
INNER JOIN #tmp t ON t.ID = c.ID
That being said, as Martin Smith said above, you probably want to move that logic into the stored procedure if possible.
Also, please don't use (NOLOCK) it doesn't really help the way most people think that it does, and it can cause some really nasty results. (Double reading rows, ghost records, ect)
If you need to be able to perform reads without causing read/write contention, I would investigate using more optimistic isolation levels, find ways to optimize the read performance to reduce possible congestion, or find indexing strategies that would make it possible to satisfy reads without locking the table itself.

Dynamic inner query

Is there a way to code a dynamic inner query? Basically, I find myself typing something like the following query over and over:
;with tempData as (
--this inner query is the part that changes, but there's always a timeGMT column.
select timeGMT, dataCol2, dataCol3
from tbl1 t1
join tbl2 t2 on t1.ID=t2.ID
)
select dateadd(ss,d.gmtOffset,t.timeGMT) timeLocal,
t.*
from tempData t
join dst d on t.timeGMT between d.sTimeGMT and d.eTimeGMT
where d.zone = 'US-Eastern'
The only thing I can think of is a stored proc with the inner query text as the input for some dynamic sql... However, my understanding of the optimizer (which is, admittedly, limited) says this isn't really a good idea.
From a performance perspective, what you have there is the version on which I would expect the optimizer to do the best job.
If the "outer" part of your example is static and code maintenance overrides performance, I'd look to encapsulating the dateadd result in a table-valued function (TVF). Since the time conversion is very much the common thread in these queries, I would definitely focus on that part of the workload.
For example, your query that can vary would look like this:
select timeGMT, dataCol2, dataCol3, lt.timeLocal
from tbl1 t1
join tbl2 t2 on t1.ID = t2.ID
cross apply dbo.LocalTimeGet(timeGMT, 'US-Eastern') AS lt
Where the TVF dbo.LocalTimeGet contains the logic for dateadd(ss,d.gmtOffset,t.timeGMT) and the lookup of the time zone offset value based on the time zone name. The implementation of that function would look something like:
CREATE FUNCTION dbo.LocalTimeGet (
#TimeGMT datetime,
#TimeZone varchar(20)
)
RETURNS TABLE
AS
RETURN (
SELECT DATEADD(ss, d.gmtOffset, #TimeGMT) AS timeLocal
FROM dst AS d
WHERE d.zone = #TimeZone
);
GO
The upside of this approach is when you upgrade to 2008 or later, there are system functions you could use to make this conversion a lot easier to code and you'll only have to alter the TVF. If your result sets are small, I'd consider a system scalar function (SQL 2008) over a TVF, even if it implements those same system functions. Based on your comment, it sounds like the system functions won't do what you need, but you could still stick with your implementation of a dst table, which is encapsulated in the TVF above.
TVFs can be a performance problem because the optimizer assumes they only return 1 row.
If you need to combine encapsulation and performance, then I'd do the time zone calc in the application code instead. Even though you'd have to apply it to each project that uses it, you would only have to implement it 1x in each project (in the Data Access Layer) and treat it as a common utility library if you'll be using across projects.
To answer the OP's follow-on question, a SQL Server 2008 solution would look like this:
First, create permanent definitions:
CREATE TYPE dbo.tempDataType AS TABLE (
timeGMT DATETIME,
dataCol2 int,
dataCol3 int)
GO
CREATE PROCEDURE ComputeDateWithDST
#tempData tempDataType READONLY
AS
SELECT dateadd(ss,d.gmtOffset,t.timeGMT) timeLocal, t.*
FROM #tempData t
JOIN dst d ON t.timeGMT BETWEEN d.sTimeGMT AND d.eTimeGMT
WHERE d.zone = 'US-Eastern'
GO
Thereafter, whenever you want to plug a subquery (which has now become a separate query, no longer a CTE) into the stored procedure:
DECLARE #tempData tempDataType
INSERT #tempData
-- sample subquery:
SELECT timeGMT, dataCol2, dataCol3
FROM tbl1 t1
JOIN tbl2 t2 ON t1.ID=t2.ID
EXEC ComputeDateWithDST #tempData;
GO
Performance could be an issue because you'd be running separately what used to be a CTE instead of letting SQL Server combine it with the main query to optimize the execution plan.

sql server stored procedure

Can I join a table with a Stored Procedure which returns a table ?
Thanks
You need to use INSERT.. EXEC to store the data from the SP into a table or table-variable. Then you can join to that.
Say the SP returns a table (a int, b varchar(10), c datetime)
declare #temp table (a int, b varchar(10), c datetime)
;
insert #temp
exec myproc 1, 10, 'abcdef'
;
select *
from #temp t join othertable o on ... etc
Without creating a temp table, if you also exclude table-variable, then the only option - provided the SP -does not take any- parameters, is to use OPENQUERY to run the SP to return a table. Pseudo:
select *
from OPENQUERY(local_server, 'spname_no_params') t
join othertable o on ... etc
You can't join directly onto a stored procedure. So you either need to use the approach per Richard's answer, or you could convert the sproc to a table valued function.
e.g.
CREATE FUNCTION dbo.fxnExample(#Something INTEGER)
RETURNS TABLE
AS
RETURN
(
SELECT A, B
FROM MyTable
WHERE Something = #Something
)
which you then use/JOIN on in a query like this:
SELECT t1.Foo, f.A, f.B
FROM Table1 t1
JOIN dbo.fxnExample(1) f ON t1.A = f.A
The thing to note is you can't do everything in a user defined function that you can in a sproc so depending on what your sproc does, this may not be possible. Also, for best performance you should make it an inline table valued function like my example above. The alternative is a multi-statement table valued function which could give you poor performance due to the way that the execution plan produced will be based on an assumption of a very low number of rows being returned by it (i.e. 1) - so if it returned a larger number of rows then performance could be poor.
Here's a good MSDN article on it: http://blogs.msdn.com/b/psssql/archive/2010/10/28/query-performance-and-multi-statement-table-valued-functions.aspx
No it's not possible. What you can do is put the output of that SP into a temporary table and use it to your join statement.

Is there a way to optimize the query given below

I have the following Query and i need the query to fetch data from SomeTable based on the filter criteria present in the Someothertable. If there is nothing present in SomeOtherTable Query should return me all the data present in SomeTable
SQL SERVER 2005
SomeOtherTable does not have any indexes or any constraint all fields are char(50)
The Following Query work fine for my requirements but it causes performance problems when i have lots of parameters.
Due to some requirement of Client, We have to keep all the Where clause data in SomeOtherTable. depending on subid data will be joined with one of the columns in SomeTable.
For example the Query can can be
SELECT
*
FROM
SomeTable
WHERE
1=1
AND
(
SomeTable.ID in (SELECT DISTINCT ID FROM SomeOtherTable WHERE Name = 'ABC' and subid = 'EF')
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC' and subid = 'EF')
)
AND
(
SomeTable.date =(SELECT date FROM SomeOtherTable WHERE Name = 'ABC' and subid = 'Date')
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC' and subid = 'Date')
)
EDIT----------------------------------------------
I think i might have to explain my problem in detail:
We have developed an ASP.net application that is used to invoke parametrize crystal reports, parameters to the crystal reports are not passed using the default crystal reports method.
In ASP.net application we have created wizards which are used to pass the parameters to the Reports, These parameters are not directly consumed by the crystal report but are consumed by the Query embedded inside the crystal report or the Stored procedure used in the Crystal report.
This is achieved using a table (SomeOtherTable) which holds parameter data as long as report is running after which the data is deleted, as such we can assume that SomeOtherTable has max 2 to 3 rows at any given point of time.
So if we look at the above query initial part of the Query can be assumed as the Report Query and the where clause is used to get the user input from the SomeOtherTable table.
So i don't think it will be useful to create indexes etc (May be i am wrong).
SomeOtherTable does not have any
indexes or any constraint all fields
are char(50)
Well, there's your problem. There's nothing you can do to a query like this which will improve its performance if you create it like this.
You need a proper primary or other candidate key designated on all of your tables. That is to say, you need at least ONE unique index on the table. You can do this by designating one or more fields as the PK, or you can add a UNIQUE constraint or index.
You need to define your fields properly. Does the field store integers? Well then, an INT field may just be a better bet than a CHAR(50).
You can't "optimize" a query that is based on an unsound schema.
Try:
SELECT
*
FROM
SomeTable
LEFT JOIN SomeOtherTable ON SomeTable.ID=SomeOtherTable.ID AND Name = 'ABC'
WHERE
1=1
AND
(
SomeOtherTable.ID IS NOT NULL
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC')
)
also put 'with (nolock)' after each table name to improve performance
The following might speed you up
SELECT *
FROM SomeTable
WHERE
SomeTable.ID in
(SELECT DISTINCT ID FROM SomeOtherTable Where Name = 'ABC')
UNION
SELECT *
FROM SomeTable
Where
NOT EXISTS (Select spName From SomeOtherTable Where spName = 'ABC')
The UNION will effectivly split this into two simpler queries which can be optiomised separately (depends very much on DBMS, table size etc whether this will actually improve performance -- but its always worth a try).
The "EXISTS" key word is more efficient than the "SELECT COUNT(1)" as it will return true as soon as the first row is encountered.
Or check if the value exists in db first
And you can remove the distinct keyword in your query, it is useless here.
if EXISTS (Select spName From SomeOtherTable Where spName = 'ABC')
begin
SELECT *
FROM SomeTable
WHERE
SomeTable.ID in
(SELECT ID FROM SomeOtherTable Where Name = 'ABC')
end
else
begin
SELECT *
FROM SomeTable
end
Aloha
Try
select t.* from SomeTable t
left outer join SomeOtherTable o
on t.id = o.id
where (not exists (select id from SomeOtherTable where spname = 'adbc')
OR spname = 'adbc')
-Edoode
change all your select statements in the where part to inner jons.
the OR conditions should be union all-ed.
also make sure your indexing is ok.
sometimes it pays to have an intermediate table for temp results to which you can join to.
It seems to me that there is no need for the "1=1 AND" in your query. 1=1 will always evaluate to be true, leaving the software to evaluate the next part... why not just skip the 1=1 and evaluate the juicy part?
I am going to stick to my original Query.

Resources