How to loop with different values in T-SQL query? - sql-server

I have some specific set of values that I want to filter on a column, I don't want to do an 'in' clause in SQL Server. I want to use loop to pass in different set of values each time.
For example if there is a name column in my data, and I want to run query 5 times with different filter value.
Please look at the loop query attached below.
DECLARE #cnt INT = 1;
WHILE #cnt < 94
BEGIN
SELECT Name, COUNT(*) AS Number_of_Names
FROM Table
WHERE name IN ('John')
AND value IS NOT NULL
GROUP BY Name
SET #cnt = #cnt + 1;
END;
I want to pass in different values under 'name' column at each loop like john in the case above, then mary in the next loop likewise based on set of values I pass in the variable like #values = John,Mary,Nicole,matt etc..

Considering the comments on your question, this should give you an idea on how to achieve a solution without using loops and still get all the names even when the name is not present on the table.
SELECT Name,
COUNT(value) AS Number_of_Names --Only count when value is not null
FROM (VALUES('John'), ('Mary'), ('Nicole'), ('Matt'))Names(name) --This can be replaced by a table-valued parameter or temp table.
LEFT JOIN Table t ON Names.name = t.name
--WHERE name IN ('John') /*No longer needed*/
--AND value IS NOT NULL /*Removed this because it would make the OUTER JOIN behave as an INNER JOIN*/
GROUP BY Name;

Related

Check null parameter in IN Clause

I have a stored procedure with a few parameters. One of them is a varchar containing a possible list of IDs (comma separated values i.e. 1, 2, 5, 9). I need the procedure to ignore the parameter when it is NULL. This is the way I actually do:
CREATE PROCEDURE PROC_TEST
#MAIN_ID INT = NULL,
#DETAIL_IDs VARCHAR(2000)
AS
BEGIN
SELECT ..... FROM TABLE1 INNER JOIN TABLE2 ON ...
WHERE
TABLE1.ID = ISNULL(#MAIN_ID, TABLE1.ID) AND
(
#DETAIL_IDs IS NULL
OR TABLE2.ID IN
(SELECT VALUE FROM STRING_SPLIT(#DETAIL_IDs,','))
)
END
It seems like the #DETAIL_IDs IS NULL part of the code requires a lot of time to execute: What am I doing wrong here?
Even if I remove the OR clause and simply add AND #DETAIL:IDs IS NULL it takes a long time.
The tables have more than 1 million records each.

I want my stored procedure to only populate one instance of a name

I'm creating a stored procedure that populates two tables tblAirport and tblCountry. tblCountry gets its country names from tblAirport but I only want one instance of the country name to show up in `tblCountry. So far for my stored procedure I have this
DECLARE #PK INT = (SELECT PK FROM tblAirport WHERE strName = #strName)
IF #PK IS NULL
INSERT INTO tblAirport (ICAOCode,IATACode,strName,strCity,strCountry,degLat,minLat,secLat,Equator,degLong,minLong,secLong,Meridian,strElevation)
VALUES (#ICAOCode,#IATACode,#strName,#strCity,#strCountry,#degLat,#minLat,#secLat,#Equator,#degLong,#minLong,#secLong,#Meridian,#strElevation)
SET #PK = (SELECT PK FROM tblAirport WHERE strName = #strName);
IF EXISTS (SELECT * FROM tblCountry WHERE strCountry = #strCountry)
SET #strCountry = #strCountry + 'x'
INSERT INTO tblCountry (strCountry)
VALUES (#strCountry)
I tried using IF EXISTS (SELECT * FROM tblCountry WHERE strCountry = #strCountry)
SET #strCountry = #strCountry + 'x' just to show any duplicate countries but I don't know how to eliminate the duplicates from my table. I'm new to SQL and I've only learned the IF EXISTS function. Any suggestions would be great. Thank you!
This is how to handle a multiline IF ELSE (https://technet.microsoft.com/en-us/library/ms182717(v=sql.110).aspx)
IF NOT EXISTS (SELECT * FROM tblCountry WHERE strCountry = #strCountry)
BEGIN
INSERT INTO tblCountry (strCountry) VALUES (#strCountry)
END;
In general though, I'd be concerned about a procedure that uses the data to drive the possible values in a lookup list, especially something like countries that should probably be pre-defined up front. You'd hate for them to enter free-form duplicates that are really the same country with a slightly different spelling.

SQL using a function in a trigger

I am creating a a trigger in SQL that will insert into another table after Insert on it. However I need to fetch a Value from the table to increment to be used in the insert.
I have a AirVisionSiteLog table. On insert on the table I would like for it to insert into another SiteLog table. However in order to do this I need to fetch the last Entry Number of the Site from the SiteLog table. Then on its insert take that result and increase by one for the new Entry Number. I am new to Triggers and Functions so I am not sure how to use them correctly. I believe I have a function to retrieve and increment the Entry Number however I am not sure how to use it in the Trigger.
My Function -
CREATE FUNCTION AQB_RMS.F_GetLogEntryNumber
(#LocationID int)
RETURNS INTEGER
AS
BEGIN
DECLARE
#MaxEntry Integer,
#EntryNumber Integer
Set #MaxEntry = (Select Max(SL.EntryNumber) FROM AQB_MON.AQB_RMS.SiteLog SL
WHERE SL.LocationID = #LocationID)
SET #EntryNumber = #MaxEntry + 1
RETURN #EntryNumber
END
My Trigger and attempt to use the Function -
CREATE TRIGGER [AQB_RMS].[SiteLogCreate] on [AQB_MON].[AQB_RMS].[AirVisionSiteLog]
AFTER INSERT
AS
BEGIN
declare #entrynumber int
declare #corrected int
set #corrected = 0
INSERT INTO [AQB_MON].[AQB_RMS].[SiteLog]
([SiteLogTypeID],[LocationID],[EntryNumber],[SiteLogEntry]
,[EntryDate],[Corrected],[DATE_CREATED],[CREATED_BY])
SELECT st.SiteLogTypeID, l.LocationID,
(select AQB_RMS.F_GetLogEntryNumber from [AQB_MON].[AQB_RMS].[SiteLog] sl
where sl.LocationID = l.LocationID)
, i.SiteLogEntry, i.EntryDate, #corrected, i.DATE_CREATED, i.CREATED_BY
from inserted i
left join AQB_MON.[AQB_RMS].[SiteLogType] st on st.SiteLogType = i.SiteLogType
left join AQB_MON.AQB_RMS.Location l on l.SourceSiteID = i.SourceSiteID
END
GO
I believe that you are close.
At this part of the query in the trigger: (I set the columns vertically so that the difference is more noticable)
SELECT st.SiteLogTypeID,
l.LocationID,
(select AQB_RMS.F_GetLogEntryNumber from [AQB_MON].[AQB_RMS].[SiteLog] sl where sl.LocationID = l.LocationID),
i.SiteLogEntry,
i.EntryDate,
#corrected,
i.DATE_CREATED,
i.CREATED_BY
...should be:
SELECT st.SiteLogTypeID,
l.LocationID,
AQB_RMS.F_GetLogEntryNumber(select l.LocationID from [AQB_MON].[AQB_RMS].[SiteLog] sl where sl.LocationID = l.LocationID),
i.SiteLogEntry,
i.EntryDate,
#corrected,
i.DATE_CREATED,
i.CREATED_BY
So basically, you would call the function name with the query as the parameter, which the results thereof should only be one row with a value.
Note that in my modified example, I added the l.LocationID after the select in the function call, so I'm not sure if this is what you need, but change that to match your needs. Because I'm not sure of the exact column that you need, add a comment should there be other issues.

How to do a conditional JOIN with table valued parameter?

Okay so I have spent some time researching this but cannot seem to find a good solution.
I am currently creating a stored procedure that takes a set of optional parameters. The stored procedure will act as the "universal search query" for multiple tables and columns.
The stored procedure looks something like this (Keep in mind that this is just a stripped down version and the actual stored procedure has more columns etc.)
The '#ProductIdsParam IntList READONLY' is an example table valued parameter that I would like to JOIN if it is not empty. In other words, the query should only search by parameters that are not null/empty.
Calling the procedure and parsing the other parameters works just like it should. I might however have misunderstood and should not do a "universal search query" like this at all.
CREATE PROCEDURE [dbo].[usp_Search]
#ProductIdParam INT = NULL,
#CustomerNameParam NVARCHAR(100) = NULL,
#PriceParam decimal = NULL,
-- THIS IS WHAT I'D LIKE TO JOIN. BUT THE TABLE CAN BE EMPTY
#ProductIdsParam IntList READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT DISTINCT
CustomerTransactionTable.first_name AS FirstName,
CustomerTransactionTable.last_name AS LastName,
ProductTable.description AS ProductDescription,
ProductTable.price as ProductPrice
FROM dbo.customer AS CustomerTransactionTable
-- JOINS
LEFT JOIN dbo.product AS ProductTable
ON CustomerTransactionTable.product_id = ProductTable.id
WHERE
(ProductTable.id = #ProductIdParam OR #ProductIdParam IS NULL)
AND (CustomerTransactionTable.first_name = #CustomerNameParam OR #CustomerNameParam IS NULL)
AND (CustomerTransactionTable.price = #PriceParam OR #PriceParam IS NULL)
END
You can add the int table in LEFT join and then add a where condition based on the record count in the filter table. If #ProductIdsParam is declared as table, you should first count records in it and store the result in a varaible.
AND COALESCE(#ProductIdsParam.id, 0) = (CASE WHEN #ProductIdsCount = 0 THEN 0 ELSE ProductTable.id END)
In case #ProductIdsCount = 0 then you get always 0 = 0 so you get all the records, else you select only records where the productId in the filter table equals the ProductTable.id.
There are other (maybe cleaner) approaches possible though but I think this works.

tsql bulk update

MyTableA has several million records. On regular occasions every row in MyTableA needs to be updated with values from TheirTableA.
Unfortunately I have no control over TheirTableA and there is no field to indicate if anything in TheirTableA has changed so I either just update everything or I update based on comparing every field which could be different (not really feasible as this is a long and wide table).
Unfortunately the transaction log is ballooning doing a straight update so I wanted to chunk it by using UPDATE TOP, however, as I understand it I need some field to determine if the records in MyTableA have been updated yet or not otherwise I'll end up in an infinite loop:
declare #again as bit;
set #again = 1;
while #again = 1
begin
update top (10000) MyTableA
set my.A1 = their.A1, my.A2 = their.A2, my.A3 = their.A3
from MyTableA my
join TheirTableA their on my.Id = their.Id
if ##ROWCOUNT > 0
set #again = 1
else
set #again = 0
end
is the only way this will work if I add in a
where my.A1 <> their.A1 and my.A2 <> their.A2 and my.A3 <> their.A3
this seems like it will be horribly inefficient with many columns to compare
I'm sure I'm missing an obvious alternative?
Assuming both tables are the same structure, you can get a resultset of rows that are different using
SELECT * into #different_rows from MyTable EXCEPT select * from TheirTable and then update from that using whatever key fields are available.
Well, the first, and simplest solution, would obviously be if you could change the schema to include a timestamp for last update - and then only update the rows with a timestamp newer than your last change.
But if that is not possible, another way to go could be to use the HashBytes function, perhaps by concatenating the fields into an xml that you then compare. The caveat here is an 8kb limit (https://connect.microsoft.com/SQLServer/feedback/details/273429/hashbytes-function-should-support-large-data-types) EDIT: Once again, I have stolen code, this time from:
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2009/10/21/detecting-changed-rows-in-a-trigger-using-hashbytes-and-without-eventdata-and-or-s.aspx
His example is:
select batch_id
from (
select distinct batch_id, hash_combined = hashbytes( 'sha1', combined )
from ( select batch_id,
combined =( select batch_id, batch_name, some_parm, some_parm2
from deleted c -- need old values
where c.batch_id = d.batch_id
for xml path( '' ) )
from deleted d
union all
select batch_id,
combined =( select batch_id, batch_name, some_parm, some_parm2
from some_base_table c -- need current values (could use inserted here)
where c.batch_id = d.batch_id
for xml path( '' ) )
from deleted d
) as r
) as c
group by batch_id
having count(*) > 1
A last resort (and my original suggestion) is to try Binary_Checksum? As noted in the comment, this does open the risk for a rather high collision rate.
http://msdn.microsoft.com/en-us/library/ms173784.aspx
I have stolen the following example from lessthandot.com - link to the full SQL (and other cool functions) is below.
--Data Mismatch
SELECT 'Data Mismatch', t1.au_id
FROM( SELECT BINARY_CHECKSUM(*) AS CheckSum1 ,au_id FROM pubs..authors) t1
JOIN(SELECT BINARY_CHECKSUM(*) AS CheckSum2,au_id FROM tempdb..authors2) t2 ON t1.au_id =t2.au_id
WHERE CheckSum1 <> CheckSum2
Example taken from http://wiki.lessthandot.com/index.php/Ten_SQL_Server_Functions_That_You_Have_Ignored_Until_Now
I don't know if this is better than adding where my.A1 <> their.A1 and my.A2 <> their.A2 and my.A3 <> their.A3, but I would definitely give it a try (assuming SQL Server 2005+):
declare #again as bit;
set #again = 1;
declare #idlist table (Id int);
while #again = 1
begin
update top (10000) MyTableA
set my.A1 = their.A1, my.A2 = their.A2, my.A3 = their.A3
output inserted.Id into #idlist (Id)
from MyTableA my
join TheirTableA their on my.Id = their.Id
left join #idlist i on my.Id = i.Id
where i.Id is null
/* alternatively (instead of left join + where):
where not exists (select * from #idlist where Id = my.Id) */
if ##ROWCOUNT > 0
set #again = 1
else
set #again = 0
end
That is, declare a table variable for collecting the IDs of the rows being updated and use that table for looking up (and omitting) IDs that have already been updated.
A slight variation on the method would be to use a local temporary table instead of a table variable. That way you would be able to create an index on the ID lookup table, which might result in better performance.
If schema change is not possible. How about using trigger to save off the Ids that have changed. And only import/export those rows.
Or use trigger to export it immediately.

Resources