Saving a select count(*) value to an integer (SQL Server) - sql-server

I'm having some trouble with this statement, owing no doubt to my ignorance of what is returned from this select statement:
declare #myInt as INT
set #myInt = (select COUNT(*) from myTable as count)
if(#myInt <> 0)
begin
print 'there's something in the table'
end
There are records in myTable, but when I run the code above the print statement is never run. Further checks show that myInt is in fact zero after the assignment above. I'm sure I'm missing something, but I assumed that a select count would return a scalar that I could use above?

If #myInt is zero it means no rows in the table: it would be NULL if never set at all.
COUNT will always return a row, even for no rows in a table.
Edit, Apr 2012: the rules for this are described in my answer here:Does COUNT(*) always return a result?
Your count/assign is correct but could be either way:
select #myInt = COUNT(*) from myTable
set #myInt = (select COUNT(*) from myTable)
However, if you are just looking for the existence of rows, (NOT) EXISTS is more efficient:
IF NOT EXISTS (SELECT * FROM myTable)

select #myInt = COUNT(*) from myTable

Declare #MyInt int
Set #MyInt = ( Select Count(*) From MyTable )
If #MyInt > 0
Begin
Print 'There''s something in the table'
End
I'm not sure if this is your issue, but you have to esacpe the single quote in the print statement with a second single quote. While you can use SELECT to populate the variable, using SET as you have done here is just fine and clearer IMO. In addition, you can be guaranteed that Count(*) will never return a negative value so you need only check whether it is greater than zero.

[update] -- Well, my own foolishness provides the answer to this one. As it turns out, I was deleting the records from myTable before running the select COUNT statement.
How did I do that and not notice? Glad you asked. I've been testing a sql unit testing platform (tsqlunit, if you're interested) and as part of one of the tests I ran a truncate table statement, then the above. After the unit test is over everything is rolled back, and records are back in myTable. That's why I got a record count outside of my tests.
Sorry everyone...thanks for your help.

Related

How do I use ##RowCount in a stored procedure, against rows in another table to work out the percentage?

Firstly, may I state that I'm aware of the ability to, e.g., create a new function, declare variables for rowcount1 and rowcount2, run a stored procedure that returns a subset of rows from a table, then determine the entire rowcount for that same table, assign it to the second variable and then 1 / 2 x 100....
However, is there a cleaner way to do this which doesn't result in numerous running of things like this stored procedure? Something like
select (count(*stored procedure name*) / select count(*) from table) x 100) as Percentage...
Sorry for the crap scenario!
EDIT: Someone has asked for more details. Ultimately, and to cut a very long story short, I wish to know what people would consider the quickest and most processor-concise method there would be to show the percentage of rows that are returned in the stored procedure, from ALL rows available in that table. Does that make more sense?
The code in the stored procedure is below:
SET #SQL = 'SELECT COUNT (DISTINCT c.ElementLabel), r.FirstName, r.LastName, c.LastReview,
CASE
WHEN c.LastReview < DateAdd(month, -1, GetDate()) THEN ''OUT of Date''
WHEN c.LastReview >= DateAdd(month, -1, GetDate()) THEN ''In Date''
WHEN c.LastReview is NULL THEN ''Not Yet Reviewed'' END as [Update Status]
FROM [Residents-'+#home_name+'] r
LEFT JOIN [CarePlans-'+#home_name+'] c ON r.PersonID = c.PersonID
WHERE r.Location = '''+#home_name+'''
AND CarePlanType = 0
GROUP BY r.LastName, r.FirstName, c.LastReview
HAVING COUNT(ELEMENTLABEL) >= 14
Thanks
Ant
I could not tell from your question if you are attempting to get the count and the result set in one query. If it is ok to execute the SP and separately calculate a table count then you could store the results of the stored procedure into a temp table.
CREATE TABLE #Results(ID INT, Value INT)
INSERT #Results EXEC myStoreProc #Parameter1, #Parameter2
SELECT
Result = ((SELECT COUNT(*) FROM #Results) / (select count(*) from table))* 100

Exists vs select count

In SQL Server, performance wise, it is better to use IF EXISTS (select * ...) than IF (select count(1)...) > 0...
However, it looks like Oracle does not allow EXISTS inside the IF statement, what would be an alternative to do that because using IF select count(1) into... is very inefficient performance wise?
Example of code:
IF (select count(1) from _TABLE where FIELD IS NULL) > 0 THEN
UPDATE TABLE _TABLE
SET FIELD = VAR
WHERE FIELD IS NULL;
END IF;
the best way to write your code snippet is
UPDATE TABLE _TABLE
SET FIELD = VAR
WHERE FIELD IS NULL;
i.e. just do the update. it will either process rows or not. if you needed to check if it did process rows then add afterwards
if (sql%rowcount > 0)
then
...
generally in cases where you have logic like
declare
v_cnt number;
begin
select count(*)
into v_cnt
from TABLE
where ...;
if (v_cnt > 0) then..
its best to use ROWNUM = 1 because you DON'T CARE if there are 40 million rows..just have Oracle stop after finding 1 row.
declare
v_cnt number;
begin
select count(*)
into v_cnt
from TABLE
where rownum = 1
and ...;
if (v_cnt > 0) then..
or
select count(*)
into v_cnt
from dual
where exists (select null
from TABLE
where ...);
whichever syntax you prefer.
As Per:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3069487275935
You could try:
for x in ( select count(*) cnt
from dual
where exists ( select NULL from foo where bar ) )
loop
if ( x.cnt = 1 )
then
found do something
else
not found
end if;
end loop;
is one way (very fast, only runs the subquery as long as it "needs" to, where exists
stops the subquery after hitting the first row)
That loop always executes at least once and at most once since a count(*) on a table
without a group by clause ALWAYS returns at LEAST one row and at MOST one row (even of
the table itself is empty!)

tsql bulk update

MyTableA has several million records. On regular occasions every row in MyTableA needs to be updated with values from TheirTableA.
Unfortunately I have no control over TheirTableA and there is no field to indicate if anything in TheirTableA has changed so I either just update everything or I update based on comparing every field which could be different (not really feasible as this is a long and wide table).
Unfortunately the transaction log is ballooning doing a straight update so I wanted to chunk it by using UPDATE TOP, however, as I understand it I need some field to determine if the records in MyTableA have been updated yet or not otherwise I'll end up in an infinite loop:
declare #again as bit;
set #again = 1;
while #again = 1
begin
update top (10000) MyTableA
set my.A1 = their.A1, my.A2 = their.A2, my.A3 = their.A3
from MyTableA my
join TheirTableA their on my.Id = their.Id
if ##ROWCOUNT > 0
set #again = 1
else
set #again = 0
end
is the only way this will work if I add in a
where my.A1 <> their.A1 and my.A2 <> their.A2 and my.A3 <> their.A3
this seems like it will be horribly inefficient with many columns to compare
I'm sure I'm missing an obvious alternative?
Assuming both tables are the same structure, you can get a resultset of rows that are different using
SELECT * into #different_rows from MyTable EXCEPT select * from TheirTable and then update from that using whatever key fields are available.
Well, the first, and simplest solution, would obviously be if you could change the schema to include a timestamp for last update - and then only update the rows with a timestamp newer than your last change.
But if that is not possible, another way to go could be to use the HashBytes function, perhaps by concatenating the fields into an xml that you then compare. The caveat here is an 8kb limit (https://connect.microsoft.com/SQLServer/feedback/details/273429/hashbytes-function-should-support-large-data-types) EDIT: Once again, I have stolen code, this time from:
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2009/10/21/detecting-changed-rows-in-a-trigger-using-hashbytes-and-without-eventdata-and-or-s.aspx
His example is:
select batch_id
from (
select distinct batch_id, hash_combined = hashbytes( 'sha1', combined )
from ( select batch_id,
combined =( select batch_id, batch_name, some_parm, some_parm2
from deleted c -- need old values
where c.batch_id = d.batch_id
for xml path( '' ) )
from deleted d
union all
select batch_id,
combined =( select batch_id, batch_name, some_parm, some_parm2
from some_base_table c -- need current values (could use inserted here)
where c.batch_id = d.batch_id
for xml path( '' ) )
from deleted d
) as r
) as c
group by batch_id
having count(*) > 1
A last resort (and my original suggestion) is to try Binary_Checksum? As noted in the comment, this does open the risk for a rather high collision rate.
http://msdn.microsoft.com/en-us/library/ms173784.aspx
I have stolen the following example from lessthandot.com - link to the full SQL (and other cool functions) is below.
--Data Mismatch
SELECT 'Data Mismatch', t1.au_id
FROM( SELECT BINARY_CHECKSUM(*) AS CheckSum1 ,au_id FROM pubs..authors) t1
JOIN(SELECT BINARY_CHECKSUM(*) AS CheckSum2,au_id FROM tempdb..authors2) t2 ON t1.au_id =t2.au_id
WHERE CheckSum1 <> CheckSum2
Example taken from http://wiki.lessthandot.com/index.php/Ten_SQL_Server_Functions_That_You_Have_Ignored_Until_Now
I don't know if this is better than adding where my.A1 <> their.A1 and my.A2 <> their.A2 and my.A3 <> their.A3, but I would definitely give it a try (assuming SQL Server 2005+):
declare #again as bit;
set #again = 1;
declare #idlist table (Id int);
while #again = 1
begin
update top (10000) MyTableA
set my.A1 = their.A1, my.A2 = their.A2, my.A3 = their.A3
output inserted.Id into #idlist (Id)
from MyTableA my
join TheirTableA their on my.Id = their.Id
left join #idlist i on my.Id = i.Id
where i.Id is null
/* alternatively (instead of left join + where):
where not exists (select * from #idlist where Id = my.Id) */
if ##ROWCOUNT > 0
set #again = 1
else
set #again = 0
end
That is, declare a table variable for collecting the IDs of the rows being updated and use that table for looking up (and omitting) IDs that have already been updated.
A slight variation on the method would be to use a local temporary table instead of a table variable. That way you would be able to create an index on the ID lookup table, which might result in better performance.
If schema change is not possible. How about using trigger to save off the Ids that have changed. And only import/export those rows.
Or use trigger to export it immediately.

SQL Server 2008: Why table scaning when another logical condition is satisfied first?

Consider following piece of code:
declare #var bit = 0
select * from tableA as A
where
1=
(case when #var = 0 then 1
when exists(select null from tableB as B where A.id=B.id)
then 1
else 0
end)
Since variable #var is set to 0, then the result of evaluating searched case operator is 1. In the documentation of case it is written that it is evaluated until first WHEN is TRUE. But when I look at execution plan, I see that tableB is scanned as well.
Does anybody know why this happens? Probably there are ways how one can avoid second table scan when another logical condition is evaluated to TRUE?
Because the plan that is compiled and cached needs to work for all possible values of #var
You would need to use something like
if (#var = 0)
select * from tableA
else
select * from tableA as A
where exists(select * from tableB as B where A.id=B.id)
Even OPTION RECOMPILE doesn't look like it would help actually. It still doesn't give you the plan you would have got with a literal 0=0
declare #var bit = 0
select * from
master.dbo.spt_values as A
where
1=
(case when 0 = #var then 1
when exists(select null from master.dbo.spt_values as B where A.number=B.number)
then 1
else 0
end)
option(recompile)
Plan http://img189.imageshack.us/img189/3977/executionplan.jpg
select * from
master.dbo.spt_values as A
where
1=
(case when 0 = 0 then 1
when exists(select null from master.dbo.spt_values as B where A.number=B.number)
then 1
else 0
end)
Plan http://img193.imageshack.us/img193/3977/executionplan.jpg
RE: Question in comments. Try the following with the "Include Actual Execution Plan" option enabled.
declare #var bit = datepart(second,GETDATE())%2
print #var
if (#var = 0)
select * from
master.dbo.spt_values --8BA71BA5-3025-4967-A0C8-38B9FBEF8BAD
else
select * from
master.dbo.spt_values as A --8BA71BA5-3025-4967-A0C8-38B9FBEF8BAD
where exists(select null from master.dbo.spt_values as B where A.number=B.number)
Then try
SELECT usecounts, cacheobjtype, objtype, text, query_plan
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
CROSS APPLY sys.dm_exec_query_plan(plan_handle)
where text like '%8BA71BA5-3025-4967-A0C8-38B9FBEF8BAD%'
The Compiled Plan will look like
Plan http://img178.imageshack.us/img178/3977/executionplan.jpg
The Actual Execution Plan will show only one path was actually executed though.
if tableB has few rows, a table scan is the fastest way to go.
best source for dynamic search conditions:
Dynamic Search Conditions in T-SQL by Erland Sommarskog
there are a lot of subtle implications on how you do this as to if an index can be used or not. If you are on the proper release of SQL Server 2008 you can just add OPTION (RECOMPILE) to the query and the local variable's value at run time is used for the optimizations.
Consider this, OPTION (RECOMPILE) will take this code (where no index can be used with this mess of ORs):
WHERE
(#search1 IS NULL or Column1=#Search1)
AND (#search2 IS NULL or Column2=#Search2)
AND (#search3 IS NULL or Column3=#Search3)
and optimize it at run time to be (provided that only #Search2 was passed in with a value):
WHERE
Column2=#Search2
and an index can be used (if you have one defined on Column2)

How do i check if something exist without using count(*) ... limit 1

My code is SELECT COUNT(*) FROM name_list WHERE [name]='a' LIMIT 1
It appears there is no limit clause in SQL Server. So how do i say tell me if 'a' exist in name_list.name?
IF EXISTS(SELECT * FROM name_list WHERE name = 'a')
BEGIN
-- such a record exists
END
ELSE
BEGIN
-- such a record does not exist
END
Points to note:
don't worry about the SELECT * - the database engine knows what you are asking
the IF is just for illustration - the EXISTS(SELECT ...) expression is what answers your question
the BEGIN and END are strictly speaking unnecessary if there is only one statement in the block
COUNT(*) returns a single row anyway, no need to limit.
The ANSI equivalent for LIMIT is TOP: SELECT TOP(1) ... FROM ... WHERE...
And finally, there is EXISTS: IF EXISTS (SELECT * FROM ... WHERE ...).
The TOP clause is the closest equivalent to LIMIT. The following will return all of the fields in the first row whose name field equals 'a' (altough if more than one row matches, the row that ets returned will be undefined unless you also provide an ORDER BY clause).
SELECT TOP 1 * FROM name_list WHERE [name]='a'
But there's no need to use it if you're doing a COUNT(*). The following will return a single row with a single field that is number of rows whose name field eqals 'a' in the whole table.
SELECT COUNT(*) FROM name_list WHERE [name]='a'
IF (EXISTS(SELECT [name] FROM name_list where [name] = 'a'))
begin
//do other work if exists
end
You can also do the opposite:
IF (NOT EXISTS(SELECT [name] FROM name_list where [name] = 'a'))
begin
//do other work if not exists
end
No nono that is wrong.
First there is top, so you have to say something like:
select top 1 1 from name_list where [name]='a'
You'll get a row with only a unnamed field with 1 out of the query if there is data, and no rows at all if there is no data.
This query returns exactly what you intended:
SELECT TOP 1 CASE WHEN EXISTS(SELECT * WHERE [name] = 'a') THEN 1 ELSE 0 END FROM name_list

Resources