Performance hit of having many Result Set of a single row - sql-server

I have a TSQL code that relies on a stored procedure to select a row.
When I'm implementing a more complex TSQL script that will select many rows based on a condition, instead of having one result set of x rows I'm ending up with x result sets containing one row.
My first question is: is it a concern or the performances are close to what I would get with one result set of x rows?
Second question: does anybody think that a temporary table where my stored procedure insert the result (instead of a select) should be faster?
Edit:
Basically this stored procedure select all the items of a given HierarchicalObject.
ALTER PROCEDURE [dbo].[MtdMdl_HierarchicalObject_Collection_Items]
#relatedid int
AS
BEGIN
SET NOCOUNT ON
declare #curkeyid int
declare cur CURSOR static read_only LOCAL
for select distinct [Id] from MtdMdl_Item where [Owner] = #relatedid
open cur
fetch next
from cur into #curkeyid
while ##FETCH_STATUS = 0
BEGIN
-- select the item row from its ID
exec MtdMdl_Item_ItemBase_Read #keyid = #curkeyid
fetch next
from cur into #curkeyid
END
close cur
deallocate cur
END
ALTER PROCEDURE [dbo].[MtdMdl_Item_ItemBase_Read]
#keyid int
AS
BEGIN
SET NOCOUNT ON
SELECT TOP(1) [Id], [TimeStamp], [Name], [Owner], [Value]
FROM [MtdMdl_Item]
WHERE ([Id]=#keyid)
ORDER BY TimeStamp Desc
END

For sure you should better place all single output rows into resulting temporary table before selecting final recordset. There is no reason currently in your code to return one recorset containing all separate rows from iteration over cursor with sp;
Your MtdMdl_Item_ItemBase_Read is relevant a bit because after turning it into function you can avoid sp+cursor and complete the task with one single query using inline function.
upd
According to your data structure I understand that your [Id] is not unique which is source of confusing.
There are many ways to do what you need but here is example of one query even avoiding CTE for temporary result:
DECLARE #relatedid int = 2
SELECT top(1) WITH ties
[Id], [TimeStamp], [Name], [Owner], [Value]
FROM MtdMdl_Item
WHERE [Owner]=#relatedid
ORDER BY row_number() over(partition BY [Id] ORDER BY [TimeStamp] DESC)
Consider this SQL Fiddle as demo.
upd2
Example with inline table function:
CREATE FUNCTION MtdMdl_Item_ItemBase_Read (#keyid int)
RETURNS TABLE
AS
RETURN
(
SELECT TOP(1) [Id], [TimeStamp], [Name], [Owner], [Value]
FROM [MtdMdl_Item]
WHERE ([Id]=#keyid)
ORDER BY TimeStamp Desc
)
GO
DECLARE #relatedid int = 2
SELECT DISTINCT A.[Id],B.* FROM MtdMdl_Item A
OUTER apply (SELECT * FROM MtdMdl_Item_ItemBase_Read(A.[Id])) B
WHERE A.[Owner] = #relatedid
SQL Fiddle 2

Your answer is in below link you should use GROUP BY instead of DISTINCT
SQL/mysql - Select distinct/UNIQUE but return all columns?
And in below line of your code enter list of columns you want in your result
declare cur CURSOR static read_only LOCAL
for select distinct [Id] from MtdMdl_Item where [Owner] = #relatedid
So your query will be
declare cur CURSOR static read_only LOCAL
for select rows,you,want,in,result from MtdMdl_Item where [Owner] = #relatedid Order By [column name you want to be distinct]

Related

How to UPDATE TOP(n) with ORDER BY giving a predictable result?

I'm trying to read the top 100 items of a database table that is being used like a queue. As I do this I'm trying to mark the items as done like this:
UPDATE TOP(#qty)
QueueTable WITH (READPAST)
SET
IsDone = 1
OUTPUT
inserted.Id,
inserted.Etc
FROM
QueueTable
WHERE
IsDone = 0
ORDER BY
CreatedDate ASC;
The only problem is, according to UPDATE (Transact-SQL) on MSDN, the ORDER BY is not valid in an UPDATE and:
The rows referenced in the TOP expression used with INSERT, UPDATE, or
DELETE are not arranged in any order.
How can I achieve what I need which is to update the items at the top of the queue while also selecting them?
SQL Server allows you to update a derived table, CTE or view:
UPDATE x
SET
IsDone = 1
OUTPUT
inserted.Id,
inserted.Etc
FROM (
select TOP (N) *
FROM
QueueTable
WHERE
IsDone = 0
ORDER BY
CreatedDate ASC;
) x
No need to compute a set of IDs first. This is faster and usually has more desirable locking behavior.
Tested in SSMS, it works fine. You may need to do some modification accordingly.
--create table structure
create table #temp1 (
id int identity(1,1),
value int
)
go
--insert sample data
insert #temp1 values (1)
go 20
--below is solution
declare #qty int = 10
declare #cmd nvarchar(2000) =
N'update #temp1
set value= 100
output inserted.value
where id in
(
select top '+ cast(#qty as nvarchar(5)) +' id from #temp1
order by id
)';
execute sp_executesql #cmd
You can use ranking function (for example row_number).
update top (100) q
set IsDone = 1
output
inserted.Id,
inserted.Etc
from (
select *, row_number() over(order by CreatedDate asc, (select 0)) rn
from QueueTable) q
where rn <= 100

Insert random text into columns from a reference table variable

I have a table ABSENCE that has 40 employee ids and need to add two columns from a table variable, which acts as a reference table. For each emp id, I need to randomly assign the values from the table variable. Here's the code I tried without randomizing:
USE TSQL2012;
GO
DECLARE #MAX SMALLINT;
DECLARE #MIN SMALLINT;
DECLARE #RECODE SMALLINT;
DECLARE #RE CHAR(100);
DECLARE #rearray table (recode smallint,re char(100));
insert into #rearray values (100,'HIT BY BEER TRUCK')
,(200,'BAD HAIR DAY')
,(300,'ASPIRIN OVERDOSE')
,(400,'MAKEUP DISASTER')
,(500,'GOT LOCKED IN THE SALOON')
DECLARE #REFCURSOR AS CURSOR;
SET #REFCURSOR = CURSOR FOR
SELECT RECODE,RE FROM #REARRAY;
OPEN #REFCURSOR;
SET #MAX = (SELECT DISTINCT ##ROWCOUNT FROM ABSENCE);
SET #MIN = 0;
ALTER TABLE ABSENCE ADD CODE SMALLINT, REASONING CHAR(100);
WHILE (#MIN <= #MAX)
BEGIN
FETCH NEXT FROM #REFCURSOR INTO #RECODE,#RE;
INSERT INTO ABSENCE (CODE, REASONING) VALUES (#RECODE,#RE);
SET #MIN+=1;
END
CLOSE #REFCURSOR
DEALLOCATE #REFCURSOR
SELECT EMPID,CODE,REASONING FROM ABSENCE
Though am inserting into two columns only, it is attempting to insert into empid (which has already been filled) and as it cannot be NULL, the insertion fails.
Also, how to randomize the values from the REARRAY table variable to insert them into the ABSENCE table?
Since this is a small dataset, one approach might be to use CROSS APPLY with a SELECT TOP(1) ... FROM #rearray ORDER BY NEWID() approach. This will essentially join your ABSENCE table with your reference table in an UPDATE statement, selecting a random row each time in the join. In full, it would look like:
UPDATE ABSENCE
SET col1 = x1.recode, col2 = x2.recode
FROM ABSENCE a
CROSS APPLY (SELECT TOP(1) * FROM #rearray ORDER BY NEWID()) x1(recode, re)
CROSS APPLY (SELECT TOP(1) * FROM #rearray ORDER BY NEWID()) x2(recode, re)

How to use CTE instead of Cursor

I want to use Common Table Expressions (CTE) instead of Cursor in SQL Server 2012. Your assistance is highly appreciated.
This is my situation:
DECLARE
#tp_ID INTEGER
truncate table T_Rep_Exit_Checklist_Table
DECLARE cursorName CURSOR -- Declare cursor
LOCAL SCROLL STATIC
FOR
SELECT
tp_ID
from V_Rep_Exit_Checklist
OPEN cursorName -- open the cursor
FETCH NEXT FROM cursorName INTO #tp_ID
WHILE ##FETCH_STATUS = 0
BEGIN
insert into T_Rep_Exit_Checklist_Table
SELECT
#tp_ID-- AS tp_ID
,Item_Status_Code
,Item_Status_Desc
,Item_Code
,Item_Desc
,Item_Cat_Code
,Item_Cat_Desc
,Item_Cleared_By_No
,Item_Cleared_By_Name
V_Rep_Exit_Checklist c
FETCH NEXT FROM cursorName
INTO #tp_ID
END
CLOSE cursorName -- close the cursor
DEALLOCATE cursorName -- Deallocate the cursor
Something about your query doesn't seem to make sense. It seems like your INSERT statement with the SELECT is missing a where clause and therefore if you source view have 5 records for instance, you insert 25 records because you take the id of the first record during the first iteration with the cursor, and insert all records with that id, then repeat for each row of the view.
Assuming the above logic is intended, then you should just need a CROSS JOIN:
INSERT T_Rep_Exit_Checklist_Table
SELECT
T1.tp_ID,
T2.Item_Status_Code,
T2.Item_Status_Desc,
T2.Item_Code,
T2.Item_Desc,
T2.Item_Cat_Code,
T2.Item_Cat_Desc,
T2.Item_Cleared_By_No,
T2.Item_Cleared_By_Name
FROM V_Rep_Exit_Checklist T1
CROSS JOIN V_Rep_Exit_Checklist T2
However, you would like to see it as CTE:
;WITH CTE AS (
SELECT * FROM V_Rep_Exit_Checklist
)
INSERT T_Rep_Exit_Checklist_Table
SELECT
T1.tp_ID,
T2.Item_Status_Code,
T2.Item_Status_Desc,
T2.Item_Code,
T2.Item_Desc,
T2.Item_Cat_Code,
T2.Item_Cat_Desc,
T2.Item_Cleared_By_No,
T2.Item_Cleared_By_Name
FROM CTE T1
CROSS JOIN CTE T2
If my assumption is wrong and instead you are trying to just insert all records in the view directly into the table, then why not just a simple INSERT as below?
INSERT T_Rep_Exit_Checklist_Table
SELECT
tp_ID,
Item_Status_Code,
Item_Status_Desc,
Item_Code,
Item_Desc,
Item_Cat_Code,
Item_Cat_Desc,
Item_Cleared_By_No,
Item_Cleared_By_Name
FROM V_Rep_Exit_Checklist
However, if your business requirement is such that you can only insert the records from your view 1 tp_ID at a time, a while statement could be used to replace your cursor:
DECLARE #Records TABLE (tp_ID INT)
INSERT #Records
SELECT tp_ID FROM V_Rep_Exit_Checklist
DECLARE #tp_ID INTEGER
WHILE EXISTS (SELECT * FROM #Records) BEGIN
SET #tp_ID = (SELECT TOP 1 tp_ID FROM #Records)
INSERT T_Rep_Exit_Checklist_Table
SELECT
tp_ID,
Item_Status_Code,
Item_Status_Desc,
Item_Code,
Item_Desc,
Item_Cat_Code,
Item_Cat_Desc,
Item_Cleared_By_No,
Item_Cleared_By_Name
FROM V_Rep_Exit_Checklist
WHERE tp_ID = #tp_ID
DELETE #Records WHERE tp_ID = #tp_ID
END

SQL Server query with pagination and count

I want to make a database query with pagination. So, I used a common-table expression and a ranked function to achieve this. Look at the example below.
declare #table table (name varchar(30));
insert into #table values ('Jeanna Hackman');
insert into #table values ('Han Fackler');
insert into #table values ('Tiera Wetherbee');
insert into #table values ('Hilario Mccray');
insert into #table values ('Mariela Edinger');
insert into #table values ('Darla Tremble');
insert into #table values ('Mammie Cicero');
insert into #table values ('Raisa Harbour');
insert into #table values ('Nicholas Blass');
insert into #table values ('Heather Hayashi');
declare #pagenumber int = 2;
declare #pagesize int = 3;
declare #total int;
with query as
(
select name, ROW_NUMBER() OVER(ORDER BY name ASC) as line from #table
)
select top (#pagesize) name from query
where line > (#pagenumber - 1) * #pagesize
Here, I can specify the #pagesize and #pagenumber variables to give me just the records that I want. However, this example (that comes from a stored procedure) is used to make a grid pagination in a web application. This web application requires to show the page numbers. For instance, if a have 12 records in the database and the page size is 3, then I'll have to show 4 links, each one representing a page.
But I can't do this without knowing how many records are there, and this example just gives me the subset of records.
Then I changed the stored procedure to return the count(*).
declare #pagenumber int = 2;
declare #pagesize int = 3;
declare #total int;
with query as
(
select name, ROW_NUMBER() OVER(ORDER BY name ASC) as line, total = count(*) over()from #table
)
select top (#pagesize) name, total from query
where line > (#pagenumber - 1) * #pagesize
So, along with each line, it will show the total number of records. But I didn't like it.
My question is if there's a better way (performance) to do this, maybe setting the #total variable without returning this information in the SELECT. Or is this total column something that won't harm the performance too much?
Thanks
Assuming you are using MSSQL 2012, you can use Offset and Fetch which cleans up server-side paging greatly. We've found performance is fine, and in most cases better. As far as getting the total column count, just use the window function below inline...it will not include the limits imposed by 'offset' and 'fetch'.
For Row_Number, you can use window functions the way you did, but I would recommend that you calculate that client side as (pagenumber*pagesize + resultsetRowNumber), so if you're on the 5th page of 10 results and on the third row you would output row 53.
When applied to an Orders table with about 2 million orders, I found the following:
FAST VERSION
This ran in under a second. The nice thing about it is that you can do your filtering in the common table expression once and it applies both to the paging process and the count. When you have many predicates in the where clause, this keeps things simple.
declare #skipRows int = 25,
#takeRows int = 100,
#count int = 0
;WITH Orders_cte AS (
SELECT OrderID
FROM dbo.Orders
)
SELECT
OrderID,
tCountOrders.CountOrders AS TotalRows
FROM Orders_cte
CROSS JOIN (SELECT Count(*) AS CountOrders FROM Orders_cte) AS tCountOrders
ORDER BY OrderID
OFFSET #skipRows ROWS
FETCH NEXT #takeRows ROWS ONLY;
SLOW VERSION
This took about 10 sec, and it was the Count(*) that caused the slowness. I'm surprised this is so slow, but I suspect it's simply calculating the total for each row. It's very clean though.
declare #skipRows int = 25,
#takeRows int = 100,
#count int = 0
SELECT
OrderID,
Count(*) Over() AS TotalRows
FROM Location.Orders
ORDER BY OrderID
OFFSET #skipRows ROWS
FETCH NEXT #takeRows ROWS ONLY;
CONCLUSION
We've gone through this performance tuning process before and actually found that it depended on the query, predicates used, and indexes involved. For instance, the second we introduced a view it chugged, so we actually query off the base table and then join up the view (which includes the base table) and it actually performs very well.
I would suggest having a couple of straight-forward strategies and applying them to high-value queries that are chugging.
DECLARE #pageNumber INT = 1 ,
#RowsPerPage INT = 20
SELECT *
FROM TableName
ORDER BY Id
OFFSET ( ( #pageNumber - 1 ) * #RowsPerPage ) ROWS
FETCH NEXT #RowsPerPage ROWS ONLY;
What if you calculate the count beforehand?
declare #pagenumber int = 2;
declare #pagesize int = 3;
declare #total int;
SELECT #total = count(*)
FROM #table
with query as
(
select name, ROW_NUMBER() OVER(ORDER BY name ASC) as line from #table
)
select top (#pagesize) name, #total total from query
where line > (#pagenumber - 1) * #pagesize
Another way, is to calculate max(line). Check the link
Return total records from SQL Server when using ROW_NUMBER
UPD:
For single query, check marc_s's answer on the link above.
with query as
(
select name, ROW_NUMBER() OVER(ORDER BY name ASC) as line from #table
)
select top (#pagesize) name,
(SELECT MAX(line) FROM query) AS total
from query
where line > (#pagenumber - 1) * #pagesize
#pagenumber=5
#pagesize=5
Create a common table expression and write logic like this
Between ((#pagenumber-1)*(#pagesize))+1 and (#pagenumber *#pagesize)
There are many way we can achieve pagination: I hope this information is useful to you and others.
Example 1: using offset-fetch next clause. introduce in 2005
declare #table table (name varchar(30));
insert into #table values ('Jeanna Hackman');
insert into #table values ('Han Fackler');
insert into #table values ('Tiera Wetherbee');
insert into #table values ('Hilario Mccray');
insert into #table values ('Mariela Edinger');
insert into #table values ('Darla Tremble');
insert into #table values ('Mammie Cicero');
insert into #table values ('Raisa Harbour');
insert into #table values ('Nicholas Blass');
insert into #table values ('Heather Hayashi');
declare #pagenumber int = 1
declare #pagesize int = 3
--this is a CTE( common table expression and this is introduce in 2005)
with query as
(
select ROW_NUMBER() OVER(ORDER BY name ASC) as line, name from #table
)
--order by clause is required to use offset-fetch
select * from query
order by name
offset ((#pagenumber - 1) * #pagesize) rows
fetch next #pagesize rows only
Example 2: using row_number() function and between
declare #table table (name varchar(30));
insert into #table values ('Jeanna Hackman');
insert into #table values ('Han Fackler');
insert into #table values ('Tiera Wetherbee');
insert into #table values ('Hilario Mccray');
insert into #table values ('Mariela Edinger');
insert into #table values ('Darla Tremble');
insert into #table values ('Mammie Cicero');
insert into #table values ('Raisa Harbour');
insert into #table values ('Nicholas Blass');
insert into #table values ('Heather Hayashi');
declare #pagenumber int = 2
declare #pagesize int = 3
SELECT *
FROM
(select ROW_NUMBER() OVER (ORDER BY PRODUCTNAME) AS RowNum, * from Products)
as Prodcut
where RowNum between (((#pagenumber - 1) * #pageSize )+ 1)
and (#pagenumber * #pageSize )
I hope these will be helpful to all
I don't like other solutions for being too complex, so here is my version.
Execute three select queries in one go and use output parameters for getting the count values. This query returns the total count, the filter count, and the page rows. It supports sorting, searching, and filtering the source data. It's easy to read and modify.
Let's say you have two tables with one-to-many relationship, items and their prices changed over time so the example query is not too trivial.
create table shop.Items
(
Id uniqueidentifier not null primary key,
Name nvarchar(100) not null,
);
create table shop.Prices
(
ItemId uniqueidentifier not null,
Updated datetime not null,
Price money not null,
constraint PK_Prices primary key (ItemId, Updated),
constraint FK_Prices_Items foreign key (ItemId) references shop.Items(Id)
);
Here is the query:
select #TotalCount = count(*) over()
from shop.Items i;
select #FilterCount = count(*) over()
from shop.Items i
outer apply (select top 1 p.Price, p.Updated from shop.Prices p where p.ItemId = i.Id order by p.Updated desc) as p
where (#Search is null or i.Name like '%' + #Search + '%')/**where**/;
select i.Id as ItemId, i.Name, p.Price, p.Updated
from shop.Items i
outer apply (select top 1 p.Price, p.Updated from shop.Prices p where p.ItemId = i.Id order by p.Updated desc) as p
where (#Search is null or i.Name like '%' + #Search + '%')/**where**/
order by /**orderby**/i.Id
offset #SkipCount rows fetch next #TakeCount rows only;
You need to provide the following parameters to the query:
#SkipCount - how many records to skip, calculated from the page number.
#TakeCount - how many records to return, calculated from or equal to the page size.
#Search - a text to search for in some columns, provided by the grid search box.
#TotalCount - the total number of records in the data source, the output parameter.
#FilterCount - the number of records after the search and filtering operations, the output parameter.
You can replace /**orderby**/ comment with the list of columns and their ordering directions if the grid must support sorting the rows by columns. you get this info from the grid and translate it to an SQL expression. We still need to order the records by some column initially, I usually use ID column for that.
If the grid must support filtering data by each column individually, you can replace /**where**/ comment with an SQL expression for that.
If the user is not searching and filtering the data, but only clicks through the grid pages, this query doesn't change at all and the database server executes it very quickly.

Do Inserted Records Always Receive Contiguous Identity Values

Consider the following SQL:
CREATE TABLE Foo
(
ID int IDENTITY(1,1),
Data nvarchar(max)
)
INSERT INTO Foo (Data)
SELECT TOP 1000 Data
FROM SomeOtherTable
WHERE SomeColumn = #SomeParameter
DECLARE #LastID int
SET #LastID = SCOPE_IDENTITY()
I would like to know if I can depend on the 1000 rows that I inserted into table Foo having contiguous identity values. In order words, if this SQL block produces a #LastID of 2000, can I know for certain that the ID of the first record I inserted was 1001? I am mainly curious about multiple statements inserting records into table Foo concurrently.
I know that I could add a serializable transaction around my insert statement to ensure the behavior that I want, but do I really need to? I'm worried that introducing a serializable transaction will degrade performance, but if SQL Server won't allow other statements to insert into table Foo while this statement is running, then I don't have to worry about it.
I disagree with the accepted answer. This can easily be tested and disproved by running the following.
Setup
USE tempdb
CREATE TABLE Foo
(
ID int IDENTITY(1,1),
Data nvarchar(max)
)
Connection 1
USE tempdb
SET NOCOUNT ON
WHILE NOT EXISTS(SELECT * FROM master..sysprocesses WHERE context_info = CAST('stop' AS VARBINARY(128) ))
BEGIN
INSERT INTO Foo (Data)
VALUES ('blah')
END
Connection 2
USE tempdb
SET NOCOUNT ON
SET CONTEXT_INFO 0x
DECLARE #Output TABLE(ID INT)
WHILE 1 = 1
BEGIN
/*Clear out table variable from previous loop*/
DELETE FROM #Output
/*Insert 1000 records*/
INSERT INTO Foo (Data)
OUTPUT inserted.ID INTO #Output
SELECT TOP 1000 NEWID()
FROM sys.all_columns
IF EXISTS(SELECT * FROM #Output HAVING MAX(ID) - MIN(ID) <> 999 )
BEGIN
/*Set Context Info so other connection inserting
a single record in a loop terminates itself*/
DECLARE #stop VARBINARY(128)
SET #stop = CAST('stop' AS VARBINARY(128))
SET CONTEXT_INFO #stop
/*Return results for inspection*/
SELECT ID, DENSE_RANK() OVER (ORDER BY Grp) AS ContigSection
FROM
(SELECT ID, ID - ROW_NUMBER() OVER (ORDER BY [ID]) AS Grp
FROM #Output) O
ORDER BY ID
RETURN
END
END
Yes, they will be contiguous because the INSERT is atomic: complete success or full rollback. It is also performed as a single unit of work: you wont get any "interleaving" with other processes
However (or to put your mind at rest!), consider the OUTPUT clause
DECLARE #KeyStore TABLE (ID int NOT NULL)
INSERT INTO Foo (Data)
OUTPUT INSERTED.ID INTO #KeyStore (ID) --this line
SELECT TOP 1000 Data
FROM SomeOtherTable
WHERE SomeColumn = #SomeParameter
If you want the Identity values for multiple rows use OUTPUT:
DECLARE #NewIDs table (PKColumn int)
INSERT INTO Foo (Data)
OUTPUT INSERTED.PKColumn
INTO #NewIDs
SELECT TOP 1000 Data
FROM SomeOtherTable
WHERE SomeColumn = #SomeParameter
you now have the entire set of values in the #NewIDs table. You can add any columns from the Foo table into the #NewIDs table and insert those columns as well.
It is not good practice to attach any sort of meaning whatsoever to identity values. You should assume that they are nothing more than integers guaranteed to be unique within the scope of your table.
Try adding the following:
option(maxdop 1)

Resources