I try to create dynamic forecast for 18(!) months depend on previous columns (months) and i am stuck:
I have three columns:
Stock
SafetyStock
Need for production - another select with clause WHERE date = getdate()
what i need to achieve:
Index, Stock- Current month, SafetyStock-Current month, Need for production (select * from Nfp where date = getdate()), Stock - Current month + 1, Safetystock - Current Month + 1, Need for Production - Current Month + 1 ... etc till 18 months
calculations:
Stock - Current month + 1 = Stock previous month + SafetyStock previous month - Needs for production of current month
there is any possibility to create something like this ? it has to be dynamic and get calculation for current date and next 18 months. So now i have to calculate from 2020-10 till let's say 2022-04
What i have tried:
I prepared 18 cte and joins everything. Then i do calculations - it works but it slow and i think it is not profesional.
I have tried to do dynamic sql, below you can see my code but i have stucked when i wanted to do computed column depended on previous computed column:
------------------- CODE -------------------------
if object_id('tempdb..#tmp') is not null
drop table #tmp
if object_id('tempdb..#tmp2') is not null
drop table #tmp2
declare #cols as int
declare #iteration as int
declare #Mth as nvarchar(30)
declare #data as date
declare #sql as nvarchar(max)
declare #sql2 as nvarchar(max)
set #cols = 18
set #iteration = 0
set #Mth = month(getdate())
set #data = cast(getdate() as date)
select
10 as SS,
12 as Stock
into #tmp
WHILE #iteration < #cols
begin
set #iteration = #iteration + 1
set #sql =
'
alter table #tmp
add [StockUwzgledniajacSS - ' + cast(concat(year(DATEADD(Month, #Iteration, #data)),'-', month(DATEADD(Month, #Iteration, #data))) as nvarchar(max)) +'] as (Stock - SS)
'
exec (#sql)
set #Mth= #Mth+ 1
set #sql2 =
'
alter table #tmp
add [StockUwzgledniajacSS - ' + #Mth +'] as ([StockUwzgledniajacSS - ' + #Mth +'])
'
end
select * from #tmp
thanks in advance!
Update 1 note: I wrote this before you posted your data. This still holds I believe but, of course, stock levels are way different. Given that your NFP data is by day, and your report is by month, I suggest adding something to preprocess that data into months e.g., sum of NPS values, grouped by month.
Update 2 (next day) note: From the OPs comments below, I've tried to integrate this with what was written and more directly answering the question e.g., creating a reporting table #tmp.
Given that the OP also mentions millions of rows, I imagine each row represents a specific part/item - I've included this as a field called StockNum.
I have done something that probably doesn't do your calculations properly, but demonstrates the approach and should get you over your current hurdle. Indeed, if you haven't used these before, then updating this code with your own calculations will help you to understand how it works so you can maintain it.
I'm assuming the key issue here for calculation is that this month's stock is based on last month's stock and then new stock minus old stock for this month.
It is possible to calculate this in 18 separate statements (update table set col2 = some function of col1, then update table set col3 = some function of col2, etc). However, updating the same table multiple times is often an anti-pattern causing poor performance - especially if you need to read the base data again and again.
Instead, something like this is often best calculated using a Recusive CTE (here's an example description), where it 'builds' a set of data based on previous results.
The key difference in this approach is that it
Creates the reporting table (without any data/calculations going in)
Calculates the data as a separate step - but with columns/fields that can be used to link to the reporting table
Inserts the data from calculations into the reporting table as a single insert statement.
I have used temporary tables/etc liberally, to help demonstrate the process.
You haven't explained what safety stock is, nor how you measure what's coming in, so for the example below, I have assumed safety stock is the amount produced and is 5 per month. I've then assumed that NFP is amount going out each month (e.g., forward estimates of sales). The key result will be stock at the end of month (e.g., which you could then review whether it's too high or too low).
As you want to store it in a table that has each month as columns, the first step is to create a list with the relevant buckets (months). These include fields used for matching in later calculations/etc. Note I have included some date fields (startdate and enddate) which may be useful when you customise the code. This part of the SQL is designed to be as straightforward as possible.
We then create the scratch table that has our reference data for stock movements, replacing your SELECT * FROM NFP WHERE date = getdate()
/* SET UP BUCKET LIST TO HELP CALCULATION */
CREATE TABLE #RepBuckets (BucketNum int, BucketName nvarchar(30), BucketStartDate datetime, BucketEndDate datetime)
INSERT INTO #RepBuckets (BucketNum) VALUES
(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),
(11),(12),(13),(14),(15),(16),(17),(18)
DECLARE #CurrentBucketStart date
SET #CurrentBucketStart = DATEFROMPARTS(YEAR(getdate()), MONTH(getdate()), 1)
UPDATE #RepBuckets
SET BucketName = 'StockAtEnd_' + FORMAT(DATEADD(month, BucketNum, #CurrentBucketStart), 'MMM_yy'),
BucketStartDate = DATEADD(month, BucketNum, #CurrentBucketStart),
BucketEndDate = DATEADD(month, BucketNum + 1, #CurrentBucketStart)
/* CREATE BASE DATA */
-- Current stock
CREATE TABLE #Stock (StockNum int, MonthNum int, StockAtStart int, SafetyStock int, NFP int, StockAtEnd int, PRIMARY KEY(StockNum, MonthNum))
INSERT INTO #Stock (StockNum, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd) VALUES
(12422, 0, NULL, NULL, NULL, 10)
-- Simulates SELECT * FROM NFP WHERE date = getdate()
CREATE TABLE #NFP_by_month (StockNum int, MonthNum int, StockNFP int, PRIMARY KEY(StockNum, MonthNum))
INSERT INTO #NFP_by_month (StockNum, MonthNum, StockNFP) VALUES
(12422, 1, 4), (12422, 7, 4), (12422, 13, 4),
(12422, 2, 5), (12422, 8, 5), (12422, 14, 5),
(12422, 3, 2), (12422, 9, 2), (12422, 15, 2),
(12422, 4, 7), (12422, 10, 7), (12422, 16, 7),
(12422, 5, 9), (12422, 11, 9), (12422, 17, 9),
(12422, 6, 3), (12422, 12, 3), (12422, 18, 3)
We then use the recursive CTE to get calculate our data. It stores these in table #StockProjections.
What this does is
Start with your current stock (last row in the #Stock table). Note that the only value that matters in that is the stock at end of month.
Uses that stock level at the end of last month, as the stock level at the start of the new month
Adds the safety stock, minuses the NFP, and calculates your stock at end.
Note that within the recursive part of the CTE, 'SBM' (StockByMonth) refers to last month's data). This is then used with whatever external data (e.g., #NFP) to calculate new data.
These calculations create a table with
StockNum (the ID number of the relevant stock item - for this example, I've used one stock item 12422)
MonthNum (I've used integers this rather than dates, for clarity/simplicity)
BucketName (an nvarchar representing the month, used for column names)
Stock at start of month
Safety stock (which I assume is incoming stock, 5 per month)
NFP (which I assume is outgoing stock, varies by month and comes from a scratch table here - you'll need to adjust this to your select)
Stock at end of month
/* CALCULATE PROJECTIONS */
CREATE TABLE #StockProjections (StockNum int, BucketName nvarchar(30), MonthNum int, StockAtStart int, SafetyStock int, NFP int, StockAtEnd int, PRIMARY KEY (StockNum, BucketName))
; WITH StockByMonth AS
(-- Anchor
SELECT TOP 1 StockNum, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd
FROM #Stock S
ORDER BY MonthNum DESC
-- Recursion
UNION ALL
SELECT NFP.StockNum,
SBM.MonthNum + 1 AS MonthNum,
SBM.StockAtEnd AS NewStockAtStart,
5 AS Safety_Stock,
NFP.StockNFP,
SBM.StockAtEnd + 5 - NFP.StockNFP AS NewStockAtEnd
FROM StockByMonth SBM
INNER JOIN #NFP_by_month NFP ON NFP.MonthNum = SBM.MonthNum + 1
WHERE NFP.MonthNum <= 18
)
INSERT INTO #StockProjections (StockNum, BucketName, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd)
SELECT StockNum, BucketName, MonthNum, StockAtStart, SafetyStock, NFP, StockAtEnd
FROM StockByMonth
INNER JOIN #RepBuckets ON StockByMonth.MonthNum = #RepBuckets.BucketNum
Now we have the data, we set up a table for reporting purposes. Note that this table has the month names embedded into the column names (e.g., StockAtEnd_Jun_21). It would be easier to use a generic name (e.g., StockAtEnd_Month4) but I've gone for the slightly more complex case here for demonstration.
/* SET UP TABLE FOR REPORTING */
DECLARE #cols int = 18
DECLARE #iteration int = 0
DECLARE #colname nvarchar(30)
DECLARE #sql2 as nvarchar(max)
CREATE TABLE #tmp (StockNum int PRIMARY KEY)
WHILE #iteration <= #cols
BEGIN
SET #colname = (SELECT TOP 1 BucketName FROM #RepBuckets WHERE BucketNum = #iteration)
SET #sql2 = 'ALTER TABLE #tmp ADD ' + QUOTENAME(#colname) + ' int'
EXEC (#sql2)
SET #iteration = #iteration + 1
END
The last step is to add the data to your reporting table. I've used a pivot here but feel free to use whatever you like.
/* POPULATE TABLE */
DECLARE #columnList nvarchar(max) = N'';
SELECT #columnList += QUOTENAME(BucketName) + N' ' FROM #RepBuckets
SET #columnList = REPLACE(RTRIM(#columnList), ' ', ', ')
DECLARE #sql3 nvarchar(max)
SET #sql3 = N'
;WITH StockPivotCTE AS
(SELECT *
FROM (SELECT StockNum, BucketName, StockAtEnd
FROM #StockProjections
) StockSummary
PIVOT
(SUM(StockAtEnd)
FOR [BucketName]
IN (' + #columnList + N')
) AS StockPivot
)
INSERT INTO #tmp (StockNum, ' + #columnList + N')
SELECT StockNum, ' + #columnList + N'
FROM StockPivotCTE'
EXEC (#sql3)
Here's a DB<>fiddle showing it running with results of each sub-step.
Related
I'm running into a wall compiling row updates and new rows in a few tables to save off in another table for trending. I know a cursor could achieve this pretty easily and I get a result set, but I'm struggling to figure out how to get these into a table with the cursor (or if I should approach it completely differently.
Background
I want to calculate and save off daily the number of new and edited rows daily from several tables of interest in a production database. These tables' rows are timestamped with the last edit.
My stats database that contains a tablestats table that will house the information for each table across 6 columns. My goal in mind is to run an Agent job daily to count the prior day's timestamps, the delta between today's rowcount and the prior day's rowcount, and then merge those into tablestats.
Something like this:
tablename
updyear
updmonth
updday
rowupdates
newrows
table_1
2023
2
5
2509
34
table_1
2023
2
6
3443
90
table_2
2023
2
5
834
255
table_2
2023
2
6
544
433
With that, I can trend/pivot the data as needed.
What I tried
I figured a cursor would in part be the best approach since I was having trouble condensing the query's results with the name of the table I'm pulling from. I adapted this question & answers to get part of the way there, but I'm struggling with how to take the next step. I abbreviated the below code for legibility:
DECLARE #last_upd nvarchar(MAX) = '';
DECLARE #checkdate date = DATEADD(DAY, -1, GETDATE());
SELECT #last_upd = #last_upd + 'SELECT '''
+QUOTENAME(name)
+''',YEAR(last_upd) as updyear /* month, etc. */,COUNT(last_upd) as rowupdates FROM '
+ QUOTENAME(name)
+ ' WHERE last_upd > #checkdate /* GROUP BY year/month/day*/; '
FROM sys.tables
WHERE (name IN ('table_1','table_2','table_3'))
IF ##ROWCOUNT > 0
EXEC sp_executesql #last_upd
, N'#checkdate date'
, #checkdate'
Which returns the following:
Query 1
updyear
updmonth
updday
rowupdates
table_1
2023
2
5
table_1
2023
2
6
Query 2
updyear
updmonth
updday
rowupdates
table_2
2023
2
5
table_2
2023
2
6
Query 3, etc.
Since it returns as 3 separate queries, I'm unsure how to get that into a merge statement, since I can't SELECT * INTO #temptable with these.
The reason I'm interested in merge even though it's a daily run is to accomodate any potential conflicts with existing data. I haven't gotten to the point of doing a rowcount but assume at worst, I could do a second cursor with the rowcount prior to rolling it up into a stored procedure.
What you really want is a UNION ALL to combine the results from the various queries into a single result set. If you change your dynamic SELECT to a UNION ALL SELECT, you are most of the way there. What's left is to strip the leading occurence using something like SET #last_upd = STUFF(#last_upd, 1, 10, ''), which replaces the first 10 characters with nothing.
If you include a newline immediately after the opening quote of your dynamic statement, the generated SQL will look a lot nicer when you print it out during debugging.
It is also common to now use STRING_AGG() to to combine generated code snippets when generating dynamic SQL, but your approach works so I'll leave it.
For the table name column in the result, you can use QUOTENAME(..., '''') to safely stringify the name inside single quotes instead of [].
The updated code would be something like:
DECLARE #last_upd nvarchar(MAX) = '';
DECLARE #checkdate date = DATEADD(DAY, -1, GETDATE());
SELECT #last_upd = #last_upd + '
UNION ALL
SELECT '
+ QUOTENAME(name, '''')
+ ',YEAR(last_upd) as updyear /* month, etc. */,COUNT(last_upd) as rowupdates FROM '
+ QUOTENAME(name)
+ ' WHERE last_upd > #checkdate GROUP BY YEAR(last_upd) /* year/month/day*/ '
FROM sys.tables
WHERE (name IN ('table_1','table_2','table_3'))
SET #last_upd = STUFF(#last_upd, 1, 10, '')
SELECT #last_upd
IF ##ROWCOUNT > 0
EXEC sp_executesql #last_upd
, N'#checkdate date'
, #checkdate
Generated SQL:
SELECT 'table_1',YEAR(last_upd) as updyear /* month, etc. */,COUNT(last_upd) as rowupdates FROM [table_1] WHERE last_upd > #checkdate GROUP BY YEAR(last_upd) /* year/month/day*/
UNION ALL
SELECT 'table_2',YEAR(last_upd) as updyear /* month, etc. */,COUNT(last_upd) as rowupdates FROM [table_2] WHERE last_upd > #checkdate GROUP BY YEAR(last_upd) /* year/month/day*/
UNION ALL
SELECT 'table_3',YEAR(last_upd) as updyear /* month, etc. */,COUNT(last_upd) as rowupdates FROM [table_3] WHERE last_upd > #checkdate GROUP BY YEAR(last_upd) /* year/month/day*/
Results:
(No column name)
updyear
rowupdates
[table_1]
2023
2
[table_2]
2023
1
[table_3]
2023
3
See this db<>fiddle
I'll leave it to you to finish up the details to get your complete desired result.
I have a requirement to generate Ticket ID's which have the following format:
TicketType+YYMMDD+nnnn
TicketType is 4-characters
YYMMDD is 2-digit year/month/day
nnnn is a 4-digit incrementing number starting at 0001 for each TicketType+YYMMDD
I have something that's been working for a year, but today revealed a flaw.
DECLARE #TktType varchar(4) = CASE #TypeId WHEN 1 THEN 'TKTT' WHEN 2 THEN 'TKTD' WHEN 3 THEN 'TKTV' WHEN 4 THEN 'TKTB' END
DECLARE #DatePart varchar(6) = CAST(YEAR(GetDate()) - 2000 AS varchar(4)) +
RIGHT('0' + CAST(MONTH(GetDate()) AS varchar(2)), 2) +
RIGHT('0' + CAST(DAY(GetDate()) AS varchar(2)), 2)
DECLARE #nextNum varchar(4) = (SELECT CONVERT(INT, MAX(SUBSTRING(SO, 11, 4))) + 1 FROM T_SO WHERE SO LIKE #TktType + #DatePart +'%')
SET #nextNum = RIGHT('000' + COALESCE(#nextNum, '1'), 4)
INSERT INTO tblTickets (TktID, ...)
VALUES (#TktType + #DatePart + #nextNum, ...)
This has been working for a year without a hitch. Can you guess what happened? Today two people hit it at the same time. Both generated the same Ticket ID, and since the TktID column is the primary key, one of them got a nice "Violation of PRIMARY KEY constraint" message.
So i've thought about creating a new table for each ticket type with an identity column and a bit column. Insert a 0 and get back the inserted id. This would mean having to truncate the table and reset the identity seed every midnight. I'm sure there are unforeseen issues with this.
I've also thought about looping and incrementing the number until the insert is successfull. Bad.
And one of my peers suggested using a transaction to lock the table which would make anyone else wait until I was done. Not sure about this.
Has anyone else had to do something similar? I'm looking for suggestions and advice on how to best resolve the issue.
EDIT: I think I have something that works. Feel free to leave your thoughts.
First, I created a table that has a row for each ticket type:
CREATE TABLE [dbo].[T_TicketID](
[id] [int] NOT NULL,
[TicketType] [varchar](4) NOT NULL PRIMARY KEY,
[Date] [date] NOT NULL
)
Then I created a procedure that accepts a ticket type and returns a full Ticket ID:
ALTER PROCEDURE usp_CreateTicketID
#TicketType varchar(4),
#TicketID varchar(14) OUTPUT
AS
SET NOCOUNT ON
DECLARE #Date DATE = GETDATE()
DECLARE #out TABLE (TicketID varchar(14))
UPDATE T_TicketID
SET
id = CASE WHEN [Date] = #Date THEN id + 1 ELSE 1 END,
[Date] = CASE WHEN [Date] = #Date THEN [Date] ELSE #Date END
OUTPUT #TicketType +
CONVERT(varchar, YEAR(#Date) - 2000) +
RIGHT('0' + CONVERT(varchar, MONTH(#Date)), 2) +
RIGHT('0' + CONVERT(varchar, DAY(#Date)), 2) +
RIGHT('000' + CONVERT(varchar, INSERTED.id), 4)
INTO #out
WHERE TicketType = #TicketType
SET #TicketID = (SELECT TicketID FROM #out)
Since the UPDATE is atomic, it serializes the updates and everyone gets a unique TicketID.
I've tested it by having 2 processes, each in a loop, hitting it 10,000 times each with no delays in the loop. I saved the generated TicketID's to a table and then verified there were no duplicates.
If you can't avoid using an identity or GUID, you can use a custom sequence which will handle the race condition you encountered. SQL Server will handle the hard part of dishing out the next number in the sequence and by using the cycle argument it will wrap around and start back at the minvalue when the last number maxvalue is used.
create sequence dbo.TicketNumber as smallint
start with 1
increment by 1
minvalue 1
maxvalue 9999
cycle;
-- This will give you the next value in line each time it's run
select next value for dbo.TicketNumber as TicketNumber
I would suggest that you don't store the TicketType, date, and incrementing number all in one field. It would be better if you had a column for each value and had your primary key be all 3 columns, e.g.:
create table dbo.tblTickets
(
TicketType char(4) not null,
TicketDate char(6) not null,
TicketNumber smallint not null,
constraint PK_TicketType_TicketDate_TicketNumber primary key
(
TicketType,
TicketDate,
TicketNumber
)
)
Unless you have a case where you will insert more than 9999 tickets in a day, what you have will work fine until 100 years after you started inputting values, then the 2 digit year will get you.
The transaction wrapping is the only change you need to implement to prevent the case you bumped into.
While others will disagree, making your primary key this way is fine, it fits your use case, you know its limitations, and it is more meaningful in your business case than a random guid.
Using the built in guid, or integer identity, would have stopped the error you had from occuring, but if your current key of type-date-number has any business value you would lose that, or still have to maintain that constraint while also maintaining the guid.
I have a table in SQL Server 2008, which has DOB (e.g. 1992-03-15) and in same table, I have an Age column, which is right now Null. I need to update the Age according to the DOB. I have both columns (AGE and DOB) in the same table. I need script which does my job to update Age according to DOB
And other one is in same table, I have Arrival Month (e.g. 8) and Arrival year (e.g. 2011), according to that I need to update another column (Time in country). Say let's say according to example (08(MM), 2011(YYYY)), should update (TimeInCountry) - 4.2 something like that. Which should deduct from current date and time has mentioned into month and year
Do let me know if you need anything else.
Not sure what is the data type your age column is
You can do something like below
Update TableName
Set Age = DATEDIFF(yy, DOB, getdate())
if you using decimal
Age = DATEDIFF(hour,DOB,GETDATE())/8766.0
I believe creating a trigger bill be use full, if you adding new rows in future
For yopur first Problem,
UPDATE TABLE_NAME SET AGE=DATEDIFF(hour,DOB_COLUMN,GETDATE())/8766.0
If you want in Round ,
UPDATE TABLE_NAME SET
AGE= CONVERT(int,ROUND(DATEDIFF(hour,DOB_COLUMN,GETDATE())/8766.0,0))
And Not Sure what you really want to do in the Second Problem,but my guess you can try something like..
Update Table_Name set
TimeInCountry=cast(len(Arrival_year) as varchar(4))+'.'+cast(len(Arrival_Month) as varchar(2))
I have implemented User Defined function
Here it is which may help to someone.
ALTER FUNCTION [dbo].[TimeInCountry]
(
#ArrivalMonth varchar(10),
#ArrivalYear Varchar(10)
) RETURNS VARCHAR(10)
BEGIN
Declare #Ageyear int
Declare #Agemonth int
Declare #Final varchar(10)
Declare #CurrentMonth int
Declare #Currentyear int
Set #CurrentMonth = (Select DatePart(mm, GetDate()))
Set #Currentyear = (Select DatePart(yyyy,GetDate()))
Select #AgeYear = #Currentyear - #ArrivalYear
Select #AgeMonth = #CurrentMonth - #ArrivalMonth
if (#AgeMonth < 0)
BEGIN
Set #AgeYear = #AgeYear - 1
Set #AgeMonth = #AgeMonth + 12
END
Select #Final = (Select Cast(#AgeYear as Varchar(max)) +'.'+ Cast(#AgeMonth as varchar(max)))
Return #Final
---And finally call this function where to update.
--To Check
Select [DBName].TimeInCountry (8,2013)
--- and Finally updating.
Update [DBName].[dbo].[TableName] Set TimeInCountry = dbo.TimeInCountry (ArrivalMonth,ArrivalYear) from [DBName].[dbo].[TableName]
Thanks again everyone.
I was looking at different ways of writing a stored procedure to return a "page" of data. This was for use with the ASP ObjectDataSource, but it could be considered a more general problem.
The requirement is to return a subset of the data based on the usual paging parameters; startPageIndex and maximumRows, but also a sortBy parameter to allow the data to be sorted. Also there are some parameters passed in to filter the data on various conditions.
One common way to do this seems to be something like this:
[Method 1]
;WITH stuff AS (
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
One problem with this is that it doesn't give the total count and generally we need another stored procedure for that. This second stored procedure has to replicate the parameter list and the complex WHERE clause. Not nice.
One solution is to append an extra column to the final select list, (SELECT COUNT(*) FROM stuff) AS TotalRows. This gives us the total but repeats it for every row in the result set, which is not ideal.
[Method 2]
An interesting alternative is given here (https://web.archive.org/web/20211020111700/https://www.4guysfromrolla.com/articles/032206-1.aspx) using dynamic SQL. He reckons that the performance is better because the CASE statement in the first solution drags things down. Fair enough, and this solution makes it easy to get the totalRows and slap it into an output parameter. But I hate coding dynamic SQL. All that 'bit of SQL ' + STR(#parm1) +' bit more SQL' gubbins.
[Method 3]
The only way I can find to get what I want, without repeating code which would have to be synchronized, and keeping things reasonably readable is to go back to the "old way" of using a table variable:
DECLARE #stuff TABLE (Row INT, ...)
INSERT INTO #stuff
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
(Or a similar method using an IDENTITY column on the table variable).
Here I can just add a SELECT COUNT on the table variable to get the totalRows and put it into an output parameter.
I did some tests and with a fairly simple version of the query (no sortBy and no filter), method 1 seems to come up on top (almost twice as quick as the other 2). Then I decided to test probably I needed the complexity and I needed the SQL to be in stored procedures. With this I get method 1 taking nearly twice as long as the other 2 methods. Which seems strange.
Is there any good reason why I shouldn't spurn CTEs and stick with method 3?
UPDATE - 15 March 2012
I tried adapting Method 1 to dump the page from the CTE into a temporary table so that I could extract the TotalRows and then select just the relevant columns for the resultset. This seemed to add significantly to the time (more than I expected). I should add that I'm running this on a laptop with SQL Server Express 2008 (all that I have available) but still the comparison should be valid.
I looked again at the dynamic SQL method. It turns out I wasn't really doing it properly (just concatenating strings together). I set it up as in the documentation for sp_executesql (with a parameter description string and parameter list) and it's much more readable. Also this method runs fastest in my environment. Why that should be still baffles me, but I guess the answer is hinted at in Hogan's comment.
I would most likely split the #SortBy argument into two, #SortColumn and #SortDirection, and use them like this:
…
ROW_NUMBER() OVER (
ORDER BY CASE #SortColumn
WHEN 'Name' THEN Name
WHEN 'OtherName' THEN OtherName
…
END *
CASE #SortDirection
WHEN 'DESC' THEN -1
ELSE 1
END
) AS Row
…
And this is how the TotalRows column could be defined (in the main select):
…
COUNT(*) OVER () AS TotalRows
…
I would definitely want to do a combination of a temp table and NTILE for this sort of approach.
The temp table will allow you to do your complicated series of conditions just once. Because you're only storing the pieces you care about, it also means that when you start doing selects against it further in the procedure, it should have a smaller overall memory usage than if you ran the condition multiple times.
I like NTILE() for this better than ROW_NUMBER() because it's doing the work you're trying to accomplish for you, rather than having additional where conditions to worry about.
The example below is one based off a similar query I'm using as part of a research query; I have an ID I can use that I know will be unique in the results. Using an ID that was an identity column would also be appropriate here, though.
--DECLARES here would be stored procedure parameters
declare #pagesize int, #sortby varchar(25), #page int = 1;
--Create temp with all relevant columns; ID here could be an identity PK to help with paging query below
create table #temp (id int not null primary key clustered, status varchar(50), lastname varchar(100), startdate datetime);
--Insert into #temp based off of your complex conditions, but with no attempt at paging
insert into #temp
(id, status, lastname, startdate)
select id, status, lastname, startdate
from Table1 ...etc.
where ...complicated conditions
SET #pagesize = 50;
SET #page = 5;--OR CAST(#startRowIndex/#pagesize as int)+1
SET #sortby = 'name';
--Only use the id and count to use NTILE
;with paging(id, pagenum, totalrows) as
(
select id,
NTILE((SELECT COUNT(*) cnt FROM #temp)/#pagesize) OVER(ORDER BY CASE WHEN #sortby = 'NAME' THEN lastname ELSE convert(varchar(10), startdate, 112) END),
cnt
FROM #temp
cross apply (SELECT COUNT(*) cnt FROM #temp) total
)
--Use the id to join back to main select
SELECT *
FROM paging
JOIN #temp ON paging.id = #temp.id
WHERE paging.pagenum = #page
--Don't need the drop in the procedure, included here for rerunnability
drop table #temp;
I generally prefer temp tables over table variables in this scenario, largely so that there are definite statistics on the result set you have. (Search for temp table vs table variable and you'll find plenty of examples as to why)
Dynamic SQL would be most useful for handling the sorting method. Using my example, you could do the main query in dynamic SQL and only pull the sort method you want to pull into the OVER().
The example above also does the total in each row of the return set, which as you mentioned was not ideal. You could, instead, have a #totalrows output variable in your procedure and pull it as well as the result set. That would save you the CROSS APPLY that I'm doing above in the paging CTE.
I would create one procedure to stage, sort, and paginate (using NTILE()) a staging table; and a second procedure to retrieve by page. This way you don't have to run the entire main query for each page.
This example queries AdventureWorks.HumanResources.Employee:
--------------------------------------------------------------------------
create procedure dbo.EmployeesByMartialStatus
#MaritalStatus nchar(1)
, #sort varchar(20)
as
-- Init staging table
if exists(
select 1 from sys.objects o
inner join sys.schemas s on s.schema_id=o.schema_id
and s.name='Staging'
and o.name='EmployeesByMartialStatus'
where type='U'
)
drop table Staging.EmployeesByMartialStatus;
-- Populate staging table with sort value
with s as (
select *
, sr=ROW_NUMBER()over(order by case #sort
when 'NationalIDNumber' then NationalIDNumber
when 'ManagerID' then ManagerID
-- plus any other sort conditions
else EmployeeID end)
from AdventureWorks.HumanResources.Employee
where MaritalStatus=#MaritalStatus
)
select *
into #temp
from s;
-- And now pages
declare #RowCount int; select #rowCount=COUNT(*) from #temp;
declare #PageCount int=ceiling(#rowCount/20); --assuming 20 lines/page
select *
, Page=NTILE(#PageCount)over(order by sr)
into Staging.EmployeesByMartialStatus
from #temp;
go
--------------------------------------------------------------------------
-- procedure to retrieve selected pages
create procedure EmployeesByMartialStatus_GetPage
#page int
as
declare #MaxPage int;
select #MaxPage=MAX(Page) from Staging.EmployeesByMartialStatus;
set #page=case when #page not between 1 and #MaxPage then 1 else #page end;
select EmployeeID,NationalIDNumber,ContactID,LoginID,ManagerID
, Title,BirthDate,MaritalStatus,Gender,HireDate,SalariedFlag,VacationHours,SickLeaveHours
, CurrentFlag,rowguid,ModifiedDate
from Staging.EmployeesByMartialStatus
where Page=#page
GO
--------------------------------------------------------------------------
-- Usage
-- Load staging
exec dbo.EmployeesByMartialStatus 'M','NationalIDNumber';
-- Get pages 1 through n
exec dbo.EmployeesByMartialStatus_GetPage 1;
exec dbo.EmployeesByMartialStatus_GetPage 2;
-- ...etc (this would actually be a foreach loop, but that detail is omitted for brevity)
GO
I use this method of using EXEC():
-- SP parameters:
-- #query: Your query as an input parameter
-- #maximumRows: As number of rows per page
-- #startPageIndex: As number of page to filter
-- #sortBy: As a field name or field names with supporting DESC keyword
DECLARE #query nvarchar(max) = 'SELECT * FROM sys.Objects',
#maximumRows int = 8,
#startPageIndex int = 3,
#sortBy as nvarchar(100) = 'name Desc'
SET #query = ';WITH CTE AS (' + #query + ')' +
'SELECT *, (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingPageNo' +
', pagingCountRow / ' + CAST(#maximumRows as nvarchar(10)) + ' As pagingCountPage ' +
', (dt.pagingRowNo - 1) % ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingRowInPage ' +
'FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY ' + #sortBy + ') As pagingRowNo, COUNT(*) OVER () AS pagingCountRow ' +
'FROM CTE) dt ' +
'WHERE (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 = ' + CAST(#startPageIndex as nvarchar(10))
EXEC(#query)
At result-set after query result columns:
Note:
I add some extra columns that you can remove them:
pagingRowNo : The row number
pagingCountRow : The total number of rows
pagingPageNo : The current page number
pagingCountPage : The total number of pages
pagingRowInPage : The row number that started with 1 in this page
I am looking for a way to increment a uniqueidentifier by 1 in TSQL. For example, if the id is A6BC60AD-A4D9-46F4-A7D3-98B2A7237A9E, I'd like to be able to select A6BC60AD-A4D9-46F4-A7D3-98B2A7237A9F.
#rein It's for a data import. We have an intermediate table with IDs that we're generating records from, and we join on those IDs later in the import. Unfortunately, now some of those records generate a couple of records in the next table, so we need a new id that is reproducible.
The way you want to increment Guid is not correct for SQL Server as Guid is a structure with different byte order in the byte groups, please have a look at:
http://sqlblog.com/blogs/alberto_ferrari/archive/2007/08/31/how-are-guids-sorted-by-sql-server.aspx
and notice the following:
Now, when I run modified Alberto's query, I'm getting the following sequence:
3, 2, 1, 0, 5, 4, 7, 6, 9, 8, 15, 14, 13, 12, 11, 10
That means, that GUID's byte #3 is the least significant and GUID's byte #10 is the most significant [from SQL Server ORDER BY clause perspective].
Here is simple function to increment a uniqueidentifier accounting for this:
create function [dbo].[IncrementGuid](#guid uniqueidentifier)
returns uniqueidentifier
as
begin
declare #guid_binary binary(16), #b03 binary(4), #b45 binary(2), #b67 binary(2), #b89 binary(2), #bAF binary(6)
select #guid_binary = #guid
select #b03 = convert(binary(4), reverse(substring(#guid_binary,1,4)))
select #b45 = convert(binary(2), reverse(substring(#guid_binary,5,2)))
select #b67 = convert(binary(2), reverse(substring(#guid_binary,7,2)))
select #b89 = convert(binary(2), substring(#guid_binary,9,2))
select #bAF = convert(binary(6), substring(#guid_binary,11,6))
if (#b03 < 'FFFFFFFF')
begin
select #b03 = convert(binary(4), cast(#b03 as int) + 1)
end
else if (#b45 < 'FFFF')
begin
select #b45 = convert(binary(2), cast(#b45 as int) + 1)
end
else if (#b89 < 'FFFF')
begin
select #b89 = convert(binary(2), cast(#b89 as int) + 1)
end
else
begin
select #bAF = convert(binary(6), cast(#bAF as bigint) + 1)
end
return convert(binary(16), reverse(convert(char(4),#b03)) + reverse(convert(char(2),#b45)) + reverse(convert(char(2),#b67)) + convert(char(2),#b89) + convert(char(6),#bAF))
end
Note that bytes 6 and 7 are not incremented as they contain the Guid version bits.
But as others has pointed you really should not be doing this. In your case it might be better if you create a temp table for these Guids (with two columns: one integer as index and second one with generated Guids).
Here is one way I've come up with, but I'm hoping there is a better way.
LEFT([ID], 19) + RIGHT(CONVERT(uniqueidentifier, CONVERT(binary(16), CONVERT(binary(16), [ID]) + CONVERT(bigint, 1))), 17) AS 'MyNewID'
You can do this approach, but I'm not accounting for the case of overflowing lower 8 bytes.
declare #guid uniqueidentifier, #binaryUpper8 binary(8), #binaryLower8 binary(8), #binary16 binary(16), #bigint bigint
set #guid = 'A6BC60AD-A4D9-46F4-A7D3-98B2A7237A9E'
set #binary16 = cast(#guid as binary(16))
--harvest lower 8 bytes
select #binaryUpper8= substring(#binary16, 1, 8)
,#binaryLower8 = substring(#binary16, 9, 8)
set #bigint = cast(#binaryLower8 as bigint)
--increment
set #bigint = #bigint + 1
--convert back
set #binaryLower8 = cast(#bigint as binary(8))
set #binary16 = #binaryUpper8 + #binaryLower8
set #guid = cast(#binary16 as uniqueidentifier)
select #guid