I have a bunch of production orders and I'm trying to group by within a datetime range, then count the quantity within that range. For example, I want to group from 2230 to 2230 each day.
PT.ActualFinish is datetime (eg. if PT.ActualFinish is 2020-05-25 23:52:30 then it would be counted on the 26th May instead of the 25th)
Currently it's grouped by date (midnight to midnight) as opposed to the desired 2230 to 2230.
GROUP BY CAST(PT.ActualFinish AS DATE)
I've been trying to reconcile some DATEADD with the GROUP without success. Is it possible?
Just add 1.5 hours (90 minutes) and then extract the date:
group by convert(date, dateadd(minute, 90, pt.acctualfinish))
For this kind of thing you can use a function I created called NGroupRangeAB (code below) which can be used to create groups over values with an upper and lower bound.
Note that this:
SELECT f.*
FROM core.NGroupRangeAB(0,1440,12) AS f
ORDER BY f.RN;
Returns:
RN GroupNumber Low High
--- ------------ ------ -------
0 1 0 120
1 2 121 240
2 3 241 360
3 4 361 480
4 5 481 600
5 6 601 720
6 7 721 840
7 8 841 960
8 9 961 1080
9 10 1081 1200
10 11 1201 1320
11 12 1321 1440
This:
SELECT
f.GroupNumber,
L = DATEADD(MINUTE,f.[Low]-SIGN(f.[Low]),CAST('00:00:00.0000000' AS TIME)),
H = DATEADD(MINUTE,f.[High]-1,CAST('00:00:00.0000000' AS TIME))
FROM core.NGroupRangeAB(0,1440,12) AS f
ORDER BY f.RN;
Returns:
GroupNumber L H
------------- ---------------- ----------------
1 00:00:00.0000000 01:59:00.0000000
2 02:00:00.0000000 03:59:00.0000000
3 04:00:00.0000000 05:59:00.0000000
4 06:00:00.0000000 07:59:00.0000000
5 08:00:00.0000000 09:59:00.0000000
6 10:00:00.0000000 11:59:00.0000000
7 12:00:00.0000000 13:59:00.0000000
8 14:00:00.0000000 15:59:00.0000000
9 16:00:00.0000000 17:59:00.0000000
10 18:00:00.0000000 19:59:00.0000000
11 20:00:00.0000000 21:59:00.0000000
12 22:00:00.0000000 23:59:00.0000000
Now for a real-life example that may help you:
-- Sample Date
DECLARE #table TABLE (tm TIME);
INSERT #table VALUES ('00:15'),('11:20'),('21:44'),('09:50'),('02:15'),('02:25'),
('02:31'),('23:31'),('23:54');
-- Solution:
SELECT
GroupNbr = f.GroupNumber,
TimeLow = f2.L,
TimeHigh = f2.H,
Total = COUNT(t.tm)
FROM core.NGroupRangeAB(0,1440,12) AS f
CROSS APPLY (VALUES(
DATEADD(MINUTE,f.[Low]-SIGN(f.[Low]),CAST('00:00:00.0000000' AS TIME)),
DATEADD(MINUTE,f.[High]-1,CAST('00:00:00.0000000' AS TIME)))) AS f2(L,H)
LEFT JOIN #table AS t
ON t.tm BETWEEN f2.L AND f2.H
GROUP BY f.GroupNumber, f2.L, f2.H;
Returns:
GroupNbr TimeLow TimeHigh Total
-------------------- ---------------- ---------------- -----------
1 00:00:00.0000000 01:59:00.0000000 1
2 02:00:00.0000000 03:59:00.0000000 3
3 04:00:00.0000000 05:59:00.0000000 0
4 06:00:00.0000000 07:59:00.0000000 0
5 08:00:00.0000000 09:59:00.0000000 1
6 10:00:00.0000000 11:59:00.0000000 1
7 12:00:00.0000000 13:59:00.0000000 0
8 14:00:00.0000000 15:59:00.0000000 0
9 16:00:00.0000000 17:59:00.0000000 0
10 18:00:00.0000000 19:59:00.0000000 0
11 20:00:00.0000000 21:59:00.0000000 1
12 22:00:00.0000000 23:59:00.0000000 2
Note that an inner join will eliminate the 0-count rows.
CREATE FUNCTION core.NGroupRangeAB
(
#min BIGINT, -- Group Number Lower boundary
#max BIGINT, -- Group Number Upper boundary
#groups BIGINT -- Number of groups required
)
/*****************************************************************************************
[Purpose]:
Creates an auxilliary table that allows for grouping based on a given set of rows (#rows)
and requested number of "row groups" (#groups). core.NGroupRangeAB can be thought of as a
set-based, T-SQL version of Oracle's WIDTH_BUCKET, which:
"...lets you construct equiwidth histograms, in which the histogram range is divided into
intervals that have identical size. (Compare with NTILE, which creates equiheight
histograms.)" https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions214.htm
See usage examples for more details.
[Author]:
Alan Burstein
[Compatibility]:
SQL Server 2008+
[Syntax]:
--===== Autonomous
SELECT ng.*
FROM dbo.NGroupRangeAB(#rows,#groups) AS ng;
[Parameters]:
#rows = BIGINT; the number of rows to be "tiled" (have group number assigned to it)
#groups = BIGINT; requested number of tile groups (same as the parameter passed to NTILE)
[Returns]:
Inline Table Valued Function returns:
GroupNumber = BIGINT; a row number beginning with 1 and ending with #rows
Members = BIGINT; Number of possible distinct members in the group
Low = BIGINT; the lower-bound range
High = BIGINT; the Upper-bound range
[Dependencies]:
core.rangeAB (iTVF)
[Developer Notes]:
1. An inline derived tally table using a CTE or subquery WILL NOT WORK. NTally requires
a correctly indexed tally table named dbo.tally; if you have or choose to use a
permanent tally table with a different name or in a different schema make sure to
change the DDL for this function accordingly. The recomended number of rows is
1,000,000; below is the recomended DDL for dbo.tally. Note the "Beginning" and "End"
of tally code.To learn more about tally tables see:
http://www.sqlservercentral.com/articles/T-SQL/62867/
2. For best results a P.O.C. index should exists on the table that you are "tiling". For
more information about P.O.C. indexes see:
http://sqlmag.com/sql-server-2012/sql-server-2012-how-write-t-sql-window-functions-part-3
3. NGroupRangeAB is deterministic; for more about deterministic and nondeterministic functions
see https://msdn.microsoft.com/en-us/library/ms178091.aspx
[Examples]:
-----------------------------------------------------------------------------------------
--===== 1. Basic illustration of the relationship between core.NGroupRangeAB and NTILE.
-- Consider this query which assigns 3 "tile groups" to 10 rows:
DECLARE #rows BIGINT = 7, #tiles BIGINT = 3;
SELECT t.N, t.TileGroup
FROM ( SELECT r.RN, NTILE(#tiles) OVER (ORDER BY r.RN)
FROM core.rangeAB(1,#rows,1,1) AS r) AS t(N,TileGroup);
Results:
N TileGroup
--- ----------
1 1
2 1
3 1
4 2
5 2
6 3
7 3
To pivot these "equiheight histograms" into "equiwidth histograms" we could do this:
DECLARE #rows BIGINT = 7, #tiles BIGINT = 3;
SELECT TileGroup = t.TileGroup,
[Low] = MIN(t.N),
[High] = MAX(t.N),
Members = COUNT(*)
FROM ( SELECT r.RN, NTILE(#tiles) OVER (ORDER BY r.RN)
FROM core.rangeAB(1,#rows,1,1) AS r) AS t(N,TileGroup);
GROUP BY t.TileGroup;
Results:
TileGroup Low High Members
---------- ---- ----- -----------
1 1 3 3
2 4 5 2
3 6 7 2
This will return the same thing at a tiny fraction of the cost:
SELECT TileGroup = ng.GroupNumber,
[Low] = ng.[Low],
[High] = ng.[High],
Members = ng.Members
FROM core.NGroupRangeAB(1,#rows,#tiles) AS ng;
--===== 2.1. Divide 25 Rows into 3 groups
DECLARE #min BIGINT = 1, #max BIGINT = 25, #groups BIGINT = 4;
SELECT ng.GroupNumber, ng.Members, ng.low, ng.high
FROM core.NGroupRangeAB(#min,#max,#groups) AS ng;
--===== 2.2. Assign group membership to another table
DECLARE #min BIGINT = 1, #max BIGINT = 25, #groups BIGINT = 4;
SELECT
ng.GroupNumber, ng.low, ng.high, s.WidgetId, s.Price
FROM (VALUES('a',$12),('b',$22),('c',$9),('d',$2)) AS s(WidgetId,Price)
JOIN core.NGroupRangeAB(#min,#max,#groups) AS ng
ON s.Price BETWEEN ng.[Low] AND ng.[High]
ORDER BY ng.RN;
Results:
GroupNumber low high WidgetId Price
------------ ---- ----- --------- ---------------------
1 1 7 d 2.00
2 8 13 a 12.00
2 8 13 c 9.00
4 20 25 b 22.00
-----------------------------------------------------------------------------------------
[Revision History]:
Rev 00 - 20190128 - Initial Creation; Final Tuning - Alan Burstein
****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT
RN = r.RN, -- Sort Key
GroupNumber = r.N2, -- Bucket (group) number
Members = g.S-ur.N+1, -- Count of members in this group
[Low] = r.RN*g.S+rc.N+ur.N, -- Lower boundary for the group (inclusive)
[High] = r.N2*g.S+rc.N -- Upper boundary for the group (inclusive)
FROM core.rangeAB(0,#groups-1,1,0) AS r -- Range Function
CROSS APPLY (VALUES((#max-#min)/#groups,(#max-#min)%#groups)) AS g(S,U) -- Size, Underflow
CROSS APPLY (VALUES(SIGN(SIGN(r.RN-g.U)-1)+1)) AS ur(N) -- get Underflow
CROSS APPLY (VALUES(#min+r.RN-(ur.N*(r.RN-g.U)))) AS rc(N); -- Running Count
GO
Related
Here is what I am trying to produce:
Row_Num Person Value Row_Number
1 Leo Math 1
1 Leo Science 2
1 Leo History 3
1 Leo Math,Science,History 4
2 Robert Gym 2
2 Robert Math 3
2 Robert History 4
2 Robert Gym,Math,History 1
3 David Art 1
3 David Science 2
3 David English 3
3 David History 4
3 David Computer Science 5
3 David Art,Science,English,History,Computer Science 6
This is the code I am using:
with part_1 as
(
select
1 as [Row_Num],
'Leo' as [Person],
'Math,Science,History' as [Subjects]
---
union
---
select
'2',
'Robert',
'Gym,Math,History'
---
union
---
select
'3',
'David',
'Art,Science,English,History,Computer Science'
---
)
----------------------------------------------------------------------
select
[Row_Num],
[Person],
[Subjects]
into
#part1
from
part_1;
go
--------------------------------------------------------------------------------
with p_2 as(
select
[Row_Num],
[Person],
--[Subjects],
[value]
from
#part1
cross apply
STRING_SPLIT([Subjects],',')
union all
select
[Row_Num],
[Person],
[Subjects]
from
#part1
)
select
[Row_Num]
,[Person]
,[Value]
,row_number()
over(Partition by Row_Num order by (select 1)) as [Row_Number]
from
p_2
order by
[Row_Num]
,[Row_Number]
Here is what I am producing:
Row_Num Person Value Row_Number
1 Leo Math 1
1 Leo Science 2
1 Leo History 3
1 Leo Math,Science,History 4
2 Robert Gym,Math,History 1
2 Robert Gym 2
2 Robert Math 3
2 Robert History 4
3 David Art 1
3 David Science 2
3 David English 3
3 David History 4
3 David Computer Science 5
3 David Art,Science,English,History,Computer Science 6
It looks good, until you look at Robert. All of the subjects are on the first row, instead of the bottom.
Any suggestions?
STRING_SPLIT is documented to not "care" about ordinal position:
The output rows might be in any order. The order is not guaranteed to match the order of the substrings in the input string. You can override the final sort order by using an ORDER BY clause on the SELECT statement (ORDER BY value).
If the ordinal position of the data is important, don't use STRING_SPLIT. Personally, I recommend using delimitedsplit8k_LEAD, which includes a itemnumber column.
But idealy, the real solution is to stop storing delimited data in your database. Create 2 further tables, one with a list of the subjects, and another that creates a relationship between the student and subject.
Note that SQL Server 2022 brings a new parameter to STRING_SPLIT called ordinal which, when 1 is passed to it, will cause STRING_SPLIT to return an additional column (called ordinal) with the ordinal position of the value within the string; so you could add that column to your ORDER BY to ensure the ordering in maintained.
Of course, this doesn't change the fact that you should not be storing delimited data to start with, and should still be aiming to fix your design.
Here's an easy solution.
DECLARE #x VARCHAR(1000) = 'a,b,c,d,e,f,g';
DECLARE #t TABLE
(
[Index] INT PRIMARY KEY IDENTITY(1, 1)
, [Value] VARCHAR(50)
)
INSERT INTO #t (VALUE)
SELECT [Value]
FROM string_split(#x, ',')
SELECT * FROM #t
Wrap it like this:
CREATE FUNCTION SPLIT_STRING2
(
#x VARCHAR(5000)
, #y VARCHAR(5000)
) RETURNS #t TABLE
(
[Index] INT PRIMARY KEY IDENTITY(1, 1)
, [Value] VARCHAR(50)
)
AS
BEGIN
INSERT INTO #t (VALUE)
SELECT [Value]
FROM string_split(#x, #y)
RETURN
END
Here is a recursive CTE method to parse the Subject. There's one anchor and two recursive queries. The first recursive query parses the Subjects. The second recursive query adds the summary. I added the special case of one subject. The summary and parsed subjects records are the same. (This means there was only one subjects in Subjects.) It is filtered out in this example.
This only works because a recursive CTE has only the records from the prior iteration. If looking for set N+1 and referring back to the CTE, the CTE has records from set N, not sets 1 through N. (Reminds me of the math proof - prove n = 1 is true, then prove if n true then n+1 true. Then it's true for all n > 0.)
DECLARE #delimiter char(1) = ',';
WITH part_1 as (
-- Subjects has 0 or more "tokens" seperated by a comma
SELECT *
FROM (values
(1, 'Leo', 'Math,Science,History'),
('2', 'Robert', 'Gym,Math,History'),
('3', 'David', 'Art,Science,English,History,Computer Science'),
('4', 'Lazy', 'Art')
) t ([Row_Num],[Person],[Subjects])
), part_2 as (
-- Anchor on the first token. Every token has delimiter before an after, even if we have to pretend it exists
SELECT Row_Num, Person, Subjects,
LEN(Subjects) + 1 as [index_max], -- the last index possible is a "pretend" index just after the end of the string
1 as N, -- this is the first token
0 as [index_before], -- for the first token, pretend the delimiter exists just before the first character at index 0
CASE WHEN CHARINDEX(#delimiter, Subjects) > 0 THEN CHARINDEX(#delimiter, Subjects) -- delimiter after exists
ELSE LEN(Subjects) + 1 -- pretend the delimiter exists just after the end of the string at index len + 1
END as [index_after],
CAST(1 as bit) as [is_token] -- needed to stop the 2nd recursion
FROM part_1
-- Recursive part that checks records for token N to add records for token N + 1 if it exists
UNION ALL
SELECT Row_Num, Person, Subjects,
index_max,
N + 1,
index_after, -- the delimiter before is just the prior token's delimiter after.
CASE WHEN CHARINDEX(#delimiter, Subjects, index_after + 1) > 0 THEN CHARINDEX(#delimiter, Subjects, index_after + 1) -- delimiter after exists
ELSE index_max -- pretend the delimiter exists just after the end of the string at index len + 1
END,
CAST(1 as bit) as [is_token] -- needed to stop the 2nd recursion
FROM part_2 -- a recursive CTE has only the prior result for token N, not accumulated result of tokens 1 to N
WHERE index_after > 0 AND index_after < index_max
UNION ALL
-- Another recursive part that checks if the prior token is the last. If the last, add the record with full string that was just parsed.
SELECT Row_Num, Person, Subjects,
index_max,
N + 1, -- this is not a token
0, -- the entire originsal string is desired
index_max, -- the entire originsal string is desired
CAST(0 as bit) as [is_token] -- not a token - stops this recursion
FROM part_2 -- this has only the prior result for N, not accumulated result of 1 to N
WHERE index_after = index_max -- the prior token was the last
AND is_token = 1 -- it was a token - stops this recursion
AND N > 1 -- add this to remove the added record it it's identical - 1 token
)
SELECT Row_Num, Person, TRIM(SUBSTRING(Subjects, index_before + 1, index_after - index_before - 1)) as [token], N,
index_max, index_before, index_after, is_token
FROM part_2
ORDER BY Row_Num, N -- Row_Num, is_token DESC, N is not required
Row_Num Person token N index_max index_before index_after is_token
----------- ------ -------------------------------------------- ----------- ----------- ------------ ----------- --------
1 Leo Math 1 21 0 5 1
1 Leo Science 2 21 5 13 1
1 Leo History 3 21 13 21 1
1 Leo Math,Science,History 4 21 0 21 0
2 Robert Gym 1 17 0 4 1
2 Robert Math 2 17 4 9 1
2 Robert History 3 17 9 17 1
2 Robert Gym,Math,History 4 17 0 17 0
3 David Art 1 45 0 4 1
3 David Science 2 45 4 12 1
3 David English 3 45 12 20 1
3 David History 4 45 20 28 1
3 David Computer Science 5 45 28 45 1
3 David Art,Science,English,History,Computer Science 6 45 0 45 0
4 Lazy Art 1 4 0 4 1
I have a Dimension table containing machines.
Each machine has a date created value.
I would like to have a Select statement that generates for each day after a certain start date the available number of machines. A machine is available after the date created on wards
As I have read only access to the database I am not able to create a physical calendar table
I hope somebody can help me solving my issue
I assume this is what you want. Based on this sample table:
USE tempdb;
GO
CREATE TABLE dbo.Machines
(
MachineID int,
CreatedDate date
);
INSERT dbo.Machines VALUES(1,'20200104'),(2,'20200202'),(3,'20200214');
Then say you wanted the number of active machines starting on January 1st:
DECLARE #StartDate date = '20200101';
;WITH x AS
(
SELECT n = 0 UNION ALL SELECT n + 1 FROM x
WHERE n < DATEDIFF(DAY, #StartDate, GETDATE())
),
days(d) AS
(
SELECT DATEADD(DAY, x.n, #StartDate) FROM x
)
SELECT days.d, MachineCount = COUNT(m.MachineID)
FROM days
LEFT OUTER JOIN dbo.Machines AS m
ON days.d >= m.CreatedDate
GROUP BY days.d
ORDER BY days.d
OPTION (MAXRECURSION 0);
Results:
d MachineCount
---------- ------------
2020-01-01 0
2020-01-02 0
2020-01-03 0
2020-01-04 1
2020-01-05 1
...
2020-01-31 1
2020-02-01 1
2020-02-02 2
2020-02-03 2
...
2020-02-12 2
2020-02-13 2
2020-02-14 3
2020-02-15 3
Clean up:
DROP TABLE dbo.Machines;
(Yes, some people hiss at recursive CTEs. You can replace it with any number of set generation techniques, some I talk about here, here, and here.)
I have a case where I need to write a CTE ( at least this seems like the best approach) . I have almost everything I need in place but one last issue. I am using a CTE to generate many millions of a records and then I will insert them into a table. The data itself is almost irrelevant except for three columns. 2 date time columns and one character column.
The idea behind the CTE is this. I want one datetime field called Start and one int field called DataValue. I will have a variable which is the count of records I want to aim for and then another variable which is the number of times I want to repeat the datetime value. I don't think I need to explain the software this data represents but basically I need to have 16 rows where the Start value is the same and then after the 16th run I want to then add 15 minutes and then repeat. Effectively there will be events in 15 minute intervals and I will need X number of rows per 15 minute interval to represent those events.
This is my code
Declare #tot as int;
Declare #inter as int;
Set #tot = 26
Set #inter = 3;
WITH mycte(DataValue,start) AS
(
SELECT 1 DataValue, cast('01/01/2011 00:00:00' as datetime) as start
UNION all
if DataValue % #inter = 0
SELECT
DataValue + 1,
cast(DateAdd(minute,15,start) as datetime)
else
select
DataValue + ,
start
FROM mycte
WHERE DataValue + 1 <= #tot)
select
m.start,
m.start,
m.Datavalue%#inter
from mycte as m
option (maxrecursion 0);
I'll change the select statement into an insert statement once I get it working but the m.DataValue%#inter will make it repeat integer when inserting so the only thing I need is to figure out how to make the start be the same 16 times in a row and then increment
It seems that I cannot have an IF statement in the CTE but I am not sure how to accomplish that but what I was going to do was basically say if the DataValue%16 was 0 then increase the value of start.
In the end I should hopefully have something like this where in this case I only repeat it 4 times
+-----------+-------------------+
| DateValue | start |
+-----------+-------------------+
| 1 | 01/01/01 00:00:00 |
| 2 | 01/01/01 00:00:00 |
| 3 | 01/01/01 00:00:00 |
| 4 | 01/01/01 00:00:00 |
| 5 | 01/01/01 00:15:00 |
| 6 | 01/01/01 00:15:00 |
| 7 | 01/01/01 00:15:00 |
| 8 | 01/01/01 00:15:00 |
Is there another way to accomplish this without conditional statements?
You can use case when as below:
Declare #tot as int;
Declare #inter as int;
Set #tot = 26
Set #inter = 3;
WITH mycte(DataValue,start) AS
(
SELECT 1 DataValue, cast('01/01/2011 00:00:00' as datetime) as start
UNION all
SELECT DataValue+1 [Datavalue],
case when (DataValue % #inter) = 0 then cast(DateAdd(minute,15,start) as datetime) else [start] end [start]
FROM mycte
WHERE (DataValue + 1) <= #tot)
select
m.DataValue,
m.[start]
from mycte as m
option (maxrecursion 0);
This will give the below result
DataValue Start
========= =============
1 2011-01-01 00:00:00.000
2 2011-01-01 00:00:00.000
3 2011-01-01 00:00:00.000
4 2011-01-01 00:15:00.000
5 2011-01-01 00:15:00.000
6 2011-01-01 00:15:00.000
7 2011-01-01 00:30:00.000
8 2011-01-01 00:30:00.000
9 2011-01-01 00:30:00.000
10 2011-01-01 00:45:00.000
11 2011-01-01 00:45:00.000
12 2011-01-01 00:45:00.000
....
26 2011-01-01 02:00:00.000
And if you dont want to use case when you can use double recursive cte as below:-
WITH mycte(DataValue,start) AS
( --this recursive cte will generate the same record the number of #inter
SELECT 1 DataValue, cast('01/01/2011 00:00:00' as datetime) as start
UNION all
SELECT DataValue+1 [DataValue],[start]
FROM mycte
WHERE (DataValue + 1) <= #inter)
,Increments as (
-- this recursive cte will do the 15 additions
select * from mycte
union all
select DataValue+#inter [DataValue]
,DateAdd(minute,15,[start]) [start]
from Increments
WHERE (DataValue + 1) <= #tot
)
select
m.DataValue,
m.[start]
from Increments as m
order by DataValue
option (maxrecursion 0);
it will give the same results.
You can do this with a tally table and some basic math. I'm not sure if your total rows are #tot or should they be #tot * #inter. If so, you just need to change the TOP clause. If you need more rows, you just need to alter the tally table generation.
Declare #tot as int;
Declare #inter as int;
Set #tot = 26
Set #inter = 3;
WITH
E(n) AS(
SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n)
),
E2(n) AS(
SELECT a.n FROM E a, E b
),
E4(n) AS(
SELECT a.n FROM E2 a, E2 b
),
cteTally(n) AS(
SELECT TOP( #tot) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) n
FROM E4
)
SELECT n, DATEADD( MI, 15* ((n-1)/#inter), '20110101')
FROM cteTally;
I'm trying to write a incremental update statement using SQL Server 2012.
Current Data:
RecNo Budget_ID Item_Code Revision
---------------------------------------
1 16 xxx 2
2 16 xxx NULL
3 16 xxx NULL
12 19 yyy 3
13 19 yyy NULL
14 19 yyy NULL
15 19 yyy NULL
Expected result:
RecNo Budget_ID Item_Code Revision
---------------------------------------
1 16 xxx 2
2 16 xxx 1
3 16 xxx 0
12 19 yyy 3
13 19 yyy 2
14 19 yyy 1
15 19 yyy 0
However with following approach, I ended up with the result set as below.
UPDATE a
SET a.Revision = (SELECT MIN(b.Revision)
FROM [dbo].[foo] b
WHERE b.item_code = a.item_code
AND b.budget_id = a.budget_id
GROUP BY b.item_code ) -1
FROM [dbo].[foo] a
WHERE a.Revision is NULL
Result:
RecNo Budget_ID Item_Code Revision
---------------------------------------
1 16 xxx 2
2 16 xxx 1
3 16 xxx 1
12 19 yyy 3
13 19 yyy 2
14 19 yyy 2
15 19 yyy 2
Can anyone help me to get this right?
Thanks in advance!
Try this:
;with cte as
(select *, row_number() over (partition by budget_id order by rec_no desc) rn from dbo.foo)
update cte
set revision = rn - 1
Basically, since the revision value seems to be decreasing with increase in rec_no, we simply use the row_number() function to get row number of each record within the subset of all records with a particular budget_id, sorted in descending order of rec_no. Since the least possible value of row_number() will be 1, we subtract 1 so that the last record in the partition will have revision set to 0 instead 1.
You may test the code here
I found this example from this link https://stackoverflow.com/a/13629639/1692632
First you select MIN value to some variable and then you can update table by decreasing variable at same time.
DECLARE #table TABLE (ID INT, SomeData VARCHAR(10))
INSERT INTO #table (SomeData, ID) SELECT 'abc', 6 ;
INSERT INTO #table (SomeData) SELECT 'def' ;
INSERT INTO #table (SomeData) SELECT 'ghi' ;
INSERT INTO #table (SomeData) SELECT 'jkl' ;
INSERT INTO #table (SomeData) SELECT 'mno' ;
INSERT INTO #table (SomeData) SELECT 'prs' ;
DECLARE #i INT = (SELECT ISNULL(MIN(ID),0) FROM #table)
UPDATE #table
SET ID = #i, #i = #i - 1
WHERE ID IS NULL
SELECT *
FROM #table
I'm not sure if this will do the trick but you can try with
Update top(1) a
SET a.Revision = (Select MIN(b.Revision)
FROM [dbo].[foo] b where b.item_code = a.item_code and b.budget_id = a.budget_id
group by b.item_code ) -1
FROM [dbo].[foo] a
WHERE a.Revision is NULL
and repeat until there's no changes left
Update Data
set Revision = x.Revision
from
(select RecNo, Budget_ID, Item_Code, case when Revision is null then ROW_NUMBER() over(partition by Budget_ID order by RecNo desc) - 1 else Revision end Revision
from Data
) x
where x.RecNo = data.RecNo
You basically use ROW_NUMBER() to count backwards for each Budget_ID, and use that row number minus 1 where Revision is null. This is basically the same as Shree's answer, just without the CTE.
I have a scenario where i'm splitting a number of results into quartilies using the SQL Server NTILE function below. The goal is to have an as equal number of rows in each class
case NTILE(4) over (order by t2.TotalStd)
when 1 then 'A' when 2 then 'B' when 3 then 'C' else 'D' end as Class
The result table is shown below and there is a (9,9,8,8) split between the 4 class groups A,B,C and D.
There are two results which cause me an issue, both rows have a same total std value of 30 but are assigned to different quartiles.
8 30 A
2 30 B
I'm wondering is there a way to ensure that rows with the same value are assigned to the same quartile? Can i group or partition by another column to get this behaviour?
Pos TotalStd class
1 16 A
2 23 A
3 21 A
4 29 A
5 25 A
6 26 A
7 28 A
8 30 A
9 29 A
1 31 B
2 30 B
3 32 B
4 32 B
5 34 B
6 32 B
7 34 B
8 32 B
9 33 B
1 36 C
2 35 C
3 35 C
4 35 C
5 40 C
6 38 C
7 41 C
8 43 C
1 43 D
2 48 D
3 45 D
4 47 D
5 44 D
6 48 D
7 46 D
8 57 D
You will need to re create the Ntile function, using the rank function.
The rank function gives the same rank for rows with the same value. The value later 'jumps' to the next rank as if you used row_number.
We can use this behavior to mimic the Ntile function, forcing it to give the same Ntile value to rows with the same value. However - this will cause the Ntile partitions to be with a different size.
See the example below for the new Ntile using 4 bins:
declare #data table ( x int )
insert #data values
(1),(2),
(2),(3),
(3),(4),
(4),(5)
select
x,
1+(rank() over (order by x)-1) * 4 / count(1) over (partition by (select 1)) as new_ntile
from #data
Results:
x new_ntile
---------------
1 1
2 1
2 1
3 2
3 2
4 3
4 3
5 4
Not sure what you're expecting to happen here, really. SQL Server has divided up the data into 4 groups of as-equal-size-as-possible, as you asked. What do you want to happen? Have a look at this example:
declare #data table ( x int )
insert #data values
(1),(2),
(2),(3),
(3),(4),
(4),(5)
select
x,
NTILE(4) over (order by x) as ntile
from #data
Results:
x ntile
----------- ----------
1 1
2 1
2 2
3 2
3 3
4 3
4 4
5 4
Now every ntile group shares a value with the one(s) next to it! But what else should it do?
Try this:
; with a as (
select TotalStd,Class=case ntile(4)over( order by TotalStd )
when 1 then 'A'
when 2 then 'B'
when 3 then 'C'
when 4 then 'D'
end
from t2
group by TotalStd
)
select d.*, a.Class from t2 d
inner join a on a.TotalStd=d.TotalStd
order by Class,Pos;
Here we have a table of 34 rows.
DECLARE #x TABLE (TotalStd INT)
INSERT #x (TotalStd) VALUES (16), (21), (23), (25), (26), (28), (29), (29), (30), (30), (31), (32), (32), (32), (32), (33), (34),
(34), (35), (35), (35), (36), (38), (40), (41), (43), (43), (44), (45), (46), (47), (48), (48), (57)
SELECT '#x', TotalStd FROM #x ORDER BY TotalStd
We want to divide into quartiles. If we use NTILE, the bucket sizes will be roughly the same size (8 to 9 rows each) but ties are broken arbitrarily:
SELECT '#x with NTILE', TotalStd, NTILE(4) OVER (ORDER BY TotalStd) quantile FROM #x
See how 30 appears twice: once in quantile 1 and once in quantile 2. Similarly, 43 appears both in quantiles 3 and 4.
What I ought to find is 10 items in quantile 1, 8 in quantile 2, 7 in quantile 3 and 9 in quantile 4 (i.e. not a perfect 9-8-9-8 split, but such a split is impossible if we are not allowed to break ties arbitrarily). I can do it using NTILE to determine cutoff points in a temporary table:
DECLARE #cutoffs TABLE (quantile INT, min_value INT, max_value INT)
INSERT #cutoffs (quantile, min_value)
SELECT y.quantile, MIN(y.TotalStd)
FROM (SELECT TotalStd, NTILE(4) OVER (ORDER BY TotalStd) AS quantile FROM #x) y
GROUP BY y.quantile
-- The max values are the minimum values of the next quintiles
UPDATE c1 SET c1.max_value = ISNULL(C2.min_value, (SELECT MAX(TotalStd) + 1 FROM #x))
FROM #cutoffs c1 LEFT OUTER JOIN #cutoffs c2 ON c2.quantile - 1 = c1.quantile
SELECT '#cutoffs', * FROM #cutoffs
We'll use the the boundary values in the #cutoffs table to create the final table:
SELECT x.TotalStd, c.quantile FROM #x x
INNER JOIN #cutoffs c ON x.TotalStd >= c.min_value AND x.TotalStd < c.max_value