I have a table in SQL Server with two fields.
Total Group
35645 24
12400 55
30000 41
I want to split each group into smaller segments of fixed size 7000, with the remainder of each group into the last segment. So, the output should look like below.
Segment Total Group
1 7000 24
2 7000 24
3 7000 24
4 7000 24
5 7000 24
6 645 24
1 7000 55
2 5400 55
1 7000 41
2 7000 41
3 7000 41
4 7000 41
5 2000 41
This should do it:
declare #t table (Total int,[Group] int)
insert into #t(Total,[Group]) values
(35645,24 ),
(12400,55 ),
(30000,41 )
;With Numbers as (
select ROW_NUMBER() OVER (ORDER BY number)-1 n
from master..spt_values
)
select
n.n+1 as Segment,
CASE WHEN (n.n+1)*7000 < t.Total THEN 7000
ELSE t.Total - (n.n*7000) END as Total,
t.[Group]
from
#t t inner join
Numbers n on n.n*7000 < t.Total
(If you already have a Numbers table you can eliminate that part. I'm using spt_values just as a table that I know has plenty of rows in it, so that the ROW_NUMBER() expression should generate all of the necessary numbers)
Results:
Segment Total Group
-------------------- -------------------- -----------
1 7000 24
2 7000 24
3 7000 24
4 7000 24
5 7000 24
6 645 24
1 7000 55
2 5400 55
1 7000 41
2 7000 41
3 7000 41
4 7000 41
5 2000 41
I prepared following SELECT statement using SQL CTE expression and SQL numbers table function
declare #divisor int = 7000
;with CTE as (
select
Total,
[Group],
#divisor divisor,
(Total / #divisor) quotient,
(Total % #divisor) reminder
from t
), NT as (
SELECT i FROM dbo.NumbersTable(1, (select max(quotient) from CTE) ,1)
)
select
case when i = 0 then reminder else divisor end as Total,
[Group]
from (
select *
from CTE, NT
where quotient >= i
union all
select *, 0 as i
from CTE
where reminder >= 0
) t
order by [Group], i desc
Related
I want to calculate the running total using a Stored Procedure. The base Table is ~10.000 rows and is as follows:
nWordNr nBitNr tmTotals
------------------------
5 14 86404
5 14 146
2 3 438
10 2 3319
5 12 225
2 3 58
.... .... .....
.... .... .....
I want this to be GROUPED BY nWordNr, NBitNr and have the total tmTotals. To do this is started of with the following:
SELECT TOP 10
[nWordNr] as W,
[nBitNr] as B,
SUM([tmTotals]) as total,
COUNT(*) as Amount
FROM Messages_History
GROUP BY nWordNr, nBitNr
ORDER BY total desc
This results in:
W B total Amount
-----------------------
2 3 3578775 745
3 3 3557975 395
5 4 2305229 72
5 3 2183050 33
5 12 2022401 825
5 14 1673295 652
48 12 1658862 302
4 3 1606454 215
48 13 1541729 192
5 9 1463256 761
Now I want to calculate the running total on the column total like this:
W B total Amount running
-------------------------------
2 3 3578775 745 3578775
3 3 3557975 395 7136750
5 4 2305229 72 9441979
5 3 2183050 33 11625029
5 12 2022401 825 etc.
5 14 1673295 652 etc.
48 12 1658862 302 etc.
4 3 1606454 215 etc.
48 13 1541729 192 etc.
5 9 1463256 761 etc.
so what I found was:
COUNT([tmTotals]) over (ORDER BY [nWordNr], [nBitNr]) as Running
But here I get the error that is discussed in this question: Column invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause and I just can't figure out how to solve it in this case
it should be SUM ( SUM ( tmTotals) ) OVER ( ... )
SELECT TOP 10
[nWordNr] as W,
[nBitNr] as B,
SUM([tmTotals]) as total,
COUNT(*) as Amount,
SUM(SUM([tmTotals])) OVER (ORDER BY [nWordNr], [nBitNr]) as Running
FROM Messages_History
GROUP BY nWordNr, nBitNr
ORDER BY total desc
EDIT :
Looking at your expected result, the Running should be
SUM(SUM([tmTotals])) OVER (ORDER BY SUM([tmTotals]) DESC) as Running
if the above is a bit difficult to grasp, then you can use a CTE or derived table and perform the running total on the outer query
; with CTE as
(
SELECT
[nWordNr] as W,
[nBitNr] as B,
SUM([tmTotals]) as total,
COUNT(*) as Amount
FROM Messages_History
GROUP BY nWordNr, nBitNr
)
SELECT TOP 10 *,
SUM(total) OVER (ORDER BY total desc) as Running
FROM CTE
ORDER BY total desc
ID Date Value Average
1 10/5/2017 15 15
2 10/6/2017 25 20
3 10/7/2017 35 25
4 10/8/2017 45 35
5 10/9/2017 55 45
6 10/10/2017 65 55
7 10/11/2017 75 65
If this is my table, I want average to be a computed column and its formula in general is average of previous 3 row's Value column.
(Ex. for 2nd row it is (25+15)/2 )
How can i do such a thing in computed column? Is there any better way to achieve this.
Thanks in advance.
i would go with a view and use avg windows function
select
id,
date,
value,
avg(value) over (order by id)
from table
Updated answer: you could use frames clause like below
Working Demo
;with cte(id,date,val)
as
(
select 1 ,'10/5/2017' , 15 UNION ALL
select 2 ,'10/6/2017' , 25 UNION ALL
select 3 ,'10/7/2017' , 35 UNION ALL
select 4 ,'10/8/2017' , 45 UNION ALL
select 5 ,'10/9/2017' , 55 UNION ALL
select 6 ,'10/10/2017', 65 UNION ALL
select 7 ,'10/11/2017', 75
)
SELECT *,avg(VAL) OVER (ORDER BY id rows between 2 PRECEDING and current row ) FROM CTE
I have a table which has sequence numbers from 1 to 90000
so i wanted to know how to automatically assign the values to the sequence numbers
say for example from 1 to 1000 i want them to fall under 1000 bucket
from 1001 to 2000 under 2000 bucket
and so on up to 90000 records.
You can divide the number by 1000, floor it, and multiply it back by 1000:
SELECT 1000*FLOOR(num/1000) + 1, COUNT(*)
FROM mytable
GROUP BY FLOOR(num/1000)
The Modulo (%) operator is perfect for something like this...
So easy, it feels like it's cheating.
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),
Sequense (n) AS (
SELECT TOP 90000
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n3 a CROSS JOIN cte_n3 b
)
SELECT
SequenseNumber = s.n,
GroupNumber = s.n - (s.n % 1000)
FROM
Sequense s;
Results...
SequenseNumber GroupNumber
-------------------- --------------------
1 0
2 0
3 0
4 0
5 0
6 0
.........................
997 0
998 0
999 0
1000 1000
1001 1000
1002 1000
1003 1000
1004 1000
1005 1000
1006 1000
1007 1000
1008 1000
1009 1000
1010 1000
.........................
89990 89000
89991 89000
89992 89000
89993 89000
89994 89000
89995 89000
89996 89000
89997 89000
89998 89000
89999 89000
90000 90000
(90000 rows affected)
The code below does what you want, it uses CEILING and an algorithm that works from 0 to 90K literally consider that the num must be DECIMAL if you use int the decimal round is set to 0.
Test data
declare #tbl table(num decimal)
insert into #tbl
select 1 union
select 999 union
select 1000 union
select 1001 union
select 2001 union
select 3001 union
select 9999 union
select 10000 union
select 10001 union
select 15001 union
select 25001 union
select 77006 union
select 80000 union
select 90000
Query
SELECT distinct
num,
CASE WHEN num <= 10000 THEN 1000*CEILING(num/1000)
WHEN num <= 20000 THEN 10000 + 1000*CEILING((num-10000)/1000)
WHEN num <= 30000 THEN 20000 + 1000*CEILING((num-20000)/1000)
WHEN num <= 40000 THEN 30000 + 1000*CEILING((num-30000)/1000)
WHEN num <= 50000 THEN 40000 + 1000*CEILING((num-40000)/1000)
WHEN num <= 60000 THEN 50000 + 1000*CEILING((num-50000)/1000)
WHEN num <= 70000 THEN 60000 + 1000*CEILING((num-60000)/1000)
WHEN num <= 80000 THEN 70000 + 1000*CEILING((num-70000)/1000)
WHEN num <= 90000 THEN 80000 + 1000*CEILING((num-80000)/1000)
ELSE 0
END
FROM #tbl
Result
1 1000
999 1000
1000 1000
1001 2000
2001 3000
3001 4000
9999 10000
10000 10000
10001 11000
15001 16000
25001 26000
77006 78000
80000 80000
90000 90000
I'm going to assume that what Jason posted is what the OP is looking for. This is a slight variation using getnumsAB which was developed for exactly this type of thing. First we'll use it to create some sample data:
Sample data
if object_id('tempdb..#yourdata') is not null drop table #yourdata;
select SequenceNumber = rn
into #yourdata
from dbo.GetNumsAB(1,90000,1,1);
create unique clustered index uq_cl_yourdata on #yourdata(SequenceNumber);
To understand my solution first note this query:
select rn, n1, n2 from dbo.getnumsAB(0,90000,1000,1);
This returns:
rn n1 n2
----- -------- --------
1 0 1000
2 1000 2000
3 2000 3000
4 3000 4000
....
87 86000 87000
88 87000 88000
89 88000 89000
90 89000 90000
Solution
select y.SequenceNumber, GroupNumber = n1
from #yourdata y
join dbo.getnumsAB(0,90000,1000,1) gn
on y.SequenceNumber >= n1 and y.SequenceNumber < n2;
If i'm not missing something wouldn't you just use a case statement?
select case
when Sequence <= 1000
then '1000'
when Sequence <= 2000 and >= 1001
then '2000'
and so on up to 9000?
This is the result of my first sql statement:
SELECT
count(*) countQuarter, Hour, Quarter,
ROW_NUMBER() OVER(ORDER BY Hour, Quarter ASC) AS rownum
FROM
(SELECT [ID] ,[simulationID] ,[time],
replace(str(time/3600,len(ltrim(time/3600))+abs(sign(time/359999)-1)) + ':' + str((time/60)%60,2) + ':' + str(time%60,2),' ','0') dtString,
(time/3600) Hour, (time/60)%60 Minute, case when (time/60)%60<15 then 15 when
(time/60)%60<30 then 30 when (time/60)%60<45 then 45 when (time/60)%60<60 then 60 end
Quarter ,[person] ,[link] ,[vehicle] FROM [TEST].[dbo].[evtLinks]
WHERE simulationID=#simulationID) B
GROUP BY Hour, Quarter
which gives the following results:
Count Hour Quarter Rownum
497 0 15 1
842 0 30 2
1033 0 45 3
1120 0 60 4
1235 1 15 5
1267 1 30 6
1267 1 45 7
1267 1 60 8
1267 2 15 9
1267 2 30 10
I desire a result, where the column fullCount is the sum of the Count of the actual row and the next 3!
Count Hour Quarter Rownum Fullcount
497 0 15 1 3492
842 0 30 2 4230
1033 0 45 3 4655
1120 0 60 4 ...
1235 1 15 5
1267 1 30 6
1267 1 45 7
1267 1 60 8
1267 2 15 9
1267 2 30 10
How can this be done with grouping or analytical functions in SQL Server?
For SQL Server 2012, yes this can be done:
declare #t table ([Count] int,[Hour] int,[Quarter] int,Rownum int)
insert into #t([Count],[Hour],[Quarter],Rownum) values
(497 , 0 , 15 , 1 ),
(842 , 0 , 30 , 2 ),
(1033 , 0 , 45 , 3 ),
(1120 , 0 , 60 , 4 ),
(1235 , 1 , 15 , 5 ),
(1267 , 1 , 30 , 6 ),
(1267 , 1 , 45 , 7 ),
(1267 , 1 , 60 , 8 ),
(1267 , 2 , 15 , 9 ),
(1267 , 2 , 30 , 10 )
select *,SUM([Count]) OVER (
ORDER BY rownum
ROWS BETWEEN CURRENT ROW AND
3 FOLLOWING)
from #t
Here I'm using #t as your current result set - you may be able to adapt this into your current query or may have to place your current query in a CTE.
Unfortunately, the ROWS BETWEEN syntax is only valid on 2012 and later.
Tested the logical scenario and it works, but I don't have your data, so in your case it should look roughly like this:
;WITH CTE as (SELECT count(*) countQuarter,Hour,Quarter,
ROW_NUMBER() OVER(ORDER BY Hour, Quarter ASC) AS rownum
FROM
(SELECT [ID] ,[simulationID] ,[time],
replace(str(time/3600,len(ltrim(time/3600))+abs(sign(time/359999)-1)) + ':' + str((time/60)%60,2) + ':' + str(time%60,2),' ','0') dtString,
(time/3600) Hour, (time/60)%60 Minute, case when (time/60)%60<15 then 15 when
(time/60)%60<30 then 30 when (time/60)%60<45 then 45 when (time/60)%60<60 then 60 end
Quarter ,[person] ,[link] ,[vehicle] FROM [TEST].[dbo].[evtLinks]
WHERE simulationID=#simulationID) B
GROUP BY Hour, Quarter)
SELECT *, CA.Fullcount
FROM CTE
CROSS APPLY (SELECT SUM(countQuarter) Fullcount FROM CTE C WHERE C.ID BETWEEN CTE.ID AND CTE.ID+3) CA
I have a scenario where i'm splitting a number of results into quartilies using the SQL Server NTILE function below. The goal is to have an as equal number of rows in each class
case NTILE(4) over (order by t2.TotalStd)
when 1 then 'A' when 2 then 'B' when 3 then 'C' else 'D' end as Class
The result table is shown below and there is a (9,9,8,8) split between the 4 class groups A,B,C and D.
There are two results which cause me an issue, both rows have a same total std value of 30 but are assigned to different quartiles.
8 30 A
2 30 B
I'm wondering is there a way to ensure that rows with the same value are assigned to the same quartile? Can i group or partition by another column to get this behaviour?
Pos TotalStd class
1 16 A
2 23 A
3 21 A
4 29 A
5 25 A
6 26 A
7 28 A
8 30 A
9 29 A
1 31 B
2 30 B
3 32 B
4 32 B
5 34 B
6 32 B
7 34 B
8 32 B
9 33 B
1 36 C
2 35 C
3 35 C
4 35 C
5 40 C
6 38 C
7 41 C
8 43 C
1 43 D
2 48 D
3 45 D
4 47 D
5 44 D
6 48 D
7 46 D
8 57 D
You will need to re create the Ntile function, using the rank function.
The rank function gives the same rank for rows with the same value. The value later 'jumps' to the next rank as if you used row_number.
We can use this behavior to mimic the Ntile function, forcing it to give the same Ntile value to rows with the same value. However - this will cause the Ntile partitions to be with a different size.
See the example below for the new Ntile using 4 bins:
declare #data table ( x int )
insert #data values
(1),(2),
(2),(3),
(3),(4),
(4),(5)
select
x,
1+(rank() over (order by x)-1) * 4 / count(1) over (partition by (select 1)) as new_ntile
from #data
Results:
x new_ntile
---------------
1 1
2 1
2 1
3 2
3 2
4 3
4 3
5 4
Not sure what you're expecting to happen here, really. SQL Server has divided up the data into 4 groups of as-equal-size-as-possible, as you asked. What do you want to happen? Have a look at this example:
declare #data table ( x int )
insert #data values
(1),(2),
(2),(3),
(3),(4),
(4),(5)
select
x,
NTILE(4) over (order by x) as ntile
from #data
Results:
x ntile
----------- ----------
1 1
2 1
2 2
3 2
3 3
4 3
4 4
5 4
Now every ntile group shares a value with the one(s) next to it! But what else should it do?
Try this:
; with a as (
select TotalStd,Class=case ntile(4)over( order by TotalStd )
when 1 then 'A'
when 2 then 'B'
when 3 then 'C'
when 4 then 'D'
end
from t2
group by TotalStd
)
select d.*, a.Class from t2 d
inner join a on a.TotalStd=d.TotalStd
order by Class,Pos;
Here we have a table of 34 rows.
DECLARE #x TABLE (TotalStd INT)
INSERT #x (TotalStd) VALUES (16), (21), (23), (25), (26), (28), (29), (29), (30), (30), (31), (32), (32), (32), (32), (33), (34),
(34), (35), (35), (35), (36), (38), (40), (41), (43), (43), (44), (45), (46), (47), (48), (48), (57)
SELECT '#x', TotalStd FROM #x ORDER BY TotalStd
We want to divide into quartiles. If we use NTILE, the bucket sizes will be roughly the same size (8 to 9 rows each) but ties are broken arbitrarily:
SELECT '#x with NTILE', TotalStd, NTILE(4) OVER (ORDER BY TotalStd) quantile FROM #x
See how 30 appears twice: once in quantile 1 and once in quantile 2. Similarly, 43 appears both in quantiles 3 and 4.
What I ought to find is 10 items in quantile 1, 8 in quantile 2, 7 in quantile 3 and 9 in quantile 4 (i.e. not a perfect 9-8-9-8 split, but such a split is impossible if we are not allowed to break ties arbitrarily). I can do it using NTILE to determine cutoff points in a temporary table:
DECLARE #cutoffs TABLE (quantile INT, min_value INT, max_value INT)
INSERT #cutoffs (quantile, min_value)
SELECT y.quantile, MIN(y.TotalStd)
FROM (SELECT TotalStd, NTILE(4) OVER (ORDER BY TotalStd) AS quantile FROM #x) y
GROUP BY y.quantile
-- The max values are the minimum values of the next quintiles
UPDATE c1 SET c1.max_value = ISNULL(C2.min_value, (SELECT MAX(TotalStd) + 1 FROM #x))
FROM #cutoffs c1 LEFT OUTER JOIN #cutoffs c2 ON c2.quantile - 1 = c1.quantile
SELECT '#cutoffs', * FROM #cutoffs
We'll use the the boundary values in the #cutoffs table to create the final table:
SELECT x.TotalStd, c.quantile FROM #x x
INNER JOIN #cutoffs c ON x.TotalStd >= c.min_value AND x.TotalStd < c.max_value