SQL Server Group By Excluding Some Values - sql-server

I have some records like below:
ID Val Amount
1 0 3
2 0 3
3 0 4
4 1 2
5 1 3
6 2 3
7 2 4
I want to group this data by the column Val and get the sum(amount), but do not group the ones with Val = 0.
The result set I need is like below:
Val Amount
0 3
0 3
0 4
1 5
2 7
I did it by two ways, but none seem to be the best way:
First one is by using unions, like, first having the ones with Val = 0, then grouping the ones with Val <> 0 and unioning the two result sets.
Second one is a little bit better. Let's call the data we have is in the table #Table:
WITH g AS
(
SELECT Val, Amount, CASE WHEN Val = '0' then Val + ID
else Val END A FROM #table
)
SELECT CASE WHEN A LIKE '0%' THEN 0 ELSE A END AS A, SUM(Amount)
FROM g
GROUP BY A
This also works, but being have to concatenate with the ID column (or raw_number) and than using a left function to remove it is not a best practice.
So I'm looking for a better approach, both looking better and performing better as well.
I work on SQL Server 2008, but I'm open to any solutions which require newer versions.

The shortest way of doing it is the following:
SELECT Val, SUM(Amount)
FROM mytable
GROUP BY Val, CASE WHEN Val = 0 THEN ID ELSE 0 END
Demo here
You can also do it using window functions:
;WITH CTE AS (
SELECT ID, Val, Amount,
DENSE_RANK() OVER (PARTITION BY Val
ORDER BY CASE
WHEN Val = 0 THEN ID
ELSE 0
END) AS rank
FROM mytable
)
SELECT Val, SUM(Amount) AS total_amount
FROM CTE
GROUP BY Val, rank
The result set returned by the CTE is:
ID Val Amount rank
--------------------
1 0 3 1
2 0 3 2
3 0 4 3
4 1 2 1
5 1 3 1
6 2 3 1
7 2 4 1
So using rank you can differentiate between 0 and the rest of Val values.
Demo here
You can use both methods and see how they compare to each other in terms of performance.

Use a union here. The top of the below union finds aggregate amounts of values which are not zero, and the bottom brings in the zero value records, not aggregated.
SELECT Val, SUM(Amount) AS Amount
FROM g
WHERE Val <> 0
GROUP BY Val
UNION ALL
SELECT Val, Amount
FROM g
WHERE Val = 0
ORDER BY Val;
Demo

Related

Classifying rows into a grouping column that shows the data is related to prior rows

I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.
The data is as follows:
customer id
sequence id
prior_sequence id
1
1
0
1
2
1
1
3
2
2
4
0
2
5
4
2
6
0
2
7
6
Ideally, I would like to create the following grouping column and yield the following results:
customer id
sequence id
prior sequence id
grouping
1
1
0
1
1
2
1
1
1
3
2
1
2
4
0
2
2
5
4
2
2
6
0
3
2
7
6
3
I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.
I agree that a recursive CTE will do the job. Something like:
WITH reccte AS
(
/*query that determines starting point for recursion
*
* In this case we want all records with no prior_sequence_id
*/
SELECT
customer_id,
sequence_id,
prior_sequence_id,
/*establish grouping*/
ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
FROM yourtable
WHERE prior_sequence_id = 0
UNION
/*join the recursive CTe back to the table and iterate*/
SELECT
yourtable.customer_id,
yourtable.sequence_id,
yourtable.prior_sequence_id,
reccte.grouping
FROM reccte
INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;
It looks like you could use a simple correlated query, at least given your sample data:
select *, (
select Sum(Iif(prior_sequence_id = 0, 1, 0))
from t t2
where t2.sequence_id <= t.sequence_id
) Grouping
from t;
See Example Fiddle

data sequence handing

I am stuck with a weird problem of handling data sequences.
My source data looks like -
Roll-on, Marker
1,1
2,0
3,0
5,1
8,1
9,0
10,1
the marker column can only have two values, 1 and 0
if the roll no column is in a sequence, the marker value of 1 indicates the start of the sequence and all the remaining roll no will have marker value 0 within that sequence. So for roll no sequence 1-3, marker value is 1 for roll no 1 and 0 for the rest. However, if roll no doesn't fall into a sequence(as in roll no 8), the marker value is 1.
From this data I need to create an output as follows -
Roll range
1
2
3
1-3
5
5-5
8
9
10
8-10
Meaning -
display the roll no in sequence as in the input
after each sequence ends, display a new record containing the start and end roll no of the proceeding sequence
How is this possible?
Thanks in advance for help.
It seems like an island and gap problem.
If I understand correctly, we can try to use SUM window function with conditions to make it.
Generator a gap of row number then getting min and max group by
SELECT CONCAT(MIN(Roll),'-',MAX(Roll))
FROM (
SELECT *,
SUM(CASE WHEN Marker = 1 THEN 1 ELSE 0 END) OVER(ORDER BY Roll) grp
FROM T
) t1
GROUP BY grp
as I comment I am not sure about the logic of 8-10 (why isn't 8-9 and 10-10) from your expect result and columns description, I think we can try to judge Max of Roll then do some arithmetic.
SELECT CONCAT(MIN(Roll),'-',MAX(Roll))
FROM (
SELECT *,
SUM(CASE WHEN Marker = 1 THEN 1 ELSE 0 END) OVER(ORDER BY Roll) + IIF(MAX(Roll) OVER() = Roll, - Marker,0) grp
FROM T
) t1
GROUP BY grp
so that the final query combines result set we can use UNION ALL
;WITH CTE AS (
SELECT *,
SUM(CASE WHEN Marker = 1 THEN 1 ELSE 0 END) OVER(ORDER BY Roll) + IIF(MAX(Roll) OVER() = Roll, - Marker,0) grp
FROM T
)
SELECT [Roll range]
FROM (
SELECT CONCAT(MIN(Roll),'-',MAX(Roll)) 'Roll range',MAX(Roll) seq
FROM CTE t1
GROUP BY grp
UNION ALL
SELECT CAST(Roll AS VARCHAR(5)),Roll
FROM CTE t1
) t1
ORDER BY seq
sqlfiddle
SELECT
CASE WHEN a=2 AND CHARINDEX('-',R)=0 THEN CONCAT(R,'-',R) ELSE R END as R,
R2,
a
FROM (
SELECT
1 as a,
CONVERT(VARCHAR(3), Roll) R,
Roll as R2
FROM table1
UNION ALL
SELECT
2,
STRING_AGG(Roll,'-') R,
MAX(Roll) as R2
FROM (
SELECT
Roll,
SUM(Marker) OVER (ORDER BY Roll) S
FROM
table1
) x
GROUP BY S
) x
ORDER BY R2,a
output:
R
R2
a
1
1
1
2
2
1
3
3
1
1-2-3
3
2
5
5
1
5-5
5
2
8
8
1
9
9
1
8-9
9
2
10
10
1
10-10
10
2
Columns R2 and a are added for correct sorting.
I group 8-9 and 10-10, but this question is still open, see comment

COUNT and COUNT DISTINCT for different groups

For a SQL Server based report,
Table:
CID Date ID Service Days
1 3/7/2016 1 Individual 3
2 4/5/2016 2 Individual 4
3 5/24/2016 1 Individual 3
4 4/4/2016 4 Group 2
5 4/4/2016 4 Group 2
6 2/18/2016 4 Group 2
7 5/5/2016 5 Group 1
8 5/5/2016 5 Group 1
I used this code:
SELECT
ID,
Service,
COUNT(WHEN Days = 4 THEN 1 END) AS '4Days',
COUNT(WHEN Days = 3 THEN 1 END) AS '3Days',
COUNT(WHEN Days = 2 THEN 1 END) AS '2Days',
COUNT(WHEN Days = 1 THEN 1 END) AS '1Day'
FROM Table T1
GROUP BY
ID,
Service
which gives me this Output:
ID Service 4Days 3Days 2Days 1Day
1 Individual 0 2 0 0
2 Individual 1 0 0 0
4 Group 0 0 3 0
5 Group 0 0 0 2
What I want to do is not count the Group services as separate services for separate individuals, but just as one service per group. A Count Distinct used with the Date or ID could help me do that but I don't know how to make that play with the Individual services where I just wanna count them individually and not using DISTINCT. So the desired output is:
ID Service 4Days 3Days 2Days 1Day
1 Individual 0 2 0 0
2 Individual 1 0 0 0
4 Group 0 0 2 0
5 Group 0 0 0 1
I'll edit the post in case I oversimplified the problem since this is dummy data.
Looks like you could use distinct this way if you wanted:
count(distinct
case when Days = 1 then case when Service = 'Group' then 1 else "Date" end end
) as [1Day]
Depending on your indexing it's possible that introducing another column in the query would change the query plan. I suspect that probably isn't the case though.
If I am not wrong for '2Days' column service type 'Group' count should be '2' if our grouping based on 'Date' column, if so then try this:
SELECT
ID,
Service,
CASE WHEN MAX(t.days) = 4 THEN MAX(t.date) ELSE 0 END AS '4Days',
CASE WHEN MAX(t.days) = 3 THEN MAX(t.date) ELSE 0 END AS '3Days',
CASE WHEN MAX(t.days) = 2 THEN MAX(t.date) ELSE 0 END AS '2Days',
CASE WHEN MAX(t.days) = 1 THEN MAX(t.date) ELSE 0 END AS '1Day'
FROM table T1
OUTER APPLY (SELECT days,
COUNT(DISTINCT(date)) date
FROM Table WHERE days = t1.days GROUP BY days) t
GROUP BY id, service
ORDER BY ID
Based on your last edit, this is the most straight forward way I could think of to handle the query:
with cte as (
select id, service, days
from table t1
where service = 'Individual'
union all
select id, service, days
from table t1
where service = 'Group'
group by id, service, days, date
)
select id,
service,
count(case when days = 4 then 'X' end) as [4Days],
count(case when days = 3 then 'X' end) as [3Days],
count(case when days = 2 then 'X' end) as [2Days],
count(case when days = 1 then 'X' end) as [1Day]
from cte
group by id, service

How to query records based on row_num and one of the column value?

Rownum Status
1 2
2 1
3 3
4 2
5 3
6 1
The condition is to query records appear before the first record of status=3 which in the above scenario the expected output will be rownum = 1 and 2.
In the case if there is no status=3 then show everything.
I'm not sure from where to start hence currently no findings
If you are using SQL Server 2012+, then you can use window version of SUM with an ORDER BY clause:
SELECT Rownum, Status
FROM (
SELECT Rownum, Status,
SUM(CASE WHEN Status = 3 THEN 1 ELSE 0 END)
OVER
(ORDER BY Rownum) AS s
FROM mytable) t
WHERE t.s = 0
Calculated field s is a running total of Status = 3 occurrences. The query returns all records before the first occurrence of a 3 value.
Demo here

TSQL - Difficult Grouping

Please see fiddle: http://sqlfiddle.com/#!6/e6768/2
I have data, like below:
DRIVER DROP
1 1
1 2
1 ReturnToBase
1 4
1 5
1 ReturnToBase
1 6
1 7
2 1
2 2
2 ReturnToBase
2 4
I am trying to group my data, so for each driver, each group of return to bases have a grouping number.
My output should look like this:
DRIVER DROP GROUP
1 1 1
1 2 1
1 ReturnToBase 1
1 4 2
1 5 2
1 ReturnToBase 2
1 6 3
1 7 3
1 ReturnToBase 3
2 1 1
2 2 1
2 ReturnToBase 1
2 4 2
I've tried getting this result with a combination of windowed functions but I've been miles off so far
Below is what I had so far, it isn't supposed to be functional I was trying to figure out how it could be done, if it's even possible.
SELECT
ROW_NUMBER() OVER (Partition BY Driver order by Driver Desc) rownum,
Count(1) OVER (Partition By Driver Order By Driver Desc) counter,
Count
DropNo,
Driver,
CASE DropNo
WHEN 'ReturnToBase' THEN 1 ELSE 0 END AS EnumerateRound
FROM
Rounds
You can use the following query:
SELECT id, DRIVER, DROPno,
1 + SUM(flag) OVER (PARTITION BY DRIVER ORDER BY id) -
CASE
WHEN DROPno = 'ReturnToBase' THEN 1
ELSE 0
END AS grp
FROM (
SELECT id, DRIVER, DROPno,
CASE
WHEN DROPno = 'ReturnToBase' THEN 1
ELSE 0
END AS flag
FROM rounds ) AS t
Demo here
This query uses windowed version of SUM with ORDER BY in the OVER clause to calculate a running total. This version of SUM is available from SQL Server 2012 onwards AFAIK.
Fiddling a bit with this running total value is all we need in order to get the correct GROUP value.
EDIT: (credit goes to #Conrad Frix)
Using CROSS APPLY instead of an in-line view can considerably simplify things:
SELECT id, DRIVER, DROPno,
1 + SUM(x.flag) OVER (PARTITION BY DRIVER ORDER BY id) - x.flag
FROM rounds
CROSS APPLY (SELECT CASE WHEN DROPno = 'ReturnToBase' THEN 1 ELSE 0 END) AS x(flag)
Demo here
Added a sequential ID column to your example for use in a recursive CTE:
with cte as (
select ID,DRIVER,DROPno,1 as GRP
FROM rounds
where ID = 1
union all
select a.ID
,a.DRIVER
,a.DROPno
,case when b.DROPno = 'ReturnToBase'
or b.DRIVER <> a.DRIVER then b.GRP + 1
else b.GRP end
from rounds a
inner join cte b
on a.ID = b.ID + 1
)
select * from cte
SQL Fiddle

Resources