So I have a table that binds ProductId and GroupId. The product can be assigned to all of 5 groups (1-5).
If the product doesn't exist in the table, it's not assigned to any of the group
ProductId | GroupId
-------------------
100 | 1
100 | 2
200 | 1
200 | 2
200 | 3
200 | 4
200 | 5
Taking a look at this table, we know that Product that goes by id 100 is assigned to 2 groups (1,2) and the product of id 200 is assigned to 5 groups (1-5).
I'm trying to write a query that will display each product in separate row, together with columns for all of the 5 groups and a bit value that contains information if the product belongs to the group or not (0,1). A visualization of the result I need:
ProductId | IsGroup1 | IsGroup2 | IsGroup3 | IsGroup4 | IsGroup5
-----------------------------------------------------------------
100 | 1 | 1 | 0 | 0 | 0 -- this belongs to groups 1, 2
200 | 1 | 1 | 1 | 1 | 1 -- this belongs to all of the groups
I know I could probably solve it using a self join 5 times on each distinct product, but I'm wondering if there's a more elegant way of solving it?
Any tips will be strongly appreciated
You could use a pivot. Since you only have 5 groups you don't need a dynamic pivot.
DB FIDDLE
select
ProductId
,IsGroup1 = iif([1] is null,0,1)
,IsGroup2 = iif([2] is null,0,1)
,IsGroup3 = iif([3] is null,0,1)
,IsGroup4 = iif([4] is null,0,1)
,IsGroup5 = iif([5] is null,0,1)
from
(select ProductID, GroupId from mytable) x
pivot
(max(GroupId) for GroupId in ([1],[2],[3],[4],[5])) p
I'm using SQL Server 2008, and trying to gather individual customer data appearing over multiple rows in my table, an example of my database is as follows:
custID | status | type | value
-------------------------
1 | 1 | A | 150
1 | 0 | B | 100
1 | 0 | A | 153
1 | 0 | A | 126
2 | 0 | A | 152
2 | 0 | B | 101
2 | 0 | B | 103
For each custID, my task is to find a flag if status=1 for any row, if type=B for any row, and the average of value in all cases where type=B. So my solution should look like:
custID | statusFlag | typeFlag | valueAv
-------------------------------------------
1 | 1 | 1 | 100
2 | 0 | 1 | 102
I can get answers for this using lots of row_number() over (partition by .. ), to create ids, and creating subtables for each column selecting the desired id. My issue is this method is awkward and time consuming, as I have many more columns than shown above to do this over, and many tables to repeat it for. My ideal solution would be to define my own aggregate() function so I could just do:
select custID, ag1(statusFlag), ag2(typeFlag)
group by custID
but as far as I can tell custom aggregates can't be defined in SQL server. Is there a nicer general approach to this problem, which doesn't require defining lots of id's ?
use CASE WHEN to evaluate the value and apply the aggregate function accordingly
select custID,
statusFlag = max(status),
typeFlag = max(case when type = 'B' then 1 else 0 end),
valueAv = avg(case when type = 'B' then value end)
from samples
group by custID
I can use a traditional subquery approach to count the occurrences in the last ten minutes. For example, this:
drop table if exists [dbo].[readings]
go
create table [dbo].[readings](
[server] [int] NOT NULL,
[sampled] [datetime] NOT NULL
)
go
insert into readings
values
(1,'20170101 08:00'),
(1,'20170101 08:02'),
(1,'20170101 08:05'),
(1,'20170101 08:30'),
(1,'20170101 08:31'),
(1,'20170101 08:37'),
(1,'20170101 08:40'),
(1,'20170101 08:41'),
(1,'20170101 09:07'),
(1,'20170101 09:08'),
(1,'20170101 09:09'),
(1,'20170101 09:11')
go
-- Count in the last 10 minutes - example periods 08:31 to 08:40, 09:12 to 09:21
select server,sampled,(select count(*) from readings r2 where r2.server=r1.server and r2.sampled <= r1.sampled and r2.sampled > dateadd(minute,-10,r1.sampled)) as countinlast10minutes
from readings r1
order by server,sampled
go
How can I use a window function to obtain the same result ? I've tried this:
select server,sampled,
count(case when sampled <= r1.sampled and sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
-- count(case when currentrow.sampled <= r1.sampled and currentrow.sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
from readings r1
order by server,sampled
But the result is just the running count. Any system variable that refers to the current row pointer ? currentrow.sampled ?
This isn't a very pleasing answer but one possibility is to first create a helper table with all the minutes
CREATE TABLE #DateTimes(datetime datetime primary key);
WITH E1(N) AS
(
SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1)) V(N)
) -- 1*10^1 or 10 rows
, E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows
, E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows
, E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
,R(StartRange, EndRange)
AS (SELECT MIN(sampled),
MAX(sampled)
FROM readings)
,N(N)
AS (SELECT ROW_NUMBER()
OVER (
ORDER BY (SELECT NULL)) AS N
FROM E8)
INSERT INTO #DateTimes
SELECT TOP (SELECT 1 + DATEDIFF(MINUTE, StartRange, EndRange) FROM R) DATEADD(MINUTE, N.N - 1, StartRange)
FROM N,
R;
And then with that in place you could use ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
WITH T1 AS
( SELECT Server,
MIN(sampled) AS StartRange,
MAX(sampled) AS EndRange
FROM readings
GROUP BY Server )
SELECT Server,
sampled,
Cnt
FROM T1
CROSS APPLY
( SELECT r.sampled,
COUNT(r.sampled) OVER (ORDER BY N.datetime ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) AS Cnt
FROM #DateTimes N
LEFT JOIN readings r
ON r.sampled = N.datetime
AND r.server = T1.server
WHERE N.datetime BETWEEN StartRange AND EndRange ) CA
WHERE CA.sampled IS NOT NULL
ORDER BY sampled
The above assumes that there is at most one sample per minute and that all the times are exact minutes. If this isn't true it would need another table expression pre-aggregating by datetimes rounded to the minute.
As far as I know, there is not a simple exact replacement for your subquery using window functions.
Window functions operate on a set of rows and allow you to work with them based on partitions and order.
What you are trying to do isn't the type of partitioning that we can work with in window functions.
To generate the partitions we would need to be able to use window functions in this instance would just result in overly complicated code.
I would suggest cross apply() as an alternative to your subquery.
I am not sure if you meant to restrict your results to within 9 minutes, but with sampled > dateadd(...) that is what is happening in your original subquery.
Here is what a window function could look like based on partitioning your samples into 10 minute windows, along with a cross apply() version.
select
r.server
, r.sampled
, CrossApply = x.CountRecent
, OriginalSubquery = (
select count(*)
from readings s
where s.server=r.server
and s.sampled <= r.sampled
/* doesn't include 10 minutes ago */
and s.sampled > dateadd(minute,-10,r.sampled)
)
, Slices = count(*) over(
/* partition by server, 10 minute slices, not the same thing*/
partition by server, dateadd(minute,datediff(minute,0,sampled)/10*10,0)
order by sampled
)
from readings r
cross apply (
select CountRecent=count(*)
from readings i
where i.server=r.server
/* changed to >= */
and i.sampled >= dateadd(minute,-10,r.sampled)
and i.sampled <= r.sampled
) as x
order by server,sampled
results: http://rextester.com/BMMF46402
+--------+---------------------+------------+------------------+--------+
| server | sampled | CrossApply | OriginalSubquery | Slices |
+--------+---------------------+------------+------------------+--------+
| 1 | 01.01.2017 08:00:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:02:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:05:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:30:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:31:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:37:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:40:00 | 4 | 3 | 1 |
| 1 | 01.01.2017 08:41:00 | 4 | 3 | 2 |
| 1 | 01.01.2017 09:07:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 09:08:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 09:09:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 09:11:00 | 4 | 4 | 1 |
+--------+---------------------+------------+------------------+--------+
Thanks, Martin and SqlZim, for your answers. I'm going to raise a Connect enhancement request for something like %%currentrow that can be used in window aggregates. I'm thinking this would lead to much more simple and natural sql:
select count(case when sampled <= %%currentrow.sampled and sampled > dateadd(minute,-10,%%currentrow.sampled) then 1 else null end) over (...whatever the window is...)
We can already use expressions like this:
select count(case when sampled <= getdate() and sampled > dateadd(minute,-10,getdate()) then 1 else null end) over (...whatever the window is...)
so thinking would be great if we could reference a column that's in the current row.
Thank you for taking your time to read my question.
What I want to do is to count all rows from a colum (State), and get 2 different count columns on the result (One for each possible value 1 or 0).
For example I have this table (Called “product”) with these rows:
ProductID |Subcategory| State |
101 | 201 | 1 |
102 | 201 | 1 |
103 | 201 | 1 |
104 | 202 | 0 |
105 | 202 | 0 |
106 | 203 | 1 |
107 | 203 | 0 |
108 | 203 | 0 |
State: 1=Active, 0=Inactive
So I want to get this:
|Subcategory| Active|Inactive|
| 201 | 3 | 0 |
| 202 | 0 | 2 |
| 202 | 1 | 2 |
Do you know how can I do this?
I was trying with this query and inserting the results in a temptable:
SELECT product.Subcategory,
COUNT(product.Subcategory) AS Products
FROM product,subcategory,category
WHERE product.Subcategory = subcategory.SubcategoryId
AND product.State = 1 or 0
AND subcategory.Category = category.CategoryId
GROUP BY product.Subcategory
But I get just the products with State = 1 or 0
I tried LEFT JOIN but I couldn’t get what I need it.
This is what I’m trying… (But it is not working)
SELECT P.Subcategory,
COUNT(P.State)
FROM product AS P
LEFT JOIN(
SELECT product.Subcategory AS S,
COUNT(product.State) AS C
FROM product
WHERE product.State = 0
GROUP BY product.Subcategory) AS t
ON t.S = p.Subcategory
I swear I checked on other user questions like:
Display zero by using count(*) if no result returned for a particular case
SQL - Returning all rows even if count is zero for item
But I couldn’t find what I need it.
Could you please help me with this? =)
Thank you!!!
I would do this using conditional aggregation:
select p.subcategory,
sum(case when state = 1 then 1 else 0 end) as active,
sum(case when state = 0 then 1 else 0 end) as inactive
from product p
group by p.subcategory
You can do that in two ways.
One is by using Conditional Aggregate use Count Instead of Sum to avoid Else part in case statement
SELECT subcategory,
Count(CASE WHEN state = 1 THEN 1 END) AS Active,
Count(CASE WHEN state = 0 THEN 1 END) AS InActive
FROM product
GROUP BY subcategory
Another way is by using Pivot
SELECT subcategory,
[1] AS Active,
[0] AS InActive
FROM (SELECT subcategory,
State,
ProductID
FROM product) a
PIVOT(Count(ProductID)
FOR State IN([1],[0])) piv
Try This
SELECT subcategory,
Count(State) AS TotalC,
ABS(Sum(State)) AS Active,
TotalC-Active AS InActive
FROM product GROUP BY ProductID;
I am currently using the following query to get some numbers:
SELECT gid, count(gid), (SELECT cou FROM size WHERE gid = infor.gid)
FROM infor
WHERE id==4325
GROUP BY gid;
The output I am getting at my current stage is the following:
+----------+-----------------+---------------------------------------------------------------+
| gid | count(gid) | (SELECT gid FROM size WHERE gid=infor.gid) |
+----------+-----------------+---------------------------------------------------------------+
| 19 | 1 | 19 |
| 27 | 4 | 27 |
| 556 | 1 | 556 |
+----------+-----------------+---------------------------------------------------------------+
I am trying to calculate the weighted average i.e.
(1*19+4*27+1*556)/(19+27+556)
Is there a way to do this using a single query?
Use:
SELECT SUM(x.num * x.gid) / SUM(x.cou)
FROM (SELECT i.gid,
COUNT(i.gid) AS num,
s.cou
FROM infor i
LEFT JOIN SIZE s ON s.gid = i.gid
WHERE i.id = 4325
GROUP BY i.gid) x
You could place your original query as a sub-query and SUM the records. I could not test this as I don't have the dataset you do, but it should work in theory ;)
SELECT SUM(gid)/SUM(weights) AS calculated_average FROM (
SELECT gid, (COUNT(gid) * gid) AS weights
FROM infor
WHERE id = 4325
GROUP BY gid);