SQL Server GROUP BY multiple columns - sql-server

Let's assume I have in SQL Server the following table with only seven days available (SUN - SAT):
Orders
| Day | ProductType | Price |
| SUN | 1 | 10 |
| MON | 1 | 15 |
| MON | 2 | 20 |
| MON | 3 | 10 |
| TUE | 1 | 5 |
| TUE | 3 | 5 |
...
I need to group the data in a way so that to see the Total sum of Prices by each distinct Day and two groups of ProductType (= 1 and > 1):
| Day | FirstProductTypeTotal | RestProductsTypesTotal | GrandTotal |
| SUN | 10 | 0 | 10 |
| MON | 15 | 30 | 45 |
| TUE | 5 | 5 | 10 |
...
where FirstProductTypeTotal is ProductType = 1 and RestProductTypesTotal is ProductType > 1.
Is it possible to select this in one select instead of writing two different selects:
Select Day, SUM(Price) as FirstTotal from Orders where ProductType = 1 group by Day
and
Select Day, SUM(Price) as SecondTotal from Orders where ProductType > 1 group by Day
And then add FirstTotal and SecondTotal manually in the code to get the Grand total for each day of the week?

Use CASE Expression
Select Day, SUM(CASE WHEN ProductType = 1 THE Price ELSE 0 END) AS FirstTotal,
SUM(CASE WHEN ProductType > 1 THE Price ELSE 0 END) AS SecondTotal,
SUM(Price) AS GrandTotal
FROM Orders
group by Day

Try conditional aggregation;
Sample data;
CREATE TABLE #Orders ([Day] varchar(10), ProductType int, Price int)
INSERT INTO #Orders ([Day],ProductType, Price)
VALUES
('SUN',1,10)
,('MON',1,15)
,('MON',2,20)
,('MON',3,10)
,('TUE',1,5)
,('TUE',3,5)
Query;
SELECT
o.[Day]
,SUM(CASE WHEN o.ProductType = 1 THEN o.Price ELSE 0 END) FirstTotal
,SUM(CASE WHEN o.ProductType > 1 THEN o.Price ELSE 0 END) SecondTotal
,SUM(o.Price) GrandTotal
FROM #Orders o
GROUP BY o.[Day]
Result
Day FirstTotal SecondTotal GrandTotal
MON 15 30 45
SUN 10 0 10
TUE 5 5 10
You'd just need to sort out the ordering of the days because SQL Server by definition doesn't store the data in any particular order.

Related

Join tables by Column name

I've got a table with vehicles mark and sales date. I need to take a query to take how many different vehicles has been sold in this year, separated by months, for example:
| January | February | March | April |..............
----------------------------------------------------------
mark1 | | 1 | 5 | |..............
mark2 | 45 | | | 7 |..............
mark3 | 12 | 11 | 5 | 3 |..............
The original table is:
mark | soldDate
----------------------------
mark1 | 01/07/2020
mark2 | 04/07/2020
mark1 | 05/07/2020
mark3 | 06/07/2020
If i want to take how many different vehicles has been sold i use this query:
SELECT mark, COUNT(mark) WHERE FORMAT(soldDate, 'MMMM') = 'january' GROUP BY mark
How can i divide the data in every single month?
With conditional aggregation:
select mark,
count(case when month(soldDate) = 1 then 1 end) as January,
count(case when month(soldDate) = 2 then 1 end) as February,
...........................................................
where year(soldDate) = 2020
group by mark
SELECT Mark, DATENAME(MONTH, DATEADD(MONTH, MONTH(SalesDate) - 1, '1900-01-01')) M, COUNT(*) COUNT
FROM VehicleSales
WHERE YEAR(SalesDate) = '2020'
GROUP BY Mark, MONTH(SalesDate)
Order by Mark, M

SQL Query with Average and Grouping

I just want to ask you guys, especially those with MsSQL knowledge, regarding my query.
My goal is to get the average delivery time and group my data by delivery date and route id daily/weekly/monthly.
Here's my query:
SELECT
RouteID,
CONVERT(date, [DeliveryDate]) AS delivery_date,
AVG(
DATEDIFF(
day,
CONVERT(date, [UnloadDate]),
CONVERT(date, [DeliveryDate])
)
) as Averate_Delivery_Time
FROM [CARGODB].[dbo].[Cargo_Transactions]
WHERE
[DeliveryDate] IS NOT NULL AND
[UnloadDate] != 0 AND
[StageID] = 'D' AND
( CONVERT(date, [DeliveryDate]) LIKE '%2016%' or
CONVERT(date, [DeliveryDate]) LIKE '%2017%')
GROUP BY CONVERT(date, [DeliveryDate]), [RouteID]
ORDER BY CONVERT(date, [DeliveryDate]) DESC
I am not confident if the average delivery time is correct so if you think it's wrong or there are other things in my query that needs to be corrected, please let me know.
UPDATE:
I was able to get the right query:
SELECT [RouteID],
CAST(DATEPART(YEAR,[DeliveryDate]) as varchar) + ' Week ' +
CAST(DATEPART(WEEK,[DeliveryDate]) AS varchar) AS week_name,
AVG(DATEDIFF(day, CONVERT(date, [UnloadDate]), CONVERT(date,
[DeliveryDate]))) as Average_Delivery_Days
FROM [CARGODB].[dbo].[Cargo_Transactions]
WHERE [DeliveryDate] IS NOT NULL AND [DeliveryDate] != 0
AND CONVERT(date, [DeliveryDate]) BETWEEN '2016-01-01' AND GETDATE()
AND [UnloadDate] IS NOT NULL AND [UnloadDate] != 0 AND [DeliveryDate] >
[UnloadDate]
AND [Deleted] = 0 and [StageID] = 'D'
GROUP BY DATEPART(YEAR,[DeliveryDate]), DATEPART(WEEK,[DeliveryDate]),
[RouteID]
ORDER BY DATEPART(YEAR,[DeliveryDate]), DATEPART(WEEK,[DeliveryDate]),
Average_Delivery_Days desc
But I have a more complicated query to do now. I have this sample data:
RouteID | week_name | yearnum | weeknum | Average_Delivery_Days
=======================================================================
MK | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
TSM | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
E | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
A | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
D | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
MP | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
CTN | 2016 Week 3 | 2016 | 3 | 9
-----------------------------------------------------------------------
BIS | 2016 Week 3 | 2016 | 3 | 8
-----------------------------------------------------------------------
C | 2016 Week 3 | 2016 | 3 | 1
-----------------------------------------------------------------------
PN | 2016 Week 4 | 2016 | 4 |10
-----------------------------------------------------------------------
How can I make the above data be like:
MK and TSM are merged into 1 new routeID like Manila1
E, A, and D are merged into another as Manila2
MP, CTN, AND BIS as Visayas
C and PN as Mindanao
and so on..
And the average delivery days will be changed as well.
Your help is highly appreciated. Thank you!

Postgres: Calculate ratio of table entries that match condition

I have the following two tables in a PostgreSQL database:
dummy=# select * from employee;
id | name
----+-------
1 | John
2 | Susan
3 | Jim
4 | Sarah
(4 rows)
dummy=# select * from stats;
id | arrival | day | employee_id
----+----------+------------+-------------
2 | 08:31:34 | monday | 2
4 | 08:15:00 | monday | 3
5 | 08:43:00 | monday | 4
1 | 08:34:00 | monday | 1
7 | 08:29:00 | midweek | 1
8 | 08:31:00 | midweek | 2
9 | 08:10:00 | midweek | 3
10 | 08:40:00 | midweek | 4
11 | 08:28:00 | midweek | 1
12 | 08:33:00 | midweek | 2
14 | 08:21:00 | midweek | 3
15 | 08:45:00 | midweek | 4
16 | 08:25:00 | midweek | 1
17 | 08:35:00 | midweek | 2
18 | 08:44:00 | midweek | 4
19 | 08:10:00 | friday | 1
20 | 08:40:00 | friday | 2
21 | 08:30:00 | friday | 3
22 | 08:30:00 | friday | 4
(19 rows)
I want to select all employees that arrive between 8:25 and 8:35 on midweek and friday. I can accomplish that relatively easy with the following query:
SELECT * FROM stats
WHERE
arrival >= (time '8:30' - interval '5 minutes')
AND
arrival <= (time '8:30' + interval '5 minutes')
AND
(day = 'midweek' or day = 'friday');
However, an additional criterion is that I only want to select those employees that arrive at least 60% of the time within the aforementioned time window. This is where I am stuck. I do not know how to calculate that ratio.
What does the Query look like which fulfills all the criteria?
CLARIFICATION
Apparently the above description for the ratio is misleading.
When calculating the ratio then only the rows that meet the criteria (day = 'midweek' or day = 'friday') shall be considered. So in the sample data John and Susan show up four times for work on midweek and friday. Three out of those four times they are punctual. Hence, the ratio for Susan and John is 75%.
Use a common table expression to calculate needed counts, e.g.
with in_time as (
select *
from stats
where arrival >= (time '8:30' - interval '5 minutes')
and arrival <= (time '8:30' + interval '5 minutes')
and (day = 'midweek' or day = 'friday')
),
count_in_time as (
select employee_id, count(*)
from in_time
group by employee_id
),
total_count as (
select employee_id, count(*)
from stats
where day = 'midweek' or day = 'friday'
group by employee_id
)
select
i.*,
c.count as in_time,
t.count as total_count,
round(c.count* 100.0/t.count, 2) as ratio
from in_time i
join count_in_time c using(employee_id)
join total_count t using(employee_id);
Results:
id | arrival | day | employee_id | in_time | total_count | ratio
----+----------+---------+-------------+---------+-------------+-------
16 | 08:25:00 | midweek | 1 | 3 | 4 | 75.00
11 | 08:28:00 | midweek | 1 | 3 | 4 | 75.00
7 | 08:29:00 | midweek | 1 | 3 | 4 | 75.00
17 | 08:35:00 | midweek | 2 | 3 | 4 | 75.00
12 | 08:33:00 | midweek | 2 | 3 | 4 | 75.00
8 | 08:31:00 | midweek | 2 | 3 | 4 | 75.00
21 | 08:30:00 | friday | 3 | 1 | 3 | 33.33
22 | 08:30:00 | friday | 4 | 1 | 4 | 25.00
(8 rows)
You can add an appropriate condition in the WHERE clause of the final query.
If you want to get aggregated data only with employees and their ratios, use count() with filter:
select employee_id, name, in_time* 1.0/ total as ratio
from (
select
employee_id,
count(*) filter (where arrival >= time '8:30' - interval '5 minutes' and arrival <= time '8:30' + interval '5 minutes') as in_time,
count(*) as total
from stats
where day in ('midweek', 'friday')
group by 1
) s
join employee e on e.id = s.employee_id
where in_time* 1.0/ total >= 0.6;
employee_id | name | ratio
-------------+-------+------------------------
1 | John | 0.75000000000000000000
2 | Susan | 0.75000000000000000000
(2 rows)
You can get the arrival rate like so, for example:
SELECT name,
AVG(CASE WHEN arrival >= (time '8:30' - interval '5 minutes') AND
arrival <= (time '8:30' + interval '5 minutes') THEN 1 ELSE 0 END) AS arrival_rate
FROM employee
INNER JOIN stats ON stats.employee_id = employee.id
GROUP BY name
and to select only those where rate > 60% you just use having condition
SELECT name,
AVG(CASE WHEN arrival >= (time '8:30' - interval '5 minutes') AND
arrival <= (time '8:30' + interval '5 minutes') THEN 1 ELSE 0 END) AS arrival_rate
FROM employee
INNER JOIN stats ON stats.employee_id = employee.id
GROUP BY name
HAVING
AVG(CASE WHEN arrival >= (time '8:30' - interval '5 minutes') AND
arrival <= (time '8:30' + interval '5 minutes') THEN 1 ELSE 0 END)
> 0.6

T-SQL Pivot table

I’ve a table MachineStatus which stores the status history of a machine.
The table looks like this:
| MachineStatusId | From | To | State | MachineId |
----------------------------------------------------------------------------------------------------------------------------------------
| B065FC43-DBE7-E611-9BDB-801F02F47041 | 2017-01-30 07:00:00 | 2017-01-30 08:00:00 | 1 | 92649C7B-E962-4EB1-B631-00086EECA98A |
| B165FC43-DBE7-E611-9BDB-801F02F47041 | 2017-01-30 08:00:00 | 2017-01-30 09:00:00 | 200 | 92649C7B-E962-4EB1-B631-00086EECA98A |
| B265FC43-DBE7-E611-9BDB-801F02F47041 | 2017-01-30 07:00:00 | 2017-01-30 08:00:00 | 1 | A2649C7B-E962-4EB1-B631-00086EECA98A |
| B365FC43-DBE7-E611-9BDB-801F02F47041 | 2017-01-30 08:00:00 | 2017-01-30 09:00:00 | 500 | A2649C7B-E962-4EB1-B631-00086EECA98A |
It stores for each machine, for each status change a record with the information [From] when [To] when a certain [State] was valid.
I like to calculate the time each machine spent in each state.
The result should look like this:
| MachineId | Alias | State1 | State200 | State500 |
-------------------------------------------------------------------------------------------------
| 92649C7B-E962-4EB1-B631-00086EECA98A | Somename | 60 | 60 | 0 |
| A2649C7B-E962-4EB1-B631-00086EECA98A | Some other name | 60 | 0 | 60 |
Each state should be represented as a column.
Here is wat I have tried so far:
SELECT
MAX(mState.MachineId),
MAX(m.Alias),
SUM(CASE mState.State WHEN 1 THEN mState.Diff ELSE 0 END) AS CritTime,
SUM(CASE mState.State WHEN 200 THEN mState.Diff ELSE 0 END) AS OpTime,
SUM(CASE mState.State WHEN 500 THEN mState.Diff ELSE 0 END) AS OtherTime
FROM
(
SELECT
DATEDIFF(MINUTE, ms.[From], ISNULL(ms.[To], GETDATE())) AS Diff,
ms.State AS State,
MachineId
FROM
MachineStatus ms
WHERE
ms.[From] >= #rangeFrom AND
(ms.[To] <= #rangeEnd OR ms.[To] IS NULL)
) as mState
INNER JOIN Machines m ON m.MachineId = mState.MachineId
GROUP BY
mState.MachineId,
m.Alias,
mState.State
Calculating the time and grouping the result by machines works but I cannot figure out how to reduce the result set only contain one row per machine but with a column per state.
I started in your subquery without apply any sum to your calculated data:
SELECT m.MachineId,
m.Alias,
Minutes,
s.State
FROM machines m
INNER JOIN states s ON m.MachineId = s.MachineId
Then you can pivot() for [State] and calculate the sum() of every state in this form:
WITH Calc AS
(
SELECT m.MachineId,
m.Alias,
Minutes,
s.State
FROM machines m
INNER JOIN states s ON m.MachineId = s.MachineId
)
SELECT MachineId, Alias, [State1], [State2], [State500]
FROM
(SELECT MachineId, Alias, State, Minutes FROM Calc) AS SourceTable
PIVOT
(
SUM(Minutes) FOR State IN ([State1],[State2],[State500])
) AS PivotTable;
This is the result:
+--------------------------------------+---------+--------+--------+----------+
| MachineId | Alias | State1 | State2 | State500 |
+--------------------------------------+---------+--------+--------+----------+
| 92649C7B-E962-4EB1-B631-00086EECA98A | Alias 1 | 100 | 100 | 100 |
+--------------------------------------+---------+--------+--------+----------+
| A2649C7B-E962-4EB1-B631-00086EECA98A | Alias 2 | 10 | 20 | 70 |
+--------------------------------------+---------+--------+--------+----------+
Notice that you must know how many states return your data.
Can check it here: http://rextester.com/DHDX77489

Complex SQL query, not sure where to start

I have a tough one here I think. I have the following tables:
[Assets]
AssetId | Name
1 | Acura NSX
2 | Dodge Ram
[Assignments]
AssignmentId | AssetId | StartMileage | EndMileage | StartDate | EndDate
1 | 1 | 8000 | 10000 | 4/1/2015 | 5/1/2015
2 | 1 | 10000 | 16000 | 9/15/2015 | 1/5/2016
3 | 2 | 51000 | NULL | 1/1/2016 | NULL
[Reminders]
ReminderId | AssetId | Name | Distance | Time | Active
1 | 1 | Oil Change | 3000 (miles)| 3 (months)| 1
2 | 1 | Tire Rotation | 5000 | 6 | 0
3 | 2 | Oil Change | 3000 | 3 | 1
4 | 2 | Air Filter | 50000 | 48 | 1
[Maintenance]
MaintenanceId | AssetId | ReminderId | Mileage | Date | Vendor
1 | 1 | 1 | 10000 | 5/1/2015 | Jiffy Lube
2 | 2 | 3 | 51000 | 6/1/2015 | Dealership
I need a query that will join these 4 tables and return something like the following.
Name | Name | Current Mileage | Last Mileage | Last Date
Acura NSX | Oil Change | 16000 | 10000 | 5/1/2015
Dodge RAM | Oil Change | 51000 | 51000 | 6/1/2015
Dodge RAM | Air Filter | 51000 | -- | --
I need to take the distance threshold from the Reminders table and add it to the mileage from the Maintenance table then compare it to the start and end mileage from the Assignments table. If the threshold is greater than the start or end mileage then select the asset name, the name of the reminder, the current mileage (start or end mileage from Assignments, whichever is greater), and mileage and date from the last maintenance for that reminder. I need to do the same for time threshold. Add it to the date from the Maintenance table then compare it to today's date. If it's greater then display the asset.
Can one of you SQL gurus help me with this please?
UPDATE:
SELECT
v.Name,
r.Name AS Reminder,
a.CurrentMileage,
i.MaintenanceMileage,
i.MaintenanceDate
FROM
Assets v
LEFT JOIN
(SELECT AssetId,
COALESCE(EndMileage, StartMileage) AS CurrentMileage,
ROW_NUMBER() OVER (PARTITION BY AssetId
ORDER BY AssignmentId DESC) AS window_id
FROM Assignments) a
ON v.AssetId = a.AssetId
AND a.window_id = 1
JOIN
Reminders r
ON v.AssetId = r.AssetId
AND r.ActiveFlag = 1
LEFT JOIN
(SELECT AssetId,
ReminderId,
MAX(Mileage) AS MaintenanceMileage,
MAX([Date]) AS MaintenanceDate
FROM Maintenances
GROUP BY AssetId, ReminderId) i
ON r.ReminderId = i.ReminderId
AND (a.CurrentMileage > (NULLIF(i.MaintenanceMileage, 0) + r.DistanceThreshold))
OR (GETDATE() > DATEADD(m, r.[TimeThreshold], i.MaintenanceDate))
Here is a starting point:
SELECT v.Name AS [Asset Name], r.Name AS Reminder, a.CurrentMileage,
m.Mileage + r.Distance AS [Last Mileage], m.[Date] AS [Last Date]
FROM Assets v
JOIN ( -- get the latest relevant row as window_id = 1
SELECT AssetId, COALESCE(EndMileage, StartMileage) AS CurrentMileage,
COALESCE(EndDate, StartDate) AS AssignDate,
ROW_NUMBER() OVER (partition by AssetId
order by COALESCE(EndDate, StartDate) DESC) AS window_id
FROM Assignments
) a
ON v.AssetId = a.AssetId
AND a.window_id = 1
JOIN Reminders r
ON v.AssetId = r.AssetId
AND r.Active = 1
LEFT JOIN Maintenance m
ON r.AssetId = m.AssetId
AND r.ReminderId = m.ReminderId
-- corrected
AND ((a.CurrentMileage > (NULLIF(m.Mileage, 0) + r.Distance))
-- slightly oversimplified
OR (GETDATE() > DATEADD(m, r.[Time], COALESCE(m.[Date], a.AssignDate))))
The date calculations are slightly oversimplified because they use the latest assignment dates. What you would really want is a column Assets.InServiceDate that would anchor the time before the first maintenance would be due. But this will get you started.

Resources