TSQL Finding maximum values in multiple rows - sql-server

In my query I need to find the supplier with the highest costs for every single year.
SELECT YEAR(ORDERS.OrderDate),
MAX(ORDERS.Freight) AS [Greatest cost]
FROM ORDERS
GROUP BY YEAR(ORDERS.OrderDate)
ORDER BY YEAR(ORDERS.OrderDate) ASC
This code does give me the maximum cost per year, it doesn't give me the name of the supplier.
SELECT YEAR(ORDERS.OrderDate),
SHIPPERS.ShipperID,
SHIPPERS.CompanyName,
MAX(ORDERS.Freight) AS [Greatest cost]
FROM ORDERS, SHIPPERS
WHERE SHIPPERS.ShipperID = ORDERS.ShipVia
GROUP BY YEAR(ORDERS.OrderDate),
SHIPPERS.ShipperID,
SHIPPERS.CompanyName
ORDER BY YEAR(ORDERS.OrderDate) ASC
This code then gave me too much, as in, it gave me all the suppliers (with their highest numbers) for every single year, while I need the highest supplier per year.
Thanks in advance!

There are likely several ways to do this. Here's one: http://sqlfiddle.com/#!6/47d38/3/0
Test data:
create table ORDERS
(
OrderDate datetime,
ShipVia int,
Freight int
);
create table SHIPPERS
(
ShipperID int,
CompanyName nvarchar(100)
);
insert SHIPPERS values (1, 'Shipper1'), (2, 'Shipper2'), (3, 'Shipper3');
insert ORDERS values
('2011-2-1', 1, 10),
('2011-3-1', 1, 20),
('2011-2-2', 2, 5),
('2011-3-2', 2, 10),
('2011-2-3', 3, 18),
('2012-2-1', 1, 10),
('2012-3-1', 1, 20),
('2012-2-2', 2, 25),
('2012-3-2', 2, 40),
('2012-2-3', 3, 18);
Query:
with A as
(
select
YEAR(O.OrderDate) as [year],
S.ShipperID,
SUM(O.Freight) as [totalFreight]
from ORDERS as O
join SHIPPERS as S on O.ShipVia = S.ShipperId
group by YEAR(O.OrderDate), S.ShipperId
)
select A.*, S.CompanyName
from A
join SHIPPERS as S on A.ShipperID = S.ShipperID
where A.totalFreight >=ALL
(select totalFreight from A as Ainner where A.[year] = Ainner.[year]);
Results:
year ShipperID totalFreight CompanyName
2011 1 30 Shipper1
2012 2 65 Shipper2

select * from
(
SELECT YEAR(ORDERS.OrderDate),
SHIPPERS.ShipperID,
SHIPPERS.CompanyName,
MAX(ORDERS.Freight) over (partition by YEAR(ORDERS.OrderDate), SHIPPERS.ShipperID) as max,
row_number() over (partition by YEAR(ORDERS.OrderDate),
order by ORDERS.Freight desc) as rn
FROM ORDERS, SHIPPERS
WHERE SHIPPERS.ShipperID = ORDERS.ShipVia
) tt
where tt.rn = 1

Related

CTE - LEFT OUTER JOIN Performance Problem

Using SQL Server 2017.
SQL FIDDLE: LINK
CREATE TABLE [TABLE_1]
(
PLAN_NR decimal(28,6) NULL,
START_DATE datetime NULL,
);
CREATE TABLE [TABLE_2]
(
PLAN_NR decimal(28,6) NULL,
PERIOD_NR decimal(28,6) NULL,
);
INSERT INTO TABLE_1 (PLAN_NR, START_DATE)
VALUES (1, '2020-05-01'), (2, '2020-08-05');
INSERT INTO TABLE_2 (PLAN_NR, PERIOD_NR)
VALUES (1, 1), (1, 2), (1, 5), (1, 6), (1, 5), (1, 6), (1, 17),
(2, 2), (2, 3), (2, 5), (2, 2), (2, 17), (2, 28);
CREATE VIEW ALL_PERIODS
AS
WITH rec_cte AS
(
SELECT
PLAN_NR, START_DATE,
1 period_nr, DATEADD(day, 7, START_DATE) next_date
FROM
TABLE_1
UNION ALL
SELECT
PLAN_NR, next_date,
period_nr + 1, DATEADD(day, 7, next_date)
FROM
rec_cte
WHERE
period_nr < 100
),
cte1 AS
(
SELECT
PLAN_NR, period_nr, START_DATE
FROM
rec_cte
UNION ALL
SELECT
PLAN_NR, period_nr, DATEADD(DAY, 1, EOMONTH(next_date, -1))
FROM
rec_cte
WHERE
MONTH(START_DATE) <> MONTH(next_date)
),
cte2 AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY PLAN_NR ORDER BY START_DATE) rn
FROM cte1
)
SELECT PLAN_NR, rn PERIOD_NR, START_DATE
FROM cte2
WHERE rn <= 100
Table_1 lists plans (PLAN_NR) and their start date (START_DATE).
Table_2 lists plan numbers (PLAN_NR) and periods (1 - X). Per plan number periods can appear several times but can also be missing.
A period lasts seven days, unless the period includes a change of month. Then the period is divided into a part before the end of the month and a part after the end of the month.
The view ALL_PERIODS lists 100 periods per plan according to this system.
My problem is the performance of the following select which I would like to use in a view:
SELECT
t2.PLAN_NR
, t2.PERIOD_NR
, a_p.START_DATE
from TABLE_2 as t2
left outer join ALL_PERIODS a_p on t2.PERIOD_NR = a_p.PERIOD_NR and t2.PLAN_NR = a_p.PLAN_NR
From about 4000 entries in TABLE_2 the select becomes incredibly slow.
The join itself does not yet slow down the query. Only with the additional select a_p.START_DATE everything becomes incredibly slow.
I read the view into a temporary table and did the join over that and got no performance issues. (2 seconds for the 4000 entries).
So I assume that the CTE used in the view is the reason for the slow performance.
Unfortunately I can't use temporary tables in views and I would hate to write the data to a normal table.
Is there a way in SQL Server to improve the CTE lag?
Instead of a recusive CTE, generate ALL_PERIODS with a CROSS join between the Plan table and a "number table" either persisted, or as a non-recursive CTE.
EG
WITH N As
(
select top 100 row_number() over (order by (select null)) i
from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) ) v1(i),
(values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) ) v2(i)
),
plan_period AS
(
SELECT
PLAN_NR, START_DATE,
N.i period_nr, DATEADD(day, 7*N.i, START_DATE) next_date
FROM TABLE_1 CROSS JOIN N
),
if you are able to modify the view I would recommend to do this :
add a table containing numbers starting from 0 to whatever you think you will need in database, you can use below command :
create table numbers ( id int)
go
;with cte (
select 0 num
union all
select num + 1
where num < 2000 -- change this
)
insert into number
from num from cte
change the first cte in the view to this :
WITH rec_cte AS
(
SELECT
PLAN_NR
, DATEADD(DAY, 7* id, START_DATE) START_DATE
, id +1 period_nr
, DATEADD(DAY, 7*( id+1), START_DATE) next_date
FROM
TABLE_1 t
CROSS apply intenum i
WHERE i.id <100
),...
Also consider using temp table instead of cte it might be helpful

TSQL: Group by one column, count all rows and keep value on second column based on row_number

I have a query that returns an Id, a Name and the Row_Number() based on some rules.
The query looks like that
SELECT
tm.id AS Id,
pn.Name AS Name,
ROW_NUMBER() OVER(PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Row
FROM
#tempTable AS tm
LEFT JOIN
names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
The output of the above query looks like the table below with the dummy data
CREATE TABLE people
(
id int,
name varchar(55),
row int
);
INSERT INTO people
VALUES (1, 'John', 1), (1, 'John', 2), (2, 'Mary', 1),
(3, 'Jeff', 1), (4, 'Bill', 1), (4, 'Bill', 2),
(4, 'Bill', 3), (4, 'Billy', 4), (5, 'Bobby', 1),
(5, 'Bob', 2), (5, 'Bob' , 3), (5, 'Bob' , 4);
What I try to do, is group by the id field, count all rows, but for the name, use the one with row = 1
My attempt is like this, but, obviously, I get different rows since I include the x.name in the group by.
SELECT
x.id,
x.name,
COUNT(*) AS Value
FROM
(SELECT
tm.id AS Id,
pn.Name AS Name,
ROW_NUMBER() OVER(PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Row
FROM
#tempTable AS tm
LEFT JOIN
names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
) x
GROUP BY
x.id, x.name
ORDER BY
COUNT(*) DESC
The desired results from the dummy data are:
id name count
------------------
1 John 2
2 Mary 1
3 Jeff 1
4 Bill 4
5 Bobby 4
You can use FIRST_VALUE() window function to get the name of the row with row number = 1 and with the keyword DISTINCT there is no need to GROUP BY:
SELECT DISTINCT tm.id AS Id
, FIRST_VALUE(pn.Name) OVER (PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Name
, COUNT(*) OVER (PARTITION BY tm.id) AS counter
FROM #tempTable AS tm
LEFT JOIN names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
If you can't use FIRST_VALUE() then you can do it with conditional aggregation:
SELECT id,
MAX(CASE WHEN Row = 1 THEN Name END) AS NAME,
COUNT(*) AS Counter
FROM (
SELECT tm.id AS Id
, pn.Name AS Name
, ROW_NUMBER() OVER(PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Row
FROM #tempTable AS tm
LEFT JOIN names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
) t
GROUP BY id
This could be one solution to your problem: group on both id and the target name (case when p.row = 1 then p.name end) for the counting. Adding a with rollup to the grouping will "roll up" the count aggregations. Another aggregation on just id can then be use to merge the row values from the intermediate data set (visible in fiddle).
with cte as
(
select p.id,
case when p.row = 1 then p.name end as name,
count(1) as cnt
from people p
group by p.id, case when p.row = 1 then p.name end with rollup
having grouping(p.id) = 0
)
select cte.id,
max(cte.name) as name,
max(cte.cnt) as [count]
from cte
group by cte.id;
Fiddle
This would be another solution: do a regular count query with grouping on id and fetch the required name afterwards with a cross apply.
with cte as
(
select p.id,
count(1) as cnt
from people p
group by p.id
)
select cte.id,
n.name,
cte.cnt as [count]
from cte
cross apply ( select p.name
from people p
where p.id = cte.id
and p.row = 1 ) n;
Fiddle

Only show value of Max rows with partition by?

the title might be a bit off however i'm trying to remove the values of a row without removing the actual row.
This is my table:
SELECT ID,CustomerID,Weight FROM Orders
What am i trying to accomplish is this:
The MAX() value of ID Group By CustomerID that would give me null values in Weight where max and group by is not set
Is it possible to do this in one line? with a partiton by?
Something like:
SELECT MAX(ID) over (partition by CustomerID,Weight).... I know this is wrong but if possible to do without a join or CTE and only in one line in the select statement that would be great.
One possible approach is using ROW_NUMBER:
SELECT
ID,
CustomerID,
CASE
WHEN ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY ID DESC) = 1 THEN [Weight]
ELSE Null
END AS [Weight]
FROM #Orders
ORDER BY ID
Input:
CREATE TABLE #Orders (
ID int,
CustomerID int,
[Weight] int
)
INSERT INTO #Orders
(ID, CustomerID, [Weight])
VALUES
(1, 11, 100),
(2, 11, 17),
(3, 11, 35),
(4, 22, 26),
(5, 22, 78),
(6, 22, 10030)
Output:
ID CustomerID Weight
1 11 NULL
2 11 NULL
3 11 35
4 22 NULL
5 22 NULL
6 22 10030
Try this
;WITH CTE
AS
(
SELECT
MAX_ID = MAX(ID) OVER(PARTITION BY CustomerId),
ID,
CustomerId,
Weight
FROM Orders
)
SELECT
ID,
CustomerId,
Weight = CASE WHEN ID = MAX_ID THEN Weight ELSE NULL END
FROM CTE
You can try this.
SELECT ID,CustomerId,CASE WHEN ID= MAX(ID) OVER(PARTITION BY CustomerId) THEN Weight ELSE NULL END AS Weight FROM Orders

SQL Server - How would I insert a RANK function to rows that are already sorted in ranked order?

So, apparently, I have everything right according to my professor except for one column that shows the rank of the columns shown in the code below. I'm thinking that, essentially, it just has to show the row numbers off to the left side in its own column. Here are the instructions:
The sales manager would now like you to create a report that ranks her
products by both their total sales and total sales quantity (each will
be its own column). Create a stored procedure that returns the
following columns but also with the two new rank columns added.
Product Name | Orders Count | Total Sales Value | Total Sales
Quantity
I know that it doesn't have that extra column in the assignment description, but I guess I need it. Here is what I have so far:
USE OnlineStore
GO
CREATE PROC spManagerProductSalesCount
AS
BEGIN
SELECT
P.Name AS 'Product Name',
Isnull(Count(DISTINCT O.OrderID), 0) AS 'Orders Count',
Sum(Isnull(O.OrderTotal, 0)) AS 'Total Sales Value',
Sum (Isnull(OI.OrderItemQuantity, 0)) AS 'Total Sales Quantity'
FROM
Product P
INNER JOIN
OrderItem OI ON P.ProductID = OI.ProductID
INNER JOIN
Orders O on O.OrderID = OI.OrderID
GROUP BY
P.Name
ORDER BY
'Total Sales Value' DESC, 'Total Sales Quantity' DESC
END
Update: It does need to be in a stored procedure and CTEs can/should be used. I could use some help with the CTEs. Those are pretty difficult for me.
This is just the select part of the stored proc but it should show you what to do:
declare #products table
(
Name varchar(50),
id int
)
declare #orderitems table
(
id int,
orderid int,
productid int,
orderitemquantity int
)
declare #orders table
(
orderid int,
ordertotal decimal(18,2)
)
insert into #products VALUES ('apple', 1)
insert into #products VALUES ('orange', 2)
insert into #products VALUES ('pear', 3)
insert into #products VALUES ('melon', 4)
insert into #orders values(1, 19.0)
insert into #orders values(2, 25.5)
insert into #orders values(3, 9.5)
insert into #orders values(4, 13.5)
insert into #orders values(5, 8.5)
insert into #orderitems VALUES(1, 1, 1, 20)
insert into #orderitems VALUES(2, 1, 2, 10)
insert into #orderitems VALUES(3, 2, 3, 5)
insert into #orderitems VALUES(4, 2, 4, 4)
insert into #orderitems VALUES(5, 3, 1, 10)
insert into #orderitems VALUES(6, 3, 2, 5)
insert into #orderitems VALUES(7, 4, 3, 3)
insert into #orderitems VALUES(8, 4, 4, 2)
insert into #orderitems VALUES(9, 5, 1, 5)
insert into #orderitems VALUES(10, 5, 4, 2)
;WITH summary as
(
SELECT p.Name as ProductName,
COUNT(o.orderid) as 'Orders Count',
ISNULL(Sum(o.ordertotal),0) AS 'Total Sales Value',
ISNULL(Sum(oi.orderitemquantity),0) AS 'Total Sales Quantity'
FROM #products p
INNER JOIN #orderitems oi on oi.productid = p.id
INNER JOIN #orders o on o.orderid = oi.orderid
GROUP BY p.Name
)
SELECT ProductName, [Orders Count], [Total Sales Value], [Total Sales Quantity],
RANK() OVER (ORDER BY [Total Sales Value] DESC) AS ValueRanking,
RANK() OVER (ORDER BY [Total Sales Quantity] DESC) AS QuantityRanking FROM summary
Notice a few things here. This code can be cut and pasted into a Management Studio query window and run as such. It starts with some table declarations and insert of sample data. When asking a question it is always useful if you do this part of the work; people are much more likely to answer, if the most boring bit is done!
COUNT() does not need ISNULL protection; it returns 0, if there are no values.
Given the final data, you will see that the ValueRanking and QuantityRankings are different (I fiddled the data to get this, just to illustrate the point). What this means is that the final result can only be ordered by one of them (or indeed by any other column - order by is not dependent on ranking).
HTH

Query to get date rows older than a start date (not a simple WHERE)

I have a feeling this is quite simple, but I can't put my finger on the query. I'm trying to find all of the activities of an employee which corresponds to their start date in a specific location.
create table Locations (EmployeeID int, LocationID int, StartDate date);
create table Activities (EmployeeID int, ActivityID int, [Date] date);
insert into Locations values
(1, 10, '01-01-2010')
, (1, 11, '01-01-2012')
, (1, 11, '01-01-2013');
insert into Activities values
(1, 1, '02-01-2010')
, (1, 2, '04-01-2010')
, (1, 3, '06-06-2014');
Expected result:
EmployeeID LocationID StartDate EmployeeID ActivityID Date
1 10 '01-01-2010' 1 1 '02-01-2010'
1 10 '01-01-2010' 1 2 '04-01-2010'
1 11 '01-01-2013' 1 3 '06-06-2014'
So far, I have this, but it's not quite giving me the result I was hoping for. I somehow have to reference only the information from the most recent Location, which the la.StartDate <= a.Date does not filter out and includes information from older locations as well.
select *
from Locations la
inner join Activities a on la.EmployeeID = a.EmployeeID
and la.StartDate <= a.Date
Give this one a try:
with Locations as (
select
*
from (values
(1, 10, '01-01-2010')
, (1, 11, '01-01-2012')
, (1, 11, '01-01-2013')
) la (EmployeeID, LocationID, StartDate)
),
Activities as (
select
*
from (
values
(1, 1, '02-01-2010')
, (1, 2, '04-01-2010')
, (1, 3, '06-06-2014')
) a (EmployeeID, ActivityID, [Date])
)
select
la.*,
a.*
from Activities a
cross apply (
select
*
from (
select
la.*,
ROW_NUMBER() OVER (
PARTITION BY
EMPLOYEEID
ORDER BY
DATE DESC
) seqnum
from Locations la
where
la.EmployeeID = a.EmployeeID and
la.StartDate <= a.Date
) la
where
la.seqnum = 1
) la
Thank you all, but I managed to find the answer:
select *
from LocationAssociations la
inner join Activities a on la.EmployeeID = a.EmployeeID
and la.StartDate = (select max(StartDate) from LocationAssociations where StartDate >= la.StartDate and StartDate <= a.Date)

Resources