How to use Pivot in SQL Server? - sql-server

I have two table with columns as mentioned below:
CUSTOMER table as DIM_CUSTOMER:
ID_CUSTOMER, CUSTOMER_NAME
TRANSACTION table as FACT_TRANSACTION:
ID_CUSTOMER, DATE, TOTAL_PRICE, QUANTITY
Problem statement is to
Find top 100 customers and their average spend, average quantity by each year. Also find the percentage of change in their spend.
My approach:
SELECT TOP 100
YEAR(FT.DATE) AS [YEAR],
FT.ID_CUSTOMER AS [CUSTOMER NAME],
FT.TOTAL_PRICE AS [TOTAL AMT],
AVG(FT.TOTAL_PRICE) AS [AVG SPEND],
AVG(FT.QUANTITY) AS [AVG QUANTITY]
FROM
FACT_TRANSACTIONS FT
INNER JOIN
DIM_CUSTOMER DC ON FT.ID_CUSTOMER = DC.ID_CUSTOMER
GROUP BY
FT.DATE, FT.ID_CUSTOMER, FT.TOTAL_PRICE
ORDER BY
3 DESC
This is resulting in the top 100 customers based on their usage.
Now I need to determine the percentage change in their spend YEAR wise.
How can I do that? Probably using PIVOT option herein will help, but I'm unsure.

You can try using LAG in order to access the previous [AVG SPEND] for the current row. The idea is to group the data for each [CUSTOMER NAME] using PARTITION BY and then to order the data by the [YEAR]. The function will give us the previous result and we can calculated easily the difference.
Try something like this:
SELECT TOP 100
YEAR(FT.DATE) AS [YEAR],
FT.ID_CUSTOMER AS [CUSTOMER NAME],
FT.TOTAL_PRICE AS [TOTAL AMT],
AVG(FT.TOTAL_PRICE) AS [AVG SPEND],
AVG(FT.QUANTITY) AS [AVG QUANTITY]
INTO #DataSource
FROM
FACT_TRANSACTIONS FT
INNER JOIN
DIM_CUSTOMER DC ON FT.ID_CUSTOMER = DC.ID_CUSTOMER
GROUP BY
YEAR(FT.DATE), FT.ID_CUSTOMER, FT.TOTAL_PRICE
ORDER BY
[AVG SPEND] DESC
SELECT *
,[AVG SPEND] - LAG([AVG SPEND], 1, 0) OVER (PARTITION BY [CUSTOMER NAME] ORDER BY [YEAR])
FROM #DataSource
Note, that:
the function requires SQL Server 2012+
you can change the partitioning and ordering as you like in order to satisfy your real goal (for example you can use ORDER BY [YEAR] DESC
you can use the LEAD function in order to access the next value within the group if you want to calculated difference in advace
I materialized the data in temporary table, but you can use table variable or whatever you are using

Can you please try following SQL CTE query ?
;with topcustomers as(
SELECT distinct top 100
ID_CUSTOMER,
SUM(TOTAL_PRICE) over (partition by ID_CUSTOMER) as TotalSPEND
FROM FACT_TRANSACTION
order by TotalSPEND desc
), cte as (
SELECT
distinct
t.ID_CUSTOMER, YEAR(t.DATE) [YEAR], TotalSPEND,
AVG(t.QUANTITY * 1.0) over (partition by t.ID_CUSTOMER, YEAR(t.DATE)) as AverageQUANTITY,
AVG(t.TOTAL_PRICE * 1.0) over (partition by t.ID_CUSTOMER, YEAR(t.DATE)) as AverageSPEND
FROM FACT_TRANSACTION t
INNER JOIN topcustomers c on c.ID_CUSTOMER = t.ID_CUSTOMER
)
select
*,
( AverageSPEND - lag(AverageSPEND,1) over (partition by ID_CUSTOMER order by [YEAR]) ) * 100.0 / AverageSPEND as [%Change]
from cte

Related

Latest Customer Donation And Amount

I have two tables:
Customer which has an Id column representing the customer Id.
CustomerDonation that contains CustomerId (FK), Amount and DatePayed
I'd like have all the customers together with their latest donation and the amount of that donation.
I am receiving duplicate values on my query so I will not paste it here.
You could also use the WITH TIES option
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By CustomerId Order By DatePayed Desc)
WITH
SortedDonation AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY DatePayed DESC) AS SeqID,
*
FROM
CustomerDonation
)
SELECT
*
FROM
Customer
LEFT JOIN
SortedDonation
ON SortedDonation.CustomerId = Customer.Id
AND SortedDonation.SeqId = 1
If the same customer can make multiple donations with the same DatePayed, then this will arbitrarily pick just one of them.
If you add additional fields to the ORDER BY you can deterministically pick which one you want.
Or, if you want all of them use DENSE_RANK() instead of ROW_NUMBER()
Use Row_Number() Analytic function .
Select * from (
Select customerId,Amount,DatePayed, row_number() over (partition by CustomerId order by DatePayed desc) as rowN)
as tab where rowN = 1
You only need the CustomerDonation table for this. You can join with the Customer table if you want other information of the customer.
WITH cte AS (
SELECT
CustomerId
, MAX(DatePayed) AS LastDate
FROM
CustomerDonation
)
SELECT
cd.CustomerId
, cd.Amount
, cd.DatePayed
FROM
CustomerDonation cd
JOIN cte ON cd.CustomerId = cte.CustomerId
AND cd.DatePayed = cte.LastDate

SQL select top 10 for each year

I have a fact database from which I want to make a trendline based on top 10 items based on sum quantity for each item per year.
I've done the following, but it does for example select more than 10 entities for my year 2007:
select TOP 10 sum(Quantity) as Quantity,DIM_Time.Year, DIM_Item.Name as Name
from Fact_Purchase
join DIM_Item on DIM_Item.BKey_ItemId = Fact_Purchase.DIM_Item
join DIM_Time on DIM_Time.ID = Fact_Purchase.DIM_Time_DeliveryDate
where Fact_Purchase.DIM_Company = 2 and DIM_Time.ID = FACT_Purchase.DIM_Time_DeliveryDate
Group by dim_item.Name, DIM_Time.Year
Order by Quantity DESC
How do I select top 10 items with the highest quantity through all my years, with only 10 top entities for each year?
As you can guess, the company is individual, and Is going to be a parameter in my report
I think this is what you're going for. My apologies if I messed up on translating your tables across.
select *
from (
select DIM_Time.[Year], dim_item.Name, SUM(Quantity) Quantity, RANK() OVER (PARTITION BY DIM_Time.[Year] ORDER BY SUM(Quantity) DESC) salesrank
from Fact_Purchase
join DIM_Item on DIM_Item.BKey_ItemId = Fact_Purchase.DIM_Item
join DIM_Time on DIM_Time.ID = Fact_Purchase.DIM_Time_DeliveryDate
where Fact_Purchase.DIM_Company = 2 and DIM_Time.ID = FACT_Purchase.DIM_Time_DeliveryDate
group by dim_item.Name, DIM_Time.[Year]
) tbl
where salesrank <= 10
order by [Year], salesrank
The subquery groups by name/year, and the RANK() OVER part sets up a sort of row index that increments by SUM(Quantity) and restarts for each Year. From there you just have to filter out anything with a salesrank (index) that's over 10.
SELECT
_year,
Name,
_SUM,
RANK_iD
FROM
(
SELECT
_year,
Name,
_SUM,
DENSE_RANK()OVER(PARTITION BY _year,_Month ORDER BY _SUM DESC) AS RANK_iD
FROM(
Select
DIM_Time AS _year,
DIM_Item as Name,
sum(Quantity) AS _SUM
from
#ABC
GROUP BY
_year,
Name
)A
)B
WHERE RANK_iD<=10

SQL Server create grand total row

I have a table that contains order details. How can I create a row at the end that totals all my subtotals?
SELECT
o.order_id, o.itemDescription as Description,
o.quantity_shipped as [Quantity],
o.itemCost as [Price each],
(o.quantity_shipped * CAST(o.itemCost as float)) as [Sub Total]
FROM
dbo.order_items o
This will give you total by Order Id
SELECT o.order_id, SUM((o.quantity_shipped * CAST (o.itemCost as float))) as [TotalByOrderId]
FROM dbo.order_items o
GROUP BY o.order_id
This will give you grand total
SELECT SUM((o.quantity_shipped * CAST (o.itemCost as float))) as [GrandTotal]
FROM dbo.order_items o
A way (not the most performance-wise) can be the following:
;WITH CTE AS (
SELECT o.order_id, o.itemDescription as Description,
o.quantity_shipped as [Quantity],
o.itemCost as [Price each],
(o.quantity_shipped * CAST(o.itemCost as float)) as [Sub Total]
FROM dbo.order_items o)
SELECT *
FROM CTE
UNION ALL
SELECT NULL, 'Grand Total', NULL, NULL, SUM([Sub Total])
FROM CTE
Why would you do that? Adding a meaningless total row makes processing a lot more complicated later. Unless you do hierarchical sumbtotals for sub-lines.
The normal way this is handled by having totals as part of the invoice table.

Deleting duplicates in a time series

I have a large set of measurements taken every 1 millisecond stored in a SQL Server 2012 table. Whenever there are 3 or more duplicate values in some rows that I would like to delete the middle duplicates. Highlighted values in this image of sample data are the ones that I want to delete. Is there a way to do this with a SQL query?
You can do this using a CTE and ROW_NUMBER:
SQL Fiddle
WITH CteGroup AS(
SELECT *,
grp = ROW_NUMBER() OVER(ORDER BY MS) - ROW_NUMBER() OVER(PARTITION BY Value ORDER BY MS)
FROM YourTable
),
CteFinal AS(
SELECT *,
RN_FIRST = ROW_NUMBER() OVER(PARTITION BY grp, Value ORDER BY MS),
RN_LAST = ROW_NUMBER() OVER(PARTITION BY grp, Value ORDER BY MS DESC)
FROM CteGroup
)
DELETE
FROM CteFinal
WHERE
RN_FIRST > 1
AND RN_LAST > 1
I'm sure there must be a more efficient way to do this, but you could join the table to itself twice to find the previous and next value in the list, and then delete all of the entries where all three values are the same.
DELETE FROM tbl
WHERE ms IN
(
SELECT T.ms
FROM tbl T
INNER JOIN tbl T1 ON T.ms = T1.ms + 1
INNER JOIN tbl T2 ON T.ms = T2.ms - 1
WHERE T.value = T1.value AND T.value = T2.value
)
If the table is really big, I can see this blowing tempdb though.
Yes there is
select * from table group by table.field ->value

Running total query in select statement without views

I have to query a set of running total data by month.
e.g.
Month Amount Total
2014-01-01 100 100
2014-01-02 100 200
2014-01-03 100 300
The application does not allow to create a view or SP. It is able to select data from a table directly.
e.g.
select Month,
Amount,
Total -- This is my problem.
from Table -- This is a table only.
Any ideas are welcome, thank you.
You can use OUTER APPLY:
SELECT T.Month,T.Amount,T2.Total
FROM Table1 T
OUTER APPLY
( SELECT Total = SUM(Amount)
FROM Table1 T2
WHERE T2.Month <= T.Month
) T2;
Or a correlated subquery:
SELECT T.Amount,
( SELECT Amount = SUM(Amount)
FROM Table1 T2
WHERE T2.Month <= T.Month
)
FROM Table1 T
The easiest way is to use SQL Server 2012 because it has cumulative sum built-in:
select Month, Amount,
sum(Amount) over (order by Month) as Total -- This is my problem.
from Table;
The correlated subquery method follows a similar structure:
select Month, Amount,
(select sum(Amount) from table t2 where t2.Month <= t.Month) as Total
from Table t;
These are usually the two methods that I would consider, because both are standard SQL. As Vignesh points out you can do it with cross apply as well (although as I write this, his query is not correct).
Here is a second way to create a running total:
SELECT t.month, t.amount,
SUM(t.amount) OVER(PARTITION BY t.month ORDER BY t.month
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as [Total]
FROM [yourTable] AS t

Resources