SQL Pivot sum with specific condition - sql-server

I am trying to create a pivot in order to sum specific month as the total, below is my table and sql pivot:
CREATE TABLE yt
([Store] varchar(10), [Month] varchar(10), [xCount] int)
;
INSERT INTO yt
([Store], [Month], [xCount])
VALUES
('A', 'JAN', 1),
('A', 'FEB', 10),
('B', 'JAN', 2),
('B', 'FEB', 20),
('C', 'JAN', 3),
('C', 'FEB', 70),
('C', 'MAR', 110)
;
select *
from
(
select Store, Month, xCount
from yt
) src
pivot
(
sum(xcount)
for Month in ([JAN], [FEB], [MAR])
) piv;
Store JAN FEB MAR
A 1 10 NULL
B 2 20 NULL
C 3 70 110
The above result I am able to sum the value by the current month. Now I want to sum all the month when Store a, sum only FEB, and MAR in Store b, and Store c sum MAR, so I want my result as
Store JAN FEB MAR
A 6 100 110
B 5 90 110
C 3 70 110
Can I do it in pivot? Or need to create a table to run sql again?
Thanks for the help.

For Sql Server 2008 you need some self join and ordering to make a running sum. Here I take store as this ordering:
WITH cte AS (SELECT * FROM #yt
PIVOT(SUM(xcount) FOR Month IN ([JAN], [FEB], [MAR])) piv)
SELECT c1.Store ,
SUM(c2.jan) AS jan ,
SUM(c2.feb) AS feb ,
SUM(c2.mar) AS mar
FROM cte c1
JOIN cte c2 ON c1.Store <= c2.Store
GROUP BY c1.Store

Your problem is running total order by store desc
In SQL Server2005+, You can try this:
SELECT Store,
SUM([JAN]) OVER(ORDER BY Store DESC) AS [JAN],
SUM([FEB]) OVER(ORDER BY Store DESC) AS [FEB],
SUM([MAR]) OVER(ORDER BY Store DESC) AS [MAR]
FROM (
SELECT *
FROM
(
SELECT Store, Month, xCount
FROM yt
) src
PIVOT
(
SUM(xcount)
for Month in ([JAN], [FEB], [MAR])
) piv
) AS A
ORDER BY Store
But in MySQL, you can reference this link

If you are using SQL SERVER 2012, you can use
SELECT STORE,
SUM(JAN)
OVER (
ORDER BY RN DESC) JAN,
SUM(FEB)
OVER (
ORDER BY RN DESC) FEB,
SUM(MAR)
OVER (
ORDER BY RN DESC) MAR
FROM (SELECT ROW_NUMBER()
OVER (
ORDER BY STORE)RN,
*
FROM (SELECT STORE,
MONTH,
XCOUNT
FROM #YT) SRC
PIVOT ( SUM(XCOUNT)
FOR MONTH IN ([JAN],
[FEB],
[MAR]) ) PIV)A
ORDER BY RN ASC

Related

SQL Server, first of each time series

A table 'readings' has a list of dates
[Date] [Value]
2015-03-19 00:30:00 1.2
2015-03-19 00:40:00 1.2
2015-03-19 00:50:00 0.1
2015-03-19 01:00:00 0.1
2015-03-19 01:10:00 2
2015-03-19 01:20:00 0.5
2015-03-19 01:30:00 0.5
I need to get the most recent instance where the value is below a set point (in this case the value 1.0), but I only want the start (earliest datetime) where the value was below 1 for consecutive times.
So with the above data I want to return 2015-03-19 01:20:00, as the most recent block of times where value < 1, but I want the start of that block.
This SQL just returns the most recent date, rather than the first date whilst the value has been low (so returns 2015-03-19 01:30:00 )
select top 1 *
from readings where value <=1
order by [date] desc
I can't work out how to group the consecutive dates, to therefore only get the first ones
It is SQL Server, the real data isn't at exactly ten min intervals, and the readings table is about 70,000 rows- so fairly large!
Thanks, Charli
Demo
SELECT * FROM (
SELECT [Date]
,Value
,ROW_NUMBER() OVER (PARTITION BY cast([Date] AS DATE) ORDER BY [Date] ASC) AS RN FROM #table WHERE value <= 1
) t WHERE t.RN = 1
Select Max( [date] )
From [dbo].[readings]
Where ( [value] <= 1 )
You seem to want the minimum date for each set of consecutive records having a value that is less than 1. The query below returns exactly these dates:
SELECT MIN([Date])
FROM (
SELECT [Date], [Value],
ROW_NUMBER() OVER (ORDER BY [Date]) -
COUNT(CASE WHEN [Value] < 1 THEN 1 END) OVER (ORDER BY [Date]) AS grp
FROM mytable) AS t
WHERE Value < 1
GROUP BY grp
grp calculated field identifies consecutive records having Value<1.
Note: The above query will work for SQL Server 2012+.
Demo here
Edit:
To get the date value of the last group you can modify the above query to:
SELECT TOP 1 MIN([Date])
FROM (
SELECT [Date], [Value],
ROW_NUMBER() OVER (ORDER BY [Date]) -
COUNT(CASE WHEN [Value] < 1 THEN 1 END) OVER (ORDER BY [Date]) AS grp
FROM mytable) AS t
WHERE Value < 1
GROUP BY grp
ORDER BY grp DESC
Demo here

Rolling Prior13 months with Current Month Sales

Within a SQL Server 2012 database, I have a table with two columns customerid and date. I am interested in getting by year-month, a count of customers that have purchased in current month but not in prior 13 months. The table is extremely large so something efficient would be highly appreciated. Results table is shown after the input data. In essence, it is a count of customers that purchased in current month but not in prior 13 months (by year and month).
---input table-----
declare #Sales as Table ( customerid Int, date Date );
insert into #Sales ( customerid, date) values
( 1, '01/01/2012' ),
( 1, '04/01/2013' ),
( 1, '01/01/2014' ),
( 1, '01/01/2014' ),
( 1, '04/06/2014' ),
( 2, '04/01/2014' ),
( 3, '01/03/2012' ),
( 3, '01/03/2014' ),
( 4, '01/04/2012' ),
( 4, '04/04/2013' ),
( 5, '02/01/2010' ),
( 5, '02/01/2013' ),
( 5, '04/01/2014' )
select customerid, date
from #Sales;
---desired results ----
yearmth monthpurchasers monthpurchasernot13m
201002 1 1
201201 3 3
201302 1 1
201304 2 2
201401 2 1
201404 3 2
Thanks very much for looking at this!
Dev
You didn't provide the expected result, but I believe this is pretty close (at least logically):
;with g as (
select customerid, year(date)*100 + month(date) as mon
from #Sales
group by customerid, year(date)*100 + month(date)
),
x as (
select *,
count(*) over(partition by customerid order by mon
rows between 13 preceding and 1 preceding) as cnt
from g
),
y as (
select mon, count(*) as cnt from x
where cnt = 0
group by mon
)
select g.mon,
count(distinct(g.customerid)) as monthpurchasers,
isnull(y.cnt, 0) as cnt
from g
left join y on g.mon = y.mon
group by g.mon, y.cnt
order by g.mon
Tell me if this query helps. It extracts all the rows which meet your condition into a Table variable. Then, I use your query and join to this table.
declare #startDate datetime
declare #todayDate datetime
declare #tbl_Custs as Table(customerid int)
set #startDate = '04/01/2014' -- mm/dd/yyyy
set #todayDate = GETDATE()
insert into #tbl_Custs
-- purchased only this month
select customerid
from Sales
where ([date] >= #startDate and [date] <= #todayDate)
and customerid NOT in
(
-- purchased in past 13 months
select distinct customerid
from Sales
where ([date] >= DATEADD(MONTH,-13,[date])
and [date] < #startDate)
)
-- your query goes here
select year(date) as year
,month(date) as month
,count(distinct(c.customerid)) as monthpurchasers
from #tbl_Custs as c right join
Sales as s
on c.customerid = s.customerid
group by year(date) , month(date)
order by year(date) , month(date)
Below query will produce what you are looking for. I am not sure how performance will be on a big table (how big is your table?) but it is pretty straight forward so I think it will be ok. I simply calculate the 13 months earlier on CTE to find my sale window. Than join to the Sales table within that window / customer id and grouping records based on the unmatched records. You don't actually need 2 CTE's here you can do the DATEADD(mm,-13,date) on the join part of the second CTE but I thought it might be more clear this way.
P.S. If you need to change the time frame from 13 months to something else all you have to change is the DATEADD(mm,-13,date) this simply substracts 13 months from the date value.
Hope this helps or at least leads to a better solution
;WITH PurchaseWindow AS (
select customerid, date, DATEADD(mm,-13,date) minsaledate
FROM #Sales
), JoinBySaleWindow AS (
SELECT a.customerid, a.date,a.minsaledate,b.date earliersaledate
FROM PurchaseWindow a
LEFT JOIN #sales b ON a.customerid =b.customerid
--Find the sales for the customer within the last 13 months of original sale
AND b.date BETWEEN a.date AND a.minsaledate
)
SELECT DATEPART(yy,date) AS [year], DATEPART(mm, date) AS [month], COUNT(DISTINCT customerid) monthpurchases
FROM JoinBySaleWindow
--Exclude records where a sale within last 13 months occured
WHERE earliersaledate IS NULL
GROUP BY DATEPART(mm, date), DATEPART(yy,date)
Sorry about the typos they are fixed now.

SQL Server: Gap / Island, 365 day "contiguous" block

I have a table that looks like this:-
tblMeterReadings
id meter date total
1 1 03/01/2014 100.1
1 1 04/01/2014 184.1
1 1 05/01/2014 134.1
1 1 06/01/2014 132.1
1 1 07/01/2014 126.1
1 1 08/01/2014 190.1
This is an 8 day "contiguous block" from '2014-01-03' to '2014-01-08'.
In the real table there are "contiguous blocks" of years in length.
I need to select the MOST RESCENT CONTINUOUS 365 DAY BLOCK (filtered by meter column). If 365 cannot be found, then it should select next largest continuous block.
When I say CONTINUOUS I mean there must be no days missing.
This is beyond me, so if someone can solve... I will be very impressed.
using distinct to not count days with 2 sets of data
declare #gapdays int = 2 -- replace this with 365 in your case
;with x as
(
select datediff(d, '2014-01-01', [date])-dense_rank()over(order by [date]) grp
,[date]
from #t
)
select top 1 max([date]) last_date, min([date]) first_date, count(distinct [date]) days_in_a_row
from x
group by grp
having count(distinct [date]) >= #gapdays
order by max([date]) desc
There you go:
declare #tblMeterReadings table (id int, meter int, [date] date, total money)
insert into #tblMeterReadings ( id, meter, date, total )
values
(1, 1, '03/01/2014', 100.1),
(1, 1, '04/01/2014', 184.1),
(1, 1, '05/01/2014', 134.1),
(1, 1, '06/01/2014', 132.1),
(1, 1, '07/01/2014', 126.1),
(1, 1, '08/01/2014', 190.1),
(1, 1, '10/01/2014', 200.1),
(1, 1, '12/01/2014', 202.1),
(1, 1, '13/01/2014', 204.1)
;with data as (
select i = datediff(day, '2014', [date]), *
from #tblMeterReadings l
)
, islands as (
select island = l.i - row_number() over (order by i), l.*
from data l
)
, spans as (
select l = min(i), r = max(i)
from islands i
group by island
)
select *
from spans s
left join data l on s.l = l.i
left join data r on s.r = r.i
Most recent continuous block not exceeding 365 days in length will be as follows:
select top 1 *
from spans s
left join data l on s.l = l.i
left join data r on s.r = r.i
where s.l - s.r < 365
order by s.l - s.r desc, s.r desc
With recursive CTE and datepart(dayofyear, date):
with cte as
(
select id, meter, date, datepart(dayofyear, date) as x, cast(1 as int) as level, t1.date as startDate from tblMeterReadings t1
where meter = 1
and not exists(select * from tblMeterReadings t2 where (datepart(dayofyear, t1.date) - 1) = datepart(dayofyear, t2.date))
union all
select t1.id, t1.meter, t1.date, datepart(dayofyear, t1.date) as x, t2.level + 1, t2.startDate from tblMeterReadings t1
inner join cte t2 ON (datepart(dayofyear, t1.date)) = (datepart(dayofyear, t2.date) + 1)
)
select TOP 365 * from cte
where cte.startDate = (select top 1 startdate
from cte
--where Level <= 365
order by Level desc, startDate desc)
order by Level desc
OPTION ( MAXRECURSION 365 )
SQL Fiddle example

sql query to find the Items with the highest difference

I have my database table ABC as shown below :
ItemId Month Year Sales
1 1 2013 333
1 2 2013 454
2 1 2013 434
and so on .
I would like to write a query to find the top 3 items that have had the highest increase in sales from last month to this month , so that I see somethinglike this in the output.
Output :
ItemId IncreaseInSales
1 +121
9 +33
6 +16
I came up to here :
select
(select Sum(Sales) from ABC where [MONTH] = 11 )
-
(select Sum(Sales) from ABC where [MONTH] = 10)
I cannot use a group by as it is giving an error . Can anyone point me how I can
proceed further ?
Assuming that you want the increase for a given month, you can also do this with an aggregation query:
select top 3 a.ItemId,
((sum(case when year = #YEAR and month = #MONTH then 1.0*sales end) /
sum(case when year = #YEAR and month = #MONTH - 1 or
year = #YEAR - 1 and #Month = 1 and month = 12
then sales end)
) - 1
) * 100 as pct_increase
from ABC a
group by a.ItemId
order by pct_increase desc;
You would put the year/month combination you care about in the variables #YEAR and #MONTH.
EDIT:
If you just want the increase, then do a difference:
select top 3 a.ItemId,
(sum(case when year = #YEAR and month = #MONTH then 1.0*sales end) -
sum(case when year = #YEAR and month = #MONTH - 1 or
year = #YEAR - 1 and #Month = 1 and month = 12
then sales
end)
) as difference
from ABC a
group by a.ItemId
order by difference desc;
Here is the SQL Fiddle that demonstrates the below query:
SELECT TOP(3) NewMonth.ItemId,
NewMonth.Month11Sales - OldMonth.Month10Sales AS IncreaseInSales
FROM
(
SELECT s1.ItemId, Sum(s1.Sales) AS Month11Sales
FROM ABC AS s1
WHERE s1.MONTH = 11
AND s1.YEAR = 2013
GROUP BY s1.ItemId
) AS NewMonth
INNER JOIN
(
SELECT s2.ItemId, Sum(s2.Sales) AS Month10Sales
FROM ABC AS s2
WHERE s2.MONTH = 10
AND s2.YEAR = 2013
GROUP BY s2.ItemId
) AS OldMonth
ON NewMonth.ItemId = OldMonth.ItemId
ORDER BY NewMonth.Month11Sales - OldMonth.Month10Sales DESC
You never mentioned if you could have more than one record for an ItemId with the same Month, so I made the query to handle it either way. Obviously you were lacking the year = 2013 in your query. Once you get past this year you will need that.
Another option could be something on these lines:
SELECT top 3 a.itemid, asales-bsales increase FROM
(
(select itemid, month, sum(sales) over(partition by itemid) asales from ABC where month=2
and year=2013) a
INNER JOIN
(select itemid, month, sum(sales) over(partition by itemid) bsales from ABC where month=1
and year=2013) b
ON a.itemid=b.itemid
)
ORDER BY increase desc
if you need to cater for months without sales then you can do a FULL JOIN and calculate increase as isnull(asales,0) - isnull(bsales,0)
You could adapt this solution based on PIVOT operator:
SET NOCOUNT ON;
DECLARE #Sales TABLE
(
ItemID INT NOT NULL,
SalesDate DATE NOT NULL,
Amount MONEY NOT NULL
);
INSERT #Sales (ItemID, SalesDate, Amount)
VALUES
(1, '2013-01-15', 333), (1, '2013-01-14', 111), (1, '2012-12-13', 100), (1, '2012-11-12', 150),
(2, '2013-01-11', 200), (2, '2012-12-10', 150), (3, '2013-01-09', 900);
-- Parameters (current year & month)
DECLARE #pYear SMALLINT = 2013,
#pMonth TINYINT = 1;
DECLARE #FirstDayOfCurrentMonth DATE = CONVERT(DATE, CONVERT(CHAR(4), #pYear) + '-' + CONVERT(CHAR(2), #pMonth) + '-01');
DECLARE #StartDate DATE = DATEADD(MONTH, -1, #FirstDayOfCurrentMonth), -- Begining of the previous month
#EndDate DATE = DATEADD(DAY, -1, DATEADD(MONTH, 1, #FirstDayOfCurrentMonth)) -- End of the current month
SELECT TOP(3) t.ItemID,
t.[2]-t.[1] AS IncreaseAmount
FROM
(
SELECT y.ItemID, y.Amount,
DENSE_RANK() OVER(ORDER BY y.FirstDayOfSalesMonth ASC) AS MonthNum -- 1=Previous Month, 2=Current Month
FROM
(
SELECT x.ItemID, x.Amount,
DATEADD(MONTH, DATEDIFF(MONTH, 0, x.SalesDate), 0) AS FirstDayOfSalesMonth
FROM #Sales x
WHERE x.SalesDate BETWEEN #StartDate AND #EndDate
) y
) z
PIVOT( SUM(z.Amount) FOR z.MonthNum IN ([1], [2]) ) t
ORDER BY IncreaseAmount DESC;
SQLFiddle demo
Your sample data seems to be incomplete, however, here is my try. I assume that you want to know the three items with the greatest sales-difference from one month to the next:
WITH Increases AS
(
SELECT a1.itemid,
a1.sales - (SELECT a2.sales
FROM dbo.abc a2
WHERE a1.itemid = a2.itemid
AND ( ( a1.year = a2.year
AND a1.month > 1
AND a1.month = a2.month + 1 )
OR ( a1.year = a2.year + 1
AND a1.month = 1
AND a2.month = 12 ) ))AS IncreaseInSales
FROM dbo.abc a1
)
SELECT TOP 3 ItemID, MAX(IncreaseInSales) AS IncreaseInSales
FROM Increases
GROUP BY ItemID
ORDER BY MAX(IncreaseInSales) DESC
Demo
SELECT
cur.[ItemId]
MAX(nxt.[Sales] - cur.[Sales]) AS [IncreaseInSales]
FROM ABC cur
INNER JOIN ABC nxt ON (
nxt.[Year] = cur.[Year] + cur.[month]/12 AND
nxt.[Month] = cur.[Month]%12 + 1
)
GROUP BY cur.[ItemId]
I'd do this this way. It should work in all the tagged versions of SQL Server:
SELECT TOP 3 [ItemId],
MAX(CASE WHEN [Month] = 2 THEN [Sales] END) -
MAX(CASE WHEN [Month] = 1 THEN [Sales] END) [Diff]
FROM t
WHERE [Month] IN (1, 2) AND [Year] = 2013
GROUP BY [ItemId]
HAVING COUNT(*) = 2
ORDER BY [Diff] DESC
Fiddle here.
The reason why I'm adding the HAVING clause is that if any item is added in only one of the months then the numbers will be all wrong. So I'm only comparing items that are only present in both months.
The reason of the WHERE clause would be to filter in advance only the needed months and improve the efficiency of the query.
An SQL Server 2012 solution could also be:
SELECT TOP 3 [ItemId], [Diff] FROM (
SELECT [ItemId],
LEAD([Sales]) OVER (PARTITION BY [ItemId] ORDER BY [Month]) - [Sales] Diff
FROM t
WHERE [Month] IN (1, 2) AND [Year] = 2013
) s
WHERE [Diff] IS NOT NULL
ORDER BY [Diff] DESC

SQL query to list the latest rows for every week

In SQL Server, I have the following data and I want to get the latest row for every week.
I have the following table. The WW (week number of the date) column doesn't exist, just the date, I put it there to explain better the purpose of the query.
Number Date WW
392 2012-07-23 30
439 2012-07-24 30
735 2012-07-25 30
882 2012-07-26 30 *
193 2012-07-30 31
412 2012-07-31 31
425 2012-08-01 31
748 2012-08-02 31
711 2012-08-03 31 *
757 2012-08-07 32
113 2012-08-08 32 *
444 2012-08-15 33 *
The desired result of the query should be
882 30
711 31
113 32
444 33
Basically what I want to get is the latest row of every week. I have found examples like this Get the records of last month in SQL server
to find just the latest one, but I don't know how to extend them to get the list of results for every week. The rows can be stored in any date, for example maybe in a week there are no results, or in a week there are 5 rows and the next week 10 rows. I would really appreciate some hints for this.
You should be able to use something like this:
select t1.number, t2.ww
from yourtable t1
inner join
(
select max(date) mxdate, datepart(week, [Date]) ww
from yourtable
group by datepart(week, [Date])
) t2
on t1.date = t2.mxdate
and datepart(week, t1.[Date]) = t2.ww;
Or you can use CTE:
;with cte as
(
select number, datepart(week, [Date]) ww,
row_number() over(partition by datepart(week, [Date])
order by date desc) rn
from yourtable
)
select number, ww
from cte
where rn = 1
See SQL Fiddle with Demo
If you use the MAX() version of this and you have two numbers with the same date, then you will return two records. Using the row_number() version will only return one record that meets the criteria due to applying the row_number filter (See Demo)
If you are using sql server you can do this:
declare #test table(Number int, Date datetime, WW int);
INSERT INTO #test
(Number, Date, WW)
VALUES
(392, '2012-07-22 17:00:00', 30),
(439, '2012-07-23 17:00:00', 30),
(735, '2012-07-24 17:00:00', 30),
(882, '2012-07-25 17:00:00', 30),
(193, '2012-07-29 17:00:00', 31),
(412, '2012-07-30 17:00:00', 31),
(425, '2012-07-31 17:00:00', 31),
(748, '2012-08-01 17:00:00', 31),
(711, '2012-08-05 17:00:00', 31),
(757, '2012-08-06 17:00:00', 32),
(113, '2012-08-07 17:00:00', 32),
(444, '2012-08-14 17:00:00', 33)
SELECT * FROM (
select number, date, ww,
row_number() over (partition by ww order by date desc) rn
from #test) v
WHERE rn = 1;
This can be done with a correlated sub-query as shown below:
SELECT number, [date]
FROM tablename T
WHERE NOT EXISTS (
SELECT 1
FROM tablename
WHERE DATEPART(wk, [date]) = DATEPART(wk, T.[date])
AND date > T.[date]
)
Note this does not return 2012-08-06 as I believe this is actually in the 32nd week of the year http://www.wolframalpha.com/input/?i=6th+August+2012

Resources