SQL Pivot question - sql-server

I'm having a hard time getting my head around a query im trying to build with SQL Server 2005.
I have a table, lets call its sales:
SaleId (int) (pk) EmployeeId (int) SaleDate(datetime)
I want to produce a report listing the total number of sales by an employee for each day in a given data range.
So, for example I want the see all sales in December 1st 2009 - December 31st 2009 with an output like:
EmployeeId Dec1 Dec2 Dec3 Dec4
1 10 10 1 20
2 25 10 2 2
..etc however the dates need to be flexible.
I've messed around with using pivot but cant quite seem to get it, any ideas welcome!

Here's a complete example. You can change the date range to fit your needs.
use sandbox;
create table sales (SaleId int primary key, EmployeeId int, SaleAmt float, SaleDate date);
insert into sales values (1,1,10,'2009-12-1');
insert into sales values (2,1,10,'2009-12-2');
insert into sales values (3,1,1,'2009-12-3');
insert into sales values (4,1,20,'2009-12-4');
insert into sales values (5,2,25,'2009-12-1');
insert into sales values (6,2,10,'2009-12-2');
insert into sales values (7,2,2,'2009-12-3');
insert into sales values (8,2,2,'2009-12-4');
SELECT * FROM
(SELECT EmployeeID, DATEPART(d, SaleDate) SaleDay, SaleAmt
FROM sales
WHERE SaleDate between '20091201' and '20091204'
) src
PIVOT (SUM(SaleAmt) FOR SaleDay
IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],[31])) AS pvt;
Results (actually 31 columns (for all possible month days) will be listed, but I'm just showing first 4):
EmployeeID 1 2 3 4
1 10 10 1 20
2 25 10 2 2

I tinkered a bit, and I think this is how you can do it with PIVOT:
select employeeid
, [2009/12/01] as Dec1
, [2009/12/02] as Dec2
, [2009/12/03] as Dec3
, [2009/12/04] as Dec4
from sales pivot (
count(saleid)
for saledate
in ([2009/12/01],[2009/12/02],[2009/12/03],[2009/12/04])
) as pvt
(this is my table:
CREATE TABLE [dbo].[sales](
[saleid] [int] NULL,
[employeeid] [int] NULL,
[saledate] [date] NULL
data is: 10 rows for '2009/12/01' for emp1, 25 rows for '2009/12/01' for emp2, 10 rows for '2009/12/02' for emp1, etc.)
Now, i must say, this is the first time I used PIVOT and perhaps I am not grasping it, but this seems pretty useless to me. I mean, what good is it to have a crosstab if you cannot do anything to specify the columns dynamically?
EDIT: ok- dcp's answer does it. The trick is, you don't have to explicitly name the columns in the SELECT list, * will actually correctly expand to a column for the first 'unpivoted' column, and a dynamically generated column for each value that appears in the FOR..IN clause in the PIVOT construct.

Related

Finding a difference then the largest value over time

How do you get the row that gained most value over a period of time out of the large group set?
I've seen some overly-complicated variations on this question, and none with a good answer. I've tried to put together the simplest possible example:
Given a table like the one below, with row#, ID, year, and value columns, how would you find an ID that gained the most value and display the difference as a new column in the output?
Column A
ID
Year
Value
row 1
322
2012
150,000
row 2
322
2013
165,000
row 3
344
2012
220,000
row 4
344
2013
290,000
Desired output:
ID
Value
Value_Gained
344
290,000
70,000
SELECT id, year, value
FROM table
WHERE value = (SELECT MAX(value) FROM table);
The FIRST_VALUE window function will help you get values between last and first year for each of your ids. Then it's sufficient to order by your biggest values and getting one row using TOP(N).
SELECT TOP(1)
ID,
FIRST_VALUE([Value]) OVER(PARTITION BY [ID] ORDER BY [Year] DESC) AS [Value],
FIRST_VALUE([Value]) OVER(PARTITION BY [ID] ORDER BY [Year] DESC)
- FIRST_VALUE([Value]) OVER(PARTITION BY [ID] ORDER BY [Year]) AS [ValueGained]
FROM tab
ORDER BY [Value] DESC
Check the demo here.

Joining column spread across multiple rows based on condition

I have a table for employee with comment spread across multiple rows. Those need to be joined into a single row. To identify which rows can be joined, we need to use date field - if date is present and there is subsequent row with no date then that denote start of comment for employee. If however there is single row with no date with no prior date row as well then that is considered as new comment. The order in which comment are entered (identity column) is also provided so LEAD function was the way I was trying
Below Table is what we have:
Table screenshot
EmployeeId
Date
Comment
Order
1001
2021-01-08
This is only first part
1
1001
NULL
this is the second
2
1001
NULL
and this is third part
3
1001
2021-01-15
This is a new comment for same
4
1002
2021-01-16
This one has subsequent comment
5
1002
2021-01-16
The second comment
6
1003
NULL
This is single comment
7
1003
2021-01-12
This is also single comment
8
The result we expect is :
Result Expected
EmployeeId
Date
Comment
Order
1001
2021-01-08
This is only first part this is the second and this is third part
1
1001
NULL
This is a new comment for same
4
1002
2021-01-15
This one has subsequent comment The second comment
5
1003
2021-01-16
This is single comment
7
1003
2021-01-16
This is also single comment
8
I am trying the lead function but not able to get how to join n number of row based on condition. Any help?
SQL :
CREATE TABLE Comments(
[EmployeeID] [int] NOT NULL,
[Date] [date] null,
[Comment] [varchar](100) NULL,
[Order] [int] NULL
)
INSERT INTO Comments VALUES('1001','1/8/2021', 'This is only first part', 1)
INSERT INTO Comments VALUES('1001',NULL, 'this is the second', 2)
INSERT INTO Comments VALUES('1001',NULL, 'and this is third part', 3)
INSERT INTO Comments VALUES('1001','1/15/2021', 'This is a new comment for same', 4)
INSERT INTO Comments VALUES('1002','1/16/2021', 'This one has subsequent comment', 5)
INSERT INTO Comments VALUES('1002','1/16/2021', 'The second comment', 6)
INSERT INTO Comments VALUES('1003',NULL, 'This is single comment', 7)
INSERT INTO Comments VALUES('1003','1/12/2021', 'This is also single comment', 8)
I left a bunch of comments about details in your "expected results" that did not make sense given your requirements. If I go by your stated requirements then here is a solution:
First normalize the table to have dates so we can use group by
SELECT EmployeeID,
COALESCE([Date],
LAG([Date] OVER (ORDER BY [Order] ASC), -- Get the prior one if null
MIN([Date] OVER (PARTITION BY EmployeeID ORDER BY [Date] ASC)) AS [Date], -- Get the smallest one if the last two are null
Comment,
[Order]
FROM sometableyoudidnotname
Now that we have this table we can use group by and string_agg
SELECT EmployeeID,
MIN(cDate) as [DATE],
STRING_AGG(Comment, ' ') WITHIN GROUP (ORDER BY [ORDER] ASC) AS Comment
FROM (
SELECT EmployeeID,
COALESCE([Date],
LAG([Date] OVER (ORDER BY [Order] ASC), -- Get the prior one if null
MIN([Date] OVER (PARTITION BY EmployeeID ORDER BY [Date] ASC)) AS [Date], -- Get the smallest one if the last two are null
Comment,
[Order]
FROM sometableyoudidnotname) X
GROUP BY EmployeeID

How to collect all deference in rows between two periods?

I'm trying to see the difference between the two periods for a column.
For example, we see that sales decreased at the end of the month, and we need to see which products were not sold at the end of the month?
I can create SELECT to see quantity for each product for each period:
SELECT product_id, count(product_id) AS Count
FROM testDB
WHERE
sales_date IS NOT NULL
AND
delivery_date BETWEEN '2021-02-01 00:00:03.0000000' AND '2021-02-14 23:56:00.0000000'
GROUP BY
product_id
and the same SELECT with another period:
delivery_date BETWEEN '2021-02-14 00:00:03.0000000' AND '2021-02-28 23:56:00.0000000'
So, after these queries I see list for first period with 10 products with quantity and in second period I see list with 7 products with quantity. I can't get the difference between the lists of the two SELECTs. I tried to use != and NOT IN but without any results.
I will be very grateful for your help. Thanks
Sorry for the confusion. I meant the difference between the two selects:
The result of the first one (for first period):
Product_ID Count
grapes. 100
lime. 13
lemon. 15
cherry. 222
blueberry. 123
banana. 1
apple. 123
watermelon 56
and second one (for second period):
Product_ID Count
grapes. 10
lime. 1
lemon. 10
cherry. 2
blueberry. 13
banana. 12
and I wand to see difference between these selects:
Product_ID Count
apple. 0
watermelon. 0
So we did not sell any apples and watermelons in second period.
SELECT product_id, count(product_id) AS Count,delivery_date-sales_date as DIFFERENCE
FROM testDB
WHERE
sales_date IS NOT NULL
AND
delivery_date BETWEEN '2021-02-01 00:00:03.0000000' AND '2021-02-14 23:56:00.0000000'
GROUP BY
product_id
This should work for getting the difference between the 2 period columns.

How Sum with SQL server RollUP|SQL CUBE

Many Thanks for having to take your time to give some suggestions/help.
I have below data and I would like to calculate the total/sum across all the column group by year using SQL Server 2012. Below data is the score for four countries for different games, I would like to get the total score for all the countries together (4 distinct) group by year.
SELECT
,Country
,SUM(Football) AS Football
,SUM(Basket) AS Basket
,SUM(Ball) AS Ball
,SUM(Volleyball) Volleyball
,Year
From CountryScore
GROUP BY
GROUP BY SETS(Year,())
Sample Data
Expected Result
You can try to use SUM with add to your expect addition number columns.
CREATE TABLE CountryScore(
year int,
Football int,
Basket int,
Volleyball int
);
INSERT INTO CountryScore VALUES (1996,4,6,7);
INSERT INTO CountryScore VALUES (1996,4,6,7);
INSERT INTO CountryScore VALUES (1996,6,7,7);
INSERT INTO CountryScore VALUES (1996,6,7,7);
INSERT INTO CountryScore VALUES (2000,7,4,8);
INSERT INTO CountryScore VALUES (2000,7,6,5);
INSERT INTO CountryScore VALUES (2000,6,6,6);
INSERT INTO CountryScore VALUES (2000,6,8,8);
Query 1:
SELECT
[Year]
,SUM(Football + Basket + Volleyball) AS 'Total Score'
From CountryScore
GROUP BY
[Year]
Results:
| Year | Total Score |
|------|-------------|
| 1996 | 74 |
| 2000 | 77 |
If your value column possible be NULL, you can add ISNULL function to prevent it and set NULL to 0.
SELECT
[Year]
,SUM(ISNULL(Football,0) + ISNULL(Basket,0) + ISNULL(Volleyball,0)) AS 'Total Score'
From CountryScore
GROUP BY
[Year]
If you want play wise then you can use the below query
SELECT year,sum(Football)Football,sum(Basket)Basket,sum(Volleyball) Volleyball
From #CountryScore
group by grouping sets(year,())
Thanks
Sasi

Calculating Year to Date Total

I want to generate a Payroll type query whereby the values in Payroll 1 (say for the previous month) should be included in Payroll 2 (for the current month) Year-to-Date Totals.
This can best be explained with an example:
DECLARE #MyTable TABLE(ID INT IDENTITY, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,'Basic Salary',100),
(1,'Normal Over Time',50),
(1,'Work on Saturday',150),
(1,'Work on Sunday',200),
(2,'Basic Salary',100)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY Description ORDER BY PayrollID) AS [Month to Date]
FROM #MyTable
When I run the above I get
ID EmployeeID PayrollID Description Current Month Month to Date
1 1 1 Basic Salary 100 100
2 1 1 Normal Over Time 50 50
3 1 1 Work on Saturday 150 150
4 1 1 Work on Sunday 200 200
5 1 2 Basic Salary 100 200
The Year-to-Date running totals are per each Description meaning Basic Salary Category has its own running total and so does Saturday and Sunday etc, etc. You will notice that for Basic Salary in Payroll 2 the running Year-to-Date total is 200 (i.e. 100 from Payroll 1 + 100 from Payroll 2)
The challenge I have is that Payroll 1 has data for Basic Salary, Work on Saturday and Work on Sunday whereas Payroll 2 only has Basic Salary as the employee did not work on Saturday nor on Sunday in Payroll 2 (the current month).
However, in the cumulative Year-to-Date column the data from Payroll 1 (previous month) should still be selected and included in the Year-to-Date running Total -
something like this:
ID EmployeeID PayrollID Description Current Month Month to Date
1 1 1 Basic Salary 100 100
2 1 1 Normal Over Time 50 50
3 1 1 Work on Saturday 150 150
4 1 1 Work on Sunday 200 200
5 1 2 Basic Salary 100 200
2 1 1 Normal Over Time NULL 50
3 1 1 Work on Saturday NULL 150
4 1 1 Work on Sunday NULL 200
Although the employee did not work on Saturday nor Sunday in the current month (Payroll 2) the running (Year-to-Date) totals for working on a Saturday should be 150 that he/she worked in the previous month (Payroll 1). The same should apply to working on Sunday where the running total in the current month (Payroll 2) should be the 200 that he/she worked in the previous month (Payroll 1).
How do I do that with a simple Select Statement without writing a complicated Procedure?
EDIT:
I have cleaned up the ode as follows:
DECLARE #MyTable TABLE(ID INT IDENTITY, EmployeeID INT, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,1,'Basic Salary',100),
(1,1,'Normal Over Time',50),
(1,1,'Work on Saturday',150),
(1,1,'Work on Sunday',200),
(1,2,'Basic Salary',100)
WITH pay_elements AS
(
SELECT Description
FROM #MyTable
GROUP BY Description
)
,pay_slips AS
(
SELECT EmployeeID, PayrollID
FROM #MyTable
GROUP BY EmployeeID, PayrollID
)
,pay_lines AS
(
SELECT
mt.ID
,PS.EmployeeID
,PS.PayrollID
,PE.Description
,ISNULL(mt.[Current Month], 0) AS [Current Month]
FROM
pay_slips AS ps
OUTER APPLY
pay_elements AS pe
LEFT JOIN
#MyTable AS mt
ON (mt.EmployeeID = ps.EmployeeID)
AND (mt.PayrollID = ps.PayrollID)
AND (mt.Description = pe.Description)
)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY EmployeeID, Description ORDER BY PayrollID) AS [Month to Date]
FROM pay_lines
And I get this error:
Msg 319, Level 15, State 1, Line 10
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
Msg 102, Level 15, State 1, Line 17
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Line 23
Incorrect syntax near ','.
You first need to build a "structure" of row headings, and then join that onto the actual data.
So for example:
WITH pay_elements AS
(
SELECT Description
FROM #MyTable
GROUP BY Description
)
,pay_slips AS
(
SELECT EmployeeID, PayrollID
FROM #MyTable
GROUP BY EmployeeID, PayrollID
)
,pay_lines AS
(
SELECT
mt.ID
,pay_slips.EmployeeID
,pay_slips.PayrollID
,pay_elements.Description
,ISNULL(mt.Current_Month, 0) AS Current_Month
FROM
pay_slips AS ps
OUTER APPLY
pay_elements AS pe
LEFT JOIN
#MyTable AS mt
ON (mt.EmployeeID = ps.EmployeeID)
AND (mt.PayrollID = ps.PayrollID)
AND (mt.Description = pe.Description)
)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY EmployeeID, Description ORDER BY PayrollID) AS [Month to Date]
FROM pay_lines
What we're doing here is getting a list of the different kind of pay elements in your table. Then we're getting a list of Employees and Payrolls done to date, and manually forcing every Payroll to include a row in respect of all possible pay elements.
Once that structure is built, we join onto the base table to get the actual values (replacing NULLs with zeros, for those pay elements that weren't originally included in the base table).
Then we simply query this padded-out table in the same way you did originally.
Note, I've written this on the fly and haven't checked this code so please excuse any minor errors.
I am little confused with the column you mentioned Year-to-Date in your description. I assume this might be [Month to Date] column present in your query. Please correct me if I am wrong.
I think what you are trying to achieve is - the descriptions which are not present in payroll ID 2 like Work on Saturday and Work on Sunday should also be selected below the result set.
Problem is:
Summation of NULL value is always NULL so if [Current Month] value is NULL then you can not achieve to display 50,150,200 in the [Month to Date] column
You can have fixed categories against each payroll id:
Normal Over Time
Work on Saturday
Work on Sunday
Basic Salary
Query:
DECLARE #MyTable TABLE(ID INT IDENTITY, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,'Basic Salary',100),
(1,'Normal Over Time',50),
(1,'Work on Saturday',150),
(1,'Work on Sunday',200),
(2,'Basic Salary',100),
(2,'Normal Over Time',0),
(2,'Work on Saturday',0),
(2,'Work on Sunday',0)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY Description ORDER BY PayrollID) AS [Month to Date]
FROM #MyTable order by ID,PayrollID

Resources