Aggregation in dynamic pivot table in SQL Server 2012 - sql-server

I have a table in SQL Server 2012 which has 5 million rows.
Table view is like this :
CustomerID ProdID FavouriteProduct
1 A A
1 A A
1 A A
1 B A
1 A A
1 A A
1 A A
1 B A
2 A C
2 AN C
2 G C
2 C C
2 C C
2 F C
2 D C
2 C C
As you can see there are so many different products.
I already wrote a query for that :
DECLARE #DynamicPivotQuery AS NVARCHAR(MAX)
DECLARE #ColumnName AS NVARCHAR(MAX)
--Get distinct values of the PIVOT Column
SELECT
#ColumnName = ISNULL(#ColumnName + ',','') + QUOTENAME(prodID)
FROM
(SELECT DISTINCT ProdID FROM Table) AS Prods
--Prepare the PIVOT query using the dynamic
SET #DynamicPivotQuery =
N'SELECT CustomerID, ' + #ColumnName + '
FROM table
PIVOT(count(CustomerID)
FOR ProdID IN (' + #ColumnName + ')) AS PVTTable'
--Execute the Dynamic Pivot Query
EXEC sp_executesql #DynamicPivotQuery
Normally, I expect count(*) or Count(SubID) would count the number of each products bought by each customer, but it doesn't. It says
Invalid Column name CustomerID
instead.

You can't count the CustomerId while selecting it, try this instead:
SET #DynamicPivotQuery =
N'SELECT CustomerID, ' + #ColumnName + '
FROM table
PIVOT(count(FavouriteProduct)
FOR ProdID IN (' + #ColumnName + ')) AS PVTTable'
You need a column to count, and CustomerId is not allowed since you are selecting it. If FavouriteProduct fails, I suggest you fake a column or find another column.
Since you have more columns than described, I modified query. Try this instead:
SET #DynamicPivotQuery =
N'SELECT CustomerID, ' + #ColumnName + '
FROM
(SELECT CustomerID, FavouriteProduct, ProdID FROM table) x
PIVOT(count(FavouriteProduct)
FOR ProdID IN (' + #ColumnName + ')) AS PVTTable'

Related

Using pivot and then counting the values in each pivot column

I have a table that looks like this:
Agent_id
break_id
time
1
1
15
1
2
12
1
2
12
I used pivot to get this structure:
Agent_id
1
2
1
15
24
The problem is that I need to get the count for the pivoted columns, in the example I need to have a structure like this:
Agent_id
1
2
count1
count2
1
15
24
1
2
And I'm not sure on how to do it ... this is the query so far.
DECLARE #COLUMNS VARCHAR(MAX)
DECLARE #QUERY nVARCHAR(MAX)
SELECT #COLUMNS = COALESCE(#COLUMNS + ', ','') + QUOTENAME([break_id])
FROM
(SELECT DISTINCT [break_id] FROM test) AS B
ORDER BY B.[break_id]
SET #QUERY = '
SELECT agent_id,
'+#COLUMNS+'
FROM (
SELECT TOP (1000)
agent_id,break_id,time_inbreak
FROM test
) as pivotData
PIVOT (
SUM(time_inbreak)
FOR break_id IN ('+#COLUMNS+')
) as pivotResult
'
EXEC sp_executesql #QUERY
Any help is greatly appreciated
Unfortunately, PIVOT can only pivot a single column. But we can do multiple columns using conditional aggregation SUM(CASE WHEN... and COUNT(CASE WHEN...:
DECLARE #COLUMNS VARCHAR(MAX)
DECLARE #QUERY nVARCHAR(MAX)
SELECT #COLUMNS = COALESCE(#COLUMNS + ', ','') +
'Sum' + QUOTENAME([break_id]) +
' = SUM(CASE WHEN break_id = ' + break_id + ' THEN time_inbreak END), Count' +
QUOTENAME([break_id]) + ' = COUNT(CASE WHEN break_id = ' + break_id + ' THEN 1 END)'
FROM
(SELECT DISTINCT [break_id] FROM test) AS B
ORDER BY B.[break_id]
OPTION (MAXDOP 1);
SET #QUERY = '
SELECT agent_id,
'+#COLUMNS+'
FROM (
SELECT TOP (1000)
agent_id,break_id,time_inbreak
FROM test
) as pivotData
GROUP BY agent_id;
';
PRINT #QUERY
EXEC sp_executesql #QUERY
I must say, I'm not sure how safe it is to aggregate the columns like that, especially in the face of parallelism. Preferably use STRING_AGG or FOR XML PATH(''), TYPE. At the very least I have added OPTION (MAXDOP 1) to prevent parallelism

Dynamic pivot with tweaks

I'm trying to solve experiment with some SQL pivoting. I have created a pivot by naming the columns, I am now trying to do it dynamically. From a few searches I have come up with the following code
USE [LMS]
DECLARE #DynamicPivotQuery AS NVARCHAR(MAX)
DECLARE #ColumnName AS NVARCHAR(MAX)
--Get distinct values of the PIVOT Column
SELECT #ColumnName= ISNULL(#ColumnName + ',','')
+ QUOTENAME(Learning_Path_Title)
FROM (SELECT DISTINCT [Learning_Path_Title] FROM [dbo].[copy from]) AS Courses
--Prepare the PIVOT query using the dynamic
SET #DynamicPivotQuery =
N'SELECT Distinct [Userd], ' + #ColumnName + '
FROM [dbo].[copy from]
PIVOT(MAX([Learning_Path_Complete])
FOR [Learning_Path_Title] IN (' + #ColumnName + ')) AS PVTTable'
--Execute the Dynamic Pivot Query
EXEC sp_executesql #DynamicPivotQuery
My data is structured in the following way
I want the table to be displayed as below

SQL Server Pivot and Sort

I have a shop order table with the Item code, description ReleaseDate and Required quantity.
How to query in a pivot table format in such a way that the results will be pivoted using Year+Month of [ReleaseDate] in sorting order from oldest date to latest date. Using the Year+Month as a column.
This is my query but it fails.
--Declare necessary variables
DECLARE #SQLQuery AS NVARCHAR(MAX)
DECLARE #PivotColumns AS NVARCHAR(MAX)
--Get unique values of pivot column
SELECT #PivotColumns = COALESCE(#PivotColumns + ',','') + QUOTENAME([YEARMONTH])
FROM (SELECT DISTINCT CONVERT(char(6), cast([releaseddate] as date), 112 ) as [YEARMONTH] FROM [dbo].[ShopOrder]) as PivotQuery
SELECT #PivotColumns
--Create the dynamic query with all the values for
--pivot column at runtime
SET #SQLQuery =
N'SELECT ItemCode, ' + #PivotColumns + '
FROM [dbo].[ShopOrder]
PIVOT( SUM(RequiredQty)
FOR [releaseddate] IN (' + #PivotColumns + ')) AS P'
SELECT #SQLQuery
--Execute dynamic query
EXEC sp_executesql #SQLQuery
This is the original record
Results query must be like this
Here I have tried to execute the pivot with your provided data.
Query
Select
[ItemCode],
[Description],
[2017/8],
[2017/9]
from
(
select cast(year(ReleasedDate) as nvarchar)+'/'+cast(month(ReleasedDate) as nvarchar) as ReleasedDate,ItemCode,Description,RequiredQty
from shoporder) as PivotData
Pivot
(
sum(RequiredQty) for ReleasedDate in
([2017/8],[2017/9])) as Pivoting
order by ItemCode
Output:
Fiddle
You are not pivoting correctly and the column names should be generated in a single string. Try this:
CREATE TABLE ShopOrder (ItemCode VARCHAR(100),[Description] VARCHAR(100),ReleaseDate DATE, RequiredQty INT)
GO
INSERT INTO ShopOrder
VALUES
('A','SLEEVE NUT','08/01/2017',19200)
,('A','SLEEVE NUT','08/02/2017',18000)
,('A','SLEEVE NUT','09/01/2017',17000)
,('B','STARTER','08/03/2017',10000)
,('B','STARTER','08/04/2017',18000)
,('B','STARTER','09/15/2017',16000)
DECLARE #SQLQuery AS NVARCHAR(MAX)
DECLARE #PivotColumns AS NVARCHAR(MAX)
SET #PivotColumns = STUFF(( SELECT DISTINCT ',[' + CONVERT(char(6), cast(ReleaseDate as date), 112 ) + ']'
FROM ShopOrder
ORDER BY ',[' + CONVERT(char(6), cast(ReleaseDate as date), 112 ) + ']'
FOR XML PATH('')),1,1,'')
SET #SQLQuery =
N'
SELECT ItemCode,'+ #PivotColumns + '
FROM (SELECT ItemCOde,CONVERT(char(6), cast(ReleaseDate as date),112) ReleaseDate, RequiredQty
FROM ShopOrder) AS T
PIVOT( SUM(RequiredQty)
FOR ReleaseDate IN ('+#PivotColumns+')) AS P
'
SELECT #SQLQuery
--Execute dynamic query
EXEC sp_executesql #SQLQuery

SQL Pivot Table Dynamic - exclude unused columns

I have the following SQL code that creates a very useful pivot table:
Use [A1_20132014]
DECLARE #cols NVARCHAR (MAX)
SELECT #cols = COALESCE (#cols + ',[' + Link_ID + ']', '[' + Link_ID + ']')
FROM (SELECT DISTINCT Link_ID FROM A1) PV
ORDER BY Link_ID
DECLARE #query NVARCHAR(MAX)
SET #query = 'SELECT * FROM
(
-- We will select the data that has to be shown for pivoting
SELECT date_1, StartHour,Cost, Link_ID
FROM A1
WHERE Section = (''1AB'')
) x
PIVOT
(
-- Values in each dynamic column
SUM(Cost)
-- Select columns from #cols
FOR Link_ID IN (' + #cols + ')
) p;'
EXEC SP_EXECUTESQL #query
from these headings
link_id Section date_1 StartHour Cost data_source
4000000027866016A 8NB 2013-09-02 6 5871 1
4000000027866017B 5EB 2013-10-09 9 8965 2
4000000027856512B 4TB 2013-05-06 15 6754 1
4000000027866015A 6HB 2013-06-08 8 5354 1
4000000027866011A 1AB 2013-06-09 11 2 1
with these source types;
link_Id nvarchar(50)
Section nvarchar(50)
Date_1 smalldatetime
StartHour int
Cost float
data_source int
However despite WHERE clause that specifies a certain section unfortunately ALL sections still appear in the pivot table but populated with NULL values all the way down.
Is there a way of completely excluding the columns that do not meet the WHERE clause?
Thanks for any help.
Henry.
Put the where clause in the subquery. That way you'll only get columns that apply to 1AB
SELECT #cols = COALESCE (#cols + ',[' + Link_ID + ']', '[' + Link_ID + ']')
FROM (SELECT DISTINCT Link_ID FROM A1 WHERE Section = '1AB') PV
ORDER BY Link_ID

How do I filter values out of a TSQL UNPIVOT?

I've got a column oriented table that specifies manufacturing options for a product. Each option can be a subset of the following choices (' ', 'B','L','R','BX','LX','RX). The table has 314 columns. (This is a legacy build and not the way I'd do it today...)
I have a customer request to count the number of times an option was used within a date range. Getting to the filtered data is not a problem, but getting the column oriented table converted to a countable row-oriented table is challenging my pivot/unpivot skills.
I can get the column names with this (as referenced from http://www.simple-talk.com/community/blogs/andras/archive/2007/09/14/37265.aspx)
declare #cols nvarchar(max)
select #cols = coalesce(#cols + ',' + quotename(column_name), quotename(column_name))
from information_schema.columns where table_name = 'mfgOptions'
and data_type = 'char' and character_maximum_length
Most of the time, the options have a ' ' value. I don't want to include these in my pivoted table.
My desired result is a table like:
MfgItemID | MfgOptionName | OptionSelection
-------------------------------------------
1000 | option1 | BX
1000 | option2 | B
2000 | option1 | LX
2000 | option3 | RX
using the #cols definition above, I created the following UNPIVOT
declare #query nvarchar(max)
set #query = N'SELECT MfgItemID, MfgOptionName, OptionSelection
from (SELECT MfgItemID, ' + #cols + '
FROM mfgOptions) a
UNPIVOT (OptionSelection for MfgOptionName IN (' + #cols + ') ) AS u
order by MfgItemID, MfgOptionName'
EXECUTE(#query)
This query is executing and appears to mostly do what I need. However it takes an extremely long time as I have 500,000+ rows in the table. If I could eliminate all blank options before unpivoting, I think it would run much faster.
Does anyone have any suggestions on how to filter field values prior to unpivoting?
You can't put a WHERE inside the UNPIVOT clause and you can't filter out in your subquery that returns a because there may be other values on the row that are useful. The best thing I know that you can do is say WHERE u.OptionSelection <> ' ' after the UNPIVOT like so:
declare #query nvarchar(max)
set #query = N'SELECT MfgItemID, MfgOptionName, OptionSelection
from (SELECT MfgItemID, ' + #cols + '
FROM mfgOptions) a
UNPIVOT (OptionSelection for MfgOptionName IN (' + #cols + ') ) AS u
WHERE u.OptionSelection <> '' ''
order by MfgItemID, MfgOptionName'
EXECUTE(#query)

Resources