SQL Server : long to wide data using PIVOT? [duplicate] - sql-server

Here is a part of my MSSQL 2008 [ERROR CODE] table, which I want to transpose to following structure. I tried searching a workaround but could not find a solution to accomplish the task. Using Pivot I think is not feasible as I cannot use aggregate function. Can someone please help me to how to make this possible?
+----------+-------+---------------------------------------------------+
| SKILL ID | SKILL | PARAMETER |
+----------+-------+---------------------------------------------------+
| 1 | 121 | STANDARD VERBIAGE & PROCEDURES |
| 1 | 121 | ISSUE IDENTIFICATION |
| 1 | 121 | CALL COURTESY |
| 1 | 121 | ISSUE RESOLUTION |
| 2 | BO | COLLECTION PROCESS ADHERENCE |
| 2 | BO | INTELLIGENCE PARAMETER |
| 3 | EM | SOFT SKILLS |
| 3 | EM | PRODUCT KNOWLEDGE |
| 3 | EM | CALL CLOSING |
| 3 | EM | CALL OPENING |
| 4 | FLC | RESOLUTION |
| 4 | FLC | NONE |
| 5 | FTA | OTHERS |
| 5 | FTA | HYGIENE FACTORS |
| 5 | FTA | ACCOUNT SCREEN |
| 5 | FTA | ORDER , DOCUMENTATION AND CONFIGURATION |
| 5 | FTA | VALIDATION SCREEN |
| 5 | FTA | PARTY SCREEN |
| 5 | FTA | ORDER , DOCUMENTATION AND CONFIGURATION |
| 6 | NCE | COMPLIANCE |
| 6 | NCE | CRM |
| 6 | NCE | ACCOUNT LEVEL /INSTALLATION DETAILS CONFIRTMATION |
| 6 | NCE | CONTENTS/BILL DETAILS |
| 6 | NCE | SELFCARE |
| 6 | NCE | FEEDBACK/SATISFACTION |
| 6 | NCE | OBJECTION RESOLUTION |
| 6 | NCE | CUSTOMER HANDLING |
| 6 | NCE | RED ALERT |
| 7 | RTO | ZERO TOLERANCE |
| 7 | RTO | OVERALL IMPRESSION |
| 7 | RTO | SUMMARY AND CLOSING |
| 7 | RTO | PROCESS KNOWLEDGE |
| 7 | RTO | OPENING |
| 8 | SHMNP | SKILL AREA |
| 8 | SHMNP | CONVINCING SKILLS |
+----------+-------+---------------------------------------------------+
This is may expected output
+-------+--------------------------------+------------------------+---------------------------------------------------+
| SKILL | PARAMETER1 | PARAMETER2 | PARAMETER3 |
+-------+--------------------------------+------------------------+---------------------------------------------------+
| 121 | STANDARD VERBIAGE & PROCEDURES | ISSUE IDENTIFICATION | CALL COURTESY |
| BO | COLLECTION PROCESS ADHERENCE | INTELLIGENCE PARAMETER | NULL |
| EM | SOFT SKILLS | PRODUCT KNOWLEDGE | CALL CLOSING |
| FLC | RESOLUTION | NONE | NULL |
| FTA | OTHERS | HYGIENE FACTORS | ACCOUNT SCREEN |
| NCE | COMPLIANCE | CRM | ACCOUNT LEVEL /INSTALLATION DETAILS CONFIRTMATION |
| RTO | ZERO TOLERANCE | OVERALL IMPRESSION | SUMMARY AND CLOSING |
| SHMNP | SKILL AREA | CONVINCING SKILLS | NULL |
+-------+--------------------------------+------------------------+---------------------------------------------------+

You can use the PIVOT function to get the result, you will just have to use row_number() to help.
The base query for this will be:
select skill_id, skill, parameter,
row_number() over(partition by skill, skill_id order by skill_id) rn
from yt;
See SQL Fiddle with Demo. I use row_number() to apply a distinct value to each row within the skill and skill_id, you will then use this row number value as the column to PIVOT.
The full code with the PIVOT applied will be:
select skill_id, skill,[Parameter_1], [Parameter_2], [Parameter_3]
from
(
select skill_id, skill, parameter,
'Parameter_'+cast(row_number() over(partition by skill, skill_id
order by skill_id) as varchar(10)) rn
from yt
) d
pivot
(
max(parameter)
for rn in ([Parameter_1], [Parameter_2], [Parameter_3])
) piv;
See SQL Fiddle with Demo.
In your case, it seems like you will have an unknown number of parameters for each skill. If that is true, then you will want to use dynamic SQL to get the result:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME('Parameter_'
+cast(row_number() over(partition by skill, skill_id
order by skill_id) as varchar(10)))
from yt
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT skill_id, skill,' + #cols + ' from
(
select skill_id, skill, parameter,
''Parameter_''+cast(row_number() over(partition by skill, skill_id
order by skill_id) as varchar(10)) rn
from yt
) x
pivot
(
max(parameter)
for rn in (' + #cols + ')
) p '
execute(#query);
See SQL Fiddle with Demo

Related

Getting Top 10 based on column value

I have a code that output a long list of the sum of count of work orders per name and sorts it by total, name and count:
;with cte as (
SELECT [Name],
[Emergency],
count([Emergency]) as [CountItem]
FROM tableA
GROUP BY [Name], [Emergency])
select Name,[Emergency],[Count],SUM([CountItem]) OVER(PARTITION BY Name) as Total from cte
order by Total desc, Name, [CountItem] desc
but I only want to get the top 10 Names with the highest total like the one below:
+-------+-------------------------------+-------+-------+
| Name | Emergency | Count | Total |
+-------+-------------------------------+-------+-------+
| PLB | No | 7 | 15 |
| PLB | No Hot Water | 4 | 15 |
| PLB | Resident Locked Out | 2 | 15 |
| PLB | Overflowing Tub | 1 | 15 |
| PLB | No Heat | 1 | 15 |
| GG | Broken Lock - Exterior | 6 | 6 |
| BOA | Broken Lock - Exterior | 2 | 4 |
| BOA | Garage Door not working | 1 | 4 |
| BOA | Resident Locked Out | 1 | 4 |
| 15777 | Smoke Alarm not working | 3 | 3 |
| FP | No air conditioning | 2 | 3 |
| FP | Flood | 1 | 3 |
| KB | No electrical power | 2 | 3 |
| KB | No | 1 | 3 |
| MEM | Noise Complaint | 3 | 3 |
| ANG | Parking Issue | 2 | 2 |
| ALL | Smoke Alarm not working | 2 | 2 |
| AAS | No air conditioning | 1 | 2 |
| AAS | Toilet - Clogged (1 Bathroom) | 1 | 2 |
+-------+-------------------------------+-------+-------+
Note: I'm not after unique values. As you can see from the example above it gets the top 10 names from a very long table.
What I want to happen is assign a row id for each name so all PLB above will have a row id of 1, GG = 2, BOA = 3, ...
So on my final select I will only add the where clause where row id <= 10. I already tried ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Name) but it's assigning 1 to every unique Name it encounters.
You may try this:
;with cte as (
SELECT [Name],
[Emergency],
count([Emergency]) as [CountItem]
FROM tableA
GROUP BY [Name], [Emergency]),
ct as (
select Name,[Emergency],[Count],SUM([CountItem]) OVER(PARTITION BY PropertyName) as Total from cte
),
ctname as (
select dense_rank() over ( order by total, name ) as RankName, Name,[Emergency],[Count], total from ct )
select * from ctname where rankname < 11

What's an efficient way to count "previous" rows in SQL?

Hard to phrase the title for this one.
I have a table of data which contains a row per invoice. For example:
| Invoice ID | Customer Key | Date | Value | Something |
| ---------- | ------------ | ---------- | ------| --------- |
| 1 | A | 08/02/2019 | 100 | 1 |
| 2 | B | 07/02/2019 | 14 | 0 |
| 3 | A | 06/02/2019 | 234 | 1 |
| 4 | A | 05/02/2019 | 74 | 1 |
| 5 | B | 04/02/2019 | 11 | 1 |
| 6 | A | 03/02/2019 | 12 | 0 |
I need to add another column that counts the number of previous rows per CustomerKey, but only if "Something" is equal to 1, so that it returns this:
| Invoice ID | Customer Key | Date | Value | Something | Count |
| ---------- | ------------ | ---------- | ------| --------- | ----- |
| 1 | A | 08/02/2019 | 100 | 1 | 2 |
| 2 | B | 07/02/2019 | 14 | 0 | 1 |
| 3 | A | 06/02/2019 | 234 | 1 | 1 |
| 4 | A | 05/02/2019 | 74 | 1 | 0 |
| 5 | B | 04/02/2019 | 11 | 1 | 0 |
| 6 | A | 03/02/2019 | 12 | 0 | 0 |
I know I can do this using either a CTE like this...
(
select
count(*)
from table
where
[Customer Key] = t.[Customer Key]
and [Date] < t.[Date]
and Something = 1
)
But I have a lot of data and that's pretty slow. I know I can also use cross apply to achieve the same thing, but as far as I can tell that's not any better performing than just using a CTE.
So; is there a more efficient means of achieving this, or do I just suck it up?
EDIT: I originally posted this without the requirement that only rows where Something = 1 are counted. Mea culpa - I asked it in a hurry. Unfortunately I think that this means I can't use row_number() over (partition by [Customer Key])
Assuming you're using SQL Server 2012+ you can use Window Functions:
COUNT(CASE WHEN Something = 1 THEN CustomerKey END) OVER (PARTITION BY CustomerKey ORDER BY [Date]
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -1 AS [Count]
Old answer before new required logic:
COUNT(CustomerKey) OVER (PARTITION BY CustomerKey ORDER BY [Date]
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -1 AS [Count]
If you're not using 2012 an alternative is to use ROW_NUMBER
ROW_NUMBER() OVER (PARTITION BY CustomerKey ORDER BY [Date]) - 1 AS Count

Group By same or similar string sql

1) Suppose i have a table like this:-
| id | color_code | fruit |
|:------|--------------|----------------:|
| 1 | 000001 | apple |
| 2 | 000001 | apple |
| 3 | 000001 | apple |
| 4 | 000002 | lemon |
| 5 | 000002 | lemon |
| 6 | 000003 | grapes |
| 7 | 000003 | grapes |
How can i group by the fruit column according to the color_code column in sql server?
like this i suppose:-
| id | color_code | fruit | group_concat(id) |
|:------|--------------|-----------------|---------------------|
| 1 | 000001 | apple | 1,2,3 |
| 4 | 000002 | lemon | 2,5 |
| 6 | 000003 | grapes | 6,7 |
2) What if i have 3 tables (like shown below) which require join, how can i achieve this?
series_no table:
| id | desc_seriesno |
|:------|----------------:|
| 7040 | AU1011 |
| 7041 | AU1022 |
| 7042 | AU1033 |
| 7043 | AU1044 |
| 7044 | AU1055 |
| 7045 | AU1066 |
brand table:
| id | desc_brand |
|:------|----------------:|
| 1020 | Audi |
| 1021 | Bentley |
| 1022 | Ford |
| 1023 | BMW |
| 1024 | Mazda |
| 1025 | Toyota |
car_info table:
| seriesno_id | brand_id | color |
|:---------------|------------|--------:|
| 7040 | 1020 | white |
| 7040 | 1020 | black |
| 7040 | 1020 | pink |
| 7041 | 1021 | yellow |
| 7041 | 1021 | brown |
| 7042 | 1022 | purple |
| 7042 | 1022 | black |
| 7042 | 1022 | green |
| 7043 | 1023 | blue |
| 7044 | 1024 | red |
| 7045 | 1025 | maroon |
| 7045 | 1025 | white |
this is my current query with sql server 2014:-
SELECT SN.id AS seriesid, B.id AS brandid, B.desc_brand
FROM [db1].[dbo].[series_no] SN
LEFT JOIN [db1].[dbo].[car_info] CI
ON CI.seriesno_id = SN.id
RIGHT JOIN [db1].[dbo].[brand] B
ON B.id = CI.brand_id
GROUP BY SN.id, B.id
ORDER BY SN.id ASC
but unfortunately it gave me an error since i cannot group by similar string this way.
i want it to be like this:-
| seriesid | brandid | desc_brand | count |
|:-----------|------------|---------------|-------|
| 7040 | 1020 | Audi | 3 |
| 7041 | 1021 | Bentley | 2 |
| 7042 | 1022 | Ford | 3 |
| 7043 | 1023 | BMW | 1 |
| 7044 | 1024 | Mazda | 1 |
| 7045 | 1025 | Toyota | 2 |
1 Fruit Color
Assuming the table name is FruitColor, you can get the desired output by the following query -
SELECT MIN(id) AS id
, color_code
, fruit
, group_concat_id = STUFF((SELECT ',' + CAST(id AS VARCHAR)
FROM FruitColor AS fci
WHERE fci.fruit = fc.fruit AND fci.color_code = fc.color_code
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM FruitColor AS fc
GROUP BY color_code, fruit
ORDER BY id;
The MIN() selects the first id of the group.
Since there is no default GROUP_CONCAT function like in MySql in SQL Server, you have to use the STUFF function and FOR XML PATH. To learn more about group concat you can visit this link https://sqlperformance.com/2014/08/t-sql-queries/sql-server-grouped-concatenation
You can customize the WHERE clause to match only by color_code.
2. You can have several options for this -
Option (a): Show counts for series with brands
SELECT seriesno_id AS seriesid, ci.brand_id AS bandid, desc_brand, COUNT(*) AS [count]
FROM db1.dbo.car_info AS ci
LEFT JOIN db1.dbo.brand AS b ON (b.id = ci.brand_id)
GROUP BY seriesno_id, ci.brand_id, desc_brand;
Here you don't need to use the series table if you want to show counts for cars having brand(s).
You may not need to use the RIGHT JOIN on the brand table because if brand table contains a record which
is not in car_info table, then seriesno_id would be null.
Option (b): Show counts for all the series with or without a brand
SELECT sn.id AS seriesid, ci.brand_id AS bandid, desc_brand, COUNT(*) AS [count]
FROM db1.dbo.series_no AS sn
LEFT JOIN db1.dbo.car_info AS ci ON (ci.seriesno_id = sn.id)
LEFT JOIN db1.dbo.brand AS b ON (b.id = ci.brand_id)
GROUP BY sn.id, ci.brand_id, desc_brand;
Option (c): The work around for selecting a column which is not in a GROUP BY
SELECT seriesno_id AS seriesid, ci.brand_id AS bandid, MAX(desc_brand) AS desc_brand, COUNT(*) AS [count]
FROM db1.dbo.car_info AS ci
LEFT JOIN db1.dbo.brand AS b ON (b.id = ci.brand_id)
GROUP BY seriesno_id, ci.brand_id;
Here, if we are certain that each brand contains only one desc_brand, we can use an aggregate on it.
This is bcause applying aggregate only one value returns that value. I used MAX here.
Personally I would go with option (a) as it makes more sense.
Update on GROUP BY exception for desc_brand being NTEXT...
Cast desc_brand to NVARCHAR to avoid the exception.
CAST(desc_brand AS NVARCHAR(200))
Also I highly recommend using VARCHAR / NVARCHAR instead of any TEXT, CHAR etc. because they usually occupy more memory.
SELECT
id = SUBSTRING(group_concat,1,1),
color_code,
fruit,
group_concat
FROM(
SELECT distinct
m.color_code,
m.fruit,
group_concat = STUFF((SELECT ',' + CONVERT(varchar(10),md.id)
FROM [Test_1].[dbo].[Stuff] md
WHERE m.fruit = md.fruit
AND m.color_code = md.color_code
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM [Test_1].[dbo].[Stuff] m)x
use below code ..
SELECT distinct
m.color_code
, m.fruit
, group_concat = STUFF((
SELECT ',' + CONVERT(varchar(10),md.id)
FROM dbo.tablename md
WHERE m.fruit = md.fruit and m.color_code = md.color_code
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM dbo.tablename m
for second :
SELECT SN.id AS seriesid, B.id AS brandid, B.desc_brand ,count(*)
FROM [db1].[dbo].[series_no] SN
LEFT JOIN [db1].[dbo].[car_info] CI
ON CI.seriesno_id = SN.id
RIGHT JOIN [db1].[dbo].[brand] B
ON B.id = CI.brand_id
GROUP BY SN.id, B.id ,B.desc_brand
ORDER BY 4 ASC

SQL Pivot, Duplicate Key Values, No Aggregate [duplicate]

Here is a part of my MSSQL 2008 [ERROR CODE] table, which I want to transpose to following structure. I tried searching a workaround but could not find a solution to accomplish the task. Using Pivot I think is not feasible as I cannot use aggregate function. Can someone please help me to how to make this possible?
+----------+-------+---------------------------------------------------+
| SKILL ID | SKILL | PARAMETER |
+----------+-------+---------------------------------------------------+
| 1 | 121 | STANDARD VERBIAGE & PROCEDURES |
| 1 | 121 | ISSUE IDENTIFICATION |
| 1 | 121 | CALL COURTESY |
| 1 | 121 | ISSUE RESOLUTION |
| 2 | BO | COLLECTION PROCESS ADHERENCE |
| 2 | BO | INTELLIGENCE PARAMETER |
| 3 | EM | SOFT SKILLS |
| 3 | EM | PRODUCT KNOWLEDGE |
| 3 | EM | CALL CLOSING |
| 3 | EM | CALL OPENING |
| 4 | FLC | RESOLUTION |
| 4 | FLC | NONE |
| 5 | FTA | OTHERS |
| 5 | FTA | HYGIENE FACTORS |
| 5 | FTA | ACCOUNT SCREEN |
| 5 | FTA | ORDER , DOCUMENTATION AND CONFIGURATION |
| 5 | FTA | VALIDATION SCREEN |
| 5 | FTA | PARTY SCREEN |
| 5 | FTA | ORDER , DOCUMENTATION AND CONFIGURATION |
| 6 | NCE | COMPLIANCE |
| 6 | NCE | CRM |
| 6 | NCE | ACCOUNT LEVEL /INSTALLATION DETAILS CONFIRTMATION |
| 6 | NCE | CONTENTS/BILL DETAILS |
| 6 | NCE | SELFCARE |
| 6 | NCE | FEEDBACK/SATISFACTION |
| 6 | NCE | OBJECTION RESOLUTION |
| 6 | NCE | CUSTOMER HANDLING |
| 6 | NCE | RED ALERT |
| 7 | RTO | ZERO TOLERANCE |
| 7 | RTO | OVERALL IMPRESSION |
| 7 | RTO | SUMMARY AND CLOSING |
| 7 | RTO | PROCESS KNOWLEDGE |
| 7 | RTO | OPENING |
| 8 | SHMNP | SKILL AREA |
| 8 | SHMNP | CONVINCING SKILLS |
+----------+-------+---------------------------------------------------+
This is may expected output
+-------+--------------------------------+------------------------+---------------------------------------------------+
| SKILL | PARAMETER1 | PARAMETER2 | PARAMETER3 |
+-------+--------------------------------+------------------------+---------------------------------------------------+
| 121 | STANDARD VERBIAGE & PROCEDURES | ISSUE IDENTIFICATION | CALL COURTESY |
| BO | COLLECTION PROCESS ADHERENCE | INTELLIGENCE PARAMETER | NULL |
| EM | SOFT SKILLS | PRODUCT KNOWLEDGE | CALL CLOSING |
| FLC | RESOLUTION | NONE | NULL |
| FTA | OTHERS | HYGIENE FACTORS | ACCOUNT SCREEN |
| NCE | COMPLIANCE | CRM | ACCOUNT LEVEL /INSTALLATION DETAILS CONFIRTMATION |
| RTO | ZERO TOLERANCE | OVERALL IMPRESSION | SUMMARY AND CLOSING |
| SHMNP | SKILL AREA | CONVINCING SKILLS | NULL |
+-------+--------------------------------+------------------------+---------------------------------------------------+
You can use the PIVOT function to get the result, you will just have to use row_number() to help.
The base query for this will be:
select skill_id, skill, parameter,
row_number() over(partition by skill, skill_id order by skill_id) rn
from yt;
See SQL Fiddle with Demo. I use row_number() to apply a distinct value to each row within the skill and skill_id, you will then use this row number value as the column to PIVOT.
The full code with the PIVOT applied will be:
select skill_id, skill,[Parameter_1], [Parameter_2], [Parameter_3]
from
(
select skill_id, skill, parameter,
'Parameter_'+cast(row_number() over(partition by skill, skill_id
order by skill_id) as varchar(10)) rn
from yt
) d
pivot
(
max(parameter)
for rn in ([Parameter_1], [Parameter_2], [Parameter_3])
) piv;
See SQL Fiddle with Demo.
In your case, it seems like you will have an unknown number of parameters for each skill. If that is true, then you will want to use dynamic SQL to get the result:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME('Parameter_'
+cast(row_number() over(partition by skill, skill_id
order by skill_id) as varchar(10)))
from yt
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT skill_id, skill,' + #cols + ' from
(
select skill_id, skill, parameter,
''Parameter_''+cast(row_number() over(partition by skill, skill_id
order by skill_id) as varchar(10)) rn
from yt
) x
pivot
(
max(parameter)
for rn in (' + #cols + ')
) p '
execute(#query);
See SQL Fiddle with Demo

SQL Server : update sequence number across multiple groups

I would like to update a table:
| id | type_id | created_at | sequence |
|----|---------|------------|----------|
| 1 | 1 | 2010-04-26 | NULL |
| 2 | 1 | 2010-04-27 | NULL |
| 3 | 2 | 2010-04-28 | NULL |
| 4 | 3 | 2010-04-28 | NULL |
To this (note that created_at is used for ordering, and sequence is "grouped" by type_id):
| id | type_id | created_at | sequence |
|----|---------|------------|----------|
| 1 | 1 | 2010-04-26 | 1 |
| 2 | 1 | 2010-04-27 | 2 |
| 3 | 2 | 2010-04-28 | 1 |
| 4 | 3 | 2010-04-28 | 1 |
Same question has been raised but for SQL Server.
Link
Thanks.
You can use ROW_NUMBER() to get sequence number per type_id slice. Use a CTE to make UPDATE operation simpler:
;WITH ToUpdate AS (
SELECT id, type_id, created_at, sequence,
ROW_NUMBER() OVER (PARTITION BY type_id ORDER BY created_at) AS newSeq
FROM mytable
)
UPDATE ToUpdate
SET sequence = newSeq
Demo here

Resources