SQL Server Management Studio 2012 Pivot/Cross Tab Query - sql-server

I'm using SQL Server Management Studio 2012 and would like to create a pivot/cross tab query for a table with over 2300 rows.
The table has 5 columns:
- name
- group
- status
- date
- count
There are about 580 distinct names.
Each name is associated to 4 different groups (A, B, C, and D).
Each group has a complete status of yes or no.
A date is associated to each status when completed. Otherwise, status is NULL.
The count column is only applicable to group B and D and is an integer value.
SAMPLE A:
name group status date count
A.A.1 A yes 5/23 NULL
A.A.1 B yes 5/27 112
A.A.1 C yes 6/4 NULL
A.A.1 D yes 6/15 122
A.B.2 A yes 5/25 NULL
A.B.2 B yes 6/1 119
A.B.2 C no NULL NULL
A.B.2 D no NULL NULL
I am trying to display the status of each name as the field values across 11 columns :
- name
- group A
- group A date
- group B
- group B date
- group B count
- group C
- group C date
- group D
- group D date
- group D count
The 'name' column would have the 580 distinct names with their corresponding group data across A, B, C, and D.
SAMPLE B:
nm grp_A A_day grp_B B_day B_ct grp_C C_day grp_D D_day D_ct
A.A.1 yes 5/23 yes 5/27 112 yes 6/4 yes 6/15 122
A.B.2 yes 5/25 yes 6/1 119 no NULL no NULL NULL
(column names have been changed to fit into this question section's format)
Ultimately, the result should have all 580 distinct names in the first column and its corresponding status for each group, the date of completion (or NULL if it has not been completed yet), and the count for groups B and D.
I've tried using a CASE statement, but it generates the names once for each group, resulting in the original table being spaced out across the 11 coulmns.
SAMPLE C:
nm grp_A A_day grp_B B_day B_ct grp_C C_day grp_D D_day D_ct
A.A.1 yes 5/23
A.A.1 yes 5/27 112
A.A.1 yes 6/4
A.A.1 yes 6/15 122
A.B.2 yes 5/25
A.B.2 yes 6/1 119
A.B.2 no NULL
A.B.2 no NULL NULL
What am I doing wrong? Please help!
-- K-moj

Without seeing your query I am guessing but if you are trying to PIVOT the data with a CASE expression my suggestion would be to add an aggregate function around the CASE.
select
name,
max(case when [group] = 'A' then status end) grp_A,
max(case when [group] = 'A' then date end) A_day,
max(case when [group] = 'A' then [count] end) A_ct,
max(case when [group] = 'B' then status end) grp_B,
max(case when [group] = 'B' then date end) B_day,
max(case when [group] = 'B' then [count] end) B_ct,
max(case when [group] = 'C' then status end) grp_C,
max(case when [group] = 'C' then date end) C_day,
max(case when [group] = 'C' then [count] end) C_ct,
max(case when [group] = 'D' then status end) grp_D,
max(case when [group] = 'D' then date end) D_day,
max(case when [group] = 'D' then [count] end) D_ct
from yourtable
group by name
See SQL Fiddle with Demo.
If you want to use the PIVOT function then you will need to first look at unpivoting the status, date and count columns first, then pivot them in the the final result.
An UNPIVOT is when you convert multiple columns of data into multiple rows. You can unpivot the status, date and count columns using a variety of methods. Since you are using SQL Server 2012 you can use CROSS APPLY with a VALUES clause. The code to convert the columns into rows will be:
select name,
col = col+'_'+[group],
value
from yourtable
cross apply
(
values
('grp', status),
('day', [date]),
('ct', cast([count] as varchar(10)))
) c(col, value)
See Demo. This gives a result:
| NAME | COL | VALUE |
| A.A.1 | grp_A | yes |
| A.A.1 | day_A | 5/23 |
| A.A.1 | ct_A | (null) |
| A.A.1 | grp_B | yes |
| A.A.1 | day_B | 5/27 |
| A.A.1 | ct_B | 112 |
Instead of having multiple columns that you want to pivot, you now have all values to be turned into new columns in value and the new column names in col. You can then apply the PIVOT function so the full code will be similar to the following:
select name,
grp_A, day_A, ct_A,
grp_B, day_B, ct_B,
grp_C, day_C, ct_C,
grp_D, day_D, ct_D
from
(
select name,
col = col+'_'+[group],
value
from yourtable
cross apply
(
values
('grp', status),
('day', [date]),
('ct', cast([count] as varchar(10)))
) c(col, value)
) d
pivot
(
max(value)
for col in (grp_A, day_A, ct_A,
grp_B, day_B, ct_B,
grp_C, day_C, ct_C,
grp_D, day_D, ct_D)
) piv
See SQL Fiddle with Demo

Related

how to make your data horizontal

i have 2 identical data in 2 row and i intend to make this data become 1 row. for example i have this data sample
Name Status Bank
Thung Active ABC Bank
Thung Hold ABC Bank
can i make something like this
Name Status 1 Bank 1 Status 2 Bank 2
Thung Active ABC Bank Hold ABC Bank
sorry i cant explain it properly
SQL Fiddle
MS SQL Server 2017 Schema Setup:
create table MyTable(Name varchar(max),BStatus varchar(max),Bank varchar(max))
insert into MyTable (Name,BStatus,Bank)values('Thung','Active', 'ABC Bank')
insert into MyTable (Name,BStatus,Bank)values('Thung','Hold', 'ABC Bank')
Query 1:
with CTE AS (select *,
(CASE WHEN BStatus='Active' THEN BStatus END) AS Status1,
(CASE WHEN BStatus = 'Hold' THEN BStatus END) AS Status2,
(CASE WHEN Bank='ABC Bank' THEN Bank END) AS Bank1,
(CASE WHEN Bank='ABC Bank' THEN Bank END) AS Bank2,
ROW_NUMBER() OVER (PARTITION BY BStatus,Bank Order By Name) as rn
from MyTable
group by Name,BStatus,Bank )
select c.Name
,max(c.Status1) AS Status1
,max(c.Status2) AS Status2
,max(c.Bank1) AS Bank1
,max(c.Bank2) AS Bank2
from cte c
where rn=1
group by c.Name,c.Bank
Results:
| Name | Status1 | Status2 | Bank1 | Bank2 |
|-------|---------|---------|----------|----------|
| Thung | Active | Hold | ABC Bank | ABC Bank |

Duplicates on Self Left Join

I'm trying to pivot out a table of data stored in a vertical model into a more horizontal, SQL Server table-like model. Unfortunately due to the nature of the data, I cannot use the real data here so I worked up a generic example that follows the same model.
There are three columns to the table, an ID, column ID and value, where the ID and column ID form the Primary Key. Additionally none of the data is required (i.e. an ID can be missing column ID = 3 without breaking anything)
PetID | ColumnID | Value
---------------------------
1 | 1 | Gilda
1 | 2 | Cat
2 | 1 | Sonny
2 | 2 | Cat
2 | 3 | Black
Due to the fact that the Primary Key is a composite of two columns I cannot use the built in PIVOT functionality, so I tried doing a self LEFT JOIN:
SELECT T1.PetID
,T2.Value AS [Name]
,T3.Value AS [Type]
,T4.Value AS [Color]
FROM #Temp AS T1
LEFT JOIN #Temp AS T2 ON T1.PetID = T2.PetID
AND T2.ColumnID = 1
LEFT JOIN #Temp AS T3 ON T1.PetID = T3.PetID
AND T3.ColumnID = 2
LEFT JOIN #Temp AS T4 ON T1.PetID = T4.PetID
AND T4.ColumnID = 3;
The idea being that I want to take the ID from T1 and then do a self LEFT JOIN to get each of the values by ColumnID. However I'm getting duplicates in the data:
PetID | Name | Type | Color
------------------------------
1 | Gilda | Cat | NULL
1 | Gilda | Cat | NULL
2 | Sonny | Cat | Black
2 | Sonny | Cat | Black
2 | Sonny | Cat | Black
I am able to get rid of these duplicates using a DISTINCT, but the dataset is rather large, so the required sort action is slowing down the query tremendously. Is there a better way to accomplish this or am I just stuck with a slow query?
You can use a CASE statement and avoid the joins altogether.
SELECT
PetID,
MAX(CASE WHEN ColumnID = 1 THEN Value ELSE NULL END) AS Name,
MAX(CASE WHEN ColumnID = 2 THEN Value ELSE NULL END) AS Type,
MAX(CASE WHEN ColumnID = 3 THEN Value ELSE NULL END) AS Color
FROM #Temp
GROUP BY PetId
It is essential that PetID, ColumnID be your primary key for this to work correctly. Otherwise it will cause problems when the same ColumnID is used multiple times for the same PetID
You can use pivot if you'd like to..
SELECT *
FROM (SELECT PetID,
(CASE ColumnID
WHEN 1 THEN 'Name'
WHEN 2 THEN 'Type'
WHEN 3 THEN 'Color'
END) ValueType,
VALUE
FROM #Temp
) t
PIVOT
( MAX(Value)
FOR ValueType IN ([Name],[Type],[Color])
) p
Another way without the Sub query would be..
SELECT PetID,
[1] [Name],
[2] [Type],
[3] [Color]
FROM #Temp
PIVOT
( MAX(Value)
FOR ColumnID IN ([1],[2],[3])
) p
I don't understand your concern about sorting. You have a primary key so you also have an index. This is the correct way to do it:
select
PetID,
min(case when ColumnID = 1 then Value end) as Name,
min(case when ColumnID = 2 then Value end) as Type,
min(case when ColumnID = 3 then Value end) as Color
from #Temp
group by PetID
A fix for your duplication is simple though and will probably improve performance as well:
FROM (select distinct PetID from #Temp) AS T1
SELECT T1.PetID
,T1.Value AS [Name]
,T2.Value AS [Type]
,T3.Value AS [Color]
--select *
FROM #Temp AS T1
LEFT JOIN #Temp AS T2 ON T1.PetID = T2.PetID
AND T2.ColumnID = 2
LEFT JOIN #Temp AS T3 ON T1.PetID = T3.PetID
AND T3.ColumnID = 3
where t1.ColumnID = 1
Your problem was that you were joining to the main table that had multiple rows.

how to get count of employees whose name starts with alphabet A and B

How can I get a count of employees whose name starts with A or B? The result should look like the table bellow.
===========
A | B |
===========
5 | 8 |
-----------
You can always use CASE
SELECT
SUM(case when first_name like 'A%' then 1 else 0 end) 'A' ,
SUM(case when first_name like 'B%' then 1 else 0 end) 'B'
FROM tableName
Query basically means add 1 to column A for every first_name that starts with A.
Based on my understanding.
Below Query will return two columns 1 :Starting Alphabet, 2: Count.
SELECT LEFT(employees, 1) , Count(LEFT(employees, 1)) FROM
TableName GROUP BY LEFT(employees, 1)

Query trick - kind of unpivot

I have the following table
SnapShotDay OperationalUnitNumber IsOpen StatusDate
1-01-2014 001 1 1-01-2014
2-01-2014 NULL NULL NULL
3-01-2014 001 0 3-01-2014
4-01-2014 NULL NULL NULL
5-01-2014 001 1 5-01-2014
I obtain this with a SELECT construct, but what I need to do now is fill in the "NULL"ed rows by taking values from the first Non nulled row before. The latter would give:
SnapShotDay OperationalUnitNumber IsOpen StatusDate
1-01-2014 001 1 1-01-2014
2-01-2014 001 1 1-01-2014
3-01-2014 001 0 3-01-2014
4-01-2014 001 0 3-01-2014
5-01-2014 001 1 5-01-2014
In functional words: I have events records that give me an event on a date for an oprrational unit; the event is: IsOpen or IsClosed. Chaining those events together according to the date gives a sort of Ranges. What I need is generate daily records for those ranges (target is a fact table).
I am trying to achieve this in plain SQL query (no stored procedure).
Can you think of a trick ?
Declare #t table(
SnapShotDay date,
OperationalUnitNumber int,
IsOpen bit,
StatusDate date
)
insert into #t
select '1-01-2014', 001 , 1 , '1-01-2014' union all
select '2-01-2014', NULL, NULL, NULL union all
select '3-01-2014', 001 , 0 ,'3-01-2014' union all
select '4-01-2014', NULL,NULL,NULL union all
select '5-01-2014', 001 ,1,'5-01-2014'
;
with CTE as
(
select *,row_number()over( order by (select 0))rn from #t
)
select *,
case when a.isopen is null then (
select IsOpen from cte where rn=a.rn-1
) else a.isopen end
from cte a
ok i got it create one more cte1 then,
,cte1 as
(
select top 1 rn ,IsOpen from cte where IsOpen is not null order by rn desc
)
--select * from Statuses
select *,
case
when a.rn<=(select b.rn from cte1 b) and a.IsOpen is null then
(
select
a1.IsOpen
from
cte a1
where
a1.rn=a.rn-1
)
when a.rn>=(select b.rn from cte1 b) and a.IsOpen is null then
(select IsOpen from cte1)
else
a.isopen
end
from
cte a
Try this. In the main query we're looking for the previous date with not null values. Then just JOIN this table with this LastDate.
WITH T1 AS
(
SELECT *, (SELECT MAX(SnapShotDay)
FROM T
WHERE SnapShotDay<=TMain.SnapShotDay
AND OPERATIONALUNITNUMBER IS NOT NULL)
as LastDate
FROM T as TMain
)
SELECT T1.SnapShotDay,
T.OperationalUnitNumber,
T.IsOpen,
T.StatusDate
FROM T1
JOIN T ON T1.LastDate=T.SnapShotDay
SQLFiddle demo
SELECT
t1.SnapShotDay,
CASE WHEN t1.OperationalUnitNumber IS NOT NUll
THEN t1.OperationalUnitNumber
ELSE (SELECT TOP 1 t2.OperationalUnitNumber FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.OperationalUnitNumber IS NOT NULL ORDER BY SnapShotDay DESC)
END AS OperationalUnitNumber,
CASE WHEN t1.IsOpen IS NOT NUll
THEN t1.IsOpen
ELSE (SELECT TOP 1 t2.IsOpen FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.IsOpen IS NOT NULL ORDER BY SnapShotDay DESC)
END AS IsOpen,
CASE WHEN t1.StatusDate IS NOT NUll
THEN t1.StatusDate
ELSE (SELECT TOP 1 t2.StatusDate FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.StatusDate IS NOT NULL ORDER BY SnapShotDay DESC)
END AS StatusDate
FROM YourTable t1
You asked for 'plain sql', here is a tested attempt using SQL, with comments, that gives the required answer.
I have tested the code using 'sqlite' and 'mysql' on windows xp. It is pure SQL and should work everywhere.
SQL is about 'sets' and combining them and ordering the results.
This problem seems to be about two separate sets:
1) The 'snap shot day' that have readings.
2) the 'snap shot day' that don't have readings.
I have added extra columns so that we can easily see where values came from.
let us deal with the easy set first:
This is the set of 'supplied' readings.
SELECT dss.SnapShotDay theDay,
'supplied' readingExists,
dss.OperationalUnitNumber,
dss.IsOpen,
dss.StatusDate
FROM dailysnapshot dss
WHERE dss.OperationalUnitNumber IS NOT NULL
results:
theDay readingExists OperationalUnitNumber IsOpen StatusDate
2014-01-01 supplied 001 1 2014-01-01
2014-01-03 supplied 001 0 2014-01-03
2014-01-05 supplied 001 1 2014-01-05
Now let us deal with the set of 'days that have missing readings'. We need to get the 'most recent day that has readings that is closest to the day with the missing readings' and assume the same values from the 'most recent day' that is before the 'current' missing day.
It sounds complex but it isn't. It asks:
foreach day without a reading - get me the closest, earlier, date that has readings and i will use those readings.
Here is the query:
SELECT emptyDSS.SnapShotDay,
'missing' readingExists,
maxPrevDSS.OperationalUnitNumber,
maxPrevDSS.IsOpen,
maxPrevDSS.StatusDate
FROM dailysnapshot emptyDSS
INNER JOIN dailysnapshot maxPrevDSS ON maxPrevDSS.SnapShotDay =
(SELECT MAX(dss.SnapShotDay)
FROM dailysnapshot dss
WHERE dss.SnapShotDay < emptyDSS.SnapShotDay
AND dss.OperationalUnitNumber IS NOT NULL)
WHERE emptyDSS.OperationalUnitNumber IS NULL
results:
SnapShotDay readingExists OperationalUnitNumber IsOpen StatusDate
2014-01-02 missing 001 1 2014-01-01
2014-01-04 missing 001 0 2014-01-03
This is not about efficiency! It is about getting the correct 'result set' with the easiest to understand SQL code. I assume the database engine will optimize the query. The query can be 'tweaked' later if required.
We now need to combine the two queries and order the results in the manner we require.
The standard way of combining results from SQL queries is with set operators (union, intersection, minus).
we use 'union' and an 'order by' on the result set.
this gives the final query of:
SELECT dss.SnapShotDay theDay,
'supplied' readingExists,
dss.OperationalUnitNumber,
dss.IsOpen,
dss.StatusDate
FROM dailysnapshot dss
WHERE `OperationalUnitNumber` IS NOT NULL
UNION
SELECT emptyDSS.SnapShotDay theDay,
'missing' readingExists,
maxPrevDSS.OperationalUnitNumber,
maxPrevDSS.IsOpen,
maxPrevDSS.StatusDate
FROM dailysnapshot emptyDSS
INNER JOIN dailysnapshot maxPrevDSS ON maxPrevDSS.SnapShotDay =
(SELECT MAX(dss.SnapShotDay)
FROM dailysnapshot dss
WHERE dss.SnapShotDay < emptyDSS.SnapShotDay
AND dss.OperationalUnitNumber IS NOT NULL)
WHERE emptyDSS.OperationalUnitNumber IS NULL
ORDER BY theDay ASC
result:
theDay readingExists dss.OperationalUnitNumber dss.IsOpen dss.StatusDate
2014-01-01 supplied 001 1 2014-01-01
2014-01-02 missing 001 1 2014-01-01
2014-01-03 supplied 001 0 2014-01-03
2014-01-04 missing 001 0 2014-01-03
2014-01-05 supplied 001 1 2014-01-05
I enjoyed doing this.
It should work with most SQL engines.

Grouping results by test

I have a table with this structure:
Test Value Shape
1 1,89 20
1 2,08 27
1 2,05 12
2 2,01 12
2 2,05 35
2 2,03 24
I need a column for each Test value, in this case, something like this:
Test 1 | Test 2
Value | Shape | Value | Shape
I tried to do this with pivot, but the results wasn't good. Can someone help me?
[]'s
There are a few different ways that you can get the result since you are using SQL Server. In order to get the result, you will first need to create a unique value that will allow you return multiple rows for each Test. I would apply a windowing function like row_number():
select test, value, shape,
row_number() over(partition by test
order by value) seq
from yourtable
This query will be used as the base for the rest of your process. This creates a unique sequence for each test and then when you apply the aggregate function you are able to return multiple rows.
You can get your final result using an aggregate function with a CASE expression:
select
max(case when test = 1 then value end) test1Value,
max(case when test = 1 then shape end) test1Shape,
max(case when test = 2 then value end) test2Value,
max(case when test = 2 then shape end) test2Shape
from
(
select test, value, shape,
row_number() over(partition by test
order by value) seq
from yourtable
) d
group by seq;
See SQL Fiddle with Demo.
If you want to implement the PIVOT function, then I would first need to unpivot your multiple columns of Value and Shape and then apply the PIVOT. You will still use row_number() to generate a unique sequence that will be needed to return multiple rows. The basic syntax will be:
;with cte as
(
-- get unique sequence
select test, value, shape,
row_number() over(partition by test
order by value) seq
from yourtable
)
select test1Value, test1Shape,
test2Value, test2Shape
from
(
-- unpivot the multiple columns
select t.seq,
col = 'test'+cast(test as varchar(10))
+ col,
val
from cte t
cross apply
(
select 'value', value union all
select 'shape', cast(shape as varchar(10))
) c (col, val)
) d
pivot
(
max(val)
for col in (test1Value, test1Shape,
test2Value, test2Shape)
) piv;
See SQL Fiddle with Demo. Both versions give a result:
| TEST1VALUE | TEST1SHAPE | TEST2VALUE | TEST2SHAPE |
|------------|------------|------------|------------|
| 1,89 | 20 | 2,01 | 12 |
| 2,05 | 12 | 2,03 | 24 |
| 2,08 | 27 | 2,05 | 35 |

Resources