I have this data set and Im trying to create a pivot table for the data below. I having some issues when trying to put in the date for each type across.
Member ID Type Date
1 A 12/5/2014
1 b 3/6/2014
2 a 6/9/2015
2 b 3/2/2015
2 c 6/1/2014
3 a 6/5/2014
3 c 7/6/2014
4 c 9/13/2014
5 a 7/25/2014
5 b 6/24/2014
5 c 2/24/2014
Then
I would like it to come out as a pivot table, like this:
Member ID A A date B B date c C date
1 Yes 12/5/2014 yes 3/6/2014 Null Null
2 Yes 6/9/2015 yes 3/2/2015 Yes 6/1/2014
3 Yes 6/5/2014 Null Null Yes 7/6/2014
4 Null Null Null Null Yes 9/13/2014
5 Yes 7/25/2014 yes 6/24/2014 Yes 2/24/2014
i got this far not including the dates
SELECT
MemberID
,Type
--,Date
into #Test9
FROM [Data].[dbo].[Test]
I create the Temp table then try to pivot without date
Select *
From #Test9
Pivot ( count(type)
for type in (a]
,[b]
,[c])) as pvt
Would someone please help.
Assuming each member will only have one value for each type you could use conditional aggregation to get the result you want:
select
memberid,
max(case when type='a' then 'Yes' end) as "a",
max(case when type='a' then date end) as "a date",
max(case when type='b' then 'Yes' end) as "b",
max(case when type='b' then date end) as "b date",
max(case when type='c' then 'Yes' end) as "c",
max(case when type='c' then date end) as "c date"
from your_table
group by MemberID
Sample SQL Fiddle
This can quite easily be turned into a dynamic query if you don't have a fixed number of types; there are plenty of good answers here on Stack Overflow that demonstrates how, including the canonical SQL Server Pivot question that I usually link to.
try
declare #t table(memberid int,[type] char(1),[date] date)
insert into #t(MemberID,[Type],[Date]) values
( 1,'A','12/5/2014'),
(1,'b','3/6/2014'),
(2,'a','6/9/2015'),
(2,'b','3/2/2015'),
(2,'c','6/1/2014'),
(3,'a','6/5/2014'),
(3,'c','7/6/2014'),
(4,'c','9/13/2014'),
(5,'a','7/25/2014'),
(5,'b','6/24/2014'),
(5,'c','2/24/2014')
select memberid, case when len(a)>0 then 'Yes' end as [A date],a,
case when len(b)>0 then 'Yes' end as [B date],b,
case when len(c)>0 then 'Yes' end [c date] ,c from #t t
pivot( max([date]) for [type] in (a,b,c)) p
Related
I have 4 columns in my table like :
key cusi isi name
1 46644UAQ1 US46642EAV83 A
1 46644UAR9 XS0062104145 A
1 254206AC9 A
2 05617YAJ8 US86359AXP38 B
2 885220BP7 B
2 null B
3 885220BP5 885220BP7345 c
the key and name column content is getting duplicated because of the cusi and isi column .I would like to transpose only few columns in this case cusi and isi column so that i get 1 record of id =1 and another one for id=2 .In my use case there can be at the max 3 ditinct cusi or 3 isi column.
The transpose table should like
key name cusi1 cusi2 cusi3 isi1 isi2 isi3
1 A 46644UAQ1 46644UAR9 254206AC9 US46642EAV83 XS0062104145 NULL
2 A 46644UAR9 05617YAJ8 885220BP7 US86359AXP38 NULL NULL
3 c 885220BP5 null null 885220BP7345 NULL NULL
In some cases there might be only 1 row like in t he above example it is for key= 3
i know that sql has PIVOT and UNPIVOT queries but i am not sure how to use it for transposing selecting columns of a table
Any help would be of great help.
Thanks
If you know that each key-name group will have a fixed number of records (three, based on the sample data you gave us), then an ordinary non pivot should work. In the query below, I make use of the row number to distinguish each of the three columns you want for cusi and isi in your result set.
SELECT t.key,
t.name,
MAX(CASE WHEN t.rn = 1 THEN cusi END) AS cusi1,
MAX(CASE WHEN t.rn = 2 THEN cusi END) AS cusi2,
MAX(CASE WHEN t.rn = 3 THEN cusi END) AS cusi3,
MAX(CASE WHEN t.rn = 1 THEN isi END) AS isi1,
MAX(CASE WHEN t.rn = 2 THEN isi END) AS isi2,
MAX(CASE WHEN t.rn = 3 THEN isi END) AS isi3
FROM
(
SELECT key,
cusi,
isi,
name,
ROW_NUMBER() OVER(PARTITION BY key ORDER BY cusi) AS rn
FROM yourTable
) t
GROUP BY t.key,
t.name
Note that SQL Server also has a PIVOT function, which is an alternative to what I gave.
There was one other SIMILAR answer but it is 2 pages long and my requirement doesn't need that. I have 2 tables, tableA and a tableB, and I need to find the COUNTS of rows that are present in tableA but are not present in tableB OR if update_on in tableB is not today's date.
My tables:
tableA:
release_id book_name release_begin_date
----------------------------------------------------
1122 midsummer 2016-01-01
1123 fool's errand 2016-06-01
1124 midsummer 2016-04-01
1125 fool's errand 2016-08-01
tableB:
release_id book_name updated_on
-----------------------------------------
1122 midsummer 2016-08-17
1123 fool's errand 2016-08-16**
Expected result: Since each book is missing one release id, 1 is count. But in addition fool's errand's existing row in tableB has updated_on date of yesterday and not today, it needs to be counted in count_of_not_updated.
book_name count_of_missing count_of_not_updated
-------------------------------------------------------
midsummer 1 0
fool's errand 1 1
Note: Even though fool's errand is present in tableB, I need to show it in count_of_missing because it's updated_on date is yesterday and not today. I know it has to be a combination of a left join and something else, but the kicker here is not only getting the missing rows from left table but at the same time checking if the updated_on table was today's date and if not, count that row in count_of_not_updated.
select sum(case when b.release_id is null then 1 else 0 end) as noReleaseID
, sum(case when datediff(d, b.release_date, getdate()) > 0 then 1 else 0 end) as releaseDateNotToday
, a.release_id
from tableA a
left outer join tableB b on a.release_id = b.release_id
Group by a.release_id
This example uses a sum function on a case statement to add up the instances where the case statement returns true. Note that the current code assumes, as in your example, that you are looking to count all old release dates from table b - more steps would be required if each book has multiple old release dates in table b, and you only want to compare to the most recent release date.
Try this
DECLARE #tableA TABLE (release_id INT, book_name NVARCHAR(50), release_begin_date DATETIME)
DECLARE #tableB TABLE (release_id INT, book_name NVARCHAR(50), updated_on DATETIME)
INSERT INTO #tableA
VALUES
(1122, 'midsummer', '2016-01-01'),
(1123, 'fool''s errand', '2016-06-01'),
(1124, 'midsummer', '2016-04-01'),
(1125, 'fool''s errand', '2016-08-01')
INSERT INTO #tableB
VALUES
(1122, 'midsummer', '2016-08-17'),
(1123, 'fool''s errand', '2016-08-16')
;WITH TmpTableA
AS
(
SELECT
book_name,
COUNT(1) CountOfTableA
FROM
#tableA
GROUP BY
book_name
), TmpTableB
AS
(
SELECT
book_name,
COUNT(1) CountOfTableB,
SUM(CASE WHEN CONVERT(VARCHAR(11), updated_on, 112) = CONVERT(VARCHAR(11), GETDATE(), 112) THEN 0 ELSE 1 END) count_of_not_updated
FROM
#tableB
GROUP BY
book_name
)
SELECT
A.book_name ,
A.CountOfTableA - ISNULL(B.CountOfTableB, 0) AS count_of_missing,
ISNULL(B.count_of_not_updated, 0) AS count_of_not_updated
FROM
TmpTableA A LEFT JOIN
TmpTableB B ON A.book_name = B.book_name
Result:
book_name count_of_missing count_of_not_updated
-------------------- ---------------- --------------------
fool's errand 1 1
midsummer 1 1
I have the following table
SnapShotDay OperationalUnitNumber IsOpen StatusDate
1-01-2014 001 1 1-01-2014
2-01-2014 NULL NULL NULL
3-01-2014 001 0 3-01-2014
4-01-2014 NULL NULL NULL
5-01-2014 001 1 5-01-2014
I obtain this with a SELECT construct, but what I need to do now is fill in the "NULL"ed rows by taking values from the first Non nulled row before. The latter would give:
SnapShotDay OperationalUnitNumber IsOpen StatusDate
1-01-2014 001 1 1-01-2014
2-01-2014 001 1 1-01-2014
3-01-2014 001 0 3-01-2014
4-01-2014 001 0 3-01-2014
5-01-2014 001 1 5-01-2014
In functional words: I have events records that give me an event on a date for an oprrational unit; the event is: IsOpen or IsClosed. Chaining those events together according to the date gives a sort of Ranges. What I need is generate daily records for those ranges (target is a fact table).
I am trying to achieve this in plain SQL query (no stored procedure).
Can you think of a trick ?
Declare #t table(
SnapShotDay date,
OperationalUnitNumber int,
IsOpen bit,
StatusDate date
)
insert into #t
select '1-01-2014', 001 , 1 , '1-01-2014' union all
select '2-01-2014', NULL, NULL, NULL union all
select '3-01-2014', 001 , 0 ,'3-01-2014' union all
select '4-01-2014', NULL,NULL,NULL union all
select '5-01-2014', 001 ,1,'5-01-2014'
;
with CTE as
(
select *,row_number()over( order by (select 0))rn from #t
)
select *,
case when a.isopen is null then (
select IsOpen from cte where rn=a.rn-1
) else a.isopen end
from cte a
ok i got it create one more cte1 then,
,cte1 as
(
select top 1 rn ,IsOpen from cte where IsOpen is not null order by rn desc
)
--select * from Statuses
select *,
case
when a.rn<=(select b.rn from cte1 b) and a.IsOpen is null then
(
select
a1.IsOpen
from
cte a1
where
a1.rn=a.rn-1
)
when a.rn>=(select b.rn from cte1 b) and a.IsOpen is null then
(select IsOpen from cte1)
else
a.isopen
end
from
cte a
Try this. In the main query we're looking for the previous date with not null values. Then just JOIN this table with this LastDate.
WITH T1 AS
(
SELECT *, (SELECT MAX(SnapShotDay)
FROM T
WHERE SnapShotDay<=TMain.SnapShotDay
AND OPERATIONALUNITNUMBER IS NOT NULL)
as LastDate
FROM T as TMain
)
SELECT T1.SnapShotDay,
T.OperationalUnitNumber,
T.IsOpen,
T.StatusDate
FROM T1
JOIN T ON T1.LastDate=T.SnapShotDay
SQLFiddle demo
SELECT
t1.SnapShotDay,
CASE WHEN t1.OperationalUnitNumber IS NOT NUll
THEN t1.OperationalUnitNumber
ELSE (SELECT TOP 1 t2.OperationalUnitNumber FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.OperationalUnitNumber IS NOT NULL ORDER BY SnapShotDay DESC)
END AS OperationalUnitNumber,
CASE WHEN t1.IsOpen IS NOT NUll
THEN t1.IsOpen
ELSE (SELECT TOP 1 t2.IsOpen FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.IsOpen IS NOT NULL ORDER BY SnapShotDay DESC)
END AS IsOpen,
CASE WHEN t1.StatusDate IS NOT NUll
THEN t1.StatusDate
ELSE (SELECT TOP 1 t2.StatusDate FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.StatusDate IS NOT NULL ORDER BY SnapShotDay DESC)
END AS StatusDate
FROM YourTable t1
You asked for 'plain sql', here is a tested attempt using SQL, with comments, that gives the required answer.
I have tested the code using 'sqlite' and 'mysql' on windows xp. It is pure SQL and should work everywhere.
SQL is about 'sets' and combining them and ordering the results.
This problem seems to be about two separate sets:
1) The 'snap shot day' that have readings.
2) the 'snap shot day' that don't have readings.
I have added extra columns so that we can easily see where values came from.
let us deal with the easy set first:
This is the set of 'supplied' readings.
SELECT dss.SnapShotDay theDay,
'supplied' readingExists,
dss.OperationalUnitNumber,
dss.IsOpen,
dss.StatusDate
FROM dailysnapshot dss
WHERE dss.OperationalUnitNumber IS NOT NULL
results:
theDay readingExists OperationalUnitNumber IsOpen StatusDate
2014-01-01 supplied 001 1 2014-01-01
2014-01-03 supplied 001 0 2014-01-03
2014-01-05 supplied 001 1 2014-01-05
Now let us deal with the set of 'days that have missing readings'. We need to get the 'most recent day that has readings that is closest to the day with the missing readings' and assume the same values from the 'most recent day' that is before the 'current' missing day.
It sounds complex but it isn't. It asks:
foreach day without a reading - get me the closest, earlier, date that has readings and i will use those readings.
Here is the query:
SELECT emptyDSS.SnapShotDay,
'missing' readingExists,
maxPrevDSS.OperationalUnitNumber,
maxPrevDSS.IsOpen,
maxPrevDSS.StatusDate
FROM dailysnapshot emptyDSS
INNER JOIN dailysnapshot maxPrevDSS ON maxPrevDSS.SnapShotDay =
(SELECT MAX(dss.SnapShotDay)
FROM dailysnapshot dss
WHERE dss.SnapShotDay < emptyDSS.SnapShotDay
AND dss.OperationalUnitNumber IS NOT NULL)
WHERE emptyDSS.OperationalUnitNumber IS NULL
results:
SnapShotDay readingExists OperationalUnitNumber IsOpen StatusDate
2014-01-02 missing 001 1 2014-01-01
2014-01-04 missing 001 0 2014-01-03
This is not about efficiency! It is about getting the correct 'result set' with the easiest to understand SQL code. I assume the database engine will optimize the query. The query can be 'tweaked' later if required.
We now need to combine the two queries and order the results in the manner we require.
The standard way of combining results from SQL queries is with set operators (union, intersection, minus).
we use 'union' and an 'order by' on the result set.
this gives the final query of:
SELECT dss.SnapShotDay theDay,
'supplied' readingExists,
dss.OperationalUnitNumber,
dss.IsOpen,
dss.StatusDate
FROM dailysnapshot dss
WHERE `OperationalUnitNumber` IS NOT NULL
UNION
SELECT emptyDSS.SnapShotDay theDay,
'missing' readingExists,
maxPrevDSS.OperationalUnitNumber,
maxPrevDSS.IsOpen,
maxPrevDSS.StatusDate
FROM dailysnapshot emptyDSS
INNER JOIN dailysnapshot maxPrevDSS ON maxPrevDSS.SnapShotDay =
(SELECT MAX(dss.SnapShotDay)
FROM dailysnapshot dss
WHERE dss.SnapShotDay < emptyDSS.SnapShotDay
AND dss.OperationalUnitNumber IS NOT NULL)
WHERE emptyDSS.OperationalUnitNumber IS NULL
ORDER BY theDay ASC
result:
theDay readingExists dss.OperationalUnitNumber dss.IsOpen dss.StatusDate
2014-01-01 supplied 001 1 2014-01-01
2014-01-02 missing 001 1 2014-01-01
2014-01-03 supplied 001 0 2014-01-03
2014-01-04 missing 001 0 2014-01-03
2014-01-05 supplied 001 1 2014-01-05
I enjoyed doing this.
It should work with most SQL engines.
I'm using SQL Server Management Studio 2012 and would like to create a pivot/cross tab query for a table with over 2300 rows.
The table has 5 columns:
- name
- group
- status
- date
- count
There are about 580 distinct names.
Each name is associated to 4 different groups (A, B, C, and D).
Each group has a complete status of yes or no.
A date is associated to each status when completed. Otherwise, status is NULL.
The count column is only applicable to group B and D and is an integer value.
SAMPLE A:
name group status date count
A.A.1 A yes 5/23 NULL
A.A.1 B yes 5/27 112
A.A.1 C yes 6/4 NULL
A.A.1 D yes 6/15 122
A.B.2 A yes 5/25 NULL
A.B.2 B yes 6/1 119
A.B.2 C no NULL NULL
A.B.2 D no NULL NULL
I am trying to display the status of each name as the field values across 11 columns :
- name
- group A
- group A date
- group B
- group B date
- group B count
- group C
- group C date
- group D
- group D date
- group D count
The 'name' column would have the 580 distinct names with their corresponding group data across A, B, C, and D.
SAMPLE B:
nm grp_A A_day grp_B B_day B_ct grp_C C_day grp_D D_day D_ct
A.A.1 yes 5/23 yes 5/27 112 yes 6/4 yes 6/15 122
A.B.2 yes 5/25 yes 6/1 119 no NULL no NULL NULL
(column names have been changed to fit into this question section's format)
Ultimately, the result should have all 580 distinct names in the first column and its corresponding status for each group, the date of completion (or NULL if it has not been completed yet), and the count for groups B and D.
I've tried using a CASE statement, but it generates the names once for each group, resulting in the original table being spaced out across the 11 coulmns.
SAMPLE C:
nm grp_A A_day grp_B B_day B_ct grp_C C_day grp_D D_day D_ct
A.A.1 yes 5/23
A.A.1 yes 5/27 112
A.A.1 yes 6/4
A.A.1 yes 6/15 122
A.B.2 yes 5/25
A.B.2 yes 6/1 119
A.B.2 no NULL
A.B.2 no NULL NULL
What am I doing wrong? Please help!
-- K-moj
Without seeing your query I am guessing but if you are trying to PIVOT the data with a CASE expression my suggestion would be to add an aggregate function around the CASE.
select
name,
max(case when [group] = 'A' then status end) grp_A,
max(case when [group] = 'A' then date end) A_day,
max(case when [group] = 'A' then [count] end) A_ct,
max(case when [group] = 'B' then status end) grp_B,
max(case when [group] = 'B' then date end) B_day,
max(case when [group] = 'B' then [count] end) B_ct,
max(case when [group] = 'C' then status end) grp_C,
max(case when [group] = 'C' then date end) C_day,
max(case when [group] = 'C' then [count] end) C_ct,
max(case when [group] = 'D' then status end) grp_D,
max(case when [group] = 'D' then date end) D_day,
max(case when [group] = 'D' then [count] end) D_ct
from yourtable
group by name
See SQL Fiddle with Demo.
If you want to use the PIVOT function then you will need to first look at unpivoting the status, date and count columns first, then pivot them in the the final result.
An UNPIVOT is when you convert multiple columns of data into multiple rows. You can unpivot the status, date and count columns using a variety of methods. Since you are using SQL Server 2012 you can use CROSS APPLY with a VALUES clause. The code to convert the columns into rows will be:
select name,
col = col+'_'+[group],
value
from yourtable
cross apply
(
values
('grp', status),
('day', [date]),
('ct', cast([count] as varchar(10)))
) c(col, value)
See Demo. This gives a result:
| NAME | COL | VALUE |
| A.A.1 | grp_A | yes |
| A.A.1 | day_A | 5/23 |
| A.A.1 | ct_A | (null) |
| A.A.1 | grp_B | yes |
| A.A.1 | day_B | 5/27 |
| A.A.1 | ct_B | 112 |
Instead of having multiple columns that you want to pivot, you now have all values to be turned into new columns in value and the new column names in col. You can then apply the PIVOT function so the full code will be similar to the following:
select name,
grp_A, day_A, ct_A,
grp_B, day_B, ct_B,
grp_C, day_C, ct_C,
grp_D, day_D, ct_D
from
(
select name,
col = col+'_'+[group],
value
from yourtable
cross apply
(
values
('grp', status),
('day', [date]),
('ct', cast([count] as varchar(10)))
) c(col, value)
) d
pivot
(
max(value)
for col in (grp_A, day_A, ct_A,
grp_B, day_B, ct_B,
grp_C, day_C, ct_C,
grp_D, day_D, ct_D)
) piv
See SQL Fiddle with Demo
I'm working on an ssis package to fix some data from a table. The table looks something like this:
CustID FieldID INT_VAL DEC_VAL VARCHAR_VAL DATE_VAL
1 1 23
1 2 500.0
1 3 David
1 4 4/1/05
1 5 52369871
2 1 25
2 2 896.23
2 3 Allan
2 4 9/20/03
2 5 52369872
I want to transform it into this:
CustID FirstName AccountNumber Age JoinDate Balance
1 David 52369871 23 4/1/05 500.0
2 Allan 52369872 25 9/20/03 896.23
Currently, I've got my SSIS package set up to pull in the data from the source table, does a conditional split on the field id, then generates a derived column on each split. The part I'm stuck on is joining the data back together. I want to join the data back together on the CustId.
However, the join merge only allows you to join 2 datasets, in the end I will need to join about 30 data sets. Is there a good way to do that without having to have a bunch of merge joins?
That seems a bit awkward, why not just do it in a query?
select
CustID,
max(case when FieldID = 3 then VARCHAR_VAL else null end) as 'FirstName',
max(case when FieldID = 5 then INT_VAL else null end) as 'AccountNumber',
max(case when FieldID = 1 then INT_VAL else null end) as 'Age',
max(case when FieldID = 4 then DATE_VAL else null end) as 'JoinDate',
max(case when FieldID = 2 then DEC_VAL else null end) as 'Balance'
from
dbo.StagingTable
group by
CustID
If your source system is MSSQL, then you can use that query from SSIS or even create a view in the source database (if you're allowed to). If not, then copy the data directly to a staging table in MSSQL and query it from there.