PIVOT two columns and keep others as they are - sql-server

I want to turn some of the rows into columns while keeping other rows as they are.
ID name value RefId
1 Fname John 32145
2 LName Smith 32145
3 Fname Peter 34589
4 LName Mahlang 34589
Now what I want to achieve is to turn the Fname and Lname rows into columns with their matching value field. ID column doesn't really matter, I don't need it.
Desired Output
Fname Lname RefId
John Smith 32145
Peter Mahlang 34589
Any help

Using conditional aggregation:
select
Fname = max(case when name = 'Fname' then value end)
, Lname = max(case when name = 'Lname' then value end)
, RefId
from t
group by RefId
rextester demo: http://rextester.com/MRMY11592
returns:
+---------+---------+-------+
| Fname | Lname | RefId |
+---------+---------+-------+
| John | Smith | 32145 |
| Peter | Mahlang | 34589 |
+---------+---------+-------+
Or using pivot()
select
Fname
, Lname
, RefId
from (select name, value, refid from t) s
pivot(max(value) for name in ([Fname],[Lname]))p

Related

SQL Server: Creating transposed table and joining with existing table

I have a set of data from table [MSPWIP].[MSPWIP].[Event] that looks like this:
| Createdby | StationName | SerialNumber |
-------------------------------------------------------
| Jay | L1.A1 | 22191321572 |
| Allan | L1.A2 | 22191321572 |
| Nathan | L2.A1 | 22191321579 |
| Jane | L2.A2 | 22191321579 |
And I have other sets of data that I have already joined in another query which is not relevant to the problem
I want to create a table separating the operator (denoted by createdby) by stations where L1.A1 means Line 1 Station 1 for example. For me at the moment, Line is not relevant
My ideal data after I restructure it should look like this
| SerialNumber | Operator1 | Operator2 |
----------------------------------------
| 22191321572 | Jay | Allan |
| 22191321579 | Nathan | Jane |
I tried using this code to Join both tables:
Query#1
Declare #Operator1 Table(
SerialNumber Varchar(255),
Operator1 Varchar(255)
)
Insert Into #Operator1 (Serialnumber, Operator1)
Select
SerialNumber,
Createdby as Operator1
From [MSPWIP].[MSPWIP].[Event]
where StationName like '%01'
Declare #Operator2 Table(
SerialNumber Varchar(255),
Operator2 Varchar(255)
)
Insert Into #Operator2 (Serialnumber, Operator2)
Select
SerialNumber,
CreatedBy as Operator2
From [MSPWIP].[MSPWIP].[Event]
where StationName like '%02'
select
a.SerialNumber,
CreatedBy,
b.Operator2
From #Operator1 a
join #Operator2 b
On a.SerialNumber = b.SerialNumber
Where a.SerialNumber In ('22191321572', '22191321574')
Then I would like to join it with that other query using the code below:
Query#2
join #Operator1 i
on a.SerialNumber = i.SerialNumber
join #Operator2 j
on a.SerialNumber = j.SerialNumber
Note that a is a different table.
However with Query#1 it only managed to show the headings and not the data, and this also caused Query#2 to also display heading and nothing else.
Just wondering if there was something wrong with Query#1 where the data failed to be inserted into the columns?
============================================
Update:
Using the answer below (with Modifications) I came up with a code like this
Query#3
SELECT Distinct*
FROM (
SELECT distinct
SerialNumber,
Case When t.StationName like '%A1' then CreatedBy End Operator1,
Case When t.StationName like '%A2' then CreatedBy End Operator2
--, Max(CASE WHEN CAST(RIGHT(t.StationName, 1) AS Varchar(255)) = 1 THEN t.CreatedBy END) Operator1
--, Max(CASE WHEN CAST(RIGHT(t.StationName, 1) AS Varchar(255)) = 2 THEN t.CreatedBy END) Operator2
FROM [MSPWIP].[MSPWIP].[Event] t
where t.CreatedDate > '2019-05-30'
Group BY SerialNumber, StationName, Createdby
) d
However my results now became staggered like so:
| SerialNumber | Operator1 | Operator2 |
----------------------------------------
| 22191321572 | Jay | NULL |
| 22191321572 | NULL | Allan |
| 22191321579 | Nathan | NULL |
| 22191321579 | NULL | Jane |
Did i do something wrong here?
You can save your time by doing it in one run like this :
SELECT *
FROM (
SELECT
SerialNumber
, MAX(CASE WHEN RIGHT(t.StationName, 2) = '01' THEN t.Operator END) Operator1
, MAX(CASE WHEN RIGHT(t.StationName, 2) = '02' THEN t.Operator END) Operator2
FROM [MSPWIP].[MSPWIP].[Event] t
GROUP BY SerialNumber
) d
then you just join it with the required tables.
P.S : If your station part in the StationName is not always a number, then you can use SUBSTRING(t.StationName, CHARINDEX('.', t.StationName) + 1, LEN(t.StationName)) instead of RIGHT(t.StationName, 2) to get the station part (which is after the dot).

Building data sets from multiple multivalue records using applys with missing values

I have a SQL Server 2012 database which was imported from a multi-value environment which causes me more headaches than I care to count, however it is what it is and I have to work with it.
I am trying to build a data set using these multi value records but have hit a stumbling bock. This is my scenario
I have a custom split string TVF that splits a string of "Test,String" into
Rowno | Item
------+---------
1 | Test
2 | String
I have the following data:
Clients Table
Ref | Names | Surname | DOB | IdNo
----+-----------+-----------+---------------------+------
123 |John,Sally |Smith | DOB1,DoB2 | 45,56
456 |Dave,Paul |Jones,Dann| DOB1,DOB2 | 98
789 |Mary,Moe,Al|Lee | DOB1 | NULL
What I need to create is a data set that looks like this:
Ref | Names | Surname | DOB | IdNo
----+-----------+-----------+---------------------+------
123 | John | Smith | DOB1 | 45
123 | Sally | Smith | DOB2 | 56
456 | Dave | Jones | DOB1 | 98
456 | Paul | Dann | DOB2 |
789 | Mary | Lee | DOB1 |
789 | Moe | Lee | |
789 | Al | Lee | |
In the past, to solve similar issues, I would tackle this using a query like this:
SELECT
Ref
, SplitForenames.ITEM names
, SplitSurname.ITEM Surname
, SplitDOB.ITEM dob
, SplitNI.ITEM ID
FROM
Clients
CROSS APPLY
dbo.udf_SplitString(Names, ',') SplitForenames
OUTER APPLY
dbo.udf_SplitString(Surname, ',') SplitSurname
OUTER APPLY
dbo.udf_SplitString(DOB, ',') SplitDOB
OUTER APPLY
dbo.udf_SplitString(ID, ',') SplitNI
WHERE
SplitSurname.RowNo = SplitForenames.RowNo
AND SplitDOB.RowNo = SplitForenames.RowNo
AND SplitNI.RowNo = SplitForenames.RowNo
ORDER BY
REF;
However due to there being examples of differences between the number of surnames to forenames and missing DOB and ID fields i cannot match them in this way.
I need to match where there is a match then otherwise be blank for DOB and ID and use the first instance of the surname. I am just stuck as to how to achieve this.
Anyone have any suggestions as to how i can create my required data-set from the original source.
Thanks in advance
I cant find what are the condition of DOB column to be split or not.
However: with the Split function SpliF as below:
CREATE FUNCTION SplitF(#str AS NVARCHAR(max))
RETURNS #People TABLE
(Rowno INT,Item NVARCHAR(10))
AS
BEGIN
DECLARE #i INT, #pos INT
DECLARE #subname NVARCHAR(max)
SET #I = 0;
WHILE(LEN(#str)>0)
BEGIN
SET #pos = CHARINDEX(',',#str)
IF #pos = 0 SET #pos = LEN(#str)+1
SET #subname = SUBSTRING(#str,1,#pos-1)
SET #str = SUBSTRING(#str, #pos+1, len(#str))
SET #i = #i + 1
INSERT INTO #People VALUES (#i, #subname)
END
RETURN
END
GO
select * from SplitF('test,my,function')
Rowno Item
----------- ----------
1 test
2 my
3 function
and basic data:
select Ref, Names, Surname, DOB, IdNo into #clients
from ( select 123 as Ref, 'John,Sally' as Names, 'Smith' as Surname,
'DOB1,DOB2' as DOB, '45,56' as IdNo
union all select 456, 'Dave,Paul','Jones,Dann','DOB1,DOB2', '98'
union all select 789, 'Mary,Moe,Al', 'Lee', 'DOB1', NULL) A
select * from #clients
Ref Names Surname DOB IdNo
----------- ----------- ---------- --------- -----
123 John,Sally Smith DOB1,DOB2 45,56
456 Dave,Paul Jones,Dann DOB1,DOB2 98
789 Mary,Moe,Al Lee DOB1 NULL
using below code you will get such results:
select
Ref,
RTrim(S_NAM.Item) as Names,
coalesce(S_SURNAM.Item,S_SURNAM_LAST.Item) AS Surname,
coalesce(split_dob.Item, '') as DOB,
coalesce(split_IdNo.Item,'') as IdNo
from
#clients MAIN
outer apply(select Rowno, Item from SplitF(MAIN.Names)) as S_NAM
outer apply(select top 1 Item from SplitF(MAIN.Surname) where Rowno = S_NAM.Rowno) as S_SURNAM
outer apply(select top 1 Item from SplitF(MAIN.Surname) order by Rowno desc) as S_SURNAM_LAST
outer apply(select top 1 Item from SplitF(MAIN.IdNo) where Rowno = S_NAM.Rowno) as split_IdNo
outer apply(select top 1 Item from SplitF(MAIN.DOB) where Rowno = S_NAM.Rowno) as split_dob
order by MAIN.Ref, S_NAM.Rowno
Ref Names Surname DOB IdNo
----------- ---------- ---------- ---------- ----------
123 John Smith DOB1 45
123 Sally Smith DOB2 56
456 Dave Jones DOB1 98
456 Paul Dann DOB2
789 Mary Lee DOB1
789 Moe Lee
789 Al Lee
I think you can handle this using subqueries and doing the RowNo comparison before the OUTER APPLY:
FROM Clients c CROSS APPLY
dbo.udf_SplitString(Names, ',') SplitForenames OUTER APPLY
(SELECT . . .
FROM dbo.udf_SplitString(Surname, ',') SplitSurname
WHERE SplitSurname.RowNo = SplitForenames.RowNo
) SplitSurname OUTER APPLY
(SELECT . . .
FROM dbo.udf_SplitString(DOB, ',') SplitDOB
WHERE SplitDOB.RowNo = SplitForenames.RowNo
) SplitDOB OUTER APPLY
(SELECT . . .
FROM dbo.udf_SplitString(DOB, ',') SplitNI
WHERE SplitNI.RowNo = SplitForenames.RowNo
) SplitNI

ORDER BY more than one column in t-sql

I have a table of a classroom with 3 column: Name, Class and Age. When do
Select * from students
It shows these value
Name | Class | Age
John | D | 7
Mary | A | 10
Jenny | B | 9
Peter | D | 7
I want to sort the values with these conditions
- First, Order by Age DESC
- If there are more 2 people have same age, Order by Name ASC
I use these command
Select * from students order by Age Desc, Name ASC .
but it doesn't sort Class too. Is there anyone can help me?
To sort by class first, then age, then name:
Select Name, Class, Age
FROM students
ORDER BY class ASC, Age DESC, Name ASC;
Should output:
Name | Class | Age
Mary | A | 10
Jenny | B | 9
John | D | 7
Peter | D | 7
To sort by age first, then class, then name:
Select Name, Class, Age
FROM students
ORDER BY Age DESC, class ASC, Name ASC;
Should output the same, because the data provided happens to sort the same way using this alternate criteria.
Name | Class | Age
Mary | A | 10
Jenny | B | 9
John | D | 7
Peter | D | 7
Try this Query.
First use Asc then second use Desc.
Select * from students order by Name ASC, Age Desc,
declare #tab table(Name varchar(30), Class char(1), Age int )
insert into #tab
select 'John' , 'D' , 7
union all
select 'Mary' , 'A' , 10
union all
select 'Jenny' , 'B' , 9
union all
select 'Peter' , 'D' , 7
select * from #tab order by Age desc,name asc
Name Class Age
Mary A 10
Jenny B 9
John D 7
Peter D 7

SQL Server split SELECT XML column as arbitrary individual columns

In my application, I have few pre-defined fields for an object and user can define custom fields. I am using XML data type to store the custom fields in a name value format.
e.g. I have Employees table that has FN, LN, Email as pre-defined columns and CustomFields as XML column to hold the user defined fields.
And different rows can contain different custom fields.
e.g. Row 1 -> John, Smith, jsmith#example.com,
<root>
<phone>123-123-1234</phone>
<country>USA</country>
</root>
and then Row 2 -> Smith, John, sjohn#example.com,
<root>
<age>50</age>
<sex>Male</sex>
</root>
And there can be any number of such custom fields defined for different employee records. The format will always be the same
<root><field>value</field></root>
How can I return Phone and Country as columns while selecting Row1 and return Age and Sex as columns while selecting Row2?
Take this temp table for all examples
CREATE TABLE #tbl (ID INT IDENTITY, FirstName VARCHAR(100),LastName VARCHAR(100),eMail VARCHAR(100),CustomFields XML);
INSERT INTO #tbl VALUES
('John','Smith','john.smith#test.com'
,'<root>
<phone>123-123-1234</phone>
<country>USA</country>
</root>')
, ('Jane','Miller','jane.miller#test.com'
,'<root>
<age>50</age>
<sex>Male</sex>
</root>');
Option 1
Assuming that there is a fix known set of custom fields.
This allows typesafe reading (age as INT)
all possible columns are returned, unused are NULL
Try this code
SELECT tbl.ID
,tbl.FirstName
,tbl.LastName
,tbl.eMail
,tbl.CustomFields.value('(/root/phone)[1]','nvarchar(max)') AS phone
,tbl.CustomFields.value('(/root/country)[1]','nvarchar(max)') AS country
,tbl.CustomFields.value('(/root/age)[1]','int') AS age
,tbl.CustomFields.value('(/root/sex)[1]','nvarchar(max)') AS sex
FROM #tbl AS tbl
This is the result
+----+-----------+----------+----------------------+--------------+---------+------+------+
| ID | FirstName | LastName | eMail | phone | country | age | sex |
+----+-----------+----------+----------------------+--------------+---------+------+------+
| 1 | John | Smith | john.smith#test.com | 123-123-1234 | USA | NULL | NULL |
+----+-----------+----------+----------------------+--------------+---------+------+------+
| 2 | Jane | Miller | jane.miller#test.com | NULL | NULL | 50 | Male |
+----+-----------+----------+----------------------+--------------+---------+------+------+
*/
Option 2
assuming you do not know the field names in advance you cannot name the output columns directly
But you can use generic names, read the data row-wise and do PIVOT
Try this:
SELECT p.*
FROM
(
SELECT tbl.FirstName
,tbl.LastName
,tbl.eMail
,N'Col_' + CAST(ROW_NUMBER() OVER(PARTITION BY tbl.ID ORDER BY (SELECT NULL)) AS NVARCHAR(max)) AS ColumnName
,A.cf.value('local-name(.)','nvarchar(max)') + ':' + A.cf.value('.','nvarchar(max)') AS cf
FROM #tbl AS tbl
CROSS APPLY tbl.CustomFields.nodes('/root/*') AS A(cf)
) AS x
PIVOT
(
MAX(cf) FOR ColumnName IN(Col_1,Col_2,Col_3,Col_4 /*add as many as you need*/)
) AS p
This is the result
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
| FirstName | LastName | eMail | Col_1 | Col_2 | Col_3 | Col_4 |
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
| Jane | Miller | jane.miller#test.com | age:50 | sex:Male | NULL | NULL |
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
| John | Smith | john.smith#test.com | phone:123-123-1234 | country:USA | NULL | NULL |
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
Option 3
assuming you do not know the columns, but you need the columns correctly named
attention: be aware of the fact, that such an approach will never be allowed in ad-hoc-SQL such as VIEW or inline TVF which might be a great back draw...
This needs dynamic creation of a statement. I will create the statement of Option 1 but replace the fix list with a dynamically created list:
DECLARE #DynamicColumns NVARCHAR(MAX)=
(
SELECT ',tbl.CustomFields.value(''(/root/' + A.cf.value('local-name(.)','nvarchar(max)') + ')[1]'',''nvarchar(max)'') AS ' + A.cf.value('local-name(.)','nvarchar(max)')
FROM #tbl AS tbl
CROSS APPLY tbl.CustomFields.nodes('/root/*') AS A(cf)
FOR XML PATH('')
);
DECLARE #DynamicSQL NVARCHAR(MAX)=
' SELECT tbl.ID
,tbl.FirstName
,tbl.LastName
,tbl.eMail'
+ #DynamicColumns +
' FROM #tbl AS tbl;'
EXEC(#DynamicSQL);
The result would be the same as in Option 1, but with a completely dynamic approach.
Cleanup
DROP TABLE #tbl;

SQL "Where In" for empty subquery

I have the following query, where the intention is to show each record with the time until the next record
Data:
gid time name
1010883478 29/03/2016 0:00:02 John
1010883527 29/03/2016 0:00:04 John
1010883578 29/03/2016 0:00:06 John
SQL:
SELECT A.[gid]
,A.[time]
,A.[name]
,(B.[time] - A.[time]) as timeTilNext
FROM [location] A CROSS JOIN [location] B
WHERE B.[gid] IN (
SELECT MIN(C.[gid])
FROM [location] C
WHERE C.[gid] > A.[gid] AND C.[name] = A.[name] )
ORDER BY A.[gid]
Current Output:
gid time name timeTilNext
1010883478 2016-03-29 00:00:02.000 John 1900-01-01 00:00:02.000
1010883527 2016-03-29 00:00:04.000 John 1900-01-01 00:00:02.000
Expected Output:
gid time name timeTilNext
1010883478 2016-03-29 00:00:02.000 John 1900-01-01 00:00:02.000
1010883527 2016-03-29 00:00:04.000 John 1900-01-01 00:00:02.000
1010883578 2016-03-29 00:00:06.000 John -1 (or whatever)
However, it does not show a record for the highest [gid] for a given [name] (only the second highest).
I'm hoping for the highest [gid] to show -1 for timeTilNext, to indicate that there is no more events.
Any ideas about how to modify my query?
In SQL Server 2012 you can use LEAD window function to get the value of the "next" row.
DECLARE #location TABLE ([gid] int, [time] datetime, [name] varchar(50));
INSERT INTO #location ([gid], [time], [name]) VALUES
(1010883478, '2016-03-29 00:00:02', 'John'),
(1010883527, '2016-03-29 00:00:04', 'John'),
(1010883578, '2016-03-29 00:00:06', 'John');
SELECT
A.[gid]
,A.[time]
,A.[name]
,LEAD(A.[time]) OVER(PARTITION BY A.[name] ORDER BY A.[gid]) AS NextTime
,ISNULL(DATEDIFF(second, A.[time],
LEAD(A.[time]) OVER(PARTITION BY A.[name] ORDER BY A.[gid])), -1) AS SecondsTillNext
FROM #location A
ORDER BY A.[gid];
Result
+------------+-------------------------+------+-------------------------+-----------------+
| gid | time | name | NextTime | SecondsTillNext |
+------------+-------------------------+------+-------------------------+-----------------+
| 1010883478 | 2016-03-29 00:00:02.000 | John | 2016-03-29 00:00:04.000 | 2 |
| 1010883527 | 2016-03-29 00:00:04.000 | John | 2016-03-29 00:00:06.000 | 2 |
| 1010883578 | 2016-03-29 00:00:06.000 | John | NULL | -1 |
+------------+-------------------------+------+-------------------------+-----------------+
If the "next" row is not available, LEAD would return NULL. You can use ISNULL() to replace it with some non-null value if you want.
select
*,-1 as 'time until next ' from location t1
where time=(select max(time) from location t2 where t1.name=t2.name) b
SELECT A.gid,A.name,A.time,
(
(SELECT MIN(B.time) FROM [location] B WHERE B.time>A.time AND B.name=A.name)
-
A.time
) as timeTilNext
FROM [location] A

Resources