SQL Server - Transpose rows into columns - sql-server

I've searched high and low for an answer to this so apologies if it's already answered!
I have the following result from a query in SQL 2005:
ID
1234
1235
1236
1267
1278
What I want is
column1|column2|column3|column4|column5
---------------------------------------
1234 |1235 |1236 |1267 |1278
I can't quite get my head around the pivot operator but this looks like it's going to be involved. I can work with there being only 5 rows for now but a bonus would be for it to be dynamic, i.e. can scale to x rows.
EDIT:
What I'm ultimately after is assigning the values of each resulting column to variables, e.g.
DECLARE #id1 int, #id2 int, #id3 int, #id4 int, #id5 int
SELECT #id1 = column1, #id2 = column2, #id3 = column3, #id4 = column4,
#id5 = column5 FROM [transposed_table]

You also need a value field in your query for each id to aggregate on. Then you can do something like this
select [1234], [1235]
from
(
-- replace code below with your query, e.g. select id, value from table
select
id = 1234,
value = 1
union
select
id = 1235,
value = 2
) a
pivot
(
avg(value) for id in ([1234], [1235])
) as pvt

I think you'll find the answer in this answer to a slightly different question: Generate "scatter plot" result of members against sets from SQL query
The answer uses Dynamic SQL. Check out the last link in mellamokb's answer: http://www.sqlfiddle.com/#!3/c136d/14 where he creates column names from row data.

In case you have a grouped flat data structure that you want to group transpose, like such:
GRP | ID
---------------
1 | 1234
1 | 1235
1 | 1236
1 | 1267
1 | 1278
2 | 1234
2 | 1235
2 | 1267
2 | 1289
And you want its group transposition to appear like:
GRP | Column 1 | Column 2 | Column 3 | Column 4 | Column 5
-------------------------------------------------------------
1 | 1234 | 1235 | 1236 | 1267 | 1278
2 | 1234 | 1235 | NULL | 1267 | NULL
You can accomplish it with a query like this:
SELECT
Column1.ID As column1,
Column2.ID AS column2,
Column3.ID AS column3,
Column4.ID AS column4,
Column5.ID AS column5
FROM
(SELECT GRP, ID FROM FlatTable WHERE ID = 1234) AS Column1
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1235) AS Column2
ON Column1.GRP = Column2.GRP
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1236) AS Column3
ON Column1.GRP = Column3.GRP
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1267) AS Column4
ON Column1.GRP = Column4.GRP
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1278) AS Column5
ON Column1.GRP = Column5.GRP
(1) This assumes you know ahead of time which columns you will want — notice that I intentionally left out ID = 1289 from this example
(2) This basically uses a bunch of left outer joins to append 1 column at a time, thus creating the transposition. The left outer joins (rather than inner joins) allow for some columns to be null if they don't have corresponding values from the flat table, without affecting any subsequent columns.

Related

Joining two tables and need to have MAX aggregate function in ON clause

This is my code! I want to give a part id and purchase order id to my report and it brings all the related information with those specification. The important thing is that, if we have same purchase order id and part id we need the code to return the result with the highest transaction id. The following code is not providing what I expected. Could you please help me?
SELECT MAX(INVENTORY_TRANS.TRANSACTION_ID), INVENTORY_TRANS.PART_ID
, INVENTORY_TRANS.PURC_ORDER_ID, TRACE_INV_TRANS.QTY, TRACE_INV_TRANS.CREATE_DATE, TRACE_INV_TRANS.TRACE_ID
FROM INVENTORY_TRANS
JOIN TRACE_INV_TRANS ON INVENTORY_TRANS.TRANSACTION_ID = TRACE_INV_TRANS.TRANSACTION_ID
WHERE INVENTORY_TRANS.PART_ID = #PartID
AND INVENTORY_TRANS.PURC_ORDER_ID = #PurchaseOrderID
GROUP BY TRACE_INV_TRANS.QTY, TRACE_INV_TRANS.CREATE_DATE, TRACE_INV_TRANS.TRACE_ID, INVENTORY_TRANS.PART_ID
, INVENTORY_TRANS.PURC_ORDER_ID
The sample of trace_inventory_trans table is :
part_id trace_id transaction id qty create_date
x 1 10
x 2 11
x 3 12
the sample of inventory_trans table is :
transaction_id part_id purc_order_id
11 x p20
12 x p20
I wanted to have the result of biggest transaction which is transaction 12 but it shows me transaction 11
I would use a sub-query to find the MAX value, then join that result to the other table.
The ORDER BY + TOP (1) returns the MAX value for transaction_id.
SELECT
inv.transaction_id
,inv.part_id
,inv.purc_order_id
,tr.qty
,tr.create_date
,tr.trace_id
FROM
(
SELECT TOP (1)
transaction_id,
part_id,
purc_order_id
FROM
INVENTORY_TRANS
WHERE
part_id = #PartID
AND
purc_order_id = #PurchaseOrderID
ORDER BY
transaction_id DESC
) AS inv
JOIN
TRACE_INV_TRANS AS tr
ON inv.transaction_id = tr.transaction_id;
Results:
+----------------+---------+---------------+------+-------------+----------+
| transaction_id | part_id | purc_order_id | qty | create_date | trace_id |
+----------------+---------+---------------+------+-------------+----------+
| 12 | x | p20 | NULL | NULL | 3 |
+----------------+---------+---------------+------+-------------+----------+
Rextester Demo

TSQL Conditional Where or Group By?

I have a table like the following:
id | type | duedate
-------------------------
1 | original | 01/01/2017
1 | revised | 02/01/2017
2 | original | 03/01/2017
3 | original | 10/01/2017
3 | revised | 09/01/2017
Where there may be either one or two rows for each id. If there are two rows with same id, there would be one with type='original' and one with type='revised'. If there is one row for the id, type will always be 'original'.
What I want as a result are all the rows where type='revised', but if there is only one row for a particular id (thus type='original') then I want to include that row too. So desired output for the above would be:
id | type | duedate
1 | revised | 02/01/2017
2 | original | 03/01/2017
3 | revised | 09/01/2017
I do not know how to construct a WHERE clause that conditionally checks whether there are 1 or 2 rows for a given id, nor am I sure how to use GROUP BY because the revised date could be greater than or less than than the original date so use of aggregate functions MAX or MIN don't work. I thought about using CASE somehow, but also do not know how to construct a conditional that chooses between two different rows of data (if there are two rows) and display one of them rather than the other.
Any suggested approaches would be appreciated.
Thanks!
you can use row number for this.
WITH T AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Type DESC) AS RN
FROM YourTable
)
SELECT *
FROM T
WHERE RN = 1
Is something like this sufficient?
SELECT *
FROM mytable m1
WHERE type='revised'
or 1=(SELECT COUNT(*) FROM mytable m2 WHERE m2.id=m1.id)
You could use a subquery to take the MAX([type]). In this case it works for [type] since alphabetically we want revised first, then original and "r" comes after "o" in the alphabet. We can then INNER JOIN back on the same table with the matching conditions.
SELECT T2.*
FROM (
SELECT id, MAX([type]) AS [MAXtype]
FROM myTABLE
GROUP BY id
) AS dT INNER JOIN myTable T2 ON dT.id = T2.id AND dT.[MAXtype] = T2.[type]
ORDER BY T2.[id]
Gives output:
id type duedate
1 revised 2017-02-01
2 original 2017-03-01
3 revised 2017-09-01
Here is the sqlfiddle: http://sqlfiddle.com/#!6/14121f/6/0

SQL Server query involving subqueries - performance issues

I have three tables:
Table 1: | dbo.pc_a21a22 |
batchNbr Other columns...
-------- ----------------
12345
12346
12347
Table 2: | dbo.outcome |
passageId record
---------- ---------
00003 200
00003 9
00004 7
Table 3: | dbo.passage |
passageId passageTime batchNbr
---------- ------------- ---------
00001 2015.01.01 12345
00002 2016.01.01 12345
00003 2017.01.01 12345
00004 2018.01.01 12346
What I want to do: for each batchNbr in Table 1 get first its latest passageTime and the corresponding passageID from Table 3. With that passageID, get the relevant rows in Table 2 and establish whether any of these rows contains the record 200. Per passageId there are at most 2 records in Table 2
What is the most efficient way to do this?
I have already created a query that works, but it's awfully slow and thus unfit for tables with millions of rows. Any suggestion on how to either change the query or do it another way? Altering the table structure is not an option, I only have read rights to the database.
My current solution (slow):
SELECT TOP 50000
a.batchNbr,
CAST ( CASE WHEN 200 in (SELECT TOP 2 record FROM dbo.outcome where passageId in (
SELECT SubqueryResults.passageId From (SELECT Top 1 passageId FROM dbo.passage pass WHERE pass.batchNbr = a.batchNbr ORDER BY passageTime Desc) SubqueryResults
)
) then 1 else 0 end as bit) as KGT_IO_END
FROM dbo.pc_a21a22 a
The desired output is:
batchNbr 200present
--------- ----------
12345 1
12346 0
I suggest you use table joining rather than subqueries.
select
a.*, b.*
from
dbo.table1 a
join
dbo.table2 b on a.id = b.id
where
/*your where clause for filtering*/
EDIT:
You could use this as a reference Join vs. sub-query
Try this
SELECT TOP 50000 a.*, (CASE WHEN b.record = 200 THEN 1 ELSE 0 END) AS
KGT_IO_END
FROM dbo.Test1 AS a
LEFT OUTER JOIN
(SELECT record, p.batchNbr
FROM dbo.Test2 AS o
LEFT OUTER JOIN (SELECT MAX(passageId) AS passageId, batchNbr FROM
dbo.Test3 GROUP BY batchNbr) AS p ON o.passageId = p.passageId
) AS b ON a.batchNbr = b.batchNbr;
The MAX subquery is to get the latest passageId by batchNbr.
However, your example won't get the record 200, since the passageId of the record with 200 is 00001, while the latest passageId of the batchNbr 12345 is 00003.
I used LEFT OUTER JOIN since the passageId from Table2 no longer match any of the latest passageId from Table3. The resulting subquery would have no records to join to Table1. Therefore INNER JOIN would not show any records from your example data.
Output from your example data:
batchNbr KGT_IO_END
12345 0
12346 0
12347 0
Output if we change the passageId of record 200 to 00003 (the latest for 12345)
batchNbr KGT_IO_END
12345 1
12346 0
12347 0

Duplicates on Self Left Join

I'm trying to pivot out a table of data stored in a vertical model into a more horizontal, SQL Server table-like model. Unfortunately due to the nature of the data, I cannot use the real data here so I worked up a generic example that follows the same model.
There are three columns to the table, an ID, column ID and value, where the ID and column ID form the Primary Key. Additionally none of the data is required (i.e. an ID can be missing column ID = 3 without breaking anything)
PetID | ColumnID | Value
---------------------------
1 | 1 | Gilda
1 | 2 | Cat
2 | 1 | Sonny
2 | 2 | Cat
2 | 3 | Black
Due to the fact that the Primary Key is a composite of two columns I cannot use the built in PIVOT functionality, so I tried doing a self LEFT JOIN:
SELECT T1.PetID
,T2.Value AS [Name]
,T3.Value AS [Type]
,T4.Value AS [Color]
FROM #Temp AS T1
LEFT JOIN #Temp AS T2 ON T1.PetID = T2.PetID
AND T2.ColumnID = 1
LEFT JOIN #Temp AS T3 ON T1.PetID = T3.PetID
AND T3.ColumnID = 2
LEFT JOIN #Temp AS T4 ON T1.PetID = T4.PetID
AND T4.ColumnID = 3;
The idea being that I want to take the ID from T1 and then do a self LEFT JOIN to get each of the values by ColumnID. However I'm getting duplicates in the data:
PetID | Name | Type | Color
------------------------------
1 | Gilda | Cat | NULL
1 | Gilda | Cat | NULL
2 | Sonny | Cat | Black
2 | Sonny | Cat | Black
2 | Sonny | Cat | Black
I am able to get rid of these duplicates using a DISTINCT, but the dataset is rather large, so the required sort action is slowing down the query tremendously. Is there a better way to accomplish this or am I just stuck with a slow query?
You can use a CASE statement and avoid the joins altogether.
SELECT
PetID,
MAX(CASE WHEN ColumnID = 1 THEN Value ELSE NULL END) AS Name,
MAX(CASE WHEN ColumnID = 2 THEN Value ELSE NULL END) AS Type,
MAX(CASE WHEN ColumnID = 3 THEN Value ELSE NULL END) AS Color
FROM #Temp
GROUP BY PetId
It is essential that PetID, ColumnID be your primary key for this to work correctly. Otherwise it will cause problems when the same ColumnID is used multiple times for the same PetID
You can use pivot if you'd like to..
SELECT *
FROM (SELECT PetID,
(CASE ColumnID
WHEN 1 THEN 'Name'
WHEN 2 THEN 'Type'
WHEN 3 THEN 'Color'
END) ValueType,
VALUE
FROM #Temp
) t
PIVOT
( MAX(Value)
FOR ValueType IN ([Name],[Type],[Color])
) p
Another way without the Sub query would be..
SELECT PetID,
[1] [Name],
[2] [Type],
[3] [Color]
FROM #Temp
PIVOT
( MAX(Value)
FOR ColumnID IN ([1],[2],[3])
) p
I don't understand your concern about sorting. You have a primary key so you also have an index. This is the correct way to do it:
select
PetID,
min(case when ColumnID = 1 then Value end) as Name,
min(case when ColumnID = 2 then Value end) as Type,
min(case when ColumnID = 3 then Value end) as Color
from #Temp
group by PetID
A fix for your duplication is simple though and will probably improve performance as well:
FROM (select distinct PetID from #Temp) AS T1
SELECT T1.PetID
,T1.Value AS [Name]
,T2.Value AS [Type]
,T3.Value AS [Color]
--select *
FROM #Temp AS T1
LEFT JOIN #Temp AS T2 ON T1.PetID = T2.PetID
AND T2.ColumnID = 2
LEFT JOIN #Temp AS T3 ON T1.PetID = T3.PetID
AND T3.ColumnID = 3
where t1.ColumnID = 1
Your problem was that you were joining to the main table that had multiple rows.

sql inserr based on maximum date

I have to just insert the value from one table into another but the condition is that out of same id I have to select that one having maximum date and then insert into another. like :
table 1
a | b
1 | 12/1/13
1 | 18/1/13
2 | 2/4/13
2 | 9/8/13
table 2
a | b
1 | 18/1/13
2 | 9/8/13
please suggest the SQL query for it
Could you try :
INSERT INTO Table2 (idcolumn, datecolumn)
SELECT DISTINCT idcolumn, datecolumn
FROM Table1
GROUP BY idcolumn
ORDER BY datecolumn DESC
INSERT INTO table2(a,b)
SELECT a, MAX(b) AS b
FROM table1
GROUP BY a;

Resources