Pivot millons of records - sql-server

I have a table with 4 columns and more than 100 million records.
Table Design:
ID char(12) PK
Type Char(2) PK (Values 1,2,3)
DCID varchar(10) Null
IND Varchar(2) Null (Values Y, N)
This needs to be pivoted like
ID, DCID1, DCID2, DCID3, IND1, IND2, IND3
If Type is having value 1,Then in the Pivoted table the DCID1 should have the value or if Type is 2 DCID2 should have the value and so on. Also correspoding IND also needs to be placed in IND1, IND2, IND3 like that.
How to pivot this?

My suggestion would be to look at using both the UNPIVOT and the PIVOT functions to get the result.
The UNPIVOT will be used to convert the DCI and IND multiple columns into multiple rows in a single column. Once that is done, then you can pivot the data back into columns.
The UNPIVOT code will be similar to this:
select id,
col +type as new_col,
value
from
(
select id,
type,
dcid,
cast(ind as varchar(10)) ind
from yt
) d
unpivot
(
value
for col in (DCID, IND)
) unpiv;
See SQL Fiddle with Demo. This gives a result:
| ID | NEW_COL | VALUE |
----------------------------------
| 1 | dcid1 | test |
| 1 | ind1 | Y |
| 2 | dcid2 | est |
| 2 | ind2 | Y |
The new_col contains the DCID and IND names and it has the type value concatenated to the end. This new value will be what you apply the PIVOT to:
select id, DCID1, DCID2, DCID3, IND1, IND2, IND3
from
(
select id,
col +type as new_col,
value
from
(
select id,
type,
dcid,
cast(ind as varchar(10)) ind
from yt
) d
unpivot
(
value
for col in (DCID, IND)
) unpiv
) src
pivot
(
max(value)
for new_col in (DCID1, DCID2, DCID3, IND1, IND2, IND3)
) piv;
See SQL Fiddle with Demo. The result will be:
| ID | DCID1 | DCID2 | DCID3 | IND1 | IND2 | IND3 |
-------------------------------------------------------------
| 1 | test | | | Y | | |
| 2 | | est | | | Y | |
| 3 | | | blah | | | Y |
| 4 | yes | | | N | | |
| 5 | | hs | | | N | |
| 6 | | | jr | | | N |

Related

Extract into multiple columns from JSON with PostgreSQL

I have a column item_id that contains data in JSON (like?) structure.
+----------+---------------------------------------------------------------------------------------------------------------------------------------+
| id | item_id |
+----------+---------------------------------------------------------------------------------------------------------------------------------------+
| 56711 | {itemID":["0530#2#1974","0538\/2#2#1974","0538\/3#2#1974","0538\/18#2#1974","0539#2#1974"]}" |
| 56712 | {itemID":["0138528#2#4221","0138529#2#4221","0138530#2#4221","0138539#2#4221","0118623\/2#2#4220"]}" |
| 56721 | {itemID":["2704\/1#1#1356"]}" |
| 56722 | {itemID":["0825\/2#2#3349","0840#2#3349","0844\/10#2#3349","0844\/11#2#3349","0844\/13#2#3349","0844\/14#2#3349","0844\/15#2#3349"]}" |
| 57638 | {itemID":["0161\/1#2#3364","0162\/1#2#3364","0163\/2#2#3364"]}" |
| 57638 | {itemID":["109#1#3364","110\/1#1#3364"]}" |
+----------+---------------------------------------------------------------------------------------------------------------------------------------+
I need the last four digits before every comma (if there is) and the last 4 digits distincted and separated into individual colums.
The distinct should happen across id as well, so only one result row with id: 57638 is permitted.
Here is a fiddle with a code draft that is not giving the right answer.
The desired result should look like this:
+----------+-----------+-----------+
| id | item_id_1 | item_id_2 |
+----------+-----------+-----------+
| 56711 | 1974 | |
| 56712 | 4220 | 4221 |
| 56721 | 1356 | |
| 56722 | 3349 | |
| 57638 | 3364 | 3365 |
+----------+-----------+-----------+
There can be quite a lot 'item_id_%' column in the results.
with the_table (id, item_id) as (
values
(56711, '{"itemID":["0530#2#1974","0538\/2#2#1974","0538\/3#2#1974","0538\/18#2#1974","0539#2#1974"]}'),
(56712, '{"itemID":["0138528#2#4221","0138529#2#4221","0138530#2#4221","0138539#2#4221","0118623\/2#2#4220"]}'),
(56721, '{"itemID":["2704\/1#1#1356"]}'),
(56722, '{"itemID":["0825\/2#2#3349","0840#2#3349","0844\/10#2#3349","0844\/11#2#3349","0844\/13#2#3349","0844\/14#2#3349","0844\/15#2#3349"]}'),
(57638, '{"itemID":["0161\/1#2#3364","0162\/1#2#3364","0163\/2#2#3364"]}'),
(57638, '{"itemID":["109#1#3365","110\/1#1#3365"]}')
)
select id
,(array_agg(itemid)) [1] itemid_1
,(array_agg(itemid)) [2] itemid_2
from (
select distinct id
,split_part(replace(json_array_elements(item_id::json -> 'itemID')::text, '"', ''), '#', 3)::int itemid
from the_table
order by 1
,2
) t
group by id
DEMO
You can unnest the json array, get the last 4 characters of each element as a number, then do conditional aggregation:
select
id,
max(val) filter(where rn = 1) item_id_1,
max(val) filter(where rn = 2) item_id_2
from (
select
id,
right(val, 4)::int val,
dense_rank() over(partition by id order by right(val, 4)::int) rn
from mytable t
cross join lateral jsonb_array_elements_text(t.item_id -> 'itemID') as x(val)
) t
group by id
You can add more conditional max()s to the outer query to handle more possible values.
Demo on DB Fiddle:
id | item_id_1 | item_id_1
----: | --------: | --------:
56711 | 1974 | null
56712 | 4220 | 4221
56721 | 1356 | null
56722 | 3349 | null
57638 | 3364 | 3365

Merging multiple bytea rows based on id

I'm working with a postgres database where I need to merge multiple rows into a row based on ID.
ID | A | B | C |
--------------------
1 | x | | |
1 | x | y | |
2 | x | | z |
3 | | y | |
3 | | | z |
A, B and C are bytea columns.
I need to merge it as follows:
ID | A | B | C |
--------------------
1 | x | y | |
2 | x | | z |
3 | | y | z |
The problem occurs when I do GROUP BY on ID, as I'm not able to find a appropriate aggregate function for bytea columns.
You can always do it with sub queries
WITH allID as (
SELECT distinct ID
FROM YourTable
)
SELECT
ID,
(SELECT A FROM yourTable yt where yt.ID = ai.ID ORDER BY A LIMIT 1) as A,
(SELECT B FROM yourTable yt where yt.ID = ai.ID ORDER BY B LIMIT 1) as B,
(SELECT C FROM yourTable yt where yt.ID = ai.ID ORDER BY C LIMIT 1) as C
FROM allID as ai

Getting values from a table that's inside a table (unpivot / cross apply)

I'm having a serious problem with one of my import tables. I've imported an Excel file to a SQL Server table. The table ImportExcelFile now looks like this (simplified):
+----------+-------------------+-----------+------------+--------+--------+-----+---------+
| ImportId | Excelfile | SheetName | Field1 | Field2 | Field3 | ... | Field10 |
+----------+-------------------+-----------+------------+--------+--------+-----+---------+
| 1 | C:\Temp\Test.xlsx | Sheet1 | Age / Year | 2010 | 2011 | | 2018 |
| 2 | C:\Temp\Test.xlsx | Sheet1 | 0 | Value1 | Value2 | | Value9 |
| 3 | C:\Temp\Test.xlsx | Sheet1 | 1 | Value1 | Value2 | | Value9 |
| 4 | C:\Temp\Test.xlsx | Sheet1 | 2 | Value1 | Value2 | | Value9 |
| 5 | C:\Temp\Test.xlsx | Sheet1 | 3 | Value1 | Value2 | | Value9 |
| 6 | C:\Temp\Test.xlsx | Sheet1 | 4 | Value1 | Value2 | | Value9 |
| 7 | C:\Temp\Test.xlsx | Sheet1 | 5 | NULL | NULL | | NULL |
+----------+-------------------+-----------+------------+--------+--------+-----+---------+
I now want to insert those values from Field1 to Field10 to the table AgeYear(in my original table there are about 70 columns and 120 rows). The first row (Age / Year, 2010, 2011, ...) is the header row. The column Field1 is the leading column. I want to save the values in the following format:
+-----------+-----+------+--------+
| SheetName | Age | Year | Value |
+-----------+-----+------+--------+
| Sheet1 | 0 | 2010 | Value1 |
| Sheet1 | 0 | 2011 | Value2 |
| ... | ... | ... | ... |
| Sheet1 | 0 | 2018 | Value9 |
| Sheet1 | 1 | 2010 | Value1 |
| Sheet1 | 1 | 2011 | Value2 |
| ... | ... | ... | ... |
| Sheet1 | 1 | 2018 | Value9 |
| ... | ... | ... | ... |
+-----------+-----+------+--------+
I've tried the following query:
DECLARE #sql NVARCHAR(MAX) =
';WITH cte AS
(
SELECT i.SheetName,
ROW_NUMBER() OVER(PARTITION BY i.SheetName ORDER BY i.SheetName) AS rn,
' + #columns + ' -- #columns = 'Field1, Field2, Field3, Field4, ...'
FROM dbo.ImportExcelFile i
WHERE i.Sheetname LIKE ''Sheet1''
)
SELECT SheetName,
age Age,
y.[Year]
FROM cte
CROSS APPLY
(
SELECT Field1 age
FROM dbo.ImportExcelFile
WHERE SheetName LIKE ''Sheet1''
AND ISNUMERIC(Field1) = 1
) a (age)
UNPIVOT
(
[Year] FOR [Years] IN (' + #columns + ')
) y
WHERE rn = 1'
EXEC (#sql)
So far I'm getting the desired ages and years. My problem is that I don't know how I could get the values. With UNPIVOT I don't get the NULL values. Instead it fills the whole table with the same values even if they are NULL in the source table.
Could you please help me?
Perhaps an alternative approach. This is not dynamic, but with the help of a CROSS APPLY and a JOIN...
The drawback is that you'll have to define the 70 fields.
Example
;with cte0 as (
Select A.ImportId
,A.SheetName
,Age = A.Field1
,B.*
From ImportExcelFile A
Cross Apply ( values ('Field2',Field2)
,('Field3',Field3)
,('Field10',Field10)
) B (Item,Value)
)
,cte1 as ( Select * from cte0 where ImportId=1 )
Select A.SheetName
,[Age] = try_convert(int,A.Age)
,[Year] = try_convert(int,B.Value)
,[Value] = A.Value
From cte0 A
Join cte1 B on A.Item=B.Item
Where A.ImportId>1
Returns

SQL SELECT and INSERT/UPDATE, PIVOT?

I need to take data from a table that looks like this:
name | server | instance | version | user
----------|----------|------------|----------|--------- -
package_a | x | 1 | 1 | AB
package_b | x | 1 | 1 | TL
package_a | x | 2 | 4 | SK
package_a | y | 1 | 2 | MD
package_c | y | 1 | 4 | SK
package_b | y | 2 | 1 | SK
package_a | y | 2 | 1 | TL
package_b | x | 2 | 3 | TL
package_c | x | 2 | 1 | TL
and I need to put it in a table like that:
name | v_x_1 | u_x_1 | v_x_2 | u_x_2 | v_y_1 | u_y_1 | v_y_2 | u_y_2
----------|-------|-------|-------|-------|-------|-------|-------|-------
package_a | 1 | AB | 4 | SK | 2 | MD | 1 | TL
package_b | 1 | TL | 3 | TL | NULL | NULL | 1 | SK
package_c | NULL | NULL | 1 | TL | 4 | SK | NULL | NULL
I already tried INSERT with (SUB)SELECT, tried to INSERT package names first using DISTINCT and UPDATE afterwards, played around with PIVOT and stuff like that.
But I'm rather new to SQL and programming in general, so I couldn't come up with a solution. Since I not only have a version number in the source table but also nvarchar columns, It seems like PIVOT won't be the way to go, right?
You can use PIVOT on a sub query that uses UNION to separate the user and version values.
insert into YourNewTable (name, [v_x_1],[u_x_1],[v_x_2],[u_x_2],[v_y_1],[u_y_1],[v_y_2],[u_y_2])
select *
from (
select name, cast([version] as varchar(30)) as value, concat('v_',[server],'_',instance) as title from YourTable
union all
select name, [user] as value, concat('u_',[server],'_',instance) as title from YourTable
) q
pivot (max(value) FOR title IN (
[v_x_1],[u_x_1],[v_x_2],[u_x_2],[v_y_1],[u_y_1],[v_y_2],[u_y_2]
)
) pvt;

Twice Inner Join on same table with Aggregate function

being a novice sql user:
I have a simple table storing some records over night daily. table:
Table: T1
+----+-----+----+-----------+------------+
| Id | A | AB | Value | Date |
+----+-----+----+-----------+------------+
| 1 | abc | I | -48936.08 | 2013-06-24 |
| 2 | def | A | 431266.19 | 2013-06-24 |
| 3 | xyz | I | -13523.90 | 2013-06-24 |
| 4 | abc | A | 13523.90 | 2013-06-23 |
| 5 | xyz | I | -13523.90 | 2013-06-23 |
| 6 | def | A | 13523.90 | 2013-06-22 |
| 7 | def | I | -13523.90 | 2013-06-22 |
+----+-----+----+-----------+------------+
I would like to get all values of columns A,AB, Value for the latest Date on Column A filtered on AB = I
basically the result should look like:
+----+-----+----+-----------+------------+
| Id | A | AB | Value | Date |
+----+-----+----+-----------+------------+
| 1 | abc | I | -48936.08 | 2013-06-24 |
| 3 | xyz | I | -13523.90 | 2013-06-24 |
| 7 | def | I | -13523.90 | 2013-06-22 |
+----+-----+----+-----------+------------+
I have tried to use inner join twice on the same table but failed to come up with correct result.
any help would be appreciated.
thanks :)
This will work with sqlserver 2005+
;WITH a as
(
SELECT id, A,AB, Value, Date
, row_number() over (partition by A order by Date desc) rn
FROM t1
WHERE AB = 'I'
)
SELECT id, A,AB, Value, Date
FROM a WHERE rn = 1
; WITH x AS (
SELECT id
, a
, ab
, "value"
, "date"
, Row_Number() OVER (PARTITION BY a ORDER BY "date" DESC) As row_num
FROM your_table
WHERE ab = 'I'
)
SELECT *
FROM x
WHERE row_num = 1

Resources