Convert columns to rows by ID - sql-server

Looking for a way to convert columns to rows in sql server.
I have a table with the columns below:
[ID] [Action] [Note] [Resolution]
Here is what I want to get as the result with the columns: [ID] [Notes]
And the result values will be:
'1' 'Action1'
'1' 'Note1'
'1' 'Resolution1'
'2' 'Action2'
'2' 'Note2'
'2' 'Note2.1'
'2' 'Resolution2' etc
Any ideas how I could do this in T-SQL? Also for the note field there could be multiple entries. Thanks!

Assuming your source table and data looks like this:
-- select * from t:
ID Action Note Resolution
--- ------- ------- -----------
1 Action1 Note1 Resolution1
2 Action2 Note2 Resolution2
2 Action2 Note2.1 Resolution2
This query:
select distinct id, notes
from (select * from t) as source
unpivot (notes for ids in ([action], [note], [resolution])
) as unpivotted_table
will produce this result:
id notes
--- ------
1 Action1
1 Note1
1 Resolution1
2 Action2
2 Note2
2 Note2.1
2 Resolution2
which looks a lot like what you are asking for.
You can find more information on how the unpivot operator works here.

Related

Sorting Multiple Parent/Child on Recursive SQL Query

I'm converting some Oracle code to SQL Server.
The Oracle code looks like this:
SELECT FLEX_VALUE, DESCRIPTION, ADMIN_ENABLED, PARENT_FLEX_VALUE, DISPLAY_DESC, LEVEL
FROM ( SELECT FLEX_VALUE, DESCRIPTION, ADMIN_ENABLED, PARENT_FLEX_VALUE, vDescField AS DISPLAY_DESC
FROM GL_SEGMENT2
WHERE PERIOD_YEAR = 2015; )
CONNECT BY PRIOR FLEX_VALUE = PARENT_FLEX_VALUE
START WITH PARENT_FLEX_VALUE IS NULL
ORDER SIBLINGS BY DISPLAY_DESC;
And it produces the following CORRECT RESULTS:
The Query groups the data by their parent, and the levels are indicated.
The results are ordered by the parent groups.
The 'children' under the group headings don't seem to be ordered.
The data is stored in a single table.
I have converted the Oracle query to the following SQL Server query:
WITH n ([FLEX_VALUE], [DESCRIPTION], [ADMIN_ENABLED], [PARENT_FLEX_VALUE], [DISPLAY_DESC], [LEVEL]) AS
(SELECT P1.[FLEX_VALUE], P1.[DESCRIPTION], P1.[ADMIN_ENABLED], P1.[PARENT_FLEX_VALUE], P1.[DISPLAY_DESC], 1 AS [LEVEL]
FROM (SELECT [FLEX_VALUE], [DESCRIPTION], [ADMIN_ENABLED], [PARENT_FLEX_VALUE], [FLEX_VALUE] + ' - ' + [DESCRIPTION] AS [DISPLAY_DESC]
FROM dbo.FIN_REP_GL_SEGMENT2
WHERE [PERIOD_YEAR] = 2015 ) AS P1
WHERE LEN(LTRIM(RTRIM(ISNULL(P1.[PARENT_FLEX_VALUE],'')))) = 0
UNION ALL
SELECT C1.[FLEX_VALUE], C1.[DESCRIPTION], C1.[ADMIN_ENABLED], C1.[PARENT_FLEX_VALUE], C1.[DISPLAY_DESC], Parent.[LEVEL] + 1
FROM (SELECT [FLEX_VALUE], [DESCRIPTION], [ADMIN_ENABLED], [PARENT_FLEX_VALUE], [FLEX_VALUE] + ' - ' + [DESCRIPTION] AS [DISPLAY_DESC]
FROM dbo.FIN_REP_GL_SEGMENT2
WHERE [PERIOD_YEAR] = 2015 ) AS C1
JOIN n Parent ON Parent.[FLEX_VALUE] = C1.[PARENT_FLEX_VALUE] )
SELECT [FLEX_VALUE], [DESCRIPTION], [ADMIN_ENABLED], [PARENT_FLEX_VALUE], [DISPLAY_DESC], [LEVEL]
FROM n
ORDER BY [DISPLAY_DESC]
The above SQL Server query produces INCORRECT SORTING as illustrated below:
The LEVELS seem correct but the children are being displayed under the incorrect parent categories (note the B145 and Cnnn values). The B145 record should display under the F000 parent, and the Cnnn records should display under the L000 parent.
Currently the SQL Query puts these under the B000 parent which is incorrect!
The SQL query seems to be sorting on the FLEX_VALUE column, irrespective of what 'parent' the 'child' actually belongs to.
The root cause [sic] of the issue seems to be that there are MULTIPLE root records with NULL in their PARENT_FLEX_VALUE, and I actually want to ignore the alphabetic sorting on FLEX_VALUE (I'm only concerned with the PARENT SORT ORDER).
Everything I try with the SQL query doesn't change the sorting order.
Other than the sorting/grouping issue, the query is basically working.
A re-worked example of my current SQL Server query attempt with an explanation of why it currently doesn't work will be very much appreciated.
This Oracle query similar to yours, recursive, should help. I think you can easily modify it to SQL Server version:
with t(FV, DSC, PFV, path, lvl) as (
select FLEX_VALUE, DESCRIPTION, PARENT_FLEX_VALUE, flex_value, 1
from gl_segment2 where parent_flex_value is null
union all
select g.FLEX_VALUE, g.DESCRIPTION, g.PARENT_FLEX_VALUE,
t.path||'/'||g.flex_value, t.lvl+1
from gl_segment2 g join t on g.parent_flex_value = t.fv )
select t.*, lpad(' ', (lvl-1)*2, ' ')||fv hierarchy from t order by path
In order to keep hierarchy I added path column which enables correct ordering.
Of course you don't need columns PATH, LVL, HIERARCHY in output, I added them only for presentation puproses.
Output and SQLFiddle:
FV DSC PFV PATH LVL HIERARCHY
----- -------------------- ----- ----------------- ---------- ----------
A000 DESCRIPTION A000 A000 1 A000
A010 DESCRIPTION A010 A000 A000/A010 2 A010
A100 DESCRIPTION A100 A010 A000/A010/A100 3 A100
A101 DESCRIPTION A101 A010 A000/A010/A101 3 A101
A011 DESCRIPTION A011 A000 A000/A011 2 A011
B000 DESCRIPTION B000 B000 1 B000
B010 DESCRIPTION B010 B000 B000/B010 2 B010
B011 DESCRIPTION B011 B000 B000/B011 2 B011
F000 DESCRIPTION F000 F000 1 F000
B145 DESCRIPTION B145 F000 F000/B145 2 B145

Max Value with unique values in more than one column

I feel like I'm missing something really obvious here.
Using T-SQL/SQL-Server:
I have unique values in more than one column but want to select the max version based on one particular column.
Dataset:
Example
ID | Name| Version | Code
------------------------
1 | Car | 3 | NULL
1 | Car | 2 | 1000
1 | Car | 1 | 2000
Target status: I want my query to only select the row with the highest version value. Running a MAX on the version column pulls all three because of the distinct values in the 'Code' column:
SELECT ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
The net result is that I get all three entries as per the data set due to the unique values in the Code column, but I only want the top row (Version 3).
Any help would be appreciated.
You need to identify the row with the highest version as 1 query and use another outer query to pull out all the fields for that row. Like so:
SELECT t.ID, t.Name, GRP.Version, t.Code
FROM (
SELECT ID
,Name
,MAX(Version) as Version
FROM Table
GROUP BY ID, Name
) GRP
INNER JOIN Table t on GRP.ID = t.ID and GRP.Name = t.Name and GRP.Version = t.Version
You can also use row_number() to do this kind of logic, for example like this:
select ID, Name, Version, Code
from (
select *, row_number() over (order by Version desc) as RN
from Table1
) X where RN = 1
Example in SQL Fiddle
add the top statment to force the return of a single row. Also add the order by notation
SELECT top 1 ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
order by max(version) desc

TSQL - View with cross apply and pivot

this is my base table:
docID | rowNumber | Column1 | Column2 | Column3
I use cross apply and pivot to transform the records in Column1 to actual columns and use the values in column2 and column3 as records for the new columns. In my fiddle you can see base and transformed select statement.
I have columns like Plant and Color which are numbered, e.g. Plant1, Plant2, Plant3, Color1, Color2 etc.
For each plant that exists in all plant columns I want to create a new row with a comma separated list of colors in one single column.
What I want to achieve is also in below screenshot:
This should become a view to use in Excel. How do I need to modify the view to get to the desired result?
Additional question: The Length-column is numeric. Is there any way to switch the decimal separator from within Excel as a user and apply it to this or all numeric column(s) so that it will be recognized by Excel as a number?
I used to have an old php web query where I would pass the separator from a dropdown cell in Excel as a parameter.
Thank you.
First off, man the way your data is stored is a mess. I would recommend reading up on good data structures and fixing yours if you can. Here's a TSQL query that gets you the data in the correct format.
WITH CTE_no_nums
AS
(
SELECT docID,
CASE
WHEN PATINDEX('%[0-9]%',column1) > 0
THEN SUBSTRING(column1,0,PATINDEX('%[0-9]%',column1))
ELSE column1
END AS cols,
COALESCE(column2,column3) AS vals
FROM miscValues
WHERE column2 IS NOT NULL
OR column3 IS NOT NULL
),
CTE_Pivot
AS
(
SELECT docID,partNumber,prio,[length],material
FROM CTE_no_nums
PIVOT
(
MAX(vals) FOR cols IN (partNumber,prio,[length],material)
) pvt
)
SELECT A.docId + ' # ' + B.vals AS [DocID # Plant],
A.docID,
A.partNumber,
A.prio,
B.vals AS Plant,
A.partNumber + '#' + A.material + '#' + A.[length] AS Identification,
A.[length],
SUBSTRING(CA.colors,0,LEN(CA.colors)) colors --substring removes last comma
FROM CTE_Pivot A
INNER JOIN CTE_no_nums B
ON A.docID = B.docID
AND B.cols = 'Plant'
CROSS APPLY ( SELECT vals + ','
FROM CTE_no_nums C
WHERE cols = 'Color'
AND C.docID = A.docID
FOR XML PATH('')
) CA(colors)
Results:
DocID # Plant docID partNumber prio Plant Identification length colors
---------------- ------ ---------- ---- ---------- ------------------ ------- -------------------------
D0001 # PlantB D0001 X001 1 PlantB X001#MA123#10.87 10.87 white,black,blue
D0001 # PlantC D0001 X001 1 PlantC X001#MA123#10.87 10.87 white,black,blue
D0002 # PlantA D0002 X002 2 PlantA X002#MA456#16.43 16.43 black,yellow
D0002 # PlantC D0002 X002 2 PlantC X002#MA456#16.43 16.43 black,yellow
D0002 # PlantD D0002 X002 2 PlantD X002#MA456#16.43 16.43 black,yellow

Keep nulls with two IN()

I'm refactoring very old code. Currently, PHP generates a separate select for every value. Say loc contains 1,2 and data contains a,b, it generates
select val from tablename where loc_id=1 and data_id=a;
select val from tablename where loc_id=1 and data_id=b;
select val from tablename where loc_id=2 and data_id=a;
select val from tablename where loc_id=2 and data_id=b;
...etc which all return either a single value or nothing. That meant I always had n(loc_id)*n(data_id) results, including nulls, which is necessary for subsequent processing. Knowing the order, this was used to generate an HTML table. Both data_id and loc_id can in theory scale up to a couple thousands (which is obviously not great in a table, but that's another concern).
+-----------+-----------+
| data_id 1 | data_id 2 |
+----------+-----------+-----------+
| loc_id 1 | - | 999.99 |
+----------+-----------+-----------+
+ loc_id 2 | 888.88 | - |
+----------+-----------+-----------+
To speed things up, I was looking at replacing this with a single query:
select val from tablename where loc_id in (1,2) and data_id in (a,b) order by loc_id asc, data_id asc;
to get a result like (below) and iterate to build my table.
Rownum VAL
------- --------
1 null
2 999.99
3 777.77
4 null
Unfortunately that approach drops the nulls from the resultset so I end up with
Rownum VAL
------- --------
1 999.99
2 777.77
Note that it is possible that neither data_id or loc_id have any match, in which case I would still need a null, null.
So I don't know which value matches which. I ways to match with the expected loc_id/data_id combination in php if I add loc_id and data_id... but that's getting messy.
Still a novice in SQL in general and that's absolutely the first time I work on PostgreSQL so hopefully that's not too obvious... As I post this I'm looking at two ways to solve this: any in array[] and joins. Will update if anything new is found.
tl;dr question
How do I do a where loc_id in (1,2) and data_id in (a,b) and keep the nulls so that I always get n(loc)*n(data) results?
You can achieve that in a single query with two steps:
Generate a matrix of all desired rows in the output.
LEFT [OUTER] JOIN to actual rows.
You get at least one row for every cell in your table.
If (loc_id, data_id) is unique, you get exactly one row.
SELECT t.val
FROM (VALUES (1), (2)) AS l(loc_id)
CROSS JOIN (VALUES ('a'), ('b')) AS d(data_id) -- generate total grid of rows
LEFT JOIN tablname t USING (loc_id, data_id) -- attach matching rows (if any)
ORDER BY l.loc_id, d.data_id;
Works for any number of columns with any number of values.
For your simple case:
SELECT t.val
FROM (
VALUES
(1, 'a'), (1, 'b')
, (2, 'a'), (2, 'b')
) AS ld (loc_id, data_id) -- total grid of rows
LEFT JOIN tablname t USING (loc_id, data_id) -- attach matching rows (if any)
ORDER BY ld.loc_id, ld.data_id;
where (loc_id in (1,2) or loc_id is null)
and (data_id in (a,b) or data_id is null)
Select the fields you use for filtering, so you know where the values came from:
select loc,data,val from tablename where loc in (1,2) and data in (a,b);
You won't get nulls this way either, but it's not a problem anymore. You know which fields are missing, and you know those are nulls.

Query to find the record with most matching columns, where the number of columns and names of columns is unknown?

I have two tables, X and Y, with identical schema but different records. Given a record from X, I need a query to find the closest matching record in Y that contains NULL values for non-matching columns. Identity columns should be excluded from the comparison. For example, if my record looked like this:
------------------------
id | col1 | col2 | col3
------------------------
0 |'abc' |'def' | 'ghi'
And table Y looked like this:
------------------------
id | col1 | col2 | col3
------------------------
6 |'abc' |'def' | 'zzz'
8 | NULL |'def' | NULL
Then the closest match would be record 8, since where the columns don't match, there are NULL values. 6 WOULD have been the closest match, but the 'zzz' disqualified it.
What's unique about this problem is that the schema of the tables is unknown besides the id column and the data types. There could be 4 columns, or there could be 7 columns. We just don't know - it's dynamic. All we know is that there is going to be an 'id' column and that the columns will be strings, either varchar or nvarchar.
What is the best query in this case to pick the closest matching record out of Y, given a record from X? I'm actually writing a function. The input is an integer (the id of a record in X) and the output is an integer (the id of a record in Y, or NULL). I'm an SQL novice, so a brief explanation of what's happening in your solution would help me greatly.
There could be 4 columns, or there could be 7 columns.... I'm actually writing a function.
This is an impossible task. Because functions are deterministic, so you cannot have a function that will work on an arbitrary table structure, using dynamic SQL. A stored procedure, sure, but not a function.
However, the below shows you a way using FOR XML and some decomposing of the XML to unpivot rows into column names and values which can then be compared. The technique used here and the queries can be incorporated into a stored procedure.
MS SQL Server 2008 Schema Setup:
-- this is the data table to match against
create table t1 (
id int,
col1 varchar(10),
col2 varchar(20),
col3 nvarchar(40));
insert t1
select 6, 'abc', 'def', 'zzz' union all
select 8, null , 'def', null;
-- this is the data with the row you want to match
create table t2 (
id int,
col1 varchar(10),
col2 varchar(20),
col3 nvarchar(40));
insert t2
select 0, 'abc', 'def', 'ghi';
GO
Query 1:
;with unpivoted1 as (
select n.n.value('local-name(.)','nvarchar(max)') colname,
n.n.value('.','nvarchar(max)') value
from (select (select * from t2 where id=0 for xml path(''), type)) x(xml)
cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
), unpivoted2 as (
select x.id,
n.n.value('local-name(.)','nvarchar(max)') colname,
n.n.value('.','nvarchar(max)') value
from (select id,(select * from t1 where id=outr.id for xml path(''), type) from t1 outr) x(id,xml)
cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
)
select TOP(1) WITH TIES
B.id,
sum(case when A.value=B.value then 1 else 0 end) matches
from unpivoted1 A
join unpivoted2 B on A.colname = B.colname
group by B.id
having max(case when A.value <> B.value then 1 end) is null
ORDER BY matches;
Results:
| ID | MATCHES |
----------------
| 8 | 1 |

Resources