How to write the COLUMNS to ROWS in SQL server [duplicate] - sql-server

Looking for elegant (or any) solution to convert columns to rows.
Here is an example: I have a table with the following schema:
[ID] [EntityID] [Indicator1] [Indicator2] [Indicator3] ... [Indicator150]
Here is what I want to get as the result:
[ID] [EntityId] [IndicatorName] [IndicatorValue]
And the result values will be:
1 1 'Indicator1' 'Value of Indicator 1 for entity 1'
2 1 'Indicator2' 'Value of Indicator 2 for entity 1'
3 1 'Indicator3' 'Value of Indicator 3 for entity 1'
4 2 'Indicator1' 'Value of Indicator 1 for entity 2'
And so on..
Does this make sense? Do you have any suggestions on where to look and how to get it done in T-SQL?

You can use the UNPIVOT function to convert the columns into rows:
select id, entityId,
indicatorname,
indicatorvalue
from yourtable
unpivot
(
indicatorvalue
for indicatorname in (Indicator1, Indicator2, Indicator3)
) unpiv;
Note, the datatypes of the columns you are unpivoting must be the same so you might have to convert the datatypes prior to applying the unpivot.
You could also use CROSS APPLY with UNION ALL to convert the columns:
select id, entityid,
indicatorname,
indicatorvalue
from yourtable
cross apply
(
select 'Indicator1', Indicator1 union all
select 'Indicator2', Indicator2 union all
select 'Indicator3', Indicator3 union all
select 'Indicator4', Indicator4
) c (indicatorname, indicatorvalue);
Depending on your version of SQL Server you could even use CROSS APPLY with the VALUES clause:
select id, entityid,
indicatorname,
indicatorvalue
from yourtable
cross apply
(
values
('Indicator1', Indicator1),
('Indicator2', Indicator2),
('Indicator3', Indicator3),
('Indicator4', Indicator4)
) c (indicatorname, indicatorvalue);
Finally, if you have 150 columns to unpivot and you don't want to hard-code the entire query, then you could generate the sql statement using dynamic SQL:
DECLARE #colsUnpivot AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #colsUnpivot
= stuff((select ','+quotename(C.column_name)
from information_schema.columns as C
where C.table_name = 'yourtable' and
C.column_name like 'Indicator%'
for xml path('')), 1, 1, '')
set #query
= 'select id, entityId,
indicatorname,
indicatorvalue
from yourtable
unpivot
(
indicatorvalue
for indicatorname in ('+ #colsunpivot +')
) u'
exec sp_executesql #query;

well If you have 150 columns then I think that UNPIVOT is not an option. So you could use xml trick
;with CTE1 as (
select ID, EntityID, (select t.* for xml raw('row'), type) as Data
from temp1 as t
), CTE2 as (
select
C.id, C.EntityID,
F.C.value('local-name(.)', 'nvarchar(128)') as IndicatorName,
F.C.value('.', 'nvarchar(max)') as IndicatorValue
from CTE1 as c
outer apply c.Data.nodes('row/#*') as F(C)
)
select * from CTE2 where IndicatorName like 'Indicator%'
sql fiddle demo
You could also write dynamic SQL, but I like xml more - for dynamic SQL you have to have permissions to select data directly from table and that's not always an option.
UPDATEAs there a big flame in comments, I think I'll add some pros and cons of xml/dynamic SQL. I'll try to be as objective as I could and not mention elegantness and uglyness. If you got any other pros and cons, edit the answer or write in comments
cons
it's not as fast as dynamic SQL, rough tests gave me that xml is about 2.5 times slower that dynamic (it was one query on ~250000 rows table, so this estimate is no way exact). You could compare it yourself if you want, here's sqlfiddle example, on 100000 rows it was 29s (xml) vs 14s (dynamic);
may be it could be harder to understand for people not familiar with xpath;
pros
it's the same scope as your other queries, and that could be very handy. A few examples come to mind
you could query inserted and deleted tables inside your trigger (not possible with dynamic at all);
user don't have to have permissions on direct select from table. What I mean is if you have stored procedures layer and user have permissions to run sp, but don't have permissions to query tables directly, you still could use this query inside stored procedure;
you could query table variable you have populated in your scope (to pass it inside the dynamic SQL you have to either make it temporary table instead or create type and pass it as a parameter into dynamic SQL;
you can do this query inside the function (scalar or table-valued). It's not possible to use dynamic SQL inside the functions;

Just to help new readers, I've created an example to better understand #bluefeet's answer about UNPIVOT.
SELECT id
,entityId
,indicatorname
,indicatorvalue
FROM (VALUES
(1, 1, 'Value of Indicator 1 for entity 1', 'Value of Indicator 2 for entity 1', 'Value of Indicator 3 for entity 1'),
(2, 1, 'Value of Indicator 1 for entity 2', 'Value of Indicator 2 for entity 2', 'Value of Indicator 3 for entity 2'),
(3, 1, 'Value of Indicator 1 for entity 3', 'Value of Indicator 2 for entity 3', 'Value of Indicator 3 for entity 3'),
(4, 2, 'Value of Indicator 1 for entity 4', 'Value of Indicator 2 for entity 4', 'Value of Indicator 3 for entity 4')
) AS Category(ID, EntityId, Indicator1, Indicator2, Indicator3)
UNPIVOT
(
indicatorvalue
FOR indicatorname IN (Indicator1, Indicator2, Indicator3)
) UNPIV;

Just because I did not see it mentioned.
If 2016+, here is yet another option to dynamically unpivot data without actually using Dynamic SQL.
Example
Declare #YourTable Table ([ID] varchar(50),[Col1] varchar(50),[Col2] varchar(50))
Insert Into #YourTable Values
(1,'A','B')
,(2,'R','C')
,(3,'X','D')
Select A.[ID]
,Item = B.[Key]
,Value = B.[Value]
From #YourTable A
Cross Apply ( Select *
From OpenJson((Select A.* For JSON Path,Without_Array_Wrapper ))
Where [Key] not in ('ID','Other','Columns','ToExclude')
) B
Returns
ID Item Value
1 Col1 A
1 Col2 B
2 Col1 R
2 Col2 C
3 Col1 X
3 Col2 D

I needed a solution to convert columns to rows in Microsoft SQL Server, without knowing the colum names (used in trigger) and without dynamic sql (dynamic sql is too slow for use in a trigger).
I finally found this solution, which works fine:
SELECT
insRowTbl.PK,
insRowTbl.Username,
attr.insRow.value('local-name(.)', 'nvarchar(128)') as FieldName,
attr.insRow.value('.', 'nvarchar(max)') as FieldValue
FROM ( Select
i.ID as PK,
i.LastModifiedBy as Username,
convert(xml, (select i.* for xml raw)) as insRowCol
FROM inserted as i
) as insRowTbl
CROSS APPLY insRowTbl.insRowCol.nodes('/row/#*') as attr(insRow)
As you can see, I convert the row into XML (Subquery select i,* for xml raw, this converts all columns into one xml column)
Then I CROSS APPLY a function to each XML attribute of this column, so that I get one row per attribute.
Overall, this converts columns into rows, without knowing the column names and without using dynamic sql. It is fast enough for my purpose.
(Edit: I just saw Roman Pekar answer above, who is doing the same.
I used the dynamic sql trigger with cursors first, which was 10 to 100 times slower than this solution, but maybe it was caused by the cursor, not by the dynamic sql. Anyway, this solution is very simple an universal, so its definitively an option).
I am leaving this comment at this place, because I want to reference this explanation in my post about the full audit trigger, that you can find here: https://stackoverflow.com/a/43800286/4160788

DECLARE #TableName varchar(max)=NULL
SELECT #TableName=COALESCE(#TableName+',','')+t.TABLE_CATALOG+'.'+ t.TABLE_SCHEMA+'.'+o.Name
FROM sysindexes AS i
INNER JOIN sysobjects AS o ON i.id = o.id
INNER JOIN INFORMATION_SCHEMA.TABLES T ON T.TABLE_NAME=o.name
WHERE i.indid < 2
AND OBJECTPROPERTY(o.id,'IsMSShipped') = 0
AND i.rowcnt >350
AND o.xtype !='TF'
ORDER BY o.name ASC
print #tablename
You can get list of tables which has rowcounts >350 . You can see at the solution list of table as row.

The opposite of this is to flatten a column into a csv eg
SELECT STRING_AGG ([value],',') FROM STRING_SPLIT('Akio,Hiraku,Kazuo', ',')

Related

Test if a SQL Server column exists without using the schema?

A 3rd party DB we read in ADO.Net recently added a column in a new version of their code. It's a fkey to a new table.
We have read-only access to the data tables, so in theory cannot rely on the schema to do this. So...
1) is INFORMATION_SCHEMA always available for items you can access, or is it possible we will not have rights even to tables we can read?
2) if (1) is "bad", what would be the canonical solution? In SQL itself I would do a SELECT * FROM x WHERE 1=0 and then test the headers, is there an equivalent test in .Net?
You can list the column names (values are optional) from any Table or Query via a little XML.
Example
Declare #AnyTableOrQuery Table (EmpID int,EmpName varchar(50),Salary int,Location varchar(100))
Insert Into #AnyTableOrQuery Values
(1,'Arul',100,null)
,(2,'Jane',120,'New York')
Select B.*
From ( values (cast((Select Top 1 * From #AnyTableOrQuery for XML RAW,ELEMENTS XSINIL) as xml))) A(XMLData)
Cross Apply (
Select Column_Name = a.value('local-name(.)','varchar(100)')
,Column_Value = a.value('.','varchar(max)')
From A.XMLData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./*') as C2(a)
) B
Returns
Column_Name Column_Value
EmpID 1
EmpName Arul
Salary 100
Location
EDIT
#MauryMarkowitz Provided a much better solution (see comment below)
sp_describe_first_result_set #tsql = N'Select * from YourTable'

What is easiest and optimize way to find specific value from database tables?

As per my requirement, I have to find if some words like xyz#test.com value exists in which tables of columns. The database size is very huge and more than 2500 tables.
Can anyone please provide an optimal way to find this type of value from the database. I've created a loop query which took around almost more than 9 hrs to run.
9 hours is clearly a long time. Furthermore, 2,500 tables seems close to insanity for me.
Here is one approach that will run 1 query per table, not one per column. Now I have no idea how this will perform against 2,500 tables. I suspect it may be horrible. That said I would strongly suggest a test filter first like Table_Name like 'OD%'
Example
Declare #Search varchar(max) = 'cappelletti' -- Exact match '"cappelletti"'
Create Table #Temp (TableName varchar(500),RecordData xml)
Declare #SQL varchar(max) = ''
Select #SQL = #SQL+ ';Insert Into #Temp Select TableName='''+concat(quotename(Table_Schema),'.',quotename(table_name))+''',RecordData = (Select A.* for XML RAW) From '+concat(quotename(Table_Schema),'.',quotename(table_name))+' A Where (Select A.* for XML RAW) like ''%'+#Search+'%'''+char(10)
From INFORMATION_SCHEMA.Tables
Where Table_Type ='BASE TABLE'
and Table_Name like 'OD%' -- **** Would REALLY Recommend a REASONABLE Filter *** --
Exec(#SQL)
Select A.TableName
,B.*
,A.RecordData
From #Temp A
Cross Apply (
Select ColumnName = a.value('local-name(.)','varchar(100)')
,Value = a.value('.','varchar(max)')
From A.RecordData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./#*') as C2(a)
Where a.value('.','varchar(max)') Like '%'+#Search+'%'
) B
Drop Table #Temp
Returns
If it Helps, the individual queries would look like this
Select TableName='[dbo].[OD]'
,RecordData= (Select A.* for XML RAW)
From [dbo].[OD] A
Where (Select A.* for XML RAW) like '%cappelletti%'
On a side-note, you can search numeric data and even dates.
Make a procedure with VARCHAR datatype of column with table name and store into the temp table from system tables.
Now make one dynamic Query with executing a LOOP on each record with = condition with input parameter of email address.
If condition is matched in any statement using IF EXISTS statement, then store that table name and column name in another temp table. and retrieve the list of those records from temp table at end of the execution.

Querying a table with an additional column containing table name

List item
I am looking for a way query to query a table and add a column with the table name, without explicitly writing the actual 'tablename' within the select statement. Is there a way to do this?
For example I want;
Table name: Construction
The original columns would be Modif_num, modif_desc.
I'd like a query with these results;
The original columns would be Modif_num, modif_desc.
MODIF_NUM TABLE_NAME MODIF_DESC
2 Construction Quality
2 Construction Quality
2 Construction Quality
2 Construction Quality
A regular select * would yield
MODIF_NUM MODIF_DESC
2 Quality
2 Quality
2 Quality
2 Quality
In this instance i would use excel.
column A : table name
column B : ="select cast('"&A1&"' as nvarchar(50)) as tablename ,* into TARGETTABLE from "& A1
Then fill column A with all your table names.. then copy and paste column B into SSMS
This assumes based on your comment this is a one off task. If its not a one off task use the same logic to generate a bunch of strings and execute them.
Ah Wait sorry you cannot do select into repeatedly, what am i thinking.. sorry more like this:
In your select statement you can return a column based on a string, for example:
SELECT 'Construction' As Table_Name, MODIF_NUM FROM MyTable
OR
SELECT 'Construction' As Table_Name, * FROM MyTable
To bring them together, a UNION may work:
SELECT 'Construction' As Table_Name, MODIF_NUM, MODIF_DESC FROM tblConstruction
UNION ALL
SELECT 'Demolition' As Table_Name, MODIF_NUM, MODIF_DESC FROM tblDemolition
UNION ALL
SELECT 'Reconstruction' As Table_Name, MODIF_NUM, MODIF_DESC FROM tblReconstruction
Does this help?
Try this query:
SELECT TABLE_NAME, a.*
FROM [Construction] a,
INFORMATION_SCHEMA.TABLES

Parse XML and generate new rows through SQL Query

I've the input data in SQL table in below format:
ID Text
1 <Key><Name>Adobe</Name><Display>Ado</Display></Key><Key>.....</Key>
2 <Key><Name></Name><Display>Microsoft</Display><Version>1.1</Version></Key>
There can be multiple keys for each ID.There could be several thousand rows in a table in above format. I've to generate the final sql output in below format
ID Name Display Version
1 Adobe Ado
1 xyz yz 1.2
2 Microsoft 1.1
I am using the below query to parse Text column, but getting all data in one row. How can I split that data in multiple rows as indicated above.
SELECT
CAST(CAST(Text AS XML).query('data(/Key/Name)') AS VARCHAR(MAX)) AS Name,
CAST(CAST(Text AS XML).query('data(/Key/Display)') as VARCHAR(MAX)) AS DisplayName,
CAST(CAST(Text AS XML).query('data(/Key/Version)') AS VARCHAR(MAX)) AS Version
FROM
ABC where ID = 1
Currently I am running this query for each ID at a time. Is there a way to run for all ID's together. Also, is there any other efficient way to get the desired output.
Here is the example:
-- Sample demonstrational schema
declare #t table (
Id int primary key,
TextData nvarchar(max) not null
);
insert into #t
values
(1, N'<Key><Name>Adobe</Name><Display>Ado</Display></Key><Key><Name>xyz</Name><Display>yz</Display><Version>1.2</Version></Key>'),
(2, N'<Key><Name></Name><Display>Microsoft</Display><Version>1.1</Version></Key>');
-- The actual query
with cte as (
select t.Id, cast(t.TextData as xml) as [XMLData]
from #t t
)
select c.Id,
k.c.value('./Name[1]', 'varchar(max)') as [Name],
k.c.value('./Display[1]', 'varchar(max)') as [DisplayName],
k.c.value('./Version[1]', 'varchar(max)') as [Version]
from cte c
cross apply c.XMLData.nodes('/Key') k(c);
Different type can be corrected with the in-place cast/convert done in CTE (or equivalent subquery).

Select records with a substring from another table

I have this two tables:
data
id |email
_
1 |xxx#gmail.com
2 |yyy#gmial.com
3 |zzzgimail.com
errors
_
error |correct
#gmial.com|#gmail.com
gimail.com|#gmail.com
How can I select from data all the records with an email error? Thanks.
SELECT d.id, d.email
FROM data d
INNER JOIN errors e ON d.email LIKE '%' + e.error
Would do it, however doing a LIKE with a wildcard at the start of the value being matched on will prevent an index from being used so you may see poor performance.
An optimal approach would be to define a computed column on the data table, that is the REVERSE of the email field and index it. This would turn the above query into a LIKE condition with the wildcard at the end like so:
SELECT d.id, d.email
FROM data d
INNER JOIN errors e ON d.emailreversed LIKE REVERSE(e.error) + '%'
In this case, performance would be better as it would allow an index to be used.
I blogged a full write up on this approach a while ago here.
Assuming the error is always at the end of the string:
declare #data table (
id int,
email varchar(100)
)
insert into #data
(id, email)
select 1, 'xxx#gmail.com' union all
select 2, 'yyy#gmial.com' union all
select 3, 'zzzgimail.com'
declare #errors table (
error varchar(100),
correct varchar(100)
)
insert into #errors
(error, correct)
select '#gmial.com', '#gmail.com' union all
select 'gimail.com', '#gmail.com'
select d.id,
d.email,
isnull(replace(d.email, e.error, e.correct), d.email) as CorrectedEmail
from #data d
left join #errors e
on right(d.email, LEN(e.error)) = e.error
Well, in reality you can't with the info you have provided.
In SQL you would need to maintain a table of "correct" domains. With that you could do a simple query to find non-matches.
You could use some "non" SQL functionality in SQL Server to do a regular expression check, however that kind of logic does not below in SQL (IMO).
select * from
(select 1 as id, 'xxx#gmail.com' as email union
select 2 as id, 'yyy#gmial.com' as email union
select 3 as id, 'zzzgimail.com' as email) data join
(select '#gmial.com' as error, '#gmail.com' as correct union
select 'gimail.com' as error, '#gmail.com' as correct ) errors
on data.email like '%' + error + '%'
I think ... that if you didn't use a wildcard at the beginning but anywhere after, it could benefit from an index. If you used a full text search, it could benefit too.

Resources