INSERTing rows from a SELECT statement into a sequence of columns - sql-server

This maybe a bit of a noob question, but is there a nice simple way of inserting ROWS from a select statment into COLUMNS of another table?
I'm not just talking about doing an INSERT / SELECT.
I have a function which splits some CSV into rows. So say I have two rows of data like this
joe,bloggs,joe.bloggs#domain.fake;jane,soap,jane.soap#domain.notreal;
I split first by semi-colon, then by comma
Result of first split call
Id Data
1 joe,bloggs,joe.bloggs#domain.fake
2 jane,soap,jane.soap#domain.notreal
The on each of these I run the split function again
Id Data
1 Joe
2 Bloggs
3 joe.bloggs#domain.fake
With this returned data, I want to do an insert statement that looks like this
INSERT INTO Customers (#first,#last,#email)
SELECT [row1].[col2],[row2].[col2],[row3].[col2]
Is there any simple way to do this?

An easy way load all the data from CSV to database table is to use BULK INSERT statement. All you need to do is to use correct parameters(in your case FIELDTERMINATOR = ',', ROWTERMINATOR = ';')
BULK INSERT Customers
FROM 'c:\split.csv'
WITH
(FIELDTERMINATOR = ',',
ROWTERMINATOR = ';'
)
GO
After the first splitting you have a data in table format. Thus you can use greate method of splitting a column with delimited string into multiple columns using XML method
INSERT dbo.Customers([first], [last], [email])
SELECT Split.a.value('/M[1]', 'VARCHAR(100)' ) AS [first],
Split.a.value('/M[2]', 'VARCHAR(100)' ) AS [last],
Split.a.value('/M[3]', 'VARCHAR(100)' ) AS [email]
FROM (SELECT CAST('<M>' + REPLACE(Data , ',' , '</M><M>' ) + '</M>' AS XML) AS xmlData
FROM dbo.testCSV
) AS x CROSS APPLY XMLDATA.nodes('.') AS Split(a)
Demo on SQLFiddle

Related

Replace specials chars with HTML entities

I have the following in table TABLE
id content
-------------------------------------
1 Hellö world, I äm text
2 ènd there äré many more chars
3 that are speçial in my dat£base
I now need to export these records into HTML files, using bcp:
set #command = 'bcp "select [content] from [TABLE] where [id] = ' +
#id queryout +' + #filename + '.html" -S ' + #instance +
' -c -U ' + #username + ' -P ' + #password"
exec xp_cmdshell #command, no_ouput
To make the output look correct, I need to first replace all special characters with their respective HTML entities (pseudo)
insert into [#temp_html] ..
replace(replace([content], 'ö', 'ö'), 'ä', 'ä')
But by now, I have 30 nested replaces and it's starting to look insane.
After much searching, I found this post which uses a HTML conversion table but it is too advanced for me to understand:
The table does not list the special chars itself as they are in my text (ö, à etc) but UnicodeHex. Do I need to add them to the table to make the conversions that I need?
I am having trouble understanding how to update my script to replace all special chars. Can someone please show me a snippet of (pseudo) code?
One way to do that with a translation table is using a recursive cte to do the replaces, and one more cte to get only the last row of each translated value.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
id int,
content nvarchar(100)
)
INSERT INTO #T (id, content) VALUES
(1, 'Hellö world, I äm text'),
(2, 'ènd there äré many more chars'),
(3, 'that are speçial in my dat£base')
Then, create and populate the translation table (I don't know the HTML entities for these chars, so I've just used numbers [plus it's easier to see in the results]). Also, please note that this can be done using yet another cte in the chain.
DECLARE #Translations AS TABLE
(
str nchar(1),
replacement nvarchar(10)
)
INSERT INTO #Translations (str, replacement) VALUES
('ö', '-1-'),
('ä', '-2-'),
('è', '-3-'),
('ä', '-4-'),
('é', '-5-'),
('ç', '-6-'),
('£', '-7-')
Now, the first cte will do the replaces, and the second cte just adds a row_number so that for each id, the last value of lvl will get 1:
;WITH CTETranslations AS
(
SELECT id, content, 1 As lvl
FROM #T
UNION ALL
SELECT id, CAST(REPLACE(content, str, replacement) as nvarchar(100)), lvl+1
FROM CTETranslations
JOIN #Translations
ON content LIKE '%' + str + '%'
), cteNumberedTranslation AS
(
SELECT id, content, ROW_NUMBER() OVER(PARTITION BY Id ORDER BY lvl DESC) rn
FROM CTETranslations
)
Select from the second cte where rn = 1, I've joined the original table to show the source and translation side by side:
SELECT r.id, s.content, r.content
FROM #T s
JOIN cteNumberedTranslation r
ON s.Id = r.Id
WHERE rn = 1
ORDER BY Id
Results:
id content content
1 Hellö world, I äm text Hell-1- world, I -4-m text
2 ènd there äré many more chars -3-nd there -4-r-5- many more chars
3 that are speçial in my dat£base that are spe-6-ial in my dat-7-base
Please note that if your content have more that 100 special chars, you will need to add the maxrecursion 0 hint to the final select:
SELECT r.id, s.content, r.content
FROM #T s
JOIN cteNumberedTranslation r
ON s.Id = r.Id
WHERE rn = 1
ORDER BY Id
OPTION ( MAXRECURSION 0 );
See a live demo on rextester.

SQL Server Regular expression extract pattern from DB colomn

I have a question about SQL Server: I have a database column with a pattern which is like this:
up to 10 digits
then a comma
up to 10 digits
then a semicolon
e.g.
100000161, 100000031; 100000243, 100000021;
100000161, 100000031; 100000243, 100000021;
and I want to extract within the pattern the first digits (up to 10) (1.) and then a semicolon (4.)
(or, in other words, remove everything from the semicolon to the next semicolon)
100000161; 100000243; 100000161; 100000243;
Can you please advice me how to establish this in SQL Server? Im not very familiar with regex and therefore have no clue how to fix this.
Thanks,
Alex
Try this
Declare #Sql Table (SqlCol nvarchar(max))
INSERT INTO #Sql
SELECT'100000161,100000031;100000243,100000021;100000161,100000031;100000243,100000021;'
;WITH cte
AS (SELECT Row_number()
OVER(
ORDER BY (SELECT NULL)) AS Rno,
split.a.value('.', 'VARCHAR(1000)') AS Data
FROM (SELECT Cast('<S>'
+ Replace( Replace(sqlcol, ';', ','), ',',
'</S><S>')
+ '</S>'AS XML) AS Data
FROM #Sql)AS A
CROSS apply data.nodes('/S') AS Split(a))
SELECT Stuff((SELECT '; ' + data
FROM cte
WHERE rno%2 <> 0
AND data <> ''
FOR xml path ('')), 1, 2, '') AS ExpectedData
ExpectedData
-------------
100000161; 100000243; 100000161; 100000243
I believe this will get you what you are after as long as that pattern truly holds. If not it's fairly easy to ensure it does conform to that pattern and then apply this
Select Substring(TargetCol, 1, 10) + ';' From TargetTable
You can take advantage of SQL Server's XML support to convert the input string into an XML value and query it with XQuery and XPath expressions.
For example, the following query will replace each ; with </b><a> and each , to </a><b> to turn each string into <a>100000161</a><a>100000243</a><a />. After that, you can select individual <a> nodes with /a[1], /a[2] :
declare #table table (it nvarchar(200))
insert into #table values
('100000161, 100000031; 100000243, 100000021;'),
('100000161, 100000031; 100000243, 100000021;')
select
xCol.value('/a[1]','nvarchar(200)'),
xCol.value('/a[2]','nvarchar(200)')
from (
select convert(xml, '<a>'
+ replace(replace(replace(it,';','</b><a>'),',','</a><b>'),' ','')
+ '</a>')
.query('a') as xCol
from #table) as tmp
-------------------------
A1 A2
100000161 100000243
100000161 100000243
value extracts a single value from an XML field. nodes returns a table of nodes that match the XPath expression. The following query will return all "keys" :
select
a.value('.','nvarchar(200)')
from (
select convert(xml, '<a>'
+ replace(replace(replace(it,';','</b><a>'),',','</a><b>'),' ','')
+ '</a>')
.query('a') as xCol
from #table) as tmp
cross apply xCol.nodes('a') as y(a)
where a.value('.','nvarchar(200)')<>''
------------
100000161
100000243
100000161
100000243
With 200K rows of data though, I'd seriously consider transforming the data when loading it and storing it in indivisual, indexable columns, or add a separate, related table. Applying string manipulation functions on a column means that the server can't use any covering indexes to speed up queries.
If that's not possible (why?) I'd consider at least adding a separate XML-typed column that would contain the same data in XML form, to allow the creation of an XML index.

SQL - Separate string into columns

I am bulk inserting a csv file into SQL Server 2012. The data is currently | pipe delimited as one long string for each row. I'd like to separate the data into the different columns at each pipe.
Here is how the data looks as its imported:
ID|ID2|Person|Person2|City|State
"1"|"ABC"|"Joe"|"Ben"|"Boston"|"MA"
"2"|"ABD"|"Jack"|"Tim"|"Nashua"|"NH"
"3"|"ADC"|"John"|"Mark"|"Hartford"|"CT"
I'd liek to separate the data into the columns at each pipe:
ID ID2 Person Person2 City State
1 ABC Joe Ben Boston MA
2 ABD Jack Tim Nashua NH
3 AFC John Mark Hartford CT
I'm finding it difficult to use charindex and substring functions because of the number of columns of the data also I've tried to use ParseName since that is a 2012 function but thats not working either as all the columns come out as NULL with ParseName
The file contains about 300k rows and I've found a solution using xmlname but it is very slow. ie: takes a minute to separate the data.
Here's the slow xml solution:
CREATE TABLE #tbl(iddata varchar(200))
DECLARE #i int = 0
WHILE #i < 100000
BEGIN
SET #i = #i + 1
INSERT INTO #tbl(iddata)
SELECT '"1"|"ABC"|"Joe"|"Ben"|"Boston"|"MA"'
UNION ALL
SELECT '"2"|"ABD"|"Jack"|"Tim"|"Nashua"|"NH"'
UNION ALL
SELECT '"3"|"AFC"|"John"|"Mark"|"Hartford"|"CT"'
END
;WITH XMLData
AS
(
SELECT idData,
CONVERT(XML,'<IDs><id>'
+ REPLACE(iddata,'|', '</id><id>') + '</id></IDs>') AS xmlname
FROM (
SELECT REPLACE(iddata,'"','') as iddata
FROM #tbl
)x
)
SELECT xmlname.value('/IDs[1]/id[1]','varchar(100)') AS ID,
xmlname.value('/IDs[1]/id[2]','varchar(100)') AS ID2,
xmlname.value('/IDs[1]/id[3]','varchar(100)') AS Person,
xmlname.value('/IDs[1]/id[4]','varchar(100)') AS Person2,
xmlname.value('/IDs[1]/id[5]','varchar(100)') AS City,
xmlname.value('/IDs[1]/id[6]','varchar(100)') AS State
FROM XMLData
This will do it for you.
CREATE TABLE #Import (
ID NVARCHAR(MAX),
ID2 NVARCHAR(MAX),
Person NVARCHAR(MAX),
Person2 NVARCHAR(MAX),
City NVARCHAR(MAX),
State NVARCHAR(MAX))
SET QUOTED_IDENTIFIER OFF
BULK INSERT #Import
FROM 'C:\MyFile.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n',
ERRORFILE = 'C:\myRubbishData.log'
)
select * from #Import
DROP TABLE #Import
Unfortunately using BULK INSERT will not deal with text qualifiers, so you will end up with "ABC" rather than ABC.
Either remove the text qualifiers from the csv file, or run a replace on your table once the data has been imported.
To save you the pain and misery of having to deal with pipes, I would strongly recommend that you process your input file to convert those pipes into commas, and then use SQL Server's built-in capacity to parse CSV into a table.
If you are using Java, replacing the pipes would literally take just one line of code:
String line = "\"1\"|\"ABC\"|\"Joe\"|\"Ben\"|\"Boston\"|\"MA\"";
line = line.replaceAll("|", ",");
// then write this line back out to file
BULK INSERT YourTable
FROM 'input.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
ERRORFILE = 'C:\CSVDATA\SchoolsErrorRows.csv',
TABLOCK
)
If you cannot work with the BULK suggestions (due to rights) you might speed your query up by about 30% with this:
SELECT AsXml.value('x[1]','varchar(100)') AS ID
,AsXml.value('x[2]','varchar(100)') AS ID2
,AsXml.value('x[3]','varchar(100)') AS Person
,AsXml.value('x[4]','varchar(100)') AS Person2
,AsXml.value('x[5]','varchar(100)') AS City
,AsXml.value('x[6]','varchar(100)') AS State
FROM #tbl
CROSS APPLY(SELECT CAST('<x>' + REPLACE(SUBSTRING(iddata,2,LEN(iddata)-2),'"|"','</x><x>') + '</x>' AS XML)) AS a(AsXml)

Converting Rows To Columns With Unknown Number Of Elements

I am trying to achive this:
Initial table:
PARM1 |PARM2 |DATE
-------------------
VALUE1|VALUE2|DATE1
VALUE3|VALUE4|DATE2
Final result:
PARM |DATE1 |DATE2 |...
-----------------------
PARM1|VALUE1|VALUE3|...
PARM2|VALUE2|VALUE4|...
Briefly, I want to convert my parameter names into lines and to have a column for every date, where the cells contain the parameter values for the date and parameter.
So far, I managed to get this:
SELECT *
FROM
(
SELECT [Parameter], [DATE], VALUE
FROM
(
SELECT PARM1, PARM2 FROM PARAMETER_VALUES
) SOURCE_TABLE
UNPIVOT
(
VALOR FOR [Parameter] IN (PARM1, PARM2)
) UNPIVOTED_TABLE
) T
The problem is, I can't PIVOT the results now, because I don't know how many DATEs there are. I want it to be dynamic.
Is it possible?
In short, you can't use the PIVOT command with unknown columns.
Your only option is to retrieve the data and reformat, using dynamic SQL or some kind of front end.
You can pivot using dynamic columns, if you build the pivot before hand.
SELECT #listColYouwantInPivot= STUFF(( SELECT distinct '], [' + [columnName]
FROM tableName
FOR
XML PATH('')
), 1, 2, '') + ']'
Just plug #listColYouwantInPivot in the pivot statement with a concatenation afterward.

comparing a column to a list of values in t-sql

I am displaying records on a page, and I need a way for the user to select a subset of those records to be displayed on another page. These records aren't stored anywhere the are a dynamically generated thing.
What is the best way to in sql to say where a uniqueid is in this list of ids not in a table etc. I know I could dynamically construct the sql with a bunch of ors, but that seems like a hack. anyone else have any suggestions?
this is the best source:
http://www.sommarskog.se/arrays-in-sql.html
create a split function, and use it like:
SELECT
*
FROM YourTable y
INNER JOIN dbo.splitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach
For this method to work, you need to do this one time table setup:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTable(',','1,2,3,,,4,5,6777,,,')
OUTPUT:
ListValue
-----------------------
1
2
3
4
5
6777
(6 row(s) affected)
Your can pass in a CSV string into a procedure and process only rows for the given IDs:
SELECT
y.*
FROM YourTable y
INNER JOIN dbo.FN_ListToTable(',',#GivenCSV) s ON y.ID=s.ListValue
You can use the solution Joel Spolsky recently gave for this problem.
SELECT * FROM MyTable
WHERE ',' + 'comma,separated,list,of,words' + ','
LIKE '%,' + MyTable.word + ',%';
That solution is clever but slow. The better solution is to split the comma-separated string, and construct a dynamic SQL query with the IN() predicate, adding a query parameter placeholder for each element in your list of values:
SELECT * FROM MyTable
WHERE word IN ( ?, ?, ?, ?, ?, ?, ?, ? );
The number of placeholders is what you have to determine when you split your comma-separated string. Then pass one value from that list per parameter.
If you have too many values in the list and making a long IN() predicate is unwieldy, then insert the values to a temporary table, and JOIN against your main table:
CREATE TEMPORARY TABLE TempTableForSplitValues (word VARCHAR(20));
...split your comma-separated list and INSERT each value to a separate row...
SELECT * FROM MyTable JOIN TempTableForSplitValues USING (word);
Also see many other similar questions on SO, including:
Dynamic SQL Comma Delimited Value Query
Passing a varchar full of comma delimited values to a SQL Server IN function
Parameterized Queries with Like and In
Parameterizing a SQL IN clause?

Resources