TSQL - Split delimited String into columns - sql-server

The Problem
I have a number of filename strings that I want to parse into columns using a tilda as delimiter. The strings take on the static format:
Filepath example C:\My Documents\PDF
Surname example Walker
First Name example Thomas
Birth Date example 19991226
Document Created Datetime example 20180416150322
Document Extension example .pdf
So a full concatenated example would be something like:
C:\My Documents\PDF\Walker~Thomas~19991226~20180416150322.pdf
I want to ignore the file path and extension given in the string and only parse the following values into columns:
Surname, First Name, Birth Date, Document Created Datetime
So something like:
SELECT Surname = --delimitedString[0]
FirstName = --delimitedString[1]
--etc.
What I have tried
I know that I have several tasks I would need to perform in order to split the string, first I would need to trim off the extension and file path so that I can return a string delimited by tildas (~).
This is problem one for me, however problem 2 is splitting the new delimted string itself i.e.
Walker~Thomas~19991226~20180416150322
Ive had a good read through this very comprehensive question and It seems (as im using SQL Server 2008R2) the only options are to use either a function with loops or recursive CTE's or attempt a very messy attempt using SUBSTRING() with charIndex().
Im aware that If I had access to SQL Server 2016 I could use string_split but unfortunately I cant upgrade.
I do have access to SSIS but im very new to it so decided to attempt the bulk of the work within a SQL statement

Here is a way without a splitter that shouldn't be too complicated...
declare #var table (filepath varchar(256))
insert into #var values
('C:\My Documents\PDF\Walker~Thomas~19991226~20180416150322.pdf')
;with string as(
select
x = right(filepath,charindex('\',reverse(filepath))-1)
from #var
)
select
SurName= substring(x,1,charindex('~',x) - 1)
,FirstName = substring(x,charindex('~',x) + 1,charindex('~',x) - 1)
from string

I know you mentioned wanting to avoid the charindex() option if at all possible, but I worked it out in a hopefully semi-readable way. I find it somewhat easy to read complex functions like this when I space each parameter on a different line and use indent levels. It's not the most proper looking, but it helps with legibility:
with string as (select 'C:\My Documents\PDF\Walker~Thomas~19991226~20180416150322.pdf' as filepath)
select
substring(
filepath,
len(filepath)-charindex('\',reverse(filepath))+2, --start location, after last '\'
len(filepath)- --length of path
(len(filepath)-charindex('\',reverse(filepath))+2)- --less characters up to last '\'
(len(filepath)-charindex('.',filepath)) --less file extention
)
from string

Fritz already have a great start, my answer just add on top it
with string as (select 'C:\My Documents\PDF\Walker~Thomas~19991226~20180416150322.pdf' as filepath)
, newstr as (
select
REPLACE(substring(
filepath,
len(filepath)-charindex('\',reverse(filepath))+2, --start location, after last '\'
len(filepath)- --length of path
(len(filepath)-charindex('\',reverse(filepath))+2)- --less characters up to last '\'
(len(filepath)-charindex('.',filepath)) --less file extention
) , '~', '.') as new_part
from string
)
SELECT
PARSENAME(new_part,4) as Surname,
PARSENAME(new_part,3) as [First Name],
PARSENAME(new_part,2) as [Birth Date],
PARSENAME(new_part,1) as [Document Created Datetime]
FROM newstr

Related

How can I pull out multiple substrings from within a single string

For example, I have this string in an SSMS DB:
Sent for processing:1 ;DK A/C-WestlySnipes:ACCT NOT FOUND ;DK A/C-SonyaBlade:ACCT NOT FOUND
What I want to be able to do is pull out WestleySnipes and SonyaBlade from within that string and store it in a temp table. The names can be different and there can be more than one name within a particular string.
I've tried using substrings combined with CharIndex but I can only pull out 1 value. Not sure how to pull out multiple names from within the same string where the names can change in any given string.
In SQL Server 2016 and later, STRING_SPLIT(string, delimiter) is a table valued function. You can split your text string with a ; delimiter like so. (fiddle).
SELECT value FROM STRING_SPLIT(
'Sent for processing:1 ;DK A/C-WestlySnipes:ACCT NOT FOUND ;DK A/C-SonyaBlade:ACCT NOT FOUND',
';')
You get back this:
value
Sent for processing:1
DK A/C-WestlySnipes:ACCT NOT FOUND
DK A/C-SonyaBlade:ACCT NOT FOUND
Then you can use ordinary string-editing functions on value in your query to extract the precise substrings you need. Something like this will work for your specific example. (fiddle.)
SELECT
value,
REPLACE(REPLACE(value, 'DK A/C-', ''), ':ACCT NOT FOUND', '') val
FROM STRING_SPLIT(
'Sent for processing:1 ;DK A/C-WestlySnipes:ACCT NOT FOUND ;DK A/C-SonyaBlade:ACCT NOT FOUND',
';')
WHERE value LIKE 'DK %'

TSQL - Replace string before symbol

I have a table storing pathes to files on sql server. I need to replace the path before the last backslash:
C:\Users\APP\AppData\Local\Temp\test\abc deg.pdf
to for example:
\app\pp\abc deg.pdf
EDIT: The table containts many pathes - I need to run through the whole table and change all pathes.
You can use:
CHARINDEX('\', REVERSE(#str))
to get the index of the first backslash starting from the end.
Using RIGHT with this index you can extract the string after the last backslash and concatenate it to the new path:
DECLARE #str VARCHAR(50) = 'C:\Users\APP\AppData\Local\Temp\test\abc deg.pdf'
SELECT '\app\pp' + RIGHT(#str, CHARINDEX('\', REVERSE(#str)))
Reverse the input string (using REVERSE) and find the index of the first backslash (using CHARINDEX).
Take the left part up to that index (using LEFT) and concatenate with the reverse of your replacement string (using + operator).
Then reverse that to get your final result.
Try this:
declare #a varchar(max)='C:\Users\APP\AppData\Local\Temp\test\abc deg.pdf'
select REPLACE(#a,SUBSTRING(#a,1,(LEN(#a)-charindex('\',reverse(#a),1))),'\app\pp')
Update: For updatingall the table column values.
select REPLACE([column-name],SUBSTRING([column-name],1,(LEN([column-name])-charindex('\',reverse([column-name]),1))),'\app\pp')
FROM [Your-table]
This is how it can be done.
UPDATE TABLE
SET PATH = REPLACE(PATH, 'C:\Users\APP\AppData\Local\Temp\test', '\app\pp')
WHERE ...
This is going to replace 'C:\Users\APP\AppData\Local\Temp\test' with '\app\pp'. Or you can modify the path as required.
Please test before executing this UPDATE statement. I havent specified filters here

Concat the values in a string with SQL Server

I want to select a list of items and part numbers for for each item as a string:
SELECT top 100 *
FROM ii
OUTER APPLY
(SELECT def, ( ipr.part_number + ',') as prt
FROM ipr
WHERE ii.item_id = ipr.item_id
FOR XML PATH('') ) PN
The error is:
[Error Code: 8155, SQL State: S0002] No column name was specified for
column 1 of 'PN'.
How can I fix this?
I think that your whole OUTER APPLY statement generates one XML for both default_part_number and concatenated string, which(the whole XML) doesn't have a name.
What you could try to do would be adding alias like this AS PN(TestThis).
However, I don't think that you're expecting result you're going to get. It would better if you'd give us some example data and expected output. It will be easier for us to solve your problem in that case.
The combination of XML and STUFF is funny but perfectly fitting to your needs.
First you concat your strings with the ', ' in front, then you must return your XML with ", TPYE). You must read the result with ".value()" and use STUFF to replace the first ', '.
You'll find a lot of exampels in the net...

SQL Server String extract based on pattern

I have string data in the following format:
MODELNUMBER=Z12345&HELLOWORLD=WY554&GADTYPE=PLA&ID=Z-12345
/DTYPE=PLA&ID=S-10758&UN_JTT_REDIRECT=UN_JTT_IOSV
and need to extract IDs based on two conditions
Starting after a pattern &ID=
Ending till the last character or
if it hits a & stop right there.
So in the above example I'm using the following code:
SUBSTRING(MyCol,(PATINDEX('%&id=%',[MyCol])+4),(LEN(MyCol) - PATINDEX('%&id%',[MyCol])))
Essentially looking the pattern &id=% and extract string after that till end of the line. Would anyone advise on how to handle the later part of the logic ..
My current results are
Z-12345
Z-12345&UN_JTT_REDIRECT=UN_JTT_IOSV
What I need is
Z-12345
Z-12345
Try this
SUBSTRING(MyCol, (PATINDEX('%[A-Z]-[0-9][0-9][0-9][0-9][0-9]%',[MyCol])),7)
if you run into performance issues add the where clause below
-- from Mytable
WHERE [MyCol] like '%[A-Z]-[0-9][0-9][0-9][0-9][0-9]%'
maybe not the most elegant solution but it works for me.
Correct syntax of PATINDEX
Here's one example how to do it:
select
substring(d.data, s.s, isnull(nullif(e.e,0),2000)-s.s) as ID,
d.data
from data d
cross apply (
select charindex('&ID=', d.data)+4 as s
) s
cross apply (
select charindex('&', d.data, s) as e
) e
where s.s > 4
This assumes there data column is varchar(2000) and the where clause leaves out any rows that don't have &ID=
The first cross apply searches for the start position, the second one for the end. The isnull+nulliff in the actual select handles the case where & is not found and replaces it with 2000 to make sure the whole string is returned.

Building dynamic query for Sql Server 2008 when table name contains " ' "

I need to fetch Table's TOP_PK, IDENT_CURRENT, IDENT_INCR, IDENT_SEED for which i am building dynamic query as below:
sGetSchemaCommand = String.Format("SELECT (SELECT TOP 1 [{0}] FROM [{1}]) AS TOP_PK, IDENT_CURRENT('[{1}]') AS CURRENT_IDENT, IDENT_INCR('[{1}]') AS IDENT_ICREMENT, IDENT_SEED('[{1}]') AS IDENT_SEED", pPrimaryKey, pTableName)
Here pPrimaryKey is name of Table's primary key column and pTableName is name of Table.
Now, i am facing problem when Table_Name contains " ' " character.(For Ex. KIN'1)
When i am using above logic and building query it would be as below:
SELECT (SELECT TOP 1 [ID] FROM [KIL'1]) AS TOP_PK, IDENT_CURRENT('[KIL'1]') AS CURRENT_IDENT, IDENT_INCR('[KIL'1]') AS IDENT_ICREMENT, IDENT_SEED('[KIL'1]') AS IDENT_SEED
Here, by executing above query i am getting error as below:
Incorrect syntax near '1'.
Unclosed quotation mark after the character string ') AS IDENT_SEED'.
So, can anyone please show me the best way to solve this problem?
Escape a single quote by doubling it: KIL'1 becomes KIL''1.
If a string already has adjacent single quotes, two becomes four, or four becomes eight... it can get a little hard to read, but it works :)
Using string methods from .NET, your statement could be:
sGetSchemaCommand = String.Format("SELECT (SELECT TOP 1 [{0}] FROM [{1}]) AS TOP_PK, IDENT_CURRENT('[{2}]') AS CURRENT_IDENT, IDENT_INCR('[{2}]') AS IDENT_ICREMENT, IDENT_SEED('[{2}]') AS IDENT_SEED", pPrimaryKey, pTableName, pTableName.Replace("'","''"))
EDIT:
Note that the string replace is now only on a new, third substitution string. (I've taken out the string replace for pPrimaryKey, and for the first occurrence of pTableName.) So now, single quotes are only doubled, when they will be within other single quotes.
You need to replace every single quote into two single quotes http://beyondrelational.com/modules/2/blogs/70/posts/10827/understanding-single-quotes.aspx

Resources