SQL server string or VARCHAR manipulation containing numerics - sql-server

In SQL server, I have VARCHAR values.
I need a view that automatically reformats data.
Data that is stored in the following form:
hawthorn104freddy#hawthorn.com
scotland2samantha#gmail.com3
birmingham76roger#outlook.co.uk1905student
Needs to be reformatted into the following:
hawthorn 104freddy#hawthorn.com0000
scotland 002samantha#gmail.com 0003
birmingham076roger#outlook.co.uk1905student
Reformatting
Numeric values within the strings are padded with zeros to the length of the longest number
All other characters are padded with space characters to line up the numbers.
Does anyone know how this is done?
Note: Bear in mind that a string may contain any combination of words and numbers.

You should split your values to 4 columns (to find maximum length in each column), then add leading/trailing zeros/spaces, then concat it.
Here is code to split values, hope you will have no problems with adding zeros and spaces:
declare #v varchar(255) = 'hawthorg104freddy#hawthorn.com50'
select
FirstPart = left(#v, patindex('%[a-z][0-9]%', #v)),
SecondPart = substring(#v, patindex('%[0-9]%', #v), patindex('%[0-9][a-z]%', #v) - patindex('%[a-z][0-9]%', #v)),
ThirdPart = substring(#v, patindex('%[0-9][a-z]%', #v) + 1, len(#v) - patindex('%[0-9][a-z]%', #v) - patindex('%[0-9][a-z]%', reverse(#v))),
Fourthpart = right(#v, patindex('%[0-9][a-z]%', reverse(#v)))
Notes:
patindex('%[a-z][0-9]%', #v) - Last letter in hawthorn (nickname?)
patindex('%[0-9][a-z]%', #v) - Last digit in first number (104)
patindex('%[0-9][a-z]%', reverse(#v)) - Length of the last number
You can also use CLR and RegEx to split values to groups:
https://github.com/zzzprojects/Eval-SQL.NET/wiki/SQL-Server-Regex-%7C-Use-regular-expression-to-search,-replace-and-split-text-in-SQL

You can use PATINDEX
declare #str varchar(100)='hawthorn104freddy#hawthorn.com'
SELECT SUBSTRING(#str,0,PATINDEX('%[0-9]%',#str)),
SUBSTRING(#str,PATINDEX('%[0-9]%',#str),LEN(#str)-LEN(SUBSTRING(#str,0,PATINDEX('%[0-9]%',#str))))

Related

SQL split a comma delimited string by fixed length

I have a string of comma separated values, I am loading these values into a system that has a max length I need to abide by. Once the string hits a max length it should move the values to another column but retain values. For the sake of this example below, I only need to split the string into two columns.
For example my string value = val1,val2,val3,val4,val5
Max length of output fields = 15
Output should be two columns:
ValueList1 ValueList2
val1,val2,val3 val4,val5
I'm trying to complete this with T-SQL but this is not a common issue I need to solve and am stumped. Any help would be greatly appreciated.
you can try this.
DECLARE #stringvalue VARCHAR(5000) = 'val1,val2,val3,val4,val5'
DECLARE #MaxLengthofOutputFields INT = 15
SELECT
CASE WHEN LEN(#stringvalue) > #MaxLengthofOutputFields
THEN LEFT(#stringvalue, #MaxLengthofOutputFields - CHARINDEX(',',REVERSE(LEFT(#stringvalue,#MaxLengthofOutputFields))))
ELSE #stringvalue END ,
CASE WHEN LEN(#stringvalue) > #MaxLengthofOutputFields THEN
SUBSTRING(#stringvalue, #MaxLengthofOutputFields - CHARINDEX(',',REVERSE(LEFT(#stringvalue,#MaxLengthofOutputFields))) + 2, LEN(#stringvalue))
END
Result:
-------------------- -------------
val1,val2,val3 val4,val5

1+1=3? Space characters in nvarchar variables and string lengths

I've just stumbled upon this:
Why doesn't the following code:
DECLARE #s nvarchar(10) = N' '
PRINT CONCAT('#', #s, '#')
PRINT CONCAT('#', LEN(#s), '#')
result in either the output
##
#0#
or
# #
#1#
On a SQL Server 2017, however, this code produces the output
# #
#0#
Which seems contradictory to me.
Either the string has the length 0 and is '' or the length 1 and is ' '.
The whole thing becomes even stranger if you add the following code:
DECLARE #s nvarchar(10) = N' '
PRINT CONCAT('#', #s, '#')
PRINT CONCAT('#', LEN(#s), '#')
DECLARE #l1 int = LEN(CONCAT('#', #s, '#'))
PRINT LEN(#s)
PRINT LEN('#')
PRINT #l1
Which outputs the following:
# #
#0#
0
1
3
So we have three substrings, one with length 0, two with length 1. The total string then has length 3? I'm confused.
If you fill #s with several spaces, it looks even more funny - e.g. 5 spaces results in this output:
# #
#0#
0
1
7
So here's 1×0 + 2×1 even 7. I wish my bank would calculate my account balance like this.
Can someone explain to me what's going on?
Many thanks for your help!
LEN
Returns the number of characters of the specified string expression,
excluding trailing spaces.
So LEN(' ') = 0 (only spaces), but LEN(' x') = 2 (no trailing spaces).
LEN excludes trailing spaces. If that is a problem, consider using the
DATALENGTH (Transact-SQL) function which does not trim the string. If
processing a unicode string, DATALENGTH will return twice the number
of characters.

SQL: Fix for CSV import mistake

I have a database that has multiple columns populated with various numeric fields. While trying to populate from a CSV, I must have mucked up assigning delimited fields. The end result is a column containing It's Correct information, but also contains the next column over's data- seperated by a comma.
So instead of Column UPC1 containing "958634", it contains "958634,95877456". The "95877456" is supposed to be in the UPC2 column, instead UPC2 is NULL.
Is there a way for me to split on the comma and send the data to UPC2 while keeping UPC1 data before the comma in tact?
Thanks.
You can do this with string functions. To query the values and verify the logic, try this:
SELECT
LEFT(UPC1, CHARINDEX(',', UPC1) - 1),
SUBSTRING(UPC1, CHARINDEX(',', UPC1) + 1, 1000)
FROM myTable;
If the result is what you want, turn it into an update:
UPDATE myTable SET
UPC1 = LEFT(UPC1, CHARINDEX(',', UPC1) - 1),
UPC2 = SUBSTRING(UPC1, CHARINDEX(',', UPC1) + 1, 1000);
The expression for UPC1 takes the left side of UPC1 up to one character before the comma.
The expression for UPC2 takes the remainder of the UPC1 string starting one character after the comma.
The third argument to SUBSTRING needs some explaining. It's the number of characters you want to include after the starting position of the string (which in this case is one character after the comma's location). If you specify a value that's longer than the string SUBSTRING will just return to the end of the string. Using 1000 here is a lot easier than calculating the exact number of characters you need to get to the end.

SQL Select command to selectively extract data from a string

I have data in a column which contains values for various fields from an application and all these fields are concatenated into one field on the database side and separated with commas.
If the field in the application is blank, then the value between the two commas will just be blank.
I need a select statement to select each of the individual fields if they are populated. I would like to specify each field as a variable which I will declare at the top of the statement.
An example of the string in the database field is:
,"FIELD1","FIELD2","FIELD3",FIELD4,FIELD5,FIELD6,,,,"FIELD10",FIELD11,FIELD12,FIELD13
As you can see, fields 7-9 were blank in this example so they are blank in the string.
I just need a way to selectively select the field I need using the commas as my marker. The string always starts with a comma so field1 always comes after the first comma.
I hope this makes sense!
Try this:
DECLARE #STRING VARCHAR(255) = ',"FIELD1","FIELD2","FIELD3",FIELD4,FIELD5,FIELD6,,,,"FIELD10",FIELD11,FIELD12,FIELD13'
DECLARE #FieldToReturn INT = 12 -- Pick which field you want
SET #STRING = RIGHT(#STRING, LEN(#STRING) - 1) + ',' -- Strip leading comma & add comma to the end
WHILE #FieldToReturn > 1
BEGIN
SET #STRING = SUBSTRING(#STRING,PATINDEX('%,%',#STRING), LEN(#STRING))
SET #FieldToReturn = #FieldToReturn - 1
SET #STRING = RIGHT(#STRING, LEN(#STRING) - 1)
END
SELECT SUBSTRING(#STRING,0,PATINDEX('%,%', #STRING))
If it the field is not populated, this will return a blank.
Edit: I know that I could have put all of the string manipulation in one line within the WHILE, but chose not to for readability...to me that is more important that the possibility of a teeny tiny bit of overhead in this example

Splitting string with three delimeters in SQL Server

String pattern:
1#5,7;2#;3#4
These are three sets of values separated by semicolon.
Digit before # goes in one column, digits after # (separated by comma) go in another column (so the second set in this case only has one value)
How can I do this?
This is what I found on the net:
DECLARE #S VARCHAR(MAX) = '1,100,12345|2,345,433|3,23423,123|4,33,55'
DECLARE #x xml = '<r><c>' +
REPLACE(REPLACE(#S, ',','</c><c>'),'|','</c></r><r><c>') +
'</c></r>'
SELECT x.value('c[1]','int') AS seq,
x.value('c[2]','int') AS invoice,
x.value('c[3]','int') AS amount
FROM #x.nodes('/r') x(x)
This however has fixed no. of figures after every delimiter. And it also uses only 2 delimiters.

Resources