T-SQL 2008: Parse String - sql-server

I have the following location as a string:
\\Windows\UnitB\CU1234_001\
I want to return the CU1234_001 part only. The query which I need to use needs to be dynamic since this string will change and it could be longer or shorter (it will all the time end in "\".
I've tried to used something like this but this just eliminate the last "\" and returns the rest of the string:
select
substring('\\Windows\UnitB\CU1234_001\',
1, (len('\\Windows\UnitB\CU1234_001\') - (Charindex('\',
reverse(rtrim('\\Windows\UnitB\CU1234_001\'))))))

You can use a combination of string functions to extract what you want:
SELECT REVERSE(SUBSTRING(REVERSE(col),
2,
CHARINDEX('/', REVERSE(col), 2) - 2))
FROM yourTable

Related

SQL - Replace string function is not working as intended

I have a simple string; for example,'01023201580001'.
I would like to replace the last two characters of this string; '01', with '00'.
I could extract the last two characters from this string as RIGHT(columname,2) and then use
REPLACE([columname], RIGHT([columname], 2), '00') as newColumnString
But in the result, it replaces the first two characters as well?
Expected result: 01023201580000
Result I get: 00023201580000
What am I doing wrong?
The second argument to the replace() function defines a pattern to match. The function will look for all instances of that pattern in the target string (first argument) and replace them with the replacement text (third argument).
If you know you only need to change the last two characters, you can take the value excluding those characters and then append the characters you want:
select left(columname, len(columname) - 2) + '00';
If you are doing this for an entire column and some of the rows might not end with '01', you can filter those out:
update MyTable
set columname = left(columname, len(columname) - 2) + '00'
where columname like '%01';
You could also use stuff() in a similar way.
In SQL server, you can use substring like so:
DECLARE #s NVARCHAR(20) = N'01023201580001';
DECLARE #ReplaceWith NVARCHAR(20) = N'00';
SELECT SUBSTRING(#s, 0, LEN(#s) - 1) + #ReplaceWith;
Output: 01023201580000

Select substring between two words in a long string

I have a table with a large string column called "HL7_MESSAGE" that I need to pull a string from several key words, as you'll see in the code below. I'm an Oracle person so the code was written and works in Oracle SQL but I need to convert it into SQL server code. SQL server doesn't have Regexp_substr function but I haven't been able to get this to work using charindex or Patindex. Basically I select a string between two strings and in the decode statements I look for if there is data missing between the two words/sections. If it just finds '.br\' then it's missing data and I just flag missing or filled. Anyway, code is below...if someone can decode it to SQL server version 2011 I would appreciate it.
CODE:
select
primary_key,
trim(REPLACE(trim(regexp_substr(hl7_message, 'RHRN:(.*)BIRTHDATE:', 1, 1, null, 1)),'\.br\',' ')) AS PATIENT_RHRN,
trim(REPLACE(trim(regexp_substr(hl7_message, 'PATIENT NAME:(.*)RHRN:', 1, 1, null, 1)),'\.br\',' ')) AS PATIENT_NAME,
trim(REPLACE(trim(regexp_substr(hl7_message, 'ULI:(.*)GENDER:', 1, 1, null, 1)),'\.br\',' ')) AS PATIENT_ULI,
decode(replace(to_char(regexp_substr(hl7_message,'FINDINGS:(.*)ADVERSE EVENTS:',1,1,'',1)),'\.br\'),NULL, 'missing', 'filled') FINDINGS_TO_ADVS_EVENTS_FLAG,
decode(replace(to_char(regexp_substr(hl7_message,'IMPRESSIONS:(.*)RECOMMENDATIONS:',1,1,'',1)),'\.br\'),NULL, 'missing', 'filled') IMPRESSION_TO_RECOMM_FLAG,
decode(replace(to_char(regexp_substr(hl7_message,'RECOMMENDATIONS:(.*)_____________________________',1,1,'',1)),'\.br\'),NULL, 'missing', 'filled') RECOMM_TO_SIG_UNDERLINE
from TEST;
Thanks
I have provided below a quick example using patindex and substring to extract information from a string between two other strings, Hopefully this will act as a basis for you to be able to do the conversion.
DECLARE #a varchar(300) = 'this is a long string I will extract data from'
PRINT SUBSTRING(#a, PATINDEX('% long %', #a) + LEN('% long %')-2, PATINDEX('% extract %', #a) - (PATINDEX('% long %', #a) + LEN('% long %')-2))
The key is in using the two patindex patterns to determine the start point and length:
First find the end of the first string pattern (Patindex to find the start, len - 2 for the end):
Patindex(pattern, string) + LEN(pattern, string) - 2
Then to find the length, use Patindex to find the start of the second string, and subtract the start point found above.
Patindex(pattern, string) - (Patindex(start pattern, string) + LEN(start pattern, string) - 2 )
I hope that this helps.

Make substring using a specific delimiter in SQL

I want to make a substring of a column value using a specific delimiter.I tried SUBSTRING_INDEX,but it doesn't work for SQL.Is there any way to achieve this??
Column values are:
ARTCSOFT-1111
ARTCSOFT-1112
ARTCSOFT-1113
and I want to achieve the same thing in SQL:
SUBSTRING_INDEX(Code,'SOFT-',1))
i.e I want the number after SOFT- in my substring.I can't use only - because before SOFT- there is chance that - may occur(rare case,but I don't want to take a chance)
Try using just SUBSTRING . For example
SELECT
SUBSTRING(code, CHARINDEX('SOFT-', code) + 5, LEN(code)) AS [name] from dbo.yourtable
hope this helps.
Tested Result:
SELECT RIGHT(Code , CHARINDEX ('-' ,REVERSE(Code))-1)
Read this as: Get the rightmost string after the first '-' in a reversed string - which is the same as the string after the last '-' character.
Try This Query:
select substring(col,charindex('-',col)+1,len(col)-charindex('-',col)) from #Your_table
Explanation of Query:
Here Charindex find the '-' delimeter [length] IN Given String and now that Result[length+1] is our starting point and ending length is [len(col)-starting length] gives ending point and then use substring Function to split a string according to our requirement.
Result of Query:
Required_col
1111
1112
1113

SQL: Fix for CSV import mistake

I have a database that has multiple columns populated with various numeric fields. While trying to populate from a CSV, I must have mucked up assigning delimited fields. The end result is a column containing It's Correct information, but also contains the next column over's data- seperated by a comma.
So instead of Column UPC1 containing "958634", it contains "958634,95877456". The "95877456" is supposed to be in the UPC2 column, instead UPC2 is NULL.
Is there a way for me to split on the comma and send the data to UPC2 while keeping UPC1 data before the comma in tact?
Thanks.
You can do this with string functions. To query the values and verify the logic, try this:
SELECT
LEFT(UPC1, CHARINDEX(',', UPC1) - 1),
SUBSTRING(UPC1, CHARINDEX(',', UPC1) + 1, 1000)
FROM myTable;
If the result is what you want, turn it into an update:
UPDATE myTable SET
UPC1 = LEFT(UPC1, CHARINDEX(',', UPC1) - 1),
UPC2 = SUBSTRING(UPC1, CHARINDEX(',', UPC1) + 1, 1000);
The expression for UPC1 takes the left side of UPC1 up to one character before the comma.
The expression for UPC2 takes the remainder of the UPC1 string starting one character after the comma.
The third argument to SUBSTRING needs some explaining. It's the number of characters you want to include after the starting position of the string (which in this case is one character after the comma's location). If you specify a value that's longer than the string SUBSTRING will just return to the end of the string. Using 1000 here is a lot easier than calculating the exact number of characters you need to get to the end.

How do I match a substring of variable length?

I am importing data into my SQL database from an Excel spreadsheet.
The imp table is the imported data, the app table is the existing database table.
app.ReceiptId is formatted as "A" followed by some numbers. Formerly it was 4 digits, but now it may be 4 or 5 digits.
Examples:
A1234
A9876
A10001
imp.ref is a free-text reference field from Excel. It consists of some arbitrary length description, then the ReceiptId, followed by an irrelevant reference number in the format " - BZ-0987654321" (which is sometimes cropped short, or even missing entirely).
Examples:
SHORT DESC A1234 - BZ-0987654321
LONGER DESCRIPTION A9876 - BZ-123
REALLY LONG DESCRIPTION A2345 - B
REALLY REALLY LONG DESCRIPTION A23456
The code below works for a 4-digit ReceiptId, but will not correctly capture a 5-digit one.
UPDATE app
SET
[...]
FROM imp
INNER JOIN app
ON app.ReceiptId = right(right(rtrim(replace(replace(imp.ref,'-',''),'B','')),5)
+ rtrim(left(imp.ref,charindex(' - BZ-',imp.ref))),5)
How can I change the code so it captures either 4 (A1234) or 5 (A12345) digits?
As ughai rightfully wrote in his comment, it's not recommended to use anything other then columns in the on clause of a join.
The reason for that is that using functions prevents sql server for using any indexes on the columns that it might use without the functions.
Therefor, I would suggest adding another column to imp table that will hold the actual ReceiptId and be calculated during the import process itself.
I think the best way of extracting the ReceiptId from the ref column is using substring with patindex, as demonstrated in this fiddle:
SELECT ref,
RTRIM(SUBSTRING(ref, PATINDEX('%A[0-9][0-9][0-9][0-9]%', ref), 6)) As ReceiptId
FROM imp
Update
After the conversation with t-clausen-dk in the comments, I came up with this:
SELECT ref,
CASE WHEN PATINDEX('%[ ]A[0-9][0-9][0-9][0-9][0-9| ]%', ref) > 0
OR PATINDEX('A[0-9][0-9][0-9][0-9][0-9| ]%', ref) = 1 THEN
SUBSTRING(ref, PATINDEX('%A[0-9][0-9][0-9][0-9][0-9| ]%', ref), 6)
ELSE
NULL
END As ReceiptId
FROM imp
fiddle here
This will return null if there is no match,
when a match is a sub string that contains A followed by 4 or 5 digits, separated by spaces from the rest of the string, and can be found at the start, middle or end of the string.
Try this, it will remove all characters before the A[number][number][number][number] and take the first 6 characters after that:
UPDATE app
SET
[...]
FROM imp
INNER JOIN app
ON app.ReceiptId in
(
left(stuff(ref,1, patindex('%A[0-9][0-9][0-9][0-9][ ]%', imp.ref + ' ') - 1, ''), 5),
left(stuff(ref,1, patindex('%A[0-9][0-9][0-9][0-9][0-9][ ]%', imp.ref + ' ') - 1, ''), 6)
)
When using equal, the spaces after is not evaluated

Resources