I have a table with a large string column called "HL7_MESSAGE" that I need to pull a string from several key words, as you'll see in the code below. I'm an Oracle person so the code was written and works in Oracle SQL but I need to convert it into SQL server code. SQL server doesn't have Regexp_substr function but I haven't been able to get this to work using charindex or Patindex. Basically I select a string between two strings and in the decode statements I look for if there is data missing between the two words/sections. If it just finds '.br\' then it's missing data and I just flag missing or filled. Anyway, code is below...if someone can decode it to SQL server version 2011 I would appreciate it.
CODE:
select
primary_key,
trim(REPLACE(trim(regexp_substr(hl7_message, 'RHRN:(.*)BIRTHDATE:', 1, 1, null, 1)),'\.br\',' ')) AS PATIENT_RHRN,
trim(REPLACE(trim(regexp_substr(hl7_message, 'PATIENT NAME:(.*)RHRN:', 1, 1, null, 1)),'\.br\',' ')) AS PATIENT_NAME,
trim(REPLACE(trim(regexp_substr(hl7_message, 'ULI:(.*)GENDER:', 1, 1, null, 1)),'\.br\',' ')) AS PATIENT_ULI,
decode(replace(to_char(regexp_substr(hl7_message,'FINDINGS:(.*)ADVERSE EVENTS:',1,1,'',1)),'\.br\'),NULL, 'missing', 'filled') FINDINGS_TO_ADVS_EVENTS_FLAG,
decode(replace(to_char(regexp_substr(hl7_message,'IMPRESSIONS:(.*)RECOMMENDATIONS:',1,1,'',1)),'\.br\'),NULL, 'missing', 'filled') IMPRESSION_TO_RECOMM_FLAG,
decode(replace(to_char(regexp_substr(hl7_message,'RECOMMENDATIONS:(.*)_____________________________',1,1,'',1)),'\.br\'),NULL, 'missing', 'filled') RECOMM_TO_SIG_UNDERLINE
from TEST;
Thanks
I have provided below a quick example using patindex and substring to extract information from a string between two other strings, Hopefully this will act as a basis for you to be able to do the conversion.
DECLARE #a varchar(300) = 'this is a long string I will extract data from'
PRINT SUBSTRING(#a, PATINDEX('% long %', #a) + LEN('% long %')-2, PATINDEX('% extract %', #a) - (PATINDEX('% long %', #a) + LEN('% long %')-2))
The key is in using the two patindex patterns to determine the start point and length:
First find the end of the first string pattern (Patindex to find the start, len - 2 for the end):
Patindex(pattern, string) + LEN(pattern, string) - 2
Then to find the length, use Patindex to find the start of the second string, and subtract the start point found above.
Patindex(pattern, string) - (Patindex(start pattern, string) + LEN(start pattern, string) - 2 )
I hope that this helps.
Related
I have a simple string; for example,'01023201580001'.
I would like to replace the last two characters of this string; '01', with '00'.
I could extract the last two characters from this string as RIGHT(columname,2) and then use
REPLACE([columname], RIGHT([columname], 2), '00') as newColumnString
But in the result, it replaces the first two characters as well?
Expected result: 01023201580000
Result I get: 00023201580000
What am I doing wrong?
The second argument to the replace() function defines a pattern to match. The function will look for all instances of that pattern in the target string (first argument) and replace them with the replacement text (third argument).
If you know you only need to change the last two characters, you can take the value excluding those characters and then append the characters you want:
select left(columname, len(columname) - 2) + '00';
If you are doing this for an entire column and some of the rows might not end with '01', you can filter those out:
update MyTable
set columname = left(columname, len(columname) - 2) + '00'
where columname like '%01';
You could also use stuff() in a similar way.
In SQL server, you can use substring like so:
DECLARE #s NVARCHAR(20) = N'01023201580001';
DECLARE #ReplaceWith NVARCHAR(20) = N'00';
SELECT SUBSTRING(#s, 0, LEN(#s) - 1) + #ReplaceWith;
Output: 01023201580000
I am trying to parse out values from a text string. The values are nested within the string. For example, I get flat files that look something like this: "John Q Public (12345-01) - Customer referral". I want to parse out the account number; e.g., 12345-01, from the text string. The starting position isn't constant nor is the length of the account number. The pattern is #####-##.
Any help would be most appreciated.
Some basic string manipulation will capture the values you want. No idea if this works in your real situation because I have no idea how "stable" or consistent your data is.
declare #Something varchar(100) = 'John Q Public (12345-01) - Customer referral'
select substring(#Something
, charindex('(', #Something) + 1 --starting position of the values you want
, charindex(')', #Something) - charindex('(', #Something) - 1 --length of the values you want
)
I want to use substring in SQL server to capture a string between a specific text string and the following char(10). The problem is that there are several occurrences of char(10) in the complete string so I need some code to localize the first char(10) after my specific string. Using the code below results in an error due to a negative value (first char(10) occurs prior to the my specific string).
SELECT SUBSTRING(col, LEN(LEFT(col, CHARINDEX ('my specific string', col))) + 1, LEN(col) - LEN(LEFT(col,
CHARINDEX ('my specific string', col))) - LEN(RIGHT(col, LEN(col) - CHARINDEX (char(10), col))) - 1);
Error: Invalid length parameter passed to the LEFT or SUBSTRING
function.
A very straightforward answer to the specific data in this question: you could pass a third parameter to the CHARINDEX function call that looks for the CHAR(10) after a specific starting index. As the third parameter, you can pass the index of the 'my specific string' you found:
SELECT SUBSTRING(col, LEN(LEFT(col, CHARINDEX ('my specific string', col))) + 1, LEN(col) - LEN(LEFT(col,
CHARINDEX ('my specific string', col))) - LEN(RIGHT(col, LEN(col) - CHARINDEX (char(10), col, CHARINDEX ('my specific string', col)))) - 1);
I am not sure if this will actually solve your final issue, however. Perhaps your col could not contain a CHAR(10) after a 'my specific string'... Or it might contain multiple 'my specific string's as well. If you want all logic that handles your (more complex) requirements in a single SELECT-statement, things can get messy very quickly.
I have the following location as a string:
\\Windows\UnitB\CU1234_001\
I want to return the CU1234_001 part only. The query which I need to use needs to be dynamic since this string will change and it could be longer or shorter (it will all the time end in "\".
I've tried to used something like this but this just eliminate the last "\" and returns the rest of the string:
select
substring('\\Windows\UnitB\CU1234_001\',
1, (len('\\Windows\UnitB\CU1234_001\') - (Charindex('\',
reverse(rtrim('\\Windows\UnitB\CU1234_001\'))))))
You can use a combination of string functions to extract what you want:
SELECT REVERSE(SUBSTRING(REVERSE(col),
2,
CHARINDEX('/', REVERSE(col), 2) - 2))
FROM yourTable
I have a string with a specific pattern:
23;chair,red [$3]
i.e., a number followed by a semicolon, then a name followed by a left square bracket.
Assuming the semicolon ; always exists and the left square bracket [ always exists in the string, how do I extract the text between (and not including) the ; and the [ in a SQL Server query? Thanks.
Combine the SUBSTRING(), LEFT(), and CHARINDEX() functions.
SELECT LEFT(SUBSTRING(YOUR_FIELD,
CHARINDEX(';', YOUR_FIELD) + 1, 100),
CHARINDEX('[', YOUR_FIELD) - 1)
FROM YOUR_TABLE;
This assumes your field length will never exceed 100, but you can make it smarter to account for that if necessary by employing the LEN() function. I didn't bother since there's enough going on in there already, and I don't have an instance to test against, so I'm just eyeballing my parentheses, etc.
Assuming they always exist and are not part of your data, this will work:
declare #string varchar(8000) = '23;chair,red [$3]'
select substring(#string, charindex(';', #string) + 1, charindex(' [', #string) - charindex(';', #string) - 1)
An alternative to the answer provided by #Marc
SELECT SUBSTRING(LEFT(YOUR_FIELD, CHARINDEX('[', YOUR_FIELD) - 1), CHARINDEX(';', YOUR_FIELD) + 1, 100)
FROM YOUR_TABLE
WHERE CHARINDEX('[', YOUR_FIELD) > 0 AND
CHARINDEX(';', YOUR_FIELD) > 0;
This makes sure the delimiters exist, and solves an issue with the currently accepted answer where doing the LEFT last is working with the position of the last delimiter in the original string, rather than the revised substring.
select substring(your_field, CHARINDEX(';',your_field)+1
,CHARINDEX('[',your_field)-CHARINDEX(';',your_field)-1)
from your_table
Can't get the others to work. I believe you just want what is in between ';' and '[' in all cases regardless of how long the string in between is. After specifying the field in the substring function, the second argument is the starting location of what you will extract. That is, where the ';' is + 1 (fourth position - the c), because you don't want to include ';'. The next argument takes the location of the '[' (position 14) and subtracts the location of the spot after the ';' (fourth position - this is why I now subtract 1 in the query). This basically says substring(field,location I want substring to begin, how long I want substring to be). I've used this same function in other cases. If some of the fields don't have ';' and '[', you'll want to filter those out in the "where" clause, but that's a little different than the question. If your ';' was say... ';;;', you would use 3 instead of 1 in the example. Hope this helps!
If you need to split something into 3 pieces, such as an email address and you don't know the length of the middle part, try this (I just ran this on sqlserver 2012 so I know it works):
SELECT top 2000
emailaddr_ as email,
SUBSTRING(emailaddr_, 1,CHARINDEX('#',emailaddr_) -1) as username,
SUBSTRING(emailaddr_, CHARINDEX('#',emailaddr_)+1, (LEN(emailaddr_) - charindex('#',emailaddr_) - charindex('.',reverse(emailaddr_)) )) domain
FROM
emailTable
WHERE
charindex('#',emailaddr_)>0
AND
charindex('.',emailaddr_)>0;
GO
Hope this helps.