Dealing with errors while parsing strings - sql-server

I'm tasked with pulling relevent data out of a field which is essentially free text. I have been able to get the information I need 98% of the time by looking for keywords and using CASE statements to break the field down into 5 different fields.
My issue is I can't get around the last 2% because the errors don't follow any logical order - they are mostly misspellings.
I could bypass the field with a TRY CATCH, but I don't like giving up 4 good pieces of data when the routine is choking on one.
Is there any way to handle blanket errors within a CASE statement, or is there another option?
Current code, the 'b' with the commented out section is where it's choking right now:
CASE WHEN #Location = 0 THEN
CASE WHEN #Duration = 0 THEN
CASE WHEN #Timing = 0 THEN
SUBSTRING(#Comment,#Begin, #Context-#Begin)
ELSE
SUBSTRING(#Comment,#Begin, #Timing-#Begin)
END
ELSE SUBSTRING(#Comment,#Begin, #Duration-#Begin)
END
ELSE SUBSTRING(#Comment,#Begin, #Location-#Begin)
END AS Complaint
,CASE WHEN #Location = 0 THEN ''
ELSE
CASE WHEN #Duration = 0 THEN
CASE WHEN #Timing = 0 THEN SUBSTRING(#Comment,#Location+10, (#CntBegin-11))
ELSE SUBSTRING(#Comment,#Location+10, #Timing-(#Location+10))
END
ELSE SUBSTRING(#Comment,#Location+10, #Duration-(#Location+10))
END
END AS Location
,CASE WHEN #Timing = 0 THEN ''
ELSE
CASE WHEN #CntBegin = 0 THEN
SUBSTRING(#Comment,#Timing+#TimingEnd, (#Location+#Context)-(#Timing+#TimingEnd))
ELSE
'b'--SUBSTRING(#Comment,#Timing+#TimingEnd, (#Location+#CntBegin-1)-(#Timing+#TimingEnd))
END
END AS Timing
On this statement, which has a comma in an odd spot. I have to reference the comma usually for the #CntBegin, but in this case it's making my (#Location+#CntBegin-1) shorter then the (#Timing+#TimingEnd):
'Pt also presents with/for mild check MGP/MGD located in OU, since 12/2015 ? Stability.'
Please take into account, I'm not necessarily trying to fix this error, I'm looking for a way to handle any error that comes up as who knows what someone is going to type. I'd like to just display 'ERR' in that particular field when the code runs into something it can't handle. I just don't want the routine to die.

Assuming your error is due to the length parameter in SUBSTRING being less than 0. I always alias my parameters using CROSS APPLY and then validate the input before calling SUBSTRING(). Something like this should work:
SELECT
CASE WHEN CA.StringLen > 0 /*Ensure valid length*/
THEN SUBSTRING(#comment,#Timing+#TimingEnd,CA.StringLen)
ELSE 'Error'
END
FROM YourTable
CROSS APPLY (SELECT StringLen = (#Location+#CntBegin-1)-(#Timing+#TimingEnd)) AS CA

Related

Find a word in a variable and render 1 else 0

What is the best solution in finding a word in a variable? This needs to be within the Select statement, and if the word is found then '1' should be returned, else '0'.
Charindex or Regex?
I am trying something like this:
Select top 100
[ReportingEntity]
,if CHARINDEX('Issuers', [ReportingEntity]) = '1' ELSE '0' END AS 'Issuers'
FROM [MSC].[dbo].[dsl_file]
And [ReportingEntity] can have variables like:
Tickert Issuers
Fund_Manager-Issuers
Issuers of Event
...
Keen also to understand how that would work with regex
You can do so with IIF, or CASE
IIF
SELECT TOP 100 [ReportingEntity]
,IIF(ReportingEntity LIKE '%Issuers%', 1, 0) AS 'Issuers'
FROM [MSC].[dbo].[dsl_file]
CASE
SELECT TOP 100 [ReportingEntity]
,CASE WHEN ReportingEntity LIKE '%Issuers%' THEN 1 ELSE 0 END AS 'Issuers'
FROM [MSC].[dbo].[dsl_file]

Combining IF ELSE clause with UPDATE in TSQL

Trying to combine an UPDATE clause for rows in a table if different conditions are met. I'm doing this via TSQL on Azure SQL.
I could run multiple TSQL statements in the format of the green-ed code that are mutually exclusive from each other, but I'd rather use a nested IF ELSE statement in order to make the code cleaner.
Is this possible?
Do it all in one hit with a case expression.
update student set
p2 = case when total_score > 500 then 'pass' else 'fail' end
where p2 is null;
You could combine both updates as follows by using a case expression to test whether or not the column needs an update:
update student set
p1 = case when p1 is null and age > 20 then 'old' else p1 end
, p2 = case when p2 is null then case when total_score > 500 then 'pass' else 'fail' end else p2 end
where p2 is null or p1 is null;
Note: As a design issue, setting a column value based on the age of a student isn't usually a good idea, because its only giving you a point in time, so you usually then need to keep updating it. Far better to calculate it when you query the table.

SQL String are same but case equals method returns false

I am using SQL to compare two columns and return TRUE/FALSE if they are equal.
In some cases, the two columns contain exactly the same string (no spaces or anything) but I am still getting false.
What may the reason for this be?
I am using this code:
CASE WHEN column1 = column2 THEN 0 ELSE 1 END AS [check]
The values are different despite the displayed value.
Using T-SQL, run a query like this to see the exact difference in the underlying raw values:
SELECT
column1
, CAST(column1 AS varbinary(MAX)) AS column1Binary
, column2
, CAST(column2 AS varbinary(MAX)) AS column2Binary
FROM dbo.YourTable;
This will reveal underlying differences like tabs or subtle character differences.
In fact, a likely explanation for what you are seeing is that one/both of the strings has leading and/or trailing whitespace. On SQL Server you may try:
CASE WHEN LTRIM(column1) = LTRIM(column2) THEN 0 ELSE 1 END AS [check]
If the above does not detect the problematical records, then try checking the length:
CASE WHEN LEN(column1) = LEN(column2) THEN 0 ELSE 1 END AS [check2]

Error converting varchar to bigint in very peculiar situation

My intent is to retrieve all CLIENT_CODE converted to BigInt, to compare with a value passed as a parameter in the where clause from a 400 lines sql query. When execute the code below, I get the following error message:
message error 8114 from sql server: "Error converting varchar to
bigint".
Test Code:
select CASE when (len (CLIENT_CODE) > 2 and isNumeric(CLIENT_CODE) = 1)
then (CAST(SUBSTRING(TAB.CLIENT_CODE, 1, LEN(TAB.CLIENT_CODE)-1) AS BIGINT))
else CLIENT_CODE end from TABLE TAB
Code Nested:
--HUGE_SQL...
AND ((CASE when (len (CLIENT_CODE) > 2 and isNumeric(CLIENT_CODE) = 1)
then (CAST(SUBSTRING(TAB.CLIENT_CODE, 1, LEN(TAB.CLIENT_CODE)-1) AS BIGINT))
else CLIENT_CODE end) = #MyClient_Code)
--... HUGE_SQL
Our CLIENT_CODE is varchar(20), some have 0 characters, and some have letters, but almost every record is a number.
In my understanding, the case must be evaluated first, but it don't appear to be the case.
When i put the isNumeric(CLIENT_CODE) = 1 in the where clause, in test code, it works. My problem is that i can't do it in this particular case, because the fact it is already nested in the where clause from a huge sql query, and adding the isNumeric(CLIENT_CODE) = 1 there doesn't work, because it has a lot of other conditions.
Which is the best way to retrieve this data? Can someone figure it out how to do it?
(It will be very helpfull some kind of explanation of how is treated the functions vs case vs where)
Your Case expression Returns BIGINT in one case and else it return VARCHAR data type .
For Case expression, in each case the returned data type must be same.
Also instead of using ISNUMERIC() use following
select CASE
when (len (CLIENT_CODE) > 2 and CLIENT_CODE NOT LIKE '%[^0-9]%')
then (CAST(SUBSTRING(TAB.CLIENT_CODE, 1, LEN(TAB.CLIENT_CODE)-1) AS BIGINT))
end
from TABLE TAB
ISNUMERIC() returns true for values like 123a1 , 346g2 it considers it as raise to power stuff, therefore use NOT LIKE '%[^0-9]%' to get strings where only actual numeric values are present.

Divide in query and error Arithmetic overflow error converting expression to data type int

I want to write this query (all of the fields are int):
select SUM(Service.discount+((Service.price*Factor.discount)/(Factor.amount-Factor.discount)))
But sometimes I get this error:
Arithmetic overflow error converting expression to data type int.
Other times I get this error:
Divide by zero error encountered.
How can I rewrite this query to avoid these errors?
//I Use this but agan overflow
select case when(Factor.amount-Factor.discount)<>0 then
Service.discount+((Service.price*Factor.discount)/(Factor.amount-Factor.discount))
else
Service.discount
end
from Factor inner join Service on Factor.code=Service.factorCode
Arithmetic overflow: don't use sum at all, take SUM off and take the brackets off either end.
Divide by zero: see Jonny's answer (I think he means //something as in whatever you want to do when factor.amount-factor.discount is zero....)
so maybe:
select case when discount2 <> 0 then discount+((price*discount)/(discount2)) else
discount+(price*discount) end FROM SERVICE
SELECT CASE
WHEN (Factor.amount-Factor.discount) <> 0
THEN
CONVERT(FLOAT,Service.discount+((Service.price*Factor.discount)/(Factor.amount-
Factor.discount)))
ELSE
Service.discount
END
FROM Factor INNER JOIN Service ON Factor.code=Service.factorCode
It might be better to decide how many decimal places you want to see:
CONVERT(decimal(10,2),Service.discount+((Service.price*Factor.discount)/(Factor.amount-Factor.discount)))
select
CASE (Factor.amount-Factor.discount)
WHEN 0
-- choose the result when Factor.amount-Factor.discount = 0 and replace this line
ELSE
SUM(Service.discount+((Service.price*Factor.discount)/
(Factor.amount-Factor.discount)))
END
...

Resources