T-SQL field parsing - sql-server

T-SQL field parsing - sql-server

There is a field in a 3rd party database that I need to group on for a report I'm writing. The field can contain a few different types of data. First it could contain a 3 digit number. I need to break these out into groups such as 101 to 200 and 201 to 300. In addition to this the field could also be prefaced with a particular letter such a M or K then a few numbers. It is defined as VARCHAR(8) and any help in how I could handle both cases where it may start with a particular letter or fall within a numeric range would be appreciated. If I could write it as a case statement and return a department based either on the numeric value or the first letter that would be the best so I can group in my report.
Thanks,
Steven

If I could write it as a case statement and return a department based either on the numeric value or the first letter that would be the best so I can group in my report.
case when substring( field, 1, 1 ) = 'M' then ...
when substring( field, 1, 1 ) = 'K" then ...
else floor( (cast( field as int) - 1 ) / 100) end
select ....
group by
case when substring( field, 1, 1 ) = 'M' then ...
when substring( field, 1, 1 ) = 'K" then ...
else floor( (cast( field as int) - 1 ) / 100) end
Matt Hamilton asks,
Any reason why you've opted to use substring(field, 1, 1) rather than simply left(field, 1)? I notice that #jms did it too, in the other answer.
I know substring is specified in ANSI-92; I don't know that left is. And anyway, left isn't a primitive, as it can be written in terms of substring, so using substring seems a little cleaner.

select
CASE (CASE WHEN substring(field,1,1) between 0 and 9 then 'N' Else 'C' END)
WHEN 'N' THEN
CASE field
WHEN ... THEN ...
WHEN ... THEN ...
END
WHEN 'C' THEN
CASE field
WHEN ... THEN ...
WHEN ... THEN ...
END
END

Related

Defeat these dashed dashes in SQL server

I have a table that contains the names of various recording artists. One of them has a dash in their name. If I run the following:
Select artist
, substring(artist,8,1) as substring_artist
, ascii(substring(artist,8,1)) as ascii_table
, ascii('-') as ascii_dash_key /*The dash key next to zero */
, len(artist) as len_artist
From [dbo].[mytable] where artist like 'Sleater%'
Then the following is returned. This seems to indicate that a dash (ascii 45) is being stored in the artist column
However, if I change the where clause to:
From [dbo].[mytable] where artist like 'Sleater' + char(45) + '%'
I get no results returned. If I copy and paste the output from the artist column into a hex editor, I can see that the dash is actually stored as E2 80 90, the Unicode byte sequence for the multi-byte hyphen character.
So, I'd like to find and replace such occurrences with a standard ascii hyphen, but I'm am at a loss as to what criteria to use to find these E2 80 90 hyphens?

Your char is the hyphen, information on it here :
https://www.charbase.com/2010-unicode-hyphen
You can see that the UTF16 code is 2010 so in T-SQL you can build it with
SELECT NCHAR(2010)
From there you can use any SQL command with that car, for example in a select like :
Select artist
From [dbo].[mytable] where artist like N'Sleater' + NCHAR(2010) + '%'
or as you want in a
REPLACE( artist, NCHAR(2010), '-' )
with a "real" dash
EDIT:
If the collation of your DB give you some trouble with the NCHAR(2010) you can also try to use the car N'‐' that you'll copy/paste from the charbase link I gave you so :
REPLACE( artist , N'‐' , '-' )
that you can even take from the string here (made with the special car) so all made for you :
update mytable set artist=REPLACE( artist, N'‐' , '-' )

I don't know your table definition and COLLATION but I'm almost sure that you are mixing NCHAR and CHAR types and convert unicode, multibyte characters to sinle byte representations. Take a look at this demo:
WITH Demo AS
(
SELECT N'ABC'+NCHAR(0x2010)+N'DEF' T
)
SELECT
T,
CASE WHEN T LIKE 'ABC'+CHAR(45)+'%' THEN 1 ELSE 0 END [Char],
CASE WHEN T LIKE 'ABC-%' THEN 1 ELSE 0 END [Hyphen],
CASE WHEN T LIKE N'ABC‐%' THEN 1 ELSE 0 END [Unicode-Hyphen],--unicode hyphen us used here
CASE WHEN T LIKE N'ABC'+NCHAR(45)+N'%' THEN 1 ELSE 0 END [NChar],
CASE WHEN CAST(T AS varchar(MAX)) LIKE 'ABC-%' THEN 1 ELSE 0 END [ConvertedToAscii],
ASCII(NCHAR(0x2010)) ConvertedToAscii,
CAST(SUBSTRING(T, 4, 1) AS varbinary) VarbinaryRepresentation
FROM Demo
My results:
T Char Hyphen Unicode-Hyphen NChar ConvertedToAscii ConvertedToAscii VarbinaryRepresentation
------- ----------- ----------- -------------- ----------- ---------------- ---------------- --------------------------------------------------------------
ABC‐DEF 0 0 1 0 1 45 0x1020
UTF-8 (3 bytes) representation is the same as 2010 in unicode.

Dutch zipcodes regex in SQL Server 2014 always returns 0

I might be missing something. I have to check Dutch zipcodes, but I got some user entered data in my database. I want to check if the zipcode can be an actual zipcode. Format for Dutch zipcodes: 1000-9999AA-ZZ
So any integer between 1000 and 9999 in combination with 2 lettres can be a valid zipcode (there are some additional parameters, but I am not worrying about them for now).
I didn't get my regex to work with this code:
iif(ZipCode like '^[1-9][0-9]{3}\s[a-zA-Z]{2}$',1,0) as MatchIndicator
Yet it always returns zero.
I even tried it with a simpler regex
iif(ZipCode like '^[1-9]',1,0) as MatchIndicator
Returns 0 everytime as well.
I found myself an alternative, but I think the regex code is better to use in the long run for more complicated text.
Alternative
case when LEFT(ZipCode,1) between '1' and '9'
and substring(ZipCode,2,1) between '0' and '9'
and substring(ZipCode,3,1) between '0'and '9'
and substring(ZipCode,4,1) between '0' and '9'
and substring(ZipCode,5,1) between 'A' and 'Z'
and substring(ZipCode,6,1) between 'A' and 'Z' then 1 else 0 end as MatchIndicator
And
patindex('[1-9][0-9][0-9][0-9][a-zA-z][a-zA-z]',ZipCode)
Any thoughts?

SQL Server doesn't support 'proper' regex. So how about:
CREATE TABLE #Test (Postcode VARCHAR(6))
INSERT INTO #Test
VALUES
('1234AZ'),
('9876ZQ'),
('1900Sz'),
('ABCDe1'),
('XwYx1A'),
('5000A1')
SELECT
PostCode,
CASE WHEN
TRY_CAST(SUBSTRING(PostCode, 1, 4) AS INT) BETWEEN 1000 AND 9999
AND
PATINDEX('%[A-Z][A-Z]%' COLLATE Latin1_General_Bin, SUBSTRING(PostCode, 5, 2)) > 0 THEN 1 ELSE 0
END IsValid
FROM #Test
PostCode IsValid
-------- -----------
1234AZ 1
9876ZQ 1
1900Sz 0
ABCDe1 0
XwYx1A 0
5000A1 0

Parsing Chars In SQL Using PATINDEX

I'm trying to validate a string using raw sql;
tried using:
DECLARE #AlphaNumeric varchar(50)
SET #AlphaNumeric = '1017a'
SELECT SUBSTRING(#AlphaNumeric, 1, (PATINDEX('%[^0-9]%', #AlphaNumeric) - 1)) AS 'Numeric',
SUBSTRING(#AlphaNumeric, PATINDEX('%[^0-9]%', #AlphaNumeric), DATALENGTH(#AlphaNumeric)) AS 'Alpha'
But if the user types 101a7a,this doesnt work properly;what i want to do exactly is;
I want the variable always to be, numeric+alphanumeric,lenght doesnt matter.
For example :
2303A OK
23A434A NOT OK
A344 NOT OK.
4324AAC OK
This would be dead easy if i could do it in Regex but sql gives me headaches :(

Letters followed by numbers are OK; Numbers followed by letters aren't; All characters must be letters or numbers. Hence...
select * from yourtable
where yourfield like '%[0-9][a-z]%'
and not (yourfield like '%[a-z][0-9]%')
and not (yourfield like '%[^0-9a-z]%')

I think this will do what you want. At least, it works on your sample data:
with t as (
select '2303A' as col union all
select '23A434A' union all
select 'A344'
)
select *,
(case when col like '%[0-9]%' and
substring(col, patindex('%[A-Z]%', col), len(col)) not like '%[^A-Z]%'
then 'OK'
else 'NOT OK'
end)
from t;
The two conditions are. First check that the character string has a number somewhere. Then, check that there are only letters after the first letter is found. I'm assuming that all letters are uppercase.
EDIT:
There might be an easier way. You can check that a number is followed by a letter somewhere in the string, but that a letter is never followed by a number. For this, you only need like:
select (case when col not like '%[^A-Z0-9]%' and
col like '%[0-9][A-Z]%' and
col not like '%[A-Z][0-9]%'
then 'OK'
else 'NOT OK'
end)

I have an approach that should work in your situation. Basically identify the position of the last integer and compare it to the position of the first non integer. You can get the position of the last integer like this
len(#AlphaNumeric) - PATINDEX('%[0-9]%', Reverse(#AlphaNumeric))+1
and you can get the position of the first non integer like this
PATINDEX('%[^0-9]%', #AlphaNumeric)
so that would make your where clause (where all integers precede any non integers like this
Where (len(#AlphaNumeric) - PATINDEX('%[0-9]%', Reverse(#AlphaNumeric))+1 ) < PATINDEX('%[^0-9]%', #AlphaNumeric)

Conversion failed when converting the nvarchar to int

I have a field which is varchar and contains numbers and dates as strings. I want to update all numbers in this field that is greater than 720. I have attempted firstly to do a select but I get this error:
Conversion failed when converting the nvarchar value '16:00' to data type int.
This is my query:
select id, case(isnumeric([other08])) when 1 then [other08] else 0 end
from CER where sourcecode like 'ANE%' --and other08 > 720
It fails when I uncomment the last part.
I am trying to get all numerics greater than 720, but I can't do the comaprison. It also fails when casting and converting.
Thanks all for any help

You also need to perform the checks and conversion in the WHERE clause:
SELECT
id,
CASE WHEN isnumeric([other08]) = 1 THEN CAST([other08] AS INT) ELSE 0 END
FROM CER
WHERE sourcecode LIKE 'ANE%'
AND CASE WHEN isnumeric([other08]) = 1 THEN CAST([other08] AS INT) ELSE 0 END > 720

You need to use IsNumeric in your where clause, to avoid trying to compare strings to the number 720. Eg:
select id, case(isnumeric([other08])) when 1 then [other08] else 0 end
from CER
where sourcecode like 'ANE%' and ISNUMERIC(other08) = 1 and other08 > 720
EDIT
As #Abs pointed out, the above approach won't work. We can use a CTE to compute a reliable field to filter on, however:
WITH Data AS (
select id
, case WHEN isnumeric([other08]) THEN CAST([other08] AS int) else 0 end AS FilteredOther08
, CER.*
from CER
where sourcecode like 'ANE%'
)
SELECT *
FROM Data
WHERE [FilteredOther08] > 720

SQL Server: sort a column numerically if possible, otherwise alpha

I am working with a table that comes from an external source, and cannot be "cleaned". There is a column which an nvarchar(20) and contains an integer about 95% of the time, but occasionally contains an alpha. I want to use something like
select * from sch.tbl order by cast(shouldBeANumber as integer)
but this throws an error on the odd "3A" or "D" or "SUPERCEDED" value.
Is there a way to say "sort it like a number if you can, otherwise just sort by string"? I know there is some sloppiness in that statement, but that is basically what I want.
Lets say for example the values were
7,1,5A,SUPERCEDED,2,5,SECTION
I would be happy if these were sorted in any of the following ways (because I really only need to work with the numeric ones)
1,2,5,7,5A,SECTION,SUPERCEDED
1,2,5,5A,7,SECTION,SUPERCEDED
SECTION,SUPERCEDED,1,2,5,5A,7
5A,SECTION,SUPERCEDED,1,2,5,7

I really only need to work with the
numeric ones
this will give you only the numeric ones, sorted properly:
SELECT
*
FROM YourTable
WHERE ISNUMERIC(YourColumn)=1
ORDER BY YourColumn

select
*
from
sch.tbl
order by
case isnumeric(shouldBeANumber)
when 1 then cast(shouldBeANumber as integer)
else 0
end

Provided that your numbers are not more than 100 characters long:
WITH chars AS
(
SELECT 1 AS c
UNION ALL
SELECT c + 1
FROM chars
WHERE c <= 99
),
rows AS
(
SELECT '1,2,5,7,5A,SECTION,SUPERCEDED' AS mynum
UNION ALL
SELECT '1,2,5,5A,7,SECTION,SUPERCEDED'
UNION ALL
SELECT 'SECTION,SUPERCEDED,1,2,5,5A,7'
UNION ALL
SELECT '5A,SECTION,SUPERCEDED,1,2,5,7'
)
SELECT rows.*
FROM rows
ORDER BY
(
SELECT SUBSTRING(mynum, c, 1) AS [text()]
FROM chars
WHERE SUBSTRING(mynum, c, 1) BETWEEN '0' AND '9'
FOR XML PATH('')
) DESC

SELECT
(CASE ISNUMERIC(shouldBeANumber)
WHEN 1 THEN
RIGHT(CONCAT('00000000',shouldBeANumber), 8)
ELSE
shouoldBeANumber) AS stringSortSafeAlpha
ORDEER BY
stringSortSafeAlpha
This will add leading zeros to all shouldBeANumber values that truly are numbers and leave all remaining values alone. This way, when you sort, you can use an alpha sort but still get the correct values (with an alpha sort, "100" would be less than "50", but if you change "50" to "050", it works fine). Note, for this example, I added 8 leading zeros, but you only need enough leading zeros to cover the largest possible integer in your column.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

T-SQL field parsing - sql-server

select CASE (CASE WHEN substring(field,1,1) between 0 and 9 then 'N' Else 'C' END) WHEN 'N' THEN CASE field WHEN ... THEN ... WHEN ... THEN ... END WHEN 'C' THEN CASE field WHEN ... THEN ... WHEN ... THEN ... END END

Related

Defeat these dashed dashes in SQL server

Dutch zipcodes regex in SQL Server 2014 always returns 0

Parsing Chars In SQL Using PATINDEX

Conversion failed when converting the nvarchar to int

SQL Server: sort a column numerically if possible, otherwise alpha

Categories

Resources