ASCII Comparisons when Using SQL_Latin1_General_CP1_CI_AS - sql-server

I ran the query below to see how strings would compare in SQL Server, since a different query based on comparison of fields with this type of data yielded different results than expected. I wasn't sure whether the angle bracket would be considered a lower value than a number (based on the ASCII table it would not, but wanted to check). As a result of the outcome, I've reviewed multiple posts regarding string comparison, collation and expected values and they all seem to reinforce that this should not be working this way. The collation on the database (as on the field that caused trouble) is SQL_Latin1_General_CP1_CI_AS.
SELECT ASCII('<') AS [<]
,ASCII('0') AS [0]
,CASE WHEN '<0.1' < '0.1' THEN 1 ELSE 0 END AS TEST1
,CASE WHEN '<' < '0' THEN 1 ELSE 0 END AS TEST2
,CASE WHEN ASCII('<') < ASCII('0') THEN 1 ELSE 0 END AS TEST3
Result:
< 0 TEST1 TEST2 TEST3
60 48 1 1 0
Any ideas to point me in the right direction would be very much appreciated!

Related

SQL Equivalent of Countif or Index Match on row level instead of columnar level

I have a table which comprises of 30 columns, all adjacent to one another. Of these 5 are text fields indicating certain details pertaining to that entry and 25 are value fields. Value fields have the column name as Val00, Val01, Val02 .....upto Val24
Based on a logic appearing elsewhere, these value fields input a value for n amount of columns and then drop to 0 for all of the subsequent fields
e.g.
When n is 5 the output will be
Val00
Val01
Val02
Val03
Val04
Val05
Val06
Val07
Val24
1.5
1.5
1.5
1.5
1.5
0
0
0
0
As can be seen, all values starting from val05 will drop to 0 and all columns from Val 05 to Val24 will be 0.
Given this output, I want to find what this n value is and create a new column ValCount to store this value in it.
In Excel this would be fairly straight forward to achieve with the below formula
=COUNTIF(Val00:Val24,">0")
However I'm not sure how we would go about in terms of SQL. I understand that the count function works on a columnar level and not on a row level.
I found a solution which is rather long but it should do the job so I hope it helps you.
SELECT SUM(SUM(CASE WHEN Val00 >= 1 THEN 1 ELSE 0 END)
+ SUM(CASE WHEN Val01 >= 1 THEN 1 ELSE 0 END)
+ SUM(CASE WHEN Val02 >= 1 THEN 1 ELSE 0 END)) As ValCount

Is there a rule for naming columns a certain name?

So we have a SQL Server 2017 instance with the default collation of Vietnamese_100_CI_AS_KS.
When we create a Temp Table, we see that if a specific name is used, it somehow sees one letter as case sensitive.
Example:
CREATE TABLE #MaterialTest
(
Product VARCHAR(50),
Quality VARCHAR (50),
MaterialOriginGroup VARCHAR (50)
)
INSERT INTO #MaterialTest VALUES ('Papers', 'Good', 'Test1')
If I query this table using:
Select Product, Quality, MaterialOriginGroup from #MaterialTest
It works fine. If I query it using pretty much anything else it works fine
But if I use a lower case "G" for "Group" in MaterialOriginGroup it fails.
Example:
Use TempDB
Select product, quality, materialorigingroup from #MaterialTest
Msg 207, Level 16, State 1, Line 2 Invalid column name
'materialorigingroup'.
If I query it with:
Use TempDB
Select product, quality, materialoriginGroup from #MaterialTest
It works.
Any idea why?
Building on my own investigations and #Larnu's SQL Fiddle, I've devised a little script to test two-letter combinations:
DECLARE #Letter VARCHAR(2);
SET #Letter = 'ch';
DECLARE #LetterA VARCHAR(1);
DECLARE #LetterB VARCHAR(1);
SET #LetterA = LEFT(#Letter, 1);
SET #LetterB = RIGHT(#Letter, 1);
SELECT CASE WHEN V.G = UPPER(#Letter) COLLATE Vietnamese_100_CI_AS_KS THEN 1 ELSE 0 END,
CASE WHEN V.G = LOWER(#Letter) COLLATE Vietnamese_100_CI_AS_KS THEN 1 ELSE 0 END,
CASE WHEN V.G = CONCAT(LOWER(#LetterA), UPPER(#LetterB)) COLLATE Vietnamese_100_CI_AS_KS THEN 1 ELSE 0 END,
CASE WHEN V.G = CONCAT(UPPER(#LetterA), LOWER(#LetterB)) COLLATE Vietnamese_100_CI_AS_KS THEN 1 ELSE 0 END
FROM (VALUES(#Letter COLLATE Vietnamese_100_CI_AS_KS)) V(G);
This will output 1 or 0 depending upon whether a comparison is valid for the two letter combination. The comparisons are:
Upper case letters (i.e. NG)
Lower case letters (i.e. ng)
One upper case, one lower case letter (i.e. Ng)
One lower case, one upper case letter (i.e. nG)
Running this for ab produces:
1 1 1 1
So ab does not exhibit this issue. However, for ch you get:
1 1 0 1
So ch does exhibit the same issue.
This seems related to declared consonants in the Vietnamese alphabet, which include these two-letter consonants:
Ch = 1 1 0 1 (has the same issue)
Gh = 1 1 0 1 (has the same issue)
Gi = 1 1 0 1 (has the same issue)
Kh = 1 1 0 1 (has the same issue)
Ng = 1 1 0 1 (has the same issue)
Nh = 1 1 0 1 (has the same issue)
Ph = 1 1 0 1 (has the same issue)
Qu = 1 1 0 1 (has the same issue)
Th = 1 1 0 1 (has the same issue)
Tr = 1 1 0 1 (has the same issue)
Unfortunately, I do not know why these are affected by this issue, which is not helpful. In addition, I don't know why it only affects the column name when it is a lowercase followed by an uppercase letter.
Perhaps someone else may see this and know the answer?
The reference for the alphabet is here.

Convert Decode from Oracle to Case from MS SQL Server

I have this line I am struggling with to convert a query from Oracle to SQL Server 2012. the following line is:
DECODE(SUM(DECODE(a.canceldate, NULL, 1,0)), 1, NULL, To_Date(MAX(TO_CHAR(a.canceldate,'yyyymmdd')), 'yyyymmdd')) dCancelDate,
As I inteprete is to convert it like:
case a.canceldate
(when sum(case a.canceldate when Null then 1 else 0 end))
when 1
then 0
else convert(datetime,a.canceldate)
end max(a.canceldate) as dCancelDate,
I will appreciate some assistant, my line is not correct for SQL Server 2012.
The decode formula is equivalent to
case sum(case when a.canceldate is null then 1 else 0 end) when 1 then null
else to_date( ... ) end dCancelDate, ...
One mistake I saw in your translation is that you have when sum(...) when 1. You can't have it both ways, it is either when sum(...) = 1 or sum(...) when 1. It may be the only mistake, I didn't look too hard.
What you have within the to_date() is horrible; are you converting dates to character strings, then take the max IN ALPHABETICAL ORDER and then translate back to date? Why? Perhaps just so you delete the time-of-day component? That is a lot easier done with trunc(max(a.canceldate)).

Dutch zipcodes regex in SQL Server 2014 always returns 0

I might be missing something. I have to check Dutch zipcodes, but I got some user entered data in my database. I want to check if the zipcode can be an actual zipcode. Format for Dutch zipcodes: 1000-9999AA-ZZ
So any integer between 1000 and 9999 in combination with 2 lettres can be a valid zipcode (there are some additional parameters, but I am not worrying about them for now).
I didn't get my regex to work with this code:
iif(ZipCode like '^[1-9][0-9]{3}\s[a-zA-Z]{2}$',1,0) as MatchIndicator
Yet it always returns zero.
I even tried it with a simpler regex
iif(ZipCode like '^[1-9]',1,0) as MatchIndicator
Returns 0 everytime as well.
I found myself an alternative, but I think the regex code is better to use in the long run for more complicated text.
Alternative
case when LEFT(ZipCode,1) between '1' and '9'
and substring(ZipCode,2,1) between '0' and '9'
and substring(ZipCode,3,1) between '0'and '9'
and substring(ZipCode,4,1) between '0' and '9'
and substring(ZipCode,5,1) between 'A' and 'Z'
and substring(ZipCode,6,1) between 'A' and 'Z' then 1 else 0 end as MatchIndicator
And
patindex('[1-9][0-9][0-9][0-9][a-zA-z][a-zA-z]',ZipCode)
Any thoughts?
SQL Server doesn't support 'proper' regex. So how about:
CREATE TABLE #Test (Postcode VARCHAR(6))
INSERT INTO #Test
VALUES
('1234AZ'),
('9876ZQ'),
('1900Sz'),
('ABCDe1'),
('XwYx1A'),
('5000A1')
SELECT
PostCode,
CASE WHEN
TRY_CAST(SUBSTRING(PostCode, 1, 4) AS INT) BETWEEN 1000 AND 9999
AND
PATINDEX('%[A-Z][A-Z]%' COLLATE Latin1_General_Bin, SUBSTRING(PostCode, 5, 2)) > 0 THEN 1 ELSE 0
END IsValid
FROM #Test
PostCode IsValid
-------- -----------
1234AZ 1
9876ZQ 1
1900Sz 0
ABCDe1 0
XwYx1A 0
5000A1 0

Why is '-' equal to 0 (zero) in SQL?

When you run the following query in SQL Management studio the result will be 1.
SELECT
CASE WHEN '-' = 0 THEN
1
ELSE
0
END
That scares me a bit, because I have to check for 0 value a numerous number of times and it seems it is vulnerable for being equal to value '-'.
You're looking at it the wrong way around.
'-' is a string, so it will get implicitly casted to an integer value when comparing it with an integer:
select cast('-' as int) -- outputs 0
To make sure that you are actually comparing a value to the string '0', make your comparison like this instead:
select case when '-' = '0' then 1 else 0 end
In general, you're asking for trouble when you're comparing values of different data types, since implicit conversions happen behind the scene - so avoid it at all costs.

Resources