SQL Server Convert Data - sql-server

I'm having trouble trying to clean a database because SQL Server doesn't differentiate '2¹59' from '2159', but when when try to convert into INT it obviously returns an error.
In this case I need to replace by NULL, every non numerical data.
Can someone help please? (I'm using Sql Server 2008)

From SQL SERVER 2012 there is a new function which have been added called TRY_PARSE,
If you use it then it will automatically make non int to null.
select TRY_PARSE('2¹59' as int)
Output of above query will be null.

You can use a different collation to change the way the strings are compared:
select
case when N'2¹59' = N'2159' collate Latin1_General_BIN then 1 else 0 end
This will select 0 as you'd expect.
More importantly, since MS SQL understands unicode properly, you can do this:
select cast(N'2¹59' as varchar)
which will give you '2159' - properly replacing the "broken" digits.
If you have no other option, you could also build a helper table to handle indexing the string (just a single column with numbers 1..1000 for example), and do something like this:
exists
(
select 1 from [Numbers]
where
[Numbers].[Index] < len([Value]) + 1
and
unicode(substring([Value], [Numbers].[Index], 1)) > 127
)
Needless to say, this is going to be rather slow. For simple integers, though, this can work as a decent validation - simply use (unicode(substring([Value], [Numbers].[Index], 1)) not between 48 and 57) and ([Numbers].[Index] <> 0 or substring([Value], 1, 1) <> '-')) for example.

Related

In MS SQL an index on a computed column that uses RIGHT and CHARINDEX results in Invalid length parameter passed to RIGHT

I have this computed column
ALTER TABLE mytable
ADD vMessage AS (CONVERT([nvarchar] (200),RIGHT(Message,CHARINDEX('.',REVERSE(Message),1)-1),0))
And I am trying to make an index on vMessage
CREATE NONCLUSTERED INDEX [IX_vMessage]
ON mytable ([vMessageType])
I get this error
Invalid length parameter passed to the RIGHT function.
Creating the computed column and running this query works
SELECT
Message,
RIGHT(Message,charindex('.',reverse(Message),1)-1),
CONVERT([nvarchar](200),RIGHT(Message,CHARINDEX('.',REVERSE(Message),1)-1),0)
FROM mytable
The data is similar to this sample data.
DECLARE #mytable TABLE (message nvarchar(1024))
INSERT INTO #mytable (message) VALUES
('Services.Common.Contracts.InternalContract'),
('Services.Common.Contracts.ItemArchivedContract'),
('Services.Common.Contracts.ItemCreatedContract'),
('Services.Common.Contracts.ItemInformationUpdatedContract'),
('Services.Common.Contracts.EmailContract'),
('Services.Common.Contracts.Customer.SetCredentialsContract'),
('Services.Common.Contracts.InternalItemContract')
SELECT
Message,
RIGHT(Message,charindex('.',reverse(Message),1)-1),
CONVERT([nvarchar](200),RIGHT(Message,CHARINDEX('.',REVERSE(Message),1)-1),0)
FROM #myTable
No message is empty. All messages have at least one dot character in them. Below returns nothing.
SELECT * FROM mytable WHERE CHARINDEX('.', Message) = 0
The problem is that CHARINDEX can return 0 if it doesn't find the value being searched for. To avoid causing errors because of this, you should always put CHARINDEX into NULLIF to null out the 0
ALTER TABLE mytable
ADD vMessage AS (
CONVERT(nvarchar(200),
RIGHT(
Message,
NULLIF(
CHARINDEX(
'.',
REVERSE(Message)
),
0
) - 1
)
)
);
I say always, because using a WHERE doesn't always help, as SQL Server often rearranges expressions. NULLIF uses a CASE internally, which is the only guaranteed way for this not to happen.
The code starts by reversing the string and then using CHARINDEX() to look for a . character. Since you don't want to keep the . in the final result, you then subtract 1 from the returned value. This is where the problem comes in.
CHARINDEX() returns 0 if the value isn't found (using 1-based rather than 0-based indexing for the string). When we subtract 1 from that and pass it to the RIGHT() function, you have an invalid argument and will see this error.
But I also see this:
All messages have at least one dot character in them.
I suggest checking that again. Perhaps the test is running in a different environment from production. We can see your query runs fine on the provided sample data when we load it to db fiddle:
https://dbfiddle.uk/bUMPVALz

Convert from string with leading zeros to bigint in SQL Server CE isn't working

I have a SQL Server Compact (3.5) database with a nvarchar column with lots of data that looks like 000000000011070876. I'm trying to copy that data to another column that is a BIGINT, using the CONVERT function.
My first try was:
UPDATE Mobile_Reservation
SET SAPNo = CONVERT(BIGINT, ItemNumber)
If I run this query in SQL Server 2008 R2, it works fine. 000000000011070876 becomes 11070876. Unfortunately, in SQL Server CE, it becomes 0. Apparently it cannot handle the leading zeros. But it will turn 000000004000010576 into 40, which I assumed meant it was only looking at the first 10 digits. Then I tried:
UPDATE Mobile_Reservation
SET SAPNo = CONVERT(BIGINT, SUBSTRING(ItemNumber, 8, 10))
With a start index of 8, I assumed it would start just before the 4 (The first 8 digits are always 0s, but may be more than 8 0s). This worked somewhat better, but not successfully. 000000000011070876 became 1107 and 000000004000010576 became 40000105.
Then I tried the hardcoded string:
UPDATE Mobile_Reservation
SET SAPNo = CONVERT(BIGINT, '4000010576')
And this worked fine, which confused me even more. I tried a few different combinations of strings, and the logic it seems to use is: for every leading 0 in the string, a char from the other end is removed. '1234' becomes 1234, but '01234' becomes 123. But it's not a hard fast rule, because 04000010576 becomes 40000105, which means the single leading 0 is removing two digits from the end...
Is this a problem with SQL Server CE's implementation of CONVERT, or perhaps something else I'm not noticing? Any thoughts on how to fix this?
I wound up solving this problem with:
UPDATE Mobile_Reservation
SET SAPNo = CONVERT(BIGINT, REPLACE(LTRIM(REPLACE(ItemNumber, '0', ' ')), ' ', '0'))
Not the nicest solution, but it works.
There is an implicit conversion between bigint and nvarchar() -- what about just trying
UPDATE Mobile_Reservation SET SAPNo = ItemNumber
If not that, then sometimes you also can get away with something like
UPDATE Mobile_Reservation SET SAPNo = 0 + ItemNumber
I'm sorry I don't have SQL CE to test on.

ms sql server executes 'then' before 'when' in case

when i try to select
select
case when (isnumeric(SUBSTRING([VZWECK2],1,9)) = 1)
then CONVERT(decimal,SUBSTRING([VZWECK2],1,9))
else null
end as [NUM]
from table
sql-server gives me:
Msg 8114, Level 16, State 5, Line 2
Error converting data type varchar to numeric.
[VZWECK2] is a char(27). is this a known bug? because it seems to me it executes the convert before it does the case, which defies the purpose of my select. i know that there are values that are not numeric obviously, which is why i need the case statement to weed them out.
for some reason selecting
select
case when (isnumeric(SUBSTRING([VZWECK2],1,9)) = 1)
then 99
else null
end as [NUM]
from table
yields no errors and behaves as expected
The problem is that ISNUMERIC is very forgiving, and that ISNUMERIC returns 1 is unfortunately no guarantee that CONVERT will work. This is why SQL Server 2012 and later introduced TRY_CAST and TRY_CONVERT.
If you are converting whole numbers, a more reliable check is to make sure the string consists of only digits with NOT LIKE '%[^0-9]%' (that is, it must not contain a non-digit anywhere). This is too restrictive for some formats (like floating point) but for integers it works nicely.
Do you know the value which throws the error? IsNumeric is not exactly fool-proof, for example:
select ISNUMERIC('$')
select ISNUMERIC('+')
select ISNUMERIC('-')
all yield 1
Alternatively, you could go with TRY_PARSE instead.
Edit: TRY_PARSE is introduced in sql server 2012, so may not be available to you.

Differences between Excel and SQL sorting

Programs used:
SQL Server 2000, Excel 2003
We have a table in our database called Samples. Using the following query...
SELECT [Sample], [Flag] FROM Samples
ORDER BY [Sample]
... we get the following results:
Sample Flag
---------- ----
12-ABC-345 1
123-45-AB 0
679-ADC-12 1
When the user has the same data in an Excel spreadsheet, and sorts by the Sample column, they get the following sort order:
Sample Flag
---------- ----
123-45-AB 0
12-ABC-345 1
679-ADC-12 1
Out of curiosity, why is there a discrepancy between the sort in SQL and Excel (other than, "because it's Microsoft").
Is there a way in SQL to sort on the Sample column in the same method as the Excel method, or vice versa?
The SQL server sorting is determined by the database, table, or field collation. By default, this is a standard lexicographical string sort (the character code for the hyphen is numerically lower than the character code for 1). Unfortunately, according to this Microsoft link, Excel ignores hyphens and apostrophes when sorting, except for tie-breaking. There's no collation that does this specifically (that I'm aware of), so you'll have to fake it.
To achieve the same result in SQL Server, you'd need to do:
SELECT [Sample], [Flag] FROM Samples
ORDER BY REPLACE(REPLACE([Sample], '-', ''), '''', ''),
(CASE WHEN CHARINDEX([Sample], '-') > 0 THEN 1 ELSE 0 END) +
(CASE WHEN CHARINDEX([Sample], '''') > 0 THEN 1 ELSE 0 END) ASC
This orders the results by the string as if it had all hyphens and apostrophe's removed, then orders by a computed value that will yield 1 for any value that contains a hyphen or an apostrophe, or 2 for any value that contains both (and 0 for a value that contains neither). This expression will cause any value that contains a hyphen and/or apostrophe to sort after an expression that is otherwise equivalent, just like Excel.
I personally consider SQL Server sorting order correct and I'd intervene on Excel, as it's the one following an "unusual" method (at least, from my experience).
Here's an explanation of how Excel sorts alphanumeric data, and how to fix it: How to correctly sort alphanumeric data in Excel.

How to use ORDER BY, LOWER in SQL SERVER 2008 with non-unicode data

The question is about Armenian. I'm using sql server 2005, collation
SQL_Latin1_General_CP1_CI_AS, data mostly is in Armenian and we can't use unicode.
I tested on ms sql 2008 with a windows collation for armenian language ( Cyrillic_General_100_ ), I have found here, ( http://msdn.microsoft.com/en-us/library/ms188046.aspx ) but it didn't help.
I have a function, that orders hex values and a lower function, which takes each char in each string and converts it to it's lower form, but it's not acceptable solution, it works really slow, calling that functions on every column of a huge table.
Is there any solution for this issue not using unicode and not working with hex values manually?
UPDATE:
On the left side are mixed case words, sorted in the right order and with lower case representations on the right side. Hope this will help. Thank You.
Words are written in unicode.
ԱբԳդԵզ -> աբգդեզ
ԱգԳսԴԼ -> ագգսդլ
ԲաԴֆդԴ -> բադֆդդ
ԳԳԼասա -> գգլասա
ԴմմլօՏ -> դմմլօտ
ԵլԲնՆն -> ելբննն
ԶՎլուտ -> զվլուտ
էԹփձջՐ -> էթփձջր
ԸխԾդսՂ -> ըխծդսղ
ԹԶէըԿր -> թզէըկր
One solution would be to create a computed column for each text column which converts the value into Armenian collation and sets it to lower case like so:
Alter Table TableName
Add TextValueArmenian As ( LOWER(TextColumn COLLATE Latin1_General_CI_AS) ) PERSISTED
Once you do this, you can put indexes on these columns and query for them.
If that isn't your flavor of tea, then another solution would be an indexed view where you create a view with SCHEMABINDING that casts each of the various columns to lower case and to the right collation and then put indexes on that view.
EDIT I notice in your examples, that your are using a Case-insensitive, Accent-sensitive. Perhaps the simple solution to your ordering issues would be to use Latin1_General_CS_AS or Cyrillic_General_100_CS_AS if available.
EDIT
Whew. After quite a bit of research, I think I have an answer which unfortunately may not be you will want. First, yes I can copy the text you provided into code or something like Notepad++ because StackOverflow is encoded using UTF-8 and Armenian will fit into UTF-8. Second, this hints at what you are trying to achieve: storing UTF-8 in SQL Server. Unfortunately, SQL Server 2008 (or any prior version) does not natively support UTF-8. In order to store data in UTF-8, you have a handful of choices:
Store it in binary and convert it to UTF-8 on the client (which pretty much eliminates any sorting done on the server)
Store it in Unicode and convert it to UTF-8 on the client. It should be noted that the SQL Server driver will already convert most strings to Unicode and your example does work fine in Unicode. The obvious downside is that it eats up twice the space.
Create a CLR user defined type in SQL Server to store UTF-8 values. Microsoft provides a sample that comes with SQL Server to do just this. You can download the samples from CodePlex from here. You can also find more information on the sample in this article in the Books Online. The downside is that you have to have the CLR enabled in SQL Server and I'm not sure how well it will perform.
Now, that said, I was able to get you sample working with no problem using Unicode in SQL Server.
If object_id('tempdb..#Test') Is Not Null
Drop Table #Test
GO
Create Table #Test
(
EntrySort int identity(1,1) not null
, ProperSort int
, MixedCase nvarchar(50)
, Lowercase nvarchar(50)
)
GO
Insert #Test(ProperSort, MixedCase, Lowercase)
Select 1, N'ԱբԳդԵզ',N'աբգդեզ'
Union All Select 6, N'ԵլԲնՆն',N'ելբննն'
Union All Select 2, N'ԱգԳսԴԼ',N'ագգսդլ'
Union All Select 3, N'ԲաԴֆդԴ',N'բադֆդդ'
Union All Select 4, N'ԳԳԼասա',N'գգլասա'
Union All Select 5, N'ԴմմլօՏ',N'դմմլօտ'
Union All Select 9, N'ԸխԾդսՂ',N'ըխծդսղ'
Union All Select 7, N'ԶՎլուտ',N'զվլուտ'
Union All Select 10, N'ԹԶէըԿր',N'թզէըկր'
Union All Select 8,N'էԹփձջՐ',N'էթփձջր'
Select * From #Test Order by ProperSort
Select * From #Test Order by Lowercase
Select * From #Test Order by Lower(MixedCase)
All three of these queries return the same result.
Did you get an error like this?
Msg 448, Level 16, State 1, Line 1
Invalid collation 'Cyrillic_General_100_'.
Try:
ORDER BY Name COLLATE Cyrillic_General_100_CI_AS
Or pick one you prefer from this list:
select * from fn_helpcollations()
where name like 'Cyrillic_General_100_%'

Resources