I want to write a query to see if a category field is within a certain range. The problem is the field can contain null, text or numeric text prefixed by '#' character.
Does anybody know of SQL that will strip the non numerics and allow me to do the following check.
category > 1 and category < 100
Here is a sample of what the field category can contain:
#230.1
#200
Null
text
I am using SQL Server 2000
I appears astander's solution is functional. You should consider however a few points:
If the table holds more than a few thousand rows, and if this type of query is to be run frequently, it may be beneficial to introduce a new column to hold the numeric value of the category (if available, null otherwise). This will be more efficient for two reasons: as written, SQL needs to scan the table, completely, i.e.it needs to review every single row; also it needs to perform all these conversion which are a bit expensive, CPU-wise.
You may consider introducing some extra logic to normalize the category field. For example to get rid of common leading or trailing characters etc. This will "rescue" several category codes which would otherwise translate to null wouldn't be able to participate in these filters.
Try something like this
DECLARE #Table TABLE(
Val VARCHAR(200)
)
INSERT INTO #Table (Val) SELECT '#230.1'
INSERT INTO #Table (Val) SELECT '#200'
INSERT INTO #Table (Val) SELECT '210'
INSERT INTO #Table (Val) SELECT NULL
INSERT INTO #Table (Val) SELECT 'text'
SELECT *
FROM (
SELECT CASE
WHEN ISNUMERIC(Val) = 1
THEN CAST(Val AS FLOAT)
WHEN LEN(Val) > 1 AND ISNUMERIC(RIGHT(Val,LEN(Val)-1)) = 1
THEN CAST(RIGHT(Val,LEN(Val)-1) AS FLOAT)
END Num
FROM #Table
WHERE Val IS NOT NULL
AND (
ISNUMERIC(Val) = 1
OR (
LEN(Val) > 1
AND ISNUMERIC(RIGHT(Val,LEN(Val)-1)) = 1
)
)
) Numbers
WHERE Num BETWEEN 205 AND 230
Related
I have a varchar column that has numbers with .0
This column has both numeric data and non-numeric data.
I first tried to convert data type to integer, but since there is non-numeric data type, it would not let me.
How do I remove .0 (from all numbers that has .0)?
So, for example, 100.0 should be 100
I am not trying to use select, cast or truncate as I need to actually modify the existing data.
Thanks.
Since the column has both numeric and non-numeric data it is not enough to just check if it ends with '.0'.
You should also check if it is a numeric value, which can be done with TRY_CAST():
UPDATE tablename
SET col = LEFT(col, LEN(col) - 2)
WHERE col LIKE '%.0' AND TRY_CAST(col AS FLOAT) IS NOT NULL
See the demo.
Assuming you want to update your table...
where x = your table name
yourfieldname = the field name you need to update.
.
UPDATE table X
SET yourfieldName = left(yourfieldname,len(yourieldName)-2)
WHERE right(yourfieldName,2)='.0')
-- or perhaps where yourfieldname like '%.0' would be faster...
Should: update all fields ending in .0 would need to test to see which where clause would be faster depending on indexes. if speed is a consideration. If not; and this is a 1 and done... does it matter?
Be sure to test on a subset/copy table!
Assumes you don't have a spaces after the .0... or any non-display characters.. If you do you'll need to trim off the spaces and replace the non-display characters with empty string ''
Just another option
Example
Declare #YourTable table (SomeCol varchar(50))
Insert Into #YourTable values
('100.0')
,('1001.0')
,('Not Numeric')
,('-200.05')
,('10,250.0')
Update #YourTable
set SomeCol = format(try_convert(money,SomeCol),'#.######')
From #YourTable
Where try_convert(money,SomeCol) is not null
The Updated Table
SomeCol
100
1001
Not Numeric
-200.05
10250
I have a table like this (simplified):
CREATE TABLE #table (Id INT, Field NVARCHAR(MAX))
INSERT INTO #table VALUES (1, 'SomeText')
INSERT INTO #table VALUES (2, '1234')
For some reasons I need to query this table and get the sum of Field if it is numeric and return '' if it is not. I tried it like this:
SELECT CASE WHEN ISNUMERIC(Field) = 1 THEN SUM(CONVERT(MONEY, Field)) ELSE '' END
FROM #table
GROUP BY Field
But this query leads to the following exception:
Cannot convert a char value to money. The char value has incorrect syntax.
I even changed the ELSE case from '' to 0 but I still get the same message.
Why do I get the exception? As far as I know, SUM(...) should not be executed when ISNUMERIC(Field) returns 0.
Select sum(case when ISNUMERIC(Field)=1 then cast(field as money) else 0 end)
from #table
Group By Field
Returns
(No column name)
1234.00
0.00
Working with mixed datatypes can be a real pain. Where possible, consider table designs that avoid this. To further complicate matters, IsNumeric does not always return what you might expect.
Filtering out the non-numerics before aggregating is one way to go:
SELECT
SUM(CONVERT(MONEY, Field))
FROM
#table
WHERE
ISNUMERIC(Field) = 1
GROUP BY
Field
;
I've got a column of type Text. In the column are numeric values such as4, 8, 3.2, etc... and also values such as 'Negative', 'Positive', '27A', '2pos 1neg'.
The user needs to be able to say: "Give me all the values between 10 and 30, and also the values that are 'Negative'. The WHERE clause would need to do something along the lines of:
WHERE Tbl.Col > 10
AND Tbl.Col < 30
AND Tbl.Col = 'Negative'
This is problematic for obvious reasons. I've tried using the ISNUMERIC function to alleviate the issue but can't seem to get exactly what i need. I can either get all the alpha values in the column, or all the numeric values in the column as floats but cant seem to filter on both at the same time. To grab all the Numeric values I've been using this:
SELECT Num.Val FROM
(SELECT Val = CASE ISNUMERIC(CAST(TBL.COL AS VARCHAR)) WHEN 1
THEN CAST(CAST(TBL.COL AS VARCHAR) AS FLOAT) ELSE NULL END
FROM Table TBL
WHERE TBL.COL IS NOT NULL ) as Num
WHERE Num.val IS NOT NULL
AND Num.val > 10
If I understand the issue correctly something like this should get you close.
with MyNumbers as
(
select t.Col
from Tbl t
--where ISNUMERIC(t.Col) = 1
where t.Col NOT LIKE '%[^0-9.]%'
)
, MyAlpha as
(
select t.Col
from Tbl t
where ISNUMERIC(t.Col) = 0
)
select Col
from MyNumbers
where Col > 10
and Col < 30
union all
select Col
from MyAlpha
where ColorMatch = ' Negative'
First I would go slap the person who designed the table (hopefully it isn't you) :>
Go to here and get the split table function. I would then convert the text column (like you have in example above) into varchar(max) and supply it as the parameter to the split function. Then you could select from the table results of the split function using the user supplied parameters.
I have found the answer to my problem:
SELECT
al_Value = Table.Column
FROM Table
WHERE (
ISNUMERIC(CAST(Table.Column AS VARCHAR)) = 1 AND
CONVERT(FLOAT, CAST(Table.Column AS VARCHAR)) > 1.0 AND
CONVERT(FLOAT, CAST(Table.Column AS VARCHAR)) < 10.0
)
OR (
CAST(Table.Column AS VARCHAR) IN ('negative', 'no bueno')
)
This will return one column named 'al_Value' that filters on Table.Column (which is of Datatype TEXT) and apply the filters in the WHERE clause above.
Thanks everyone for trying to help me with this issue.
I am stuck on converting a varchar column UserID to INT. I know, please don't ask why this UserID column was not created as INT initially, long story.
So I tried this, but it doesn't work. and give me an error:
select CAST(userID AS int) from audit
Error:
Conversion failed when converting the varchar value
'1581............................................................................................................................' to data type int.
I did select len(userID) from audit and it returns 128 characters, which are not spaces.
I tried to detect ASCII characters for those trailing after the ID number and ASCII value = 0.
I have also tried LTRIM, RTRIM, and replace char(0) with '', but does not work.
The only way it works when I tell the fixed number of character like this below, but UserID is not always 4 characters.
select CAST(LEFT(userID, 4) AS int) from audit
You could try updating the table to get rid of these characters:
UPDATE dbo.[audit]
SET UserID = REPLACE(UserID, CHAR(0), '')
WHERE CHARINDEX(CHAR(0), UserID) > 0;
But then you'll also need to fix whatever is putting this bad data into the table in the first place. In the meantime perhaps try:
SELECT CONVERT(INT, REPLACE(UserID, CHAR(0), ''))
FROM dbo.[audit];
But that is not a long term solution. Fix the data (and the data type while you're at it). If you can't fix the data type immediately, then you can quickly find the culprit by adding a check constraint:
ALTER TABLE dbo.[audit]
ADD CONSTRAINT do_not_allow_stupid_data
CHECK (CHARINDEX(CHAR(0), UserID) = 0);
EDIT
Ok, so that is definitely a 4-digit integer followed by six instances of CHAR(0). And the workaround I posted definitely works for me:
DECLARE #foo TABLE(UserID VARCHAR(32));
INSERT #foo SELECT 0x31353831000000000000;
-- this succeeds:
SELECT CONVERT(INT, REPLACE(UserID, CHAR(0), '')) FROM #foo;
-- this fails:
SELECT CONVERT(INT, UserID) FROM #foo;
Please confirm that this code on its own (well, the first SELECT, anyway) works for you. If it does then the error you are getting is from a different non-numeric character in a different row (and if it doesn't then perhaps you have a build where a particular bug hasn't been fixed). To try and narrow it down you can take random values from the following query and then loop through the characters:
SELECT UserID, CONVERT(VARBINARY(32), UserID)
FROM dbo.[audit]
WHERE UserID LIKE '%[^0-9]%';
So take a random row, and then paste the output into a query like this:
DECLARE #x VARCHAR(32), #i INT;
SET #x = CONVERT(VARCHAR(32), 0x...); -- paste the value here
SET #i = 1;
WHILE #i <= LEN(#x)
BEGIN
PRINT RTRIM(#i) + ' = ' + RTRIM(ASCII(SUBSTRING(#x, #i, 1)))
SET #i = #i + 1;
END
This may take some trial and error before you encounter a row that fails for some other reason than CHAR(0) - since you can't really filter out the rows that contain CHAR(0) because they could contain CHAR(0) and CHAR(something else). For all we know you have values in the table like:
SELECT '15' + CHAR(9) + '23' + CHAR(0);
...which also can't be converted to an integer, whether you've replaced CHAR(0) or not.
I know you don't want to hear it, but I am really glad this is painful for people, because now they have more war stories to push back when people make very poor decisions about data types.
This question has got 91,000 views so perhaps many people are looking for a more generic solution to the issue in the title "error converting varchar to INT"
If you are on SQL Server 2012+ one way of handling this invalid data is to use TRY_CAST
SELECT TRY_CAST (userID AS INT)
FROM audit
On previous versions you could use
SELECT CASE
WHEN ISNUMERIC(RTRIM(userID) + '.0e0') = 1
AND LEN(userID) <= 11
THEN CAST(userID AS INT)
END
FROM audit
Both return NULL if the value cannot be cast.
In the specific case that you have in your question with known bad values I would use the following however.
CAST(REPLACE(userID COLLATE Latin1_General_Bin, CHAR(0),'') AS INT)
Trying to replace the null character is often problematic except if using a binary collation.
This is more for someone Searching for a result, than the original post-er. This worked for me...
declare #value varchar(max) = 'sad';
select sum(cast(iif(isnumeric(#value) = 1, #value, 0) as bigint));
returns 0
declare #value varchar(max) = '3';
select sum(cast(iif(isnumeric(#value) = 1, #value, 0) as bigint));
returns 3
I would try triming the number to see what you get:
select len(rtrim(ltrim(userid))) from audit
if that return the correct value then just do:
select convert(int, rtrim(ltrim(userid))) from audit
if that doesn't return the correct value then I would do a replace to remove the empty space:
select convert(int, replace(userid, char(0), '')) from audit
This is how I solved the problem in my case:
First of all I made sure the column I need to convert to integer doesn't contain any spaces:
update data set col1 = TRIM(col1)
I also checked whether the column only contains numeric digits.
You can check it by:
select * from data where col1 like '%[^0-9]%' order by col1
If any nonnumeric values are present, you can save them to another table and remove them from the table you are working on.
select * into nonnumeric_data from data where col1 like '%[^0-9]%'
delete from data where col1 like '%[^0-9]%'
Problems with my data were the cases above. So after fixing them, I created a bigint variable and set the values of the varchar column to the integer column I created.
alter table data add int_col1 bigint
update data set int_col1 = CAST(col1 AS VARCHAR)
This worked for me, hope you find it useful as well.
I have a periodic check of a certain query (which by the way includes multiple tables) to add informational messages to the user if something has changed since the last check (once a day).
I tried to make it work with checksum_agg(binary_checksum(*)), but it does not help, so this question doesn't help much, because I have a following case (oversimplified):
select checksum_agg(binary_checksum(*))
from
(
select 1 as id,
1 as status
union all
select 2 as id,
0 as status
) data
and
select checksum_agg(binary_checksum(*))
from
(
select 1 as id,
0 as status
union all
select 2 as id,
1 as status
) data
Both of the above cases result in the same check-sum, 49, and it is clear that the data has been changed.
This doesn't have to be a simple function or a simple solution, but I need some way to uniquely identify the difference like these in SQL server 2000.
checksum_agg appears to simply add the results of binary_checksum together for all rows. Although each row has changed, the sum of the two checksums has not (i.e. 17+32 = 16+33). This is not really the norm for checking for updates, but the recommendations I can come up with are as follows:
Instead of using checksum_agg, concatenate the checksums into a delimited string, and compare strings, along the lines of SELECT binary_checksum(*) + ',' FROM MyTable FOR XML PATH(''). Much longer string to check and to store, but there will be much less chance of a false positive comparison.
Instead of using the built-in checksum routine, use HASHBYTES to calculate MD5 checksums in 8000 byte blocks, and xor the results together. This will give you a much more resilient checksum, although still not bullet-proof (i.e. it is still possible to get false matches, but very much less likely). I'll paste the HASHBYTES demo code that I wrote below.
The last option, and absolute last resort, is to actually store the table table in XML format, and compare that. This is really the only way you can be absolutely certain of no false matches, but is not scalable and involves storing and comparing large amounts of data.
Every approach, including the one you started with, has pros and cons, with varying degrees of data size and processing requirements against accuracy. Depending on what level of accuracy you require, use the appropriate option. The only way to get 100% accuracy is to store all of the table data.
Alternatively, you can add a date_modified field to each table, which is set to GetDate() using after insert and update triggers. You can do SELECT COUNT(*) FROM #test WHERE date_modified > #date_last_checked. This is a more common way of checking for updates. The downside of this one is that deletions cannot be tracked.
Another approach is to create a modified table, with table_name (VARCHAR) and is_modified (BIT) fields, containing one row for each table you wish to track. Using insert, update and delete triggers, the flag against the relevant table is set to True. When you run your schedule, you check and reset the is_modified flag (in the same transaction) - along the lines of SELECT #is_modified = is_modified, is_modified = 0 FROM tblModified
The following script generates three result sets, each corresponding with the numbered list earlier in this response. I have commented which output correspond with which option, just before the SELECT statement. To see how the output was derived, you can work backwards through the code.
-- Create the test table and populate it
CREATE TABLE #Test (
f1 INT,
f2 INT
)
INSERT INTO #Test VALUES(1, 1)
INSERT INTO #Test VALUES(2, 0)
INSERT INTO #Test VALUES(2, 1)
/*******************
OPTION 1
*******************/
SELECT CAST(binary_checksum(*) AS VARCHAR) + ',' FROM #test FOR XML PATH('')
-- Declaration: Input and output MD5 checksums (#in and #out), input string (#input), and counter (#i)
DECLARE #in VARBINARY(16), #out VARBINARY(16), #input VARCHAR(MAX), #i INT
-- Initialize #input string as the XML dump of the table
-- Use this as your comparison string if you choose to not use the MD5 checksum
SET #input = (SELECT * FROM #Test FOR XML RAW)
/*******************
OPTION 3
*******************/
SELECT #input
-- Initialise counter and output MD5.
SET #i = 1
SET #out = 0x00000000000000000000000000000000
WHILE #i <= LEN(#input)
BEGIN
-- calculate MD5 for this batch
SET #in = HASHBYTES('MD5', SUBSTRING(#input, #i, CASE WHEN LEN(#input) - #i > 8000 THEN 8000 ELSE LEN(#input) - #i END))
-- xor the results with the output
SET #out = CAST(CAST(SUBSTRING(#in, 1, 4) AS INT) ^ CAST(SUBSTRING(#out, 1, 4) AS INT) AS VARBINARY(4)) +
CAST(CAST(SUBSTRING(#in, 5, 4) AS INT) ^ CAST(SUBSTRING(#out, 5, 4) AS INT) AS VARBINARY(4)) +
CAST(CAST(SUBSTRING(#in, 9, 4) AS INT) ^ CAST(SUBSTRING(#out, 9, 4) AS INT) AS VARBINARY(4)) +
CAST(CAST(SUBSTRING(#in, 13, 4) AS INT) ^ CAST(SUBSTRING(#out, 13, 4) AS INT) AS VARBINARY(4))
SET #i = #i + 8000
END
/*******************
OPTION 2
*******************/
SELECT #out