Select rows that can't be casted - sql-server

In a column of my table are stored the number of the house address.
Unfortunately my previous colleagues were not a fan of thinking so they made the column of type varchar and did not block input on the software... so now I'm stuck with a bunch of rows where the number of house/apartment is "N.I.", "Not Info", "Unknown", etc. instead of a meaningful number...
I would like to select only the rows that are not numbers... something like select * from table where CAST(column as int) throws exception

Take a look at IsNumeric, IsInt, IsNumber, you can't use just isnumeric it will return true for - signs and other stuff like that
For example, this returns 1
SELECT ISNUMERIC('2d5'),
ISNUMERIC('+')

select * from table where ISNumeric(column)=0
but it may give false positives .....

Try this:
SELECT * FROM table WHERE ISNUMERIC(column + 'e0') = 0

Create a function that that tries the cast and any other logic that you needs then return 0 if the value doesn't meet your requirements if it succeeds return 1. then use the function in the where clause

Related

Snowflake: Trouble getting numbers to return from a PIVOT function

I am moving a query from SQL Server to Snowflake. Part of the query creates a pivot table. The pivot table part works fine (I have run it in isolation, and it pulls numbers I expect).
However, the following parts of the query rely on the pivot table- and those parts fail. Some of the fields return as a string-type. I believe that the problem is Snowflake is having issues converting string data to numeric data. I have tried CAST, TRY_TO_DOUBLE/NUMBER, but these just pull up 0.
I will put the code down below, and I appreciate any insight as to what I can do!
CREATE OR REPLACE TEMP TABLE ATTR_PIVOT_MONTHLY_RATES AS (
SELECT
Market,
Coverage_Mo,
ZEROIFNULL(TRY_TO_DOUBLE('Starting Membership')) AS Starting_Membership,
ZEROIFNULL(TRY_TO_DOUBLE('Member Adds')) AS Member_Adds,
ZEROIFNULL(TRY_TO_DOUBLE('Member Attrition')) AS Member_Attrition,
((ZEROIFNULL(CAST('Starting Membership' AS FLOAT))
+ ZEROIFNULL(CAST('Member Adds' AS FLOAT))
+ ZEROIFNULL(CAST('Member Attrition' AS FLOAT)))-ZEROIFNULL(CAST('Starting Membership' AS FLOAT)))
/ZEROIFNULL(CAST('Starting Membership' AS FLOAT)) AS "% Change"
FROM
(SELECT * FROM ATTR_PIVOT
WHERE 'Starting Membership' IS NOT NULL) PT)
I realize this is a VERY big question with a lot of moving parts... So my main question is: How can I successfully change the data type to numeric value, so that hopefully the formulas work in the second half of the query?
Thank you so much for reading through it all!
EDITED FOR SHORTENING THE QUERY WITH UNNEEDED SYNTAX
CAST(), TRY_TO_DOUBLE(), TRY_TO_NUMBER(). I have also put the fields (Starting Membership, Member Adds) in single and double quotation marks.
Unless you are quoting your field names in this post just to highlight them for some reason, the way you've written this query would indicate that you are trying to cast a string value to a number.
For example:
ZEROIFNULL(TRY_TO_DOUBLE('Starting Membership'))
This is simply trying to cast a string literal value of Starting Membership to a double. This will always be NULL. And then your ZEROIFNULL() function is turning your NULL into a 0 (zero).
Without seeing the rest of your query that defines the column names, I can't provide you with a correction, but try using field names, not quoted string values, in your query and see if that gives you what you need.
You first mistake is all your single quoted columns names are being treated as strings/text/char
example your inner select:
with ATTR_PIVOT(id, studentname) as (
select * from values
(1, 'student_a'),
(1, 'student_b'),
(1, 'student_c'),
(2, 'student_z'),
(2, 'student_a')
)
SELECT *
FROM ATTR_PIVOT
WHERE 'Starting Membership' IS NOT NULL
there is no "starting membership" column and we get all the rows..
ID
STUDENTNAME
1
student_a
1
student_b
1
student_c
2
student_z
2
student_a
So you need to change 'Starting Membership' -> "Starting Membership" etc,etc,etc
As Mike mentioned, the 0 results is because the TRY_TO_DOUBLE always fails, and thus the null is always turned to zero.
now, with real "string" values, in real named columns:
with ATTR_PIVOT(Market, Coverage_Mo, "Starting Membership", "Member Adds", "Member Attrition") as (
select * from values
(1, 10 ,'student_a', '23', '150' )
)
SELECT
Market,
Coverage_Mo,
ZEROIFNULL(TRY_TO_DOUBLE("Starting Membership")) AS Starting_Membership,
ZEROIFNULL(TRY_TO_DOUBLE("Member Adds")) AS Member_Adds,
ZEROIFNULL(TRY_TO_DOUBLE("Member Attrition")) AS Member_Attrition
FROM ATTR_PIVOT
WHERE "Starting Membership" IS NOT NULL
we get what we would expect:
MARKET
COVERAGE_MO
STARTING_MEMBERSHIP
MEMBER_ADDS
MEMBER_ATTRITION
1
10
0
23
150

SQL Server CHOOSE() function behaving unexpectedly with RAND() function

I've encountered an interesting SQL server behaviour while trying to generate random values in T-sql using RAND and CHOOSE functions.
My goal was to try to return one of two given values using RAND() as rng. Pretty easy right?
For those of you who don't know it, CHOOSE function accepts in an index number(int) along with a collection of values and returns a value at specified index. Pretty straightforward.
At first attempt my SQL looked like this:
select choose(ceiling((rand()*2)) ,'a','b')
To my surprise, this expression returned one of three values: null, 'a' or 'b'. Since I didn't expect the null value i started digging. RAND() function returns a float in range from 0(included) to 1 (excluded). Since I'm multiplying it by 2, it should return values anywhere in range from 0(included) to 2 (excluded). Therefore after use of CEILING function final value should be one of: 0,1,2. After realising that i extended the value list by 'c' to check whether that'd be perhaps returned. I also checked the docs page of CEILING and learnt that:
Return values have the same type as numeric_expression.
I assumed the CEILINGfunction returned int, but in this case would mean that the value is implicitly cast to int before being used in CHOOSE, which sure enough is stated on the docs page:
If the provided index value has a numeric data type other than int,
then the value is implicitly converted to an integer.
Just in case I added an explicit cast. My SQL query looks like this now:
select choose(cast(ceiling((rand()*2)) as int) ,'a','b','c')
However, the result set didn't change. To check which values cause the problem I tried generating the value beforehand and selecting it alongside the CHOOSE result. It looked like this:
declare #int int = cast(ceiling((rand()*2)) as int)
select #int,choose( #int,'a','b','c')
Interestingly enough, now the result set changed to (1,a), (2,b) which was my original goal. After delving deeper in the CHOOSE docs page and some testing i learned that 'null' is returned in one of two cases:
Given index is a null
Given index is out of range
In this case that would mean that index value when generated inside the SELECT statement is either 0 or above 2/3 (I'm assuming that negative numbers are not possible here and CHOOSE function indexes from 1). As I've stated before 0 should be one of possibilities of:
ceiling((rand()*2))
,but for some reason it's never 0 (at least when i tried it 1 million+ times like this)
set nocount on
declare #test table(ceiling_rand int)
declare #counter int = 0
while #counter<1000000
begin
insert into #test
select ceiling((rand()*2))
set #counter=#counter+1
end
select distinct ceiling_rand from #test
Therefore I assume that the value generated in SELECT is greater than 2/3 or NULL. Why would it be like this only when generated in SELECT statement? Perhaps order of resolving CAST, CELING or RAND inside SELECT is different than it would seem? It's true I've only tried it a limited number of times, but at this point the chances of it being a statistical fluctuation are extremely small. Is it somehow a floating-point error? I truly am stumbled and looking forward to any explanation.
TL;DR: When generating a random number inside a SELECT statement result set of possible values is different then when it's generated before the SELECT statement.
Cheers,
NFSU
EDIT: Formatting
You can see what's going on if you look at the execution plan.
SET SHOWPLAN_TEXT ON
GO
SELECT (select choose(ceiling((rand()*2)) ,'a','b'))
Returns
|--Constant Scan(VALUES:((CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(1) THEN 'a' ELSE CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(2) THEN 'b' ELSE NULL END END)))
The CHOOSE is expanded out to
SELECT CASE
WHEN ceiling(( rand() * 2 )) = 1 THEN 'a'
ELSE
CASE
WHEN ceiling(( rand() * 2 )) = 2 THEN 'b'
ELSE NULL
END
END
and rand() is referenced twice. Each evaluation can return a different result.
You will get the same problem with the below rewrite being expanded out too
SELECT CASE ceiling(( rand() * 2 ))
WHEN 1 THEN 'a'
WHEN 2 THEN 'b'
END
Avoid CASE for this and any of its variants.
One method would be
SELECT JSON_VALUE ( '["a", "b"]' , CONCAT('$[', FLOOR(rand()*2) ,']') )

Inconsistent result from isnumeric

I have the following queries.
select ISNUMERIC(result+ 'E0'), from t1
select ISNUMERIC('7' + 'E0')
select ISNUMERIC('7' + '.E0')
The data type of result column is varchar(50).The first query yields 0 even when result is like 2,3 and returns 1 for float only ...Whereas second and third query works fine for both integer and float.Am I missing anything? My requirement is to check whether result column is number(integer,float) or not.I know isnumeric returns 1 for type like money,small money,real etc but this is not the case here as i don't have such type in my result and i am only receiving 0.
The reason for the seemingly inconsistent result might be, that there is a space in your result column value. Try timming text and feed the trimmed text to ISNUMERIC:
select ISNUMERIC(ltrim(rtrim(result))+ 'E0') from t1

Why does SUM(...) on an empty recordset return NULL instead of 0?

I understand why null + 1 or (1 + null) returns null: null means "unknown value", and if a value is unknown, its successor is unknown as well. The same is true for most other operations involving null.[*]
However, I don't understand why the following happens:
SELECT SUM(someNotNullableIntegerField) FROM someTable WHERE 1=0
This query returns null. Why? There are no unknown values involved here! The WHERE clause returns zero records, and the sum of an empty set of values is 0.[**] Note that the set is not unknown, it is known to be empty.
I know that I can work around this behaviour by using ISNULL or COALESCE, but I'm trying to understand why this behaviour, which appears counter-intuitive to me, was chosen.
Any insights as to why this makes sense?
[*] with some notable exceptions such as null OR true, where obviously true is the right result since the unknown value simply does not matter.
[**] just like the product of an empty set of values is 1. Mathematically speaking, if I were to extend $(Z, +)$ to $(Z union {null}, +)$, the obvious choice for the identity element would still be 0, not null, since x + 0 = x but x + null = null.
The ANSI-SQL-Standard defines the result of the SUM of an empty set as NULL. Why they did this, I cannot tell, but at least the behavior should be consistent across all database engines.
Reference: http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt on page 126:
b) If AVG, MAX, MIN, or SUM is specified, then
Case:
i) If TXA is empty, then the result is the null value.
TXA is the operative resultset from the selected column.
When you mean empty table you mean a table with only NULL values, That's why we will get NULL as output for aggregate functions. You can consider this as by design for SQL Server.
Example 1
CREATE TABLE testSUMNulls
(
ID TINYINT
)
GO
INSERT INTO testSUMNulls (ID) VALUES (NULL),(NULL),(NULL),(NULL)
SELECT SUM(ID) FROM testSUMNulls
Example 2
CREATE TABLE testSumEmptyTable
(
ID TINYINT
)
GO
SELECT SUM(ID) Sums FROM testSumEmptyTable
In both the examples you will NULL as output..

'Converting varchar to data type numeric' error after successful conversion to decimal(18,2)

I have a temporary table I'm using for parsing, #rp.
#rp contains an nvarchar(max) column, #rp.col8, which holds positive and negative numbers to two decimal places of precision e.g. `1234.26'.
I'm able to run the following query and get out a set of converted values out:
select * from
(
select CONVERT(decimal(18,2),rp.col8) as PARSEAMT
from #rp
where
--#rp filtering criteria
)q
However, when I try to query for PARSEAMT = 0 in the following manner, I get the standard '8114, Error converting data type varchar to numeric.':
select * from
(
select CONVERT(decimal(18,2),col8) as PARSEAMT
from #rp
where
--#rp filtering criteria
)q
where q.PARSEAMT = 0
Without that where clause, the query runs fine and generates the expected values.
I've also tried other clauses like where q.PARSEAMT = 0.00 and where q.PARSEAMT = convert(decimal(18,2),0).
What am I doing wrong in my comparison?
I was going to suggest you select PARSEAMT into another temp-table/table-variable but I can see you've already done that from your comments.
Out of interest what does the following yield?
select
col8
from
#rp
where
-- ISNUMERIC returns 1 when the input expression evaluates to a valid
-- numeric data type; otherwise it returns 0. Valid numeric data types
-- include the following:
isnumeric(col8) <> 1

Resources