Sybase, data type - sql-server

I have 2 queries:
(1)
declare #m varchar
set #m='10'
select * from test where month=#m
(2)
declare #m varchar(2)
set #m='10'
select * from test where month=#m
Number of rows in result is different. In 2 variant more than in first. What is the reason could be?

That's because when you don't specify how many bytes the varchar variable can hold, the engines uses the default with is 1:
When n isn't specified in a data definition or variable declaration
statement, the default length is 1. If n isn't specified when using
the CAST and CONVERT functions, the default length is 30.
So, in the first case you have:
select * from test where month=1
and in the second:
select * from test where month=10

Related

SQL Server CHOOSE() function behaving unexpectedly with RAND() function

I've encountered an interesting SQL server behaviour while trying to generate random values in T-sql using RAND and CHOOSE functions.
My goal was to try to return one of two given values using RAND() as rng. Pretty easy right?
For those of you who don't know it, CHOOSE function accepts in an index number(int) along with a collection of values and returns a value at specified index. Pretty straightforward.
At first attempt my SQL looked like this:
select choose(ceiling((rand()*2)) ,'a','b')
To my surprise, this expression returned one of three values: null, 'a' or 'b'. Since I didn't expect the null value i started digging. RAND() function returns a float in range from 0(included) to 1 (excluded). Since I'm multiplying it by 2, it should return values anywhere in range from 0(included) to 2 (excluded). Therefore after use of CEILING function final value should be one of: 0,1,2. After realising that i extended the value list by 'c' to check whether that'd be perhaps returned. I also checked the docs page of CEILING and learnt that:
Return values have the same type as numeric_expression.
I assumed the CEILINGfunction returned int, but in this case would mean that the value is implicitly cast to int before being used in CHOOSE, which sure enough is stated on the docs page:
If the provided index value has a numeric data type other than int,
then the value is implicitly converted to an integer.
Just in case I added an explicit cast. My SQL query looks like this now:
select choose(cast(ceiling((rand()*2)) as int) ,'a','b','c')
However, the result set didn't change. To check which values cause the problem I tried generating the value beforehand and selecting it alongside the CHOOSE result. It looked like this:
declare #int int = cast(ceiling((rand()*2)) as int)
select #int,choose( #int,'a','b','c')
Interestingly enough, now the result set changed to (1,a), (2,b) which was my original goal. After delving deeper in the CHOOSE docs page and some testing i learned that 'null' is returned in one of two cases:
Given index is a null
Given index is out of range
In this case that would mean that index value when generated inside the SELECT statement is either 0 or above 2/3 (I'm assuming that negative numbers are not possible here and CHOOSE function indexes from 1). As I've stated before 0 should be one of possibilities of:
ceiling((rand()*2))
,but for some reason it's never 0 (at least when i tried it 1 million+ times like this)
set nocount on
declare #test table(ceiling_rand int)
declare #counter int = 0
while #counter<1000000
begin
insert into #test
select ceiling((rand()*2))
set #counter=#counter+1
end
select distinct ceiling_rand from #test
Therefore I assume that the value generated in SELECT is greater than 2/3 or NULL. Why would it be like this only when generated in SELECT statement? Perhaps order of resolving CAST, CELING or RAND inside SELECT is different than it would seem? It's true I've only tried it a limited number of times, but at this point the chances of it being a statistical fluctuation are extremely small. Is it somehow a floating-point error? I truly am stumbled and looking forward to any explanation.
TL;DR: When generating a random number inside a SELECT statement result set of possible values is different then when it's generated before the SELECT statement.
Cheers,
NFSU
EDIT: Formatting
You can see what's going on if you look at the execution plan.
SET SHOWPLAN_TEXT ON
GO
SELECT (select choose(ceiling((rand()*2)) ,'a','b'))
Returns
|--Constant Scan(VALUES:((CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(1) THEN 'a' ELSE CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(2) THEN 'b' ELSE NULL END END)))
The CHOOSE is expanded out to
SELECT CASE
WHEN ceiling(( rand() * 2 )) = 1 THEN 'a'
ELSE
CASE
WHEN ceiling(( rand() * 2 )) = 2 THEN 'b'
ELSE NULL
END
END
and rand() is referenced twice. Each evaluation can return a different result.
You will get the same problem with the below rewrite being expanded out too
SELECT CASE ceiling(( rand() * 2 ))
WHEN 1 THEN 'a'
WHEN 2 THEN 'b'
END
Avoid CASE for this and any of its variants.
One method would be
SELECT JSON_VALUE ( '["a", "b"]' , CONCAT('$[', FLOOR(rand()*2) ,']') )

T-SQL Scientific notation field conversion

I load excel file into sql as varchar(max) and got that Scientific e value which now I try to convert into numeric as I need to do compare that value, and here I'm running into problem.
This is main question: How and to what type I can convert this to compare with whole integer value ?
On the pic You can see how this seen in Excel, even formatted to text it somehow still loaded into varchar(max) not like char string. This can be seen from my test code.
DECLARE #C VARCHAR(MAX) = '1.1001562717e+011', #Nc VARCHAR(MAX) = '110015627174';
SELECT #c, LEN(#c) LenC ,
ISNUMERIC(#c) NumYN
---, CAST(#c AS DECIMAL(38,2)) cDec ---CAST(#c AS NUMERIC) cNum --, CAST(#c AS BIGINT) cInt
WHERE #c LIKE '%[^0-9]%'
AND ISNUMERIC(#c) = 1
To start, ISNUMERIC is a terrible function, it does not give good results; it is often wrong. If you try ISNUMERIC('1.1001562717e+011') you'll notice that you get the value 1, however, CONVERT(numeric(13,1),'1.1001562717e+011') will produce an error. A far better function is TRY_CONVERT (or TRY_CAST), which returns NULL if the conversion fails for the specific data type: TRY_CONVERT(numeric(13,1),'1.1001562717e+011').
Being specific on the data type is actually important here, as ISNUMERIC could be (incorrectly) suggesting that the value could be converted to at least 1 of the numeric data types; but that doesn't mean all of them. For scientific data types the only data type you can convert to is a float/real:
SELECT TRY_CONVERT(numeric(13,1),'1.1001562717e+011') AS Numeric,
TRY_CONVERT(bigint,'1.1001562717e+011') AS int,
TRY_CONVERT(float,'1.1001562717e+011') AS float,
TRY_CONVERT(money,'1.1001562717e+011') AS money;
Notice that only float has a value here. As you want a numeric as the final value, then you'll need to CONVERT the value twice:
CONVERT(numeric(13,1),TRY_CONVERT(float,'1.1001562717e+011'))

Convert INT to BIT

I tried below query
DECLARE #Input INT = 300
DECLARE #Ouput TINYINT
SET #Ouput = #Input
SELECT #Ouput
While execute the above statement, I received the following error.
Arithmetic overflow error for data type tinyint, value = 300.
The input value exceeds the limit, so the error is displayed.
I tried another query
DECLARE #Input INT = 300
DECLARE #Ouput BIT
SET #Ouput = #Input
SELECT #Ouput
When I execute the statement, I really wondered, it doesn't show any error. If the input value <> 0 (negative or positive), the output value always 1.
Converting to bit promotes any nonzero value to 1.
Sqlserver have power or say try to convert value implicitly as your input and output type without using cast or convert function, if you not specify any.
When the conversion error comes in default logic or scenerio, it give the specific type cast error.
So your tiny int and int length is differ, 300 is not fit in tiny int. Tiny int allow max at 255 value
Here is implicit & explicit convertion chart
You receive Arithmetic overflow on setting int value to tinyint because the range of value of int is larger than tinyint (you know. "tiny"(in small voice)).
Anf if you use bit which only have a value of O or 1 , basic in Computer fundamentals , and I think the value you get is 1 because it has a value, if none, you'll get 0.

'Converting varchar to data type numeric' error after successful conversion to decimal(18,2)

I have a temporary table I'm using for parsing, #rp.
#rp contains an nvarchar(max) column, #rp.col8, which holds positive and negative numbers to two decimal places of precision e.g. `1234.26'.
I'm able to run the following query and get out a set of converted values out:
select * from
(
select CONVERT(decimal(18,2),rp.col8) as PARSEAMT
from #rp
where
--#rp filtering criteria
)q
However, when I try to query for PARSEAMT = 0 in the following manner, I get the standard '8114, Error converting data type varchar to numeric.':
select * from
(
select CONVERT(decimal(18,2),col8) as PARSEAMT
from #rp
where
--#rp filtering criteria
)q
where q.PARSEAMT = 0
Without that where clause, the query runs fine and generates the expected values.
I've also tried other clauses like where q.PARSEAMT = 0.00 and where q.PARSEAMT = convert(decimal(18,2),0).
What am I doing wrong in my comparison?
I was going to suggest you select PARSEAMT into another temp-table/table-variable but I can see you've already done that from your comments.
Out of interest what does the following yield?
select
col8
from
#rp
where
-- ISNUMERIC returns 1 when the input expression evaluates to a valid
-- numeric data type; otherwise it returns 0. Valid numeric data types
-- include the following:
isnumeric(col8) <> 1

Can I use a hash of fields instead of direct field comparison to simplify comparison of records?

I am integrating between 4 data sources:
InternalDeviceRepository
ExternalDeviceRepository
NightlyDeviceDeltas
MidDayDeviceDeltas
Changes flow into the InternalDeviceRepository from the other three sources.
All sources eventually are transformed to have the definition of
FIELDS
=============
IdentityField
Contract
ContractLevel
StartDate
EndDate
ContractStatus
Location
IdentityField is the PrimaryKey, Contract Key is a secondary Key only if a match exists, otherwise a new record needs to be created.
Currently I compare all the fields in a WHERE clause in SQL Statements and also in a number of places in SSIS packages. This creates some unclean looking SQL and SSIS packages.
I've been mulling computing a hash of ContractLevel, StartDate, EndDate, ContractStatus, and Location and adding that to each of the input tables. This would allow me to use a single value for comparison, instead of 5 separate ones each time.
I've never done this before, nor have I seen it done. Is there a reason that it should be used, or is that a cleaner way to do it?
It is a valid approach. Consider to introduce a calculated field with the hash and index on it.
You may use either CHECKSUM function or write your own hash function like this:
CREATE FUNCTION dbo.GetMyLongHash(#data VARBINARY(MAX))
RETURNS VARBINARY(MAX)
WITH RETURNS NULL ON NULL INPUT
AS
BEGIN
DECLARE #res VARBINARY(MAX) = 0x
DECLARE #position INT = 1, #len INT = DATALENGTH(#data)
WHILE 1 = 1
BEGIN
SET #res = #res + HASHBYTES('MD5', SUBSTRING(#data, #position, 8000))
SET #position = #position+8000
IF #Position > #len
BREAK
END
WHILE DATALENGTH(#res) > 16 SET #res= dbo.GetMyLongHash(#res)
RETURN #res
END
which will give you 16-byte value - you may take all the 16 bytes as Guid, or only first 8-bytes as bigint and compare it.
Adapt the function in your way - to accept string as parameter or even all the your fields instead of varbinary
BUT
be careful with strings casing, datetime formats
if using CHECKSUM - check also other fields, checksum produces dublicates
avoid using 4-byte hash result on relaively big table

Resources