Casting binary to bit - sql-server

Based on the recent question.
Can someone point me to explanation of the following?
If I cast binary(4) constant 0x80000000 to int, take the resulting value and cast it to bit type, the result is 1.
select cast(0x80000000 as int) --> -2147483648
select cast(-2147483648 as bit) --> 1
But if I cast 0x80000000 to bit type directly the result is 0.
select cast(0x80000000 as bit) --> 0
I hoped to get 1 in this case as well, thinkning that probably this expression equivalent to
select cast(cast(0x80000000 as binary(1)) as bit)
but this is not the case. Instead, it seems that the highest byte of the binary constant is taken and converted to bit. So, effectively it is something like
select cast(cast(right(0x80000000, 1) as binary(1)) as bit)
I'm clear with first binary -> int -> bit part. What I'm not clear with is the second binary -> bit part. I was not able to find this behavior explained in the documentation, where only
Converting to bit promotes any nonzero value to 1.
is stated.

binary is not a number, it's a string of bytes. When you cast binary to another type, a conversion is performed. When binary is longer than the target data-type, it is truncated from the left. When it's shorter than the target, it is padded with zeroes from the left. The exception is when casting to another string type (e.g. varchar or another binary) - there it's padding and truncation from the right, which may be a bit confusing at first :)
So what happens here?
select cast(cast(0x0F as binary(1)) as bit) -- 1 - 0x0F is nonzero
select cast(cast(0x01 as binary(1)) as bit) -- 1 - 0x01 is nonzero
select cast(cast(0x01 as binary(2)) as bit) -- 0 - truncated to 0x00, which is zero
select cast(cast(0x0100 as binary(2)) as bit) -- 0 - truncated to 0x00
select cast(cast(0x0001 as binary(2)) as bit) -- 1 - truncated to 0x01, nonzero
As the documentation says:
When data is converted from a string data type (char, varchar, nchar, nvarchar, binary, varbinary, text, ntext, or image) to a binary or varbinary data type of unequal length, SQL Server pads or truncates the data on the right. When other data types are converted to binary or varbinary, the data is padded or truncated on the left. Padding is achieved by using hexadecimal zeros.
Which is something you can use, because:
select cast(0x0100 as binary(1)) -- 0x01
So if you need non-zero on the whole value, you basically need to convert to an integer data type, if possible. If you want the rightmost byte, use cast as bit, and if you want the leftmost, use cast as binary(1). Any other can be reached by using the string manipulation functions (binary is a string, just not a string of characters). binary doesn't allow you to do something like 0x01000 = 0 - that includes an implicit conversion to int (in this case), so the usual rules apply - 0x0100000000 = 0 is true.
Also note that there are no guarantees that conversions from binary are consistent between SQL server versions - they're not really managed.

Yes, in general when converting from an arbitrary length binary or varbinary value to a fixed size type, it's the rightmost bits or bytes that are converted:
select
CAST(CAST(0x0102030405060708 as bigint) as varbinary(8)),
CAST(CAST(0x0102030405060708 as int) as varbinary(8)),
CAST(CAST(0x0102030405060708 as smallint) as varbinary(8)),
CAST(CAST(0x0102030405060708 as tinyint) as varbinary(8))
Produces:
------------------ ------------------ ------------------ ------------------
0x0102030405060708 0x05060708 0x0708 0x08
I can't actually find anywhere in the documentation that specifically states this, but there again, the documentation does basically state that conversions between binary and other types is not guaranteed to follow any specific conventions:
Converting any value of any type to a binary value of large enough size and then back to the type, will always result in the same value if both conversions are taking place on the same version of SQL Server. The binary representation of a value might change from version to version of SQL Server.
So, the above shown conversions were the "expected" results running on SQL Server 2012, on my machine, but others may get different results.

Related

Is the SQL Server data type called "money" a binary-fixed-point or decimal-fixed-point type?

I'm expanding on a similar question titled Is SQL Server 'MONEY' data type a decimal floating point or binary floating point?
The accepted answer told us that the "money" datatype is fixed-point rather than floating-point type, but didn't say whether it was binary-fixed-point or decimal-fixed-point.
I'm assuming it's decimal-fixed-point but I can't find confirmation of this anywhere.
The documentation tells us the range and size, but not the underlying implementation.
Not sure why you care about the underlying implementation but you can CAST a money data type value to binary(8) to see the value's bits:
DECLARE #money money;;
--same as min 64-bit signed integer (2's compliment) with 4 decimal places assumed
SET #money = -922337203685477.5808;
SELECT CAST(#money AS binary(8)); --0x8000000000000000
--same as max 64-bit signed integer with 4 decimal places assumed
SET #money = 922337203685477.5807
SELECT CAST(#money AS binary(8)); --0x7FFFFFFFFFFFFFFF
So money looks to be a 64 bit signed integer with 4 decimal places assumed. The precision/scale is not included with the value with money (and it's smallmoney cousin).

how does SQL Server actually store russian symbols in char?

I have a column NAME, which is CHAR(50).
It contains the value 'Рулон комбинированный СТЕРИТ 50мм ? 200 м'
which integer representation is:
'1056,1091,1083,1086,1085,32,1082,1086,1084,1073,1080,1085,1080,1088,1086,1074,1072,1085,1085,1099,1081,32,1057,1058,1045,1056,1048,1058,32,53,48,1084,1084,32,63,32,50,48,48,32,1084'
but CHAR implies that it contains 8 bit. How does SQL Server store values like '1056,1091,1083,1086,1085' which are UNICODE symbols?
OK, and also ? symbol is actually × (215) (Multiplication Sign)
If SQL Server can represent '1056' why it can't represent '215'?
What the 255 values in a char mean is determined by the database collation. For Russia this is typically Cyrillic_General_CI_AS (where CI means Case Insentitive and AS means Accent Sensitive.)
There's a good chance this matches Windows code page 1251, so л is stored as hex EB or decimal 235. You can verify this with T-SQL:
create database d1 collate Cyrillic_General_CI_AS;
use d1
select ascii('л')
-->
235
In the Cyrillic code page, decimal 215 means Ч, not the multiplication sign. Because SQL Server can't match the multiplication sign to the Cyrillic code page, it replaces it with a question mark:
select ascii('×'), ascii('?')
-->
63 63
In the Cyrillic code page, the char 8-bit representation of the multiplication sign and the question mark are both decimal 63, the question mark.
I have a column NAME, which is CHAR(50).
It contains the value 'Рулон комбинированный СТЕРИТ 50мм ? 200 м'
which integer representation is:
'1056,1091,1083,1086,1085,32,1082,1086,1084,1073,1080,1085,1080,1088,1086,1074,1072,1085,1085,1099,1081,32,1057,1058,1045,1056,1048,1058,32,53,48,1084,1084,32,63,32,50,48,48,32,1084'
Cyted above is wrong.
I make a test within a database with Cyrillic collation and integer representation is different from what you showed us, so or your data type is not char, or your integer representation is wrong, and yes, "but CHAR implies that it contains 8 bit" is correct and here is how you can prove it to youerself:
--create table dbo.t (name char(50));
--insert into dbo.t values ('Рулон комбинированный СТЕРИТ 50мм ? 200 м')
select cast (name as binary(50))
from dbo.t;
select substring(cast (name as binary(50)), n, 1) as bin_substr,
cast(substring(cast (name as binary(50)), n, 1) as int) as int_,
char(substring(cast (name as binary(50)), n, 1)) as cyr_char
from dbo.t cross join nums.dbo.nums;
Here dbo.Nums is an auxiliary table containig integers. I just convert your string from char field into binary, split it byte per byte and convert into int and char.

Using CAST and bigint

I am trying to understand what does this statement does
SUM(CAST(FILEPROPERTY(name, 'SpaceUsed') AS bigint) * 8192.)/1024 /1024
Also why is there a dot after 8192? Can anybody explain this query bit by bit. Thanks!
FILEPROPERTY() returns an int value. Note that the SpaceUsed property is not in bytes but in "pages" - and in SQL Server a page is 8KiB, so multiplying by 8192 to get the size in KiB is appropriate.
I've never encountered a trailing dot without fractional digits before - the documentation for constants/literals in T-SQL does not give an example of this usage, but reading it implies it's a decimal:
decimal constants are represented by a string of numbers that are not enclosed in quotation marks and contain a decimal point.
Thus multiplying the bigint value by a decimal would yield a decimal value, which may be desirable if you want to preserve fractional digits when dividing by 1024 (and then 1024 again), though it's odd that those numbers are actually int literals, so the operation would just be truncation-division.
I haven't tested it, but you could try just this:
SELECT
SUM( FILEPROPERTY( name, 'SpaceUsed' ) ) * ( 8192.0 / 10485760 ) AS TotalGigabytes
FROM
...
If you're reading through code and you need to do research to understand what it's doing - do a favour for the next person who reads the code by adding an explanatory comment to save them from having to do research, e.g. "gets the total number of 8KiB pages used by all databases, then converts it to gigabytes".
The dot . after an Integer converts it implicitly to decimal value. This is most likely here to force the output to also be decimal (not an integer). In this case you only need one part of the operation to be converted to force the output to be in that type.
This probably has to do with bytes/pages since the numbers 8192 and 1024 (most likely for converting to larger unit). One could also imply this by the value of property which indicates how much space is being used by a file.
A page fits within 8kB which means that multiplying pages value by 8192 does convert the output to bytes being used. Then division two times by 1024 succesfully converts the output to gigabytes.
Explanation on functions used:
FILEPROPERTY returns a value for a file name which is stored within database. If a file is not present, null value is returned
CAST is for casting the value to type bigint
SUM is an aggregate function used in a query to sum values for a specified group

Some doubts related Microsoft SQL Server bigint

I have the following doubt related to Microsoft SQL Server. If a bigint column has a value as 0E-9, does it mean that this cell can contain value with 9 decimal digits or what?
BIGINT: -9,223,372,036,854,775,808 TO 9,223,372,036,854,775,807
INT: -2,147,483,648 TO 2,147,483,647
SMALLINT: -32,768 TO 32,767
TININT: 0 TO 255
These are for storing non-decimal values. You need to use DECIMAL or NUMERIC to store values shuch as 112.455. When maximum precision is used, valid values are from - 10^38 +1 through 10^38 - 1.
OE-9 isn't NUMERICor INTEGER value. It's a VARCHAR unless you are meaning something else like scientific notation.
https://msdn.microsoft.com/en-us/library/ms187746.aspx
https://msdn.microsoft.com/en-us/library/ms187745.aspx
No, the value would be stored as an integer. Any decimal amounts will be dropped off and only values to the left of the decimal will be kept (no rounding).
More specifically, bigint stores whole numbers between -2^63 and 2^63-1.

Why SQL binary convert function results a non-0101... value?

Why when I use the command in SQL Server 2005
select convert(varbinary(16),N'123')
results 0x310032003300 and not 1111011?
Basically each letter of '123' gets converted to it's UCS-2(basically the ASCII value padded to make it a double byte) value in the three double bytes of 0x3100, 0x3200, 0x3300, and concatenated together in a varbinar.
Hopefully that answers why you see this behavior. If you convert to an int first you may see what you were perhaps hoping for instead:
select convert(varbinary(16),cast(N'123' as int))
produces hex value 0x0000007B which in binary is 1111011
See http://www.asciitable.com/ the entry for numeric 3, the hex representation is 0x33 which corresponds to the same entry in unicode: http://www.ssec.wisc.edu/~tomw/java/unicode.html (this pattern does not necessarily hold true for all ASCII/unicode characters, but does for the 10 integers).

Resources