Difference when casting a datetime to a decimal on different SQL servers - sql-server

On two different sql servers I run the following query:
declare #myDatetime as datetime = '2017-07-04 23:42:32.400'
select CAST(#myDatetime AS DECIMAL(20,5))
I get two different results:
42918.98788 and 42918.98787
If I cast to a DECIMAL(20,6) it works fine (42918.987875) but let's say I need to put it in a decimal(20,5).
Where can I found the source of this difference in behaviour when rounding? Is it an option somewhere that rounds the final 5 up or down? Is it a sort of locale, international setting, collation or else?
Is it the different verions of SQL (12.0.5000.0 vs 13.0.4202.2)?

According to this document:
https://support.microsoft.com/en-us/help/4010261/sql-server-2016-improvements-in-handling-some-data-types-and-uncommon
Microsoft have made some changes to how it handles some "uncommon" conversions:
SQL Server 2016 includes improvements to the precision of the following operations under compatibility level 130:
Uncommon data type conversions. These include the following:
float/integer to/from datetime/smalldatetime
real/float to/from numeric/money/smallmoney
float to real
So I would suspect that might be what we see for the different rounding, in this specific situation.

Related

Pandas read_sql changing large number IDs when reading

I transferred an Oracle database to SQL Server and all seems to have went well. The various ID columns are large numbers so I had to use Decimal as they were too large for BigInt.
I am now trying to read the data using pandas.read_sql using pyodbc connection with ODBC Driver 17 for SQL Server. df = pandas.read_sql("SELECT * FROM table1"),con)
The numbers are coming out as float64 and when I try to print them our use them in SQL statements they come out in scientific notation and when I try to use '{:.0f}'.format(df.loc[i,'Id']) It turns several numbers into the same number such as 90300111000003078520832. It is like precision is lost when it goes to scientific notation.
I also tried pd.options.display.float_format = '{:.0f}'.format before the read_sql but this did not help.
Clearly I must be doing something wrong as the Ids in the database are correct.
Any help is appreciated Thanks
pandas' read_sql method has an option named coerce_float which defaults to True and it …
Attempts to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets.
However, in your case it is not useful, so simply specify coerce_float=False.
I've had this problem too, especially working with long ids: read_sql works fine for the primary key, but not for other columns (like the retweeted_status_id from Twitter API calls). Setting coerce_float to false does nothing for me, so instead I cast retweeted_status_id to a character format in my sql query.
Using psql, I do:
df = pandas.read_sql("SELECT *, Id::text FROM table1"),con)
But in SQL server it'd be something like
df = pandas.read_sql("SELECT *, CONVERT(text, Id) FROM table1"),con)
or
df = pandas.read_sql("SELECT *, CAST(Id AS varchar) FROM table1"),con)
Obviously there's a cost here if you're asking to cast many rows, and a more efficient option might be to pull from SQL server without using pandas (as a nested list or JSON or something else) which will also preserve your long integer formats.

SQL Server appears to correctly interpret non-supported string literal formats

Have an instance of SQL Server 2012 that appears to correctly interpret string literal dates whose formats are not listed in the docs (though note these docs are for SQL Server 2017).
Eg. I have a TSV with a column of dates of the format %d-%b-%y (see https://devhints.io/datetime#date-1) which looks like "25-FEB-93". However, this throws type errors when trying to copy the data into the SQL Server table (via mssql-tools bcp binary). Yet, when testing on another table in SQL Server, I can do something like...
select top 10 * from account where BIRTHDATE > '25-FEB-93'
without any errors. All this, even though the given format is not listed in the docs for acceptable date formats and it apparently also can't be used as a castable string literal when writing in new records. Can anyone explain what is going on here?
the given format is not listed in the docs for acceptable date formats
That means it's not supported, and does not have documented behavior. There's lots of strings that under certain regional settings will convert due to quirks in the parsing implementation.
It's a performance-critical code path, and so the string formats are not rigorously validated on conversion. You're expected to ensure that the strings are in a supported format.
So you may need to load the column as a varchar(n) and then convert it. eg
declare #v varchar(200) = '25-FEB-93'
select convert(datetime,replace(#v,'-',' '),6)
Per the docs format 6 is dd mon YY, but note that this conversion "works" without replacing the - with , but that's an example of the behavior you observed.

How is build the format of geography data type in sql server?

I'm not being able to understand how is the data type geography in SQL server...
For example I have the following data:
0xE6100000010CCEAACFD556484340B2F336363BCA21C0
what I know:
0x is prefix for hexadecimal
last 16 numbers are longitude: B2F336363BCA21C0 (double of decimal format)
16 numbers before the last 16 are latitude: CEAACFD556484340 (double of decimal format)
4 first numbers are SRID: E610 (hexadecimal for WGS84)
what I don't understand:
numbers from 5 to 12 : 0000010C
what is this?
From what I read this seems linked to WKB(Well Known Binary) or EWKB(Extended Well Known Binary) anyway i was not abble to find a definition for EWKB...
And for WKB this is supposed to be geometry type (4-byte integer) but the value doesn't match with the Geometry types codes (this example is for one point coordinate)
Can you help to understand this format?
The spatial types (geometry and geography) in SQL Server are implemented as CLR data types. As with any such data types, you get a binary representation when you query the value directly. Unfortunately, it's not (as far as I know) WKB but rather whatever format Microsoft decided was best for their implementation. For us (the users), we should work with the published interface of methods that have been published by MS (for instance the geography method reference). Which is to say that you should only try to decipher the MS binary representation if you're curious (and not for actually working with it).
That said, if you need/want to work with WKB, you can! For example, you can use the STGeomFromWKB() static method to create a geography instance from WKB that you provide and STAsBinary() can be called on a geography instance to return WKB to you.
The Format spec can be found here:
https://msdn.microsoft.com/en-us/library/ee320529(v=sql.105).aspx
As that page shows, it used to change very frequently, but has slowed down significantly over the past 2 years
I am currently needing to dig into the spec to serialize from JVM code into a bcp file so that I can use SQLServerBulkCopy rather than plain JDBC to upload data into tables (it is about 7x faster to write a bcp file than using JDBC), but this is proving to be more complicated than what I originally anticipated.
After testing with bcp, you can upload geographies by specifying an off row format ( varchar(max) ) and store the well known text, SQL server will see this and assume you wanted a geography based on the WKT it sees.
In my case converting to nvarchar resolved the issue.

MS SQL server - convert HEX string to integer

This answer to what looks like the same question:
Convert integer to hex and hex to integer
..does not work for me.
I am not able to go to a HEX string to an integer using MS SQL server 2005 CAST or CONVERT. Am I missing something trivial? I have searched extensively, and the best I can find are long-winded user functions to go from a hex string value to something that looks like a decimal int. Surely there is a simple way to do this directly in a query using built in functions rather than writing a user function?
Thanks
Edit to include examples:
select CONVERT(INT, 0x89)
works as expected, but
select CONVERT(INT, '0x' + substring(msg, 66, 2)) from sometable
gets me:
"Conversion failed when converting the varchar value '0x89' to data type int."
an extra explicit CAST:
select CONVERT(INT, CAST('0x89' AS VARBINARY))
executes, but returns 813185081.
Substituting 'Int', 'Decimal', etc for 'Varbinary' results in an error. In general, strings that appear to be numeric are interpreted as numeric if required, but not in this case, and there does not appear to be a CAST that recognizes HEX. I would like to think there is something simple and obvious and I've just missed it.
Microsoft SQL Server Management Studio Express 9.00.3042.00
Microsoft SQL Server 2005 - 9.00.3080.00 (Intel X86) Sep 6 2009 01:43:32 Copyright (c) 1988-2005 Microsoft Corporation Express Edition with Advanced Services on Windows NT 5.1 (Build 2600: Service Pack 3)
To sum up: I want to take a hex string which is a value in a table, and display it as part of a query result as a decimal integer, using only system defined functions, not a UDF.
Thanks for giving some more explicit examples. As far as I can tell from the documentation and Googling, this is not possible in MSSQL 2005 without a UDF or other procedural code. In MSSQL 2008 the CONVERT() function's style parameter now supoprts binary data, so you can do it directly like this:
select convert(int, convert(varbinary, '0x89', 1))
In previous versions, your choices are:
Use a UDF (TSQL or CLR; CLR might actually be easier for this)
Wrap the SELECT in a stored procedure (but you'll probably still have the equivalent of a UDF in it anyway)
Convert it in the application front end
Upgrade to MSSQL 2008
If converting the data is only for display purposes, the application might be the easiest solution: data formatting usually belongs there anyway. If you must do it in a query, then a UDF is easiest but the performance may not be great (I know you said you preferred not to use a UDF but it's not clear why). I'm guessing that upgrading to MSSQL 2008 just for this probably isn't realistic.
Finally, FYI the version number you included is the version of Management Studio, not the version number of your server. To get that, query the server itself with select ##version or select serverproperty('ProductVersion').

What datatype should I bind as query parameter to use with NUMBER(15) column in Oracle ODBC?

I have just been bitten by issue described in SO question Binding int64 (SQL_BIGINT) as query parameter causes error during execution in Oracle 10g ODBC.
I'm porting a C/C++ application using ODBC 2 from SQL Server to Oracle. For numeric fields exceeding NUMBER(9) it uses __int64 datatype which is bound to queries as SQL_C_SBIGINT. Apparently such binding is not supported by Oracle ODBC. I must now do an application wide conversion to another method. Since I don't have much time---it's an unexpected issue---I would rather use proved solution, not trial and error.
What datatype should be used to bind as e.g. NUMBER(15) in Oracle? Is there documented recommended solution? What are you using? Any suggestions?
I'm especially interested in solutions that do not require any additional conversions. I can easily provide and consume numbers in form of __int64 or char* (normal non-exponential form without thousands separator or decimal point). Any other format requires additional conversion on my part.
What I have tried so far:
SQL_C_CHAR
Looks like it's going to work for me. I was worried about variability of number format. But in my use case it doesn't seem to matter. Apparently only fraction point character changes with system language settings.
And I don't see why I should use explicit cast (e.g. TO_NUMERIC) in SQL INSERT or UPDATE command. Everything works fine when I bind parameter with SQL_C_CHAR as C type and SQL_NUMERIC (with proper precision and scale) as SQL type. I couldn't reproduce any data corruption effect.
SQL_NUMERIC_STRUCT
I've noticed SQL_NUMERIC_STRUCT added with ODBC 3.0 and decided to give it a try. I am disappointed.
In my situation it is enough, as the application doesn't really use fractional numbers. But as a general solution... Simply, I don't get it. I mean, I finally understood how it is supposed to be used. What I don't get is: why anyone would introduce new struct of this kind and then make it work this way.
SQL_NUMERIC_STRUCT has all the needed fields to represent any NUMERIC (or NUMBER, or DECIMAL) value with it's precision and scale. Only they are not used.
When reading, ODBC sets precision of the number (based on precision of the column; except that Oracle returns bigger precision, e.g. 20 for NUMBER(15)). But if your column has fractional part (scale > 0) it is by default truncated. To read number with proper scale you need to set precision and scale yourself with SQLSetDescField call before fetching data.
When writing, Oracle thankfully respects scale contained in SQL_NUMERIC_STRUCT. But ODBC spec doesn't mandate it and MS SQL Server ignores this value. So, back to SQLSetDescField again.
See HOWTO: Retrieving Numeric Data with SQL_NUMERIC_STRUCT and INF: How to Use SQL_C_NUMERIC Data Type with Numeric Data for more information.
Why ODBC doesn't fully use its own SQL_NUMERIC_STRUCT? I don't know. It looks like it works but I think it's just too much work.
I guess I'll use SQL_C_CHAR.
My personal preference is to make the bind variables character strings (VARCHAR2), and let Oracle do the conversion from character to it's own internal storage format. It's easy enough (in C) to get data values represented as null terminated strings, in an acceptable format.
So, instead of writing SQL like this:
SET MY_NUMBER_COL = :b1
, MY_DATE_COL = :b2
I write the SQL like this:
SET MY_NUMBER_COL = TO_NUMBER( :b1 )
, MY_DATE_COL = TO_DATE( :b2 , 'YYYY-MM-DD HH24:MI:SS')
and supply character strings as the bind variables.
There are a couple of advantages to this approach.
One is that works around the issues and bugs one encounters with binding other data types.
Another advantage is that bind values are easier to decipher on an Oracle event 10046 trace.
Also, an EXPLAIN PLAN (I believe) expects all bind variables to be VARCHAR2, so that means the statement being explained is slightly different than the actual statement being executed (due to the implicit data conversions when the datatypes of the bind arguments in the actual statement are not VARCHAR2.)
And (less important) when I'm testing of the statement in TOAD, it's easier just to be able to type in strings in the input boxes, and not have to muck with changing the datatype in a dropdown list box.
I also let the buitin TO_NUMBER and TO_DATE functions validate the data. (In earlier versions of Oracle at least, I encountered issues with binding a DATE value directly, and it bypassed (at least some of) the validity checking, and allowed invalid date values to be stored in the database.
This is just a personal preference, based on past experience. I use this same approach with Perl DBD.
I wonder what Tom Kyte (asktom.oracle.com) has to say about this topic?

Resources