Bulk insert in SQL Server with a regional separator

Bulk insert in SQL Server with a regional separator - sql-server

I want to make bulk insert of data where decimal separator is a comma as in regional settings.
The data are the following:
RegionName Value_1 Value_2 Value_3
Region 1 27,48 66,41 32,82
Region 2 38,93 45,80 61,83
Region 3 38,17 58,02 35,11
Region 4 34,35 16,03 29,01
Region 5 67,94 58,02 17,56
I make the bulk insert using this script:
create table RegVaues (
RegionName varchar(30)
,Value_1 float
,Value_2 float
,Value_3 float
)
go
bulk insert RegVaues
from N'A:\TestValues.txt'
with
(
DATAFILETYPE = 'widechar'
,fieldterminator = '\t'
,rowterminator = '\n'
,firstrow = 2
,keepnulls
)
go
After fulfilling a scrip I receive an error:
sg 4864, Level 16, State 1, Line 2 Bulk load data conversion error
(type mismatch or invalid character for the specified codepage) for
row 2, column 2 (Value_1).
When I try the same with a dot as a separator - all works. I have tried to insert the data with different types (float, decimal, numeric). In my SSMS in Tools->Options->International Settings the Language is set to "Same as in Microsoft Windows". The database collation is Ukrainian_CI_AS. But still the data with a comma separator couldn't be inserted. What I'm doing wrong?

Well the error is self explanatory, values with commas in them are not float values and when you are to insert values like 27,48 , 66,41 into a float column, it tries to convert them values to float and it fails hence the error message.
A simple solution would be to insert the data into a holding/staging table first with column data type character (VarChar) , then replace the commas with a decimal point and then use them values to insert into your final destination table.
Also mind you float is an approximate data type and should only be used for approximate values like (mass of earth and distance between planets etc) for exact values use DECIMAL or NUMERIC data types.

Related

Snowflake float type column out of range error

I got Numeric value out of range error when trying to insert two values into a Float type column.
create or replace table num_test(float_num float); -- create table and column
INSERT INTO num_test (float_num)VALUES (1.0528618730874378E10), (-3.694822225952521E-13);
The error i got shows: "Numeric value '10528618730.874378' is out of range."
But when i try to insert these two values separately, it work fine.
INSERT INTO num_test (float_num)VALUES (1.0528618730874378E10); -- ok
INSERT INTO num_test (float_num)VALUES (-3.694822225952521E-13); -- ok
I couldn't see any out of range issues on the snowflake document for the values I tried to insert.

If I had to guess what the problem was, it would be that "the guess of the "type" based on the first value in the VALUES is smaller than the type of the second value".
So if we just try and select those values with zero target problems:
select column1, system$typeof(column1)
from VALUES
(1.0528618730874378E10),
(-3.694822225952521E-13);
triggers.
Numeric value '10528618730.874378' is out of range
One at a time we get:
COLUMN1
SYSTEM$TYPEOF(COLUMN1)
-0.0000000000003695
NUMBER(29,28)[SB16]
COLUMN1
SYSTEM$TYPEOF(COLUMN1)
10,528,618,730.874378
NUMBER(17,6)[SB8]
So sure enough the two random "numbers" are cast to two different types, and these are deemed "too different". Thus my guess was correct.
What to do about this:
So we inline cast them:
select column1, system$typeof(column1)
from VALUES
(1.0528618730874378E10::double),
(-3.694822225952521E-13::double)
;
we get
COLUMN1
SYSTEM$TYPEOF(COLUMN1)
10,528,618,730.8744
FLOAT[DOUBLE]
-0.0000000000003695
FLOAT[DOUBLE]
so the values are safe if we first tell the DB what they are, and avoid the auto guessing code.
thus in your context:
INSERT INTO num_test (float_num)VALUES
(1.0528618730874378E10::double),
(-3.694822225952521E-13::double)
;
number of rows inserted
2

How to convert VARCHAR columns to DECIMAL without rounding in SQL Server?

In my SQL class, I'm working with a table that is all VARCHAR. I'm trying to convert each column to a more correct data type.
For example. I have a column called Item_Cost that has a value like:
1.25000000000000000000
I tried to run this query:
ALTER TABLE <table>
ALTER COLUMN Item_Cost DECIMAL
This query does run successfully, but it turns it into 1 instead of 1.25.
How do I prevent the rounding?

Check out the documentation for the data type decimal. The type is defined by optional parameters p (precision) and s (scale). The latter determines the numbers to the right of the decimal point.
Extract from the documentation (I highlighted the important bit in bold):
s (scale)
The number of decimal digits that are stored to the right of
the decimal point. This number is subtracted from p to determine the
maximum number of digits to the left of the decimal point. Scale must
be a value from 0 through p, and can only be specified if precision is
specified. The default scale is 0 and so 0 <= s <= p. Maximum storage
sizes vary, based on the precision.
Defining a suitable precision and scale fixes your issue.
Sample data
create table MyData
(
Item_Cost nvarchar(100)
);
insert into MyData (Item_Cost) values ('1.25000000000000000000');
Solution
ALTER TABLE MyData Alter Column Item_Cost DECIMAL(10, 3);
Result
Item_Cost
---------
1.250
Fiddle

how does SQL Server actually store russian symbols in char?

I have a column NAME, which is CHAR(50).
It contains the value 'Рулон комбинированный СТЕРИТ 50мм ? 200 м'
which integer representation is:
'1056,1091,1083,1086,1085,32,1082,1086,1084,1073,1080,1085,1080,1088,1086,1074,1072,1085,1085,1099,1081,32,1057,1058,1045,1056,1048,1058,32,53,48,1084,1084,32,63,32,50,48,48,32,1084'
but CHAR implies that it contains 8 bit. How does SQL Server store values like '1056,1091,1083,1086,1085' which are UNICODE symbols?
OK, and also ? symbol is actually × (215) (Multiplication Sign)
If SQL Server can represent '1056' why it can't represent '215'?

What the 255 values in a char mean is determined by the database collation. For Russia this is typically Cyrillic_General_CI_AS (where CI means Case Insentitive and AS means Accent Sensitive.)
There's a good chance this matches Windows code page 1251, so л is stored as hex EB or decimal 235. You can verify this with T-SQL:
create database d1 collate Cyrillic_General_CI_AS;
use d1
select ascii('л')
-->
235
In the Cyrillic code page, decimal 215 means Ч, not the multiplication sign. Because SQL Server can't match the multiplication sign to the Cyrillic code page, it replaces it with a question mark:
select ascii('×'), ascii('?')
-->
63 63
In the Cyrillic code page, the char 8-bit representation of the multiplication sign and the question mark are both decimal 63, the question mark.

I have a column NAME, which is CHAR(50).
It contains the value 'Рулон комбинированный СТЕРИТ 50мм ? 200 м'
which integer representation is:
'1056,1091,1083,1086,1085,32,1082,1086,1084,1073,1080,1085,1080,1088,1086,1074,1072,1085,1085,1099,1081,32,1057,1058,1045,1056,1048,1058,32,53,48,1084,1084,32,63,32,50,48,48,32,1084'
Cyted above is wrong.
I make a test within a database with Cyrillic collation and integer representation is different from what you showed us, so or your data type is not char, or your integer representation is wrong, and yes, "but CHAR implies that it contains 8 bit" is correct and here is how you can prove it to youerself:
--create table dbo.t (name char(50));
--insert into dbo.t values ('Рулон комбинированный СТЕРИТ 50мм ? 200 м')
select cast (name as binary(50))
from dbo.t;
select substring(cast (name as binary(50)), n, 1) as bin_substr,
cast(substring(cast (name as binary(50)), n, 1) as int) as int_,
char(substring(cast (name as binary(50)), n, 1)) as cyr_char
from dbo.t cross join nums.dbo.nums;
Here dbo.Nums is an auxiliary table containig integers. I just convert your string from char field into binary, split it byte per byte and convert into int and char.

netezza Double Precision Output Truncates Vaules

I've noticed that the nzsql and 'nzunload' just truncates double precision column's mantissa values. Here is the issue:
select tot_amt from table1;
tot_amt
~~~~~~~
123.124
567.678
while when I use other clients like Aginity for Data analytics - the output I get is
tot_amt
~~~~~~~
123.1240535
567.6780122
Also I've found the 'truncation' happens when netezza encounters 0 after 3 mantissa digits.
We are trying to migrate this db to oracle and due to this issue the entire project is messed and the client doesn't trust our migration scripts. Has anyone encountered this issue? The only workaround, even frmo IBM engineer is to cast it TO_CHAR( '999,999.999', col ) This will kill the unload scripts if I have to do it for billions of rows.

I can reproduce this issue where I have a table created with column as FLOAT(6) such as:
USERDB.USER(USER)=> create table ZZ (
USERDB.USER(USER)(> YY FLOAT(6)
USERDB.USER(USER)(> );
CREATE TABLE
USERDB.USER(USER)=> insert into ZZ (yy) values (123.123456789);
INSERT 0 1
USERDB.USER(USER)=> insert into ZZ (yy) values (12.123456789);
INSERT 0 1
USERDB.USER(USER)=> select * from ZZ;
YY
---------
123.123
12.1234
(2 rows)
USERDB.USER(USER)=> select CAST ( YY as FLOAT(15) ) from ZZ;
?COLUMN?
----------------
123.1234588623
12.123399734497
(2 rows)
USERDB.USER(USER)=>
I can cast the column values to a wider type, however the problem I see is that the value I inserted is not the same as the value returned. And the same is true if I use Aginity also to query, the values are incorrect.
Check the precision (and scale) of the 'tot_amt' column in table1, I guess the data type used to store values is quite small (FLOAT(6) maybe?), and NZSQL is telling you the correct values as enforced by the data type.

'Converting varchar to data type numeric' error after successful conversion to decimal(18,2)

I have a temporary table I'm using for parsing, #rp.
#rp contains an nvarchar(max) column, #rp.col8, which holds positive and negative numbers to two decimal places of precision e.g. `1234.26'.
I'm able to run the following query and get out a set of converted values out:
select * from
(
select CONVERT(decimal(18,2),rp.col8) as PARSEAMT
from #rp
where
--#rp filtering criteria
)q
However, when I try to query for PARSEAMT = 0 in the following manner, I get the standard '8114, Error converting data type varchar to numeric.':
select * from
(
select CONVERT(decimal(18,2),col8) as PARSEAMT
from #rp
where
--#rp filtering criteria
)q
where q.PARSEAMT = 0
Without that where clause, the query runs fine and generates the expected values.
I've also tried other clauses like where q.PARSEAMT = 0.00 and where q.PARSEAMT = convert(decimal(18,2),0).
What am I doing wrong in my comparison?

I was going to suggest you select PARSEAMT into another temp-table/table-variable but I can see you've already done that from your comments.
Out of interest what does the following yield?
select
col8
from
#rp
where
-- ISNUMERIC returns 1 when the input expression evaluates to a valid
-- numeric data type; otherwise it returns 0. Valid numeric data types
-- include the following:
isnumeric(col8) <> 1

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Bulk insert in SQL Server with a regional separator - sql-server

Related

Snowflake float type column out of range error

How to convert VARCHAR columns to DECIMAL without rounding in SQL Server?

how does SQL Server actually store russian symbols in char?

netezza Double Precision Output Truncates Vaules

'Converting varchar to data type numeric' error after successful conversion to decimal(18,2)

Categories

Resources