I have a very simple (but big) CSV file and I want to import it to my database in Microsoft SQL Server 2014 (Database/Tasks/Import Data). But I receive the following error :
The conversion returned status value 2 and status text "The value could not be converted because of a potential loss of data".
Here is a sample of my CSV file (containing ~ 9 million rows) :
1393013,297884,'20150414 15:46:25'
1393010,301242,'20150414 15:46:58'
Ideally my first and second columns are big-int and the third is datetime. In the wizard, I choose 'unsigned 8 byte integer' for first two and 'timestamp' for the third and I receive the error. Even I try to use string for all three columns as data type and still I receive the same error.
I also tried using bcp command in command line. It errs nothing and inserts nothing! Also using "bulk insert" command errors me that :
the column is too long! verify your terminators
But they are correctly fixed!
I appreciate any idea you have as a solution to this simple-looking problem.
You are trying to change the input types: unsigned 8 byte integer is a setting on the source.
You don't need to change source setting at all. 'string [DT_STR]' and the default length of 50 will work.
'timestamp' is a binary type. I believe the type you are after is datetime, but that set is on the destination, not the source. The source is still a string regardless.
You still will not be able to import your date value as a datetime data type.
This would work though (added dashes) -> 2015-04-14 15:46:25. Import what you have as string and fix it after import unless you can get your text file changed.
Related
I have an SSIS job to import data from a flat file into an SQL Server table. I'm having an issue regarding the encoding of the source file and destination table.
The file is an UTF8 encoded CSV file with some standard accented latin characters (ãóé, etc). My destination table is defined as having the Latin1_General_CI_AS Collation, which means I can manually insert the following text with no problem: "JOÃO ANTÓNIO".
When I declare the Flat File source, it automatically determines the file as having the 65001 code page (UTF-8), and infers the string [DT_STR] data type for each column. However, the SSIS package automatically assumes the destination table as having the 1252 Code Page, giving me the following error:
Validation error. <STEPNAME>: <STEPNAME>: The code page 65002 specified on output column "<MYCOLUMN>" (180) is not valid. Select a different code page for output column "<MYCOLUMN>".
I understand why, since the database collation is defined as having that Code Page. However, if I try to set the Flat File datasource as having the Latin1 1252 encoding, the SSIS executes but it imports characters incorrectly:
JOÃO ANTÓNIO (Flat File)-> JOAO ANTÓNIO (Database).
I have already tried to configure the flat file source as being unicode compliant, but then when after I configure each column as having a unicode compliant data type, i can't update the destination step since SSIS infers data types directly from the database and doesn't allow me to change them.
Is there a way to keep the flat file source as being CP 1252, but also importing the correct characters? What am I missing here?
Thanks to Larnu's comment i've been able to get around this problem.
Since SSIS doesn't allow implicit data conversion, I needed to set up a data conversion step first (Derived Column Transformation). Since the source columns were already set up as DTSTR[65002], i had to configure new derived columns form an expression, converting from the source code page into the destination code page, with the following expression:
(DT_STR, 50, 1252)<SourceColumn>
Where a direct cast to DT_STR is being made, stating the column will have a maximum size of 50 characters and the data will be represented with the 1252 code page.
I am querying a SQL Server database that has a column of type datetimeoffset. I am using 'pyodbc' and SQL Server 2017. The datetime is being returned as strings as follows:
"b'\xe3\x07\n\x00\x0e\x00\x12\x00\x03\x00\x05\x00#\xe1\x9d\x18\x00\x00\x00\x00'"
Pandas doesn't recognize it as a timestamp and I have tried using Python 3 'struct' module to unpack it like this:
import struct
raw = 'b\xe3\x07\n\x00\x0e\x00\x12\x00\x03\x00\x05\x00#\xe1\x9d\x18\x00\x00\x00\x00'
unpacked, = struct.unpack('<Q', raw)
That errors out because 'raw' is a string. If I enter the string directly as an argument in 'unpack' it errors out because of wrong number of bytes.
How do I convert the column values to pandas datetime?
Additional Note:
This site indicates that the SQL Server uses a particular type that pyodbc doesn't handle natively as suggested by mostert. That said, they seem to have no problem retrieving a human-readable value.
[SOLVED] So the solution at this site does work. TIL: when adding the converter you need to get the type as an integer, in this case '-155'. This site has the integer codes for some other types
So the solution at this site does work. TIL: when adding the converter you need to get the type as an integer, in this case '-155'. This site has the integer codes for some other types
Every time that I try to import an Excel file into SQL Server I'm getting a particular error. When I try to edit the mappings the default value for all numerical fields is float. None of the fields in my table have decimals in them and they aren't a money data type. They're only 8 digit numbers. However, since I don't want my primary key stored as a float when it's an int, how can I fix this? It gives me a truncation error of some sort, I'll post a screen cap if needed. Is this a common problem?
It should be noted that I cannot import Excel 2007 files (I think I've found the remedy to this), but even when I try to import .xls files every value that contains numerals is automatically imported as a float and when I try to change it I get an error.
http://imgur.com/4204g
SSIS doesn't implicitly convert data types, so you need to do it explicitly. The Excel connection manager can only handle a few data types and it tries to make a best guess based on the first few rows of the file. This is fully documented in the SSIS documentation.
You have several options:
Change your destination data type to float
Load to a 'staging' table with data type float using the Import Wizard and then INSERT into the real destination table using CAST or CONVERT to convert the data
Create an SSIS package and use the Data Conversion transformation to convert the data
You might also want to note the comments in the Import Wizard documentation about data type mappings.
Going off of what Derloopkat said, which still can fail on conversion (no offense Derloopkat) because Excel is terrible at this:
Paste from excel into Notepad and save as normal (.txt file).
From within excel, open said .txt file.
Select next as it is obviously tab delimited.
Select "none" for text qualifier, then next again.
Select the first row, hold shift, select the last row, and select the text radial button. Click Finish
It will open, check it to make sure it's accurate and then save as an excel file.
There is a workaround.
Import excel sheet with numbers as float (default).
After importing, Goto Table-Design
Change DataType of the column from Float to Int or Bigint
Save Changes
Change DataType of the column from Bigint to any Text Type (Varchar, nvarchar, text, ntext etc)
Save Changes.
That's it.
When Excel finds mixed data types in same column it guesses what is the right format for the column (the majority of the values determines the type of the column) and dismisses all other values by inserting NULLs. But Excel does it far badly (e.g. if a column is considered text and Excel finds a number then decides that the number is a mistake and insert a NULL instead, or if some cells containing numbers are "text" formatted, one may get NULL values into an integer column of the database).
Solution:
Create a new excel sheet with the name of the columns in the first row
Format the columns as text
Paste the rows without format (use CVS format or copy/paste in Notepad to get only text)
Note that formatting the columns on an existing Excel sheet is not enough.
There seems to be a really easy solution when dealing with data type issues.
Basically, at the end of Excel connection string, add ;IMEX=1;"
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\YOURSERVER\shared\Client Projects\FOLDER\Data\FILE.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
This will resolve data type issues such as columns where values are mixed with text and numbers.
To get to connection property, right click on Excel connection manager below control flow and hit properties. It'll be to the right under solution explorer. Hope that helps.
To avoid float type field in a simple way:
Open your excel sheet..
Insert blank row after header row and type (any text) in all cells.
Mouse Right-Click on the head of the columns that cause a float issue and select (Format Cells), then choose the category (Text) and press OK.
And then export the excel sheet to your SQL server.
This simple way worked with me.
A workaround to consider in a pinch:
save a copy of the excel file, modify the column to format type 'text'
copy the column values and paste to a text editor, save the file (call it tmp.txt).
modify the data in the text file to start and end with a character so that the SQL Server import mechanism will recognize as text. If you have a fancy editor, use included tools. I use awk in cygwin on my windows laptop. For example, I start end end the column value with a single quote, like "$ awk '{print "\x27"$1"\x27"}' ./tmp.txt > ./tmp2.txt"
copy and paste the data from tmp2.txt over top of the necessary column in the excel file, and save the excel file
run the sql server import for your modified excel file... be sure to double check the data type chosen by the importer is not numeric... if it is, repeat the above steps with a different set of characters
The data in the database will have the quotes once the import is done... you can update the data later on to remove the quotes, or use the "replace" function in your read query, such as "replace([dbo].[MyTable].[MyColumn], '''', '')"
I have a CSV file that I'm trying to import into SQL Management Server Studio.
In Excel, the column giving me trouble looks like this:
Tasks > import data > Flat Source File > select file
I set the data type for this column to DT_NUMERIC, adjust the DataScale to 2 in order to get 2 decimal places, but when I click over to Preview, I see that it's clearly not recognizing the numbers appropriately:
The column mapping for this column is set to type = decimal; precision 18; scale 2.
Error message: Data Flow Task 1: Data conversion failed. The data conversion for column "Amount" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
(SQL Server Import and Export Wizard)
Can someone identify where I'm going wrong here? Thanks!
I believe I figured it out... the CSV Amount column was formatted such that the numbers still contained commas separating at the thousands mark. I adjusted XX,XXX.XX to XXXXX.XX it seems to have worked. –
I'm facing a problem in a package to import some data from a MySQL table to Oracle table and MS SQL Server table. It works well from MySQL to SQL Server, however I get an error when I want to import to Oracle.
The table I want to import contains an attribute (unitPrice) of data type DT_R8.
The destination data type for Oracle is a DT_NUMBERIC as you can see in the capture.
I added a conversion step to convert the unitPrice data from DT_R8 to DT_NUMERIC.
It doesn't work, I get the following error.
I found the detail of the error :
An ORA-01722 ("invalid number") error occurs when an attempt is made to convert a character string into a number, and the string cannot be converted into a valid number. Valid numbers contain the digits '0' through '9', with possibly one decimal point, a sign (+ or -) at the beginning or end of the string, or an 'E' or 'e' (if it is a floating point number in scientific notation). All other characters are forbidden.
However, I don't know how to fix.
EDIT : I added a component to redirect rows/errors to an Excel file.
The following screenshot show the result of the process including errors :
By browsing the only 3000 rows recorded, It seems the process accept only int values no real. So if the price is equal to 10, it's OK but if it's 10,5 it's failed.
Any idea to solve this issue ?
Your NLS environment does not match the expected one. Default, Oracle assumes that "," is the grouping character and "." is the decimal separator. Make sure that your session uses the correct value for the NLS_NUMERIC_CHARACTERS parameter.
See Setting Up a Globalization Support Environment for docu.