Postgres table import gives "Invalid input syntax for double precision" - database

I exported a postgres table as CSV:
"id","notify_me","score","active","is_moderator","is_owner","is_creator","show_marks","course_id","user_id"
8,False,36,"A",False,True,True,True,2,8
29,False,0,"A",False,False,False,True,2,36
30,False,25,"A",False,False,False,True,2,37
33,False,2,"A",False,False,False,False,2,40
Then I tried to import it using pgadmin:
But I ended up getting following error:
I checked the values of Score column, but it doesnt contain value "A":
This is the existing data in the coursehistory table (for schema details):
Whats going wrong here?
PS:
Earlier there was grade column with all NULL values:
But it was giving me following error:
I got same error even using \copy
db=# \copy courseware_coursehistory FROM '/root/db_scripts/data/couse_cpp.csv' WITH (FORMAT csv)
ERROR: value too long for type character varying(2)
CONTEXT: COPY courseware_coursehistory, line 1, column grade: "NULL"
I felt that import utility will respect the order of column in the header of the csv, especially when there is header switch in the UI. Seems that it doesnt and just decides whether to start from first row or second.

This is your content, with an "A" as the fourth value:
8,False,36,"A",False,True,True,True,2,8
And the your table course_history, with the column "score" in fourth position, using a double precision.
The error message makes sense to me, an A is not a valid double precision.

Order of columns in the kind of import you are doing is relevant. If you need a more flexible way to do imports of csv files, you could use a python script that in fact takes into account your header; and column order is not relevant as long as names, types and no nulls are correct (for existing tables).
Like this:
import pandas as pd
from sqlalchemy import create_engine
engine=create_engine('postgresql://user:password#ip_host:5432/database_name')
data_df= pd.read_csv('course_cpp_courseid22.csv', sep=',', header=0)
data_df.to_sql('courseware_coursehistory', engine, schema='public', if_exists='append', index=False)

I ended up copying this CSV (also shown in postscript of original question; this also contains grade column and has no header row):
using \copy command in psql prompt.
Start psql prompt:
root#50ec9abb3214:~# psql -U user_role db_name
Copy from csv as explained here:
db_name=# \copy db_table FROM '/root/db_scripts/data/course_cpp2.csv' delimiter ',' NULL AS 'NULL' csv

Related

Uploading excel file to sql server [duplicate]

Every time that I try to import an Excel file into SQL Server I'm getting a particular error. When I try to edit the mappings the default value for all numerical fields is float. None of the fields in my table have decimals in them and they aren't a money data type. They're only 8 digit numbers. However, since I don't want my primary key stored as a float when it's an int, how can I fix this? It gives me a truncation error of some sort, I'll post a screen cap if needed. Is this a common problem?
It should be noted that I cannot import Excel 2007 files (I think I've found the remedy to this), but even when I try to import .xls files every value that contains numerals is automatically imported as a float and when I try to change it I get an error.
http://imgur.com/4204g
SSIS doesn't implicitly convert data types, so you need to do it explicitly. The Excel connection manager can only handle a few data types and it tries to make a best guess based on the first few rows of the file. This is fully documented in the SSIS documentation.
You have several options:
Change your destination data type to float
Load to a 'staging' table with data type float using the Import Wizard and then INSERT into the real destination table using CAST or CONVERT to convert the data
Create an SSIS package and use the Data Conversion transformation to convert the data
You might also want to note the comments in the Import Wizard documentation about data type mappings.
Going off of what Derloopkat said, which still can fail on conversion (no offense Derloopkat) because Excel is terrible at this:
Paste from excel into Notepad and save as normal (.txt file).
From within excel, open said .txt file.
Select next as it is obviously tab delimited.
Select "none" for text qualifier, then next again.
Select the first row, hold shift, select the last row, and select the text radial button. Click Finish
It will open, check it to make sure it's accurate and then save as an excel file.
There is a workaround.
Import excel sheet with numbers as float (default).
After importing, Goto Table-Design
Change DataType of the column from Float to Int or Bigint
Save Changes
Change DataType of the column from Bigint to any Text Type (Varchar, nvarchar, text, ntext etc)
Save Changes.
That's it.
When Excel finds mixed data types in same column it guesses what is the right format for the column (the majority of the values determines the type of the column) and dismisses all other values by inserting NULLs. But Excel does it far badly (e.g. if a column is considered text and Excel finds a number then decides that the number is a mistake and insert a NULL instead, or if some cells containing numbers are "text" formatted, one may get NULL values into an integer column of the database).
Solution:
Create a new excel sheet with the name of the columns in the first row
Format the columns as text
Paste the rows without format (use CVS format or copy/paste in Notepad to get only text)
Note that formatting the columns on an existing Excel sheet is not enough.
There seems to be a really easy solution when dealing with data type issues.
Basically, at the end of Excel connection string, add ;IMEX=1;"
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\YOURSERVER\shared\Client Projects\FOLDER\Data\FILE.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
This will resolve data type issues such as columns where values are mixed with text and numbers.
To get to connection property, right click on Excel connection manager below control flow and hit properties. It'll be to the right under solution explorer. Hope that helps.
To avoid float type field in a simple way:
Open your excel sheet..
Insert blank row after header row and type (any text) in all cells.
Mouse Right-Click on the head of the columns that cause a float issue and select (Format Cells), then choose the category (Text) and press OK.
And then export the excel sheet to your SQL server.
This simple way worked with me.
A workaround to consider in a pinch:
save a copy of the excel file, modify the column to format type 'text'
copy the column values and paste to a text editor, save the file (call it tmp.txt).
modify the data in the text file to start and end with a character so that the SQL Server import mechanism will recognize as text. If you have a fancy editor, use included tools. I use awk in cygwin on my windows laptop. For example, I start end end the column value with a single quote, like "$ awk '{print "\x27"$1"\x27"}' ./tmp.txt > ./tmp2.txt"
copy and paste the data from tmp2.txt over top of the necessary column in the excel file, and save the excel file
run the sql server import for your modified excel file... be sure to double check the data type chosen by the importer is not numeric... if it is, repeat the above steps with a different set of characters
The data in the database will have the quotes once the import is done... you can update the data later on to remove the quotes, or use the "replace" function in your read query, such as "replace([dbo].[MyTable].[MyColumn], '''', '')"

Pentaho: Cannot import Boolean Value to table with PostgreSQL Bulk Loader?

I am currently trying to import some data (from a csv file) to a postgreSQL database. For this, I am using the CSV file input step to import the csv file into Kettle. Second, I am using the Modified Java Script Value step for altering some values and I am also adding a new column named VALID. This column should always be true. I added the column VALID to the fields in the lower half of the step window. My step looks like following:
To import the data from kettle to the PostgreSQL database table, I am using the PostgreSQL Bulk Loader (as there are millions of rows to import). This steps looks like following:
As you can see in this image, the table column name is valid and the stream field is VALID (which is coming from the Javascript Value step). Both boolean. Should be working. But instead, I am getting the following error message if I run the transformation:
2018/02/12 14:52:50 - PostgreSQL Bulk Loader.0 - Caused by:
org.postgresql.util.PSQLException: ERROR: invalid input syntax for type
boolean: "1.0"
Wobei: COPY adac_test, line 1, column valid: "1.0"
Any suggestions on how to fix this?
cast "valid" as string. In postgres the boolean values are "kind of" strings, not really but boolean values on postgres can be inserted as string:
https://www.postgresql.org/docs/9.6/static/datatype-boolean.html
based on this documentation , this should work:
var VALID = 't';
and select type string instead boolean below the code window

Import CSV to Microsoft SQL Server 2014 Wizard

I have a very simple (but big) CSV file and I want to import it to my database in Microsoft SQL Server 2014 (Database/Tasks/Import Data). But I receive the following error :
The conversion returned status value 2 and status text "The value could not be converted because of a potential loss of data".
Here is a sample of my CSV file (containing ~ 9 million rows) :
1393013,297884,'20150414 15:46:25'
1393010,301242,'20150414 15:46:58'
Ideally my first and second columns are big-int and the third is datetime. In the wizard, I choose 'unsigned 8 byte integer' for first two and 'timestamp' for the third and I receive the error. Even I try to use string for all three columns as data type and still I receive the same error.
I also tried using bcp command in command line. It errs nothing and inserts nothing! Also using "bulk insert" command errors me that :
the column is too long! verify your terminators
But they are correctly fixed!
I appreciate any idea you have as a solution to this simple-looking problem.
You are trying to change the input types: unsigned 8 byte integer is a setting on the source.
You don't need to change source setting at all. 'string [DT_STR]' and the default length of 50 will work.
'timestamp' is a binary type. I believe the type you are after is datetime, but that set is on the destination, not the source. The source is still a string regardless.
You still will not be able to import your date value as a datetime data type.
This would work though (added dashes) -> 2015-04-14 15:46:25. Import what you have as string and fix it after import unless you can get your text file changed.

Sqoop to SQL Server. Blank string error

I'm trying to load data from sqoop to sql server. I'm writing:
sqoop export --connect "jdbc:sqlserver://<server name>;username=<user>;
password=<pass>;database=<db>" --table test_out --input-fields-terminated-by ~
--export-dir /user/test.out
but I get error when row has blank string in test.out:
1~a
<nul>~b
<blank>~c
In this example, the third line returns an error:
Failed map tasks=1
Any ideas?
I would suggest to take a look into the Failed map task log as it usually contains quite detail information about the failure.
Sqoop will always expect that the number of columns on every line in the exported data will be equal to the number of columns in target table. While exporting from text file, the number of columns will be determined by the number of separators present on the line. Based on provided example it seems that the target table have 2 columns, whereas the third line have zero separators and thus it's assume that it's one single column. This discrepancy would cause Sqoop to fail.

Loading 532 columns from a CSV file into a DB2 table

Summary : Is there a limit to the number of columns which can be Imported/Loaded from a CSV file? If yes, what is the workaround? Thanks
I am very new to DB2, and I am supposed to import a | (pipe) delimited csv file which contains 532 columns into a DB2 table which also has 532 columns in exact positions as the csv. I also have a smaller file with only 27 columns in both csv and table. I am using the following command:
IMPORT FROM "C:\myfile.csv" OF DEL MODIFIED BY COLDEL| METHOD P (1, 2,....27) MESSAGES "C:\messages.txt" INSERT INTO PRE_SUBS_GPRS2_1010 (col1,col2,....col27);
This works fine.
But in the second file, which is like:
IMPORT FROM "C:\myfile.csv" OF DEL MODIFIED BY COLDEL| METHOD P (1, 2,....532) MESSAGES "C:\messages.txt" INSERT INTO PRE_SUBS_GPRS_1010 (col1,col2,....col532);
It does not work. It gives me an error that says:
SQL3037N An SQL error "-206" occurred during Import processing.
Explanation:
An SQL error occurred during processing of the Action String (for
example, "REPLACE into ...") parameter.
The command cannot be processed.
User Response:
Look at the SQLCODE (message number) in the message for more
information. Make changes and resubmit the command.
I am using the Control Center to run the query, not command prompt.
The problem was because one of the column names in the list of columns of the INSERT statement was more than 30 characters long. It was getting truncated and was not recognized.
Hope this helps others in future. Please let me know if you need further details.
The specific error code is SQL0206 and the documentation about this error is here
http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.messages.sql.doc/doc/msql00206n.html
For the limits, I think the maximal quantity of columns in an import should be the maximal quantity permitted for a Table. Take a look in the information center
Database fundamentals > SQL > SQL and XML limits
Maximum number of columns in a table 7 1012
Try to import just one row. If you have problems, probably is due to incompatibility of types, column order, duplicated rows with the already present in the table.

Resources