Talend: Write data to PostgreSQL database error - database

I am trying to write data from a .csv file to my postgreSQL database. The connection is fine, but when I run my job i get the following error:
Exception in component tPostgresqlOutput_1
org.postgresql.util.PSQLException: ERROR: zero-length delimited identifier at or near """"
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:328)
at talend_test.exporttoexcel_0_1.exportToExcel.tFileInputDelimited_1Process(exportToExcel.java:568)
at talend_test.exporttoexcel_0_1.exportToExcel.runJobInTOS(exportToExcel.java:1015)
at talend_test.exporttoexcel_0_1.exportToExcel.main(exportToExcel.java:886)
My job is very simple:
tFileInputDelimiter -> PostgreSQL_Output
I think that the error means that the double quotes should be single quotes ("" -> ''), but how can i edit this in Talend?
Or is it another reason?
Can anyone help me on this one?
Thanks!

If you are using the customer.csv file from the repository then you have to change the properties of customer file by clicking through metadata->file delimited->customer in the repository pane.
You should be able to right click the customer file and then choose Edit file delimited. In the third screen, if the file extension is .csv then in Escape char settings you have to select CSV options. Typical escape sequences (as used by Excel and other programs) have escape char as "\"" and text enclosure is also "\"".
You should also check that encoding is set to UTF-8 in the file settings. You can then refresh your preview to view a sample of your file in a table format. If this matches your expectations of the data you should then be able to save the metadata entry and update this to your jobs.
If your file is not in the repository, then click on the component with your file and do all of the above CSV configuration steps in the basic settings of the component.

Related

Failing to convert a CSV file to UTF-8 BOM(w/ Notepad++) in order to migrate it to SQL w/ SSIS and keep regional(Polish) letters

I have a CSV in Polish that I want to get into SQL w/ SSIS.
I open it in Notepad++ and it says UTF 8.
If it doesn't actually say UTF-8-BOM in the status bar then Notepad++ is only guessing the encoding. Try selecting Encoding > Encode in UTF-8-BOM, save the file, then close and reopen it to confirm the change. After saving it with a BOM (Byte Order Mark) try importing it via SSIS again using code page 65001 (UTF-8) setting and see if it works.
#AlwaysLearning
I convert the file as the user above suggested and it now shows UTF8 BOM in the corner. I save it.
So here's the vicious circle:
a) when choosing the CSV in SSIS as UTF-8 in the preview I can see my polish letters properly. Until I hit Run. Then I get this error:
Error at Data Flow Task [SQL Server Destination [9]]: The column "ColumnName" cannot be processed because more than one code page (65001 and 1252) are specified for it.
I get it for each column.
b) When I change the filetype in the connection manager to 1252, I can immediately see in the preview that my Polish letters are lost. But now running it works like a charm and I get no errors.
Screenshot1
Screenshot2
Here's what I've tried:
Changing to 1250, 65001 etc
Ticking Unicode
changing Locale to Polish, polish(Poland), English
googling
searching stack
Posting This question to stack:

SSIS Export OLEDB Source to a Flat File with UTF-8

I am trying to export an OLEDB source (from a stored procedure) to an UTF-8 flat file, but am getting the following error:
[Flat File Destination [2]]
Error: Data conversion failed. The data conversion for column "name" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".
The name column is defined in the stored procedure as nvarchar(30).
In the advanced editor of the OLEDB source, I have the AlwaysUseDefaultCodePage set to true and the DefaultCodePage set to 65001.
In the advanced editor for the flat file for the External Columns and Input columns, the Data Type is Unicode string [DT-WSTR] with a length of 30.
The connection manager for the flat file has the Unicode checkbox un-checked and the Code page is: 65001 (UTF-8).
I am stumped right now and any help would be appreciated.
Thanks,
David
EDIT:
I added a redirect of errors and truncations to a flat file destination but nothing was sent to the file.
Also, when I have a data viewer on the OLE DB source it comes up with all the records.
The data viewer for the Destination also shows all the records. The length of the name in both viewers is 30 characters (from Excel).
I gave up on getting the data flow to work and coded a C# script task instead.
I changed the output of my data flow to produce a Unicode file by checking the Unicode check box in the flat file connection manager.
I then have the new C# script read the Unicode file one line at a time and output the line to another flat file using Encoding.UTF8 adding a new line character at the end of the line variable.
After the new file is created, I delete the input file and rename the new file to be the same path and name as the original input file. This is also done in the C# script.

Error importing data from CSV with OpenRowset in SQL Server - Mysterious value of "S7"

I have a file dump which needs to be imported into SQL Server on a daily basis, which I have created a scheduled task to do this without any attendant. All CSV files are decimated by ',' and it's a Windows CR/LF file encoded with UTF-8.
To import data from these CSV files, I mainly use OpenRowset. It works well until I ran into a file in which there's a value of "S7". If the file contains the value of "S7" then that column will be recognized as datatype of numeric while doing the OpenRowset import and which will lead to a failure for other alphabetic characters to be imported, leaving only NULL values.
This is by far I had tried:
Using IMEX=1: openrowset('Microsoft.ACE.OLEDB.15.0','text;IMEX=1;HDR=Yes;
Using text driver: OpenRowset('MSDASQL','Driver=Microsoft Access Text Driver (*.txt, *.csv);
Using Bulk Insert with or without a format file.
The interesting part is that if I use Bulk Insert, it will give me a warning of unexpected end of file. To solve this, I have tried to use various row terminator indicators like '0x0a','\n', '\r\n' or not designated any, but they all failed. And finally I managed to import some of the records which using a row terminator of ',\n'. However the original file contains like 1000 records and only 100 will be imported, without any notice of errors or warnings.
Any tips or helps would be much appreciated.
Edit 1:
The file is ended with a newline character, from which I can tell from notepad++. I managed to import files which give me an error of unexpected end of file by removing the last record in those files. However even with this method, that I still can not import all records, only a partial of which can be imported.

Text Qualifier þ (thorn) in SSIS [duplicate]

I'm trying to read a flat file in SSIS which is in this format
col1 þ col2 þ col 3
I'm using the flatfile connection manager but there is no option for the 'þ' character in the column delimiter section of the connection manager.
What would be the workaround for this? Other than reading the file and replacing the thorn character with a SSIS supported delimiter,
Being a dumb 'merican, I think the lower case thorn character is 0xFE while upper case is 0xDE. This will become important soon.
I created an SSIS package with a Flat File Connection Manager. I pointed it at a comma delimited file that looked like
col 1,col 2,col 3
This allowed me to get the metadata set for the file. Once I have all the columns defined and my package is otherwise good. Save it. Commit it to your version control system. If you're not using version control, shame on you, but then make a copy of your .dtsx file and put it somewhere handy.
Replace the comma delimited file with the a thorn delimited one.
What we're doing
What we're going to do is edit the XML that is our SSIS package by hand to exchange the delimter of a , with a þ. It's a straight forward operation but since you are going off the reservation, it's easy to foul up and then your package won't open up properly in the editor.
How to fix it
If you have the package open, close the package but leave Visual Studio open. Right click on the file and select "View Code".
In an SSIS 2012 package, you'll be looking for
DTS:ColumnDelimiter="_x002C_"
In a 2008 package,
<DTS:Property DTS:Name="ColumnDelimiter" xml:space="preserve">_x002C_</DTS:Property>
What we're going to do is substitute _x00FE_ (thorn) for _x002C_ (comma). Save the file and then double click to open it back up.
Your connection manager should now show the thorn symbol on the Columns tab.
Interestingly enough, after you open the package, if you go back into the Code, the editor will have swapped the thorn character into the file in place of the hexagonal character code. Weird.

Tilde (~) Delimited File Read in SSIS

I'm trying to load a Tilde (~) delimited .DAT to SQL Server DB using SSIS. When I use a flat file source to read the file, I don't see the option of a ~ delimiter. I'm pasting a row from my file below:
7318~97836: LRX PAIN MONTHLY DX~001~ALL OTHER NSAIDs~1043676~001~1043676~001~OSR~401~01~ORALS,SOL,TAB/CAP RE~156720~50MG~ANSAID~100 0170-07
In here, I need to get the data between the columns separated by a ~ i.e.
Column 1 should have '7318', Column 2 should have '97836: LRX PAIN MONTHLY DX'.
Can someone help me with this? Can this be done using a Flat File Source or do I need to use a Script Task?
Sure you can, you just need to configure the "Column delimiter" property in the "Flat File Connection Manager Editor". There are some predetermined choices there, but you can click and type any separator you want:
After that you can click "refresh" and then "OK".

Resources