error while loading csv file though snlowsql copy into command - snowflake-cloud-data-platform

i was trying to load csv file from AWS s3 bucket with copy into command in one of the csv file throw error like
End of record reached while expected to parse column
'"RAW_PRODUCTS"["PACK_COUNT_UNITS":25]
and with the VALIDATION_MODE = RETURN_ALL_ERRORS it also give me 2 rows that have error i am not sure what error would be.
my concern is can we get specific error so that we can fix it in file.

You might try using the VALIDATE table function. https://docs.snowflake.com/en/sql-reference/functions/validate.html

Thanks Eda, i already reviewed above link but that did not work with sql query with copy into table from s3 bucket, so i create stage and place that csv file on stage and then try to run that validate command that give me same error row.
there is another way to identify error while executing copy into command you can add VALIDATION_MODE = RETURN_ALL_ERRORS you will get same result.
by the way i resolve error its due to /,, i remove / and it loaded successfully. / or /, is working as it was in other row but /,, did not work.

Related

Is there a way to find out details of data type erorr in Snowflake?

I am pretty new to Snowflake Cloud offering and was just trying to load a simple .csv file from AWS s3 staging are to a table in Snowflake using copy command.
Here is what I used as the command:
copy into "database name"."schema"."table name"
from #S3_ACCESS
file_format = (format_name = format name);
When run the above code, I get the following error: Numeric value '63' is not recognized
Please see the attached image. Not sure what this error is and i'm not able to find any lead in Snowflake UI itself to find out what could be wrong with the value.
Thanks in Advance!
The error says, it was waiting a numberic value, but it got "63", and this value can not be converted to numeric value.
From the image you share, I can see that there are some weird characters around 6 and 3. There could be an issue with file encoding or data is corrupted.
Please check encoding option for file format:
https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html#format-type-options-formattypeoptions
By the way, I recommend you always use utf-8.

Error importing data from CSV with OpenRowset in SQL Server - Mysterious value of "S7"

I have a file dump which needs to be imported into SQL Server on a daily basis, which I have created a scheduled task to do this without any attendant. All CSV files are decimated by ',' and it's a Windows CR/LF file encoded with UTF-8.
To import data from these CSV files, I mainly use OpenRowset. It works well until I ran into a file in which there's a value of "S7". If the file contains the value of "S7" then that column will be recognized as datatype of numeric while doing the OpenRowset import and which will lead to a failure for other alphabetic characters to be imported, leaving only NULL values.
This is by far I had tried:
Using IMEX=1: openrowset('Microsoft.ACE.OLEDB.15.0','text;IMEX=1;HDR=Yes;
Using text driver: OpenRowset('MSDASQL','Driver=Microsoft Access Text Driver (*.txt, *.csv);
Using Bulk Insert with or without a format file.
The interesting part is that if I use Bulk Insert, it will give me a warning of unexpected end of file. To solve this, I have tried to use various row terminator indicators like '0x0a','\n', '\r\n' or not designated any, but they all failed. And finally I managed to import some of the records which using a row terminator of ',\n'. However the original file contains like 1000 records and only 100 will be imported, without any notice of errors or warnings.
Any tips or helps would be much appreciated.
Edit 1:
The file is ended with a newline character, from which I can tell from notepad++. I managed to import files which give me an error of unexpected end of file by removing the last record in those files. However even with this method, that I still can not import all records, only a partial of which can be imported.

SSIS read flat file skip first row

First of all, I did spend quite some time on research, and I know there are many related questions, though I can't find the right answer on this question.
I'm creating a SSIS package, which does the following:
1. Download and store CSV file locally, using HTTP connection.
And 2. Read in CSV file and store on SQL Server.
Due to the structure of my flat file, the flat file connection keeps giving me errors, both in SSIS as in the SQL Import Wizard.
The structure of the file is:
"name of file"
"columnA","columnB"
"valueA1","valueB1"
"valueA2","valueB2"
Hence the row denominator is end of line {CR}{LF} and the column denominator is a comma{,}, with text qualifier ".
I want to import only the values, not the name of the file or the column names.
I played around with the settings and got the right preview with the following settings (see image below)
- Header rows to skip: 0
- Column names in the first data row: no
- 2 self-configured columns (string with columnWidth = 255)
- Data rows to skip: 2
When I run the SSIS Package or SQL Import Wizard I get the following error:
[SSIS.Pipeline] Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The
PrimeOutput method on Flat File Source returned error code 0xC0202091.
The component returned a failure code when the pipeline engine called
PrimeOutput(). The meaning of the failure code is defined by the
component, but the error is fatal and the pipeline stopped executing.
There may be error messages posted before this with more information
about the failure.
I can't figure out what goes wrong and what I can do to make this import work.
If you want to skip the file name and the column names, you need to set Header Rows to skip to 2. You should also check whether the file actually uses line feeds (LF) instead of CR+LF. Checking the line breaks in a text editor isn't enough to detect the difference, as most editors display correctly files with both CR+LF or LF.
You can check the results of your settings by clicking on the "Preview" button in your flat file source. If the settings are correct, you'll see a grid with your data properly aligned. If not, you'll get an error, or the data will be wrong in some way, eg a very large number of columns, column names in the first data row etc

SSIS error handling: redirect rows that have zip code field more than 5 from a flat file

I have been given a task to load a simple flat file into another using ssis package. The source flat file contains a zip code field, now my task is to extract and load into another flat file that accepts only the ones with correct zip code which is 5 digit zip code , and redirect the invalid rows to a new file.
Since I am new to SSIS, any help or ideas is much appreciated.
You can add a derived column which determines the length of the field. Then you can add a conditional split based on that column. <= 5 goes the good path, > 5 goes the reject path.

Talend: Write data to PostgreSQL database error

I am trying to write data from a .csv file to my postgreSQL database. The connection is fine, but when I run my job i get the following error:
Exception in component tPostgresqlOutput_1
org.postgresql.util.PSQLException: ERROR: zero-length delimited identifier at or near """"
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:328)
at talend_test.exporttoexcel_0_1.exportToExcel.tFileInputDelimited_1Process(exportToExcel.java:568)
at talend_test.exporttoexcel_0_1.exportToExcel.runJobInTOS(exportToExcel.java:1015)
at talend_test.exporttoexcel_0_1.exportToExcel.main(exportToExcel.java:886)
My job is very simple:
tFileInputDelimiter -> PostgreSQL_Output
I think that the error means that the double quotes should be single quotes ("" -> ''), but how can i edit this in Talend?
Or is it another reason?
Can anyone help me on this one?
Thanks!
If you are using the customer.csv file from the repository then you have to change the properties of customer file by clicking through metadata->file delimited->customer in the repository pane.
You should be able to right click the customer file and then choose Edit file delimited. In the third screen, if the file extension is .csv then in Escape char settings you have to select CSV options. Typical escape sequences (as used by Excel and other programs) have escape char as "\"" and text enclosure is also "\"".
You should also check that encoding is set to UTF-8 in the file settings. You can then refresh your preview to view a sample of your file in a table format. If this matches your expectations of the data you should then be able to save the metadata entry and update this to your jobs.
If your file is not in the repository, then click on the component with your file and do all of the above CSV configuration steps in the basic settings of the component.

Resources