Snowflake Copy Error due to field enclosed with double double quotes - snowflake-cloud-data-platform

Found character 'B' instead of field delimiter ',' File
While trying to load
123,""BigB" - some data", 345 ....
I tried tackling this with ESCAPE_UNENCLOSED_FIELD = '"' file format parameter and able to load without any errors but with few defects.
1) The particular field got loaded as BigB" - some data (double quotes in the mid of the field not trimmed).
2) Some other fields format got messed up because of file format ESCAPE_UNENCLOSED_FIELD = '"'.
Any idea or recommendation to over come this scenario in SnowFlake?

Try adding FIELD_OPTIONALLY_ENCLOSED_BY to your File format.

Related

Load csv.gz file containing both double quotes and comma with in a field string to Snowflake

I have a set of .csv.gz files that are to be loaded to my Snowflake table from S3 bucket. Problem is that, a particular field in the file has both double quotes and comma and hence I am unable to use FIELD_OPTIONALLY_ENCLOSED_BY = '"'.
Any help is appreciated!

Varchar fields containing a single double quote get exported with two double quotes from snowflake web UI

I have a table with a single Varchar column and a row with value A”B”C .
While downloading it from snowflake UI as a CSV file the result shows A””B””C
.
In this case if you are using Data Unloading feature, you need to use File format in the copy into statement.
In the file format you need to use the option "FIELD_OPTIONALLY_ENCLOSED_BY "
Ref - https://docs.snowflake.com/en/user-guide/data-unload-considerations.html#empty-strings-and-null-values
When a field contains this character, escape it using the same character. For example, if the value is the double quote character and a field contains the string "A", escape the double quotes as follows: ""A"".

Copying S3 file with row of data containing Arabic text throws off the end of record and Copy fails

I Unloaded a table from from Redshift to S3. The table is 212 columns wide. Some fields in some rows contain Arabic text.
Here's the Redshidt Unload command I used:
unload ('select * from dataw.testing')
to 's3://uarchive-live/rpt_all/rpt_all.txt'
iam_role 'arn:aws:iam::12345678988:role/service-role'
GZIP
DELIMITER '\t'
null as ''
;
When I attempt to COPY this file into Snowflake an error occurs.
End of record reached while expected to parse column '"RPT_ALL"["AUTO_TRAF_RETR_CNT":211]' File 'rpt_all_250/rpt_all.txt0000_part_113.gz', line 9684, character 1187 Row 9684, column "RPT_ALL"["AUTO_TRAF_RETR_CNT":211]
The field name referenced in the error is not the last field in the records, there are two more after that one.
I removed the Arabic text from the fields and left them blank, then I attempted the COPY again, and this time it Copied with no errors.
Here's the Snowflake File Format I'm using:
CREATE FILE FORMAT IF NOT EXISTS "DEV"."PUBLIC"."ff_noheader" TYPE = 'CSV' RECORD_DELIMITER = '\n' FIELD_DELIMITER = '\t' SKIP_HEADER = 0 COMPRESSION = 'GZIP' TIMESTAMP_FORMAT = 'AUTO' TRIM_SPACE = TRUE REPLACE_INVALID_CHARACTERS = TRUE;
Here's the Snowflake Copy command I'm using:
COPY INTO "DEV"."PUBLIC"."RPT_ALL" FROM #"stg_All"/snowflk_test.csv FILE_FORMAT="DEV"."PUBLIC"."ff_noheader";
What do I need to configure in Snowflake to accept this Arabic text so that the end of record is not corrupted?
Thanks
I'm not a Snowflake expert but I have used it and I have debug a lot issue like this.
My initial though as to why you are getting an unexpected EOR, which is \n, is that you data contains \n. If you data has \n then this will look like an EOR when the data is read. I don't believe there is a way to change the EOR in the Redshift UNLOAD command. So you need to ESCAPE in the Redshift UNLOAD command to add a backslash before characters like \n. You will also need to tell Snowflake what the escape character is - ESCAPE = '\' (I think you need double backslash in this statement). [There's a change you may need to quote your fields also but you will know that when you hit any issues hidden by this one.]
The other way would be to use a different unload format that doesn't suffer from overloaded character meaning.
There's a chance that the issue is in character encodings related to your Arabic text but I expect not since both Redshift and Snowflake are UTF-8 based systems. Possible but not likely.

snowflake - copy delimiter error character "\" in the date itself like "BOB\Y", so data copy error

I'm unable to use copy and load data into snowflake table.
error-- Field delimiter ',' found while expecting record delimiter
'\n' File 'txn_type_text.csv.gz', line 690, character 35 Row 689,
column "TRIPS_TRANS_TEXT"["TRIPS_TRANS_DESC":3]
any help!
It is late reply but may help others who sees it.
If the field_delimiter(,) we define for the CSV also appears in the record, then this error will appear. Normally, CSV files will wrap such records in " (you can see this by opening the CSV file in a text editor).
Use FIELD_OPTIONALLY_ENCLOSED_BY = '"' in Copy statement. This will take care of extra delimiter available in record itself.

Load table issue - BCP from flat file - Sybase IQ

I am getting the below error while trying to do bcp from a flat delimited file into Sybase IQ table.
Could not execute statement.
Non-space text found after ending quote character for an enclosed field.
I couldn't observe any non space text in the file, but this error is stopping me from doing the bulk copy. | is column delimiter with " as text qualifier and \n is row delimiter.
Below is the sample template for the same, am using.
LOAD TABLE TABLE_NAME(a NULL('(null)'),b NULL('(null)'),c NULL('(null)'))
USING CLIENT FILE '/home/...../a.txt' //unix
QUOTES ON
FORMAT bcp
STRIP RTRIM
DELIMITED BY '|'
ROW DELIMITED BY '\n'
When i perform the same query with QUOTES OFF, the load was successful. But, the same query is getting failed with QUOTES ON. I would like to get quotes stripped off, as well.
Sample Data
12345|"abcde"|(null)
12346|"abcdf"|"zxf"
12347|(null)|(null)
12348|"abcdg"|"zyf"
Any leads would be helpful!
If IQ bcp is the same as ASE, then I think those '(null)' fields are being interpreted as strings, not fields that are NULL.
You'd need to stream edit out those (null).
You're on unix so use sed or perl -ne.
E.g. pipe the file through " | perl -pne 's/(null)//g'" to the loading command or filename.
QUOTES OFF might seem to work, but I wonder if when you look in your loaded data, you'll see double quotes inside the 2nd field, and '(null)' where you expect a field to be NULL.

Resources