SNOWFLAKE COPY Command not copying data inside - snowflake-cloud-data-platform

Im getting started with Snowflake and something I dont understand. I tried to issue a copy command as below but it shows no rows processed.
copy into customer
from #bulk_copy_example_stage
FILES = ('dataDec-9-2020.csv')
file_format = (type = csv field_delimiter = '|' skip_header = 1)
FORCE=TRUE;
I tried with another file from the same S3 folder
copy into customer
from #bulk_copy_example_stage
FILES = ('generated_customer_data.csv')
file_format = (type = csv field_delimiter = '|' skip_header = 1)
FORCE=TRUE;
And this worked.
At this stage im pretty sure that something was wrong with my first file. but my question is, how do we get to print out what the error was? all it shows in the console is as below which is not really helpful.

You could try looking at the copy_history to find out what's wrong with the file.
Reference: copy_history

Related

snowflake: continue on error but also list all the errors

I am inserting data into snowflake using the below statement
copy into "sampletable"
from s3://test/test/ credentials=(aws_key_id='xxxx' aws_secret_key='yyyyy')
file_format = (type = csv field_delimiter = '|'skip_header = 1)
on_error = 'continue';
but after the ingestion is done, i also want to know which rows are not inserted and whats the reason, since i am using the option on_error = 'continue'
any idea how can i do this.
You can use the VALIDATE function: https://docs.snowflake.com/en/sql-reference/functions/validate.html

Found character ':' instead of field delimiter ','

Again I am facing an issue with loading a file into snowflake.
My file format is:
TYPE = CSV
FIELD_DELIMITER = ','
FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
NULL_IF = ''
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
[ COMMENT = '<string_literal>' ]
Now by running the:
copy into trips from #citibike_trips
file_format=CSV;
I am receiving the following error:
Found character ':' instead of field delimiter ','
File 'citibike-trips-json/2013-06-01/data_01a304b5-0601-4bbe-0045-e8030021523e_005_7_2.json.gz', line 1, character 41
Row 1, column "TRIPS"["STARTTIME":2]
If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client.
I am a little confused about the file I am trying to load. Actually, I got the file from a tutorial on YouTube and in the video, it works properly. However, inside the file, there are not only CSV datasets, but also JSON, and parquet. I think this could be the problem, but I am not sure to solve it, since the command code above is having the file_format = CSV.
Remove FIELD_OPTIONALLY_ENCLOSED_BY = '\042' , recreate the file format and run the copy statement again.
You're trying to import a JSON file using a CSV file format. In most cases all you need to do is specify JSON as the file type in the COPY INTO statement.
FILE_FORMAT = ( { FORMAT_NAME = '[<namespace>.]<file_format_name>' |
TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] } ) ]
You're using CSV, but it should be JSON:
FILE_FORMAT = (TYPE = JSON)
If you're more comfortable using a named file format, use the builder to create a named file format that's of type JSON:
I found a thread in the Snowflake Community forum that explains what I think you might have been facing. There are now three different kinds of files in the stage - CSV, parquet, and JSON. The copy process given in the tutorial expects there to be only CSV. You can use this syntax to exclude non-CSV files from the copy:
copy into trips from #citibike_trips
on_error = skip_file
pattern = '.*\.csv\.gz$'
file_format = csv;
Using the PATTERN option with a regular expression you can filter only the csv files to be loaded.
https://community.snowflake.com/s/feed/0D53r0000AVKgxuCQD
And if you also run into an error related to timestamps, you will want to set this file format before you do the copy:
create or replace file format
citibike.public.csv
type = 'csv'
field_optionally_enclosed_by = '\042'
S3 to Snowflake ( loading csv data in S3 to Snowflake table throwing following error)

Copy the same file into table using COPY command & snowpipe

I coudln't load the samefile into table in snowflake using COPY command/snowpipe.
I am always getting the following result
Copy executed with 0 files processed.
I have re-created the table. Truncated the table. But the copy_history doesn't show any data
select * from table(information_schema.copy_history(table_name=>'mytable', start_time=> dateadd(hours, -10, current_timestamp())));
I have used FORCE = true in COPY Command and COPY command didnt load the same file into Table. I have explicitly mentioned file path in COPY COMMAND
FROM
#STAGE_DEV/myfile/05/28/16/myfile_1.csv
) file_format = (
format_name = STANDARD_CSV_FORMAT Skip_header = 1 FIELD_OPTIONALLY_ENCLOSED_BY = '"' NULL_IF = 'NULL'
)
on_error = continue
Force = True;
Anyone faced similar issue and what would the process to load the same file again using COPY command or SNOWPIPE ? I dont have option to change file name or put the files in different S3 bucket.
ls#stage shows the following files ls#stage
I have reloaded files to S3 bucket and it's working. Thank you guys for all the responses. –

Snowflake-Internal Stage data load error: How to load "\" character

In a file, few of the rows have \ in a column value for example, i have rows in below format.
101,Path1,Z:\VMC\PSPS,abc
102,Path5,C:\wintm\PSPS,abc
I was wondering how to load \ character
COPY INTO TEST_TABLE from #database.schema.stage_name FILE_FORMAT = ( TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '\"' SKIP_HEADER = 1 );
is there any thing that i can mention the file_format line?
Are you still getting this error? I just tried to recreate it by creating a CSV based off your sample data and a test table. I loaded the CSV into an internal stage and then ran your COPY command. It worked for me. Please see the screenshot below.
Could you provide more details on the error you are facing? Perhaps there was something off with your table definition.

What regex parser is used for the files_pattern for the 'COPY INTO' sql query?

(Submitted on behalf of a Snowflake User)
I have a test s3 folder called s3://bucket/path/test=integration_test_sanity/file.parquet
I want to be able to load this into snowflake using the COPY INTO command but I want to be able to load all the test folders which have a structure like test=*/file.parquet.
I've tried:
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='test=(.*)/.*'
FILE_FORMAT = (TYPE = parquet)
and also
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='test=.*/.*'
FILE_FORMAT = (TYPE = parquet)
Neither of these works. I was wondering what regex parser is used by Snowflake and which regex I should use to get this to work.
This works but I can't filter on just test folders which can cause issues
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='.*/.*'
FILE_FORMAT = (TYPE = parquet)
Any recommendations? Thanks!
Try this:
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='.*/test.*[.]parquet'
FILE_FORMAT = (TYPE = parquet)

Resources