unload Snowflake data to s3 without the extension/file_format - snowflake-cloud-data-platform

How can I unload snowflake data to s3 without using any file format?
For unloading the data into a specific extension we use file format in snowflake.
E.g. code
copy into 's3://mybucket/unload/'
from mytable
storage_integration = myint
file_format = (format_name = my_csv_format);
But what I want is to store data without any extension.

SINGLE is what I was looking for. It is one of parameters we can use with COPY command which creates the file without extension.
Code:
copy into 's3://mybucket/unload/'
from mytable
storage_integration = myint
file_format = (format_name = my_csv_format)
SINGLE = TRUE;
Go through note of below link for better understanding:
https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html#:~:text=comma%20(%2C)-,FILE_EXTENSION,-%3D%20%27string%27%20%7C%20NONE

You can add the parameter FILE_EXTENSION = NONE to your file format. With this parameter Snowflake is not adding a file extension based on your file format (in this case .csv), but is using the passed extension (NONE or any other).
copy into 's3://mybucket/unload/'
from mytable
storage_integration = myint
file_format = (format_name = my_csv_format file_extension = NONE);
https://docs.snowflake.com/en/sql-reference/sql/copy-into-location.html

Related

snowflake: continue on error but also list all the errors

I am inserting data into snowflake using the below statement
copy into "sampletable"
from s3://test/test/ credentials=(aws_key_id='xxxx' aws_secret_key='yyyyy')
file_format = (type = csv field_delimiter = '|'skip_header = 1)
on_error = 'continue';
but after the ingestion is done, i also want to know which rows are not inserted and whats the reason, since i am using the option on_error = 'continue'
any idea how can i do this.
You can use the VALIDATE function: https://docs.snowflake.com/en/sql-reference/functions/validate.html

How to load data into snowflake table from json.gz file

I would like to insert records from my json.gz file into snowflake table.
I created this steps:
CREATE FILE FORMAT test_gz TYPE = JSON
create stage my_test_stage
storage_integration = MY_S3
url = 's3://mybucket/'
file_format = test_gz;
copy into test_table
from #my_test_stage
I have an error: JSON file can produce one and only one column of type variant or object or array.
I also tried to change file format to gzip but it's not working.
All you need to do is to add
CREATE FILE FORMAT test_gz TYPE = JSON COMPRESSION=GZIP instead of just TYPE = JSON
Make sure your dest table looking like this:
CREATE OR REPLACE TABLE test_table (JSON_DATA VARIANT)
Either way you can execute this all at once:
copy into test_table
from 's3://mybucket/'
storage_integration = MY_S3
file_format = (type = json COMPRESSION=GZIP)

Unloading Snowflake table data into S3 in Parquet format

i am trying to unload Snowflake tale data into S3 bucket in parquet format. but getting below error.
`SQL compilation error: COPY statement only supports simple SELECT from stage statements for import.`
below is the syntax of copy statement
`create or replace stage STG_LOAD
url='s3://bucket/foler'
credentials=(aws_key_id='xxxx',aws_secret_key='xxxx')
file_format = (type = PARQUET);
copy into STG_LOAD from
(select OBJECT_CONSTRUCT(country_cd,source)
from table_1
file_format = (type='parquet')
header='true';`
please let me know if i am missing anything here.
You have to identify named stages using the # symbol. Also the header option is true rather than 'true'
copy into #STG_LOAD from
(select OBJECT_CONSTRUCT(country_cd,source)
from table_1 )
file_format = (type='parquet')
header=true;

Copy the same file into table using COPY command & snowpipe

I coudln't load the samefile into table in snowflake using COPY command/snowpipe.
I am always getting the following result
Copy executed with 0 files processed.
I have re-created the table. Truncated the table. But the copy_history doesn't show any data
select * from table(information_schema.copy_history(table_name=>'mytable', start_time=> dateadd(hours, -10, current_timestamp())));
I have used FORCE = true in COPY Command and COPY command didnt load the same file into Table. I have explicitly mentioned file path in COPY COMMAND
FROM
#STAGE_DEV/myfile/05/28/16/myfile_1.csv
) file_format = (
format_name = STANDARD_CSV_FORMAT Skip_header = 1 FIELD_OPTIONALLY_ENCLOSED_BY = '"' NULL_IF = 'NULL'
)
on_error = continue
Force = True;
Anyone faced similar issue and what would the process to load the same file again using COPY command or SNOWPIPE ? I dont have option to change file name or put the files in different S3 bucket.
ls#stage shows the following files ls#stage
I have reloaded files to S3 bucket and it's working. Thank you guys for all the responses. –

What regex parser is used for the files_pattern for the 'COPY INTO' sql query?

(Submitted on behalf of a Snowflake User)
I have a test s3 folder called s3://bucket/path/test=integration_test_sanity/file.parquet
I want to be able to load this into snowflake using the COPY INTO command but I want to be able to load all the test folders which have a structure like test=*/file.parquet.
I've tried:
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='test=(.*)/.*'
FILE_FORMAT = (TYPE = parquet)
and also
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='test=.*/.*'
FILE_FORMAT = (TYPE = parquet)
Neither of these works. I was wondering what regex parser is used by Snowflake and which regex I should use to get this to work.
This works but I can't filter on just test folders which can cause issues
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='.*/.*'
FILE_FORMAT = (TYPE = parquet)
Any recommendations? Thanks!
Try this:
COPY INTO raw.test_sanity_test_parquet
FROM 's3://bucket/path/'
CREDENTIALS=(AWS_KEY_ID='XXX' AWS_SECRET_KEY='XXX')
PATTERN='.*/test.*[.]parquet'
FILE_FORMAT = (TYPE = parquet)

Resources