I would like to insert records from my json.gz file into snowflake table.
I created this steps:
CREATE FILE FORMAT test_gz TYPE = JSON
create stage my_test_stage
storage_integration = MY_S3
url = 's3://mybucket/'
file_format = test_gz;
copy into test_table
from #my_test_stage
I have an error: JSON file can produce one and only one column of type variant or object or array.
I also tried to change file format to gzip but it's not working.
All you need to do is to add
CREATE FILE FORMAT test_gz TYPE = JSON COMPRESSION=GZIP instead of just TYPE = JSON
Make sure your dest table looking like this:
CREATE OR REPLACE TABLE test_table (JSON_DATA VARIANT)
Either way you can execute this all at once:
copy into test_table
from 's3://mybucket/'
storage_integration = MY_S3
file_format = (type = json COMPRESSION=GZIP)
Related
My Data is follows:
[ {
"InvestorID": "10014-49",
"InvestorName": "Blackstone",
"LastUpdated": "11/23/2021"
},
{
"InvestorID": "15713-74",
"InvestorName": "Bay Grove Capital",
"LastUpdated": "11/19/2021"
}]
So Far Tried:
CREATE OR REPLACE TABLE STG_PB_INVESTOR (
Investor_ID string, Investor_Name string,Last_Updated DATETIME
); Created table
create or replace file format investorformat
type = 'JSON'
strip_outer_array = true;
created file format
create or replace stage investor_stage
file_format = investorformat;
created stage
copy into STG_PB_INVESTOR from #investor_stage
I am getting an error:
SQL compilation error: JSON file format can produce one and only one column of type variant or object or array. Use CSV file format if you want to load more than one column.
You should be loading your JSON data into a table with a single column that is a VARIANT. Once in Snowflake you can either flatten that data out with a view or a subsequent table load. You could also flatten it on the way in using a SELECT on your COPY statement, but that tends to be a little slower.
Try something like this:
CREATE OR REPLACE TABLE STG_PB_INVESTOR_JSON (
var variant
);
create or replace file format investorformat
type = 'JSON'
strip_outer_array = true;
create or replace stage investor_stage
file_format = investorformat;
copy into STG_PB_INVESTOR_JSON from #investor_stage;
create or replace table STG_PB_INVESTOR as
SELECT
var:InvestorID::string as Investor_id,
var:InvestorName::string as Investor_Name,
TO_DATE(var:LastUpdated::string,'MM/DD/YYYY') as last_updated
FROM STG_PB_INVESTOR_JSON;
How can I unload snowflake data to s3 without using any file format?
For unloading the data into a specific extension we use file format in snowflake.
E.g. code
copy into 's3://mybucket/unload/'
from mytable
storage_integration = myint
file_format = (format_name = my_csv_format);
But what I want is to store data without any extension.
SINGLE is what I was looking for. It is one of parameters we can use with COPY command which creates the file without extension.
Code:
copy into 's3://mybucket/unload/'
from mytable
storage_integration = myint
file_format = (format_name = my_csv_format)
SINGLE = TRUE;
Go through note of below link for better understanding:
https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html#:~:text=comma%20(%2C)-,FILE_EXTENSION,-%3D%20%27string%27%20%7C%20NONE
You can add the parameter FILE_EXTENSION = NONE to your file format. With this parameter Snowflake is not adding a file extension based on your file format (in this case .csv), but is using the passed extension (NONE or any other).
copy into 's3://mybucket/unload/'
from mytable
storage_integration = myint
file_format = (format_name = my_csv_format file_extension = NONE);
https://docs.snowflake.com/en/sql-reference/sql/copy-into-location.html
we are facing below issue while loading csv file from S3 to Snowflake.
SQL Compilation error: Insert column value list does not match column list expecting 7 but got 6
we have tried removing the column from table and tried again but this time it is showing expecting 6 but got 5
below are the the commands that we have used for stage creation and copy command.
create or replace stage mystage
url='s3://test/test'
STORAGE_INTEGRATION = test_int
file_format = (type = csv FIELD_OPTIONALLY_ENCLOSED_BY='"' COMPRESSION=GZIP);
copy into mytable
from #mystage
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
FILE_FORMAT = (TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY='"' COMPRESSION=GZIP error_on_column_count_mismatch=false TRIM_SPACE=TRUE NULL_IF=(''))
FORCE = TRUE
ON_ERROR = Continue
PURGE=TRUE;
You can not use MATCH_BY_COLUMN_NAME for the CSV files, this is why you get this error.
This copy option is supported for the following data formats:
JSON
Avro
ORC
Parquet
https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html
i am trying to unload Snowflake tale data into S3 bucket in parquet format. but getting below error.
`SQL compilation error: COPY statement only supports simple SELECT from stage statements for import.`
below is the syntax of copy statement
`create or replace stage STG_LOAD
url='s3://bucket/foler'
credentials=(aws_key_id='xxxx',aws_secret_key='xxxx')
file_format = (type = PARQUET);
copy into STG_LOAD from
(select OBJECT_CONSTRUCT(country_cd,source)
from table_1
file_format = (type='parquet')
header='true';`
please let me know if i am missing anything here.
You have to identify named stages using the # symbol. Also the header option is true rather than 'true'
copy into #STG_LOAD from
(select OBJECT_CONSTRUCT(country_cd,source)
from table_1 )
file_format = (type='parquet')
header=true;
I am trying to load data from azure blob storage.
The data has already been staged.
But, the issue is when I try to run
copy into random_table_name
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv'
Below is the error I encounter:
raise error_class(
snowflake.connector.errors.ProgrammingError: 001757 (42601): SQL compilation error:
Table 'random_table_name' does not exist
Basically, it says table does not exist, which it does not, but the syntax on website is the same as mine.
COPY INTO query on Snowflake returns TABLE does not exist error
In my case the table name is case-sensitive. Snowflake seems to convert everything to upper case. I changed the database/schema/table names to all upper-case and it started working.
First run the below query to fetch the column headers
select $1 FROM #stage_name_i_created/filename.csv limit 1
Assuming below are the header lines from your csv file
id;first_name;last_name;email;age;location
Create a file_format csv
create or replace file format semicolon
type = 'CSV'
field_delimiter = ';'
skip_header=1;
Then you should define the datatype and field name as below
create or replace table <yourtable> as
select $1::varchar as id
,$2::varchar as first_name
,$3::varchar as last_name
,$4::varchar as email
,$5::int as age
,$6::varchar as location
FROM #stage_name_i_created/yourfile.csv
(file_format => semicolon );
The table must exist prior to running a COPY INTO command. In your post, you say that the table does not exist...so that is your issue.
If your table exist, try by forcing the table path like this:
copy into <database>.<schema>.<random_table_name>
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv'
or by steps like this:
use database <database_name>;
use schema <schema_name>;
copy into database.schema.random_table_name
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv';
rbachkaniwala, what do you mean by 'How do I create a table?( according to snowflake syntax it is not possible to create empty tables)'.
You can just do below to create a table
CREATE TABLE random_table_name (FIELD1 VARCHAR, FIELD2 VARCHAR)
The table does need to exist. You should check the documentation for COPY INTO.
Other areas to consider are
do you have the right context set for the database & schema
does the user / role have access to the table or object.
It basically seems like you don't have the table defined yet. You should
ensure the table is created
ensure all columns in the CSV exist as columns in the table
ensure the order of the columns are the same as in the CSV
I'd check data types too.
"COPY INTO" is not a query command, it is the actual data transfer execution from source to destination, which both must exist as others commented here but If you want just to query without loading the files then run the following SQL:
//Display list of files in the stage to verify stage
LIST #stage_name_i_created;
//Create a file format
CREATE OR REPLACE FILE FORMAT RANDOM_FILE_CSV
type = csv
COMPRESSION = 'GZIP' FIELD_DELIMITER = ',' RECORD_DELIMITER = '\n' SKIP_HEADER = 0 FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE ESCAPE = 'NONE' ESCAPE_UNENCLOSED_FIELD = 'NONE' DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N');
//Now select the data in the files
Select $1 as first_col,$2 as second_col //can add as necessary number of columns ...etc
from #stage_name_i_created
(FILE_FORMAT => RANDOM_FILE_CSV)
More information can be found in the documentation link here
https://docs.snowflake.com/en/user-guide/querying-stage.html