unable to load csv file into snowflake with the COPY INTO command - snowflake-cloud-data-platform

End of record reached while expected to parse column '"VEGETABLE_DETAILS_PLANT_HEIGHT"["HIGH_END_OF_RANGE":5]'
File 'veg_plant_height.csv', line 8, character 14
Row 3, column "VEGETABLE_DETAILS_PLANT_HEIGHT"["HIGH_END_OF_RANGE":5]
If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client.
this is my table
create or replace table VEGETABLE_DETAILS_PLANT_HEIGHT (
PLANT_NAME text(7),
VEG_HEIGHT_CODE text(1),
UNIT_OF_MEASURE text(2),
LOW_END_OF_RANGE number(2),
HIGH_END_OF_RANGE number(2)
);
and the COPY INTO command I used
copy into vegetable_details_plant_height
from #like_a_window_into_an_s3_bucket
files = ( 'veg_plant_height.csv')
file_format = ( format_name=VEG_CHALLENGE_CC );
and the csv file https://uni-lab-files.s3.us-west-2.amazonaws.com/veg_plant_height.csv

The error "End of record reached while expected to parse column" means Snowflake detected there were less than expected columns when processing the current row.
Please review your CSV file and make sure each row has correct number of columns. The error said on line 8.

The table has 5 columns but source file consist values for four columns due to this copy command returns the error. In order to resolve the issue you can modified the copy command as mentioned below:
copy into vegetable_details_plant_height(PLANT_NAME, UNIT_OF_MEASURE, LOW_END_OF_RANGE, HIGH_END_OF_RANGE)
from (select $1, $2, $3, $4 from #like_a_window_into_an_s3_bucket)
files = ( 'veg_plant_height.csv') file_format = ( format_name=VEG_CHALLENGE_CC );

As you can see in csv file data in one column is in "" and the names are separated by , so u need to use that FIELD_OPTIONALLY_ENCLOSED_BY = '"' type option

Related

How to solve error "Field delimiter ',' found while expecting record delimiter '\n'" while loading json data to the stage

I am trying to "COPY INTO" command to load data from s3 to the snowflake
Below are the steps I followed to create the stage and loading file from stage to Snowflake
JSON file
{
"Name":"Umesh",
"Desigantion":"Product Manager",
"Location":"United Kingdom"
}
create or replace stage emp_json_stage
url='s3://mybucket/emp.json'
credentials=(aws_key_id='my id' aws_secret_key='my key');
# create the table with variant
CREATE TABLE emp_json_raw (
json_data_raw VARIANT
);
#load data from stage to snowflake
COPY INTO emp_json_raw from #emp_json_stage;
I am getting below error
Field delimiter ',' found while expecting record delimiter '\n' File
'emp.json', line 2, character 18 Row 2, column
"emp_json_raw"["JSON_DATA_RAW":1]
I am using a simple JSON file, and I don't understand this error.
What causes it and how can I solve it?
File format is not specified and is defaulting to CSV format hence the error.
Try this:
COPY INTO emp_json_raw
from #emp_json_stage
file_format=(TYPE=JSON);
There are other options too that can be specified with file_format other than TYPE. Refer the documentation here: https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-json
try:
file_format = (type = csv field_optionally_enclosed_by='"')
The default settings do not expect the " wrapping around your data.
So you could strip all the " or ... just set the field_optionally_enclosed_by to a ". This does mean if your data has " in it things get messy.
https://docs.snowflake.com/en/user-guide/getting-started-tutorial-copy-into.html
https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html#type-csv
Also have a standard practice to mention type of file either it could be CSV, JSON ,AVRO , Parquet etc.
https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html

ORA-01722: invalid number error while executing select query on external table

I'm executing select * from owner_name.tablename; which is the external table having dat file.
while executing this it is not returning any rows and not even count(0) fetching results. I could see the below error in log file.
error processing column ID in row 1 for datafile /external/ab/output/user/table_name.dat
ORA-01722: invalid number
.
.
.
.
.
.
.
error processing column ID in row 140489 for datafile /external/ab/output/user/table_name.dat
ORA-01722: invalid number
But, the same dat file and same table executing fine in owner schema.
I did check all like dat file, DDL of table, file location, spaces in data file, delimiters and everything looking same but still error ORA-01722
What am I missing here? Previous stackoverflow questions are for insert queries and not for external tables.
DDL :
CREATE TABLE "OWNER"."TABLE_EXT"
( "ID" NUMBER(22,0),
"SOURCE_CODE" NUMBER(3,0)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "OWNER_EXT"
ACCESS PARAMETERS
( records delimited BY newline FIELDS TERMINATED BY "Ç" missing field VALUES are NULL
(
CLAIM_ID ,
SOURCE_CODE ,
) )
LOCATION
( "OWNER_EXT":'table.dat'
)
)
REJECT LIMIT UNLIMITED
PARALLEL 16 ;
DAT file :
Ç3Ç5278260051Ç557065100415Ç5278260051ÇÇÇÇÇÇÇ

Snowflake-Internal Stage data load error: How to load "\" character

In a file, few of the rows have \ in a column value for example, i have rows in below format.
101,Path1,Z:\VMC\PSPS,abc
102,Path5,C:\wintm\PSPS,abc
I was wondering how to load \ character
COPY INTO TEST_TABLE from #database.schema.stage_name FILE_FORMAT = ( TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '\"' SKIP_HEADER = 1 );
is there any thing that i can mention the file_format line?
Are you still getting this error? I just tried to recreate it by creating a CSV based off your sample data and a test table. I loaded the CSV into an internal stage and then ran your COPY command. It worked for me. Please see the screenshot below.
Could you provide more details on the error you are facing? Perhaps there was something off with your table definition.

Data load in Snowflake: NULL result in a non-nullable column

I am getting the error message: NULL result in a non-nullable column on my loading my parquet files into Snowflake.
I have NOT null columns in Snowflake for example, NAME2, NAME3, but the values against them in the parquet files are empty string.
So my question is how can I resolve this constraint without changing my table definition or without removing not null constraint?
COPY INTO "DB_STAGE"."SCH_ABC_INIT"."T_TAB" FROM (
SELECT
$1:OPSYS::VARCHAR,
$1:MANDT::VARCHAR,
$1:LIFNR::VARCHAR,
$1:LAND1::VARCHAR,
$1:NAME1::VARCHAR,
$1:NAME2::VARCHAR,
$1:NAME3::VARCHAR,
$1:NAME4::VARCHAR,
..
..
$1:OPTYPE::VARCHAR
FROM #DB_STAGE.SCH_ABC_INIT.initial_load_stage_ABC)
file_format = (type = 'parquet', NULL_IF=('NULL','',' ','NULL','NULL','//N'))
pattern = '.*/ABC-TAB-prod/.*snappy.parquet';
I believe that this line
file_format = (type = 'parquet', NULL_IF=('NULL','',' ','NULL','NULL','//N'))
is explicitly asking to take the empty strings and make them into NULL values, which obviously won't work going into fields that are NOT NULL in your table. You should probably try something like this:
file_format = (type = 'parquet', NULL_IF=('NULL','//N'))
Your other option is to remove the NOT NULL in your table and allow the conversion to NULL.
Looking at these options there is some guidance on NULLs particularly for parquet
NULL_IF = ( 'string1' [ , 'string2' , ... ] )
Use
Data loading only
Definition
String used to convert to and from SQL NULL. Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.
This file format option is applied to the following actions only when loading Parquet data into separate columns using the MATCH_BY_COLUMN_NAME copy option.
Note that Snowflake converts all instances of the value to NULL, regardless of the data type. For example, if 2 is specified as a value, all instances of 2 as either a string or number are converted.
For example:
NULL_IF = ('\N', 'NULL', 'NUL', '')
Note that this option can include empty strings.
Default
\N (i.e. NULL, which assumes the ESCAPE_UNENCLOSED_FIELD value is \)
from the docs: https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html
There's also more content I found under communities: https://community.snowflake.com/s/question/0D50Z00009Vw7ktSAB/how-can-i-get-schema-file-when-copy-into-file-returns-empty-row
https://community.snowflake.com/s/question/0D50Z00008UE4MKSA1/while-loading-a-parquet-file-to-snowflake-all-the-optional-field-in-parquet-schema-are-coming-as-null-any-idea-why-it-is-happening-all-other-fields-which-are-mandatory-in-parquet-schema-as-coming-as-expected
Let me know if these help
Try switching empty_field_as_null=false in file_format.

COPY INTO query on Snowflake returns TABLE does not exist error

I am trying to load data from azure blob storage.
The data has already been staged.
But, the issue is when I try to run
copy into random_table_name
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv'
Below is the error I encounter:
raise error_class(
snowflake.connector.errors.ProgrammingError: 001757 (42601): SQL compilation error:
Table 'random_table_name' does not exist
Basically, it says table does not exist, which it does not, but the syntax on website is the same as mine.
COPY INTO query on Snowflake returns TABLE does not exist error
In my case the table name is case-sensitive. Snowflake seems to convert everything to upper case. I changed the database/schema/table names to all upper-case and it started working.
First run the below query to fetch the column headers
select $1 FROM #stage_name_i_created/filename.csv limit 1
Assuming below are the header lines from your csv file
id;first_name;last_name;email;age;location
Create a file_format csv
create or replace file format semicolon
type = 'CSV'
field_delimiter = ';'
skip_header=1;
Then you should define the datatype and field name as below
create or replace table <yourtable> as
select $1::varchar as id
,$2::varchar as first_name
,$3::varchar as last_name
,$4::varchar as email
,$5::int as age
,$6::varchar as location
FROM #stage_name_i_created/yourfile.csv
(file_format => semicolon );
The table must exist prior to running a COPY INTO command. In your post, you say that the table does not exist...so that is your issue.
If your table exist, try by forcing the table path like this:
copy into <database>.<schema>.<random_table_name>
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv'
or by steps like this:
use database <database_name>;
use schema <schema_name>;
copy into database.schema.random_table_name
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv';
rbachkaniwala, what do you mean by 'How do I create a table?( according to snowflake syntax it is not possible to create empty tables)'.
You can just do below to create a table
CREATE TABLE random_table_name (FIELD1 VARCHAR, FIELD2 VARCHAR)
The table does need to exist. You should check the documentation for COPY INTO.
Other areas to consider are
do you have the right context set for the database & schema
does the user / role have access to the table or object.
It basically seems like you don't have the table defined yet. You should
ensure the table is created
ensure all columns in the CSV exist as columns in the table
ensure the order of the columns are the same as in the CSV
I'd check data types too.
"COPY INTO" is not a query command, it is the actual data transfer execution from source to destination, which both must exist as others commented here but If you want just to query without loading the files then run the following SQL:
//Display list of files in the stage to verify stage
LIST #stage_name_i_created;
//Create a file format
CREATE OR REPLACE FILE FORMAT RANDOM_FILE_CSV
type = csv
COMPRESSION = 'GZIP' FIELD_DELIMITER = ',' RECORD_DELIMITER = '\n' SKIP_HEADER = 0 FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE ESCAPE = 'NONE' ESCAPE_UNENCLOSED_FIELD = 'NONE' DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N');
//Now select the data in the files
Select $1 as first_col,$2 as second_col //can add as necessary number of columns ...etc
from #stage_name_i_created
(FILE_FORMAT => RANDOM_FILE_CSV)
More information can be found in the documentation link here
https://docs.snowflake.com/en/user-guide/querying-stage.html

Resources