Unload Snowflake table data into S3 in Parquet format - snowflake-cloud-data-platform

i could see all the values as Null's under the columns after loading into S3 bucket from Snowflake table.
below is the code i have used.
create or replace stage STG_LOAD
url='s3://bucket/foler'
credentials=(aws_key_id='xxxx',aws_secret_key='xxxx')
file_format = (type = PARQUET);
copy into STG_LOAD from
(select OBJECT_CONSTRUCT(country_cd,source)
from table_1
file_format = (type='parquet')
header='true';
please let me know if i am missing something here.

It's normal to see null values. It's expected behaviour. Your SELECT produces an empty object, therefore it is written as NULL values:
create or replace table table_1
( country_cd varchar, source varchar ) as
select * from values
('US','Jack'),
('UK','Joe'),
('NL','Jim'),
('EU', null);
select OBJECT_CONSTRUCT(country_cd,source) output
from table_1
+------------------+
| OUTPUT |
+------------------+
| { "US": "Jack" } |
| { "UK": "Joe" } |
| { "NL": "Jim" } |
| {} |
+------------------+
If you don't want to write null values, you can filter them on your select:
copy into #STG_LOAD from
(select OBJECT_CONSTRUCT(country_cd,source ) output
from table_1
where output != OBJECT_CONSTRUCT() ) -- no empty objects
file_format = (type=parquet)
overwrite = true;

Related

snowflake: mention date format staged data files

I have a tsv as below (just formatted for better repr) in a s4 bucket
col_1 | col_2 | col_3
2017/12/01 | 1996 | 20101201
.. | .. | ..
All the above columns are of DATE type
I create a stage to load this file from s3
Now I have table created
CREATE OR REPLACE TABLE
"test"
(
col_1 Date,
col_2 Date,
col_3 Date
);
Now i want to ingest the above csv into this table
-- create file_format
create or replace file format my_file_format
type = csv
field_delimiter = '|'
skip_header = 1
-- create stage
CREATE or replace STAGE my_stage
URL='s3://xxxx/yyyy'
CREDENTIALS=(AWS_KEY_ID='XXXXXXXXXXXXXX' AWS_SECRET_KEY='YYYYY');
-- copy into
copy into "TEST"
from #my_stage
file_format = (format_name = my_file_format);
-- or insert into
insert INTO "test" (select $1,$2,$3 from #my_stage (file_format => my_file_format));
I get error
Can't parse '2017/12/01' as date with format 'AUTO'
I cant change the csv. Is there any way i can mention the date format for each col while ingesting.
Can you try to use the proper date format based on the values?
insert INTO "test" (select TO_DATE($1,'YYYY/MM/DD'),TO_DATE($2,'YYYY'),
TO_DATE($3,'YYYYMMDD') from #my_stage (file_format => my_file_format));

How to add lots of values to a Postgres column with a where statement

I have a table with 1000 rows and have added a new column but now i need to add the data to it. Below is an example of my table.
location | name | display_name
-----------------+--------+-------
liverpool | Dan |
london | Louise |
stoke-on-trent | Amel |
itchen-hampshire| Mark |
I then have a csv that looks like this that has the extra data
location,name,display_name
Liverpool,Dan,Liverpool
London,Louise,London
stoke-on-trent,Amel,Stoke on Trent
itchen-hampshire,Mark,itchen (hampshire)
i know how to update a single row but not sure for the 1000 rows of data i have?
updating single row
UPDATE info_table
SET display_name = 'Itchen (Hampshire)'
WHERE id = 'itchen-hampshire';
You should first load that CSV data into another table and then do an update join on the first table:
UPDATE yourTable t1
SET display_name = t2.display_name
FROM csvTable t2
WHERE t2.location = t1.location;
If you only want to update display names which are null and have no value, then use:
WHERE t2.location = t1.location AND display_name IS NULL;
Updating more than one columns you can use this genralized query
update test as t set
column_a = c.column_a
from (values
('123', 1),
('345', 2)
) as c(column_b, column_a)
where c.column_b = t.column_b;

Search with LIKE in PostgreSQL array

I have this table:
id | name | tags
----+----------+-------------------------
1 | test.jpg | {sometags,other_things}
I need to get rows that contain specific tags by searching in array with regular expression or LIKE, like this:
SELECT * FROM images WHERE 'some%' LIKE any(tags);
But this query returns nothing.
with images (id, name, tags) as (values
(1, 'test.jpg', '{sometags, other_things}'::text[]),
(2, 'test2.jpg', '{othertags, other_things}'::text[])
)
select *
from images
where (
select bool_or(tag like 'some%')
from unnest(tags) t (tag)
);
id | name | tags
----+----------+-------------------------
1 | test.jpg | {sometags,other_things}
unnest returns a set which you aggregate with the convenient bool_or function

update one table from another selected table

I select one column from a table and generated the second column by select case:
(select Id , case
when education=0 then '0::ALL'
when education=1 then '1::HIGH_SCHOOL'
when education=2 then '2::UNDERGRAD'
when education=3 then '3::ALUM'
when education=4 then '4::HIGH_SCHOOL_GRAD'
when education=5 then '5::SOME_COLLEGE'
when education=6 then '6::ASSOCIATE_DEGREE'
when education=7 then '7::IN_GRAD_SCHOOL'
when education=8 then '8::SOME_GRAD_SCHOOL'
when education=9 then '9::MASTER_DEGREE'
when education=10 then '10::PROFESSIONAL_DEGREE'
when education=11 then '11::DOCTORATE_DEGREE'
when education=12 then '12::UNSPECIFIED'
end as myeducation
from ids_table where Id = '4fcc-a519-15db04651b91')
assuming it returns:
------------------------------------------------
| Id myeducation |
| 4fcc-a519-15db04651b91, 9::MASTER_DEGREE |
------------------------------------------------
in the same table (ids_table), I have an empty column is called: allEducations
I want to set allEducations = myeducation where id (of the table above that I "created") is equal to the id of the table (ids_table)
before:
ids_table:
----------------------------------------------
| Id allEducation |
| 4fcc-a519-15db04651b91, |
------------------------------------------------
after:
----------------------------------------------
| Id allEducation |
| 4fcc-a519-15db04651b91, 9::MASTER_DEGREE |
------------------------------------------------
I tried to do something like:
`;WITH b AS (THE SQL QUERY ABOVE) update ids_table c set c.allEducations = b.myeducation where c.id = b.id'
any help appreciated!
This should be enough:
begin tran updateEducation
update ids_table set allEducations =
case
when education=0 then '0::ALL'
when education=1 then '1::HIGH_SCHOOL'
when education=2 then '2::UNDERGRAD'
when education=3 then '3::ALUM'
when education=4 then '4::HIGH_SCHOOL_GRAD'
when education=5 then '5::SOME_COLLEGE'
when education=6 then '6::ASSOCIATE_DEGREE'
when education=7 then '7::IN_GRAD_SCHOOL'
when education=8 then '8::SOME_GRAD_SCHOOL'
when education=9 then '9::MASTER_DEGREE'
when education=10 then '10::PROFESSIONAL_DEGREE'
when education=11 then '11::DOCTORATE_DEGREE'
when education=12 then '12::UNSPECIFIED'
end
---- if it is not good
-- rollback
---- if it is good
-- commit

How to retrieve old values in OUTPUT clause with SQL MERGE statement

I'm using MERGE statement to update a product table containing (Name="a", Description="desca"). My source table contains (Name="a", Description="newdesca") and I merge on the Name field.
In my Output clause, I would like to get back the field BEFORE the update -> Description = "desca".
I couldn't find a way to do that, I'm always getting back the new value ("newdesca"). Why?
Can you not just used the deleted memory-resident table. e.g:
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
CREATE TABLE #T (Name VARCHAR(5), Description VARCHAR(20));
INSERT #T (Name, Description)
VALUES ('a', 'desca'), ('b', 'delete');
MERGE #T AS t
USING (VALUES ('a', 'newdesca'), ('c', 'insert')) AS m (Name, Description)
ON t.Name = m.Name
WHEN MATCHED THEN
UPDATE SET Description = m.Description
WHEN NOT MATCHED BY TARGET THEN
INSERT (Name, Description)
VALUES (m.Name, m.Description)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT $Action, inserted.*, deleted.*;
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
The output of this would be:
$Action | Name | Description | Name | Description
--------+-------+-------------+------+--------------
INSERT | c | insert | NULL | NULL
UPDATE | a | newdesca | a | desca
DELETE | NULL | NULL | b | delete

Resources