Update name in Snowflake variant column - snowflake-cloud-data-platform

I have copied some json files into Snowflake from a stage and I have a property name which contains a hyphen.
When I try to query for this property name (as shown below), I get this error.
select my_variant:test-id from mytable;
SQL compilation error: error line 1 at position 44 invalid identifier 'ID'.
I assume it doesn't like the hyphen. Is there any way I can rename this hyphenated name in my variant column so I don't get the error?

You just need to quote the column name in the variant:
select my_variant:"test-id" from mytable;
If you want to update it, see below. It assumes that you have a key per row, so that we can aggregate it back to rebuild the variant at the row level.
Setup test table:
create or replace table test (k int, a variant);
insert into test
select 1, parse_json('{"test-id": 1, "test-id2": "2"}')
union all
select 2, parse_json('{"test-1": 1, "test-2": "2"}');
select * from test;
+---+-------------------+
| K | A |
|---+-------------------|
| 1 | { |
| | "test_id": 1, |
| | "test_id2": "2" |
| | } |
| 2 | { |
| | "test_1": 1, |
| | "test_2": "2" |
| | } |
+---+-------------------+
Update the table:
update test t
set t.a = b.value
from (
with t as (
select
k,
replace(f.key, '-', '_') as key,
f.value as value
from test,
lateral flatten(a) f
)
select
k, object_agg(key, value) as value
from t
group by k
) b
where t.k = b.k
;
select * from test;
+---+-------------------+
| K | A |
|---+-------------------|
| 1 | { |
| | "test_id": 1, |
| | "test_id2": "2" |
| | } |
| 2 | { |
| | "test_1": 1, |
| | "test_2": "2" |
| | } |
+---+-------------------+

Related

Use column values as filenames when COPYing into a stage

Is there an out-of-the-box method for Snowflake to use values from a column as a filename when using COPY INTO #mystage? The goal is to copy X number of files into an s3 stage (essentially PARTITION BY column1), but straight to the stage, not creating subfolders. X would be the number of distinct values in a column.
This can obviously be done manually:
copy into #mystage/mycustomfilename
However, the better option would be something like this:
copy into #mystage/$column1
Is there a version of this that Snowflake supports?
As mentioned above, the PARTITION BY setting parses data into subfolders and the subfolders are named using the values in the specified column, but Snowflake still uses a generic filename within each subfolder.
Created structure -
create temporary table temp_tab_split_members(seq_id number, member_id number, name varchar2(30));
+----------------------------------------------------+
| status |
|----------------------------------------------------|
| Table TEMP_TAB_SPLIT_MEMBERS successfully created. |
+----------------------------------------------------+
Fake data -
insert into temp_tab_split_members
with cte as
(select seq4(),(trim(mod(seq4(),4))+1)::integer,'my name-'||seq4() from table(generator(rowcount=>12)))
select * from cte;
+-------------------------+
| number of rows inserted |
|-------------------------|
| 12 |
+-------------------------+
Checking data format -
select * from TEMP_TAB_SPLIT_MEMBERS order by member_id;
+--------+-----------+------------+
| SEQ_ID | MEMBER_ID | NAME |
|--------+-----------+------------|
| 0 | 1 | my name-0 |
| 4 | 1 | my name-4 |
| 8 | 1 | my name-8 |
| 1 | 2 | my name-1 |
| 5 | 2 | my name-5 |
| 9 | 2 | my name-9 |
| 2 | 3 | my name-2 |
| 6 | 3 | my name-6 |
| 10 | 3 | my name-10 |
| 3 | 4 | my name-3 |
| 7 | 4 | my name-7 |
| 11 | 4 | my name-11 |
+--------+-----------+------------+
Checked stage is empty
list #test_row_stage;
+------+------+-----+---------------+
| name | size | md5 | last_modified |
|------+------+-----+---------------|
+------+------+-----+---------------+
Main procedure to generate files
EXECUTE IMMEDIATE $$
DECLARE
company varchar2(30);
BU varchar2(30);
eval_desc varchar2(30);
member_id varchar2(30);
file_name varchar2(30);
c1 CURSOR FOR SELECT distinct member_id FROM temp_tab_split_members;
BEGIN
for record in c1 do
member_id:=record.member_id;
file_name:='load'||'_'||member_id||'.csv';
execute immediate 'copy into #test_row_stage/'||:file_name||' from
(select * from temp_tab_split_members where member_id='||:member_id||') overwrite=false';
end for;
RETURN 0;
END;
$$
;
+-----------------+
| anonymous block |
|-----------------|
| 0 |
+-----------------+
Check stage contents after procedure execution
list #test_row_stage; -- output truncated columnwise
+----------------------------------------+------+
| name | size |
|----------------------------------------+------+
| test_row_stage/load_1.csv_0_0_0.csv.gz | 48 |
| test_row_stage/load_2.csv_0_0_0.csv.gz | 48 |
| test_row_stage/load_3.csv_0_0_0.csv.gz | 48 |
| test_row_stage/load_4.csv_0_0_0.csv.gz | 48 |
File contents cross-check
select $1,$2,$3 from #test_row_stage/load_1.csv_0_0_0.csv.gz union
select $1,$2,$3 from #test_row_stage/load_2.csv_0_0_0.csv.gz union
select $1,$2,$3 from #test_row_stage/load_3.csv_0_0_0.csv.gz union
select $1,$2,$3 from #test_row_stage/load_4.csv_0_0_0.csv.gz;
+----+----+------------+
| $1 | $2 | $3 |
|----+----+------------|
| 0 | 1 | my name-0 |
| 4 | 1 | my name-4 |
| 8 | 1 | my name-8 |
| 1 | 2 | my name-1 |
| 5 | 2 | my name-5 |
| 9 | 2 | my name-9 |
| 2 | 3 | my name-2 |
| 6 | 3 | my name-6 |
| 10 | 3 | my name-10 |
| 3 | 4 | my name-3 |
| 7 | 4 | my name-7 |
| 11 | 4 | my name-11 |
+----+----+------------+
There is no OOB for this as I understand, but you can write custom code and fetch values and use them to name files and copy them to stage/s3.
Please refer below for something similar -
EXECUTE IMMEDIATE $$
DECLARE
company varchar2(30);
BU varchar2(30);
eval_desc varchar2(30);
member_id varchar2(30);
file_name varchar2(30);
c1 CURSOR FOR SELECT * FROM test_pivot;
BEGIN
for record in c1 do
company:=record.company;
BU:=record.BU;
eval_desc:=record.eval_desc;
member_id:=record.member_id;
file_name:='load'||'_'||member_id||'.csv';
create or replace temporary table temp_test_pvt(company varchar2(30),BU varchar2(30),eval_desc varchar2(30),member_id varchar2(30));
insert into temp_test_pvt values (:company,:bu,:eval_desc,:member_id);
execute immediate 'copy into #test_row_stage/'||:file_name||' from (select * from temp_test_pvt) overwrite=false';
end for;
RETURN 0;
END;
$$
;
Also, refer a similar post here -
Copy JSON data from Snowflake into S3

Count of current values in column

I'd like to know count of current value in column.
+--------------+---------------------+--+------------+------------+
| | | | 1st column | 2nd column |
+--------------+---------------------+--+------------+------------+
| How many 'A' | (there should be 1) | | A | B |
+--------------+---------------------+--+------------+------------+
| How many 'B' | (there should be 2) | | B | B |
+--------------+---------------------+--+------------+------------+
| | | | B | A |
+--------------+---------------------+--+------------+------------+
I try to do this in Google Sheets.
try:
=QUERY({D2:D; E2:E},
"select Col1,count(Col1)
where Col1 is not null
group by Col1
label count(Col1)''", 0)

Getting values from a table that's inside a table (unpivot / cross apply)

I'm having a serious problem with one of my import tables. I've imported an Excel file to a SQL Server table. The table ImportExcelFile now looks like this (simplified):
+----------+-------------------+-----------+------------+--------+--------+-----+---------+
| ImportId | Excelfile | SheetName | Field1 | Field2 | Field3 | ... | Field10 |
+----------+-------------------+-----------+------------+--------+--------+-----+---------+
| 1 | C:\Temp\Test.xlsx | Sheet1 | Age / Year | 2010 | 2011 | | 2018 |
| 2 | C:\Temp\Test.xlsx | Sheet1 | 0 | Value1 | Value2 | | Value9 |
| 3 | C:\Temp\Test.xlsx | Sheet1 | 1 | Value1 | Value2 | | Value9 |
| 4 | C:\Temp\Test.xlsx | Sheet1 | 2 | Value1 | Value2 | | Value9 |
| 5 | C:\Temp\Test.xlsx | Sheet1 | 3 | Value1 | Value2 | | Value9 |
| 6 | C:\Temp\Test.xlsx | Sheet1 | 4 | Value1 | Value2 | | Value9 |
| 7 | C:\Temp\Test.xlsx | Sheet1 | 5 | NULL | NULL | | NULL |
+----------+-------------------+-----------+------------+--------+--------+-----+---------+
I now want to insert those values from Field1 to Field10 to the table AgeYear(in my original table there are about 70 columns and 120 rows). The first row (Age / Year, 2010, 2011, ...) is the header row. The column Field1 is the leading column. I want to save the values in the following format:
+-----------+-----+------+--------+
| SheetName | Age | Year | Value |
+-----------+-----+------+--------+
| Sheet1 | 0 | 2010 | Value1 |
| Sheet1 | 0 | 2011 | Value2 |
| ... | ... | ... | ... |
| Sheet1 | 0 | 2018 | Value9 |
| Sheet1 | 1 | 2010 | Value1 |
| Sheet1 | 1 | 2011 | Value2 |
| ... | ... | ... | ... |
| Sheet1 | 1 | 2018 | Value9 |
| ... | ... | ... | ... |
+-----------+-----+------+--------+
I've tried the following query:
DECLARE #sql NVARCHAR(MAX) =
';WITH cte AS
(
SELECT i.SheetName,
ROW_NUMBER() OVER(PARTITION BY i.SheetName ORDER BY i.SheetName) AS rn,
' + #columns + ' -- #columns = 'Field1, Field2, Field3, Field4, ...'
FROM dbo.ImportExcelFile i
WHERE i.Sheetname LIKE ''Sheet1''
)
SELECT SheetName,
age Age,
y.[Year]
FROM cte
CROSS APPLY
(
SELECT Field1 age
FROM dbo.ImportExcelFile
WHERE SheetName LIKE ''Sheet1''
AND ISNUMERIC(Field1) = 1
) a (age)
UNPIVOT
(
[Year] FOR [Years] IN (' + #columns + ')
) y
WHERE rn = 1'
EXEC (#sql)
So far I'm getting the desired ages and years. My problem is that I don't know how I could get the values. With UNPIVOT I don't get the NULL values. Instead it fills the whole table with the same values even if they are NULL in the source table.
Could you please help me?
Perhaps an alternative approach. This is not dynamic, but with the help of a CROSS APPLY and a JOIN...
The drawback is that you'll have to define the 70 fields.
Example
;with cte0 as (
Select A.ImportId
,A.SheetName
,Age = A.Field1
,B.*
From ImportExcelFile A
Cross Apply ( values ('Field2',Field2)
,('Field3',Field3)
,('Field10',Field10)
) B (Item,Value)
)
,cte1 as ( Select * from cte0 where ImportId=1 )
Select A.SheetName
,[Age] = try_convert(int,A.Age)
,[Year] = try_convert(int,B.Value)
,[Value] = A.Value
From cte0 A
Join cte1 B on A.Item=B.Item
Where A.ImportId>1
Returns

SQL SELECT and INSERT/UPDATE, PIVOT?

I need to take data from a table that looks like this:
name | server | instance | version | user
----------|----------|------------|----------|--------- -
package_a | x | 1 | 1 | AB
package_b | x | 1 | 1 | TL
package_a | x | 2 | 4 | SK
package_a | y | 1 | 2 | MD
package_c | y | 1 | 4 | SK
package_b | y | 2 | 1 | SK
package_a | y | 2 | 1 | TL
package_b | x | 2 | 3 | TL
package_c | x | 2 | 1 | TL
and I need to put it in a table like that:
name | v_x_1 | u_x_1 | v_x_2 | u_x_2 | v_y_1 | u_y_1 | v_y_2 | u_y_2
----------|-------|-------|-------|-------|-------|-------|-------|-------
package_a | 1 | AB | 4 | SK | 2 | MD | 1 | TL
package_b | 1 | TL | 3 | TL | NULL | NULL | 1 | SK
package_c | NULL | NULL | 1 | TL | 4 | SK | NULL | NULL
I already tried INSERT with (SUB)SELECT, tried to INSERT package names first using DISTINCT and UPDATE afterwards, played around with PIVOT and stuff like that.
But I'm rather new to SQL and programming in general, so I couldn't come up with a solution. Since I not only have a version number in the source table but also nvarchar columns, It seems like PIVOT won't be the way to go, right?
You can use PIVOT on a sub query that uses UNION to separate the user and version values.
insert into YourNewTable (name, [v_x_1],[u_x_1],[v_x_2],[u_x_2],[v_y_1],[u_y_1],[v_y_2],[u_y_2])
select *
from (
select name, cast([version] as varchar(30)) as value, concat('v_',[server],'_',instance) as title from YourTable
union all
select name, [user] as value, concat('u_',[server],'_',instance) as title from YourTable
) q
pivot (max(value) FOR title IN (
[v_x_1],[u_x_1],[v_x_2],[u_x_2],[v_y_1],[u_y_1],[v_y_2],[u_y_2]
)
) pvt;

How to create a cross tab (in crystal) from multiple columns (in sql)

I have 5 columns in SQL that I need to turn into a cross tab in Crystal.
This is what I have:
Key | RELATIONSHIP | DISABLED | LIMITED | RURAL | IMMIGRANT
-----------------------------------------------------------------
1 | Other Dependent | Yes | No | No | No
2 | Victim/Survivor | No | No | No | No
3 | Victim/Survivor | Yes | No | No | No
4 | Child | No | No | No | No
5 | Victim/Survivor | No | No | No | No
6 | Victim/Survivor | No | No | No | No
7 | Child | No | No | No | No
8 | Victim/Survivor | No | Yes | Yes | Yes
9 | Child | No | Yes | Yes | Yes
10 | Child | No | Yes | Yes | Yes
This is what I want the cross tab to look like (Distinct count on Key):
| Victim/Survivor | Child | Other Dependent | Total |
--------------------------------------------------------------
| DISABLED | 1 | 0 | 1 | 2 |
--------------------------------------------------------------
| LIMITED | 1 | 2 | 0 | 3 |
--------------------------------------------------------------
| RURAL | 1 | 2 | 0 | 3 |
--------------------------------------------------------------
| IMMIGRANT | 1 | 2 | 0 | 3 |
--------------------------------------------------------------
| TOTAL | 4 | 6 | 1 | 11 |
--------------------------------------------------------------
I used this formula in Crystal in an effort to combine 4 columns (Field name = {#OTHERDEMO})...
IF {TABLE.DISABLED} = "YES" THEN "DISABLED" ELSE
IF {TABLE.LIMITED} = "YES" THEN "LIMITED" ELSE
IF {TABLE.IMMIGRANT} = "YES" THEN "IMMIGRANT" ELSE
IF {TABLE.RURAL} = "YES" THEN "RURAL"
...then made the cross-tab with #OTHERDEMO as the rows, RELATIONSHIP as the Columns with a distinct count on KEY:
Problem is, once crystal hits the first "Yes" it stops counting thus not categorizing correctly in the cross-tab. So I get a table that counts the DISABILITY first and gives the correct display, then counts the Limited and gives some info, but then dumps everything else.
In the past, I have done mutiple conditional formulas...
IF {TABLE.DISABLED} = "YES" AND {TABLE.RELATIONSHIP} = "Victim/Survivor" THEN {TABLE.KEY} ELSE {#NULL}
(the #null formula is because Crystal, notoriously, gets confused with nulls.)
...then did a distinct count on Key, and finally summed it in the footer.
I am convinced there is another way to do this. Any help/ideas would be greatly appreciated.
If you unpivot those "DEMO" columns into rows it will make the crosstab super easy...
select
u.[Key],
u.[RELATIONSHIP],
u.[DEMO]
from
Table1
unpivot (
[b] for [DEMO] in ([DISABLED], [LIMITED], [RURAL], [IMMIGRANT])
) u
where
u.[b] = 'Yes'
SqlFiddle
or if you were stuck on SQL2000 compatibility level you could manually unpivot the Yes values...
select [Key], [REALTIONSHIP], [DEMO] = cast('DISABLED' as varchar(20))
from Table1
where [DISABLED] = 'Yes'
union
select [Key], [REALTIONSHIP], [DEMO] = cast('LIMITED' as varchar(20))
from Table1
where [LIMITED] = 'Yes'
union
select [Key], [REALTIONSHIP], [DEMO] = cast('RURAL' as varchar(20))
from Table1
where [RURAL] = 'Yes'
union
select [Key], [REALTIONSHIP], [DEMO] = cast('IMMIGRANT' as varchar(20))
from Table1
where [IMMIGRANT] = 'Yes'
For the crosstab, use a count on the Key column (aka row count), [DEMO] on rows, and [RELATIONSHIP] on columns.

Resources