How can i replace a db name using variable in snowflake? - snowflake-cloud-data-platform

I can use variables for query filtering conditions. E.g.,
set mytime = '2021-12-12 09:00:00';
select col1
from db1.schema.table1
where event_time > $mytime
However, if I use the same way to replace the db1 with a variable, it will not work. I.e.,
set mytime = '2021-12-12 09:00:00';
set db_name = 'db1';
select col1
from $db_name.schema.table1
where event_time > $mytime

So if your variable has a fully qualified name db/schema/table name and you use the INDENTIFIER function can help.
create table test.test.db1(id number);
set db_name = 'test.test.db1';
insert into test.test.db1 values (1),(2),(3);
then this works:
select id
from identifier ($db_name);
ID
1
2
3
but composing the string on the fly does not presently work:
select id
from identifier ($just_db_name||'.test.db1');
but you can two step this:
set fqn_db_name = $just_db_name||'.test.db1';
select id
from identifier ($fqn_db_name);
ID
1
2
3
Snowflake Scripting:
Using snowflake scripting, it can be done as a single "statement", like so:
begin
let fqn text := $just_db_name || '.test.db1';
let res resultset := (select id from identifier(:fqn));
return table(res);
end;
ID
1
2
3

Related

Dynamic query with variable value from a table in SSIS 2015

I have Table A that has only one row :
CODE | DATE
202211 | 2022-11
this table will update it self automatically every end of the month (eg: Next month it will change to 202212 and 2022-12)
I want to use 'CODE' and 'DATE' to make my query dynamic, using variable and Execute SQL Task in SSIS.
My original query look like this :
SELECT * FROM X
WHERE PERIOD = '202211', EXPDATE > '2022-11'
I want to make it so that whenever Table A change, I don't have to change the Query too.
This is what I tried already :
DECLARE #Period varchar(50)
DECLARE #Expdate varchar(50)
SET #Period = ?
SET #Expdate = ?
SELECT * FROM X
WHERE PERIOD = #Period, EXPDATE > #Expdate
When I try to run using '?' just as the documentation say, it doesn't work, but it run when I change the '?' into hardcode, so I'm pretty sure at least my query works. Am I missing something, or I'm setting the Variable wrong.
This is my variable settings
Name | Scope | Data type | Value | Expression
position | MyDtsx | String | |
date | MyDtsx | String | |
This is my SQL Task setting
General
__________________________
Result Set = Single Row
SQLSourceType = Direct Input
SQLStatement = 'SELECT CODE as position, DATE as date FROM A'
Result Set
__________________________
Result Name | Variable Name
position | User::position
date | User::date
You have invalid syntax
SELECT *
FROM X
WHERE PERIOD = #Period, EXPDATE > #Expdate
Make that
SELECT *
FROM X
WHERE PERIOD = #Period AND EXPDATE > #Expdate
However, if table A (the source of Period and ExpDate) is in the same database as table X, skip the extraneous Execute SQL Task and the variables and just make the query
SELECT X.*
FROM X
WHERE EXISTS (SELECT *
FROM A
WHERE X.PERIOD = A.Code AND X.EXPDATE > P.[Date]);
The comment indicates
Table A and Table X is from a different database. Sadly the query is actually fine since when I change it to' #Period = '202211'' and '#Expdate = '2022-11'' it run.
A working repo.
My working notes for those following along at home as the name changes messed me a few times.
Table A's Code = SSIS Variable position = Table X's Period
Table A's Date = SSIS Variable date = Table X's ExpDate
SQL Setup
Execute sql task to ensure I have data
drop table if exists dbo.so_74394641;
create table dbo.so_74394641
(
Col1 bigint, Period varchar(50), ExpDate varchar(50)
);
insert into dbo.so_74394641
SELECT row_number() over (order by (SELECT NULL)) AS col1, '202211', '2022-12'
FROM sys.all_objects;
SQL Get Values
Execute sql task. Hard coded as I didn't want to create another table in a different database
SELECT '202211' AS Code, '2022-11' AS [Date];
DFT Get Data
A data flow. OLE DB Source component using the following query
DECLARE #Period varchar(50)
DECLARE #Expdate varchar(50)
SET #Period = ?
SET #Expdate = ?
SELECT * FROM dbo.SO_74394641 AS X
WHERE PERIOD = #Period and EXPDATE > #Expdate;
Parameters mapped
0 User::dosition Input
1 User::date Input
Control flow
Data Flow

How do I parametrize Lua script to go through table values executing queries

new with Lua but trying.
I have multiple "Create table" queries which I need to execute, what changes only is Schema and Table name.
At the moment I am explicitly defining each query.
I want to parametrize Lua script from the table below passing table name as argument, since there is 100+ tables which needs to be generated this way.
MappingTable
targetSchema
targetTable
originSchema
originTable
schema1
table1
schema3
table3
schema2
table2
schema4
table4
Current solution
CREATE LUA SCRIPT "ScriptName" () RETURNS ROWCOUNT AS
query([[
Create or replace table schema1.table1 as
select * from schema3.table3;
]])
query([[
Create or replace table schema2.table2 as
select * from schema4.table4;
]])
What I've tried:
CREATE OR REPLACE LUA SCRIPT "ScriptName"('MappingTable') RETURNS ROWCOUNT AS
map_table = execute[[ SELECT * FROM .."'MappingTableName'"..;]] -- passing argument of the script, mapping table name
-- passing values from the columns
load = [[Create or replace table ]]..
[[']]..targetSchema..[['.']]..
[[']]..targetTable..]]..
[[as select * from]]..
[[']]..originSchema..[['.']]..
[[']]..originTable..[[']]
Not sure about the syntax, also I guess I need to loop through the values of the table.
Thank you
Here is a sample script:
create or replace lua script ScriptName (
t_MappingTable
, s_ConditionColumn
, s_ConditionValue
)
returns rowcount as
-- passing argument of the script, mapping table name
local map_table = query ([[
select * from ::MappingTable where ::ConditionColumn = :ConditionValue
]],{
MappingTable = t_MappingTable
, ConditionColumn = s_ConditionColumn
, ConditionValue = s_ConditionValue
});
-- passing values from the columns
for i = 1, #map_table do
query ([[
create or replace table ::targetSchema.::targetTable as
select * from ::originSchema.::originTable
]],{
targetSchema = map_table[i].TARGETSCHEMA
, targetTable = map_table[i].TARGETTABLE
, originSchema = map_table[i].ORIGINSCHEMA
, originTable = map_table[i].ORIGINTABLE
});
end
/
You may want to read values from map_table the other way.
In case when you have case-sensitive column names:
targetSchema = map_table[i]."targetSchema"
, targetTable = map_table[i]."targetTable"
, originSchema = map_table[i]."originSchema"
, originTable = map_table[i]."originTable"
In case when you are sure in column order and don't want to worry about column names:
targetSchema = map_table[i][1]
, targetTable = map_table[i][2]
, originSchema = map_table[i][3]
, originTable = map_table[i][4]

UNION ALL on all tables starting with a certain string

I would like to combine tables starting with the same name into one table.
For example let's say I have a database with tables 'EXT_ABVD', 'EXT_ADAD','EXT_AVSA','OTHER', and I want to combine all tables beginning with 'EXT_', I would want the result of
select col1 ,col2 from EXT_ABVD
union all
select col1 ,col2 from EXT_ADAD
union all
select col1 ,col2 from EXT_AVSA;
I would like to do this on a regular basis (daily for example), and every time this runs there may be new tables starting with 'EXT_'. I don't want to update the union_all query manually.
I am new to Snowflake and don't know how can I do that? Can I use a script inside Snowflake?
Given these tables:
CREATE TABLE TEST_DB.PUBLIC.EXT_ABVD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAQ (col1 INTEGER, col2 INTEGER);
A view like this could be dynamically created:
CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS
SELECT * FROM TEST_DB.PUBLIC.EXT_ABVD
UNION ALL
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAD
UNION ALL
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAQ
Using this Procedure:
create or replace procedure TEST_DB.PUBLIC.CREATE_UNION_VEIW(TBL_PREFIX VARCHAR)
returns VARCHAR -- return final create statement
language javascript
as
$$
// build query to get tables from information_schema
var get_tables_stmt = "SELECT Table_Name FROM TEST_DB.INFORMATION_SCHEMA.TABLES \
WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_NAME LIKE '"+ TBL_PREFIX + "%';"
var get_tables_stmt = snowflake.createStatement({sqlText:get_tables_stmt });
// get result set containing all table names
var tables = get_tables_stmt.execute();
// to control if UNION ALL should be added or not
// this could likely be handled more elegantly but i don't know JavaScript :)
var row_count = get_tables_stmt.getRowCount();
var rows_iterated = 0;
// define view name
var create_statement = "CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS \n";
// loop over result set to build statement
while (tables.next()) {
rows_iterated += 1;
// we get values from the first (and only) column in the result set
var table_name = tables.getColumnValue(1);
// this will obviously fail if the column count doesnt match
create_statement += "SELECT * FROM TEST_DB.PUBLIC." + table_name
// add union all to all but last row
if (rows_iterated < row_count){
create_statement += "\n UNION ALL \n"
}
}
// create the view
var create_statement = snowflake.createStatement( {sqlText: create_statement} );
create_statement.execute();
// return the create statement as text
return create_statement.getSqlText();
$$
;
Which we would call like this: CALL CREATE_UNION_VIEW('EXT_A');
This is just a basic example so logic for column counts, schemas etc. likely needs to be added. But given this I think you will be able to figure out how to deal with result sets, parameters and statements.
Edit: See here for how to set up a task that would run a procedure on daily basis. The most basic would in this case look like this:
create or replace task create_union_task
warehouse = COMPUTE_WH
schedule = '1440 minute' -- once every day
as
CALL CREATE_UNION_VIEW('EXT_A');
The only way you can achieve this currently is via a Snowflake Stored Procedure.
You don't specify how you want to consume the result of the query, but a convenient way is via a VIEW. So the Stored Procedure has to generate a VIEW definition containing the query in your question.

T-SQL Check if list has values, select and Insert into Table

I'm quite new to T-SQL and currently struggling with an insert statement in my stored procedure: I use as a parameter in the stored procedure a list of ids of type INT.
If the list is NOT empty, I want to store the ids into the table Delivery.
To pass the list of ids, i use a table type:
CREATE TYPE tIdList AS TABLE
(
ID INT NULL
);
GO
Maybe you know a better way to pass a list of ids into a stored procedure?
However, my procedure looks as follows:
-- parameter
#DeliveryModelIds tIdList READONLY
...
DECLARE #StoreId INT = 1;
-- Delivery
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID FROM #DeliveryModelIds;
If the list has values, I want to store the values into the DB as well as the StoreId which is always 1.
If I insert the DeliveryIds 3,7,5 The result in table Delivery should look like this:
DeliveryId | StoreId | DeliveryModelId
1...............| 1...........| 3
2...............| 1...........| 7
3...............| 1...........| 5
Do you have an idea on how to solve this issue?
THANKS !
You can add #StoreId to your select for your insert.
...
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID, #StoreId FROM #DeliveryModelIds;
Additionally, if you only want to insert DeliveryModelId that do not currently exist in the target table, you can use not exists() in the where clause like so:
...
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT dmi.ID, #StoreId
FROM #DeliveryModelIds dmi
where not exists (
select 1
from MyDb.Delivery i
where i.StoreId = #StoreId
and i.DeliveryModeId = dmi.ID
);
You need to modify the INSERT statement to:
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID, 1 FROM #DeliveryModelIds;
So you are also selecting a literal, 1, along with ID field.

Copy records with dynamic column names

I have two tables with different columns in PostgreSQL 9.3:
CREATE TABLE person1(
NAME TEXT NOT NULL,
AGE INT NOT NULL
);
CREATE TABLE person2(
NAME TEXT NOT NULL,
AGE INT NOT NULL,
ADDRESS CHAR(50),
SALARY REAL
);
INSERT INTO person2 (Name, Age, ADDRESS, SALARY)
VALUES ('Piotr', 20, 'London', 80);
I would like to copy records from person2 to person1, but column names can change in program, so I would like to select joint column names in program. So I create an array containing the intersection of column names. Next I use a function: insert into .... select, but I get an error, when I pass the array variable to the function by name. Like this:
select column_name into name1 from information_schema.columns where table_name = 'person1';
select column_name into name2 from information_schema.columns where table_name = 'person2';
select * into cols from ( select * from name1 intersect select * from name2) as tmp;
-- Create array with name of columns
select array (select column_name::text from cols) into cols2;
CREATE OR REPLACE FUNCTION f_insert_these_columns(VARIADIC _cols text[])
RETURNS void AS
$func$
BEGIN
EXECUTE (
SELECT 'INSERT INTO person1 SELECT '
|| string_agg(quote_ident(col), ', ')
|| ' FROM person2'
FROM unnest(_cols) col
);
END
$func$ LANGUAGE plpgsql;
select * from cols2;
array
------------
{name,age}
(1 row)
SELECT f_insert_these_columns(VARIADIC cols2);
ERROR: column "cols2" does not exist
What's wrong here?
You seem to assume that SELECT INTO in SQL would assign a variable. But that is not so.
It creates a new table and its use is discouraged in Postgres. Use the superior CREATE TABLE AS instead. Not least, because the meaning of SELECT INTO inside plpgsql is different:
Combine two tables into a new one so that select rows from the other one are ignored
Concerning SQL variables:
User defined variables in PostgreSQL
Hence you cannot call the function like this:
SELECT f_insert_these_columns(VARIADIC cols2);
This would work:
SELECT f_insert_these_columns(VARIADIC (TABLE cols2 LIMIT 1));
Or cleaner:
SELECT f_insert_these_columns(VARIADIC array) -- "array" being the unfortunate column name
FROM cols2
LIMIT 1;
About the short TABLE syntax:
Is there a shortcut for SELECT * FROM?
Better solution
To copy all rows with columns sharing the same name between two tables:
CREATE OR REPLACE FUNCTION f_copy_rows_with_shared_cols(
IN _tbl1 regclass
, IN _tbl2 regclass
, OUT rows int
, OUT columns text)
LANGUAGE plpgsql AS
$func$
BEGIN
SELECT INTO columns -- proper use of SELECT INTO!
string_agg(quote_ident(attname), ', ')
FROM (
SELECT attname
FROM pg_attribute
WHERE attrelid IN (_tbl1, _tbl2)
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
GROUP BY 1
HAVING count(*) = 2
) sub;
EXECUTE format('INSERT INTO %1$s(%2$s) SELECT %2$s FROM %3$s'
, _tbl1, columns, _tbl2);
GET DIAGNOSTICS rows = ROW_COUNT; -- return number of rows copied
END
$func$;
Call:
SELECT * FROM f_copy_rows_with_shared_cols('public.person2', 'public.person1');
Result:
rows | columns
-----+---------
3 | name, age
Major points
Note the proper use of SELECT INTO for assignment inside plpgsql.
Note the use of the data type regclass. This allows to use schema-qualified table names (optionally) and defends against SQL injection attempts:
Table name as a PostgreSQL function parameter
About GET DIAGNOSTICS:
Count rows affected by DELETE
About OUT parameters:
Returning from a function with OUT parameter
The manual about format().
Information schema vs. system catalogs.

Resources