Drop multiple tables at once - snowflake-cloud-data-platform

In SQL you have the ability to drop multiple tables at once with a simple query like:
drop table a, b, c
In Snowflake this doesnt work. Is there a way to drop multiple tables at once?

Maybe you can create a simple SP to do this for you:
create or replace procedure drop_tables(list varchar)
returns string
language javascript
as
$$
var l = LIST.split(',');
var sqls = [];
for (i=0; i<l.length; i++) {
var sql = "DROP TABLE IF EXISTS " + l[i].trim();
var rs = snowflake.execute( {sqlText: sql});
sqls.push(sql);
}
return JSON.stringify(sqls);
$$;
call drop_tables('mytest,test , my_table ');
+---------------------------------------------------------------------------------------------+
| DROP_TABLES |
|---------------------------------------------------------------------------------------------|
| ["DROP TABLE IF EXISTS mytest","DROP TABLE IF EXISTS test","DROP TABLE IF EXISTS my_table"] |
+---------------------------------------------------------------------------------------------+

Hi you can do a 2 step like this where you generate the DDL if you have the list of tables to be deleted.
SELECT 'DROP TABLE ' || table_name || ';'
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME LIKE IN ('TAB1', 'TAB2');

Related

Use query return results as DB names, add schema and table, then use them in another query

I have a catalog DB that stores the names of other DBs. These DBs contains the same schema and tabls. Now I want to extract all the DB names from the catalog DB and query a specific table in all those DBs.
Here is an example:
catalog DB name: CatalogDB
schema name: schemaExp
table name: tableExp
CatalogDB contains a list of otherDBs, e.g., otherDB1, otherDB2, otherDBXYZ, etc.
So I can get all the other DB names by
select DBName
from CatalogDB;
I can query the table in otherDB1 using the following query
select *
from otherDB1.schemaExp.tableExp;
I want to query the same tableExp in all the other DBs. How can I do that?
EDIT: I am not interested in combining tables since table content can get updated. Is it possible to query the catalog db and put the return db names in a parameter then run a query to select from each DBs from the parameter?
So if you wanted to select row count for "all those tables" you could:
create table test.test.cat_tab(cat_db string, cat_schema string, cat_table string, other_dbs array);
create database test_a;
create schema test_a.test;
create table test_a.test.table_exp(val int);
insert into test_a.test.table_exp values (1),(2);
create database test_b;
create schema test_b.test;
create table test_b.test.table_exp(val int);
insert into test_b.test.table_exp values (3);
insert into test.test.cat_tab select 'test', 'test', 'table_exp', array_construct('test_a','test_b');
then dynamically count those great data driven rows:
declare
counts int;
total_counts int := 0;
begin
let c1 cursor for select f.value::text ||'.'|| cat_schema ||'.'|| cat_table as fqn from test.test.cat_tab,table(flatten(input=>other_dbs)) f where cat_table = 'table_exp';
for record in c1 do
let str := 'select count(*) from ' || record.fqn;
execute immediate str;
select $1 into counts from table(result_scan(last_query_id()));
total_counts := total_counts + counts;
end for;
return total_counts;
end;
anonymous block
3
if you want to select from all those tables in a union greatness:
declare
sql string := '';
res resultset;
begin
let c1 cursor for select f.value::text ||'.'|| cat_schema ||'.'|| cat_table as fqn from test.test.cat_tab,table(flatten(input=>other_dbs)) f where cat_table = 'table_exp';
for record in c1 do
if (sql <> '') then
sql := sql || ' union all ';
end if;
sql := sql || 'select * from ' || record.fqn;
end for;
res := (execute immediate :sql);
return table(res);
end;
gives:
VAL
1
2
3

Return number of rows affected by update statement

How can I get number of rows affected by an update query in a procedure in snowflake
Create procedure prc_upd_tables
As
Begin
Var stmt= "select distinct 'update' || Table || '.' || Table_schema || '.' || table_name ||
Set column_cd= 8 where column_cd = 4; as upd from table_name
Sample output
Table1 -24 rows
Table2 - 30 rows
Table3 - 0 rows
Table4 -73 rows
You can try the RESULT_SCAN and capture the count of the last query.
create table test_hk (fld1 varchar2(10));
insert into test_hk values('12121212');
update test_hk set fld1 = 2;
select * from test_hk;
SELECT $1 FROM TABLE(RESULT_SCAN(LAST_QUERY_ID(-1)));
Your sample looks oversimplified but if it's a JS stored procedure (because I see something like "var stmt="), you can use getNumRowsAffected() method of the Statement object:
https://docs.snowflake.com/en/sql-reference/stored-procedures-api.html#getNumRowsAffected
CREATE OR REPLACE PROCEDURE delete_some_rows()
RETURNS FLOAT
LANGUAGE JAVASCRIPT
AS
$$
var sql_command = "DELETE FROM deneme";
var stmt = snowflake.createStatement( {sqlText: sql_command} );
stmt.execute();
return stmt.getNumRowsAffected()
$$
;

Combine multiple tables into one in Snowflake

Let's say I have the following monthly tables with table names formatted such that the number after the underscore refers to the month. What I want to do is to combine these 12 tables into one without having to write 10-30 insert/union all statements
table_1
table_2
table_3
table_4
table_5
table_6
table_7
table_8
table_9
table_10
table_11
table_12 -- (only 12 in this instance but could be as many as 36)
My current approach is to first create the master table with data from table_1.
create temporary table master_table_1_12 as
select * -- * to keep it simple for this example
from table_1;
Then use variables such that I can simply keep hitting the run button until it errors out with "table_13 does not exist"
set month_id=(select max(month_id) from master_table_1_12) + 1;
set table_name=concat('table_',$month_id);
insert into master_table_1_12
select *
from identifier($table_name);
Note: All monthly tables have a month_id column
Sure it saves some space on the console(compared to multiple inserts), but I still have to run it 12 times. Are Snowflake Tasks something I could use for this? I couldn't find a fitting example from their documentation to code that up but, if anyone had success with that or with a Javascript based SP for a problem like this, please enlighten.
Here's a stored procedure that will insert into master_table_1_12 from selects on table_1 through table_12. Modify as required:
create or replace procedure FILL_MASTER_TABLE()
returns string
language javascript
as
$$
var rows = 0;
for (var i=1; i<=12; i++) {
rows += insertRows(i);
}
return rows + " rows inserted into master_table_1_12.";
// End of main function
function insertRows(i) {
sql =
`insert into master_table_1_12
select *
from table_${i};`;
return doInsert(sql);
}
function doInsert(queryString) {
var out;
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs = stmt.execute();;
rs.next();
return rs.getColumnValue(1);
}
$$;
call fill_master_table();
By the way, if you don't have any processing to do and just need to consolidate the tables, you can do something like this:
insert into master_table_1_12
select * from table_1
union all
select * from table_2
union all
select * from table_3
union all
select * from table_4
union all
select * from table_5
union all
select * from table_6
union all
select * from table_7
union all
select * from table_8
union all
select * from table_9
union all
select * from table_10
union all
select * from table_11
union all
select * from table_12
;
Can you not create a view on top of these 12 tables. The view will be an union of all these tables.
Based on the comments below, I further elaborated my answer. please try this approach. It will provide better performance when your table is large. Partitioning it will improve performance. This is based on real experience.
CREATE TABLE SALES_2000 (REGION VARCHAR, UNITS_SOLD NUMBER);
CREATE TABLE SALES_2001 (REGION VARCHAR, UNITS_SOLD NUMBER);
CREATE TABLE SALES_2002 (REGION VARCHAR, UNITS_SOLD NUMBER);
CREATE TABLE SALES_2003 (REGION VARCHAR, UNITS_SOLD NUMBER);
INSERT INTO SALES_2000 VALUES('ASIA', 25);
INSERT INTO SALES_2001 VALUES('ASIA', 50);
INSERT INTO SALES_2002 VALUES('ASIA', 55);
INSERT INTO SALES_2003 VALUES('ASIA', 65);
CREATE VIEW ALL_SALES AS
SELECT * FROM SALES_2000
UNION
SELECT * FROM SALES_2001
UNION
SELECT * FROM SALES_2002
UNION
SELECT * FROM SALES_2003;
SELECT * FROM ALL_SALES WHERE UNITS_SOLD = 25;
I ended up creating a UDF that spits out a create view statement and a stored procedure that executes it to create a temporary view. I work with tables following specific naming convention, so you might have to tweak this solution a little for your use case. The separation of UDF and stored proc actually helps with that as you'd mostly need to tweak the SQL UDF. I am sharing a simplified version of what I actually have in the interest of keeping it representative of the tables I listed in my question.
SQL UDF FOR GENERATING A CREATE VIEW STATETEMENT
create or replace function sandbox.public.define_view(table_pattern varchar, start_month varchar, end_month varchar)
returns table ("" varchar) as
$$
with cte1(month_id) as
(select start_month::int + row_number() over (order by 1) - 1
from table(generator(rowcount=> end_month::int - start_month::int + 1)))
,cte2(month_id,statement) as
(select 0,
concat('create or replace temporary view master_',
split_part(table_pattern,'.',-1),
start_month,
'_',
end_month,
' as ')
union all
select month_id,
concat('select * from ',
table_pattern,
month_id,
case when month_id=end_month::int then ';' else ' union all ' end)
from cte1)
select listagg(statement, '\n') within group (order by month_id) as create_view_statement
from cte2
$$;
PROCEDURE FOR EXECUTING THE OUTPUT OF THE UDF ABOVE
create or replace procedure sandbox.public.create_view(TABLE_PATTERN varchar, START_MONTH varchar,END_MONTH varchar)
returns varchar not null
language Javascript
execute as caller
as
$$
sql_command = 'select * from table(sandbox.public.define_view(:1, :2, :3))';
var stmt = snowflake.createStatement({sqlText: sql_command ,binds: [TABLE_PATTERN, START_MONTH, END_MONTH]}).execute();
stmt.next();
var ddl = stmt.getColumnValue(1);
var run=snowflake.createStatement({sqlText: ddl}).execute();
run.next();
var message=run.getColumnValue(1);
return "Temporary " + message;
$$;
USAGE DEMO
set table_pattern ='sandbox.public.table_';
set start_month ='1';
set end_month = '12';
set master_view='master_'||split_part($table_pattern,'.',-1)||$start_month||'_'||$end_month;
call create_view($table_pattern, $start_month, $end_month);
select top 100 *
from identifier($master_view);

Finding NULL column names in SQL Server

In SQL Server, is there any quick way to find out null column names for a particular record other than using CASE expression?
For eg:
I have one record whose values are like:
id|FName|LName|Dept
1 |NULL |Smith|NULL
Expected result is: FName, Dept
So, every time I will have only one record and I need to find the list of NULL columns for it.
if it's just a result set then you can't do it in sql-server.
but you can use other language making a query tool to do it like below C# code :
void Main(){
var nullColumnsName = GetNullColumnsName("select 1 id,null FName,'Smith' LName ,null Dept");
Console.WriteLine(nullColumnsName); //result : FName,Dept
}
IEnumerable<string> GetNullColumnsName(string sql)
{
using (var cnn = Connection)
{
if(cnn.State == ConnectionState.Closed) cnn.Open();
using(var cmd = cnn.CreateCommand()){
cmd.CommandText = sql;
using(var reader = cmd.ExecuteReader(CommandBehavior.SequentialAccess|CommandBehavior.SingleResult|CommandBehavior.SingleRow))
{
if(reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
var value = reader.GetValue(i);
if(value is DBNull)
yield return reader.GetName(i);
}
}
while (reader.Read()) {};
}
}
}
}
As per your this statement: "every time I will have only one record and I need to find the list of NULL columns for it"
You can store these values like follows:
Column | Values
FName | Null
LName | Smith
Dept | Null
and I assume that id is the primary key so as per your requirement if you want to store that then you can or if not, it's your choice.
SELECT Column FROM table
WHERE Values IS NULL;
You can use the following script. The script:
opens a cursor over all columns in the table definition
then, for each column, it queries all values in the column
if there is no non-null value in the column, it prints the name of the column
create table tmpTable (id int, FName varchar(32), LName varchar(32), Dept varchar(32))
insert into tmpTable values(1, null, 'Smith', null)
go
declare #column varchar(128)
declare #sql varchar(max)
declare c cursor for
select c.name from sys.tables t join sys.columns c on t.object_id = c.object_id
where t.name = 'tmpTable'
open c
fetch next from c into #column
while ##fetch_status = 0
begin
select #sql = 'if not exists (select * from tmpTable where ' + #column + ' is not null) begin print + ''' + #column + ''' end'
exec(#sql)
fetch next from c into #column
end
close c
deallocate c
go
drop table tmpTable
For your test case, this is the output of the script:
FName
Dept
Note: This script will work regardless how many rows you have in the table.
Edit:
I had written my answer before the following clarification was added "It is a result set". My solution works with a table (as I am querying the system catalogs) and will not cover the clarified scenario (using a result set).

UNION ALL on all tables starting with a certain string

I would like to combine tables starting with the same name into one table.
For example let's say I have a database with tables 'EXT_ABVD', 'EXT_ADAD','EXT_AVSA','OTHER', and I want to combine all tables beginning with 'EXT_', I would want the result of
select col1 ,col2 from EXT_ABVD
union all
select col1 ,col2 from EXT_ADAD
union all
select col1 ,col2 from EXT_AVSA;
I would like to do this on a regular basis (daily for example), and every time this runs there may be new tables starting with 'EXT_'. I don't want to update the union_all query manually.
I am new to Snowflake and don't know how can I do that? Can I use a script inside Snowflake?
Given these tables:
CREATE TABLE TEST_DB.PUBLIC.EXT_ABVD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAQ (col1 INTEGER, col2 INTEGER);
A view like this could be dynamically created:
CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS
SELECT * FROM TEST_DB.PUBLIC.EXT_ABVD
UNION ALL
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAD
UNION ALL
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAQ
Using this Procedure:
create or replace procedure TEST_DB.PUBLIC.CREATE_UNION_VEIW(TBL_PREFIX VARCHAR)
returns VARCHAR -- return final create statement
language javascript
as
$$
// build query to get tables from information_schema
var get_tables_stmt = "SELECT Table_Name FROM TEST_DB.INFORMATION_SCHEMA.TABLES \
WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_NAME LIKE '"+ TBL_PREFIX + "%';"
var get_tables_stmt = snowflake.createStatement({sqlText:get_tables_stmt });
// get result set containing all table names
var tables = get_tables_stmt.execute();
// to control if UNION ALL should be added or not
// this could likely be handled more elegantly but i don't know JavaScript :)
var row_count = get_tables_stmt.getRowCount();
var rows_iterated = 0;
// define view name
var create_statement = "CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS \n";
// loop over result set to build statement
while (tables.next()) {
rows_iterated += 1;
// we get values from the first (and only) column in the result set
var table_name = tables.getColumnValue(1);
// this will obviously fail if the column count doesnt match
create_statement += "SELECT * FROM TEST_DB.PUBLIC." + table_name
// add union all to all but last row
if (rows_iterated < row_count){
create_statement += "\n UNION ALL \n"
}
}
// create the view
var create_statement = snowflake.createStatement( {sqlText: create_statement} );
create_statement.execute();
// return the create statement as text
return create_statement.getSqlText();
$$
;
Which we would call like this: CALL CREATE_UNION_VIEW('EXT_A');
This is just a basic example so logic for column counts, schemas etc. likely needs to be added. But given this I think you will be able to figure out how to deal with result sets, parameters and statements.
Edit: See here for how to set up a task that would run a procedure on daily basis. The most basic would in this case look like this:
create or replace task create_union_task
warehouse = COMPUTE_WH
schedule = '1440 minute' -- once every day
as
CALL CREATE_UNION_VIEW('EXT_A');
The only way you can achieve this currently is via a Snowflake Stored Procedure.
You don't specify how you want to consume the result of the query, but a convenient way is via a VIEW. So the Stored Procedure has to generate a VIEW definition containing the query in your question.

Resources