Snowflake IDENTIFIER() not always working - snowflake-cloud-data-platform

In a Snowflake stored procedure, I'm executing dynamic SQL. I want to pass the table names into the queries using bound variables. Because they are variables, I need to wrap them with IDENTIFIER() (https://docs.snowflake.com/en/sql-reference/identifier-literal.html).
However, this seems to work in some cases and not others.
This works (using Javascript string interpolation to put the target table into the query, but using bound variables for the TMP table) :
var deleteResults = snowflake.execute({sqlText:`
DELETE FROM ${threePartTargetTable} TARGET
USING IDENTIFIER(?) TMP
WHERE TARGET.DOCUMENT_DESCRIPTOR_ID = TMP.DOCUMENT_DESCRIPTOR_ID
AND TARGET.DOCUMENT_DESCRIPTOR_ID NOT IN (SELECT DOCUMENT_DESCRIPTOR_ID FROM IDENTIFIER(?) WHERE END_ON = '12/31/9999 12:00:00 AM');`
,binds:[
tempTableName,
tempTableName
]});
But this (using bound variables for all of the table names) returns
invalid identifier 'TARGET.DOCUMENT_DESCRIPTOR_ID'
var deleteResults = snowflake.execute({sqlText:`
DELETE FROM IDENTIFIER(?) TARGET
USING IDENTIFIER(?) TMP
WHERE TARGET.DOCUMENT_DESCRIPTOR_ID = TMP.DOCUMENT_DESCRIPTOR_ID
AND TARGET.DOCUMENT_DESCRIPTOR_ID NOT IN (SELECT DOCUMENT_DESCRIPTOR_ID FROM IDENTIFIER(?) WHERE END_ON = '12/31/9999 12:00:00 AM');`
,binds:[
threePartTargetTable,
tempTableName,
tempTableName
]});
This is another statement taken from the same procedure, which uses the same bound variable for the target table, and this works fine :
var insertResults = snowflake.execute({sqlText:`INSERT INTO IDENTIFIER(?)
(
DOCUMENT_DESCRIPTOR_ID
, NAME
)
SELECT TMP.DOCUMENT_DESCRIPTOR_ID
, TMP.NAME
FROM ${tempTableName} TMP
WHERE TMP.END_ON = '12/31/9999 12:00:00 AM'
AND TMP.DOCUMENT_DESCRIPTOR_ID NOT IN (SELECT DOCUMENT_DESCRIPTOR_ID FROM IDENTIFIER(?));`
,binds:[
threePartTargetTable,
threePartTargetTable
]});
The documentation doesn't mention any limitations of where IDENTIFIER() can be used. Does anyone know why the second piece of code doesn't work?

It seems the issue is the alias which for bound variable cannot be properly resolved. Similar behavior could be observed using Snowflake scripting.
Demo:
CREATE OR REPLACE TABLE test1(ID) AS SELECT 1 UNION ALL SELECT 2;
CREATE OR REPLACE TABLE test2(ID) AS SELECT 2;
Using session variables(works):
SET a = 'test1';
SET b = 'test2';
DELETE FROM IDENTIFIER($a) AS TARGET
USING IDENTIFIER($b) AS TMP
WHERE TARGET.ID = TMP.ID;
Using Java Script procedure:
create or replace procedure sp_delete(a STRING, b STRING)
returns float
language javascript
as
$$
var deleteResults = snowflake.execute({sqlText:`
DELETE FROM IDENTIFIER(?) TARGET
USING IDENTIFIER(?) TMP
WHERE TARGET.ID = TMP.ID;`
,binds:[A, B]});
$$ ;
CALL sp_delete('test1', 'test2');
SQL compilation error: invalid identifier 'TARGET.ID' At Snowflake.execute
Using Snowflake Scripting:
create or replace procedure sp_delete(a STRING, b STRING)
returns float
language SQL
as
$$
BEGIN
DELETE FROM IDENTIFIER(:A) AS TARGET
USING IDENTIFIER(:B) AS TMP
WHERE TARGET.ID = TMP.ID;
RETURN 0;
END;
$$;
CALL sp_delete('test1', 'test2');
invalid identifier 'TARGET.ID'
The workaround is to rewrite query to eqivalent form without alias:
create or replace procedure sp_delete(a STRING, b STRING)
returns float
language javascript
as
$$
var deleteResults = snowflake.execute({sqlText:`
DELETE FROM IDENTIFIER(?)
WHERE ID IN (SELECT TMP.ID FROM IDENTIFIER(?) AS TMP) ;`
,binds:[A, B]});
$$;
For actual query:
DELETE FROM IDENTIFIER(?) TARGET
USING IDENTIFIER(?) TMP
WHERE TARGET.DOCUMENT_DESCRIPTOR_ID = TMP.DOCUMENT_DESCRIPTOR_ID
AND TARGET.DOCUMENT_DESCRIPTOR_ID NOT IN (SELECT DOCUMENT_DESCRIPTOR_ID
FROM IDENTIFIER(?)
WHERE END_ON = '12/31/9999 12:00:00 AM');
=>
DELETE FROM IDENTIFIER(?)
WHERE DOCUMENT_DESCRIPTOR_ID IN (SELECT DOCUMENT_DESCRIPTOR_ID FROM IDENTIFIER(?))
AND DOCUMENT_DESCRIPTOR_ID NOT IN (SELECT DOCUMENT_DESCRIPTOR_ID
FROM IDENTIFIER(?)
WHERE END_ON = '12/31/9999 12:00:00 AM');

Related

How do I parametrize Lua script to go through table values executing queries

new with Lua but trying.
I have multiple "Create table" queries which I need to execute, what changes only is Schema and Table name.
At the moment I am explicitly defining each query.
I want to parametrize Lua script from the table below passing table name as argument, since there is 100+ tables which needs to be generated this way.
MappingTable
targetSchema
targetTable
originSchema
originTable
schema1
table1
schema3
table3
schema2
table2
schema4
table4
Current solution
CREATE LUA SCRIPT "ScriptName" () RETURNS ROWCOUNT AS
query([[
Create or replace table schema1.table1 as
select * from schema3.table3;
]])
query([[
Create or replace table schema2.table2 as
select * from schema4.table4;
]])
What I've tried:
CREATE OR REPLACE LUA SCRIPT "ScriptName"('MappingTable') RETURNS ROWCOUNT AS
map_table = execute[[ SELECT * FROM .."'MappingTableName'"..;]] -- passing argument of the script, mapping table name
-- passing values from the columns
load = [[Create or replace table ]]..
[[']]..targetSchema..[['.']]..
[[']]..targetTable..]]..
[[as select * from]]..
[[']]..originSchema..[['.']]..
[[']]..originTable..[[']]
Not sure about the syntax, also I guess I need to loop through the values of the table.
Thank you
Here is a sample script:
create or replace lua script ScriptName (
t_MappingTable
, s_ConditionColumn
, s_ConditionValue
)
returns rowcount as
-- passing argument of the script, mapping table name
local map_table = query ([[
select * from ::MappingTable where ::ConditionColumn = :ConditionValue
]],{
MappingTable = t_MappingTable
, ConditionColumn = s_ConditionColumn
, ConditionValue = s_ConditionValue
});
-- passing values from the columns
for i = 1, #map_table do
query ([[
create or replace table ::targetSchema.::targetTable as
select * from ::originSchema.::originTable
]],{
targetSchema = map_table[i].TARGETSCHEMA
, targetTable = map_table[i].TARGETTABLE
, originSchema = map_table[i].ORIGINSCHEMA
, originTable = map_table[i].ORIGINTABLE
});
end
/
You may want to read values from map_table the other way.
In case when you have case-sensitive column names:
targetSchema = map_table[i]."targetSchema"
, targetTable = map_table[i]."targetTable"
, originSchema = map_table[i]."originSchema"
, originTable = map_table[i]."originTable"
In case when you are sure in column order and don't want to worry about column names:
targetSchema = map_table[i][1]
, targetTable = map_table[i][2]
, originSchema = map_table[i][3]
, originTable = map_table[i][4]

Stored procedure handling multiple SQL statements in Snowflake

I'm creating a stored procedure in Snowflake that will eventually be called by a task.
However I'm getting the following error:
Multiple SQL statements in a single API call are not supported; use one API call per statement instead
And not sure how approach the advised solution within my Javascript implementation.
Here's what I have
CREATE OR REPLACE PROCEDURE myStoreProcName()
RETURNS VARCHAR
LANGUAGE javascript
AS
$$
var rs = snowflake.execute( { sqlText:
`set curr_date = '2015-01-01';
CREATE OR REPLACE TABLE myTableName AS
with cte1 as (
SELECT
*
FROM Table1
where date = $curr_date
)
,cte2 as (
SELECT
*
FROM Table2
where date = $curr_date
)
select * from
cte1 as 1
inner join cte2 as 2
on(1.key = 2.key)
`
} );
return 'Done.';
$$;
You could write your own helper function(idea of user: waldente):
this.executeMany=(s) => s.split(';').map(sqlText => snowflake.createStatement({sqlText}).execute());
executeMany('set curr_date = '2015-01-01';
CREATE OR REPLACE TABLE ...');
The last statement should not contain ; it also may fail if there is ; in one of DDL which was not intended as separator.
You can't have:
var rs = snowflake.execute( { sqlText:
`set curr_date = '2015-01-01';
CREATE OR REPLACE TABLE myTableName AS
...
`
Instead you need to call execute twice (or more). Each for a different query ending in ;.

select query on a table is hanging when a stored procedure to update the table is run

I've a simple stored procedure to update a table as follows:
This sp is updating the table properly. But when I execute select query on po_tran table, its hanging.
Is there any mistake in the stored procedure..?
alter procedure po_tran_upd #locid char(3)
as
SET NOCOUNT ON;
begin
update t
set t.lastndaysale = (select isnull(sum( qty)*-1, 0)
from exp_tran
where exp_tran.loc_id =h.loc_id and
item_code = t.item_code and
exp_tran.doc_date > dateadd(dd,-30,getdate() )
and exp_tran.doc_type in ('PI', 'IN', 'SR')),
t.stk_qty = (select isnull(sum( qty), 0)
from exp_tran
where exp_tran.loc_id =h.loc_id and
item_code = t.item_code )
from po_tran t, po_hd h
where t.entry_no=h.entry_no and
h.loc_id=#locid and
h.entry_date> getdate()-35
end
;
Try the following possible ways to optimize your procedure.
Read this article, where I have explained the same example using CURSOR, Here I also have updated a field of the table using CURSOR.
Important: Remove Subquery, As I can see you have used a subquery to update the field.
You can use Join or Save the result of your query in the temp variable and you can use that variable while update.
i.g
DECLARE #lastndaysale AS FLOAT
DECLARE #stk_qty AS INT
select #lastndaysale = isnull(sum( qty)*-1, 0) from exp_tran where exp_tran.loc_id =h.loc_id and
item_code = t.item_code and exp_tran.doc_date > dateadd(dd,-30,getdate() ) and exp_tran.doc_type in ('PI', 'IN', 'SR')
select #stk_qty = isnull(sum( qty), 0) from exp_tran where exp_tran.loc_id =h.loc_id and item_code = t.item_code
update t set t.lastndaysale =#lastndaysale,
t.stk_qty = #stk_qty
from po_tran t, po_hd h where t.entry_no=h.entry_no and h.loc_id=#locid and h.entry_date> getdate()-35
This is just a sample example you can do need full changes in that.
I added a possibly more performant update, however, I do not fully understand your question. If "any" query is running slow against the po_tran, then I suggest you examine the indexing on that table and ensure it has a proper clustered index. If "this" query is running slow then I suggest you look into "covering indexes". The two fields entry_no and item_code seem like good candidates to include in a covering index.
update t
set t.lastndaysale =
CASE WHEN e.doc_date > dateadd(dd,-30,getdate() AND e.doc_type in ('PI', 'IN', 'SR') THEN
isnull(sum(qty) OVER (PARTITION BY e.loc_id, t.item_code) *-1, 0)
ELSE 0
END,
t.stk_qty = isnull(SUM(qty) OVER (PARTITION BY e.loc_id, t.item_code),0)
from
po_tran t
INNER JOIN po_hd h ON h.entry_no=t.entry_no AND h.entry_date> getdate()-35
INNER JOIN exp_tran e ON e.loc_id = h.loc_id AND e.itesm_code = t.item_code
where
h.loc_id=#locid

Getting error in sql query "The column 'nextid' was specified multiple times"

I am getting an error when executing this SQL query:
The column 'nextid' is specified multiple times
It is working fine in SQL Server but when I try to run the query on Azure Data flow as source query, I get the error.
SELECT
CASE
WHEN bp.nextid IS NULL
THEN
CASE
WHEN nextid = '100000' THEN '100000'
WHEN nextid = '300000' THEN '300000'
WHEN nextid = '400000' THEN '400000'
THEN '500000'
ELSE '500000'
END
WHEN bp.nextid IS NOT NULl
THEN bp.nextid
END 'LastBudgetPoolId',
*
FROM
staging.nextid bp
LEFT JOIN
staging.gca ga ON bp.next id = ga.nextid;
As #larnu said that don't use * so I change the query and added column names after that working fine as
SELECT
CASE
WHEN bp.nextid IS NULL
THEN
CASE
WHEN nextid = '100000' THEN '100000'
WHEN nextid = '300000' THEN '300000'
WHEN nextid = '400000' THEN '400000'
THEN '500000'
ELSE '500000'
END
WHEN bp.nextid IS NOT NULl
THEN bp.nextid
END 'LastBudgetPoolId',
id,name,username ,nextid
FROM
staging.nextid bp
LEFT JOIN
staging.gca ga ON bp.next id = ga.nextid;

Using Select From table in Insert statement of a merge query

The below code is getting error in SQL
Incorrect syntax near the keyword 'Select'.
This is my code for merge where InstalledSoftwareList is a used defined table.
MERGE [dbo].[TableName] AS TargetTable
USING UDTableName AS SourceTable
ON (TargetTable.[EId] = SourceTable.[EId]
AND TargetTable.[MId] = SourceTable.[MId]
AND TargetTable.PackageId = (SELECT Id FROM [PackagesDummyTable] SP WHERE SP.[Version] = SourceTable.[Version] AND SP.[Name] = SourceTable.[Name])
)
WHEN NOT MATCHED BY TARGET -- If the records in the Customer table is not matched?-- then INSERT the record
THEN INSERT ([Guid], [PackageId], [CName], [UUID], [MAC], [Date], [isUninstalled], [LastUpdatedDateTime], [DataCapturedTime], [SGuid], [UniqueId], [MId], [EId])
Select SourceTable.Guid,SP.PackageId,SourceTable.CName,SourceTable.UUID,SourceTable.MAC,SourceTable.Date,SourceTable.isUninstalled,GETUTCDATE(),SourceTable.DataCapturedTime,SourceTable.SGuid, SourceTable.UniqueId, SourceTable.MId, SourceTable.EId
FROM [PackagesDummyTable] SP WHERE SP.[Version] = SourceTable.[Version] AND SP.[Name] = SourceTable.[Name];
I was referring this https://msdn.microsoft.com/en-us/library/bb510625.aspx. And my syntax seems to be right.
Can anyone help me on this. I am using SQL Azure.
As #gotqn said, If you only need to prcess the new data, you can just you insert into statement.
If it's required that you must you MERG INTo, you can change your script to below
MERGE [dbo].[TableName] AS TargetTable
USING (
SELECT UN.[EId],UN.[MId],SP.ID ,UN.Guid,SP.PackageId,UN.CName,UN.UUID,UN.MAC,UN.Date,UN.isUninstalled
,UN.DataCapturedTime,UN.SGuid, UN.UniqueId
FROM UDTableName AS UN AS
INNER JOIN [PackagesDummyTable] SP ON SP.[Version] = UN.[Version] AND SP.[Name] = UN.[Name]
) SourceTable
ON TargetTable.[EId] = SourceTable.[EId]
AND TargetTable.[MId] = SourceTable.[MId]
AND TargetTable.PackageId = SourceTable.Id
WHEN NOT MATCHED BY TARGET -- If the records in the Customer table is not matched?-- then INSERT the record
THEN INSERT ([Guid], [PackageId], [CName], [UUID], [MAC], [Date], [isUninstalled], [LastUpdatedDateTime], [DataCapturedTime], [SGuid], [UniqueId], [MId], [EId])
VALUES( SourceTable.Guid,SourceTable.PackageId,SourceTable.CName,SourceTable.UUID,SourceTable.MAC,SourceTable.Date,SourceTable.isUninstalled,GETUTCDATE(),SourceTable.DataCapturedTime,SourceTable.SGuid, SourceTable.UniqueId, SourceTable.MId, SourceTable.EId)
;
MERGE is nice if you want to do more than one CRUD operations. In this case, we only need to insert new records. Could you try something like this:
INSERT INTO ([Guid], [PackageId], [CName], [UUID], [MAC], [Date], [isUninstalled], [LastUpdatedDateTime], [DataCapturedTime], [SGuid], [UniqueId], [MId], [EId])
SELECT SourceTable.Guid,SP.PackageId,SourceTable.CName,SourceTable.UUID,SourceTable.MAC,SourceTable.Date,SourceTable.isUninstalled,GETUTCDATE(),SourceTable.DataCapturedTime,SourceTable.SGuid, SourceTable.UniqueId, SourceTable.MId, SourceTable.EId
-- we need these two tables in order to import data
FROM UDTableName AS SourceTable
INNER JOIN [PackagesDummyTable] SP
ON SP.[Version] = SourceTable.[Version]
AND SP.[Name] = SourceTable.[Name]
-- we are joing this table in order to check if there is new data for import
LEFT JOIN [dbo].[TableName] AS TargetTable
ON TargetTable.[EId] = SourceTable.[EId]
AND TargetTable.[MId] = SourceTable.[MId]
-- we are importing only the data that is new
WHERE TargetTable.PackageId IS NULL;

Resources