Snowflake regexp_replace not working as expected - snowflake-cloud-data-platform

I tried to get paths enclosed by double quotes (ex: "path"."to"."element"). It also strips any bracket-enclosed array element references (like "[0]")
var path_name = "regexp_replace(regexp_replace("customers[0].name",'\\[(.+)\\]'),'(\\w+)','"\\1"')" ;
I tried this method but it is displaying error

So this is a really poorly written question. But lets play the guessing game anyways.
So you have a Javascript stored procedure, and you have that line it side it, and it doesn't work as you expect: lets guess it looks like:
create or replace procedure sp()
returns VARCHAR
language javascript
as
$$
var txt = '"customers[0].name"';
var sql_regexp1 = '\\\\[(.+)\\\\]';
var sql_regexp2 = '(\\\\w+)';
var sql_rep_2 = '\"\\\\1\"';
var full_rep1 = "regexp_replace('" + txt + "','"+ sql_regexp1 +"')";
var full_rep2 = "select regexp_replace(" + full_rep1 + ",'"+ sql_regexp2 +"','"+ sql_rep_2 + "');";
//return full_rep2;
var statement = snowflake.createStatement( {sqlText: full_rep2} );
var result_set1 = statement.execute();
result_set1.next()
return result_set1.getColumnValue(1);
$$;
;
and if you uncomment out the early return to can see the full_rep2
thus you can test that the inner SQL
select regexp_replace('"customers[0].name"','\\[(.+)\\]');
gives:
REGEXP_REPLACE('"CUSTOMERS[0].NAME"','\[(.+)\]')
"customers.name"
lets assume that's correct!
then you can check the outer replace:
select regexp_replace(regexp_replace('"customers[0].name"','\\[(.+)\\]'),'(\\w+)','"\\1"');
which gives:
REGEXP_REPLACE(REGEXP_REPLACE('"CUSTOMERS[0].NAME"','\[(.+)\]'),'(\W+)','"\1"')
""customers"."name""
and if we call the stored procedure:
call sp();
we get:
SP
""customers"."name""
So this was "how I debugged the SQL/Javascript" to have "valid working SQL. The question then becomes, what output did you want. And can you get there from here.

Related

Binding value in SHOW USER statement using JS in Snowflake Stored Procedures

It looks like Snowflake doesn't process parameter binding for SHOW USER statements in JS like this.
var sql_cmd = "SHOW USERS LIKE ?;";
var username = "user.name"
var stmt = snowflake.createStatement({sqlText: sql_cmd, binds:[username]});
var users = stmt.execute();
It just gives me an error saying that
SQL compilation error: syntax error line 1 at position 16 unexpected '?'. At Statement.execute, line 14 position 18
How do I make it work?
Is there a more accurate docs on what is supported by the binds feature? I feel like it should support all SQLs but looks like it doesn't work on CREATE either on another thread I found here.
Can you try this as an alternative? Note to use the SHOW USERS command you must execute the proc as a CALLER: https://docs.snowflake.com/en/sql-reference/stored-procedures-rights.html#caller-s-rights-stored-procedures
CREATE OR REPLACE PROCEDURE user_bind(username VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
AS
$$
var sql_command = "SHOW USERS LIKE '" + USERNAME + "'";
var stmt = snowflake.createStatement( {sqlText: sql_command} );
var result1 = stmt.execute();
result1.next();
return result1.getColumnValue(1);
$$
;
alternatively you dont need a parameter and you could just use the variable within the SP
var username= 'USER1';
var sql_command = "SHOW USERS LIKE '" + username + "'";
call the stored proc
CALL user_bind ('USER1'); --or
CALL user_bind ();

How to bind JavaScript based date column in Snowflake SQL

I am creating snowflake JavaScript based store procedure. How can i refer the date data type variable in snowflake sql.
Here is the sample code:
In the below code ,please suggest how can i use 'dnblatestdt' variable in sql statement.
create or replace procedure test_proc_registration_master_perished_dt(PARAM_REG_SUB_UUID VARCHAR)
returns varchar not null
language javascript
as
$$
/*get latest ingestion_uuid for the given state*/
var step01=`select distinct dnb_applicable_dt,ingestion_uuid from temp_registration_hash_master `;
var statement01=snowflake.createStatement( {sqlText: step01,binds: [PARAM_REG_SUB_UUID]} );
variable1= statement01.execute();
variable1.next();
dnblatestdt=variable1.getColumnValue(1);
ingsuuid=variable1.getColumnValue(2);
/* check if the ingestion is successful or not*/
var step02=`select INGESTION_SUCCESSFUL from FILE_INGESTION_HISTORY where ingestion_uuid=:1 and date=:2::TIMESTAMP_LTZ::DATE`;
var statement02=snowflake.createStatement( {sqlText: step02,binds: [ingsuuid,dnblatestdt]} );
variable2= statement02.execute();
variable2.next();
ingsindc=variable2.getColumnValue(1);
return 'success'
$$
So I wrote a much simpler function that uses a similar pattern to your code:
create or replace procedure test_proc()
returns varchar not null
language javascript
as
$$
var step01 = `SELECT 6::number, '2022-01-27'::timestamp_ntz;`;
var statement01 = snowflake.createStatement( {sqlText: step01} );
results1 = statement01.execute();
results1.next();
ingsuuid = results1.getColumnValue(1);
dnblatestdt = results1.getColumnValue(2);
/* check if the ingestion is successful or not*/
var step02=`SELECT :1 * 2, DATEADD(year,-1, :2::timestamp_ntz);`;
var statement02 = snowflake.createStatement( {sqlText: step02,binds: [ingsuuid , dnblatestdt]} );
results2 = statement02.execute();
results2.next();
ingsindc = results2.getColumnValue(1);
return 'success'
$$
;
and using it works for me:
call test_proc();
TEST_PROC
success
I swapped the order of the reading parameters on the first function, but that should not be a problem.
this makes me thing your casting on the second instance is not working
:2::TIMESTAMP_LTZ::DATE
so I would suggest moving that casting to the first function, which you can test outside the stored procedure, thus.
SELECT DISTINCT dnb_applicable_dt::TIMESTAMP_LTZ::DATE, ingestion_uuid
FROM temp_registration_hash_master
when that is happy, you shouldn't need any casting on the second used of the values.

Stored procedure - get anticipated columns before fully executing statement?

I'm working through a stored procedure and wondering if there's a way to retrieve the anticipated result column list from a sql statement before fully executing.
Scenarios:
dynamic SQL
a UDF that might vary the columns outside of our control
EX:
//inbound parameter
SET QUERY_DEFINITION_ID = 12345;
//Initial statement pulls query text from bank of queries
var sqlText = getQueryFromQueryBank(QUERY_DEFINITION_ID);
//now we run our query
var cmd = {sqlText: sqlText };
stmt = snowflake.createStatement(cmd);
What I'd like to be able to do is say "right - before you run this, give me the anticipated column list" so I can compare it to what's expected.
EX:
Expected: [col1, col2, col3, col4]
Got: [col1]
Result: Oops. Don't run.
Rationale here is that I want to short-circuit the execution if something is missing - before it potentially runs for a while. I can validate all of this after the fact, but it would be really helpful to stop early.
Any ideas very much appreciated!
This sample SP code shows how to get a list of columns that a query will project into the result before you run the query. It should only be used for large, long running queries because it will take a few seconds to get the column list.
There are a couple of caveats. 1) It will only return the names of the columns. It won't tell you how they were built, that is, whether they're aliased, direct from a table, calculated, etc. 2) The example query I used is straight from the Snowflake documentation here https://docs.snowflake.com/en/user-guide/sample-data-tpcds.html#functional-query-definition. For convenience, I minimized the query to a single line. The output of the columns includes object qualifiers in addition to the column names, so V1.I_CATEGORY, V1.D_YEAR, V1.D_MOY, etc. If you don't want them to make it easier to compare names, you can strip off the qualifiers using the JavaScript split function on the dot and take index 1 of the resulting array.
create or replace procedure EXPLAIN_BEFORE_RUNNING()
returns string
language javascript
execute as caller
as
$$
// Set the context for the session to the TPC-H sample data:
executeNonQuery("use schema snowflake_sample_data.tpcds_sf10tcl;");
// Here's a complex query from the Snowflake docs (minimized to one line for convienience):
var sql = `with v1 as( select i_category, i_brand, cc_name, d_year, d_moy, sum(cs_sales_price) sum_sales, avg(sum(cs_sales_price)) over(partition by i_category, i_brand, cc_name, d_year) avg_monthly_sales, rank() over (partition by i_category, i_brand, cc_name order by d_year, d_moy) rn from item, catalog_sales, date_dim, call_center where cs_item_sk = i_item_sk and cs_sold_date_sk = d_date_sk and cc_call_center_sk= cs_call_center_sk and ( d_year = 1999 or ( d_year = 1999-1 and d_moy =12) or ( d_year = 1999+1 and d_moy =1)) group by i_category, i_brand, cc_name , d_year, d_moy), v2 as( select v1.i_category ,v1.d_year, v1.d_moy ,v1.avg_monthly_sales ,v1.sum_sales, v1_lag.sum_sales psum, v1_lead.sum_sales nsum from v1, v1 v1_lag, v1 v1_lead where v1.i_category = v1_lag.i_category and v1.i_category = v1_lead.i_category and v1.i_brand = v1_lag.i_brand and v1.i_brand = v1_lead.i_brand and v1.cc_name = v1_lag.cc_name and v1.cc_name = v1_lead.cc_name and v1.rn = v1_lag.rn + 1 and v1.rn = v1_lead.rn - 1) select * from v2 where d_year = 1999 and avg_monthly_sales > 0 and case when avg_monthly_sales > 0 then abs(sum_sales - avg_monthly_sales) / avg_monthly_sales else null end > 0.1 order by sum_sales - avg_monthly_sales, 3 limit 100;`;
// Before actually running the query, generate an explain plan.
executeNonQuery("explain " + sql);
// Now read the column list from the explain plan from the result set.
var columnList = executeSingleValueQuery("COLUMN_LIST", `select "expressions" as COLUMN_LIST from table(result_scan(last_query_id())) where "operation" = 'Result';`);
// For now, just exit with the column list as the output...
return columnList;
// Your code here...
// Helper functions:
function executeNonQuery(queryString) {
var out = '';
cmd = {sqlText: queryString};
stmt = snowflake.createStatement(cmd);
var rs;
rs = stmt.execute();
}
function executeSingleValueQuery(columnName, queryString) {
var out;
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs;
try{
rs = stmt.execute();
rs.next();
return rs.getColumnValue(columnName);
}
catch(err) {
if (err.message.substring(0, 18) == "ResultSet is empty"){
throw "ERROR: No rows returned in query.";
} else {
throw "ERROR: " + err.message.replace(/\n/g, " ");
}
}
return out;
}
$$;
call Explain_Before_Running();

Use of Variable in Snowflake Stored Procedure

I have to add a variable MaxDate in my SQL Stored Proc (shown below). The code gets errored out since MaxDate is not represented by its value. Any idea on how I can pass a variable in a stored proc?
create or replace procedure Load_Employee()
returns varchar not null
language javascript
EXECUTE AS CALLER
as
$$
//Variable Initialization
var IntegrationTable ='EMPLOYEE';
var TypeID=0;
var MaxDate=' ';
var cmd = "Select max(COMPLETED_DATE) from SCHEMA.TABLE where TARGET_TABLE_NAME= " + "'" + IntegrationTable + "'" ;
var sql = snowflake.createStatement({sqlText: cmd});
var result = sql.execute();
result.next();
MaxDate=result.getColumnValue(1);
var cmd=` Insert into PersonTable
select SHA1(concat(Person_id,'|','Person')) ,12345678,SHA1(concat('Payroll','|','Pay','|', Load_Date)) ,current_timestamp() , Tenant
from Schema.PERSONTABLE where Date_Added >= MaxDate
where TYPE='ABC' ;`;
$$
;
If your query to get MaxDate works right, then the value should be in the variable. The problem is it's not being replaced in the sql variable defining the insert statement.
Since you're using backticks to open and close the string, you can use a special JavaScript notation to replace the variable with its value, ${MaxDate}.
Your definition of the insert statement would look like this:
var cmd=` Insert into PersonTable
select SHA1(concat(Person_id,'|','Person')) ,12345678,SHA1(concat('Payroll','|','Pay','|', Load_Date)) ,current_timestamp() , Tenant
from Schema.PERSONTABLE where Date_Added >= ${MaxDate}
where TYPE='ABC' ;`;
If that doesn't work, try cutting the SP short with return MaxDate; to see what got assigned to that variable. Also it's very helpful to check the query history view to see what SQL actually ran inside a stored procedure.
Also, I think this is the same SP that was having an issue with a null return. You'll need to return a string value using something like return 'Success'; or something to avoid getting an error for the null return. That's because of the returns varchar not null in the definition.

REGEXP_SUBSTR function in stored procedure returns null

I have the following string:
'AAA|BBB||CCC|1.23'
I would like to return: 'CCC|1.23'
When using the regexp: \w+\|\d(.\d+|$) I am able to get the desired results.
When in Snowflake running the following query, returns the correct results:
SELECT REGEXP_SUBSTR('AAA|BBB||CCC|1.23', '\\w+\\|\\d(.\\d+|$)') AS regexp_return;
However when used in a stored procedure as follows:
CREATE OR REPLACE PROCEDURE dnr.regexp_issue ()
returns string
language javascript
execute as owner
AS
$$
var sql_statement = `SELECT REGEXP_SUBSTR('AAA|BBB||CCC|1.23', '\\w+\\|\\d(.\\d+|$)') AS regexp_return;`
var query = snowflake.createStatement({sqlText: sql_statement});
var query_res = query.execute();
query_res.next();
result = query_res.getColumnValue(1);
return result;
$$;
The resulting CALL dnr.regexp_issue(); returns a NULL as if no matching pattern was found.
Any ideas?
the slashed need to be double double quoted as they are going through two string parsers.
CREATE OR REPLACE PROCEDURE regexp_issue ()
returns string
language javascript
execute as owner
AS
$$
var sql_statement = `SELECT REGEXP_SUBSTR('AAA|BBB||CCC|1.23', '\\\\w+\\\\|\\\\d(.\\\\d+|$)') AS regexp_return;`
var query = snowflake.createStatement({sqlText: sql_statement});
var query_res = query.execute();
query_res.next();
result = query_res.getColumnValue(1);
return result;
$$;
call regexp_issue();
gives:
REGEXP_ISSUE
CCC|1.23
To add to Simeon's answer, you can also use .replace(/\\/g, "\\\\") at the end of a string that with double backslashes for Snowflake. That avoids using quadruple backslashes JavaScript + Snowflake SQL escape characters. It can make for more legible strings. It would look like this:
var sql_statement = `SELECT REGEXP_SUBSTR('AAA|BBB||CCC|1.23', '\\w+\\|\\d(.\\d+|$)') AS regexp_return;`.replace(/\\/g, "\\\\");
You can also put it in two separate lines for even more clarity and a few microseconds more processing time.
var sql_statement = `SELECT REGEXP_SUBSTR('AAA|BBB||CCC|1.23', '\\w+\\|\\d(.\\d+|$)') AS regexp_return;`
sql_statement = sql_statement.replace(/\\/g, "\\\\");

Resources