Can we use Parameterized Views in Snowflake? - snowflake-cloud-data-platform

Can we use Parameterized Views in Snowflake. Such as pass the table name or database name as parameters instead of hardcoding it?

I think your best bet is to use session variables in conjunction with a regular view.
A session variable can be referenced in the view DDL, and will need to be set in any sessions querying the view.
To do this, you can make use of the IDENTIFIER function in Snowflake, which lets you use text as an object identifier.
create table t1 (col1 number, col2 number);
create table t2 (col1 number, col2 number);
set ti = 't1';
create view v1 as select col1, col2 from identifier($ti);
Before you query the view, you will need to set the session variable (ti in this case) to the table name (fully qualified if need be).
set ti = 't1';
select * from v1; -- returns data from t1
set ti = 't2';
select * from v1; -- returns data from t2

I have not found a way to do this, so I've created what I call a "wrapper view" in the past when I need something like this, example as follows.
I hope this helps...Rich
--create source tables and test records
CREATE TABLE t1 (id NUMBER, str VARCHAR);
CREATE TABLE t2 (id NUMBER, str VARCHAR);
CREATE TABLE t3 (id NUMBER, str VARCHAR);
INSERT INTO t1 VALUES(1, 'record from t1');
INSERT INTO t1 VALUES(2, 'record from t1');
INSERT INTO t2 VALUES(100, 'record from t2');
INSERT INTO t2 VALUES(101, 'record from t2');
INSERT INTO t3 VALUES(998, 'record from t3');
INSERT INTO t3 VALUES(999, 'record from t3');
--create the "wrapper" view
CREATE VIEW vw_t AS (
SELECT 't1' as table_name, * FROM t1
UNION ALL
SELECT 't2' as table_name, * FROM t2
UNION ALL
SELECT 't3' as table_name, * FROM t3);
--try it out
SELECT *
FROM vw_t
WHERE table_name = 't3';
--results
TABLE_NAME ID STR
t3 998 record from t3
t3 999 record from t3

I think the best way to handle something like this would be to create a UDTF that acts like a view that has been parameterized. So, in essence, you'd reference the UDTF like a view and pass the parameters into the UDTF, which would then return the data that you wish to use. Note that Snowflake has 2 options for UDTF (SQL and Javascript):
https://docs.snowflake.net/manuals/sql-reference/udf-table-functions.html
https://docs.snowflake.net/manuals/sql-reference/udf-js-table-functions.html

Although when using the interactive SQL worksheet on Snowflake, you can do this:
SET target_table_name='myTable';
SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=$target_table_name
That does not work programmatically. Instead, as described here, a parameterized query such as a view uses this syntax:
SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=(?)
And yes, the name of that first (?) parameter is 1.
I'm working on my SnowflakeSQLHelper (an adaptation of the Microsoft Patterns and Practices) that will help when attaching parameters.

Related

Checking whether value exists in results of a stored procedure

I have a stored procedure which returns a list of IDs for a particular set of generators I want to be able to then use the results of this stored procedure as part of another query.
Can I write a query like:
select * from table where id in (exec dbo.storedprocedurename)
Using table variable and JOIN you can achieve this. Store the procedure result into the table.
DECLARE #ProcOutput TABLE (Id INT);
INSERT INTO #ProcOutput (Id)
EXEC [dbo].[storedprocedurename]
SELECT T.*
FROM Table T
JOIN #ProcOutput O ON O.Id = T.Id
If the procedure returns multiple entries, according to the output you can re-design the table's schema.
If your output of procedure is 2 columns then you may try this:
INSERT INTO MyTable
(
Col1,
Col2
)
EXEC [dbo].[storedprocedurename]
GO
SELECT * FROM TABLE WHERE ID IN (SELECT Col1 from Mytable)

TSQL - subquery inside Begin End

Consider the following query:
begin
;with
t1 as (
select top(10) x from tableX
),
t2 as (
select * from t1
),
t3 as (
select * from t1
)
-- --------------------------
select *
from t2
join t3 on t3.x=t2.x
end
go
I was wondering if t1 is called twice hence tableX being called twice (which means t1 acts like a table)?
or just once with its rows saved in t1 for the whole query (like a variable in a programming lang)?
Just trying to figure out how tsql engine optimises this. This is important to know because if t1 has millions of rows and is being called many times in the whole query generating the same result then there should be a better way to do it..
Just create the table:
CREATE TABLE tableX
(
x int PRIMARY KEY
);
INSERT INTO tableX
VALUES (1)
,(2)
Turn on the execution plan generation and execute the query. You will get something like this:
So, yes, the table is queried two times. If you are using complex common table expression and you are working with huge amount of data, I will advice to store the result in temporary table.
Sometimes, I am getting very bad execution plans for complex CTEs which were working nicely in the past. Also, you are allowed to define indexes on temporary tables and improve performance further.
To be honest, there is no answer... The only answer is Race your horses (Eric Lippert).
The way you write your query does not tell you, how the engine will put it in execution. This depends on many, many influences...
You tell the engine, what you want to get and the engine decides how to get this.
This may even differ between identical calls depending on statistics, currently running queries, existing cached results etc.
Just as a hint, try this:
USE master;
GO
CREATE DATABASE testDB;
GO
USE testDB;
GO
--I create a physical test table with 1.000.000 rows
CREATE TABLE testTbl(ID INT IDENTITY PRIMARY KEY, SomeValue VARCHAR(100));
WITH MioRows(Nr) AS (SELECT TOP 1000000 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values v1 CROSS JOIN master..spt_values v2 CROSS JOIN master..spt_values v3)
INSERT INTO testTbl(SomeValue)
SELECT CONCAT('Test',Nr)
FROM MioRows;
--Now we can start to test this
GO
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
DECLARE #dt DATETIME2 = SYSUTCDATETIME();
--Your approach with CTEs
;with t1 as (select * from testTbl)
,t2 as (select * from t1)
,t3 as (select * from t1)
select t2.ID AS t2_ID,t2.SomeValue AS t2_SomeValue,t3.ID AS t3_ID,t3.SomeValue AS t3_SomeValue INTO target1
from t2
join t3 on t3.ID=t2.ID;
SELECT 'Final CTE',DATEDIFF(MILLISECOND,#dt,SYSUTCDATETIME());
GO
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
DECLARE #dt DATETIME2 = SYSUTCDATETIME();
--Writing the intermediate result into a physical table
SELECT * INTO test1 FROM testTbl;
SELECT 'Write into test1',DATEDIFF(MILLISECOND,#dt,SYSUTCDATETIME());
select t2.ID AS t2_ID,t2.SomeValue AS t2_SomeValue,t3.ID AS t3_ID,t3.SomeValue AS t3_SomeValue INTO target2
from test1 t2
join test1 t3 on t3.ID=t2.ID
SELECT 'Final physical table',DATEDIFF(MILLISECOND,#dt,SYSUTCDATETIME());
GO
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
DECLARE #dt DATETIME2 = SYSUTCDATETIME();
--Same as before, but with an primary key on the intermediate table
SELECT * INTO test2 FROM testTbl;
SELECT 'Write into test2',DATEDIFF(MILLISECOND,#dt,SYSUTCDATETIME());
ALTER TABLE test2 ADD PRIMARY KEY (ID);
SELECT 'Add PK',DATEDIFF(MILLISECOND,#dt,SYSUTCDATETIME());
select t2.ID AS t2_ID,t2.SomeValue AS t2_SomeValue,t3.ID AS t3_ID,t3.SomeValue AS t3_SomeValue INTO target3
from test2 t2
join test2 t3 on t3.ID=t2.ID
SELECT 'Final physical tabel with PK',DATEDIFF(MILLISECOND,#dt,SYSUTCDATETIME());
--Clean up (Careful with real data!!!)
GO
USE master;
GO
--DROP DATABASE testDB;
GO
On my system the
first takes 674ms, the
second 1.205ms (297 for writing into test1) and the
third 1.727ms (285 for writing into test2 and ~650ms for creating the index.
Although the query is performed twice, the engine can take advantage of cached results.
Conclusio
The engine is really smart... Don't try to be smarter...
If the table would cover a lot of columns and much more data per row the whole test might return something else...
If your CTEs (sub-queries) involve much more complex data with joins, views, functions and so on, the engine might get into troubles finding the best approach.
If performance matters, you can race your horses to test it out. One hint: I sometimes used a TABLE HINT quite successfully: FORCE ORDER. This will perform joins in the order specified in the query.
Here is a simple example to test the theories:
First, via temporary table which calls the matter only once.
declare #r1 table (id int, v uniqueidentifier);
insert into #r1
SELECT * FROM
(
select id=1, NewId() as 'v' union
select id=2, NewId()
) t
-- -----------
begin
;with
t1 as (
select * from #r1
),
t2 as (
select * from t1
),
t3 as (
select * from t1
)
-- ----------------
select * from t2
union all select * from t3
end
go
On the other hand, if we put the matter inside t1 instead of the temporary table, it gets called twice.
t1 as (
select id=1, NewId() as 'v' union
select id=2, NewId()
)
Hence, my conclusion is to use temporary table and not reply on cached results.
Also, ive implemented this on a large scale query that called the "matter" twice only and after moving it to temporary table the execution time went straight half!!

How to compare varbinary data type in where clause

I have a linked server that is created to pull user details from a specific Organisation Unit with a scheduled sql job agent.
The table is created to hold user details has a column for ObjectGUID number and the type is defined as varbinary(50) (I am not sure why..).
The process checks if there is a new user by comparing the ObjectGUID number the saved Users table and if there is a new number then insert the new user in the table.
However I have noticed that the comparisons actually not really working properly.
SELECT
tbl.objectGUID AS UserGUID
FROM [dbo].[ActiveDirectoryUsers] tbl
WHERE tbl.objectGUID NOT IN (SELECT UserGUID FROM dbo.Users)
When I create a new user the new user is appearing in the ActiveDirectoryUsers view.
but when the where clause added to compare results with Users table then result is always empty. It looks like I need to cast or convert the varbinary to varchar then do the comparisons. I tried to cast the varbinary into varchar and uniqueidentifier but still it does not work.
Any idea how would I do the comparisons?
Update
CREATE VIEW [dbo].[ActiveDirectoryUsers] AS
SELECT "SAMAccountName" AS sAMAccountName, "mail" AS Email,
"objectGUID" AS objectGUID
FROM OpenQuery(ADSI, 'SELECT SAMAccountName, mail, objectGUID
FROM ''ldapconnectionstring.com''')
An example of objectGUID in the Users table
0x1DBCC071C69C8242B4895D42750969B1
You should not cast varbinary to smth particular to be able to use it in WHERE clause.
Your problem is that you use NOT IN where NULL values are present.
Try to execute my code first as it is (it will return 1 row) and then uncomment NULL value insert and execute it again.
This time you'll get 0 rows:
declare #t1 table (guid varbinary(50))
insert into #t1
values(0x1DBCC071C69C8242B4895D42750969B1)--, (null);
declare #t2 table (guid varbinary(50))
insert into #t2
values(0x1DBCC071C69C8242B4895D42750969B1), (0x1DBCC071C69C8242B4895D42750969B2);
select *
from #t2 t2
where t2.guid not in (select guid from #t1);
To fix your problem, try to use NOT EXISTS instead of NOT IN like this:
select *
from #t2 t2
where not exists (select *
from #t1 t1
where t1.guid = t2.guid);
In your case the code should be like this:
SELECT tbl.objectGUID AS UserGUID
FROM [dbo].[ActiveDirectoryUsers] tbl
WHERE not exists (SELECT *
FROM dbo.Users u
where u.UserGUID = tbl.objectGUID );

How to transfer data from one table to another

I have two tables, I want to transfer all data from the first table to the second table in case of this data is not exits i nthe second table. how to do it using MS-sql server query ?
it could be something like:
INSERT INTO tableB(FieldA, FieldB, FieldC)
SELECT a.FieldA, a.FieldB, a.FieldC
FROM tableA a
WHERE NOT EXISTS
(
SELECT *
FROM tableB b
/* Primary key field(s)*/
WHERE b.FieldA =a.FieldA
)
in ms-sql you could do something like this:
INSERT INTO mytable(column1, column2) select value1, value2 from mytable2;
but you must make sure that the column1 and value1 have the same datatype same with column2.
Hope it helps. ;)
If the table doesn't exixst you can
SELECT * INTO SECOND_TABLE
FROM FIRST_TABLE;
If you want it to run even if table exists you can preceed this query with:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[YOUR_SCHEMA].[SECOND_TABLE]') AND type in (N'U'))
DROP TABLE [YOUR_SCHEMA].[SECOND_TABLE];

Copy image data type from one table to another

How do you copy an image data type (or varbinary(max)) from one table to another in SQL Server, without having to save the data to a file first?
You select the records from one table and insert into another. As you do it in the same query, the data doesn't leave the database, so you don't have to store it anywhere.
Example:
insert into SomeTable (SomeId, SomeBinaryField)
select SomeId, SomeBinaryField
from SomeOtherTable
where SomeId = 42
You can make at as complex as you like.
I prefer parsing the same field in the same field using a select statement to copy image data from one table to an other.
Update [Database].[dbo].[DataTableA$Attachment]
SET [Store Pointer ID] = (SELECT [Store Pointer ID]
FROM [Database].[dbo].[DataTableB$Attachment]
WHERE [No_] = '35975') WHERE [No_] = '35975'
You can just use an insert statement with a SELECT clause, for example:
declare #t1 table (t1 image)
declare #t2 table (t2 image)
insert into #t2 select t.t1 as t2 from #t1 as t
You can get full details about the INSERT statement here:
http://msdn.microsoft.com/en-us/library/ms174335.aspx

Resources