Loading data in Snowflake using bind variables - snowflake-cloud-data-platform

We're building dynamic data loading statements for Snowflake using the Python interface.
We want to create a stage at query runtime, and use that stage in a subsequent statement. Table and stage names are dynamic using bind variable.
Yet, it doens't seem like we can find the correct syntax as we tried everything on https://docs.snowflake.com/en/user-guide/python-connector-api.html
COPY INTO IDENTIFIER( %(table_name)s )(SRC, LOAD_TIME, ROW_HASH)
FROM (SELECT t.$1, CURRENT_TIMESTAMP(0), MD5(t.$1) FROM "'%(stage_name)s'" t)
PURGE = TRUE;
Is this even possible? Does it work for anyone?

Your code does not create stage as you mentioned, and you don't need create a stage, instead use table stage or user stage. The SQL below uses table stage.
You also need to change your syntax a little and use more pythonic way : f-strings
sql = f"""COPY INTO {table_name} (SRC, LOAD_TIME, ROW_HASH)
FROM (SELECT t.$1, CURRENT_TIMESTAMP(0), MD5(t.$1) FROM #%{table_name} t)
PURGE = TRUE"""

Related

ODI 12c CUSTOM_TEMPLATE extract option not working

I'm using the CUSTOM_TEMPLATE extract option on the source table to force a select actually from another table. Which then would be used by a custom IKM I'm using in order to get the column list of the "forced" table with the odiRef.getColList API. But the template select query is not considered at all in the execution, so the IKM still gets the columns from the original table and I don't need them.
The code in the CUSTOM_TEMPLATE is:
select *
from <%=odiRef.getObjectName("L", "#V_OFFL_TABLE_NAME", "OFFLOAD_AREA_HIST", "DWH_LCL", "D") %>
where src_date_from_dt = to_date('V_OFFL_TRANSFER_DATE','YYYY-MM-DD')
The code in the SOURCE tab of the custom IKM I made is:
select <%=odiRef.getSrcColList("","[COL_NAME]",",\n","")%>
from <%=odiRef.getObjectName("L", "#V_OFFL_TABLE_NAME", "OFFLOAD_AREA_HIST", "DWH_LCL", "D") %>
where src_date_from_dt = to_date('V_OFFL_TRANSFER_DATE','YYYY-MM-DD')
in this case I'm trying with odiRef.getSrcColList in the IKM, but I;ve also tried with odiRef.getColList - same result.
Try to create a Dummy Datastore in the Model where you have that table.
You can add dummy or general attributes into that Datastore(Var1,Var2,Var3...Num1,Num2..etc).
Drag this datastore to the source area and add the CUSTOM_TEMPLATE there.
*Make sure this dummy is having the same logical schema as the table in your custom query.
This can also work with multi table query.

Accessing properties from Mappers

I have a linked server in SQL Server so when I query something, it has to be something like this:
SELECT * FROM [SERVERNAME].[DBNAME].[SCHEMA].[TABLE]
Now I have to implement this way of querying to an existing project with the servername, dbname and schema provided in my application.properties.
Is there any way to access these properties from my Mapper(xml)?
You can use properties.
With MyBatis-Spring-Boot, you can define properties in your application.properties with the prefix mybatis.configuration.variables. [1].
mybatis.configuration.variables.db_servername=YOUR_SERVER_NAME
mybatis.configuration.variables.db_dbname=YOUR_DB_NAME
mybatis.configuration.variables.db_schema=YOUR_SCHEMA
It is also possible to reference variables defined in the same application.properties.
mybatis.configuration.variables.db_servername=${servername}
mybatis.configuration.variables.db_dbname=${dbname}
mybatis.configuration.variables.db_schema=${schema}
Then you can use these variables in mappers using ${}.
SELECT * FROM [${db_servername}].[${db_dbname}].[${db_schema}].[TABLE]
Note: #{} won't work. See this FAQ entry for the difference.
[1] The doc says that the prefix is mybatis.configuration-properties., but I just tested it and it didn't work. It could be my mistake, though. I plan to investigate when I have some spare time.

Fetching ElasticSearch Results into SQL Server by calling Web Service using SQL CLR

Code Migration due to Performance Issues :-
SQL Server LIKE Condition ( BEFORE )
SQL Server Full Text Search --> CONTAINS ( BEFORE )
Elastic Search ( CURRENTLY )
Achieved So Far :-
We have a web page created in ASP.Net Core which has a Auto Complete Drop Down of 2.5+ Million Companies Indexed in Elastic Search https://www.99corporates.com/
Due to performance issues we have successfully shifted our code from SQL Server Full Text Search to Elastic Search and using NEST v7.2.1 and Elasticsearch.Net v7.2.1 in our .Net Code.
Still looking for a solution :-
If the user does not select a company from the Auto Complete List and simply enters a few characters and clicks on go then a list should be displayed which we had done earlier by using the SQL Server Full Text Search --> CONTAINS
Can we call the ASP.Net Web Service which we have created using SQL CLR and code like SELECT * FROM dbo.Table WHERE Name IN( dbo.SQLWebRequest('') )
[System.Web.Script.Services.ScriptMethod()]
[System.Web.Services.WebMethod]
public static List<string> SearchCompany(string prefixText, int count)
{
}
Any better or alternate option
While that solution (i.e. the SQL-APIConsumer SQLCLR project) "works", it is not scalable. It also requires setting the database to TRUSTWORTHY ON (a security risk), and loads a few assemblies as UNSAFE, such as Json.NET, which is risky if any of them use static variables for caching, expecting each caller to be isolated / have their own App Domain, because SQLCLR is a single, shared App Domain, hence static variables are shared across all callers, and multiple concurrent threads can cause race-conditions (this is not to say that this is something that is definitely happening since I haven't seen the code, but if you haven't either reviewed the code or conducted testing with multiple concurrent threads to ensure that it doesn't pose a problem, then it's definitely a gamble with regards to stability and ensuring predictable, expected behavior).
To a slight degree I am biased given that I do sell a SQLCLR library, SQL#, in which the Full version contains a stored procedure that also does this but a) handles security properly via signatures (it does not enable TRUSTWORTHY), b) allows for handling scalability, c) does not require any UNSAFE assemblies, and d) handles more scenarios (better header handling, etc). It doesn't handle any JSON, it just returns the web service response and you can unpack that using OPENJSON or something else if you prefer. (yes, there is a Free version of SQL#, but it does not contain INET_GetWebPages).
HOWEVER, I don't think SQLCLR is a good fit for this scenario in the first place. In your first two versions of this project (using LIKE and then CONTAINS) it made sense to send the user input directly into the query. But now that you are using a web service to get a list of matching values from that user input, you are no longer confined to that approach. You can, and should, handle the web service / Elastic Search portion of this separately, in the app layer.
Rather than passing the user input into the query, only to have the query pause to get that list of 0 or more matching values, you should do the following:
Before executing any query, get the list of matching values directly in the app layer.
If no matching values are returned, you can skip the database call entirely as you already have your answer, and respond immediately to the user (much faster response time when no matches return)
If there are matches, then execute the search stored procedure, sending that list of matches as-is via Table-Valued Parameter (TVP) which becomes a table variable in the stored procedure. Use that table variable to INNER JOIN against the table rather than doing an IN list since IN lists do not scale well. Also, be sure to send the TVP values to SQL Server using the IEnumerable<SqlDataRecord> method, not the DataTable approach as that merely wastes CPU / time and memory.
For example code on how to accomplish this correctly, please see my answer to Pass Dictionary to Stored Procedure T-SQL
In C#-style pseudo-code, this would be something along the lines of the following:
List<string> = companies;
companies = SearchCompany(PrefixText, Count);
if (companies.Length == 0)
{
Response.Write("Nope");
}
else
{
using(SqlConnection db = new SqlConnection(connectionString))
{
using(SqlCommand batch = db.CreateCommand())
{
batch.CommandType = CommandType.StoredProcedure;
batch.CommandText = "ProcName";
SqlParameter tvp = new SqlParameter("ParamName", SqlDbType.Structured);
tvp.Value = MethodThatYieldReturnsList(companies);
batch.Paramaters.Add(tvp);
db.Open();
using(SqlDataReader results = db.ExecuteReader())
{
if (results.HasRows)
{
// deal with results
Response.Write(results....);
}
}
}
}
}
Done. Got the solution.
Used SQL CLR https://github.com/geral2/SQL-APIConsumer
exec [dbo].[APICaller_POST]
#URL = 'https://www.-----/SearchCompany'
,#JsonBody = '{"searchText":"GOOG","count":10}'
Let me know if there is any other / better options to achieve this.

OrientDB - find "orphaned" binary records

I have some images stored in the default cluster in my OrientDB database. I stored them by implementing the code given by the documentation in the case of the use of multiple ORecordByte (for large content): http://orientdb.com/docs/2.1/Binary-Data.html
So, I have two types of object in my default cluster. Binary datas and ODocument whose field 'data' lists to the different record of binary datas.
Some of the ODocument records' RID are used in some other classes. But, the other records are orphanized and I would like to be able to retrieve them.
My idea was to use
select from cluster:default where #rid not in (select myField from MyClass)
But the problem is that I retrieve the other binary datas and I just want the record with the field 'data'.
In addition, I prefer to have a prettier request because I don't think the "not in" clause is really something that should be encouraged. Is there something like a JOIN which return records that are not joined to anything?
Can you help me please?
To resolve my problem, I did like that. However, I don't know if it is the right way (the more optimized one) to do it:
I used the following SQL request:
SELECT rid FROM (FIND REFERENCES (SELECT FROM CLUSTER:default)) WHERE referredBy = []
In Java, I execute it with the use of the couple OCommandSQL/OCommandRequest and I retrieve an OrientDynaElementIterable. I just iterate on this last one to retrieve an OrientVertex, contained in another OrientVertex, from where I retrieve the RID of the orpan.
Now, here is some code if it can help someone, assuming that you have an OrientGraphNoTx or an OrientGraph for the 'graph' variable :)
String cmd = "SELECT rid FROM (FIND REFERENCES (SELECT FROM CLUSTER:default)) WHERE referredBy = []";
List<String> orphanedRid = new ArrayList<String>();
OCommandRequest request = graph.command(new OCommandSQL(cmd));
OrientDynaElementIterable objects = request.execute();
Iterator<Object> iterator = objects.iterator();
while (iterator.hasNext()) {
OrientVertex obj = (OrientVertex) iterator.next();
OrientVertex orphan = obj.getProperty("rid");
orphanedRid.add(orphan.getIdentity().toString());
}

Where EF6 doing where clause at SQL or at client

I am trying to start using EF6 for a project. My database is already filled with millions of records.
I can't find right explanation how does EF send T-SQL to SQL Server? I am afraid that I am going to download bunch of data to user for no reason.
In code below I have found three way to get my data to List<> but I am not sure which is right way to do WHERE clause at SQL.
I do not want to fill client with millions of record and to query (filter) that data at client.
using (rgtBaza baza = new rgtBaza())
{
var t = baza.Database.SqlQuery<CJE_DOC>("select * from cje_doc where datum between #od and #do",new SqlParameter("od", this.dateTimePickerOD.Value.Date ) ,new SqlParameter("do", this.dateTimePickerOD.Value.Date)).ToList();
var t = baza.CJE_DOC.Where(s => s.DATUM.Value >= this.dateTimePickerOD.Value.Date && s.DATUM.Value <= this.dateTimePickerDO.Value.Date).ToList();
var query = from b in baza.CJE_DOC
where b.DATUM >= this.dateTimePickerOD.Value.Date && b.DATUM.Value <= this.dateTimePickerDO.Value.Date
select b;
var t = query.ToList();
this.dataGridViewCJENICI.DataSource = t;
}
In all 3 cases, the filtering will happen on the database side, the filtering (or WHERE clause) will not take place on the client side.
If you want to verify that this is true, especially for your last 2 options, add some logging so that you can see the generated SQL:
baza.Database.Log = s => Console.WriteLine(s);
In this case, since you are using EF already, choose the 2nd or 3rd options, they are both equivalent with different syntax. Pick your favorite syntax.
In all of those examples, EF6 will generate a SQL query including the where clause - it won't perform the where clause on the client.
It won't actually retrieve any data from the database until you iterate through the results, which in the examples above, is when you call .ToList().
EF6 would only run the filter on the client if you called something like:
baza.CJE_DOC.ToList().Where(x => x.Field == value)
In this instance, it would retrieve the entire table when you called ToList(), and then use a client-side Linq query to filter the results in the where clause.
Any of the 3 will run the query on the SQL Server.
EF relies on LINQ's deferred execution model to build up an expression tree. Once you take an action that causes the expression to be enumerated (e.g. calling ToList(), ToArray(), or any of the other To*() methods), it will convert the expression tree to SQL, send the query to the server, and then start returning the results.
One of the side effects of this is that when using the query or lambda syntax, expressions that EF does not understand how to convert to SQL will cause an exception.
If you absolutely need to use some code that EF can't handle, you can break your code into multiple segments -- filtering things down as far as possible via code that can be converted to SQL, using the AsEnumerable() method to "close off" the EF expression, and doing your remaining filtering or transformations using Linq to Objects.

Resources