How to Generating SQL dynamically - scalikejdbc

I would like to use this library only for generating sql without executing it.
Can you please let me see good example how can i use SQLSytax in order just to generate.
for example :
val query:String = //Use SQLSyntax
println(query)
res1: select * from TABLE where A = ?
val bindedParameters:List[String] = ....

You can use #statement and #parameters like this.
scala> val q = sql"select * from users where id = ${123}"
q: scalikejdbc.SQL[Nothing,scalikejdbc.NoExtractor] = scalikejdbc.SQLToTraversableImpl#37157995
scala> q.statement
res0: String = select * from users where id = ?
scala> q.parameters
res1: Seq[Any] = List(123)

Related

Flink job that waits for the event to appear in Iceberg table

I'm writing a code that queries iceberg table with flink batch job:
val id = "1111"
val batchStream: DataStream[RowData] =
FlinkSource.forRowData()
.env(flinkEnv)
.tableLoader(tableLoader)
.streaming(false)
.build()
val tableEnv = StreamTableEnvironment.create(flinkEnv)
val inputTable = tableEnv.fromDataStream(batchStream)
tableEnv.createTemporaryView("InputTable", inputTable)
val resultTable = tableEnv.sqlQuery(
s"""SELECT id FROM InputTable WHERE id='${id}' and `year` = ${year}
| and `month` = ${month} and `day` = ${day}""".stripMargin)
.execute()
val results: CloseableIterator[Row] = resultTable.collect()
var report: List[String] = List[String]()
while (results.hasNext) {
val event = results.next().toString
println("Result test " + event)
report ::= event
}
It works for me if I know that the event is already in the table. Can you help me rewrite it in a way it's not batch but streaming, and it keeps waiting (Thread.sleep(100) is ok too) on the id = "1111" to appear in my iceberg table

NIFI - upload binary.zip to SQL Server as varbinary

I am trying to upload a binary.zip to SQL Server as varbinary type column content.
Target Table:
CREATE TABLE myTable ( zipFile varbinary(MAX) );
My NIFI Flow is very simple:
-> GetFile:
filter:binary.zip
-> UpdateAttribute:<br>
sql.args.1.type = -3 # as varbinary according to JDBC types enumeration
sql.args.1.value = ??? # I don't know what to put here ! (I've triying everything!)
sql.args.1.format= ??? # Is It required? I triyed 'hex'
-> PutSQL:<br>
SQLstatement= INSERT INTO myTable (zip_file) VALUES (?);
What should I put in sql.args.1.value?
I think it should be the flowfile payload, but it would work as part of the INSERT in the PutSQL? Not by the moment!
Thanks!
SOLUTION UPDATE:
Based on https://issues.apache.org/jira/browse/NIFI-8052
(Consider I'm sending some data as attribute parameter)
import java.nio.charset.StandardCharsets
import org.apache.nifi.controller.ControllerService
import groovy.sql.Sql
def flowFile = session.get()
def lookup = context.controllerServiceLookup
def dbServiceName = flowFile.getAttribute('DatabaseConnectionPoolName')
def tableName = flowFile.getAttribute('table_name')
def fieldName = flowFile.getAttribute('field_name')
def dbcpServiceId = lookup.getControllerServiceIdentifiers(ControllerService).find
{ cs -> lookup.getControllerServiceName(cs) == dbServiceName }
def conn = lookup.getControllerService(dbcpServiceId)?.getConnection()
def sql = new Sql(conn)
flowFile.read{ rawIn->
def parms = [rawIn ]
sql.executeInsert "INSERT INTO " + tableName + " (date, "+ fieldName + ") VALUES (CAST( GETDATE() AS Date ) , ?) ", parms
}
conn?.close()
if(!flowFile) return
session.transfer(flowFile, REL_SUCCESS)
session.commit()
maybe there is a nifi native way to insert blob however you could use ExecuteGroovyScript instead of UpdateAttribute and PutSQL
add SQL.mydb parameter on the level of processor and link it to required DBCP pool.
use following script body:
def ff=session.get()
if(!ff)return
def statement = "INSERT INTO myTable (zip_file) VALUES (:p_zip_file)"
def params = [
p_zip_file: SQL.mydb.BLOB(ff.read()) //cast flow file content as BLOB sql type
]
SQL.mydb.executeInsert(params, statement) //committed automatically on flow file success
//transfer to success without changes
REL_SUCCESS << ff
inside the script SQL.mydb is a reference to groovy.sql.Sql oblject

Stored procedure - Use array list in SQL query for insert in snowflake

If i have created/generated a list of elements during processing in stored procedure say rownum = [1,2,3,4].
Now i want to use this list in a sql statement to filter out rows say select * from mytable where rownum not in (1,2,3,4) in same stored procedure.
How can i achieve this ?
Please guide.
Thanks
The general solution to this would be to use binding variables. However, set types are not supported as bind variables in the Stored Procedure APIs currently.
The JavaScript APIs do permit you to generate your SQL dynamically using string and array transform functions, so the following approaches can be taken to work around the problem.
Inline the list of values into the query by forming a SQL syntax of a set of values:
CREATE OR REPLACE PROCEDURE SAMPLE()
RETURNS RETURNTYPE
LANGUAGE JAVASCRIPT
AS
$$
var lst = [2, 3, 4]
var lstr = lst.join(',') // "2, 3, 4"
var sql_command = `SELECT * FROM T WHERE C NOT IN (${lstr})` // (2, 3, 4)
var stmt = snowflake.createStatement( {sqlText: sql_command} )
// Runs: SELECT * FROM T WHERE C NOT IN (2, 3, 4) [Literal query string]
[...]
$$;
Or if the list of values used could be unsafe, you can generate the query to carry just the right number of bind variables:
CREATE OR REPLACE PROCEDURE SAMPLE()
RETURNS RETURNTYPE
LANGUAGE JAVASCRIPT
AS
$$
var lst = [2, 3, 4]
var lst_vars = Array(lst.length).fill("?").join(", ") // "?, ?, ?"
var sql_command = `SELECT * FROM T WHERE C NOT IN (${lst_vars})` // (?, ?, ?)
var stmt = snowflake.createStatement( {sqlText: sql_command, binds: lst} )
// Runs: SELECT * FROM T WHERE C NOT IN (2, 3, 4) [After bind-substitution]
[...]
$$;
Snowflake has the ARRAY_CONTAINS( , ) function.
example:
Array_Contains( 5, array_construct( 1,2,3,4))

Spark: how to change datarframe Array[String] to RDD[Array[String]]

I have transactions as a DataFrame array<string>:
transactions: org.apache.spark.sql.DataFrame = [collect_set(b): array<string>]
I want to change it to RDD[Array[string]], but when I am changing it to RDD, it's getting changed to org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]:
val sam: RDD[Array[String]] = transactions.rdd
<console>:42: error: type mismatch;
found : org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
required: org.apache.spark.rdd.RDD[Array[String]]
val sam: RDD[Array[String]] = transactions.rdd
transactions.rdd will return RDD[Row], as it is in message.
You can manually convert Row to Array:
val sam = transactions.rdd.map(x => x.getList(0).toArray.map(_.toString))
More Spark 2.0 style it would be:
val sam = transactions.select("columnName").as[Array[String]].rdd
Replace columnName with proper column name from DataFrame - probably you should rename collect_set(b) to more user-friendly name
Dataframe is actually a array[Row] so whenever you run collect on a dataframe it will create a array[Row] and when you are converting it RDD it becomes a RDD[Row].
So if you want RDD[Array[String]] you can do it this way:
val sam = transactions.rdd.map(x => x.toString().stripPrefix("[").stripSuffix("]").split(fieldSeperator))

What is LINQ equivalent of SQL’s "IN" keyword

How can I write below sql query in linq
select * from Product where ProductTypePartyID IN
(
select Id from ProductTypeParty where PartyId = 34
)
There is no direct equivalent in LINQ. Instead you can use contains () or any
other trick to implement them. Here's an example that uses Contains:
String [] s = new String [5];
s [0] = "34";
s [1] = "12";
s [2] = "55";
s [3] = "4";
s [4] = "61";
var result = from d in context.TableName
where s.Contains (d.fieldname)
select d;
check this link for details: in clause Linq
int[] productList = new int[] { 1, 2, 3, 4 };
var myProducts = from p in db.Products
where productList.Contains(p.ProductID)
select p;
Syntactic variations aside, you can write it in practically the same way.
from p in ctx.Product
where (from ptp in ctx.ProductTypeParty
where ptp.PartyId == 34
select ptp.Id).Contains(p.ProductTypePartyID)
select p
I prefer using the existential quantifier, though:
from p in ctx.Product
where (from ptp in ctx.ProductTypeParty
where ptp.PartyId == 34
&& ptp.Id == p.ProductTypePartyID).Any()
select p
I expect that this form will resolve to an EXISTS (SELECT * ...) in the generated SQL.
You'll want to profile both, in case there's a big difference in performance.
Something similar to this
var partyProducts = from p in dbo.Product
join pt in dbo.ProductTypeParty on p.ProductTypePartyID equal pt.PartyId
where pt.PartyId = 34
select p
You use the Contains in a Where clause.
Something along these lines (untested):
var results = Product.Where(product => ProductTypeParty
.Where(ptp => ptp.PartyId == 34)
.Select(ptp => ptp.Id)
.Contains(product.Id)
);

Resources