I am trying to batch insert records into an SQL table using Kotlin Exposed. I have set up the code as per the Exposed documentation, however, the SQL statements being executed are individual insert statements rather than 1 batch insert statement.
The documentation located here: https://github.com/JetBrains/Exposed/wiki/DSL
has the following on Batch Inserting:
Batch Insert
Batch Insert allow mapping a list of entities into DB raws in one sql statement. It is more efficient than inserting one by one as it initiates only one statement. Here is an example:
val cityNames = listOf("Paris", "Moscow", "Helsinki")
val allCitiesID = cities.batchInsert(cityNames) { name ->
this[cities.name] = name
}
My code is as follows:
val mappings: List<Triple<String, String, String>> = listOf(triple1, triple2, triple3)
transaction {
TableName.batchInsert(mappings) {
this[TableName.value1] = it.first
this[TableName.value2] = it.second
this[TableName.value3] = it.third
}
}
What I expect to see printed out is 1 batch insert statement which follows the syntax of
INSERT INTO TableName (value1, value2, value3) values
(triple1value1, triple1value2, triple1value3),
(triple2value1, triple2value2, triple2value3),
(triple3value1, triple3value2, triple3value3), ...
but instead it prints 3 individual insert statements with the following syntax
INSERT INTO TableName (value1, value2, value3) values (triple1value1, triple1value2, triple1value3)
INSERT INTO TableName (value1, value2, value3) values (triple2value1, triple2value2, triple2value3)
INSERT INTO TableName (value1, value2, value3) values (triple3value1, triple3value2, triple3value3)
As this seems like the documented correct way to batch insert, what am I doing incorrectly here?
The docs explain:
NOTE: The batchInsert function will still create multiple INSERT
statements when interacting with your database. You most likely want
to couple this with the rewriteBatchedInserts=true (or
rewriteBatchedStatements=true) option of your relevant JDBC driver,
which will convert those into a single bulkInsert. You can find the
documentation for this option for MySQL here and PostgreSQL here.
https://github.com/JetBrains/Exposed/wiki/DSL#batch-insert
Related
I’m building an app using Cassandra as DB and I was wondering if there is any way to copy paste and « sync » a column value with another.
I’ve tried to use materialized views but I wasn’t able to add additional regular columns in the same row where the view was created. I also saw this diagram that is about « links » in cql.
cql links
Can anyone help my finding a way of doing this please?
There isn't a way to do that in Cassandra. You will need to use CQL BATCH statements to keep your tables synchronised. It will group inserts, updates and deletes into one atomic transaction. Have a look at this article where I've explained it in a bit more detail -- https://community.datastax.com/articles/2744/.
For example, if you had these tables to maintain:
movies
movies_by_actor
movies_by_genre
then you would group the updates in a CQL BATCH like this:
BEGIN BATCH
INSERT INTO movies (...) VALUES (...);
INSERT INTO movies_by_actor (...) VALUES (...);
INSERT INTO movies_by_genre (...) VALUES (...);
APPLY BATCH;
Note that it is also possible to do UPDATE and DELETE statements as well as conditional writes in a batch.
The above example is just to illustrate it in cqlsh and is not used in reality. Here is an example BatchStatement using the Java driver:
SimpleStatement insertMovies =
SimpleStatement.newInstance(
"INSERT INTO movies (...) VALUES (?, ...)", <some_values>);
SimpleStatement insertMoviesByActor =
SimpleStatement.newInstance(
"INSERT INTO movies_by_actor (...) VALUES (?, ...)", <some_values>);
SimpleStatement insertMoviesByGenre =
SimpleStatement.newInstance(
"INSERT INTO movies_by_genre (...) VALUES (?, ...)", <some_values>);
BatchStatement batch =
BatchStatement.builder(DefaultBatchType.LOGGED)
.addStatement(insertMovies)
.addStatement(insertMoviesByActor)
.addStatement(insertMoviesByGenre)
.build();
For details, see Java driver Batch statements. Cheers!
I'm unable to find a solution online for my question. If it is even possible, how do I write an SQL Insert statement that uses parameter values as well as selecting a value from another table.
Example:
"INSERT INTO Users (user_name, user_csn, user_adid, user_contact, user_adminpriviledge, user_datestart, user_active, user_team)
VALUES (#username, #usercsn, #useradid, #usercontact, #userauth, #userstart, #useractive, #userteam = (SELECT team_id FROM teaminfo WHERE team_name = '" & ddlAddTeam.SelectedValue & "'))"
I understand that the example is wrong, just trying my best to represent what I'm looking for in code.
Also another question would be regarding aliasing and datareaders. I seem to be unable to do "reader("column_name")" for aliased column names?
Example:
query = "SELECT u.*, t.team_name FROM Users u
JOIN teaminfo t ON u.user_team = t.team_id WHERE user_csn = '" & GV.userCSN & "'"
I tried to use
reader("u.user_name")
but failed as well.
You need other syntax of insert operation: INSERT INTO ... SELECT ... FROM ...:
INSERT INTO Users (user_name, user_csn, user_adid, user_contact, user_adminpriviledge, user_datestart, user_active, user_team)
SELECT #username, #usercsn, #useradid, #usercontact, #userauth, #userstart, #useractive, team_id --<--not here's your column
FROM teaminfo
WHERE team_name = #param
Also, it looks like it's .NET (C# or VB code), so you you are prone to SQL injection concatenating you string with parameters!
In my SQL I already put #param in proper place, then with SqlCommand you are probably using, you have to call method Addon SqlCommand.Paramteres collection, and then supplly it with value of ddlAddTeam.SelectedValue.
Try this code:
Using connection = New SqlConnection("connString")
Using com = New SqlCommand
com.Connection = connection
com.CommandText = "INSERT INTO Users (user_name, user_csn, user_adid, user_contact, user_adminpriviledge, user_datestart, user_active, user_team)
Select #username, #usercsn, #useradid, #usercontact, #userauth, #userstart, #useractive, team_id --<--Not here's your column
From teaminfo
Where team_name = #param"
com.Parameters.Add("#param", SqlDbType.VarChar).Value = ddlAddTeam.SelectedValue
connection.Open()
End Using
End Using
And for column alises: in data reader you use column aliases without table name (u before the dot in ou example). Try to give aliases to all your columns to avoid such problems.
The data source for an INSERT statement can be a SELECT statement—see the <dml_table_source> part of the statement definition at the linked page—and a SELECT statement can include parameters in the select list. Here's a simple example:
declare #Target table (Id bigint, Datum char(1));
declare #Source table (Id bigint);
declare #Datum char(1) = 'X';
insert #Source values (1);
insert #Target
select
Id = S.Id, -- Value from another table
Datum = #Datum -- Parameter
from
#Source S;
There are more examples at the page linked above; scroll down to the "Inserting Data From Other Tables" section header.
Also, if you're going to build a query in (C#?) code as you've shown in your example, you should really pass any arguments as parameters rather than trying to build them directly into the query text. Read up on SQL injection attacks to see why.
Your INSERT query should be like
"INSERT INTO Users (user_name, user_csn, user_adid, user_contact, user_adminpriviledge, user_datestart, user_active, user_team)
VALUES (#username, #usercsn, #useradid, #usercontact, #userauth, #userstart, #useractive, (SELECT team_id FROM teaminfo WHERE team_name = #userteam ))"
Second when fetching from reader it should be like :
reader("user_name") // I am not sure about this. You can put break point and open the object in watch window
I need to process data in Azure Data Lake.
My flow is as follows:
I would like to select from the database list of IDs for next processing. This I have done.
I need to iterate through IDs (from the first step) and I need to successively export data into separated files (partitioned by ID)
The problem is following statemanet:
U-SQL’s procedures do not provide any imperative code-flow constructs
such as a for or while loops.
Any idea how to process data in similar way as with cursor?
I didn't find any documentation regarding to the cursors in U-SQL.
Thank you!
There are no cursors in U-SQL, because of the statement you reference above.
U-SQL does not provide any imperative code-flow constructs because it impedes the optimizer's ability to globally optimize your script.
You should think of approaching your problem declaratively. For example, if you have a list of IDs (either in a table or SqlArray or even a file), use a declarative join. For example, you want to add 42 to every value where the key is in a list of existing keys:
// Two options for providing the "looping data"
// Option 1: Array Variable
DECLARE #keys_var = new SqlArray<string>{"k1", "k2", "k3"};
// Option 2: Rowset (eg from an EXTRACT from file, a table or other place)
#keys = SELECT * FROM (VALUES("k1"), ("k2"), ("k3")) AS T(key);
// This is the data you want to iterate over to add 42 to the value column for every matching key
#inputdata = SELECT * FROM (VALUES (1, "k1"), (2, "k1"), (3, "k2"), (6, "k5")) AS T(value, key);
//Option 1:
#res = SELECT value+42 AS newval, key FROM #inputdata WHERE #keys_var.Contains(key);
OUTPUT #res TO "/output/opt1.csv" USING Outputters.Csv();
//Option 2:
#res = SELECT value+42 AS newval, i.key
FROM #inputdata AS i INNER JOIN #keys AS k
ON i.key == k.key;
OUTPUT #res TO "/output/opt2.csv" USING Outputters.Csv();
Now in your case, you want to have data-driven output file sets. This feature is currently being worked on (it is one of our top asks). Until then you would have to write a script to generate the script (I will provide an example on your other question).
If you really want iterative behaviour you need to call the USQL from PowerShell.
For example:
ForEach ($Date in $Dates)
{
$USQLProcCall = '[dbo].[usp_OutputDailyAvgSpeed]("' + $Date + '");'
$JobName = 'Output daily avg dataset for ' + $Date
Write-Host $USQLProcCall
$job = Submit-AzureRmDataLakeAnalyticsJob `
-Name $JobName `
-AccountName $DLAnalyticsName `
–Script $USQLProcCall `
-DegreeOfParallelism $DLAnalyticsDoP
Write-Host "Job submitted for " $Date
}
Source: https://www.purplefrogsystems.com/paul/2017/05/recursive-u-sql-with-powershell-u-sql-looping/
I am new to Scala and have been learning about slick(3.1.1). While writing a particular code to insert data into a table, I have to insert a row if it does not exist else update a particular column in that table. For a single row update I have written the following code, which works fine :
def updateDate(id: Int, country : Country, lastDate: DateTime)(implicit ec: ExecutionContext) =
byPKC.applied((id, country, )).map(_.lastMessageDate).update(lastDate) flatMap {
case 0 ⇒
create(User.withLastDate(id, country, lastDate))
case x ⇒ DBIO.successful(x)
}
Now I am unsure of how to do a bulk operation in scala for this. I tried the following, which even though inefficient, should work but there is no row being inserted into the table.
def updateDates(ids: Set[Int], country: Country, lastDate: DateTime)(implicit ec: ExecutionContext) = {
ids.foreach(e ⇒ updateDate(e, country, lastDate))
DBIO.successful(1)
}
How do I do a bulk write in scala? Also, why does this bulk operation not work? Any help would be much appreciated.
Two-part answer to the two-part question.
How do I do a bulk write in scala?
You can't do bulk updates in SQL; it's simply not supported. You can bulk insert, though. Slick supports bulk inserts with the ++= operator on TableQuery.
Why does this bulk operation not work?
You're generating the DBIO values, but you aren't executing them via db.run(). DBIO is a mapping of Scala operations to SQL, but you have to execute it before it will be sent to the database.
I have a page that ask of users opinion about a topic. Their responses are then saved into a table. What I want to do is to check how many users selected an option 1,2,3 and 4.
What I have now are multiple T-SQL queries that run successfully but I believe there is a simplified version of the code I have written. I would be grateful if someone can simplify my queries into one single query. Thank you.
here is sample of data in the database table
enter image description here
$sql4 = "SELECT COUNT(CO) FROM GnAppItms WHERE CO='1' AND MountID='".$mountID."'";
$stmt4 = sqlsrv_query($conn2, $sql4);
$row4 = sqlsrv_fetch_array($stmt4);
$sql5="SELECT COUNT(CO) FROM GnAppItms WHERE CO='2' AND MountID='".$mountID."'";
$stmt5=sqlsrv_query($conn2,$sql5);
$row5=sqlsrv_fetch_array($stmt5);
$sql6="SELECT COUNT(CO) FROM GnAppItms WHERE CO='3' AND MountID='".$mountID."'";
$stmt6=sqlsrv_query($conn2,$sql6);
$row6=sqlsrv_fetch_array($stmt6);
$sql7="SELECT COUNT(CO) FROM GnAppItms WHERE CO='4' AND MountID='".$mountID."'";
$stmt7=sqlsrv_query($conn2,$sql7);
$row7=sqlsrv_fetch_array($stmt7);
You can do it by using group by in sql server
example :
create table a
(id int,
mountid nvarchar(100),
co int,
)
insert into a values (1,'aa',1)
insert into a values (2,'aa',2)
insert into a values (3,'aa',1)
insert into a values (4,'aa',2)
insert into a values (5,'aa',3)
Query
select co,count(co)as countofco from a
where mountid='aa'
group by
co
result
co countofco
1 2
2 2
3 1
Note : Beware of SQL injection when you are writing a sql query, so always use parametrized query. You can edit the above example code and make it as a parametrized query for preventing sql injection