I am new to Scala and have been learning about slick(3.1.1). While writing a particular code to insert data into a table, I have to insert a row if it does not exist else update a particular column in that table. For a single row update I have written the following code, which works fine :
def updateDate(id: Int, country : Country, lastDate: DateTime)(implicit ec: ExecutionContext) =
byPKC.applied((id, country, )).map(_.lastMessageDate).update(lastDate) flatMap {
case 0 ⇒
create(User.withLastDate(id, country, lastDate))
case x ⇒ DBIO.successful(x)
}
Now I am unsure of how to do a bulk operation in scala for this. I tried the following, which even though inefficient, should work but there is no row being inserted into the table.
def updateDates(ids: Set[Int], country: Country, lastDate: DateTime)(implicit ec: ExecutionContext) = {
ids.foreach(e ⇒ updateDate(e, country, lastDate))
DBIO.successful(1)
}
How do I do a bulk write in scala? Also, why does this bulk operation not work? Any help would be much appreciated.
Two-part answer to the two-part question.
How do I do a bulk write in scala?
You can't do bulk updates in SQL; it's simply not supported. You can bulk insert, though. Slick supports bulk inserts with the ++= operator on TableQuery.
Why does this bulk operation not work?
You're generating the DBIO values, but you aren't executing them via db.run(). DBIO is a mapping of Scala operations to SQL, but you have to execute it before it will be sent to the database.
Related
I’m building an app using Cassandra as DB and I was wondering if there is any way to copy paste and « sync » a column value with another.
I’ve tried to use materialized views but I wasn’t able to add additional regular columns in the same row where the view was created. I also saw this diagram that is about « links » in cql.
cql links
Can anyone help my finding a way of doing this please?
There isn't a way to do that in Cassandra. You will need to use CQL BATCH statements to keep your tables synchronised. It will group inserts, updates and deletes into one atomic transaction. Have a look at this article where I've explained it in a bit more detail -- https://community.datastax.com/articles/2744/.
For example, if you had these tables to maintain:
movies
movies_by_actor
movies_by_genre
then you would group the updates in a CQL BATCH like this:
BEGIN BATCH
INSERT INTO movies (...) VALUES (...);
INSERT INTO movies_by_actor (...) VALUES (...);
INSERT INTO movies_by_genre (...) VALUES (...);
APPLY BATCH;
Note that it is also possible to do UPDATE and DELETE statements as well as conditional writes in a batch.
The above example is just to illustrate it in cqlsh and is not used in reality. Here is an example BatchStatement using the Java driver:
SimpleStatement insertMovies =
SimpleStatement.newInstance(
"INSERT INTO movies (...) VALUES (?, ...)", <some_values>);
SimpleStatement insertMoviesByActor =
SimpleStatement.newInstance(
"INSERT INTO movies_by_actor (...) VALUES (?, ...)", <some_values>);
SimpleStatement insertMoviesByGenre =
SimpleStatement.newInstance(
"INSERT INTO movies_by_genre (...) VALUES (?, ...)", <some_values>);
BatchStatement batch =
BatchStatement.builder(DefaultBatchType.LOGGED)
.addStatement(insertMovies)
.addStatement(insertMoviesByActor)
.addStatement(insertMoviesByGenre)
.build();
For details, see Java driver Batch statements. Cheers!
I am trying to batch insert records into an SQL table using Kotlin Exposed. I have set up the code as per the Exposed documentation, however, the SQL statements being executed are individual insert statements rather than 1 batch insert statement.
The documentation located here: https://github.com/JetBrains/Exposed/wiki/DSL
has the following on Batch Inserting:
Batch Insert
Batch Insert allow mapping a list of entities into DB raws in one sql statement. It is more efficient than inserting one by one as it initiates only one statement. Here is an example:
val cityNames = listOf("Paris", "Moscow", "Helsinki")
val allCitiesID = cities.batchInsert(cityNames) { name ->
this[cities.name] = name
}
My code is as follows:
val mappings: List<Triple<String, String, String>> = listOf(triple1, triple2, triple3)
transaction {
TableName.batchInsert(mappings) {
this[TableName.value1] = it.first
this[TableName.value2] = it.second
this[TableName.value3] = it.third
}
}
What I expect to see printed out is 1 batch insert statement which follows the syntax of
INSERT INTO TableName (value1, value2, value3) values
(triple1value1, triple1value2, triple1value3),
(triple2value1, triple2value2, triple2value3),
(triple3value1, triple3value2, triple3value3), ...
but instead it prints 3 individual insert statements with the following syntax
INSERT INTO TableName (value1, value2, value3) values (triple1value1, triple1value2, triple1value3)
INSERT INTO TableName (value1, value2, value3) values (triple2value1, triple2value2, triple2value3)
INSERT INTO TableName (value1, value2, value3) values (triple3value1, triple3value2, triple3value3)
As this seems like the documented correct way to batch insert, what am I doing incorrectly here?
The docs explain:
NOTE: The batchInsert function will still create multiple INSERT
statements when interacting with your database. You most likely want
to couple this with the rewriteBatchedInserts=true (or
rewriteBatchedStatements=true) option of your relevant JDBC driver,
which will convert those into a single bulkInsert. You can find the
documentation for this option for MySQL here and PostgreSQL here.
https://github.com/JetBrains/Exposed/wiki/DSL#batch-insert
I have a page that ask of users opinion about a topic. Their responses are then saved into a table. What I want to do is to check how many users selected an option 1,2,3 and 4.
What I have now are multiple T-SQL queries that run successfully but I believe there is a simplified version of the code I have written. I would be grateful if someone can simplify my queries into one single query. Thank you.
here is sample of data in the database table
enter image description here
$sql4 = "SELECT COUNT(CO) FROM GnAppItms WHERE CO='1' AND MountID='".$mountID."'";
$stmt4 = sqlsrv_query($conn2, $sql4);
$row4 = sqlsrv_fetch_array($stmt4);
$sql5="SELECT COUNT(CO) FROM GnAppItms WHERE CO='2' AND MountID='".$mountID."'";
$stmt5=sqlsrv_query($conn2,$sql5);
$row5=sqlsrv_fetch_array($stmt5);
$sql6="SELECT COUNT(CO) FROM GnAppItms WHERE CO='3' AND MountID='".$mountID."'";
$stmt6=sqlsrv_query($conn2,$sql6);
$row6=sqlsrv_fetch_array($stmt6);
$sql7="SELECT COUNT(CO) FROM GnAppItms WHERE CO='4' AND MountID='".$mountID."'";
$stmt7=sqlsrv_query($conn2,$sql7);
$row7=sqlsrv_fetch_array($stmt7);
You can do it by using group by in sql server
example :
create table a
(id int,
mountid nvarchar(100),
co int,
)
insert into a values (1,'aa',1)
insert into a values (2,'aa',2)
insert into a values (3,'aa',1)
insert into a values (4,'aa',2)
insert into a values (5,'aa',3)
Query
select co,count(co)as countofco from a
where mountid='aa'
group by
co
result
co countofco
1 2
2 2
3 1
Note : Beware of SQL injection when you are writing a sql query, so always use parametrized query. You can edit the above example code and make it as a parametrized query for preventing sql injection
Say you have a stored procedure or function returning multiple rows, as discussed in How to return multiple rows from the stored procedure? (Oracle PL/SQL)
What would be a good way, using Scala, to "select * from table (all_emps);" (taken from URL above) and read the multiple rows of data that would be the result?
As far as I can see it is not possible to do this using Squeryl. Is there a scalaified tool like Squeryl that I can use, or do I have to drop to JDBC?
Functions that return tables are an Oracle specific feature, I doubt an ORM (be it Scala or even Java) would have support for such a proprietary extension.
So I think you're more or less on your own :).
Probably the easiest way is to use a plain JDBC java.sql.Statement and execute "select * from table (all_emps)" with the executeQuery method.
To address the second part of your question about a way to select from table in a more scala-esque way, I am using Slick. Quoting from their example documentation:
case class Coffee(name: String, supID: Int, price: Double)
implicit val getCoffeeResult = GetResult(r => Coffee(r.<<, r.<<, r.<<))
Database.forURL("...") withSession {
Seq(
Coffee("Colombian", 101, 7.99),
Coffee("Colombian_Decaf", 101, 8.99),
Coffee("French_Roast_Decaf", 49, 9.99)
).foreach(c => sqlu"""
insert into coffees values (${c.name}, ${c.supID}, ${c.price})
""").execute)
val sup = 101
val q = sql"select * from coffees where sup_id = $sup".as[Coffee]
// A bind variable to prevent SQL injection ^
q.foreach(println)
}
Though I am not sure how it's dealing (if at all) with stored procs/functions.
I am trying to use Dapper support my data access for my server app.
My server app has another application that drops records into my database at a rate of 400 per minute.
My app pulls them out in batches, processes them, and then deletes them from the database.
Since data continues to flow into the database while I am processing, I don't have a good way to say delete from myTable where allProcessed = true.
However, I do know the PK value of the rows to delete. So I want to do a delete from myTable where Id in #listToDelete
Problem is that if my server goes down for even 6 mintues, then I have over 2100 rows to delete.
Since Dapper takes my #listToDelete and turns each one into a parameter, my call to delete fails. (Causing my data purging to get even further behind.)
What is the best way to deal with this in Dapper?
NOTES:
I have looked at Tabled Valued Parameters but from what I can see, they are not very performant. This piece of my architecture is the bottle neck of my system and I need to be very very fast.
One option is to create a temp table on the server and then use the bulk load facility to upload all the IDs into that table at once. Then use a join, EXISTS or IN clause to delete only the records that you uploaded into your temp table.
Bulk loads are a well-optimized path in SQL Server and it should be very fast.
For example:
Execute the statement CREATE TABLE #RowsToDelete(ID INT PRIMARY KEY)
Use a bulk load to insert keys into #RowsToDelete
Execute DELETE FROM myTable where Id IN (SELECT ID FROM #RowsToDelete)
Execute DROP TABLE #RowsToDelte (the table will also be automatically dropped if you close the session)
(Assuming Dapper) code example:
conn.Open();
var columnName = "ID";
conn.Execute(string.Format("CREATE TABLE #{0}s({0} INT PRIMARY KEY)", columnName));
using (var bulkCopy = new SqlBulkCopy(conn))
{
bulkCopy.BatchSize = ids.Count;
bulkCopy.DestinationTableName = string.Format("#{0}s", columnName);
var table = new DataTable();
table.Columns.Add(columnName, typeof (int));
bulkCopy.ColumnMappings.Add(columnName, columnName);
foreach (var id in ids)
{
table.Rows.Add(id);
}
bulkCopy.WriteToServer(table);
}
//or do other things with your table instead of deleting here
conn.Execute(string.Format(#"DELETE FROM myTable where Id IN
(SELECT {0} FROM #{0}s", columnName));
conn.Execute(string.Format("DROP TABLE #{0}s", columnName));
To get this code working, I went dark side.
Since Dapper makes my list into parameters. And SQL Server can't handle a lot of parameters. (I have never needed even double digit parameters before). I had to go with Dynamic SQL.
So here was my solution:
string listOfIdsJoined = "("+String.Join(",", listOfIds.ToArray())+")";
connection.Execute("delete from myTable where Id in " + listOfIdsJoined);
Before everyone grabs the their torches and pitchforks, let me explain.
This code runs on a server whose only input is a data feed from a Mainframe system.
The list I am dynamically creating is a list of longs/bigints.
The longs/bigints are from an Identity column.
I know constructing dynamic SQL is bad juju, but in this case, I just can't see how it leads to a security risk.
Dapper request the List of object having parameter as a property so in above case a list of object having Id as property will work.
connection.Execute("delete from myTable where Id in (#Id)", listOfIds.AsEnumerable().Select(i=> new { Id = i }).ToList());
This will work.