Dapper-Plus BulkInsert - How to return number of rows affected? - dapper

In Dapper-Plus, is there a way to return the number of rows affected in the database? This is my code:
using (SqlConnection connection = new SqlConnection(Environment.GetEnvironmentVariable("sqldb_connection")))
{
connection.BulkInsert(myList);
}
I see you can do it for inserting a single row, but can't find functionality on the dapper plus bulk insert.

Since Dapper Plus allow to chain multiple methods, the method doesn't directly return this value.
However, you can do it with the following code:
var resultInfo = new Z.BulkOperations.ResultInfo();
connection.UseBulkOptions(options => {
options.UseRowsAffected = true;
options.ResultInfo = resultInfo;
}).BulkInsert(orders);
// Show RowsAffected
Console.WriteLine("Rows Inserted: " + resultInfo.RowsAffectedInserted);
Console.WriteLine("Rows Affected: " + resultInfo.RowsAffected);
Fiddle: https://dotnetfiddle.net/mOMNng
Keep in mind that using that option will slightly make the bulk operations slower.
EDIT: Answer comment
will it make it as slow as using the regular dapper insert method or is this way still faster?
It will still be way faster than regular Insert.

Related

Is it necessary to avoid loops for updating models in laravel?

I'm trying to sort multiple records for a model based on a field and store their ranks in DB. Like below:
$instances = Model::orderBy('field')->get();
$rank = 1;
foreach ($instances as $instance) {
$instance->update([
'rank' => $rank,
]);
$rank++;
}
I have two questions:
1- Is there any alternative ways to avoid using loop? for example I put the ranks in an array and update the whole records by only one magic method. For example:
$instances = Model::orderBy('field')->get();
$rank = 1;
$ranks_array = array();
foreach ($instances as $instance) {
array_push($ranks_array, $rank);
$rank++;
}
$instances->magicMethod($ranks_array);
2- Is it necessary at all to do so? are the loops have heavy effects on the server or not? need to say that the number of records I'm going to update may not exceed 50 at all.
For insert queries, inserting all records at once will go much faster than inserting them one by one. However for update queries, if you need to update specific rows with specific data, there is no other way than to update them one by one.
I recently came across a similar issue where I needed to update 90k+ row from my DB.
Because I needed to add specific values to each column I needed to individually update each column.
What I found was instead of doing
$DBModel = Model::get();
foreach ($DBModel as $row) {
$row->notify = $row->paid;
// the date is calculated from a specific carbon date from another column
// I did not include the logic here for the example
$row->due = "0000-00-00 00:00:00";
$row->save();
}
Running the previous query took 5m33s
But doing it like this
$DBModel = Model::get();
DB::beginTransaction();
foreach ($DBModel as $row) {
DB::update(update TABLE_NAME set due = ?, notify = ? where id = ?",
["0000-00-00 00:00:00", $row->paid, $row->id]
);
}
DB::commit();
The previous query took only 1m54s to execute.

Does the feature "execute command multiple times" result in multiple round-trips to database?

In the Dapper documentation, it says you can use an IEnumerable parameter to execute a command multiple times. It gives the following example:
connection.Execute(#"insert MyTable(colA, colB) values (#a, #b)",
new[] { new { a=1, b=1 }, new { a=2, b=2 }, new { a=3, b=3 } }
).IsEqualTo(3); // 3 rows inserted: "1,1", "2,2" and "3,3"
Will this result in multiple round-trips to the database (i.e. one for each T in the IEnumerable<T>)? Or is Dapper smart enough to transform the multiple queries into a batch and just do one round-trip? The documentation says an example usage is batch loading, so I suspect it only does one round-trip, but I want to be sure before I use it for performance-sensitive code.
As a follow-up question, and depending on the answer to the first, I'd be curious how transactions are handled? That is, is there one transaction for the whole set of Ts, or one transaction per T?
I finally got around to looking at this again. Looking at the source code (in \Dapper\SqlMapper.cs), I found the following snippet in method ExecuteImpl:
// ...
foreach (var obj in multiExec)
{
if (isFirst)
{
masterSql = cmd.CommandText;
isFirst = false;
identity = new Identity(command.CommandText, cmd.CommandType, cnn, null, obj.GetType(), null);
info = GetCacheInfo(identity, obj, command.AddToCache);
}
else
{
cmd.CommandText = masterSql; // because we do magic replaces on "in" etc
cmd.Parameters.Clear(); // current code is Add-tastic
}
info.ParamReader(cmd, obj);
total += cmd.ExecuteNonQuery();
}
// ...
The interesting part is on the second-last line where ExecuteNonQuery is called. That method is being called on each iteration of the for loop, so I guess it is not being batched in the sense of a set-based operation. Therefore, multiple round-trips are required. However, it is being batched in the sense that all operations are performed on the same connection, and within the same transaction if so specified.
The only way I can think of to do a set-based operation is to create a custom table-valued type (in the database) for the object of interest. Then, in the .NET code pass a DataTable object containing matching names and types as a command parameter. If there were a way to do this without having to create a table-valued type for every object, I'd love to hear about it.

Entity Framework insert (of one object) slow when table has large number of records

I have a large asp.net mvc application that runs on a database that is rapidly growing in size. When the database is empty, everything works quickly, but one of my tables now has 350K records in it and an insert is now taking 15s. Here is a snippet:
foreach (var packageSheet in packageSheets)
{
// Create OrderSheets
var orderSheet = new OrderSheet { Sheet = packageSheet.Sheet };
// Add Default Options
orderSheet.AddDefaultOptions();
orderSheet.OrderPrints.Add(
new OrderPrint
{
OrderPose = CurrentOrderSubject.OrderPoses.Where(op => op.Id == orderPoseId).Single(),
PrintId = packageSheet.Sheet.Prints.First().Id
});
// Create OrderPackageSheets and add it to the order package held in the session
var orderPackageSheet =
new OrderPackageSheet
{
OrderSheet = orderSheet,
PackageSheet = packageSheet
};
_orderPackageRepository.SaveChanges();
...
}
When I SaveChanges at this point it takes 15s the on the first loop; each iteration after is fast. I have indexed the tables in question so I believe the database is tuned properly. It's the OrderPackageSheets table that contains 350K rows.
Can anyone tell me how I can optimize this to get rid of the delay?
Thank you!
EF can be slow if you are inserting a lot of rows at same time.
context.Configuration.AutoDetectChangesEnabled = false; wont do too much for you if this is really web app
You need to share your table definition and for instance you can use Simple recovery model which will improve insert performances.
Or, as mentioned, if you need to insert a lot of rows use bulk inserts
If the number of records is too high ,You can use stored procedure instead of EF.
If you need to use EF itself ,Disable auto updating of the context using
context.Configuration.AutoDetectChangesEnabled = false;
and save the context after the loop
Check these links
Efficient way to do bulk insert/update with Entity Framework
http://weblog.west-wind.com/posts/2013/Dec/22/Entity-Framework-and-slow-bulk-INSERTs

Jdbc batched updates good key retrieval strategy

I insert alot of data into a table with a autogenerated key using the batchUpdate functionality of JDBC. Because JDBC doesn't say anything about batchUpdate and getAutogeneratedKeys I need some database independant workaround.
My ideas:
Somehow pull the next handed out sequences from the database before inserting and then using the keys manually. But JDBC hasn't got a getTheNextFutureKeys(howMany). So how can this be done? Is pulling keys e.g. in Oracle also transaction save? So only one transaction can ever pull the same set of future keys.
Add an extra column with a fake id that is only valid during the transaction.
Use all the other columns as secondary key to fetch the generated key. This isn't really 3NF conform...
Are there better ideas or how can I use idea 1 in a generalized way?
Partial answer
Is pulling keys e.g. in Oracle also transaction save?
Yes, getting values from a sequence is transaction safe, by which I mean even if you roll back your transaction, a sequence value returned by the DB won't be returned again under any circumstances.
So you can prefetch the id-s from a sequence and use them in the batch insert.
Never run into this, so I dived into it a little.
First of all, there is a way to retrieve the generated ids from a JDBC statement:
String sql = "INSERT INTO AUTHORS (LAST, FIRST, HOME) VALUES " +
"'PARKER', 'DOROTHY', 'USA', keyColumn";
int rows = stmt.executeUpdate(sql,
Statement.RETURN_GENERATED_KEYS);
ResultSet rs = stmt.getGeneratedKeys();
if (rs.next()) {
ResultSetMetaData rsmd = rs.getMetaData();
int colCount = rsmd.getColumnCount();
do {
for (int i = 1; i <= colCount; i++) {
String key = rs.getString(i);
System.out.println("key " + i + "is " + key);
}
}
while (rs.next();)
}
else {
System.out.println("There are no generated keys.");
}
see this http://download.oracle.com/javase/1.4.2/docs/guide/jdbc/getstart/statement.html#1000569
Also, theoretically it could be combined with the JDBC batchUpdate
Although, this combination seems to be rather non-trivial, on this pls refer to this thread.
I sugest to try this, and if you do not succeed, fall back to pre-fetching from sequence.
getAutogeneratedKeys() will also work with a batch update as far as I remember.
It returns a ResultSet with all newly created ids - not just a single value.
But that requires that the ID is populated through a trigger during the INSERT operation.

Hitting the 2100 parameter limit (SQL Server) when using Contains()

from f in CUSTOMERS
where depts.Contains(f.DEPT_ID)
select f.NAME
depts is a list (IEnumerable<int>) of department ids
This query works fine until you pass a large list (say around 3000 dept ids) .. then I get this error:
The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100.
I changed my query to:
var dept_ids = string.Join(" ", depts.ToStringArray());
from f in CUSTOMERS
where dept_ids.IndexOf(Convert.ToString(f.DEPT_id)) != -1
select f.NAME
using IndexOf() fixed the error but made the query slow. Is there any other way to solve this? thanks so much.
My solution (Guids is a list of ids you would like to filter by):
List<MyTestEntity> result = new List<MyTestEntity>();
for(int i = 0; i < Math.Ceiling((double)Guids.Count / 2000); i++)
{
var nextGuids = Guids.Skip(i * 2000).Take(2000);
result.AddRange(db.Tests.Where(x => nextGuids.Contains(x.Id)));
}
this.DataContext = result;
Why not write the query in sql and attach your entity?
It's been awhile since I worked in Linq, but here goes:
IQuery q = Session.CreateQuery(#"
select *
from customerTable f
where f.DEPT_id in (" + string.Join(",", depts.ToStringArray()) + ")");
q.AttachEntity(CUSTOMER);
Of course, you will need to protect against injection, but that shouldn't be too hard.
You will want to check out the LINQKit project since within there somewhere is a technique for batching up such statements to solve this issue. I believe the idea is to use the PredicateBuilder to break the local collection into smaller chuncks but I haven't reviewed the solution in detail because I've instead been looking for a more natural way to handle this.
Unfortunately it appears from Microsoft's response to my suggestion to fix this behavior that there are no plans set to have this addressed for .NET Framework 4.0 or even subsequent service packs.
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=475984
UPDATE:
I've opened up some discussion regarding whether this was going to be fixed for LINQ to SQL or the ADO.NET Entity Framework on the MSDN forums. Please see these posts for more information regarding these topics and to see the temporary workaround that I've come up with using XML and a SQL UDF.
I had similar problem, and I got two ways to fix it.
Intersect method
join on IDs
To get values that are NOT in list, I used Except method OR left join.
Update
EntityFramework 6.2 runs the following query successfully:
var employeeIDs = Enumerable.Range(3, 5000);
var orders =
from order in Orders
where employeeIDs.Contains((int)order.EmployeeID)
select order;
Your post was from a while ago, but perhaps someone will benefit from this. Entity Framework does a lot of query caching, every time you send in a different parameter count, that gets added to the cache. Using a "Contains" call will cause SQL to generate a clause like "WHERE x IN (#p1, #p2.... #pn)", and bloat the EF cache.
Recently I looked for a new way to handle this, and I found that you can create an entire table of data as a parameter. Here's how to do it:
First, you'll need to create a custom table type, so run this in SQL Server (in my case I called the custom type "TableId"):
CREATE TYPE [dbo].[TableId] AS TABLE(
Id[int] PRIMARY KEY
)
Then, in C#, you can create a DataTable and load it into a structured parameter that matches the type. You can add as many data rows as you want:
DataTable dt = new DataTable();
dt.Columns.Add("id", typeof(int));
This is an arbitrary list of IDs to search on. You can make the list as large as you want:
dt.Rows.Add(24262);
dt.Rows.Add(24267);
dt.Rows.Add(24264);
Create an SqlParameter using the custom table type and your data table:
SqlParameter tableParameter = new SqlParameter("#id", SqlDbType.Structured);
tableParameter.TypeName = "dbo.TableId";
tableParameter.Value = dt;
Then you can call a bit of SQL from your context that joins your existing table to the values from your table parameter. This will give you all records that match your ID list:
var items = context.Dailies.FromSqlRaw<Dailies>("SELECT * FROM dbo.Dailies d INNER JOIN #id id ON d.Daily_ID = id.id", tableParameter).AsNoTracking().ToList();
You could always partition your list of depts into smaller sets before you pass them as parameters to the IN statement generated by Linq. See here:
Divide a large IEnumerable into smaller IEnumerable of a fix amount of item

Resources