Dapper insert or Update?

Dapper insert or Update? - dapper

I haven't started using Dapper just yet but just stumbled across it today as I was researching about bulk insert/updates. Currently I'm using EF6 but I would like to look at using Dapper in the future of my bulk stuff. For this app there could be ~15k records but I have other apps that could be ~100k records.
I'm trying to research a way that I could convert the following EF code into Dapper. All it does is reads in a record from a file, look to see if that employee exists in the DB, if so it updates the properties with the values from the file and if not it creates a new object with the values from the file.
I couldn't find any examples when I was looking around. All I could find was how to do a simple insert or an update. I didn't really find a good examples of bulk insert/update. It's very possible I'm just not understanding how to use Dapper just yet.
How would I do this with Dapper?
int count = 1;
using (ctx = new DataContext())
{
ctx.Configuration.AutoDetectChangesEnabled = false;
ctx.Configuration.ValidateOnSaveEnabled = false;
while ((record = srFile.ReadLine()) != null)
{
int employeeId = int.Parse(record.Substring(2, 8));
bio_employee employee = ctx.bio_employee.FirstOrDefault(e => e.emp_id == employeeId);
if (employee != null)
{
SetEmployeeData(employee, record);
ctx.Entry(employee).State = System.Data.Entity.EntityState.Modified;
}
else
{
employee = new bio_employee();
employee.emp_id = employeeId;
SetEmployeeData(employee, record);
ctx.bio_employee.Add(employee);
}
if (count % batchSize == 0)
{
ctx.SaveChanges();
ctx.Dispose();
ctx = new DataContext();
}
count++;
}
ctx.SaveChanges(); //save any remaining
}

Dapper provides multiple methods to query data but none to perform saving operations other than using a command like you normally do without an ORM.
A lot of third party libraries cover however this scenario for Dapper:
Dapper Plus (Recommended)
Dapper Contrib
Dapper Extensions
Dapper FastCRUD
Dapper SimpleCRUD
Disclaimer: I'm the owner of the project Dapper Plus
Dapper Plus is by far the fastest library by providing: BulkInsert, BulkDelete, BulkUpdate, and BulkMerge operations. It can easily support scenarios with millions of records.
// CONFIGURE & MAP entity
DapperPlusManager.Entity<Employee>()
.Table("Employee")
.Identity(x => x.EmployeeID);
// SAVE entity
connection.BulkMerge(employeeList);
EDIT: Answer subquestion
Is the .BulkMerge in your DapperPlus doing an Upsert
Yes, BulkMerge is an upsert operation.
You can also specify more than one mapping for the same entity by using a mapping key.
// Using key from database (x => x.EmployeeID)
DapperPlusManager.Entity<Employee>()
.Table("Employee");
connection.BulkInsert(employees);
// Using custom key
DapperPlusManager.Entity<Employee>("customKey")
.Table("Employee")
.Key(x => x.Code);
connection.BulkInsert("customKey", employees);

Related

Dapper-Plus BulkInsert - How to return number of rows affected?

In Dapper-Plus, is there a way to return the number of rows affected in the database? This is my code:
using (SqlConnection connection = new SqlConnection(Environment.GetEnvironmentVariable("sqldb_connection")))
{
connection.BulkInsert(myList);
}
I see you can do it for inserting a single row, but can't find functionality on the dapper plus bulk insert.

Since Dapper Plus allow to chain multiple methods, the method doesn't directly return this value.
However, you can do it with the following code:
var resultInfo = new Z.BulkOperations.ResultInfo();
connection.UseBulkOptions(options => {
options.UseRowsAffected = true;
options.ResultInfo = resultInfo;
}).BulkInsert(orders);
// Show RowsAffected
Console.WriteLine("Rows Inserted: " + resultInfo.RowsAffectedInserted);
Console.WriteLine("Rows Affected: " + resultInfo.RowsAffected);
Fiddle: https://dotnetfiddle.net/mOMNng
Keep in mind that using that option will slightly make the bulk operations slower.
EDIT: Answer comment
will it make it as slow as using the regular dapper insert method or is this way still faster?
It will still be way faster than regular Insert.

Entity Framework insert (of one object) slow when table has large number of records

I have a large asp.net mvc application that runs on a database that is rapidly growing in size. When the database is empty, everything works quickly, but one of my tables now has 350K records in it and an insert is now taking 15s. Here is a snippet:
foreach (var packageSheet in packageSheets)
{
// Create OrderSheets
var orderSheet = new OrderSheet { Sheet = packageSheet.Sheet };
// Add Default Options
orderSheet.AddDefaultOptions();
orderSheet.OrderPrints.Add(
new OrderPrint
{
OrderPose = CurrentOrderSubject.OrderPoses.Where(op => op.Id == orderPoseId).Single(),
PrintId = packageSheet.Sheet.Prints.First().Id
});
// Create OrderPackageSheets and add it to the order package held in the session
var orderPackageSheet =
new OrderPackageSheet
{
OrderSheet = orderSheet,
PackageSheet = packageSheet
};
_orderPackageRepository.SaveChanges();
...
}
When I SaveChanges at this point it takes 15s the on the first loop; each iteration after is fast. I have indexed the tables in question so I believe the database is tuned properly. It's the OrderPackageSheets table that contains 350K rows.
Can anyone tell me how I can optimize this to get rid of the delay?
Thank you!

EF can be slow if you are inserting a lot of rows at same time.
context.Configuration.AutoDetectChangesEnabled = false; wont do too much for you if this is really web app
You need to share your table definition and for instance you can use Simple recovery model which will improve insert performances.
Or, as mentioned, if you need to insert a lot of rows use bulk inserts

If the number of records is too high ,You can use stored procedure instead of EF.
If you need to use EF itself ,Disable auto updating of the context using
context.Configuration.AutoDetectChangesEnabled = false;
and save the context after the loop
Check these links
Efficient way to do bulk insert/update with Entity Framework
http://weblog.west-wind.com/posts/2013/Dec/22/Entity-Framework-and-slow-bulk-INSERTs

Is there any way to do a Insert or Update / Merge / Upsert in LLBLGen

I'd like to do an upmerge using LLBLGen without first fetching then saving the entity.
I already found the possibility to update without fetching the entity first, but then I have to know it is already there.
Updating entries would be about as often as inserting a new entry.
Is there a possibility to do this in one step?
Would it make sense to do it in one step?
Facts:
LLBLgen Pro 2.6
SQL Server 2008 R2
.NET 3.5 SP1

I know I'm a little late for this, but As I remember working with LLBLGenPro, it is totally possible and one of its beauties is everithing is possible!
I don't have my samples, but I'm pretty sure you there is a method named UpdateEntitiesDirectly that can be used like this:
// suppose we have Product and Order Entities
using (var daa = new DataAccessAdapter())
{
int numberOfUpdatedEntities =
daa.UpdateEntitiesDirectly(OrderFields.ProductId == 23 && OrderFields.Date > DateTime.Now.AddDays(-2));
}
When using LLBLGenPro we were able to do pretty everything that is possible with an ORM framework, it's just great!
It also has a method to do a batch delete called DeleteEntitiesDirectly that may be usefull in scenarios that you need to delete an etity and replace it with another one.
Hope this is helpful.

I think you can achieve what you're looking for by using EntityCollection. First fetch the entities you want to update by FetchEntityCollection method of DataAccessAdapter then, change anything you want in that collection, insert new entities to it and save it using DataAccessAdapter, SaveCollection method. this way existing entities would be updated and new ones would be inserted to the Database. For example in a product order senario in which you want to manipulate orders of a specified product then you can use something like this:
int productId = 23;
var orders = new EntityCollection<OrderEntity>();
using (DataAccessAdapter daa = new DataAccessAdapter())
{
daa.FetchEntityCollection(orders, new RelationPredicateBucket(OrderFields.ProductId == productId))
foreach(var order in orders)
{
order.State = 1;
}
OrderEntity newOrder = new OrderEntity();
newOrder.ProductId == productId;
newOrder.State = 0;
orders.Add(newOrder);
daa.SaveEntityCollection(orders);
}

As far as I know, this is not possible, and could not be possible.
If you were to just call adapter.Save(entity) on an entity that was not fetched, the framework would assume it was new. If you think about it, how could the framework know whether to emit an UPDATE or an INSERT statement? No matter what, something somewhere would have to query the database to see if the row exists.
It would not be too difficult to create something that did this more or less automatically for single entity (non-recursive) saves. The steps would be something like:
Create a new entity and set it's fields.
Attempt to fetch an entity of the same type using the PK or a unique constraint (there are other options as well, but none as uniform)
If the fetch fails, just save the new entity (INSERT)
If the fetch succeeds, map the fields of the created entity to the fields of the fetched entity.
Save the fetched entity (UPDATE).

How to use MS Sync Framework to filter client-specific data?

Let's say I've got a SQL 2008 database table with lots of records associated with two different customers, Customer A and Customer B.
I would like to build a fat client application that fetches all of the records that are specific to either Customer A or Customer B based on the credentials of the requesting user, then stores the fetched records in a temporary local table.
Thinking I might use the MS Sync Framework to accomplish this, I started reading about row filtering when I came across this little chestnut:
Do not rely on filtering for security.
The ability to filter data from the
server based on a client or user ID is
not a security feature. In other
words, this approach cannot be used to
prevent one client from reading data
that belongs to another client. This
type of filtering is useful only for
partitioning data and reducing the
amount of data that is brought down to
the client database.
So, is this telling me that the MS Sync Framework is only a good option when you want to replicate an entire table between point A and point B?
Doesn't that seem to be an extremely limiting characteristic of the framework? Or am I just interpreting this statement incorrectly? Or is there some other way to use the framework to achieve my purposes?
Ideas anyone?
Thanks!

No, it is only a security warning.
We use filtering extensively in our semi-connected app.
Here is some code to get you started:
//helper
void PrepareFilter(string tablename, string filter)
{
SyncAdapters.Remove(tablename);
var ab = new SqlSyncAdapterBuilder(this.Connection as SqlConnection);
ab.TableName = "dbo." + tablename;
ab.ChangeTrackingType = ChangeTrackingType.SqlServerChangeTracking;
ab.FilterClause = filter;
var cpar = new SqlParameter("#filterid", SqlDbType.UniqueIdentifier);
cpar.IsNullable = true;
cpar.Value = DBNull.Value;
ab.FilterParameters.Add(cpar);
var nsa = ab.ToSyncAdapter();
nsa.TableName = tablename;
SyncAdapters.Add(nsa);
}
// usage
void SetupFooBar()
{
var tablename = "FooBar";
var filter = "FooId IN (SELECT BarId FROM dbo.GetAllFooBars(#filterid))";
PrepareFilter(tablename, filter);
}

Hitting the 2100 parameter limit (SQL Server) when using Contains()

from f in CUSTOMERS
where depts.Contains(f.DEPT_ID)
select f.NAME
depts is a list (IEnumerable<int>) of department ids
This query works fine until you pass a large list (say around 3000 dept ids) .. then I get this error:
The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100.
I changed my query to:
var dept_ids = string.Join(" ", depts.ToStringArray());
from f in CUSTOMERS
where dept_ids.IndexOf(Convert.ToString(f.DEPT_id)) != -1
select f.NAME
using IndexOf() fixed the error but made the query slow. Is there any other way to solve this? thanks so much.

My solution (Guids is a list of ids you would like to filter by):
List<MyTestEntity> result = new List<MyTestEntity>();
for(int i = 0; i < Math.Ceiling((double)Guids.Count / 2000); i++)
{
var nextGuids = Guids.Skip(i * 2000).Take(2000);
result.AddRange(db.Tests.Where(x => nextGuids.Contains(x.Id)));
}
this.DataContext = result;

Why not write the query in sql and attach your entity?
It's been awhile since I worked in Linq, but here goes:
IQuery q = Session.CreateQuery(#"
select *
from customerTable f
where f.DEPT_id in (" + string.Join(",", depts.ToStringArray()) + ")");
q.AttachEntity(CUSTOMER);
Of course, you will need to protect against injection, but that shouldn't be too hard.

You will want to check out the LINQKit project since within there somewhere is a technique for batching up such statements to solve this issue. I believe the idea is to use the PredicateBuilder to break the local collection into smaller chuncks but I haven't reviewed the solution in detail because I've instead been looking for a more natural way to handle this.
Unfortunately it appears from Microsoft's response to my suggestion to fix this behavior that there are no plans set to have this addressed for .NET Framework 4.0 or even subsequent service packs.
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=475984
UPDATE:
I've opened up some discussion regarding whether this was going to be fixed for LINQ to SQL or the ADO.NET Entity Framework on the MSDN forums. Please see these posts for more information regarding these topics and to see the temporary workaround that I've come up with using XML and a SQL UDF.

I had similar problem, and I got two ways to fix it.
Intersect method
join on IDs
To get values that are NOT in list, I used Except method OR left join.
Update
EntityFramework 6.2 runs the following query successfully:
var employeeIDs = Enumerable.Range(3, 5000);
var orders =
from order in Orders
where employeeIDs.Contains((int)order.EmployeeID)
select order;

Your post was from a while ago, but perhaps someone will benefit from this. Entity Framework does a lot of query caching, every time you send in a different parameter count, that gets added to the cache. Using a "Contains" call will cause SQL to generate a clause like "WHERE x IN (#p1, #p2.... #pn)", and bloat the EF cache.
Recently I looked for a new way to handle this, and I found that you can create an entire table of data as a parameter. Here's how to do it:
First, you'll need to create a custom table type, so run this in SQL Server (in my case I called the custom type "TableId"):
CREATE TYPE [dbo].[TableId] AS TABLE(
Id[int] PRIMARY KEY
)
Then, in C#, you can create a DataTable and load it into a structured parameter that matches the type. You can add as many data rows as you want:
DataTable dt = new DataTable();
dt.Columns.Add("id", typeof(int));
This is an arbitrary list of IDs to search on. You can make the list as large as you want:
dt.Rows.Add(24262);
dt.Rows.Add(24267);
dt.Rows.Add(24264);
Create an SqlParameter using the custom table type and your data table:
SqlParameter tableParameter = new SqlParameter("#id", SqlDbType.Structured);
tableParameter.TypeName = "dbo.TableId";
tableParameter.Value = dt;
Then you can call a bit of SQL from your context that joins your existing table to the values from your table parameter. This will give you all records that match your ID list:
var items = context.Dailies.FromSqlRaw<Dailies>("SELECT * FROM dbo.Dailies d INNER JOIN #id id ON d.Daily_ID = id.id", tableParameter).AsNoTracking().ToList();

You could always partition your list of depts into smaller sets before you pass them as parameters to the IN statement generated by Linq. See here:
Divide a large IEnumerable into smaller IEnumerable of a fix amount of item