I have a table that consists of a column of pre-populated numbers. My API using Nhibernate grabs the first 10 rows where 'Used' flag is set as false.
What would be the best possible way to avoid concurrency issue when multiple session try to grab row from the table?
After selecting the row, I can update the flag column to be True so subsequent calls will not use the same numbers.
With such a general context, it could be done that way:
// RepeatableRead ensures the read rows does not get concurrently updated by another
// session.
using (var tran = session.BeginTransaction(IsolationLevel.RepeatableRead))
{
var entities = session.Query<Entity>()
.Where(e => !e.Used)
.OrderBy(e => e.Id)
.Take(10)
.ToList();
foreach(var entity in entities)
{
e.Used = true;
}
// If your session flush mode is not the default one and does not cause
// commits to flush the session, add a session.Flush(); call before committing.
tran.Commit();
return entities;
}
It is simple. It may fail with a deadlock, in which case you would have to throw away the session, get a new one, and retry.
Using an optimistic update pattern could be an alternate solution, but this requires some code for recovering from failed attempts too.
Using a no explicit lock solution, which will not cause deadlock risks, could do it, but it will require more queries:
const int entitiesToObtain = 10;
// Could initialize here with null instead, but then, will have to check
// for null after the while too.
var obtainedEntities = new List<Entity>();
while (obtainedEntities.Count == 0)
{
List<Entity> candidates;
using (var tran = session.BeginTransaction())
{
candidatesIds = session.Query<Entity>()
.Where(e => !e.Used)
.Select(e => e.Id)
.OrderBy(id => id)
.Take(entitiesToObtain)
.ToArray();
}
if (candidatesIds.Count == 0)
// No available entities.
break;
using (var tran = session.BeginTransaction())
{
var updatedCount = session.CreateQuery(
#"update Entity e set e.Used = true
where e.Used = false
and e.Id in (:ids)")
.SetParameterList("ids", candidatesIds)
.ExecuteUpdate();
if (updatedCount == candidatesIds.Length)
{
// All good, get them.
obtainedEntities = session.Query<Entity>()
.Where(e => candidatesIds.Contains(e.Id))
.ToList();
tran.Commit();
}
else
{
// Some or all of them were no more available, and there
// are no reliable way to know which ones, so just try again.
tran.Rollback();
}
}
}
This uses NHibernate DML-style operations as suggested here. A strongly typed alternative is available in NHibernate v5.0.
Related
I sync data from an api and detect if an insert or update is necessary.
From time to time I receive DbUpdateExceptions and then fallback to single insert/update + savechanges instead of addrange/updaterange + savechanges.
Because single entities are so slow I wanted to only remove the failing entity from changetracking and try to save it all again, but unfortunately mssql returns all entities instead of only the one that is failing in DbUpdateException.Entries.
Intellisense tells me
Gets the entries that were involved in the error. Typically this is a single entry, but in some cases it may be zero or multiple entries.
Interestingly this is true if I try it on a mysql server. There only one entity is returned, but mssql returns all, which makes it impossible for me to exclude only the failing one.
Is there any setting to change mssql behaviour?
Both mysql and mssql are azure hosted resources.
Here an example:
var addList = new List<MyEntity>();
var updateList = new List<MyEntity>();
//load existing data from db
var existingData = Context.Set<MyEntity>()
.AsNoTracking()
.Take(2).ToList();
if (existingData.Count < 2)
return;
//addList
addList.Add(new MyEntity
{
NotNullableProperty = "Value",
RequiredField1 = Guid.Empty,
RequiredField2 = Guid.Empty,
});
addList.Add(new MyEntity
{
NotNullableProperty = "Value",
RequiredField1 = Guid.Empty,
RequiredField2 = Guid.Empty,
});
addList.Add(existingData.ElementAt(0)); //this should fail due to duplicate key
addList.Add(new MyEntity
{
NotNullableProperty = "Value",
RequiredField1 = Guid.Empty,
RequiredField2 = Guid.Empty,
});
//updateList
existingData.ElementAt(1).NotNullableProperty = null; //this should fail due to invalid value
updateList.Add(existingData.ElementAt(1));
//save a new entity that should fail
var newKb = new MyEntity
{
NotNullableProperty = "Value",
RequiredField1 = Guid.Empty,
RequiredField2 = Guid.Empty,
};
Context.Add(newKb);
Context.SaveChanges();
newKb.NotNullableProperty = "01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"; //this should fail due to length
updateList.Add(newKb);
try
{
if (addList.IsNotNullOrEmpty())
context.Set<MyEntity>().AddRange(addList);
if (updateList.IsNotNullOrEmpty())
context.Set<MyEntity>().UpdateRange(updateList);
context.SaveChanges();
}
catch (DbUpdateException updateException)
{
//updateException.Entries contains all entries, that were added/updated although only three should fail
}
I have this function and it is working perfectly
public DemandeConge Creat(DemandeConge DemandeConge)
{
try
{
var _db = Context;
int numero = 0;
//??CompanyStatique
var session = _httpContextAccessor.HttpContext.User.Claims.ToList();
int currentCompanyId = int.Parse(session[2].Value);
numero = _db.DemandeConge.AsEnumerable()
.Where(t => t.companyID == currentCompanyId)
.Select(p => Convert.ToInt32(p.NumeroDemande))
.DefaultIfEmpty(0)
.Max();
numero++;
DemandeConge.NumeroDemande = numero.ToString();
//_db.Entry(DemandeConge).State = EntityState.Added;
_db.DemandeConge.Add(DemandeConge);
_db.SaveChanges();
return DemandeConge;
}
catch (Exception e)
{
return null;
}
}
But just when i try to insert another leave demand directly after inserting one (without waiting or refreshing the page )
An error appears saying that this new demand.id exists
I think that i need to add refresh after saving changes?
Any help and thanks
Code like this:
numero = _db.DemandeConge.AsEnumerable()
.Where(t => t.companyID == currentCompanyId)
.Select(p => Convert.ToInt32(p.NumeroDemande))
.DefaultIfEmpty(0)
.Max();
numero++;
Is a very poor pattern. You should leave the generation of your "numero" (ID) up to the database via an Identity column. Set this up in your DB (if DB First) and set up your mapping for this column as DatabaseGenerated.Identity.
However, your code raises lots of questions.. Why is it a String instead of an Int? This will be a bugbear for using an identity column.
The reason you will want to avoid code like this is because each request will want to query the database to get the "max" ID, as soon as you get two requests running relatively simultaneously you will get 2 requests that say the max ID is "100" before either can reserve and insert 101, so both try to insert 101. By using Identity columns the database will get 2x inserts and give them an ID first-come-first-serve. EF can manage associating FKs around these new IDs automatically for you when you set up navigation properties for the relations. (Rather than trying to set FKs manually which is the typical culprit for developers trying to fetch a new ID app-side)
If you're stuck using an existing schema where the PK is a combination of company ID and this Numero column as a string then about all you can do is implement a retry strategy to account for duplicates:
const int MAXRETRIES = 5;
var session = _httpContextAccessor.HttpContext.User.Claims.ToList();
int currentCompanyId = int.Parse(session[2].Value);
int insertAttemptCount = 0;
while(insertAttempt < MAXRETRIES)
{
try
{
numero = Context.DemandeConge
.Where(t => t.companyID == currentCompanyId)
.Select(p => Convert.ToInt32(p.NumeroDemande))
.DefaultIfEmpty(0)
.Max() + 1;
DemandeConge.NumeroDemande = numero.ToString();
Context.DemandeConge.Add(DemandeConge);
Context.SaveChanges();
break;
}
catch (UpdateException)
{
insertAttemptCount++;
if (insertAttemptCount >= MAXRETRIES)
throw; // Could not insert, throw and handle exception rather than return #null.
}
}
return DemandeConge;
Even this won't be fool proof and can result in failures under load, plus it is a lot of code to work around a poor DB design so my first recommendation would be to fix the schema because coding like this is prone to errors and brittle.
I am using Node to copy 2 million rows from SQL Server to another database, so of course I use the "streaming" option, like this:
const sql = require('mssql')
...
const request = new sql.Request()
request.stream = true
request.query('select * from verylargetable')
request.on('row', row => {
promise = write_to_other_database(row);
})
My problem is that I have do an asynchronous operation with each row ( insert into another database), which takes time.
The reading is faster than the writing, so the "on row" events just keep coming, and memory eventually fills-up with pending promises, and eventually crashes Node. This is frustrating -- the whole point of "streaming" is to avoid this, isn't it?
How can I solve this problem?
To stream millions of rows without crashing, intermittently pause your request.
sql.connect(config, err => {
if (err) console.log(err);
const request = new sql.Request();
request.stream = true; // You can set streaming differently for each request
request.query('select * from dbo.YourAmazingTable'); // or
request.execute(procedure)
request.on('recordset', columns => {
// Emitted once for each recordset in a query
//console.log(columns);
});
let rowsToProcess = [];
request.on('row', row => {
// Emitted for each row in a recordset
rowsToProcess.push(row);
if (rowsToProcess.length >= 3) {
request.pause();
processRows();
}
console.log(row);
});
request.on('error', err => {
// May be emitted multiple times
console.log(err);
});
request.on('done', result => {
// Always emitted as the last one
processRows();
//console.log(result);
});
const processRows = () => {
// process rows
rowsToProcess = [];
request.resume();
}
The problems seems to be caused by reading the stream using "row" events that don't allow you to control the flow of the stream. This should be possible with "pipe" method, but then you end up in a Data Stream and implementing a writable stream - which may be tricky.
A simple solution would be to use Scramjet so your code would be complete in a couple lines:
const sql = require('mssql')
const {DataStream} = require("scramjet");
//...
const request = new sql.Request()
request.stream = true
request.query('select * from verylargetable')
request.pipe(new DataStream({maxParallel: 1}))
// pipe to a new DataStream with no parallel processing
.batch(64)
// optionally batch the requests that someone mentioned
.consume(async (row) => write_to_other_database(row));
// flow control will be done automatically
Scramjet will use promises to control the flow. You can also try increasing the maxParallel method, but keep in mind that in this case the last line could start pushing rows simultaneously.
My own answer: instead of writing to the target database at the same time, I convert each row into an "insert" statement, and push the statement to a message queue ( RabbitMQ, a separate process ). This is fast, and can keep-up with the rate of reading. Another node process pulls from the queue ( more slowly ) and writes to the target database. Thus the big "back-log" of rows is handled by the message queue itself, which is good at that sort of thing.
I have an interesting little problem. My controller is assigning values to the properties in my model using two tables. In one of the tables, I have some entries that I made a while ago, and also some that I've just added recently. The old entries are being assigned values correctly, but the new entries assign NULL even though they're in the same table and were created in the same fashion.
Controller
[HttpPost]
[Authorize]
public ActionResult VerifyReservationInfo(RoomDataView model)
{
string loginName = User.Identity.Name;
UserManager UM = new UserManager();
UserProfileView UPV = UM.GetUserProfile(UM.GetUserID(loginName));
RoomAndReservationModel RoomResModel = new RoomAndReservationModel();
List<RoomProfileView> RoomsSelectedList = new List<RoomProfileView>();
GetSelectedRooms(model, RoomsSelectedList);
RoomResModel.RoomResRmProfile = RoomsSelectedList;
RoomResModel.GuestId = UPV.SYSUserID;
RoomResModel.FirstName = UPV.FirstName;
RoomResModel.LastName = UPV.LastName;
RoomResModel.PhoneNumber = UPV.PhoneNumber;
return View(RoomResModel);
}
GetUserProfile from the manager
public UserProfileView GetUserProfile(int userID)
{
UserProfileView UPV = new UserProfileView();
ResortDBEntities db = new ResortDBEntities();
{
var user = db.SYSUsers.Find(userID);
if (user != null)
{
UPV.SYSUserID = user.SYSUserID;
UPV.LoginName = user.LoginName;
UPV.Password = user.PasswordEncryptedText;
var SUP = db.SYSUserProfiles.Find(userID);
if (SUP != null)
{
UPV.FirstName = SUP.FirstName;
UPV.LastName = SUP.LastName;
UPV.PhoneNumber = SUP.PhoneNumber;
UPV.Gender = SUP.Gender;
}
var SUR = db.SYSUserRoles.Find(userID);
if (SUR != null)
{
UPV.LOOKUPRoleID = SUR.LOOKUPRoleID;
UPV.RoleName = SUR.LOOKUPRole.RoleName;
UPV.IsRoleActive = SUR.IsActive;
}
}
}
return UPV;
}
The issue I see is that this database has a somewhat poor design, and that particular record fell into the trap of that poor design. Consider that you have two ID's on that table:
SYSUserProfileID
SYSUserID
That's usually an indication of a bad design (though I'm not sure you can change it), if you can, you should merge anything that uses SYSUserID to use SYSUserProfileID.
This is bad because that last row has two different ID's. When you use db.Find(someId) Entity Framework will look for the Primary Key (SYSUserProfileID in this case) which is 19 for that row. But by the sounds of it, you also need to find it by the SYSUserID which is 28 for that row.
Personally, I'd ditch SYSUserID if at all possible. Otherwise, you need to correct the code so that it looks for the right ID column at the right times (this will be a massive PITA in the future), or correct that record so that the SYSUserID and SYSUserProfileID match. Either of these should fix this problem, but changing that record may break other things.
I'm trying to get programmatically what I can get manually from SSMS using Tasks > Generate Scripts
The code below works fine, EXCEPT it doesn't generate any constraints. I don't get any ALTER TABLE [foo] ADD CONSTRAINT ... ON DELETE CASCADE etc etc. I've tried a lot of combinations of Dri options and on different databases as well. I'm stumped.
Thanks for insight!
Scripter scrp = new Scripter(srv)
{
Options =
{
ScriptDrops = false,
WithDependencies = false,
Indexes = true,
Triggers = false,
Default = true,
DriAll = true,
//ScriptData = true,
ScriptSchema = true,
}
};
var urns = new List<Urn>();
foreach (Table tb in db.Tables)
{
if (tb.IsSystemObject == false)
{
urns.Add(tb.Urn);
}
}
var inserts = scrp.EnumScript(urns.ToArray());
File.WriteAllLines(path, inserts);
Well, I found a solution, which is to use the Script method of each object to produce the schema and the EnumScript method (with scriptSchema=false) to produce the inserts for the table content.
foreach (Table tb in db.Tables)
{
if (tb.IsSystemObject == false)
{
foreach (var s in tb.Script(schemaOptions))
strings.Add(s);
if (scriptData)
{
foreach (var i in tb.EnumScript(insertOptions))
strings.Add(i);
}
}
}
I confess this solution feels a bit hollow because I never found out why the original method didn't work. It's a Repair without a Diagnosis, but a repair nonetheless.
As to why I wrote this thing in the first place, my database is on a shared server and there isn't any way to get an automated backup that I could use offline or somewhere else. So this is my backup scheme.
The solution above follows the code example given by Microsoft here: Scripting . The problem with this approach is the tables are scripted in No Particular Order, but need to be in order of their dependencies in order for the constraints to be defined and for rows to be inserted. Can't reference a foreign key in a table that doesn't exist yet.
The best solution I have so far is to use DependencyWalker.DiscoverDependencies() to get a
dependency tree, DependencyWalker.WalkDependencies() to get a linear list and iterate over that list, as follows:
var urns = new List<Urn>();
Scripter schemaScripter = new Scripter(srv) { Options = schemaOptions };
Scripter insertScripter = new Scripter(srv) { Options = insertOptions };
var dw = new DependencyWalker(srv);
foreach (Table t in db.Tables)
if (t.IsSystemObject == false)
urns.Add(t.Urn);
DependencyTree dTree = dw.DiscoverDependencies(urns.ToArray(), true);
DependencyCollection dColl = dw.WalkDependencies(dTree);
foreach (var d in dColl)
{
foreach (var s in schemaScripter.Script(new Urn[] { d.Urn }))
strings.Add(s);
strings.Add("GO");
if (scriptData)
{
int n = 0;
foreach (var i in insertScripter.EnumScript(new Urn[] {d.Urn}))
{
strings.Add(i);
if ((++n) % 100 == 0)
strings.Add("GO");
}
}
}
...
File.WriteAllLines(path, strings);
Adding a "GO" every so often keeps the batch size small so SSMS doesn't run out of memory.
To complete the example, the database gets scripted thus:
foreach (var s in db.Script(new ScriptingOptions { ScriptSchema = true }))
strings.Add(s);
strings.Add("GO");
strings.Add("use " + dbName);
strings.Add("GO");
Users, views, stored procedures are scripted thus:
foreach (User u in db.Users)
{
if (u.IsSystemObject == false)
{
foreach (var s in u.Script(new ScriptingOptions { ScriptSchema = true }))
strings.Add(s);
}
}
The file produced by this code can be used to recreate the database. I have it set up on an old laptop to pull a snapshot of my online database every hour. Poor man's log shipping / backups / mirroring.