Django: lock particular rows in table - database

I have the following django method:
def setCurrentSong(request, player):
try:
newCurrentSong = ActivePlaylistEntry.objects.get(
song__player_lib_song_id=request.POST['lib_id'],
song__player=player,
state=u'QE')
except ObjectDoesNotExist:
toReturn = HttpResponseNotFound()
toReturn[MISSING_RESOURCE_HEADER] = 'song'
return toReturn
try:
currentSong = ActivePlaylistEntry.objects.get(song__player=player, state=u'PL')
currentSong.state=u'FN'
currentSong.save()
except ObjectDoesNotExist:
pass
except MultipleObjectsReturned:
#This is bad. It means that
#this function isn't getting executed atomically like we hoped it would be
#I think we may actually need a mutex to protect this critial section :(
ActivePlaylistEntry.objects.filter(song__player=player, state=u'PL').update(state=u'FN')
newCurrentSong.state = u'PL'
newCurrentSong.save()
PlaylistEntryTimePlayed(playlist_entry=newCurrentSong).save()
return HttpResponse("Song changed")
Essentially, I want it to be so that for a given player, there is only one ActivePlaylistEntry that has a 'PL' (playing) state at any given time. However, I have actually experienced cases where, as a result of quickly calling this method twice in a row, I get two songs for the same player with a state of 'PL'. This is bad as I have other application logic that relies on the fact that a player only has one playing song at any given time (plus semantically it doesn't make sense to be playing two different songs at the same time on the same player). Is there a way for me to do this update atomically? Just running the method as a transaction with the on_commit_success decorator doesn't seem to work. Is there like a way to lock the table for all songs belonging to a particular player? I was thinking of adding a lock column to my model (boolean field) and either just spinning on it or pausing the thread for a few milliseconds and checking again but these feel super hackish and dirty. I was also thinking about creating a stored procedure but that's not really database independent.

Locking queries were added in 1.4.
with transaction.commit_manually():
ActivePlayListEntry.objects.select_for_update().filter(...)
aple = ActivePlayListEntry.objects.get(...)
aple.state = ...
transaction.commit()
But you should consider refactoring so that a separate table with a ForeignKey is used to indicate the "active" song.

Related

Swift threading issue in Array

In my project, have a data provider, which provides data in every 2 milli seconds. Following is the delegate method in which the data is getting.
func measurementUpdated(_ measurement: Double) {
measurements.append(measurement)
guard measurements.count >= 300 else { return }
ecgView.measurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.setNeedsDisplay()
}
guard measurements.count >= 50000 else { return }
let olderMeasurementsPrefix = measurements.count - 50000
measurements = Array(measurements.dropFirst(olderMeasurementsPrefix))
print("Measurement Count : \(measurements.count)")
}
What I am trying to do is that when the array has more than 50000 elements, to delete the older measurement in the first n index of Array, for which I am using the dropFirst method of Array.
But, I am getting a crash with the following message:
Fatal error: Can't form Range with upperBound < lowerBound
I think the issue due to threading, both appending and deletion might happen at the same time, since the delegate is firing in a time interval of 2 millisecond. Can you suggest me an optimized way to resolve this issue?
So to really fix this, we need to first address two of your claims:
1) You said, in effect, that measurementUpdated() would be called on the main thread (for you said both append and dropFirst would be called on main thread. You also said several times that measurementUpdated() would be called every 2ms. You do not want to be calling a method every 2ms on the main thread. You'll pile up quite a lot of them very quickly, and get many delays in their updating, as the main thread is going to have UI stuff to be doing, and that always eats up time.
So first rule: measurementUpdated() should always be called on another thread. Keep it the same thread, though.
Second rule: The entire code path from whatever collects the data to when measurementUpdated() is called must also be on a non-main thread. It can be on the thread that measurementUpdated(), but doesn't have to be.
Third rule: You do not need your UI graph to update every 2ms. The human eye cannot perceive UI change that's faster than about 150ms. Also, the device's main thread will get totally bogged down trying to re-render as frequently as every 2ms. I bet your graph UI can't even render a single pass at 2ms! So let's give your main thread a break, by only updating the graph every, say, 150ms. Measure the current time in MS and compare against the last time you updated the graph from this routine.
Fourth rule: don't change any array (or any object) in two different threads without doing a mutex lock, as they'll sometimes collide (one thread will be trying to do an operation on it while another is too). An excellent article that covers all the current swift ways of doing mutex locks is Matt Gallagher's Mutexes and closure capture in Swift. It's a great read, and has both simple and advanced solutions and their tradeoffs.
One other suggestion: You're allocating or reallocating a few arrays every 2ms. It's unnecessary, and adds undue stress on the memory pools under the hood, I'd think. I suggest not doing append and dropsFirst calls. Try rewriting such that you have a single array that holds 50,000 doubles, and never changes size. Simply change values in the array, and keep 2 indexes so that you always know where the "start" and the "end" of the data set is within the array. i.e. pretend the next array element after the last is the first array element (pretend the array loops around to the front). Then you're not churning memory at all, and it'll operate much quicker too. You can surely find Array extensions people have written to make this trivial to use. Every 150ms you can copy the data into a second pre-allocated array in the correct order for your graph UI to consume, or just pass the two indexes to your graph UI if you own your graph UI and can adjust it to accommodate.
I don't have time right now to write a code example that covers all of this (maybe someone else does), but I'll try to revisit this tomorrow. It'd actually be a lot better for you if you made a renewed stab at it yourself, and then ask us a new question (on a new StackOverflow) if you get stuck.
Update As #Smartcat correctly pointed this solution has the potential of causing memory issues if the main thread is not fast enough to consume the arrays in the same pace the worker thread produces them.
The problem seems to be caused by ecgView's measurements property: you are writing to it on the thread receiving the data, while the view tries to read from it on the main thread, and simultaneous accesses to the same data from multiple thread is (unfortunately) likely to generate race conditions.
In conclusion, you need to make sure that both reads and writes happen on the same thread, and can easily be achieved my moving the setter call within the async dispatch:
let ecgViewMeasurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.measurements = ecgViewMeasurements
self.ecgView.setNeedsDisplay()
}
According to what you say, I will assume the delegate is calling the measuramentUpdate method from a concurrent thread.
If that's the case, and the problem is really related to threading, this should fix your problem:
func measurementUpdated(_ measurement: Double) {
DispatchQueue(label: "MySerialQueue").async {
measurements.append(measurement)
guard measurements.count >= 300 else { return }
ecgView.measurements = Array(measurements.suffix(300))
DispatchQueue.main.async {
self.ecgView.setNeedsDisplay()
}
guard measurements.count >= 50000 else { return }
let olderMeasurementsPrefix = measurements.count - 50000
measurements = Array(measurements.dropFirst(olderMeasurementsPrefix))
print("Measurement Count : \(measurements.count)")
}
}
This will put the code in an serial queue. This way you can ensure that this block of code will run only one at a time.

Most efficient way to increment a value of everything in Firebase

Say I have entries that look like this:
And I want to increment the priority field by 1 for every Item in the list of Estimates.
I can grab the estimates like this:
var estimates = firebase.child('Estimates');
After that how would I auto increment every Estimates priority by 1?
FOR FIRESTORE API ONLY, NOT FIREBASE
Thanks to the latest Firestore patch (March 13, 2019), you don't need to follow the other answers above.
Firestore's FieldValue class now hosts a increment method that atomically updates a numeric document field in the firestore database. You can use this FieldValue sentinel with either set (with mergeOptions true) or update methods of the DocumentReference object.
The usage is as follows (from the official docs, this is all there is):
DocumentReference washingtonRef = db.collection("cities").document("DC");
// Atomically increment the population of the city by 50.
washingtonRef.update("population", FieldValue.increment(50));
If you're wondering, it's available from version 18.2.0 of firestore. For your convenience, the Gradle dependency configuration is implementation 'com.google.firebase:firebase-firestore:18.2.0'
Note: Increment operations are useful for implementing counters, but
keep in mind that you can update a single document only once per
second. If you need to update your counter above this rate, see the
Distributed counters page.
EDIT 1: FieldValue.increment() is purely "server" side (happens in firestore), so you don't need to expose the current value to the client(s).
EDIT 2: While using the admin APIs, you can use admin.firestore.FieldValue.increment(1) for the same functionality. Thanks to #Jabir Ishaq for voluntarily letting me know about the undocumented feature. :)
EDIT 3:If the target field which you want to increment/decrement is not a number or does not exist, the increment method sets the value to the current value! This is helpful when you are creating a document for the first time.
This is one way to loop over all items and increase their priority:
var estimatesRef = firebase.child('Estimates');
estimatesRef.once('value', function(estimatesSnapshot) {
estimatesSnapshot.forEach(function(estimateSnapshot) {
estimateSnapshot.ref().update({
estimateSnapshot.val().priority + 1
});
});
});
It loops over all children of Estimates and increases the priority of each.
You can also combine the calls into a single update() call:
var estimatesRef = firebase.child('Estimates');
estimatesRef.once('value', function(estimatesSnapshot) {
var updates = {};
estimatesSnapshot.forEach(function(estimateSnapshot) {
updates[estimateSnapshot.key+'/priority'] = estimateSnapshot.val().priority + 1;
});
estimatesRef.update(updates);
});
The performance will be similar to the first solution (Firebase is very efficient when it comes to handling multiple requests). But in the second case it will be sent a single command to the server, so it will either fail or succeed completely.

Storing the value with the Ref, as long as it's not in the datastore

I'm have a List<Ref<Entity>>. I add new entries to the list like this:
entities.add(Ref.create(new_entry));
modified.add(new_entry);
When I store the entity that contains the list, I store the list itself and all the entities that are in the modified list. This works fine.
The problem is, that I have to work with the entities-list, while I add new entries to it. This requires iterating the list multiple times. The problem here is, that the refs in the list point to old entries (which are already in the datastore) and new entries (which are not yet in the datastore).
This causes the Ref.get()-method to return null for all the yet unstored entries in the list (the ones that are still in the modified-list).
I worked around this by doing this when inserting:
Ref<T> ref = new DeadRef<>(
Key.create(data),
data
);
this.entities.add(ref);
this.modified.add(data);
This way, I can mix stored and unstored entries in one list and Ref.get() always returns a value.
This works, but I have noticed that the refs in the entities-list stay DeadRefs when I store them to the datastore and load them in again.
Will this be a problem? Is there maybe even a better way to accomplish this?
This seems like a bad idea, although I don't know what specific problems you will run into.
The "right answer" is to save your entities first.
Edit: Also look at the documentation for ofy().defer().save(), which can prevent you from issuing a lot of unnecessary save operations.

app engine datastore: model for progressively updated terrain height map

Users submit rectangular axis-aligned regions associated with "terrain maps". At any time users can delete regions they have created.
class Region(db.Model):
terrain_map = reference to TerrainMap
top_left_x = integer
top_left_y = integer
bottom_right_x = integer
bottom_right_y = integer
I want to maintain a "terrain height map", which progressively updates as new rectangular regions are added or deleted.
class TerrainMap(db.Model):
terrain_height = blob
Here's a "picture" of what the map could look like with two overlapping regions: http://pastebin.com/4yzXSFC5
So i thought i could do this by adding a RegionUpdate model, created when Region entity is either created or deleted, and also enqueuing a Task which would churn through a query for "RegionUpdate.applied = False"
class RegionUpdate(db.Model):
terrain_map = reference to TerrainMap
top_left_x = integer
top_left_y = integer
bottom_right_x = integer
bottom_right_y = integer
operation = string, either "increase" or "decrease"
applied = False/True
The problem is it seems all RegionUpdates and Region entities have to be under the same entity group as their TerrainMap: RegionUpdates must only get created when Region entities are created or deleted, so this must be done transactionally; TerrainMap.terrain_height is updated in a Task, so this must be an idempotent operation - i can only think of doing this by transactionally grabbing a batch of RegionUpdate entities, then applying them to the TerrainMap.
That makes my entity group much larger than the "rule of thumb" size of about "a single user's worth of data or smaller".
Am i overlooking some better way to model this?
As I suggested in the Reddit question, I think that Brett's data pipelines talk has all the information you need to build this. The basic approach is this: Every time you insert or update a Region, add a 'marker' entity in the same entity group. Then, in that task, update the corresponding TerrainMap with the new data, and leave a Marker entity as a child of that, too, indicating that you've applied that update. See the talk for full details.
On the other hand, you haven't specified how many Regions you expect per TerrainMap, or how frequently they'll be updated. If the update rate to a single terrain map isn't huge, you could simply store all the regions as child entities of the TerrainMap to which they apply, and update the map synchronously or on the task queue in a single transaction, which is much simpler.

How do I avoid a memory leak with LINQ-To-SQL?

I have been having some issues with LINQ-To-SQL around memory usage. I'm using it in a Windows Service to do some processing, and I'm looping through a large amount of data that I'm pulling back from the context. Yes - I know I could do this with a stored procedure but there are reasons why that would be a less than ideal solution.
Anyway, what I see basically is memory is not being released even after I call context.SubmitChanges(). So I end up having to do all sorts of weird things like only pull back 100 records at time, or create several contexts and have them all do separate tasks. If I keep the same DataContext and use it later for other calls, it just eats up more and more memory. Even if I call Clear() on the "var tableRows" array that the query returns to me, set it to null, and call SYstem.GC.Collect() - it still doesn't release the memory.
Now I've read some about how you should use DataContexts quickly and dispose of them quickly, but it seems like their ought to be a way to force the context to dump all its data (or all its tracking data for a particular table) at a certain point to guarantee the memory is free.
Anyone know what steps guarantee that the memory is released?
A DataContext tracks all the objects it ever fetched. It won't release this until it is garbage collected. Also, as it implements IDisposable, you must call Dispose or use the using statement.
This is the right way to go:
using(DataContext myDC = new DataContext)
{
// Do stuff
} //DataContext is disposed
If you don't need object tracking set DataContext.ObjectTrackingEnabled to false. If you do need it, you can use reflection to call the internal DataContext.ClearCache(), although you have to be aware that since its internal, it's subject to disappear in a future version of the framework. And as far as I can tell, the framework itself doesn't use it but it does clear the object cache.
As Amy points out, you should dispose of the DataContext using a using block.
It seems that your primary concern is about creating and disposing a bunch of DataContext objects. This is how linq2sql is designed. The DataContext is meant to have short lifetime. Since you are pulling a lot of data from the database, it makes sense that there will be a lot of memory usage. You are on the right track, by processing your data in chunks.
Don't be afraid of creating a ton of DataContexts. They are designed to be used that way.
Thanks guys - I will check out the ClearCache method. Just for clarification (for future readers), the situation in which I was getting the memory usuage was something like this:
using(DataContext context = new DataContext())
{
while(true)
{
int skipAmount = 0;
var rows = context.tables.Select(x => x.Dept == "Dept").Skip(skipAmount).Take(100);
//break out of loop when out of rows
foreach(table t in rows)
{
//make changes to t
}
context.SubmitChanges();
skipAmount += rows.Count();
rows.Clear();
rows = null;
//at this point, even though the rows have been cleared and changes have been
//submitted, the context is still holding onto a reference somewhere to the
//removed rows. So unless you create a new context, memory usuage keeps on growing
}
}
I just ran into a similar problem. In my case, helped establish the properties of DataContext.ObjectTrackingEnabled to false.
But it works only in the case of iterating through the rows as follows:
using (var db = new DataContext())
{
db.ObjectTrackingEnabled = false;
var documents = from d in db.GetTable<T>()
select d;
foreach (var doc in documents)
{
...
}
}
If, for example, in the query to use the methods ToArray() or ToList() - no effect

Resources