I have a simple multi-threaded application. All the threads will only do put operations to the same database. But before a thread takes a put operations, it will first of all acquire a mutex lock to increase the key number and then release the lock and then do the put operation, i.e., the threads will insert items with different key number maybe at the same time. That's what I did in my application.
What I am still confused about is whether this simple app needs to specify DB_INIT_LOCK flag or DB_INIT_CDB flag? I have read the document about these flags. DB_INIT_CDB means multiple reads/single writer, however, in my simple app, the threads can operate concurrently, not single writer, so I do not need it. For DB_INIT_LOCK, since the threads never insert the item with the same key, I do not need it, am I right?
Please correct me if I am wrong. Many thanks.
You correctly state that DB_INIT_CDB gives you a multi-reader, single-writer environment. This puts Berkeley DB in a completely different mode of operation. But, since you've got more than one writer, you can't use it.
You'll need at least these two flags:
DB_INIT_LOCK: You're doing your own locking around your database key generation. But when you insert records into the database, Berkeley DB is going to touch some of the same pieces of memory. For example, the very first two records you insert will be right next to each other in the database. Unless they are large, they'll be on the same database "page" of memory. You need this flag to tell BDB to do its own locking.
It's the same as if you implemented your own in-memory binary tree that multiple threads were changing. You'd have to use some kind of locking to prevent the threads from completely destroying the tree with incompatible updates.
DB_THREAD: This flag lets BDB know that multiple threads will be using the same database environment.
You may find that you need to use transactions, or at the very least to allow BDB to use them interally. That's DB_INIT_TXN. And, I've always needed DB_INIT_MPOOL and DB_PRIVATE to allow BDB to use malloc() to manage some of its own memory.
(Just as an aside, if you have a simple increment for your key, consider using an atomic increment operation instead of a mutex. If you're using C with gcc, the builtin __sync_fetch_and_add (or __sync_add_and_fetch) can do this for you. From C++, you can use std::atomic's post-increment.)
I am trying to figure out how to insert an item into a B+ tree using locks and don't really understand the theory behind it.
So for searching, my view is that I put a lock on the root node, and then decide which child node I should go to and lock it, at this point I can release the parent node and continue this operation until I reach the leaf node.
But inserting is a lot more complicated because I can't allow any other threads to interfere with the insertion. My idea is put a lock on each node along the path to the leaf node but putting that many locks is quite expensive, and then the question I have is what happens when the leaf node splits because it is too large?
Does anyone know how to properly insert an item into a B+ tree using locks?
There are many different strategies for dealing with locking in B-Trees in general; most of these actually deal with B+Trees and its variations since they have been dominating the field for decades. Summarising these strategies would be tantamount to summarising the progress of four decades; it's virtually impossible. Here are some highlights.
One strategy for minimising the amount of locking during initial descent is to lock not the whole path starting from the root, but only the sub-path beginning at the last 'stable' node (i.e. a node that won't split or merge as a result of the currently planned operation).
Another strategy is to assume that no split or merge will happen, which is true most of the time anyway. This means the descent can be done by locking only the current node and the child node one will descend into next, then release the lock on the previously 'current' node and so on. If it turns out that a split or merge is necessary after all then re-descend from the root under a heavier locking regime (i.e. path rooted at last stable node).
Another staple in the bag of tricks is to ensure that each node 'descended through' is stable by preventative splitting/merging; that is, when the current node would split or merge under a change bubbling up from below then it gets split/merged right away before continuing the descent. This can simplify operations (including locking) and it is somewhat popular in reinventions of the wheel - homework assignments and 'me too' implementations, rather than sophisticated production-grade systems.
Some strategies allow most normal operations to be performed without any locking at all but usually they require that the standard B+Tree structure be slightly modified; see B-link trees for example. This means that different concurrent threads operating on the tree can 'see' different physical views of this tree - depending on when they got where and followed which link - but they all see the same logical view.
Seminal papers and good overviews:
Efficient Locking for Concurrent Operations on B-Trees (Lehman/Yao 1981)
Concurrent Operations on B*-Trees with Overtaking (Sagiv 1986)
A survey of B-tree locking techniques (Graefe 2010)
B+Tree Locking (slides from Stanford U, including Blink trees)
A Blink Tree method and latch protocol for synchronous deletion in a high concurreny environment (Malbrain 2010)
A Lock-Free B+Tree (Braginsky/Petrank 2012)
The problem is like this:
I have an array of 500 pointers which point to 500 elements in a doubly linked list. There are 10 threads which run in parallel. Each thread runs 50 loops, and tries to free some element in the list.
The list is sorted (contain simple integers), and there are 10 other threads running in parallel, searching for the node that contains a particular integer and access the other satellite data in this node. So the node is like:
struct node
{
int key; // Key used to search this nodes
int x,y,z; // Satellite data
struct node *prev;
struct node *right;
};
The problem is easily solvable if I just lock the list before search / delete. But that is too coarse grained. How do I synchronize these threads so that I can achieve better concurrency?
Edits:
This is not a homework question. I do not belong to academia.
The array holding 500 pointers seems weird. I have made it like that to visualize my problems with least possible complexity.
I can think of a couple of broad approaches which don't involve a global lock, and should allow some degree of forward progress:
1. mark but don't remove
When a deletion thread identifies its victim, mark it as deleted but leave it in place.
When a search thread encounters a node with this deleted mark, it just ignores it.
You'll need to issue a write/release barrier after marking the node deleted, and an acquire barrier before inspecting the value: you'll need platform-specific, compiler-specific extensions, otherwise you're writing those barriers in assembler.
2. genuine removal with a lockfree list
As per the paper in Peeyush's answer; similar platform- or compiler-specific requirements for CAS, and significant care is required. Options such as refcounts or hazard pointers can allow the node to be genuinely deleted once no-one is looking at it. You may find you need to replace your prev/next pointers by short indices you can pack into a single word for CAS to work: this means bounding the number of nodes and allocating them in an array.
Also note that although every thread should be able to make progress with this sort of scheme, individual operations (eg. traversing to the next node) may become much more expensive due to the synchronisation requirements.
You might consider lock-free linked list using CompareAndSwap operation.
link to paper
You need to lock any data that can change. If you will do a lot of work, create one lock per item in the list. A thread has to have the previous, the current, and the next item locked in order to remove the middle one. Make sure to always get locks in the same order to avoid deadlocks.
Other delete threads and the search threads will have to wait until the object is removed and the new links set up. Then the locks are released and they can continue.
I'm trying to implement a (special kind of) doubly-linked list in C, in a pthreads environment but using only C-wrapped synchronization instructions like atomic CAS, etc. rather than pthread primitives. (The elements of the list are fixed-size chunks of memory and almost surely cannot fit pthread_mutex_t etc. inside them.) I don't actually need full arbitrary doubly-linked list methods, only:
insertion at the end of the list
deletion from the beginning of the list
deletion at arbitrary points in the list based on a pointer to the member to be removed, which was obtained from a source other than by traversing the list.
So perhaps a better way to describe this data structure would be a queue/fifo with the possibility of removing items mid-queue.
Is there a standard approach to synchronizing this? I'm getting stuck on possible deadlock issues, some of which are probably inherent to the algorithms involved and others of which might stem from the fact that I'm trying to work in a confined space with other constraints on what I can do.
Edit: In particular, I'm stuck on what to do if adjacent objects are to be removed simultaneously. Presumably when removing an object, you need to obtain locks on both the previous and next objects in the list and update their next/prev pointers to point to one another. But if either neighbor is already locked, this would result in a deadlock. I've tried to work out a way that any/all of the removals taking place could walk the locked part of the list and determine the maximal sublist that's currently in the process of removal, then lock the nodes adjacent to that sublist so that the whole sublist gets removed as a whole, but my head is starting to hurt.. :-P
Conclusion(?): To follow up, I do have some code I want to get working, but I'm also interested in the theoretical problem. Everyone's answers have been quite helpful, and combined with details of the constraints outside what I expressed here (you really don't want to know where the pointer-to-element-to-be-removed came from and the synchronization involved there!) I've decided to abandon the local-lock code for now and focus on:
using a larger number of smaller lists which each have individual locks.
minimizing the number of instructions over which locks are held and poking at memory (in a safe way) prior to acquiring a lock to reduce the possibility of page faults and cache misses while a lock is held.
measuring the contention under artificially-high load and evaluating whether this approach is satisfactory.
Thanks again to everybody who gave answers. If my experiment doesn't go well I might come back to the approaches outlined (especially Vlad's) and try again.
Why not just apply a coarse-grained lock? Just lock the whole queue.
A more elaborate (however not necessarily more efficient, depends on your usage pattern) solution would be using a read-wrote lock, for reading and writing, respectively.
Using lock-free operations seem to me not a very good idea for your case. Imagine that some thread is traversing your queue, and at the same moment the "current" item is deleted. Doesn't matter how many additional links your traverse algorithm holds, all that items may be deleted, so your code would have no chance to finish the traversal.
Another issue with compare-and-swap is that with pointers you never know whether it really points to the same old structure, or the old structure has been freed and some new structure is allocated at the same address. This may or may not be an issue for your algorithms.
For the case of "local" locking (i.e., the possibility to lock each list item separately), An idea would be to make the locks ordered. Ordering the locks ensures the impossibility of a deadlock. So your operations are like that:
Delete by the pointer p to the previous item:
lock p, check (using perhaps special flag in the item) that the item is still in the list
lock p->next, check that it's not zero and in the list; this way you ensure that the p->next->next won't be removed in the meantime
lock p->next->next
set a flag in p->next indicating that it's not in the list
(p->next->next->prev, p->next->prev) = (p, null); (p->next, p->next->next) = (p->next->next, null)
release the locks
Insert into the beginning:
lock head
set the flag in the new item indicating that it's in the list
lock the new item
lock head->next
(head->next->prev, new->prev) = (new, head); (new->next, head) = (head, new)
release the locks
This seems to be correct, I didn't however try this idea.
Essentially, this makes the double-linked list work as if it were a single-linked list.
If you don't have the pointer to the previous list element (which is of course usually the case, as it's virtually impossible to keep such a pointer in consistent state), you can do the following:
Delete by the pointer c to the item to be deleted:
lock c, check if it is still a part of the list (this has to be a flag in the list item), if not, operation fails
obtain pointer p = c->prev
unlock c (now, c may be moved or deleted by other thread, p may be moved or deleted from the list as well) [in order to avoid the deallocation of c, you need to have something like shared pointer or at least a kind of refcounting for list items here]
lock p
check if p is a part of the list (it could be deleted after step 3); if not, unlock p and restart from the beginning
check if p->next equals c, if not, unlock p and restart from the beginning [here we can maybe optimize out the restart, not sure ATM]
lock p->next; here you can be sure that p->next==c and is not deleted, because the deletion of c would have required locking of p
lock p->next->next; now all the locks are taken, so we can proceed
set the flag that c is not a part of the list
perform the customary (p->next, c->next, c->prev, c->next->prev) = (c->next, null, null, p)
release all the locks
Note that just having a pointer to some list item cannot ensure that the item is not deallocated, so you'll need to have a kind of refcounting, so that the item is not destroyed at the very moment you try to lock it.
Note that in the last algorithm the number of retries is bounded. Indeed, new items cannot appear on the left of c (insertion is at the rightmost position). If our step 5 fails and thus we need a retry, this can be caused only by having p removed from the list in the meanwhile. Such a removal can occur not more than N-1 times, where N is the initial position of c in the list. Of course, this worst case is rather unlikely to happen.
Please don't take this answer harshly, but don't do this.
You will almost certainly wind up with bugs, and very hard bugs to find at that. Use the pthreads lock primitives. They are your friends, and have been written by people who deeply understand the memory model provided by your processor of choice. If you try to do the same thing with CAS and atomic increment and the like, you will almost certainly make some subtle mistake that you won't find until it's far too late.
Here's a little code example to help illustrate the point. What's wrong with this lock?
volatile int lockTaken = 0;
void EnterSpinLock() {
while (!__sync_bool_compare_and_swap(&lockTaken, 0, 1) { /* wait */ }
}
void LeaveSpinLock() {
lockTaken = 0;
}
The answer is: there's no memory barrier when releasing the lock, meaning that some of the write operations executed within the lock may not have happened before the next thread gets into the lock. Yikes! (There are probably many more bugs too, for example, the function doesn't do the platform-appropriate yield inside the spin loop and so is hugely wasteful of CPU cycles. &c.)
If you implement your double-linked list as a circular list with a sentinal node, then you only need to perform two pointer assignments in order to remove an item from the list, and four to add an item. I'm sure you can afford to hold a well-written exclusive lock over those pointer assignments.
Note that I am assuming that you are not one of the few people who deeply understand memory models only because there are very few of them in the world. If you are one of these people, the fact that even you can't figure it out ought to be an indication of how tricky it is. :)
I am also assuming that you're asking this question because you have some code you'd actually like to get working. If this is simply an academic exercise in order to learn more about threading (perhaps as a step on your way to becoming a deep low-level concurrency expert) then by all means, ignore me, and do your research on the details of the memory model of the platform you're targeting. :)
You can avoid deadlock if you maintain a strict hierarchy of locks: if you're locking multiple nodes, always lock the ones closer to the head of the list first. So, to delete an element, first lock the node's predecessor, then lock the node, then lock the node's successor, unlink the node, and then release the locks in reverse order.
This way, if multiple threads try to delete adjacent nodes simultaneously (say, nodes B and C in the chain A-B-C-D), then whichever thread first gets the lock to node B will be the one that will unlink first. Thread 1 will lock A, then B, then C, and thread 2 will lock B, then C, then D. There's only competition for B, and there's no way that thread 1 can hold a lock while waiting for a lock held by thread 2 and while thread 2 is waiting on the lock held by thread 1 (i.e. deadlock).
You cannot get away without a lock for the whole list. Here's why:
Insert into an Empty List
Threads A and B wants to insert an object.
Thread A examines the list, finds it empty
A context switch occurs.
Thread B examines the list, finds it empty and updates the head and tail to point to its object.
A context switch occurs
Thread A updates the head and tail to point to its object. Thread B's object has been lost.
Delete an item from the middle of the list
Thread A wants to delete node X. For this it first has to lock X's predecessor, X itself and X's successor since all of these nodes will be affected by the operation. To lock X's predecessor you must do something like
spin_lock(&(X->prev->lockFlag));
Although I've used function call syntax, if spin_lock is a function, you are dead in the water because that involves at least three operations before you actually have the lock:
place the address of the lock flag on the stack (or in a register)
call the function
do the atomic test and set
There are two places there where thread A can be swapped out and another thread can get in and remove X's predecessor without thread A knowing that X's predecessor has changed. So you have to implement the spin lock itself atomically. i.e. you have to add an offset to X to get x->prev then dereference it to get *(x->prev) and add an offset to that to get lockFlag and then do an atomic operation all in one atomic unit. Otherwise there is always an opportunity for something to sneak in after you have committed to locking a particular node but before you have actually locked it.
I note that the only reason you need a doubly-linked list here is because of the requirement to delete nodes from the middle of the list, that were obtained without walking the list. A simple FIFO can obviously be implemented with a singly-linked list (with both head and tail pointers).
You could avoid the deletion-from-the-middle case by introducing another layer of indirection - if the list nodes simply contain a next pointer and a payload pointer, with the actual data pointed to elsewhere (you say memory allocation is not possible at the point of insertion, so you'll just need to allocate the list node structure at the same point that you allocate the payload itself).
In the delete-from-the-middle case, you simply set the payload pointer to NULL and leave the orphaned node in the list. If the FIFO pop operation encounters such an empty node, it just frees it and tries again. This deferral lets you use a singly-linked list, and a lockless singly-linked list implementation is significantly easier to get right.
Of course, there is still an essential race here around the removal of a node in the middle of the queue - nothing appears to stop that node coming to the front of the queue and being removed by another thread before the thread that has decided it wants to remove it actually gets a chance to do so. This race appears to be outside the scope of the details provided in your question.
Two ideas.
First, to avoid the deadlock problem I would do some sort of spinlock:
lock the item that is to be deleted
try to lock one of the neighbors, if you have cheap random bits available chose the side randomly
if this doesn't succeed abandon your first lock
and loop
try to lock the other one
if this succeeds delete your item
else abandon both locks
and loop
Since splicing an element out of a list is not very lengthy as an operation, this shouldn't cost you much performance overhead. And in case that you really have a rush to delete all elements at the same time, it still should give you some good parallelism.
The second would be to do lazy delete. Mark your elements that are to be deleted and only remove them effectively when they appear at the end of the list. Since you are only interested in the head and the tail the effective users of the list items can do this. The advantage is that when they are at the end when deleted, the deadlock problem disappears. The disadvantage is that this makes the final deletion a sequential operation.
I have been searching concurrent linked list implementations/academic papers that allow for concurrent insertions to disjoint places in the list. I would prefer a lock based approach.
Unfortunately, all the implementations I've checked out so far use list based locking as opposed to something akin to node based locking.
Any help people?
EDIT 1: Thanks all for the initial responses. Using node based locking means that for insertion after a node or deleting a node I need to lock the previous and the next node. Now it is entirely possible that by the time Thread 1 tries to lock the previous node it got deleted in Thread 2. How to guard against such accidents?
I'm not able to recommend any libraries that do this for C specifically, but if you end up doing it yourself you could potentially avoid having to have thousands of locks by re-using a small number of locks and some "hashing" to decide which to use for each node. You'd get quite a number of cases where there wouldn't be any contention if the number of locks is suitably larger than the number of nodes for little space overhead (and it's fixed, not per node).
Update, for EDIT 1
You could work around this by having a per-list multiple reader, single write lock, (rwlock), where you acquire a "read" lock prior to getting the per-node lock for inserts, but for a delete you need to get the single "write" lock. You avoid unnecessary synchronisation issues for the read/insert operations fairly easily and deleting is simple enough. (The assumption is delete is much rarer than insert though)
You may want to look at using a lock-free implementation. The idea is to use an atomic test-set operation when inserting/deleting a node.
Unfortunately, there are not many widely known implementations. You may have to roll your own. Here is the gcc documentation about atomic operation support:
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
The trouble with node based locking is that you normally have to lock two nodes for each insertion. This can be more expensive in some situations.
Worse is that you get dining philosopher alike deadlock possibilities you have to treat.
So therefore list based locking is easier and thats why you see more about these.
If the performance characteristics of list based locking is not favorable to your application consider changing to a different data structure than as single linked list.