multi-thread applications in Berkeley DB - database

I have a simple multi-threaded application. All the threads will only do put operations to the same database. But before a thread takes a put operations, it will first of all acquire a mutex lock to increase the key number and then release the lock and then do the put operation, i.e., the threads will insert items with different key number maybe at the same time. That's what I did in my application.
What I am still confused about is whether this simple app needs to specify DB_INIT_LOCK flag or DB_INIT_CDB flag? I have read the document about these flags. DB_INIT_CDB means multiple reads/single writer, however, in my simple app, the threads can operate concurrently, not single writer, so I do not need it. For DB_INIT_LOCK, since the threads never insert the item with the same key, I do not need it, am I right?
Please correct me if I am wrong. Many thanks.

You correctly state that DB_INIT_CDB gives you a multi-reader, single-writer environment. This puts Berkeley DB in a completely different mode of operation. But, since you've got more than one writer, you can't use it.
You'll need at least these two flags:
DB_INIT_LOCK: You're doing your own locking around your database key generation. But when you insert records into the database, Berkeley DB is going to touch some of the same pieces of memory. For example, the very first two records you insert will be right next to each other in the database. Unless they are large, they'll be on the same database "page" of memory. You need this flag to tell BDB to do its own locking.
It's the same as if you implemented your own in-memory binary tree that multiple threads were changing. You'd have to use some kind of locking to prevent the threads from completely destroying the tree with incompatible updates.
DB_THREAD: This flag lets BDB know that multiple threads will be using the same database environment.
You may find that you need to use transactions, or at the very least to allow BDB to use them interally. That's DB_INIT_TXN. And, I've always needed DB_INIT_MPOOL and DB_PRIVATE to allow BDB to use malloc() to manage some of its own memory.
(Just as an aside, if you have a simple increment for your key, consider using an atomic increment operation instead of a mutex. If you're using C with gcc, the builtin __sync_fetch_and_add (or __sync_add_and_fetch) can do this for you. From C++, you can use std::atomic's post-increment.)

Related

Multithreaded access to data structure

I am writing application in Linux using C, pthreads and sockets.
This will be client-server application, server will have N+2 threads, where N - number of active clients, one thread for accepting new connections and creating threads for clients and last one will be accepting user input.
I will be using linked list to save some data that will be relevant to my application, with every client there will be associated one node in my list. Those client threads will update information that is stored in their nodes with some interval, could be one second, could be two minutes, it will dynamically change.
Now here is the problem, if user requests it, the information stored in linked list needs to be written to standard output. Of course during writing I should acquire mutex. I am worried that one mutex for whole list will hinder the performance.
I was thinking about associating mutex with every node, but it will complicate removal of some specified node (firstly, I would need to make sure that the 'stdout writer' thread won't be traversing the list, I would also need to acquire mutex of my node and the previous one to change the pointer that points to the next node and so on - either I would need to traverse all the way to the previous or I would need to make double linked list).
So I am wondering if the solution that involves multiple mutexes is even better with much more complicated code, conditions and all of this locking, waiting and unlocking.
You are right that having a per-node mutex will make code more complex. That's a tradeoff you will have to decide the value of. You can either have a single lock for the entire list, that might cause lock contention, but the code is largely not impacted by the presence of the lock and thus easier to write, or you can have more locks with considerably less opportunity for contention, leading to better performance, but the code is harder to write and get correct. You could even have something in the middle by having a lock per group of nodes - allocate a few nodes together and have a lock for that group - but then you'll have issues with tracking a free list and the potential for fragmentation.
You'll need to consider the relative frequency of add operations, delete operations, and full-list iterations, as well as others (reorganization, searching, whatever else your application will require). If add/delete are extremely frequent, but walking the list is once every third blue moon, the single lock could easily be appropriate. But if walking the list (whether for a full dump of the data, or to search or something else) is very common, the more granular approach becomes more attractive. You might even need to consider reader/writer locks instead of mutexes.
You don't need to traverse the list all the way back: while you traverse it, you test if the next element is the one that you want to remove, and then you may lock both nodes - always in the same order throughout the code, so you avoid deadlocking. Also, you can use the double checking idiom and lock the mutex node when you need to be sure of what it has.
remove
for node in list
if node->next is the desired node
lock(node)
lock(node->next)
if node->next is the desired node
do removing stuff
else
treat concurrent modification - retry, maybe?
release(node->next)
release(node)
With this idiom you don't need to lock the entire list while reading it, and also checks for a modification performed between the first test and the locking. I don't believe the code would get that much more complicated with an array of mutexes, and the locking overhead is nothing compared with the operations you may do, as IO.
Unless you have tens or even hundreds of thousands of users, it won't take that long to read the list. You might want to create a local, intermediate list so the original is not locked while writing, which might take some time. This also means you get a snapshot of the list at one point in time. If you lock individual nodes, you could remove A, then remove element B, and yet have A appear in the displayed list when B does not.
As I understand it, if you do want to lock individual nodes, your list must be singly linked. Additions and removals get rather tricky. In Java, there are several system classes that do this using fast compare-and-swap techniques. There must be code like it in C, but I don't know where to look for it. And you will get those chronologically-challenged results.
If you are going to have N thread for N active client then think about the option of using pthread_setspecific and pthread_getspecific.

Poor performance of reads in pthreads hash table using read-write locks

I've put together a simple key value store which speaks a subset of the Redis protocol. It uses pthreads on Linux to share the hash table; I use pthreads rwlocks to manage access to this table. I've been testing the K-V store using the Redis benchmark tool.
With a single client, I can do about 2500 SET operations a second. However, it can only do about 25 GETs per second; I'd expect the other way around, so this surprises me. It scales to some extent, so if I throw 10 clients at it I'll get nearly 9000 SETs per second and around 250 GETs per second.
My GET code is pretty simple; I lock the table, find the appropriate hash table location, and check for a matching key in the linked-list there. For a GET, I use pthread_rwlock_rdlock and pthread_rwlock_unlock when I'm done. For SET, I use pthread_rwlock_wrlock and pthread_rwlock_unlock. SET is quite a bit more complex than GET.
I also made the code work on Plan 9, using shared-memory processes and their own implementation of read/write locks. There, GETs are almost as fast as SETs, instead of 100x slower. This makes me think my hash table code is probably ok; I use exactly the same hash table code for both OSes, I simply use #defines to select the appropriate lock for each OS (the interface is the same in both cases, lucky!).
I'm not very experienced with pthreads. Can anyone help me figure out why my performance is sucking so badly?
(Note: this isn't meant to be a high-performing K-V store, it's meant to be a naively written test application/benchmark. It handles requests in about the simplest method possible, by spinning off a new thread for every client)
I don't know rwlocks but the experience I made with conditional locks with pthreads was, that with a realtime kernel the conditions were woken up faster. You can even tune the programs priority with chrtcommand.

Synchronizing Database Access in a Distributed App

A common bit of programming logic I find myself implementing often is something like the following pseudo-code:
Let X = some value
Let Database = some external Database handle
if !Database.contains(X):
SomeCalculation()
Database.insert(X)
However, in a multi-threaded program we have a race condition here. Thread A might check if X is in Database, find that it's not, and then proceed to call SomeCalculation(). Meanwhile, Thread B will also check if X is in Database, find that it's not, and insert a duplicate entry.
So of course, this needs to be synchronized like:
Let X = some value
Let Database = some external Database handle
LockMutex()
if !Database.contains(X):
SomeCalculation()
Database.insert(X)
UnlockMutex()
This is fine, except what if the application is a distributed app, running across multiple computers, all of which communicate with the same back-end database machine? In this case, a Mutex is useless, because it only synchronizes a single instance of the app with other local threads. To make this work, we'd need some kind of "global" distributed synchronization technique. (Assume that simply disallowing duplicates in Database is not a feasible strategy.)
In general, what are some practical solutions to this problem?
I realize this question is very generic, but I don't want to make this a language-specific question because this is an issue that comes up across multiple languages and multiple Database technologies.
I intentionally avoided specifying whether I'm talking about an RDBMS or SQL Database, versus something like a NoSQL Database, because again - I'm looking for generalized answers based on industry practices. For example, is this situation something that Atomic Stored Procedures might solve? Or Atomic Transactions? Or is this something that requires something like a "Distributed Mutex"? Or more generally, is this problem generally addressed by the Database system, or is it something the Application itself should handle?
If it turns out this question is impossible to answer at all without further information, please tell me so I can modify it.
One sure way to ensure against data stomping is to lock the data row. Many databases allow you to do that, via transactions. Some don't support transactions.
However, this is overkill for most cases, where contention is low in general. You might want to read up on Isolation levels to get more background on the topic.
A better general approach is often Optimistic Concurrency. The idea behind it is that each data row includes a signature, a timestamp works fine but the signature need not be time oriented. It could be a hash value, for example. This is a general concurrency management approach and is not limited to relational stores.
The app that changes data first reads the row, and then performs whatever calculations it requires, and then at some point, writes the updated data back to the data store. Via Optimistic concurrency, the app writes the update with the stipulation (expressed in SQL if it is a SQL database) that the data row must be updated only if the signature has not changed in the interim. And, each time a data row is updated, the signature must be updated as well.
The result is that updates don't get stomped on. But for a more rigorous explanation of the concurrency issues, refer to that article on DB Isolation levels.
All distributed updaters must follow the OCC convention (or something stronger, like transactional locking) in order for this to work.
You can obviously move the "synch" part to the DB layer itself, using an exclusive lock on a specific resource.
This is a bit extreme (in most cases, attempting the insert and managing the exception when you actually discover that someone already inserted the row) would be more adequate, I think.
Well, since you ask an general question, I will try to provide another option. Its not very orthodox, but may be useful: You could "define" a machine or a process responsible for doing that. For example:
Let X = some value
Let Database = some external Database handle
xResposible = Definer.defineResponsibleFor(X);
if(xResposible == me)
if !Database.contains(X):
SomeCalculation()
Database.insert(X)
The trick here is to make defineResponsibleFor always return the same value independent of who is calling. So, if you have a fair distributed range of X and a fair Definer all machines will have work to do. And you could use simple threading mutex to avoid race conditions. Of course, now you have to take care of fail-tolerance (if a machine or process is out of business your Definer must know and not define any job for it). But you should do this anyawy... :)

List insertion, disjoint n parallel?

I have been searching concurrent linked list implementations/academic papers that allow for concurrent insertions to disjoint places in the list. I would prefer a lock based approach.
Unfortunately, all the implementations I've checked out so far use list based locking as opposed to something akin to node based locking.
Any help people?
EDIT 1: Thanks all for the initial responses. Using node based locking means that for insertion after a node or deleting a node I need to lock the previous and the next node. Now it is entirely possible that by the time Thread 1 tries to lock the previous node it got deleted in Thread 2. How to guard against such accidents?
I'm not able to recommend any libraries that do this for C specifically, but if you end up doing it yourself you could potentially avoid having to have thousands of locks by re-using a small number of locks and some "hashing" to decide which to use for each node. You'd get quite a number of cases where there wouldn't be any contention if the number of locks is suitably larger than the number of nodes for little space overhead (and it's fixed, not per node).
Update, for EDIT 1
You could work around this by having a per-list multiple reader, single write lock, (rwlock), where you acquire a "read" lock prior to getting the per-node lock for inserts, but for a delete you need to get the single "write" lock. You avoid unnecessary synchronisation issues for the read/insert operations fairly easily and deleting is simple enough. (The assumption is delete is much rarer than insert though)
You may want to look at using a lock-free implementation. The idea is to use an atomic test-set operation when inserting/deleting a node.
Unfortunately, there are not many widely known implementations. You may have to roll your own. Here is the gcc documentation about atomic operation support:
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
The trouble with node based locking is that you normally have to lock two nodes for each insertion. This can be more expensive in some situations.
Worse is that you get dining philosopher alike deadlock possibilities you have to treat.
So therefore list based locking is easier and thats why you see more about these.
If the performance characteristics of list based locking is not favorable to your application consider changing to a different data structure than as single linked list.

When I update/insert a single row should it lock the entire table?

I have two long running queries that are both on transactions and access the same table but completely separate rows in those tables. These queries also perform some update and inserts based on those queries.
It appears that when these run concurrently that they encounter a lock of some kind and it’s preventing the task from finishing and locks up when it goes to update one of the rows. I’m using an exclusive row lock on the rows being read and the lock that shows up on the process is a lck_m_ix lock.
Two questions:
When I update/insert a single row does it lock the entire table?
What can be done to work around this sort of issue?
Typically no, but it depends (most often used answer for SQL Server!)
SQL Server will have to lock the data involved in a transaction in some way. It has to lock the data in the table itself, and the data any affected indexes, while you perform a modification. In order to improve concurrency, there are several "granularities" of locking that the server might decide to use, in order to allow multiple processes to run: row locks, page locks, and table locks are common (there are more). Which scale of locking is in play depends on how the server decides to execute a given update. Complicating things, there are also classifications of locks like shared, exclusive, and intent exclusive, that control whether the locked object can be read and/or modified.
It's been my experience that SQL Server mainly uses page locks for changes to small portions of tables, and past some threshold will automatically escalate to a table lock, if a larger portion of a table seems (from stats) to be affected by an update or delete. The idea is that it is faster to lock a table (one lock) than obtaining and managing thousands of individual row or page locks for a big update.
To see what is happening in your specific case, you'd need to look at the query logic and, while your stuff is running, examine the locking/blocking conditions in sys.dm_tran_locks, sys.dm_os_waiting_tasks or other DMV's. You would want to discover what exactly is getting locked by what step in each of your processes, to discover why one is blocking the other.
The short version:
No
Fix your code.
The long version:
LCK_M_IX is an intent lock, meaning the operation will place an X lock on a subordinate element. Eg. When updating a row in a table, the operation table takes an IX lock on the table before locking X the row being updated/inserted/deleted. Intent locks are common strategy to deal with hierarchies, like table/page/row, because the lock manager cannot understand the physical structure of resources requested to be locked (ie. it cannot know that an X-lock on page P1 is incompatible with an S-lock on row R1 because R1 is contained in P1). For more details, see Lock Modes.
The fact that you are seeing contention on intent locks means you are trying to obtain high level object locks, like table locks. You will need to analyze your source code for the request being blocked (the one requesting the lock incompatible with LCK_M_IX) and remove the cause of the object level lock request. What that means will depend on your source code, I cannot know what you're doing there. My guess is that you use an erroneous lock hint.
A more general approach is to rely on SNAPSHOT ISOLATION. But this, most likely, will not solve the problem you're seeing, since snapshot isolation can only benefit row level contention issues, not applications that request table locks.
A frequent aim of using transactions: keep them as short and sweet as possible. I get the sense from your wording in the question that you are opening a transaction, then doing all kinds of things, some of which take a long time. Then expecting multiple users to be able to run this same code concurrently. Unfortunately, if you perform an insert at the beginning of that set of code, then do 40 other things before committing or rolling back, it is possible that that insert will block everyone else from running the same type of insert, essentially turning your operation from free-for-all to serial.
Find out what each query is doing, and if you are getting lock escalations that you wouldn't expect. Just because you say WITH (ROWLOCK) on a query doesn't mean SQL Server will be able to comply... if you are touched multiple indexes, indexed views, persisted computed columns etc. then there are all kinds of reasons why your rowlock may not hold any water. You also might have things later in the transaction that are taking longer than you think, and maybe you don't realize that the locks on all of the objects involved in the transaction (not just the statement that is currently running) can be held for the duration of the transaction.
Different databases have different locking mechanisms, but ones like SQL Server and Oracle have different types of locking.
The default on SQL Server appears to be pessimistic Page locking - so if you have a small number of records then all of them may get locked.
Most databases should not lock when running a script, so I'm wondering whether you're potentially running multiple queries concurrently without transactions.

Resources