I am writing two progrms that simulate a banking service. There's the server program and the user program. The server sets up multiple threads that function as "eletronic counters" that read the user's requests and do as they say.
The user's accounts are stored on the server inside an array and they can be accessed depending on the requests. My problem is the following: imagine thread A is transfering money from John to Maria. How can I stop the other threads from accessing John's and Maria's account while the transaction is taking place? I know about semaphores, mutexes and condition variables, but I can't find a way to use them in a way that doesn't block the access to the entire array.
EDIT: I was told to create N mutexes, where N = number of accounts, and have each mutex associated with an account. Is there a better solution to solve this problem?
There are several options, among them:
Option 1
Give every account its own mutex. Ensure that when a thread wants to lock two records (e.g. for a transfer) that it always looks them in the same order -- e.g. lowest number first.
Threads will then simply acquire the mutexes of the records they need to modify (always observing correct locking order to avoid deadlock), make their modifications, and then release the mutexes.
Option 2
Roll your own record-level locks. Establish a variable for each account recording whether that account is locked. This can be inside the account array or in a separate data structure. Use a single mutex to protect access to all the lock flags, and a CV to assist threads in waiting for a lock to become available.
Threads then operate in this pattern:
Lock the mutex.
If all required records are unlocked then turn on the their lock flags and go to step 4.
Wait on the CV, then go back to step 2.
Release the mutex
Perform all (other) account modifications
Re-lock the mutex
Turn off all the record locks acquired in step 2.
Broadcast to the CV and release the mutex.
Option 2 has more thread contention than does option 1, and therefore probably somewhat less concurrency in practice, but that is the tradeoff involved in using only one mutex. You could address that to some extent with a hybrid solution that divided the accounts into groups, and implemented option 2 on a per-group basis.
Related
Let's say we have an inventory system that tracks the available number of products in a shop (quantity). So we can have something similar to this:
Id
Name
Quantity
1
Laptop
10
We need to think about two things here:
Be sure that Quantity is never negative
If we have simultaneous requests for a product we must ensure valid Quantity.
In other words, we can have:
request1 for 5 laptops (this request will be processed on thread1)
request2 for 1 laptop (this request will be processed on thread2)
When both requests are processed, the database should contain
Id
Name
Quantity
1
Laptop
4
However, that might not be the case, depending on how we write our code.
If on our server we have something similar to this:
var product = _database.GetProduct();
if (product.Quantity - requestedQuantity >= 0)
{
product.Quantity -= requestedQuantity;
_database.Save();
}
With this code, it's possible that both requests (that are executed on separate threads) would hit the first line of the code at the exact same time.
thread1: _database.GetProduct(); // Quantity is 10
thread2: _database.GetProduct(); // Quantity is 10
thread1: _product.Quantity = 10 - 5 = 5
thread2: _product.Quantity = 10 - 1 = 9
thread1: _database.Save(); // Quantity is 5
thread2: _database.Save(); // Quantity is 9
What has just happened? We have sold 6 laptops, but we reduced just one from the inventory.
How to approach this problem?
To ensure only positive quantity we can use some DB constraints (to imitate unsigned int).
To deal with race condition we usually use lock, and similar techniques.
And depending on a case that might work, if we have one instance of a server...But, what should we do when we have multiple instances of the server and the server is running on multithreading environment?
It seems to me that the moment you have more than one web server, your only reasonable option for locking is the database. Why do I say reasonable? Because we have Mutex.
A lock allows only one thread to enter the part that's locked and the lock is not shared with any other processes.
A mutex is the same as a lock but it can be system-wide (shared by multiple processes).
Now...This is my personal opinion, but I expect that managing Mutex between a few processes in microservice-oriented world where a new instance of the server can spin up each second or where the existing instance of the server can die each second is tricky and messy (Do we have some Github example?).
How to solve the problem then?
Stored procedure* - offload the responsibility to the database. Write a new stored procedure and wrap the whole logic into a transaction. Each of the servers will call this SP and we don't need to worry about anything. But this might be slow?
SELECT ...FOR UPDATE - I saw this while I was investigating the problem. With this approach, we still try to solve the problem on 'database' level.
Taking into account all of the above, what should be the best approach to solve this problem? Is there any other solution I am missing? What would you suggest?
I am working in .NET and using EF Core with PostgreSQL, but I think that this is really a language-agnostic question and that principle for solving the issue is similar in all environments (and similar for many relational databases).
After reading the majority of the comments let's assume that you need a solution for a relational database.
The main thing that you need to guarantee is that the write operation at the end of your code only happens if the precondition is still valid (e.g. product.Quantity - requestedQuantity).
This precondition is evaluated at the application side in memory. But the application only sees a snapshot of the data at the moment, when database read happened: _database.GetProduct(); This might become obsolete as soon as someone else is updating the same data. If you want to avoid using SERIALIZABLE as a transaction isolation level (which has performance implications anyway), the application should detect at the moment of writing if the precondition is still valid. Or said differently, if the data is unchanged while it was working on it.
This can be done by using offline concurrency patterns: Either an optimistic offline lock or a pessimistic offline lock. Many ORM frameworks support these features by default.
I'm building a web service which reserves unique items to users.
Service is required to handle high amounts of concurrent requests that should avoid blocking each other as much as possible. Each incoming request must reserve n-amount of unique items of the desired type, and then process them successfully or release them back to the reservables list so they can be reserved by an another request. A succesful processing contains multiple steps like communicating with integrated services and other time consuming steps, so keeping items reserved with a DB transaction until the end would not be an efficient solution.
Currently I've implemented a solution where reservable items are stored in a buffer DB table where items are being locked and deleted by incoming requests with SELECT FOR UPDATE SKIP LOCKED. As service must support multiple item types, this buffer table contains only n amount of items per type at a time as the table size would otherwise grow into too big as there is about ten thousand different types. When certain item types are all reserved (selected and removed) the request locks the item type and adds more reservable items into the buffer. This fill operation requires integration calls and may take some time. During the fill, all other operations needs to wait until the filling operation finishes and items become available. This is where the problem arises. When thousands of requests wait for the same item type to become available in the buffer, each needs to poll this information somehow.
What could be an efficient solution for this kind of polling?
I think the "real" answer is to start the refill process when the stock gets low, rather than when it is completely depleted. Then it would already be refilled by the time anyone needs to block on it. Or perhaps you could make the refill process work asynchronously, so that the new rows are generated near-instantly and then the integrations are called later. So you would enqueue the integrations, rather than the consumers.
But barring that, it seems like you want the waiters to lock the "item type" in a mode incompatible with the how the refiller locks it. Then it will naturally block, and be released once the refiller is done. The problem is that if you want to assemble an order of 50 things and the 47th is depleted, do you want to maintain the reservation on the previous 46 things while you wait?
Presumably your reservation is not blocking anyone else, unless the one you have reserved is the last one available. In which case you are not really blocking them, just forcing them to go through the refill process, which would have had to be done eventually anyway.
I have a simple multi-threaded application. All the threads will only do put operations to the same database. But before a thread takes a put operations, it will first of all acquire a mutex lock to increase the key number and then release the lock and then do the put operation, i.e., the threads will insert items with different key number maybe at the same time. That's what I did in my application.
What I am still confused about is whether this simple app needs to specify DB_INIT_LOCK flag or DB_INIT_CDB flag? I have read the document about these flags. DB_INIT_CDB means multiple reads/single writer, however, in my simple app, the threads can operate concurrently, not single writer, so I do not need it. For DB_INIT_LOCK, since the threads never insert the item with the same key, I do not need it, am I right?
Please correct me if I am wrong. Many thanks.
You correctly state that DB_INIT_CDB gives you a multi-reader, single-writer environment. This puts Berkeley DB in a completely different mode of operation. But, since you've got more than one writer, you can't use it.
You'll need at least these two flags:
DB_INIT_LOCK: You're doing your own locking around your database key generation. But when you insert records into the database, Berkeley DB is going to touch some of the same pieces of memory. For example, the very first two records you insert will be right next to each other in the database. Unless they are large, they'll be on the same database "page" of memory. You need this flag to tell BDB to do its own locking.
It's the same as if you implemented your own in-memory binary tree that multiple threads were changing. You'd have to use some kind of locking to prevent the threads from completely destroying the tree with incompatible updates.
DB_THREAD: This flag lets BDB know that multiple threads will be using the same database environment.
You may find that you need to use transactions, or at the very least to allow BDB to use them interally. That's DB_INIT_TXN. And, I've always needed DB_INIT_MPOOL and DB_PRIVATE to allow BDB to use malloc() to manage some of its own memory.
(Just as an aside, if you have a simple increment for your key, consider using an atomic increment operation instead of a mutex. If you're using C with gcc, the builtin __sync_fetch_and_add (or __sync_add_and_fetch) can do this for you. From C++, you can use std::atomic's post-increment.)
I read article from this link
https://msdn.microsoft.com/en-us/library/ms189823.aspx
One thing I don't understand clearly the differences between
the
#LockMode=Shared, Update, IntentShared, IntentExclusive, or Exclusive.
Depending on the lock mode you take, other transactions using the same resource can either aquire a lock or not. The meaning of the locks and their effect on other lock takers are described here:
SQL Server lock compatibility matrix.
Short version:
Shared (aka "Read"): Lets others take Shared locks, too but prevents Exclusive locks from being taken.
Update: Only one transaction at a time can have an Update lock. Others can take Shared locks. Exclusive locks are prevented.
Exclusive: What it says on the label. Every other lock is prevented.
Intent ...: Not very useful mode for an application lock. These come from resource hierarchies like index trees and mean that you don't wish to lock the actual resource but one dependent from it (which may or may not lead to a change on the intent locked resource).
For instance, there exists table A and table B, and i need to process an update in A and then B and I decide to table lock both them during use (as demanded by my architect). Simultaneously, another procedure is called which table locks B, then locks A.
Will this transaction complete? I have a feeling its a deadlock, quite sure of it as it's not releasing any resources...
Yes it is a possible deadlock.
The deadlock scenario is
Your task locks A
Other task locks B
then
Your task tries to lock B but it can't as you have the lock
and
other task tries to lock A but it can't as you have it.
So one of these tasks has to fail/rollback so the other can complete. Depending on RDBMS used the db will choose one of these to terminate.
Often the solution is for a guideline that you must lock resources in the same order in all processes usually this has to be manualy enforced.
Yes. This approach will end in a classic cyclic deadlock as mentioned here
Using TABLE level lock for an update is an Overkill. What is the rationale behind doing this ? If you have the correct indexes, locks will be acquired at the key level, which helps multiple processes concurrently access the tables in question.
Still it is a best practice to access the tables in the same order when possible.