I have a PLSQL API which can be called from multiple threads concurrently. However, there is a piece of code in the API which I want to be accessed only by one thread at a time. I am using dbms_lock.request to handle concurrency and using following query for checking the number of threads that are waiting on lock:
SELECT
l.*,
substr(a.name,1,41) name,
substr(s.program,1,45) program,
p.spid SPID,
s.osuser,
l.SID SID,
s.process PID,
s.TERMINAL,
S.STATUS
FROM
sys.dbms_lock_allocated a,
v$lock l,
v$session s,
v$process p
WHERE
a.lockid = l.id1 and
l.type = 'UL' and
l.sid = s.sid and
p.addr = s.paddr;
I see around only 200 threads waiting on a lock but actually there are thousands of thread invoking the API.
I want to know what guides the max number of threads that can wait on lock? And what happens to other threads that are accessing the API.
There is no limit, but most likely you have hit a limit higher up in the stack.
For example, if I had a connection pool of (say) 100 sessions, then from the database perspective I would see:
1 session from the pool holding the lock
99 sessions from the pool waiting for the lock
but from upstream, I might also see another 500 app sessions waiting to get a slot from the (currently full) connection pool.
"Thousands of threads" is concerning from an app design perspective
Related
Why does it say by a sequence of short transactions? If transactions are long there should be no difference, no?
However, care must be taken to avoid the following scenario. Suppose a
transaction T2 has a shared-mode lock on a data item, and another
transaction T1 requests an exclusive-mode lock on the data item. T1
has to wait for T2 to release the shared mode lock. Meanwhile, a
transaction T3 may request a shared-mode lock on the same data item.
The lock request is compatible with the lock granted to T2, so T3 may
be granted the shared-mode lock. At this point T2 may release the
lock, but still T1 has to wait for T3 to finish. But again, there may
be a new transaction T4 that requests a shared-mode lock on the same
data item, and is granted the lock before T3 releases it. In fact, it
is possible that there is a sequence of transactions that each
requests a shared mode lock on the data item, and each transaction
releases the lock a short while after it is granted, but T1 never gets
the exclusive-mode lock on the data item. The transaction T1 may never
make progress, and is said to be starved.
Long transactions (in time) are actually more susceptible to blocking problems than short transactions are. Consequently, it is usually recommended that transactions be designed to hold blocking locks for as short a time as possible.
So, in the scenario above a series of "long" transactions are actually much more likely to cause this problem. However, the writer refers to a series of "short" transactions to emphasize that this problem can happen even when the transactions are short (if there are enough nearly simultaneous compatible transactions).
I am writing two progrms that simulate a banking service. There's the server program and the user program. The server sets up multiple threads that function as "eletronic counters" that read the user's requests and do as they say.
The user's accounts are stored on the server inside an array and they can be accessed depending on the requests. My problem is the following: imagine thread A is transfering money from John to Maria. How can I stop the other threads from accessing John's and Maria's account while the transaction is taking place? I know about semaphores, mutexes and condition variables, but I can't find a way to use them in a way that doesn't block the access to the entire array.
EDIT: I was told to create N mutexes, where N = number of accounts, and have each mutex associated with an account. Is there a better solution to solve this problem?
There are several options, among them:
Option 1
Give every account its own mutex. Ensure that when a thread wants to lock two records (e.g. for a transfer) that it always looks them in the same order -- e.g. lowest number first.
Threads will then simply acquire the mutexes of the records they need to modify (always observing correct locking order to avoid deadlock), make their modifications, and then release the mutexes.
Option 2
Roll your own record-level locks. Establish a variable for each account recording whether that account is locked. This can be inside the account array or in a separate data structure. Use a single mutex to protect access to all the lock flags, and a CV to assist threads in waiting for a lock to become available.
Threads then operate in this pattern:
Lock the mutex.
If all required records are unlocked then turn on the their lock flags and go to step 4.
Wait on the CV, then go back to step 2.
Release the mutex
Perform all (other) account modifications
Re-lock the mutex
Turn off all the record locks acquired in step 2.
Broadcast to the CV and release the mutex.
Option 2 has more thread contention than does option 1, and therefore probably somewhat less concurrency in practice, but that is the tradeoff involved in using only one mutex. You could address that to some extent with a hybrid solution that divided the accounts into groups, and implemented option 2 on a per-group basis.
Problem:
I have list of 2M+ users data in my datastore project. I would like to send a weekly newsletter to all users. The mailing API accepts max 50 email address per API call.
Previous Solution:
Used app-engine backend and a simple datastore query to process all the records at one go. But what happens is, sometimes I get memory overflow critical error log and the process starts all over again. Because of this some users, get the same email more than once. So I moved to dataflow.
Current Solution:
I use the FlatMap function to send each email id to a function and then send email individually to each user.
def process_datastore(project, pipeline_options):
p = beam.Pipeline(options=pipeline_options)
query = make_query()
entities = (p | 'read from datastore' >> ReadFromDatastore(project, query))
entities | beam.FlatMap(lambda entity: sendMail([entity.properties.get('emailID', "")]))
return p.run()
With cloud dataflow, I have ensured that each user gets a mail only once and also nobody is missed out. There are no memory errors.
But this current process takes 7 hours to finish running. I have tried to replace FlatMap with ParDo, with the assumption that ParDo will parallelize the process. But even that takes same time.
Question:
How to bunch the email ids in group of 50, so that the mail API call is effectively used?
How to parallelize the process such that the time taken is less than an hour?
You could use query cursors to split your users in batches of 50 and do the actual batch processing (the email sending) inside push queue or deferred tasks. This would be a GAE-only solution, without cloud dataflow, IMHO a lot simpler.
You can find an example of such processing in Google appengine: Task queue performance (taking the answer into account as well). That solution is using the deferred library, but it is almost trivial to use push queue tasks instead.
The answer touches on the parallelism aspect in the sense that you may want to limit it to keep costs down.
You could also split the batching itself inside the tasks to obtain an indefinitely scalable solution (any number of recipients, without hitting memory or deadline exceeded failures), with the task re-enqueing itself to continue the work from where it left off.
I have an error log which reports a deadlock:
Transaction (Process ID 55) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
I am trying to reproduce this error, but my standard deadlock SQL code produces a different error:
Transaction (Process ID 54) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
I want to be very clear that I am not asking what a deadlock is. I do understand the basics.
My question is: what is the meaning of lock | communication buffer resources in this context? What are "communication buffer resources"? Does the lock | signify anything?
My best guess is that a communication buffer is used when parallel threads combine their results. Can anyone confirm or deny this?
My ultimate goal is to somehow trigger the first error to occur again.
I would interpret the message as a deadlock on some combination of Lock resources or Communication Buffer resources. "Lock resources" are ordinary object locks, and "Communication Buffer resources" are exchangeEvents used for combining results of parallel queries. These are described further in https://blogs.msdn.microsoft.com/bartd/2008/09/24/todays-annoyingly-unwieldy-term-intra-query-parallel-thread-deadlocks/ where the relevant paragraph is:
An "exchangeEvent" resource indicates the presence of parallelism operators in a query plan. The idea is that the work for an operation like a large scan, sort, or join is divided up so that it can be executed on multiple child threads. There are "producer" threads that do the grunt work and feed sets of rows to "consumers". Intra-query parallel requires signaling between these worker threads: the consumers may have to wait on producers to hand them more data, and the producers may have to wait for consumers to finish processing the last batch of data. Parallelism-related waits show up in SQL DMVs as CXPACKET or EXCHANGE wait types (note that the presence of these wait types is normal and simply indicates the presence of parallel query execution -- by themselves, these waits don't indicate that this type or any other type of deadlock is occurring).
The deadlock graph for one of these I've seen included a set of processes with only one SPID and a graph of objectlocks and exchangeEvents. I guess the message "Transaction (Process ID 55) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction" appears instead of "Intra-query parallelism caused your server command (process ID #51) to deadlock. Rerun the query without intra-query parallelism by using the query hint option (maxdop 1)" because of the combination of objectlocks and exchangeevents, or else the message has been changed in SQL Server since the article was written.
Your issue is parallelism related, and the error has "no meaning" as the error message is not reflecting your problem and no do not go and change the maxdope settings. in order to get to the cause of the error you need to use trace flag 1204 , have a look as to how to use the trace flag and what info you get.
When you do this you'd get the answer as to why, where and what line of code caused the lock. I guess you're able to google your self from that point, and if not then post it and you'll get the answer you need.
You can use MAXDOP 1 as a query hint - i.e. run that query on one cpu - without affecting the rest of the server.
This will avoid the error for that query - doesn't tell you why it's failing but does provide a work-around if you have to get it working fast :-)
I have one variable pool, shared by all clients, that stores all remaining enemies in a game. When a client starts a game, he gets some enemies from pool. When he finishes, enemies that he did not kill are put back into pool. The game also checks to see if all the enemies have been killed (i.e., pool is empty).
What is the best way to implement pool? I'm concerned about the 5 updates per second limit on a datastore entity.
How often do expect a game to be started/finished? If the answer is definitely less than 5 per second, then just use a single entity to represent pool and use transactions to atomically get and update it as games start and finish.
If you're really expecting so many clients to share a single pool that it will be updated at a sustained rate of more that 5 times per second, then consider sharding pool into multiple pieces. When the client starts a game, remove entities from just one of the shards. To test if the sharded pool is empty, just retrieve all the shards and see if they are all empty. (When modifying a shard, you'll still need to use a transaction.)