Related
I have a C program that uses a PCRE regex to determine if a process in a cgroup should be added to one variable or another. I spawn a thread to read the cpuacct.stat file in each running cgroup, where the number of threads never exceeded the number of cores. These samples and results are then combined into one of two variables.
The relevant snippet of code is:
pcreExecRet = pcre_exec(reCompiled,
pcreExtra,
queue,
strlen(queue), // length of string
0, // Start looking at this point
0, // OPTIONS
subStrVec,
30); // Length of subStrVec
//CRITICAL SECTION?
pthread_mutex_lock(&t_lock); //lock mutex
while (sumFlag == 0) {
pthread_cond_wait(&ok_add, &t_lock); //wait on ok signal
}
if(pcreExecRet > 0) {
sumOne += loadavg;
} else if (pcreExecRet == PCRE_ERROR_NOMATCH){
sumTwo += loadavg;
} else {
perror("Could not determine sum!\n"); //if this fails
}
sumFlag = 1;
pthread_cond_signal(&ok_add); //signal that it is ok to add
pthread_mutex_unlock(&t_lock); //unlock mutex
My question is whether or not the pcre_exec() call is thread-safe? Should it be moved into the critical section? I know the compiled regex is thread safe, but I'm not sure about pcreExtra (const pcre_extra) or subStrVec (int *ovector). These variables are global for now.
Yes it is thread safe, all PCRE functions are but you should be careful under certain conditions
The following is from the manual pages for PCRE
MULTITHREADING
The PCRE functions can be used in multi-threading applications, with
the proviso that the memory management functions pointed to by
pcre_malloc, pcre_free, pcre_stack_malloc, and pcre_stack_free, and the
callout and stack-checking functions pointed to by pcre_callout and
pcre_stack_guard, are shared by all threads.
The compiled form of a regular expression is not altered during match-
ing, so the same compiled pattern can safely be used by several threads
at once.
If the just-in-time optimization feature is being used, it needs sepa-
rate memory stack areas for each thread. See the pcrejit documentation
for more details.
For child processes, the wait() and waitpid() functions can be used to suspends execution of the current process until a child has exited. But this function can not be used for non-child processes.
Is there another function, which can wait for exit of any process ?
Nothing equivalent to wait(). The usual practice is to poll using kill(pid, 0) and looking for return value -1 and errno of ESRCH to indicate that the process is gone.
Update: Since linux kernel 5.3 there is a pidfd_open syscall, which creates an fd for a given pid, which can be polled to get notification when pid has exited.
On BSDs and OS X, you can use kqueue with EVFILT_PROC+NOTE_EXIT to do exactly that. No polling required. Unfortunately there's no Linux equivalent.
So far I've found three ways to do this on Linux:
Polling: you check for the existence of the process every so often, either by using kill or by testing for the existence of /proc/$pid, as in most of the other answers
Use the ptrace system call to attach to the process like a debugger so you get notified when it exits, as in a3nm's answer
Use the netlink interface to listen for PROC_EVENT_EXIT messages - this way the kernel tells your program every time a process exits and you just wait for the right process ID. I've only seen this described in one place on the internet.
Shameless plug: I'm working on a program (open source of course; GPLv2) that does any of the three.
You could also create a socket or a FIFO and read on them. The FIFO is especially simple: Connect the standard output of your child with the FIFO and read. The read will block until the child exits (for any reason) or until it emits some data. So you'll need a little loop to discard the unwanted text data.
If you have access to the source of the child, open the FIFO for writing when it starts and then simply forget about it. The OS will clean the open file descriptor when the child terminates and your waiting "parent" process will wake up.
Now this might be a process which you didn't start or own. In that case, you can replace the binary executable with a script that starts the real binary but also adds monitoring as explained above.
Here is a way to wait for any process (not necessarily a child) in linux to exit (or get killed) without polling:
Using inotify to wait for the /proc'pid' to be deleted would be the perfect solution, but unfortunately inotify does not work with pseudo file systems like /proc.
However we can use it with the executable file of the process.
While the process still exists, this file is being held open.
So we can use inotify with IN_CLOSE_NOWRITE to block until the file is closed.
Of course it can be closed for other reasons (e.g. if another process with the same executable exits) so we have to filter those events by other means.
We can use kill(pid, 0), but that can't guarantee if it is still the same process. If we are really paranoid about this, we can do something else.
Here is a way that should be 100% safe against pid-reuse trouble: we open the pseudo directory /proc/'pid', and keep it open until we are done. If a new process is created in the meantime with the same pid, the directory file descriptor that we hold will still refer to the original one (or become invalid, if the old process cease to exist), but will NEVER refer the new process with the reused pid. Then we can check if the original process still exists by checking, for example, if the file "cmdline" exists in the directory with openat(). When a process exits or is killed, those pseudo files cease to exist too, so openat() will fail.
here is an example code:
// return -1 on error, or 0 if everything went well
int wait_for_pid(int pid)
{
char path[32];
int in_fd = inotify_init();
sprintf(path, "/proc/%i/exe", pid);
if (inotify_add_watch(in_fd, path, IN_CLOSE_NOWRITE) < 0) {
close(in_fd);
return -1;
}
sprintf(path, "/proc/%i", pid);
int dir_fd = open(path, 0);
if (dir_fd < 0) {
close(in_fd);
return -1;
}
int res = 0;
while (1) {
struct inotify_event event;
if (read(in_fd, &event, sizeof(event)) < 0) {
res = -1;
break;
}
int f = openat(dir_fd, "fd", 0);
if (f < 0) break;
close(f);
}
close(dir_fd);
close(in_fd);
return res;
}
You could attach to the process with ptrace(2). From the shell, strace -p PID >/dev/null 2>&1 seems to work. This avoid the busy-waiting, though it will slow down the traced process, and will not work on all processes (only yours, which is a bit better than only child processes).
None I am aware of. Apart from the solution from chaos, you can use semaphores if you can change the program you want to wait for.
The library functions are sem_open(3), sem_init(3), sem_wait(3), ...
sem_wait(3) performs a wait, so you don´t have to do busy waiting as in chaos´ solution. Of course, using semaphores makes your programs more complex and it may not be worth the trouble.
Maybe it could be possible to wait for /proc/[pid] or /proc/[pid]/[something] to disappear?
There are poll() and other file event waiting functions, maybe that could help?
Since linux kernel 5.3 there is a pidfd_open syscall, which creates an fd for a given pid, which can be polled to get notification when pid has exited.
Simply poll values number 22 and 2 of the /proc/[PID]/stat.
The value 2 contains name of the executable and 22 contains start time.
If they change, some other process has taken the same (freed) PID. Thus the method is very reliable.
You can use eBPF to achieve this.
The bcc toolkit implements many excellent monitoring capabilities based on eBPF. Among them, exitsnoop traces process termination, showing the command name and reason for termination,
either an exit or a fatal signal.
It catches processes of all users, processes in containers, as well as processes that
become zombie.
This works by tracing the kernel sched_process_exit() function using dynamic tracing, and
will need updating to match any changes to this function.
Since this uses BPF, only the root user can use this tool.
You can refer to this tool for related implementation.
You can get more information about this tool from the link below:
Github repo: tools/exitsnoop: Trace process termination (exit and fatal signals). Examples.
Linux Extended BPF (eBPF) Tracing Tools
ubuntu manpages: exitsnoop-bpfcc
You can first install this tool and use it to see if it meets your needs, and then refer to its implementation for coding, or use some of the libraries it provides to implement your own functions.
exitsnoop examples:
Trace all process termination
# exitsnoop
Trace all process termination, and include timestamps:
# exitsnoop -t
Exclude successful exits, only include non-zero exit codes and fatal signals:
# exitsnoop -x
Trace PID 181 only:
# exitsnoop -p 181
Label each output line with 'EXIT':
# exitsnoop --label EXIT
Another option
Wait for a (non-child) process' exit using Linux's PROC_EVENTS
Reference project:
https://github.com/stormc/waitforpid
mentioned in the project:
Wait for a (non-child) process' exit using Linux's PROC_EVENTS. Thanks
to the CAP_NET_ADMIN POSIX capability permitted to the waitforpid
binary, it does not need to be set suid root. You need a Linux kernel
having CONFIG_PROC_EVENTS enabled.
Appricate #Hongli's answer for macOS with kqueue. I implement it with swift
/// Wait any pids, including non-child pid. Block until all pids exit.
/// - Parameters:
/// - timeout: wait until interval, nil means no timeout
/// - Throws: WaitOtherPidError
/// - Returns: isTimeout
func waitOtherPids(_ pids: [Int32], timeout: TimeInterval? = nil) throws -> Bool {
// create a kqueue
let kq = kqueue()
if kq == -1 {
throw WaitOtherPidError.createKqueueFailed(String(cString: strerror(errno)!))
}
// input
// multiple changes is OR relation, kevent will return if any is match
var changes: [Darwin.kevent] = pids.map({ pid in
Darwin.kevent.init(ident: UInt(pid), filter: Int16(EVFILT_PROC), flags: UInt16(EV_ADD | EV_ENABLE), fflags: NOTE_EXIT, data: 0, udata: nil)
})
let timeoutDeadline = timeout.map({ Date(timeIntervalSinceNow: $0)})
let remainTimeout: () ->timespec? = {
if let deadline = timeoutDeadline {
let d = max(deadline.timeIntervalSinceNow, 0)
let fractionalPart = d - TimeInterval(Int(d))
return timespec(tv_sec: Int(d), tv_nsec: Int(fractionalPart * 1000 * 1000 * 1000))
} else {
return nil
}
}
// output
var events = changes.map{ _ in Darwin.kevent.init() }
while !changes.isEmpty {
// watch changes
// sync method
let numOfEvent: Int32
if var timeout = remainTimeout() {
numOfEvent = kevent(kq, changes, Int32(changes.count), &events, Int32(events.count), &timeout);
} else {
numOfEvent = kevent(kq, changes, Int32(changes.count), &events, Int32(events.count), nil);
}
if numOfEvent < 0 {
throw WaitOtherPidError.keventFailed(String(cString: strerror(errno)!))
}
if numOfEvent == 0 {
// timeout. Return directly.
return true
}
// handle the result
let realEvents = events[0..<Int(numOfEvent)]
let handledPids = Set(realEvents.map({ $0.ident }))
changes = changes.filter({ c in
!handledPids.contains(c.ident)
})
for event in realEvents {
if Int32(event.flags) & EV_ERROR > 0 { // #see 'man kevent'
let errorCode = event.data
if errorCode == ESRCH {
// "The specified process to attach to does not exist"
// ingored
} else {
print("[Error] kevent result failed with code \(errorCode), pid \(event.ident)")
}
} else {
// succeeded event, pid exit
}
}
}
return false
}
enum WaitOtherPidError: Error {
case createKqueueFailed(String)
case keventFailed(String)
}
PR_SET_PDEATHSIG can be used to wait for parent process termination
My use case is as follows: I have a program that enforces that only one instance of it can be running at any given time, so at startup it always tries to grab hold of a lock file in a standard location, and terminates if the file is already locked. That's all working fine, but now I want to enhance the program with a new command-line option which, when specified, will cause the program to just print out a status report for the program and then terminate (prior to the main lock guard described above), which will include whether the lock file is already locked or not, what the pid of the running process is (if such exists), and some program state queried from a database.
So as you can see, when invoked in this "status report" mode, my program should not actually acquire the lock if it is available. I just want to know if the file is already locked or not, so I can inform the user as part of the status report.
From my searching, there does not appear to be any way of doing this. Rather, the only possible solution seems to be to call flock() with the non-blocking flag, and then, if you actually acquired the lock, you can release it immediately. Something like this:
if (flock(fileno(lockFile), LOCK_EX|LOCK_NB ) == -1) {
if (errno == EWOULDBLOCK) {
printf("lock file is locked\n");
} else {
// error
} // end if
} else {
flock(fileno(lockFile), LOCK_UN );
printf("lock file is unlocked\n");
} // end if
I suppose it's not such a big deal to acquire the lock and then release it immediately, but I was wondering if there's any better solution out there that doesn't involve a brief and unnecessary acquisition of the lock?
Note: There are already a couple of similar questions whose titles may make it seem like they're identical to this question, but it is clear from the contents of those questions that the OPs are interested in actually writing to the file after acquiring the lock, so this is a distinct question:
Check if a file is already locked using flock()?
How to check if a file is locked or not?
You cannot do this reliably. Processes are asynchronous: when you fail to acquire the lock, there is no guarantee that the file will still be locked by the time you print the locked status. Similarly, if you manage to acquire the lock, You then immediately release it, so by the time you print the unlocked status, the file my have been locked by another process. If there are a lot of contenders trying to lock this file, the likelihood of the status message being out of sync is high. Attackers can take advantage of this kind of approximation to penetrate systems.
If you were to rely on this check in a script to perform any kind of concurrent work, all bets are off. If it is just producing an informative status, you should use the past tense in the status messages:
if (flock(fileno(lockFile), LOCK_EX|LOCK_NB) == -1) {
if (errno == EWOULDBLOCK) {
printf("lock file was locked\n");
} else {
// error
}
} else {
flock(fileno(lockFile), LOCK_UN);
printf("lock file was unlocked\n");
}
I don't see what's wrong with the approach of placing a lock on the file and immediately releasing it. In my opinion, you are doing it just as I would do it.
That said, there is another locking API in Unix: fcntl locks. See man fcntl on Linux. It has F_SETLK to acquire or release a lock, and F_GETLK to test whether a lock can be placed. The fcntl locks are slightly different that flock locks: they are advisory record locks placed on a region of the file, not for the whole file.
There is a third api too: lockf(3). You can use F_LOCK to lock a file, and F_TEST to test if the file region can be locked. The lockf(3) API has been implemented as a wrapper on top of fcntl(2) locking on Linux, but that may not be true on other operating systems.
Do not use flock(). It does not work reliably if the lock file directory happens to be a network filesystem (for example, NFS) and the OS you're using does not implement flock() using fcntl() advisory record locking.
(For example, in current Linux systems, flock() and fcntl() locks are separate and do not interact on local files, but do interact on files residing on NFS filesystems. It is not that strange to have /var/lock on an NFS filesystem in server clusters, especially failover and web server systems, so this is, in my opinion, a real issue you should consider.)
Edited to add: If for some external reason you are constrained to use flock(), you can use flock(fd, LOCK_EX|LOCK_NB) to try to obtain the exclusive lock. This call will never block (wait for the lock to be released), but will fail with -1 and errno == EWOULDBLOCK if the file is already locked. Similar to the fcntl() locking scheme explained in detail below, you try to obtain the exclusive lock (without blocking); if successful, you keep the lock file descriptor open, and let the operating system release the lock automatically when the process exits. If the nonblocking lock fails, you must choose whether you will abort, or proceed anyway.
You can accomplish your goals by using POSIX.1 functions and fcntl() advisory record locks (covering the entire file). The semantics are standard across all POSIXy systems, so this approach will work on all POSIXy and unix-like systems.
Features of fcntl() locks are simple, but nonintuitive. When any descriptor referring to the lock file is closed, the advisory locks on that file are released. When the process exits, the advisory locks on all open files are automatically released. Locks are maintained across an exec*(). Locks are not inherited via fork(), nor are they released in the parent (even when marked close-on-exec). (If the descriptors are close-on-exec, then they will be automatically closed in the child process. Otherwise the child process will have an open descriptor to the file, but not any fcntl() locks. Closing the descriptors in the child process will not affect the parent's lock on the file.)
Therefore the correct strategy is very simple: Open the lock file exactly once, and use fcntl(fd,F_SETLK,&lock) to place an exclusive all-file advisory lock without blocking: if there is a conflicting lock, it will fail immediately, instead of blocking until the lock can be acquired. Keep the descriptor open, and let the operating system auto-release the lock when your process exits.
For example:
#define _POSIX_C_SOURCE 200809L
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
/* Open and exclusive-lock file, creating it (-rw-------)
* if necessary. If fdptr is not NULL, the descriptor is
* saved there. The descriptor is never one of the standard
* descriptors STDIN_FILENO, STDOUT_FILENO, or STDERR_FILENO.
* If successful, the function returns 0.
* Otherwise, the function returns nonzero errno:
* EINVAL: Invalid lock file path
* EMFILE: Too many open files
* EALREADY: Already locked
* or one of the open(2)/creat(2) errors.
*/
static int lockfile(const char *const filepath, int *const fdptr)
{
struct flock lock;
int used = 0; /* Bits 0 to 2: stdin, stdout, stderr */
int fd;
/* In case the caller is interested in the descriptor,
* initialize it to -1 (invalid). */
if (fdptr)
*fdptr = -1;
/* Invalid path? */
if (filepath == NULL || *filepath == '\0')
return errno = EINVAL;
/* Open the file. */
do {
fd = open(filepath, O_RDWR | O_CREAT, 0600);
} while (fd == -1 && errno == EINTR);
if (fd == -1) {
if (errno == EALREADY)
errno = EIO;
return errno;
}
/* Move fd away from the standard descriptors. */
while (1)
if (fd == STDIN_FILENO) {
used |= 1;
fd = dup(fd);
} else
if (fd == STDOUT_FILENO) {
used |= 2;
fd = dup(fd);
} else
if (fd == STDERR_FILENO) {
used |= 4;
fd = dup(fd);
} else
break;
/* Close the standard descriptors we temporarily used. */
if (used & 1)
close(STDIN_FILENO);
if (used & 2)
close(STDOUT_FILENO);
if (used & 4)
close(STDERR_FILENO);
/* Did we run out of descriptors? */
if (fd == -1)
return errno = EMFILE;
/* Exclusive lock, cover the entire file (regardless of size). */
lock.l_type = F_WRLCK;
lock.l_whence = SEEK_SET;
lock.l_start = 0;
lock.l_len = 0;
if (fcntl(fd, F_SETLK, &lock) == -1) {
/* Lock failed. Close file and report locking failure. */
close(fd);
return errno = EALREADY;
}
/* Save descriptor, if the caller wants it. */
if (fdptr)
*fdptr = fd;
return 0;
}
The reason the above makes sure it does not accidentally reuse a standard descriptor, is because I've been bitten by it in a very rare case. (I wanted to exec an user-specified process while holding a lock, but redirecting the standard input and output to currently controlling terminal.)
The use is very simple:
int result;
result = lockfile(YOUR_LOCKFILE_PATH, NULL);
if (result == 0) {
/* Have an exclusive lock on YOUR_LOCKFILE_PATH */
} else
if (result == EALREADY) {
/* YOUR_LOCKFILE_PATH is already locked by another process */
} else {
/* Cannot lock YOUR_LOCKFILE_PATH, see strerror(result). */
}
Edited to add: I used internal linkage (static) for the above function just out of habit. If the lock file is user-specific, it should use ~/.yourapplication/lockfile; if it is system-wide, it should use e.g. /var/lock/yourapplication/lockfile. I have a habit of keeping the functions related to this kind of initialization stuff, including defining/building the lockfile path etc. as well automatic plugin registration function (using opendir()/readdir()/dlopen()/dlsym()/closedir()), in the same file; the lockfile function tends to be called internally (by the function that builds the lockfile path), and thus ends up having internal linkage.
Feel free to use, reuse, or modify the function as you wish; I consider it to be in public domain, or licensed under CC0 where public domain dedication is not possible.
The descriptor is "leaked" intentionally, so that it will be closed (and the lock on it released) by the operating system when the process exits, but not before.
If there is a lot of post-work cleanups your process does, during which you do wish to allow another copy of this process, you can retain the descriptor, and just close(thatfd) at the point where you wish to release the lock.
This question is based on:
When is it safe to destroy a pthread barrier?
and the recent glibc bug report:
http://sourceware.org/bugzilla/show_bug.cgi?id=12674
I'm not sure about the semaphores issue reported in glibc, but presumably it's supposed to be valid to destroy a barrier as soon as pthread_barrier_wait returns, as per the above linked question. (Normally, the thread that got PTHREAD_BARRIER_SERIAL_THREAD, or a "special" thread that already considered itself "responsible" for the barrier object, would be the one to destroy it.) The main use case I can think of is when a barrier is used to synchronize a new thread's use of data on the creating thread's stack, preventing the creating thread from returning until the new thread gets to use the data; other barriers probably have a lifetime equal to that of the whole program, or controlled by some other synchronization object.
In any case, how can an implementation ensure that destruction of the barrier (and possibly even unmapping of the memory it resides in) is safe as soon as pthread_barrier_wait returns in any thread? It seems the other threads that have not yet returned would need to examine at least some part of the barrier object to finish their work and return, much like how, in the glibc bug report cited above, sem_post has to examine the waiters count after having adjusted the semaphore value.
I'm going to take another crack at this with an example implementation of pthread_barrier_wait() that uses mutex and condition variable functionality as might be provided by a pthreads implementation. Note that this example doesn't try to deal with performance considerations (specifically, when the waiting threads are unblocked, they are all re-serialized when exiting the wait). I think that using something like Linux Futex objects could help with the performance issues, but Futexes are still pretty much out of my experience.
Also, I doubt that this example handles signals or errors correctly (if at all in the case of signals). But I think proper support for those things can be added as an exercise for the reader.
My main fear is that the example may have a race condition or deadlock (the mutex handling is more complex than I like). Also note that it is an example that hasn't even been compiled. Treat it as pseudo-code. Also keep in mind that my experience is mainly in Windows - I'm tackling this more as an educational opportunity than anything else. So the quality of the pseudo-code may well be pretty low.
However, disclaimers aside, I think it may give an idea of how the problem asked in the question could be handled (ie., how can the pthread_barrier_wait() function allow the pthread_barrier_t object it uses to be destroyed by any of the released threads without danger of using the barrier object by one or more threads on their way out).
Here goes:
/*
* Since this is a part of the implementation of the pthread API, it uses
* reserved names that start with "__" for internal structures and functions
*
* Functions such as __mutex_lock() and __cond_wait() perform the same function
* as the corresponding pthread API.
*/
// struct __barrier_wait data is intended to hold all the data
// that `pthread_barrier_wait()` will need after releasing
// waiting threads. This will allow the function to avoid
// touching the passed in pthread_barrier_t object after
// the wait is satisfied (since any of the released threads
// can destroy it)
struct __barrier_waitdata {
struct __mutex cond_mutex;
struct __cond cond;
unsigned waiter_count;
int wait_complete;
};
struct __barrier {
unsigned count;
struct __mutex waitdata_mutex;
struct __barrier_waitdata* pwaitdata;
};
typedef struct __barrier pthread_barrier_t;
int __barrier_waitdata_init( struct __barrier_waitdata* pwaitdata)
{
waitdata.waiter_count = 0;
waitdata.wait_complete = 0;
rc = __mutex_init( &waitdata.cond_mutex, NULL);
if (!rc) {
return rc;
}
rc = __cond_init( &waitdata.cond, NULL);
if (!rc) {
__mutex_destroy( &pwaitdata->waitdata_mutex);
return rc;
}
return 0;
}
int pthread_barrier_init(pthread_barrier_t *barrier, const pthread_barrierattr_t *attr, unsigned int count)
{
int rc;
rc = __mutex_init( &barrier->waitdata_mutex, NULL);
if (!rc) return rc;
barrier->pwaitdata = NULL;
barrier->count = count;
//TODO: deal with attr
}
int pthread_barrier_wait(pthread_barrier_t *barrier)
{
int rc;
struct __barrier_waitdata* pwaitdata;
unsigned target_count;
// potential waitdata block (only one thread's will actually be used)
struct __barrier_waitdata waitdata;
// nothing to do if we only need to wait for one thread...
if (barrier->count == 1) return PTHREAD_BARRIER_SERIAL_THREAD;
rc = __mutex_lock( &barrier->waitdata_mutex);
if (!rc) return rc;
if (!barrier->pwaitdata) {
// no other thread has claimed the waitdata block yet -
// we'll use this thread's
rc = __barrier_waitdata_init( &waitdata);
if (!rc) {
__mutex_unlock( &barrier->waitdata_mutex);
return rc;
}
barrier->pwaitdata = &waitdata;
}
pwaitdata = barrier->pwaitdata;
target_count = barrier->count;
// all data necessary for handling the return from a wait is pointed to
// by `pwaitdata`, and `pwaitdata` points to a block of data on the stack of
// one of the waiting threads. We have to make sure that the thread that owns
// that block waits until all others have finished with the information
// pointed to by `pwaitdata` before it returns. However, after the 'big' wait
// is completed, the `pthread_barrier_t` object that's passed into this
// function isn't used. The last operation done to `*barrier` is to set
// `barrier->pwaitdata = NULL` to satisfy the requirement that this function
// leaves `*barrier` in a state as if `pthread_barrier_init()` had been called - and
// that operation is done by the thread that signals the wait condition
// completion before the completion is signaled.
// note: we're still holding `barrier->waitdata_mutex`;
rc = __mutex_lock( &pwaitdata->cond_mutex);
pwaitdata->waiter_count += 1;
if (pwaitdata->waiter_count < target_count) {
// need to wait for other threads
__mutex_unlock( &barrier->waitdata_mutex);
do {
// TODO: handle the return code from `__cond_wait()` to break out of this
// if a signal makes that necessary
__cond_wait( &pwaitdata->cond, &pwaitdata->cond_mutex);
} while (!pwaitdata->wait_complete);
}
else {
// this thread satisfies the wait - unblock all the other waiters
pwaitdata->wait_complete = 1;
// 'release' our use of the passed in pthread_barrier_t object
barrier->pwaitdata = NULL;
// unlock the barrier's waitdata_mutex - the barrier is
// ready for use by another set of threads
__mutex_unlock( barrier->waitdata_mutex);
// finally, unblock the waiting threads
__cond_broadcast( &pwaitdata->cond);
}
// at this point, barrier->waitdata_mutex is unlocked, the
// barrier->pwaitdata pointer has been cleared, and no further
// use of `*barrier` is permitted...
// however, each thread still has a valid `pwaitdata` pointer - the
// thread that owns that block needs to wait until all others have
// dropped the pwaitdata->waiter_count
// also, at this point the `pwaitdata->cond_mutex` is locked, so
// we're in a critical section
rc = 0;
pwaitdata->waiter_count--;
if (pwaitdata == &waitdata) {
// this thread owns the waitdata block - it needs to hang around until
// all other threads are done
// as a convenience, this thread will be the one that returns
// PTHREAD_BARRIER_SERIAL_THREAD
rc = PTHREAD_BARRIER_SERIAL_THREAD;
while (pwaitdata->waiter_count!= 0) {
__cond_wait( &pwaitdata->cond, &pwaitdata->cond_mutex);
};
__mutex_unlock( &pwaitdata->cond_mutex);
__cond_destroy( &pwaitdata->cond);
__mutex_destroy( &pwaitdata_cond_mutex);
}
else if (pwaitdata->waiter_count == 0) {
__cond_signal( &pwaitdata->cond);
__mutex_unlock( &pwaitdata->cond_mutex);
}
return rc;
}
17 July 20111: Update in response to a comment/question about process-shared barriers
I forgot completely about the situation with barriers that are shared between processes. And as you mention, the idea I outlined will fail horribly in that case. I don't really have experience with POSIX shared memory use, so any suggestions I make should be tempered with scepticism.
To summarize (for my benefit, if no one else's):
When any of the threads gets control after pthread_barrier_wait() returns, the barrier object needs to be in the 'init' state (however, the most recent pthread_barrier_init() on that object set it). Also implied by the API is that once any of the threads return, one or more of the the following things could occur:
another call to pthread_barrier_wait() to start a new round of synchronization of threads
pthread_barrier_destroy() on the barrier object
the memory allocated for the barrier object could be freed or unshared if it's in a shared memory region.
These things mean that before the pthread_barrier_wait() call allows any thread to return, it pretty much needs to ensure that all waiting threads are no longer using the barrier object in the context of that call. My first answer addressed this by creating a 'local' set of synchronization objects (a mutex and an associated condition variable) outside of the barrier object that would block all the threads. These local synchronization objects were allocated on the stack of the thread that happened to call pthread_barrier_wait() first.
I think that something similar would need to be done for barriers that are process-shared. However, in that case simply allocating those sync objects on a thread's stack isn't adequate (since the other processes would have no access). For a process-shared barrier, those objects would have to be allocated in process-shared memory. I think the technique I listed above could be applied similarly:
the waitdata_mutex that controls the 'allocation' of the local sync variables (the waitdata block) would be in process-shared memory already by virtue of it being in the barrier struct. Of course, when the barrier is set to THEAD_PROCESS_SHARED, that attribute would also need to be applied to the waitdata_mutex
when __barrier_waitdata_init() is called to initialize the local mutex & condition variable, it would have to allocate those objects in shared memory instead of simply using the stack-based waitdata variable.
when the 'cleanup' thread destroys the mutex and the condition variable in the waitdata block, it would also need to clean up the process-shared memory allocation for the block.
in the case where shared memory is used, there needs to be some mechanism to ensured that the shared memory object is opened at least once in each process, and closed the correct number of times in each process (but not closed entirely before every thread in the process is finished using it). I haven't thought through exactly how that would be done...
I think these changes would allow the scheme to operate with process-shared barriers. the last bullet point above is a key item to figure out. Another is how to construct a name for the shared memory object that will hold the 'local' process-shared waitdata. There are certain attributes you'd want for that name:
you'd want the storage for the name to reside in the struct pthread_barrier_t structure so all process have access to it; that means a known limit to the length of the name
you'd want the name to be unique to each 'instance' of a set of calls to pthread_barrier_wait() because it might be possible for a second round of waiting to start before all threads have gotten all the way out of the first round waiting (so the process-shared memory block set up for the waitdata might not have been freed yet). So the name probably has to be based on things like process id, thread id, address of the barrier object, and an atomic counter.
I don't know whether or not there are security implications to having the name be 'guessable'. if so, some randomization needs to be added - no idea how much. Maybe you'd also need to hash the data mentioned above along with the random bits. Like I said, I really have no idea if this is important or not.
As far as I can see there is no need for pthread_barrier_destroy to be an immediate operation. You could have it wait until all threads that are still in their wakeup phase are woken up.
E.g you could have an atomic counter awakening that initially set to the number of threads that are woken up. Then it would be decremented as last action before pthread_barrier_wait returns. pthread_barrier_destroy then just could be spinning until that counter falls to 0.
Is there a function analogous to IsBadReadPtr in Unix? At least some functionalities of IsBadReadPtr?
I want to write a procedure which would react if something bad happens to a process (like SIGSEGV) and recover some information. But I want to check the pointers to make sure that the data is not corrupt and see if they can be accessed safely. Otherwise the crash handling procedure itself will crash, thus becoming useless.
Any suggestions?
The usual way to do this on POSIX systems is to use the write() system call. It will return EFAULT in errno rather than raising a signal if the memory cannot be read:
int nullfd = open("/dev/random", O_WRONLY);
if (write(nullfd, pointer, size) < 0)
{
/* Not OK */
}
close(nullfd);
(/dev/random is a good device to use for this on Linux, because it can be written by any user and will actually try to read the memory given. On OSes without /dev/random or where it isn't writeable, try /dev/null). Another alternative would be an anonymous pipe, but if you want to test a large region you'll need to regularly clear the reading end of the pipe.
How can you do it?
You try to do it and then handle the error.
To do this, first you set up a sigsetjmp and a SIGSEGV signal handler. Then attempt to use the pointer. If it was a bad pointer then the SIGSEGV handler is called and you can jump to safety and report the error to the user.
You can never tell "whether a pointer can be accessed safely", on Windows or on Unix. But for some similar information on some unix platforms, check out cat /proc/self/maps.
I ran into the same issue trying to read a 'pixel' from a framebuffer while running Ubuntu from within a virtualbox. There seemed to be no secure way to check access without crashing or acutally hanging gdb. The suggestion made by StasM hinted me towards to following working 'local' solution using fork.
void *some_address;
int pid = fork();
if (pid== 0)
{
someaddress[0] = some_address[0];
_exit(123);
}
bool access_ok = true;
int status;
int result = waitpid(pid, &status, 0);
if (result == -1 || WIFEXITED(status) == 0 || WEXITSTATUS(status) != 123)
{
access_ok = false;
}