shm_open: Differences between Mac and Linux - c

I have a queue in shared memory. It does work on Linux (kernel 4.3.4), but not on Mac OS X. Are there any differences between how Mac OS X handles shared memory and how linux does, which may explain this?
I get the shared memory via:
int sh_fd = shm_open(shmName, O_RDWR | O_CREAT,
S_IROTH | S_IWOTH // others hav read/write permission
| S_IRUSR | S_IWUSR // I have read/write permission
);
// bring the shared memory to the desired size
ftruncate(sh_fd, getpagesize());
The queue is very simple as well. Here is the basic struct:
typedef struct {
// this is to check whether the queue is initialized.
// on linux, this will be 0 initially
bool isInitialized;
// mutex to protect concurrent access
pthread_mutex_t access;
// condition for the reader, readers should wait here
pthread_cond_t reader;
// condition for the writer, writers should wait here
pthread_cond_t writer;
// whether the queue can still be used.
bool isOpen;
// maximum capacity of the queue.
int32_t capacity;
// current position of the reader and number of items.
int32_t readPos, items;
// entries in the queue. The array actually is longer, which means it uses the space behind the struct.
entry entries[1];
} shared_queue;
Basically everyone who wants access acquires the mutex, readPos indicates where the next value should be read (incrementing readPos afterwards), (readPos+items) % capacity is where new items go. The only somewhat fancy trick is the isInitialized byte. ftruncate fills the shared memory with zeros if it had length 0 before, so I rely on isInitiualized to be zero on a fresh shared memory page and write a 1 there as soon as I initialize the struct.
As I said, it works on Linux, so I don't think it is a simple implementation bug. Is there any subtle difference between shm_open on Mac vs. Linux which I may not be aware of? The bug I see looks like the reader tries to read from an empty queue, so, maybe the pthread mutex/condition does not work on shared memory in a Mac?

The problem is that PTHREAD_PROCESS_SHARED is not supported on mac.
http://alesteska.blogspot.de/2012/08/pthreadprocessshared-not-supported-on.html

You must set PTHREAD_PROCESS_SHARED on both the mutex and condition variables.
So for a mutex:
pthread_mutexattr_t mutex_attr;
pthread_mutex_t the_mutex;
pthread_mutexattr_init(&mutex_attr);
pthread_mutexattr_setpshared(&mutex_attr, PTHREAD_PROCESS_SHARED);
pthread_mutexattr(&the_mutex, &mutex_attr);
Basically the same steps for the condition variables, but replace mutexattr with condattr.
If the the pthread_*attr_setpshared functions don't exist or return an error, then it may not be supported on your platform.
To be on the safe side, you might want to set PTHREAD_MUTEX_ROBUST if supported. This will prevent deadlock over the mutex (though not guarantee queue consistency) if a process exits while holding the lock.
EDIT: As an added caution, having a boolean "is initialized" flag is an insufficient plan on its own. You need more than that to really guarantee only one process can initialize the structure. At the very least you need to do:
// O_EXCL means this fails if not the first one here
fd = shm_open(name, otherFlags | O_CREAT | O_EXCL );
if( fd != -1 )
{
// initialize here
// Notify everybody the mutex has been initialized.
}
else
{
fd = shm_open(name, otherFlags ); // NO O_CREAT
// magically somehow wait until queue is initialized.
}
Are you sure really need to roll your own queue? Will POSIX message queues (see mq_open man page) do the job? If not, what about one of many messaging middleware solutions out there?
Update 2016-Feb-10: Possible mkfifo based solution
One alternative to implementing your own queue in shared memory is to use an OS provided named FIFO using mkfifo. A key difference between a FIFO and a named pipe is that you are allowed to have multiple simultaneous readers and writers.
A "catch" to this, is that the reader sees end-of-file when the last writer exits, so if you want readers to go indefinitely, you may need to open a dummy write handle.
FIFOs are super easy to use on the command line, like so:
reader.sh
mkfifo my_queue
cat my_queue
write.sh
echo "hello world" > my_queue
Or slightly more effort in C:
reader.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
int main(int argc, char**argv)
{
FILE * fifo;
FILE * wfifo;
int res;
char buf[1024];
char * linePtr;
/* Try to create the queue. This may belong on reader or writer side
* depending on your setup. */
if( 0 != mkfifo("work_queue", S_IRUSR | S_IWUSR ) )
{
if( errno != EEXIST )
{
perror("mkfifo:");
return -1;
}
}
/* Get a read handle to the queue */
fifo = fopen("work_queue", "r");
/* Get a write handle to the queue */
wfifo = fopen("work_queue", "w");
if( !fifo )
{
perror("fopen: " );
return -1;
}
while(1)
{
/* pull a single message from the queue at a time */
linePtr = fgets(buf, sizeof(buf), fifo);
if( linePtr )
{
fprintf(stdout, "new command=%s\n", linePtr);
}
else
{
break;
}
}
return 0;
}
writer.c
#include <stdio.h>
#include <unistd.h>
int main(int argc, char**argv)
{
FILE * pipe = fopen("work_queue", "w");
unsigned int job = 0;
int my_pid = getpid();
while(1)
{
/* Write one 'entry' to the queue */
fprintf(pipe, "job %u from %d\n", ++job, my_pid);
}
}

Related

Linux select() not blocking

I'm trying to understand the difference between select() and poll() better. For this I tried to implement a simple program that will open a file as write-only, add its file descriptor to the read set and than execute select in hopes that the function will block until the read permission is granted.
As this didnt work (and as far as I understood, this is intended behaviour) I tried to block access to the file using flock before the select() executen. Still, the program did not block its execution.
My sample code is as follows:
#include <stdio.h>
#include <poll.h>
#include <sys/file.h>
#include <errno.h>
#include <sys/select.h>
int main(int argc, char **argv)
{
printf("[+] Select minimal example\n");
int max_number_fds = FOPEN_MAX;
int select_return;
int cnt_pollfds;
struct pollfd pfds_array[max_number_fds];
struct pollfd *pfds = pfds_array;
fd_set fds;
int fd_file = open("./poll_text.txt", O_WRONLY);
struct timeval tv;
tv.tv_sec = 10;
tv.tv_usec = 0;
printf("\t[+] Textfile fd: %d\n", fd_file);
//create and set fds set
FD_ZERO(&fds);
FD_SET(fd_file, &fds);
printf("[+] Locking file descriptor!\n");
if(flock(fd_file,LOCK_EX) == -1)
{
int error_nr = errno;
printf("\t[+] Errno: %d\n", error_nr);
}
printf("[+] Executing select()\n");
select_return = select(fd_file+1, &fds, NULL, NULL, &tv);
if(select_return == -1){
int error_nr = errno;
printf("[+] Select Errno: %d\n", error_nr);
}
printf("[+] Select return: %d\n", select_return);
}
Can anybody see my error in this code? Also: I first tried to execute this code with two FDs added to the read list. When trying to lock them I had to use flock(fd_file,LOCK_SH) as I cannot exclusively lock two FDs with LOCK_EX. Is there a difference on how to lock two FDs of the same file (compared to only one fd)
I'm also not sure why select will not block when a file, that is added to the Read-set is opened as Write-Only. The program can never (without a permission change) read data from the fd, so in my understanding select should block the execution, right?
As a clarification: My "problem" I want to solve is that I have to check if I'm able to replace existing select() calls with poll() (existing in terms of: i will not re-write the select() call code, but will have access to the arguments of select.). To check this, I wanted to implement a test that will force select to block its execution, so I can later check if poll will act the same way (when given similar instructions, i.e. the same FDs to check).
So my "workflow" would be: write tests for different select behaviors (i.e. block and not block), write similar tests for poll (also block, not block) and check if/how poll can be forced do exactly what select is doing.
Thank you for any hints!
When select tells you that a file descriptor is ready for reading, this doesn't necessarily mean that you can read data. It only means that a read call will not block. A read call will also not block when it returns an EOF or error condition.
In your case I expect that read will immediately return -1 and set errno to EBADF (fd is not a valid file descriptor or is not open for reading) or maybe EINVAL (fd is attached to an object which is unsuitable for reading...)
Edit: Additional information as requested in a comment:
A file can be in a blocking state if a physical operation is needed that will take some time, e.g. if the read buffer is empty and (new) data has to be read from the disk, if the file is connected to a terminal and the user has not yet entered any (more) data or if the file is a socket or a pipe and a read would have to wait for (new) data to arrive...
The same applies for write: If the send buffer is full, a write will block. If the remaining space in the send buffer is smaller than your amount of data, it may write only the part that currently fits into the buffer.
If you set a file to non-blocking mode, a read or write will not block but tell you that it would block.
If you want to have a blocking situation for testing purposes, you need control over the process or hardware that provides or consumes the data. I suggest to use read from a terminal (stdin) when you don't enter any data or from a pipe where the writing process does not write any data. You can also fill the write buffer on a pipe when the reading process does not read from it.

Write atomically to a file using Write() with snprintf()

I want to be able to write atomically to a file, I am trying to use the write() function since it seems to grant atomic writes in most linux/unix systems.
Since I have variable string lengths and multiple printf's, I was told to use snprintf() and pass it as an argument to the write function in order to be able to do this properly, upon reading the documentation of this function I did a test implementation as below:
int file = open("file.txt", O_CREAT | O_WRONLY);
if(file < 0)
perror("Error:");
char buf[200] = "";
int numbytes = snprintf(buf, sizeof(buf), "Example string %s" stringvariable);
write(file, buf, numbytes);
From my tests it seems to have worked but my question is if this is the most correct way to implement it since I am creating a rather large buffer (something I am 100% sure will fit all my printfs) to store it before passing to write.
No, write() is not atomic, not even when it writes all of the data supplied in a single call.
Use advisory record locking (fcntl(fd, F_SETLKW, &lock)) in all readers and writers to achieve atomic file updates.
fcntl()-based record locks work over NFS on both Linux and BSDs; flock()-based file locks may not, depending on system and kernel version. (If NFS locking is disabled like it is on some web hosting services, no locking will be reliable.) Just initialize the struct flock with .l_whence = SEEK_SET, .l_start = 0, .l_len = 0 to refer to the entire file.
Use asprintf() to print to a dynamically allocated buffer:
char *buffer = NULL;
int length;
length = asprintf(&buffer, ...);
if (length == -1) {
/* Out of memory */
}
/* ... Have buffer and length ... */
free(buffer);
After adding the locking, do wrap your write() in a loop:
{
const char *p = (const char *)buffer;
const char *const q = (const char *)buffer + length;
ssize_t n;
while (p < q) {
n = write(fd, p, (size_t)(q - p));
if (n > 0)
p += n;
else
if (n != -1) {
/* Write error / kernel bug! */
} else
if (errno != EINTR) {
/* Error! Details in errno */
}
}
}
Although there are some local filesystems that guarantee write() does not return a short count unless you run out of storage space, not all do; especially not the networked ones. Using a loop like above lets your program work even on such filesystems. It's not too much code to add for reliable and robust operation, in my opinion.
In Linux, you can take a write lease on a file to exclude any other process opening that file for a while.
Essentially, you cannot block a file open, but you can delay it for up to /proc/sys/fs/lease-break-time seconds, typically 45 seconds. The lease is granted only when no other process has the file open, and if any other process tries to open the file, the lease owner gets a signal. (If the lease owner does not release the lease, for example by closing the file, the kernel will automagically break the lease after the lease-break-time is up.)
Unfortunately, these only work in Linux, and only on local files, so they are of limited use.
If readers do not keep the file open, but open, read, and close it every time they read it, you can write a full replacement file (must be on the same filesystem; I recommend using a lock-subdirectory for this), and hard-link it over the old file.
All readers will see either the old file or the new file, but those that keep their file open, will never see any changes.

read on many real file descriptors

Working on a Linux (Ubuntu) application. I need to read many files in a non-blocking fashion. Unfortunately epoll doesn't support real file descriptor (file descriptor from file), it does support file descriptor that's network socket. select does work on real file descriptors, but it has two drawbacks, 1) it's slow, linearly go through all the file descriptors that are set, 2) it's limited, it typically won't allow more than 1024 file descriptors.
I can change each file descriptors to be non-blocking and use non-blocking "read" to poll, but it's very expensive especially when there are a large number of file descriptors.
What are the options here?
Thanks.
Update 1
The use case here is to create some sort of file server, with many clients requesting for files, serve them in a non-blocking fashion. Due to network side implementation (not standard TCP/IP stack), can't use sendfile().
You could use multiple select calls combined with either threading or forking. This would reduce the number of FD_ISSET calls per select set.
Perhaps you can provide more details about your use-case. It sounds like you are using select to monitor file changes, which doesn't work as you would expect with regular files. Perhaps you are simply looking for flock
You could use Asynchronous IO on Linux. The relevant AIO manpages (all in section 3) appear to have quite a bit of information. I think that aio_read() would probably be the most useful for you.
Here's some code that I believe you should be able to adapt for your usage:
...
#define _GNU_SOURCE
#include <aio.h>
#include <unistd.h>
typedef struct {
struct aiocb *aio;
connection_data *conn;
} cb_data;
void callback (union sigval u) {
// recover file related data prior to freeing
cb_data data = u.sival_ptr;
int fd = data->aio->aio_fildes;
uint8_t *buffer = data->aio->aio_buf;
size_t len = data->aio->aio_nbytes;
free (data->aio);
// recover connection data pointer then free
connection_data *conn = data->conn;
free (data);
...
// finish handling request
...
return;
}
...
int main (int argc, char **argv) {
// initial setup
...
// setup aio for optimal performance
struct aioinit ainit = { 0 };
// online background threads
ainit.aio_threads = sysconf (_SC_NPROCESSORS_ONLN) * 4;
// use defaults if using few core system
ainit.aio_threads = (ainit.aio_threads > 20 ? ainit.aio_threads : 20)
// set num to the maximum number of likely simultaneous requests
ainit.aio_num = 4096;
ainit.aio_idle_time = 5;
aio_init (&ainit);
...
// handle incoming requests
int exit = 0;
while (!exit) {
...
// the [asynchronous] fun begins
struct aiocb *cb = calloc (1, sizeof (struct aiocb));
if (!cb)
// handle OOM error
cb->aio_fildes = file_fd;
cb->aio_offset = 0; // assuming you want to send the entire file
cb->aio_buf = malloc (file_len);
if (!cb->aio_buf)
// handle OOM error
cb->aio_nbytes = file_len;
// execute the callback in a separate thread
cb->aio_sigevent.sigev_notify = SIGEV_THREAD;
cb_data *data = malloc (sizeof (cb_data));
if (!data)
// handle OOM error
data->aio = cb; // so we can free() later
// whatever you need to finish handling the request
data->conn = connection_data;
cb->aio_sigevent.sigev_value.sival_ptr = data; // passed to callback
cb->aio_sigevent.sigev_notify_function = callback;
if ((err = aio_read (cb))) // and you're done!
// handle aio error
// move on to next connection
}
...
return 0;
}
This will result in you no longer having to wait on files being read in your main thread. Of course, you can create more performant systems using AIO, but those are naturally likely to be more complex and this should work for a basic use case.

Shared memory is not so shared with processes in C

While trying to resolve some debugging issues , I added some printf-s to my code :
I used that code :
struct PipeShm
{
int init;
sem_t sema;
...
...
}
struct PipeShm * sharedPipe = NULL;
func2:
int func2()
{
if (!sharedPipe)
{
int myFd = shm_open ("/myregion", O_CREAT | O_TRUNC | O_RDWR, 0666);
if (myFd == -1)
error_out ("shm_open");
// allocate some memory in the region in the size of the struct
int retAlloc = ftruncate (myFd, sizeof * sharedPipe);
if (retAlloc < 0) // check if allocation failed
error_out("ftruncate");
// map the region and shared in with all the processes
sharedPipe = mmap (NULL, sizeof * sharedPipe,PROT_READ | PROT_WRITE,MAP_SHARED , myFd, 0);
if (sharedPipe == MAP_FAILED) // check if the allocation failed
error_out("mmap");
// put initial value
int value = -10;
// get the value of the semaphore
sem_getvalue(&sharedPipe->semaphore, &value);
if (sharedPipe->init != TRUE) // get in here only if init is NOT TRUE !
{
if (!sem_init (&sharedPipe->semaphore, 1, 1)) // initialize the semaphore to 0
{
sharedPipe->init = TRUE;
sharedPipe->flag = FALSE;
sharedPipe->ptr1 = NULL;
sharedPipe->ptr2 = NULL;
sharedPipe->status1 = -10;
sharedPipe->status2 = -10;
sharedPipe->semaphoreFlag = FALSE;
sharedPipe->currentPipeIndex = 0;
printf("\nI'm inside the critical section! my init is: %d\n" , sharedPipe->init);
}
else
perror ("shm_pipe_init");
printf("\nI'm out the critical section! my init is: %d\n" , sharedPipe->init);
}
}
return 1; // always successful
}
With that main :
int main()
{
int spd, pid, rb;
char buff[4096];
fork();
func2();
return 0;
}
And got this :
shm_pipe_mkfifo: File exists
I'm inside the critical section! my init is: 1
I'm out the critical section! my init is: 1
Output:hello world!
I'm inside the critical section! my init is: 1
I'm out the critical section! my init is: 1
It seems that the shared memory is not so shared , why ?
The segment is shared between all processes due to MAP_SHARED | MAP_ANONYMOUS , so why both processes have the same before and after values ?
It seems that each process has its own semaphore even though it was shared between them , so what went wrong ?
Thanks
Since you use the MAP_ANONYMOUS flag to mmap, the myFd argument is ignored, and you create two independent shared memory chunks, one in each process, which have no relation to each other.
MAP_ANONYMOUS
The mapping is not backed by any file; its contents are initialā€
ized to zero. The fd and offset arguments are ignored; however,
some implementations require fd to be -1 if MAP_ANONYMOUS (or
MAP_ANON) is specified, and portable applications should ensure
this. The use of MAP_ANONYMOUS in conjunction with MAP_SHARED
is only supported on Linux since kernel 2.4.
If you get rid of MAP_ANONYMOUS you'll then only have one shared memory chunk, but you then have the problem of not calling sem_init. On Linux with NPTL it will actually work, as clearing a sem_t to all 0 bytes (the initial state here) is equivalent to sem_init(&sema, anything, 0); (NPTL ignores the pshared flag), but that's not portable to other systems.
Per Karoly's comment on another answer, there's also a race condition due O_TRUNC on the open call. If the second thread calls open after the first thread has already started modifying the semaphore, that TRUNC will clobber the semaphore state. Probably the best solution is to move the code creating, opening, and mmaping the shared memory to a different function that is called BEFORE calling fork.
edit
To fix the O_TRUNC problem, you can't have more than one process calling shm_open with O_TRUNC. But if you just get rid of the O_TRUNC, then you have the startup problem that if the shared memory object already exists (from a previous run of the program), it may not be in a predictable state. On possibility is to split off the beginning of func2:
main() {
func1();
fork();
func2();
}
func1() {
int myFd = shm_open ("/myregion", O_CREAT | O_TRUNC | O_RDWR, 0666);
if (myFd == -1)
error_out ("shm_open");
// allocate some memory in the region in the size of the struct
int retAlloc = ftruncate (myFd, sizeof *sharedPipe);
if (retAlloc < 0) // check if allocation failed
error_out("ftruncate");
// map the region and shared in with all the processes
sharedPipe = mmap (NULL, sizeof *sharedPipe, PROT_READ|PROT_WRITE, MAP_SHARED, myFd, 0);
if (sharedPipe == MAP_FAILED) // check if the allocation failed
error_out("mmap");
}
func2() {
// put initial value
int value = -10;
// get the value of the semaphore
sem_getvalue(&sharedPipe->semaphore, &value);
:
Alternately you could keep the same code (just get rid of O_TRUNC) and add a cleanup before the fork:
main() {
shm_unlink("/myregion");
fork();
func2();
In all cases you'll still have problem if you run multiple copies of your program at the same time.
A few thoughts...
I think this is a fundemental misunderstanding of how POSIX semaphores work. I don't see a call to sem_init or sem_open. You shouldn't be able to use them across processes without doing so much more explicitly than you've done.
I'm not so fresh on the implementation of mmap on Linux and how MAP_ANONYMOUS might affect this, but in general writes to mapped regions can't really be instantaneous. The manpage on linux.die says:
MAP_SHARED
Share this mapping. Updates to the mapping are visible to other processes that map this file, and are carried through to the underlying file. The file may not actually be updated until msync(2) or munmap() is called.
The reason for this is that your memory access gets trapped in a page fault and at that point the kernel will fill contents from the file descriptor, then let you do your write in RAM, then at some later point the kernel will flush back to the file descriptor.

Initializing shared memory safely

I have several processes communicating with each through POSIX shared memory on OS X.
My issue is these processes could spawn in any order, and try to initialize the shared memory segment at the same time.
I tried using advisory locks with fcntl and flock but both fail telling me I'm passing an invalid file descriptor (I'm positive the file descriptor is not invalid). So clearly that's out of the picture.
Are there any alternatives to this? Or is there any details about using locks with shared memory that I'm not aware of?
Edit:
My attempt at using locks looks like this:
// Some declarations...
struct Queue {
int index[ENTRIES_PER_QUEUE];
sem_t lock;
sem_t readWait;
sem_t writeSem;
struct Entry slots[ENTRIES_PER_QUEUE];
};
struct ipc_t {
int fd;
char name[512];
struct Queue* queue;
};
ipc_t ipc_create(const char* name, int owner) {
int isInited = 1;
struct Queue* queue;
struct flock lock = {
.l_type = F_WRLCK,
.l_whence = SEEK_SET,
.l_start = 0,
.l_len = 0
};
ipc_t conn = malloc(sizeof(struct ipc_t));
sprintf(conn->name, "/arqvenger_%s", name);
conn->fd = shm_open(conn->name, O_CREAT | O_RDWR, 0666);
if (conn->fd == -1) {
free(conn);
perror("shm_open failed");
return NULL;
}
if (fcntl(conn->fd, F_SETLKW, &lock) == -1) {
perror("Tanked...");
}
// Do stuff with the lock & release it
The output I get is:
Tanked...: Bad file descriptor
A common technique is to first call shm_open with O_CREAT|O_EXCL. This will succeed for only one process that then has to do the setup. The others then would have to do the open as before and wait a bit, probably polling, that the setup is finished.
Edit: To show how this could work as discussed in the comments.
struct head {
unsigned volatile flag;
pthread_mutex_t mut;
};
void * addr = 0;
/* try shm_open with exclusive, and then */
if (/* we create the segment */) {
addr = mmap(something);
struct head* h = addr;
pthread_mutex_init(&h->mut, aSharedAttr);
pthread_mutex_lock(&h->mut);
h->flag = 1;
/* do the rest of the initialization, and then */
pthread_mutex_unlock(&h->mut);
} else {
/* retry shm_open without exclusive, and then */
addr = mmap(something);
struct head* h = addr;
/* initialy flag is guaranteed to be 0 */
/* this will break out of the loop whence the new value is written to flag */
while (!h->flag) sched_yield();
pthread_mutex_lock(&h->mut);
pthread_mutex_unlock(&h->mut);
}
I have been able to reproduce the problem. Then I found a sad notice in the standard:
When the file descriptor fildes refers to a shared memory object, the
behavior of fcntl() shall be the same as for a regular file except the
effect of the following values for the argument cmd shall be
unspecified: F_SETFL, F_GETLK, F_SETLK, and F_SETLKW.
So it's likely it's not yet supported. For such a simple setup I would use a semaphore for signaling "the memory is ready".
EDIT
It seems I need to mention semaphores can be created using "O_EXCL" (so there are no races).

Resources