Load shared memory in C

Load shared memory in C - c

So I'm making a Load.c file, that basically will load a bunch of "students" into shared memory.
The students are stored in a struct that looks like this:
struct StudentInfo{
char fName[20];
char lName[20];
char telNumber[15];
char whoModified[10];
};
Anyway, I need to load this in shared memory. We were given some sample code and we are reading the code from a data file that will look like this:
John Blakeman
111223333
560 Southbrook Dr. Lexington, KY 40522
8591112222
Paul S Blair
111223344
3197 Trinity Rd. Lexington, KY 40533
etc....
Here's my idea for the code: (header.h just has struct info/ and semaphore count....I'm unsure of what it needs to be, right now it's labeled as 5)
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/sem.h>
#include "header.h"
main()
{
int i,id;
struct StudentInfo *infoptr;
int sema_set;
id = shmget(KEY, SEGSIZE,IPC_CREAT|0666); /* get shared memory to store data */
if (id < 0){
perror("create: shmget failed");
exit(1);
}
infoptr=(struct StudentInfo *)shmat(id,0,0); /* attach the shared memory segment to the process's address space */
if(infoptr <= (struct StudentInfo *) (0)) {
perror("create: semget failed");
exit(2);
}
/* store data in shared memory segment */
//here's where I need help
That's about as far as I got. now I know I can store data using strcpy(infoptr->fName,"Joe"); (for example)
but I need to read an X number of names? How would I go about storing them? Using some sort of push/pop vector of structs? how would it look like?
And do I adjust semaphores based on how many "entries" there are I assume? I'm a little bit confused how to adjust my number of semaphores.
Oh BTW here's my header file just in case (SSN's are fake obviously)
/* header.h */
#define KEY ((key_t)(10101)) /* Change to Last 5 digits of SSN */
#define SEGSIZE sizeof(struct StudentInfo)
#define NUM_SEMAPHS 5
#define SEMA_KEY ((key_t) (1010)) /* Last 4 digits of SSN */
struct StudentInfo{
char fName[20];
char lName[20];
char telNumber[15];
char whoModified[10];
};
void Wait(int semaph, int n);
void Signal(int semaph, int n);
int GetSemaphs(key_t k, int n);

Uhm... I'm not sure here, but this is what I understood:
You're loading struct StudentInfo blocks into a shared memory space, and you want to be able to access it from other processes?
First, consider that your structure is fixed-size. If you want to read 10 names, you need to get sizeof(struct StudentInfo) * 10 bytes, if you want 400, make that * 400 -- so you don't need to push and pop your students from any kind of queue, since you can just use math to calculate from where and up to where you need to read. Getting students 10 to 20 is just reading the shared memory space from sizeof(struct StudentInfo) * 10 to sizeof(struct StudentInfo) * 10
As for mutual exclusion (if you're going to have multiple readers or writers, which I assume is what you wanted the semaphores for), I do not recommend semaphores. They are adequate for simpler kinds of exclusion, like "don't use this function if I'm using it", but not for locking large sets of data.
I'd use file locking. In Unix, you can use file locking primitives to create advisory locks over specific bytes in a file, even if the file is 0-bytes-long. What does this mean?
Advisory means you don't enforce them, other processes must respect them willingly. 0-byte-long means you can open a file that doesn't exist, and lock portions of it corresponding to your student structure positions in shared memory. You don't need the file to actually have data, you can use it to represent your shared memory database without writing anything to it.
What's the advantage of this over semaphores? You have fine-grained control of your locks with a single file descriptor!
Wheh, that got long. Hope I helped.

Related

Linux kernel: help understanding xfs structs

I'm currently working on a project involving the exploration of the linux kernel, particularly the XFS filesystem , right now I'm trying to read where a file resides on disk and i stumbled upon the xfs_ifork struct which looks like:
typedef struct xfs_ifork {
int if_bytes; /* bytes in if_u1 */
int if_real_bytes; /* bytes allocated in if_u1 */
struct xfs_btree_block *if_broot; /* file's incore btree root */
short if_broot_bytes; /* bytes allocated for root */
unsigned char if_flags; /* per-fork flags */
union {
xfs_bmbt_rec_host_t *if_extents;/* linear map file exts */
xfs_ext_irec_t *if_ext_irec; /* irec map file exts */
char *if_data; /* inline file data */
} if_u1;
union {
xfs_bmbt_rec_host_t if_inline_ext[XFS_INLINE_EXTS];
/* very small file extents */
char if_inline_data[XFS_INLINE_DATA];
/* very small file data */
xfs_dev_t if_rdev; /* dev number if special */
uuid_t if_uuid; /* mount point value */
} if_u2;
} xfs_ifork_t;
my lead on the u1 union, which encases different scenarios of file data fork, but i cant seem to have figured out the exact way to read them , the xfs_bmbt_rec_host_t struct refers to the disk address , written inside 2 uint64 types which encapsulates the entire address i need,
but somehow, the address I'm able to read isn't the correct one, so goes for the xfs_ext_irec_t struct, which holds the xfs_bmbt_rec_host_t struct and counters.
if someone happens to know theses structs well or have used them i would appreciate some explaining

Can anyone help me make a Shared Memory Segment in C

I need to make a shared memory segment so that I can have multiple readers and writers access it. I think I know what I am doing as far as the semaphores and readers and writers go...
BUT I am clueless as to how to even create a shared memory segment. I want the segment to hold an array of 20 structs. Each struct will hold a first name, an int, and another int.
Can anyone help me at least start this? I am desperate and everything I read online just confuses me more.
EDIT: Okay, so I do something like this to start
int memID = shmget(IPC_PRIVATE, sizeof(startData[0])*20, IPC_CREAT);
with startData as the array of structs holding my data initialized and I get an error saying
"Segmentation Fault (core dumped)"

The modern way to obtain shared memory is to use the API, provided by the Single UNIX Specification. Here is an example with two processes - one creates a shared memory object and puts some data inside, the other one reads it.
First process:
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <fcntl.h>
#define SHM_NAME "/test"
typedef struct
{
int item;
} DataItem;
int main (void)
{
int smfd, i;
DataItem *smarr;
size_t size = 20*sizeof(DataItem);
// Create a shared memory object
smfd = shm_open(SHM_NAME, O_RDWR | O_CREAT, 0600);
// Resize to fit
ftruncate(smfd, size);
// Map the object
smarr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, smfd, 0);
// Put in some data
for (i = 0; i < 20; i++)
smarr[i].item = i;
printf("Press Enter to remove the shared memory object\n");
getc(stdin);
// Unmap the object
munmap(smarr, size);
// Close the shared memory object handle
close(smfd);
// Remove the shared memory object
shm_unlink(SHM_NAME);
return 0;
}
The process creates a shared memory object with shm_open(). The object is created with an initial size of zero, so it is enlarged using ftruncate(). Then the object is memory mapped into the virtual address space of the process using mmap(). The important thing here is that the mapping is read/write (PROT_READ | PROT_WRITE) and it is shared (MAP_SHARED). Once the mapping is done, it can be accessed as a regular dynamically allocated memory (as a matter of fact, malloc() in glibc on Linux uses anonymous memory mappings for larger allocations). Then the process writes data into the array and waits until Enter is pressed. Then it unmaps the object using munmap(), closes its file handle and unlinks the object with shm_unlink().
Second process:
#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#define SHM_NAME "/test"
typedef struct
{
int item;
} DataItem;
int main (void)
{
int smfd, i;
DataItem *smarr;
size_t size = 20*sizeof(DataItem);
// Open the shared memory object
smfd = shm_open(SHM_NAME, O_RDONLY, 0600);
// Map the object
smarr = mmap(NULL, size, PROT_READ, MAP_SHARED, smfd, 0);
// Read the data
for (i = 0; i < 20; i++)
printf("Item %d is %d\n", i, smarr[i].item);
// Unmap the object
munmap(smarr, size);
// Close the shared memory object handle
close(smfd);
return 0;
}
This one opens the shared memory object for read access only and also memory maps it for read access only. Any attempt to write to the elements of the smarr array would result in segmentation fault being delivered.
Compile and run the first process. Then in a separate console run the second process and observe the output. When the second process has finished, go back to the first one and press Enter to clean up the shared memory block.
For more information consult the man pages of each function or the memory management portion of the SUS (it's better to consult the man pages as they document the system-specific behaviour of these functions).

mmap, msync and linux process termination

I want to use mmap to implement persistence of certain portions of program state in a C program running under Linux by associating a fixed-size struct with a well known file name using mmap() with the MAP_SHARED flag set. For performance reasons, I would prefer not to call msync() at all, and no other programs will be accessing this file. When my program terminates and is restarted, it will map the same file again and do some processing on it to recover the state that it was in before the termination. My question is this: if I never call msync() on the file descriptor, will the kernel guarantee that all updates to the memory will get written to disk and be subsequently recoverable even if my process is terminated with SIGKILL? Also, will there be general system overhead from the kernel periodically writing the pages to disk even if my program never calls msync()?
EDIT: I've settled the problem of whether the data is written, but I'm still not sure about whether this will cause some unexpected system loading over trying to handle this problem with open()/write()/fsync() and taking the risk that some data might be lost if the process gets hit by KILL/SEGV/ABRT/etc. Added a 'linux-kernel' tag in hopes that some knowledgeable person might chime in.

I found a comment from Linus Torvalds that answers this question
http://www.realworldtech.com/forum/?threadid=113923&curpostid=114068
The mapped pages are part of the filesystem cache, which means that even if the user process that made a change to that page dies, the page is still managed by the kernel and as all concurrent accesses to that file will go through the kernel, other processes will get served from that cache.
In some old Linux kernels it was different, that's the reason why some kernel documents still tell to force msync.
EDIT: Thanks RobH corrected the link.
EDIT:
A new flag, MAP_SYNC, is introduced since Linux 4.15, which can guarantee the coherence.
Shared file mappings with this flag provide the guarantee that
while some memory is writably mapped in the address space of
the process, it will be visible in the same file at the same
offset even after the system crashes or is rebooted.
references:
http://man7.org/linux/man-pages/man2/mmap.2.html search MAP_SYNC in the page
https://lwn.net/Articles/731706/

I decided to be less lazy and answer the question of whether the data is written to disk definitively by writing some code. The answer is that it will be written.
Here is a program that kills itself abruptly after writing some data to an mmap'd file:
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
typedef struct {
char data[100];
uint16_t count;
} state_data;
const char *test_data = "test";
int main(int argc, const char *argv[]) {
int fd = open("test.mm", O_RDWR|O_CREAT|O_TRUNC, (mode_t)0700);
if (fd < 0) {
perror("Unable to open file 'test.mm'");
exit(1);
}
size_t data_length = sizeof(state_data);
if (ftruncate(fd, data_length) < 0) {
perror("Unable to truncate file 'test.mm'");
exit(1);
}
state_data *data = (state_data *)mmap(NULL, data_length, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, fd, 0);
if (MAP_FAILED == data) {
perror("Unable to mmap file 'test.mm'");
close(fd);
exit(1);
}
memset(data, 0, data_length);
for (data->count = 0; data->count < 5; ++data->count) {
data->data[data->count] = test_data[data->count];
}
kill(getpid(), 9);
}
Here is a program that validates the resulting file after the previous program is dead:
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <assert.h>
typedef struct {
char data[100];
uint16_t count;
} state_data;
const char *test_data = "test";
int main(int argc, const char *argv[]) {
int fd = open("test.mm", O_RDONLY);
if (fd < 0) {
perror("Unable to open file 'test.mm'");
exit(1);
}
size_t data_length = sizeof(state_data);
state_data *data = (state_data *)mmap(NULL, data_length, PROT_READ, MAP_SHARED|MAP_POPULATE, fd, 0);
if (MAP_FAILED == data) {
perror("Unable to mmap file 'test.mm'");
close(fd);
exit(1);
}
assert(5 == data->count);
unsigned index;
for (index = 0; index < 4; ++index) {
assert(test_data[index] == data->data[index]);
}
printf("Validated\n");
}

I found something adding to my confusion:
munmap does not affect the object that was mappedthat is, the call to munmap
does not cause the contents of the mapped region to be written
to the disk file. The updating of the disk file for a MAP_SHARED
region happens automatically by the kernel's virtual memory algorithm
as we store into the memory-mapped region.
this is excerpted from Advanced Programming in the UNIX® Environment.
from the linux manpage:
MAP_SHARED Share this mapping with all other processes that map this
object. Storing to the region is equiva-lent to writing to the
file. The file may not actually be updated until msync(2) or
munmap(2) are called.
the two seem contradictory. is APUE wrong?

I didnot find a very precise answer to your question so decided add one more:
Firstly about losing data, using write or mmap/memcpy mechanisms both writes to page cache and are synced to underlying storage in background by OS based on its page replacement settings/algo. For example linux has vm.dirty_writeback_centisecs which determines which pages are considered "old" to be flushed to disk. Now even if your process dies after the write call has succeeded, the data would not be lost as the data is already present in kernel pages which will eventually be written to storage. The only case you would lose data is if OS itself crashes (kernel panic, power off etc). The way to absolutely make sure your data has reached storage would be call fsync or msync (for mmapped regions) as the case might be.
About the system load concern, yes calling msync/fsync for each request is going to slow your throughput drastically, so do that only if you have to. Remember you are really protecting against losing data on OS crashes which I would assume is rare and probably something most could live with. One general optimization done is to issue sync at regular intervals say 1 sec to get a good balance.

Either the Linux manpage information is incorrect or Linux is horribly non-conformant. msync is not supposed to have anything to do with whether the changes are committed to the logical state of the file, or whether other processes using mmap or read to access the file see the changes; it's purely an analogue of fsync and should be treated as a no-op except for the purposes of ensuring data integrity in the event of power failure or other hardware-level failure.

According to the manpage,
The file may not actually be
updated until msync(2) or munmap() is called.
So you will need to make sure you call munmap() prior to exiting at the very least.

Using system calls from C, how do I get the utilization of the CPU(s)?

In C on FreeBSD, how does one access the CPU utilization?
I am writing some code to handle HTTP redirects. If the CPU load goes above a threshold on a FReeBSD system, I want to redirect client requests. Looking over the man pages, kvm_getpcpu() seems to be the right answer, but the man pages (that I read) don't document the usage.
Any tips or pointers would be welcome - thanks!
After reading the answers here, I was able to come up with the below. Due to the poor documentation, I'm not 100% sure it is correct, but top seems to agree. Thanks to everyone who answered.
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/sysctl.h>
#include <unistd.h>
#define CP_USER 0
#define CP_NICE 1
#define CP_SYS 2
#define CP_INTR 3
#define CP_IDLE 4
#define CPUSTATES 5
int main()
{
long cur[CPUSTATES], last[CPUSTATES];
size_t cur_sz = sizeof cur;
int state, i;
long sum;
double util;
memset(last, 0, sizeof last);
for (i=0; i<6; i++)
{
if (sysctlbyname("kern.cp_time", &cur, &cur_sz, NULL, 0) < 0)
{
printf ("Error reading kern.cp_times sysctl\n");
return -1;
}
sum = 0;
for (state = 0; state<CPUSTATES; state++)
{
long tmp = cur[state];
cur[state] -= last[state];
last[state] = tmp;
sum += cur[state];
}
util = 100.0L - (100.0L * cur[CP_IDLE] / (sum ? (double) sum : 1.0L));
printf("cpu utilization: %7.3f\n", util);
sleep(1);
}
return 0;
}

From the MAN pages
NAME
kvm_getmaxcpu, kvm_getpcpu -- access per-CPU data
LIBRARY
Kernel Data Access Library (libkvm, -lkvm)
SYNOPSIS
#include <sys/param.h>
#include <sys/pcpu.h>
#include <sys/sysctl.h>
#include <kvm.h>
int
kvm_getmaxcpu(kvm_t *kd);
void *
kvm_getpcpu(kvm_t *kd, int cpu);
DESCRIPTION
The kvm_getmaxcpu() and kvm_getpcpu() functions are used to access the
per-CPU data of active processors in the kernel indicated by kd. The
kvm_getmaxcpu() function returns the maximum number of CPUs supported by
the kernel. The kvm_getpcpu() function returns a buffer holding the per-
CPU data for a single CPU. This buffer is described by the struct pcpu
type. The caller is responsible for releasing the buffer via a call to
free(3) when it is no longer needed. If cpu is not active, then NULL is
returned instead.
CACHING
These functions cache the nlist values for various kernel variables which
are reused in successive calls. You may call either function with kd set
to NULL to clear this cache.
RETURN VALUES
On success, the kvm_getmaxcpu() function returns the maximum number of
CPUs supported by the kernel. If an error occurs, it returns -1 instead.
On success, the kvm_getpcpu() function returns a pointer to an allocated
buffer or NULL. If an error occurs, it returns -1 instead.
If either function encounters an error, then an error message may be
retrieved via kvm_geterr(3.)
EDIT
Here's the kvm_t struct:
struct __kvm {
/*
* a string to be prepended to error messages
* provided for compatibility with sun's interface
* if this value is null, errors are saved in errbuf[]
*/
const char *program;
char *errp; /* XXX this can probably go away */
char errbuf[_POSIX2_LINE_MAX];
#define ISALIVE(kd) ((kd)->vmfd >= 0)
int pmfd; /* physical memory file (or crashdump) */
int vmfd; /* virtual memory file (-1 if crashdump) */
int unused; /* was: swap file (e.g., /dev/drum) */
int nlfd; /* namelist file (e.g., /kernel) */
struct kinfo_proc *procbase;
char *argspc; /* (dynamic) storage for argv strings */
int arglen; /* length of the above */
char **argv; /* (dynamic) storage for argv pointers */
int argc; /* length of above (not actual # present) */
char *argbuf; /* (dynamic) temporary storage */
/*
* Kernel virtual address translation state. This only gets filled
* in for dead kernels; otherwise, the running kernel (i.e. kmem)
* will do the translations for us. It could be big, so we
* only allocate it if necessary.
*/
struct vmstate *vmst;
};

I believe you want to look into 'man sysctl'.

I don't know the exact library, command, or system call; however, if you really get stuck, download the source code to top. It displays per-cpu stats when you use the "-P" flag, and it has to get that information from somewhere.

How does one keep an int and an array in shared memory in C?

I'm attempting to write a program in which children processes communicate with each other on Linux.
These processes are all created from the same program and as such they share code.
I need them to have access to two integer variables as well as an integer array.
I have no idea how shared memory works and every resource I've searched has done nothing but confuse me.
Any help would be greatly appreciated!
Edit: Here is an example of some code I've written so far just to share one int but it's probably wrong.
int segmentId;
int sharedInt;
const int shareSize = sizeof(int);
/* Allocate shared memory segment */
segmentId = shmget(IPC_PRIVATE, shareSize, S_IRUSR | S_IWUSR);
/* attach the shared memory segment */
sharedInt = (int) shmat(segmentId, NULL, 0);
/* Rest of code will go here */
/* detach shared memory segment */
shmdt(sharedInt);
/* remove shared memory segment */
shmctl(segmentId, IPC_RMID, NULL);

You are going to need to increase the size of your shared memory. How big an array do you need? Whatever value it is, you're going to need to select it before creating the shared memory segment - dynamic memory isn't going to work too well here.
When you attach to shared memory, you get a pointer to the start address. It will be sufficiently well aligned to be used for any purpose. So, you can create pointers to your two variables and array along these lines (cribbing some of the skeleton from your code example) - note the use of pointers to access the shared memory:
enum { ARRAY_SIZE = 1024 * 1024 };
int segmentId;
int *sharedInt1;
int *sharedInt2;
int *sharedArry;
const int shareSize = sizeof(int) * (2 + ARRAY_SIZE);
/* Allocate shared memory segment */
segmentId = shmget(IPC_PRIVATE, shareSize, S_IRUSR | S_IWUSR);
/* attach the shared memory segment */
sharedInt1 = (int *) shmat(segmentId, NULL, 0);
sharedInt2 = sharedInt1 + 1;
sharedArry = sharedInt1 + 2;
/* Rest of code will go here */
...fork your child processes...
...the children can use the three pointers to shared memory...
...worry about synchronization...
...you may need to use semaphores too - but they *are* complex...
...Note that pthreads and mutexes are no help with independent processes...
/* detach shared memory segment */
shmdt(sharedInt1);
/* remove shared memory segment */
shmctl(segmentId, IPC_RMID, NULL);

From your comment it seems you're using IPC_PRIVATE, and that definitely looks wrong ("private" kinds of suggest it's not meant for sharing, no?-). Try something like:
#include <sys/ipc.h>
#include <sys/shm.h>
...
int segid = shmget((key_t)0x0BADDOOD, shareSize, IPC_CREAT);
if (segid < 0) { /* insert error processing here! */ }
int *p = (int*) shmat(segid, 0, 0);
if (!p) { /* insert error processing here! */ }

This guide looks useful: http://www.cs.cf.ac.uk/Dave/C/node27.html. It includes some example programs.
There are also Linux man pages online.

Shared memory is just a segment of memory allocated by one process, with a unique id, and the other process will also make the allocation, with the same id, and the size of the memory is the sizeof the structure that you are using, so you would have a structure with 2 integers and an integer array.
Now they both have a pointer to the same memory, so the writes of one will overwrite whatever else was there, and the other has immediate access to it.