fork() failing repeatedly - c

I was trying to create a child process using fork but it is repeatedly returning -1, I tried to found the causes and I came to this:
Fork() will fail and no child process will be created if:
[EAGAIN] The system-imposed limit on the total number of pro-
cesses under execution would be exceeded. This limit
is configuration-dependent.
[EAGAIN] The system-imposed limit MAXUPRC (<sys/param.h>) on the
total number of processes under execution by a single
user would be exceeded.
[ENOMEM] There is insufficient swap space for the new process.
Now I don't know how to check the first and third point but on looking at point MAXUPRC - I looked into sys/param.h:
//<sys/param.h>
#define MAXUPRC CHILD_MAX /* max simultaneous processes */
CHILD_MAX has been mentioned here (unistd.h):
//<unistd.h> - DEFINED IN MY SYSTEM
#define _SC_CHILD_MAX 2
CHILD_MAX - _SC_CHILD_MAX
The maximum number of simultaneous processes per user ID.
Must not be less than _POSIX_CHILD_MAX (25).
Now I can't establish if keeping _SC_CHILD_MAX less than 25 is the reason or do I have to look into 1st and 3rd causes (they are hard to check as the system is Z/OS with limited access and I don't have much idea about it).
perror(""); isn't printing anything and errno is printing 655360.
#include<stdio.h>
#include<errno.h>
#include<unistd.h>
#include<stdlib.h>
int main()
{
long int prc_id;
prc_id=fork();
if(prc_id ==0)
calling_child();
else if(prc_id <0)
{
printf("errno - %d\n",errno);
printf("failed %d\n",prc_id);
exit(0);
}
return 0;
}
This above code runs fine and creates a child process on my own laptop (centos) but in dev environment I guess there are some restrictions.
calling_child is never called as the prc_id returned is -1 (not even first print statement is printed on entering the function).

I copied your sample program to my z/OS system running z/OS Version 2.3, with a minor enhancement since you left out the calling_child function (UPDATED to add the sysconf() value for SC_MAX_CHILD):
#include<stdio.h>
#include<errno.h>
#include<unistd.h>
#include<stdlib.h>
int main()
{
long int prc_id;
printf("SC_CHILD_MAX = %d\n", sysconf(_SC_CHILD_MAX));
prc_id=fork();
if(prc_id ==0)
calling_child();
else if(prc_id <0)
{
printf("errno - %d\n",errno);
printf("failed %d\n",prc_id);
exit(0);
}
return 0;
}
static void calling_child(void)
{
printf("Hello from PID %x\n", getpid());
return;
}
I put the code in a file called test2.c and built it with this shell command: c99 -o test2 -g test2.c
It compiles cleanly and I was able to run it with no problems. It produces this output:
SC_CHILD_MAX = 32767
Hello from PID 40100b2
Most likely your build or execution environment isn't configured properly. It absolutely works fine on my pretty basic system, and I didn't have to do anything unusual at all to get it running.
A few small hints...
How are you getting into the z/OS UNIX shell? If you're logging onto TSO then running the ISPF shell or the OMVS command, you might prefer simply SSH'ing into your z/OS system. I usually find this is the cleanest batch environment.
Second thing is that you probably want to double-check your C/C++ environment. There are some good debugging features built into the IBM XLC compiler - try man C99 (or whatever dialect you use) and have a read.
Third thing is that IBM includes the dbx debugger in z/OS, so if you really get stuck, just try running your executable under dbx and you can step through your program line at a time.
As for those ERRNOs and so on, don't forget to also look at the __errno2() values - they usually have a very specific reason code that goes along with the more generic errors. For example, your z/OS security administrator can certainly do things to limit your use of z/OS UNIX functions - that would be revealed in the __errno2() values pretty clearly.
Stick with it though - if you know UNIX or Linux, all the skills you have from using the shell to coding pretty much transfer 100% to z/OS if you put in a little time to learning the basics.

If you've been playing with fork(2) you may well have used up your limit of processes available to your process, see ulimit for your shell:
$ help ulimit | grep process
Ulimit provides control over the resources available to processes
-d the maximum size of a process's data segment
-l the maximum size a process may lock into memory
-u the maximum number of user processes
If so, logging out and logging back in again will solve the problem.

Related

Why do these instruction counts of ls differ so much? (ptrace vs perf vs qemu)

I want to count the total number of instructions executed when running /bin/ls.
I used 3 methods whose results differ heavily and i dont have a clue why.
1. Instruction counting with ptrace
I wrote a piece of code that invokes an instance of ls and singlesteps through it with ptrace:
#include <stdio.h>
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>
#include <sys/reg.h>
#include <sys/syscall.h>
int main()
{
pid_t child;
child = fork(); //create child
if(child == 0) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
char* child_argv[] = {"/bin/ls", NULL};
execv("/bin/ls", child_argv);
}
else {
int status;
long long ins_count = 0;
while(1)
{
//stop tracing if child terminated successfully
wait(&status);
if(WIFEXITED(status))
break;
ins_count++;
ptrace(PTRACE_SINGLESTEP, child, NULL, NULL);
}
printf("\n%lld Instructions executed.\n", ins_count);
}
return 0;
}
Running this code gives me 516.678 Instructions executed.
2. QEMU singlestepping
I simulated ls using qemu in singlestep mode and logged all incoming instructions into a log file using the following command:
qemu-x86_64 -singlestep -D logfile -d in_asm /bin/ls
According to qemu ls executes 16.836 instructions.
3. perf
sudo perf stat ls
This command gave me 8.162.180 instructions executed.
I know that most of these instructions come from the dynamic linker and it is fine that they get counted. But why do these numbers differ so much? Shouldn't they all be the same?
Your counting instruction number method with qemu was wrong,the in_asm option only show the translated instruction in a compiled block, so after tb chaining process in qemu, it would dirctly jump to the translated block,leading the count in qemu was less than other tools, so a good way in practice is -d nochain,exec with -singlestep options.
Still, there also have instruction number differce between these tools, i have tried qemu running in different dirctory to produce those logs, the qemu guest program was statically linked, the logs file show different results in counting instruction number, it may be some glibc start or init stuff get involved with environment arguments to cause this differnce.
Why do these instruction counts differ so much? Because they really measure different things, and only the unit of measure is the same. It's as if you were weighing something you brought from the store, and one person weighed everything without packages nor even stickers on it, another was weighing it in packages and included the shopping bags too, and yet another also added the mud you brought into the house on your boots.
That's pretty much what is happening here: the instruction counts are not the instruction counts only of what's inside the ls binary, but can also include the libraries it uses, the services of the kernel loader needed to bring those libraries in, and finally the code executed in the process but in the kernel context. The methods you used all behave differently in that respect. So the question is: what do you need out of that measurement? If you need the "total effort", then certainly the largest number is what you want: this will include some of the overhead caused by the kernel. If you need the "I just want to know what happened in ls", then the smallest number is the one you want.
Your program using PTRACE_SINGLESTEP should count all user-space instructions executed in the process. A syscall instruction counts as one because you can't single-step into the kernel; that's opaque to ptrace.
That should be pretty similar to perf stat --all-user or perf stat -e instructions:u to count user-space instructions. (Probably counting the same within a few instructions out of however many millions). That perf option or :u event modifier tell it to program the HW performance counters to only count the event while the CPU is not at privilege level 0 (kernel mode); modern x86 CPUs have hardware support for this so perf doesn't have to run instructions inside the kernel on every transition to stop and restart counters.
Both of these include everything that happens in user-space, including ld-linux.so dynamic linker code that runs before execution reaches _start in a dynamic executable.
See also How do I determine the number of x86 machine instructions executed in a C program? which includes hand-written asm source for a static executable that only runs 2 instructions in user-space. perf stat --all-user counts 3 instructions for it on my Skylake. That Q&A also has a bunch of other discussion about what happens in a user-space process, and hopefully useful links.
Qemu counting is totally different because it does dynamic translation. See wen liang's answer and What instructions does qemu trace? which Peter Maydell linked in a comment on Kuba's answer here.
If you want to use a tool like this, you might want Intel's SDE, which uses Intel PIN dynamic instrumentation. It can histogram instruction types for you, as well as counting a total. See my answer on How do I determine the number of x86 machine instructions executed in a C program? for links.

In C, is there a way to pass a return value of one program to another program?

I have 2 basic questions since I'm new at C:
How to go about passing the return value of a C program to another program? prog1 will list out items (number of items can be varied each time of execution) and I'd like to store and pass ONLY the last item value to another prog2 for different purpose. Basically the output of prog1 is below and I'd like to extract the last item on the list which is /dev/hidraw2 for the prog2. Prog2 is currently using hardcoded value and I'd like to make it more dynamic.
prog1 output:
/dev/hidraw0
/dev/hidraw1
/dev/hidraw2
prog1 code can be found here:
https://pastebin.pl/view/379db296
prog2 code snippet is below:
/* C */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
const char *bus_str(int bus);
int main(int argc, char **argv)
{
int fd;
int i, res, desc_size = 0;
char buf[256];
struct hidraw_report_descriptor rpt_desc;
struct hidraw_devinfo info;
char *device = "/dev/hidraw2"; /*** replace hardcoded value here dynamically ***/
if (argc > 1)
device = argv[1];
/* Open the Device with non-blocking reads. In real life,
don't use a hard coded path; use libudev instead. */
fd = open(device, O_RDWR);
If question 1 above is resolved, can I incorporate prog1 into prog2 directly even though they have different compilation parameters? Do you forsee any issue there?
prog1 compile command:
gcc -Wall -g -o prog1 prog1.c -ludev
prog2 compile command:
gcc prog2 prog2.c
One way, and there are many, is to use fork. fork() is a little unlike other C function calls. When you call it, you get an extra copy of the running process. You can then compare the return value to determine which process is the parent and which process is the child.
The overall logic would then look a little like:
while (looping) {
char device[] = (set to the desired value);
int childPID = fork();
if (childPID == 0) {
/* this is the child */
doWhatever(device);
exit(0);
} else {
/* this is the parent */
/* let's wait on the child to complete */
wait();
}
Now this example is very simple. There are many other considerations; but this approach avoids other more complex ways to pass information between processes. Basically the child is created as a copy of the program at the moment in time that fork() is called, so the contents of device will be shared between the two processes.
Some of the problems to watch out for is that children need to report their exit codes to their parent processes. So if you don't want to have one process running at a time, you'll need a more complex way to wait() on the children.
If you kill the parent process in this example, the children will also be killed. That may or may not be what you desire.
Other approaches (some much better than others):
Using IPC to create a producer / consumer pattern based on shared memory and semaphores (to control who's writing to the shared memory, so reading processes don't read partially written entries or remove items partially creating problems with writers.
Using the above approach (fork) but creating an unnamed pipe in the patern that the child then reads from. This can permit the child to process more than one request.
Using a named pipe (a very special kind of file) that exists independently of the two programs. This allows independent launching of the child(ren).
Using a regular file, written by the producer and read by the consumer(s). This will require some sort of signalling for when the file is safe to write to (so consumers don't get bad reads) and when the file is safe to read / shorten (so the writers don't corrupt the file).
I'm sure there are many other approaches; but, when you have two programs cooperating to solve a problem, you start to encounter a lot of issues that you don't normally have when only considering one program solving the problem.
Some of the things to consider:
The communication is reliable - files, sockets, networks, etc. all sometimes fail. You need to verify your sends were sent and provide some means for knowing your data is not corrupt due to the transport.
The act of communicating requires time - you will need to handle delays in both the packaging, insertion, retrieval, and unpackaging of the transmission; and, often those delays can change dramatically for each message.
The number of messages that can be handled is finite - transmission requires time, and time implies a rate of transmission. Since there is a rate involved, you cannot design a working program that ignores the limits of the communication path.
The message can be corrupted or eavesdropped - While we like to think that computers never make errors, when communicating errors in the data occur more frquently. These can be due to lots of reasons (people testing sockets with telnet, electrical interference with networks, pipe files being removed, etc.) In addition, the data itself is more easily read by others, which is a problem for some programs.
The way the transmission occurs changes - Even if you are using files instead of networks to transmit information, administrators can move files and insert symlinks. Networks may send every bit of information through a different path. Don't assume a static path.
You can't provide instructions to avoid critical issues - No computer has only one administrator, even if it is your own personal computer. Automated systems and subroutines, as well as other people, ensure that you'll never be able to provide instructions on how to work around issues to all the right people. Write your solutions avoiding a need for workarounds that are implemented by people following a custom "required" procedure.
Moving data is not free - it costs time, electricity, RAM, CPU, and possibly disk or network. These costs can grow (if not managed) to prevent your program from functioning, even if all other parts of the solution are correct.
Transportation of data is often not homogenous - Once you commit to a way of communicating information, odds are that it will not be easy to replace it with another way easily. Many of the solutions provide additional features that aren't present in other approaches, and even if you decide on "network only" transport, the difference between networks may make your solution less generic that you might think.
Armed with these realizations, it will be much easier for you to create a solution that works, and doesn't fall apart when some tiny detail changes.

Counter for main() in Linux OS

i want to implement a counter in Linux which keep a track of how many times
main() is called by any process.
when i start this counter thing as a process, from that time it will tell me how many times main() was called not by my program but in the entire OS system
example:
i start this as a daemon and then i create a simple code
#include <stdio.h>
int main(){
//some code
return 0;
}
Now here main is called so the counter will increment by one.
Can anyone explain me how can this be done.?
thanks
You might want to take a look at: Proc connector and socket filters
The Proc Connector and Socket Filters Posted on February 9, 2011 by scott
The proc connector is one of those interesting kernel features that most people rarely come across, and even more rarely find documentation on. Likewise the socket filter. This is a shame, because they’re both really quite useful interfaces that might serve a variety of purposes if they were better documented.
The proc connector allows you to receive notification of process events such fork and exec calls, as well as changes to a process’s uid, gid or sid (session id). These are provided through a socket-based interface by reading instances of struct proc_event defined in the kernel header....
main() is just a abstract code; assembler uses functions' name as a hint, and converts them into numbers. So you can't implement such counter.
main() is usually called once, what you mean may be a program which counts how many programs are executed.
There is popen in Linux which performs the given command, and yields the result as FILE *, so you can execute the ps command and parse it to get the list of the processes. You can continuously invoke popen and count the number of programs.
char Buffer[1024];
sprintf(Buffer,"ps ");
FILE *ptr = popen(Buffer, "r");
if(NULL != ptr)
{
while(fgets(Buffer, sizeof(Buffer),ptr));
pclose(ptr);
}
// Now the list of processes are in Buffer

Why is heap overflow "allowed" to freeze the system?

This code takes a number as input on the command line and calls the heapOverflow() function that many times:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void heapOverflow()
{
unsigned *some_array = malloc(50000);
}
int main(int argc, char *argv[])
{
unsigned num_calls = atoi(argv[1]);
while (num_calls > 0)
{
heapOverflow();
num_calls--;
}
return 0;
}
On Linux Mint 17.1, running this with a large enough input (e.g. 10000000 in my case) freezes the system for a few minutes, before bash returns with "Killed", and then the system remains slow for a couple more minutes.
Why does the OS allow a process to take over memory to such a degree? Shouldn't the scheduler and memory manager work together to kill a process when it becomes clear that it will request too much heap memory? Or is there a situation in which giving all this memory to one process could be useful (i.e. could the process actually be doing useful work even while the rest of the system, or at least the X GUI system, is frozen?)
Why does the OS allow a process to take over memory to such a degree?
Because it is configured to do so.
The Linux kernel supports, among other features, per-process resource limits as standardized in POSIX.1-2008; see e.g. prlimit for a command-line access to these, and getrlimit()/setrlimit() for the C library interface.
In most Linux distributions, these limits are set by a Pluggable Authentication Module pam_limits in limits.conf.
The problem is, those limits are very task-specific. They vary a lot from system to system, and indeed even from user to user: some don't like their system to start paging (slow down like OP described) and would rather the process to fail; others prefer to wait for a while since they actually need the results from the resource-hungry process. Setting the limits is the responsibility of the system administrator.
I guess one could easily write a program that checks the current configuration (in particular, /proc/meminfo), and set the resource limits for a single-user desktop/laptop machine. However, you could just as well create a helper script, say /usr/local/bin/run-limited,
#!/bin/sh
exec prlimit --as=1073741824 --rss=262144 "$#"
so you can run any of your programs with address space limited to 1 GB and resident set size (amount of RAM actually used) to 256k pages.

Number of Running Processes on a Minix system from C code

So, this seemed simple at first, but after crawling Google and here, the answer doesn't seem as simple as I first thought.
Basically, I'm editing a MINIX kernel as part of a practical for my Operating Systems course, and I have to add a little function that spits out the number of running processes when you hit a function key in the Information Server. I've figured out how to integrate the functionality so all the other stuff works, but for the life of me, I can not figure out how to get the current number of processes running in the system into my C code and into a variable to print out.
First I thought there'd be a nifty Syscall like SYS_NUMPROCS or something that'd return the value, but no luck.
Then, I tried piping output from a system("ps -ax | wc -l") to a file and the file wouldn't create. I tried using popen() and no luck there either - even with a simple "ls" read into a buffer, it just bombs the code and "hangs" the run of the code, so there's no output.
So now I'm truly stumped, and any help would be super awesome, because at this point I've exhausted all the obvious options.
The only two things I can think of now would be a loop counting all the processes, but first you have to get to the system's process list, and I've heard vague things said about /proc/ as a directory, but I haven't a clue how to access/run through that or how it'd link up to getting the number of processes in the first place.
Thanks a stack (lol pun), guys :)
Also, I haven't included code explicitly because nothing I've written aside from basic printf'ing for cosmetic output, because none of the things I've tried gave me any joy :/
Edit notes: Guys, this is a kernel edit - I'm writing the function to printf the information in a system C file, then recompiling the kernel and rebooting the system to test. It's a UNIX (MINIX) kernel, not a Linux kernel, and it's not a user mode program.
My code for popen(), as some of you requested, is as follows:
public void cos_dmp(){
char buffer[512];
FILE * f;
f = popen("ps -ax | wc -l","r");
fgets(buffer, sizeof(buffer),f);
//buffer should now contain result of popen()
printf(buffer);
}
That's a bit of a hacked together version from what I remember and keeping it ultra simple and showing you guys that's what I was trying to do. Again though, there must be a better way to do this aside from essentially calling the output of a system() call.
Edit again: the above code woks perfectly from a user program but won't work from the kernel function. Anybody have an idea why?:/
struct kinfo kinfo;
int nr_tasks, nr_procs;
getsysinfo(PM_PROC_NR, SI_KINFO, &kinfo);
nr_procs = kinfo.nr_pro;
This will get you the number of processes running
try looking to see what ps does. Look at its source code; it knows how many processes there are
Perhaps you could show us the code your wrote for capturing the result of system("ps -ax | wc -l"), or the code you wrote to use popen and we could help you diagnose the problem with it.
Regardless, the most efficient way I can think of to count the number of existing (not the same as running) processes on the system is to opendir("/proc") and count the number of entries that are strings of decimal digits. Each process in the system will be represented by a subdirectory of /proc, named after the decimal process id number of that process.
So, if you find "/proc/3432", for example, then you know that there exists a process with a pid of "3432". Simply count the number of subdirectories you find whose names are decimal numbers.
Assumptions:
You are asking about Linux, not MINIX.
You are writing a user-mode program, not modifiying the kernel.
So I have been having the same problem and found a solution. (MINIX 3.1) within the method to count the processes use this code:
(This is ANSI C)
It simply runs through the process table and counts the number of processes.
I know this is an old thread but it might help someone in the future.
#include "../pm/mproc.h"
/* inside function */
struct mproc *mp;
int i, n=0;
printf("Number of running processes:\n");
getsysinfo(PM_PROC_NR, SI_PROC_TAB, mproc);
for (i = 0; i<NR_PROCS; i++) {
mp = &mprocs[i];
if (mp->mp_pid == 0 && i != PM_PROCS_NR) continue;
n++;
}
printf("%d", n);
/* function end */
I had the same assignment at my university so i will post my solution if someone needs it in future. I am using Minix 3.3 and VMware player for virtual machines.
In pm server at location /usr/src/minix/servers/pm there is glo.h file which contains various global variables used by pm server. In that file, there is fortunately one variable called procs_in_use defined as EXTERN int procs_in_use;
So simple printf("%d\n",procs_in_use); from a system call will show the number of current processes running. You can test this by adding fork() function in your user space program in the middle of a loop.
One more mention : first answer that says
struct kinfo kinfo;
int nr_tasks, nr_procs;
getsysinfo(PM_PROC_NR, SI_KINFO, &kinfo);
nr_procs = kinfo.nr_procs;
didn't work out for me. SI_KINFO no more exists so you should use SI_PROC_TABLE. Also there can be problems with permissions, so you will not be able to call this function from your regular system call. There is alternative function sys_getkinfo(&kinfo) that can be called from your fresh system call and that will do the same as the above. The problem is kinfo.nr_procs will not return number of current processes but number of maximum user processes that can be in operative system which is 256 by default, and can be changed manually in file where NR_PROCS is defined. On the other hand kinfo.nr_taskswill return maximum number of kernel processes that can be held by operative system, which is 5 by default.
Check this out: http://procps.sourceforge.net/
It's got source to a number of small utilities that do these kinds of things. It'll be a good learning experience :) and I think PS is i n there as pm100 noted.
If you're editing the kernel, the most efficient way to solve this problem is to maintain a count every time a process (i.e., a task_struct entry) is created (and make sure you decrement the count every where a process terminates).
You could always loop through the list of processes in the kernel using the built-in macro (but its expensive, so you should try to avoid it):
struct task_struct *p;
unsigned int count = 0;
for_each_process(task) {
count++;
}
Check this out: http://sourceforge.net/p/readproc/code/ci/master/tree/
#include"read_proc.h"
int main(void)
{
struct Root * root=read_proc();
printf("%lu\n",root->len);
return 0;
}

Resources