I am trying to understand why the address of stack variables of a given process have different values when the process is executed on its own on the command line and when another process starts the process with an execl call (without forking). I am running these program in an intel 64 bit kali linux machine with ASLR disabled.
Here is the code for the program that has its stack variable address printed:
int main(int argc, char *argv[]){
char buffer;
printf("buffer: %p\n", &buffer);
return 1;
}
Here is the code for the program that executes the code above without forking:
int main(int argc, char * argv[]){
int var;
printf("var is at %p\n", &var);
execl("./a", "a", NULL);
return 1;
}
Since executing a process in the shell, to my knowledge, is the same as the shell executing an exec after forking itself, I tested having the code above fork itself first then have the child perform an exec. Here is the code for the third experiment:
int main(int argc, char * argv[]){
int var;
printf("var is at %p\n", &var);
if(fork()==0)
{
execl("./a", "a", NULL);
}
return 1;
}
All three programs above are compiled with gcc with the options: -fno-stack-protector -z execstack.
Here are the results of the various tests I conducted:
1.) Executing first program on shell:
buffer: 0x7fffffffe2df
2.) Executing first program on a process using exec (without forking):
var is at 0x7fffffffe2dc
buffer: 0x7fffffffe2ef
3.) Executing first program on a process using exec (with forking):
var is at 0x7fffffffe2cc
buffer: 0x7fffffffe2df
I expected that the location of buffer in the first two tests to be the same due to each process supposedly not having an effect on each other's memory. I tested to see whether the location of global variables changed and sure enough they stayed in the same addresses whatever the test.
To my knowledge, one reason the stack is shifted when a process executes is due to inheriting environment variables from the shell. So I theorize that executing a process by having a different process use execs causes the former to inherit variables located in the stack.
However my third test says otherwise. It simulated how the shell executes a process and it got the same address as executing the process from the shell directly. I expected the result to be the same as my second experiment due to it using the same process context as the second experiment (thus inheriting the same environment) as the second experiment. The only difference was I forked first and had the child execute the exec call.
I would deeply appreciate it If someone can point me to the direction as to why the stack behaves this way.
Related
This question already has an answer here:
Getting output of a system command from stdout in C
(1 answer)
Closed 4 years ago.
I just learned C and fascinates with pointers. Recently I discovered C function system(). I am able to get return values from a program I executed via system ("program.exe"). Eg, program.c:
#include <stdio.h>
int main(){
printf("hello world\n");
return 123;
}
and this code calls program.exe, called call.c
#include <stdio.h>
#include <stdlib.h>
int main(){
int a;
printf("calling program.exe:\n");
a=system("program.exe");
printf("program.exe returned %d at exit\n",a);
return 0;
}
When I execute call.exe, I get this
calling program.exe:
hello world
pm returned 123 at exit
I was like, wow! This return value and system() function thing is like a new way to interprocess communication for me. But my question is, can I get a string return from the system() function?
Ii tried changing program.c "int main()" to "char * main()", and return 123 to return "bohemian rhapsody" and change "int a;" to "char *a;", printf format %d to %s in call.c, but I only get funny characters when I execute call.exe. I wonder what's wrong?
No you can't return a string from system(), and indeed a program under most modern desktop operating systems can only "return" an integer exit code when it terminates.
If you do want to get a string back from the executed program, one way would be to use popen() to invoke the program, and then have that program write the string to stdout.
The program that you call 'call.c' calls system(3), which does the following,
suspends the current process,
starts a child process,
and waits for the child process to complete
The return value from the called process, 'program.c/exe' is an integer (size is system dependent usually 16-bit), and what is happening behind the scenes is that the system(3) call uses the wait(2) call to (blocking) suspend execution until the child process returns.
Note that the string is not returned, but the child process prints to stdout. See the popen(3) call if you want to obtain the string (or binary) output from the child process.
See the manual page for wait(3) to see how to process the results returned by the called program, e.g. 'man -s2 wait',
WAIT(2) System Calls Manual WAIT(2)
NAME
wait, wait3, wait4, waitpid -- wait for process termination
SYNOPSIS
#include <sys/wait.h>
pid_t
wait(int *stat_loc);
pid_t
wait3(int *stat_loc, int options, struct rusage *rusage);
pid_t
wait4(pid_t pid, int *stat_loc, int options, struct rusage *rusage);
pid_t
waitpid(pid_t pid, int *stat_loc, int options);
Calling tzset() after forking appears to be very slow. I only see the slowness if I first call tzset() in the parent process before forking. My TZ environment variable is not set. I dtruss'd my test program and it revealed the child process reads /etc/localtime for every tzset() invocation, while the parent process only reads it once. This file access seems to be the source of the slowness, but I wasn't able to determine why it's accessing it every time in the child process.
Here is my test program foo.c:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
void check(char *msg);
int main(int argc, char **argv) {
check("before");
pid_t c = fork();
if (c == 0) {
check("fork");
exit(0);
}
wait(NULL);
check("after");
}
void check(char *msg) {
struct timeval tv;
gettimeofday(&tv, NULL);
time_t start = tv.tv_sec;
suseconds_t mstart = tv.tv_usec;
for (int i = 0; i < 10000; i++) {
tzset();
}
gettimeofday(&tv, NULL);
double delta = (double)(tv.tv_sec - start);
delta += (double)(tv.tv_usec - mstart)/1000000.0;
printf("%s took: %fs\n", msg, delta);
}
I compiled and executed foo.c like this:
[muir#muir-work-mb scratch]$ clang -o foo foo.c
[muir#muir-work-mb scratch]$ env -i ./foo
before took: 0.002135s
fork took: 1.122254s
after took: 0.001120s
I'm running Mac OS X 10.10.1 (also reproduced on 10.9.5).
I originally noticed the slowness via ruby (Time#localtime slow in child process).
Ken Thomases's response may be correct, but I was curious about a more specific answer because I still find the slowness unexpected behavior for a single-threaded program performing such a simple/common operation after forking. After examining http://opensource.apple.com/source/Libc/Libc-997.1.1/stdtime/FreeBSD/localtime.c (not 100% sure this is the correct source), I think I have an answer.
The code uses passive notifications to determine if the time zone has changed (as opposed to stating /etc/localtime every time). It appears that the registered notification token becomes invalid in the child process after forking. Furthermore, the code treats the error from using an invalid token as a positive notification that the timezone has changed, and proceeds to read /etc/localtime every time. I guess this is the kind of undefined behavior you can get after forking? It would be nice if the library noticed the error and re-registered for the notification, though.
Here is the snippet of code from localtime.c that mixes the error value with the status value:
nstat = notify_check(p->token, &ncheck);
if (nstat || ncheck) {
I demonstrated that the registration token becomes invalid after fork using this program:
#include <notify.h>
#include <stdio.h>
#include <stdlib.h>
void bail(char *msg) {
printf("Error: %s\n", msg);
exit(1);
}
int main(int argc, char **argv) {
int token, something_changed, ret;
notify_register_check("com.apple.system.timezone", &token);
ret = notify_check(token, &something_changed);
if (ret)
bail("notify_check #1 failed");
if (!something_changed)
bail("expected change on first call");
ret = notify_check(token, &something_changed);
if (ret)
bail("notify_check #2 failed");
if (something_changed)
bail("expected no change");
pid_t c = fork();
if (c == 0) {
ret = notify_check(token, &something_changed);
if (ret) {
if (ret == NOTIFY_STATUS_INVALID_TOKEN)
printf("ret is invalid token\n");
if (!notify_is_valid_token(token))
printf("token is not valid\n");
bail("notify_check in fork failed");
}
if (something_changed)
bail("expected not changed");
exit(0);
}
wait(NULL);
}
And ran it like this:
muir-mb:projects muir$ clang -o notify_test notify_test.c
muir-mb:projects muir$ ./notify_test
ret is invalid token
token is not valid
Error: notify_check in fork failed
You're lucky you didn't experience nasal demons!
POSIX states that only async-signal-safe functions are legal to call in the child process after the fork() and before a call to an exec*() function. From the standard (emphasis added):
… the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
…
There are two reasons why POSIX programmers call fork(). One reason is
to create a new thread of control within the same program (which was
originally only possible in POSIX by creating a new process); the
other is to create a new process running a different program. In the
latter case, the call to fork() is soon followed by a call to one of
the exec functions.
The general problem with making fork() work in a multi-threaded world
is what to do with all of the threads. There are two alternatives. One
is to copy all of the threads into the new process. This causes the
programmer or implementation to deal with threads that are suspended
on system calls or that might be about to execute system calls that
should not be executed in the new process. The other alternative is to
copy only the thread that calls fork(). This creates the difficulty
that the state of process-local resources is usually held in process
memory. If a thread that is not calling fork() holds a resource, that
resource is never released in the child process because the thread
whose job it is to release the resource does not exist in the child
process.
When a programmer is writing a multi-threaded program, the first
described use of fork(), creating new threads in the same program, is
provided by the pthread_create() function. The fork() function is thus
used only to run new programs, and the effects of calling functions
that require certain resources between the call to fork() and the call
to an exec function are undefined.
There are lists of async-signal-safe functions here and here. For any other function, if it's not specifically documented that the implementations on the platforms to which you're deploying add a non-standard safety guarantee, then you must consider it unsafe and its behavior on the child side of a fork() to be undefined.
#include<stdio.h>
#include<mpi.h>
int a=1;
int *p=&a;
int main(int argc, char **argv)
{
MPI_Init(&argc,&argv);
int rank,size;
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
//printf("Address val: %u \n",p);
*p=*p+1;
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
printf("Value of a : %d\n",*p);
return 0;
}
Here, I am trying to execute the program with 3 processes where each tries to increment the value of a by 1, so the value at the end of execution of all processes should be 4. Then why does the value printed as 2 only at the printf statement after MPI_Finalize(). And isnt it that the parallel execution stops at MPI_Finalize() and there should be only one process running after it. Then why do I get the print statement 3 times, one for each process, during execution?
It is a common misunderstanding to think that mpi_init starts up the requested number of processes (or whatever mechanism is used to implement MPI) and that mpi_finalize stops them. It's better to think of mpi_init starting the MPI system on top of a set of operating-system processes. The MPI standard is silent on what MPI actually runs on top of and how the underlying mechanism(s) is/are started. In practice a call to mpiexec (or mpirun) is likely to fire up a requested number of processes, all of which are alive when the program starts. It is also likely that the processes will continue to live after the call to mpi_finalize until the program finishes.
This means that prior to the call to mpi_init, and after the call to mpi_finalize it is likely that there is a number of o/s processes running, each of them executing the same program. This explains why you get the printf statement executed once for each of your processes.
As to why the value of a is set to 2 rather than to 4, well, essentially you are running n copies of the same program (where n is the number of processes) each of which adds 1 to its own version of a. A variable in the memory of one process has no relationship to a variable of the same name in the memory of another process. So each process sets a to 2.
To get any data from one process to another the processes need to engage in message-passing.
EDIT, in response to OP's comment
Just as a variable in the memory of one process has no relationship to a variable of the same name in the memory of another process, a pointer (which is a kind of variable) has no relationship to a pointer of the same name in the memory of another process. Do not be fooled, if the ''same'' pointer has the ''same'' address in multiple processes, those addresses are in different address spaces and are not the same, the pointers don't point to the same place.
An analogy: 1 High Street, Toytown is not the same address as 1 High Street, Legotown; there is a coincidence in names across address spaces.
To get any data (pointer or otherwise) from one process to another the processes need to engage in message-passing. You seem to be clinging to a notion that MPI processes share memory in some way. They don't, let go of that notion.
Since MPI is only giving you the option to communicate between separate processes, you have to do message passing. For your purpose there is something like MPI_Allreduce, which can sum data over the separate processes. Note that this adds the values, so in your case you want to sum the increment, and add the sum later to p:
int inc = 1;
MPI_Allreduce(MPI_IN_PLACE, &inc, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
*p += inc;
In your implementation there is no communication between the spawned threads. Each process has his own int a variable which it increments and prints to the screen. Making the variable global doesn't make it shared between processes and all the pointer gimmicks show me that you don't know what you are doing. I would suggest learning a little more C and Operating Systems before you move on.
Anyway, you have to make the processes communicate. Here's how an example might look like:
#include<stdio.h>
#include<mpi.h>
// this program will count the number of spawned processes in a *very* bad way
int main(int argc, char **argv)
{
int partial = 1;
int sum;
int my_id = 0;
// let's just assume the process with id 0 is root
int root_process = 0;
// spawn processes, etc.
MPI_Init(&argc,&argv);
// every process learns his id
MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
// all processes add their 'partial' to the 'sum'
MPI_Reduce(&partial, &sum, 1, MPI_INT, MPI_SUM, root_process, MPI_COMM_WORLD);
// de-init MPI
MPI_Finalize();
// the root process communicates the summation result
if (my_id == root_process)
{
printf("Sum total : %d\n", sum);
}
return 0;
}
I am writing a program in C in one of my systems classes. We write c code and run it in a unix environment. I have looked all over the internet, but can't seem to find any way to make the kill(int PID) command work. The code will compile and run fine, but if I use the
ps -u username
command in a unix command prompt (after execution has completed of course,) it says that the all of the processes I tried to kill in my c code are still running. I can kill them from the unix command prompt by manually entering their PIDs, but for the life of me, I cannot figure out how to do it inside of my program.
In this particular program, I am trying to kill process CC, which is a process that just infinitely calls usleep(100); until terminated.
I tried using kill(C3, -9); and variations of execlp("kill", "kill", C3, (char *)0); but still no luck. Does anyone have any idea what I am doing wrong here? My only guess is that the kill command is being passed the wrong PID parameter, but if that's the case, I have no idea how I would get the correct one.
EDIT: Also, the kill command returns a value of zero, which I believe means that it "succeeded" in executing the command.
EDIT: Just noticed that the solution to my problem was in the instructions for the assignment all along. Yup. I'm stupid.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main(int args, char* argv[])
{
//
//Step 7
//
//Create process C3
int C3=fork();
if (C3==0)
{
execlp("CC", "CC", (char *)0);
}
else
{
usleep(500000);
//
//Step 8
//
int ps=fork();
if (ps==0)
{
execlp("ps", "ps", "-u", "cooley", (char *)0);
}
else
{
wait(NULL);
kill(C3);
}
}
exit(0);
}
You're calling the kill system call with only one argument, when it takes two. This leads to undefined behavior since then the second argument can by anything. You should get a warning about this when compiling.
The second argument should be a value from <signal.h> (see the signal(7) manual page).
Is there a way to set environment variables in Linux using C?
I tried setenv() and putenv(), but they don't seem to be working for me.
I'm going to make a wild guess here, but the normal reason that these functions appear to not work is not because they don't work, but because the user doesn't really understand how environment variables work. For example, if I have this program:
int main(int argc, char **argv)
{
putenv("SomeVariable=SomeValue");
return 0;
}
And then I run it from the shell, it won't modify the shell's environment - there's no way for a child process to do that. That's why the shell commands that modify the environment are builtins, and why you need to source a script that contains variable settings you want to add to your shell, rather than simply running it.
Any unix program runs in a separate process from the process which starts it; this is a 'child' process.
When a program is started up -- be that at the command line or any other way -- the system creates a new process which is (more-or-less) a copy of the parent process. That copy includes the environment variables in the parent process, and this is the mechanism by which the child process 'inherits' the environment variables of its parent. (this is all largely what other answers here have said)
That is, a process only ever sets its own environment variables.
Others have mentioned sourcing a shell script, as a way of setting environment variables in the current process, but if you need to set variables in the current (shell) process programmatically, then there is a slightly indirect way that it's possible.
Consider this:
% cat envs.c
#include <stdio.h>
int main(int argc, char**argv)
{
int i;
for (i=1; i<argc; i++) {
printf("ENV%d=%s\n", i, argv[i]);
}
}
% echo $ENV1
% ./envs one two
ENV1=one
ENV2=two
% eval `./envs one two`
% echo $ENV1
one
%
The built-in eval evaluates its argument as if that argument were typed at the shell prompt. This is a sh-style example; the csh-style variant is left as an exercise!
The environment variable set by setenv()/putenv() will be set for the process that executed these functions and will be inherited by the processes launched by it. However, it will not be broadcasted into the shell that executed your program.
Why isn't my wrapper around setenv() working?
The environment block is process-local, and copied to child processes. So if you change variables, the new value only affects your process and child processes spawned after the change. Assuredly it will not change the shell you launched from.
Not an answer to this question, just wanna say that putenv is dangerous, use setenv instead.
putenv(char *string) is dangerous for the reason that all it does is simply append the address of your key-value pair string to the environ array. Therefore, if we subsequently modify the bytes pointed to by string, the change will affect the process environment.
#include <stdlib.h>
int main(void) {
char new_env[] = "A=A";
putenv(new_env);
// modifying your `new_env` also modifies the environment
// vairable
new_env[0] = 'B';
return EXIT_SUCCESS;
}
Since environ only stores the address of our string argument, string has to be static to prevent the dangling pointer.
#include <stdlib.h>
void foo();
int main(void) {
foo();
return EXIT_SUCCESS;
}
void foo() {
char new_env[] = "A=B";
putenv(new_env);
}
When the stack frame for foo function is deallocated, the bytes of new_env are gone, and the address stored in environ becomes a dangling pointer.