(Disclamer: This is homework)
I am creating a shell program, lets call it fancysh. I am trying to add PATH (and other env vars) functionality to my shell, so far all is good. My naive approach was to store all these variables as static variables in fancysh.c. Now however I am trying to implement the environment variable SHLVL which holds the current "depth" of the shell. For example I can be running in the first instance of fancysh and the SHLVL should read 1, upon calling fancysh again the SHLVL should increment (and decrement when a shell is exited).
What I have tried...
fancysh.h
#ifndef FANCYSH_H
#define FANCYSH_H
extern int SHLVL;
#endif
fancysh.c
#include "fancysh.h"
int SHLVL;
int main(){
/* some fancy code to determine if SHLVL is initalized */
/* if not init to 0 */
SHLVL ++;
printf("%d\n", SHLVL);
/* Test Code Only */
int pid = fork();
if(pid == 0 && SHLVL < 10)
exec("fancysh");
wait();
/* Test Code Only */
/* shell code */
SHLVL--;
printf("%d\n", SHLVL);
exit(0);
}
I used the answers here and here as part of this solution.
So how would I go about implementing the fancy code to determine if SHLVL is initialized? I had some ideas about using a combination of #ifdef and #define but I'm not 100% sure how to do this.
You need to get a grasp of the fact that different shell processes are different processes. Just because one instance of the shell is started within the scope of another instance of the shell does not mean that the former automatically inherits any data from the latter.
Or not directly, anyway. Any new instance of your shell will receive an environment from the process that starts it. If that environment contains a SHLVL variable then the new shell process can of course read that value, and it may possibly present a different value of that environment variable within its own scope.
Related
I'm having an issue creating named semaphores inside two processes. This is the relevant content for the header file that is being called inside both programs (semctrl.h):
int init_sems()
{
char names[8][5] = {"sem1", "sem2", "sem3", "sem4", "sem5", "sem6", "sem7", "sem8"};
int failed = 1;
for (int i = 0; i < 8; i++)
{
sem[i] = sem_open(names[i], O_CREAT,00777, i>3);
failed = failed || (sem[i] == SEM_FAILED);
}
return failed;
}
sem is a global array of semaphores but I doubt this is relevant.
Both programs call this function successfully but when I print the addresses for the semaphores, I get different values for each program. This means they both create a set of 8 different semaphores, therefore, they can't sync properly.
What am I missing?
On Linux:
A named semaphore is a file name in the virtual directory: /dev/shm
The individual named semaphores have the format: sem.name where name is from the sem_open() function
To use the semaphore functions must include the pthread parameter when linking
Always check the returned value from the semaphore functions. A returned value of 0 indicates success. If a -1 is returned, then check errno for the cause of the failure.
Note that named semaphore files do not 'go away' when the application exits, but rather a call to sem_unlink() is needed.
Note that if the named semaphore file is not unlinked via sem_unlink(), then it will still be active when that program is again run and the value left in the semaphore from the prior run will still be there.
The 'access bits' must be set so all the programs that want to access the semaphore can do so.
I am studying about fork(), as I learned, in fork(), the parent and child both have same "image" i.e, they both point to the same page table which all its page table entries are marked (when kernel handles the syscall) as read-only. when writing to a page, e.g, updating a variable, an anonymous page is opened, and changes are stored there, hence practically the child and parent don't have an influence on each other's variables. I encountered a strange case where I can't figure out what's going on. the thing I can't figure out is what happens when the returned fork() value ends up in the static variable and when exactly is the split made:
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
static int a = 0;
static int r = 0;
int main() {
r = fork();
if (r > 0){
a = 1;
}
if (a == 0) {
fork();
}
return 0;
}
How much fork()s are executed? the first one clearly occurs, will the second one occur? when I run the code with some printing (and checking fork succeeds) it changes from one run to another although by what I learned it should be always 2 forks. Is it some problem in my computer or program I'm using to run the code or am I missing something and this changing behavior can be explained?
Can someone help me understand how the system handles variables that are set before a process makes a fork() call. Below is a small test program I wrote to try understanding what is going on behind the scenes.
I understand that the current state of a process is "cloned", variables included, at the time of the forking. My thought was, that if I malloc'd a 2D array before calling fork, I would need to free the array both in the parent and the child processes.
As you can see from the results below the sample code, the two values act as if they are totally separate from each other, yet they have the exact same address space. I expected that my final result for tmp would be -4 no matter which process completed first.
I am newer to C and Linux, so could someone explain how this is possible? Perhaps the variable tmp becomes a pointer to a pointer which is distinct in each process? Thanks so much.
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
int main()
{
int tmp = 1;
pid_t forkReturn= fork();
if(!forkReturn) {
/*child*/
tmp=tmp+5;
printf("Value for child %d\n",tmp);
printf("Address for child %p\n",&tmp);
}
else if(forkReturn > 0){
/*parent*/
tmp=tmp-10;
printf("Value for parent %d\n",tmp);
printf("Address for parent %p\n",&tmp);
}
else {
/*Error calling fork*/
printf("Error calling fork);
}
return 0;
}
RESULTS of standard out:
Value for child 6
Address for child 0xbfb478d8
Value for parent -9
Address for parent 0xbfb478d8
It did indeed copy the entire address space, and changing memory in the child process does not affect the parent. The key to understanding this is to remember that a pointer can only point to something in your own process, and the copy happens at a lower level.
However, you should not call malloc() or free() at all in the child of fork. This can deadlock (another thread was in malloc() when you called fork()). The only functions safe to call in the child are the ones also listed as safe for signal handlers. I used to be able to claim this was true only if you wrote multithreaded code; however Apple was kind enough to spawn a background thread in the standard library, so the deadlock is real all the time. The child of fork should never be allowed to drop out of the if block. Call _exit to make sure it doesn't.
I have been writing C code for many years, but I recently came accross a feature that I have never used: a static variable inside a function. Therefore, I was wondering what are some ways that you have used this feature and it was the right design decision.
E.g.
int count(){
static int n;
n = n + 1;
return n;
}
is a BAD design decision. why? because later you might want to decrement the count which would involve changing the function parameters, changing all calling code, ...
Hopefully this is clear enough,
thanks!
void first_call()
{
static int n = 0;
if(!n) {
/* do stuff here only on first call */
n++;
}
/* other stuff here */
}
I have used static variables in test code for lazy initialization of state. Using static local variables in production code is fraught with peril and can lead to subtle bugs. It seems (at least in the code that I generally work on) that nearly any bit of code that starts out as a single-threaded only chunk of code has a nasty habit of eventually ending up working in a concurrent situation. And using a static variable in a concurrent environment can result in difficult issues to debug. The reason for this is because the resulting state change is essentially a hidden side effect.
I have used static variables as a way to control the execution of another thread.
For instance, thread #1 (the main thread) first declares and initializes a control variable such as:
/* on thread #1 */
static bool run_thread = true;
// then initialize the worker thread
and then it starts the execution of thread #2, which is going to do some work until thread #1 decides to stop it:
/* thread #2 */
while (run_thread)
{
// work until thread #1 stops me
}
There is one prominent example that you very much need to be static for protecting critical sections, namely a mutex. As an example for POSIX threads:
static pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock(&mut);
/* critical code comes here */
pthread_mutex_unlock(&mut);
This wouldn't work with an auto variable.
POSIX has such static initialzers for mutexes, conditions and once variables.
Is there a way to set environment variables in Linux using C?
I tried setenv() and putenv(), but they don't seem to be working for me.
I'm going to make a wild guess here, but the normal reason that these functions appear to not work is not because they don't work, but because the user doesn't really understand how environment variables work. For example, if I have this program:
int main(int argc, char **argv)
{
putenv("SomeVariable=SomeValue");
return 0;
}
And then I run it from the shell, it won't modify the shell's environment - there's no way for a child process to do that. That's why the shell commands that modify the environment are builtins, and why you need to source a script that contains variable settings you want to add to your shell, rather than simply running it.
Any unix program runs in a separate process from the process which starts it; this is a 'child' process.
When a program is started up -- be that at the command line or any other way -- the system creates a new process which is (more-or-less) a copy of the parent process. That copy includes the environment variables in the parent process, and this is the mechanism by which the child process 'inherits' the environment variables of its parent. (this is all largely what other answers here have said)
That is, a process only ever sets its own environment variables.
Others have mentioned sourcing a shell script, as a way of setting environment variables in the current process, but if you need to set variables in the current (shell) process programmatically, then there is a slightly indirect way that it's possible.
Consider this:
% cat envs.c
#include <stdio.h>
int main(int argc, char**argv)
{
int i;
for (i=1; i<argc; i++) {
printf("ENV%d=%s\n", i, argv[i]);
}
}
% echo $ENV1
% ./envs one two
ENV1=one
ENV2=two
% eval `./envs one two`
% echo $ENV1
one
%
The built-in eval evaluates its argument as if that argument were typed at the shell prompt. This is a sh-style example; the csh-style variant is left as an exercise!
The environment variable set by setenv()/putenv() will be set for the process that executed these functions and will be inherited by the processes launched by it. However, it will not be broadcasted into the shell that executed your program.
Why isn't my wrapper around setenv() working?
The environment block is process-local, and copied to child processes. So if you change variables, the new value only affects your process and child processes spawned after the change. Assuredly it will not change the shell you launched from.
Not an answer to this question, just wanna say that putenv is dangerous, use setenv instead.
putenv(char *string) is dangerous for the reason that all it does is simply append the address of your key-value pair string to the environ array. Therefore, if we subsequently modify the bytes pointed to by string, the change will affect the process environment.
#include <stdlib.h>
int main(void) {
char new_env[] = "A=A";
putenv(new_env);
// modifying your `new_env` also modifies the environment
// vairable
new_env[0] = 'B';
return EXIT_SUCCESS;
}
Since environ only stores the address of our string argument, string has to be static to prevent the dangling pointer.
#include <stdlib.h>
void foo();
int main(void) {
foo();
return EXIT_SUCCESS;
}
void foo() {
char new_env[] = "A=B";
putenv(new_env);
}
When the stack frame for foo function is deallocated, the bytes of new_env are gone, and the address stored in environ becomes a dangling pointer.