How do I program my own setenv()? - c

My school wants me to implement the setenv() standard c library function's behavior. I'm not allowed to use setenv() for this implementation. How can I do that?

On many implementations of the C programming language and especially on POSIX, the environment is accessible from the environ global variable. You may need to declare it manually as it's not declared in any standard header file:
extern char **environ;
environ points to a NULL terminated array of pointers to variable=value strings. For example, if your environment has the variables foo, bar, and baz, the entries in environ might be:
environ[0] = "foo=a";
environ[1] = "bar=b";
environ[2] = "baz=c";
environ[3] = NULL;
To alter the environment without using the setenv() or putenv() functions, check if the key you want to set already exists. If it does, overwrite the entry for that key. Else you need to copy the content of environ into a new array and add the new entry to its end. You can use malloc() or calloc() and memcpy() for this purpose. Since this is homework, I'm not going to supply further details.

Related

How to declare a variable using string concatenation and use that variable to print and integer defined as variable in C? [duplicate]

I have a program test.c
int global_var=10;
printf("Done");
i did
gcc -g test.c -o test
My query is
Is there a way i can get the variable name as argument (say "global_var") and print the value.
Thanks
No, C doesn't have introspection. Once the compiler has generated code, the program can not look up variable names.
The way these things are usually solved is by having a collection of all special variables that needs to be looked up by name, containing both the actual name as a string and the variable it self.
Usually it's an array of structures, something like
struct
{
const char *name;
int value;
} variables[] = {
{ "global_var", 10 }
};
The program can then look through the array variables to search for "global_var" and use (or change) the value in the structure.
General answer: No. There is no connection between a variable name and its string representation (you can get the string representation of a variable name at compile time with the preprocessor, though).
For identifiers with external linkage, there are (platform-dependent) ways: See e.g. dlsym for POSIX systems.
You can compile with debugging information and access (most) variables by names from input. Unless you really write something like a debugger, this would be a horrible design, however (and even then, you don’t access the variables used in the debugger itself but of the programme being debugged).
Finally, you could implement your own lookup table mapping from string representations to values.
No.
We only have variable names so humans don't get confused .
After your program gets turned into assembly and eventually machine code, the computer doesn't care what you name your variables.
Alternatively you could use a structure in which you would store the value and the name as a string:
struct tag_name {
char *member1;
int member2;
};
In general, it is not possible to access at runtime global variables by name. Sometimes, it might depend upon the operating system, and how the compiler is invoked. I still assume you want to dereference a global variable, and you know its type.
Then on Linux and some other systems, you could use dlopen(3) with a NULL path (to get a handle for the executable), then use dlsym on the global variable name to get its address; you can then cast that void* pointer to a pointer of the appropriate type and dereference it. Notice that you need to know the type (or at least have a convention to encode the type of the variable in its name; C++ is doing that with name mangling). If you compiled and linked with debug information (i.e. with gcc -g) the type information is in its DWARF sections of your ELF executable, so there is some way to get it.
This works if you link your executable using -rdynamic and with -ldl
Another possibility might be to customize your recent GCC with your own MELT extension which would remember and later re-use some of the compiler internal representations (i.e. the GCC Tree-s related to global variables). Use MELT register_finish_decl_first function to register a handler on declarations. But this will require some work (in coding your MELT extension).
using preprocessor tricks
You could use (portable) preprocessor tricks to achieve your goals (accessing variable by name at runtime).
The simplest way might be to define and follow your own conventions. For example you could have your own globvar.def header file containing just lines like
/* file globvar.def */
MY_GLOBAL_VARIABLE(globalint,int)
MY_GLOBAL_VARIABLE(globalint2,int)
MY_GLOBAL_VARIABLE(globalstr,char*)
#undef MY_GLOBAL_VARIABLE
And you adopt the convention that all global variables are in the above globvar.def file. Then you would #include "globvar.def" several times. For instance, in your global header, expand MY_GLOBAL_VARIABLE to some extern declaration:
/* in yourheader.h */
#define MY_GLOBAL_VARIABLE(Nam,Typ) extern Typ Nam;
#include "globvar.def"
In your main.c you'll need a similar trick to declare your globals.
Elsewhere you might define a function to get integer variables by name:
/* return the address of global int variable or else NULL */
int* global_int_var_by_name (const char*name) {
#define MY_GLOBAL_VARIABLE(Nam,Typ) \
if (!strcmp(#Typ,"int") && !strcmp(name,#Nam)) return (int*)&Nam;
#include "globvar.def"
return NULL;
}
etc etc... I'm using stringification of macro arguments.
Such preprocessor tricks are purely standard C and would work with any C99 compliant compiler.

Which C header defines the common variable names (PATH, HOME, IFS...)?

Chapter 8 of POSIX standard define a list of commonly used environment variables "that are frequently exported by widely used command interpreters and applications".
However I cannot find any C header providing their names in any of my unix-like systems.
I'm looking for something like:
#define ENV_PATH "PATH"
#define ENV_USER "USER"
#define ENV_IFS "IFS"
...
Where I can find such header? Any OS-specific header would work: I just don't want to invent names for the constants myself.
edit
If you are used to only mainstream operating systems, you might ask: why you want to use constants here? $PATH is always $PATH everywhere!
This is not actually true.
In Plan 9 from Bell Labs, environment variables are usually lowercase (apparently due to aesthetics).
In Jehanne, a new operating system derived by Plan 9, I'm reconsidering this design choice, to ease the integration of POSIX tools. However, since I like the lowercase environment variables, I'd like to be able to easily switch back to lowercase names when Jehanne will be "the one true operating system" :-D
As stated in the comments, there is no header file that provides any POSIX-specified list of environment variables used by applications and utilities.
A list of "certain variables that are frequently exported by widely used command interpreters and applications" can be found at http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08. (The actual environment variable list requires reformatting but here it is anyway...)
It is unwise to conflict with certain variables that are frequently
exported by widely used command interpreters and applications:
ARFLAGS IFS MAILPATH PS1
CC LANG MAILRC PS2
CDPATH LC_ALL MAKEFLAGS PS3
CFLAGS LC_COLLATE MAKESHELL PS4
CHARSET LC_CTYPE MANPATH PWD
COLUMNS LC_MESSAGES MBOX RANDOM
DATEMSK LC_MONETARY MORE SECONDS
DEAD LC_NUMERIC MSGVERB SHELL
EDITOR LC_TIME NLSPATH TERM
ENV LDFLAGS NPROC TERMCAP
EXINIT LEX OLDPWD TERMINFO
FC LFLAGS OPTARG TMPDIR
FCEDIT LINENO OPTERR TZ
FFLAGS LINES OPTIND USER
GET LISTER PAGER VISUAL
GFLAGS LOGNAME PATH YACC
HISTFILE LPDEST PPID YFLAGS
HISTORY MAIL PRINTER
HISTSIZE MAILCHECK PROCLANG
HOME MAILER PROJECTDIR
To access the value of an environment variable, use the getenv() function.
The exec() function documentation specifies the char **environ variable:
In addition, the following variable, which must be declared by the
user if it is to be used directly:
extern char **environ;
is initialized as a pointer to an array of character pointers to the
environment strings. The argv and environ arrays are each terminated
by a null pointer. . The null pointer terminating the argv array is not counted in argc.
Applications can change the entire environment in a single operation
by assigning the environ variable to point to an array of character
pointers to the new environment strings. After assigning a new value
to environ, applications should not rely on the new environment
strings remaining part of the environment, as a call to getenv(),
putenv(), setenv(), unsetenv(), or
any function that is dependent on an environment variable may, on
noticing that environ has changed, copy the environment strings to a
new array and assign environ to point to it.
Any application that directly modifies the pointers to which the
environ variable points has undefined behavior.
Conforming multi-threaded applications shall not use the environ
variable to access or modify any environment variable while any other
thread is concurrently modifying any environment variable. A call to
any function dependent on any environment variable shall be considered
a use of the environ variable to access that environment variable.
You can do something like that and in get_env_variables function you modify what you want. Just create something like a strncmp function for check if you want to modify this variable or not.
int main(int ac, char **av, char **env){
int i = 0;
while (env[i] != NULL){
env[i] = get_env_variables(env[i]);
i++;
}
}
char *get_env_variables(char *str) {
// PUT SOME CODE HERE
}
EDIT : don't forgot to return new env[i].

Is it safe to use the argv pointer globally?

Is it safe to use the argv pointer globally? Or is there a circumstance where it may become invalid?
i.e: Is this code safe?
char **largs;
void function_1()
{
printf("Argument 1: %s\r\n",largs[1]);
}
int main(int argc,char **argv)
{
largs = argv;
function_1();
return 1;
}
Yes, it is safe to use argv globally; you can use it as you would use any char** in your program. The C99 standard even specifies this:
The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.
The C++ standard does not have a similar paragraph, but the same is implicit with no rule to the contrary.
Note that C++ and C are different languages and you should just choose one to ask your question about.
It should be safe so long as main() function does not exit. A few examples of things that can happen after main() exits are:
Destructors of global and static variables
Threads running longer than main()
Stored argv must not be used in those.
The reference doesn't say anything which would give a reason to assume that the lifetimes of the arguments to main() function differ from the general rules for lifetimes of function arguments.
So long as argv pointer itself is valid, the C/C++ runtime must guarantee that the content to which this pointer points is valid (of course, unless something corrupts memory). So it must be safe to use the pointer and the content that long. After main() returns, there is no reason for the C/C++ runtime to keep the content valid either. So the above reasoning applies to both the pointer and the content it points to.
is it safe to use the argv pointer globally
This requires a little more clarification. As the C11 spec says in chapter §5.1.2.2.1, Program startup
[..].. with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared)
That means, the variables themselves have a scope limited to main(). They are not global themselves.
Again the standard says,
The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.
That means, the lifetime of these variables are till main() finishes execution.
So, if you're using a global variable to hold the value from main(), you can safely use those globals to access the same in any other function(s).
This thread on the comp.lang.c.moderated newsgroup discusses the issue at length from a C standard point of view, including a citation showing that the contents of the argv arrays (rather than the argv pointer itself, if e.g. you took an address &argv and stored that) last until "program termination", and an assertion that it is "obvious" that program termination has not yet occurred in a way relevant to this while the atexit-registered functions are executing:
The program has not terminated during atexit-registered
function processing. We thought that was pretty obvious.
(I'm not sure who Douglas A. Gwyn is, but it sounds like "we" means the C standard committee?)
The context of the discussion was mainly concerning storing a copy of the pointer argv[0] (program name).
The relevant C standard text is 5.1.2.2.1:
The parameters argc and argv and the strings pointed to by the
argv array shall be modifiable by the program, and retain their
last-stored values between program startup and program
termination.
Of course, C++ is not C, and its standard may subtly differ on this issue or not address it.
You can either pass them as parameters, or store them in global variables. As long as you don't return from main and try to process them in an atexit handler or the destructor of an variable at global scope, they still exist and will be fine to access from any scope.
yes, it is safe for ether C or C++, because there no thread after main was finish.

Printing Environmental variables in Linux

I am new to Linux. I came across this piece of code to print environmental variables. It is kind of confusing me. How can this code print the environmental variables?
#include <stdio.h>
extern char **environ;
int main()
{
char **var;
for(var=environ; *var!=NULL;++var)
printf("%s\n",*var);
return 0;
}
what is extern here?
If you don't know what extern means, please find a book to learn C from. It simply means 'defined somewhere else, but used here'.
The environ global variable is unique amongst POSIX global variables in that it is not declared in any header. It is like the argv array to the program, an array of character pointers each of which points at an environment variable in the name=value format. The list is terminated by a null pointer, just like argv is. There is no count for the environment, though.
for (var = environ; *var != NULL; ++var)
printf("%s\n", *var);
So, on the first iteration, var points at the first environment variable; then it is incremented to the next, until the value *var (a char *) is NULL, indicating the end of the list.
That loop could also be written as:
char **var = environ;
while (*var != 0)
puts(*var++);
From wikipedia http://en.wikipedia.org/wiki/External_variable:
Definition, declaration and the extern keyword
To understand how external variables relate to the extern keyword, it is necessary to know the difference between defining and declaring a variable. When a variable is defined, the compiler allocates memory for that variable and possibly also initializes its contents to some value. When a variable is declared, the compiler requires that the variable be defined elsewhere. The declaration informs the compiler that a variable by that name and type exists, but the compiler need not allocate memory for it since it is allocated elsewhere.
The extern keyword means "declare without defining". In other words, it is a way to explicitly declare a variable, or to force a declaration without a definition. It is also possible to explicitly define a variable, i.e. to force a definition. It is done by assigning an initialization value to a variable. If neither the extern keyword nor an initialization value are present, the statement can be either a declaration or a definition. It is up to the compiler to analyse the modules of the program and decide.
A variable must be defined once in one of the modules of the program. If there is no definition or more than one, an error is produced, possibly in the linking stage. A variable may be declared many times, as long as the declarations are consistent with each other and with the definition (something which header files facilitate greatly). It may be declared in many modules, including the module where it was defined, and even many times in the same module. But it is usually pointless to declare it more than once in a module.
An external variable may also be declared inside a function. In this case the extern keyword must be used, otherwise the compiler will consider it a definition of a local variable, which has a different scope, lifetime and initial value. This declaration will only be visible inside the function instead of throughout the function's module.
The extern keyword applied to a function prototype does absolutely nothing (the extern keyword applied to a function definition is, of course, non-sensical). A function prototype is always a declaration and never a definition. Also, in ANSI C, a function is always external, but some compiler extensions and newer C standards allow a function to be defined inside a function.
An external variable must be defined, exactly once, outside of any
function; this sets aside storage for it. The variable must also be
declared in each function that wants to access it; this states the
type of the variable. The declaration may be an explicit extern
statement or may be implicit from context. ... You should note that we
are using the words definition and declaration carefully when we refer
to external variables in this section. Definition refers to the place
where the variable is created or assigned storage; declaration refers
to places where the nature of the variable is stated but no storage is
allocated.
—The C Programming Language
Scope, lifetime and the static keyword
An external variable can be accessed by all the functions in all the modules of a program. It is a global variable. For a function to be able to use the variable, a declaration or the definition of the external variable must lie before the function definition in the source code. Or there must be a declaration of the variable, with the keyword extern, inside the function.
The static keyword (static and extern are mutually exclusive), applied to the definition of an external variable, changes this a bit: the variable can only be accessed by the functions in the same module where it was defined. But it is possible for a function in the same module to pass a reference (pointer) of the variable to another function in another module. In this case, even though the function is in another module, it can read and modify the contents of the variable—it just cannot refer to it by name.
It is also possible to use the static keyword on the definition of a local variable. Without the static keyword, the variable is automatically allocated when the function is called and released when the function exits (thus the name "automatic variable"). Its value is not retained between function calls. With the static keyword, the variable is allocated when the program starts and released when the program ends. Its value is not lost between function calls. The variable is still local, since it can only be accessed by name inside the function that defined it. But a reference (pointer) to it can be passed to another function, allowing it to read and modify the contents of the variable (again without referring to it by name).
External variables are allocated and initialized when the program starts, and the memory is only released when the program ends. Their lifetime is the same as the program's.
If the initialization is not done explicitly, external (static or not) and local static variables are initialized to zero. Local automatic variables are uninitialized, i.e. contain "trash" values.
The static keyword applied to a function definition prevents the function from being called by name from outside its module (it remains possible to pass a function pointer out of the module and use that to invoke the function).
Example (C programming language)
File 1:
int GlobalVariable; // implicit definition
void SomeFunction(); // function prototype (declaration)
int main() {
GlobalVariable = 1;
SomeFunction();
return 0;
}
File 2:
extern int GlobalVariable; // explicit declaration
void SomeFunction() { // function header (definition)
++GlobalVariable;
}
In this example, the variable GlobalVariable is defined in File 1. In order to utilize the same variable in File 2, it must be declared. Regardless of the number of files, a global variable is only defined once, however, it must be declared in any file outside of the one containing the definition.
If the program is in several source files, and a variable is defined in file1 and used in file2 and file3, then extern declarations are needed in file2 and file3 to connect the occurrences of the variable. The usual practice is to collect extern declarations of variables and functions in a separate file, historically called a header, that is included by #include at the front of each source file. The suffix .h is conventional for header names.
Extern defines a variable or a function that can be used from other files... I highly advise reading some of the many articles available on the Internet on C programming: https://www.google.ca/search?client=opera&rls=en&q=learn+c&sourceid=opera&ie=utf-8&oe=utf-8&channel=suggest
extern char **environ;
the variable environ comes from your library which you will link.
That variable saved the system environment variables of your current
linux system. That's why you can do so.

Is main() a pre-defined function in C? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
main() in C, C++, Java, C#
I'm new to programming in general, and C in particular. Every example I've looked at has a "main" function - is this pre-defined in some way, such that the name takes on a special meaning to the compiler or runtime... or is it merely a common idiom among C programmers (like using "foo" and "bar" for arbitrary variable names).
No, you need to define main in your program. Since it's called from the run-time, however, the interface your main must provide is pre-defined (must return an int, must take either zero arguments or two, the first an int, and the second a char ** or, equivalently, char *[]). The C and C++ standards do specify that a function with external linkage named main acts as the entry point for a program1.
At least as the term is normally used, a predefined function would be one such as sin or printf that's in the standard library so you can use it without having to write it yourself.
1If you want to get technical, that's only true for a "hosted" implementation -- i.e., the kind most of us use most of the time that produces programs that run on an operating system. A "free-standing" implementation (one produces program that run directly on the "bare metal" with no operating system under it) is free to define the entry point(s) as it sees fit. A free-standing implementation can also leave out most of the normal run-time library, providing only a handful of headers (e.g., <stddef.h>) and virtually no standard library functions.
Yes, main is a predefined function in the general sense of the the word "defined". In other words, the C language standard specifies that the function called at program startup shall be named main. It is not merely a convention used by programmers as we have with foo or bar.
The fine print: from the perspective of the technical meaning of the word "defined" in the context of C programming, no the main function is not "predefined" -- the compiler or C library do not supply a predefined function named main. You need to define your own implementation of the main function (and, obviously, you should name it main).
There is typically a piece of code that normal C programs are linked to which does:
extern int main(int argc, char * argv[], char * envp[]);
FILE * stdin;
FILE * stdout;
FILE * stderr;
/* ** setup argv ** */
...
/* ** setup envp ** */
...
/* ** setup stdio ** */
stdin = fdopen(0, "r");
stdout = fdopen(1, "w");
stderr = fdopen(2, "w");
int rc;
rc = main(argc, argv, envp); // envp may not be present here with some systems
exit(rc);
Note that this code is C, not C++, so it expects main to be a C function.
Also note that my code does no error checking and leaves out a lot of other system dependent stuff that probably happens. It also ignores some things that happen with C++, objective C, and various other languages that may be linked to it (notably constructor and destructor calling, and possibly having main be within a C++ try/catch block).
Anyway, this code knows that main is a function and takes arguments. If your main looks like:
int main(void) {
Then it still gets passed arguments, but they are ignored.
This code specially linked so that it is called when the program starts up.
You are completely free to write your own version of this code on many architectures, but it relies on intimate knowledge of how the operating system starts a new program as well as the C (and C++ and possibly Objective C) run time. It is likely to require some assembly programming and or use of compiler extensions to properly build.
The C compiler driver (the command you usually call when you call the compiler) passes the object file containing all of this (often called crt0.0, for C Run Time ...) along with the rest of your program to the linker, unless told not to.
When building an operating system kernel or an embedded program you often do not want to use the standard crt*.o file. You also may not want to use it if you are building a normal application in another programming language, or have some other non-standard requirements.
No, or you couldn't define one.
It's not predefined, but its meaning as an entry point, if it is present, is defined.

Resources