I'm currently working on a program, and I would like to print special output if an environment variable is set.
For example, suppose I want environment variable "DEBUG".
In my bash command prompt, I set DEBUG by typing the command:
DEBUG=
Then in my C program, I can verify this environment variable is set by printing out all the content of char **environ. DEBUG does show up in this environment printout.
However, I don't know how to retrieve this environment variable for conditional checking. I've tried using the function getenv like so:
getenv("DEBUG")
If I were to try to print out this output like below I get a seg fault:
printf("get env: %s\n", getenv("DEBUG"));
I even tried this on a known environment variable like "HOME":
printf("get env: %s\n", getenv("HOME"));
which still produces a seg fault.
Does any one have any experience checking if an environment variable is set from a C program? I'm having issues even pulling a single environment variable which is preventing me from doing so.
getenv returns NULL when the environment variable for which it is asked is not set. Your check could thus simply be
if(getenv("DEBUG")) {
// DEBUG is set
} else {
// DEBUG is not set
}
Note that there is a difference between shell and environment variables; if you want a variable to show up in the environment of a shell's subprocess, you have to export it in the shell:
export DEBUG=some_value
or
DEBUG=some_value
export DEBUG
It is not enough to just say DEBUG=some_value.
It's because you are not including stdlib.h and the compiler is assuming getenv() returns int.
You have two options, you can declare getenv() like
char *getenv(const char *);
or include stdlib.h, and the same applies for printf() but in that case the header is stdio.h.
You should enable compiler warning, on linux gcc and clang both support -Wall -Wextra -Werror, the most important one, -Werror will prevent compilation in this case.
You need to make sure that getenv (and printf) are correctly declared.
For getenv, you need:
#include <stdlib.h>
If you don't declare it, segfaults are likely when you call it. If you were to get that far, trying to use the value it returns would probably also segfault.
Undeclared functions are handled as though they were declared to accept either integer or double arguments (depending on what is provided) and as though they return integers. If int is the same size as a pointer, that might work, but in the common case where pointers are 64 bits but ints are only 32, passing a pointer as though it were an integer will result in half of its bits being dropped on the floor, making it pretty much unusable as a pointer.
Always specify -Wall when you compile your code, and make sure you pay attention to the warnings. They are important.
code snippet :
if(NULL == getenv("TIME_ELAPSED"))
{
putenv("TIME_ELAPSED=1");
}
we have to take care of error handling for putenv also .Sometimes it returns
ENOMEM Insufficient space to allocate new environment.
Related
i want to know c compiler behavior with strings
i am using windows 7 code block with GCC
int main()
{
"1145"; "ho";
printf("hello");
}
so i want to know unused string consume memory space or not
First you need to understand l(eft)-values and r(ight)-values.
l-values actually are memory locations, where objects are stored.
r-values are data, that supposed to be stored in some place in memory (in l-value).
So your construct "1145"; "ho";
makes two r-values that are not assigned anywhere. You can even make this (perfectly valid) code:
int main(){
;;
printf("hello");
}
This is allowed because ; is null statement operator. You will, not once, see expressions like
while(*ptr++); // ajusts pointer until contents of the pointer become 0
where while is actually executing every iteration ;
I'm 99% sure that this strings didn't use any space at all, because GCC without any option recognized unused statement and didn't generate any code for this line.
Compiling the code shown and assuming you enabled enough warnings you can expect the following being issued by the compiler:
warning: statement with no effect [-Wunused-value]
So the compiler seems to have noticed that those strings are "unused". Knowing this and being told to "optimise" the compilation those strings might very well be removed and would not use any memory at all.
If the compiler has been told to not optimise the strings will be part of the program and use at least sizeof "1145" + sizeof "ho" bytes.
Further readings:
To enable GCC's warnings use its -Wxyz options.
To steer optimisation with GCC use its -O option.
I understand that if printf is given no arguments it outputs an unexpected value.
Example:
#include <stdio.h>
int main() {
int test = 4 * 4
printf("The answer is: %d\n");
return 0;
}
This returns a random number. After playing around with different formats such as %p, %x etc, it doesn't print 16(because I didn't add the variable to the argument section) What i'd like to know is, where are these values being taken from? Is it the top of the stack? It's not a new value every time I compile, which is weird, it's like a fixed value.
printf("The answer is: %d\n");
invokes undefined behavior. C requires a conversion specifier to have an associated argument. While it is undefined behavior and anything can happen, on most systems you end up dumping the stack. It's the kind of trick used in format string attacks.
It is called undefined behavior and it is scary (see this answer).
If you want an explanation, you need to dive into implementation specific details. So study the generated source code (e.g. compile with gcc -Wall -Wextra -fverbose-asm + your optimization flags, then look into the generated .s assembly file) and the ABI of your system.
The printf function will go looking for the argument on the stack, even if you don't supply one. Anything that's on there will be used, if it can't find an integer argument. Most times, you will get nonsensical data. The data chosen varies depending on the settings of your compiler. On some compilers, you may even get 16 as a result.
For example:
int printf(char*, int d){...}
This would be how printf works(not really, just an example). It doesn't return an error if d is null or empty, it just looks on the stack for the argument that's supposed to be there to display.
Printf is a variable argument function. Most compilers push arguments onto the stack and then call the function, but, depending on machine, operating system, calling convention, number of arguments, etc, there are also other values pushed onto the stack, which might be constant in your function.
Printf reads this area of memory and returns it.
A question was asked in a multiple choice test: What will be the output of the following program:
#include <stdio.h>
int main(void)
{
int a = 10, b = 5, c = 2;
printf("%d %d %d\n");
return 0;
}
and the choices were various permutations of 10, 5, and 2. For some reason, it works in Turbo C++, which we use in college. However, it doesn't when compiled with gcc (which gives a warning when -Wall is enabled) or clang (which has -Wformat enabled and gives a warning by default) or in Visual C++. The output is, as expected, garbage values. My guess is that it has something to do with the fact that either Turbo C++ is 16-bit, and running on 32-bit Windows XP, or that TCC is terrible when it comes to standards.
The code has undefined behaviour.
In Turbo C++, it just so happens that the three variables live at the exact positions on the stack where the missing printf() argument would be. This results in the undefined behaviour manifesting itself by having the "correct" values printed.
However, you can't reasonably rely on this to be the case. Even the slightest change to your build environment (e.g. different compiler options) could break things in an arbitrarily nasty way.
The answer here is that the program could do anything -- this is undefined behavior. According to printf()s documentation (emphasis mine):
By default, the arguments are used in the order given, where each '*' and each conversion specifier asks for the next argument (and it is an error if insufficiently many arguments are given).
If your multiple-choice test does not have a choice for "undefined behavior" then it is a flawed test. Under the influence of undefined behavior, any answer on such a multiple-choice test question is technically correct.
It is an undefined behaviour. So it could be anything.
Try to use
printf("%d %d%d", a,b,c)
Reason:- Local variables are called on the stack and printf in Turbo C++ sees them in the same order in which they were assigned in the stack.
SUGGESTION(From comments):-
Understanding why it behaves in a particular way with a particular compiler can be useful in diagnosing problems, but don't make any other use of the information.
What's actually going on is that arguments are normally passed on the call stack. Local variables are also passed on the call stack, and so printf() sees those values, in whatever order the compiler decided to store them there.
This behavior, as well as many others, are allowed under the umbrella of undefined behavoir
No, it's not related to architecture. It is related to how TurboC++ handles the stack. Variables a, b, and c are locals and as such allocated in the stack. printf also expects the values in the stack. Apparently, TurboC++ does not add anything else to the stack after the locals and printf is able to take them as parameters. Just coincidence.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Default values of int when not initialized in c. why do i get different outputs?
Beginner so be lil soft..am compiling a simple code below, I am not assigning any value to my variables but C program generates some random values, why is it so?(Only 2nd variable generates random integers)
So where these values came from?
#include<stdio.h>
main(void) {
int var1;
int var2;
printf("Var1 is %d and Var2 is %d.", var1, var2);
return 0; //Book says I should use this for getting an output but my compiler anyways compile and return me values whether I use it or not
}
//Output 1st compiled: var1 = 19125, var2 = 8983
//Output 2nd compiled: var1 = 19125, var2 = 9207
//Output 2nd compiled: var1 = 19125, var2 = 9127
Your C program is compiled to some executable program. Notice that if you compile on Linux using gcc -Wall, you'll get warnings about uninitialized variables.
The var1 and var2 variables get compiled into using some stack slots, or some registers. These contain some apparently random number, which your program prints. (that number is not really random, it is just unpredictable garbage).
The C language does not mandate the implicit initialization of variables (in contrast with e.g. Java).
In practice, in C I strongly suggest to always explicitly initialize local variables (often, the compiler may be smart enough to even avoid emitting useless initialization).
What you observe is called undefined behavior.
You'll probably observe a different output for var1 if you compiled with a different compiler, or with different optimization flags, or with a different environment (probably typing export SOMEVAR=something before running again your program could change the output for var1, or running your program with a lot of program arguments, etc...).
You could (on Linux) compile with gcc -fverbose-asm -S and add various optimization flags (e.g. -O1 or -O2 ...) your source code yoursource.c and look inside the generated yoursource.s assembler code with some editor.
In C, when you declare variables, that reserves some space for them on the stack. The stack is how C keeps track of which arguments are passed to which function, where variables are stored if you declare them statically within function, where return values are stored, and so on. Each time you call a function, it pushes values on the stack; that is, it writes those values to the next available space on the stack, and updates the stack pointer to account for this. When a function returns, it decrements the stack pointer, to point to where it pointed in the previous function call.
If you declare a variable, but you don't initialize it, you simply get whatever value was in there before. If another function has been called, you may get the arguments passed in to that function; or you might get the return address for the function you are returning to.
In the case that you present, you are showing the main() function, with no other functions called. However, in the process of loading your program, the dynamic linker has probably called several functions within your process space. So the values that you are seeing are probably left over from that.
You cannot depend on what these values are, however. They could be anything; they could be initialized to 0, they could be random data, they could be any sort of internal data.
The content of var1 and var2 are undefined. Thus, they can contain any valid value (depending on many external factors).
It is pure luck that only the second var seams to be random. Try it on another day, after a reboot or after launching a few other programms and I bet the first var will have changed.
It's called local variables . Any local variables have auto storage specifier and these are located on stack in C.
Since you havn't initilaized these varaibles , so it will take any value called garbage value or indeterminate value (Language standard doesn't imposes any requirements that it must have specific value ) so you are getting any random value.
It's purely coincedence that you are getting same value for var1 but not for var2.
But on any other system it might give different values or even on your system may after sometime.
So,Using uninitialized variables is undefined behaviour
In C,
If the variables are declared as Global or static, then they are automatically initialised to zero. But, if they are declared as local , then the values for those variables are indeterminate i.e .., depends on the compiler. (Some garbage value)
I'm looking at example abo3.c from Insecure Programming and I'm not grokking the casting in the example below. Could someone enlighten me?
int main(int argv,char **argc)
{
extern system,puts;
void (*fn)(char*)=(void(*)(char*))&system;
char buf[256];
fn=(void(*)(char*))&puts;
strcpy(buf,argc[1]);
fn(argc[2]);
exit(1);
}
So - what's with the casting for system and puts? They both return an int so why cast it to void?
I'd really appreciate an explanation of the whole program to put it in perspective.
[EDIT]
Thank you both for your input!
Jonathan Leffler, there is actually a reason for the code to be 'bad'. It's supposed to be exploitable, overflowing buffers and function pointers etc. mishou.org has a blog post on how to exploit the above code. A lot of it is still above my head.
bta, I gather from the above blog post that casting system would somehow prevent the linker from removing it.
One thing that is not immediately clear is that the system and puts addresses are both written to the same location, I think that might be what gera is talking about “so the linker doesn’t remove it”.
While we are on the subject of function pointers, I'd like to ask a follow-up question now that the syntax is clearer. I was looking at some more advanced examples using function pointers and stumbled upon this abomination, taken from a site hosting shellcode.
#include <stdio.h>
char shellcode[] = "some shellcode";
int main(void)
{
fprintf(stdout,"Length: %d\n",strlen(shellcode));
(*(void(*)()) shellcode)();
}
So the array is getting cast to a function returning void, referenced and called? That just looks nasty - so what's the purpose of the above code?
[/EDIT]
Original question
User bta has given a correct explanation of the cast - and commented on the infelicity of casting system.
I'm going to add:
The extern line is at best weird. It is erroneous under strict C99 because there is no type, which makes it invalid; under C89, the type will be assumed to be int. The line says 'there is an externally defined integer called system, and another called puts', which is not correct - there are a pair of functions with those names. The code may actually 'work' because the linker might associate the functions with the supposed integers. But it is not safe for a 64-bit machine where pointers are not the same size as int. Of course, the code should include the correct headers (<stdio.h> for puts() and <stdlib.h> for system() and exit(), and <string.h> for strcpy()).
The exit(1); is bad on two separate counts.
It indicates failure - unconditionally. You exit with 0 or EXIT_SUCCESS to indicate success.
In my view, it is better to use return at the end of main() than exit(). Not everyone necessarily agrees with me, but I do not like to see exit() as the last line of main(). About the only excuse for it is to avoid problems from other bad practices, such as functions registered with atexit() that depend on the continued existence of local variables defined in main().
/usr/bin/gcc -g -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wold-style-definition -c nasty.c
nasty.c: In function ‘main’:
nasty.c:3: warning: type defaults to ‘int’ in declaration of ‘system’
nasty.c:3: warning: type defaults to ‘int’ in declaration of ‘puts’
nasty.c:3: warning: built-in function ‘puts’ declared as non-function
nasty.c:8: warning: implicit declaration of function ‘strcpy’
nasty.c:8: warning: incompatible implicit declaration of built-in function ‘strcpy’
nasty.c:10: warning: implicit declaration of function ‘exit’
nasty.c:10: warning: incompatible implicit declaration of built-in function ‘exit’
nasty.c: At top level:
nasty.c:1: warning: unused parameter ‘argv’
Not good code! I worry about a source of information that contains such code and doesn't explain all the awfulness (because the only excuse for showing such messy code is to dissect it and correct it).
There's another weirdness in the code:
int main(int argv,char **argc)
That is 'correct' (it will work) but 100% aconventional. The normal declaration is:
int main(int argc, char **argv)
The names are short for 'argument count' and 'argument vector', and using argc as the name for the vector (array) of strings is abnormal and downright confusing.
Having visited the site referenced, you can see that it is going through a set of graduated examples. I'm not sure whether the author simply has a blind spot on the argc/argv issue or is deliberately messing around ('abo1' suggests that he is playing, but it is not helpful in my view). The examples are supposed to feed your mind, but there isn't much explanation of what they do. I don't think I could recommend the site.
Extension question
What's the cast in this code doing?
#include <stdio.h>
char shellcode[] = "some shellcode";
int main(void)
{
fprintf(stdout,"Length: %d\n",strlen(shellcode));
(*(void(*)()) shellcode)();
}
This takes the address of the string 'shellcode' and treats it as a pointer to a function that takes an indeterminate set of arguments and returns no values and executes it with no arguments. The string contains the binary assembler code for some exploit - usually running the shell - and the objective of the intruder is to get a root-privileged program to execute their shellcode and give them a command prompt, with root privileges. From there, the system is theirs to own. For practicing, the first step is to get a non-root program to execute the shellcode, of course.
Reviewing the analysis
The analysis at Mishou's web site is not as authoritative as I'd like:
One, this code uses the extern keyword in the C language to make the system and puts functions available. What this does (I think) is basically references directly the location of a function defined in the (implied) header files…I get the impression that GDB is auto-magically including the header files stdlib.h for system and stdio.h for puts. One thing that is not immediately clear is that the system and puts addresses are both written to the same location, I think that might be what gera is talking about “so the linker doesn’t remove it”.
Dissecting the commentary:
The first sentence isn't very accurate; it tells the compiler that the symbols system and puts are defined (as integers) somewhere else. When the code is linked, the address of puts()-the-function is known; the code will treat it as an integer variable, but the address of the integer variable is, in fact, the address of the function - so the cast forces the compiler to treat it as a function pointer after all.
The second sentence is not fully accurate; the linker resolves the addresses of the external 'variables' via the function symbols system() and puts() in the C library.
GDB has nothing whatsoever to do the compilation or linking process.
The last sentence does not make any sense at all. The addresses only get written to the same location because you have an initialization and an assignment to the same variable.
This didn't motivate me to read the whole article, it must be said. Due diligence forces me onwards; the explanation afterwards is better, though still not as clear as I think it could be. But the operation of overflowing the buffer with an overlong but carefully crafted argument string is the core of the operation. The code mentions both puts() and system() so that when run in non-exploit mode, the puts() function is a known symbol (otherwise, you'd have to use dlopen() to find its address), and so that when run in exploit mode, the code has the symbol system() available for direct use. Unused external references are not made available in the executable - a good thing when you realize how many symbols there are in a typical system header compared with the number used by a program that includes the header.
There are some neat tricks shown - though the implementation of those tricks is not shown on the specific page; I assume (without having verified it) that the information for getenvaddr program is available.
The abo3.c code can be written as:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv)
{
void (*fn)(char*) = (void(*)(char*))system;
char buf[256];
fn = (void(*)(char*))puts;
strcpy(buf, argv[1]);
fn(argv[2]);
exit(1);
}
Now it compiles with only one warning with the fussy compilation options I originally used - and that's the accurate warning that 'argc' is not used. It is just as exploitable as the original; it is 'better' code though because it compiles cleanly. The indirections were unnecessary mystique, not a crucial part of making the code exploitable.
Both system and puts normally return int. The code is casting them to a pointer that returns void, presumably because they want to ignore whatever value is returned. This should be equivalent to using (void)fn(argc[2]); as the penultimate line if the cast didn't change the return type. Casting away the return type is sometimes done for callback functions, and this code snippet seems to be a simplistic example of a callback.
Why the cast for system if it is never used is beyond me. I'm assuming that there's more code that isn't shown here.