What implementation of getenv is used if no header is included? - c

I am working on a simple program for learning purposes and I see the following behavior:
If I attempt to read an environment variable using getenv it works as expected:
#include <stdio.h>
#include <stdlib.h>
int main() {
printf("PATH is %s", getenv("PATH"));
return 0;
}
If I do not include stdlib.h, I get the expected warning:
env.c:7:26: warning: implicit declaration of function ‘getenv’; did you mean ‘getline’? [-Wimplicit-function-declaration]
The program still compiles. Running the program with the header runs as expected, the program without the header included gets a seg fault.
If I inspect the disassembly of both programs, I see that there is a call to getenv#PLT
call getenv#PLT
When I output the disassembly of the 2 programs (where the only difference in the c source code is the header include), the only difference I see is that when we include the header there's an instruction to set %eax to 0 before calling getenv.
movl $0, %eax
If I step through in gdb for the program without included header, it does jump into getenv and does run a whole lot of code within the c runtime.
I am wondering what is the exact reason why not including the header here causes this kind of behavior? Why does c let this compile?
Some info:
gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
I am compiling like this:
gcc -O0 -S env.c

In C language, if you forget to include the header file, the getenv() function is interpreted as int getenv(...). If you include the header file, the getenv() function is interpreted as char* getenv(const char*). Because of implementation of printf() the only difference in your case is the size of the return value. If sizeof(int) == sizeof(char*), then you have no problem (this happens for 32-bit programs). If sizeof(int) != sizeof(char*), you get an error (this happens for 64-bit programs). This is explained in the comment #5

Related

Defining _POSIX_C_SOURCE as 2 causes error when changing code page on Windows CMD with MinGW GCC

I've been writing a Linux program that's meant to write non-English characters on the terminal, I've recently been porting it to Windows, and I've run into some issues, when trying to change the code page and the font of the terminal, having the symbolic constant _POSIX_C_SOURCE previously defined seems to change the behavior of the code, and makes it incapable of properly printing non-English characters, for reference, this is my code.
#include <windows.h>
#include <stdio.h>
int main()
{
SetConsoleCP(CP_UTF8)
SetConsoleOutputCP(CP_UTF8)
HANDLE hStdOut = GetStdHandle(STD_OUTPUT_HANDLE);
CONSOLE_FONT_INFOEX cfie;
ZeroMemory(&cfie, sizeof(cfie));
cfie.cbSize = sizeof(cfie);
lstrcpyW(cfie.FaceName, L"Lucida Console");
SetCurrentConsoleFontEx(hStdOut, 0, &cfie);
printf("Ћирилични текст\n");
return 0;
}
This is what the program prints out depending on whether I do or don't define the constant in a command line argument while compiling.
C:\Users\User\Desktop>gcc test.c
C:\Users\User\Desktop>a.exe
Ћириличан текст
C:\Users\User\Desktop>gcc -D_POSIX_C_SOURCE=2 test.c
C:\Users\User\Desktop>a.exe
������������������ ����������
This is because outputting to standard output is done literally byte-by-byte when POSIX compliance is in effect. It uses a different implementation of what is done inside the printf function.

How to force a crash in C, is dereferencing a null pointer a (fairly) portable way?

I'm writing my own test-runner for my current project. One feature (that's probably quite common with test-runners) is that every testcase is executed in a child process, so the test-runner can properly detect and report a crashing testcase.
I want to also test the test-runner itself, therefore one testcase has to force a crash. I know "crashing" is not covered by the C standard and just might happen as a result of undefined behavior. So this question is more about the behavior of real-world implementations.
My first attempt was to just dereference a null-pointer:
int c = *((int *)0);
This worked in a debug build on GNU/Linux and Windows, but failed to crash in a release build because the unused variable c was optimized out, so I added
printf("%d", c); // to prevent optimizing away the crash
and thought I was settled. However, trying my code with clang instead of gcc revealed a surprise during compilation:
[CC] obj/x86_64-pc-linux-gnu/release/src/test/test/test_s.o
src/test/test/test.c:34:13: warning: indirection of non-volatile null pointer
will be deleted, not trap [-Wnull-dereference]
int c = *((int *)0);
^~~~~~~~~~~
src/test/test/test.c:34:13: note: consider using __builtin_trap() or qualifying
pointer with 'volatile'
1 warning generated.
And indeed, the clang-compiled testcase didn't crash.
So, I followed the advice of the warning and now my testcase looks like this:
PT_TESTMETHOD(test_expected_crash)
{
PT_Test_expectCrash();
// crash intentionally
int *volatile nptr = 0;
int c = *nptr;
printf("%d", c); // to prevent optimizing away the crash
}
This solved my immediate problem, the testcase "works" (aka crashes) with both gcc and clang.
I guess because dereferencing the null pointer is undefined behavior, clang is free to compile my first code into something that doesn't crash. The volatile qualifier removes the ability to be sure at compile time that this really will dereference null.
Now my questions are:
Does this final code guarantee the null dereference actually happens at runtime?
Is dereferencing null indeed a fairly portable way for crashing on most platforms?
I wouldn't rely on that method as being robust if I were you.
Can't you use abort(), which is part of the C standard and is guaranteed to cause an abnormal program termination event?
The answer refering to abort() was great, I really didn't think of that and it's indeed a perfectly portable way of forcing an abnormal program termination.
Trying it with my code, I came across msvcrt (Microsoft's C runtime) implements abort() in a special chatty way, it outputs the following to stderr:
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
That's not so nice, at least it unnecessarily clutters the output of a complete test run. So I had a look at __builtin_trap() that's also referenced in clang's warning. It turns out this gives me exactly what I was looking for:
LLVM code generator translates __builtin_trap() to a trap instruction if it is supported by the target ISA. Otherwise, the builtin is translated into a call to abort.
It's also available in gcc starting with version 4.2.4:
This function causes the program to exit abnormally. GCC implements this function by using a target-dependent mechanism (such as intentionally executing an illegal instruction) or by calling abort.
As this does something similar to a real crash, I prefer it over a simple abort(). For the fallback, it's still an option trying to do your own illegal operation like the null pointer dereference, but just add a call to abort() in case the program somehow makes it there without crashing.
So, all in all, the solution looks like this, testing for a minimum GCC version and using the much more handy __has_builtin() macro provided by clang:
#undef HAVE_BUILTIN_TRAP
#ifdef __GNUC__
# define GCC_VERSION (__GNUC__ * 10000 \
+ __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__)
# if GCC_VERSION > 40203
# define HAVE_BUILTIN_TRAP
# endif
#else
# ifdef __has_builtin
# if __has_builtin(__builtin_trap)
# define HAVE_BUILTIN_TRAP
# endif
# endif
#endif
#ifdef HAVE_BUILTIN_TRAP
# define crashMe() __builtin_trap()
#else
# include <stdio.h>
# define crashMe() do { \
int *volatile iptr = 0; \
int i = *iptr; \
printf("%d", i); \
abort(); } while (0)
#endif
// [...]
PT_TESTMETHOD(test_expected_crash)
{
PT_Test_expectCrash();
// crash intentionally
crashMe();
}
you can write memory instead of reading it.
*((int *)0) = 0;
No, dereferencing a NULL pointer is not a portable way of crashing a program. It is undefined behavior, which means just that, you have no guarantees what will happen.
As it happen, for the most part under any of the three main OS's used today on desktop computers, that being MacOS, Linux and Windows NT (*) dereferencing a NULL pointer will immediately crash your program.
That said: "The worst possible result of undefined behavior is for it to do what you were expecting."
I purposely put a star beside Windows NT, because under Windows 95/98/ME, I can craft a program that has the following source:
int main()
{
int *pointer = NULL;
int i = *pointer;
return 0;
}
that will run without crashing. Compile it as a TINY mode .COM files under 16 bit DOS, and you'll be just fine.
Ditto running the same source with just about any C compiler under CP/M.
Ditto running that on some embedded systems. I've not tested it on an Arduino, but I would not want to bet either way on the outcome. I do know for certain that were a C compiler available for the 8051 systems I cut my teeth on, that program would run fine on those.
The program below should work. It might cause some collateral damage, though.
#include <string.h>
void crashme( char *str)
{
char *omg;
for(omg=strtok(str, "" ); omg ; omg=strtok(NULL, "") ) {
strcat(omg , "wtf");
}
*omg =0; // always NUL-terminate a NULL string !!!
}
int main(void)
{
char buff[20];
// crashme( "WTF" ); // works!
// crashme( NULL ); // works, too
crashme( buff ); // Maybe a bit too slow ...
return 0;
}

Force gcc to use syscalls

So I am currently learning assembly language (AT&T syntax). We all know that gcc has an option to generate assembly code from C code with -S argument. Now, I would like to look at some code, how it looks in assembly. The problem is, on laboratories we compile it with as+ld, and as for now, we cannot use C libraries. So for example we cannot use printf. We should do it by syscalls (32 bit is enough). And now I have this code in C:
#include <stdio.h>
int main()
{
int a = 5;
int b = 3;
int c = a + b;
printf("%d", c);
return 0;
}
This is simple code, so I know how it will look with syscalls. But if I have some more complicated code, I don't want to mess around and replace every call printf and modify other registers, cuz gcc generated code for printf, and I should have it with syscalls. So can I somehow make gcc generate assembly code with syscalls (for example for I/O (console, files)), not with C libs?
Under Linux there exist the macro family _syscallX to generate a syscall where the X names the number of parameters. It is marked as obsolete, but IMHO still working. E.g., the following code should work (not tested here):
_syscall3(int,syswrite,int,handle,char*,str,int len);
// ---
char str[]="Hello, world!\n";
// file handle 1 is stdout
syswrite(1,str,14);

This C function should always return false, but it doesn’t

I stumbled over an interesting question in a forum a long time ago and I want to know the answer.
Consider the following C function:
f1.c
#include <stdbool.h>
bool f1()
{
int var1 = 1000;
int var2 = 2000;
int var3 = var1 + var2;
return (var3 == 0) ? true : false;
}
This should always return false since var3 == 3000. The main function looks like this:
main.c
#include <stdio.h>
#include <stdbool.h>
int main()
{
printf( f1() == true ? "true\n" : "false\n");
if( f1() )
{
printf("executed\n");
}
return 0;
}
Since f1() should always return false, one would expect the program to print only one false to the screen. But after compiling and running it, executed is also displayed:
$ gcc main.c f1.c -o test
$ ./test
false
executed
Why is that? Does this code have some sort of undefined behavior?
Note: I compiled it with gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2.
As noted in other answers, the problem is that you use gcc with no compiler options set. If you do this, it defaults to what is called "gnu90", which is a non-standard implementation of the old, withdrawn C90 standard from 1990.
In the old C90 standard there was a major flaw in the C language: if you didn't declare a prototype before using a function, it would default to int func () (where ( ) means "accept any parameter"). This changes the calling convention of the function func, but it doesn't change the actual function definition. Since the size of bool and int are different, your code invokes undefined behavior when the function is called.
This dangerous nonsense behavior was fixed in the year 1999, with the release of the C99 standard. Implicit function declarations were banned.
Unfortunately, GCC up to version 5.x.x still uses the old C standard by default. There is probably no reason why you should want to compile your code as anything but standard C. So you have to explicitly tell GCC that it should compile your code as modern C code, instead of some 25+ years old, non-standard GNU crap.
Fix the problem by always compiling your program as:
gcc -std=c11 -pedantic-errors -Wall -Wextra
-std=c11 tells it to make a half-hearted attempt to compile according the (current) C standard (informally known as C11).
-pedantic-errors tells it to whole-heartedly do the above, and give compiler errors when you write incorrect code which violates the C standard.
-Wall means give me some extra warnings that might be good to have.
-Wextra means give me some other extra warnings that might be good to have.
You don't have a prototype declared for f1() in main.c, so it is implicitly defined as int f1(), meaning it is a function that takes an unknown number of arguments and returns an int.
If int and bool are of different sizes, this will result in undefined behavior. For example, on my machine, int is 4 bytes and bool is one byte. Since the function is defined to return bool, it puts one byte on the stack when it returns. However, since it's implicitly declared to return int from main.c, the calling function will try to read 4 bytes from the stack.
The default compilers options in gcc won't tell you that it's doing this. But if you compile with -Wall -Wextra, you'll get this:
main.c: In function ‘main’:
main.c:6: warning: implicit declaration of function ‘f1’
To fix this, add a declaration for f1 in main.c, before main:
bool f1(void);
Note that the argument list is explicitly set to void, which tells the compiler the function takes no arguments, as opposed to an empty parameter list which means an unknown number of arguments. The definition f1 in f1.c should also be changed to reflect this.
I think it's interesting to see where the size-mismatch mentioned in Lundin's excellent answer actually happens.
If you compile with --save-temps, you will get assembly files that you can look at. Here's the part where f1() does the == 0 comparison and returns its value:
cmpl $0, -4(%rbp)
sete %al
The returning part is sete %al. In C's x86 calling conventions, return values 4 bytes or smaller (which includes int and bool) are returned via register %eax. %al is the lowest byte of %eax. So, the upper 3 bytes of %eax are left in an uncontrolled state.
Now in main():
call f1
testl %eax, %eax
je .L2
This checks whether the whole of %eax is zero, because it thinks it's testing an int.
Adding an explicit function declaration changes main() to:
call f1
testb %al, %al
je .L2
which is what we want.
Please compile with a command such as this one:
gcc -Wall -Wextra -Werror -std=gnu99 -o main.exe main.c
Output:
main.c: In function 'main':
main.c:14:5: error: implicit declaration of function 'f1' [-Werror=impl
icit-function-declaration]
printf( f1() == true ? "true\n" : "false\n");
^
cc1.exe: all warnings being treated as errors
With such a message, you should know what to do to correct it.
Edit: After reading a (now deleted) comment, I tried to compile your code without the flags. Well, This led me to linker errors with no compiler warnings instead of compiler errors. And those linker errors are more difficult to understand, so even if -std-gnu99 is not necessary, please try to allways use at least -Wall -Werror it will save you a lot of pain in the ass.

putw() function in C

Following program with putw is not writing the required data in the file.
#include <stdio.h>
int main(void)
{
FILE *fp;
fp = fopen("a.txt", "w");
putw(25,fp);
putw(325,fp);
putw(425,fp);
fclose(fp);
return 0;
}
Program is compiled and executed like the following
gcc filename.c
./a.out
It is writing something in the file. Also if we read the integer using getw(), it is reading the value which is not available in the file. Even it is not the ASCII value.
When it is compiled with gcc filename.c -std=c99, it is showing implicit declaration warning error.
Is it required to link any library files to use putw/getw in c.
There is no function called putw in standard C, which is why you get compiler warnings. You probably meant to use putwc in wchar.h.
putw is an ancient function that exists on some platforms. Use fwrite and fread instead. You should also check the return value from putw. It may be telling you why it is failing.

Resources