Undefined behaviour after multiple calls to a print function - c

I'm writing a basic compiler and the code generated does not work as intended.
I'm using a naive graph coloring algorithm to allocate variables in registers based on their liveness.
The problem is that the generated assembly code seems perfectly fine, but, at some point, it produces undefined behaviour.
If, instead of using registers to store variables, I just use the stack, everything works fine.
I also discovered that I can't use the %edx register around an imull instruction and I wondered if something similar is happening right now with %ebx and %ecx.
I compile the code using gcc -m32 "test.s" runtime.c -o test, where runtime.c is a helper C file containing the print and input functions.
I've also tried to remove parts of the program (every print except the last one) and then the last print will work.
If I call a single print function before the last call it won't work.
The runtime.c file:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
int input() {
int num;
char term;
scanf("%d%c", &num, &term);
return num;
}
void print_int_nl(int i) {
printf("%d\n", i);
}
Source file:
a = 10
b = input()
c = - 10
d = -input()
print a
print b
print c
print d
Generated assembly code:
https://pastebin.com/ChSRbWgt
After compiling the .s file and running it using the console (./test) it asks for 2 input (as intended).
I give it 1 and 2.
Then the output is:
10
1
-10
1415880
instead of
10
1
-10
-2

You need to observe the calling convention (see e.g. Calling conventions for different C++ compilers and operating systems by Agner Fog).
Namely, there are caller-save and callee-save registers in the convention for your C compiler.
Your generated code needs to preserve the callee-save registers in order to be able to return to its C caller.
Similarly, printf() will preserve the callee-save registers, but it can trash the caller-save registers, meaning that if your generated code calls printf(), it will need to preserve the caller-save registers across the calls to printf() or any other C function.

You need to clear the input buffer after your first input the buffer still contains the newline character i'd recommend this :
int input() {
int num;
char term;
scanf("%d %c", &num, &term);
return num;
}

Related

Why GCC won't give me stackerror on long arguments?

My point is to show someone that every argument you sent to a C function, are pushed on the stack. It also happens in Ruby, Python, Lua, JavaScript, etc.
So I have written a Ruby code that generates a C code:
#!/usr/bin/env ruby
str = Array.new(10, &:itself)
a = <<~EOF
#include <stdio.h>
void x(#{str.map { |x| "int n#{x}" }.join(?,)}) {
printf("%d\\n", n#{str.length - 1}) ;
}
int main() { x(#{str.join(?,)}) ; }
EOF
IO.write('p.c', a)
After running this code with the Ruby interpreter, I get a file called p.c, which has this content:
#include <stdio.h>
void x(int n0,int n1,int n2,int n3,int n4,int n5,int n6,int n7,int n8,int n9) {
printf("%d\n", n9) ;
}
int main() { x(0,1,2,3,4,5,6,7,8,9) ; }
Which is good, and does compile and execute just fine.
But if I give the ruby program an array size of 100,000, it should generate a C file that takes n0 to n999999 arguments. That means 100,000 arguments.
A quick google search shows me that C's arguments are stored on the stack.
Passing these arguments should give me a stackerror, but it doesn't. GCC compiles it just fine, I also get output of 99999.
But with Clang, I get:
p.c:4:17: error: use of undeclared identifier 'n99999'
printf("%d\n", n99999) ;
^
p.c:8:195690: error: too many arguments to function call, expected 34464, have 100000
p.c:3:6: note: 'x' declared here
2 errors generated.
How does GCC deal with that many arguments? In most cases, I get stackerror on other programming languages when the stacksize in 10900.
The best way to prove this to your friend is to write an infinite recursive function:
#include <stdio.h>
void recurse(int x) {
static int iterations=0;
printf("Iteration: %d\n", ++iterations);
recurse(x);
}
int main() {
recurse(1);
}
This will always overflow the stack assuming there is a stack (not all architectures use stacks). It will tell you how many stack frames you get to before the stack overflow happens; this will give you an idea of the depth of the stack.
As for why gcc compiles, gcc does not know the target stack size so it cannot check for a stack overflow. It's theoretically possible to have a stack large enough to accommodate 100,000 arguments. That's less than half a megabyte. Not sure why clang behaves differently; it would depend on seeing the generated C code.
If you can share what computer system/architecture you are using, it would be helpful. You cited information that applies to 64-bit Intel systems (e.g. PC/Windows).

How can I access interpreter path address at runtime in C?

By using the objdump command I figured that the address 0x02a8 in memory contains start the path /lib64/ld-linux-x86-64.so.2, and this path ends with a 0x00 byte, due to the C standard.
So I tried to write a simple C program that will print this line (I used a sample from the book "RE for beginners" by Denis Yurichev - page 24):
#include <stdio.h>
int main(){
printf(0x02a8);
return 0;
}
But I was disappointed to get a segmentation fault instead of the expected /lib64/ld-linux-x86-64.so.2 output.
I find it strange to use such a "fast" call of printf without specifiers or at least pointer cast, so I tried to make the code more natural:
#include <stdio.h>
int main(){
char *p = (char*)0x02a8;
printf(p);
printf("\n");
return 0;
}
And after running this I still got a segmentation fault.
I don't believe this is happening because of restricted memory areas, because in the book it all goes well at the 1st try. I am not sure, maybe there is something more that wasn't mentioned in that book.
So need some clear explanation of why the segmentation faults keep happening every time I try running the program.
I'm using the latest fully-upgraded Kali Linux release.
Disappointing to see that your "RE for beginners" book does not go into the basics first, and spits out this nonsense. Nonetheless, what you are doing is obviously wrong, let me explain why.
Normally on Linux, GCC produces ELF executables that are position independent. This is done for security purposes. When the program is run, the operating system is able to place it anywhere in memory (at any address), and the program will work just fine. This technique is called Address Space Layout Randomization, and is a feature of the operating system that nowdays is enabled by default.
Normally, an ELF program would have a "base address", and would be loaded exactly at that address in order to work. However, in case of a position independent ELF, the "base address" is set to 0x0, and the operating system and the interpreter decide where to put the program at runtime.
When using objdump on a position independent executable, every address that you see is not a real address, but rather, an offset from the base of the program (that will only be known at runtime). Therefore it is not possible to know the position of such a string (or any other variable) at runtime.
If you want the above to work, you will have to compile an ELF that is not position independent. You can do so like this:
gcc -no-pie -fno-pie prog.c -o prog
It no longer works like that. The 64-bit Linux executables that you're likely using are position-independent and they're loaded into memory at an arbitrary address. In that case ELF file does not contain any fixed base address.
While you could make a position-dependent executable as instructed by Marco Bonelli it is not how things work for arbitrary executables on modern 64-bit linuxen, so it is more worthwhile to learn to do this with position-independent ones, but it is a bit trickier.
This worked for me to print ELF i.e. the elf header magic, and the interpreter string. This is dirty in that it probably only works for a small executable anyway.
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
int main(){
// convert main to uintptr_t
uintptr_t main_addr = (uintptr_t)main;
// clear bottom 12 bits so that it points to the beginning of page
main_addr &= ~0xFFFLLU;
// subtract one page so that we're in the elf headers...
main_addr -= 0x1000;
// elf magic
puts((char *)main_addr);
// interpreter string, offset from hexdump!
puts((char *)main_addr + 0x318);
}
There is another trick to find the beginning of the ELF executable in memory: the so-called auxiliary vector and getauxval:
The getauxval() function retrieves values from the auxiliary vector,
a mechanism that the kernel's ELF binary loader uses to pass certain
information to user space when a program is executed.
The location of the ELF program headers in memory will be
#include <sys/auxv.h>
char *program_headers = (char*)getauxval(AT_PHDR);
The actual ELF header is 64 bytes long, and the program headers start at byte 64 so if you subtract 64 from this you will get a pointer to the magic string again, therefore our code can be simplified to
#include <stdio.h>
#include <inttypes.h>
#include <sys/auxv.h>
int main(){
char *elf_header = (char *)getauxval(AT_PHDR) - 0x40;
puts(elf_header + 0x318); // or whatever the offset was in your executable
}
And finally, an executable that figures out the interpreter position from the ELF headers alone, provided that you've got a 64-bit ELF, magic numbers from Wikipedia...
#include <stdio.h>
#include <inttypes.h>
#include <sys/auxv.h>
int main() {
// get pointer to the first program header
char *ph = (char *)getauxval(AT_PHDR);
// elf header at this position
char *elfh = ph - 0x40;
// segment type 0x3 is the interpreter;
// program header item length 0x38 in 64-bit executables
while (*(uint32_t *)ph != 3) ph += 0x38;
// the offset is 64 bits at 0x8 from the beginning of the
// executable
uint64_t offset = *(uint64_t *)(ph + 0x8);
// print the interpreter path...
puts(elfh + offset);
}
I guess it segfaults because of the way you use printf: you dont use the format parameter how it is designed to be.
When you want to use the printf function to read data the first argument it takes is a string that will format how the display will work int printf(char *fmt , ...) "the ... represent the data you want to display accordingly to the format string parameter
so if you want to print a string
//format as text
printf("%s\n", pointer_to_beginning_of_string);
//
If this does not work cause it probably will it is because you are trying to read memory that you are not supposed to access.
try adding extra flags " -Werror -Wextra -Wall -pedantic " with your compiler and show us the errors please.

Why is the output 5 when sum doesn't return anything? [duplicate]

Consider:
#include <stdio.h>
char toUpper(char);
int main(void)
{
char ch, ch2;
printf("lowercase input: ");
ch = getchar();
ch2 = toUpper(ch);
printf("%c ==> %c\n", ch, ch2);
return 0;
}
char toUpper(char c)
{
if(c>='a' && c<='z')
c = c - 32;
}
In the toUpper function, the return type is char, but there isn't any "return" in toUpper(). And compile the source code with gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4), Fedora 14.
Of course, a warning is issued: "warning: control reaches end of non-void function", but, working well.
What has happened in that code during compile with gcc?
When the C program was compiled into assembly language, your toUpper function ended up like this, perhaps:
_toUpper:
LFB4:
pushq %rbp
LCFI3:
movq %rsp, %rbp
LCFI4:
movb %dil, -4(%rbp)
cmpb $96, -4(%rbp)
jle L8
cmpb $122, -4(%rbp)
jg L8
movzbl -4(%rbp), %eax
subl $32, %eax
movb %al, -4(%rbp)
L8:
leave
ret
The subtraction of 32 was carried out in the %eax register. And in the x86 calling convention, that is the register in which the return value is expected to be! So... you got lucky.
But please pay attention to the warnings. They are there for a reason!
It depends on the Application Binary Interface and which registers are used for the computation.
E.g. on x86, the first function parameter and the return value is stored in EAX and so gcc is most likely using this to store the result of the calculation as well.
Essentially, c is pushed into the spot that should later be filled with the return value; since it's not overwritten by use of return, it ends up as the value returned.
Note that relying on this (in C, or any other language where this isn't an explicit language feature, like Perl), is a Bad Idea™. In the extreme.
One missing thing that's important to understand is that it's rarely a diagnosable error to omit a return statement. Consider this function:
int f(int x)
{
if (x!=42) return x*x;
}
As long as you never call it with an argument of 42, a program containing this function is perfectly valid C and does not invoke any undefined behavior, despite the fact that it would invoke UB if you called f(42) and subsequently attempted to use the return value.
As such, while it's possible for a compiler to provide warning heuristics for missing return statements, it's impossible to do so without false positives or false negatives. This is a consequence of the impossibility of solving the halting problem.
I can't tell you the specifics of your platform as I don't know it, but there is a general answer to the behaviour you see.
When some function that has a return is compiled, the compiler will use a convention on how to return that data. It could be a machine register, or a defined memory location such as via a stack or whatever (though generally machine registers are used). The compiled code may also use that location (register or otherwise) while doing the work of the function.
If the function doesn't return anything, then the compiler will not generate code that explicitly fills that location with a return value. However, like I said above, it may use that location during the function. When you write code that reads the return value (ch2 = toUpper(ch);), the compiler will write code that uses its convention on how retrieve that return from the conventional location. As far as the caller code is concerned, it will just read that value from the location, even if nothing was written explicitly there. Hence you get a value.
Now look at Ray's example. The compiler used the EAX register to store the results of the upper casing operation. It just so happens, this is probably the location that return values are written to. On the calling side, ch2 is loaded with the value that's in EAX - hence a phantom return. This is only true of the x86 range of processors, as on other architectures the compiler may use a completely different scheme in deciding how the convention should be organised.
However, good compilers will try optimise according to a set of local conditions, knowledge of code, rules, and heuristics. So an important thing to note is that this is just luck that it works. The compiler could optimise and not do this or whatever - you should not reply on the behaviour.
You should keep in mind that such code may crash depending on the compiler. For example, Clang generates a ud2 instruction at the end of such function and your app will crash at run time.
There are no local variables, so the value on the top of the stack at the end of the function will be the parameter c. The value at the top of the stack upon exiting, is the return value. So whatever c holds, that's the return value.
I have tried a small program:
#include <stdio.h>
int f1() {
}
int main() {
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
printf("TEST: <%d>\n", f1());
}
Result:
TEST: <1>
TEST: <10>
TEST: <11>
TEST: <11>
TEST: <11>
I have used the MinGW-GCC compiler, so there might be differences.
You could just play around and try, e.g., a char function.
As long you don't use the result value, it will still work fine.
#include <stdio.h>
char f1() {
}
int main() {
f1();
}
But I still would recommend to set either void function or give some return value.
Your function seems to need a return:
char toUpper(char c)
{
if(c>='a'&&c<='z')
c = c - 32;
return c;
}

Force gcc to use syscalls

So I am currently learning assembly language (AT&T syntax). We all know that gcc has an option to generate assembly code from C code with -S argument. Now, I would like to look at some code, how it looks in assembly. The problem is, on laboratories we compile it with as+ld, and as for now, we cannot use C libraries. So for example we cannot use printf. We should do it by syscalls (32 bit is enough). And now I have this code in C:
#include <stdio.h>
int main()
{
int a = 5;
int b = 3;
int c = a + b;
printf("%d", c);
return 0;
}
This is simple code, so I know how it will look with syscalls. But if I have some more complicated code, I don't want to mess around and replace every call printf and modify other registers, cuz gcc generated code for printf, and I should have it with syscalls. So can I somehow make gcc generate assembly code with syscalls (for example for I/O (console, files)), not with C libs?
Under Linux there exist the macro family _syscallX to generate a syscall where the X names the number of parameters. It is marked as obsolete, but IMHO still working. E.g., the following code should work (not tested here):
_syscall3(int,syswrite,int,handle,char*,str,int len);
// ---
char str[]="Hello, world!\n";
// file handle 1 is stdout
syswrite(1,str,14);

incompatible return type from struct function - C

When I attempt to run this code as it is, I receive the compiler message "error: incompatible types in return". I marked the location of the error in my code. If I take the line out, then the compiler is happy.
The problem is I want to return a value representing invalid input to the function (which in this case is calling f2(2).) I only want a struct returned with data if the function is called without using 2 as a parameter.
I feel the only two ways to go is to either:
make the function return a struct pointer instead of a dead-on struct but then my caller function will look funny as I have to change y.b to y->b and the operation may be slower due to the extra step of fetching data in memory.
Allocate extra memory, zero-byte fill it, and set the return value to the struct in that location in memory. (example: return x[nnn]; instead of return x[0];). This approach will use more memory and some processing to zero-byte fill it.
Ultimately, I'm looking for a solution that will be fastest and cleanest (in terms of code) in the long run. If I have to be stuck with using -> to address members of elements then I guess that's the way to go.
Does anyone have a solution that uses the least cpu power?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct{
long a;
char b;
}ab;
static char dat[500];
ab f2(int set){
ab* x=(ab*)dat;
if (set==2){return NULL;}
if (set==1){
x->a=1;
x->b='1';
x++;
x->a=2;
x->b='2';
x=(ab*)dat;
}
return x[0];
}
int main(){
ab y;
y=f2(1);
printf("%c",y.b);
y.b='D';
y=f2(0);
printf("%c",y.b);
return 0;
}
If you care about speed, it is implementation specific.
Notice that the Linux x86-64 ABI defines that a struct of two (exactly) scalar members (that is, integers, doubles, or pointers, -which all fit in a single machine register- but not struct etc... which are aggregate data) is returned thru two registers (without going thru the stack), and that is quite fast.
BTW
if (set==2){return NULL;} //wrong
is obviously wrong. You could code:
if (set==2) return (aa){0,0};
Also,
ab* x=(ab*)dat; // suspicious
looks suspicious to me (since you return x[0]; later). You are not guaranteed that dat is suitably aligned (e.g. to 8 or 16 bytes), and on some platforms (notably x86-64) if dat is misaligned you are at least losing performance (actually, it is undefined behavior).
BTW, I would suggest to always return with instructions like return (aa){l,c}; (where l is an expression convertible to long and c is an expression convertible to char); this is probably the easiest to read, and will be optimized to load the two return registers.
Of course if you care about performance, for benchmarking purposes, you should enable optimizations (and warnings), e.g. compile with gcc -Wall -Wextra -O2 -march=native if using GCC; on my system (Linux/x86-64 with GCC 5.2) the small function
ab give_both(long x, char y)
{ return (ab){x,y}; }
is compiled (with gcc -O2 -march=native -fverbose-asm -S) into:
.globl give_both
.type give_both, #function
give_both:
.LFB0:
.file 1 "ab.c"
.loc 1 7 0
.cfi_startproc
.LVL0:
.loc 1 7 0
xorl %edx, %edx # D.2139
movq %rdi, %rax # x, x
movb %sil, %dl # y, D.2139
ret
.cfi_endproc
you see that all the code is using registers, and no memory is used at all..
I would use the return value as an error code, and the caller passes in a pointer to his struct such as:
int f2(int set, ab *outAb); // returns -1 if invalid input and 0 otherwise

Resources