I came across this piece of code (for the whole program see this page, see the program named "srop.c").
My question is regarding how func is used in the main method. I have only kept the code which I thought could be related.
It is the line *ret = (int)func +4; that confuses me.
There are three questions I have regarding this:
func(void) is a function, should it not be called with func() (note the brackets)
Accepting that that might be some to me unknown way of calling a function, how can it be casted to an int when it should return void?
I understand that the author doesn't want to save the frame pointer nor update it (the prologue), as his comment indicates. How is this skipping-two-lines ahead achieved with casting the function to an int and adding four?
.
(gdb) disassemble func
Dump of assembler code for function func:
0x000000000040069b <+0>: push %rbp
0x000000000040069c <+1>: mov %rsp,%rbp
0x000000000040069f <+4>: mov $0xf,%rax
0x00000000004006a6 <+11>: retq
0x00000000004006a7 <+12>: pop %rbp
0x00000000004006a8 <+13>: retq
End of assembler dump.
Possibly relevant is that when compiled gcc tells me the following:
warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
Please see below for the code.
void func(void)
{
asm("mov $0xf,%rax\n\t");
asm("retq\n\t");
}
int main(void)
{
unsigned long *ret;
/*...*/
/* overflowing */
ret = (unsigned long *)&ret + 2;
*ret = (int)func +4; //skip gadget's function prologue
/*...*/
return 0;
}
[Edit] Following the very helpful advice, here are some further information:
calling func returns a pointer to the start of the function: 0x400530
casting this to an int is dangerous (in hex) 400530
casting this to an int in decimal 4195632
safe cast to unsigned long 4195632
size of void pointer: 8
size of int: 4
size of unsigned long: 8
[Edit 2:] #cmaster: Could you please point me to some more information regarding how to put the assembler function in a separate file and link to it? The original program will not compile because it doesn’t know what the function prog (when put in the assembler file) is, so it must be added either before or during compilation?
Additionally, the gcc -S when ran on a C file only including the assembly commands seem to add a lot of extra information, could not func(void) be represented by the following assembler code?
func:
mov $0xf,%rax
retq
This code assumes a lot more than what is good for it. Anyway, the snippet that you have shown only tries to produce a pointer to the assembler function body, it does not attempt to call it. Here is what it does, and what it assumes:
func by itself produces a pointer to the function.
Assumption 1:
The pointer actually points to the start of the assembler code for func. That assumption is not necessarily right, there are architectures where a function pointer is actually a pointer to a pair of pointers, one of which points to the code, the other of which points to a data segment.
func + 4 increments this pointer to point to the first instruction of the body of the function.
Assumption 2:
Function pointers can be incremented, and their increment is in terms of bytes. I believe that this is not covered by the C standard, but I may be wrong on that one.
Assumption 3:
The prolog that is inserted by the compiler is precisely four bytes long. There is absolutely nothing that dictates what kind of prolog the compiler should emit, there is a multitude of variants allowed, with very different lengths. The code you've given tries to control the length of the prolog by not passing/returning any parameters, but still there can be compilers that produce a different prolog. Worse, the size of the prolog may depend on the optimization level.
The resulting pointer is cast to an int.
Assumption 4:
sizeof(void (*)(void)) == sizeof(int). This is false on most 64 bit systems: on these systems int is usually still four bytes while a pointer occupies eight bytes. On such a system, the pointer value will be truncated. When the int is cast back into a function pointer and called, this will likely crash the program.
My advice:
If you really want to program in assembler, compile a file with only an empty function with gcc -S. This will give you an assembler source file with all the cruft that's needed for the assembler to produce a valid object file and show you where you can add the code for your own function. Modify that file in any way you like, and then compile it together with some calling C code as normal. That way you avoid all these dangerous little assumptions.
The name of a function is a pointer to the start of the function. So the author is not calling the function at that point. Just saving a reference to the start of it.
It's not a void. It's a function pointer. More precisely in this case it is of type: void (*)(void). A pointer is just an address so can be cast to an int (but the address may be truncated if compiled for a 64 bit system as ints are 32 bits in that case).
The first instruction of the function pushes the fp onto the stack. By adding 4, that instruction is skipped. Note that in the snippets you gave the function has not been called. It's probably part of the code that you have not included.
Related
I would like to provoke a stack underflow in a C function to test security measures in my system. I could do this using inline assembler. But C would be more portable. However I can not think of a way to provoke a stack underflow using C since stack memory is safely handled by the language in that regard.
So, is there a way to provoke a stack underflow using C (without using inline assembler)?
As stated in the comments: Stack underflow means having the stack pointer to point to an address below the beginning of the stack ("below" for architectures where the stack grows from low to high).
There's a good reason why it's hard to provoke a stack underflow in C.The reason is that standards compliant C does not have a stack.
Have a read of the C11 standard, you'll find out that it talks about scopes but it does not talk about stacks. The reason for this is that the standard tries, as far as possible, to avoid forcing any design decisions on implementations. You may be able to find a way to cause stack underflow in pure C for a particular implementation but it will rely on undefined behaviour or implementation specific extensions and won't be portable.
You can't do this in C, simply because C leaves stack handling to the implementation (compiler). Similarly, you cannot write a bug in C where you push something on the stack but forget to pop it, or vice versa.
Therefore, it is impossible to produce a "stack underflow" in pure C. You cannot pop from the stack in C, nor can you set the stack pointer from C. The concept of a stack is something on an even lower level than the C language. In order to directly access and control the stack pointer, you must write assembler.
What you can do in C is to purposely write out of bounds of the stack. Suppose we know that the stack starts at 0x1000 and grows upwards. Then we can do this:
volatile uint8_t* const STACK_BEGIN = (volatile uint8_t*)0x1000;
for(volatile uint8_t* p = STACK_BEGIN; p<STACK_BEGIN+n; p++)
{
*p = garbage; // write outside the stack area, at whatever memory comes next
}
Why you would need to test this in a pure C program that doesn't use assembler, I have no idea.
In case someone incorrectly got the idea that the above code invokes undefined behavior, this is what the C standard actually says, normative text C11 6.5.3.2/4 (emphasis mine):
The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined 102)
The question is then what's the definition of an "invalid value", as this is no formal term defined by the standard. Foot note 102 (informative, not normative) provides some examples:
Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an
address inappropriately aligned for the type of object pointed to, and the address of an object after the
end of its lifetime.
In the above example we are clearly not dealing with a null pointer, nor with an object that has passed the end of its lifetime. The code may indeed cause a misaligned access - whether this is an issue or not is determined by the implementation, not by the C standard.
And the final case of "invalid value" would be an address that is not supported by the specific system. This is obviously not something that the C standard mentions, because memory layouts of specific systems are not coverted by the C standard.
It is not possible to provoke stack underflow in C. In order to provoke underflow the generated code should have more pop instructions than push instructions, and this would mean the compiler/interpreter is not sound.
In the 1980s there were implementations of C that ran C by interpretation, not by compilation. Really some of them used dynamic vectors instead of the stack provided by the architecture.
stack memory is safely handled by by the language
Stack memory is not handled by the language, but by the implementation. It is possible to run C code and not to use stack at all.
Neither the ISO 9899 nor K&R specifies anything about the existence of a stack in the language.
It is possible to make tricks and smash the stack, but it will not work on any implementation, only on some implementations. The return address is kept on the stack and you have write-permissions to modify it, but this is neither underflow nor portable.
Regarding already existing answers: I don't think that talking about undefined behaviour in the context of exploitation mitigation techniques is appropriate.
Clearly, if an implementation provides a mitigation against stack underflows, a stack is provided. In practice, void foo(void) { char crap[100]; ... } will end up having the array on the stack.
A note prompted by comments to this answer: undefined behaviour is a thing and in principle any code exercising it can end up being compiled to absolutely anything, including something not resembling the original code in the slightest. However, the subject of exploit mitigation techniques is closely tied to the target environment and what happens in practice. In practice, the code below should "work" just fine. When dealing with this kind of stuff you always have to verify generated assembly to be sure.
Which brings me to what in practice will give an underflow (volatile added to prevent the compiler from optimising it away):
static void
underflow(void)
{
volatile char crap[8];
int i;
for (i = 0; i != -256; i--)
crap[i] = 'A';
}
int
main(void)
{
underflow();
}
Valgrind nicely reports the problem.
By definition, a stack underflow is a type of undefined behaviour, and thus any code which triggers such a condition must be UB. Therefore, you can't reliably cause a stack underflow.
That said, the following abuse of variable-length arrays (VLAs) will cause a controllable stack underflow in many environments (tested with x86, x86-64, ARM and AArch64 with Clang and GCC), actually setting the stack pointer to point above its initial value:
#include <stdint.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
uintptr_t size = -((argc+1) * 0x10000);
char oops[size];
strcpy(oops, argv[0]);
printf("oops: %s\n", oops);
}
This allocates a VLA with a "negative" (very very large) size, which will wrap the stack pointer around and result in the stack pointer moving upwards. argc and argv are used to prevent optimizations from taking out the array. Assuming that the stack grows down (default on the listed architectures), this will be a stack underflow.
strcpy will either trigger a write to an underflowed address when the call is made, or when the string is written if strcpy is inlined. The final printf should not be reachable.
Of course, this all assumes a compiler which doesn't just make the VLA some kind of temporary heap allocation - which a compiler is completely free to do. You should check the generated assembly to verify that the above code does what you actually expect it to do. For example, on ARM (gcc -O):
8428: e92d4800 push {fp, lr}
842c: e28db004 add fp, sp, #4, 0
8430: e1e00000 mvn r0, r0 ; -argc
8434: e1a0300d mov r3, sp
8438: e0433800 sub r3, r3, r0, lsl #16 ; r3 = sp - (-argc) * 0x10000
843c: e1a0d003 mov sp, r3 ; sp = r3
8440: e1a0000d mov r0, sp
8444: e5911004 ldr r1, [r1]
8448: ebffffc6 bl 8368 <strcpy#plt> ; strcpy(sp, argv[0])
This assumption:
C would be more portable
is not true. C doesn't tell anything about a stack and how it is used by the implementation. On your typical x86 platform, the following (horribly invalid) code would access the stack outside of the valid stack frame (until it is stopped by the OS), but it would not actually "pop" from it:
#include <stdarg.h>
#include <stdio.h>
int underflow(int dummy, ...)
{
va_list ap;
va_start(ap, dummy);
int sum = 0;
for(;;)
{
int x = va_arg(ap, int);
fprintf(stderr, "%d\n", x);
sum += x;
}
return sum;
}
int main(void)
{
return underflow(42);
}
So, depending on what exactly you mean with "stack underflow", this code does what you want on some platform. But as from a C point of view, this just exposes undefined behavior, I wouldn't suggest to use it. It's not "portable" at all.
Is it possible to do it reliably in standard compliant C? No
Is it possible to do it on at least one practical C compiler without resorting to inline assembler? Yes
void * foo(char * a) {
return __builtin_return_address(0);
}
void * bar(void) {
char a[100000];
return foo(a);
}
typedef void (*baz)(void);
int main() {
void * a = bar();
((baz)a)();
}
Build that on gcc with "-O2 -fomit-frame-pointer -fno-inline"
https://godbolt.org/g/GSErDA
Basically the flow in this program goes as follows
main calls bar.
bar allocates a bunch of space on the stack (thanks to the big array),
bar calls foo.
foo takes a copy of the return address (using a gcc extension). This address points into the middle of bar, between the "allocation" and the "cleanup".
foo returns the address to bar.
bar cleans up it's stack allocation.
bar returns the return address captured by foo to main.
main calls the return address, jumping into the middle of bar.
the stack cleanup code from bar runs, but bar doesn't currently have a stack frame (because we jumped into the middle of it). So the stack cleanup code underflows the stack.
We need -fno-inline to stop the optimiser inlining stuff and breaking our carefully laid-down strcture. We also need the compiler to free the space on the stack by calculation rather than by use of a frame pointer, -fomit-frame-pointer is the default on most gcc builds nowadays but it doesn't hurt to specify it explicitly.
I belive this tehcnique should work for gcc on pretty much any CPU architecture.
There is a way to underflow the stack, but it is very complicated. The only way that I can think of is define a pointer to the bottom element then decrement its address value. I.e. *(ptr)--. My parentheses may be off, but you want to decrement the value of the pointer, then dereference the pointer.
Generally the OS is just going to see the error and crash. I am not sure what you are testing. I hope this helps. C allows you to do bad things, but it tries to look after the programmer. Most ways to get around this protection is through manipulation of pointers.
Do you mean stack overflow? Putting more things into the stack than the stack can accomodate? If so, recursion is the easiest way to accomplish that.
void foo();
{foo();};
If you mean attempting to remove things from an empty stack, then please post your question to the stackunderflow web site, and let me know where you've found that! :-)
So there are older library functions in C which are not protected. strcpy is a good example of this. It copies one string to another until it reaches a null terminator. One funny thing to do is pass a program that uses this a string with the null terminator removed. It will run amuck until it reaches a null terminator somewhere. Or have a string copy to itself. So back to what I was saying before is C supports pointers to just about anything. You can make a pointer to an element in the stack at the last element. Then you can use the pointer iterator built into C to decrement the value of the address, change the address value to a location preceding the last element in the stack. Then pass that element to the pop. Now if you are doing this to the Operating system process stack that would get very dependent on the compiler and operating system implementation. In most cases a function pointer to the main and a decrement should work to underflow the stack. I have not tried this in C. I have only done this in Assembly Language, great care has to be taken in working like this. Most operating systems have gotten good at stopping this since it was for a long time an attack vector.
This question already has answers here:
C standard compliant way to access null pointer address?
(5 answers)
Closed 7 years ago.
Suppose I need to write to zero address (e.g. I've mmapped something there and want to access it, for whatever reason including curiosity), and the address is known at compile time. Here're some variants I could think of to obtain the pointer, one of these works and another three don't:
#include <stdint.h>
void testNullPointer()
{
// Obviously UB
unsigned* p=0;
*p=0;
}
void testAddressZero()
{
// doesn't work for zero, GCC detects it as NULL
uintptr_t x=0;
unsigned* p=(unsigned*)x;
*p=0;
}
void testTrickyAddressZero()
{
// works, but the resulting assembly is not as terse as it could be
unsigned* p;
asm("xor %0,%0\n":"=r"(p));
*p=0;
}
void testVolatileAddressZero()
{
// p is updated, but the code doesn't actually work
unsigned*volatile p=0;
*p=0; // because this doesn't dereference p! // EDIT: pointee should also be volatile, then this will work
}
I compile this with
gcc test.c -masm=intel -O3 -c -o test.o
and then objdump -d test.o -M intel --no-show-raw-insn gives me (alignment bytes are skipped here):
00000000 <testNullPointer>:
0: mov DWORD PTR ds:0x0,0x0
a: ud2a
00000010 <testAddressZero>:
10: mov DWORD PTR ds:0x0,0x0
1a: ud2a
00000020 <testTrickyAddressZero>:
20: xor eax,eax
22: mov DWORD PTR [eax],0x0
28: ret
00000030 <testVolatileAddressZero>:
30: sub esp,0x10
33: mov DWORD PTR [esp+0xc],0x0
3b: mov eax,DWORD PTR [esp+0xc]
3f: add esp,0x10
42: ret
Here the testNullPointer obviously has UB since it dereferences what is null pointer by definition.
The principle of testAddressZero would give the expected code for any other than 0 address, e.g. 1, but for zero GCC appears to detect that address zero corresponds to null pointer, so also generates UD2.
The asm way of getting the zero address certainly inhibits the compiler's checks, but the price of that is that one has to write different assembly code for each architecture even if the principle of testAddressZero might have been successful (i.e. the same flat memory model on each arch) if not UD2 and similar traps. Also, the code appears not as terse as in the above two variants.
The way of volatile pointer would seem to be the best, but the code generated here appears to not dereference the address for some reason, so it's also broken.
The question now: if I'm targeting GCC, how can I seamlessly access zero address without any traps or other consequences of UB, and without the need to write in assembly?
As a workaround you can use the GCC option -fno-delete-null-pointer-checks that refrain the compiler to actively check for null pointer dereferencing.
While this option is intended to be used to speed-up code optimization it can be used in specific cases as this.
I would put the pointer into a global variable:
const uintptr_t zero = 0;
unsigned* zeroAddress= (unsigned *)zero;
void testZeroAddressPointer()
{
*zeroAddress=0;
}
Provided you expose the address beyond the scope of optimization (so the compiler can't figure out you don't set it somewhere else), that should do the trick, albeit slightly less efficiently.
Edit: make this code independent of implicit zero to null conversion.
The 0 address is the C99 NULL pointer (actually the "implementation" of the null pointer, which you can often write as 0....) on all the architectures I know about.
The null pointer has a very specific status in hosted C99: when a pointer can be (or was) dereferenced, it is guaranteed (by the language specification) to not be NULL (otherwise, it is undefined behavior).
Hence, the GCC compiler has the right to optimize (and actually will optimize)
int *p = something();
int x = *p;
/// the compiler is permitted to skip the following
/// because p has been dereferenced so cannot be NULL
if (p == NULL) { doit(); return; };
In your case, you might want to compile for the freestanding subset of the C99 standard. So compile with gcc -ffreestanding (beware, this option can bring some infelicities).
BTW, you might declare some extern char strange[] __attribute__((weak)); (perhaps even add asm("0") ...) and have some assembler or linker trick to make that strange have a 0 address. The compiler would not know that such a strange symbol is in fact at the 0 address...
My strong suggestion is to avoid dereferencing the 0 address.... See this. If you really need to deference the address 0, be prepared to suffer.... (so code some asm, lower the optimization, etc...).
(If you have mmap-ed the first page, just avoid using its first byte at address 0; that is often not a big deal.)
(IIRC, you are touching a grey area of GCC optimizations - and perhaps even of the C99 language specification, and you certainly want the free standing flavor of C; notice that -O3 optimization for free standing C is not well tested in the GCC compiler and might have residual bugs....)
You could consider changing the GCC compiler so that the null pointer has the numerical address 42. That would take some work.
#include <stdio.h>
#include <stdlib.h>
int (*fptr1)(int);
int square(int num){
return num*num;
}
void main(){
fptr1 = □
printf("%d\n",fptr1(5));
}
Can someone briefly explain what happens in stack when we call a function pointer? What is the difference between calling a function directly in main() and calling it by function pointer in C language by the means of physical memory and process?
I tried to understand what happens in memory when we call a function with function pointer but it is not enough to me.
When we call a function by pointer, does pointer have this function's location at code space?
When called function is running is it same as normally called function in main()?
What is the difference of calling function directly or using function pointer when code is running in a pipelined branch predictive processor?
The best way to answer this is to look at the disassembly (slightly modified sample):
fptr1 = □
int result1 = fptr1(5);
int result2 = square(5);
Results in this x64 asm:
fptr1 = □
000000013FA31A61 lea rax,[square (013FA31037h)]
000000013FA31A68 mov qword ptr [fptr1 (013FA40290h)],rax
int result1 = fptr1(5);
000000013FA31A6F mov ecx,5
000000013FA31A74 call qword ptr [fptr1 (013FA40290h)]
000000013FA31A7A mov dword ptr [result1],eax
int result2 = square(5);
000000013FA31A7E mov ecx,5
000000013FA31A83 call square (013FA31037h)
000000013FA31A88 mov dword ptr [result2],eax
As you can see the assembly is virtually identical between calling the function directly and via a pointer. In both cases the CPU needs to have access to the location where the code is located and call it. The direct call has the benefit of not having to dereference the pointer (as the offset will be baked into the assembly).
Yes, you can see in the assignment of the function pointer, that
it stores the code address of the 'square' function.
From a stack
setup/tear down: Yes. From a performance perspective, there is a
slight difference as noted above.
There are no branches, so there is no difference here.
Edit: If we were to interject branches into the above sample, it wouldn't take very long to exhaust the interesting scenarios, so I will address them here:
In the case where we have a branch before loading (or assignment) of the function pointer, for example (in pseudo assembly):
branch zero foobar
lea square
call ptr
Then we could have a difference. Assume that the pipeline chose to load and start processing the instructions at foobar, then when it realized that we weren't actually going to take that branch, it would have to stall in order to load the function pointer, and dereference it. If we were just calling a know address, then there would not be a stall.
Case two:
lea square
branch zero foobar
call ptr
In this case there wouldn't be any difference between direct calls vs through a function pointer, as everything we need is already know if the processor starts executing down the wrong path and then resets to start executing the call.
The third scenario is when the branch follows the call, and that is obviously not very interesting from a pipeline perspective as we've already executed the subroutine.
So to fully re-answer question 3, I would say Yes, there is a difference. But then the real question is whether or not the compiler/optimizer is smart enough to move the branch after the function pointer assignment, so it falls into case 2 and not case 1.
The pointer to function contains the address of the start of the function in the text segment of the program.
Once called, the function runs identically whether it is called directly or by a pointer to function.
I'm not sure. Often, there won't be much difference; the pointer to function doesn't change very often, if at all (e.g. because you loaded a shared library dynamically, so you have to use a pointer to function to call the function).
What is the difference between calling a function directly in main() and calling it by function pointer?
The only difference is possibly an extra memory reference to fetch the function pointer from memory.
Yes
Absolutely
No difference
Calling a function through a function pointer is just a convenience; and makes it possible to pass a function as a parameter to another function. See for example, qsort()
There's no difference between calling a function by its name and calling it through a function pointer. The purpose of function pointers is so that you can call a function that's specified somewhere else. For instance, the qsort() function takes a function pointer as an argument that it calls to determine how to order the elements of the array, rather than having only one comparison method.
Like this link http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Labels-as-Values.html
I can get the memory address of an label, so if I declare a label, get your address, and add your address, i will pass to next instruction? some ilustration >
int main () {
void *ptr;
label:
instruction 1;
instruction 2;
ptr = &&label;
// So if I do it...
ptr = ptr + 1;
// I will get the instruction 2 correct??
Thanks for all answers.
No, I don't think so.
First of, you seem to take the address of a label, which doesn't work. The label is interpreted by the compiler but it does not represent an actual adress in your code.
Second, every statement in C/C++ (in fact any language) can be translated to many machine language instructions, so instruction 1 could be translated to 3, 5, 10 or even more machine instructions.
Third, your pointer points to void. The C compiler does not know how to increment a void pointer. Normally when you increment a pointer, it adds the size of the data type you are pointing to to the address. So incrementing a long-pointer will add 4 bytes; incrementing a char-pointer will add 1 byte. In this case you have a void-pointer, which points to nothing, and thus cannot be incremented.
Fourth, I don't think that all instructions in x86 machine language are represented by the same number of bytes. So you cannot expect from adding something to a pointer that it gets to the next instruction. You might also end up in the middle of the next instruction.
You can't perform arithmetic on a void*, and the compiler wouldn't know what to add to the pointer to have it point to the next 'instruction' anyway - there is no 1 to 1 correspondence between C statement and the machine code emitted by the compiler. Even for CPUs which have a 'regular' instruction set where instructions are the same size (as opposed to something like the x86 where instructions have a variable number of bytes), a single C statement may result in several CPU instructions (or maybe only one - who knows?).
Expanding on an example in the GCC docs, you might be able to get by with something like the following, but it requires a label for each statement you want to target:
void *statements[] = { &&statement1, &&statement2 };
void** ptr;
statement1:
instruction 1;
statement2:
instruction 2;
ptr = statements;
// goto **ptr; // <== this will jump to 'instruction 1'
// goto **(ptr+1); // <== this will jump to 'instruction 2'
Note that the &&label syntax is described under C Extensions section in GCC docs. It's not C, it's GCC.
Plus, void* does not allow pointer arithmetic - it's a catch-all sort of type in C for pointing at anything. The assumption is that the compiler does not know size of the object it points to (but the programmer should :).
Even more, instruction sizes are widely different on different architectures - four bytes on SPARC, but variable length on x86, for example.
I.e. it doesn't work in C. You will have to use inline assembler for this sort of things.
No, because you can't increment void *.
void fcn() { printf("hello, world\n"); }
int main()
{
void (*pt2Function)() = fcn;
pt2Function(); // calls fcn();
// error C2171: '++' : illegal on operands of type 'void (__cdecl *)(void)'
// ++pt2Function;
return 0;
}
This is VC++, but I suspect gcc is similar.
Edited to add
Just for fun, I tried this—it crashed:
int nGlobal = 0;
__declspec(naked) void fcn()
{
// nop is 1-byte instruction that does nothing
_asm { nop }
++nGlobal;
_asm { ret }
}
int main()
{
void (*pt2Function)() = fcn;
// this works, incrementing nGlobal:
pt2Function();
printf("nGlobal: %d", nGlobal);
char *p = (char *) pt2Function;
++p; // point past the NOP?
pt2Function = (void (*)()) p;
// but this crashes...
pt2Function();
printf("nGlobal: %d", nGlobal);
return 0;
}
It crashed because this line doesn't do what I thought it did:
void (*pt2Function)() = fcn;
I thought it would take the address of the first instruction of fcn(), and put it in pt2Function. That way my ++p would make it point to the second instruction (nop is one byte long).
It doesn't. It puts the address of a jmp instruction (found in a big jump table) into pt2Function. When you increment it by one byte, it points to a meaningless location in the jump table.
I assume this is implementation-specific.
I would say "probably not". The value of the pointer will be right, because the compiler knows, but I doubt that the + 1 will know the length of instructions.
Let us suppose there's a way to get the address of a label (that is no an extension of a specific compiler). Then the problem would really be "the next instruction" idea: it can be very hard to know which is the next instruction. It depends on the processor, and on processors like x86 to know the length of an instruction you have to decode it, not fully of course but it is anyway some complex job... on notable RISC architectures, instructions' length is a lot easier and getting the next instruction could be as easy as incrementing the address by 4. But there's no a general way to do it at runtime, while at compile time it could be easier, but to allow it in a C-coherent way, C should have the type "instruction", so that "instruction *" can be a pointer to an instruction, and incrementing such a pointer would point correctly to the next instruction, provided the code is known at compile time (so, such a pointer can't point really to everything pointer can point to in general). At compile time the compiler could implement this feature easily adding another "label" just beyond the generated instruction pointed by the "first" "label". But it would be cheating...
Moreover, let us suppose you get the address of a C label, or C function, or whatever. If you skip the first instruction, likely you won't be able to "use" that address to execute the code (less the first instruction), since without that single instruction the code may become buggy... unless you know for sure you can skip that single instruction and obtain what you want, but you can't be sure... unless you take a look at the code (which can be different from compiler to compiler), and then all the point of doing such a thing from C disappears.
So, briefly, the answer is no, you can't compute the pointer to the next instruction; and if you do someway, the fact that you're pointing to code becomes meaningless since you can't jump to that address and be sure of the final behaviour.
Imagine I have the following simple C program:
int main() {
int a=5, b= 6, c;
c = a +b;
return 0;
}
Now, I would like to know the address of the expression c=a+b, that is the program address
where this addition is carried out. Is there any possibility that I could use printf?
Something along the line:
int main() {
int a=5, b= 6, c;
printf("Address of printf instruction in memory: %x", current_address_pointer_or_something)
c = a +b;
return 0;
}
I know how I could find the address out by using gdb and then info line file.c:line. However, I should know if I could also do that directly with the printf.
In gcc, you can take the address of a label using the && operator. So you could do this:
int main()
{
int a=5, b= 6, c;
sum:
c = a+b;
printf("Address of sum label in memory: %p", &&sum);
return 0;
}
The result of &&sum is the target of the jump instruction that would be emitted if you did a goto sum. So, while it's true that there's no one-to-one address-to-line mapping in C/C++, you can still say "get me a pointer to this code."
Visual C++ has the _ReturnAddress intrinsic, which can be used to get some info here.
For instance:
__declspec(noinline) void PrintCurrentAddress()
{
printf("%p", __ReturnAddress);
}
Which will give you an address close to the expression you're looking at. In the event of some optimizations, like tail folding, this will not be reliable.
Tested in Visual Studio 2008:
int addr;
__asm
{
call _here
_here: pop eax
; eax now holds the PC.
mov [addr], eax
}
printf("%x\n", addr);
Credit to this question.
Here's a sketch of an alternative approach:
Assume that you haven't stripped debug symbols, and in particular you have the line number to address table that a source-level symbolic debugger needs in order to implement things like single step by source line, set a break point at a source line, and so forth.
Most tool chains use reasonably well documented debug data formats, and there are often helper libraries that implement most of the details.
Given that and some help from the preprocessor macro __LINE__ which evaluates to the current line number, it should be possible to write a function which looks up the address of any source line.
Advantages are that no assembly is required, portability can be achieved by calling on platform-specific debug information libraries, and it isn't necessary to directly manipulate the stack or use tricks that break the CPU pipeline.
A big disadvantage is that it will be slower than any approach based on directly reading the program counter.
For x86:
int test()
{
__asm {
mov eax, [esp]
}
}
__declspec(noinline) int main() // or whatever noinline feature your compiler has
{
int a = 5;
int aftertest;
aftertest = test()+3; // aftertest = disasms to 89 45 F8 mov dword ptr [a],eax.
printf("%i", a+9);
printf("%x", test());
return 0;
}
I don't know the details, but there should be a way to make a call to a function that can then crawl the return stack for the address of the caller, and then copy and print that out.
Using gcc on i386 or x86-64:
#include <stdio.h>
#define ADDRESS_HERE() ({ void *p; __asm__("1: mov 1b, %0" : "=r" (p)); p; })
int main(void) {
printf("%p\n", ADDRESS_HERE());
return 0;
}
Note that due to the presence of compiler optimizations, the apparent position of the expression might not correspond to its position in the original source.
The advantage of using this method over the &&foo label method is it doesn't change the control-flow graph of the function. It also doesn't break the return predictor unit like the approaches using call :)
On the other hand, it's very much architecture-dependent... and because it doesn't perturb the CFG there's no guarantee that jumping to the address in question would make any sense at all.
If the compiler is any good this addition happens in registers and is never stored in memory, at least not in the way you are thinking. Actually a good compiler will see that your program does nothing, manipulating values within a function but never sending those values anywhere outside the function can result in no code.
If you were to:
c = a+b;
printf("%u\n",c);
Then a good compiler will also never store that value C in memory it will stay in registers, although it depends on the processor as well. If for example compilers for that processor use the stack to pass variables to functions then the value for c will be computed using registers (a good compiler will see that C is always 11 and just assign it) and the value will be put on the stack while being sent to the printf function. Naturally the printf function may well need temporary storage in memory due to its complexity (cant fit everything it needs to do in registers).
Where I am heading is that there is no answer to your question. It is heavily dependent on the processor, compiler, etc. There is no generic answer. I have to wonder what the root of the question is, if you were hoping to probe with a debugger, then this is not the question to ask.
Bottom line, disassemble your program and look at it, for that compile on that day with those settings, you will be able to see where the compiler has placed intermediate values. Even if the compiler assigns a memory location for the variable that doesnt mean the program will ever store the variable in that location. It depends on optimizations.