Related
How Get arguments value using inline assembly in C without Glibc?
i require this code for Linux archecture x86_64 and i386.
if you know about MAC OS X or Windows , also submit and please guide.
void exit(int code)
{
//This function not important!
//...
}
void _start()
{
//How Get arguments value using inline assembly
//in C without Glibc?
//argc
//argv
exit(0);
}
New Update
https://gist.github.com/apsun/deccca33244471c1849d29cc6bb5c78e
and
#define ReadRdi(To) asm("movq %%rdi,%0" : "=r"(To));
#define ReadRsi(To) asm("movq %%rsi,%0" : "=r"(To));
long argcL;
long argvL;
ReadRdi(argcL);
ReadRsi(argvL);
int argc = (int) argcL;
//char **argv = (char **) argvL;
exit(argc);
But it still returns 0.
So this code is wrong!
please help.
As specified in the comment, argc and argv are provided on the stack, so you cannot use a regular C function to get them, even with inline assembly, as the compiler will touch the stack pointer to allocate the local variables, setup the stack frame & co.; hence, _start must be written in assembly, as it's done in glibc (x86; x86_64). A small stub can be written to just grab the stuff and forward it to your "real" C entrypoint according to the regular calling convention.
Here a minimal example of a program (both for x86 and x86_64) that reads argc and argv, prints all the values in argv on stdout (separated by newline) and exits using argc as status code; it can be compiled with the usual gcc -nostdlib (and -static to make sure ld.so isn't involved; not that it does any harm here).
#ifdef __x86_64__
asm(
".global _start\n"
"_start:\n"
" xorl %ebp,%ebp\n" // mark outermost stack frame
" movq 0(%rsp),%rdi\n" // get argc
" lea 8(%rsp),%rsi\n" // the arguments are pushed just below, so argv = %rbp + 8
" call bare_main\n" // call our bare_main
" movq %rax,%rdi\n" // take the main return code and use it as first argument for...
" movl $60,%eax\n" // ... the exit syscall
" syscall\n"
" int3\n"); // just in case
asm(
"bare_write:\n" // write syscall wrapper; the calling convention is pretty much ok as is
" movq $1,%rax\n" // 1 = write syscall on x86_64
" syscall\n"
" ret\n");
#endif
#ifdef __i386__
asm(
".global _start\n"
"_start:\n"
" xorl %ebp,%ebp\n" // mark outermost stack frame
" movl 0(%esp),%edi\n" // argc is on the top of the stack
" lea 4(%esp),%esi\n" // as above, but with 4-byte pointers
" sub $8,%esp\n" // the start starts 16-byte aligned, we have to push 2*4 bytes; "waste" 8 bytes
" pushl %esi\n" // to keep it aligned after pushing our arguments
" pushl %edi\n"
" call bare_main\n" // call our bare_main
" add $8,%esp\n" // fix the stack after call (actually useless here)
" movl %eax,%ebx\n" // take the main return code and use it as first argument for...
" movl $1,%eax\n" // ... the exit syscall
" int $0x80\n"
" int3\n"); // just in case
asm(
"bare_write:\n" // write syscall wrapper; convert the user-mode calling convention to the syscall convention
" pushl %ebx\n" // ebx is callee-preserved
" movl 8(%esp),%ebx\n" // just move stuff from the stack to the correct registers
" movl 12(%esp),%ecx\n"
" movl 16(%esp),%edx\n"
" mov $4,%eax\n" // 4 = write syscall on i386
" int $0x80\n"
" popl %ebx\n" // restore ebx
" ret\n"); // notice: the return value is already ok in %eax
#endif
int bare_write(int fd, const void *buf, unsigned count);
unsigned my_strlen(const char *ch) {
const char *ptr;
for(ptr = ch; *ptr; ++ptr);
return ptr-ch;
}
int bare_main(int argc, char *argv[]) {
for(int i = 0; i < argc; ++i) {
int len = my_strlen(argv[i]);
bare_write(1, argv[i], len);
bare_write(1, "\n", 1);
}
return argc;
}
Notice that here several subtleties are ignored - in particular, the atexit bit. All the documentation about the machine-specific startup state has been extracted from the comments in the two glibc files linked above.
This answer is for x86-64, 64-bit Linux ABI, only. All the other OSes and ABIs mentioned will be broadly similar, but different enough in the fine details that you will need to write your custom _start once for each.
You are looking for the specification of the initial process state in the "x86-64 psABI", or, to give it its full title, "System V Application Binary Interface, AMD64 Architecture Processor Supplement (With LP64 and ILP32 Programming Models)". I will reproduce figure 3.9, "Initial Process Stack", here:
Purpose Start Address Length
------------------------------------------------------------------------
Information block, including varies
argument strings, environment
strings, auxiliary information
...
------------------------------------------------------------------------
Null auxiliary vector entry 1 eightbyte
Auxiliary vector entries... 2 eightbytes each
0 eightbyte
Environment pointers... 1 eightbyte each
0 8+8*argc+%rsp eightbyte
Argument pointers... 8+%rsp argc eightbytes
Argument count %rsp eightbyte
It goes on to say that the initial registers are unspecified except
for %rsp, which is of course the stack pointer, and %rdx, which may contain "a function pointer to register with atexit".
So all the information you are looking for is already present in memory, but it hasn't been laid out according to the normal calling convention, which means you must write _start in assembly language. It is _start's responsibility to set everything up to call main with, based on the above. A minimal _start would look something like this:
_start:
xorl %ebp, %ebp # mark the deepest stack frame
# Current Linux doesn't pass an atexit function,
# so you could leave out this part of what the ABI doc says you should do
# You can't just keep the function pointer in a call-preserved register
# and call it manually, even if you know the program won't call exit
# directly, because atexit functions must be called in reverse order
# of registration; this one, if it exists, is meant to be called last.
testq %rdx, %rdx # is there "a function pointer to
je skip_atexit # register with atexit"?
movq %rdx, %rdi # if so, do it
call atexit
skip_atexit:
movq (%rsp), %rdi # load argc
leaq 8(%rsp), %rsi # calc argv (pointer to the array on the stack)
leaq 8(%rsp,%rdi,8), %rdx # calc envp (starts after the NULL terminator for argv[])
call main
movl %eax, %edi # pass return value of main to exit
call exit
hlt # should never get here
(Completely untested.)
(In case you're wondering why there's no adjustment to maintain stack pointer alignment, this is because upon a normal procedure call, 8(%rsp) is 16-byte aligned, but when _start is called, %rsp itself is 16-byte aligned. Each call instruction displaces %rsp down by eight, producing the alignment situation expected by normal compiled functions.)
A more thorough _start would do more things, such as clearing all the other registers, arranging for greater stack pointer alignment than the default if desired, calling into the C library's own initialization functions, setting up environ, initializing the state used by thread-local storage, doing something constructive with the auxiliary vector, etc.
You should also be aware that if there is a dynamic linker (PT_INTERP section in the executable), it receives control before _start does. Glibc's ld.so cannot be used with any C library other than glibc itself; if you are writing your own C library, and you want to support dynamic linkage, you will also need to write your own ld.so. (Yes, this is unfortunate; ideally, the dynamic linker would be a separate development project and its complete interface would be specified.)
As a quick and dirty hack, you can make an executable with a compiled C function as the ELF entry point. Just make sure you use exit or _exit instead of returning.
(Link with gcc -nostartfiles to omit CRT but still link other libraries, and write a _start() in C. Beware of ABI violations like stack alignment, e.g. use -mincoming-stack-boundary=2 or an __attribte__ on _start, as in Compiling without libc)
If it's dynamically linked, you can still use glibc functions on Linux (because the dynamic linker runs glibc's init functions). Not all systems are like this, e.g. on cygwin you definitely can't call libc functions if you (or the CRT start code) hasn't called the libc init functions in the correct order. I'm not sure it's even guaranteed that this works on Linux, so don't depend on it except for experimentation on your own system.
I have used a C _start(void){ ... } + calling _exit() for making a static executable to microbenchmark some compiler-generated code with less startup overhead for perf stat ./a.out.
Glibc's _exit() works even if glibc wasn't initialized (gcc -O3 -static), or use inline asm to run xor %edi,%edi / mov $60, %eax / syscall (sys_exit(0) on Linux) so you don't have to even statically link libc. (gcc -O3 -nostdlib)
With even more dirty hacking and UB, you can access argc and argv by knowing the x86-64 System V ABI that you're compiling for (see #zwol's answer for a quote from ABI doc), and how the process startup state differers from the function calling convention:
argc is where the return address would be for a normal function (pointed to by RSP). GNU C has a builtin for accessing the return address of the current function (or for walking up the stack.)
argv[0] is where the 7th integer/pointer arg should be (the first stack arg, just above the return address). It happens to / seems to work to take its address and use that as an array!
// Works only for the x86-64 SystemV ABI; only tested on Linux.
// DO NOT USE THIS EXCEPT FOR EXPERIMENTS ON YOUR OWN COMPUTER.
#include <stdio.h>
#include <stdlib.h>
// tell gcc *this* function is called with a misaligned RSP
__attribute__((force_align_arg_pointer))
void _start(int dummy1, int dummy2, int dummy3, int dummy4, int dummy5, int dummy6, // register args
char *argv0) {
int argc = (int)(long)__builtin_return_address(0); // load (%rsp), casts to silence gcc warnings.
char **argv = &argv0;
printf("argc = %d, argv[argc-1] = %s\n", argc, argv[argc-1]);
printf("%f\n", 1.234); // segfaults if RSP is misaligned
exit(0);
//_exit(0); // without flushing stdio buffers!
}
# with a version without the FP printf
peter#volta:~/src/SO$ gcc -nostartfiles _start.c -o bare_start
peter#volta:~/src/SO$ ./bare_start
argc = 1, argv[argc-1] = ./bare_start
peter#volta:~/src/SO$ ./bare_start abc def hij
argc = 4, argv[argc-1] = hij
peter#volta:~/src/SO$ file bare_start
bare_start: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=af27c8416b31bb74628ef9eec51a8fc84e49550c, not stripped
# I could have used -fno-pie -no-pie to make a non-PIE executable
This works with or without optimization, with gcc7.3. I was worried that without optimization, the address of argv0 would be below rbp where it copies the arg, rather than its original location. But apparently it works.
gcc -nostartfiles links glibc but not the CRT start files.
gcc -nostdlib omits both libraries and CRT startup files.
Very little of this is guaranteed to work, but it does in practice work with current gcc on current x86-64 Linux, and has worked in the past for years. If it breaks, you get to keep both pieces. IDK what C features are broken by omitting the CRT startup code and just relying on the dynamic linker to run glibc init functions. Also, taking the address of an arg and accessing pointers above it is UB, so you could maybe get broken code-gen. gcc7.3 happens to do what you'd expect in this case.
Things that definitely break
atexit() cleanup, e.g. flushing stdio buffers.
static destructors for static objects in dynamically-linked libraries. (On entry to _start, RDX is a function pointer you should register with atexit for this reason. In a dynamically linked executable, the dynamic linker runs before your _start and sets RDX before jumping to your _start. Statically linked executables have RDX=0 under Linux.)
gcc -mincoming-stack-boundary=3 (i.e. 2^3 = 8 bytes) is another way to get gcc to realign the stack, because the -mpreferred-stack-boundary=4 default of 2^4 = 16 is still in place. But that makes gcc assume under-aligned RSP for all functions, not just for _start, which is why I looked in the docs and found an attribute that was intended for 32-bit when the ABI transitioned from only requiring 4-byte stack alignment to the current requirement of 16-byte alignment for ESP in 32-bit mode.
The SysV ABI requirement for 64-bit mode has always been 16-byte alignment, but gcc options let you make code that doesn't follow the ABI.
// test call to a function the compiler can't inline
// to see if gcc emits extra code to re-align the stack
// like it would if we'd used -mincoming-stack-boundary=3 to assume *all* functions
// have only 8-byte (2^3) aligned RSP on entry, with the default -mpreferred-stack-boundary=4
void foo() {
int i = 0;
atoi(NULL);
}
With -mincoming-stack-boundary=3, we get stack-realignment code there, where we don't need it. gcc's stack-realignment code is pretty clunky, so we'd like to avoid that. (Not that you'd really ever use this to compile a significant program where you care about efficiency, please only use this stupid computer trick as a learning experiment.)
But anyway, see the code on the Godbolt compiler explorer with and without -mpreferred-stack-boundary=3.
I read this question about noreturn attribute, which is used for functions that don't return to the caller.
Then I have made a program in C.
#include <stdio.h>
#include <stdnoreturn.h>
noreturn void func()
{
printf("noreturn func\n");
}
int main()
{
func();
}
And generated assembly of the code using this:
.LC0:
.string "func"
func:
pushq %rbp
movq %rsp, %rbp
movl $.LC0, %edi
call puts
nop
popq %rbp
ret // ==> Here function return value.
main:
pushq %rbp
movq %rsp, %rbp
movl $0, %eax
call func
Why does function func() return after providing noreturn attribute?
The function specifiers in C are a hint to the compiler, the degree of acceptance is implementation defined.
First of all, _Noreturn function specifier (or, noreturn, using <stdnoreturn.h>) is a hint to the compiler about a theoretical promise made by the programmer that this function will never return. Based on this promise, compiler can make certain decisions, perform some optimizations for the code generation.
IIRC, if a function specified with noreturn function specifier eventually returns to its caller, either
by using and explicit return statement
by reaching end of function body
the behaviour is undefined. You MUST NOT return from the function.
To make it clear, using noreturn function specifier does not stop a function form returning to its caller. It is a promise made by the programmer to the compiler to allow it some more degree of freedom to generate optimized code.
Now, in case, you made a promise earlier and later, choose to violate this, the result is UB. Compilers are encouraged, but not required, to produce warnings when a _Noreturn function appears to be capable of returning to its caller.
According to chapter §6.7.4, C11, Paragraph 8
A function declared with a _Noreturn function specifier shall not return to its caller.
and, the paragraph 12, (Note the comments!!)
EXAMPLE 2
_Noreturn void f () {
abort(); // ok
}
_Noreturn void g (int i) { // causes undefined behavior if i <= 0
if (i > 0) abort();
}
For C++, the behaviour is quite similar. Quoting from chapter §7.6.4, C++14, paragraph 2 (emphasis mine)
If a function f is called where f was previously declared with the noreturn attribute and f eventually
returns, the behavior is undefined. [ Note: The function may terminate by throwing an exception. —end
note ]
[ Note: Implementations are encouraged to issue a warning if a function marked [[noreturn]] might
return. —end note ]
3 [ Example:
[[ noreturn ]] void f() {
throw "error"; // OK
}
[[ noreturn ]] void q(int i) { // behavior is undefined if called with an argument <= 0
if (i > 0)
throw "positive";
}
—end example ]
Why function func() return after providing noreturn attribute?
Because you wrote code that told it to.
If you don't want your function to return, call exit() or abort() or similar so it doesn't return.
What else would your function do other than return after it had called printf()?
The C Standard in 6.7.4 Function specifiers, paragraph 12 specifically includes an example of a noreturn function that can actually return - and labels the behavior as undefined:
EXAMPLE 2
_Noreturn void f () {
abort(); // ok
}
_Noreturn void g (int i) { // causes undefined behavior if i<=0
if (i > 0) abort();
}
In short, noreturn is a restriction that you place on your code - it tells the compiler "MY code won't ever return". If you violate that restriction, that's all on you.
noreturn is a promise. You're telling the compiler, "It may or may not be obvious, but I know, based on the way I wrote the code, that this function will never return." That way, the compiler can avoid setting up the mechanisms that would allow the function to return properly. Leaving out those mechanisms might allow the compiler to generate more efficient code.
How can a function not return? One example would be if it called exit() instead.
But if you promise the compiler that your function won't return, and the compiler doesn't arrange for it to be possible for the function to return properly, and then you go and write a function that does return, what's the compiler supposed to do? It basically has three possibilities:
Be "nice" to you and figure out a way to have the function return properly anyway.
Emit code that, when the function improperly returns, it crashes or behaves in arbitrarily unpredictable ways.
Give you a warning or error message pointing out that you broke your promise.
The compiler might do 1, 2, 3, or some combination.
If this sounds like undefined behavior, that's because it is.
The bottom line, in programming as in real life, is: Don't make promises you can't keep. Someone else might have made decisions based on your promise, and bad things can happen if you then break your promise.
The noreturn attribute is a promise that you make to the compiler about your function.
If you do return from such a function, behavior is undefined, but this doesn't mean a sane compiler will allow you to mess the state of the application completely by removing the ret statement, especially since the compiler will often even be able to deduce that a return is indeed possible.
However, if you write this:
noreturn void func(void)
{
printf("func\n");
}
int main(void)
{
func();
some_other_func();
}
then it's perfectly reasonable for the compiler to remove the some_other_func completely, it if feels like it.
As others have mentioned, this is classic undefined behavior. You promised func wouldn't return, but you made it return anyway. You get to pick up the pieces when that breaks.
Although the compiler compiles func in the usual manner (despite your noreturn), the noreturn affects calling functions.
You can see this in the assembly listing: the compiler has assumed, in main, that func won't return. Therefore, it literally deleted all of the code after the call func (see for yourself at https://godbolt.org/g/8hW6ZR). The assembly listing isn't truncated, it literally just ends after the call func because the compiler assumes any code after that would be unreachable. So, when func actually does return, main is going to start executing whatever crap follows the main function - be it padding, immediate constants, or a sea of 00 bytes. Again - very much undefined behavior.
This is transitive - a function that calls a noreturn function in all possible code paths can, itself, be assumed to be noreturn.
According to this
If the function declared _Noreturn returns, the behavior is undefined. A compiler diagnostic is recommended if this can be detected.
It is the programmer's responsibility to make sure that this function never returns, e.g. exit(1) at the end of the function.
ret simply means that the function returns control back to the caller. So, main does call func, the CPU executes the function, and then, with ret, the CPU continues execution of main.
Edit
So, it turns out, noreturn does not make the function not return at all, it's just a specifier that tells the compiler that the code of this function is written in such a way that the function won't return. So, what you should do here is to make sure that this function actually doesn't return control back to the callee. For example, you could call exit inside it.
Also, given what I've read about this specifier it seems that in order to make sure the function won't return to its point of invocation, one should call another noreturn function inside it and make sure that the latter is always run (in order to avoid undefined behavior) and doesn't cause UB itself.
no return function does not save the registers on the entry as it is not necessary. It makes the optimisations easier. Great for the scheduler routine for example.
See the example here:
https://godbolt.org/g/2N3THC and spot the difference
TL:DR: It's a missed-optimization by gcc.
noreturn is a promise to the compiler that the function won't return. This allows optimizations, and is useful especially in cases where it's hard for the compiler to prove that a loop won't ever exit, or otherwise prove there's no path through a function that returns.
GCC already optimizes main to fall off the end of the function if func() returns, even with the default -O0 (minimum optimization level) that it looks like you used.
The output for func() itself could be considered a missed optimization; it could just omit everything after the function call (since having the call not return is the only way the function itself can be noreturn). It's not a great example since printf is a standard C function that is known to return normally (unless you setvbuf to give stdout a buffer that will segfault?)
Lets use a different function that the compiler doesn't know about.
void ext(void);
//static
int foo;
_Noreturn void func(int *p, int a) {
ext();
*p = a; // using function args after a function call
foo = 1; // requires save/restore of registers
}
void bar() {
func(&foo, 3);
}
(Code + x86-64 asm on the Godbolt compiler explorer.)
gcc7.2 output for bar() is interesting. It inlines func(), and eliminates the foo=3 dead store, leaving just:
bar:
sub rsp, 8 ## align the stack
call ext
mov DWORD PTR foo[rip], 1
## fall off the end
Gcc still assumes that ext() is going to return, otherwise it could have just tail-called ext() with jmp ext. But gcc doesn't tailcall noreturn functions, because that loses backtrace info for things like abort(). Apparently inlining them is ok, though.
Gcc could have optimized by omitting the mov store after the call as well. If ext returns, the program is hosed, so there's no point generating any of that code. Clang does make that optimization in bar() / main().
func itself is more interesting, and a bigger missed optimization.
gcc and clang both emit nearly the same thing:
func:
push rbp # save some call-preserved regs
push rbx
mov ebp, esi # save function args for after ext()
mov rbx, rdi
sub rsp, 8 # align the stack before a call
call ext
mov DWORD PTR [rbx], ebp # *p = a;
mov DWORD PTR foo[rip], 1 # foo = 1
add rsp, 8
pop rbx # restore call-preserved regs
pop rbp
ret
This function could assume that it doesn't return, and use rbx and rbp without saving/restoring them.
Gcc for ARM32 actually does that, but still emits instructions to return otherwise cleanly. So a noreturn function that does actually return on ARM32 will break the ABI and cause hard-to-debug problems in the caller or later. (Undefined behaviour allows this, but it's at least a quality-of-implementation problem: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82158.)
This is a useful optimization in cases where gcc can't prove whether a function does or doesn't return. (It's obviously harmful when the function does simply return, though. Gcc warns when it's sure a noreturn function does return.) Other gcc target architectures don't do this; that's also a missed optimization.
But gcc doesn't go far enough: optimizing away the return instruction as well (or replacing it with an illegal instruction) would save code size and guarantee noisy failure instead of silent corruption.
And if you're going to optimize away the ret, optimizing away everything that's only needed if the function will return makes sense.
Thus, func() could be compiled to:
sub rsp, 8
call ext
# *p = a; and so on assumed to never happen
ud2 # optional: illegal insn instead of fall-through
Every other instruction present is a missed optimization. If ext is declared noreturn, that's exactly what we get.
Any basic block that ends with a return could be assumed to never be reached.
I am a beginner at assembly, and I am curious to know how the stack frame looks like here, so I could access the argument by understanding and not algorithm.
P.S.: the assembly function is process
#include <stdio.h>
# define MAX_LEN 120 // Maximal line size
extern int process(char*);
int main(void) {
char buf[MAX_LEN];
int str_len = 0;
printf("Enter a string:");
fgets(buf, MAX_LEN, stdin);
str_len = process(buf);
So, I know that when I want to access the process function's argument, which is in assembly, I have to do the following:
push ebp
mov ebp, esp ; now ebp is pointing to the same address as esp
pushad
mov ebx, dword [ebp+8]
Now I also would like someone to correct me on things I think are correct:
At the start, esp is pointing to the return address of the function, and [esp+8] is the slot in the stack under it, which is the function's argument
Since the function process has one argument and no inner declarations (not sure about the declarations) then the stack frame, from high to low, is 8 bytes for the argument, 8 bytes for the return address.
Thank you.
There's no way to tell other than by means of debugger. You are using ia32 conventions (ebp, esp) instead of x64 (rbp, rsp), but expecting int / addresses to be 64 bit. It's possible, but not likely.
Compile the program (gcc -O -g foo.c), then run with gdb a.out
#include <stdio.h>
int process(char* a) { printf("%p", (void*)a); }
int main()
{
process((char *)0xabcd1234);
}
Break at process; run; disassemble; inspect registers values and dump the stack.
- break process
- run
- disassemble
- info frame
- info args
- info registers
- x/32x $sp - 16 // to dump stack +-16 bytes in both side of stack pointer
Then add more parameters, a second subroutine or local variables with known values. Single step to the printf routine. What does the stack look like there?
You can also use gdb as calculator: what is the difference in between sp and rax ?
It's print $sp - $rax if you ever want to know.
Tickle your compiler to produce assembler output (on Unixy systems usually with the -S flag). Play around with debugging/non-debugging flags, the extra hints for the debugger might help in refering back to the source. Don't give optimization flags, the reorganizing done by the compiler can lead to thorough confusion. Add a simple function calling into your code to see how it is set up and torn down too.
How can I print out the current value at the stack pointer in C in Linux (Debian and Ubuntu)?
I tried google but found no results.
One trick, which is not portable or really even guaranteed to work, is to simple print out the address of a local as a pointer.
void print_stack_pointer() {
void* p = NULL;
printf("%p", (void*)&p);
}
This will essentially print out the address of p which is a good approximation of the current stack pointer
There is no portable way to do that.
In GNU C, this may work for target ISAs that have a register named SP, including x86 where gcc recognizes "SP" as short for ESP or RSP.
// broken with clang, but usually works with GCC
register void *sp asm ("sp");
printf("%p", sp);
This usage of local register variables is now deprecated by GCC:
The only supported use for this feature is to specify registers for input and output operands when calling Extended asm
Defining a register variable does not reserve the register. Other than when invoking the Extended asm, the contents of the specified register are not guaranteed. For this reason, the following uses are explicitly not supported. If they appear to work, it is only happenstance, and may stop working as intended due to (seemingly) unrelated changes in surrounding code, or even minor changes in the optimization of a future version of gcc. ...
It's also broken in practice with clang where sp is treated like any other uninitialized variable.
In addition to duedl0r's answer with specifically GCC you could use __builtin_frame_address(0) which is GCC specific (but not x86 specific).
This should also work on Clang (but there are some bugs about it).
Taking the address of a local (as JaredPar answered) is also a solution.
Notice that AFAIK the C standard does not require any call stack in theory.
Remember Appel's paper: garbage collection can be faster than stack allocation; A very weird C implementation could use such a technique! But AFAIK it has never been used for C.
One could dream of a other techniques. And you could have split stacks (at least on recent GCC), in which case the very notion of stack pointer has much less sense (because then the stack is not contiguous, and could be made of many segments of a few call frames each).
On Linuxyou can use the proc pseudo-filesystem to print the stack pointer.
Have a look here, at the /proc/your-pid/stat pseudo-file, at the fields 28, 29.
startstack %lu
The address of the start (i.e., bottom) of the
stack.
kstkesp %lu
The current value of ESP (stack pointer), as found
in the kernel stack page for the process.
You just have to parse these two values!
You can also use an extended assembler instruction, for example:
#include <stdint.h>
uint64_t getsp( void )
{
uint64_t sp;
asm( "mov %%rsp, %0" : "=rm" ( sp ));
return sp;
}
For a 32 bit system, 64 has to be replaced with 32, and rsp with esp.
You have that info in the file /proc/<your-process-id>/maps, in the same line as the string [stack] appears(so it is independent of the compiler or machine). The only downside of this approach is that for that file to be read it is needed to be root.
Try lldb or gdb. For example we can set backtrace format in lldb.
settings set frame-format "frame #${frame.index}: ${ansi.fg.yellow}${frame.pc}: {pc:${frame.pc},fp:${frame.fp},sp:${frame.sp}} ${ansi.normal}{ ${module.file.basename}{\`${function.name-with-args}{${frame.no-debug}${function.pc-offset}}}}{ at ${ansi.fg.cyan}${line.file.basename}${ansi.normal}:${ansi.fg.yellow}${line.number}${ansi.normal}{:${ansi.fg.yellow}${line.column}${ansi.normal}}}{${function.is-optimized} [opt]}{${frame.is-artificial} [artificial]}\n"
So we can print the bp , sp in debug such as
frame #10: 0x208895c4: pc:0x208895c4,fp:0x01f7d458,sp:0x01f7d414 UIKit`-[UIApplication _handleDelegateCallbacksWithOptions:isSuspended:restoreState:] + 376
Look more at https://lldb.llvm.org/use/formatting.html
You can use setjmp. The exact details are implementation dependent, look in the header file.
#include <setjmp.h>
jmp_buf jmp;
setjmp(jmp);
printf("%08x\n", jmp[0].j_esp);
This is also handy when executing unknown code. You can check the sp before and after and do a longjmp to clean up.
If you are using msvc you can use the provided function _AddressOfReturnAddress()
It'll return the address of the return address, which is guaranteed to be the value of RSP at a functions' entry. Once you return from that function, the RSP value will be increased by 8 since the return address is pop'ed off.
Using that information, you can write a simple function that return the current address of the stack pointer like this:
uintptr_t GetStackPointer() {
return (uintptr_t)_AddressOfReturnAddress() + 0x8;
}
int main(int argc, const char argv[]) {
uintptr_t rsp = GetStackPointer();
printf("Stack pointer: %p\n", rsp);
}
Showcase
You may use the following:
uint32_t msp_value = __get_MSP(); // Read Main Stack pointer
By the same way if you want to get the PSP value:
uint32_t psp_value = __get_PSP(); // Read Process Stack pointer
If you want to use assembly language, you can also use MSP and PSP process:
MRS R0, MSP // Read Main Stack pointer to R0
MRS R0, PSP // Read Process Stack pointer to R0
Source: http://milw0rm.org/papers/145
#include <stdio.h>
#include <stdlib.h>
int main()
{
char scode[]="\x31\xc0\xb0\x01\x31\xdb\xcd\x80";
(*(void(*) ()) scode) ();
}
This papers is tutorial about shellcode on Linux platform, however it did not explain how the following statement "(*(void(*) ()) scode) ();" works. I'm using the book "The C Language Programming Reference, 2ed by Brian.W.Kernighan, Dennis.M.Ritchie" to lookup for an answer but found no answer. May someone can point to the right directions, maybe a website, another C reference book where I can find an answer.
Its machine code (compiled assembly instructions) in scode then it casts to a callable void function pointer and calls it. GMan demonstrated an equivalent, clearer approach:
typedef void(*void_function)(void);
int main()
{
char scode[]="\x31\xc0\xb0\x01\x31\xdb\xcd\x80";
void_function f = (void_function)scode;
f(); //or (*f)();
}
scode contains x86 machine code which disassembles into (thanks Michael Berg)
31 c0 xor %eax,%eax
b0 01 mov $0x1,%al
31 db xor %ebx,%ebx
cd 80 int $0x80
This is the code for a system call in Linux (interrupt 0x80). According to the system call table, this is calling the sys_exit() system call (eax=1) with parameter 0 (in ebx). This causes the process to exit immediately, as if it called _exit(0).
Jonathan Leffler pointed out that this is most commonly used to call shellcode, "a small piece of code used as the payload in the exploitation of a software vulnerability." Thus, modern OSes take measures to prevent this.
If the stack is non-executable, this code will fail horribly. The shell code is loaded into a local variable in the stack, and then we jump to that location. If the stack is non-executable, then a CPU fault of some kind will occur as soon as the CPU tries to execute the code, and control will be shifted into the kernel's interrupt handlers. The kernel will then kill the process in an abnormal fashion. One case where the stack might be non-executable would be if you're running on a CPU that supports Physical Address Extensions, and you have the NX (non-executable) bit set in your page tables.
There may also be instruction cache issues on some CPUs -- if the instruction cache hasn't been flushed, the CPU may read stale data (instead of the shell code we explicitly loaded into the stack) and start executing random instructions.
In C:
(some_type) some_var
casts some_var to be of type some_type.
In your code sample "void(*) ()" is the some_type and is the signature for a function pointer that takes no arguments and returns nothing.
"(void(*) ()) scode" casts scode to be a function pointer.
"(*(void(*) ()) scode)" dereferences that function pointer.
And the final () calls the function defined in scode.
And the bytes in scode disassemble to the following i386 assembly:
31 c0 xor %eax,%eax
b0 01 mov $0x1,%al
31 db xor %ebx,%ebx
cd 80 int $0x80
What this code does is assign some machine code (the bytes in scode) then it converts the address of that code into a function pointer of type void function () then calls it.
In C/C++, this function's type definition is expressed:
typedef void (* basicFunctionPtr) (void);
A typedef helps:
// function that takes and returns nothing
typedef void(*generic_function)(void);
// cast to function
generic_function f = (generic_function)scode;
// call
(*f)();
// same thing written differently:
// call
f();
scode is an address. (void(*)()) casts scode to a function returning void and accepting no parameters. The leading * calls the function pointer, and the trailing () indicates that no arguments are given to the function.
To learn a lot more about shell-coding technique, look at the book:
The Shellcoder's Handbook, 2nd Edn
There are several other similar books as well - I think this is the best, but could be persuaded otherwise. You can also find numerous related resources with Google and "shellcoder's handbook" (or your search engine of choice, no doubt).
The character array contains executable code and the cast is a function cast.
(*(void(*) ()) means "cast to a function pointer that produces void, i.e. nothing. The () after the name is the function call operator.
The characters encoded in scode are the char/byte representations of some compiled assembly code. The code you have posted takes that assembly, encoded as characters for simplicity, and then calls that string as a function.
The assembly seems to translate out to:
xor %eax,
%eax mov $0x1,
%al xor %ebx,
%ebx int $0x80
Yup, that would indeed create a shell in Linux.