I encountered some code in a tutorial about buffer overflows.
It's a program that exploits a simple program that is vulnerable to a buffer overflow (if some stack protection mechanisms are turned off).
My question is: what is the for loop doing? I mean the line within the for loop:
*(void **)(buf + i) = addr;
Its a bit of a strange syntax that I haven't seen before, or maybe I have seen it but it just confuses me.
The idea of the program is that the buf is passed as argument to the vulnerable program and through a strcpy it will overwrite the return address on the stack such that it will run the shellcode that is passed in an environment parameter.
Thanks!
The full code:
int main(int argc, char **argv) {
void *addr = (char *) 0xc0000000 - 4 - (strlen(VULN) + 1) - (strlen(&shellcode) + 1);
char buf[768];
size_t i;
for (i = 0; i < sizeof(buf); i += sizeof(void *)) {
*(void **)(buf + i) = addr;
}
char *params[] = { VULN, buf, NULL };
char *env[] = { &shellcode, NULL };
execve(VULN, params, env);
perror("execve");
return -1;
}
C has a kind of Treehorn type system. For any object x of type T, you can pretend it's an object of a different type. To do so, you cast the address of the object. So, in steps:
T x; is an object of type T.
&x is the address of the object, it's of type T * – "pointer to T".
Now pretend this is a pointer to something else: (U *)(&x) – a "pointer to U", but it's the same value.
If we dereference that, we treat the object x as though it were a U: *(U *)(&x)
Now apply all this to T = char, x = buf[i] and U = void * in your code. Note that &buf[i] is identical to buf + i. Also note that i is incremented in strides of sizeof(void *) so that each round of the loop doesn't step on the memory touched by the previous rounds.
A word of warning: it is generally not allowed to treat one object as though it were one of a different type; this is undefined behavior. There are only some exceptions; e.g. you can treat an int as though it were an unsigned int, and you can treat any object x as though it were a char[sizeof x]. (None of these are the case in your code, which is not well-formed.)
First, it calculates a value which will remain constant throughout the execution of the for loop:
0xc0000000 - 4 - (strlen(VULN) + 1) - (strlen(&shellcode) + 1)
Then, inside the for loop, it writes this constant value into every "4-byte entry" in the buf array:
buf[0...3] = the constant value
buf[4...7] = the constant value
buf[8...11] = the constant value
...
buf[764...767] = the constant value
Related
I have a question. I wanna pass my own 2D array to pass function.And in that function,i will change my own array.So,there is a return.What i exactly know is that the code blow can be accepted by the compiler.But, i don't why it is.When i take the int (* aaa)[3]; out of the main function,it works well.But , when it is inside the main,there will throw an exception that unable to use the uninitialized aaa.I wonder why could this happan.
int* pass(int (*a)[3]) {
a=(int*)malloc(sizeof(int*)*2);
a[0][1] = 1;
a[0][2] = 2;
return a;
}
int (* aaa)[3];
int main() {
aaa = pass(aaa);
printf("%d", aaa[0][2]);
}
this could work.
int* pass(int (*a)[3]) {
a=(int*)malloc(sizeof(int*)*2);
a[0][1] = 1;
a[0][2] = 2;
return a;
}
int main() {
int (* aaa)[3];
aaa = pass(aaa);
printf("%d", aaa[0][2]);
}
but,this can't work.
When int (* aaa)[3]; appears outside of any function, it aaa is automatically initialized to a null pointer. When it appears inside a function, it is not initialized.
The code aaa = pass(aaa); passes aaa to the routine named pass. This is a use of the value of aaa. When aaa has been initialized, that is fine. But, when aaa is not initialized and you attempt to pass its value, the behavior is not defined by the C standard. This is what the compiler is warning you about.
Next, let’s examine this code:
int* pass(int (*a)[3]) {
a=(int*)malloc(sizeof(int*)*2);
a[0][1] = 1;
a[0][2] = 2;
return a;
}
This code never uses the value of a that is passed to it. When a function is called, its parameter, a in this case, is given a value (which comes from the argument the caller passed). This parameter is a separate variable from the argument. Assigning a a value with a=(int*)malloc(sizeof(int*)*2); does not change the value of aaa in the calling routine. So this code assigns a new value to a without using the old value.
Because of that, the routine does not need a parameter passed to it. It could be written to use a local variable instead, like this:
int (*pass(void))[3] {
int (*a)[3] = malloc(2 * sizeof *a);
a[0][1] = 1;
a[0][2] = 2;
return a;
}
The void in this means pass does not take any arguments.
Note that I changed malloc(sizeof(int*)*2 to malloc(2 * sizeof *a). sizeof(int*)*2 is wrong because it requests space for two pointers to int. But a points to arrays of three int, so, to get two of those, you need space for two arrays of three int. That is 2 * sizeof(int [3]). However, it is easier to write this as malloc(2 * sizeof *a), which means “two of whatever a points to”. This is also better because it reduces the frequency with which errors are made: Even if the declaration of a is changed, this sizeof *a will automatically adjust without needing to be edited. With sizeof(int [3]), any edit to the declaration of a would require another edit to the sizeof.
Also, I removed the (int*) to cast the result of malloc. In C, a void *, which is the type malloc returns, will automatically be converted to whatever object pointer type it is assigned to. There is no need for an explicit cast, and using an explicit cast can mask certain errors. (However, if you compile the program with a C++ compiler, it will complain about the lack of a cast, because the rules are different in C++.)
Since the function is returning a pointer to an array of three int, not an pointer to an int, I changed its declaration to int (*pass(void))[3].
With these changes, the program could be:
#include <stdio.h>
#include <stdlib.h>
int (*pass(void))[3]
{
int (*a)[3] = malloc(2 * sizeof *a);
a[0][1] = 1;
a[0][2] = 2;
return a;
}
int main(void)
{
int (*aaa)[3] = pass();
printf("%d\n", aaa[0][2]);
}
maybe this helps you a bit to see that C is 'flexible' when it comes to arrays.Because in the first part the assumed array declaration is given by datalen in malloc of the initAAA function and returns the pointer to the memory that is allocated. And still in the for loop we can access the data with index.
The second part of main declares just same data 'bbb' as the first 'aaa' but this time not as pointer and the initiation of the data fields with zeros (0) is done with the curly parenthesis. {}. A boring for loop thru all the indexes and set each data field with int 0 would just do it also. But who wants more code than needed.
#include <stdio.h>
#include <string.h>
int *initAAA(int *p, uint entrys) {
size_t datalen = entrys * sizeof *p;
p = malloc(datalen); // p is a pointer here.
// copy character '0' starting at address of p up to datalen addresses
// easier then writing a for loop to initiate safely.
memset(p, 0, datalen); // defined in string.h
return p;
}
int main(void) {
const uint maxAssets = 3;
const uint entrysPerAsset = 2;
int *aaa = NULL; // always a good idea, to set pointers to NULL before allocating memory for it. Because you can check if (aaa==NULL) initAAA(...
uint entrys = maxAssets * entrysPerAsset;
aaa = initAAA(aaa,entrys);
printf("address:%p items:%d \n",aaa, entrys);
for (uint i = 0; i < entrys; i++) {
printf("%d ", aaa[i]);
}
free(aaa); // malloc without free, bad idea!
printf("\n---\n");
int bbb[maxAssets][entrysPerAsset] = {0,0,0,0,0,0};
for (uint a = 0; a < maxAssets; a++) {
for (uint e = 0; e < entrysPerAsset; e++) {
printf("%d ", bbb[a][e]);
}
}
// bbb does not need free(bbb); because it is released with the function end.
// and yep, there was no malloc for bbb. so we are done.
}
and by the way. welcome to C.
I'm confused about these errors. The str1 is a string that is being passed but I get the a warning at the compare and and error at the if statement
int stringcmp(void* str1, void* str2) {
int a = strlen(str1);
int b = strlen(str2);
int x;
if ( a < b ) {
x = a;
} else {
x = b;
}
int c = 0;
while ( c < x ) {
if (str1[c] < str2[c]) { //errors happen here
return 0;
}
if (str1[c] > str2[c]) {
return 1;
}
c++;
}
if ( a == x ) {
return 0;
}
return 1;
}
Your function receives void* arguments, so in the line you pointed you are dereferencing pointers to void and that is why you get the warnings. I am not sure why you are receiving the value not ignore as it ought to be because that means that you are assigning the value of a function that returns void, and that is not the case of strlen (which is the only one you are calling).
And you should also receive an error in the calls to strlen when passing to it a void* parameter.
So you have to change your function's signature to
int stringcmp(const char* str1, const char* str2);
to suppress the warnings and be able to call strlen on the strings.
Dereferencing void * makes no sense, ever. In C, void means "no type"/"nothing", if you have void *p;, what is *p supposed to be?
In C, void * is used as a generic pointer type, "pointer to anything". To use p above, you must cast it to the type of the object it is pointing at (passed in some other way, unbeknownst to the compiler). E.g.:
int i;
void *p = (void *) &i;
...
int j = *(int *)p;
You know what p points to, the compiler doesn't.
The type void * is often used to pass around opaque data, or to write generic functions (like qsort, it gets an array of unspecified elements and a function that compares them).
That said, as a (misguided) extension GCC allows pointer arithmetic on void *, so that p + 1 is like ((char *)p + 1), it points at the next char position.
Your error arises from the fact that you are attempting to use the array notation str1[c] on a void-pointer. To understand why this is wrong we need to look at what the array notation in C actually means.
Let us assume that we define char* str1. Here we have said that str1 is an address to some place in memory where there is a char.
To get the data that is stored on the address which str1 is referring to we can use *str1.
This is equivalent to saying: "Go to the address that str1 is holding and give me what is stored on that address".
When we use the array notation we can use str1[0], this will be a value fetched from a place in memory where there is an element of the same type that str1 was defined as. It is the same thing as saying *str1 (go to the address that str1 is pointing to and give me the value that is stored there).
An array is just a bunch of data stored in a sequence in memory and strings are just arrays of characters of exactly 1 byte in size stored immediately after one another.
When we say str1[1] we are telling the compiler to move by the size of the type that str1 was defined to be pointing at, in this case 1 byte(char), and then get us whatever is stored at that location. In the case of strings, this should be another char.
Now when we have defined str1 as void*, how would the compiler know how how much it should move in memory to get the next element in the array? Since void has no size it is impossible.
Hopefully you now understand what you need to change in this line to get rid of your errors
int stringcmp(void* str1, void* str2)
I have tried to understand void* recently by testing a few things, here is the test code:
#include <stdio.h>
int main(int argc, char** argv)
{
float nb2 = 2.2;
void* multi_type_ptr = &nb2;
printf("\nmulti_type_ptr:\n");
printf("Elem 1: %f\n",*(float*)multi_type_ptr);
// *(multi_type_ptr + 1) = 4;
// *(int*)(multi_type_ptr + 1) = 4;
*((int*)multi_type_ptr + 1) = 4;
printf("Elem 2: %d\n",*((int*)multi_type_ptr + 1));
return 0;
}
Does this instruction can't work because the language/compiler don't know how much bytes it should add to the address of multi_type_ptr (it doesn't know the size of the type pointed by void*)?
*(multi_type_ptr + 1) = 4;
Also a void* cannot be dereferenced, so it should be cast to the right type, right? This line still can't work because of the issue mentioned above.
*(int*)(multi_type_ptr + 1) = 4;
Does this assignment works because the language/compiler understand it have to add (1 * sizeof int) to the address of multi_type_ptr? I use cl compiler included in Visual Studio.
*((int*)multi_type_ptr + 1) = 4;
It looks like the result of (multi_type_ptr + 1) is still an (int*) when its content is accessed by the * at the beginning of the line, am i right?
Thank you in advance for your future correction, explanation and reinforcement.
*(multi_type_ptr + 1) = 4;
The expression multi_type_ptr + 1 is not valid C because C does not allow pointer arithmetic with void *.
*((int*)multi_type_ptr + 1) = 4;
This statement compiles in your machine because (int*) multi_type_ptr + 1 does pointer arithmetic with int *. Nevertheless pointer addition in C is undefined behavior if the resulting pointer is not in the same array object as the operand (or one past the last element) and dereferencing the resulting pointer is undefined behavior if the resulting pointer is outside the array object, so in your case the * expression (and thus the statement) invokes undefined behavior.
There are several questions here:
Does this instruction can't work because the language/compiler dont know how much bytes it should add to the adress of multi_type_ptr (it doesn't know the size of the type pointed by void*)?
*(multi_type_ptr + 1) = 4;
Yes, you are right. Some compilers (GCC for example) allow pointer arithmetic with void pointers and treat them as if pointing to objects of size 1 (just like char*), to avoid the frequent cast from void* just to move around.
Does this assignment works bacause the language/compiler understand it have to add (1 * sizeof int) to the adress of multi_type_ptr? I use cl compiler included in Visual Studio.
*((int*)multi_type_ptr + 1) = 4;
Yes, right again. But note that while this code is syntactically correct, and it will compile, it is technically Undefined Behaviour for two reasons:
You are type punning or type aliasing the nb2 variable. That is you are accessing it using a different type than that defined (defined as float, used as int) (the exception would be that you are using char pointers, but that is not the case).
If 2*sizeof(int) > sizeof(float) (quite likely) then your pointer arithmetic is taking you outside of nb2 bounds, and you are probably corrupting your stack.
It looks like the result of (multi_type_ptr + 1) is still an (int*) when its content is accessed by the * at the begginning of the line, am i right?
No. Cast operator have high precedence, so this line:
*((int*)multi_type_ptr + 1) = 4;
is actually read as:
*(((int*)multi_type_ptr) + 1) = 4;
That is, like:
int *temp1 = (int*)multi_type_ptr);
int *temp2 = t1 + 1;
*temp2 = 4;
But again, beware the type aliasing!
I'm learning C and this is probably the most stupid question ever but here it goes!
My code:
char tmp = *x++;
char *g = &tmp;
Can I do this line of code in a one liner? With something like:
char *g = &(*x++);
More info:
char *x = "1234";
//Turn each 'number' to a int
while(*x != '\0'){
char tmp = *x++;
char *x = &tmp;
int t = atoi(x);
}
If x is a pointer to char (and points to an existing object), then the declaration char *g = &(*x++); is allowed in standard C and has defined behavior.
The result of the * operator is an lvalue, per C 2011 6.5.3.2 4, so its address may be taken with &.
In detail:
x++ increments x and produces the original value of x. (Note: Incrementing a pointer requires that the pointer point to an object. It does not require that it be an array element or that there be another object after it; you are allowed to increment to point one past an object, as long as you do not then dereference a pointer to a non-existent object.)
*x++ dereferences the pointer (the original value of x). The result is an lvalue.
&(*x++) takes the address of the lvalue, which is the original value of x.
Then this value is used to initialize g.
Additionally, C 2011 6.5.3.2 3 specifies that the combination of & and * cancel, except that the result is not an lvalue and the usual constraints apply, so the & and * operations are not actually evaluated. Thus, this statement is the same as char *g = x++;.
Just as a hint: Are you aware that you are shadowing the outer x variable?
What you are currently doing is:
char *x = "1234"; //declare c style string
while(*x != '\0'){ //for the conditional the "outer" x will be used for comparsion
char tmp = *x++; //increment x to point to whatever comes sizint atoi (const char * str);eof(char) bytes afterwards; dereference what x previously pointed to (postfix ++) and save it in local char tmp
char *x = &tmp; //take adress of the local char and save its adress in a new local/inner char*
int t = atoi(x); //call atoi on the inner x
Although this may work it may be confusing style to shadow variables like this. (Confusing for other developers especially)
Also take a look at the signature of atoi:
int atoi (const char * str);
Here you can see that you can safely pass the pointer like this:
int t = atoi(&x);
++x;
Or preferably:
int t = atoi(&x++);
Can anyone explain the logic how to add a and b?
#include <stdio.h>
int main()
{
int a=30000, b=20, sum;
char *p;
p = (char *) a;
sum = (int)&p[b]; //adding a and b!
printf("%d",sum);
return 0;
}
The + is hidden here:
&p[b]
this expression is equivalent to
(p + b)
So we actually have:
(int) &p[b] == (int) ((char *) a)[b]) == (int) ((char *) a + b) == a + b
Note that this technically invokes undefined behavior as (char *) a has to point to an object and pointer arithmetic outside an object or one past the object invokes undefined behavior.
C standard says that E1[E2] is equivalent to *((E1) + (E2)). Therefore:
&p[b] = &*((p) + (b)) = ((p) + (b)) = ((a) + (b)) = a + b
p[b] is the b-th element of the array p. It's like writing *(p + b).
Now, when adding & it'll be like writing: p + b * sizeof(char) which is p + b.
Now, you'll have (int)((char *) a + b) which is.. a + b.
But.. when you still have + in your keyboard, use it.
As #gerijeshchauhan clarified in the comments, * and & are inverse operations, they cancel each other. So &*(p + b) is p + b.
p is made a pointer to char
a is converted to a pointer to char, thus making p point to memory with address a
Then the subscript operator is used to get to an object at an offset of b beyond the address pointed to by p. b is 20 and p+20=30020 . Then the address-of operator is used on the resulting object to convert the address back to int, and you've got the effect of a+b
The below comments might be easier to follow:
#include <stdio.h>
int main()
{
int a=30000, b=20, sum;
char *p; //1. p is a pointer to char
p = (char *) a; //2. a is converted to a pointer to char and p points to memory with address a (30000)
sum = (int)&p[b]; //3. p[b] is the b-th (20-th) element from address of p. So the address of the result of that is equivalent to a+b
printf("%d",sum);
return 0;
}
Reference: here
char *p;
p is a pointer (to element with size 1 byte)
p=(char *)a;
now p points to memory with address a
sum= (int)&p[b];
p pointer can be use as array p[] (start address (in memory) of this array is a)
p[b] means to get b-th element - this element address is a+b
[ (start address)a + b (b-th element * size of element (1 byte)) ]
&p[b] means to get address of element at p[b] but its address is a+b
if you use pointer to int (mostly 4 bytes)
int* p
p = (int*)a;
your sum will be a+(4*b)
int a=30000, b=20, sum;
char *p; //1. p is a pointer to char
p = (char *) a;
a is of type int, and has the value 30000. The above assignment converts the value 30000 from int to char* and stores the result in p.
The semantics of converting integers to pointers are (partially) defined by the C standard. Quoting the N1570 draft, section 6.3.2.3 paragraph 5:
An integer may be converted to any pointer type. Except as previously
specified, the result is implementation-defined, might not be
correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation.
with a (non-normative) footnote:
The mapping functions for converting a pointer to an integer or an
integer to a pointer are intended to be consistent with the addressing
structure of the execution environment.
The standard makes no guarantees about the relative sizes of types int and char*; either could be bigger than the other, and the conversion could lose information. The result of this particular conversions is very unlikely to be a valid pointer value. If it's a trap representation, then the behavior of the assignment is undefined.
On a typical system you're likely to be using, char* is at least as big as int, and integer-to-pointer conversions probably just reinterpret the bits making up the integer's representation as the representation of a pointer value.
sum = (int)&p[b];
p[b] is by definition equivalent to *(p+b), where the + denotes pointer arithmetic. Since the pointer points to char, and a char is by definition 1 byte, the addition advances the pointed-to address by b bytes in memory (in this case 20).
But p is probably not a valid pointer, so any attempt to perform arithmetic on it, or even to access its value, has undefined behavior.
In practice, most C compilers generate code that doesn't perform extra checks. The emphasis is on fast execution of correct code, not on detection of incorrect code. So if the previous assignment to p set it to an address corresponding to the number 30000, then adding b, or 20, to that address will probably yield an address corresponding to the number 30020.
That address is the result of (p+b); now the [] operator implicitly applies the * operator to that address, giving you the object that that address points to -- conceptually, this is a char object stored at an address corresponding to the integer 30020.
We immediately apply the & operator to that object. There's a special-case rule that says applying & to the result of a [] operator is equivalent to just doing the pointer addition; see 6.5.3.2p2 in the above referenced standard draft.
So this:
&p[b]
is equivalent to:
p + b
which, as I said above, yields an address (of type char*) corresponding to the integer value 30020 -- assuming, of course, that integer-to-pointer conversions behave in a certain way and that the undefined behavior of constructing and accessing an invalid pointer value don't do anything surprising.
Finally, we use a cast operator to convert this address to type int. Conversion of a pointer value to an integer is also implementation-defined, and possibly undefined. Quoting 6.3.2.3p6:
Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any
integer type.
It's not uncommon for a char* to be bigger than an int (for example, I'm typing this on a system with 32-bit int and 64-bit char*). But we're relatively safe from overflow in this case, because the char* value is the result of converting an in-range int value. there's no guarantee that converting a given value from int to char* and back to int will yield the original result, but it commonly works that way, at least for values that are in range.
So if a number of implementation-specific assumptions happen to be satisfied by the implementation on which the code happens to be running, then this code is likely to yield the same result as 30000 + 20.
Incidentally, I've worked on a system where this would have failed. The Cray T90 was a word-addressed machine, with hardware addresses pointing to 64-bit words; there was no hardware support for byte addressing. But char was 8 bits, so char* and void* pointers had to be constructed and manipulated in hardware. A char* pointer consisted of a 64-bit word pointer with a byte offset stored in the otherwise unused high-order 3 bits. Conversions between pointers and integers did not treat these high-order bits specially; they were simply copied. So ptr + 1 and (char*)(int)ptr + 1) could yield very different results.
But hey, you've managed to add two small integers without using the + operator, so there's that.
An alternative to the pointer arithmetic is to use bitops:
#include <stdio.h>
#include <string.h>
unsigned addtwo(unsigned one, unsigned two);
unsigned addtwo(unsigned one, unsigned two)
{
unsigned carry;
for( ;two; two = carry << 1) {
carry = one & two;
one ^= two;
}
return one;
}
int main(int argc, char **argv)
{
unsigned one, two, result;
if ( sscanf(argv[1], "%u", &one ) < 1) return 0;
if ( sscanf(argv[2], "%u", &two ) < 1) return 0;
result = addtwo(one, two);
fprintf(stdout, "One:=%u Two=%u Result=%u\n", one, two, result );
return 0;
}
On a completely different note, perhaps what was being looked for was an understanding of how binary addition is done in hardware, with XOR, AND, and bit shifting. In other words, an algorithm something like this:
int add(int a, int b)
{ int partial_sum = a ^ b;
int carries = a & b;
if (carries)
return add(partial_sum, carries << 1);
else
return partial_sum;
}
Or an iterative equivalent (although, gcc, at least, recognizes the leaf function and optimizes the recursion into an iterative version anyway; probably other compilers would as well)....
Probably needs a little more study for the negative cases, but this at least works for positive numbers.
/*
by sch.
001010101 = 85
001000111 = 71
---------
010011100 = 156
*/
#include <stdio.h>
#define SET_N_BIT(i,sum) ((1 << (i)) | (sum))
int sum(int a, int b)
{
int t = 0;
int i = 0;
int ia = 0, ib = 0;
int sum = 0;
int mask = 0;
for(i = 0; i < sizeof(int) * 8; i++)
{
mask = 1 << i;
ia = a & mask;
ib = b & mask;
if(ia & ib)
if(t)
{
sum = SET_N_BIT(i,sum);
t = 1;
/*i(1) t=1*/
}
else
{
t = 1;
/*i(0) t=1*/
}
else if (ia | ib)
if(t)
{
t = 1;
/*i(0) t=1*/
}
else
{
sum = SET_N_BIT(i,sum);
t = 0;
/*i(1) t=0*/
}
else
if(t)
{
sum = SET_N_BIT(i,sum);
t = 0;
/*i(1) t=0*/
}
else
{
t = 0;
/*i(0) t=0*/
}
}
return sum;
}
int main()
{
int a = 85;
int b = 71;
int i = 0;
while(1)
{
scanf("%d %d", &a, &b);
printf("%d: %d + %d = %d\n", ++i, a, b, sum(a, b));
}
return 0;
}