Pointer Upcast and Downcast - c

Pointer Downcast
int* ptrInt;
char * ptrChar;
void* ptrVoid;
unsigned char indx;
int sample = 0x12345678;
ptrInt = &sample;
ptrVoid = (void *)(ptrInt);
ptrChar = (char *)(ptrVoid);
/*manipulating ptrChar */
for (indx = 0; indx < 4; indx++)
{
printf ("\n Value: %x \t Address: %p", *(ptrChar + indx), ( ptrChar + indx));
}
Output:
Value: 00000078 Address: 0022FF74
Value: 00000056 Address: 0022FF75
Value: 00000034 Address: 0022FF76
Value: 00000012 Address: 0022FF77
Question:
Why was sample divided into char sized data? And when pointer arithmetic is performed, how was it able to get its remaining value?
How this was possible?
Pointer Upcast
unsigned int * ptrUint;
void * ptrVoid;
unsigned char sample = 0x08;
ptrVoid = (void *)&sample;
ptrUint = (unsigned int *) ptrVoid;
printf(" \n &sample: %p \t ptrUint: %p ", &sample, ptrUint );
printf(" \n sample: %p \t *ptrUint: %p ", sample, *ptrUint );
Output:
&sample: 0022FF6F ptrUint: 0022FF6F
sample: 00000008 *ptrUint: 22FF6F08 <- Problem Point
Question:
Why is it that there is a garbage value in *ptrUint? Why is the garbage value similar
to ptrUint? Should malloc() or calloc() be used to avoid this garbage value? What kind of remedy would you suggest to remove the garbage value?

In the first example, you are using a char pointer so the data is going to be accessed a byte at a time. Memory is byte addressable, so when you add one to a pointer, you will access the next higher memory address. This is what is happening with the for loop. Using a byte pointer tells the compiler to access only the single byte, and rest of bits will show up as 0 when you are printing with %p.
In the second example, I think what is happening is that one byte is allocated for the sample byte, then the following 4 bytes were allocated to the ptrUint. So when you get the value starting at the memory address of sample and converting it to a 4 byte pointer, you just see the value in Sample plus the first 3 bytes of the ptrUint. If you cast this to a char pointer, and print, you would only see 8 in the output.

These aren't upcasts and downcasts, that would imply some kind of inheritance hierarchy.
In your first example you treat a pointer to an integer like if it was a pointer to char(s). Incrementing a pointer to int adds 4 to it, incrementing a pointer to char adds 1 to it (assuming 32 bit ints and 8 bit chars). Dereferencing them makes an int and a char, respectively. Hence the fragmentation into bytes.
In your second example you treat the unsigned char variable called sample as if it were a pointer to int, and dereference it. You are essentially reading garbage from the 0x08 memory address. I suppose you forgot a &. You are also passing a 1 byte char and a 4 byte int to the second printf, instead of 4+4 bytes, that messes up printf, reading 3 bytes more from the stack than you have given him. Which coincidentally is part of the ptrUint value given to the first call of printf. Using %c instead of %p should fix it.

The other answers have already explained why you're seeing what you're seeing.
I will add that your second example relies on undefined behaviour. It is not valid to dereference an int * that points to data that wasn't originally an int. i.e.:
char x = 5;
int *p = (int *)&x;
printf("%d\n", *p); // undefined behaviour

Related

Overwriting of Arbitrary memory using format string

I was reading an old article about format string exploit back in the 2000's, link can be found here: Article
At page 15, the author describes the mean to overwrite a variable's content by increasing the printf internal stack pointer as so: Stack pushed
unsigned char canary[5];
unsigned char foo[4];
memset (foo, ’\x00’, sizeof (foo));
/* 0 * before */ strcpy (canary, "AAAA");
/* 1 */ printf ("%16u%n", 7350, (int *) &foo[0]);
/* 2 */ printf ("%32u%n", 7350, (int *) &foo[1]);
/* 3 */ printf ("%64u%n", 7350, (int *) &foo[2]);
/* 4 */ printf ("%128u%n", 7350, (int *) &foo[3]);
/* 5 * after */ printf ("%02x%02x%02x%02x\n", foo[0], foo[1],
foo[2], foo[3]);
printf ("canary: %02x%02x%02x%02x\n", canary[0],
canary[1], canary[2], canary[3]);
Returns the output “10204080” and “canary: 00000041”
Unfortunately the author doesn't explain the reason why the stack gets pushed like this, in other terms what part of the printf procedure is provoking the override in memory?
Edit:
I do understand that the instruction in /1/ will create a right padded field of width 16 then write the number of written bytes (16) to the address of foo[0].
The question is why does it overwrite to the adjacent memory? You would normally think that it would only write on the address of foo[0] which is one byte not 4.
The code has undefined behavior and is invalid.
Anyway, the statement:
printf ("%16u%n", 7350, (int *) &foo[0]);
is basically doing:
*(int *)&foo[0] = 16;
The biggest issue is with the last one:
printf("%128u%n", 7350, (int *) &foo[3]);
it's doing:
*(int *)&foo[3] = 128;
but foo[3] is an unsigned char. Assuming sizeof(int) = 4, ie. int has 4 bytes, then this writes 3 bytes out of bounds to foo + 3. x86 stores stack in reverse order - the memory reserved for canary is put after the memory for foo. The stack memory looks like this:
<-- foo ---><--- canary ----->
[0][1][2][3][0][1][2][3][4][5]
^^^^^^^^^^^^
storing (int)128 here in **little endian**
Because x86 is little endian, foo[3] is assigned the value of 128, and canary[0..2] are zeroed (because 128 = 0x00000080).
You can do:
// I want it to print 0xDEAD
// I swap bytes for endianess I get 0xADDE
// I then shift it left by 8 bytes and get 0xADDE00
// 0xADDE00 = 11394560
// The following printf will do foo[3] = 0x00
// but also: canary[0] = 0xDE, canary[1] = 0xAD and canary[2] = 0x00
fprintf("%11394560u%n", 7350, (int *) &foo[3]);
printf("0x%02x%02x\n", canary[0], canary[1]);
// will output 0xDEAD
why does it overwrite to the adjacent memory?
Undefined behavior (UB).
All lines below exhibit UB. What OP sees to a potential result, but not specified by C.
/* 0 * before */ strcpy (canary, "AAAA"); // Writing out of bounds
/* 1 */ printf ("%16u%n", 7350, (int *) &foo[0]); // Writing out of bounds, alignment issues.
/* 2 */ printf ("%32u%n", 7350, (int *) &foo[1]);
/* 3 */ printf ("%64u%n", 7350, (int *) &foo[2]);
/* 4 */ printf ("%128u%n", 7350, (int *) &foo[3]);
After reading the gnu lib c on the %n format parameter here : Gnu C
It states that %n uses an argument which must be a pointer to an int.
From my understanding an int is (edit)at least 16 bits long, but depending on the compiler it can be stored as a 4 byte word or even 8 in modern machines.
I will go with the guess that this article has been written in a time where compilers were already storing ints as 4 byte words, so therefore %n will promote each unsigned char pointer to an int pointer thus overriding 4 bytes of memory on each call, starting from the char address.

Problem creating a pointer to a given address

I am trying to create a pointer to an address in memory given by the user, but I am having problems. I actually manage to create the pointer and print its address, but I never succeed to print or change its content. Would any one kindly explain why?
Here is the code I'm trying to run:
int m = 10;
printf("&m = %d (address in dec.)\n", &m);
printf("Enter an address: ");
int address;
scanf("%d", &address);
int* p = (int*) address;
printf("p = %d (address in dec.)\n", p);
printf("*p = %d (value)\n", *p);
and here is a console of me interacting with the program:
&m = 1220033252 (address in dec.)
Enter an address: 1220033252
p = 1220033252 (address in dec.)
Segmentation fault
Same thing happens if I try an address different from &m.
Thanks!
int is not necessarily big enough to hold a pointer. It's usually 32 bits, but pointers are 64 or 128 bits on many systems now.
You can use the built-in typedef intptr_t to declare an integer big enough to hold a pointer to integer.
int m = 10;
printf("&m = %" PRIiPTR " (address in dec.)\n", (intptr_t)&m);
printf("Enter an address: ");
intptr_t address;
scanf("%" SCNiPTR, &address);
int* p = (int*) address;
printf("p = %" PRIiPTR " (address in dec.)\n", (intptr_t)p);
printf("*p = %d (value)\n", *p);
For more information, see Why / when to use `intptr_t` for type-casting in C? and string format for intptr_t and uintptr_t
Shortly, you're hitting the fact that int and int * aren't the same types.
First, %d should be used for values of type int, which as not the same as pointer type. Most likely, on your machine size of pointer is bigger than size of int, so you are printing not the pointer, but just a part of it. For example, in my case the pointer value is 140734913221608 (0x7fff668293e8), but 1719833576 (0x668293e8) gets printed. Because of that, you enter a pointer to a random unallocated memory, and when you dereference it, segmentation fault occurs.
Second, the code for reading pointer value has the same problem. Additionally to wrong format string for scanf, you are trying to store the value in a variable of type int. A variable of pointer type should be used instead.
As for getting the code working, check out Correct format specifier to print pointer or address? for an in-depth explanation.

How does this memory allocator work?

So, I just sat down and decided to write a memory allocator. I was tired, so I just threw something together. What I ended up with was this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 5643;
printf("%d: %d", *num, sizeof(int));
afree(num);
}
For some reason, this works. But I can not explain why it works. It may have to do with the fact that I am tired, but I really can not see why it works. So, this is what it should be doing, logically, and as I understand it:
It creates a char array with a pointer which points to the first element of the array.
When I call alloc with a value of 4 (which is the size of an int, as I have tested down below), it should set next to point to the fourth element of the array. It should then return a char pointer to the first 4 bytes of the array casted to a void pointer.
I then set that value to something greater than the max value of a char. C should realise that that isn't possible and should then truncate it to *num % sizeof(char).
I have one guess as to why this works: When the char pointer is casted to a void pointer and then gets turned into an integer it somehow changes the size of the pointer so that it is able to point to an integer. (I haven't only tried this memory allocator with integers, but with structures as well, and it seems to work with them as well).
Is this guess correct, or am I too tired to think?
EDIT:
EDIT 2: I think I've understood it. I realised that my phrasing from yesterday was quite bad. The thing which threw me off was the fact that the returned pointer actually points to a char, but I am still somehow able to store an integer value.
The allocator posted implements a mark and release allocation scheme:
alloc(size) returns a valid pointer if there is at least size unallocated bytes available in the arena. The available size is reduced accordingly. Note that this pointer can only be used to store bytes, as it is not properly aligned for anything else. Furthermore, from a strict interpretation of the C Standard, even if the pointer is properly aligned, using it as a pointer to any other type would violate the strict aliasing rule.
afree(ptr) resets the arena to the state is was before alloc() returned ptr. It would be a useful extension to make afree(NULL) reset the arena to its initial state.
Note that the main() function attempts to use the pointer returned by alloc(sizeof(int)) as a pointer to int. This invokes undefined behavior because there is no guarantee that buffer is properly aligned for this, and because of the violation of the strict aliasing rule.
Note also that the printf format printf("%d: %d", *num, sizeof(int)); is incorrect for the second argument. It should be printf("%d: %zd", *num, sizeof(int)); or printf("%d: %d", *num, (int)sizeof(int)); if the C runtime library is too old to support %zd.
Actually! I came up with a reason for the behaviour! This is what I was wondering, however, I wasn't too good at putting my thoughs into words yesterday (sorry). I modified my code to something like this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 346455;
printf("%d: %d\n", *num, (int)sizeof(int));
printf("%d, %d, %d, %d", *(next - 4), *(next - 3), *(next - 2), *(next - 1));
afree(num);
}
Now, the last printf produces "87, 73, 5, 0".
If you convert all the values into a big binary value you get this: 00000000 00000101 01001001 01010111. If you take that binary value and convert it to decimal you get the original value of *num, which is 346455. So, basically it separates the integer into 4 bytes and puts them into different elements of the array. I think this is implementation-defined and has to do with little endian and big endian. Is this correct? My first prediction was that it would truncate the integer and basically set the value to (integer value) % sizeof(char).
int *num = alloc(sizeof(int));
Says - 'here is a pointer (alloc) that points to some space, lets says it points to an integer (int*).'
The you say
*num = 5643;
Which says - set that integer to 5643.
Why wouldnt it work - given that alloc did in fact return a pointer to a block of good memory that can hold an integer

Memset not zeroing char pointer

I know there's tons of these and it's probably a simple question, but I can't seem to figure it out :(
char * char_buffer = (char *) malloc(64);
printf("%x\n", char_buffer);
memset(char_buffer,
0,
64);
printf("%x\n", char_buffer);
Output :
50000910
50000910
Why isn't char_buffer zero'd? Can someone explain what's happening?
You are confused about the difference between a pointer and space being pointed to.
Your code doesn't zero the pointer. It zeroes the space being pointed to. The memset function behaves that way. In fact you would not want to zero the pointer, as then it would no longer point to the memory you allocated.
Your printf statement attempts to print the pointer's value, which is the address of the space being pointed to. Not the contents of the space being pointed to.
Actually the printf statement causes undefined behaviour, because you mismatched format specifiers. Your compiler should have warned about this.
Here is some correct code:
printf("The buffer's address is %p\n", char_buffer);
printf("The buffer contains: %02X %02X %02X ...\n",
(unsigned char)char_buffer[0],
(unsigned char)char_buffer[1],
(unsigned char)char_buffer[2]);
To use the %X specifier you must pass non-negative values in, which is why the cast is necessary. You could declare the buffer as unsigned char * instead, in which case the cast would not be necessary.
You're printing the address of the buffer, not its contents. The contents are zero after the memset(). Just try:
printf("%d", char_buffer[0]);
and see the zero ;)
You're printing out the pointer, not the buffer values.
memset(p,v,n) writes the byte v to the byte pointed at by p and n-1 bytes after it. It doesn't touch the pointer p at all (it's passed in by value, how can it?)
To print out the buffer, you'd have to loop through it. I use something like this:
void printbuf(char *buf, int size)
{
int i = 0;
char b = 0;
for (i = 0; i < size; i++) {
if (i % 8 == 0) {
printf("\n%3d: ", i);
}
b = buf[i];
printf("%2hhX ", b);
}
}

Finding the distance between pointers in the same block or array

If I have a block of memory malloc'd of size char * a = malloc (10*sizeof(char*)),
and I have two char pointers, b and c inside of this block,
how can I find the distance between these two pointers?
If the memory addresses go from 0x00 to, say a+5, then
how could I accurately get the distance between b and c?
If b and c are your char* pointers, point at memory inside of the block a, you can just use c-b to find the distance between them.
Consider this example:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *a = calloc(10, 1);
a[3] = 'h';
a[6] = 'i';
char *b = &a[3];
char *c = &a[6];
/* Print out the pointer address. */
printf("%p\n", b);
printf("%p\n", c);
/* Subtracting the pointers gives a long int type. */
printf("%ld\n", c-b);
return 0;
}
Running this gives:
0xe4b013
0xe4b016
3
Obviously the first two are going to change each run but it gets the distance correct, because calloc and malloc are going to give you a contiguous memory block or fail; it's all right next to each other. They give you the pointer to the first address in that block.
edit for clarification:
The pointers have a certain type, which is used in the pointer arithmetic. It's a coincidence that this was asking about char pointers and char types are only 1 byte long; if the type was something of 20 bytes, the pointer arithmetic would be the same, returning a result of 3.
It is essentially "how many items of size pointed to by these pointers did it move along" and we're calculating the offset, with the number returned being address_2 - address_1 / size_of_the_type.

Resources