I have a case in which I have a uint64_t*, a char* (6 chars long), and a uint16_t* that point to the addresses x, x+2, and x respectively. So if the char* points to "This i" then the uint16_t* should point to the value 0 dec, and the uint64_t* should point to the value 115, 588, 096, 157, 780 dec.
But when I look at these values I am getting 0 dec for the uint64_t I do not understand why this would be. Could someone explain?
Edit (added code):
FILE *file = fopen("/Users/Justin/Desktop/test.txt", "rb");
uint64_t *window = malloc(8);
char *buffer = (char*)(window+2);
uint16_t *data = (uint16_t*)window;
uint16_t *read = (uint16_t*)(window+6);
fread(buffer, 6, 1, file);
printf("%p\n", window);
printf("%p\n", buffer);
printf("%p\n", data);
printf("%p\n", read);
printf("%s\n", buffer);
printf("%llu\n", *window);
and the output is:
0x100105440
0x100105450
0x100105440
0x100105470
This i
0
You're getting tripped up by pointer math.
As window is declared as uint64_t *, window + 2 does not increment the address by two bytes; it increments it by 2 * sizeof(uint64_t) (that is, sixteen bytes). As a result, the memory you're looking at is uninitialized (and, in fact, lies outside the allocated block).
If you actually want the address to be incremented by two bytes, you'll need to cast the pointer to char * before adding 2 to it:
char *buffer = ((char *) window) + 2;
I think you're misunderstanding what the +2 does here:
uint64_t *window = malloc(8);
char *buffer = (char*)(window+2);
It helps to visualize the data that we got back from malloc, using | to help show 8-byte boundaries:
|-------|-------|-------|-------|-------|-------|-------|-------
^
window
Now, buffer doesn't point two bytes ahead of window. It points two uint64_t's ahead. Or in other words, ((char*)window) + 2 * sizeof(*window):
|-------|-------|-------|-------|-------|-------|-------|-------
^ ^
window buffer
which you then fread into
|-------|-------This i--|-------|-------|-------|-------|-------
^ ^
window buffer
If you want to just point two bytes ahead, you have to add the 2 to a char*:
char* buffer = ((char*)window) + 2;
In addition to the pointer math issues that #duskwuff pointed out and #Barry clarified, there is also another issue. The answer you get will depend on the computer architecture on which this code runs.
On big-endian machines you'll get one answer, on little-endian machines you'll get a different answer.
Oh, and one more thing. Writing buffer using %s is very dangerous. If there are no zero bytes in the data read from the file, it will wander through memory until it finds one.
Related
I was reading an old article about format string exploit back in the 2000's, link can be found here: Article
At page 15, the author describes the mean to overwrite a variable's content by increasing the printf internal stack pointer as so: Stack pushed
unsigned char canary[5];
unsigned char foo[4];
memset (foo, ’\x00’, sizeof (foo));
/* 0 * before */ strcpy (canary, "AAAA");
/* 1 */ printf ("%16u%n", 7350, (int *) &foo[0]);
/* 2 */ printf ("%32u%n", 7350, (int *) &foo[1]);
/* 3 */ printf ("%64u%n", 7350, (int *) &foo[2]);
/* 4 */ printf ("%128u%n", 7350, (int *) &foo[3]);
/* 5 * after */ printf ("%02x%02x%02x%02x\n", foo[0], foo[1],
foo[2], foo[3]);
printf ("canary: %02x%02x%02x%02x\n", canary[0],
canary[1], canary[2], canary[3]);
Returns the output “10204080” and “canary: 00000041”
Unfortunately the author doesn't explain the reason why the stack gets pushed like this, in other terms what part of the printf procedure is provoking the override in memory?
Edit:
I do understand that the instruction in /1/ will create a right padded field of width 16 then write the number of written bytes (16) to the address of foo[0].
The question is why does it overwrite to the adjacent memory? You would normally think that it would only write on the address of foo[0] which is one byte not 4.
The code has undefined behavior and is invalid.
Anyway, the statement:
printf ("%16u%n", 7350, (int *) &foo[0]);
is basically doing:
*(int *)&foo[0] = 16;
The biggest issue is with the last one:
printf("%128u%n", 7350, (int *) &foo[3]);
it's doing:
*(int *)&foo[3] = 128;
but foo[3] is an unsigned char. Assuming sizeof(int) = 4, ie. int has 4 bytes, then this writes 3 bytes out of bounds to foo + 3. x86 stores stack in reverse order - the memory reserved for canary is put after the memory for foo. The stack memory looks like this:
<-- foo ---><--- canary ----->
[0][1][2][3][0][1][2][3][4][5]
^^^^^^^^^^^^
storing (int)128 here in **little endian**
Because x86 is little endian, foo[3] is assigned the value of 128, and canary[0..2] are zeroed (because 128 = 0x00000080).
You can do:
// I want it to print 0xDEAD
// I swap bytes for endianess I get 0xADDE
// I then shift it left by 8 bytes and get 0xADDE00
// 0xADDE00 = 11394560
// The following printf will do foo[3] = 0x00
// but also: canary[0] = 0xDE, canary[1] = 0xAD and canary[2] = 0x00
fprintf("%11394560u%n", 7350, (int *) &foo[3]);
printf("0x%02x%02x\n", canary[0], canary[1]);
// will output 0xDEAD
why does it overwrite to the adjacent memory?
Undefined behavior (UB).
All lines below exhibit UB. What OP sees to a potential result, but not specified by C.
/* 0 * before */ strcpy (canary, "AAAA"); // Writing out of bounds
/* 1 */ printf ("%16u%n", 7350, (int *) &foo[0]); // Writing out of bounds, alignment issues.
/* 2 */ printf ("%32u%n", 7350, (int *) &foo[1]);
/* 3 */ printf ("%64u%n", 7350, (int *) &foo[2]);
/* 4 */ printf ("%128u%n", 7350, (int *) &foo[3]);
After reading the gnu lib c on the %n format parameter here : Gnu C
It states that %n uses an argument which must be a pointer to an int.
From my understanding an int is (edit)at least 16 bits long, but depending on the compiler it can be stored as a 4 byte word or even 8 in modern machines.
I will go with the guess that this article has been written in a time where compilers were already storing ints as 4 byte words, so therefore %n will promote each unsigned char pointer to an int pointer thus overriding 4 bytes of memory on each call, starting from the char address.
So, I just sat down and decided to write a memory allocator. I was tired, so I just threw something together. What I ended up with was this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 5643;
printf("%d: %d", *num, sizeof(int));
afree(num);
}
For some reason, this works. But I can not explain why it works. It may have to do with the fact that I am tired, but I really can not see why it works. So, this is what it should be doing, logically, and as I understand it:
It creates a char array with a pointer which points to the first element of the array.
When I call alloc with a value of 4 (which is the size of an int, as I have tested down below), it should set next to point to the fourth element of the array. It should then return a char pointer to the first 4 bytes of the array casted to a void pointer.
I then set that value to something greater than the max value of a char. C should realise that that isn't possible and should then truncate it to *num % sizeof(char).
I have one guess as to why this works: When the char pointer is casted to a void pointer and then gets turned into an integer it somehow changes the size of the pointer so that it is able to point to an integer. (I haven't only tried this memory allocator with integers, but with structures as well, and it seems to work with them as well).
Is this guess correct, or am I too tired to think?
EDIT:
EDIT 2: I think I've understood it. I realised that my phrasing from yesterday was quite bad. The thing which threw me off was the fact that the returned pointer actually points to a char, but I am still somehow able to store an integer value.
The allocator posted implements a mark and release allocation scheme:
alloc(size) returns a valid pointer if there is at least size unallocated bytes available in the arena. The available size is reduced accordingly. Note that this pointer can only be used to store bytes, as it is not properly aligned for anything else. Furthermore, from a strict interpretation of the C Standard, even if the pointer is properly aligned, using it as a pointer to any other type would violate the strict aliasing rule.
afree(ptr) resets the arena to the state is was before alloc() returned ptr. It would be a useful extension to make afree(NULL) reset the arena to its initial state.
Note that the main() function attempts to use the pointer returned by alloc(sizeof(int)) as a pointer to int. This invokes undefined behavior because there is no guarantee that buffer is properly aligned for this, and because of the violation of the strict aliasing rule.
Note also that the printf format printf("%d: %d", *num, sizeof(int)); is incorrect for the second argument. It should be printf("%d: %zd", *num, sizeof(int)); or printf("%d: %d", *num, (int)sizeof(int)); if the C runtime library is too old to support %zd.
Actually! I came up with a reason for the behaviour! This is what I was wondering, however, I wasn't too good at putting my thoughs into words yesterday (sorry). I modified my code to something like this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 346455;
printf("%d: %d\n", *num, (int)sizeof(int));
printf("%d, %d, %d, %d", *(next - 4), *(next - 3), *(next - 2), *(next - 1));
afree(num);
}
Now, the last printf produces "87, 73, 5, 0".
If you convert all the values into a big binary value you get this: 00000000 00000101 01001001 01010111. If you take that binary value and convert it to decimal you get the original value of *num, which is 346455. So, basically it separates the integer into 4 bytes and puts them into different elements of the array. I think this is implementation-defined and has to do with little endian and big endian. Is this correct? My first prediction was that it would truncate the integer and basically set the value to (integer value) % sizeof(char).
int *num = alloc(sizeof(int));
Says - 'here is a pointer (alloc) that points to some space, lets says it points to an integer (int*).'
The you say
*num = 5643;
Which says - set that integer to 5643.
Why wouldnt it work - given that alloc did in fact return a pointer to a block of good memory that can hold an integer
I have this code right here.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int *size;
int i = 0;
char buf[] = "Thomas was alone";
size = (int*)calloc(1, sizeof(buf)+1);
for(i=0; i<strlen(buf); i++)
{
*(size+i) = buf[i];
printf("%c", *(size+i));
}
free(size);
}
To my understanding calloc reserves a memspace the size of the first arg multiplied by the second, in this case 18. The length of buf is 17 and thus the for loop should not have any problems at all.
Running this program results in the expected results ( It prints Thomas was alone ), however it crashes immediately too. This persists unless I crank up the size of calloc ( like multiplied by ten ).
Am I perhaps understanding something wrongly?
Should I use a function to prevent this from happening?
int *size means you need:
size = calloc(sizeof(int), sizeof(buf));
You allocated enough space for an array of char, but not an array of int (unless you're on an odd system where sizeof(char) == sizeof(int), which is a theoretical possibility rather than a practical one). That means your code writes well beyond the end of the allocated memory, which is what leads to the crashing. Or you can use char *size in which case the original call to calloc() is OK.
Note that sizeof(buf) includes the terminal null; strlen(buf) does not. That means you overallocate slightly with the +1 term.
You could also perfectly sensibly write size[i] instead of *(size+i).
Change the type of size to char.
You are using an int and when you add to the pointer here *(size+i), you go out of bounds.
Pointer arithmetic takes account of the type, which in you case is int not char. sizeof int is larger than char on your system.
You allocate place for char array not for int array:
char is 1 byte in memory (most often)
int is 4 bytes in memory (most often)
so you allocate 1 * sizeof(buf) + 1 = 18 bytes
so for example in memory:
buf[0] = 0x34523
buf[1] = 0x34524
buf[2] = 0x34525
buf[3] = 0x34526
but when you use *(size + 1) you don't move pointer on 1 byte but for sizeof(int) so for 4 bytes.
So in memory it will look like:
size[0] = 0x4560
size[1] = 0x4564
size[2] = 0x4568
size[3] = 0x4572
so after few loops you are out of memory.
change calloc(1, sizeof(buf) + 1); to calloc(sizeof(int), sizeof(buf) + 1); to have enough memory.
Second think, I think is some example on which you learn how it works?
My suggestion:
Use the same type of pointer and variable.
when you assign diffnerent type of variables, use explicit conversion, in this example
*(size+i) = (int)buf[i];
I have written a program in C which will read the bytes at a specific memory address from its own address space.
it works like this:
first it reads a DWORD from a File.
then it uses this DWORD as a memory address and reads a byte from this memory address in the current process' address space.
Here is a summary of the code:
FILE *fp;
char buffer[4];
fp=fopen("input.txt","rb");
// buffer will store the DWORD read from the file
fread(buffer, 1, 4, fp);
printf("the memory address is: %x", *buffer);
// I have to do all these type castings so that it prints only the byte example:
// 0x8b instead of 0xffffff8b
printf("the byte at this memory address is: %x\n", (unsigned)(unsigned char)(*(*buffer)));
// And I perform comparisons this way
if((unsigned)(unsigned char)(*(*buffer)) == 0x8b)
{
// do something
}
While this program works, I wanted to know if there is another way to read the byte from a specific memory address and perform comparisons? Because each time, I need to write all the type castings.
Also, now when I try to write the byte to a file using the following syntax:
// fp2 is the file pointer for the output file
fwrite(fp2, 1, 1, (unsigned)(unsigned char)(*(*buffer)));
I get the warnings:
test.c(64) : warning C4047: 'function' : 'FILE *' differs in levels of indirectio
n from 'unsigned int'
test.c(64) : warning C4024: 'fwrite' : different types for formal and actual para
meter 4
thanks.
You can use the C language union construct to represent an alias for your type as shown
typedef union {
char char[4];
char *pointer;
} alias;
alias buffer;
This assumes a 32-bit architecture (you could adjust the 4 at compile time, but would then also need to change the fread() byte count).
Then, you can simply use *(buffer.pointer) to reference the contents of the memory location.
From your question, the application is not clear, and the technique seems error prone. How do you take into account the movement of addresses in memory as things change? There may be some point in using the linker maps to extract symbolic information for locations to avoid the absolute addresses.
Take note of the definition of fwrite,
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
which means that the warnings at the last part of your question are because you should be writing from a character pointer rather than writing the actual value of the character.
You can remove the extra type castings by assigning the pointer you read from the file to another variable of the correct type.
Examples to think about:
#include <stdio.h>
int main() {
union {
char buffer[8];
char *character;
long long number;
} indirect;
/* indirect is a single 8-byte variable that can be accessed
* as either a character array, a character pointer, or as
* an 8-byte integer! */
char *x = "hi";
long long y;
char *z;
printf("stored in the memory beginning at x: '%s'\n", x); /* 'hi' */
printf("bytes used to represent the pointer x: %ld\n", sizeof(x)); /* 8 */
printf("exact value (memory location) of (pointed to by) the pointer x: %p\n", x); /* 4006c8 */
y = (long long) x;
printf("%llx\n", y); /* 4006c8 */
z = (char *) y;
printf("%s\n", z); /* 'hi' */
/* the cool part--we can access the exact same 8 bytes of data
* in three different ways, as a 64-bit character pointer,
* as an 8-byte character buffer, or as
* an 8-byte integer */
indirect.character = z;
printf("%s\n", indirect.character); /* 'hi' */
printf("%s\n", indirect.buffer); /* binary garbage which is the raw pointer */
printf("%lld\n", indirect.number); /* 4196040 */
return 0;
}
By the way, reading arbitrary locations from memory seems concerning. (You say that you are reading from a specific memory address within the program's own address space, but how do you make sure of that?)
fp=fopen("input.txt","rb");
The file has an extension of .txt and you are trying to read it as a binary file. Please name files accordingly. If on Windows, name binary files with .bin extention. On Linux file extension do not matter.
// buffer will store the DWORD read from the file
fread(buffer, 1, 4, fp);
If you want to read 4 bytes, declare an unsinged int variable and read 4 bytes into it as shown below
fread(&uint, 1, 4, fp);
Why do you want to use a character array ? That is incorrect.
printf("the memory address is: %x", *buffer);
What are you trying to do here ? buffer is a pointer to a const char and the above statement prints the hex value of the first character in the array. The above statement is equal to
printf("the memory address is: %x", buffer[0]);
(*(*buffer)
How is this working ? Aren't there any compiler warnings and errors ? Is it Windows or Linux ? (*buffer) is a char and again de-referencing it should throw and error unless properly cast which I see you are not doing.
Pointer Downcast
int* ptrInt;
char * ptrChar;
void* ptrVoid;
unsigned char indx;
int sample = 0x12345678;
ptrInt = &sample;
ptrVoid = (void *)(ptrInt);
ptrChar = (char *)(ptrVoid);
/*manipulating ptrChar */
for (indx = 0; indx < 4; indx++)
{
printf ("\n Value: %x \t Address: %p", *(ptrChar + indx), ( ptrChar + indx));
}
Output:
Value: 00000078 Address: 0022FF74
Value: 00000056 Address: 0022FF75
Value: 00000034 Address: 0022FF76
Value: 00000012 Address: 0022FF77
Question:
Why was sample divided into char sized data? And when pointer arithmetic is performed, how was it able to get its remaining value?
How this was possible?
Pointer Upcast
unsigned int * ptrUint;
void * ptrVoid;
unsigned char sample = 0x08;
ptrVoid = (void *)&sample;
ptrUint = (unsigned int *) ptrVoid;
printf(" \n &sample: %p \t ptrUint: %p ", &sample, ptrUint );
printf(" \n sample: %p \t *ptrUint: %p ", sample, *ptrUint );
Output:
&sample: 0022FF6F ptrUint: 0022FF6F
sample: 00000008 *ptrUint: 22FF6F08 <- Problem Point
Question:
Why is it that there is a garbage value in *ptrUint? Why is the garbage value similar
to ptrUint? Should malloc() or calloc() be used to avoid this garbage value? What kind of remedy would you suggest to remove the garbage value?
In the first example, you are using a char pointer so the data is going to be accessed a byte at a time. Memory is byte addressable, so when you add one to a pointer, you will access the next higher memory address. This is what is happening with the for loop. Using a byte pointer tells the compiler to access only the single byte, and rest of bits will show up as 0 when you are printing with %p.
In the second example, I think what is happening is that one byte is allocated for the sample byte, then the following 4 bytes were allocated to the ptrUint. So when you get the value starting at the memory address of sample and converting it to a 4 byte pointer, you just see the value in Sample plus the first 3 bytes of the ptrUint. If you cast this to a char pointer, and print, you would only see 8 in the output.
These aren't upcasts and downcasts, that would imply some kind of inheritance hierarchy.
In your first example you treat a pointer to an integer like if it was a pointer to char(s). Incrementing a pointer to int adds 4 to it, incrementing a pointer to char adds 1 to it (assuming 32 bit ints and 8 bit chars). Dereferencing them makes an int and a char, respectively. Hence the fragmentation into bytes.
In your second example you treat the unsigned char variable called sample as if it were a pointer to int, and dereference it. You are essentially reading garbage from the 0x08 memory address. I suppose you forgot a &. You are also passing a 1 byte char and a 4 byte int to the second printf, instead of 4+4 bytes, that messes up printf, reading 3 bytes more from the stack than you have given him. Which coincidentally is part of the ptrUint value given to the first call of printf. Using %c instead of %p should fix it.
The other answers have already explained why you're seeing what you're seeing.
I will add that your second example relies on undefined behaviour. It is not valid to dereference an int * that points to data that wasn't originally an int. i.e.:
char x = 5;
int *p = (int *)&x;
printf("%d\n", *p); // undefined behaviour