C compiler relocates pointer that overlaps another variable - c

I am doing some experiments to see how C allocates variables on the stack. I am getting some odd behavior with the following code. C appears to be growing the stack downward, so in the following example, the char c is allocated in the byte immediately before the short s. I then create an int pointer bigRandP and point it at the same location occupied by c, so the "int" it sees overlaps with the space on the stack occupied by s. I then try to assign something to the location referenced by the int pointer.
unsigned short nameSum = 0;
unsigned char smallRand = 0;
unsigned int* bigRandP;
//The "int" pointed to by iP should overlap s
bigRandP = (unsigned int*)(&smallRand);
printf("%p %p %p\n", &nameSum, &smallRand, bigRandP);
printf("%u %u %u\n", smallRand, nameSum, *bigRandP);
*bigRandP = 0;
printf("%p %p %p\n", &nameSum, &smallRand, bigRandP);
printf("%u %u %u\n", smallRand, nameSum, *bigRandP);
0028FF1A 0028FF19 0028FF19
0 0 419430400
0028FF1A 0028FF19 0028FF00
0 0 4210788
The printed results are interesting. Not only does the assignment fail (the int pointed to by bigRandP is not set to 0), the int pointer itself is silently relocated to point somewhere else further down the stack. What is going on? Is this the C compiler's way of keeping me from overwriting other variables with overlapping pointers?

bigRandP is a pointer to unsigned int.
You pointed it to an unsigned char object, then you modified the unsigned int object that bigRandP points to.
Apparently smallRand and bigRandP are stored close to each other in memory. By trying to modify sizeof (unsigned int) bytes of a 1-byte object, you clobbered part of the pointer object itself.
Bottom line: Your program's behavior is undefined.
Also, though this probably isn't related to the behavior you're seeing, the %p format requires a void* argument. If you want to print some other type of pointer, you should convert it to void*:
printf("%p %p %p\n", (void*)&nameSum, (void*)&smallRand, (void*)bigRandP);
It's likely to "work" with or without the casts on systems where all pointers have the same representation, but the version with the casts is more correct on all systems.

Related

Memory address and content

I have written
int a;
printf("addr = %p and content = %x\n", (void*)(&a), *(&a));
printf("addr = %p and content = %x\n", (void*)(&a)+1, *(&a)+1);
What I see in the output is
addr = 0x7fffffffde3a and content = 55554810
addr = 0x7fffffffde3b and content = 5555
I expect to see one byte in each address. However, I don't see such thing. Why?
First of all, pointer arithmetic and the dereference operator honors the data type.
Remember, a pointer arithmetic, which generates a pointer one past the last element of an array is valid, but attempt to dereference the generated pointer is undefined behavior.
Attempt to dereference a pointer which points to invalid memory location is undefined behavior.
That said,
Quoting C11,
The unary * operator denotes indirection. [...] If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’.
So, in your case, *(&a) is the same as a, which is of type int, and the format specifier prints the integer value stored in a.
If you want to see the byte-by-byte value, you need to cast the pointer (address of a) to a char * and then, dereference the pointer to see value stored in each byte.
So, (void*)(&a)+1 should be changed to (char*)(&a)+1 to point to the next byte of memory.
If you want to print bytes, then play with bytes:
use unsigned char* or uint8_t* for pointers
use %hhx to tell printf that input value is character length.
Example:
int a = 0x12345678;
printf("addr = %p and content = %hhx\n", (void*)&a, *(uint8_t*)&a);
printf("addr = %p and content = %hhx\n", ((void*)&a)+1, *((uint8_t*)&a+1));
Result in my little-endian env:
addr = 0xbedbacac and content = 78
addr = 0xbedbacad and content = 56
And don't forget to use *((uint8_t*)&a+1) instead of *(uint8_t*)&a+1. In the example the later would return 79 (78+1).
You've used the formatting directive %x. %x expects an unsigned int. The type of what you pass as the corresponding argument to printf does not change that. printf will read an unsigned int values from its arguments, no matter what you pass to it. If what you pass something that isn't compatible with what printf reads, the behavior of your program is undefined.
You happen to pass an int. An int is not an unsigned int, but it's close enough that when you read an int expecting an unsigned int, the value remains the same if the integer is positive or zero. Negative integers are mapped to a large positive value (UINT_MAX - a on machines using two's complement representation, which is almost all machines).
If you pass a char (signed or not) to printf when it expects an int (signed or not), the behavior is well-defined due to another feature of the C language, which is promotions. Values of integer types that are smaller than int (i.e. char and short) are converted to int¹. So the following program snippet is well-defined and prints the value of a byte at the address p (assuming that … is replaced by a valid pointer value):
unsigned char p = …;
printf("addr = %p and content = %x\n", (void*)p, *p);
*(&a) is the same thing as a. To see just one byte of a, you could cast the pointer &a to the type unsigned char *.
printf("addr = %p and content = %x\n", (void*)&a, *(unsigned char *)(&a));
This will print one byte of the representation of a in memory. Note that the representation depends on the machine's endianness.
Your code snippet doesn't initialize a, so the first call to printf prints whatever garbage happens to be at the location of a in memory at this time. This assumes that int does not have any trap representation, which is the case on virtually all C implementations.
The second call to printf tries to print *(&a)+1. This is just a+1. The output you get is surprising: are you sure you didn't actually run the program with *(&a+1)? This seems to be what you wanted to explore. With *(&a+1), the behavior of your program would be undefined, because this looks one int past a, and a is not in an array of two or more ints. In practice, you'd be likely to get whatever was just below a on the stack, but that's not something you can count on.
If you wanted to see the byte value at an address which is 1 past the start of a in memory, you'd need to cast the pointer to a byte pointer first. When you add an integer n to a pointer, this doesn't add n to the address stored in the pointer, it adds n to the pointer itself. This is only useful when the pointer points to a value inside an array; then p + n points to n array elements past p. In fact, p[n] is equivalent to *(p+n). If you want to add 1 to the address, then you need to obtain a byte pointer, i.e. a pointer to unsigned char. Contrast:
int a[2] = {0x12345678, 0x9abcdef0};
printf("addr = %p and content = %x\n", (unsigned char*)(&a) + 1, *((unsigned char*)(&a) + 1));
printf("addr = %p and content = %x\n", (&a) + 1, *((&a) + 1));
This is well-defined (but with an implementation-defined value since it depends on the platform's endianness) provided that int consists of at least two bytes (which is not strictly mandatory in C, but is the case everywhere except in a few embedded systems).
(void*)(&a) + 1 is not standard C, because void doesn't have a size so it doesn't make sense to move a pointer to void one void element further. However some implementations such as GCC treat void* as byte pointers, just like unsigned char *, so adding an integer to a void* adds this integer to the address stored in the pointer.
¹ Or to unsigned int if the smaller type doesn't fit in a signed int, e.g. on a platform where short and int have the same size.

casting int pointer to char pointer

I've read several posts about casting int pointers to char pointers but i'm still confused on one thing.
I understand that integers take up four bytes of memory (on most 32 bit machines?) and characters take up on byte of memory. By casting a integer pointer to a char pointer, will they both contain the same address? Does the cast operation change the value of what the char pointer points to? ie, it only points to the first 8 bits of an integers and not all 32 bits ? I'm confused as to what actually changes when I cast an int pointer to char pointer.
By casting a integer pointer to a char pointer, will they both contain the same address?
Both pointers would point to the same location in memory.
Does the cast operation change the value of what the char pointer points to?
No, it changes the default interpretation of what the pointer points to.
When you read from an int pointer in an expression *myIntPtr you get back the content of the location interpreted as a multi-byte value of type int. When you read from a char pointer in an expression *myCharPtr, you get back the content of the location interpreted as a single-byte value of type char.
Another consequence of casting a pointer is in pointer arithmetic. When you have two int pointers pointing into the same array, subtracting one from the other produces the difference in ints, for example
int a[20] = {0};
int *p = &a[3];
int *q = &a[13];
ptrdiff_t diff1 = q - p; // This is 10
If you cast p and q to char, you would get the distance in terms of chars, not in terms of ints:
char *x = (char*)p;
char *y = (char*)q;
ptrdiff_t diff2 = y - x; // This is 10 times sizeof(int)
Demo.
The int pointer points to a list of integers in memory. They may be 16, 32, or possibly 64 bits, and they may be big-endian or little endian. By casting the pointer to a char pointer, you reinterpret those bits as characters. So, assuming 16 bit big-endian ints, if we point to an array of two integers, 0x4142 0x4300, the pointer is reinterpreted as pointing to the string "abc" (0x41 is 'a', and the last byte is nul). However if integers are little endian, the same data would be reinterpreted as the string "ba".
Now for practical purposes you are unlikely to want to reinterpret integers as ascii strings. However its often useful to reinterpret as unsigned chars, and thus just a stream of raw bytes.
Casting a pointer just changes how it is interpreted; no change to its value or the data it points to occurs. Using it may change the data it points to, just as using the original may change the data it points to; how it changes that data may differ (which is likely the point of doing the casting in the first place).
A pointer is a particular variable that stores the memory address where another variable begins. Doesnt matter if the variable is a int or a char, if the first bit has the same position in the memory, then a pointer to that variable will look the same.
the difference is when you operate on that pointer. If your pointer variable is p and it's a int pointer, then p++ will increase the address that it contains of 4 bytes.
if your pointer is p and it's a char pointer, then p++ will increase the address that it contains of 1 byte.
this code example will help you understand:
int main(){
int* pi;
int i;
char* pc;
char c;
pi = &i;
pc = &c;
printf("%p\n", pi); // 0x7fff5f72c984
pi++;
printf("%p\n", pi); // 0x7fff5f72c988
printf("%p\n", pc); // 0x7fff5f72c977
pc++;
printf("%p\n", pc); // 0x7fff5f72c978
}

passing a pointer address vs pointer to a pointer in C

I'm slightly confused between these two pieces of code:
version 1: (gives warnings after compiling)
int func(int *ptr2)
{
*ptr2 += 1;
}
int main()
{
int a = 5;
int *ptr = &a;
printf("Address of a: %x\n", a);
printf("Before: %x\n", ptr);
func(&ptr);
printf("After: %x\n", ptr);
return 0;
}
Output:
Address of a: 5770a18c
Before: 5770a18c
After: 5770a18d
version 2:
int func(int **ptr2)
{
*ptr2 += 1;
}
int main()
{
int a = 5;
int *ptr = &a;
printf("address of a: %x\n", &a);
printf("Before: %x\n", ptr);
func(&ptr);
printf("After: %x\n", ptr);
return 0;
}
Output:
Address of a: cc29385c
Before: cc29385c
After: cc293860
If I'm understanding pointers in C correctly when we pass by reference, we are creating a pointer to that location. This allows us to change the value at the address held by the pointer through the dereference operator.
However, if we want to change the value held by a pointer, we use a pointer to a pointer. We pass the address of the pointer and create a new pointer to hold said address. If we want to change the value, we use the dereference operator to access our pointer's (defined elsewhere) value.
Hopefully I'm on the right track, but I'm struggling to visualize what's happening with version 1 specifically. Mainly, I'd just like to understand the difference in make-up and output between these two programs. I assume version 1 is still a pointer to a pointer, but why are the incremented values different between both programs? If version 1 is successfully incrementing ptr's value (which I suspect is not), why is that I cannot find code with the same syntax? I think I'm missing something fairly trivial here... Any help is appreciated
Based on your output, you appear to be compiling for a 32-bit system where addresses and int are of that size.
When you increment the value at *ptr with that type being int, it will simply add 1.
When *ptr resolves to an int* then it will increment by sizeof(int) because the value at the current address in this case is 4 bytes long, so we have to increase the address by the number of bytes that an int consumes so that we're pointing at the next int. Note that doing this is only valid if you actually have allocated memory at the subsequent address.
Generally you pass a T** when the callee needs to modify the address to point to - such as say, the callee performs a malloc() to allocate space for the pointer.
&ptr is a pointer to a pointer, but what is passed to func() is a pointer to int converted from &ptr in implementation-defined manner. Then, *ptr2 += 1; is incrementing int and add 1 to what is pointed by ptr2 (the pointer ptr in main(), which eventually have the same reepresentation as `int in your system).
In version 2, the pointer to a pointer is correctly passed to func(). Therefore, pointer aritimetic is performed and the size of int is added to the address.
Note that you invoked undefined behavior by passing data having wrong type to printf(). The correct way to print pointers is like this:
printf("Before: %p\n", (void*)ptr);
As you see, cast the pointer to void* and use %p specifier.

typecasting a pointer to an int .

I can't understand the output of this program .
What I get of it is , that , first of all , the pointers p, q ,r ,s were pointing towards null .
Then , there has been a typecasting . But how the heck , did the output come as 1 4 4 8 . I might be very wrong in my thoughts . So , please correct me if I am wrong .
int main()
{
int a, b, c, d;
char* p = (char*)0;
int *q = (int *)0;
float* r = (float*)0;
double* s = (double*)0;
a = (int)(p + 1);
b = (int)(q + 1);
c = (int)(r + 1);
d = (int)(s + 1);
printf("%d %d %d %d\n", a, b, c, d);
_getch();
return 0;
}
Pointer arithmetic, in this case adding an integer value to a pointer value, advances the pointer value in units of the type it points to. If you have a pointer to an 8-byte type, adding 1 to that pointer will advance the pointer by 8 bytes.
Pointer arithmetic is valid only if both the original pointer and the result of the addition point to elements of the same array object, or just past the end of it.
The way the C standard describes this is (N1570 6.5.6 paragraph 8):
When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
[...]
If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined. If the result points one past the last element of the
array object, it shall not be used as the operand of a unary *
operator that is evaluated.
A pointer just past the end of an array is valid, but you can't dereference it. A single non-array object is treated as a 1-element array.
Your program has undefined behavior. You add 1 to a null pointer. Since the null pointer doesn't point to any object, pointer arithmetic on it is undefined.
But compilers aren't required to detect undefined behavior, and your program will probably treat a null pointer just like any valid pointer value, and perform arithmetic on it in the same way. So if the null pointer points to address 0 (this is not guaranteed, BTW, but it's very common), then adding 1 to it will probably give you a pointer to address N, where N is the size in bytes of the type it points to.
You then convert the resulting pointer to int (which is at best implementation-defined, will lose information if pointers are bigger than int, and may yield a trap representation) and you print the int value. The result, on most systems, will probably show you the sizes of char, int, float, and double, which are commonly 1, 4, 4, and 8 bytes, respectively.
Your program's behavior is undefined, but the way it actually behaves on your system is typical and unsurprising.
Here's a program that doesn't have undefined behavior that illustrates the same point:
#include <stdio.h>
int main(void) {
char c;
int i;
float f;
double d;
char *p = &c;
int *q = &i;
float *r = &f;
double *s = &d;
printf("char: %p --> %p\n", (void*)p, (void*)(p + 1));
printf("int: %p --> %p\n", (void*)q, (void*)(q + 1));
printf("float: %p --> %p\n", (void*)r, (void*)(r + 1));
printf("double: %p --> %p\n", (void*)s, (void*)(s + 1));
return 0;
}
and the output on my system:
char: 0x7fffa67dc84f --> 0x7fffa67dc850
int: 0x7fffa67dc850 --> 0x7fffa67dc854
float: 0x7fffa67dc854 --> 0x7fffa67dc858
double: 0x7fffa67dc858 --> 0x7fffa67dc860
The output is not as clear as your program's output, but if you examine the results closely you can see that adding 1 to a char* advances it by 1 byte, an int* or float* by 4 bytes, and a double* by 8 bytes. (Other than char, which by definition has a size of 1 bytes, these may vary on some systems.)
Note that the output of the "%p" format is implementation-defined, and may or may not reflect the kind of arithmetic relationship you might expect. I've worked on systems (Cray vector computers) where incrementing a char* pointer would actually update a byte offset stored in the high-order 3 bits of the 64-bit word. On such a system, the output of my program (and of yours) would be much more difficult to interpret unless you know the low-level details of how the machine and compiler work.
But for most purposes, you don't need to know those low-level details. What's important is that pointer arithmetic works as it's described in the C standard. Knowing how it's done on the bit level can be useful for debugging (that's pretty much what %p is for), but is not necessary to writing correct code.
Adding 1 to a pointer advances the pointer to the next address appropriate for the pointer's type.
When the (null)pointers+1 are recast to int, you are effectively printing the size of each of the types being pointed to by the pointers.
printf("%d %d %d %d\n", sizeof(char), sizeof(int), sizeof(float), sizeof(double) );
does pretty much the same thing. If you want to increment each pointer by only 1 BYTE, you'll need to cast them to (char *) before incrementing them to let the compiler know
Search for information about pointer arithmetic to learn more.
You're typecasting the pointers to primitive datatypes rather type casting them to pointers themselves and then using * (indirection) operator to indirect to that variable value. For instance, (int)(p + 1); means p; a pointer to constant, is first incremented to next address inside memory (0x1), in this case. and than this 0x1 is typecasted to an int. This totally makes sense.
The output you get is related to the size of each of the relevant types. When you do pointer arithmetic as such, it increases the value of the pointer by the added value times the base type size. This occurs to facilitate proper array access.
Because the size of char, int, float, and double are 1, 4, 4, and 8 respectively on your machine, those are reflected when you add 1 to each of the associated pointers.
Edit:
Removed the alternate code which I thought did not exhibit undefined behavior, which in fact did.

malloc returns negative value

I am running this piece of code on a hardware.
unsigned char *buf;
buf = malloc(sizeof(int));
printf("Address of buf %d\n" , &buf);
if(!buf)
return MEMORYALLOC_FAILURE;
The malloc is returning negative value. What could be the problem?
Address returned by malloc is not negative or positive, it just address, use %p to print it, not %d:
printf("Address of buf %p\n" , &buf);
And if you want to print the address returned from malloc, remove the ampersand:
printf("Address of buf %p\n" , buf);
You're printing the address of a memory location as a signed integer. If the memory address--for example on a 32bit machine--is more than 2,147,483,647 (0x7FFFFFFF) it will display as a negative number.
In this case you're also printing the address of a local variable on the stack rather than the address returned by malloc.
The error with using %d to print a pointer-sized value is that pointers may vary in size. The correct approach therefore would be to use the printf specifier for pointers, %p:
// nb: we don't take the address of buf,
// buf is already a pointer (thus its *value* is an address)
printf("Address of buf %p\n", buf);
Type mismatch:
you try to print an address with the specifier for an int ("%d"). You should use "%p" and cast the value to void*
printf("Address of buf %p\n" , (void*)&buf);
Also note the above will not tell you where the allocated memory is. For that you'd need
printf("Address of newly allocated memory %p\n" , (void*)buf);
The cast to void* is mandated by the C99 standard (emphasis is mine)
The argument shall be a pointer to void. The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner.
Also note that pointers to void need not have the same representation as pointers to other types.
malloc returns either NULL (aka 0) or a memory address. Memory addresses cannot be negative. You just converted the pointer itself to a number, resulting in a negative number.
malloc returns a pointer. It never returns a negative value. If you think it is then you probably have a broken everything. On the other hand if you mean it returns a null pointer, then it means you are unable to allocate that memory.

Resources