What is the effect of ` c = (int *) ((char *) c + 1)`? - c

I meet the question in OS course. Here is the code from 6.828 (Operating System) online course. It meant to let learners practice the pointers in C programming language.
#include <stdio.h>
#include <stdlib.h>
void
f(void)
{
int a[4];
int *b = malloc(16);
int *c;
int i;
printf("1: a = %p, b = %p, c = %p\n", a, b, c);
c = a;
for (i = 0; i < 4; i++)
a[i] = 100 + i;
c[0] = 200;
printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c[1] = 300;
*(c + 2) = 301;
3[c] = 302;
printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c = c + 1;
*c = 400;
printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
c = (int *) ((char *) c + 1);
*c = 500;
printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);
b = (int *) a + 1;
c = (int *) ((char *) a + 1);
printf("6: a = %p, b = %p, c = %p\n", a, b, c);
}
int
main(int ac, char **av)
{
f();
return 0;
}
I copy it to a file and compile it use gcc , then I got this output:
$ ./pointer
1: a = 0x7ffd3cd02c90, b = 0x55b745ec72a0, c = 0x7ffd3cd03079
2: a[0] = 200, a[1] = 101, a[2] = 102, a[3] = 103
3: a[0] = 200, a[1] = 300, a[2] = 301, a[3] = 302
4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302
5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302
6: a = 0x7ffd3cd02c90, b = 0x7ffd3cd02c94, c = 0x7ffd3cd02c91
I can easily understand the output of 1,2,3,4. But it's hard for me to understand the output of 5. Specially why a[1] = 128144 and a[2] = 256?
It seems this output is the result of
c = (int *) ((char *) c + 1);
*c = 500;
I have trouble understand the function of the code c = (int *) ((char *) c + 1).
c is a pointer by definiton int *c. And before the output of 5th line, c points to the second address of array a by c = a and c = c + 1. Now what's the meaning of (char *) c and ((char *) c + 1) ,then (int *) ((char *) c + 1)?

Although this is undefined behavior per the standard, it has a clear meaning in "ancient C", and it clearly works that way on the machine/compiler you're working with.
First, it casts c to a (char *), which means that pointer arithmetic will work in units of sizeof(char) (i.e. one byte) instead of sizeof(int). Then it adds one byte. Then it converts the result back to (int *). The result is an int pointer that now refers to an address one byte higher than it used to. Since c was pointing at a[1] beforehand, afterwards *c = 500 will write to the last three bytes of a[1] and the first byte of a[2].
On many machines (but not x86) this is an outright illegal thing to do. An unaligned access like that would simply crash your program. The C standard goes further and says that that code is allowed to do anything: when the compiler sees it, it can generate code that crashes, does nothing, writes to a completely unrelated bit of memory, or causes a small gnome to pop out of the side of your monitor and hit you with a mallet. However, sometimes the easiest thing to do in the case of UB is also the straightforward obvious thing, and this is one of those cases.
Your course material is trying to show you something about how numbers are stored in memory, and how the same bytes can be interpreted in different ways depending on what you tell the CPU. You should take it in that spirit, and not as a guide to writing decent C.

At the first output, c is point to a random address.
After c = a;, c point to a so when you change value of c[0], c[1], *(c + 2), 3[c] the value of a change accordingly.
At the following line:
c = c + 1;
c is now point to a[1] and the address would be 0x7ffd3cd02c94.
Now go to the line that you are asking for: c = (int *) ((char *) c + 1); it will do as following:
Convert c to a pointer type char which still point to same address 0x7ffd3cd02c94.
Do increase the pointer 1, so now the address would be 0x7ffd3cd02c95
Assign the new address again to c (int *).
Before that command, c will point to address: 0x7ffd3cd02c94-0x7ffd3cd02c97. But after that the address would be: 0x7ffd3cd02c95-0x7ffd3cd02c98. That is the reason the value at [5] is
[![enter image description here][1]][1]
Now it is clear why the value changed as you observed.
NOTE: This is correct for little endian system. For big endian the result would be a little bit different. AND for some embedded platform which not allow UNALIGNED access, you should got exception at that line.
[1]: https://i.stack.imgur.com/eU0Tb.png

This is a result of undefined behavior. You invoke undefined behavior because you dereference a null pointer (for array a) and the array size is zero (for array b) - for this case, this is equivalent to c= a; b= 0; c = (int *) ((char *) c + 1). This should trigger a warning, which is why I also added -Wall -pedantic -std=c99 in the above example.
To answer your question about (char *) c and ((char *) c + 1).
(char *) c: Since c is a pointer, c->type is int * (pointer to int). This makes c->type have type char *. You take the address of the second element in the array c and assign it to a. So, c->type is then char * (address of second element in the array c). c[0] (index 0) is therefore the first element in array c.
((char *) c + 1) - c + 1 = &c[1]. c[0] + 1 = c[1] (first element of the array c+1).

Related

Values are not getting assigned to continuous addresses

Why on compiling the below piece of code is giving runtime error?
#include<stdio.h>
int main()
{
int i;
int *p;
int a = 10;
p= &a;
printf("address of a = %x\n",p);
*(p + 0) = 5;
*(p + 1) = 6;
*(p + 2) = 7;
*(p + 3) = 8;
for(i=0; i < 4; i++)
{
printf("address = %x value = %x\n",(p+i),*(p+i));
}
return 0;
}
In this code i am assigning values to the address of variable named a after that starting from address of a the values (6,7,8) respectively are getting assigned to the next address of a consecutively.
*(p + 1) = 6;
p is an int* - meaning that when you increment it by one, it doesn't jump one byte forwards - it jumps sizeof(int) bytes forward (probably 4 bytes). If you want to assign to the bytes separately, cast the pointer to a char*:
*((char*)p + 1) = 6;
When you write code like *(p + 1) = 6; - your program is very likely to crash. Per the standard this is undefined behavior, in practice what usually really happens behind the scenes is that since p == &a and a is on the stack, p + 1 points to 4 bytes in the stack above a - which likely contains some random value like a stack canary or a return address - and you are corrupting this value.
These expressions:
*(p + 1) = 6;
*(p + 2) = 7;
*(p + 3) = 8;
Create pointers that are past the memory bounds of a which are then subsequently dereferenced. Reading memory past the bounds of an object (or even attempting to create such a pointer if it is not just past the object) triggers undefined behavior.
In this particular case it caused your program to crash, but there is no guarantee that will happen.
You should allocate that memory before accessing it. Try using malloc().
Your code should look like this:
#include<stdio.h>
int main()
{
int i;
int a = 10;
char *p= (char *)&a;
printf("address of a = %p\n",p);
for (i = 0; i < sizeof(a); ++i) {
*(p + i) = i + 5;
}
for(i = 0; i < sizeof(a); ++i) {
printf("address = %p value = %d\n", p + i, *(p + i));
}
return 0;
}
One solution is to define p as a pointer to char. Another approach is, as suggested in other answers, just cast p into a pointer to char before any arithmetic. When using pointer arithmetic, the number of bytes you "jump" is as the size of the pointed type. So, p + 1 will jump 4 bytes, in case int is 4 bytes. This is why you should use a pointer to char if you want to move one byte at a time.
In addition, your loops should run N times, where N in the number of bytes. So, my suggestion is to use sizeof.
Last thing, please note that in order to print an int you should use %d, and use %p to print pointers (i.e addresses).

Dereferencing char * array[] and storing in char ** array[] (C)

Edit. Sorry for the minimal information included previously
Say I have the following code:
char ** a[16];
a[15] = '\0';
int i;
for (i = 0; i < 5; i++) {
char * b[3];
b[2] = '\0';
b[0] = "foo";
b[1] = "bar";
if (i == 4) {
b[0] = "hello";
b[1] = "world";
}
a[i] = b;
}
Straight after the for loop, if I include the following two lines:
printf("%s %s\n", a[0][0], a[0][1]);
printf("%s %s\n", a[4][0], a[4][1]);
I want the output to be:
foo bar
hello world
However it is instead:
hello world
hello world
I am aware this is because of my declaration, a[i] = b;, where b is an array of char pointers. Each loop, the character pointers pointed to by b[0] and b[1] are changed.
In the final loop they are set to "hello" and "world" respectively. Since I assigned b to a[i], every index of a now points to the same thing.
What I would like to do is dereference b, such that a[i] is given the value b points to rather than b itself. Therefore after the loop all indexes of a are not the same.
I tried using the following but both resulted in segmentation faults:
*a[i] = *b
and
**a[i] = **b
Any help would be much appreciated, as I'm totally lost. Thank you :)
What I would like to do is dereference b, such that a[i] is given the value b points to rather than b itself. Therefore after the loop all indexes of a are not the same.
Note that b itself is an array, so you cannot just assign value b points to rather than b itself because b does not point to a single value.
To really do this (not sure why you would want this) replace the line:
a[i] = b;
a[i][0] = b[0];
a[i][1] = b[1];
a[i][2] = b[2];

Pointer of pointer arithmetic regarding matrices

I have a doubt regarding pointer of pointer arithmetic in C.
If we do
int ** ptr = 0x0;
printf("%p",ptr+=1);
The output will be ptr+(# of bytes needed for storing a pointer, in my case 8).
Now if we declare a matrix:
int A[100][50];
A[0] is a pointer of pointer.
A[0]+1 will now point to A[0]+(# of bytes needed for storing an integer, in my case 4).
Why "normally" 8 bytes are added and now 4?
A[0]+1 will point to A[0][1], so it is useful, but how does it work?
Thank you!
Consider this program, run on a 64-bit machine (a Mac running macOS Mojave 10.14.6, with GCC 9.2.0 to be precise):
#include <stdio.h>
int main(void)
{
int A[100][50];
printf("Size of void * = %zu and size of int = %zu\n", sizeof(void *), sizeof(int));
printf("Given 'int A[100][50];\n");
printf("Size of A = %zu\n", sizeof(A));
printf("Size of A[0] = %zu\n", sizeof(A[0]));
printf("Size of A[0][0] = %zu\n", sizeof(A[0][0]));
putchar('\n');
printf("Address of A[0] = %p\n", (void *)A[0]);
printf("Address of A[0] + 0 = %p\n", (void *)(A[0] + 0));
printf("Address of A[0] + 1 = %p\n", (void *)(A[0] + 1));
printf("Difference = %td\n", (void *)(A[0] + 1) - (void *)(A[0] + 0));
putchar('\n');
printf("Address of &A[0] = %p\n", (void *)&A[0]);
printf("Address of &A[0] + 0 = %p\n", (void *)(&A[0] + 0));
printf("Address of &A[0] + 1 = %p\n", (void *)(&A[0] + 1));
printf("Difference = %td\n", (void *)(&A[0] + 1) - (void *)(&A[0] + 0));
return 0;
}
The output is:
Size of void * = 8 and size of int = 4
Given 'int A[100][50];
Size of A = 20000
Size of A[0] = 200
Size of A[0][0] = 4
Address of A[0] = 0x7ffee5b005e0
Address of A[0] + 0 = 0x7ffee5b005e0
Address of A[0] + 1 = 0x7ffee5b005e4
Difference = 4
Address of &A[0] = 0x7ffee5b005e0
Address of &A[0] + 0 = 0x7ffee5b005e0
Address of &A[0] + 1 = 0x7ffee5b006a8
Difference = 200
Therefore, it is possible to deduce that A[0] is an array of 50 int — it is not a 'pointer of pointer'. Nevertheless, when used in an expression such as A[0] + 1, it 'decays' into a 'pointer to int' (pointer to the type of the element of the array), and hence A[0] + 1 is one integer's worth further through the array.
The last block of output shows that the address of an array has a different type — int (*)[50] in the case of A[0].

I am confused how to understand this code. contains double pointers

I don't understand why the code below changes the array b:
int a[] = { 3, 6, 9 };
int b[] = { 2, 4, 6, 8, 10 };
int **c;
int **d[2];
c = (int **)malloc (b[1] * sizeof(int *));
*c = &a[1];
c[1] = c[0] + 1;
*d = c;
c = c + 2;
*c = b;
c[1] = &c[0][3];
*(d + 1) = c;
d[0][3][1] = d[1][0][0];
d[1][0][2] = d[0][1][0];
I have run this code and found the values of array a and array b but I am unable to understand how these values come.
Array a remains unchanged while b becomes 2, 4, 9, 8, 2. How does this happen?
c = (int**)malloc(b[1] * sizeof(int*)); //int **c[4] ???
c is an array of double pointers *c = &a[1] this means that c[0] has the address of array a's second index. I am not getting the way to interpret this.
The code contains actual statements, therefore it must be part of a function body, hence all declarations herein have automatic storage. It is highly convoluted, with purposely contrived double indirections... Lets analyse it one line at a time:
int a[] = { 3, 6, 9 }; -- a is an array of 3 ints initialized with some explicit values.
int b[] = { 2, 4, 6, 8, 10 }; -- likewise, b is an array of 3 ints initialized with some explicit values.
int **c; -- c is an uninitialized pointer to a pointer to int, that can be made to point to an array of pointers to int.
int **d[2]; -- d is an uninitialized array of 2 pointers to pointers to int, each of which can be made to point to an array of pointers to int.
c = (int **)malloc(b[1] * sizeof(int *)); -- c is set to point to a block of uninitialized memory with a size of 4 pointers to int. In short, c now points to an uninitialized array of 4 pointers to int.
*c = &a[1]; -- The element pointed to by c (aka A[0]) is set to point to the second element of a (aka a[1], with a value of 6). The value of A[0] is &a[1].
c[1] = c[0] + 1; -- The second element in the array pointed to by c (aka A[1]) is set to point to the element after the one pointed to by c[0], hence it points to the third element of a (aka a[2] with a value of 9). The value of A[1] is&a[2]`.
*d = c; -- The first element of d is set to the value of pointer c, which is the address of A[0]. The value of d[0] is &A[0].
c = c + 2; -- The pointer c is incremented by 2, it now points to the third element of the array A allocated with malloc(), A[2].
*c = b; -- The element pointed to by c, A[2], which is itself a pointer, is set to point to the first element of b, b[0]. The value of A[2] is &b[0].
c[1] = &c[0][3]; -- The element after that, A[3], the 4th element of the array allocated by malloc, is set to point to the 4th element of the array pointed to by the element c points to. &c[0][3] is equivalent to c[0] + 3 or &(*c)[3] or simply *c + 3. This element is b[3] which has the value 8. The value of A[3] is&b[3]`.
*(d + 1) = c; -- This is equivalent to d[1] = c; which sets the second element of d to the value of the pointer c, which is the address of the 3rd element of the array allocated wth malloc(), A[2], which points to b[0]. The value of d[1] is &A[2].
d[0][3][1] = d[1][0][0]; -- Let's rewrite these terms:
d[0][3][1] => (&A[0])[3][1] => A[3][1] => (&b[3])[1] => *((b + 3) + 1) => b[4]
d[1][0][0] => (&A[2])[0][0] => (*&A[2])[0] => A[2][0] => (&b[0])[0] => b[0] which is the value 2.
Hence b[4] = 2;.
d[1][0][2] = d[0][1][0]; -- Let's rewrite these:
d[1][0][2] => (&A[2])[0][2] => (*&A[2])[2] => A[2][2] => (&b[0])[2] => (b + 0)[2] => b[2].
d[0][1][0] => (&A[0])[1][0], ie A[1][0] => (&a[2])[0] => *&a[2] => a[2] that has a value of 9.
Hence b[2] = 9;
As a consequence, the array b now has elements { 2, 4, 9, 8, 2 }.
You can run the program:
#include <stdio.h>
#include <stdlib.h>
int main() {
int a[] = { 3, 6, 9 };
int b[] = { 2, 4, 6, 8, 10 };
int **c;
int **d[2];
c = (int **)malloc (b[1] * sizeof(int *));
*c = &a[1];
c[1] = c[0] + 1;
*d = c;
c = c + 2;
*c = b;
c[1] = &c[0][3];
*(d + 1) = c;
d[0][3][1] = d[1][0][0];
d[1][0][2] = d[0][1][0];
printf("a = { ");
for (size_t i = 0; i < sizeof a / sizeof *a; i++)
printf("%d, ", a[i]);
printf("};\n");
printf("b = { ");
for (size_t i = 0; i < sizeof b / sizeof *b; i++)
printf("%d, ", b[i]);
printf("};\n");
return 0;
}

Printing a variable in C that was not assigned a value

I put this code into eclipse and run it
main()
{
int *p, *q, *r;
int a = 10, b = 25;
int c[4] = {6,12,18,24};
p = c;
printf("p = %d\n" ,p);
}
the output I get is
p = 2358752
what is this number supposed to represent? Is it the address of the variable?
If what i'm saying above is true would my answer to the following question be correct?
so lets say the following are stored at the following locations
address variables
5000 p
5004 q
5008 r
500C a
5010 b
5014 c[0]
5018 c[1]
501C c[2]
5020 c[3]
so would would the line
p = c;
be 5014?
int *p,
The above statement defines p to be a pointer to an integer.
In the below statement, c is implicitly converted to a pointer to the first element of the array a.
p = c;
// equivalent to
p = &c[0];
Therefore, p contains the address of the first element of the array. Also, the conversion specifier to print an address is %p.
printf("p = %p\n", (void *)p);
// prints the same address
printf("c = %p\n", (void *)c);
Yes, p is the address of c, which is the same as the address of c[0]. And yes, in your second example, p would be equal to 5014.

Resources