I don't understand why the output is nt in this program.
Can anyone explain this program?
#include <stdio.h>
#include <stdlib.h>
int main(){
printf(3+"excellent"+4); //output is "nt"
return 0;
}
"excellent" is an array of type char[10], the elements of which are the 9 letters of the word and the terminating '\0'. And then, C11 6.3.2.1p3,
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. [...]
i.e. it is converted to a pointer to the first character of the string, (e), and then has the type char *.
Now we have two additions:
(3 + (char *)"excellent") + 4
The C standard says (simplified, C11 6.5.6p8) that when adding an integer and a pointer together, the result will be a pointer of the same type, and will be interpreted so that if the pointer p was pointing to element n of an array, then p + m will result in a pointer that will point to element n + m of the same array, or one past the end, or, if n + m is outside the bounds of the array or one past the end, the behaviour is undefined.
I.e. 3 + "excellent" will give a pointer that will point to the 2nd letter e of excellent. Now of course since the parenthesized expression has type char * and it points to the element 3 of the array, if we add 4 to it, we get a pointer that points to the element 7, i.e. 8th letter, the n.
<-------------- char [10] -------------->
+---+---+---+---+---+---+---+---+---+---+
| e | x | c | e | l | l | e | n | t | \0|
+---+---+---+---+---+---+---+---+---+---+
^ ^ ^
| | |
first character, "excellent" after lvalue conversion
| |
+ 3 + "excellent"
|
+ 3 + "excellent" + 4
Now finally, what will happen when we call printf giving such a pointer as an argument? printf will consider the argument as being a pointer to a first character of a null terminated string that is the format string. Other than special sequences that start with %, all characters are copied verbatim to the output until the terminating null is met.
Another way to look into these is to remember that
*(a + b)
is equal to
a[b] (or even b[a])
and since &*x is equivalent to x,
&*(a + b) == (a + b) == (b + a) == &a[b] == &b[a]`
and we get that
3 + "excellent" + 4
equals
&"excellent"[3] + 4
which equals
&"excellent"[3 + 4]
i.e.
&"excellent"[7]
This
printf(3+"excellent"+4);
Can be written in a little longer but a way more clear way:
const char *str = "excellent";
const char *to_print = str + 3 + 4; // equivalent to &str[7] which points to 'n'
printf(to_print); // or printf("%s", to_print); which prints "nt"
It is because it is printing everything after the 7th character. The plus tells it where to start the print. If you change it to printf(2+"excellent"+4) you get "ent"
Related
The following lines of code work as you'd expect
#include <stdio.h>
int main(void)
{
int n;
int a[5];
int *p;
a[2] = 1024;
p = &n;
/*
* write your line of code here...
* Remember:
* - you are not allowed to use a
* - you are not allowed to modify p
* - only one statement
* - you are not allowed to code anything else than this line of code
*/
/* ...so that this prints 98\n */
printf("a[2] = %d\n", a[2]);
return (0);
}
This prints out a[2] = 1024
Now I was asked to modify this code so that a[2] = 98 gets printed instead. There were a ton of constraints. I couldn't use the variable a anywhere else in the code again and a couple other things. I found a solution online but I don't understand it at all.
#include <stdio.h>
int main(void)
{
int n;
int a[5];
int *p;
a[2] = 1024;
p = &n;
/*
* write your line of code here...
* Remember:
* - you are not allowed to use a
* - you are not allowed to modify p
* - only one statement
* - you are not allowed to code anything else than this line of code
*/
p[5] = 98;
/* ...so that this prints 98\n */
printf("a[2] = %d\n", a[2]);
return (0);
}
So, setting p[5] = 98; results in a[2] = 98 being printed, which is the intended result. I'm fairly new to C programming and pointers in general but I have absolutely no idea why this works the way it does.
This is all about the layout of the stack, the memory where the local variables in your method are stored. Variables are pushed onto the stack, so first the variable n is pushed to the stack, let us assume that it ends up at address 1000 (this is just a fictional address). Since it is an integer it takes up the space of an integer (4 bytes if integers are 32 bits). Then you push the array a to the stack. It will be located next to n.
Since n was placed at address 1000 and took up 4 bytes of memory, then a will be placed at address 1004 (1000 + 4). a is an array of 5 integers, each taking up 4 bytes. So a takes up the space from 1004 to 1024.
Your variable p is an integer pointer, and you set it to point to n. That means that p points to the address 1000 which is the address of n. You then write p[5] in C that is equivalent to the expression *(p + 5). Which basically means take value of the address p + 5. And since p is an integer pointer and each integer takes up 4 bytes, you are essentially asking for the value of address: 1000 + (5 * 4) = 1020
In the array a you stored the value 1024 at index 2, that corresponds to address: 1004 + 2 * 4 = 1012, so when you print the value of a[2] you are printing the value of address 1012. This means that the value you are setting to 98, is not a[2] but a[4].
The reason why I am mentioning this, is that in my case it did not print 98 but 1024. As people have already mentioned you are working with undefined behavior, and although it might work on some setups, it might not work on all.
I can't tell you the answer to your problem. But I can write a program which might, maybe, find the answer to your problem.
Try running this program:
#include <stdio.h>
int main2(int off)
{
int n;
int a[5];
int *p;
a[2] = 1024;
p = &n;
p[off] = 98;
return a[2];
}
int main()
{
int i;
for(i = -5; i <= 5; i++)
if(main2(i) == 98)
printf("the magic offset is %d\n", i);
}
On my computer, with one of my compilers, today, this program prints
the magic offset is 4
That tells me that (on my computer, with that compiler, today) the "solution" to your ridiculous problem would be
#include <stdio.h>
int main()
{
int n;
int a[5];
int *p;
a[2] = 1024;
p = &n;
p[4] = 98;
printf("a[2] = %d\n", a[2]);
}
I'm not even going to try to explain why this works, because the reasons are so obscure, unrepeatable, and meaningless. (See this question's other good answers for more details.) Basically the first program automates the search for a magic offset from p, more or less as you discovered.
And, in fact, under the first compiler I tried it, it didn't even work. Despite using the magic number 4 that the first program discovered, the second program printed a[2] = 1024. That's not too surprising: the relative positions of variables like a, p, and n are not specified by any standard. They're totally up to the compiler. The compiler is perfectly within its rights to arrange them one way in function main2 in my first program, and a completely different way in function main in my second program.
I tried my first program under a different compiler, and it printed
the magic offset is -3
and then crashed with a segmentation fault. But then, under that compiler, when I changed the relevant line in the second program to
p[-3] = 98;
it "worked", printing a[2] = 98 as required.
(And then I tried turning up the optimization level, and it stopped working.)
To be perfectly clear, the fact that my approach did not work under that first compiler, because it failed to arrange things in a consistent or predictable way, does not mean there's anything wrong with that compiler! Quite the contrary: the fault is entirely in the broken programs I wrote, and the broken assignment of yours that motivated them.
Here is an alternative exercise which will teach you something useful about arrays and pointers, without requiring that you "learn" false, unrepeatable facts about how variables are or aren't guaranteed to be arranged in stack frames.
#include <stdio.h>
int main(void)
{
int a[5];
int *p;
a[2] = 1024;
p = &a[4];
/*
* write your line of code here...
* Remember:
* - you are not allowed to use a
* - you are not allowed to modify p
* - only one statement
* - you are not allowed to code anything else than this line of code
*/
/* ...so that this prints 98\n */
printf("a[2] = %d\n", a[2]);
}
This problem has a similar solution — you can easily work it out — but the solution is unique and guaranteed to work, because it depends on well-defined properties of arrays and pointer arithmetic in C, not on accidental details of the stack layout.
It "works" by accident. It relies on n, p, and a being laid out in memory in a specific order (each box represents 4 bytes):
Address Item
------- --------
+---+
0x8000 n: | | p[0]
+---+
0x8004 p: | | p[1]
+---+
0x8008 | | p[2]
+---+
0x800c a: | | a[0] p[3]
+---+
0x8010 | | a[1] p[4]
+---+
0x8014 | | a[2] p[5]
+---+
0x8018 | | a[3] p[6]
+---+
0x801c | | a[4] p[7]
+---+
Some background:
A pointer is any expression whose value is the location of an object or function in a running program's execution environment - essentially, an address. A variable of pointer type stores an address value. However, pointers have associated type semantics - a pointer to int is a different type than a pointer to double, which is a different type than a pointer to struct foo, which is a different type than a pointer to an array of char, etc.
When you add 1 to a pointer value, the result is a pointer to the next object of the pointed-to type immediately following:
char *cp = &some_char;
short *sp = &some_short;
long *lp = &some_long;
+---+ +---+ +---+
some_char: | | <-- cp some_short: | | <-- sp some_long: | | <-- lp
+---+ | | | |
| | <-- cp + 1 | | | |
+---+ +---+ | |
| | <-- cp + 2 | | <-- sp + 1 | |
+---+ | | | |
| | <-- cp + 3 | | | |
+---+ +---+ +---+
| | <-- cp + 4 | | <-- sp + 2 | | <-- lp + 1
+---+ | | | |
... ... ...
This is exactly how array subscripting works - the array subscript operation a[i] is defined as *(a + i) - given a starting address a, offset i elements (not bytes!) from that address and deference the result. Arrays are not pointers; rather, array expressions "decay" to pointers to their first element under most circumstances. But this means you can use the [] subscript operator on pointer variables as well, so if you set p to point to n with
int *p = &n;
then you can apply the [] operator to p and treat it as though it was an array. So, if p == 0x8000 (the address of n), then p + 5 == 0x8014, which is the address of a[2]. Thus, *(p + 5) == p[5] == a[2] == *(a + 2).
But...
This behavior is undefined - it may work, it may not. It may result in garbled output, it may branch into some random subroutine, it may invoke Rogue. Neither the compiler nor the runtime environment are required to handle it in any particular way - any result is equally correct as far as the language is concerned.
We're pretending n is the first element of an array of int when it really isn't, so we're indexing out of bounds with p[1], p[2], etc. We're assuming objects are laid out in a specific order, but the compiler is under no obligation to lay variables out that way. The compiler may optimize things such that p is stored in a register, rather than on the stack.
This is a horrible way to teach pointers. It's unsafe, it's unportable, it's bad practice, it's confusing, it's an atypical use case, it doesn't explain why we use pointers. Whoever gave you this code shouldn't be teaching anyone how to program in C. If they write C for a living they are a menace.
May you explain the following output:
main()
{
char f[] = "qwertyuiopasd";
printf("%s\n", f + f[6] - f[8]);
printf("%s", f + f[4] - f[8]);
}
output:
uiopasd
yuiopasd
For example regarding the first printf:
f[8] should represent the char 'o'
f[6] should represent the char 'u'
%s format prints the string (printf("%s", f) is giving the whole "qwertyuiopasd")
So how does it come together, what is the byte manipulation here?
There are multiple problems in the code posted:
the missing return type for main is an obsolete syntax. you should use int main().
the prototype for printf is not in scope when the calls are compiled. This has undefined behavior. You should include <stdio.h>.
the expression f + f[6] - f[8] has undefined behavior: addition is left associative, so f + f[6] - f[8] is evaluated as (f + f[6]) - f[8]. f[6], which is the letter u is unlikely to have a value less than 14 (in ASCII, its value is 117) so f + f[6] points well beyond the end of the string, thus is an invalid pointer and computing f + f[6] - f[8] has undefined behavior, in spite of the fact that 'u' - 'o' has the value 6 for the ASCII character set. The expression should be changed to f + (f[6] - f[8]).
Assuming ASCII, the letters o, u and t have values 111, 117 and 116.
f + (f[6] - f[8]) is f + ('u' - 'o') which is f + (117 - 111) or f + 6.
f + 6 is the address of f[6], hence a pointer to the 7th character of the string "qwertyuiopasd". Printing this string produces uiopasd.
Assume the characters follows the ASCII scheme, the ASCII value of the following characters are :
o (f[8]): 111
u (f[6]): 117
t (f[4]): 116
f is the pointer to the char[], the first statement values to f + 6, this pointer will point to the 6th element of the array, and on printing, it will print from the sixth element till the point it encounters \0.
Similarly, the second statement evaluates to f + 5, thus you get yuiopasd as output.
What does f + n means?
You can perform the following arithmetic on the pointers ++, --, +, -. The pointer stores the memory address, and the increment operator on a pointer will increase the address value by the size of the type.
for eg for an integer, if f points to address location 1000, and we are storing 4 bytes int in the array, then f + 1 will point to 1004, which is the next element in the array.
it is a simple pointer arithmetic which will be easier to understand with this example
int main(void)
{
char f[] = "9876543210";
printf("%s , f[6]=%d, f[8]=%d, f[6]-f[8]=%d, f + f[6] - f[8] = %s\n",f, f[6], f[8], f[6]-f[8], f + f[6] - f[8]);
The result is :
9876543210 , f[6]=51, f[8]=49, f[6]-f[8]=2, f + f[6] - f[8] = 76543210
f[n] is the integer value of the nth index element of the array.
In this example the difference between ASCII codes of the 6th and 8th elements is 2.
When we add 2 to the char pointer it will reference the element 2 chars ahead which in our case is '7'
This is all about pointer arithmetic. The expression f + f[6] - f[8] evaluates to a char* pointer (like its first operand, because the name of an array variable is syntactically equivalent to a pointer to its first element), and will expand to this:
f + (int)'u' - (int)'o'
(where 'u' and 'o' represent f[6] and f[8], respectively).
The values that represent the characters, 'u' and 'o', are (on almost all modern systems, which use the ASCII system), separated by 6, so the expression adds 6 to the f address and prints the string starting from its 7th element.
Similarly for the expression f + f[4] - f[8] - but here, the difference is only 5 ('t' - 'o').
I have this code, I'm trying to figure out what the second line of code does.
static int table [][4]= {{1,2,3,4},{2,3,4,5},{3,4,5,6}};
int valore = *(*(table+2)+1);
printf("%d",valore);
I have a basic knowledge of pointers in C, can you explain me what does the second line of code do please?
Your table is simply a 2D array of integers. In C a 2D array is really an "array or arrays". Your table has the dimensions of static in table[3][4]; (3 rows x 4 cols), it is an array of 3 integer arrays with 4 elements each. Since it is an array, all values will be sequential in memory. You can think of the memory layout as follows.
+---+---+---+---+
table[0] | 1 | 2 | 3 | 4 |
+---+---+---+---+
table[1] | 2 | 3 | 4 | 5 |
+---+---+---+---+
table[2] | 3 | 4 | 5 | 6 |
+---+---+---+---+
An array is converted to a pointer on access (accept in 4 limited circumstances, not relevant here, see C11 Standard - 6.3.2.1 Other Operands - Lvalues, arrays, and function designators(p3) for details)
You are introduced to "pointer notation" in the question. You can access any element of an array using "array indexes" or "pointer notation". In pointer notation *(a + b) is equivalent to a[b] in array index notation. You have:
*(*(table+2)+1)
If you take it piece by piece *(table + 2) is simply table[2]. Next *(table[2] + 1) is simply table[2][1]. So you are acccessing the 2nd value in the 3rd row with either (which is simply 4).
Look things over and let me know if you have further questions.
table is an array of 3 arrays of 4 int.
When an array is used in an expression, it is converted to a pointer to its first element, except when:
It is the operand of sizeof.
It is the operand of unary &.
It is a string literal used to initialize an array.
So, in *(*(table+2)+1), table is converted to a pointer to its first element, producing &table[0]. Then we have:
*(*(&table[0]+2)+1)
Next, we have the addition &table[0] + 2. This uses pointer arithmetic. Adding an integer to a pointer (into an array) moves the pointer backward or forward by a number of elements. So &table[0] + 2 produces a pointer to table[2], which is &table[2]. Then we have:
*(*(&table[2])+1)
The inner parentheses are no longer needed, so we have:
*(*&table[2]+1)
Then * &table[2] is the thing that &table[2] points to, which means it is table[2]:
*(table[2] + 1)
Since table is an array of 3 arrays of 4 int, table[2] is an array of 4 int. Since it is an array, it is converted to a pointer to its first element, producing &table[2][0]:
*(&table[2][0] + 1)
Now we have pointer arithmetic again. &table[2][0] is a pointer to element 0 of the array table[2], so adding 1 produces a pointer to element 1, &table[2][1]:
*(&table[2][1])
Again we have parentheses that are no longer needed:
*&table[2][1]
And, finally, * &table[2][1] is the thing that &table[2][1] points to, so it is just:
table[2][1]
I've been facing difficulties to understand the fourth line of code after the first curly brace,
#include<stdio.h>
int main()
{
int arr[] = {10,20,36,72,45,36};
int *j,*k;
j = &arr[4];
k = (arr+4);
if(j==k)
printf("The two pointers are pointing at the same location");
else
printf("The two pointers are not pointing at the same location");
}
I just wanted to know what the fourth line of code after the first curly brace i.e. k = (arr+4); does?
Since k was a pointer it was supposed to point at something that had an "address of operator" ? I can still understand that if it doesn't have the "address of operator" then whatever does the part of the code k = (arr+4) do?
For any array or pointer arr and index i, the expression arr[i] is exactly equal to *(arr + i).
Now considering that arrays naturally can decay to pointers to their first element, arr + i is a pointer to element i.
Without & or the sizeof operator an array converts to a pointer to the first element of the array.In your case array will convert to a pointer to int and will point to the first element.arr + 1 will point to the second element,arr + 2 will point to the third element etc..arr+1 means increment arr with sizeof(int).
This is simplified diagram of the first 2 elements of the array.Let's assume int is 4 bytes long.
| first element | second element |
-------------------------------------------------
| | | | | | | | |
| | | | | | | | |
-------------------------------------------------
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
arr will contain 0x00
arr + 1 will contain 0x04
*arr will mean take the value from adress 0x00.It's equivalent to *(arr+0), *(0+arr), arr[0] and 0[arr].Since arr is of type int* it will take a four bytes long value.
With int* k = array k will contain the same address with array.
k = (arr + 4) will contain the address of the 5th element.
j = &arr[4]; will also store the address of the 5th element
As k is defined as pointer, it can be use to store address as value and can point to a location.
Here (arr+4) will return address of arr and plus 4.
If int takes 4 bytes then it will point to 2nd element in arr, so it depends on system(32bit/64 bit), that how much it takes to store int.
Something that you should aware of (C Standards#6.3.2.1p3):
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
The statement:
int arr[] = {10,20,36,72,45,36};
arr is an array of int.
The expression arr[i], can also be written as:
*(arr+i)
So, &a[i] can be written as:
&(*(arr+i))
The operator & is used to get the address and the operator * is used for dereferencing. These operators cancel the effect of each other when used one after another. Hence, &(*(arr+i)) is equivalent to arr+i.
I just wanted to know what the fourth line of code after the first curly brace i.e. k = (arr+4); does?
In the statement:
k = (arr+4);
none of the operators - sizeof, _Alignof and unary & is used. So,
arr will convert to a pointer to type int. That means, arr+4 will give the address of the four element past the object (i.e. int) pointed to by arr, which is nothing but &a[4].
&arr[4] --> &(*(arr+4)) --> (arr+4)
You need to understand pointer arithmetic here.
arr[i] is interpreted as *(arr + i),
where * is 'dereferencing' or 'value at' operator and arr always represents the address of first element of the array i.e. the address of array itself, which literally means,
valueat(starting address of arr + i)
Now suppose address of arr is 100 and you are adding i elements of type arr in address of array and not the value of i, that is,
valueat(100 + i)
Now according to your code, here you are assigning address of 4th element to pointer j ,
j = &arr[4];
j = &(valueat(100+ 4 elements of type int));
j = &(valueat(100+ 16)); ->
j = &(valueat(116)); ->
j = &(45) that is j = 116
Now when you do
k = (arr+4);
k = (starting address of arr + 4 elements of type int);
k = (100 + 16); that is k = 116, and that is why the output,
The two pointers are pointing at the same location
Hope this helps.
This question already has answers here:
With arrays, why is it the case that a[5] == 5[a]?
(20 answers)
Closed 5 years ago.
Array declaration:
int arr [ ]={34, 65, 23, 75, 76, 33};
Four notations: (consider i=0)
arr[i]
and
*(arr+i)
and
*(i+arr)
and
i[arr]
Lets take a look at how your array is laid out in memory:
low address high address
| |
v v
+----+----+----+----+----+----+
| 34 | 65 | 23 | 75 | 76 | 33 |
+----+----+----+----+----+----+
^ ^ ^ ^
| | | ...etc
| | |
| | arr[2]
| |
| arr[1]
|
arr[0]
That the first elements is arr[0], the second arr[1] is pretty clear, that's what everybody learns. What is less clear is that the compiler actually translates an expression such as arr[i] to *(arr + i).
What *(arr + i) does is first get a pointer to the first element, then do pointer arithmetic to get a pointer to the wanted element at index i, and then dereference the pointer to get its value.
Due to the commutative property of addition, the expression *(arr + i) is equal to *(i + arr) which due to the above mentioned translation is equal to i[arr].
The equivalence of arr[i] and *(arr + i) is also what's behind the decay of an array to a pointer to its first element.
The pointer to the arrays first element would be &arr[0]. Now we know that arr[0] should be equal to *(arr + 0) which means &arr[0] has to be equal to &*(arr + 0). Adding zero to anything is a no-op, so leading to the expression &*(arr). Parentheses with only one term and no operator can also be removed, leaving &*arr. And lastly the address-of and dereference operator are each other opposites and cancel out each other, leaving us with simply arr. So &arr[0] is equal to arr.
Each element in the array, have a position in memory. The positions in the arrays are sequential. The arrays in C are pointers and always point the first direction on memory for the collection (first element of the array).
arr[i] => Gets value of "i-position" in the array. It is the same that arr[i] = *(arr + i)
*(arr+i) => Gets value that is in memory by adding the position in memory that point arr and i value.
*(i+arr) => Is the same that *(arr+i). The sum is commutative.
i[arr] => Is the same that *(i+arr). It's another way of representing.
They are the same because the C language specification says so. Read n1570
The notation a[i] is syntactic sugar for *(a+i).
The first one is mathematical syntax (symbolics closer of what human brain is educated with) while the second one corresponds directly to one assembler instruction.
On the other hand *(a+i)=*(i+a)=i[a] because the arithmetic of pointers is commutative.
These are the same because of how the array subscript operator [] is defined.
From sectino 6.5.2.1 of the C standard:
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object
(equivalently, a pointer to the initial element of an array object)
and E2 is an integer, E1[E2] designates the E2-th element of
E1 (counting from zero).
The expression arr[i] in your example is of the form E1[E2]. Because the standard states that this is the same as *(E1+E2) that means that arr[i] is the same as *(arr + i).
Because of the commutative property of addition, *(arr + i) is the same as *(i + arr). Applying the equivalence rule above to this expression gives i[arr].
So in short, those 4 expressions are equivalent because of how the standard defines array subscripting and because of the commutative property of addition.
It works because an array variable in C (i.e. arr in your example) is just a pointer to the beginning of an array of memory locations. A pointer is number which represents the address of a specific memory location. When you put and '*' in front of a pointer, it means "give me the data in that memory location".
So, if arr is a pointer to the beginning of the array, *(arr) or *(arr + 0) is the data in the 0th index of the array, and *(arr + 1) is the data in the 1st index, and so on.
An expression which looks like A[B] essentially gets translated into something like *(A+B). So, arr[0] = *(arr + 0) and arr[i] = *(arr+i), etc.
And because A+B = B+A, the two are interchangeable. Meaning *(arr+i) = *(i+arr).
And because arr[i] = *(arr+i) and *(arr+i) = *(i+arr), it should make sense that arr[i] = i[arr].