Confusion with string pointers [duplicate] - c

This question already has answers here:
Switch case expression
(3 answers)
Closed 5 years ago.
#include<stdio.h>
int main()
{
switch(*(1+"AB" "CD"+1))
{
case 'A':printf("A is here");
break;
case 'B':printf("B is here");
break;
case 'C':printf("C is here");
break;
case 'D':printf("D is here");
break;
}
}
The output is: C is here.
Can anyone explain this to me its confusing me.

First of all, string literals only separated by white-space (and comments) are concatenated into single strings. This happens before expression parsing (see e.g. this translation phase reference for more information). This means that the expression *(1+"AB" "CD"+1) is really parsed as *(1+"ABCD"+1).
The second thing to remember is that string literals like "ABCD" are really read-only arrays, and as such one can use normal array-indexing with them, or letting them decay to pointers to their first element.
The third thing is that for any array or pointer p and index i, the expression *(p + i) is equal to p[i]. That means *(1+"ABCD"+1) (which is really the same as *("ABCD"+2)) is the same as "ABCD"[2]. Which gives you the third character in the string. And the third character in the string is 'C'.

In C, adjacent string literals, such as "AB" "CD", are concatenated. (This is a convenience that allows long strings to be easily broken up over multiple lines and enables certain features such as macros like PRIx64 in <inttypes.h> to work.) The result is "ABCD".
A string literal is an array of characters. In most circumstances, an array is automatically converted to a pointer to its first element. (The exceptions are in contexts where you want the actual array, such as applying sizeof.) So "ABCD" becomes a pointer to the A character.
When one is added to a pointer (to an element in an array), the result points to the next element in the array. So 1+"ABCD" points to the B. And 1+"ABCD"+1 points to the C.
Then the * operator produces the object the pointer points to, so *(1+"ABCD"+1) is the C character, whose value is C.

Here, switch(*(1+"AB" "CD"+1)) is evaluated like switch(*(2+"ABCD")). *(2+"ABCD") points to character C. that's why output of your code is C is here.
*(any thing) is evaluated as pointer to a string literal.

Related

how can a read-only string literal be used as a pointer?

In C one can do this
printf("%c", *("hello there"+7));
which prints h
How can a read-only string literal like "hello there" be used almost like a pointer? How does this work exactly?
Using 'anonymous' string literals can be fun.
It's common to express dates with the appropriate ordinal suffix. (Eg "1st of May" or "25th of December".)
The following 'collapses' the 'Day of Month' value (1-31) down to values 0-3, then uses that value to index into a "segmented" string literal. This works!
// Folding DoM 'down' to use a compact array of suffixes.
i = DoM;
if( i > 20 ) i %= 10; // Change 21-99 to 0-9.
if( i > 3 ) i = 0; // Every other one ends with "th"
// 0 1 2 3
suffix = &"th\0st\0nd\0rd"[ i * 3 ]; // Acknowledge 3byte regions.
A string literal is a character array (char[]) and is thus implicitly cast to a char pointer (char *) to the first element of the array.
Thus, in the example in the question ("hello there"+7), 7 is added to a pointer to the first character (h) giving a pointer to the 7th character (counting zero based) which also happens to be a h (the "h" in "there").
Notice that the pointer is to char, not const char. However, it is important to know that writing at the location pointed at by a string literal is undefined behavior which means that each compiler implementation is free to define its own behavior in that case. Depending on the compiler implementation, it may be impossible (the string literal may be stored in read-only memory), it may have unforeseen side-effects, it may change the string string literal without any side-effects or ... basically anything.
It is allowed for two identical or overlapping string literals such as "hello there" and "there" to share the same memory location. Hence, the following expressions may be either true or false depending on the compiler implementation:
"hello" == "hello"
"hello there" + 6 == "there"
While you know how it is stored you will understand.
String constants are stored in .rodata section, seperate from code which stored in .text section. So when the program is running, it need to know the address of those string constants when using them, and, length of strings and arrays are not all the same, there is no simple way to get them (integer and float can be stored in and passed by register), thus "strings are visited thought pointer same as arrays".
Actually values not able to be hard encodered in instructions are all stored in sections such as .data and .rodata.

How char array behaves for longer strings?

I asked this question as one of multiple questions here. But people asked me to ask them separately. So why this question.
Consider below code lines:
char a[5] = "geeks"; //1
char a3[] = {'g','e','e','k','s'}; //d
printf("a:%s,%u\n",a,sizeof(a)); //5
printf("a3:%s,%u\n",a3,sizeof(a3)); //j
printf("a[5]:%d,%c\n",a[5],a[5]);
printf("a3[5]:%d,%c\n",a3[5],a3[5]);
Output:
a:geeksV,5
a3:geeks,5
a[5]:86,V
a3[5]:127,
However the output in original question was:
a:geeks,5
a3:geeksV,5
The question 1 in original question was:
Does line #1 adds \0? Notice that sizeof prints 5 in line #5 indicating \0 is not there. But then, how #5 does not print something like geeksU as in case of line #j? I feel \0 does indeed gets added in line #1, but is not considered in sizeof, while is considered by printf. Am I right with this?
Realizing that the output has changed (for same online compiler) when I took out only those code lines which are related to first question in original question, now I doubt whats going on here? I believe these are undefined behavior by C standard. Can someone shed more light? Possibly for another compiler?
Sorry again for asking 2nd question.
char a[5] = "geeks"; //1
Here, you specify the array's size as '5', and initialize it with 5 characters.
Therefore, you do not have a "C string", which by definition is ended by a NUL. (0).
printf("a:%s,%u\n",a,sizeof(a)); //5
The array itself still has a size of 5, which is correctly reported by the sizeof operator, but your call to printf is undefined behaviour and could print anything after the arrray's contents - it will just keep looking at the next address until it finds a 0 somewhere. That could be immediately, or it could print a 1000000 garbage characters, or it could cause some sort of segfault or other crash.
char a3[] = {'g','e','e','k','s'}; //d
Because you don't specify the array's size, the compiler will, through the initialization syntax, determine the size of the array. However, the way you chose to initialize a3, it will still only provide 5 bytes of length.
The reason for that is that your initialization just is an initialization list, and not a "string". Therefore, your subsequent call to printf also is undefined behaviour, and it is just luck that at the position a3[5] there seems to be a 0 in your case.
Effectively, both examples have the very same error.
You could have it different thus:
char a3[] = "geeks";
Using a string literal for initialization of the array with unspecified size will cause the compiler to allocate enough memory to hold the string and the additional NUL-terminator, and sizeof (a3) will now yield 6.
"geeks" here is a string literal in C.
When you define "geeks" the compiler automatically adds the NULL character to the end. This makes it 6 characters long.
But you are assigning it to char a[5]. This will cause undefined behaviour.
As mentioned by #DavidBowling, in this case the following condition applies
(Section 6.7.8.14) C99 standard.
An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array
the elements "geeks" will be copied into the array 'a' but the NULL character will not be copied.
So in this case when you try to print the array, it will continue printing until it encounters a \0 in the memory.
From the further print statements it is seen that a[5] has the value V. Presumably the next byte on your system is \0 and the array print stops.
So, in your system, at that instance, "geeksV" is printed.

in C can i do write(&'\n', 1) where write(char[], int)

Can I do
write(&'\n', 1);
and is it equivalent to
char a = '\n';
write(&a, 1);
How would you solve this in a fashion way?
I'm tring to write the new-line caracter with a function that only take char array as first argument, and its dimension in second argument (dimension has to be specified because \0 is a valid writable character)
As others already pointed out, you cannot take the address of a character literal.
And even if you could, it would be the wrong type, because a character array usually must be zero-terminated.
What you are looking for is:
write("\n", 1);
The reason that you can't take the address of a character literal, is because the compiler actually replaces it by the characters actual ASCII value, in this case 10. So your code actually is:
write(&10, 1);
And you can't take the address of a number literal, because it's most likely not stored in memory but part of the actual generated code.
You can do this:
write("\n", 1);
What you're asking is for two different things:
'\n' is a character literal, and will be treated as an integer literal.
char a is a character variable.
you can get the address of a variable, not a literal.

What does the line starting with double quote mean in C?

I was asked in one of the interviews, what does the following line print in C? In my opinion following line has no meaning:
"a"[3<<1];
Does anyone know the answer?
Surprisingly, it does have a meaning: it's an indexing into an array of characters that represent a string literal. Incidentally, this particular one indexes at 6, which is outside the limits of the literal, and is therefore undefined behavior.
You can construct an expression that works following the same basic pattern:
char c = "quick brown fox"[3 << 1];
will have the same effect as
char c = 'b';
Think of this:
"Hello world"[0]
is 'H'
"Hello world" is a string literal. A string literal is an array of char and is converted to a pointer to the first element of the array in an expression. "Hello world"[0] means the first element of the array.
It does have meaning. Hint: a[b] means exactly the same as *(a+b). (I don't think this is a great interview question, though.)
"a" is an array of 2 characters, 'a', and 0. 3 << 1 is 3*2 = 6, so it's trying to access the 7th element of a 2-element array. That is undefined behavior.
(Also, the code doesn't print anything, even if the undefined behavior is removed, since no printing functions are called.)
"some_string"[i] returns the ith character of the given string. 3<<1 is 6. So "a"[3<<1] tries to return the 6th character of the string "a".
In other words the code invokes undefined behavior (and thus, in a sense, really does have no meaning) because it's accessing a char array out of bounds.

needed explanation for a c-program

In a c-book I bought, an exercise program is given as
what is the output for the following code snippet?
printf(3+"Welcome"+2);
the answer I got is me (by executing it in TC++)
But I can't get the actual mechanism.
please explain me the actual mechanism behind it.
It's called pointer arithmetic: 2+3=5, and "me" is the rest of the string starting at offset 5.
PS: throw away that book.
When this is compiled the "Welcome" string becomes a const char *, pointing to the first character of the string. In C, with character strings (like any pointer), you can do pointer arithmetic. This means pointer + 5 points to 5 places beyond pointer.
Therefore ("Welcome" + 5) will point 5 characters past the "W", to the substring "me."
On a side note, as other have suggested, this doesn't sound like a good book.
A string (like "Welcome") is an array of characters terminated by the NUL-character (so it's actually "Welcome\0").
What you are doing, is accessing the fifth character of it (3 + 2 = 5). This character is 'm' (array indices start at 0).
printf will continue to read till it hits the NUL-character.

Resources