Why string literals are comparable with pointers? [duplicate] - c

This question already has answers here:
C Strings Comparison with Equal Sign
(5 answers)
Closed 8 years ago.
If we say:
char *p="name";
then how can we do
if(p=="name"){
printf("able"};//this if condition is true but why?
as "name" here is a string literal and p is a pointer which holds the base address of the string then why the above statement works fine?

It is unspecified behavior whether identical string literals can be considered the same and thus have the same address. So this is not portable behavior. From the draft C99 standard section 6.4.5 String literals:
It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. [...]
If you want to compare two string you should use strcmp.

The C Standard allows the comparison to be true, but it is also allowed to be false (the behavior is unspecified). It depends on the compiler performing common string merging (for which gcc has an option to turn it on or off).
From the gcc 4.8.1 manual:
-fmerge-constants
Attempt to merge identical constants (string constants and floating-point constants)
across compilation units.
This option is the default for optimized compilation if the assembler and linker support it.
Use -fno-merge-constants to inhibit this behavior.
Enabled at levels -O, -O2, -O3, -Os.
-fmerge-all-constants
Attempt to merge identical constants and identical variables.
So what you observe is a compiler performing string merging for the two "name" literals.

Ok, let's disect what this does.
p is a pointer to name\0. So, here you are comparing p (a pointer) to "name" (also a pointer). Well, the only way this will every be true is if somewhere you have p="name" and even then, "name" is not guaranteed to point to the same place everywhere.
I believe what you are actually looking for is either strcmp to compare the entire string or your wanting to do if (*p == 'n') to compare the first character of the p string to the n character
You want to use strcmp() == 0 to compare strings instead of a simple ==, which will just compare if the pointers are the same.
The expression strcmp( p, "name") == 0 will check if the contents of the two strings are the same.

String literals are nothing memory references hence when you do char *p="name"; this, it means nothing but:
|n |a |m | e
/
p
p points to first character of the string literal. So doing this:
p=="name" evaluates to someAddress==someAddress. However this behavior is unspecified.

Related

how can a read-only string literal be used as a pointer?

In C one can do this
printf("%c", *("hello there"+7));
which prints h
How can a read-only string literal like "hello there" be used almost like a pointer? How does this work exactly?
Using 'anonymous' string literals can be fun.
It's common to express dates with the appropriate ordinal suffix. (Eg "1st of May" or "25th of December".)
The following 'collapses' the 'Day of Month' value (1-31) down to values 0-3, then uses that value to index into a "segmented" string literal. This works!
// Folding DoM 'down' to use a compact array of suffixes.
i = DoM;
if( i > 20 ) i %= 10; // Change 21-99 to 0-9.
if( i > 3 ) i = 0; // Every other one ends with "th"
// 0 1 2 3
suffix = &"th\0st\0nd\0rd"[ i * 3 ]; // Acknowledge 3byte regions.
A string literal is a character array (char[]) and is thus implicitly cast to a char pointer (char *) to the first element of the array.
Thus, in the example in the question ("hello there"+7), 7 is added to a pointer to the first character (h) giving a pointer to the 7th character (counting zero based) which also happens to be a h (the "h" in "there").
Notice that the pointer is to char, not const char. However, it is important to know that writing at the location pointed at by a string literal is undefined behavior which means that each compiler implementation is free to define its own behavior in that case. Depending on the compiler implementation, it may be impossible (the string literal may be stored in read-only memory), it may have unforeseen side-effects, it may change the string string literal without any side-effects or ... basically anything.
It is allowed for two identical or overlapping string literals such as "hello there" and "there" to share the same memory location. Hence, the following expressions may be either true or false depending on the compiler implementation:
"hello" == "hello"
"hello there" + 6 == "there"
While you know how it is stored you will understand.
String constants are stored in .rodata section, seperate from code which stored in .text section. So when the program is running, it need to know the address of those string constants when using them, and, length of strings and arrays are not all the same, there is no simple way to get them (integer and float can be stored in and passed by register), thus "strings are visited thought pointer same as arrays".
Actually values not able to be hard encodered in instructions are all stored in sections such as .data and .rodata.

Confusion with string pointers [duplicate]

This question already has answers here:
Switch case expression
(3 answers)
Closed 5 years ago.
#include<stdio.h>
int main()
{
switch(*(1+"AB" "CD"+1))
{
case 'A':printf("A is here");
break;
case 'B':printf("B is here");
break;
case 'C':printf("C is here");
break;
case 'D':printf("D is here");
break;
}
}
The output is: C is here.
Can anyone explain this to me its confusing me.
First of all, string literals only separated by white-space (and comments) are concatenated into single strings. This happens before expression parsing (see e.g. this translation phase reference for more information). This means that the expression *(1+"AB" "CD"+1) is really parsed as *(1+"ABCD"+1).
The second thing to remember is that string literals like "ABCD" are really read-only arrays, and as such one can use normal array-indexing with them, or letting them decay to pointers to their first element.
The third thing is that for any array or pointer p and index i, the expression *(p + i) is equal to p[i]. That means *(1+"ABCD"+1) (which is really the same as *("ABCD"+2)) is the same as "ABCD"[2]. Which gives you the third character in the string. And the third character in the string is 'C'.
In C, adjacent string literals, such as "AB" "CD", are concatenated. (This is a convenience that allows long strings to be easily broken up over multiple lines and enables certain features such as macros like PRIx64 in <inttypes.h> to work.) The result is "ABCD".
A string literal is an array of characters. In most circumstances, an array is automatically converted to a pointer to its first element. (The exceptions are in contexts where you want the actual array, such as applying sizeof.) So "ABCD" becomes a pointer to the A character.
When one is added to a pointer (to an element in an array), the result points to the next element in the array. So 1+"ABCD" points to the B. And 1+"ABCD"+1 points to the C.
Then the * operator produces the object the pointer points to, so *(1+"ABCD"+1) is the C character, whose value is C.
Here, switch(*(1+"AB" "CD"+1)) is evaluated like switch(*(2+"ABCD")). *(2+"ABCD") points to character C. that's why output of your code is C is here.
*(any thing) is evaluated as pointer to a string literal.

C - Comparison of Arrays with == works, why? [duplicate]

This question already has answers here:
C optimisation of string literals
(2 answers)
Closed 7 years ago.
I'm a little bit confused. I have the following function:
int comp(char s1[], char s2[]) {
return s1 == s2;
}
As far as I know this compares only the addresses of the first elements from char array s1 and char array s2.
But strange is if I compare (in Visual Studio) two equal char arrays like
comp("test","test");
I got 1 (true) instead of 0 (false). But should the addresses not be different and therefore the result should be always 0?
I'd say this is the result of a compiler optimisation using the same instance of the string. If you did something like this you'd prove == doesn't work as you suggest:
char s1[10];
char s2[10];
strcpy(s1, "test");
strcpy(s2, "test");
printf("%d\n", comp(s1, s2));
It is so because same strings are stored as one string in the string pool during compilation . Therefore both points to the same address as there is only one "test" string in the string pool.
String literals are often reused by optimizing compilers, so if you use the same string literals twice, both will be the exactly same string literals. And your function are comparing pointers, and as both string literals are the same then you are comparing the same pointers which of course will give you a "true" value.
Read about the concept of mutabe and immutable string. String stored in stack when compared returns true but if one string in heap an other in stack it returns false.
Second question is, are you using prpredefined functions to compare 2 string then the functions works as follows
Compare(s1,s2) returns positive zero or negative according as s1 preced, equals or follows s2 in lexicographical ordering basednon unicide character.
Regards.

Identical string literals are considered equal? [duplicate]

This question already has answers here:
Why is "a" != "a" in C?
(11 answers)
Closed 8 years ago.
I have written the following program:
#include <stdio.h>
main()
{
if("ddd" == "ddd")
printf("equal");
else
printf("not equal");
}
The output is "equal", but according to me, the output should be "not equal" because the string literals are stored in the literal pool or some read only memory (I guess it depends on OS), so both strings should have two different addresses as they are stored at different addresses in memory.
Previously, I have done the same type of example (one year back), and that time the output was "not equal". Now, could anyone tell me, is this due to a change in the C standard, or am I missing something?
It's unspecified for string literals with the same content to have the same address or not. So the output of your program could be equal or it could be not equal, your compiler happens to put them in the same place.
C11 6.4.5 String literals
It is unspecified whether these arrays are distinct provided their elements have the
appropriate values.
Of course, what you do in that condition is a comparison between pointers (use strcmp to compare C strings).
So, I think it is a compiler translation/optimization that "maps" identical literals at the same location in the memory.
EDIT 1:
The following sample confirms what I wrote:
#include <stdio.h>
char* a = "ddd";
char* b = "ddd";
char* c = "ddd";
int main() {
printf ("a => %p\nb => %p\nc => %p\n", a, b, c);
}
The previous program, compiled with gcc using -O0 and executed will print:
a => 0x40060c
b => 0x40060c
c => 0x40060c
I don't know how other compilers will treat the same situation.
When you're comparing two character values (which are not pointers), it is a numeric comparison.
But when you're comparing two string, Base address of strings are compared.If supposed compilers treats as both string are in same location ,then o/p is equal.Otherwise Not.
What you are comparing the two memory addresses for the different strings, which are stored in different locations.so,Not equal.
Even it is read only memory,you are used this for comparison only.You are not modify or not write anything.

C Strings Comparison with Equal Sign

I have this code:
char *name = "George"
if(name == "George")
printf("It's George")
I thought that c strings could not be compared with == sign and I have to use strcmp. For unknown reason when I compile with gcc (version 4.7.3) this code works. I though that this was wrong because it is like comparing pointers so I searched in google and many people say that it's wrong and comparing with == can't be done. So why this comparing method works ?
I thought that c strings could not be compared with == sign and I have to use strcmp
Right.
I though that this was wrong because it is like comparing pointers so I searched in google and many people say that it's wrong and comparing with == can't be done
That's right too.
So why this comparing method works ?
It doesn't "work". It only appears to be working.
The reason why this happens is probably a compiler optimization: the two string literals are identical, so the compiler really generates only one instance of them, and uses that very same pointer/array whenever the string literal is referenced.
Just to provide a reference to #H2CO3's answer:
C11 6.4.5 String literals
It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
This means that in your example, name(a string literal "George") and "George" may and may not share the same location, it's up to the implementation. So don't count on this, it may results differently in other machines.
The comparison you have done compares the location of the two strings, rather than their content. It just so happens that your compiler decided to only create one string literal containing the characters "George". This means that the location of the string stored in name and the location of the second "George" are the same, so the comparison returns non-zero.
The compiler is not required to do this, however - it could just as easily create two different string literals, with different locations but the same content, and the comparison would then return zero.
This will fail, since you are comparing two different pointers of two separate strings.
If this code still works, then this is a result of a heavy optimization of GCC, that keeps only one copy for size optimization.
Use strcmp(). Link.
If you compare two stings that you are comparing base addresses of those strings not actual characters in those strings. for comparing strings use strcmp() and strcasecmp() library functions or write program like this. below is not a full code just logic required for string comparison.
void mystrcmp(const char *source,char *dest)
{
for(i=0;source[i] != '\0';i++)
dest[i] = source[i];
dest[i] = 0;
}

Resources