Understanding two ways of declaring a C string [duplicate] - c

This question already has answers here:
How to declare strings in C [duplicate]
(4 answers)
Closed 8 years ago.
A few weeks ago I started learning the programming language C. I have knowledge in web technologies like HMTL/CSS, Javscript, PHP, and basic server administration, but C is confusing me. To my understanding, the C language does not have a data type for strings, just characters, however I may be wrong.
I have heard there are two ways of declaring a string. What is the difference between these two lines of declaring a string:
a.) char stringName[];
b.) char *stringName;
I get that char stringName[]; is an array of characters. However, the second line confuses me. To my understanding the second line makes a pointer variable. Aren't pointer variables supposed to be the memory address of another variable?

In the C language, a "string" is, as you say, an array of char. Most string functions built into the C spec expect the string to be "NUL terminated", meaning the last char of the string is a 0. Not the code representing the numeral zero, but the actual value of 0.
For example, if you're platform uses ASCII, then the following "string" is "ABC":
char myString[4] = {65, 66, 67, 0};
When you use the char varName[] = "foo" syntax, you're allocating the string on the stack (or if its in a global space, you're allocating it globally, but not dynamically.)
Memory management in C is more manual than in many other langauges you may have experience with. In particular, there is the concept of a "pointer".
char *myString = "ABC"; /* Points to a string somewhere in memory, the compiler puts somewhere. */
Now, a char * is "an address that points to a char or char array". Notice the "or" in that statement, it is important for you, the programmer, to know what the case is.
It's important to also ensure that any string operations you perform don't exceed the amount of memory you've allocated to a pointer.
char myString[5];
strcpy(myString, "12345"); /* copy "12345" into myString.
* On no! I've forgot space for my nul terminator and
* have overwritten some memory I don't own. */
"12345" is actually 6 characters long (don't forget the 0 at the end), but I've only reserved 5 characters. This is what's called a "buffer overflow", and is the cause of many serious bugs.
The other difference between "[]" and "*", is that one is creating an array (as you guessed). The other one is not reserving any space (other than the space to hold the pointer itself.) That means that until you point it somewhere that you know is valid, the value of the pointer should not be used, for either reading or writing.
Another point (made by someone in the comment)
You cannot pass an array as a parameter to a function in C. When you try, it gets converted to a pointer automatically. This is why we pass around pointers to strings rather than the strings themselves

In C, a string is a sequence of character values followed by a 0-valued byte1 . All the library functions that deal with strings use the 0 terminator to identify the end of the string. Strings are stored as arrays of char, but not all arrays of char contain strings.
For example, the string "hello" is represented as the character sequence {'h', 'e', 'l', 'l', 'o', 0}2 To store the string, you need a 6-element array of char - 5 characters plus the 0 terminator:
char greeting[6] = "hello";
or
char greeting[] = "hello";
In the second case, the size of the array is computed from the size of the string used to initialize it (counting the 0 terminator). In both cases, you're creating a 6-element array of char and copying the contents of the string literal to it. Unless the array is declared at file scope (oustide of any function) or with the static keyword, it only exists for the duration of the block in which is was declared.
The string literal "hello" is also stored in a 6-element array of char, but it's stored in such a way that it is allocated when the program is loaded into memory and held until the program terminates3, and is visible throughout the program. When you write
char *greeting = "hello";
you are assigning the address of the first element of the array that contains the string literal to the pointer variable greeting.
As always, a picture is worth a thousand words. Here's a simple little program:
#include <string.h>
#include <stdio.h>
#include <ctype.h>
int main( void )
{
char greeting[] = "hello"; // greeting contains a *copy* of the string "hello";
// size is taken from the length of the string plus the
// 0 terminator
char *greetingPtr = "hello"; // greetingPtr contains the *address* of the
// string literal "hello"
printf( "size of greeting array: %zu\n", sizeof greeting );
printf( "length of greeting string: %zu\n", strlen( greeting ) );
printf( "size of greetingPtr variable: %zu\n", sizeof greetingPtr );
printf( "address of string literal \"hello\": %p\n", (void * ) "hello" );
printf( "address of greeting array: %p\n", (void * ) greeting );
printf( "address of greetingPtr: %p\n", (void * ) &greetingPtr );
printf( "content of greetingPtr: %p\n", (void * ) greetingPtr );
printf( "greeting: %s\n", greeting );
printf( "greetingPtr: %s\n", greetingPtr );
return 0;
}
And here's the output:
size of greeting array: 6
length of greeting string: 5
size of greetingPtr variable: 8
address of string literal "hello": 0x4007f8
address of greeting array: 0x7fff59079cf0
address of greetingPtr: 0x7fff59079ce8
content of greetingPtr: 0x4007f8
greeting: hello
greetingPtr: hello
Note the difference between sizeof and strlen - strlen counts all the characters up to (but not including) the 0 terminator.
So here's what things look like in memory:
Item Address 0x00 0x01 0x02 0x03
---- ------- ---- ---- ---- ----
"hello" 0x4007f8 'h' 'e' 'l' 'l'
0x4007fc 'o' 0x00 ??? ???
...
greetingPtr 0x7fff59079ce8 0x00 0x00 0x00 0x00
0x7fff59879cec 0x00 0x40 0x7f 0xf8
greeting 0x7fff59079cf0 'h' 'e' 'l' 'l'
0x7fff59079cf4 'o' 0x00 ??? ???
The string literal "hello" is stored at a vary low address (on my system, this corresponds to the .rodata section of the executable, which is for static, constant data). The variables greeting and greetingPtr are stored at much higher addresses, corresponding to the stack on my system. As you can see, greetingPtr stores the address of the string literal "hello", while greeting stores a copy of the string contents.
Here's where things can get kind of confusing. Let's look at the following print statements:
printf( "greeting: %s\n", greeting );
printf( "greetingPtr: %s\n", greetingPtr );
greeting is a 6-element array of char, and greetingPtr is a pointer to char, yet we're passing them both to printf in exactly the same way, and the string is being printed out correctly; how can that work?
Unless it is the operand of the sizeof or unary & operators, or is a string literal used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element of the array.
In the printf call, the expression greeting has type "6-element array of char"; since it isn't the operand of the sizeof or unary & operators, it is converted ("decays") to an expression of type "pointer to char" (char *), and the address of the first element is actually passed to printf. IOW, it behaves exactly like the greetingPtr expression in the next printf call4.
The %s conversion specifer tells printf that its corresponding argument has type char *, and that it it should print out the character values starting from that address until it sees the 0 terminator.
Hope that helps a bit.
1. Often referred to as the NUL terminator; this should not be confused with the NULL pointer constant, which is also 0-valued but used in a different context.
2. You'll also see the terminating 0-valued byte written as '\0'. The leading backslash "escapes" the value, so instead of being treated as the character '0' (ASCII 48), it's treated as the value 0 (ASCII 0)).
3. In practice, space is set aside for it in the generated binary file, often in a section marked read-only; attempting to modify the contents of a string literal invokes undefined behavior.
4. This is also why the declaration of greeting copies the string contents to the array, while the declaration of greetingPtr copies the address of the first element of the string. The string literal "hello" is also an array expression. In the first declaration, since it's being used to initialize another array in a declaration, the contents of the array are copied. In the second declaration, the target is a pointer, not an array, so the expression is converted from an array type to a pointer type, and the resulting pointer value is copied to the variable.

In C (and in C++), arrays and pointers are represented similarly; an array is represented by the address of the first element in the array (which is sufficient to gain access to the other elements, since elements are contiguous in memory within an array). This also means that an array does not, by itself, indicate where it ends, and thus you need some way of identifying the end of the array, either by passing around the length as a separate variable or by using some convention (such as that there is a sentinel value that is placed in the last position of the array to indicate the end of the array). For strings, the latter is the common convention, with '\0' (the NUL character) indicating the end of the string.

Related

Something I don't get about C strings

A few questions regarding C strings:
both char* and char[] are pointers?
I've learned about pointers and I can tell that char* is a pointer, but why is it automatically a string and not just a char pointer that points to 1 char; why can it hold strings?
Why, unlike other pointers, when you assign a new value to the char* pointer you are actually allocating new space in memory to store the new value and, unlike other pointers, you just replace the value stored in the memory address the pointer is pointing at?
A pointer is not a string.
A string is a constant object having type array of char and, also, it has the property that the last element of the array is the null character '\0' which, in turn, is an int value (converted to char type) having the integer value 0.
char* is a pointer, but char[] is not. The type char[] is not a "real" type, but an incomplete type. The C language is specified in such a way that, in the moment that you define a concrete variable (object) having array of char type, the size of the array is well determined in some way or another. Thus, none variable has type char[] because this is not a type (for a given object).
However, automatically every object having type array of N objects of type char is promoted to char *, that is, a pointer to char pointing to the initial object of the array.
On the other hand, this promotion is not always performed. For example, the operator sizeof() will give different results for char* than for an array of N chars. In the former case, the size of a pointer to char is given (which is in general the same amount for every pointer...), and in the last case gives you the value N, that is, the size of the array.
The behaviour is differente when you declare function arguments as char* and char[]. Since the function cannot know the size of the array, you can think of both declarations as equivalent.
Actually, you are right here: char * is a pointer to just 1 character object. However, it can be used to access strings, as I will explain you now: In the paragraph 1. I showed you that the strings are considered objects in memory having type array of N chars for some N. This value N is big enough to allow an ending null character (as all "string" is supposed to be in C).
So, what's the deal here?
The key point to understand this issues is the concept of object (in memory).
When you have a string or, more generally, an array of char, this means that you have figured out some manner to hold an array object in memory.
This object determines a portion of RAM memory that you can access safely, because C has assigned enough memory for it.
Thus, when you point to the first byte of this object with a char* variable, actually you have guaranteed access to all the adjacent elements to the "right" of that memory place, because those places are well defined by C as having the bytes of the array above.
Briefly: the adjacent (to the right) bytes of the byte pointed by a char* variable can be accessed, they are valid places to access, so the pointer can be "iterated" to walk through these bytes, up to the end of the string, without "risks", since all the bytes in an array are contiguous well defined positions in memory.
This is a complicated question, but it reveals that you are not understanding the relationship between pointers, arrays, and string literals in C.
A pointer is just a variable pointing to a position in memory.
A pòinter to char points to just 1 object having type char.
If the adjacent bytes of the pointed position correspond to an array of chars, they will be accessible by the pointer, so the pointer can "walk on" the memory bytes occupied by the array object.
A string literal is considered as an array of char object, which implictely add an ending byte with value 0 (the null character).
In any case, an array of T object has a well defined "size".
A string literal has an additional property: it's a constant object.
Try to fit and gather these concepts in your mind to figure out what's going on.
And ask me for clarification.
ADDITIONAL REMARKS:
Consider the following piece of code:
#include <stdio.h>
int main(void)
{
char *s1 = "not modifiable";
char s2[] = "modifiable";
printf("%s ---- %s\n\n", s1, s2);
printf("Size of array s2: %d\n\n", (int)sizeof(s2));
s2[1] = '0', s2[3] = s2[5] = '1', s2[4] = '7',
s2[6] = '4', s2[7] = '8', s2[9] = '3';
printf("New value of s2: %s\n\n",s2);
//s1[0] = 'X'; // Attempting to modify s1
}
In the definition and initialization of s1 we have the string literal "not modifiable", which has constant content and constant address. Its address is assigned to the pointer s1 as initialization.
Any attempt to modify the bytes of the string will give some kind of error, because the array content is read-only.
In the definition and initialization of s2, we have the string literal "modifiable", which has, again, constant content and constant address. However, what happens now is that, as part of the initialization, the content of the string is copied to the array of char s2. The size of the array s2 is not specified (the declaration char s2[] gives an incomplete type), but after initialization the size of the array is well determined and defined as the exact size of the copied string (plus 1 character used to hold the null character, or end-of-string mark).
So, the string literal "modifiable" is used to initialize the bytes of the array s2, which is modifiable.
The right manner to do that is by changing a character at the time.
For more handy ways of modifying and assigning strings, it has to be used the standard header <string.h>.
char *s is a pointer, char s[] is an array of characters. Ex.
char *s = "hello";
char c[] = "world";
s = c; //Legal
c = address of some other string //Illegal
char *s is not a string; it points to an address. Ex
char c[] = "hello";
char *s = &c[3];
Assigning a pointer is not creating memory; you are pointing to memory. Ex.
char *s = "hello";
In this example when you type "hello" you are creating special memory to hold the string "hello" but that has nothing to do with the pointer, the pointer simply points to that spot.

I am pretty curious about char pointers

I have learned that char *p means "a pointer to char type"
and also I think I've also learned that char means to read
that amount of memory once that pointer reaches its' destination.
so conclusively, in
char *p = "hello World";
the p contains the strings' address and
the p is pointing right at it
Qusetions.
if the p points to the string, shouldn't it be reading only the 'h'???
since it only reads the sizeof a char?
why does `printf("%s", p) print the whole string???
I also learned in Rithcie's book that pointer variables don't possess a data type.
is that true???
So your string "hello world" occupies some memory.
[h][e][l][l][o][ ][w][o][r][l][d][\0]
[0][1][2][3][4][5][6][7][8][9][A][B ]
The pointer p does, in fact, only point to the first byte. In this case, byte 0. But if you use it in the context of
printf("%s", p)
Then printf knows to print until it gets the null character \0, which is why it will print the entire string and not just 'h'.
As for the second question, pointers do posses a data type. If you were to say it outloud, the name would probably be something like type "pointer to a character" in the case of p, or "pointer to an integer" for int *i.
Pointer variables don't hold a data type, they hold an address. But you use a data type so you know how many bytes you'll advance on each step, when reading from memory using that pointer.
When you call printf, the %s in the expression is telling the function to start reading at the address indicated by *p (which does hold the byte value for 'h', as you said), and stop reading when it reaches a terminating character. That's a character that has no visual representation (you refer to it as \0 in code). It tells the program where a string ends.
Here *p is a pointer to some location in memory, that it assumes to be 1 byte (or char). So it points to the 'h' letter. So p[0] or *(p+0) will give you p. But, your string ends with invisible \0 character, so when you use printf function it outputs all symbols, starting from the one, where *p points to and till `\0'.
And pointer is just a variable, that is able to hold some address (4, 8 or more bytes).
For question:
I also learned in Rithcie's book that pointer variables don't possess a data type. is that true???
Simply put, YES.
Data types in C are used to define a variable before its use. The definition of a variable will assign storage for the variable and define the type of data that will be held in the location.
C has the following basic built-in datatypes.int,float,double,char.
Quoting from C Data types
Pointer is derived data type, each data type can have a pointer associated with. Pointers don't have a keyword, but are marked by a preceding * in the variable and function declaration/definition. Most compilers supplies the predefined constant NULL, which is equivalent to 0.
if the p points to the string, shouldn't it be reading only the 'h'???
since it only reads the sizeof a char? why does printf("%s", *p) print
the whole string???
change your printf("%s", *p) to printf("%c", *p) you see what you want. both calls the printf in different ways on the basis of format specifier i.e string(%s) or char(%c).
to print the string use printf("%s", p);
to print the char use printf("%c", *p);
Second Ans. : Pointers possesses a data type. that's why you used char *p .

Passing a string literal as a function parameter defined as a pointer

I am reading the chapter on arrays and pointers in Kernighan and Richie's The C Programming Language.
They give the example:
/* strlen:  return length of string s */
int strlen(char *s)
{
    int n;
    for (n = 0; *s != '\0'; s++)
        n++;
    return n;
}
And then say:
“Since s is a pointer, incrementing it is perfectly legal; s++ has no effect on the character string in the function that called strlen, but merely increments strlen’s private copy of the pointer. That means that calls like
strlen("hello, world");  /* string constant */
strlen(array);           /* char array[100]; */
strlen(ptr);             /* char *ptr; */
all work.”
I feel like I understand all of this except the first call example: Why, or how, is the string literal "hello, world" treated as a char *s? How is this a pointer? Does the function assign this string literal as the value of its local variable *s and then use s as the array name/pointer?
To understand how a string like "Hello World" is converted to a pointer, it is important to understand that, the string is actually hexadecimal data starting at an address and moving along till it finds a NULL
So that means, every string constant such as "Hello World" is stored in the memory somewhere
Possibility would be:
0x10203040 : 0x48 [H]
0x10203041 : 0x65 [e]
0x10203042 : 0x6C [l]
0x10203043 : 0x6C [l]
0x10203044 : 0x6F [o]
0x10203045 : 0x20 [' ']
0x10203046 : 0x57 [W]
0x10203047 : 0x6F [o]
0x10203048 : 0x72 [r]
0x10203049 : 0x6C [l]
0x1020304A : 0x64 [d]
0x1020304B : 0x00 [\0]
So, when this function is called with the above values in the memory, [left side is address followed by ':' and the right side is ascii value of the character]
int strlen(const char *s)
{
int n;
for (n = 0; *s != ′\0′; s++)
n++;
return n;
}
strlen("Hello World");
at that time, what gets passed to strlen is the value 0x10203040 which is the address of the first element of the character array.
Notice, the address is passed by value.. hence, strlen has its own copy of the address of "Hello World". starting from n = 0, following uptil I find \0 in the memory, I increment n and also the address in s(which then gets incremented to 0x10203041) and so on, until it finds \0 at the address 0x1020304B and returns the string length.
"hello, world"
is an array of char (type is char[13]). The value of an array of char in an expression is a pointer to char. The pointer points to the first element of the array (i.e., the value of "hello, world" is &"hello, world"[0]).
Note that:
A pointer is (basically) a value pointing to a memory address.
A static string like "hello, word" is stored somewhere in memory
So, a pointer could as easily simply point to a static string as to any other (dynamical) structure that is stored in memory (like an array of characters). There is really no difference with the other provided examples.
Does the function assign this string literal as the value of its local variable *s
and then use s as the array name/pointer?
Yes
As it says in the first paragraph of the same page (Page 99, K&R2):
"By definition, the value of a variable or expression of type array is
the address of element zero of the array."
The value of "hello, world" would be the address of 'h'.
char str[] = "Hello, world";
strlen(str);
string in C are array of characters with terminating NULL ('\0'), that mean it should be in the memory some where.
so what is the difference of sending a stored string as above and sending it directly as below
strlen("Hello, World");
The answer is both are same, but where the string is stored and how it's handled, here comes the compiler and stack.
The compiler at compile time pushes the string in to stack and send the starting address (char*) in the stack, the calling function sees a pointer and access the string.
The compiler also add codes after exist form the function to restore the stack in the correct place thus deleting the temporary string created
Note: the above is implementation dependent of the compiler, but most of the compiler works this way

C: Behaviour of arrays when assigned to pointers

#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
printf("%p %s" ,"hello",ptr );
getchar();
}
Hi, I am trying to understand clearly how can arrays get assign in to pointers. I notice when you assign an array of chars to a pointer of chars ptr="hello"; the array decays to the pointer, but in this case I am assigning a char of arrays that are not inside a variable and not a variable containing them ", does this way of assignment take a memory address specially for "Hello" (what obviously is happening) , and is it possible to modify the value of each element in "Hello" wich are contained in the memory address where this array is stored. As a comparison, is it fine for me to assign a pointer with an array for example of ints something as vague as thisint_ptr = 5,3,4,3; and the values 5,3,4,3 get located in a memory address as "Hello" did. And if not why is it possible only with strings? Thanks in advanced.
"hello" is a string literal. It is a nameless non-modifiable object of type char [6]. It is an array, and it behaves the same way any other array does. The fact that it is nameless does not really change anything. You can use it with [] operator for example, as in "hello"[3] and so on. Just like any other array, it can and will decay to pointer in most contexts.
You cannot modify the contents of a string literal because it is non-modifiable by definition. It can be physically stored in read-only memory. It can overlap other string literals, if they contain common sub-sequences of characters.
Similar functionality exists for other array types through compound literal syntax
int *p = (int []) { 1, 2, 3, 4, 5 };
In this case the right-hand side is a nameless object of type int [5], which decays to int * pointer. Compound literals are modifiable though, meaning that you can do p[3] = 8 and thus replace 4 with 8.
You can also use compound literal syntax with char arrays and do
char *p = (char []) { "hello" };
In this case the right-hand side is a modifiable nameless object of type char [6].
The first thing you should do is read section 6 of the comp.lang.c FAQ.
The string literal "hello" is an expression of type char[6] (5 characters for "hello" plus one for the terminating '\0'). It refers to an anonymous array object with static storage duration, initialized at program startup to contain those 6 character values.
In most contexts, an expression of array type is implicitly converted a pointer to the first element of the array; the exceptions are:
When it's the argument of sizeof (sizeof "hello" yields 6, not the size of a pointer);
When it's the argument of _Alignof (a new feature in C11);
When it's the argument of unary & (&arr yields the address of the entire array, not of its first element; same memory location, different type); and
When it's a string literal in an initializer used to initialize an array object (char s[6] = "hello"; copies the whole array, not just a pointer).
None of these exceptions apply to your code:
char *ptr;
ptr = "hello";
So the expression "hello" is converted to ("decays" to) a pointer to the first element ('h') of that anonymous array object I mentioned above.
So *ptr == 'h', and you can advance ptr through memory to access the other characters: 'e', 'l', 'l', 'o', and '\0'. This is what printf() does when you give it a "%s" format.
That anonymous array object, associated with the string literal, is read-only, but not const. What that means is that any attempt to modify that array, or any of its elements, has undefined behavior (because the standard explicitly says so) -- but the compiler won't necessarily warn you about it. (C++ makes string literals const; doing the same thing in C would have broken existing code that was written before const was added to the language.) So no, you can't modify the elements of "hello" -- or at least you shouldn't try. And to make the compiler warn you if you try, you should declare the pointer as const:
const char *ptr; /* pointer to const char, not const pointer to char */
ptr = "hello";
(gcc has an option, -Wwrite-strings, that causes it to treat string literals as const. This will cause it to warn about some C code that's legal as far as the standard is concerned, but such code should probably be modified to use const.)
#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
//instead of above tow lines you can write char *ptr = "hello"
printf("%p %s" ,"hello",ptr );
getchar();
}
Here you have assigned string literal "hello" to ptr it means string literal is stored in read only memory so you can't modify it. If you declare char ptr[] = "hello";, then you can modify the array.
Say what?
Your code allocates 6 bytes of memory and initializes it with the values 'h', 'e', 'l', 'l', 'o', and '\0'.
It then allocates a pointer (number of bytes for the pointer depends on implementation) and sets the pointer's value to the start of the 5 bytes mentioned previously.
You can modify the values of an array using syntax such as ptr[1] = 'a'.
Syntactically, strings are a special case. Since C doesn't have a specific string type to speak of, it does offer some shortcuts to declaring them and such. But you can easily create the same type of structure as you did for a string using int, even if the syntax must be a bit different.

Question about pointers and strings in C [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the difference between char s[] and char *s in C?
Difference between char *str = “…” and char str[N] = “…”?
I have some code that has had me puzzled.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char* string1 = "this is a test";
char string2[] = "this is a test";
printf("%i, %i\n", sizeof(string1), sizeof(string2));
system("PAUSE");
return 0;
}
When it outputs the size of string1, it prints 4, which is to be expected because the size of a pointer is 4 bytes. But when it prints string2, it outputs 15. I thought that an array was a pointer, so the size of string2 should be the same as string1 right? So why is it that it prints out two different sizes for the same type of data (pointer)?
Arrays are not pointers. Array names decay to pointers to the first element of the array in certain situations: when you pass it to a function, when you assign it to a pointer, etc. But otherwise arrays are arrays - they exist on the stack, have compile-time sizes that can be determined with sizeof, and all that other good stuff.
Arrays and pointers are completely different animals. In most contexts, an expression designating an array is treated as a pointer.
First, a little standard language (n1256):
6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
The string literal "this is a test" is a 15-element array of char. In the declaration
char *string1 = "this is a test";
string1 is being declared as a pointer to char. Per the language above, the type of the expression "this is a test" is converted from char [15] to char *, and the resulting pointer value is assigned to string1.
In the declaration
char string2[] = "this is a test";
something different happens. More standard language:
6.7.8 Initialization
...
14 An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
...
22 If an array of unknown size is initialized, its size is determined by the largest indexed element with an explicit initializer. At the end of its initializer list, the array no longer has incomplete type.
In this case, string2 is being declared as an array of char, its size is computed from the length of the initializer, and the contents of the string literal are copied to the array.
Here's a hypothetical memory map to illustrate what's happening:
Item Address 0x00 0x01 0x02 0x03
---- ------- ---- ---- ---- ----
no name 0x08001230 't' 'h' 'i' 's'
0x08001234 ' ' 'i' 's' ' '
0x08001238 'a' ' ' 't' 'e'
0x0800123C 's' 't' 0
...
string1 0x12340000 0x08 0x00 0x12 0x30
string2 0x12340004 't' 'h' 'i' 's'
0x12340008 ' ' 'i' 's' ' '
0x1234000C 'a' ' ' 't' 'e'
0x1234000F 's' 't' 0
String literals have static extent; that is, the memory for them is set aside at program startup and held until the program terminates. Attempting to modify the contents of a string literal invokes undefined behavior; the underlying platform may or may not allow it, and the standard places no restrictions on the compiler. It's best to act as though literals are always unwritable.
In my memory map above, the address of the string literal is set off somewhat from the addresses of string1 and string2 to illustrate this.
Anyway, you can see that string1, having a pointer type, contains the address of the string literal. string2, being an array type, contains a copy of the contents of the string literal.
Since the size of string2 is known at compile time, sizeof returns the size (number of bytes) in the array.
The %i conversion specifier is not the right one to use for expressions of type size_t. If you're working in C99, use %zu. In C89, you would use %lu and cast the expression to unsigned long:
C89: printf("%lu, %lu\n", (unsigned long) sizeof string1, (unsigned long) sizeof string2);
C99: printf("%zu, %zu\n", sizeof string1, sizeof string2);
Note that sizeof is an operator, not a function call; when the operand is an expression that denotes an object, parentheses aren't necessary (although they don't hurt).
string1 is a pointer, but string2 is an array.
The second line is something like int a[] = { 1, 2, 3}; which defines a to be a length-3 array (via the initializer).
The size of string2 is 15 because the initializer is nul-terminated (so 15 is the length of the string + 1).
An array of unknown size is equivalent to a pointer for sizeof purposes. An array of static size counts as its own type for sizeof purposes, and sizeof reports the size of the storage required for the array. Even though string2 is allocated without an explicit size, the C compiler treats it magically because of the direct initialization by a quoted string and converts it to an array with static size. (Since the memory isn't allocated in any other way, there's nothing else it can do, after all.) Static size arrays are different types from pointers (or dynamic arrays!) for the purpose of sizeof behavior, because that's just how C is.
This seems to be a decent reference on the behaviors of sizeof.
The compiler know that test2 is an array, so it prints out the number of bytes allocated to it(14 letters plus null terminator). Remember that sizeof is a compiler function, so it can know the size of a stack variable.
array is not pointer. Pointer is a variable pointing to a memory location whereas array is starting point of sequential memory allocated
Its because
string1 holds pointer, where pointer has contiguous chars & its
immutable.
string2 is location where your chars sit.
basically C compiler iterprets these 2 differently. beautifully explained here http://c-faq.com/aryptr/aryptr2.html.

Resources