How compiler store C string? [duplicate] - c

This question already has answers here:
Sizeof string literal
(2 answers)
Closed 9 years ago.
I have a question about how compiler stores C string?Here is some pieces of code:
#define STRING_MACRO "macro"
const char * string_const = "w";
int main(void){
printf("%u\n", sizeof(STRING_MACRO));
printf("%u\n", sizeof(string_const));
return 0;
}
output:
6
4 -- my system is x86, so it is 4
So I am confused about how compiler stores c string?Is it the same between macro style and value style?
I think most people misunderstanding my question.So I tried another code by myself.
#define TEST "a"
int main(void)
{
char hello[] = "aa";
char (*a)[10] = &hello;
printf("%u\n", sizeof(TEST));
printf("%u\n", sizeof(hello));
printf("%u\n", sizeof(*a));
return 0;
}
output:
2
3
10
So I got a conclusion the compiler store c string of macro style in char[] type, not char * type.

sizeof(STRING_MACRO);
is seen by the compiler as:
sizeof("macro");
This gives you the size of the string literal "macro", string literals are stored in an implementation defined read only region.
const char * string_const = "w";
string_cost is a pointer which points to a string literal "w".
so,
sizeof(string_const);
gives the size of the pointer i.e const char * which is apprantly 4 on your system.

sizeof() returns the object's memory size.
#define STRING_MACRO "macro"
This is 6 because the compiler allocates 6 bytes for 'macro' (5) + (1) for the string terminator
const char * string_const = "w";
This is 4 because it's a pointer, and you're working on a 32bit platform, so 4 bytes for the pointer to char.

The string "macro" is defined as a macro in your code and not as a variable.
If you build your code with gcc -E you will get a prepocessor code. and in this code you will find that
printf("%u\n", sizeof(STRING_MACRO);
is replaced with
printf("%u\n", sizeof("macro"));
the prprocessor code is the code generated by your compilator before the compilation. in this code the compilator replace the macros in your origin code with the content of the macro.
And for "w" is literal string and string_const is a pointer pointing on that literal string. and the sizeof pointer is 4 for 32-bits systems and 8 for 64-bits systems.

In your second code,
TEST is nothing but a name for a piece of code, namely the literal a. This name is expanded to the actual code during the pre-processor stage.
All string literals are NULL terminated. This is why,
sizeof("a") = 1 // the size of the character 'a'
+ 1 // size of the '\0' character
-----
= 2
"a" and "aa" are stored in read-only parts of memory (implementation defined).
hello is stored in the stack and it points to the read-only part of the memory where "aa" is stored.
Since hello is declared as a character array (whose size is implicitly 3 since no size was explicitly specified), sizeof(hello) = length of the array = length ("aa\0") = 3.
a is a pointer to a character array of length 10.
sizeof(a) would be 4 (in your x86 system) since a is a pointer.
sizeof(*a) = 10 because *a is the base pointer of the character array of length 10.

Compiler doesn't store anything. It evaluates the size of the constant during the parsing. Size of STRING_MACRO is evaluated as the length of the string + the terminator character(\0). Size of string_const is evaluated as size of pointer (because that's what it is) and on your system the pointer size is 4 bytes (corresponds to 32-bit system).

Related

C: different String definition, I get different size using sizeof()

I was testing the use of sizeof() for the same String content "abc". my function is like this:
int main(void){
char* pass1 = "abc";
char pass2[] = "abc";
char pass3[4] = "abc";
char pass4[] = "";
scanf("%s", pass4);
printf("sizeof(pass1) is: %lu\n", sizeof(pass1));
printf("sizeof(pass2) is: %lu\n", sizeof(pass2));
printf("sizeof(pass3) is: %lu\n", sizeof(pass3));
printf("sizeof(pass4) is: %lu\n", sizeof(pass4));
return 0;
}
I input "abc" for pass4, the output is like this:
sizeof(pass1) is: 8
sizeof(pass2) is: 4
sizeof(pass3) is: 4
sizeof(pass4) is: 1
I was expecting all 4s. I thought the 4 above string definitions are the same.
why sizeof(pass1) returns 8? Why sizeof(pass4) is 1?
When you take sizeof on a pointer type, you'll get the size in bytes of the memory address. In this case 8 is the size of the address (in bytes). sizeof on statically allocated read only strings in C will return the actual size in bytes of the string including the null byte.
sizeof gives the size of its operand. To understand the results you are seeing, you need to understand what pass1, pass2, pass3, and pass4 actually are.
pass1 is a pointer to char (i.e. a char *) so sizeof pass1 gives the size of a pointer (a variable which contains a memory address, not an array). That is 8 with your compiler. The size of a pointer is implementation defined, so this may give different results with different compilers. The fact you have initialised pass1 so it points at the first character of a string literal "abc" does not change the fact that pass1 is declared as a pointer, not an array.
pass2 is an array initialised using the literal "abc" which - by convention - is represented in C using an array of four characters (the three letters 'a' to 'c', plus an additional character with value zero ('\0').
pass3 is also an array of four char, since it is declared that way char pass3[4] = <etc>. If you had done char pass3[4] = "abcdef", you would still find that sizeof pass3 is 4 (and the 4 elements of pass3 will be 'a' to 'd' (with other character 'e', 'f', and '\0' in the string literal "abcdef" not used to initialise pass3).
Since both pass2 and pass3 are arrays of four characters, their size is 4 (in general, the size of an array is the size of the array element multiplied by number of elements). The standard defines sizeof char to be 1, and 1*4 has a value 4.
pass4 is initialised using the literal "". That string literal is represented using a single char with value '\0' (and no characters before it, since none are between the double quotes). So pass4 has size 1 for the same reason that pass2 has size 4.
In this declaration
char* pass1 = "abc";
in the right side used as an initializer there is a character array that has size equal to 4. String literals have types of character arrays. Take into account that string literals include the terminating zero. You can check this the following way
printf("sizeof( \"abc\") is: %zu\n", sizeof( "abc"));
In the declaration the array is used to initialize a pointer. Used as initializer of a pointer the array is implicitly converted to the pointer to its first element.
Thus the pointer pass1 points to the first element of the string literal "abc". The size of the pointer itself in your system is equal to 8 bytes.
In these declerations
char pass2[] = "abc";
char pass3[4] = "abc";
the string literal is used to initialize arrays. In this case each element of the arrays is initialized by the corresponding element of the string literal. All other elements of the arrays are zero initialized. If the size of an array is not specified then it is calculated from the number of initializers.
So in this declaration
char pass2[] = "abc";
the array pass2 will have 4 elements because the sgtring literal provides four initializers.
In this declaration
char pass3[4] = "abc";
there is explicitly specified that the array has 4 elements.
Thus the both arrays has size equal to 4 and the pointer declared first has size of 8 bytes.

Size of pointer, pointer to pointer in C

How can I justify the output of the below C program?
#include <stdio.h>
char *c[] = {"Mahesh", "Ganesh", "999", "333"};
char *a;
char **cp[] = {c+3, c+2, c+1, c};
char ***cpp = cp;
int main(void) {
printf("%d %d %d %d ",sizeof(a),sizeof(c),sizeof(cp),sizeof(cpp));
return 0;
}
Prints
4 16 16 4
Why?
Here is the ideone link if you want to fiddle with it.
char *c[] = {"Mahesh", "Ganesh", "999", "333"};
c is an array of char* pointers. The initializer gives it a length of 4 elements, so it's of type char *[4]. The size of that type, and therefore of c, is 4 * sizeof (char*).
char *a;
a is a pointer of type char*.
char **cp[] = {c+3, c+2, c+1, c};
cp is an array of char** pointers. The initializer has 4 elements, so it's of type char **[4]. It size is 4 * sizeof (char**).
char ***cpp = cp;
cpp is a pointer to pointer to pointer to char, or char***. Its size is sizeof (char***).
Your code uses %d to print the size values. This is incorrect -- but it happens to work on your system. Probably int and size_t are the same size. To print a size_t value correctly, use %zu -- or, if the value isn't very large, you can cast it to int and use %d. (The %zu format was introduced in C99; there might still be some implementations that don't support it.)
The particular sizes you get:
sizeof a == 4
sizeof c == 16
sizeof cp == 16
sizeof cpp == 4
are specific to your system. Apparently your system uses 4-byte pointers. Other systems may have pointers of different sizes; 8 bytes is common. Almost all systems use the same size for all pointer types, but that's not guaranteed; it's possible, for example, for char* to be larger than char***. (Some systems might require more information to specify a byte location in memory than a word location.)
(You'll note that I omitted the parentheses on the sizeof expressions. That's legal because sizeof is an operator, not a function; its operand is either an expression (which may or may not be parenthesized) or a type name in parentheses, like sizeof (char*).)
a is an usually pointer, which represents the memory address. On 32-bit operating system, 32bit (4 Byte) unsigned integer is used to represent the address. Therefore, sizeof(a) is 4.
c is an array with 4 element, each element is a pointer, its size is 4*4 = 16
cp is also an array, each element is a pointer (the first *, wich point to another pointer (the second *). The later pointer points to an string in the memory. Therefore its basic element size should represent the size of a pointer. and then sizeof(cp) = 4*4 = 16.
cpp is a pointer's pointer's pointer. It is as well represent the 32bit memory address. therefore its sizeof is also 4.
a is a pointer. cpp is also a pointer just to different type (pointer to pointer to pointer).
Now c is an array. You have 4 elements, each is a pointer so you have 4 * 4 = 16 (it would be different if you would run it on x64).
Similar goes for cp. Try changing type to int and you will see the difference.
So the reason you got 4 16 16 4, is because 'a' is simply a pointer, on its own, which only requires 4 bytes (as a pointer is holding a 32bit address depending on your architecture) and so when you have a **pointer which is == to a *pointer[], your really making an array of pointers, and since you initalized 4 things that created 4 pointers, thus the 4x4 = 16. And for the cpp you may ask "well wouldn't it then be 16 as it was initalized?" and the answer is no, because a ***pointer is its own separate variable and still just a pointer(a pointer to a pointer to a pointer, or a pointer to an array of pointers), and requires only 4bytes of memory.

sizeof() function in C [duplicate]

This question already has answers here:
What does sizeof(&array) return?
(4 answers)
Closed 9 years ago.
main()
{
char a[] = "Visual C++";
char *b = "Visual C++";
printf("\n %d %d",sizeof(a),sizeof(b));
printf("\n %d %d",sizeof(*a),sizeof(*b));
}
sizeof(a) gives me output: 11 ( that is length of the string)
Why is it so ?
Why isn't the output sizeof(a)=4 since when I try to print a it gives me an address value and hence an integer?
Whenever you refer to the name of the array in your program, It normally decays to a pointer to the first element of the array. One of the exception to this rule is the sizeof operator. So when you consider the following code.
int main()
{
char a[] = "Visual C++";
printf("sizeof(a)=%d\n",sizeof(a)); /* Here sizeof(a) indicates sizeof array */
printf("a=%p",a); /* Here the array name, passed as an argument to printf decays into a pointer of type (char *) */
return 0;
}
In the declaration char a[] = "Visual C++", a is an array of 11 char. So its size is 11 bytes.
In the declaration char *b = "Visual C++", b is a pointer to char. So its size is four bytes (in the C implementation you are using).
In the expression printf("%s", a), a is also an array. However, it is automatically converted to a pointer to the first element of the array. So a pointer to char is passed to printf.
This conversion happens automatically unless an array is the argument of &, sizeof, or _Alignof or is a string literal used to initialize an array of char. Because it happens automatically, people tend to think of array names as pointers. However, they are not.
Incidentally, sizeof is an operator, not a function.
When sizeof is applied to the name of a static array (not an array allocated through malloc), the result is the size in bytes of the whole array. This is one of the few exceptions to the rule that the name of an array is converted to a pointer to the first element of the array, and is possible just because the actual array size is fixed and known at compile time, when sizeof operator is evaluated.
There are lots of errors, here.
"sizeof(a) gives me output: 11 (length of the string)"
The length of the string is 10, not 11. sizeof(a) gives you the length of the array.
"why is it so, why isn't the output sizeof(a)=4 since when I try to print a it gives me an address value"
Here are two methods of "printing a" which do not "give you an address value":
puts(a);
and:
printf("%s\n", a);
so your logic is flawed, and this is the source of your confusion. "Printing a" only "gives you an address value" when you explicitly or implicitly elect to do so.
sizeof(a) gives 11 in this case because the C language defines the sizeof operator to give you the size of an array when an array is the operand. This, I'd argue, is the most natural behavior people would expect, so presumably that's why it is defined as such.
"and hence an integer."
In any case, an address is an address, not an integer. At best you could argue that it ought to give you the size of a pointer, but certainly not the size of an integer.

Another way for string creation in C? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
C -> sizeof string is always 8
I found the following way to create a string:
#include <stdio.h>
#include <string.h>
int main(void) {
char *ptr = "this is a short string";
int count = sizeof ptr / sizeof ptr[0];
int count2 = strlen(ptr);
printf("the size of array is %d\n", count);
printf("the size of array is %d\n", count2);
return 0;
}
I can't use the the usual way to get the length by sizeof ptr / sizeof ptr[0], Does this way to create string is valid? Any pros and cons about his notation?
When you do this
char *ptr = "this is a short string";
ptr is actually a pointer on the stack.
Which points to string literal on the readonly memory.
[ptr] ----> "This is a short string"
So trying to get the sizeof ptr will always evaluate to size of int on that particular
machine.
But when you try this
char ptr[]="this is a short string";
Then actually for whole string the memory is created on the stack with ptr having
starting address of the array
|t|h|i|s| |i|s| | |a| |s|h|o|r|t| |s|t|r|i|n|g|\0|
|' '| ==> this is atually a byte shown in above diagram and ptr keep the address of |t|.
So size of ptr will give you the size of whole array.
Because ptr is a pointer and not an array. If you used char ptr[] instead of char *ptr, you would have gotten an almost-correct result - instead of 22, it would have resulted in 23, since the size of the array incorporates the terminating NUL byte also, which is not counted by strlen().
By the way, "creating" a string like this is (almost) valid, but if you don't use a character array, the compiler will initialize the pointer with the address of an array of constant strings, so you should really write
`const char *ptr`
instead of what you have now (i. e. add the const qualifier).
Yes, using strlen to get the size of a string is valid. Even more, strlen is the proper way to get the size of a string. The sizeof solution here is even wrong. sizeof ptr will return 8 on 64 bits system because it is the size of a pointer, divided by the size of a char, which is 1, you will obtain only 8.
The only case where you can use sizeof to get the size of a string is when it is declared as an array of character (char[]).
This code will crash if you modify the string (because recent compilers see it as a constant). Besides, it gives incorrect results as sizeof(ptr) returns the pointer size rather than the string size.
You should rather write:
char ptr[] = "this is a short string";
Because this code will let you find the sctring length as well as modify the string contents.

Why does sizeof return different values for same string in C? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Sizeof doesn't return the true size of variable in C
C -> sizeof string is always 8
Sizeof prints out 6 for:
printf("%d\n", sizeof("abcde"));
But it prints out 4 for:
char* str = "abcde";
printf("%d\n", sizeof(str));
Can someone explain why?
The string literal "abcde" is a character array. It is 6 bytes long, including the null terminator.
A variable of type char* is a pointer to a character. Its size is the size of a pointer, which on 32-bit systems is 4 bytes. sizeof is a compile time operation†, so it only looks at the variable's static type, which in this case is char*. It has no idea what's being pointed to.
† Except in the case of variable-length arrays, a feature introduced in the C99 language standard
First example, sizeof() return the length of the plain string.
Second example, sizeof() return the size of the pointer -> 32bits so 4 bytes.
Because here
printf("%d\n", sizeof("abcde"));
is a string, with considering NULL its 6 byte long.
and
char* str = "abcde";
printf("%d\n", sizeof(str));
is a pointer that requires 32bits hence 4 bytes :-)

Resources