I have a piece of C code and I don't understand how the sizeof(...) function works:
#include <stdio.h>
int main(){
const char firstname[] = "bobby";
const char* lastname = "eraserhead";
printf("%lu\n", sizeof(firstname) + sizeof(lastname));
return 0;
}
In the above code sizeof(firstname) is 6 and sizeof(lastname) is 8.
But bobby is 5 characters wide and eraserhead is 11 wide. I expect 16.
Why is sizeof behaving differently for the character array and pointer to character?
Can any one clarify?
firstname is a char array carrying a trailing 0-terminator. lastname is a pointer. On a 64bit system pointers are 8 byte wide.
sizeof an array is the size of the total array, in the case of "bobby", it's 5 characters and one trailing \0 which equals 6.
sizeof a pointer is the size of the pointer, which is normally 4 bytes in 32-bit machine and 8 bytes in 64-bit machine.
The size of your first array is the size of bobby\0. \0 is the terminator character, so it is 6.
The second size is the size of a pointer, which is 8 byte in your 64bit system. Its size doesn't depends on the assigned string's length.
how the sizeof(...) function works
sizeof() looks like a function but it's not a function. A function computes something at run-time.
sizeof() asks the compiler, at compile-time, how much memory it allocates for the argument. BTW sizeof() has no idea how much of it you actually use later at run time. In other words, you've "hardcoded" the printf arguments in your example.
Why is sizeof behaving differently for the character array and pointer
to character?
A pointer rarely requires the same amount of memory as an array.
In general, the amount of memory allocated for a pointer is different to what is allocated for its pointee.
firstname is an array of 6 chars, including the terminating '\0' character at the end of the string. That's why sizeof firstname is 6.
lastname is a pointer to char, and will have whatever size such a pointer has on your system. Typical values are 4 and 8. The size of lastname will be the same no matter what it is pointing to (or even if it is pointing to nothing at all).
firstname[] is null-terminated, which adds 1 to the length.
sizeof(lastname) is giving the size of the pointer instead of the actual value.
Related
I'm trying to switch from python to c for sometime, I was just checking out a few functions, what caught my attention is sizeof operator which returns the size of object in bytes. I created an array of strings, and would want to find the size of the array. I know that it can be done by sizeof(array)/sizeof(array[0]). However, I find this a bit confusing.
I expect that large array would be 2D (which is just 1D array represented differently) and each character array within this large array would occupy as many bytes as the maximum size of character array within this large array. Example below
#include <stdio.h>
#include <string.h>
const char *words[] = {"this","that","Indian","he","she","sometimes","watch","now","browser","whatsapp","google","telegram","cp","python","cpp","vim","emacs","jupyter","space","earphones","laptop","charger","whiteboard","chalk","marker","matrix","theory","optimization","gradient","descent","numpy","sklearn","pandas","torch","array"};
const int length = sizeof(words)/sizeof(words[0]);
int main()
{
printf("%s",words[1]);
printf("%i",length);
printf("\n%lu",sizeof(words[0]));
printf("\n%lu %lu %s",sizeof(words[27]),strlen(words[27]),words[27]);
return 0;
}
[OUT]
that35
8
8 12 optimization
each of the character arrays occupy 8 bytes, including the character array "optimization". I don't understand what is going on here, the strlen function gives expected output since it just find NULL character in the character array, I'd expected the output of sizeof operator to be 1 more than the output of strlen.
PS: I didn't find some resource that addresses this issue.
It's happening because sizeof(words[27]) is giving the size of a pointer and words[27] is a pointer, and pointers have a fixed size of each machine, mostly 8 bytes on a x86_64 architecture CPU. Also, words is an array of pointers.
each of the character arrays occupy 8 bytes, including the character array "optimization".
No, each word in words is occupying a fixed memory (their length), 8 bytes is the size of pointer which is unsigned long int, it stores the address of the word in words.
const int length = sizeof(words)/sizeof(words[0]);
The above line gives 35 because words is not decayed as a pointer, it is stored in the program's data section, because it's a global variable.
Read More about pointer decaying:
https://www.geeksforgeeks.org/what-is-array-decay-in-c-how-can-it-be-prevented/
https://www.opensourceforu.com/2016/09/decayintopointers/
words is an array of pointer to const char, statically initialized like this diagram:
In practice, the words will probably point to multiple entries from read-only-data. To use words in this manner, it is totally appropriate to use strlen.
I am starting C before learned Python and i am having some doubts in some concepts.
I am running this example in a 64-bit machine.
/* I understand that "vid" is only a char like any other else not a array of char
and its sizeof is 1 byte. The decimal int is 100 and the char is 'D'.
Why? 'vid' does not exist in ASCII table. How does the compiler leads with that */
char name = "vid";
/* sizeof is 8 bytes. I am not sure because if char is int therefore an array
of char would
be an array of int and if so int takes 2 or 4 bytes storage size so we reach that is
3 char long plus the NULL byte ('\0') we get 3 * 2 bytes + 1 * 2 bytes = 8 bytes .
Am i correct? And why we need to use * to declare it? Does * is not for pointers?
How does this syntax works? */
char *name_ = "vid";
A string constant like "vid" decays into a pointer to its first byte, and when you convert a pointer to a char, the program will truncate the pointer's value to make it fit. Apparently, that happens to produce a number whose ASCII value is D on your machine. You get an initialization makes integer from pointer without a cast warning for that, if you compile with GCC.
sizeof(name_) == sizeof(char*), which is 8 on a machine with 64-bit pointers. sizeof("vid") == 4, per definition: sizeof measures size in char units.
In the first exemple name = "vid" you are not assigning the string "vid" to name, by convension, a string constant is a pointer to it's first element, so in the first statment you're assigning the address of "vid". Like others said by accident the number stored in name after the address gets fitted to 1 bytes was the ascii code of 'D'. But if you turn on warnings you will get an error message telling you that your tring to assigne make a char from char * which is not compatible as char can hold only 1 byte.
The second exemple char *name_ = "vid" your assigning the address of "vid" to name_ which is right as it is a pointer to char.
Note that you are not storing the string "vid" in name_. The string constant "vid" is stored somewhere in a read only memory and the address of the first element of that string constant is assigned to name_.
For your first example, I am not sure how that compiles. You are attempting to assign an array of characters to a single character. This shouldn't be allowed without some kind of warning.
For the second, you are taking sizeof a char*, which is a pointer. Anytime you add the * to a type, you make it a pointer. In your case, this is 8 bytes, regardless of how much data it is pointing to. If you want to know the size of the data and not the size of the pointer, then you'd need to do the following;
sizeof(name_[0]) * 4
or
sizeof(char) * 4
Since your array is 3 characters, +1 for the null character, making it 4 characters long. This takes the size of the first element (a single character, 1 byte) and multiplies it by the length of the string. Thus, your data size should be 4 bytes.
Your first string of code declares name as a pointer to an array of four characters placed in static data segment. So when you are treating name as a character you get the last byte of the pointer: 0x??????????????64 where '??' are unknown bytes.
About the second string, you're getting sizeof of the pointer. In 64-bit systems pointers are 64-bit or 8-byte. It is what you get.
I have a piece of C code and I don't understand how the sizeof(...) function works:
#include <stdio.h>
int main(){
const char firstname[] = "bobby";
const char* lastname = "eraserhead";
printf("%lu\n", sizeof(firstname) + sizeof(lastname));
return 0;
}
In the above code sizeof(firstname) is 6 and sizeof(lastname) is 8.
But bobby is 5 characters wide and eraserhead is 11 wide. I expect 16.
Why is sizeof behaving differently for the character array and pointer to character?
Can any one clarify?
firstname is a char array carrying a trailing 0-terminator. lastname is a pointer. On a 64bit system pointers are 8 byte wide.
sizeof an array is the size of the total array, in the case of "bobby", it's 5 characters and one trailing \0 which equals 6.
sizeof a pointer is the size of the pointer, which is normally 4 bytes in 32-bit machine and 8 bytes in 64-bit machine.
The size of your first array is the size of bobby\0. \0 is the terminator character, so it is 6.
The second size is the size of a pointer, which is 8 byte in your 64bit system. Its size doesn't depends on the assigned string's length.
how the sizeof(...) function works
sizeof() looks like a function but it's not a function. A function computes something at run-time.
sizeof() asks the compiler, at compile-time, how much memory it allocates for the argument. BTW sizeof() has no idea how much of it you actually use later at run time. In other words, you've "hardcoded" the printf arguments in your example.
Why is sizeof behaving differently for the character array and pointer
to character?
A pointer rarely requires the same amount of memory as an array.
In general, the amount of memory allocated for a pointer is different to what is allocated for its pointee.
firstname is an array of 6 chars, including the terminating '\0' character at the end of the string. That's why sizeof firstname is 6.
lastname is a pointer to char, and will have whatever size such a pointer has on your system. Typical values are 4 and 8. The size of lastname will be the same no matter what it is pointing to (or even if it is pointing to nothing at all).
firstname[] is null-terminated, which adds 1 to the length.
sizeof(lastname) is giving the size of the pointer instead of the actual value.
Guys i have few queries in pointers. Kindly help to resolve them
char a[]="this is an array of characters"; // declaration type 1
char *b="this is an array of characters";// declaration type 2
question.1 : what is the difference between these 2 types of declaration ?
printf("%s",*b); // gives a segmentation fault
printf("%s",b); // displays the string
question.2 : i didn't get how is it working
char *d=malloc(sizeof(char)); // 1)
scanf("%s",d); // 2)
printf("%s",d);// 3)
question.3 how many bytes are being allocated to the pointer c?
when i try to input a string, it takes just a word and not the whole string. why so ?
char c=malloc(sizeof(char)); // 4)
scanf("%c",c); // 5)
printf("%c",c);// 6)
question.4 when i try to input a charcter why does it throw a segmentation fault?
Thanks in advance.. Waiting for your reply guys..
printf("%s",*b); // gives a segmentation fault
printf("%s",b); // displays the string
the %s expects a pointer to array of chars.
char *c=malloc(sizeof(char)); // you are allocating only 1 byte aka char, not array of char!
scanf("%s",c); // you need pass a pointer to array, not a pointer to char
printf("%s",c);// you are printing a array of chars, but you are sending a char
you need do this:
int sizeofstring = 200; // max size of buffer
char *c = malloc(sizeof(char))*sizeofstring; //almost equals to declare char c[200]
scanf("%s",c);
printf("%s",c);
question.3 how many bytes are being allocated to the pointer c? when i
try to input a string, it takes just a word and not the whole string.
why so ?
In your code, you only are allocating 1 byte because sizeof(char) = 1byte = 8bit, you need allocate sizeof(char)*N, were N is your "string" size.
char a[]="this is an array of characters"; // declaration type 1
char *b="this is an array of characters";// declaration type 2
Here you are declaring two variables, a and b, and initializing them. "this is an array of characters" is a string literal, which in C has type array of char. a has type array of char. In this specific case, the array does not get converted to a pointer, and a gets initialized with the array "this is an array of characters". b has type pointer to char, the array gets converted to a pointer, and b gets initialized with a pointer to the array "this is an array of characters".
printf("%s",*b); // gives a segmentation fault
printf("%s",b); // displays the string
In an expression, *b dereferences the pointer b, so it evaluates to the char pointed by b, i.e: T. This is not an address (which is what "%s" is expecting), so you get undefined behavior, most probably a crash (but don't try to do this on embedded systems, you could get mysterious behaviour and corrupted data, which is worse than a crash). In the second case, %s expects a pointer to a char, gets it, and can proceed to do its thing.
char *d=malloc(sizeof(char)); // 1)
scanf("%s",d); // 2)
printf("%s",d);// 3)
In C, sizeof returns the size in bytes of an object (= region of storage). In C, a char is defined to be the same as a byte, which has at least 8 bits, but can have more (but some standards put additional restrictions, e.g: POSIX requires 8-bit bytes, i.e: octets). So, you are allocating 1 byte. When you call scanf(), it writes in the memory pointed to by d without restraint, overwriting everything in sight. scanf() allows maximum field widths, so:
Allocate more memory, at least enough for what you want + 1 terminating ASCII NUL.
Tell scanf() to stop, e.g: scanf("%19s") for a maximum 19 characters (you'll need 20 bytes to store that, counting the terminating ASCII NUL).
And last (if markdown lets me):
char c=malloc(sizeof(char)); // 4)
scanf("%c",c); // 5)
printf("%c",c);// 6)
c is not a pointer, so you are trying to store an address where you shouldn't. In scanf, "%c" expects a pointer to char, which should point to an object (=region of storage) with enough space for the specified field width, 1 by default. Since c is not a pointer, the above may crash in some platforms (and cause worse things on others).
I see several problems in your code.
Question 1: The difference is:
a gets allocated in writable memory, the so-called data segment. Here you can read and write as much as you want. sizeof a is the length of the string plus 1, the so-called string terminator (just a null byte).
b, however, is just a pointer to a string which is located in the rodata. That means, in a data area which is read only. sizeof b is whatever is the pointer size on your system, maybe 4 or 8 on a PC or 2 on many embedded systems.
Question 2: The printf() format wants a pointer to a string. With *b, you dereferene the pointer you have and give it the first byte of data, which is a t (ASCII 84 or something like that). The callee, however, treats it as a pointer, dereferences it and BAM.
With b, however, everything goes fine, as it is exactly the right call.
Question 3: malloc(sizeof(char)) allocates exactly one byte. sizeof(char) is 1 by definition, so the call is effectively malloc(1). The input just takes a word because %s is defined that way.
Question 4:
char c=malloc(sizeof(char)); // 4)
shound give you a warning: malloc() returns a pointer which you try to put into a char. ITYM char *...
As you continue, you give that pointer to scanf(), which receives e.g. instead of 0x80043214 a mere 0x14, interprets it as a pointer and BAM again.
The correct way would be
char * c=malloc(1024);
scanf("%1024s", c);
printf("%s", c);
Why? Well, you want to read a string. 1 byte is too small, better allocate more.
In scanf() you should take care that you don't allow reading more than your buffer can hold - thus the limitation in the format specifier.
and on printing, you should use %s, because you want the whole string to be printed and not only the first character. (At least, I suppose so.)
Ad Q1: The first is an array of chars with a fixed pointer a pointing to it. sizeof(a) will return something like 20 (strlen(a)+1). Trying to assign something to a (like a = b) will fail, since a is fixed.
The second is a pointer pointing to an array of char and hence is the sizeof(b) usually 4 on 32-bit or 8 on 64-bit. Assigning something to b will work, since the pointer can take a new value.
Of course, *a or *b work on both.
Ad Q2: printf() with the %s argument takes a pointer to a char (those are the "strings" in C). Hence, printf("%s", *b) will crash, since the "pointer" used by printf() will contain the byte value of *b.
What you could do, is printf("%c", *b), but that would only print the first character.
Ad Q3: sizeof(char) is 1 (by definition), hence you allocate 1 byte. The scanf will most likely read more than one byte (remember that each string will be terminated by a null character occupying one char). Hence the scanf will trash memory, likely to cause memory sometime later on.
Ad 4: Maybe that's the trashed memory.
Both declaration are the same.
b point to the first byte so when you say *b it's the first character.
printf("%s", *b)
Will fail as %s accepts a pointer to a string.
char is one byte.
char str[] = " http://www.ibegroup.com/";
char *p = str ;
void Foo ( char str[100]){
}
void *p = malloc( 100 );
What's the sizeof str,p,str,p in the above 4 case in turn?
I've tested it under my machine(which seems to be 64bit) with these results:
25 8 8 8
But don't understand the reason yet.
sizeof(char[]) returns the number of bytes in the string, i.e. strlen()+1 for null-terminated C strings filling the entire array. Arrays don't decay to pointers in sizeof. str is an array, and the string has 25 characters plus a null byte, so sizeof(str) should be 26. Did you add a space to the value?
The size of a pointer is of course always determined just by the machine architecture, so both instances of p are 8 bytes on 64-bit architectures and 4 bytes on 32-bit architectures.
In function arguments, arrays do decay to pointers, so you're getting the same result that you get for a pointer. Therefore, the following definitions are equivalent:
void foo(char s[42]) {};
void foo(char s[100]) {};
void foo(char* s) {};
The first is the sizeof of an built-in array, which is the amount of elements (24 + null on the end of the string).
The second is the sizeof of a pointer which is the native word size of your system, in your case 64 bit or 8 bytes.
The third is the sizeof of a pointer to the first element of an array which has the same size as any other pointer, the native word size of your system. Why a pointer to the first element of an array? Because size information of an array goes lost when passed to a function and it gets implicitly converted to a pointer to the first element instead.
The fourth is the sizeof of a pointer which has the same size as any other pointer.
str is an array of 8-bit characters, including null terminator.
p is a pointer, which is typically the size of the machine's native word size (32 bit or 64 bit).
The size taken up by a pointer stays constant, regardless of the size of the memory to which it points.
EDIT
In c++, arguments that are arrays are passed by reference (which internally is a pointer type), that's why the second instance of str has sizeof 8.
in the cases the size of
char str[] = “ http://www.ibegroup.com/”
is known to be 25 (24+1), because that much memory is actually allocated.
In the case of
void Foo ( char str[100]){
no memory is allocated