Compile Error: sizeof() operator in C - c

I am not sure if this has been asked before. I tried my best to avoid duplicity.
I was using the sizeof() operator in C.
First I tried this:
char *name;
sizeof(name);
which returned size as 4 bytes. No issues till now.
Next time I tried this:
char name[];
sizeof(name); // I even tried name[]
which gave me a Compile error.
Anyone please explain why this occurs?
EDIT: I also tried inputting a string to *name which by far exceeded 4 bytes of length. Yet it was able to handle it properly. However the sizeof(name) still return 4 bytes. Even when the compiler has dynamically allocated enough memory to *name, it still reports occupying 4 bytes. Is this a sizeof() fault?

char name[]; at file scope is a tentative definition of an array. It has incomplete type. You do not know its size yet. You cannot do sizeof until after the definition has been completed.
Note that this has nothing to do with char *name; - arrays and pointers are different.
Re. your EDIT: you are confusing a pointer with the items being pointed to. char *name; takes up 4 bytes, and it points at another char. That is what a pointer does: it points at another object. It doesn't necessarily own what it points at. The semantics of a string is a series of char objects followed by a null terminator, and name should point at the first item of the series.
This all has nothing to do with sizeof name, which is the size of the pointer, not the size of the list of items being pointed at.

sizeof cannot be used with objects of incomplete array types.
char name[]; // name is of an incomplete type
C defines incomplete types as types that describe objects but lack
information needed to determine their sizes.
If you complete the type:
char name[];
char name[42]; // type of name is now completed
then using sizeof name would be valid.

Size of pointer to char is the space in memory required for that type of variable.
Size of an array of chars is number of chars in array. Which is undefined for non initialized array.

Related

char * vs char[] and much more

I'm confused about the way C handles strings and char * vs char[].
char name[10] = "asd";
printf("%p\n%p", &name, &name[0]); //0x7ffed617acd
//0x7ffed617acd
If this code gives the same addresses for both arguments, does it mean that the C compiler takes char arrays (strings) as a pointer to the first char in the array and moves in the memory till it gets the null terminator? Why wouldn't the same happen if we changed the char name[] to char *name? (I know they differ but what makes C take both in a different way?)
I know that arrays can't be assigned after declaration (unless you used something like strcpy, strcat) which is also confusing. Why wouldn't C take them as any other data type? (Something tells me the compiler has a specific addr for it while you can assign char* to whatever location in the mem since its a pointer).
I know that char * have fixed size unlike char[] which makes char * not usable for first argument of strcat.
in C a "string" is an array of type "char" (terminated with \0).
When you are referring to an array in C, you are using a pointer to the first element. In this case (char *).
According to the ANSI-C standard the name of an array is a pointer to the first element.
Being able to write name instead of &name[0] is syntactical sugar.
In the same way accessing an array element writing name[i] is analogue to writing *(name+i).
does it mean that the c compiler takes char arrays (strings) as a pointer to the first char in the array
An array is not a pointer. But an array will implicitly convert to a pointer to first element. Such conversion is called "decaying".
... and moves in the memory till it gets the null terminator???
You can write such loop if you know the pointer is to an element of null terminated string. If you write that loop, then the compiler will produce a program that does such thing.
Why wouldn't the same happen if we changed the char name[] to char *name?
Your premise is faulty. You can iterate an array directly, as well as using a pointer.
If this code gives the same addresses for both arguments, does it mean
The address of an object is the first byte of the object. What this "same address" means is that the first byte of the first element of the array is in the same address as the first byte of the array as a whole.
I know that arrays can't be assigned after declaration (unless you used something like strcpy, strcat) which is also confusing.
Neither strcpy nor strcat assign an array. They assign elements of the array which you can also do without calling those functions.
Why wouldn't C take them as any other data type?
This question is unclear. What do you mean by "C taking them"? Why do you think C should take another data type? Which data type do you think it should take?
char name[10] = "asd";
printf("%p\n%p", &name, &name[0]);
The arguments are of type char(*)[10] and char* respectively. The %p format specifier requires that the argument is of type similar to void* which isn't similar to those arguments. Passing an argument of a type other than required by the format specifier results in undefined behaviour. You should cast other pointer types to void* when using %p.

mallocing array of structs creates too small of an array

I'm a little new to structs in C and I'm having a problem with creating an array to store them. As the title says when I try to malloc out an array of structs my array ends up being too small by quite a large margin.
Here is my struct:
struct Points
{
char file_letter;
char *operation;
int cycle_time;
};
And here is how I'm trying to create the array:
struct Points *meta_data;
meta_data = malloc(number_of_delims * sizeof(struct Points));
number_of_delims is an int representing the number of Points I'm trying to create and therefore the number of elements in my array.
With number_of_delims being 64 I get an array size of about 8.
Note: this is more or less a project for school and I can't use typedef when declaring my struct as the prof. wants each struct explicitly declared as one each time it is used. This may actually be the source of my problem but we'll see!
struct Points *meta_data;
At this point we have a declaration of an object, meta_data that has type struct Points *... and struct Points *, being a pointer type, typically requires 8 bytes on common implementations. This is observable through the following program:
#include <stdio.h>
struct Points;
int main(void) {
struct Points *meta_data;
printf("sizeof meta_data: %zu\n", sizeof meta_data);
}
Remember, the sizeof operator evaluates the size of the type of the expression, which in this case is a pointer. Pointers don't carry size information about the arrays they point into. You need to keep that (i.e. preferably by pairing number_of_delims with meta_data, if you require both values later on).
With number_of_delims being 64 I get an array size of about 8.
No. You get an array size of exactly 64, as you've expected. Your pointer doesn't automatically carry that size information around with it (because you're expected to), so there is no portable way to come to the conclusion that your allocation can store 64 elements. The only way you could come to this conclusion is erroneously (i.e. by attempting to use sizeof, which as I've explained doesn't work as you expect).
As an exercise, what happens if you declare a pointer to an array of 64 struct Points, like so?
struct Points (*foo)[64] = NULL;
For a start, how many elements can NULL contain? What is sizeof foo and sizeof *foo? Do you see what I mean when I say sizeof evaluates the size of the type of an expression?

Why the char has to be a pointer instead of a type of char?

#include <stdio.h>
typedef struct {
char * name;
int age;
} person;
int main() {
person john;
/* testing code */
john.name = "John";
john.age = 27;
printf("%s is %d years old.", john.name, john.age);
}
This a well-working code, I just got a small question.
In the struct part, after I delete the * before name, this code no longer works, but no matter the age's type is, int or a pointer, it always works fine. So can anyone tell me why name has to be a pointer rather than just a type of char?
char type is short for character and can hold one character. C has no string type, instead a string in C is an array of char terminated with '\0' - the null character (null terminated strings).
Thus to use a string you need a pointer to memory that contains lots of characters. So why does it work for an int with or without the *. Well we can either have the age as an int or we can have a pointer to memory that stores the age. Either works well. But we can't store a string in one character.
This has to do with format specifiers you've in printf function. %s tries to output the string (reads a portion of memory), %d interprets everything in gets like an integer, thus even a pointer sort of works, however, you shouldn't to that, it's undefined behavior.
I suggest you to read some good books on C to get a good grasp on such things, a good list is here The Definitive C Book Guide and List
but no matter the age's type is int or a pointer, it always works fine.
That's undefined behaviour.
To elaborate, a double-quote delimited string (as seen above) is a string literal, and when used as an initializer, it basically gives you a pointer to the starting of the literal thereby it needs a pointer variable to be stored. So, name has to be a pointer.
OTOH, the initializer 27 is an integer literal (integer constant) and it needs to be stored into an int variable , not an int *. If you use 27 to initialize an int * and use that, it works (rather, seem to work) because that way, it invokes undefined behavior later, by attempting to use invalid memory location.
FWIW, if you try something like
typedef struct {
char * name;
int *age;
} person;
and then
john.age = 27; //incompatible assigment
compiler will warn you about wrong conversion from integer to pointer.
char *name: name is a pointer to type char. Now, when you make it to point to "John", the compiler stores the John\0 i.e., 5 chars to some memory and returns you the starting address of that memory. So, when you try to read using %s (string format specifier), the name variable returns you the whole string reading till \0.
char name : Here name is just one char having 1 byte of memory. So, you can't store anything more than one char. Also, when you would try to read, you should always read just one char (%c) because trying to read more than that will take you to the memory region which is not assigned to you and hence, will invoke Undefined Behavior.
int age : age is allocated 4 bytes, so you can store an integer to this memory and read as well, printf("%d", age);
int *age : age is a pointer to type int and it stores the address of some memory. Unlike strings, you do not read integers using address (loosely saying, just for the sake of avoiding complexity). You have to dereference it. So first, you need to allocate some memory, store any integer into it and return the address of this memory to age. Or else, if you don't want to allocate memory, you can use compiler's help by assigning a value to age like this, *age = 27. In this case, compiler will store 27 to some random memory and will return the address to age which can be dereferenced using *age, like printf("%d", *age);

char *filenames[1] or char *filenames What is the difference?

I am writing a program which will store a list of file names as a string array. When I declare it as
char *filenames[1]
I have no errors... but when I do
char *filenames
I get a few errors. Not at the declaration but in later use. for example when I do:
filenames[3]= (char*)malloc( strlen(line) + 1);//here ERROR is : Cannot assign char* to char.
But with the first declaration with [1] it is all fine. I was just wondering what is the difference between them?
Trust me I tried looking for the answer on google but can't find any good ones regarding this case.
At the outset, char *filenames represents a char * variable named filenames whereas, char *filenames[1] represents an array of char * variables, with one element.
From the point of the usage, both are same, but the major difference is, the first one has to be used as a normal variable and the second one can be indexed, as an array.
If you're in need to use only one variable, don't use an array. You may be in danger of using out of bound indexes if you're not careful enough.
Also, as a note, please see this discussion on why not to cast the return value of malloc() and family in C..
Relative to this statement
filenames[3]= (char*)malloc( strlen(line) + 1);//here ERROR is : Cannot assign char* to char.
the both declarations are wrong.
The firwt declaration
char *filenames[1]
is wrong because it declares an array of pointers to char having only one element. But in the statement with memory allocation there is used index 3. So there is an attempt to access memory beyond the array becuase the only valid index for an array that has one element is 0.
The second declaration
char *filenames
is wrong because it does not declare an array of pointers. So applying the subscript operator for identifier filenames
filenames[3]
gives a scalar object of type char instead of a pointer of type char *

What's the difference between char [] and char * in struct?

There is a struct like this:
struct sdshdr {
int len;
int free;
char buf[];
};
And the result of printf ("%d\n", sizeof(struct sdshdr));is 8. If I change char buf[] to char *, the result would be 16. Why is char buf[] taking no space here(sizeof(int) is 4)? When shoud I choose char buf[] over char *buf?
The construct with the empty brackets [] is allowed as the last element of the struct. It lets you allocate additional space beyond sizeof(sdshdr) for the elements of the array, letting you embed the array data with the array itself.
Pointers, on the other hand, store the data in a separately managed segment of memory, and require an additional call to free at the end. Unlike the [] way, pointers let you have more than one variable-length array inside the same struct, and the element can be placed anywhere in the struct, not only at the end of the struct.
Taking "char[]" more generally:
char[] will actually allocate a number of characters inside the struct. (A struct with char x[17] will grow by 17 bytes and so on.) A char* will just hold a pointer.
An actual char x[] (with no size specified - and I think the same goes for size 0) at the end of the struct is a special case called a "flexible array member" and is discussed in the linked question and in the other answer.
Remember also that sizeof needs to be determined at compile time. Since char buf[] is a flexible array member it's size cannot be known at compile time, therefore will be omitted from the calculation of sizeof.
char * is a pointer to a char variable, and it's size is known so is included (however that is the size of the pointer not the array it points to).

Resources