I'm learning C right now and came to a little problem I encountered while trying out some code snippets from my uni course.
It's about typedef'd pointers to structs and their usage in the sizeof() function.
#include <stdio.h>
#include <stdlib.h>
// Define struct
struct IntArrayStruct
{
int length;
int *array;
};
// Set typedef for pointer to struct
typedef struct IntArrayStruct *IntArrayRef;
// Declare function makeArray()
IntArrayRef makeArray(int length);
// Main function
int main(void)
{
// Use makeArray() to create a new array
int arraySize = 30;
IntArrayRef newArray = makeArray(arraySize);
}
// Define makeArray() with function body
IntArrayRef makeArray(int length)
{
IntArrayRef newArray = malloc(sizeof(*IntArrayRef)); // ERROR
newArray->length = length;
newArray->array = malloc(length * sizeof(int));
return newArray;
}
And the code really works in the IDE used in class (Virtual C), but when I try the exact same example in VSCode and compile it using GNU Make or GCC, it returns an error because sizeof(*IntArrayRef) in the malloc() function call is marked as an unexpected type name.
error: unexpected type name 'IntArrayRef': expected expression
However, when I change it to sizeof(IntArrayStruct), everything works splendidly.
Isn't *IntArrayRef the same value as IntArrayStruct?
IntArrayRef is the name of a type, therefore *IntArrayRef is invalid syntax. What you can (and should) do instead is give the name of the variable and dereference that:
IntArrayRef newArray = malloc(sizeof(*newArray));
Here's how your type names relate to each other:
struct IntArrayStruct * == IntArrayRef
Thus, newArray has type IntArrayRef which is the same as struct IntArrayStruct *
So, if you want the size of the pointer type, you'd use one of
sizeof (IntArrayRef)
sizeof (struct IntArrayStruct *)
sizeof newArray
If you want the size of the pointed-to type (the actual struct type), you'd use one of
sizeof (struct IntArrayStruct)
sizeof *newArray
sizeof is an operator, not a function - parentheses are only required if the operand is a type name (including typedef names). It doesn't hurt to use parentheses around non-type operands like sizeof (*newArray), but they're not necessary.
As a stylistic note, it's generally a bad idea to hide pointer types behind typedefs, especially if the user of the type has to know it's a pointer type to use it correctly. IOW, if the user of the type ever has to dereference something, then the pointerness of that something should be explicit. Even if the user doesn't need ever need to explicitly dereference it, you still shouldn't hide the pointerness of the type (take the FILE * type in the standard library as an example - you never actually dereference a FILE * object, but its pointerness is still made explicit).
Otherwise, be prepared to write a full API that hides all pointer operations from the user.
Compare
sizeof(IntArrayStruct*)
sizeof(IntArrayRef)
vs
sizeof(IntArrayStruct)
The first two are the same, and they are the size of just the pointer. I.e. same as sizeof(int*), sizeof(long*), sizeof(void*) etc.
The third is the size of the actual data structure. That's what you want if you are creating space for it with malloc.
Also Pointers and References are two different things in C++ , so it might be less confusing to communicate the fact that something is a pointers, with the abbreviation "ptr".
Finally, as mentioned, the creating a new type name, just to represent a pointer to a struct type, is non-standard. It would confuse other people without much benefit.
Related
I have a struct defined in .h
struct buf_stats {
// ***
};
then in .c file
struct buf_stats *bs = malloc(sizeof(struct buf_states*)) ;
where buf_states is a typo.
but gcc does not warn me, although I used -Wall
and this bug/typo cost me 3 hours to find out.
How to make gcc warn undefined struct like this?
In your code
struct buf_stats *bs = malloc(sizeof(struct buf_states*)) ;
is wrong for many reasons, like
You are using an undefined type (as you mentioned)
You are allocating way less memory (allocating for a pointer-to-type instead of the type)
But you compiler can't help much in _this_case for this particular type of error, as
a pointer to (any) type in a platform has a defined size, for that the structure (i.e. the type of the variable to which it points to) need not be complete (defined). This is the reason we can have self-referencing structures, right?
malloc() has no idea about the target variable type. It just reads the argument for the needed size, return a pointer (which is of type void *) to the allocated memory and upon assignment, that gets changed to the target type. It cannot possibly calculate the mismatch in the target size (type) with the allocated memory size.
Most convenient and simplest way to avoid these type of mistakes is, not to use the hard-coded type directly as the operand of sizeof, rather, use the variable reference.
Something like
struct buf_stats *bs = malloc(sizeof *bs) ; // you can write that as (sizeof (*bs)) also
// sizeof *bs === sizeof (struct buf_stats)
which is equivalent to
struct buf_stats *bs = malloc(sizeof(struct buf_stats)) ;
but is more robust and less error prone.
Notes:
You don't need the parenthesis if the operand is not a type name.
This statement does not need any modification upon changing the type of target variable bs.
You can't. Using an expression like struct foo * (a pointer to some struct type) declares that struct as an incomplete type. A size isn't known, but it's not necessary for the size of the pointer.
That said, the code looks wrong, as you need the size of the struct (not the size of the pointer), so with the following code:
struct buf_stats *bs = malloc(sizeof(struct buf_states));
you would get an error.
There's a better way to write such code:
struct buf_stats *bs = malloc(sizeof *bs);
The expression *bs has the correct type for sizeof, even when you later change the type.
I am having a little bit of confusion about derefrencing a structure pointer to a
structure variable.
It will be good if I demonstrate my problem with an example.
So here I am:
struct my_struct{
int num1;
int num2;
}tmp_struct;
void Display_struct(void * dest_var){
struct my_struct struct_ptr;
struct_ptr = *((struct my_struct *)dest_var);
printf("%d\t%d\n",struct_ptr.num1,struct_ptr.num2);
}
int main()
{
tmp_struct.num1 = 100;
tmp_struct.num2 = 150;
Display_struct(&tmp_struct);
return 0;
}
Now when I am running this example I am able to get the code to be compiled in a very clean manner and also the output is correct.
But what I am not able to get is that is this a correct way of dereferencing the structure pointer to a structure variable as we do in case of other simple
data types like this:
int example_num;
void Display_struct(void * dest_var){
int example_num_ptr;
example_num_ptr = *((int *)dest_var);
printf("%d\t%d\n",struct_ptr.num1,struct_ptr.num2);
}
int main()
{
example_num = 100;
Display_struct(&example_num);
return 0;
}
Here we can dereference the int pointer to int variable as it is a simple data
type but in my opinion we can't just dereference the structure pointer in similar manner to a structure variable as it is not simple data type but a complex data type or data structure.
Please help me in resolving the concept behind this.
The only problem is that you have to guarantee that the passed void* points to a variable of the correct struct type. As long as it does, everything will work fine.
The question is why you would use a void pointer and not the expected struct, but I assume this function is part of some generic programming setup, otherwise it wouldn't make sense to use void pointers.
However, if you would attempt something "hackish" like this:
int arr[2] = {100, 150};
Display_struct(arr); // BAD
Then there are no longer any guarantees: the above code will compile just fine but it invokes undefined behavior and therefore may crash & burn. The struct may contain padding bytes at any place and the code also breaks the "strict aliasing" rules of C.
(Aliasing refers to the rules stated by the C standard chapter 6.5 Expressions, 7§)
You are thinking up a problem where there isn't any. A struct-type (alias an aggregate data type) is technically not very different from any other type.
If we look at things on the lower level, a variable of any type (including a struct type) is just some number of bits in memory.
The type determines the number of bits in a variable and their interpretation.
Effectively, whether you dereference a pointer-to-int or a pointer-to-struct, you just get the chunk of bits your pointer points to.
In your main function, you have struct tmp_struct. It is not a pointer. But it is fine, because you pass address of tmp_struct to the function void Display_struct(void * dest_var).
Then function take the input argument, your pointer(void*). It hold the address of 'tmp_struct`.
Then inside the function you are de-referencing correctly.
struct_ptr = *((struct my_struct *)dest_var);
you deference void* to struct my_struct type. Your de-referencing correct, because you pass same type object. Otherwise it will cause run time issues.
No matter how complex your data type or data structure, de-referencing should work fine.
But if input arg type is void* make sure to pass struct my_struct to function.
I have a question about some code in Eric Roberts' Programming Abstractions in C. He use several libraries of his own both to simplify things for readers and to teach how to write libraries. (All of the library code for the book can be found on this site.)
One library, genlib provides a macro for generic allocation of a pointer to a struct type. I don't understand part of the macro. I'll copy the code below, plus an example of how it is meant to be used, then I'll explain my question in more detail.
/*
* Macro: New
* Usage: p = New(pointer-type);
* -----------------------------
* The New pseudofunction allocates enough space to hold an
* object of the type to which pointer-type points and returns
* a pointer to the newly allocated pointer. Note that
* "New" is different from the "new" operator used in C++;
* the former takes a pointer type and the latter takes the
* target type.
*/
#define New(type) ((type) GetBlock(sizeof *((type) NULL)))
/* GetBlock is a wrapper for malloc. It encasulates the
* common sequence of malloc, check for NULL, return or
* error out, depending on the NULL check. I'm not going
* to copy that code since I'm pretty sure it isn't
* relevant to my question. It can be found here though:
* ftp://ftp.awl.com/cseng/authors/roberts/cs1-c/standard/genlib.c
*/
Roberts intends for the code to be used as follows:
typedef struct {
string name;
/* etc. */
} *employeeT;
employeeT emp;
emp = New(employeeT);
He prefers to use a pointer to the record as the type name, rather than the record itself. So New provides a generic way to allocate such struct records.
In the macro New, what I don't understand is this: sizeof *((type)) NULL). If I'm reading that correctly, it says "take the size of the dereferenced cast of NULL to whatever struct type type represents in a given call". I think I understand the dereferencing: we want to allocate enough space for the struct; the size of the pointer is not what we need, so we dereference to get at the size of the underlying record-type. But I don't understand the idea of casting NULL to a type.
My questions:
You can cast NULL? What does that even mean?
Why is the cast necessary? When I tried removing it, the compiler says error: expected expression. So, sizeof *(type) is not an expression? That confused me since I can do the following to get the sizes of arbitrary pointers-to-structs:
#define struct_size(s_ptr) do { \
printf("sizeof dereferenced pointer to struct %s: %lu\n", \
#s_ptr, sizeof *(s_ptr)); \
} while(0)
Edit: As many people point out below, the two examples aren't the same:
/* How genlib uses the macro. */
New(struct MyStruct*)
/* How I was using my macro. */
struct MyStruct *ptr; New(ptr)
For the record, this isn't homework. I'm an amateur trying to improve at C. Also, there's no problem with the code, as far as I can tell. That is, I'm not asking how I can do something different with it. I'm just trying to better understand (1) how it works and (2) why it must be written the way it is. Thanks.
The issue is that the macro needs to get the size of the type pointed at by the pointer type.
As an example, suppose that you have the the pointer type struct MyStruct*. Without removing the star from this expression, how would you get the size of struct MyStruct? You couldn't write
sizeof(*(struct MyStruct*))
since that's not legal C code.
On the other hand, if you had a variable of type struct MyStruct*, you could do something like this:
struct MyStruct* uselessPointer;
sizeof(*uselessPointer);
Since sizeof doesn't actually evaluate its argument (it just determines the static size of the type of the expression), this is safe.
Of course, in a macro, you can't define a new variable. However, you could make up a random pointer to a struct MyStruct* by casting an existing pointer. Here, NULL is a good candidate - it's an existing pointer that you can legally cast to a struct MyStruct*. Therefore, if you were to write
sizeof(* ((struct MyStruct*)NULL))
the code would
Cast NULL to a struct MyStruct*, yielding a pointer of static type struct MyStruct*.
Determine the size of the object that would be formed by dereferencing the pointer. Since the pointer has type struct MyStruct*, it points at an object of type struct MyStruct, so this yields the type of struct MyStruct.
In other words, it's a simple way to get an object of the pointer type so that you can dereference it and obtain an object of the underlying type.
I've worked with Eric on some other macros and he is a real pro with the preprocessor. I'm not surprised that this works, and I'm not surprised that it's tricky, but it certainly is clever!
As a note - in C++, this sort of trick used to be common until the introduction of the declval utility type, which is a less-hacky version of this operation.
Hope this helps!
It's a hack. It relies on the fact that the argument to the sizeof operator isn't actually evaluated.
To answer your specific questions:
Yes, NULL is just a pointer literal. Like any other pointer, it may be cast.
sizeof operates on either a type or an expression. *(type) would be neither (after macro substitution has occurred), it would be a syntax error.
Consider the following C code:
typedef char * MYCHAR;
MYCHAR x;
My understanding is that the result would be that x is a pointer of type "char". However, if the declaration of x were to occur far away from the typedef command, a human reader of the code would not immediately know that x is a pointer. Alternatively, one could use
typedef char MYCHAR;
MYCHAR *x;
Which is considered to be better form? Is this more than a matter of style?
If the pointer is never meant to be dereferenced or otherwise manipulated directly -- IOW, you only pass it as an argument to an API -- then it's okay to hide the pointer behind a typedef.
Otherwise, it's better to make the "pointerness" of the type explicit.
I would use pointer typedefs only in situations when the pointer nature of the resultant type is of no significance. For example, pointer typedef is justified when one wants to declare an opaque "handle" type which just happens to be implemented as a pointer, but is not supposed to be usable as a pointer by the user.
typedef struct HashTableImpl *HashTable;
/* 'struct HashTableImpl' is (or is supposed to be) an opaque type */
In the above example, HashTable is a "handle" for a hash table. The user will receive that handle initially from, say, CreateHashTable function and pass it to, say, HashInsert function and such. The user is not supposed to care (or even know) that HashTable is a pointer.
But in cases when the user is supposed to understand that the type is actually a pointer and is usable as a pointer, pointer typedefs are significantly obfuscating the code. I would avoid them. Declaring pointers explicitly makes code more readable.
It is interesting to note that C standard library avoids such pointer typedefs. For example, FILE is obviously intended to be used as an opaque type, which means that the library could have defined it as typedef FILE <some pointer type> instead of making us to use FILE * all the time. But for some reason they decided not to.
I don't particularly like typedef to a pointer, but there is one advantage to it. It removes confusion and common mistakes when you declare more than one pointer variable in a single declaration.
typedef char *PSTR;
...
PSTR str1, str2, str3;
is arguably clearer than:
char *str1, str2, str3; // oops
I prefer leaving the *, it shows there's a pointer. And your second example should be shortened as char* x;, it makes no sense.
I also think this is a matter of style/convention. In Apple's Core Graphics library they frequently "hide" the pointer and use a convention of appending "Ref" to the end of the type. So for example, CGImage * corresponds to CGImageRef. That way you still know it's a pointer reference.
Another way to look at it is from the perspective of types. A type defines the operations that are possible on that type, and the syntax to invokes these operations. From this perspective, MYCHAR is whatever it is. It is the programmers responsibility to know the operations allowed on it. If it is declared like the first example, then it supports the * operator. You can always name the identifier appropriately to clarify it's use.
Other cases where it is useful to declare a type that is a pointer is when the nature of the parameter is opaque to the user (programmer). There may be APIs that want to return a pointer to the user, and expect the user to pass it back to the API at some other point. Like a opaque handle or a cookie, to be used by the API only internally. The user does not care about the nature of the parameter. It would make sense not to muddy the waters or expose its exact nature by exposing the * in the API.
If you look at several existing APIs, it looks as if not putting the pointerness into the type seems better style:
the already mentionned FILE *
the MYSQL * returned by MySQL's mysql_real_connect()
the MYSQL * returned by MySQL's mysql_store_result() and mysql_use_result()
and probably many others.
For an API it is not necessary to hide structure definitions and pointers behind "abstract" typedefs.
/* This is part of the (hypothetical) WDBC- API
** It could be found in wdbc_api.h
** The struct connection and struct statement ar both incomplete types,
** but we are allowed to use pointers to incomplete types, as long as we don't
** dereference them.
*/
struct connection *wdbc_connect (char *connection_string);
int wdbc_disconnect (struct connection *con);
int wdbc_prepare (struct connection * con, char *statement);
int main(void)
{
struct connection *conn;
struct statement *stmt;
int rc;
conn = wdbc_connect( "host='localhost' database='pisbak' username='wild' password='plasser'" );
stmt = wdbc_prepare (conn, "Select id FROM users where name='wild'" );
rc = wdbc_disconnect (conn);
return 0;
}
The above fragment compiles fine. (but it fails to link, obviously)
Is this more than a matter of style?
Yes. For instance, this:
typedef int *ip;
const ip p;
is not the same as:
const int *p; // p is non-const pointer to const int
It is the same as:
int * const p; // p is constant pointer to non-const int
Read about const weirdness with typedef here typedef pointer const weirdness
./drzwoposzukiwanbinarnych.c:84:24: error: expected expression before â)â token
char getNewSlowo(){
slowa *wyraz = (wyraz*) malloc(sizeof(slowa)); //LINE WITH ERROR
scanf("%s",wyraz->slowo);
return wyraz->slowo;
}
What I am trying to do?
So, I have a struct:
typedef struct node{
char *word;
unsigned int arity;
struct node *left,*right,*parent;
}baza;
I want to make that pointer word is pointing to - char slowo[30] defined below.
typedef struct word{
char slowo[30];
}slowa;
And the point that I am stuck on is the error on the top of this question.
I am extremely tired of coding and my mind is completely overheated right now so my question may be not well formed for what I am sorry if that's the case.
But why I am trying to do this?
I had a problem with assigning a word defined globally to the pointer and I noticed that when I read a new word into that global defined word the word in the struct (pointer) changed also.
Just remove the cast (wyraz*) and all will be fine. if you insist on keeping it (although it is unneeded and often considered detrimental), it should be (slowa *) instead.
This:
slowa *wyraz = (wyraz*) malloc(sizeof(slowa));
has mis-matched pointers. It's better to write this like so:
slowa *my_slowa = malloc(sizeof *my_slowa);
This removes the pointless cast, and uses the sizeof operator to ensure the number of bytes allocated matches the type of the pointer.
Code like this is a pretty good argument for never having this cast, in my opinion.
There's a reason typecasting is called type casting and not variable-name casting. What you're trying to do is using the name of the just declared variable as a type name, of course that makes no sense. If you're intending to cast away the return value of malloc(), you should use a type, and not a variable name:
slowa *wyraz = (slowa *)malloc(sizeof(slowa));
However, in C, you should not cast the return value of malloc. Furthermore, it would be less error prone if you used sizeof(*ThePointer) versus sizeof(TheType) just in case the type ever changes. All in all, write this:
slowa *wyraz = malloc(sizeof(*wyraz));