I recently wrote a wrapper for LevelDB in C and stumbled about the following problem. The LevelDB function to store data in a database looks like this:
leveldb_put(leveldb_t* db, const leveldb_writeoptions_t* options, const char* key, size_t keylen, const char* val, size_t vallen, char** errptr);
For the key and value, they use a char*. That means I would have to cast arguments that aren't char pointers. This happens often because I often store structs in the database.
After thinking about this I decided to use a void* for key and data in my wrapper function. It then looks something like this:
int db_put(db_t db, void *key, size_t keylen, void *value, size_t valuelen)
{
char *k = (char*)key;
char *v = (char*)value;
/* Call leveldb_put() here with k and v as parameters. */
return 0;
}
This way I don't have to cast the arguments I pass to my db_put() function. I think this solution is more elegant, but I guess LevelDB knew what they were doing when they choose the char pointers.
Is there a reason not to use void* to pass arbitrary data to a function?
Is there a reason not to use void* to pass arbitrary data to a
function?
No. In fact, void * exists to facilitate passing arbitrary data without the need for ugly casting. That's why ptr-to-void was standardized. In C at least. C++ is a different beast.
At LevelDB they have to deal with historical code born with char * , or pre C89 compilers, or any other veiled reason causing refactoring-inertia. Their code would work with ptrs-to-void just as well.
Note that in your version of db_put the casts should be removed as they are redundant.
The current accepted answer exists only to flatter the OP; It's actually slightly invalid.
Consider that your arbitrary struct may (most likely) have padding bytes somewhere. The value of those padding bytes is indeterminate and may or may not be insignificant.
Consider what might happen if you put a struct as key that has padding bytes, and you then attempt to get the value for that key which is otherwise equal except for the padding bytes.
Consider also how you might handle pointer members, if you choose to do so in the future.
If you intend to use a struct as key, it would be a good idea to serialise it, so you can guarantee retrieval of the corresponding value without worrying about those padding bits.
Perhaps you could pass a function pointer telling your wrapper how to serialise the key into a string...
A void* can be considered a black box that holds a pointer. By holding it
in a void* you are effectively saying that you don't care or it contains at that point, so this will allow you to make any assumption about it.
Void* is a "true" generic pointer, and can be directly assign to any particular data type without using cast.
Meanwhile, a char* explicitly specify the type of the respective object. Initially there was no char* and char* was also used to represent generic pointers. When char* is used an explicit cast is required, however the usage of char* as a generic pointer is not recommanded, because it may create confusion, like it did back there when it was hard to tell if a char* contains a string or some generic data.
Also, is legal to perform arithmeticon a char*, but not on a void*.
The downside of using void*, is given by their main usage, they can hide the actual type of the data you're storing, which prevents the compiler and other stuff to detect type errors.
In your specific situation there is no problem in using void* instead of char*, so you can use void* without worries.
Edit: Updated and reformuled the answer to correct some wrong info
You should be serializing to some standard format like json instead of dealing with raw data like that. It looks very error prone unless you always assume that the arbitrary data is just a byte buffer. In which case I would use uint8_t pointer (which is an unsigned char*) and cast all data structures to it so that the routine just thinks that it is dealing with a byte buffer.
A note on void*: i almost never ever use them. Think carefully when you introduce void pointers because in most cases you can do away with the right way of doing things which takes advantage of the standard types to avoid future bugs. The only places where you should use void* is in places like malloc() where there is not really a better way.
Related
I'm currently reprogramming some C libraries. So my question is: Why does memchr take an const void *s pointer and not simply an character pointer?
I see that you could probably search for other stuff, but the function is literally called memchr so why use a void pointer?
Thanks in advance!
C’s early development involved a fair amount of treating things as individual bytes and allowing programmers access to the bytes that represented objects. The name memchr reflects that history and cannot readily be changed now.
void was added to the language to provide a type that represents, in a sense, “something is there, but this is flexible.” The first parameter to memchr has type const void * because the routine may be used to search for a byte in any type of object, not necessary an object that is primarily an array of characters.
It's the same reason why other functions declared in string.h use void*: to avoid a cast.
Maybe you're right, the name is misleading, especially considering its it's part of string.h. Then again, so is memmove, memcmp, memset, etc...
Before the void data type was introduced in the 1989 standard (-std=c89), functions like memchr() used to take const char *, because that was the closest thing to a generic pointer at the time.
If const char * was present today, you will need to cast from x * to const char *.
You can also understand this using analogy:
strcpy() -> string copy
memcpy() -> memory (binary) copy
This question goes out to the C gurus out there:
In C, it is possible to declare a pointer as follows:
char (* p)[10];
.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.
It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:
void foo(char * p, int plen);
If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:
void foo(char (*p)[10]);
..would force the caller to give you a buffer of the specified size.
This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.
My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?
What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.
When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter
void foo(char (*p)[10]);
(in C++ language this is also done with references
void foo(char (&p)[10]);
).
This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name
typedef int Vector3d[3];
void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);
Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).
However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".
But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter
void foo(char p[], unsigned plen);
Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.
Nevertheless, if the array size is fixed, passing it as a pointer to an element
void foo(char p[])
is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.
Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as
char *p = malloc(10 * sizeof *p);
This array cannot be passed to a function declared as
void foo(char (*p)[10]);
which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows
char (*p)[10] = malloc(sizeof *p);
This, of course, can be easily passed to the above declared foo
foo(p);
and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.
I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):
As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:
int array[9];
const int (* p2)[9] = &array; /* Not legal unless array is const as well */
This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text
This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.
The obvious reason is that this code doesn't compile:
extern void foo(char (*p)[10]);
void bar() {
char p[10];
foo(p);
}
The default promotion of an array is to an unqualified pointer.
Also see this question, using foo(&p) should work.
I also want to use this syntax to enable more type checking.
But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.
Here are some more obstacles I have come across.
Accessing the array requires using (*p)[]:
void foo(char (*p)[10])
{
char c = (*p)[3];
(*p)[0] = 1;
}
It is tempting to use a local pointer-to-char instead:
void foo(char (*p)[10])
{
char *cp = (char *)p;
char c = cp[3];
cp[0] = 1;
}
But this would partially defeat the purpose of using the correct type.
One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:
char a[10];
char (*p)[10] = &a;
The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.
Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.
One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.
When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:
fileA:
char (*p)[10];
fileB:
extern char (*p)[10];
Well, simply put, C doesn't do things that way. An array of type T is passed around as a pointer to the first T in the array, and that's all you get.
This allows for some cool and elegant algorithms, such as looping through the array with expressions like
*dst++ = *src++
The downside is that management of the size is up to you. Unfortunately, failure to do this conscientiously has also led to millions of bugs in C coding, and/or opportunities for malevolent exploitation.
What comes close to what you ask in C is to pass around a struct (by value) or a pointer to one (by reference). As long as the same struct type is used on both sides of this operation, both the code that hand out the reference and the code that uses it are in agreement about the size of the data being handled.
Your struct can contain whatever data you want; it could contain your array of a well-defined size.
Still, nothing prevents you or an incompetent or malevolent coder from using casts to fool the compiler into treating your struct as one of a different size. The almost unshackled ability to do this kind of thing is a part of C's design.
You can declare an array of characters a number of ways:
char p[10];
char* p = (char*)malloc(10 * sizeof(char));
The prototype to a function that takes an array by value is:
void foo(char* p); //cannot modify p
or by reference:
void foo(char** p); //can modify p, derefernce by *p[0] = 'f';
or by array syntax:
void foo(char p[]); //same as char*
I would not recommend this solution
typedef int Vector3d[3];
since it obscures the fact that Vector3D has a type that you
must know about. Programmers usually dont expect variables of the
same type to have different sizes. Consider :
void foo(Vector3d a) {
Vector3d b;
}
where sizeof a != sizeof b
Maybe I'm missing something, but... since arrays are constant pointers, basically that means that there's no point in passing around pointers to them.
Couldn't you just use void foo(char p[10], int plen); ?
type (*)[];
// points to an array e.g
int (*ptr)[5];
// points to an 5 integer array
// gets the address of the array
type *[];
// points to an array of pointers e.g
int* ptr[5]
// point to an array of five integer pointers
// point to 5 adresses.
On my compiler (vs2008) it treats char (*p)[10] as an array of character pointers, as if there was no parentheses, even if I compile as a C file. Is compiler support for this "variable"? If so that is a major reason not to use it.
I'm currently making a small library which can access random.org and get random strings or integers. I've run into a slight problem now though, design wise, and I can't decide which of my two approaches that would be the best, so I will write down my thoughts about both and ask the questions that I find relevant to both.
I currently have a struct like this:
typedef struct {
char **array;
size_t row;
size_t col;
size_t size;
} MemoryStruct;
And this is the source of my "problem", as I now have added in integer handling into my library.
As it can be seen, the pointer is currently a char pointer, and as I want to be able to handle blot integers and chars, should I add another pointer - an integer pointer - or should I instead make it a void pointer and make a function that will return the correct type of pointer?
The addition of an integer pointer would be the easiest, but I'm not sure how obvious it would seem to the guy who might use my library at some point in the future, that this is why his program segfaults, because he used the wrong pointer in the struct.
However, adding in a void pointer instead means that I will have to do a context based function that will return a pointer of the correct type (if this is even possible - I quite honestly don't know if it is), which leads to the question:
Which return type would a function that can return different kinds of pointers have? (... it's obvious, a void pointer!) but then I still don't have the wanted type included for easy use for the programmer who're using my library - here is it put into pseudo code what I would like to do.
pointer-with-type-info *fun(MemoryStruct *memory, RandomSession *session)
switch case on session->type
return pointer of appropriate type
It should be added here, that from session->type it can inferred which type the pointer should be.
Thank you in advance for reading this.
The way I understood your question, you know the pointer data type. I would use a void* instead of two pointers, because you may add more data types later without creating more fields.
To retrieve the correct value, one can create two functions: int getAsString(MemoryStruct, char** value) and int getAsInteger(MemoryStruct, int &value). These functions would return a non-zero value in case of success (true). This way, you will be able no only to retrieve the correct value, but also you will have a way to know whether the value can be retrieved at as this data type.
So I'm trying to write a buffering library for the 64th time and I'm starting get into some pretty advanced stuff. Thought I'd ask for some proffesional input on this.
In my first header file I have this:
typedef struct StdBuffer { void* address; } StdBuffer;
extern void StdBufferClear(StdBuffer);
In another header file that #includes the first header file I have this:
typedef struct CharBuffer { char* address; } CharBuffer;
void (*CharBufferClear)(CharBuffer) = (void*) StdBufferClear;
Will declaring this function pointer void interfere with the call? They have matching by value signatures. I have never seen a function pointer declared void before, but its the only way to get it to compile cleanly.
Stackwise it should not make any difference at all from what I learned in assembler coding.
irrelevent OMG! I just said Stackwise on StackOverflow!
Hmm.. Looks like I've assumed too much here. Allow me to reclarify if I may. I don't care what 'type' of data is stored at the address. All that I am concerned with is the size of a 'unit' and how many units are at the address. Take a look at the interface agreement contract for the API if you will:
typedef struct StdBuffer {
size_t width; ///< The number of bytes that complete a data unit.
size_t limit; ///< The maximum number of data units that can be allocated for this buffer.
void * address; ///< The memory address for this buffer.
size_t index; ///< The current unit position indicator.
size_t allocated; ///< The current number of allocated addressable units.
StdBufferFlags flags;///< The API contract for this buffer.
} StdBuffer;
You see, memcpy, memmove and the like don't really care whats at an address all they want is the specifics which I'm clearly keeping track of here.
Have a look now at the first prototype to follow this contract:
typedef struct CharBuffer {
size_t width; ///< The number of bytes that complete a data unit.
size_t limit; ///< The maximum number of data units that can be allocated for this buffer.
char * address; ///< The memory address for this buffer.
size_t index; ///< The current unit position indicator.
size_t allocated; ///< The current number of allocated addressable units.
CharBufferFlags flags;///< The API contract for this buffer.
} CharBuffer;
As you an clearly see the data type is irrelevant in this context. You can say that C handles it differently depending on the case, but at the end of the day, an address is an address, a byte is byte and a long is a long for as long as we are dealing with memory on the same machine.
The purpose of this system when brought together is to remove all of this type based juggling C seems to be so proud of (and rightfully so...) Its just pointless for what I would like to do. Which is create a contract abiding prototype for any standard size of data (1, 2, 4, 8, sizeof(RandomStruct)) located at any address.
Having the ability to perform my own casting with code and manipulate that data with api functions that operate on specific length blocks of memory with specific length memory units. However, the prototype must contain the official data pointer type, because it just doesn't make sense for the end user to have to recast their data every time they would like to do something with that address pointer. It would not make sense to call it a CharBuffer if the pointer was void.
The StdBuffer is a generic type that is never EVER used except within the api itself, to manage all contract abiding data types.
The api that this system will incorporate is from my latest edition of buffering. Which is quite clearly documented here #Google Code I am aware that some things will need to change to bring this all together namely I won't have the ability to manipulate data directly from within the api safely without lots of proper research and opinion gathering.
Which just brought to my attention that I also need a Signed/Unsigned bit flag in the StdBufferFlags Members.
Perhaps the final piece to this puzzle is also in order for your perusal.
/** \def BIT(I)
\brief A macro for setting a single constant bit.
*
* This macro sets the bit indicated by I to enabled.
* \param I the (1-based) index of the desired bit to set.
*/
#define BIT(I) (1UL << (I - 1))
/** \enum StdBufferFlags
\brief Flags that may be applied to all StdBuffer structures.
* These flags determine the contract of operations between the caller
* and the StdBuffer API for working with data. Bits 1-4 are for the
* API control functions. All other bits are undefined/don't care bits.
*
* If your application would like to use the don't care bits, it would
* be smart not to use bits 5-8, as these may become used by the API
* in future revisions of the software.
*/
typedef enum StdBufferFlags {
BUFFER_MALLOCD = BIT(1), ///< The memory address specified by this buffer was allocated by an API
BUFFER_WRITEABLE = BIT(2), ///< Permission to modify buffer contents using the API
BUFFER_READABLE = BIT(3), ///< Permission to retrieve buffer contents using the API
BUFFER_MOVABLE = BIT(4) ///< Permission to resize or otherwise relocate buffer contents using the API
}StdBufferFlags;
This code requires a diagnostic:
void (*CharBufferClear)(CharBuffer) = (void*) StdBufferClear;
You're converting a void * pointer to a function pointer without a cast. In C, a void * pointer can convert to pointers to object types without a cast, but not to function pointer types. (In C++, a cast is needed to convert void * to object types also, for added safety.)
What you want here is just to cast between function pointer types, i.e.:
void (*CharBufferClear)(CharBuffer) = (void (*)(CharBuffer)) StdBufferClear;
Then you are still doing the same type punning because the functions are different types. You are trying to call a function which takes a StdBuffer using a pointer to a function which takes a CharBuffer.
This type of code is not well-defined C. Having defeated the type system, you're on your own, relying on testing, examining the object code, or obtaining some assurances from the compiler writers that this sort of thing works with that compiler.
What you learned in assembler coding doesn't apply because assembly languages have only a small number of rudimentary data types such as "machine address" or "32 bit word". The concept that two data structures with an identical layout and low-level representation might be incompatible types does not occur in assembly language.
Even if two types look the same at the low level (another example: unsigned int and unsigned long are sometimes exactly the same) C compilers can optimize programs based on the assumption that the type rules have not been violated. For instance suppose that A and B point to the same memory location. If you assign to an an object A->member, a C compiler can assume that the object B->member is not affected by this, if A->member and B->member have incompatible types, like one being char * and the other void *. The generated code keeps caching the old value of B->member in a register, even though the in-memory copy was overwritten by the assignment to A->member. This is an example of invalid aliasing.
The standard does not define the results of casting a function-pointer to void *.
Equally, converting between function pointers and then calling through the wrong one is also undefined behaviour.
There are some constructs which any standards-conforming C compiler are required to implement consistently, and there are some constructs which 99% of C compilers do implement consistently, but which standards-conforming compilers would be free to implement differently. Attempting to cast a pointer to a function which takes one type of pointer, into a pointer to a function which takes another type of pointer, falls into the latter category. Although the C standard specifies that a void* and a char* must be the same size, there is nothing that would require that they share the same bit-level storage format, much less the parameter-passing convention. While most machines allow bytes to be accessed in much the same way as words, such ability is not universal. The designer of an application-binary-interface [the document which specifies among other things how parameters are passed to routines] might specify that a char* be passed in a way which maximizes the efficiency of byte access, while a void* should be passed in a way that maximizes the efficiency of word access while retaining the ability to hold an unaligned byte address, perhaps by using a supplemental word to hold a zero or one to indicate LSB/MSB). On such a machine, having a routine that expects a void* called from code that expects to pass a char* could cause the routine to access arbitrary wrong data.
No, it doesn't matter what data type is used to store the data. It only matters the type C uses to read and write that data, and that the data is of sufficient size.
This question goes out to the C gurus out there:
In C, it is possible to declare a pointer as follows:
char (* p)[10];
.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.
It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:
void foo(char * p, int plen);
If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:
void foo(char (*p)[10]);
..would force the caller to give you a buffer of the specified size.
This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.
My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?
What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.
When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter
void foo(char (*p)[10]);
(in C++ language this is also done with references
void foo(char (&p)[10]);
).
This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name
typedef int Vector3d[3];
void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);
Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).
However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".
But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter
void foo(char p[], unsigned plen);
Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.
Nevertheless, if the array size is fixed, passing it as a pointer to an element
void foo(char p[])
is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.
Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as
char *p = malloc(10 * sizeof *p);
This array cannot be passed to a function declared as
void foo(char (*p)[10]);
which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows
char (*p)[10] = malloc(sizeof *p);
This, of course, can be easily passed to the above declared foo
foo(p);
and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.
I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):
As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:
int array[9];
const int (* p2)[9] = &array; /* Not legal unless array is const as well */
This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text
This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.
The obvious reason is that this code doesn't compile:
extern void foo(char (*p)[10]);
void bar() {
char p[10];
foo(p);
}
The default promotion of an array is to an unqualified pointer.
Also see this question, using foo(&p) should work.
I also want to use this syntax to enable more type checking.
But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.
Here are some more obstacles I have come across.
Accessing the array requires using (*p)[]:
void foo(char (*p)[10])
{
char c = (*p)[3];
(*p)[0] = 1;
}
It is tempting to use a local pointer-to-char instead:
void foo(char (*p)[10])
{
char *cp = (char *)p;
char c = cp[3];
cp[0] = 1;
}
But this would partially defeat the purpose of using the correct type.
One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:
char a[10];
char (*p)[10] = &a;
The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.
Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.
One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.
When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:
fileA:
char (*p)[10];
fileB:
extern char (*p)[10];
Well, simply put, C doesn't do things that way. An array of type T is passed around as a pointer to the first T in the array, and that's all you get.
This allows for some cool and elegant algorithms, such as looping through the array with expressions like
*dst++ = *src++
The downside is that management of the size is up to you. Unfortunately, failure to do this conscientiously has also led to millions of bugs in C coding, and/or opportunities for malevolent exploitation.
What comes close to what you ask in C is to pass around a struct (by value) or a pointer to one (by reference). As long as the same struct type is used on both sides of this operation, both the code that hand out the reference and the code that uses it are in agreement about the size of the data being handled.
Your struct can contain whatever data you want; it could contain your array of a well-defined size.
Still, nothing prevents you or an incompetent or malevolent coder from using casts to fool the compiler into treating your struct as one of a different size. The almost unshackled ability to do this kind of thing is a part of C's design.
You can declare an array of characters a number of ways:
char p[10];
char* p = (char*)malloc(10 * sizeof(char));
The prototype to a function that takes an array by value is:
void foo(char* p); //cannot modify p
or by reference:
void foo(char** p); //can modify p, derefernce by *p[0] = 'f';
or by array syntax:
void foo(char p[]); //same as char*
I would not recommend this solution
typedef int Vector3d[3];
since it obscures the fact that Vector3D has a type that you
must know about. Programmers usually dont expect variables of the
same type to have different sizes. Consider :
void foo(Vector3d a) {
Vector3d b;
}
where sizeof a != sizeof b
Maybe I'm missing something, but... since arrays are constant pointers, basically that means that there's no point in passing around pointers to them.
Couldn't you just use void foo(char p[10], int plen); ?
type (*)[];
// points to an array e.g
int (*ptr)[5];
// points to an 5 integer array
// gets the address of the array
type *[];
// points to an array of pointers e.g
int* ptr[5]
// point to an array of five integer pointers
// point to 5 adresses.
On my compiler (vs2008) it treats char (*p)[10] as an array of character pointers, as if there was no parentheses, even if I compile as a C file. Is compiler support for this "variable"? If so that is a major reason not to use it.