Portability of a C code - c

I have the following code
int main()
{
int a=6;
void *p;
p=&a;
p++;
}
Does the void pointer here increment by a particular value (if it is holding the address of any data type) ?
In the above case p increments by 1 even though it is pointing to an integer value. According to me the above code invokes Implementation Defined behavior.

The code in not valid from standard C point of view. It is illegal to increment void * pointers, or any other pointers to incomplete types.
Your compiler implements it as an extension, which is, of course, non-portable.

Applying the ++ operator to void* is a GCC extension, which I'm told makes certain things a bit more convenient for very low-level programming (not having to cast so much, basically). I don't think I've ever hit a situation where I seriously don't want to cast to unsigned char*.
Use the -pedantic flag if you're writing code that's supposed to be portable.

Beside the answer of AndreyT:
Have a look at stdint.h which makes your code more portable.

Related

Syntax for defining a pointer to dynamically allocated memory [duplicate]

Now before people start marking this a dup, I've read all the following, none of which provide the answer I'm looking for:
C FAQ: What's wrong with casting malloc's return value?
SO: Should I explicitly cast malloc()’s return value?
SO: Needless pointer-casts in C
SO: Do I cast the result of malloc?
Both the C FAQ and many answers to the above questions cite a mysterious error that casting malloc's return value can hide; however, none of them give a specific example of such an error in practice. Now pay attention that I said error, not warning.
Now given the following code:
#include <string.h>
#include <stdio.h>
// #include <stdlib.h>
int main(int argc, char** argv) {
char * p = /*(char*)*/malloc(10);
strcpy(p, "hello");
printf("%s\n", p);
return 0;
}
Compiling the above code with gcc 4.2, with and without the cast gives the same warnings, and the program executes properly and provides the same results in both cases.
anon#anon:~/$ gcc -Wextra nostdlib_malloc.c -o nostdlib_malloc
nostdlib_malloc.c: In function ‘main’:
nostdlib_malloc.c:7: warning: incompatible implicit declaration of built-in function ‘malloc’
anon#anon:~/$ ./nostdlib_malloc
hello
So can anyone give a specific code example of a compile or runtime error that could occur because of casting malloc's return value, or is this just an urban legend?
Edit I've come across two well written arguments regarding this issue:
In Favor of Casting: CERT Advisory: Immediately cast the result of a memory allocation function call into a pointer to the allocated type
Against Casting (404 error as of 2012-02-14: use the Internet Archive Wayback Machine copy from 2010-01-27.{2016-03-18:"Page cannot be crawled or displayed due to robots.txt."})
You won't get a compiler error, but a compiler warning. As the sources you cite say (especially the first one), you can get an unpredictable runtime error when using the cast without including stdlib.h.
So the error on your side is not the cast, but forgetting to include stdlib.h. Compilers may assume that malloc is a function returning int, therefore converting the void* pointer actually returned by malloc to int and then to your pointer type due to the explicit cast. On some platforms, int and pointers may take up different numbers of bytes, so the type conversions may lead to data corruption.
Fortunately, modern compilers give warnings that point to your actual error. See the gcc output you supplied: It warns you that the implicit declaration (int malloc(int)) is incompatible to the built-in malloc. So gcc seems to know malloc even without stdlib.h.
Leaving out the cast to prevent this error is mostly the same reasoning as writing
if (0 == my_var)
instead of
if (my_var == 0)
since the latter could lead to a serious bug if one would confuse = and ==, whereas the first one would lead to a compile error. I personally prefer the latter style since it better reflects my intention and I don't tend to do this mistake.
The same is true for casting the value returned by malloc: I prefer being explicit in programming and I generally double-check to include the header files for all functions I use.
One of the good higher-level arguments against casting the result of malloc is often left unmentioned, even though, in my opinion, it is more important than the well-known lower-level issues (like truncating the pointer when the declaration is missing).
A good programming practice is to write code, which is as type-independent as possible. This means, in particular, that type names should be mentioned in the code as little as possible or best not mentioned at all. This applies to casts (avoid unnecessary casts), types as arguments of sizeof (avoid using type names in sizeof) and, generally, all other references to type names.
Type names belong in declarations. As much as possible, type names should be restricted to declarations and only to declarations.
From this point of view, this bit of code is bad
int *p;
...
p = (int*) malloc(n * sizeof(int));
and this is much better
int *p;
...
p = malloc(n * sizeof *p);
not simply because it "doesn't cast the result of malloc", but rather because it is type-independent (or type-agnositic, if you prefer), because it automatically adjusts itself to whatever type p is declared with, without requiring any intervention from the user.
Non-prototyped functions are assumed to return int.
So you're casting an int to a pointer. If pointers are wider than ints on your platform, this is highly risky behavior.
Plus, of course, that some people consider warnings to be errors, i.e. code should compile without them.
Personally, I think the fact that you don't need to cast void * to another pointer type is a feature in C, and consider code that does to be broken.
If you do this when compiling in 64-bit mode, your returned pointer will be truncated to 32-bits.
EDIT:
Sorry for being too brief. Here's an example code fragment for discussion purposes.
main()
{
char * c = (char *)malloc(2) ;
printf("%p", c) ;
}
Suppose that the returned heap pointer is something bigger than what is representable in an int, say 0xAB00000000.
If malloc is not prototyped to return a pointer, the int value returned will initially be in some register with all the significant bits set. Now the compiler say, "okay, how do I convert and int to a pointer". That's going to be either a sign extension or zero extension of the low order 32-bits that it has been told malloc "returns" by omitting the prototype. Since int is signed I think the conversion will be sign extension, which will in this case convert the value to zero. With a return value of 0xABF0000000 you'll get a non-zero pointer that will also cause some fun when you try to dereference it.
A Reusable Software Rule:
In the case of writing an inline function in which used malloc(), in order to make it reusable for C++ code too, please do an explicit type casting (e.g. (char*)); otherwise compiler will complain.
A void pointer in C can be assigned to any pointer without an explicit cast. The compiler will give warning but it can be reusable in C++ by type casting malloc() to corresponding type. With out type casting also it can be use in C, because C is no strict type checking. But C++ is strictly type checking so it is needed to type cast malloc() in C++.
The malloc() function could often require a conversion cast before.
For the returned type from malloc it is a pointer to void and not a particular type, like may be a char* array, or a string.
And sometimes the compiler could not know, how to convert this type.
int size = 10;
char* pWord = (char*)malloc(size);
The allocation functions are available for all C packages.
So, these are general functions, that must work for more C types.
And the C++ libraries are extensions of the older C libraries.
Therefore the malloc function returns a generic void* pointer.
Cannot allocate an object of a type with another of different type.
Unless the objects are not classes derived from a common root class.
And not always this is possible, there are different exceptions.
Therefore a conversion cast might be necessary in this case.
Maybe the modern compilers know how to convert different types.
So this could not be a great issue, when is doing this conversion.
But a correct cast can be used, if a type conversion is possible.
As an example: it cannot be cast "apples" to "strawberries". But these both, so called "classes", can be converted to "fruits".
There are custom structure types, which cannot be cast directly.
In this case, any member variable has to be assigned separately.
Or a custom object would have to set its members independently.
Either if it is about a custom object class, or whatever else...
Also a cast to a void pointer must be used when using a free call.
This is because the argument of the free function is a void pointer.
free((void*)pWord);
Casts are not bad, they just don't work for all the variable types.
A conversion cast is also an operator function, that must be defined.
If this operator is not defined for a certain type, it may not work.
But not all the errors are because of this conversion cast operator.
With kind regards, Adrian Brinas

Does C's void* serve any purpose other than making code more idiomatic?

Does C's void* lend some benefit in the form of compiler optimization etc, or is it just an idiomatic equivalent for char*? I.e. if every void* in the C standard library was replaced by char*, would anything be damaged aside from code legibility?
In the original K&R C there was no void * type, char * was used as the generic pointer.
void * serves two main purposes:
It makes it clear when a pointer is being used as a generic pointer. Previously, you couldn't tell whether char * was being used as a pointer to an actual character buffer or as a generic pointer.
It allows the compiler to catch errors where you try to dereference or perform arithmetic on a generic pointer, rather than casting it to the appropriate type first. These operations aren't permitted on void * values. Unfortunately, some compilers (e.g. gcc) allow arithmetic on void * pointers by default, so it's not as safe as it could be.
C got along just fine for years with char * as the generic pointer type before void * was introduced later. So it clearly isn't strictly necessary. But programming languages are communication--not just telling a compiler what to do, but telling other human beings reading the program what is intended. Anything that makes a language more expressive is a good thing.
Does C's void* serve any purpose other than making code more idiomatic?
if every void* in the C standard library was replaced by char*, would anything be damaged aside from code legibility?
C11 introduced _Genric. Because of that, _Generic(some_C_standard_library_function()) ... code can compile quite differently depending on if the return type was a void *, char *, int *, etc.

What does this line mean in C99?

static int* p= (int*)(&foo);
I just know p points to a memory in the code segment.
But I don't know what exactly happens in this line.
I thought maybe it's a pointer to a function but the format to point a function is:
returnType (*pointerName) (params,...);
pointerName = &someFunc; // or pointerName=someFunc;
You take the address of foo and cast it to pointer to int.
If foo and p are of different types, the compiler might issue a warning about type mismatch. The cast is to supress that warning.
For example, consider the following code, which causes a warning from the compiler (initialization from incompatible pointer type):
float foo = 42;
int *p = &foo;
Here foo is a float, while p points to an int. Clearly - different types.
A typecasting makes the compiler treat one variable as if it was of different type. You typecast by putting new type name in parenthesis. Here we will make pointer to float to be treated like a pointer to int and the warning will be no more:
float foo = 5;
int *p = (int*)(&foo);
You could've omitted one pair of parenthesis as well and it'd mean the same:
float foo = 5;
int *p = (int*)&foo;
The issue is the same if foo is a function. We have a pointer to a function on right side of assignment and a pointer to int on left side. A cast would be added to make a pointer to function to be treated as an address of int.
A pointer of a type which points to an object (i.e. not void* and not a
pointer to a function) cannot be stored to a pointer to any other kind of
object without a cast, except in a few cases where the types are identical
except for qualifiers. Conforming compilers are required to issue a
diagnostic if that rule is broken.
Beyond that, the Standard allows compilers to interpret code that casts
pointers in nonsensical fashion unless code aides by some restrictions
which, as written make such casts essentially useless, for the nominal
purpose of promoting optimization. When the rules were written, most
compilers would probably do about half of the optimizations that would
be allowed under the rules, but would still process pointer casts sensibly
since doing so would cost maybe 5% of the theoretically-possible
optimizations. Today, however, it is more fashionable for compiler writers
to seek out all cases where an optimization would be allowed by the
Standard without regard for whether they make sense.
Compilers like gcc have an option -fno-strict-aliasing that blocks this
kind of optimization, both in cases where it would offer big benefits and
little risk, as well as in the cases where it would almost certainly break
code and be unlikely to offer any real benefit. It would be helpful if they
had an option to block only the latter, but I'm unaware of one. Thus, my
recommendation is that unless one wants to program in a very limited subset
of Dennis Ritchie's language, I'd suggest targeting the -fno-strict-aliasing
dialect.

memcpy vs assignment in C -- should be memmove?

As pointed out in an answer to this question, the compiler (in this case gcc-4.1.2, yes it's old, no I can't change it) can replace struct assignments with memcpy where it thinks it is appropriate.
I'm running some code under valgrind and got a warning about memcpy source/destination overlap. When I look at the code, I see this (paraphrasing):
struct outer
{
struct inner i;
// lots of other stuff
};
struct inner
{
int x;
// lots of other stuff
};
void frob(struct inner* i, struct outer* o)
{
o->i = *i;
}
int main()
{
struct outer o;
// assign a bunch of fields in o->i...
frob(&o.i, o);
return 0;
}
If gcc decides to replace that assignment with memcpy, then it's an invalid call because the source and dest overlap.
Obviously, if I change the assignment statement in frob to call memmove instead, then the problem goes away.
But is this a compiler bug, or is that assignment statement somehow invalid?
I think that you are mixing up the levels. gcc is perfectly correct to replace an assignment operation by a call to any library function of its liking, as long as it can guarantee the correct behavior.
It is not "calling" memcpy or whatsoever in the sense of the standard. It is just using one function it its library for which it might have additional information that guarantees correctness. The properties of memcpy as they are described in the standard are properties seen as interfaces for the programmer, not for the compiler/environment implementor.
Whether or not memcpy in that implementation in question implements a behavior that makes it valid for the assignment operation is another question. It should not be so difficult to check that or even to inspect the code.
As far as I can tell, this is a compiler bug. i is allowed to alias &o.i according to the aliasing rules, since the types match and the compiler cannot prove that the address of o.i could not have been previously taken. And of course calling memcpy with overlapping (or same) pointers invokes UB.
By the way note that, in your example, o->i is nonsense. You meant o.i I think...
I suppose that there is a typo: "&o" instead of "0".
Under this hypothesis, the "overlap" is actually a strict overwrite: memcpy(&o->i,&o->i,sizeof(o->i)). In this particular case memcpy behaves correctly.

Specifically, what's dangerous about casting the result of malloc?

Now before people start marking this a dup, I've read all the following, none of which provide the answer I'm looking for:
C FAQ: What's wrong with casting malloc's return value?
SO: Should I explicitly cast malloc()’s return value?
SO: Needless pointer-casts in C
SO: Do I cast the result of malloc?
Both the C FAQ and many answers to the above questions cite a mysterious error that casting malloc's return value can hide; however, none of them give a specific example of such an error in practice. Now pay attention that I said error, not warning.
Now given the following code:
#include <string.h>
#include <stdio.h>
// #include <stdlib.h>
int main(int argc, char** argv) {
char * p = /*(char*)*/malloc(10);
strcpy(p, "hello");
printf("%s\n", p);
return 0;
}
Compiling the above code with gcc 4.2, with and without the cast gives the same warnings, and the program executes properly and provides the same results in both cases.
anon#anon:~/$ gcc -Wextra nostdlib_malloc.c -o nostdlib_malloc
nostdlib_malloc.c: In function ‘main’:
nostdlib_malloc.c:7: warning: incompatible implicit declaration of built-in function ‘malloc’
anon#anon:~/$ ./nostdlib_malloc
hello
So can anyone give a specific code example of a compile or runtime error that could occur because of casting malloc's return value, or is this just an urban legend?
Edit I've come across two well written arguments regarding this issue:
In Favor of Casting: CERT Advisory: Immediately cast the result of a memory allocation function call into a pointer to the allocated type
Against Casting (404 error as of 2012-02-14: use the Internet Archive Wayback Machine copy from 2010-01-27.{2016-03-18:"Page cannot be crawled or displayed due to robots.txt."})
You won't get a compiler error, but a compiler warning. As the sources you cite say (especially the first one), you can get an unpredictable runtime error when using the cast without including stdlib.h.
So the error on your side is not the cast, but forgetting to include stdlib.h. Compilers may assume that malloc is a function returning int, therefore converting the void* pointer actually returned by malloc to int and then to your pointer type due to the explicit cast. On some platforms, int and pointers may take up different numbers of bytes, so the type conversions may lead to data corruption.
Fortunately, modern compilers give warnings that point to your actual error. See the gcc output you supplied: It warns you that the implicit declaration (int malloc(int)) is incompatible to the built-in malloc. So gcc seems to know malloc even without stdlib.h.
Leaving out the cast to prevent this error is mostly the same reasoning as writing
if (0 == my_var)
instead of
if (my_var == 0)
since the latter could lead to a serious bug if one would confuse = and ==, whereas the first one would lead to a compile error. I personally prefer the latter style since it better reflects my intention and I don't tend to do this mistake.
The same is true for casting the value returned by malloc: I prefer being explicit in programming and I generally double-check to include the header files for all functions I use.
One of the good higher-level arguments against casting the result of malloc is often left unmentioned, even though, in my opinion, it is more important than the well-known lower-level issues (like truncating the pointer when the declaration is missing).
A good programming practice is to write code, which is as type-independent as possible. This means, in particular, that type names should be mentioned in the code as little as possible or best not mentioned at all. This applies to casts (avoid unnecessary casts), types as arguments of sizeof (avoid using type names in sizeof) and, generally, all other references to type names.
Type names belong in declarations. As much as possible, type names should be restricted to declarations and only to declarations.
From this point of view, this bit of code is bad
int *p;
...
p = (int*) malloc(n * sizeof(int));
and this is much better
int *p;
...
p = malloc(n * sizeof *p);
not simply because it "doesn't cast the result of malloc", but rather because it is type-independent (or type-agnositic, if you prefer), because it automatically adjusts itself to whatever type p is declared with, without requiring any intervention from the user.
Non-prototyped functions are assumed to return int.
So you're casting an int to a pointer. If pointers are wider than ints on your platform, this is highly risky behavior.
Plus, of course, that some people consider warnings to be errors, i.e. code should compile without them.
Personally, I think the fact that you don't need to cast void * to another pointer type is a feature in C, and consider code that does to be broken.
If you do this when compiling in 64-bit mode, your returned pointer will be truncated to 32-bits.
EDIT:
Sorry for being too brief. Here's an example code fragment for discussion purposes.
main()
{
char * c = (char *)malloc(2) ;
printf("%p", c) ;
}
Suppose that the returned heap pointer is something bigger than what is representable in an int, say 0xAB00000000.
If malloc is not prototyped to return a pointer, the int value returned will initially be in some register with all the significant bits set. Now the compiler say, "okay, how do I convert and int to a pointer". That's going to be either a sign extension or zero extension of the low order 32-bits that it has been told malloc "returns" by omitting the prototype. Since int is signed I think the conversion will be sign extension, which will in this case convert the value to zero. With a return value of 0xABF0000000 you'll get a non-zero pointer that will also cause some fun when you try to dereference it.
A Reusable Software Rule:
In the case of writing an inline function in which used malloc(), in order to make it reusable for C++ code too, please do an explicit type casting (e.g. (char*)); otherwise compiler will complain.
A void pointer in C can be assigned to any pointer without an explicit cast. The compiler will give warning but it can be reusable in C++ by type casting malloc() to corresponding type. With out type casting also it can be use in C, because C is no strict type checking. But C++ is strictly type checking so it is needed to type cast malloc() in C++.
The malloc() function could often require a conversion cast before.
For the returned type from malloc it is a pointer to void and not a particular type, like may be a char* array, or a string.
And sometimes the compiler could not know, how to convert this type.
int size = 10;
char* pWord = (char*)malloc(size);
The allocation functions are available for all C packages.
So, these are general functions, that must work for more C types.
And the C++ libraries are extensions of the older C libraries.
Therefore the malloc function returns a generic void* pointer.
Cannot allocate an object of a type with another of different type.
Unless the objects are not classes derived from a common root class.
And not always this is possible, there are different exceptions.
Therefore a conversion cast might be necessary in this case.
Maybe the modern compilers know how to convert different types.
So this could not be a great issue, when is doing this conversion.
But a correct cast can be used, if a type conversion is possible.
As an example: it cannot be cast "apples" to "strawberries". But these both, so called "classes", can be converted to "fruits".
There are custom structure types, which cannot be cast directly.
In this case, any member variable has to be assigned separately.
Or a custom object would have to set its members independently.
Either if it is about a custom object class, or whatever else...
Also a cast to a void pointer must be used when using a free call.
This is because the argument of the free function is a void pointer.
free((void*)pWord);
Casts are not bad, they just don't work for all the variable types.
A conversion cast is also an operator function, that must be defined.
If this operator is not defined for a certain type, it may not work.
But not all the errors are because of this conversion cast operator.
With kind regards, Adrian Brinas

Resources