Malloc confusion - c

Hi after going through answer here
1: Do I cast the result of malloc? I understood that one of the reason why we do not cast malloc is that
casting malloc is redundant
But what I am still trying to figure out is the warning that will be suppressed when we do cast the malloc function
I also read this answer but I have the follwing doubts
#include<stdio.h>
main()
{
int *a=malloc(20);
}
I understood the point in the answer that compiler will think that malloc returns an int while we are trying to give that value to an int * which will gives us error cannot convert from int * to int or something like that but the basic question is
Won't the compiler in absence of stdlib.h treat malloc as a user defined function and wont it look for the declaration of it and it will give some error related to missing delcaration/defination

In the original C language - C89/90 - calling an undeclared function is not an error. For this reason, a pre-C99 compiler will not produce any "error" due to a missing function declaration. The compiler will simply assume that function returns an int.
It will also automatically and quietly "guess" (infer, derive) the function parameter types from the argument types you supplied in your call. In your example, you supplied 20, which will make the compiler to guess that the "unknown" malloc function takes a single parameter of type int. Note that this is also incorrect, because the real malloc takes a size_t parameter.
In C99 and later the function declaration is required. Which means that forgetting to declare malloc (e.g. forgetting to include <stdlib.h>) is indeed an error, which will result in a diagnostic message. (The parameter-guessing behavior is still there in the language though.)
Note also that in C99 and later declaring function main without an explicit return type int is illegal. The "implicit int" rule is specific to the original version of C language specification only. It no longer exists in C99 and later. You have to declare it as int main(... explicitly.

In the absence of stdlib.h the compiler thinks that the malloc() function will return int(For C89/90 and not from c99) and you are trying to assign that value to int * and hence there is a type mismatch and the compiler will report it

Related

Syntax for defining a pointer to dynamically allocated memory [duplicate]

Now before people start marking this a dup, I've read all the following, none of which provide the answer I'm looking for:
C FAQ: What's wrong with casting malloc's return value?
SO: Should I explicitly cast malloc()’s return value?
SO: Needless pointer-casts in C
SO: Do I cast the result of malloc?
Both the C FAQ and many answers to the above questions cite a mysterious error that casting malloc's return value can hide; however, none of them give a specific example of such an error in practice. Now pay attention that I said error, not warning.
Now given the following code:
#include <string.h>
#include <stdio.h>
// #include <stdlib.h>
int main(int argc, char** argv) {
char * p = /*(char*)*/malloc(10);
strcpy(p, "hello");
printf("%s\n", p);
return 0;
}
Compiling the above code with gcc 4.2, with and without the cast gives the same warnings, and the program executes properly and provides the same results in both cases.
anon#anon:~/$ gcc -Wextra nostdlib_malloc.c -o nostdlib_malloc
nostdlib_malloc.c: In function ‘main’:
nostdlib_malloc.c:7: warning: incompatible implicit declaration of built-in function ‘malloc’
anon#anon:~/$ ./nostdlib_malloc
hello
So can anyone give a specific code example of a compile or runtime error that could occur because of casting malloc's return value, or is this just an urban legend?
Edit I've come across two well written arguments regarding this issue:
In Favor of Casting: CERT Advisory: Immediately cast the result of a memory allocation function call into a pointer to the allocated type
Against Casting (404 error as of 2012-02-14: use the Internet Archive Wayback Machine copy from 2010-01-27.{2016-03-18:"Page cannot be crawled or displayed due to robots.txt."})
You won't get a compiler error, but a compiler warning. As the sources you cite say (especially the first one), you can get an unpredictable runtime error when using the cast without including stdlib.h.
So the error on your side is not the cast, but forgetting to include stdlib.h. Compilers may assume that malloc is a function returning int, therefore converting the void* pointer actually returned by malloc to int and then to your pointer type due to the explicit cast. On some platforms, int and pointers may take up different numbers of bytes, so the type conversions may lead to data corruption.
Fortunately, modern compilers give warnings that point to your actual error. See the gcc output you supplied: It warns you that the implicit declaration (int malloc(int)) is incompatible to the built-in malloc. So gcc seems to know malloc even without stdlib.h.
Leaving out the cast to prevent this error is mostly the same reasoning as writing
if (0 == my_var)
instead of
if (my_var == 0)
since the latter could lead to a serious bug if one would confuse = and ==, whereas the first one would lead to a compile error. I personally prefer the latter style since it better reflects my intention and I don't tend to do this mistake.
The same is true for casting the value returned by malloc: I prefer being explicit in programming and I generally double-check to include the header files for all functions I use.
One of the good higher-level arguments against casting the result of malloc is often left unmentioned, even though, in my opinion, it is more important than the well-known lower-level issues (like truncating the pointer when the declaration is missing).
A good programming practice is to write code, which is as type-independent as possible. This means, in particular, that type names should be mentioned in the code as little as possible or best not mentioned at all. This applies to casts (avoid unnecessary casts), types as arguments of sizeof (avoid using type names in sizeof) and, generally, all other references to type names.
Type names belong in declarations. As much as possible, type names should be restricted to declarations and only to declarations.
From this point of view, this bit of code is bad
int *p;
...
p = (int*) malloc(n * sizeof(int));
and this is much better
int *p;
...
p = malloc(n * sizeof *p);
not simply because it "doesn't cast the result of malloc", but rather because it is type-independent (or type-agnositic, if you prefer), because it automatically adjusts itself to whatever type p is declared with, without requiring any intervention from the user.
Non-prototyped functions are assumed to return int.
So you're casting an int to a pointer. If pointers are wider than ints on your platform, this is highly risky behavior.
Plus, of course, that some people consider warnings to be errors, i.e. code should compile without them.
Personally, I think the fact that you don't need to cast void * to another pointer type is a feature in C, and consider code that does to be broken.
If you do this when compiling in 64-bit mode, your returned pointer will be truncated to 32-bits.
EDIT:
Sorry for being too brief. Here's an example code fragment for discussion purposes.
main()
{
char * c = (char *)malloc(2) ;
printf("%p", c) ;
}
Suppose that the returned heap pointer is something bigger than what is representable in an int, say 0xAB00000000.
If malloc is not prototyped to return a pointer, the int value returned will initially be in some register with all the significant bits set. Now the compiler say, "okay, how do I convert and int to a pointer". That's going to be either a sign extension or zero extension of the low order 32-bits that it has been told malloc "returns" by omitting the prototype. Since int is signed I think the conversion will be sign extension, which will in this case convert the value to zero. With a return value of 0xABF0000000 you'll get a non-zero pointer that will also cause some fun when you try to dereference it.
A Reusable Software Rule:
In the case of writing an inline function in which used malloc(), in order to make it reusable for C++ code too, please do an explicit type casting (e.g. (char*)); otherwise compiler will complain.
A void pointer in C can be assigned to any pointer without an explicit cast. The compiler will give warning but it can be reusable in C++ by type casting malloc() to corresponding type. With out type casting also it can be use in C, because C is no strict type checking. But C++ is strictly type checking so it is needed to type cast malloc() in C++.
The malloc() function could often require a conversion cast before.
For the returned type from malloc it is a pointer to void and not a particular type, like may be a char* array, or a string.
And sometimes the compiler could not know, how to convert this type.
int size = 10;
char* pWord = (char*)malloc(size);
The allocation functions are available for all C packages.
So, these are general functions, that must work for more C types.
And the C++ libraries are extensions of the older C libraries.
Therefore the malloc function returns a generic void* pointer.
Cannot allocate an object of a type with another of different type.
Unless the objects are not classes derived from a common root class.
And not always this is possible, there are different exceptions.
Therefore a conversion cast might be necessary in this case.
Maybe the modern compilers know how to convert different types.
So this could not be a great issue, when is doing this conversion.
But a correct cast can be used, if a type conversion is possible.
As an example: it cannot be cast "apples" to "strawberries". But these both, so called "classes", can be converted to "fruits".
There are custom structure types, which cannot be cast directly.
In this case, any member variable has to be assigned separately.
Or a custom object would have to set its members independently.
Either if it is about a custom object class, or whatever else...
Also a cast to a void pointer must be used when using a free call.
This is because the argument of the free function is a void pointer.
free((void*)pWord);
Casts are not bad, they just don't work for all the variable types.
A conversion cast is also an operator function, that must be defined.
If this operator is not defined for a certain type, it may not work.
But not all the errors are because of this conversion cast operator.
With kind regards, Adrian Brinas

strlen function still work without including header file

I'm new to C, just a question on strlen. I know that it is defined in <string.h>. But if I don't include <string.h> as:
#include <stdio.h>
int main()
{
char greeting[6] = { 'z', 'e', 'l', 'l', 'o', '\0' };
printf("message size: %d\n", strlen(greeting));
return 0;
}
it still works and prints 5, I did receive a warning which says warning: incompatible implicit declaration of built-in function ‘strlen’ but how come it still compile? and who provides strlen()?
There are several things going on here.
As of the 1990 version of C, it was legal to call a function without a visible declaration. Your call strlen(greeting) would cause the compiler to assume that it's declared as
int strlen(char*);
Since the return type is actually size_t and not int, the call has undefined behavior. Since strlen is a standard library function, the compiler knows how it should be declared and warns you that the implicit warning created by the call doesn't match.
As of the 1999 version of the language, a call to a function with no visible declaration is invalid (the "implicit int" rule was dropped). The language requires a diagnostic -- if you've told the compiler to conform to C99 or later. Many C compilers are more lax by default, and will let you get away with some invalid constructs.
If you compiled with options to conform to the current standard, you would most likely get a more meaningful diagnostic -- even a fatal error if that's what you want (which is a good idea). If you're using gcc or something reasonable compatible with it, try
gcc -std=c11 -pedantic-errors ...
Of course the best solution is to add the required #include <string.h> -- and be aware that you can't always rely on your C compiler to tell you everything that's wrong with your code.
Some more problems with your code:
int main() should be int main(void) though this is unlikely to be a real problem.
Since strlen returns a result of type size_t, use %zu, not %d, to print it. (The %zu format was introduced in C99. In the unlikely event that you're using an implementation that doesn't support it, you can cast the result to a known type.)
The C standard library is linked automatically. That's what gives you access to the strlen function.
Including the header file helps the compiler out by telling it what kind of function strlen is. That is, it tells it what kind of value it returns and what kind of parameters it takes.
That's why your compiler is warning you. It's saying, "You're calling this function but I haven't been told what kind of function it is." Often a compiler will make a guess. You're passing in a char[] so the compiler figures the function must take a char* as an argument. As for the return type, the compiler makes the guess that the return value is an int.

Strange integers in c language

I have code:
#include <stdio.h>
int main() {
int a = sum(1, 3);
return 0;
}
int sum(int a, int b, int c) {
printf("%d\n", c);
return a + b + c;
}
I know that I have to declare functions first, and only after that I can call them, but I want to understand what happends.
(Compiled by gcc v6.3.0)
I ignored implicit declaration of function warning and ran program several times, output was this:
1839551928
-2135227064
41523672
// And more strange numbers
I have 2 questions:
1) What do these numbers mean?
2) How function main knows how to call function sum without it declaration?
I'll assume that the code in your question is the code you're actually compiling and running:
int main() {
int a = sum(1, 3);
return 0;
}
int sum(int a, int b, int c) {
printf("%d\n", c);
return a + b + c;
}
The call to printf is invalid, since you don't have the required #include <stdio.h>. But that's not what you're asking about, so we'll ignore it. The question was edited to add the include directive.
In standard C, since the 1999 standard, calling a function (sum in this case) with no visible declaration is a constraint violation. That means that a diagnostic is required (but a conforming compiler can still successfully compile the program if it chooses to). Along with syntax errors, constraint violations are the closest C comes to saying that something is illegal. (Except for #error directives, which must cause a translation unit to be rejected.)
Prior to C99, C had an "implicit int" rule, which meant that if you call a function with no visible declaration an implicit declaration would be created. That declaration would be for a function with a return type of int, and with parameters of the (promoted) types of the arguments you passed. Your call sum(1, 3) would create an implicit declaration int sum(int, int), and generate a call as if the function were defined that way.
Since it isn't defined that way, the behavior is undefined. (Most likely the value of one of the parameters, perhaps the third, will be taken from some arbitrary register or memory location, but the standard says nothing about what the call will actually do.)
C99 (the 1999 edition of the ISO C standard) dropped the implicit int rule. If you compile your code with a conforming C99 or later compiler, the compiler is required to diagnose an error for the sum(1, 3) call. Many compilers, for backward compatibility with old code, will print a non-fatal warning and generate code that assumes the definition matches the implicit declaration. And many compilers are non-conforming by default, and might not even issue a warning. (BTW, if your compiler did print an error or warning message, it is tremendously helpful if you include it in your question.)
Your program is buggy. A conforming C compiler must at least warn you about it, and possibly reject it. If you run it in spite of the warning, the behavior is undefined.
This is undefined behavior per 6.5.2.2 Function calls, paragraph 9 of the C standard:
If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined.
Functions without prototypes are allowed under 6.5.2.2 Function calls, paragraph 6:
If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. ...
Note again: if the parameters passed don't match the arguments expected, the behavior is undefined.
In strictly standard conforming C, if you don't declare a function before using it, it will assume certain default argument types for the function.This is based on early versions of C with a weaker type system, and retained only for backwards compatibility. It should not be used generally.
Ill skip the details here, but in your case it assumes sum takes 2 ints and returns an int.
Calling a function with the wrong number of parameters, as you are doing here, is undefined behaviour. When you call sum, the compiler thinks that it takes two integers, so it passes two integers to it. When the function is actually called, however, it tries to read one more integer, c. Since you only passed 2 ints, the space for c contains random crap, which is what you're seeing when you print out. Note that it doesn't have to do this, since this is undefined behaviour, it could do anything. It could have given values for b & c, for example.
Obviously this behaviour is confusing, and you should not rely on undefined behaviour, so you'd be better off compiling with stricter compiler settings so this program wouldn't compile. (The proper version would declare sum above main.)
1) Since you haven't provided value for parameter "c" when calling function "sum" its value inside the function is undefined. If you declared function before main, your program wouldn't even compile, and you would get "error: too few arguments to function call" error.
2) Normally, it doesn't. Function has to be declared before the call so the compiler can check function signature. Compiler optimizations solved this for you in this case.
I'm not 100% sure if C works exactly like this but, your function calls work like a stack in memory. When you call a function your arguments are put on that stack so when in the fuction you can access them by selecting less x positions on memory. So:
You call summ(1, 3)
the stack will have 1 and the on the top a 3.
when executing the fuction it will see the last position of memory for the 1º argument (it recovers the 1) and then the position before that for the 2º argument (recovering the 3), however, there is a 3º argument so it accesses the position before that as well.
This position is garbige as not put by you and different everytime you run it.
Hope it was clear enought. Remeber that the stack works is inverted so every time you add something it goes to the previous memory position, not the next.

What does "char *p, *getenv();" do? Why a function with no args in a variable declaration line?

I'm reading a bit of code in C, and in the variable declaration line, there's "char *p, *getenv();"
I understand "char *p" of course. What does "char *getenv()" do? Later on in the code, the function getenv() is called with a variable passed to it. But what's the point of "char *getenv();" without any arguments?
Sorry if this is basic. I'm just starting to learn C.
It is "valid C" (I would almost say "unfortunately") - but it is not particularly useful. It is a "declaration without declaration" - "I will be using a function getenv() but I'm not going to tell you what the arguments will be". It does have the advantage of preventing a compiler warning / error - something that would normally be prevented with a "real" function prototype, for example by including the appropriate .h file (in this case, stdlib.h).
To clarify: the following code
#include <stdio.h>
int main(void) {
char *p;
p = getenv("PATH");
printf("the path is %s\n", p);
return 0;
}
will throw a compiler warning (and you should never ignore compiler warnings):
nonsense.c: In function ‘main’:
nonsense.c:5: warning: implicit declaration of function ‘getenv’
nonsense.c:5: warning: assignment makes pointer from integer without a cast
Either adding #include <stdlib.h> or the line you had above, will make the warning go away (even with -Wall -pedantic compiler flags). The compiler will know what to do with the return value - and it figures out the type of the arguments on the fly because it knows the type when it sees it.
And that is the key point: until you tell the compiler the type of the return value of the function it does not know what to do (it will assume int, and make appropriate conversions for the variable you are assigning to - this can be inappropriate. For example, a pointer is often bigger than an int, so information will be lost in the conversion.)
It's a declaration (an old-style one without a prototype, since the argument list is missing) for the function getenv. Aside from being arguably bad style to hide a function declaration alongside the declaration of an object like this, it's bad to omit the prototype and bad not to be using the standard headers (stdlib.h in this case) to get the declarations of standard functions.

Specifically, what's dangerous about casting the result of malloc?

Now before people start marking this a dup, I've read all the following, none of which provide the answer I'm looking for:
C FAQ: What's wrong with casting malloc's return value?
SO: Should I explicitly cast malloc()’s return value?
SO: Needless pointer-casts in C
SO: Do I cast the result of malloc?
Both the C FAQ and many answers to the above questions cite a mysterious error that casting malloc's return value can hide; however, none of them give a specific example of such an error in practice. Now pay attention that I said error, not warning.
Now given the following code:
#include <string.h>
#include <stdio.h>
// #include <stdlib.h>
int main(int argc, char** argv) {
char * p = /*(char*)*/malloc(10);
strcpy(p, "hello");
printf("%s\n", p);
return 0;
}
Compiling the above code with gcc 4.2, with and without the cast gives the same warnings, and the program executes properly and provides the same results in both cases.
anon#anon:~/$ gcc -Wextra nostdlib_malloc.c -o nostdlib_malloc
nostdlib_malloc.c: In function ‘main’:
nostdlib_malloc.c:7: warning: incompatible implicit declaration of built-in function ‘malloc’
anon#anon:~/$ ./nostdlib_malloc
hello
So can anyone give a specific code example of a compile or runtime error that could occur because of casting malloc's return value, or is this just an urban legend?
Edit I've come across two well written arguments regarding this issue:
In Favor of Casting: CERT Advisory: Immediately cast the result of a memory allocation function call into a pointer to the allocated type
Against Casting (404 error as of 2012-02-14: use the Internet Archive Wayback Machine copy from 2010-01-27.{2016-03-18:"Page cannot be crawled or displayed due to robots.txt."})
You won't get a compiler error, but a compiler warning. As the sources you cite say (especially the first one), you can get an unpredictable runtime error when using the cast without including stdlib.h.
So the error on your side is not the cast, but forgetting to include stdlib.h. Compilers may assume that malloc is a function returning int, therefore converting the void* pointer actually returned by malloc to int and then to your pointer type due to the explicit cast. On some platforms, int and pointers may take up different numbers of bytes, so the type conversions may lead to data corruption.
Fortunately, modern compilers give warnings that point to your actual error. See the gcc output you supplied: It warns you that the implicit declaration (int malloc(int)) is incompatible to the built-in malloc. So gcc seems to know malloc even without stdlib.h.
Leaving out the cast to prevent this error is mostly the same reasoning as writing
if (0 == my_var)
instead of
if (my_var == 0)
since the latter could lead to a serious bug if one would confuse = and ==, whereas the first one would lead to a compile error. I personally prefer the latter style since it better reflects my intention and I don't tend to do this mistake.
The same is true for casting the value returned by malloc: I prefer being explicit in programming and I generally double-check to include the header files for all functions I use.
One of the good higher-level arguments against casting the result of malloc is often left unmentioned, even though, in my opinion, it is more important than the well-known lower-level issues (like truncating the pointer when the declaration is missing).
A good programming practice is to write code, which is as type-independent as possible. This means, in particular, that type names should be mentioned in the code as little as possible or best not mentioned at all. This applies to casts (avoid unnecessary casts), types as arguments of sizeof (avoid using type names in sizeof) and, generally, all other references to type names.
Type names belong in declarations. As much as possible, type names should be restricted to declarations and only to declarations.
From this point of view, this bit of code is bad
int *p;
...
p = (int*) malloc(n * sizeof(int));
and this is much better
int *p;
...
p = malloc(n * sizeof *p);
not simply because it "doesn't cast the result of malloc", but rather because it is type-independent (or type-agnositic, if you prefer), because it automatically adjusts itself to whatever type p is declared with, without requiring any intervention from the user.
Non-prototyped functions are assumed to return int.
So you're casting an int to a pointer. If pointers are wider than ints on your platform, this is highly risky behavior.
Plus, of course, that some people consider warnings to be errors, i.e. code should compile without them.
Personally, I think the fact that you don't need to cast void * to another pointer type is a feature in C, and consider code that does to be broken.
If you do this when compiling in 64-bit mode, your returned pointer will be truncated to 32-bits.
EDIT:
Sorry for being too brief. Here's an example code fragment for discussion purposes.
main()
{
char * c = (char *)malloc(2) ;
printf("%p", c) ;
}
Suppose that the returned heap pointer is something bigger than what is representable in an int, say 0xAB00000000.
If malloc is not prototyped to return a pointer, the int value returned will initially be in some register with all the significant bits set. Now the compiler say, "okay, how do I convert and int to a pointer". That's going to be either a sign extension or zero extension of the low order 32-bits that it has been told malloc "returns" by omitting the prototype. Since int is signed I think the conversion will be sign extension, which will in this case convert the value to zero. With a return value of 0xABF0000000 you'll get a non-zero pointer that will also cause some fun when you try to dereference it.
A Reusable Software Rule:
In the case of writing an inline function in which used malloc(), in order to make it reusable for C++ code too, please do an explicit type casting (e.g. (char*)); otherwise compiler will complain.
A void pointer in C can be assigned to any pointer without an explicit cast. The compiler will give warning but it can be reusable in C++ by type casting malloc() to corresponding type. With out type casting also it can be use in C, because C is no strict type checking. But C++ is strictly type checking so it is needed to type cast malloc() in C++.
The malloc() function could often require a conversion cast before.
For the returned type from malloc it is a pointer to void and not a particular type, like may be a char* array, or a string.
And sometimes the compiler could not know, how to convert this type.
int size = 10;
char* pWord = (char*)malloc(size);
The allocation functions are available for all C packages.
So, these are general functions, that must work for more C types.
And the C++ libraries are extensions of the older C libraries.
Therefore the malloc function returns a generic void* pointer.
Cannot allocate an object of a type with another of different type.
Unless the objects are not classes derived from a common root class.
And not always this is possible, there are different exceptions.
Therefore a conversion cast might be necessary in this case.
Maybe the modern compilers know how to convert different types.
So this could not be a great issue, when is doing this conversion.
But a correct cast can be used, if a type conversion is possible.
As an example: it cannot be cast "apples" to "strawberries". But these both, so called "classes", can be converted to "fruits".
There are custom structure types, which cannot be cast directly.
In this case, any member variable has to be assigned separately.
Or a custom object would have to set its members independently.
Either if it is about a custom object class, or whatever else...
Also a cast to a void pointer must be used when using a free call.
This is because the argument of the free function is a void pointer.
free((void*)pWord);
Casts are not bad, they just don't work for all the variable types.
A conversion cast is also an operator function, that must be defined.
If this operator is not defined for a certain type, it may not work.
But not all the errors are because of this conversion cast operator.
With kind regards, Adrian Brinas

Resources