I am learning C and am a bit confused about why I don't get any warnings/errors from GCC with the following snippet. I am allocating space of 1 char to a pointer to int, is it some changes done by GCC (like optimizing the allocated space for an int silently)?
#include <stdlib.h>
#include <stdio.h>
typedef int *int_ptr;
int main()
{
int_ptr ip;
ip = calloc(1, sizeof(char));
*ip = 1000;
printf("%d", *ip);
free(ip);
return 0;
}
Update
Having read the answers below, would it still be unsafe and risky if I did it the other way around, e.g. allocating space of an int to a pointer to char? The source of my confusion is the following answer in the Rosetta Code, in the function StringArray StringArray_new(size_t size) the coder seems to exactly be doing this this->elements = calloc(size, sizeof(int)); where this->elements is a char** elements.
The result of calloc is of the type void* which implicitly gets converted to an int* type. The C programming language and GCC simply trust the programmer to write sensible casts and thus do not produce any warnings. Your code is technically valid C, even though it produces an invalid memory write at runtime. So no, GCC does not implicitly allocate space for an integer.
If you would like to see warnings of this kind before running (or compilation), you may want to use, e.g., Clang Static Analyzer.
If you would like to see errors of this kind at runtime, run your program with Valgrind.
Update
Allocating space for 1 int (i.e. 4 bytes, generally) and then interpreting it as a char (1 char is 1 byte) will not result in any memory errors, as the space required for an int is larger than the space required for a char. In fact, you could use the result as an array of 4 char's.
The sizeof operator returns the size of that type as a number of bytes. The calloc function then allocates that number of bytes, it is not aware of what type will be stored in the allocated segment.
While this does not produce any errors, it can indeed be considered a "risky and unsafe" programming practice. Exceptions exist for advanced applications where you´d want to reuse the same memory segment for storing values of a different type.
The code on Rosetta Code you linked to contains a bug in exactly that line. It should allocate memory for a char* instead of an int. These are generally not equal. On my machine, the size of an int is 4 bytes, while the size of a char* is 8 bytes.
C has very little type safety and malloc has none. It allocates exactly as many bytes as you tell it to allocate. It's not the compiler's duty to warn about it, it is the programmer's duty to get the parameters right.
The reason why it "seems to work" is undefined behavior. *ip = 1000; might as well crash. What is undefined behavior and how does it work?
Also you should never hide pointers behind typedef. This is very bad practice and only serves to confuse the programmer and everyone reading the code.
The compiler only cares that you pass the right number and types of arguments to calloc - it doesn’t check to see if those arguments make sense, since that’s a runtime issue.
Yes, you could probably add some special case logic to the compiler when both arguments are constant expressions and sizeof operations like in this case, but how would it handle a case where both arguments are runtime variables like calloc( num, size );?
This is one of those cases where C assumes you’re smart enough to know what you’re doing.
Compiler only check Syntax, not Semantic.
Your code's Syntax is OK. But Semantic not.
Related
I'm trying to create a 2D array that will store be able to store each character of a .txt file as an element in the 2D array.
How do I dynamically allocate space for it?
This what I've done so far to malloc it. (this was copied of GeeksForGeeks)
char *arr[rownum2];
for (i = 0; i < rownum2; i++) {
arr[i] = (char *)malloc(colnum * sizeof(char));
However, I think this is the source of serious memory related issues later on in my program, and I've also been told some parts of this are unnecessary.
Can I please get the most suitable way to dynamically allocate memory for the 2D array in this specific scenario?
The code you have posted is 'OK', so long as you remember to call free() on the allocated memory, later in your code, like this:
for (i=0;i<rownum2;i++) free(arr[i]);
...and I've also been told some parts of this are unnecessary.
The explicit cast is unnecessary, so, instead of:
arr[i] = (char *)malloc(colnum*sizeof(char));
just use:
arr[i] = malloc(colnum*sizeof(char));
The sizeof(char) is also, strictly speaking, unnecessary (char will always have a size of 1) but you can leave that, for clarity.
Technically, it's not a 2D array, but an array of arrays. The difference is, you can't make 2D array with lines of different size, but you can do it with your array of arrays.
If you don't need it, you can allocate rownum2*colnum elements and access each element as arr[x+colnum*y] (it's used often because all data are kept in one place, decreasing CPU cache load and some system inner needs for storing each pointer of each allocated chunk).
Also, even array of lines of different sizes can be placed into 1D array and accessed like 2D (at least, if they do not change size or even RO). You can allocate char body[total_size], read the whole array, allocate char* arr[rownum2] and set each arr[i]=body+line_beginning_offset.
BTW don't forget there are not actual C strings because they are not null-terminated. You'll need an additional column for null-term. If you store ASCII art, 2D array is a very good solution.
The only serious problem I see in your code is that you are casting the returned value of malloc(3), and probably you have forgotten to #include <stdlib.h> also (this is a dangerous cocktail), and this way, you are destroying the returned value of the call with the cast you put before malloc(3). Let me explain:
First, you have (or haven't, but I have to guess) a 64bit architecture (as it is common today) and pointers are 64bit wide in your system, while int integers are 32bit wide.
You have probably forgotten to #include <stdlib.h> in your code (which is something I have to guess also), so the compiler is assuming that malloc(3) is actually a function returning int (this is legacy in C, if you don't provide a prototype for a function external to the compilation unit), so the compiler is generating code to get just a 32 bit value from the malloc(3) function, and not the 64bit pointer that (probably, but I have to guess also) malloc(3) actually returns.
You are casting that int 32bit value (already incorrect) to a 64bit pointer (far more incorrect, but I have to guess...), making any warning about type conversions between integer values and pointers to dissapear, and be silenced when you put the cast (the compiler assumes that, as a wise programmer you are, you have put the cast there on purpose, and that you know what you are doing)
The first (undefined behaviour) returned value is being (undefined behaviour) just cut to 32 bit, and then converted (from int to char *, with more undefined behaviour) to be used in your code. This makes the original pointer returned from malloc(3) to be completely different value when reinterpreted and cast to (char *). This makes your pointers to point to a different place, and break your program on execution.
Your code should be something like (again, a snippet has to be used, as your code is not complete):
#include <stdlib.h> /* for malloc() */
/* ... */
char *arr[rownum2];
for (i = 0; i < rownum2; i++) {
arr[i] = malloc(colnum); /* sizeof(char) is always 1 */
I need finally to do you a recommendation:
Please, read (and follow) the how to create a minimal, verifiable example page, as your probable missing #include error, is something I had to guess.... Posting snippets of code makes many times your mistakes to go away, and we have to guess what can be happening here. This is the most important thing you have to learn from this answer. Post complete, compilable and verifiable code (that is, code that you can check fails, before posting, not a snippet you selected where you guess the problem can be). The code you posted does allow nobody to verify why it can be failing, because it must be completed (and repaired, probably) to make it executable.
I am a beginner with C. I am wondering, how's malloc working.
Here is a sample code, I wrote on while trying to understand it's working.
CODE:
#include<stdio.h>
#include<stdlib.h>
int main() {
int i;
int *array = malloc(sizeof *array);
for (i = 0; i < 5; i++) {
array[i] = i+1;
}
printf("\nArray is: \n");
for (i = 0; i < 5; i++) {
printf("%d ", array[i]);
}
free(array);
return 0;
}
OUTPUT:
Array is:
1 2 3 4 5
In the program above, I have only allocated space for 1 element, but the array now holds 5 elements. So as the programs runs smoothly without any error, what is the purpose of realloc().
Could anybody explain why?
Thanks in advance.
The fact that the program runs smoothly does not mean it is correct!
Try to increase the 5 in the for loop to some extent (500000, for instance, should suffices). At some point, it will stop working giving you a SEGFAULT.
This is called Undefined Behaviour.
valgrind would also warn you about the issue with something like the following.
==16812== Invalid write of size 4
==16812== at 0x40065E: main (test.cpp:27)
If you dont know what valgrind is check this out: How do I use valgrind to find memory leaks?. (BTW it's a fantastic tool)
This should help gives you some more clarifications: Accessing unallocated memory C++
This is typical undefined behavior (UB).
You are not allowed to code like that. As a beginner, think it is a mistake, a fault, a sin, something very dirty etc.
Could anybody explain why?
If you need to understand what is really happening (and the details are complex) you need to dive into your implementation details (and you don't want to). For example, on Linux, you could study the source code of your C standard library, of the kernel, of the compiler, etc. And you need to understand the machine code generated by the compiler (so with GCC compile with gcc -S -O1 -fverbose-asm to get an .s assembler file).
See also this (which has more references).
Read as soon as possible Lattner's blog on What Every C programmer should know about undefined behavior. Every one should have read it!
The worst thing about UB is that sadly, sometimes, it appears to "work" like you want it to (but in fact it does not).
So learn as quickly as possible to avoid UB systematically.
BTW, enabling all warnings in the compiler might help (but perhaps not in your particular case). Take the habit to compile with gcc -Wall -Wextra -g if using GCC.
Notice that your program don't have any arrays. The array variable is a pointer (not an array) so is very badly named. You need to read more about pointers and C dynamic memory allocation.
int *array = malloc(sizeof *array); //WRONG
is very wrong. The name array is very poorly chosen (it is a pointer, not an array; you should spend days in reading what is the difference - and what do "arrays decay into pointers" mean). You allocate for a sizeof(*array) which is exactly the same as sizeof(int) (and generally 4 bytes, at least on my machine). So you allocate space for only one int element. Any access beyond that (i.e. with any even small positive index, e.g. array[1] or array[i] with some positive i) is undefined behavior. And you don't even test against failure of malloc (which can happen).
If you want to allocate memory space for (let's say) 8 int-s, you should use:
int* ptr = malloc(sizeof(int) * 8);
and of course you should check against failure, at least:
if (!ptr) { perror("malloc"); exit(EXIT_FAILURE); };
and you need to initialize that array (the memory you've got contain unpredictable junk), e.g.
for (int i=0; i<8; i++) ptr[i] = 0;
or you could clear all bits (with the same result on all machines I know of) using
memset(ptr, 0, sizeof(int)*8);
Notice that even after a successful such malloc (or a failed one) you always have sizeof(ptr) be the same (on my Linux/x86-64 box, it is 8 bytes), since it is the size of a pointer (even if you malloc-ed a memory zone for a million int-s).
In practice, when you use C dynamic memory allocation you need to know conventionally the allocated size of that pointer. In the code above, I used 8 in several places, which is poor style. It would have been better to at least
#define MY_ARRAY_LENGTH 8
and use MY_ARRAY_LENGTH everywhere instead of 8, starting with
int* ptr = malloc(MY_ARRAY_LENGTH*sizeof(int));
In practice, allocated memory has often a runtime defined size, and you would keep somewhere (in a variable, a parameter, etc...) that size.
Study the source code of some existing free software project (e.g. on github), you'll learn very useful things.
Read also (perhaps in a week or two) about flexible array members. Sometimes they are very useful.
So as the programs runs smoothly without any error
That's just because you were lucky. Keep running this program and you might segfault soon. You were relying on undefined behaviour (UB), which is always A Bad Thing™.
What is the purpose of realloc()?
From the man pages:
void *realloc(void *ptr, size_t size);
The realloc() function changes the size of the memory block pointed to
by ptr to size bytes. The contents will be unchanged in the range
from the start of the region up to the minimum of the old and new sizes. If the new size is larger than the old size, the added
memory
will not be initialized. If ptr is NULL, then the call is equivalent to malloc(size), for all values of size; if size is equal
to zero,
and ptr is not NULL, then the call is equivalent to free(ptr). Unless ptr is NULL, it must have been returned by an
earlier call to malloc(), calloc() or realloc(). If the area pointed to was moved, a free(ptr) is done.
I implemented sizeof as recommended. it work ok when I want to print the size of a variable ,but I can't use it as a array size.
this is the code:
#include <stdio.h>
#include <stdlib.h>
#define my_sizeof(var) (size_t)((char *)(&var+1)-(char*)(&var))
int s = 7;
void main()
{
int arr[sizeof(s)]; //works OK
int arr2[my_sizeof(s)];//error
printf("%d\n", my_sizeof(s));//works OK
int temp = 0;
}
Error 1 error C2057: expected constant expression
Error 2 error C2466: cannot allocate an array of constant size 0
Error 3 error C2133: 'arr2' : unknown size
Your implementation my_sizeof is not exactly equivalent to to the sizeof operator in C, which is a compile time operator whereas yours can only calculate size at run time.
So,
int arr[sizeof(s)];
declares an array with size sizeof(s) whereas
int arr2[my_sizeof(s)];
does the the same but the the array size is not calculated at compile time but runtime. For this to work, you'll need the support of C99's VLAs, which your compiler doesn't support and errors out.
When you define
int arr[sizeof(s)];
the compiler allocates its size in the stack as soon as it enters the function, then it needs a constant expression that can evaluate at compile time not a run time ( this could be changed at C99). With my_sizeof you are using pointer arithmetic that must be solved at runtime.
You could use my_sizeof() if you allocate the array in the heap using malloc()
just let you know, you need to differentiate compile time and run time. these two concepts are critically different in C world.
for example, following code is valid, since it gets the size during compile time:
typedef struct {
char[sizeof(s)] chars;
} anon_struct;
however, following is not valid, since the size is unknown until run time and VLA doesn't support in compositional type definition:
typedef struct {
char[my_sizeof(s)] chars;
} anon_struct;
suggest you to buy a good text book and have a good read.
gcc can compile this on Ubuntu. The problem you're seeing is likely due to the address not yet known at compile time. It will be known at runtime, though, so it looks like gcc does what it takes to make this work "under the hood". But I gather your compiler is MS-based.
I should add, the line that compiles will allocate arr with a number of int's equal to the number of bytes required for one int (s). That is probably not what you normally want. I understand if it's just an exercise for comparison, though.
I am reading in a book that the malloc function in C takes the number of 'chunks' of memory you wish to allocate as a parameter and determines how many bytes the chunks are based on what you cast the value returned by malloc to. For example on my system an int is 4 bytes:
int *pointer;
pointer = (int *)malloc(10);
Would allocate 40 bytes because the compiler knows that ints are 4 bytes.
This confuses me for two reasons:
I was reading up, and the size parameter is actually the number of bytes you want to allocate and is not related to the sizes of any types.
Malloc is a function that returns an address. How does it adjust the size of the memory it allocated based on an external cast of the address it returned from void to a different type? Is it just some compiler magic I am supposed to accept?
I feel like the book is wrong. Any help or clarification is greatly appreciated!
Here is what the book said:
char *string;
string = (char *)malloc(80);
The 80 sets aside 80 chunks of storage. The chunk size is set by the typecast, (char *), which means that malloc() is finding storage for 80 characters of text.
Yes the book is wrong and you are correct please throw away that book.
Also, do let everyone know of the name of the book so we can permanently put it in our never to recommend black list.
Good Read:
What is the Best Practice for malloc?
When using malloc(), use the sizeof operator and apply it to the object being allocated, not the type of it.
Not a good idea:
int *pointer = malloc (10 * sizeof (int)); /* Wrong way */
Better method:
int *pointer = malloc (10 * sizeof *pointer);
Rationale: If you change the data type that pointer points to, you don't need to change the malloc() call as well. Maintenance win.
Also, this method is less prone to errors during code development. You can check that it's correct without having to look at the declaration in cases where the malloc() call occurs apart from the variable declaration.
Regarding your question about casting malloc(), note that there is no need for a cast on malloc() in C today. Also, if the data type should change in a future revision, any cast there would have to be changed as well or be another error source. Also, always make sure you have <stdlib.h> included. Often people put in the cast to get rid of a warning that is a result from not having the include file. The other reason is that it is required in C++, but you typically don't write C++ code that uses malloc().
The exact details about how malloc() works internally is not defined in the spec, in effect it is a black box with a well-defined interface. To see how it works in an implementation, you'd want to look at an open source implementation for examples. But malloc() can (and does) vary wildly on various platforms.
#include<stdio.h>
#include<stdlib.h>
void main()
{
char *arr;
arr=(char *)malloc(sizeof (char)*4);
scanf("%s",arr);
printf("%s",arr);
}
In the above program, do I really need to allocate the arr?
It is giving me the result even without using the malloc.
My second doubt is ' I am expecting an error in 9th line because I think it must be
printf("%s",*arr);
or something.
do I really need to allocate the arr?
Yes, otherwise you're dereferencing an uninitialised pointer (i.e. writing to a random chunk of memory), which is undefined behaviour.
do I really need to allocate the arr?
You need to set arr to point to a block of memory you own, either by calling malloc or by setting it to point to another array. Otherwise it points to a random memory address that may or may not be accessible to you.
In C, casting the result of malloc is discouraged1; it's unnecessary, and in some cases can mask an error if you forget to include stdlib.h or otherwise don't have a prototype for malloc in scope.
I usually recommend malloc calls be written as
T *ptr = malloc(N * sizeof *ptr);
where T is whatever type you're using, and N is the number of elements of that type you want to allocate. sizeof *ptr is equivalent to sizeof (T), so if you ever change T, you won't need to duplicate that change in the malloc call itself. Just one less maintenance headache.
It is giving me the result even without using the malloc
Because you don't explicitly initialize it in the declaration, the initial value of arr is indeterminate2; it contains a random bit string that may or may not correspond to a valid, writable address. The behavior on attempting to read or write through an invalid pointer is undefined, meaning the compiler isn't obligated to warn you that you're doing something dangerous. On of the possible outcomes of undefined behavior is that your code appears to work as intended. In this case, it looks like you're accessing a random segment of memory that just happens to be writable and doesn't contain anything important.
My second doubt is ' I am expecting an error in 9th line because I think it must be printf("%s",*arr); or something.
The %s conversion specifier tells printf that the corresponding argument is of type char *, so printf("%s", arr); is correct. If you had used the %c conversion specifier, then yes, you would need to dereference arr with either the * operator or a subscript, such as printf("%c", *arr); or printf("%c", arr[i]);.
Also, unless your compiler documentation explicitly lists it as a valid signature, you should not define main as void main(); either use int main(void) or int main(int argc, char **argv) instead.
1. The cast is required in C++, since C++ doesn't allow you to assign void * values to other pointer types without an explicit cast
2. This is true for pointers declared at block scope. Pointers declared at file scope (outside of any function) or with the static keyword are implicitly initialized to NULL.
Personally, I think this a very bad example of allocating memory.
A char * will take up, in a modern OS/compiler, at least 4 bytes, and on a 64-bit machine, 8 bytes. So you use four bytes to store the location of the four bytes for your three-character string. Not only that, but malloc will have overheads, that add probably between 16 and 32 bytes to the actual allocated memory. So, we're using something like 20 to 40 bytes to store 4 bytes. That's a 5-10 times more than it actually needs.
The code also casts malloc, which is wrong in C.
And with only four bytes in the buffer, the chances of scanf overflowing is substantial.
Finally, there is no call to free to return the memory to the system.
It would be MUCH better to use:
int len;
char arr[5];
fgets(arr, sizeof(arr), stdin);
len = strlen(arr);
if (arr[len] == '\n') arr[len] = '\0';
This will not overflow the string, and only use 9 bytes of stackspace (not counting any padding...), rather than 4-8 bytes of stackspace and a good deal more on the heap. I added an extra character to the array, so that it allows for the newline. Also added code to remove the newline that fgets adds, as otherwise someone would complain about that, I'm sure.
In the above program, do I really need to allocate the arr?
You bet you do.
It is giving me the result even without using the malloc.
Sure, that's entirely possible... arr is a pointer. It points to a memory location. Before you do anything with it, it's uninitialized... so it's pointing to some random memory location. The key here is wherever it's pointing is a place your program is not guaranteed to own. That means you can just do the scanf() and at that random location that arr is pointing to the value will go, but another program can overwrite that data.
When you say malloc(X) you're telling the computer that you need X bytes of memory for your own usage that no one else can touch. Then when arr captures the data it will be there safely for your usage until you call free() (which you forgot to do in your program BTW)
This is a good example of why you should always initialize your pointers to NULL when you create them... it reminds you that you don't own what they're pointing at and you better point them to something valid before using them.
I am expecting an error in 9th line because I think it must be printf("%s",*arr)
Incorrect. scanf() wants an address, which is what arr is pointing to, that's why you don't need to do: scanf("%s", &arr). And printf's "%s" specificier wants a character array (a pointer to a string of characters) which again is what arr is, so no need to deference.