Fortune while returning C arrays? - c

I'm newbie with C and stunned with some magic while using following C functions. This code works for me, and prints all the data.
typedef struct string_t {
char *data;
size_t len;
} string_t;
string_t *generate_test_data(size_t size) {
string_t test_data[size];
for(size_t i = 0; i < size; ++i) {
string_t string;
string.data = "test";
string.len = 4;
test_data[i] = string;
}
return test_data;
}
int main() {
size_t test_data_size = 10;
string_t *test_data = generate_test_data(test_data_size);
for(size_t i = 0; i < test_data_size; ++i) {
printf("%zu: %s\n", test_data[i].len, test_data[i].data);
}
}
Why function generate_test_data works only when "test_data_size = 10", but when "test_data_size = 20" process finished with exit code 11? HOW does it possible?

This code will never work perfectly, it just happens to be working. In C, you have to manage the memory yourself. If you make a mistake, the program might continue to work... or something might scribble all over the memory you thought was yours. This often manifests itself as weird errors like you're having: it works when the length is X, but fails when the length is Y.
If you turn on -Wall, or if you're using clang even better -Weverything, you'll get a warning like this.
test.c:18:12: warning: address of stack memory associated with local variable 'test_data' returned
[-Wreturn-stack-address]
return test_data;
^~~~~~~~~
The two important kinds of memory in C are: stack and heap. Very basically, stack memory is only good for the duration of the function. Anything declared on the stack will be freed automatically when the function returns, sort of like local variables in other languages. The rule of thumb is if you don't explicitly allocate it, it's on the stack. string_t test_data[size]; is stack memory.
Heap memory you allocate and free yourself, usually using malloc or calloc or realloc or some other function doing this for you like strdup. Once allocated, heap memory stays around until it's explicitly deallocated.
Rule of thumb: heap memory can be returned from a function, stack memory cannot... well, you can but that memory slot might then be used by something else. That's what's happening to you.
So you need to allocate memory, not just once, but a bunch of times.
Allocate memory for the array of pointers to string_t structs.
Allocate memory for each string_t struct in the array.
Allocate memory for each char string (really an array) in each struct.
And then you have to free all that. Sound like a lot of work? It is! Welcome to C. Sorry. You probably want to write functions to allocate and free string_t.
static string_t *string_t_new(size_t size) {
string_t *string = malloc(sizeof(string_t));
string->len = 0;
return string;
}
static void string_t_destroy(string_t *self) {
free(self);
}
Now your test data function looks like this.
static string_t **generate_test_data_v3(size_t size) {
/* Allocate memory for the array of pointers */
string_t **test_data = calloc(size, sizeof(string_t*));
for(size_t i = 0; i < size; ++i) {
/* Allocate for the string */
string_t *string = string_t_new(5);
string->data = "test";
string->len = 4;
test_data[i] = string;
}
/* Return a pointer to the array, which is also a pointer */
return test_data;
}
int main() {
size_t test_data_size = 20;
string_t **test_data = generate_test_data_v3(test_data_size);
for(size_t i = 0; i < test_data_size; ++i) {
printf("%zu: %s\n", test_data[i]->len, test_data[i]->data);
}
/* Free each string_t in the array */
for(size_t i = 0; i < test_data_size; i++) {
string_t_destroy(test_data[i]);
}
/* Free the array */
free(test_data);
}
Instead of using pointers you could instead copy all the memory you use, which is sort of what you were previously doing. That's easier for the programmer, but inefficient for the computer. And if you're coding in C, it's all about being efficient for the computer.

Because the space for test_data in v1 gets created in the function, and that space gets reclaimed when the function returns (and can thus be used for other things); in v2, the space is set aside outside of the function, so doesn't get reclaimed.

Why function generate_test_data_v1 works only when "test_data_size = 10", but when "test_data_size = 20" process finished with exit code 11?
I see no reason why function generate_test_data_v1() should ever fail, but you cannot use its return value. It returns a pointer to an automatic variable belonging to the function's scope, and automatic variables cease to exist when the function to which they belong returns. Your program produces undefined behavior when it dereferences that pointer. I can believe that it appears to work as you intended for some sizes, but even in those cases the program is wrong.
Moreover, your program is very unlikely to be producing an exit code of 11, but it may well be terminating abruptly with a segmentation fault, which is signal 11.
And why generate_test_data_v2 works perfectly?
Function generate_test_data_v2() populates elements of an existing array belonging to function main(). That array is in scope for substantially the entire life of the program.

Related

C program crashes after freeing the pointers in an array of char *

I'm a student learning C and I was puttering around with arrays of strings and malloc().
I have the following code that is supposed to load an array of strings (statically created) with dynamically created strings (please forgive / correct me if my terminology does not align with the code I have).
The problem is, once I go to free that memory, I get the following error: free(): invalid pointer
Here is the code:
#include <stdio.h>
#include <stdlib.h>
#define RAM_SIZE 5
char* ram [RAM_SIZE];
int next_free_cell = 0;
void freeAndNullRam(){
for (int i = 0 ; i < RAM_SIZE ; i++){
printf("%d\n", i);
free(ram[i]);
ram[i] = NULL;
}
}
int main(int argc, const char *argv[])
{
for (int i= 0; i < RAM_SIZE; i++){
ram[i] = (char*)malloc(sizeof(char*)*5);
ram[i] = "aaaa";
}
for (int i= 0; i < RAM_SIZE; i++){
int empty = (ram[i] ==NULL);
if(!empty){
printf("%s\n", ram[i]);
}
}
freeAndNullRam();
for (int i= 0; i < RAM_SIZE; i++){
int empty = (ram[i] ==NULL);
printf("%d\n", empty);
}
return 0;
}
I know the issue is definitely in the freeAndNullRam() function (obviously), but I don't understand why. My understanding is that at compile time, an array of 5 pointers to char arrays is created, but to actually fill the cells of the array, I need to malloc them some memory. Why does the program complain when I free the pointers in the array, but not when I give them memory?
Thanks!
ram[i] = "aaaa"; reassigns the pointers at a[i] to point to static memory, discarding the result of malloc. Later on you pass those pointers to free, which fails because they were not the result of an *alloc function.
Use strcpy to instead copy the string from static memory into your allocated destination.
strcpy(a[i], "aaaa")
Here's a reworked version of your code to be more idiomatic C:
#include <stdio.h>
#include <stdlib.h>
// Create an array of arbitrary size
char* alloc_array(size_t size) {
// calloc() will give you a pre-zeroed (NULL) allocation, malloc() may not
return calloc(size, sizeof(char*));
}
// Clears out all entries in the array, leaving only NULL
void clear_array(char* array, size_t size) {
for (size_t i = 0; i < size; ++i) {
// free(NULL) doesn't do anything, and is easier than a test
free(array[i]);
array[i] = NULL;
}
}
// Clears, then frees the array
void free_array(char* array, size_t size) {
clear_array(array, size);
free(array);
}
int main(int argc, const char *argv[])
{
// Whenever possible use local variables, not global variables
size_t size = 5;
char* entries = alloc_array(size);
for (size_t i = 0; i < size; ++i) {
// Make a copy with strdup() so this can be released with free()
// later on. A string like "..." is static, it was never allocated.
entries[i] = strdup("aaaa");
}
for (size_t i = 0; i < size; i++) {
// Express conditions in the if statment directly
if (entries[i] != NULL) {
printf("%s\n", ram[i]);
}
}
clear_array(entries);
for (size_t i = 0; i < size; i++) {
printf("%d\n", entries[i] != NULL);
}
// Don't forget to release any allocated memory.
free_array(entries);
return 0;
}
There's a lot of bad habits in your original code you should work to expunge as quickly as possible so these things don't take root. In particular, global variables are a huge problem that need to be avoided.
One thing to remember is unless something was explicitly allocated with malloc() or a variant like calloc(), or was given to your code with an understanding that it was allocated in such a fashion, you should not call free() on it.
Not every pointer was allocated dynamically, and not every dynamically allocated pointer was allocated with malloc(). Some C code can be very confusing as a result of this.
C's syntax strongly suggests that "aaaa" is a "string". People even talk of this syntax that way: they call it "strings". But "aaaa" is nothing such. It's the unfortunately named string literal, which is not a string - neither in C nor in C++. A char * is not a string either - it's a pointer-typed value. It's used to represent strings, but itself is not a string - not even close.
You have quite reasonably expected that "aaaa" might behave like any other rvalue of the "obvious" type. Alas, while 1 is an integer literal of type int, "aaaa" is a string literal of a pointer type const char * - its value is not a string, but a pointer!
It's as if when you wrote 42, C gave you a const int * pointing to 42. That's what "string" literals do. That's the awfully deplorable side of C :(
In C++, there actually is a string type (std::string), and you can even write literals of that type with a new syntax introduced in C++11: "aaaa"s is an rvalue* of type std::string, and you can assign them exactly as you would expect of any other value type like int.
Since you're already thinking a bit like in C++, perhaps you can investigate that language next. It takes much less effort to do plenty of basic things in C++ compared to C.
*technically rvalue reference

C - Avoiding warning "address of stack memory associated with local variable returned"

I've written the following simple program that sums up the numbers from 0 to 9:
#include <stdio.h>
#include <stdlib.h>
int* allocArray() {
int arr[10];
return arr;
}
int main(void){
int* p;
int summe = 0;
p = allocArray();
for(int i = 0; i != 10; ++i) {
p[i] = i;
}
for(int i = 0; i != 10; ++i) {
summe += p[i];
}
printf("Sum = %d\n", summe);
}
The code compiles and delivers the expected result "45". However I get the following warning: 'address of stack memory associated with local variable
'arr' returned'. What am I doing wrong?
This is undefined behaviour, plain and simple. The only reason it "works" is because with this particular compiler the stack hasn't been trashed yet, but it is living on borrowed time.
The lifetime of arr ends immediately when the function call is complete. After that point any pointers to arr are invalidated and cannot be used.1
Your compiler uses stack memory to store local variables, and the warning indicates that you're returning an address to a now-invalidated stack variable.
The only way to work around this is to allocate memory dynamically and return that:
int* allocArray() {
int* arr = calloc(10, sizeof(int));
return arr;
}
Where you're now responsible for releasing that memory later.
You can also use static to extend the lifetime of arr:
int* allocArray() {
static int arr[10];
return arr;
}
Though this is not without consequences, either, as now arr is shared, not unique to each function call.
1 Like a lot of things in C there's significant overlap between what you "cannot do" because they lead to undefined behaviour and/or crashes and what you "can do" because it's permitted by the syntax and compiler. It's often your responsibility to know the consequences of any code you write, implied or otherwise.
To keep it in your code:
int arr[10];
will allocate the memory on the stack. As soon as you are leaving the function, the content of that array will be overwritten pretty soon. You want to allocate this on the heap.
You would need to use
int* arr = malloc(sizeof(int)*10);
and in the main function, after you've used it (at the end of main), call
delete[] arr;
Nevertheless, this code could be better if the ownership of the array would be properly handled. You want to make yourself familiar with C++ containers and shared/unique pointers.

How to initialize array size in a library in C?

I'm creating a C-library with .h and .c files for a ring buffer. Ideally, you would initialize this ring buffer library in the main project with something like ringbuff_init(int buff_size); and the size that is sent, will be the size of the buffer. How can I do this when arrays in C needs to be initialized statically?
I have tried some dynamically allocating of arrays already, I did not get it to work. Surely this task is possible somehow?
What I would like to do is something like this:
int buffSize[];
int main(void)
{
ringbuffer_init(100); // initialize buffer size to 100
}
void ringbuffer_init(int buff_size)
{
buffSize[buff_size];
}
This obviously doesn't compile because the array should have been initialized at the declaration. So my question is really, when you make a library for something like a buffer, how can you initialize it in the main program (so that in the .h/.c files of the buffer library) the buffer size is set to the wanted size?
You want to use dynamic memory allocation. A direct translation of your initial attempt would look like this:
size_t buffSize;
int * buffer;
int main(void)
{
ringbuffer_init(100); // initialize buffer size to 100
}
void ringbuffer_init(size_t buff_size)
{
buffSize = buff_size;
buffer = malloc(buff_size * sizeof(int));
}
This solution here is however extremely bad. Let me list the problems here:
There is no check of the result of malloc. It could return NULL if the allocation fails.
Buffer size needs to be stored along with the buffer, otherwise there's no way to know its size from your library code. It isn't exactly clean to keep these global variables around.
Speaking of which, these global variables are absolutely not thread-safe. If several threads call functions of your library, results are inpredictible. You might want to store your buffer and its size in a struct that would be returned from your init function.
Nothing keeps you from calling the init function several times in a row, meaning that the buffer pointer will be overwritten each time, causing memory leaks.
Allocated memory must be eventually freed using the free function.
In conclusion, you need to think very carefully about the API you expose in your library, and the implementation while not extremely complicated, will not be trivial.
Something more correct would look like:
typedef struct {
size_t buffSize;
int * buffer;
} RingBuffer;
int ringbuffer_init(size_t buff_size, RingBuffer * buf)
{
if (buf == NULL)
return 0;
buf.buffSize = buff_size;
buf.buffer = malloc(buff_size * sizeof(int));
return buf.buffer != NULL;
}
void ringbuffer_free(RingBuffer * buf)
{
free(buf.buffer);
}
int main(void)
{
RingBuffer buf;
int ok = ringbuffer_init(100, &buf); // initialize buffer size to 100
// ...
ringbuffer_free(&buf);
}
Even this is not without problems, as there is still a potential memory leak if the init function is called several times for the same buffer, and the client of your library must not forget to call the free function.
Static/global arrays can't have dynamic sizes.
If you must have a global dynamic array, declare a global pointer instead and initialize it with a malloc/calloc/realloc call.
You might want to also store its size in an accompanying integer variable as sizeof applied to a pointer won't give you the size of the block the pointer might be pointing to.
int *buffer;
int buffer_nelems;
char *ringbuffer_init(int buff_size)
{
assert(buff_size > 0);
if ( (buffer = malloc(buff_size*sizeof(*buffer)) ) )
buffer_nelems = buff_size;
return buffer;
}
You should use malloc function for a dynamic memory allocation.
It is used to dynamically allocate a single large block of memory with the specified size. It returns a pointer of type void which can be cast into a pointer of any form.
Example:
// Dynamically allocate memory using malloc()
buffSize= (int*)malloc(n * sizeof(int));
// Initialize the elements of the array
for (i = 0; i < n; ++i) {
buffSize[i] = i + 1;
}
// Print the elements of the array
for (i = 0; i < n; ++i) {
printf("%d, ", buffSize[i]);
}
I know I'm three years late to the party, but I feel I have an acceptable solution without using dynamic allocation.
If you need to do this without dynamic allocation for whatever reason (I have a similar issue in an embedded environment, and would like to avoid it).
You can do the following:
Library:
int * buffSize;
int buffSizeLength;
void ringbuffer_init(int buff_size, int * bufferAddress)
{
buffSize = bufferAddress;
buffSizeLength = buff_size;
}
Main :
#define BUFFER_SIZE 100
int LibraryBuffer[BUFFER_SIZE];
int main(void)
{
ringbuffer_init(BUFFER_SIZE, LibraryBuffer ) // initialize buffer size to 100
}
I have been using this trick for a while now, and it's greatly simplified some parts of working with a library.
One drawback: you can technically mess with the variable in your own code, breaking the library. I don't have a solution to that yet. If anyone has a solution to that I would love to here it. Basically good discipline is required for now.
You can also combine this with #SirDarius 's typedef for ring buffer above. I would in fact recommend it.

Pointer to local variable outside the scope of its declaration

Let's say I have a structure representing a PDF document pdf and a structure representing one of its pages pdf_page:
typedef struct pdf_page {
int page_no;
pdf_page *next_page;
char *content;
} pdf_page;
typedef struct {
pdf_page *first_page, *last_page;
} pdf;
From my main(), I call create_pdf_file(pdf *doc):
void main() {
pdf doc;
create_pdf_file(&doc);
// reading the linked list of pages here
}
Assume that create_pdf_file is something along these lines:
void
create_pdf_file(pdf *doc) {
for (int i = 0; i < 10; i++) {
pdf_page p;
p.page_no = i;
p.contents = "Hello, World!";
doc->last_page->next_page = p;
}
}
(This is merely an example source code, so no list processing is shown. Obviously, the first_page and last_page members of pdf need to be set first.)
My question: If I access doc->first_page - as well as the other pages in the linked list - after the create_pdf_file() call in my main(), is it possible that I get segmentation faults because of "taking the local variable p out of its context"?
(I am not sure whether I have guaranteed that the corresponding memory location will not be used for something else.)
If so, how do I avoid this?
yes, p is a local variable stored on the stack, when the lifetime ends (every loop iteration) any pointer to it gets invalid. you need to allocate every page with malloc() and free() it after you are finished.
this would look similar to:
for (int i = 0; i < 10; i++)
{
pdf_page* p = malloc(sizeof(pdf_page));
p->page_no = i;
p->contents = "Hello, World!";
doc->last_page->next_page = p;
}
and when you call your function you have to pass a pointer to doc:
create_pdf_file(&doc);
is it possible that I get segmentation faults because of "taking the local variable p out of its context"?
Once the block in which p is declared terminates, any pointer to p is invalid (a "dangling pointer") and attempting to dereference such a pointer is Undefined Behaviour. In other words, don't do it: you could get segmentation faults, or any other behaviour (including random memory corruption or the use of the wrong data without any error condition.)
(I am not sure whether I have guaranteed that the corresponding memory location will not be used for something else.)
You've guaranteed that the lifetime of p is shorter than a pointer to p.
If so, how do I avoid this?
Use malloc to dynamically allocate a memory region of the correct size to hold the datum. Don't forget to free the memory when you no longer need it.

seg fault from 2d array allocation

i have a struct "cell" defined as
typedef struct{
int id;
terrainType terrain;
} cell;
i then make a 2d array of cells with
cell** makeCellGrid(int sizeX, int sizeY)
{
cell** theArray;
int i;
theArray = (cell**) malloc(sizeX*sizeof(cell*));
for ( i = 0; i < sizeX; i++)
{
theArray[i] = (cell*) malloc(sizeY*sizeof(cell));
}
return theArray;
}
at first i thought this was working fine but a few seg faults later i discovered that with some values (e.g. makeCellGrid(32, 87) ) it breaks.
im fairly fresh with C pointers and memory junk and was hoping some one could point me in the right direction here.
with lower number bounds i had no issue accessing it with
map[i][j].id = x;
and so on
EDIT: forgot to add, from testing, the seg fault originate from
theArray[i] = (cell*) malloc(sizeY*sizeof(cell));
The code lacks error checking for the malloc() system call.
So if the first call to malloc() failed the second one (in the loop) tries to assign memory to NULL which indeed leads to the segmentation violation your are witnessing.
You might consider modifing you code like so:
#include <stdlib.h>
typedef struct {
int id;
TerrainType terrain;
} CellType;
void freeCellGrid(CellType ** ppCells, size_t sizeX)
{
size_t i = 0;
for (; i < sizeX; ++i)
{
free(ppCells[i]);
}
free(ppCells);
}
CellType ** makeCellGrid(size_t sizeX, size_t sizeY)
{
CellType ** ppCells = malloc(sizeX * sizeof(*ppCells));
if (ppCells)
{
size_t i = 0;
for (; i < sizeX; ++i)
{
ppCells[i] = malloc(sizeY * sizeof(**ppCells));
if (NULL == ppCells[i])
{
freeCellGrid(ppCells, i);
ppCells = NULL;
break;
}
}
}
return ppCells;
}
Notes on my modifications:
Always check system calls for errors (in the case of malloc() on error NULL is returned)
Better use an unsigned type to access memory/array indicies; size_t is meant for this
In C there is no need to cast the value returned by a void * function like malloc()
Always try to initialise variables as soon as possible; un-initilaised variables very easily lead to "irrational" behaviour of the application
If working with pointers, it might be helpfull to 'code' the level of indirection into their names (I did this here by using the prefix pp to indicated that it's a 2-level indirection)
types are different from variables: One way to distinguish this is by starting type names using capitals (CellType) and variables using small letters (ppCells).
If allocating memory to a pointer and it matters that the size of the allocated memory some suits the pointer's type it's always more secure to use the (dereferenced) pointer itself as argument to the sizeof operator then some type. As the declaration of the pointer the memory is allocated to might be changed during develpment and the adjustment of the argument to malloc() will be forgotten. To cut it short: doing as I did is less error prone.
If encapsulating the dynamical creation of structures (including arrays) it is a could idea to also implement a method which de-allocates it (here: freeCellGrid()). Even better start of with coding this deallocator first, as then you have it by hand when coding the allocator's error handling (as shown for the second call to malloc()).

Resources