I want to make a program that dynamically allocates memory for each element of an array while it is entered from stdin and stored into an array. The reading should stop when 0 is entered. If I try to make it directly in main(), in looks like this:
int *a;
int i = 0;
a = malloc(sizeof(int));
do
{
scanf("%d", &a[i]);
a = realloc(a, (i + 2) * sizeof(int)); // enough space for storing another number
i++;
} while (a[i-1] != 0);
But I don't know how to make a function that does this. This is what I've tried, but it crashes everytime.
void read(int **a, int *cnt)
{
a = malloc(sizeof(int));
*a = malloc(sizeof(int));
*cnt = 0;
do
{
scanf("%d", a[*cnt]);
*a = realloc(*a, (*cnt + 2) * sizeof(int)); // enough space for storing another number
(*cnt)++;
} while (a[*cnt-1] != 0);
}
how about putting everything inside a function and returning a;
int *read()
{
int *a;
int i = 0;
a = malloc(sizeof(int));
if( !a ) return NULL;
do
{
scanf("%d", &a[i]);
a = realloc(a, (i + 2) * sizeof(int)); // enough space for storing another number
if( !a ) return NULL;
i++;
} while (a[i-1] != 0);
return a;
}
Assuming you are calling this in the usual way:
void read(int **a, int *cnt)
{
a = malloc(sizeof(int)); // This overwrites local a disconnecting it from the main a
*a = malloc(sizeof(int)); // so this will only change the memory pointed by local a and leak memory
...
}
int main()
{
int *a;
int cnt = 0;
read(&a, &cnt);
...
}
What is happening you’re giving the address to the pointer a to the function and then in the function you’re immediately overwriting it with the memory allocation. Matter this the a in the function and a in the main are completely separate entities. If you then allocate memory for *a you’re only storing that in the local a and the main a will remain pointing to whatever it happened to be. So it is uninitialized and causes undefined behavior.
So remove the line a = malloc(sizeof(int)) and your code will properly affect the main a also.
You also have to use *a for everything in read, including scanf and while. So it might be better to make the function handle allocation and return a pointer as was suggested in another answer.
Also note you should check realloc for return values so you won’t leak memory or crash there and you should use sizeof(int*) when allocating a pointer, no matter if size of int and int* were the same. It looks clearer.
You can pattern your function along the POSIX getline() function.
The pattern is very simple. Your function receives a reference to the pointer (i.e., a pointer to a pointer) used for the data, resized dynamically; and a pointer to the size allocated to that pointer. It will return the number of elements read to the array.
If you were reading doubles rather than ints, and wished to read all doubles from the input until end-of-input (either end of file, if redirected from a file, or until the user types a non-number and presses Enter, or until the user presses Ctrl+D at the beginning of the line), the code would look something like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
size_t read_doubles(double **dataptr, size_t *sizeptr, FILE *in)
{
double *data; /* A local copy of the pointer */
size_t used = 0; /* Number of doubles in data */
size_t size; /* Number of doubles allocated for data */
/* Sanity checks against NULL pointers. */
if (!dataptr || !sizeptr || !in) {
errno = EINVAL; /* "Invalid parameter" */
return 0;
}
/* If *dataptr is NULL, or *sizeptr is zero,
there is no memory allocated yet. */
if (*dataptr != NULL && *sizeptr > 0) {
data = *dataptr;
size = *sizeptr;
} else {
data = NULL;
size = 0;
*dataptr = NULL;
*sizeptr = 0;
}
while (1) {
/* Ensure there is room in the data. */
if (used >= size) {
/* Need to allocate more.
Note: we have a copy of (data) in (*dataptr),
and of (size) in (*sizeptr). */
/* Reallocation policy. This one is simple,
reallocating in fixed-size chunks, but
better ones are well known: you probably
wish to ensure the size is at least some
sensible minimum (maybe a thousand or so),
then double the size up to a few million,
then increase the size in fixed-size chunks
of a few million, in a real-world application. */
size = used + 500;
/* Note: malloc(size) and realloc(NULL, size)
are equivalent; no need to check for NULL. */
data = realloc(data, size * sizeof data[0]);
if (!data) {
/* Reallocation failed, but the old data
pointer in (*dataptr) is still valid,
it isn't lost. Return an error. */
errno = ENOMEM;
return 0;
}
/* Reallocation succeeded; update the originals,
that are visible to the caller. */
*dataptr = data;
*sizeptr = size;
}
/* Read one more element, if possible.
Note: "&(data[used])" and "data+used"
are completely equivalent expressions.
*/
if (fscanf(input, " %lf", data + used) != 1)
break; /* No more input, or not a number. */
/* Yes, read a new data element. */
used++;
}
/* If we encountered a true read error,
return an error. */
if (ferror(input)) {
errno = EIO;
return 0;
}
/* Not an error; there just weren't more
data, or the data was not a number.
*/
/* Normally, programs do not set errno
except in case of an error. However,
here, used==0 just means there was no
data, it does not indicate an error per se.
For simplicity, because we know no error
has occurred, we just set errno=0 here,
rather than check if used==0 and only then
set errno to zero.
This also means it is safe to examine errno
after a call to this function, no matter what
the return value is. errno will be zero if no
errors have occurred, and nonzero in error cases.
*/
errno = 0;
return used;
}
The <errno.h> was included for the library to expose errno, and <string.h> for strerror(). These are both standard C.
However, the error constants I used above, EINVAL, ENOMEM, and EIO, are only defined by POSIXy systems, and might not exist in all systems. That is okay; you can just pick any smallish nonzero values and use them instead, because the function always sets errno. In that case, however, you need to check each of them and print the appropriate error message for each. In my case, all the systems I use define those three error codes for me, and I can just use strerror(errno) to convert the code to a standard error message (Invalid argument, Not enough space, and Input/output error, respectively, in non-localized programs).
Using a function defined like above, is very simple:
int main(void)
{
double *data = NULL; /* NULL for "Not allocated yet" */
size_t size = 0; /* 0 for "Not allocated yet" */
size_t used;
size_t i; /* Just a loop variable. */
used = read_doubles(&data, &size, stdin);
if (!used) {
/* No data read. Was it an actual error, or just no data? */
if (errno)
fprintf(stderr, "Error reading standard input: %s.\n", strerror(errno));
else
fprintf(stderr, "No numbers in standard input!\n");
return EXIT_FAILURE;
}
printf("Read %zu numbers from standard input.\n", used);
printf("(The dynamically allocated array has room for %zu.)\n", size);
for (i = 0; i < used; i++)
printf(" %f\n", data[i]);
/* Array no longer needed, so we can free it.
Explicitly NULLing and zeroing them means
we can safely reuse them later, if we were
to extend this program. So, it's not necessary
to clear them this way, but it is a good practice
considering it makes long-term maintenance easier. */
free(data);
data = NULL;
size = 0;
used = 0;
/* This version of the program has nothing more to do. */
return EXIT_SUCCESS;
}
Essentially, you just set the pointer you supply the address of to NULL, and the size you supply the address of also to 0, before the call to indicate no array has been dynamically allocated yet. There is no need to malloc() an initial array; realloc(NULL, size) is completely safe, and does exactly what malloc(size) does. Indeed, I often write code that has no malloc() anywhere in it, and use only realloc().
Note that the above code snippets are untested, so there might be typos in them. (And I did choose to use doubles instead of ints and a different end-of-input condition, to ensure you don't just copy-paste the code and use as-is, without reading and understanding it first.) If you find or suspect you found any, let me know in a comment, and I'll check.
Also note that the above code snippets are long only because I tried to write descriptive comments -- literally most of the "code" in them is comments. Writing descriptive comments -- those that describe the intent of the code, and not just what the code actually does; the latter is easy to read from the code itself, but the former is what you or others later reading the code need to know, to check if the code is sound or buggy --, is very hard, and even after over two decades, I'm still trying to get better at it.
If you like writing code, I do warmly recommend you start practicing writing good, intent-describing comments right away, rather than battle with it for decades like I have. It is surprising how much good comments, and occasionally a good nights sleep to review the code with fresh pair of eyes, helps.
Related
How do I free dynamically allocated memory?
Suppose input (assume it is given by user) is 1000 and now if I allocate memory of 1000 and after this(second time) if user gives input as 500 can I reuse already allocated memory ?
If user now inputs value as say 3000 , how do I go with it ? can I reuse already allocated 1000 blocks of memory and then create another 2000 blocks of memory ? or should I create all 3000 blocks of memory ?
which of these is advisable?
#include <stdio.h>
#include <stdlib.h>
typedef struct a
{
int a;
int b;
}aa;
aa* ptr=NULL;
int main() {
//code
int input=2;
ptr=malloc(sizeof(aa)*input);
for(int i=0;i<input;i++)
{
ptr[i].a=10;
ptr[i].b=20;
}
for(int i=0;i<input;i++)
{
printf("%d %d\n",ptr[i].a,ptr[i].b);
}
return 0;
}
I believe, you need to read about the "lifetime" of allocated memory.
For allocator functions, like malloc() and family, (quoting from C11, chapter §7.22.3, for "Memory management functions")
[...] The lifetime of an allocated object extends from the allocation
until the deallocation. [....]
So, once allocated, the returned pointer to the memory remains valid until it is deallocated. There are two ways it can be deallocated
Using a call to free() inside the program
Once the program terminates.
So, the allocated memory is available, from the point of allocation, to the termination of the program, or the free() call, whichever is earlier.
As it stands, there can be two aspects, let me clarify.
Scenario 1:
You allocate memory (size M)
You use the memory
You want the allocated memory to be re-sized (expanded/ shrinked)
You use some more
You're done using
is this is the flow you expect, you can use realloc() to resize the allocated memory size. Once you're done, use free().
Scenario 2:
You allocate memory (size M)
You use the memory
You're done using
If this is the case, once you're done, use free().
Note: In both the cases, if the program is run multiple times, there is no connection between or among the allocation happening in each individual invocation. They are independent.
When you use dynamically allocated memory, and adjust its size, it is important to keep track of exactly how many elements you have allocated memory for.
I personally like to keep the number of elements in use in variable named used, and the number of elements I have allocated memory for in size. For example, I might create a structure for describing one-dimensional arrays of doubles:
typedef struct {
size_t size; /* Number of doubles allocated for */
size_t used; /* Number of doubles in use */
double *data; /* Dynamically allocated array */
} double_array;
#define DOUBLE_ARRAY_INIT { 0, 0, NULL }
I like to explicitly initialize my dynamically allocated memory pointers to NULL, and their respective sizes to zero, so that I only need to use realloc(). This works, because realloc(NULL, size) is exactly equivalent to malloc(NULL). I also often utilize the fact that free(NULL) is safe, and does nothing.
I would probably write a couple of helper functions. Perhaps a function that ensures there is room for at_least entries in the array:
void double_array_resize(double_array *ref, size_t at_least)
{
if (ref->size < at_least) {
void *temp;
temp = realloc(ref->data, at_least * sizeof ref->data[0]);
if (!temp) {
fprintf(stderr, "double_array_resize(): Out of memory (%zu doubles).\n", at_least);
exit(EXIT_FAILURE);
}
ref->data = temp;
ref->size = at_least;
}
/* We could also shrink the array if
at_least < ref->size, but usually
this is not needed/useful/desirable. */
}
I would definitely write a helper function that not only frees the memory used, but also updates the fields to reflect that, so that it is completely safe to call double_array_resize() after freeing:
void double_array_free(double_array *ref)
{
if (ref) {
free(ref->data);
ref->size = 0;
ref->used = 0;
ref->data = NULL;
}
}
Here is how a program might use the above.
int main(void)
{
double_array stuff = DOUBLE_ARRAY_INIT;
/* ... Code and variables omitted ... */
if (some_condition) {
double_array_resize(&stuff, 321);
/* stuff.data[0] through stuff.data[320]
are now accessible (dynamically allocated) */
}
/* ... Code and variables omitted ... */
if (weird_condition) {
/* For some reason, we want to discard the
possibly dynamically allocated buffer */
double_array_free(&stuff);
}
/* ... Code and variables omitted ... */
if (other_condition) {
double_array_resize(&stuff, 48361242);
/* stuff.data[0] through stuff.data[48361241]
are now accessible. */
}
double_array_free(&stuff);
return EXIT_SUCCESS;
}
If I wanted to use the double_array as a stack, I might do
void double_array_clear(double_array *ref)
{
if (ref)
ref->used = 0;
}
void double_array_push(double_array *ref, const double val)
{
if (ref->used >= ref->size) {
/* Allocate, say, room for 100 more! */
double_array_resize(ref, ref->used + 100);
}
ref->data[ref->used++] = val;
}
double double_array_pop(double_array *ref, const double errorval)
{
if (ref->used > 0)
return ref->data[--ref->used];
else
return errorval; /* Stack was empty! */
}
The above double_array_push() reallocates for 100 more doubles, whenever the array runs out of room. However, if you pushed millions of doubles, this would mean tens of thousands of realloc() calls, which is usually considered wasteful. Instead, we usually apply a reallocation policy, that grows the size proportionally to the existing size.
My preferred policy is something like (pseudocode)
If (elements in use) < LIMIT_1 Then
Resize to LIMIT_1
Else If (elements in use) < LIMIT_2 Then
Resize to (elements in use) * FACTOR
Else
Resize to (elements in use) + LIMIT_2
End If
The LIMIT_1 is typically a small number, the minimum size ever allocated. LIMIT_2 is typically a large number, something like 220 (two million plus change), so that at most LIMIT_2 unused elements are ever allocated. FACTOR is between 1 and 2; many suggest 2, but I prefer 3/2.
The goal of the policy is to keep the number of realloc() calls at an acceptable (unnoticeable) level, while keeping the amount of allocated but unused memory low.
The final note is that you should only try to keep around a dynamically allocated buffer, if you reuse it for the same (or very similar) purpose. If you need an array of a different type, and don't need the earlier one, just free() the earlier one, and malloc() a new one (or let realloc() in the helpers do it). The C library will try to reuse the same memory anyway.
On current desktop machines, something like a hundred or a thousand malloc() or realloc() calls is probably unnoticeable compared to the start-up time of the program. So, it is not that important to minimize the number of those calls. What you want to do, is keep your code easily maintained and adapted, so logical reuse and variable and type names are important.
The most typical case where I reuse a buffer, is when I read text input line by line. I use the POSIX.1 getline() function to do so:
char *line = NULL;
size_t size = 0;
ssize_t len; /* Not 'used' in this particular case! :) */
while (1) {
len = getline(&line, &size, stdin);
if (len < 1)
break;
/* Have 'len' chars in 'line'; may contain '\0'! */
}
if (ferror(stdin)) {
fprintf(stderr, "Error reading standard input!\n");
exit(EXIT_FAILURE);
}
/* Since the line buffer is no longer needed, free it. */
free(line);
line = NULL;
size = 0;
The following code compiled fine yesterday for a while, started giving the abort trap: 6 error at one point, then worked fine again for a while, and again started giving the same error. All the answers I've looked up deal with strings of some fixed specified length. I'm not very experienced in programming so any help as to why this is happening is appreciated. (The code is for computing the Zeckendorf representation.)
If I simply use printf to print the digits one by one instead of using strings the code works fine.
#include <string.h>
// helper function to compute the largest fibonacci number <= n
// this works fine
void maxfib(int n, int *index, int *fib) {
int fib1 = 0;
int fib2 = 1;
int new = fib1 + fib2;
*index = 2;
while (new <= n) {
fib1 = fib2;
fib2 = new;
new = fib1 + fib2;
(*index)++;
if (new == n) {
*fib = new;
}
}
*fib = fib2;
(*index)--;
}
char *zeckendorf(int n) {
int index;
int newindex;
int fib;
char *ans = ""; // I'm guessing the error is coming from here
while (n > 0) {
maxfib(n, &index, &fib);
n -= fib;
maxfib(n, &newindex, &fib);
strcat(ans, "1");
for (int j = index - 1; j > newindex; j--) {
strcat(ans, "0");
}
}
return ans;
}
Your guess is quite correct:
char *ans = ""; // I'm guessing the error is coming from here
That makes ans point to a read-only array of one character, whose only element is the string terminator. Trying to append to this will write out of bounds and give you undefined behavior.
One solution is to dynamically allocate memory for the string, and if you don't know the size beforehand then you need to reallocate to increase the size. If you do this, don't forget to add space for the string terminator, and to free the memory once you're done with it.
Basically, you have two approaches when you want to receive a string from function in C
Caller allocates buffer (either statically or dynamically) and passes it to the callee as a pointer and size. Callee writes data to buffer. If it fits, it returns success as a status. If it does not fit, returns error. You may decide that in such case either buffer is untouched or it contains all data fitting in the size. You can choose whatever suits you better, just document it properly for future users (including you in future).
Callee allocates buffer dynamically, fills the buffer and returns pointer to the buffer. Caller must free the memory to avoid memory leak.
In your case the zeckendorf() function can determine how much memory is needed for the string. The index of first Fibonacci number less than parameter determines the length of result. Add 1 for terminating zero and you know how much memory you need to allocate.
So, if you choose first approach, you need to pass additional two parameters to zeckendorf() function: char *buffer and int size and write to the buffer instead of ans. And you need to have some marker to know if it's first iteration of the while() loop. If it is, after maxfib(n, &index, &fib); check the condition index+1<=size. If condition is true, you can proceed with your function. If not, you can return error immediately.
For second approach initialize the ans as:
char *ans = NULL;
after maxfib(n, &index, &fib); add:
if(ans==NULL) {
ans=malloc(index+1);
}
and continue as you did. Return ans from function. Remember to call free() in caller, when result is no longer needed to avoid memory leak.
In both cases remember to write the terminating \0 to buffer.
There is also a third approach. You can declare ans as:
static char ans[20];
inside zeckendorf(). Function shall behave as in first approach, but the buffer and its size is already hardcoded. I recommend to #define BUFSIZE 20 and either declare variable as static char ans[BUFSIZE]; and use BUFSIZE when checking available size. Please be aware that it works only in single threaded environment. And every call to zeckendorf() will overwrite the previous result. Consider following code.
char *a,*b;
a=zeckendorf(10);
b=zeckendorf(15);
printf("%s\n",a);
printf("%s\n",b);
The zeckendorf() function always return the same pointer. So a and b would pointer to the same buffer, where the string for 15 would be stored. So, you either need to store the result somewhere, or do processing in proper order:
a=zeckendorf(10);
printf("%s\n",a);
b=zeckendorf(15);
printf("%s\n",b);
As a rule of thumb majority (if not all) Linux standard C library function uses either first or third approach.
I am a newbie in C and I am trying to program a simple text editor, I have already written a 100 lines of stupid messy code, but it just worked. Until this SEGFAULT started showing up. I am going with the approach of switching terminal to canonical mode, and getting letter by letter from the user and do the necessary with each of 'em. The letters are added to a buffer, which is realloced extra 512 byte when the buffer is half filled, which I know is a stupid thing to do. But the cause of the SEGFAULT cant be determined. Help would be appreciated. Here's my code:
char* t_buf
int t_buf_len = 0;
int cur_buf_sz = 0;
void mem_mgr(char *buffer, unsigned long bytes){ //Function to allocate requested memory
if(buffer == NULL){
if((buffer = malloc(sizeof(char) * bytes)) == NULL){
printf("ERROR:Cannot get memory resources(ABORT)\n\r");
exit(1);
}
}
else{
realloc(buffer, bytes);
}
}
void getCharacter(){
if(t_buf_len >= (cur_buf_sz/2))
mem_mgr(t_buf, cur_buf_sz+=512);
strcpy(t_buf, "Yay! It works!");
printf("%s %d", t_buf, cur_buf_sz);
}
There are things you need to understand first,
The buffer pointer is a local variable inside the mem_mgr() function, it points to the same memory t_buf points initially, but once you modify it, it's no longer related to t_buf in any way.
So, when you return from mem_mgr() you lose the reference to the allocated memory and.
To fix this, you can pass a poitner to the pointer, and alter the actual pointer by dereferencing it.
The realloc() function, behaves exactly like malloc() if the first argument is NULL, if you read the documentation you would know that.
Memory allocation functions MUST be checked to ensure they returned a valid legal pointer, that's why you need a temporary poitner to store the return value of realloc(), because if it returns NULL, meaning that there was no memory to fulfill the request, you would lose reference to the original block of memory and you can't free it anymore.
You need to pass a pointer to your pointer to mem_mgr(), like this
int
mem_mgr(char **buffer, unsigned long bytes)
{
void *tmp = realloc(*buffer, bytes);
if (tmp != NULL) {
*buffer = tmp;
return 0;
}
return -1;
}
And then, to allocate memory
void
getCharacter()
{
if (t_buf_len >= (cur_buf_sz / 2)) {
if (mem_mgr(&t_buf, cur_buf_sz += 512) != -1) {
strcpy(t_buf, "Yay! It works!");
printf("%s %d", t_buf, cur_buf_sz);
}
}
}
The call to
mem_mgr(t_buf, cur_buf_sz+=512);
cannot change the actual parameter t_buf. You will either have to return the buffer from mem_mgr
t_buf = mem_mgr(t_buf, cur_buf_sz+=512);
or pass a pointer to t_buf
mem_mgr(&t_buf, cur_buf_sz+=512);
Furthermore, a call to realloc may change the address of the memory buffer, so you will have to use
char *tmpbuf = realloc(buffer, bytes);
if (!tmpbuf)
// Error handling
else
buffer = tmpbuf;
realloc(NULL, bytes); will behave like a malloc, so you don't need a separate branch here. This makes in total:
char *mem_mgr(char *buffer, unsigned long bytes){ //Function to allocate requested memory
char *tmpbuf = realloc(buffer, bytes);
if (!tmpbuf) {
// Error handling
}
return tmpbuf;
}
which somehow questions the reason of existence of the function mem_mgr.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm trying to implement a solution to copy a large string in memory in C.
Can you give me any advice about implementation or any reference?
I'm thinking to copy byte by byte since I don't know the length (probably I can't calculate it with strlen() since the string is very large).
Another concern is that I will have to reallocate memory on every step and I don't know how is the best way to do that. Is there any way that I can reallocate using only the reference to the last position of the memory already alocated and filled? Thus if the memory allocation fails, it will not affect the rest of the memory already filled.
What is the best value to return from this function? Should I return the number of bytes that were succesfully copied?
If there is a memory allocation fail, does realloc() set any global variable that I can check in the main function after I call the copying function? As I don't want to just return NULL from it if at some point realloc() fails, but I want to return a value more useful.
strlen() won't fail, as it uses size_t to descirbe the string's size, and size_t is large enough to hold the size of any object on the machine the program runs on.
So simply do
#define _XOPEN_SOURCE 500 /* for strdup */
#include <string.h>
int duplicate_string(const char * src, char ** pdst)
{
int result = 0;
if (NULL == ((*pdst) = strdup(src)))
{
result = -1;
}
return result;
}
If this fails try using an more clever structure to hold the data, for example by chopping it into slices:
#define _XOPEN_SOURCE 700 /* for strndup */
#include <string.h>
int slice_string(const char * src, char *** ppdst, size_t s)
{
int result = 0;
size_t s_internal = s + 1; /* Add one for the 0-terminator. */
size_t len = strlen(src) + 1;
size_t n =len/s_internal + (len%s_internal ?1 :0);
*ppdst = calloc(n + 1, sizeof(**ppdst)); /* +1 to have a stopper element. */
if (NULL == (*ppdst))
{
result = -1;
goto lblExit;
}
for (size_t i = 0; i < n; ++i)
{
(*ppdst)[i] = strndup(src, s);
if (NULL == (*ppdst)[i])
{
result = -1;
while (--i > 0)
{
free((*ppdst)[i]);
}
free(*ppdst);
*ppdst = NULL;
goto lblExit;
}
src += s;
}
lblExit:
return result;
}
Use such functions by trying dump copy first and if this fails by slicing the string.
int main(void)
{
char * s = NULL;
read_big_string(&s);
int result = 0;
char * d = NULL;
char ** pd = NULL;
/* 1st try dump copy. */
result = duplicate_string(s, &d);
if (0 != result)
{
/*2ndly try to slice it. */
{
size_t len = strlen(s);
do
{
len = len/2 + (len%2 ?1 :0);
result = slice_string(s, &pd, len);
} while ((0 != result) || (1 == len));
}
}
if (0 != result)
{
fprintf(stderr, "Duplicating the string failed.\n");
}
/* Use copies. */
if (NULL != d)
{
/* USe result from simple duplication. */
}
if (NULL != pd)
{
/* Use result from sliced duplication. */
}
/* Free the copies. */
if (NULL != pd)
{
for (size_t i = 0; pd[i]; ++i)
{
free(pd[i]);
}
}
free(pd);
free(d);
return 0;
}
realloc() failing
If there is a memory allocation fail, does realloc() set any global variable that I can check in the main function after I call the copying function? As I don't want to just return NULL from it if at some point realloc() fails, but I want to return a value more useful.
There's no problem with realloc() returning null if you use realloc() correctly. If you use realloc() incorrectly, you get what you deserve.
Incorrect use of realloc()
char *space = malloc(large_number);
space = realloc(space, even_larger_number);
If the realloc() fails, this code has overwritten the only reference to the previously allocated space with NULL, so not only have you failed to allocate new space but you also cannot release the old space because you've lost the pointer to it.
(For the fastidious: the fact that the original malloc() might have failed is not critical; space will be NULL, but that's a valid first argument to realloc(). The only difference is that there would be no previous allocation that was lost.)
Correct use of realloc()
char *space = malloc(large_number);
char *new_space = realloc(space, even_larger_number);
if (new_space != 0)
space = new_space;
This saves and tests the result of realloc() before overwriting the value in space.
Continually growing memory
Another concern is that I will have to reallocate memory on every step and I don't know how is the best way to do that. Is there any way that I can reallocate using only the reference to the last position of the memory already allocated and filled? Thus if the memory allocation fails, it will not affect the rest of the memory already filled.
The standard technique for avoiding quadratic behaviour (which really does matter when you're dealing with megabytes of data) is to double the space allocated for your working string when you need to grow it. You do that by keeping three values:
Pointer to the data.
Size of the data area that is allocated.
Size of the data area that is in use.
When the incoming data won't fit in the space that is unused, you reallocate the space, doubling the amount that is allocated unless you need more than that for the new space. If you think you're going to be adding more data later, then you might add double the new amount. This amortizes the cost of the memory allocations, and saves copying the unchanging data as often.
struct String
{
char *data;
size_t length;
size_t allocated;
};
int add_data_to_string(struct String *str, char const *data, size_t datalen)
{
if (str->length + datalen >= str->allocated)
{
size_t newlen = 2 * (str->allocated + datalen + 1);
char *newdata = realloc(str->data, newlen);
if (newdata == 0)
return -1;
str->data = newdata;
str->allocated = newlen;
}
memcpy(str->data + str->length, data, datalen + 1);
str->length += datalen;
return 0;
}
When you've finished adding to the string, you can release the unused space if you wish:
void release_unused(struct String *str)
{
char *data = realloc(str->data, str->length + 1);
str->data = data;
str->allocated = str->length + 1;
}
It is very unlikely that shrinking a memory block will move it, but the standard says:
The realloc function deallocates the old object pointed to by ptr and returns a
pointer to a new object that has the size specified by size. The contents of the new
object shall be the same as that of the old object prior to deallocation, up to the lesser of
the new and old sizes.
The realloc function returns a pointer to the new object (which may have the same
value as a pointer to the old object), or a null pointer if the new object could not be
allocated.
Note that 'may have the same value as a pointer to the old object' also means 'may have a different value from a pointer to the old object'.
The code assumes that it is dealing with null terminated strings; the memcpy() code copies the length plus one byte to collect the terminal null, for example, and the release_unused() code keeps a byte for the terminal null. The length element is the value that would be returned by strlen(), but it is crucial that you don't keep doing strlen() on megabytes of data. If you are dealing with binary data, you handle things subtly differently.
use a smart pointer and avoid copying in the first place
OK, let's use Cunningham's Question to help figure out what to do. Cunningham's Question (or Query - your choice :-) is:
What's the simplest thing that could possibly work?
-- Ward Cunningham
IMO the simplest thing that could possibly work would be to allocate a large buffer, suck the string into the buffer, reallocate the buffer down to the actual size of the string, and return a pointer to that buffer. It's the caller's responsibility to free the buffer they get when they're done with it. Something on the order of:
#define BIG_BUFFER_SIZE 100000000
char *read_big_string(FILE *f) /* read a big string from a file */
{
char *buf = malloc(BIG_BUFFER_SIZE);
fgets(buf, BIG_BUFFER_SIZE, f);
realloc(buf, strlen(buf)+1);
return buf;
}
This is example code only. There are #includes which are not included, and there's a fair number of possible errors which are not handled in the above, the implementation of which are left as an exercise for the reader. Your mileage may vary. Dealer contribution may affect cost. Check with your dealer for price and options available in your area. Caveat codor.
Share and enjoy.
I'm not an expert in C, but here's what I'm trying to do:
int main(void) {
double *myArray;
...
myFunction(myArray);
...
/* save myArray contents to file */
...
free(myArray);
...
return 0;
}
int myFunction(double *myArray) {
int len=0;
...
/* compute len */
...
myArray = malloc( sizeof(double) * len );
if (myArray == NULL)
exit(1);
...
/* populate myArray */
...
return 0;
}
I'd like to save the contents of myArray inside main, but I don't know the size required until the program is inside myFunction.
Since I'm using CentOS 6.2 Linux, which I could only find a gcc build available up to 4.4.6 (which doesn't support C99 feature of declaring a variable-length array; see "broken" under "Variable-length arrays in http://gcc.gnu.org/gcc-4.4/c99status.html), I'm stuck using -std=c89 to compile.
Simple answer is no.
You are not passing back the pointer.
use
int main(void) {
double *myArray;
...
myFunction(&myArray);
...
/* save myArray contents to file */
...
free(myArray);
...
return 0;
}
int myFunction(double **myArray) {
int len=0;
...
/* compute len */
...
*myArray = malloc( sizeof(double) * len );
if (NULL == *myArray)
exit(1);
...
EDIT
poputateThis = *myArray;
/* populate poputateThis */
END OF EDIT
...
return 0;
EDIT
Should simplify thigs for your
}
What you are doing is not OK since myFunction doesn't change the value myArray holds in main; it merely changes its own copy.
Other than that, it's OK even if stylistically debatable.
As a question of good design and practice (apart from syntax issues pointed out in other answers) this is okay as long as it is consistent with your code base's best practices and transparent. Your function should be documented so that the caller knows it has to free and furthermore knows not to allocate its own memory. Furthermore consider making an abstract data type such as:
// myarray.h
struct myarray_t;
int myarray_init(myarray_t* array); //int for return code
int myarray_cleanup(myarray_t* array); // will clean up
myarray_t will hold a dynamic pointer that will be encapsulated from the calling function, although in the init and cleanup functions it will respectively allocate and deallocate.
What you want to do is fine, but your code doesn't do it -- main never gets to see the allocated memory. The parameter myArray of myFunction is initialized with the value passed in the function call, but modifying it thereafter doesn't modify the otherwise-unrelated variable of the same name in main.
It appears in your code snippet that myFunction always returns 0. If so then the most obvious way to fix your code is to return myArray instead (and take no parameter). Then the call in main would look like myArray = myFunction();.
If myFunction in fact already uses its return value then you can pass in a pointer to double*, and write the address to the referand of that pointer. This is what Ed Heal's answer does. The double ** parameter is often called an "out-param", since it's a pointer to a location that the function uses to store its output. In this case, the output is the address of the buffer.
An alternative would be to do something like this:
size_t myFunction(double *myArray, size_t buf_len) {
int len=0;
...
/* compute len */
...
if (buf_len < len) {
return len;
}
/* populate myArray */
...
return len;
}
Then the callers have the freedom to allocate memory any way they like. Typical calling code might look like this:
size_t len = myFunction(NULL, 0);
// warning -- watch the case where len == 0, if that is possible
double *myArray = malloc(len * sizeof(*myArray));
if (!myArray) exit(1);
myFunction(myArray, len);
...
free(myArray);
What you've gained is that the caller can allocate the memory from anywhere that's convenient. What you've lost is that the caller has to write more code.
For an example of how to use that freedom, a caller could write:
#define SMALLSIZE 10;
void one_of_several_jobs() {
// doesn't usually require much space, occasionally does
double smallbuf[SMALLSIZE];
double *buf = 0;
size_t len = myFunction(smallbuf, SMALLSIZE);
if (len > SMALLSIZE) {
double *buf = malloc(len * sizeof(*buf));
if (!buf) {
puts("this job is too big, skipping it and moving to the next one");
return;
}
} else {
buf = smallbuf;
}
// use buf and len for something
...
if (buf != smallbuf) free(buf);
}
It's usually an unnecessary optimization to avoid a malloc in the common case where only a small buffer is needed -- this is only one example of why the caller might want a say in how the memory is allocated. A more pressing reason might be that your function is compiled into a different dll from the caller's function, perhaps using a different compiler, and the two don't use compatible implementations of malloc/free.