Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have a question about how to make a temporary string in C. What I mean is I would like to create a string on every iteration step and free the variable after it is not useful anymore.
I have seen question similar to this one, but they differ in significant ways.
So right now I have something similar to this:
for (int i = 0; i < array_size; i++) {
//aray1 and array2 are arrays of strings
char* temporary_value = make_hash(array1[i], array2[i], size[i]);
if (is_valid(temporary_value)) {
//Code, that doesn't interferate in memory, but uses temporary_value - mostly just compare to it
}
free(temporary_value);
}
Where make_hash mallocs the memory depending on size[i].
But it feels so wrong and sometimes returns segment fault.
My ideas to improve this are:
Make string array and free it after the loop
Put "make_hash" code inside the for-loop and just realloc memory during iteration and free the temporary_value after the for-loop
But these solutions seem to be also bad. How would you approach this kind of problem?
When functions return objects of a known size, it is often better to let the caller handle allocations than the functions themselves, the caller often know what kind of allocation is best (automatic, static, heap, etc..). Just pass the pointer to where you want the result when calling the function.
Hash functions often returns hashes of fixed size, so i would go for this:
for (int i = 0; i < array_size; i++) {
char buffer[HASH_SIZE];
/*
`make_hash` writes result into `buffer` and returns `buffer` on success, or
`NULL` on error
*/
char* temporary_value = make_hash(buffer, array1[i], array2[i], size[i]);
if (is_valid(temporary_value)) {
//Code, that doesn't interferate in memory, but uses temporary_value - mostly just compare to it
}
}
In case your hash function does not return a fixed-size hash value, and you want the possibility to realloc your buffer, then pass a pointer to a pointer to your buffer, together to a pointer to a variable holding the buffer size:
make_hash(char **buffer, size_t *buffer_size, const char *str1, const char *str2, size_t s)
{
size_t new_size = .....;
if (new_size > *buffer_size)
{
char *tmp = realloc(*buffer, new_size);
if (!tmp)
return NULL;
*buffer = tmp;
*buffer_size = new_size;
}
/*
Calculate hash, and store it wherever `b` is pointing
*/
char *b = *buffer;
.......
return b; /* or `NULL` on error */
}
char *buffer = NULL;
size_t buffer_size = 0;
for (int i = 0; i < array_size; i++) {
char* temporary_value = make_hash(&buffer, &buffer_size, array1[i], array2[i], size[i]);
if (is_valid(temporary_value)) {
//Code, that doesn't interferate in memory, but uses temporary_value - mostly just compare to it
}
}
free(buffer);
If you have a cheap way of calculating the hash size, without calling make_hash() you could also go for the first solution together with a variable-length array:
for (int i = 0; i < array_size; i++) {
size_t buffer_size = hash_size(.....);
char buffer[buffer_size];
/*
`make_hash` writes result into `buffer` and returns `buffer` on success, or
`NULL` on error
*/
char* temporary_value = make_hash(buffer, array1[i], array2[i], size[i]);
if (is_valid(temporary_value)) {
//Code, that doesn't interferate in memory, but uses temporary_value - mostly just compare to it
}
}
How I would approach this is not freeing within the loop. malloc/free are usually very expensive system calls which you do not want to do if you know that you will have to call malloc again.
The correct way to do this would be malloc once, and then realloc on subsequent calls using the same block of memory, and then free once you're outside the loop.
The segmentation fault might come from a number of factors, are the arrays the same length, array1, array2 and size. you are also freeing memory without checking if it was allocated.
I will rather have something like this.
if (temporary_value != NULL) {
free(temporary_value);
}
It has been long since coding in C but that should help in troubleshooting.
Related
I have a function
populateAvailableExtensions(const char** gAvailableExtensions[], int gCounter)
which take a pointer to an array of strings and the number of elements in the array as parameters.
I allocate initial memory to that array using malloc(0). Specs say that it will either return a null pointer or a unique pointer that can be passed to free().
int currentAvailableExtensionCount = gCounter;
This variable will store number of string in gAvailableExtensions.
Inside this for loop
for (int i = 0; i < availableExtensionCount; ++i)
I have this piece of code
size_t sizeOfAvailableExtensionName =
sizeof(availableExtensionProperties[i].name);
reallocStatus = realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
&availableExtensionProperties[i].name,
sizeOfAvailableExtensionName);
++currentAvailableExtensionCount;
where
availableExtensionProperties[i].name
returns a string.
This is how that struct is defined
typedef struct Stuff {
char name[MAX_POSSIBLE_NAME];
...
...
} Stuff;
realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
should add memory of size sizeOfAvailableExtensionName to *gAvailableExtensions de-referenced array.
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
&availableExtensionProperties[i].name,
sizeOfAvailableExtensionName);
should copy the string (this sizeOfAvailableExtensionName much memory) from
&availableExtensionPropterties[i].name
address to
&(*gAvailableExtensions)[currentAvailableExtensionCount]
address.
But I don't think the code does what I think it should because I'm getting this error
realloc(): invalid next size
Aborted
(core dumped) ./Executable
EDIT: Full code
uint32_t populateAvailableExtensions(const char** gAvailableExtensions[], int gCounter) {
int currentAvailableExtensionCount = gCounter;
void* reallocStatus;
uint32_t availableExtensionCount = 0;
vkEnumerateInstanceExtensionProperties(
VK_NULL_HANDLE, &availableExtensionCount, VK_NULL_HANDLE);
VkExtensionProperties availableExtensionProperties[availableExtensionCount];
vkEnumerateInstanceExtensionProperties(
VK_NULL_HANDLE, &availableExtensionCount, availableExtensionProperties);
for (int i = 0; i < availableExtensionCount; ++i) {
size_t sizeOfAvailableExtensionName =
sizeof(availableExtensionProperties[i].extensionName);
reallocStatus = realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
availableExtensionProperties[i].extensionName,
sizeOfAvailableExtensionName);
++currentAvailableExtensionCount;
}
return currentAvailableExtensionCount;
}
This is how an external function calls on that one,
uint32_t availableExtensionCount = 0;
availableExtensions = malloc(0);
availableExtensionCount = populateAvailableExtensions(&availableExtensions);
and
const char** availableExtensions;
is declared in header file.
EDIT 2: Updated the code, now gCounter holds the number of elements in gAvailableExtensions
This loop is totally messy:
for (int i = 0; i < availableExtensionCount; ++i) {
size_t sizeOfAvailableExtensionName =
sizeof(availableExtensionProperties[i].extensionName);
reallocStatus = realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
availableExtensionProperties[i].extensionName,
sizeOfAvailableExtensionName);
++currentAvailableExtensionCount;
}
I assume the only lines that does what you expect them to do, are the lines for (int i = 0; i < availableExtensionCount; ++i) and ++currentAvailableExtensionCount;
First, the typical way to use realloc is like this:
foo *new_p = realloc(p, new_size);
if (!new_p)
handle_error();
else
p = new_p;
The point is that realloc will not update the value of p if a reallocation happens. It is your duty to update 'p'. In your case you never update *gAvailableExtensions. I also suspect that you don't calculate sizeOfAvailableExtensionCount correctly. The operator sizeof always return a compile time constant, so the realloc doesn't actuall make any sense.
The memcpy doesn't actally make any sense either, since you are copying the string into the memory of a pointer array (probably with an additional buffer overflow).
You said that *gAvailableExtensions is a pointer to an array of pointers to strings.
That means that you have to realloc the buffer to hold the correct number of pointers, and malloc memory for each string you want to store.
For this example, I assume that .extensionName is of type char * or char[XXX]:
// Calculate new size of pointer array
// TODO: Check for overflow
size_t new_array_size =
(currentAvailableExtensionCount + availableExtensionCount) * sizeof(*gAvailableExtensions);
char **tmp_ptr = realloc(*gAvailableExtensions, new_array_size);
if (!tmp_ptr)
{
//TODO: Handle error;
return currentAvailableExtensionCount;
}
*gAvailableExtensions = tmp_ptr;
// Add strings to array
for (int i = 0; i < availableExtensionCount; ++i)
{
size_t length = strlen(availableExtensionProperties[i].extensionName);
// Allocate space for new string
char *new_s = malloc(length + 1);
if (!new_s)
{
//TODO: Handle error;
return currentAvailableExtensionCount;
}
// Copy string
memcpy (new_s, availableExtensionProperties[i].extensionName, length + 1);
// Insert string in array
(*gAvailableExtensions)[currentAvailableExtensionCount] = new_s;
++currentAvailableExtensionCount;
}
If you can guarantee that the lifetime of availableExtensionProperties[i].extensionName is longer than *gAvailableExtensions, you can simplify this a little bit by dropping malloc and memcpy in the loop, and do:
char *new_s = availableExtensionProperties[i].extensionName;
(*gAvailableExtensions)[currentAvailableExtensionCount] = new_s;
Some harsh words at the end: It seems like you have the "Infinite number of Monkeys" approach to programming, just hitting the keyboard until it works.
Such programs will just only give the illusion of working. They will break in spectacular ways sooner or later.
Programming is not a guessing game. You have to understand every piece of code you write before you move to the next one.
int currentAvailableExtensionCount =
sizeof(*gAvailableExtensions) / sizeof(**gAvailableExtensions) - 1;
is just a obfuscated way of saying
int currentAvailableExtensionCount = 0;
I stopped reading after that, because i assume that is not what you intend to write.
Pointers in c doesn't know how many elements there are in the sequence they are pointing at. They only know the size of a single element.
In your case *gAvailableExtensions is of type of char ** and **gAvailableExtensions is of type char *. Both are pointers and have the same size on a typical desktop system. So on a 64 bit desktop system the expression turns into
8/8 - 1, which equals zero.
Unless you fix this bug, or clarify that you actually want the value to always be zero, the rest of the code does not make any sense.
I'm creating a C-library with .h and .c files for a ring buffer. Ideally, you would initialize this ring buffer library in the main project with something like ringbuff_init(int buff_size); and the size that is sent, will be the size of the buffer. How can I do this when arrays in C needs to be initialized statically?
I have tried some dynamically allocating of arrays already, I did not get it to work. Surely this task is possible somehow?
What I would like to do is something like this:
int buffSize[];
int main(void)
{
ringbuffer_init(100); // initialize buffer size to 100
}
void ringbuffer_init(int buff_size)
{
buffSize[buff_size];
}
This obviously doesn't compile because the array should have been initialized at the declaration. So my question is really, when you make a library for something like a buffer, how can you initialize it in the main program (so that in the .h/.c files of the buffer library) the buffer size is set to the wanted size?
You want to use dynamic memory allocation. A direct translation of your initial attempt would look like this:
size_t buffSize;
int * buffer;
int main(void)
{
ringbuffer_init(100); // initialize buffer size to 100
}
void ringbuffer_init(size_t buff_size)
{
buffSize = buff_size;
buffer = malloc(buff_size * sizeof(int));
}
This solution here is however extremely bad. Let me list the problems here:
There is no check of the result of malloc. It could return NULL if the allocation fails.
Buffer size needs to be stored along with the buffer, otherwise there's no way to know its size from your library code. It isn't exactly clean to keep these global variables around.
Speaking of which, these global variables are absolutely not thread-safe. If several threads call functions of your library, results are inpredictible. You might want to store your buffer and its size in a struct that would be returned from your init function.
Nothing keeps you from calling the init function several times in a row, meaning that the buffer pointer will be overwritten each time, causing memory leaks.
Allocated memory must be eventually freed using the free function.
In conclusion, you need to think very carefully about the API you expose in your library, and the implementation while not extremely complicated, will not be trivial.
Something more correct would look like:
typedef struct {
size_t buffSize;
int * buffer;
} RingBuffer;
int ringbuffer_init(size_t buff_size, RingBuffer * buf)
{
if (buf == NULL)
return 0;
buf.buffSize = buff_size;
buf.buffer = malloc(buff_size * sizeof(int));
return buf.buffer != NULL;
}
void ringbuffer_free(RingBuffer * buf)
{
free(buf.buffer);
}
int main(void)
{
RingBuffer buf;
int ok = ringbuffer_init(100, &buf); // initialize buffer size to 100
// ...
ringbuffer_free(&buf);
}
Even this is not without problems, as there is still a potential memory leak if the init function is called several times for the same buffer, and the client of your library must not forget to call the free function.
Static/global arrays can't have dynamic sizes.
If you must have a global dynamic array, declare a global pointer instead and initialize it with a malloc/calloc/realloc call.
You might want to also store its size in an accompanying integer variable as sizeof applied to a pointer won't give you the size of the block the pointer might be pointing to.
int *buffer;
int buffer_nelems;
char *ringbuffer_init(int buff_size)
{
assert(buff_size > 0);
if ( (buffer = malloc(buff_size*sizeof(*buffer)) ) )
buffer_nelems = buff_size;
return buffer;
}
You should use malloc function for a dynamic memory allocation.
It is used to dynamically allocate a single large block of memory with the specified size. It returns a pointer of type void which can be cast into a pointer of any form.
Example:
// Dynamically allocate memory using malloc()
buffSize= (int*)malloc(n * sizeof(int));
// Initialize the elements of the array
for (i = 0; i < n; ++i) {
buffSize[i] = i + 1;
}
// Print the elements of the array
for (i = 0; i < n; ++i) {
printf("%d, ", buffSize[i]);
}
I know I'm three years late to the party, but I feel I have an acceptable solution without using dynamic allocation.
If you need to do this without dynamic allocation for whatever reason (I have a similar issue in an embedded environment, and would like to avoid it).
You can do the following:
Library:
int * buffSize;
int buffSizeLength;
void ringbuffer_init(int buff_size, int * bufferAddress)
{
buffSize = bufferAddress;
buffSizeLength = buff_size;
}
Main :
#define BUFFER_SIZE 100
int LibraryBuffer[BUFFER_SIZE];
int main(void)
{
ringbuffer_init(BUFFER_SIZE, LibraryBuffer ) // initialize buffer size to 100
}
I have been using this trick for a while now, and it's greatly simplified some parts of working with a library.
One drawback: you can technically mess with the variable in your own code, breaking the library. I don't have a solution to that yet. If anyone has a solution to that I would love to here it. Basically good discipline is required for now.
You can also combine this with #SirDarius 's typedef for ring buffer above. I would in fact recommend it.
I'm newbie with C and stunned with some magic while using following C functions. This code works for me, and prints all the data.
typedef struct string_t {
char *data;
size_t len;
} string_t;
string_t *generate_test_data(size_t size) {
string_t test_data[size];
for(size_t i = 0; i < size; ++i) {
string_t string;
string.data = "test";
string.len = 4;
test_data[i] = string;
}
return test_data;
}
int main() {
size_t test_data_size = 10;
string_t *test_data = generate_test_data(test_data_size);
for(size_t i = 0; i < test_data_size; ++i) {
printf("%zu: %s\n", test_data[i].len, test_data[i].data);
}
}
Why function generate_test_data works only when "test_data_size = 10", but when "test_data_size = 20" process finished with exit code 11? HOW does it possible?
This code will never work perfectly, it just happens to be working. In C, you have to manage the memory yourself. If you make a mistake, the program might continue to work... or something might scribble all over the memory you thought was yours. This often manifests itself as weird errors like you're having: it works when the length is X, but fails when the length is Y.
If you turn on -Wall, or if you're using clang even better -Weverything, you'll get a warning like this.
test.c:18:12: warning: address of stack memory associated with local variable 'test_data' returned
[-Wreturn-stack-address]
return test_data;
^~~~~~~~~
The two important kinds of memory in C are: stack and heap. Very basically, stack memory is only good for the duration of the function. Anything declared on the stack will be freed automatically when the function returns, sort of like local variables in other languages. The rule of thumb is if you don't explicitly allocate it, it's on the stack. string_t test_data[size]; is stack memory.
Heap memory you allocate and free yourself, usually using malloc or calloc or realloc or some other function doing this for you like strdup. Once allocated, heap memory stays around until it's explicitly deallocated.
Rule of thumb: heap memory can be returned from a function, stack memory cannot... well, you can but that memory slot might then be used by something else. That's what's happening to you.
So you need to allocate memory, not just once, but a bunch of times.
Allocate memory for the array of pointers to string_t structs.
Allocate memory for each string_t struct in the array.
Allocate memory for each char string (really an array) in each struct.
And then you have to free all that. Sound like a lot of work? It is! Welcome to C. Sorry. You probably want to write functions to allocate and free string_t.
static string_t *string_t_new(size_t size) {
string_t *string = malloc(sizeof(string_t));
string->len = 0;
return string;
}
static void string_t_destroy(string_t *self) {
free(self);
}
Now your test data function looks like this.
static string_t **generate_test_data_v3(size_t size) {
/* Allocate memory for the array of pointers */
string_t **test_data = calloc(size, sizeof(string_t*));
for(size_t i = 0; i < size; ++i) {
/* Allocate for the string */
string_t *string = string_t_new(5);
string->data = "test";
string->len = 4;
test_data[i] = string;
}
/* Return a pointer to the array, which is also a pointer */
return test_data;
}
int main() {
size_t test_data_size = 20;
string_t **test_data = generate_test_data_v3(test_data_size);
for(size_t i = 0; i < test_data_size; ++i) {
printf("%zu: %s\n", test_data[i]->len, test_data[i]->data);
}
/* Free each string_t in the array */
for(size_t i = 0; i < test_data_size; i++) {
string_t_destroy(test_data[i]);
}
/* Free the array */
free(test_data);
}
Instead of using pointers you could instead copy all the memory you use, which is sort of what you were previously doing. That's easier for the programmer, but inefficient for the computer. And if you're coding in C, it's all about being efficient for the computer.
Because the space for test_data in v1 gets created in the function, and that space gets reclaimed when the function returns (and can thus be used for other things); in v2, the space is set aside outside of the function, so doesn't get reclaimed.
Why function generate_test_data_v1 works only when "test_data_size = 10", but when "test_data_size = 20" process finished with exit code 11?
I see no reason why function generate_test_data_v1() should ever fail, but you cannot use its return value. It returns a pointer to an automatic variable belonging to the function's scope, and automatic variables cease to exist when the function to which they belong returns. Your program produces undefined behavior when it dereferences that pointer. I can believe that it appears to work as you intended for some sizes, but even in those cases the program is wrong.
Moreover, your program is very unlikely to be producing an exit code of 11, but it may well be terminating abruptly with a segmentation fault, which is signal 11.
And why generate_test_data_v2 works perfectly?
Function generate_test_data_v2() populates elements of an existing array belonging to function main(). That array is in scope for substantially the entire life of the program.
I'm working on a basic shell (as in the console program that awaits commands and executes them in UNIX systems) replica in C, and need to be able to manipulate 2d arrays of char to store the environment variables.
I wrote a small function to create that 2d array and initialize each string to NULL before I fill it up elsewhere in my code.
Except that it crashes as soon as the program is launched, for some reason.
I have similar issues (namely occasional segfaults, probably due to me reading/writing in an inapropriate place) with two other functions, respectively to free those 2d arrays when needed, and to get the length of one of those 2d array.
If I don't use these two functions and malloc the 2d array within the rest of my code, without initializing anything except the last entry to NULL, but instead copy the env strings directly after the malloc, I have something that works. But it'd be better to be able to prevent the memory leaks, and to have that ft_tabnew function to work so that I could reuse it in future projects.
char **ft_tabnew(size_t size)
{
char **mem;
size_t i;
if (!(mem = (char **)malloc(size + 1)))
return (NULL);
i = 0;
while (i < size + 1)
{
mem[i] = NULL;
i++;
}
return (mem);
}
void ft_tabdel(char ***as)
{
int i;
int len;
if (as == NULL)
return ;
i = 0;
len = ft_tablen(*as);
while (i < len)
{
if (*as[i])
ft_strdel(&(*as[i]));
i++;
}
free(*as);
*as = NULL;
return ;
}
size_t ft_tablen(char **tab)
{
size_t i;
i = 0;
while (tab[i])
i++;
return (i);
}
NOTE : The ft_strdel function used in ft_tabdel is freeing a string that was dynamically allocated, and sets the pointer to NULL. I've been using it for a few months in several projects and it has not failed me yet.
Hopefully, you wonderful people will be able to tell me what misconception or misunderstanding I have about 2d arrays of chars, or what stupid error I'm making here.
Thank you.
You're not allocating enough space.
if (!(mem = (char **)malloc(size + 1)))
only allocates size+1 bytes. But you need to allocate space for size+1 pointers, and pointers are typically 4 bytes. You need to multiply the number of elements by the size of each element:
if (!(mem = malloc((size + 1) * sizeof(*mem))))
In the code
char **mem;
while (i < size + 1)
{
mem[i] = NULL;
i++;
}
mem is a "pointer to a pointer to a char" and hence its size is that of a pointer, not of a char. When you say mem[i] and incement i, you increment with the size of pointer, not of char, and so overwrite memory outside your allocated memory. Try:
if (!(mem = (char **)malloc((size + 1)*sizeof(void *))))
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm trying to implement a solution to copy a large string in memory in C.
Can you give me any advice about implementation or any reference?
I'm thinking to copy byte by byte since I don't know the length (probably I can't calculate it with strlen() since the string is very large).
Another concern is that I will have to reallocate memory on every step and I don't know how is the best way to do that. Is there any way that I can reallocate using only the reference to the last position of the memory already alocated and filled? Thus if the memory allocation fails, it will not affect the rest of the memory already filled.
What is the best value to return from this function? Should I return the number of bytes that were succesfully copied?
If there is a memory allocation fail, does realloc() set any global variable that I can check in the main function after I call the copying function? As I don't want to just return NULL from it if at some point realloc() fails, but I want to return a value more useful.
strlen() won't fail, as it uses size_t to descirbe the string's size, and size_t is large enough to hold the size of any object on the machine the program runs on.
So simply do
#define _XOPEN_SOURCE 500 /* for strdup */
#include <string.h>
int duplicate_string(const char * src, char ** pdst)
{
int result = 0;
if (NULL == ((*pdst) = strdup(src)))
{
result = -1;
}
return result;
}
If this fails try using an more clever structure to hold the data, for example by chopping it into slices:
#define _XOPEN_SOURCE 700 /* for strndup */
#include <string.h>
int slice_string(const char * src, char *** ppdst, size_t s)
{
int result = 0;
size_t s_internal = s + 1; /* Add one for the 0-terminator. */
size_t len = strlen(src) + 1;
size_t n =len/s_internal + (len%s_internal ?1 :0);
*ppdst = calloc(n + 1, sizeof(**ppdst)); /* +1 to have a stopper element. */
if (NULL == (*ppdst))
{
result = -1;
goto lblExit;
}
for (size_t i = 0; i < n; ++i)
{
(*ppdst)[i] = strndup(src, s);
if (NULL == (*ppdst)[i])
{
result = -1;
while (--i > 0)
{
free((*ppdst)[i]);
}
free(*ppdst);
*ppdst = NULL;
goto lblExit;
}
src += s;
}
lblExit:
return result;
}
Use such functions by trying dump copy first and if this fails by slicing the string.
int main(void)
{
char * s = NULL;
read_big_string(&s);
int result = 0;
char * d = NULL;
char ** pd = NULL;
/* 1st try dump copy. */
result = duplicate_string(s, &d);
if (0 != result)
{
/*2ndly try to slice it. */
{
size_t len = strlen(s);
do
{
len = len/2 + (len%2 ?1 :0);
result = slice_string(s, &pd, len);
} while ((0 != result) || (1 == len));
}
}
if (0 != result)
{
fprintf(stderr, "Duplicating the string failed.\n");
}
/* Use copies. */
if (NULL != d)
{
/* USe result from simple duplication. */
}
if (NULL != pd)
{
/* Use result from sliced duplication. */
}
/* Free the copies. */
if (NULL != pd)
{
for (size_t i = 0; pd[i]; ++i)
{
free(pd[i]);
}
}
free(pd);
free(d);
return 0;
}
realloc() failing
If there is a memory allocation fail, does realloc() set any global variable that I can check in the main function after I call the copying function? As I don't want to just return NULL from it if at some point realloc() fails, but I want to return a value more useful.
There's no problem with realloc() returning null if you use realloc() correctly. If you use realloc() incorrectly, you get what you deserve.
Incorrect use of realloc()
char *space = malloc(large_number);
space = realloc(space, even_larger_number);
If the realloc() fails, this code has overwritten the only reference to the previously allocated space with NULL, so not only have you failed to allocate new space but you also cannot release the old space because you've lost the pointer to it.
(For the fastidious: the fact that the original malloc() might have failed is not critical; space will be NULL, but that's a valid first argument to realloc(). The only difference is that there would be no previous allocation that was lost.)
Correct use of realloc()
char *space = malloc(large_number);
char *new_space = realloc(space, even_larger_number);
if (new_space != 0)
space = new_space;
This saves and tests the result of realloc() before overwriting the value in space.
Continually growing memory
Another concern is that I will have to reallocate memory on every step and I don't know how is the best way to do that. Is there any way that I can reallocate using only the reference to the last position of the memory already allocated and filled? Thus if the memory allocation fails, it will not affect the rest of the memory already filled.
The standard technique for avoiding quadratic behaviour (which really does matter when you're dealing with megabytes of data) is to double the space allocated for your working string when you need to grow it. You do that by keeping three values:
Pointer to the data.
Size of the data area that is allocated.
Size of the data area that is in use.
When the incoming data won't fit in the space that is unused, you reallocate the space, doubling the amount that is allocated unless you need more than that for the new space. If you think you're going to be adding more data later, then you might add double the new amount. This amortizes the cost of the memory allocations, and saves copying the unchanging data as often.
struct String
{
char *data;
size_t length;
size_t allocated;
};
int add_data_to_string(struct String *str, char const *data, size_t datalen)
{
if (str->length + datalen >= str->allocated)
{
size_t newlen = 2 * (str->allocated + datalen + 1);
char *newdata = realloc(str->data, newlen);
if (newdata == 0)
return -1;
str->data = newdata;
str->allocated = newlen;
}
memcpy(str->data + str->length, data, datalen + 1);
str->length += datalen;
return 0;
}
When you've finished adding to the string, you can release the unused space if you wish:
void release_unused(struct String *str)
{
char *data = realloc(str->data, str->length + 1);
str->data = data;
str->allocated = str->length + 1;
}
It is very unlikely that shrinking a memory block will move it, but the standard says:
The realloc function deallocates the old object pointed to by ptr and returns a
pointer to a new object that has the size specified by size. The contents of the new
object shall be the same as that of the old object prior to deallocation, up to the lesser of
the new and old sizes.
The realloc function returns a pointer to the new object (which may have the same
value as a pointer to the old object), or a null pointer if the new object could not be
allocated.
Note that 'may have the same value as a pointer to the old object' also means 'may have a different value from a pointer to the old object'.
The code assumes that it is dealing with null terminated strings; the memcpy() code copies the length plus one byte to collect the terminal null, for example, and the release_unused() code keeps a byte for the terminal null. The length element is the value that would be returned by strlen(), but it is crucial that you don't keep doing strlen() on megabytes of data. If you are dealing with binary data, you handle things subtly differently.
use a smart pointer and avoid copying in the first place
OK, let's use Cunningham's Question to help figure out what to do. Cunningham's Question (or Query - your choice :-) is:
What's the simplest thing that could possibly work?
-- Ward Cunningham
IMO the simplest thing that could possibly work would be to allocate a large buffer, suck the string into the buffer, reallocate the buffer down to the actual size of the string, and return a pointer to that buffer. It's the caller's responsibility to free the buffer they get when they're done with it. Something on the order of:
#define BIG_BUFFER_SIZE 100000000
char *read_big_string(FILE *f) /* read a big string from a file */
{
char *buf = malloc(BIG_BUFFER_SIZE);
fgets(buf, BIG_BUFFER_SIZE, f);
realloc(buf, strlen(buf)+1);
return buf;
}
This is example code only. There are #includes which are not included, and there's a fair number of possible errors which are not handled in the above, the implementation of which are left as an exercise for the reader. Your mileage may vary. Dealer contribution may affect cost. Check with your dealer for price and options available in your area. Caveat codor.
Share and enjoy.