Memory allocation for simple variadic string concatenation in C - c

I have the following test function to copy and concatenate a variable number of string arguments, allocating automatically:
char *copycat(char *first, ...) {
va_list vl;
va_start(vl, first);
char *result = (char *) malloc(strlen(first) + 1);
char *next;
strcpy(result, first);
while (next = va_arg(vl, char *)) {
result = (char *) realloc(result, strlen(result) + strlen(next) + 1);
strcat(result, next);
}
return result;
}
Problem is, if I do this:
puts(copycat("herp", "derp", "hurr", "durr"));
it should print out a 16-byte string, "herpderphurrdurr". Instead, it prints out a 42-byte string, which is the correct 16 bytes plus 26 more bytes of junk characters.
I'm not quite sure why yet. Any ideas?

The variable-argument-list functions don't magically know how many arguments there are, so you're most likely walking the stack until you happen to hit a NULL.
You either need an argument numStrings, or supply an explicit null-terminator argument after your list of strings.

You need a sentinel marker on your list:
puts(copycat("herp", "derp", "hurr", "durr", NULL));
Otherwise, va_arg doesn't actually know when to stop. That fact that you're getting junk is pure accident since you're invoking undefined behaviour. For example, when I ran your code as-is, I got a segmentation fault.
Variable argument functions, such as printf need some sort of indication as to how many items are passed in: printf itself uses the format string up front to figure this out.
The two general methods are a count (or format string) which is useful when you can't use one of the possible values as a sentinel (a marker at the end).
If you can use a sentinel (like NULL in the case of pointers, or -1 in the case of non-negative signed integers, that's usually better so you don't have to count the elements (and possible get the element count and element list out of step).
Keep in mind that puts(copycat("herp", "derp", "hurr", "durr")); is a memory leak since you're allocating memory then losing the pointer to it. Using:
char *s = copycat("herp", "derp", "hurr", "durr");
puts(s);
free (s);
is one way to fix that, and you may want to put in error checking code in case the allocations fail.

What I understand from your code is that you assume va_next will return NULL once each argument has been "popped". That's wrong as va_next has absolutely no way to determine the number of arguments : your while loop will keep running until a NULL is randomly hit.
Solution : either provide the number of arguments, or add call your function with an additional "NULL" argument.
PS: if you are wondering why printf doesn't require such an additional argument, it's because the number of expected arguments is deduced from the format string (the number of '%flag')

As an addition to the other answers, you should cast the NULL to the expected type when using it as an argument to a variadic function: (char *)NULL. If NULL is defined as 0, then an int will be stored instead, which will accidentally work when int has the sime size as the pointer and NULL is represented by all bits 0. But none of this is guaranteed, so you may run into strange behaviour that's hard to debug when porting the code or even when only changing the compiler.

As others have mentioned, va_arg does not know when to stop. It is up to you to provide NULL (or some other marker) when you call the function. Just a few side notes:
You must call free on pointers you obtain from malloc and realloc.
There is no reason to cast the result of malloc or realloc in C.
When calling realloc, it is best to store the return value into a temporary variable. If realloc is unable to reallocate enough memory, it returns NULL but the original pointer is not freed. If you use realloc the way you do, and it is unable to reallocate the memory, then you have lost the original pointer and your subsequent call to strcat will likely fail. You could use it like this:
char *tmp = realloc(result, strlen(result) + strlen(next) + 1);
if (tmp == NULL)
{
// handle error here and free the memory
free(result);
}
else
{
// reallocation was successful, re-assign the original pointer
result = tmp;
}

Related

in c why the dereference of the s point to string value not working?

why when i use the program it return s = null
the get_string function can have update to make the program work
it is : string s = malloc(sizeof(string));
but in the end of the function and after return s; i cant free(s);
or before return s; i will lose the data i stored
i tried to search more about dereference pointers but i got nothing.
#include <stdio.h>
typedef char* string;
string get_string(string q)
{
string s = NULL;
printf("%s",q);
scanf("%s",s);
return s;
}
int main(void)
{
string a = get_string("name : ");
printf("name is %s\n",a);
}
Here are two correct uses of scanf to read a string:
char s1[10];
scanf("%9s", s1);
char *s2 = malloc(100);
scanf("%99s", s2);
Notice that in both cases — s1 and s2 — I had to explicitly say how much memory I wanted for my string. Then, when I called scanf, I included that information — 1 less than the overall string size — in the %s format, so that I could tell scanf not to read a bigger string than my string variable could hold.
Notice, by contrast, that in your get_string function, you did not allocate any memory to hold your string at all. Your variable s was a null pointer, explicitly pointing to no memory at all.
This is something that's very easy to overlook at first: Most of the time, C does not allocate memory for strings for you. Most of the time, this is your responsibility.
Now, an additional concern is that even when you do allocate memory for a string, you have to think about how long that memory will stick around, and whether anyone has to explicitly deallocate it. And there are some additional mistakes that are easy to make. In particular, suppose you took my first example (s1) to heart, and tried to fix your get_string function like this:
char *get_string(char *q)
{
char s[100]; /* WRONG */
printf("%s",q);
scanf("%99s",s);
return s;
}
Here you have given scanf a proper array to read in to, but it's local to the get_string function, meaning that it will disappear after get_string returns, and it will be useless to the caller.
Another possibility is
#include <stdlib.h>
char *get_string(char *q)
{
char s = malloc(100); /* better */
if(s == NULL) {
fprintf(stderr, "out of memory!\n");
exit(1);
}
printf("%s",q);
scanf("%99s",s);
return s;
}
This will work just fine. Note that I have checked to see whether malloc succeeded. (If it fails, it returns a NULL pointer.) But now we have the issue that some memory has been allocated which might never be freed. That is, get_string returns a pointer to dynamically-allocated memory, and it's the caller's responsibility to free that memory when it's no longer needed. The caller doesn't have to, but if there end up being 1,000,000 calls to get_string, and if none of the allocated blocks ever gets freed, that's a lot of memory wasted.
First as other people have noted in the comments the Typedef in this case isn't very helpful as it hides the fact its a pointer. Also char* is vary commonly used and not a type complicated enough for a typedef IMO.
For your other issues the problem appears to be that you are thinking of the value as a C++ string instead of a char pointer. In C there aren't string objects but instead people use char* which can pointer blocks of chars and we determine the end of the string by putting a null character at the end of list of characters.
So the reason you can't print the NULL string is because it is undefined behavior to pass a NULL pointer to printf. When you change it to s = malloc(sizeof(string)); the pointer is no longer null but instead pointing to the start of a block of memory that is sizeof(string) bytes long. You should be doing malloc(sizeof(char)*strlen(q)); instead so you have a block of memory holding a string with the length of string q instead of just one character. More generally it would be malloc(sizeof(char)*NUM_CHARS);.
When it come to the free call. You can't call free(s) after return s; because no statements after return s; will occur because the function has returned and no longer executing. As for calling before, calling free(s) deallocates that block of memory that s is pointing too from the malloc(sizeof(string)) is pointing to. Here you have to remember that the function isn't returning the memory or the string but instead it returns the pointer to the string. So if you delete the memory the pointer is pointing to then you lose it once you return.

Freeing a C pointer after altering its value

Can I free a pointer such as:
unsigned char *string1=NULL;
string1=malloc(4096);
After altering its value like:
*string1+=2;
Can free(string1) recognize the corresponding memory block to free after incrementing it (for example to point to a portion of a string), or do I need to keep the original pointer value for freeing purposes?
For example, for an implementation of the Visual Basic 6 function LTrim in C, I need to pass **string as a parameter, but in the end I will return *string+=string_offset_pointer to start beyond any blank spaces/tabs.
I think that here I am altering the pointer so if I do this in this way I will need to keep a copy of the original pointer to free it. It will probably be better to overwrite the non-blank contents into the string itself and then terminate it with 0 to avoid requiring an additional copy of the pointer just to free the memory:
void LTrim(unsigned char **string)
{
unsigned long string_length;
unsigned long string_offset_pointer=0;
if(*string==NULL)return;
string_length=strlen(*string);
if(string_length==0)return;
while(string_offset_pointer<string_length)
{
if(
*(*string+string_offset_pointer)!=' ' &&
*(*string+string_offset_pointer)!='\t'
)
{
break;
}
string_offset_pointer++;
}
*string+=string_offset_pointer;
}
It would probably be best to make the function to overwrite the string with a substring of it but without altering the actual value of the pointer to avoid requiring two copies of it:
void LTrim(unsigned char **string)
{
unsigned long string_length;
unsigned long string_offset_pointer=0;
unsigned long string_offset_rebase=0;
if(*string==NULL)return;
string_length=strlen(*string);
if(string_length==0)return;
//Detect the first leftmost non-blank
//character:
///
while(string_offset_pointer<string_length)
{
if(
*(*string+string_offset_pointer)!=' ' &&
*(*string+string_offset_pointer)!='\t'
)
{
break;
}
string_offset_pointer++;
}
//Copy the non-blank spaces over the
//originally blank spaces at the beginning
//of the string, from the first non-blank
//character up to the string length:
///
while(string_offset_pointer<string_length)
{
*(*string+string_offset_rebase)=
*(*string+string_offset_pointer);
string_offset_rebase++;
string_offset_pointer++;
}
//Terminate the newly-copied substring
//with a null byte for an ASCIIZ string.
//If the string was fully blank we will
//just get an empty string:
///
*(*string+string_offset_rebase)=0;
//Free the now unused part of the
//string. It assumes that realloc()
//will keep the current contents of our
//memory buffers and will just truncate them,
//like in this case where we are requesting
//to shrink the buffer:
///
realloc(*string,strlen(*string)+1);
}
Since you're actually doing
unsigned char *string1=NULL;
string1=malloc(4096);
*string1+=2;
free(string1);
free(string1) IS being passed the result of a malloc() call.
The *string1 += 2 will - regardless of the call of free() - have undefined behaviour if string1[0] is uninitialised. (i.e. If there is some operation that initialises string1[0] between the second and third lines above, the behaviour is perfectly well defined).
If the asterisk is removed from *string1 += 2 to form a statement string1 += 2 then the call of free() will have undefined behaviour. It is necessary for free() to be passed a value that was returned by malloc() (or calloc() or realloc()) that has not otherwise been deallocated.
The value passed to free() must be a pointer returned by malloc(), calloc(), or realloc(). Any other value results in undefined behavior.
So you have to save the original pointer if you modify it. In your code you don't actually modify the pointer, you just increment the contents of the location that it points to, so you don't have that problem.
Why is the language specified this way, you might ask? It allows for very efficient implementations. A common design is to store the size allocation in the memory locations just before the data. So the implementation of free() simply reads the memory before that address to determine how much to reclaim. If you give some other address, there's no way for it to know that this is in the middle of an allocation and it needs to scan back to the beginning to find the information.
A more complicated design would keep a list of all the allocations, and then determine which one the address points into. But this would use more memory and would be much less efficient, since it would have to search for the containing allocation.

malloc() causing crash with de-referenced 2-D character array in loop

Arrays is initialized as:
char** aldos = NULL;
char** aldoFilenames = NULL;
Function definition is:
int readFilesFromDirectory(char*** dest, char*** nameDest)
Passed to function via:
readFilesFromDirectory(&aldos, &aldoFilenames);
After counting the files, dest and nameDest are initialized:
*dest = (char**)malloc(sizeof(char*)*count);
*nameDest = (char**)malloc(sizeof(char*)*count);
count = 0; //resetting to read in the files again
First filename for nameDest is read in like:
*nameDest[count] = (char*) malloc(sizeof(char)*strlen(findData.cFileName) + 1);
strcpy(*nameDest[count], findData.cFileName);
//can confirm in my program, the value exists properly in *nameDest[count]
count++;
Heres where the problem comes in, when I throw it in a loop, it crashes (with no real useful error codes):
while (FindNextFile(hfind, &findData) != 0)
{
*nameDest[count] = (char*) malloc(sizeof(char)*strlen(findData.cFileName) + 1); //doesnt make it past here, CRASH
sprintf(*nameDest[count],"%s\0",findData.cFileName);
count++;
}
Any insight would be appreciated, I'll be quick to add more information if requested
In *nameDest[count], the indexing operator place before the dereference operator, making the code equivalent to *(nameDest[count]), which is not what you want since nameDest points to the array. You need to do the pointer dereference before the array indexing by using parenthesis: (*nameDest)[count]
I should also note that polling the OS twice for the directory listing - once for the count and once for the actual names - is unreliable, as between the two polls, the count might have changed. Consider using realloc to resize the array as you find more entries.
Several problems in the code
1) the expression: sizeof(char) is defined as 1 and multiplying anything by 1 has no effect, especially as part of a malloc() parameter, so it just clutters the code and accomplishes nothing.
Suggest removing the sizeof(char) expressions.
2) the memory allocation family (malloc, calloc, realloc) have a returned type of void* which can be assigned to any other pointer, so the cast is unneeded, just clutters the code and is a real headache when debugging and/or maintaining the code.
Suggest remove the casting of the returned values from malloc()
3) in C, array offsets start with 0 and end with array size -1 So when an array of size count is allocated, the valid offsets are 0...count-1.
However, the posted code is accessing array[count] which is past the end of the array, this is undefined behaviour and can/will lead to a seg fault event.

Write a function to dynamically create a new copy of string and return a pointer to the new copy (C)

this is what I have so far but I can't figure out what is wrong with it
void newCopy(char *s)
{
char newString = malloc(sizeof(int * strlen(s)));
newString = s;
return &newString;
}
void newCopy(char *s)
{
char newString = malloc(sizeof(int * strlen(s)));
First and second problems are here.
First is, You're assigning the return of malloc, which is a pointer, to a variable declared as char. The variable should be declared as char*.
Second is, your input to sizeof is wrong.
int * strlen(s) is nonsense and won't compile, because you're trying to multiply a type and an integer. You meant sizeof(int) (which is an integer) * strlen(s) (also an integer) which will compile.
You should use sizeof(char) instead of sizeof(int), since it is a string.
You should add 1 to the size, since strings in C need to be null terminated by an extra \0 that strlen does not report being part of the string length.
Putting it all together, sizeof(char)*(strlen(s)+1)
newString = s;
Third problem is here. = is not a magic operator - it assigns the value in the variable s (which is a pointer) to the value in the variable newString (which after fixing the above mistake, will also be a pointer). It does nothing beside that.
What you want to do instead is use strcpy, which is a function that copies the contents of one string (by following its pointer) to the contents of another string (by following its pointer). http://www.cplusplus.com/reference/cstring/strcpy/
return &newString;
Fourth and fifth problems are here.
Fourth is, You have declared the function as void and here you are trying to return a char*.
Fifth is, you are trying to return a pointer to something that was declared on the stack (a local variable). As soon as the function returns, anything on the stack for that function is trash and cannot be referenced anymore.
However, if you correctly make newString of type char*, all you need to do is return newString; And you correctly return a pointer by value which points into the heap (thanks to the earlier malloc).
}
Finally, judging by this code, I should inform you that C is not a newbie friendly language, where you can just type things that 'look like' what you want to happen and pray it works. If you're even slightly wrong your code will crash and you will have zero idea why, because you don't know the right way to do things. Either read a really good book on C and teach yourself everything from basic to advance step by step so you know how it all works, or pick up a more user friendly language.
I should start by pointing out that in my opinion, given the number and (especially) nature of the mistakes in this code, you probably need to get a good book on C.
Your newString = s; overwrites the pointer instead of copying the string into the space you just allocated. Thus, you lose the pointer to what you just allocating (leaking the memory) without making a copy. You probably want to use strcpy instead of direct assignment.
Your computation of the size you allocate isn't what you really want either. Typically, for a string of length N, you want to allocate N+1 bytes. You're currently attempting to allocat sizeof(int * strlen(s)) bytes, which shouldn't even compile.
A corrected version should be like:
char *newCopy(char *s)
{
if (s == NULL)
return NULL;
char *newString = malloc(strlen(s) + 1);
if (newString == NULL)
return NULL;
strcpy(newString, s);
return newString;
}

malloc/free, appear to be getting multiple frees

I've written a function to test if a given path is a valid Maildir directory (standard Maildir has the three subfolders "cur" "new" and "tmp" ). Function takes in the supposed directory, checks for those subfolders, and returns appropriately.
I'm getting a segfault at the second free statement with the current code, and I similarly got an "invalid next size" error with code of slightly different organization. Even more confusing, it only segfaults on some directories, while successfully completing on others, with no discernible reason (though it is consistent on which ones it will segfault on). With the second free() commented out, all accurately-formatted directories complete successfully.
Obviously I'm double-freeing. My question is, why and how? If the first free is inside the conditional statement and we return immediately after freeing, we never get to the second free. If we get to the second free, that means we skipped the first one... right?
I realize in this context it's perfectly fine because the system will reclaim the memory at the end of the program, but I'm more interested in the reason this is happening than in just making the code work. What if I were looking at a different situation, functions called by functions called by functions etc. and memory could possibly be a concern? Don't I need that 2nd free to reclaim memory?
int is_valid_folder(char* maildir)
{
struct stat *buf;
buf = (struct stat *) malloc(sizeof(struct stat));
char* new = strdup(maildir);
char* cur = strdup(maildir);
char* tmp = strdup(maildir);
strcat (cur, "/cur"); strcat (new, "/new"); strcat (tmp, "/tmp");
if(stat(cur, buf) || stat(tmp, buf) || stat(new, buf))
{
printf("Problem stat-ing one of the cur/new/tmp folders\n");
printf("Error number %d\n", errno);
free(buf);
return 1;
}
free(buf);
return 0; //a valid folder path for this function
}
You have several buffer overflows: strdup() probably allocates a char array that is just large enough to hold the maildir string, and the calls to strcat() will then overflow the arrays. (strcat(), as opposed to strdup(), does not create a new char array, so you must ensure yourself that the array you give it is large enough to hold the resulting string.)
By the way, valgrind is your friend when it comes to tracking down memory management bugs.
There's not enough space in the duplicate strings for the concatenation.
try:
char* new = (char*)calloc(strlen(maildir) + 5);
etc
I know you got it, but just as a tip... (too big for a comment)
Check the return value of strdup() for NULL and free() those pointers when you are done with them. If you don't memory will leak (it is leaking in your current code).
The strdup() function shall return a pointer to a new string, which is a duplicate of the string pointed to by s1. The returned pointer can be passed to free(). A null pointer is returned if the new string cannot be created.

Resources