strcpy behaving differently if destination string is uninitialized - c

I'm working in C trying to create a huffman decoder. This piece of code only works if codearray comes in uninitialized, otherwise it gives me a segmentation fault. However, valgrind complains that codearray is uninitialized if I do it that way. I went through it with ddd and the segmentaion fault happens once strcpy is called and I cannot figure out why.
void printtree_inorder(node* n,char* code,char* letarray,char** codearray)
{
if (n == NULL) {
return;
}
static int counter=0;
appenddigit(code,'0');
printtree_inorder(n -> left,code,letarray,codearray);
remdigit(code);
if (n->let!='\0') {
letarray[counter]=n->let;
strcpy(codearray[counter],code);
counter++;
}
appenddigit(code,'1');
printtree_inorder(n -> right,code,letarray,codearray);
remdigit(code);
}
Here is the calling function:
char code[100]={'\0'};
char** codearray=(char**)malloc(numchars*sizeof(char*));
for (i=0;i<numchars;i++) {
codearray[i]=(char*)malloc(100*sizeof(char));
}
char* letarray=(char*)malloc((numchars+1)*sizeof(char));
letarray[0]='\0';
printtree_inorder(root,code,letarray,codearray);

for (i=0;i<numchars;i++) {
codearray[i]=(char*)malloc(100*sizeof(char));
}
this is the code you talking about? it is not really initialization code, it is making room for data code.
char** codearray=(char**)malloc(numchars*sizeof(char*));
just creates you an array of char *, but they do not point to any valid memory.
so, your "initialization code" just makes sure, that your memory is created correcly.
the other thing what really scares me is, that your counter variable is static.
calling
printtree_inorder(root,code,letarray,codearray);
printtree_inorder(root,code,letarray,codearray);
will also end in a segmentation fault, since counter will be > then numchars when you call it a second time (from outside).
so, lets rewrite your code a bit and make it more safe
char* code = (char *)malloc(numchars + 1);
memset(code, 0, numchars + 1);
char* letarray = (char *)malloc(numchars + 1);
memset(letarray, 0, numchars + 1);
char** codearray = (char **)malloc(numchars * sizeof(char *));
memset(codearray, 0, numchars * sizeof(char *));
printtree_inorder(root, code, letarray, codearray, 0);
free(code);
// do not forget the free the other allocations later as well as
void printtree_inorder(node* n,char* code,char* letarray,char** codearray, int counter)
{
if (n == NULL) {
return;
}
appenddigit(code,'0');
printtree_inorder(n -> left,code,letarray,codearray, counter);
remdigit(code);
if (n->let!='\0')
{
letarray[counter] = n->let;
codearray[counter] = strdup(code);
++counter;
}
appenddigit(code,'1');
printtree_inorder(n -> right,code,letarray,codearray, counter);
remdigit(code);
}

Probably in the "initialized" call the array isn't really correctly initialized at all, therefore the function crashes.
When "not initialized", the array probably contains values that (by chance) don't lead to a segmentation fault, depending on what the program has done previously with the memory that ends up being used for codearray.
The function tries to copy a string to wherever codearray[counter] points to:
strcpy(codearray[counter],code);
In the function call you show this codearray[counter] is a random value since only the array was malloc'ed, but the elements weren't initialized to any specific values. strcpy() then tries to write to that random memory address.
You have to allocate memory for the copy of the string, for example by using strdup() instead of strcpy().

Related

How to fix segfault caused by a realloc going out of bounds?

Hello and TIA for your help. As I am new to to posting questions, I welcome any feedback on how this quesiton has been asked. I have researched much in SO without finding what I thought I was looking for.
I'm still working on it, and I'm not really good at C.
My purpose is extracting data from certain specific tags from a given XML and writing it to file. My issue arises because as I try to fill up the data struct I created for this purpose, at a certain point the realloc() function gives me a pointer to an address that's out of bounds.
If you look at this example
#include <stdio.h>
int main() {
char **arrayString = NULL;
char *testString;
testString = malloc(sizeof("1234567890123456789012345678901234567890123456789"));
strcpy(testString, "1234567890123456789012345678901234567890123456789");
int numElem = 0;
while (numElem < 50) {
numElem++;
arrayString = realloc(arrayString, numElem * sizeof(char**));
arrayString[numElem-1] = malloc(strlen(testString)+1);
strcpy(arrayString[numElem-1], testString);
}
printf("done\n");
return 0;
}
it does a similar, but simplified thing to my code. Basically tries to fill up the char** with c strings but it goes to segfault. (Yes I understand I am using strcpy and not its safer alternatives, but as far as I understand it copies until the '\0', which is automatically included when you write a string between "", and that's all I need)
I'll explain more in dephth below.
In this code i make use of the libxml2, but you don't need to know it to help me.
I have a custom struct declared this way:
struct List {
char key[24][15];
char **value[15];
int size[15];
};
struct List *list; //i've tried to make this static after reading that it could make a difference but to no avail
Which is filled up with the necessary key values. list->size[] is initialized with zeros, to keep track of how many values i've inserted in value.
value is delcared this way because for each key, i need an array of char* to store each and every value associated with it. (I thought this through, but it could be a wrong approach and am welcome to suggestions - but that's not the purpose of the question)
I loop through the xml file, and for each node I do a strcmp between the name of the node and each of my keys. When there is a match, the index of that key is used as an index in the value matrix. I then try to extend the allocated memory for the c string matrix and then afterwards for the single char*.
The "broken" code, follows, where
read is the index of the key abovementioned.
reader is the xmlNode
string contained the name of the xmlNode but is then freed so consider it as if its a new char*
list is the above declared struct
if (xmlTextReaderNodeType(reader) == 3 && read >= 0)
{
/* pull out the node value */
xmlChar *value;
value = xmlTextReaderValue(reader);
if (value != NULL) {
free(string);
string=strdup(value);
/*increment array size */
list->size[read]++;
/* allocate char** */ list->value[read]=realloc(list->value[read],list->size[read] * sizeof(char**));
if (list->value[read] == NULL)
return 16;
/*allocate string (char*) memory */
list->value[read][list->size[read]-1] = realloc(list->value[read][list->size[read]-1], sizeof(char*)*sizeof(string));
if (list->value[read][list->size[read]-1] == NULL)
return 16;
/*write string in list */
strcpy(list->value[read][list->size[read]-1], string);
}
/*free memory*/
xmlFree(value);
}
xmlFree(name);
free(string);
I'd expect this to allocate the char**, and then the char*, but after a few iteration of this code (which is a function wrapped in a while loop) i get a segfault.
Analyzing this with gdb (not an expert with it, just learned it on the fly) I noticed that indeed the code seems to work as expected for 15 iteration. At the 16th iteration, the list->value[read][list->size[read]-1] after the size is incremented, list->value[read][list->size[read]-1] points to a 0x51, marked as address out of bounds. The realloc only brings it to a 0x3730006c6d782e31, still marked as out of bounds. I would expect it to point at the last allocated value.
Here is an image of that: https://imgur.com/a/FAHoidp
How can I properly allocate the needed memory without going out of bounds?
Your code has quite a few problems:
You are not including all the appropriate headers. How did you get this to compile? If you are using malloc and realloc, you need to #include <stdlib.h>. If you are using strlen and strcpy, you need to #include <string.h>.
Not really a mistake, but unless you are applying sizeof to a type itself you don't have to use enclosing brackets.
Stop using sizeof str to get the length of a string. The correct and safe approach is strlen(str)+1. If you apply sizeof to a pointer someday you will run into trouble.
Don't use sizeof(type) as argument to malloc, calloc or realloc. Instead, use sizeof *ptr. This will avoid your incorrect numElem * sizeof(char**) and instead replace it with numElem * sizeof *arrayString, which correctly translates to numElem * sizeof(char*). This time, though, you were saved by the pure coincidence that sizeof(char**) == sizeof(char*), at least on GCC.
If you are dynamically allocating memory, you must also deallocate it manually when you no longer need it. Use free for this purpose: free(testString);, free(arrayString);.
Not really a mistake, but if you want to cycle through elements, use a for loop, not a while loop. This way your intention is known by every reader.
This code compiles fine on GCC:
#include <stdio.h> //NULL, printf
#include <stdlib.h> //malloc, realloc, free
#include <string.h> //strlen, strcpy
int main()
{
char** arrayString = NULL;
char* testString;
testString = malloc(strlen("1234567890123456789012345678901234567890123456789") + 1);
strcpy(testString, "1234567890123456789012345678901234567890123456789");
for (int numElem = 1; numElem < 50; numElem++)
{
arrayString = realloc(arrayString, numElem * sizeof *arrayString);
arrayString[numElem - 1] = malloc(strlen(testString) + 1);
strcpy(arrayString[numElem - 1], testString);
}
free(arrayString);
free(testString);
printf("done\n");
return 0;
}

Function receives a pointer to double, allocates memory and fills resulted array of doubles [duplicate]

This question already has answers here:
Initializing a pointer in a separate function in C
(2 answers)
Closed 3 years ago.
My goal is to pass a pointer to double to a function, dynamically allocate memory inside of the function, fill resulted array with double values and return filled array. After lurking attentively everywhere in StackOverflow, I have found two related topics, namely Initializing a pointer in a separate function in C and C dynamically growing array. Accordingly, I have tried to write my own code. However, the result was not the same as it was described in aforementioned topics. This program was run using both gcc and Visual Studio.
First trial.
int main()
{
double *p;
int count = getArray(&p);
<...print content of p...>
return 0;
}
int getArray(double *p)
{
int count = 1;
while(1)
{
if(count == 1)
p = (double*)malloc(sizeof(double));
else
p = (double*)realloc(p, count*sizeof(double));
scanf("%lf", &p[count-1]);
<...some condition to break...>
count++;
{
<... print the content of p ...>
return count;
}
(Here comes the warning from compiler about incompatible argument type. Ignore it).
Input:
1.11
2.22
3.33
Output:
1.11
2.22
3.33
0.00
0.00
0.00
Second trial.
int main()
{
double *p;
int count = getArray(&p);
<...print content of p...>
return 0;
}
int getArray(double **p)
{
int count = 1;
while(1)
{
if(count == 1)
*p = (double*)malloc(sizeof(double));
else
{
double ** temp = (double*)realloc(*p, count*sizeof(double));
p = temp;
}
scanf("%lf", &(*p)[count-1]);
<...some condition to break...>
count++;
{
<... print the content of p ...>
return count;
}
Input:
1.11
2.22
Segmentation error.
I tried this method on several different *nix machines, it fails when the loop uses realloc. SURPRISINGLY, this code works perfect using Visual Studio.
My questions are: first code allows to allocate and reallocate the memory and even passes all this allocated memory to main(), however, all the values are zeroed. What is the problem? As for the second program, what is the reason of the segmentation error?
The right way of doing it is like this:
int getArray(double **p)
{
int count = 0;
while(1)
{
if(count == 0)
*p = malloc(sizeof(**p));
else
*p = realloc(*p, (count+1)*sizeof(**p));
scanf("%lf", &((*p)[count]));
<...some condition to break...>
count++;
{
<...print content of p...>
return count;
}
If you pass a pointer to a function and you want to change not only the value it is pointing at, but change the address it is pointing to you HAVE to use a double pointer. It is simply not possible otherwise.
And save yourself some trouble by using sizeof(var) instead of sizeof(type). If you write int *p; p = malloc(sizeof(int));, then you are writing the same thing (int) twice, which means that you can mess things up if they don't match, which is exactly what happened to you. This also makes it harder to change the code afterwards, because you need to change at multiple places. If you instead write int *p; p = malloc(sizeof(*p)); that risk is gone.
Plus, don't cast malloc. It's completely unnecessary.
One more thing you always should do when allocating (and reallocating) is to check if the allocation was successful. Like this:
if(count == 0)
*p = malloc(sizeof(**p));
else
*p = realloc(*p, (count+1)*sizeof(**p));
if(!p) { /* Handle error */ }
Also note that it is possible to reallocate a NULL pointer, so in this case the malloc is not necessary. Just use the realloc call only without the if statement. One thing worth mentioning is that if you want to be able to continue execution if the realloc fails, you should NOT assign p to the return value. If realloc fails, you will lose whatever you had before. Do like this instead:
int getArray(double **p)
{
int count = 0;
// If *p is not pointing to allocated memory or NULL, the behavior
// of realloc will be undefined.
*p = NULL;
while(1)
{
void *tmp = realloc(*p, (count+1)*sizeof(**p));
if(!tmp) {
fprintf(stderr, "Fail allocating");
exit(EXIT_FAILURE);
}
*p = tmp;
// I prefer fgets and sscanf. Makes it easier to avoid problems
// with remaining characters in stdin and it makes debugging easier
const size_t str_size = 100;
char str[str_size];
if(! fgets(str, str_size, stdin)) {
fprintf(stderr, "Fail reading");
exit(EXIT_FAILURE);
}
if(sscanf(str, "%lf", &((*p)[count])) != 1) {
fprintf(stderr, "Fail converting");
exit(EXIT_FAILURE);
}
count++;
// Just an arbitrary exit condition
if ((*p)[count-1] < 1) {
printf("%lf\n", (*p)[count]);
break;
}
}
return count;
}
You mentioned in comments below that you're having troubles with pointers in general. That's not unusual. It can be a bit tricky, and it takes some practice to get used to it. My best advice is to learn what * and & really means and really think things through. * is the dereference operator, so *p is the value that exists at address p. **p is the value that exists at address *p. The address operator & is kind of an inverse to *, so *&x is the same as x. Also remember that the [] operator used for indexing is just syntactic sugar. It works like this: p[5] translates to *(p+5), which has the funny effect that p[5] is the same as 5[p].
In my first version of above code, I used p = tmp instead of *p = tmp and when I constructed a complete example to find that bug, I also used *p[count] instead of (*p)[count]. Sorry about that, but it does emphasize my point. When dealing with pointers, and especially pointers to pointers, REALLY think about what you're writing. *p[count] is equivalent to *(*(p+count)) while (*p)[count] is equivalent to *((*p) + count) which is something completely different, and unfortunately, none of these mistakes was caught even though I compiled with -Wall -Wextra -std=c18 -pedantic-errors.
You mentioned in comments below that you need to cast the result of realloc. That probably means that you're using a C++ compiler, and in that case you need to cast, and it should be (double *). In that case, change to this:
double *tmp = (double*)realloc(*p, (count+1)*sizeof(**p));
if(!tmp) {
fprintf(stderr, "Fail allocating");
exit(EXIT_FAILURE);
}
*p = tmp;
Note that I also changed the type of the pointer. In C, it does not matter what type of pointer tmp is, but in C++ it either has to be a double* or you would need to do another cast: *p = (double*)tmp

Mysterious segfault though pointer is initialised

I am a newbie in C and I am trying to program a simple text editor, I have already written a 100 lines of stupid messy code, but it just worked. Until this SEGFAULT started showing up. I am going with the approach of switching terminal to canonical mode, and getting letter by letter from the user and do the necessary with each of 'em. The letters are added to a buffer, which is realloced extra 512 byte when the buffer is half filled, which I know is a stupid thing to do. But the cause of the SEGFAULT cant be determined. Help would be appreciated. Here's my code:
char* t_buf
int t_buf_len = 0;
int cur_buf_sz = 0;
void mem_mgr(char *buffer, unsigned long bytes){ //Function to allocate requested memory
if(buffer == NULL){
if((buffer = malloc(sizeof(char) * bytes)) == NULL){
printf("ERROR:Cannot get memory resources(ABORT)\n\r");
exit(1);
}
}
else{
realloc(buffer, bytes);
}
}
void getCharacter(){
if(t_buf_len >= (cur_buf_sz/2))
mem_mgr(t_buf, cur_buf_sz+=512);
strcpy(t_buf, "Yay! It works!");
printf("%s %d", t_buf, cur_buf_sz);
}
There are things you need to understand first,
The buffer pointer is a local variable inside the mem_mgr() function, it points to the same memory t_buf points initially, but once you modify it, it's no longer related to t_buf in any way.
So, when you return from mem_mgr() you lose the reference to the allocated memory and.
To fix this, you can pass a poitner to the pointer, and alter the actual pointer by dereferencing it.
The realloc() function, behaves exactly like malloc() if the first argument is NULL, if you read the documentation you would know that.
Memory allocation functions MUST be checked to ensure they returned a valid legal pointer, that's why you need a temporary poitner to store the return value of realloc(), because if it returns NULL, meaning that there was no memory to fulfill the request, you would lose reference to the original block of memory and you can't free it anymore.
You need to pass a pointer to your pointer to mem_mgr(), like this
int
mem_mgr(char **buffer, unsigned long bytes)
{
void *tmp = realloc(*buffer, bytes);
if (tmp != NULL) {
*buffer = tmp;
return 0;
}
return -1;
}
And then, to allocate memory
void
getCharacter()
{
if (t_buf_len >= (cur_buf_sz / 2)) {
if (mem_mgr(&t_buf, cur_buf_sz += 512) != -1) {
strcpy(t_buf, "Yay! It works!");
printf("%s %d", t_buf, cur_buf_sz);
}
}
}
The call to
mem_mgr(t_buf, cur_buf_sz+=512);
cannot change the actual parameter t_buf. You will either have to return the buffer from mem_mgr
t_buf = mem_mgr(t_buf, cur_buf_sz+=512);
or pass a pointer to t_buf
mem_mgr(&t_buf, cur_buf_sz+=512);
Furthermore, a call to realloc may change the address of the memory buffer, so you will have to use
char *tmpbuf = realloc(buffer, bytes);
if (!tmpbuf)
// Error handling
else
buffer = tmpbuf;
realloc(NULL, bytes); will behave like a malloc, so you don't need a separate branch here. This makes in total:
char *mem_mgr(char *buffer, unsigned long bytes){ //Function to allocate requested memory
char *tmpbuf = realloc(buffer, bytes);
if (!tmpbuf) {
// Error handling
}
return tmpbuf;
}
which somehow questions the reason of existence of the function mem_mgr.

Trying to understand C sizing better

First things first, the code works, but it didn't for a while, and I'm trying to understand why what I did fixes it.
So I have a function:
int array_size(const char **array) {
int i = 0;
while (array[i] != NULL) ++i;
return i;
}
I also have this pointer which I started with one element and a call to a function which modifies local_mig:
int main(void) {
char **local_mig = malloc(sizeof(char *) * 1);
populate_local_mig(&local_mig);
int size = array_size(local_mig); // 9
}
This function looks like this (note the comment on second to last line):
void populate_local_mig(char ***local_mig) {
// ...above here reads a directory with 5 .sql files
while ((directory = readdir(dir)) != NULL) {
int d_name_len = strlen(directory->d_name);
char *file_name = malloc(sizeof(char) * (d_name_len + 1));
strcpy(file_name, (const char *)directory->d_name);
size_t len = strlen(file_name);
if (len > 4 && strcmp(file_name + len - 4, ".sql") == 0) {
(*local_mig)[i] = malloc(sizeof(char) * (len + 1));
strcpy((*local_mig)[i], file_name);
++i;
*local_mig = realloc(*local_mig, sizeof(char *) * (i + 1));
}
}
//(*local_mig)[i] = NULL;
}
Still with me? Good.
Later on, I call array_size(local_mig); and it returns 9. What the? I was expecting 5. So naturally when I iterate over local_mig later, I eventually segfault when it tries to read the 6th element.
So, I added (*local_mig)[i] = NULL; and suddenly everything was ok and it returned 5, like it should have.
All along I figured since I allocated exactly enough space to fit each character array, that the size would obviously be the number of times I resized local_mig.
Turns out I was wrong... very very wrong. But why, I ask...
If you don't set the last pointer in your list to NULL, you will encounter undefined behavior in your array_size function, as it rolls right past the end of the array (with no marker to stop it) and into memory that you probably do not own and is not initialized.
The unpredicted size of 9 is the result of the aforementioned undefined behavior. It's probably the result of whatever was in memory at the time. Really, though, with UB, anything can happen.
The loop in array_size eventually gets up to testing array[i] != NULL, where i is the last index in the space you allocated with realloc.
If you actually did set this entry to NULL then all is well. But if you didn't: uninitialized values are different to null pointers. Reading an uninitialized value may cause a crash, or the compiler may optimize the program based on the assumption that you never read uninitialized values because the language specification says you aren't meant to do that!
A likely result is that this last entry will appear to contain a junk value which probably does not match NULL. And then your loop continues to read past the end of the allocated space , with unpredictable results.

malloc convert memory for struct

What is the right way to malloc memory ? And what is the difference between them ?
void parse_cookies(const char *cookie, cookie_bank **my_cookie, int *cookies_num)
{
*my_cookie = malloc(sizeof(cookie_bank) * 1);
*my_cookie = (cookie_bank *)malloc(sizeof(cookie_bank) * 1);
my_cookie = (cookie_bank **)malloc(sizeof(cookie_bank) * 1);
///
}
I'm trying to malloc array of cookie_bank structs function.
I'm assuming that you want the function to allocate memory for an array and passing the result via a pointer parameter. So, you want to write T * x = malloc(...), and assign the result to a pointer argument, *y = x:
cookie_bank * myarray;
parse_cookies(..., &myarray, ...);
/* now have myarray[0], myarray[1], ... */
So the correct invocation should be, all rolled into one line,
parse_cookies(..., cookie_bank ** y, ...)
{
*y = malloc(sizeof(cookie_bank) * NUMBER_OF_ELEMENTS);
}
Your second example is the most correct. You don't need the *1 obviously.
*my_cookie = (cookie_bank *)malloc(sizeof(cookie_bank) * 1);
Your first example is also correct, although some compilers/flags will cause a complaint about the implicit cast from void*:
*my_cookie = malloc(sizeof(cookie_bank) * 1);
It you want to allocate more than one entry you'd generally use calloc() because it zeros the memory too:
*my_cookie = (cookie_bank*)calloc(sizeof(cookie_bank), 1);
your third example is just wrong:
my_cookie = (cookie_bank **)malloc(sizeof(cookie_bank) * 1);
This will overwrite the local my_cookie pointer, and the memory will be lost on function return.
I just would like to recommend you to read some C textbook. It seems to me that you do not have clear understanding on how pointers work in C language.
Anyway, here is some example to allocate memory with malloc.
#include <stdlib.h>
void parse_cookies(const char *cookie, cookie_bank **my_cookie, int *cookies_num)
{
if (cookies_num == NULL || *cookies_num == 0) {
return;
}
if (my_cookie == NULL) {
my_cookie = (cookie_bank**)malloc(sizeof(cookie_bank*) * *cookies_num);
}
for (int i = 0; i < *cookies_num; i++) {
*my_cookie = (cookie_bank*)malloc(sizeof(cookie_bank));
my_cookie++;
}
}
Of course, this example does not cover any error handling. Basically, my_cookie is pointer to pointer which means my_cookie is just pointer to point memory location where it holds array of pointers. The first malloc allocate the memory using size of pointer and requested number of cookie structure. Then second malloc actually allocate memory for each structure.
The problem of this function is that it can easily cause memory leak unless using this very carefully.
Anyway, it is important to understand how C pointer works.

Resources