Reading line from file causes crash - c

I'm trying to read a line from a file character by character and place the characters in a string; here' my code:
char *str = "";
size_t len = 1; /* I also count the terminating character */
char temp;
while ((temp = getc(file)) != EOF)
{
str = realloc(str, ++len * sizeof(char));
str[len-2] = temp;
str[len-1] = '\0';
}
The program crashes on the realloc line. If I move that line outside of the loop or comment it out, it doesn't crash. If I'm just reading the characters and then sending them to stdout, it all works fine (ie. the file is opened correctly). Where's the problem?

You can't realloc a pointer that wasn't generated with malloc in the first place.
You also have an off-by-one error that will give you some trouble.

Change your code to:
char *str = NULL; // realloc can be called with NULL
size_t len = 1; /* I also count the terminating character */
char temp;
while ((temp = getc(file)) != EOF)
{
str = (char *)realloc(str, ++len * sizeof(char));
str[len-2] = temp;
str[len-1] = '\0';
}
Your issue is because you were calling realloc with a pointer to memory that was not allocated with either malloc or realloc which is not allowed.
From the realloc manpage:
realloc() changes the size of the memory block pointed to by ptr to size bytes.
The contents will be unchanged to the minimum of the old and new
sizes; newly allocated memory will be uninitialized. If ptr is NULL,
then the call is equivalent to malloc(size), for all values of size;
if size is equal to zero, and ptr is not NULL, then the call is
equivalent to free(ptr). Unless ptr is NULL, it must have been
returned by an earlier call to malloc(), calloc() or realloc(). If
the area pointed to was moved, a free(ptr) is done.
On a side note, you should really not grow the buffer one character at a time, but keep two counter, one for the buffer capacity, and one for the number of character used and only increase the buffer when it is full. Otherwise, your algorithm will have really poor performance.

You can't realloc a string literal. Also, reallocing every new char isn't a very efficient way of doing this. Look into getline, a gnu extension.

Related

What is the difference between char *var= NULL; and char var[LENGTH + 1];?

I am creating a function to load a Hash Table and I'm getting a segmentation fault if my code looks like this
bool load(const char *dictionary)
{
// initialize vars
char *line = NULL;
size_t len = 0;
unsigned int hashed;
//open file and check it
FILE *fp = fopen(dictionary, "r");
if (fp == NULL)
{
return false;
}
while (fscanf(fp, "%s", line) != EOF)
{
//create node
node *data = malloc(sizeof(node));
//clear memory if things go south
if (data == NULL)
{
fclose(fp);
unload();
return false;
}
//put data in node
//data->word = *line;
strcpy(data->word, line);
hashed = hash(line);
hashed = hashed % N;
data->next = table[hashed];
table[hashed] = data;
dictionary_size++;
}
fclose(fp);
return true;
}
However If I replace
char *line = NULL; by char line[LENGTH + 1]; (where length is 45)
It works. What is going on aren't they "equivalent"?
When you do fscanf(fp, "%s", line) it'll try to read data into the memory pointed to by line - but char *line = NULL; does not allocate any memory.
When you do char line[LENGTH + 1]; you allocate an array of LENGTH + 1 chars.
Note that if a word in the file is longer than LENGTH your program will write out of bounds. Always use bounds checking operations.
Example:
while (fscanf(fp, "%*s", LENGTH, line) != EOF)
They are not equivalent.
In the first case char *line = NULL; you have a pointer-to-char which is initialised to NULL. When you call fscanf() it tries to write data to it and this will cause it to dereference the NULL pointer. Hence segfault.
One option to fix that would have been to allocate (malloc() and friends) the required memory first, check the pointer is not NULL (allocation failed) before using it. Then you would need to free() the resources once you no longer need the data.
In the second case char line[LENGTH +1] you have an array-of-char of size LENGTH + 1. This memory has been allocated for you on the stack (the compiler ensures this happens automatically for arrays), and the memory is only 'valid' for use during the lifetime of the function: once you return you must no longer use it. Now, when you pass the pointer to fscanf() (to the first element of the array in this case), fscanf() has a memory buffer to write to. As long as the buffer is large enough to hold the data being written this works correctly.
char *line = NULL;
Says "I want a variable named 'line' that can point to characters, but is not currently pointing to anything." The compiler will allocate memory that can hold a memory address, and will fill it with zero (or some other internal representation of "points to nothing").
char line[10];
Says "allocate memory for 10 characters, and I would like to use the name 'line' for the address of the first one". It does not allocate space to hold the memory address, because that's a constant, but it does allocate space for the characters (and does not initialize them).
Declaring a pointer as NULL doesn't allocate memory for the array. When you access the pointer, then what gets executed is reading / writing to a null pointer, which is not what you want. How fscanf works is it writes out to the buffer you sent, hence meaning that the buffer must be allocated before hand. If you want to use a pointer, then you ought to do:
char* line = malloc(LEN + 1);
When declaring as an array, then the compiler allocates memory for it, not you. This is better, in case you forget to free the memory, which the compiler won't do. Note that if you do use an array (which is a local variable in this case), it cannot be used by functions higher up on the call stack, because as I stated above, the memory gets freed upon return from the function.

realloc fails after multiple calls only when not debugging

The below code occasionally fails on the buffer = (char*) realloc(buffer, allocated * sizeof(char)); call (marked down below) that I use to dynamically allocate space for a char*,by allocating 1 char initially, and doubling the allocated amount every time the memory I already have is insufficient to store the string.
I have very similar code in many other parts of my project, with the same memory allocation policy and calls (changing only the type of the void* I pass to realloc).
I am using VS2010 to debug the problem, and when I start the program on debug mode, the function always completes successfully.
However, when calling the program from the command line, there is a good chance that one of the calls to realloc will fail after some time with an "Access violation reading location" error - though it doesn't happen all the time, and only happens after the function below has been called multiple times, with many reallocations having already taken place.
What's weirder, I put some prints before and after the realloc call to assert if the pointer location was changed, and, when I did so and ran the program, the calls to realloc stopped failing randomly.
What am I doing wrong?
TOKEN
next_token_file(FILE* file,
STATE_MACHINE* sm,
STATE_MACHINE* wsssm)
{
char* buffer = (char*) malloc(sizeof(char));
size_t allocated = 1;
size_t i = 0;
while(1)
{
/*
... code that increments i by one and messes with sm a bit. Does nothing to the buffer.
*/
// XXX: This fails when using realloc. Why?
if(i + 1 >= allocated)
{
allocated = allocated << 1;
buffer = (char*) realloc(buffer, allocated * sizeof(char));
}
buffer[i] = sm->current_state->state;
/*
... more code that doesn't concern the buffer
*/
}
// Null-terminate string.
buffer[++i] = 0;
TOKEN t = {ret, buffer};
return t;
}
Due to these lines
char* buffer = (char*) malloc(16 * sizeof(char));
size_t allocated = 1;
the program shrinks buffer for the first 4 re-allocations. So the program writes to unallocated memory from i=16 on, which is undefined behaviour, so anything could happen. Also this most likely smashes the memory management which in turn makes realloc() fail.
You might like to change those two lines to be:
size_t allocated = 16; /* or = 1 if the 16 was a typo. */
char * buffer = malloc(allocated);
Other notes:
sizeof(char) is always 1.
Do not cast the result of malloc/calloc/realloc as it is not necessary nor recommended: https://stackoverflow.com/a/605858/694576.
Do check the result of system calls.
Refering the last note, the following modifications should be applied
char * buffer = malloc(allocated);
might become:
char * buffer = malloc(allocated);
if (NULL == buffer)
{
/* Error handling goes here. */
}
and
buffer = (char*) realloc(buffer, allocated * sizeof(char));
might become:
{
char * pctmp = realloc(buffer, allocated);
if (NULL == pctmp)
{
/* Error handling goes here. */
}
else
{
buffer = pctmp;
}
}
More of a comment than an answer but I don't have 50 points to comment.
This:
char* buffer = (char*) malloc(16 * sizeof(char));
should be
char* buffer = (char*) malloc(1 * sizeof(char));
or
allocated = 16.
I dont know, when you are increasing or decreasing i.
But I would bet, acording to this snippet, your problem is: your reallocating infinitly, and as your not checking for realloc is returning NULL, that will crash your programm ;)
As allready said, even the not well running pritf's are conforming it, your violating your memory block. this will happen by reallocing the memory adress which has been overwritten outside the range.(excepting its UB anyway)
Or if you try to work if an invalid return value (what is when NULL is returned, what could happen because u aren't checking it)
Or if you request zerosized area(size parameter is 0) and you get returned an non zero pointer and you work with that one.
But 2nd case probably wont happen in your programm ;)

Appending Char array to Char pointer

I have been on this fow quite some time now and i dont seem to figure it out.
I have this code:
unsigned char *src;
int length = (parameterArray[i].sizeInBits/8) + 1; // check how long array should be
unsigned char tmp[length]; // declare array
memcpy(tmp, (char*)&parameterArray[i].valueU8, length); // in this case copy char to array
src = realloc(src, strlen(src) + strlen(tmp)); // reallocate space for total string
strncat(src, tmp, strlen(tmp)); // merge
every time the code crashes on the reallocating part.
I have tried almost everything and nothing works. Please help
src is an unitialized pointer, and will hold a random memory address. The preconditions for realloc() state. from the linked reference page:
Reallocates the given area of memory. It must be previously allocated by malloc(), calloc() or realloc() and not yet freed with free(), otherwise, the results are undefined.
When using realloc() store the result to a temporary variable to avoid a memory leak in the event of failure.
Additionally, calling strlen() on src will also result in undefined behaviour. As first pointed out by mani tmp must be null terminated in order for strlen() and strcpy() to work correctly. The space calculated in the realloc() must be increased by one to allocate an additional char for the terminating null character.
Example code fix:
unsigned char tmp[length + 1];
memcpy(tmp, parameterArray[i].valueU8, length);
tmp[length] = 0;
unsigned char* src = NULL;
unsigned char* src_tmp = realloc(src, (src ? strlen(src) : 0) + strlen(tmp) + 1);
if (src_tmp)
{
if (!src) *src_tmp = 0; /* Ensure null character present before strcat(). */
src = src_tmp;
strcat(src, tmp);
}
As per your code of this line memcpy(tmp, (char*)&parameterArray[i].valueU8, length); you are trying to copy valueU8 which must be assigned with Null terminator. Otherwise it will crash in this line src = realloc(src, strlen(src) + strlen(tmp));
From man pages of realloc
Unless ptr is NULL, it must have been returned by an earlier call to malloc(), calloc() or realloc().
and your src is an uninitialized pointer

Program with realloc behave differently in Valgrind

I wrote a function to read a string with fgets that uses realloc() to make the buffer grow when needed:
char * read_string(char * message){
printf("%s", message);
size_t buffsize = MIN_BUFFER;
char *buffer = malloc(buffsize);
if (buffer == NULL) return NULL;
char *p;
for(p = buffer ; (*p = getchar()) != '\n' && *p != EOF ; ++p)
if (p - buffer == buffsize - 1) {
buffer = realloc(buffer, buffsize *= 2) ;
if (buffer == NULL) return NULL;
}
*p = 0;
p = malloc(p - buffer + 1);
if (p == NULL) return NULL;
strcpy(p, buffer);
free(buffer);
return p;
}
I compiled the program and tried it, and it worked like expected. But when I run it with valgrind, the function returns NULL when the read string is >= MIN_BUFFER and valgrind says:
(...)
==18076== Invalid write of size 1
==18076== at 0x8048895: read_string (programme.c:73)
==18076== by 0x804898E: main (programme.c:96)
==18076== Address 0x41fc02f is 0 bytes after a block of size 7 free'd
==18076== at 0x402BC70: realloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==18076== by 0x8048860: read_string (programme.c:76)
(...)
==18076== Warning: silly arg (-48) to malloc()
(...)
I added a printf statement between *p=0; and p=malloc... and it confirmed that the arg passed had a value of -48.
I didn't know that programs don't run the same way when launched alone and with valgrind. Is there something wrong in my code or is it just a valgrind bug?
When you realloc the buffer, your pointer 'p' still points at the old buffer.
That will stomp memory, and also cause future allocations to use bogus values.
realloc returns a pointer to a new buffer of the requested size with the same contents as the pointer passed in, assuming that the pointer passed in was previously returned by malloc or realloc. It does not guarantee that it's the same pointer. Valgrind very likely modifies the behavior of realloc, but keeps it within the specification.
Since you are resizing memory in a loop, you would be better served by tracking your position in buffer as an offset from the beginning of buffer rather than a pointer.
As man 3 realloc says
...The function may move the memory block to a new location.
What this means, is that
p = malloc(p - buffer + 1);
is the problem. If realloc() was called, buffer might be pointing to a new block of memory and expression
(p - buffer)
does not make any sense.

What does memcpy do exactly in this program?

I am writing a program where the input will be taken from stdin. The first input will be an integer which says the number of strings to be read from stdin.
I just read the string character-by-character into a dynamically allocated memory and displays it once the string ends.
But when the string is larger than allocated size, I am reallocating the memory using realloc. But even if I use memcpy, the program works. Is it undefined behavior to not use memcpy? But the example Using Realloc in C does not use memcpy. So which one is the correct way to do it? And is my program shown below correct?
/* ss.c
* Gets number of input strings to be read from the stdin and displays them.
* Realloc dynamically allocated memory to get strings from stdin depending on
* the string length.
*/
#include <stdio.h>
#include <stdlib.h>
int display_mem_alloc_error();
enum {
CHUNK_SIZE = 31,
};
int display_mem_alloc_error() {
fprintf(stderr, "\nError allocating memory");
exit(1);
}
int main(int argc, char **argv) {
int numStr; //number of input strings
int curSize = CHUNK_SIZE; //currently allocated chunk size
int i = 0; //counter
int len = 0; //length of the current string
int c; //will contain a character
char *str = NULL; //will contain the input string
char *str_cp = NULL; //will point to str
char *str_tmp = NULL; //used for realloc
str = malloc(sizeof(*str) * CHUNK_SIZE);
if (str == NULL) {
display_mem_alloc_error();
}
str_cp = str; //store the reference to the allocated memory
scanf("%d\n", &numStr); //get the number of input strings
while (i != numStr) {
if (i >= 1) { //reset
str = str_cp;
len = 0;
}
c = getchar();
while (c != '\n' && c != '\r') {
*str = (char *) c;
printf("\nlen: %d -> *str: %c", len, *str);
str = str + 1;
len = len + 1;
*str = '\0';
c = getchar();
if (curSize/len == 1) {
curSize = curSize + CHUNK_SIZE;
str_tmp = realloc(str_cp, sizeof(*str_cp) * curSize);
if (str_tmp == NULL) {
display_mem_alloc_error();
}
memcpy(str_tmp, str_cp, curSize); // NB: seems to work without memcpy
printf("\nstr_tmp: %d", str_tmp);
printf("\nstr: %d", str);
printf("\nstr_cp: %d\n", str_cp);
}
}
i = i + 1;
printf("\nEntered string: %s\n", str_cp);
}
return 0;
}
/* -----------------
//input-output
gcc -o ss ss.c
./ss < in.txt
// in.txt
1
abcdefghijklmnopqrstuvwxyzabcdefghij
// output
// [..snip..]
Entered string:
abcdefghijklmnopqrstuvwxyzabcdefghij
-------------------- */
Thanks.
Your program is not quite correct. You need to remove the call to memcpy to avoid an occasional, hard to diagnose bug.
From the realloc man page
The realloc() function changes the size of the memory block pointed to
by ptr to size bytes. The contents will be unchanged in the range
from the start of the region up to the minimum of the old and new
sizes
So, you don't need to call memcpy after realloc. In fact, doing so is wrong because your previous heap cell may have been freed inside the realloc call. If it was freed, it now points to memory with unpredictable content.
C11 standard (PDF), section 7.22.3.4 paragraph 2:
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.
So in short, the memcpy is unnecessary and indeed wrong. Wrong for two reasons:
If realloc has freed your previous memory, then you are accessing memory that is not yours.
If realloc has just enlarged your previous memory, you are giving memcpy two pointers that point to the same area. memcpy has a restrict qualifier on both its input pointers which means it is undefined behavior if they point to the same object. (Side note: memmove doesn't have this restriction)
Realloc enlarge the memory size where reserved for your string. If it is possible to enlarge it without moving the datas, those will stay in place. If it cannot, it malloc a lager memory plage, and memcpy itself the data contained in the previous memory plage.
In short, it is normal that you dont have to call memcpy after realloc.
From the man page:
The realloc() function tries to change the size of the allocation pointed
to by ptr to size, and returns ptr. If there is not enough room to
enlarge the memory allocation pointed to by ptr, realloc() creates a new
allocation, copies as much of the old data pointed to by ptr as will fit
to the new allocation, frees the old allocation, and returns a pointer to
the allocated memory. If ptr is NULL, realloc() is identical to a call
to malloc() for size bytes. If size is zero and ptr is not NULL, a new,
minimum sized object is allocated and the original object is freed. When
extending a region allocated with calloc(3), realloc(3) does not guaran-
tee that the additional memory is also zero-filled.

Resources