I am writing a program where the input will be taken from stdin. The first input will be an integer which says the number of strings to be read from stdin.
I just read the string character-by-character into a dynamically allocated memory and displays it once the string ends.
But when the string is larger than allocated size, I am reallocating the memory using realloc. But even if I use memcpy, the program works. Is it undefined behavior to not use memcpy? But the example Using Realloc in C does not use memcpy. So which one is the correct way to do it? And is my program shown below correct?
/* ss.c
* Gets number of input strings to be read from the stdin and displays them.
* Realloc dynamically allocated memory to get strings from stdin depending on
* the string length.
*/
#include <stdio.h>
#include <stdlib.h>
int display_mem_alloc_error();
enum {
CHUNK_SIZE = 31,
};
int display_mem_alloc_error() {
fprintf(stderr, "\nError allocating memory");
exit(1);
}
int main(int argc, char **argv) {
int numStr; //number of input strings
int curSize = CHUNK_SIZE; //currently allocated chunk size
int i = 0; //counter
int len = 0; //length of the current string
int c; //will contain a character
char *str = NULL; //will contain the input string
char *str_cp = NULL; //will point to str
char *str_tmp = NULL; //used for realloc
str = malloc(sizeof(*str) * CHUNK_SIZE);
if (str == NULL) {
display_mem_alloc_error();
}
str_cp = str; //store the reference to the allocated memory
scanf("%d\n", &numStr); //get the number of input strings
while (i != numStr) {
if (i >= 1) { //reset
str = str_cp;
len = 0;
}
c = getchar();
while (c != '\n' && c != '\r') {
*str = (char *) c;
printf("\nlen: %d -> *str: %c", len, *str);
str = str + 1;
len = len + 1;
*str = '\0';
c = getchar();
if (curSize/len == 1) {
curSize = curSize + CHUNK_SIZE;
str_tmp = realloc(str_cp, sizeof(*str_cp) * curSize);
if (str_tmp == NULL) {
display_mem_alloc_error();
}
memcpy(str_tmp, str_cp, curSize); // NB: seems to work without memcpy
printf("\nstr_tmp: %d", str_tmp);
printf("\nstr: %d", str);
printf("\nstr_cp: %d\n", str_cp);
}
}
i = i + 1;
printf("\nEntered string: %s\n", str_cp);
}
return 0;
}
/* -----------------
//input-output
gcc -o ss ss.c
./ss < in.txt
// in.txt
1
abcdefghijklmnopqrstuvwxyzabcdefghij
// output
// [..snip..]
Entered string:
abcdefghijklmnopqrstuvwxyzabcdefghij
-------------------- */
Thanks.
Your program is not quite correct. You need to remove the call to memcpy to avoid an occasional, hard to diagnose bug.
From the realloc man page
The realloc() function changes the size of the memory block pointed to
by ptr to size bytes. The contents will be unchanged in the range
from the start of the region up to the minimum of the old and new
sizes
So, you don't need to call memcpy after realloc. In fact, doing so is wrong because your previous heap cell may have been freed inside the realloc call. If it was freed, it now points to memory with unpredictable content.
C11 standard (PDF), section 7.22.3.4 paragraph 2:
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.
So in short, the memcpy is unnecessary and indeed wrong. Wrong for two reasons:
If realloc has freed your previous memory, then you are accessing memory that is not yours.
If realloc has just enlarged your previous memory, you are giving memcpy two pointers that point to the same area. memcpy has a restrict qualifier on both its input pointers which means it is undefined behavior if they point to the same object. (Side note: memmove doesn't have this restriction)
Realloc enlarge the memory size where reserved for your string. If it is possible to enlarge it without moving the datas, those will stay in place. If it cannot, it malloc a lager memory plage, and memcpy itself the data contained in the previous memory plage.
In short, it is normal that you dont have to call memcpy after realloc.
From the man page:
The realloc() function tries to change the size of the allocation pointed
to by ptr to size, and returns ptr. If there is not enough room to
enlarge the memory allocation pointed to by ptr, realloc() creates a new
allocation, copies as much of the old data pointed to by ptr as will fit
to the new allocation, frees the old allocation, and returns a pointer to
the allocated memory. If ptr is NULL, realloc() is identical to a call
to malloc() for size bytes. If size is zero and ptr is not NULL, a new,
minimum sized object is allocated and the original object is freed. When
extending a region allocated with calloc(3), realloc(3) does not guaran-
tee that the additional memory is also zero-filled.
Related
I am creating a function to load a Hash Table and I'm getting a segmentation fault if my code looks like this
bool load(const char *dictionary)
{
// initialize vars
char *line = NULL;
size_t len = 0;
unsigned int hashed;
//open file and check it
FILE *fp = fopen(dictionary, "r");
if (fp == NULL)
{
return false;
}
while (fscanf(fp, "%s", line) != EOF)
{
//create node
node *data = malloc(sizeof(node));
//clear memory if things go south
if (data == NULL)
{
fclose(fp);
unload();
return false;
}
//put data in node
//data->word = *line;
strcpy(data->word, line);
hashed = hash(line);
hashed = hashed % N;
data->next = table[hashed];
table[hashed] = data;
dictionary_size++;
}
fclose(fp);
return true;
}
However If I replace
char *line = NULL; by char line[LENGTH + 1]; (where length is 45)
It works. What is going on aren't they "equivalent"?
When you do fscanf(fp, "%s", line) it'll try to read data into the memory pointed to by line - but char *line = NULL; does not allocate any memory.
When you do char line[LENGTH + 1]; you allocate an array of LENGTH + 1 chars.
Note that if a word in the file is longer than LENGTH your program will write out of bounds. Always use bounds checking operations.
Example:
while (fscanf(fp, "%*s", LENGTH, line) != EOF)
They are not equivalent.
In the first case char *line = NULL; you have a pointer-to-char which is initialised to NULL. When you call fscanf() it tries to write data to it and this will cause it to dereference the NULL pointer. Hence segfault.
One option to fix that would have been to allocate (malloc() and friends) the required memory first, check the pointer is not NULL (allocation failed) before using it. Then you would need to free() the resources once you no longer need the data.
In the second case char line[LENGTH +1] you have an array-of-char of size LENGTH + 1. This memory has been allocated for you on the stack (the compiler ensures this happens automatically for arrays), and the memory is only 'valid' for use during the lifetime of the function: once you return you must no longer use it. Now, when you pass the pointer to fscanf() (to the first element of the array in this case), fscanf() has a memory buffer to write to. As long as the buffer is large enough to hold the data being written this works correctly.
char *line = NULL;
Says "I want a variable named 'line' that can point to characters, but is not currently pointing to anything." The compiler will allocate memory that can hold a memory address, and will fill it with zero (or some other internal representation of "points to nothing").
char line[10];
Says "allocate memory for 10 characters, and I would like to use the name 'line' for the address of the first one". It does not allocate space to hold the memory address, because that's a constant, but it does allocate space for the characters (and does not initialize them).
Declaring a pointer as NULL doesn't allocate memory for the array. When you access the pointer, then what gets executed is reading / writing to a null pointer, which is not what you want. How fscanf works is it writes out to the buffer you sent, hence meaning that the buffer must be allocated before hand. If you want to use a pointer, then you ought to do:
char* line = malloc(LEN + 1);
When declaring as an array, then the compiler allocates memory for it, not you. This is better, in case you forget to free the memory, which the compiler won't do. Note that if you do use an array (which is a local variable in this case), it cannot be used by functions higher up on the call stack, because as I stated above, the memory gets freed upon return from the function.
When we allocating memory spaces for a string, do the following 2 ways give the same result?
char *s = "abc";
char *st1 = (char *)malloc(sizeof(char)*strlen(s));
char *st2 = (char *)malloc(sizeof(s));
In other words, does allocate the memory based on the size of its characters give the same result as allocating based on the size of the whole string?
If I do use the later method, is it still possible for me to add to that memory spaces character by character such as:
*st = 'a';
st++;
*st = 'b';
or do I have to add a whole string at once now?
Let's see if we can't get you straightened out on your question and on allocating (and reallocating) storage. To begin, when you declare:
char *s = "abc";
You have declared a pointer to char s and you have assigned the starting address for the String Literal "abc" to the pointer s. Whenever you attempt to use sizeof() on a_pointer, you get sizeof(a_pointer) which is typically 8-bytes on x86_64 (or 4-bytes on x86, etc..)
If you take sizeof("abc"); you are taking the size of a character array with size 4 (e.g. {'a', 'b', 'c', '\0'}), because a string literal is an array of char initialized to hold the string "abc" (including the nul-terminating character). Also note, that on virtually all systems, a string literal is created in read-only memory and cannot be modified, it is immutable.
If you want to allocate storage to hold a copy of the string "abc", you must allocate strlen("abc") + 1 characters (the +1 for the nul-terminating character '\0' -- which is simply ASCII 0, see ASCII Table & Description.
Whenever you allocate memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. So if you allocate for char *st = malloc (len + 1); characters, you do not want to iterate with the pointer st (e.g. no st++). Instead, declare a second pointer, char *p = st; and you are free to iterate with p.
Also, in C, there is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?.
If you want to add to an allocation, you use realloc() which will create a new block of memory for you and copy your existing block to it. When using realloc(), you always reallocate using a temporary pointer (e.g. don't st = realloc (st, new_size);) because if when realloc() fails, it returns NULL and if you assign that to your pointer st, you have just lost the original pointer and created a memory leak. Instead, use a temporary pointer, e.g. void *tmp = realloc (st, new_size); then validate realloc() succeeds before assigning st = tmp;
Now, reading between the lines that is where you are going with your example, the following shows how that can be done, keeping track of the amount of memory allocated and the amount of memory used. Then when used == allocated, you reallocate more memory (and remembering to ensure you have +1 bytes available for the nul-terminating character.
A short example would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define THISMANY 23
int main (void) {
char *s = "abc", *st, *p; /* string literal and pointer st */
size_t len = strlen(s), /* length of s */
allocated = len + 1, /* number of bytes in new block allocated */
used = 0; /* number of bytes in new block used */
st = malloc (allocated); /* allocate storage for copy of s */
p = st; /* pointer to allocate, preserve st */
if (!st) { /* validate EVERY allocation */
perror ("malloc-st");
return 1;
}
for (int i = 0; s[i]; i++) { /* copy s to new block of memory */
*p++ = s[i]; /* (could use strcpy) */
used++; /* advance counter */
}
*p = 0; /* nul-terminate copy */
for (size_t i = 0; i < THISMANY; i++) { /* loop THISMANY times */
if (used + 1 == allocated) { /* check if realloc needed (remember '\0') */
/* always realloc using temporary pointer */
void *tmp = realloc (st, 2 * allocated); /* realloc 2X current */
if (!tmp) { /* validate EVERY reallocation */
perror ("realloc-st");
break; /* don't exit, original st stil valid */
}
st = tmp; /* assign reallocated block to st */
allocated *= 2; /* update allocated amount */
}
*p++ = 'a' + used++; /* assign new char, increment used */
}
*p = 0; /* nul-terminate */
printf ("result st : %s\n" /* output final string, length, allocated */
"length st : %zu bytes\n"
"final size : %zu bytes\n", st, strlen(st), allocated);
free (st); /* don't forget to free what you have allocated */
}
Example Use/Output
$ ./bin/sizeofs
result st : abcdefghijklmnopqrstuvwxyz
length st : 26 bytes
final size : 32 bytes
Look things over and let me know if this answered your questions, and if not, leave a comment and I'm happy to help further.
If you are still shaky on what a pointer is, and would like more information, here are a few links that provide basic discussions of pointers that may help. Difference between char pp and (char) p? and Pointer to pointer of structs indexing out of bounds(?)... (ignore the titles, the answers discuss pointer basics)
I am trying to write a basic CSV parser in C that generates a dynamic array of char* when given a char* and a separator character, such as a comma:
char **filldoc_parse_csv(char *toparse, char sepchar)
{
char **strings = NULL;
char *buffer = NULL;
int j = 0;
int k = 1;
for(int i=0; i < strlen(toparse); i++)
{
if(toparse[i] != sepchar)
{
buffer = realloc(buffer, sizeof(char)*k);
strcat(buffer, (const char*)toparse[i]);
k++;
}
else
{
strings = realloc(strings, sizeof(buffer)+1);
strings[j] = buffer;
free(buffer);
j++;
}
}
return strings;
}
However, when I call the function as in the manner below:
char **strings = filldoc_parse_csv("hello,how,are,you", ',');
I end up with a segmentation fault:
Program received signal SIGSEGV, Segmentation fault.
__strcat_sse2 () at ../sysdeps/x86_64/multiarch/../strcat.S:166
166 ../sysdeps/x86_64/multiarch/../strcat.S: No such file or directory.
(gdb) backtrace
#0 __strcat_sse2 () at ../sysdeps/x86_64/multiarch/../strcat.S:166
#1 0x000000000040072c in filldoc_parse_csv (toparse=0x400824 "hello,how,are,you", sepchar=44 ',') at filldocparse.c:20
#2 0x0000000000400674 in main () at parsetest.c:6
The problem is centered around allocating enough space for the buffer string. If I have to, I will make buffer a static array, however, I would like to use dynamic memory allocation for this purpose. How can I do it correctly?
Various problems
strcat(buffer, (const char*)toparse[i]); attempts to changes a char to a string.
strings = realloc(strings, sizeof(buffer)+1); reallocates the same amount of space. sizeof(buffer) is the size of the pointer buffer, not the size of memory it points to.
The calling function has no way to know how many entries in strings. Suggest andding a NULL sentinel.
Minor: better to use size_t rather than int. Use more descriptive names. Do not re-call strlen(toparse) repetitively. Use for(int i=0; toparse[i]; i++) . Make toparse a const char *
You have problems with your memory allocations. When you do e.g. sizeof(buffer) you will get the size of the pointer and not what it points to. That means you will in the first run allocate five bytes (on a 32-bit system), and the next time the function is called you will allocate five bytes again.
There are also many other problems, like you freeing the buffer pointer once you assigned the pointer to strings[j]. The problem with this is that the assignment only copies the pointer and not what it points to, so by freeing buffer you also free strings[j].
Both the above problems will lead to your program having undefined behavior, which is the most common cause of runtime-crashes.
You should also avoid assigning the result of realloc to the pointer you're trying to reallocate, because if realloc fails it will return NULL and you loose the original pointer causing a memory leak.
I am doing an exercise for fun from KandR C programming book. The program is for finding the longest line from a set of lines entered by the user and then prints it.
Here is what I have written (partially, some part is taken from the book directly):-
#include <stdio.h>
#include <stdlib.h>
int MAXLINE = 10;
int INCREMENT = 10;
void copy(char longest[], char line[]){
int i=0;
while((longest[i] = line[i]) != '\0'){
++i;
}
}
int _getline(char s[]){
int i,c;
for(i=0; ((c=getchar())!=EOF && c!='\n'); i++){
if(i == MAXLINE - 1){
s = (char*)realloc(s,MAXLINE + INCREMENT);
if(s == NULL){
printf("%s","Unable to allocate memory");
// goto ADDNULL;
exit(1);
}
MAXLINE = MAXLINE + INCREMENT;
}
s[i] = c;
}
if(c == '\n'){
s[i] = c;
++i;
}
ADDNULL:
s[i]= '\0';
return i;
}
int main(){
int max=0, len;
char line[MAXLINE], longest[MAXLINE];
while((len = _getline(line)) > 0){
printf("%d", MAXLINE);
if(len > max){
max = len;
copy(longest, line);
}
}
if(max>0){
printf("%s",longest);
}
return 0;
}
The moment I input more than 10 characters in a line, the program crashes and displays:-
*** Error in `./a.out': realloc(): invalid old size: 0x00007fff26502ed0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3d6a07bbe7]
/lib64/libc.so.6[0x3d6a07f177]
/lib64/libc.so.6(realloc+0xd2)[0x3d6a0805a2]
./a.out[0x400697]
./a.out[0x40083c]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3d6a021b45]
./a.out[0x400549]
I also checked realloc invalid old size but could not follow the logic of passing a pointer to a pointer to the function modifying the array.
realloc call will re-allocate memory by taking a pointer to a storage area on the heap i.e a dynamically allocated memory result of a call to malloc.
In your case the problem is that you are allocating memory on the stack and not dynamically by a call to malloc which results memory allocation on the heap. And, passing the pointer of the automatic character array line to _getline which uses it for call to realloc. So, you got the error.
Try dynamically allocating the character array line :
char* line = (char *) malloc(MAXLINE);
char* longest = ( char *) malloc(MAXLINE);
You get an invalid old size error when your code writes the memory that malloc/realloc allocated for "housekeeping information". This is where they store the "old" allocated size. This also happens when the pointer that you pass to realloc has not been properly initialized, i.e. it's neither a NULL nor a pointer previously returned from malloc/calloc/realloc.
In your case, the pointer passed to realloc is actually an array allocated in automatic memory - i.e. it's not a valid pointer. To fix, change the declarations of line and longest as follows:
char *line = malloc(MAXLINE), *longest = malloc(MAXLINE);
To avoid memory leaks, make sure that you call free(line) and free(longest) at the end of your program.
You are trying to realloc() memory that was not allocated dynamically with malloc(). You cannot do that.
Also, if realloc() fails, the original memory is still allocated, and thus still needs to be freed with free(). So do not assign the return value of realloc() to the original pointer unless it is not NULL, otherwise you are leaking the original memory. Assign the return value of realloc() to a temp variable first, check its value, and then assign to the original pointer only if realloc() was successful.
if _getline() reads 10 or more characters, it will call realloc() on memory that was not allocated with malloc(). This is undefined behavior.
Additionally, the memory allocated from realloc() will be leaked at the end of the call to _getline().
Additionally, let's assume that the input string is "0123456789\n". Then you will attempt to write to longest that value, but you will never call realloc() before you do that, which is required.
Your error is here:
int _getline(char s[])
It means _getline is a function returning int getting a pointer to char by value.
You actually want to pass that pointer (which must point to memory allocated with malloc() or be NULL) by reference.
That means, you need to pass a pointer to a pointer to char.
Correcting that error will force you to correct all follow-on errors too.
Only use free / realloc on NULL resp. on pointers returned from malloc, calloc, realloc or a function specified to return such a pointer.
Here is my situation:
main allocates memory based on string and calls function by passing an address. The function then appropriately resizes the passed memory to accommodate more data. After which when I try to release the memory I get heap error.
Here is the code:
typedef char * string;
typedef string * stringRef;
/**************************
main
**************************/
int main()
{
string input = "Mary had";
string decoded_output = (string)calloc(strlen(input), sizeof(char));
sprintf(decoded_output, "%s", input);
gen_binary_string(input, &decoded_output);
free(decoded_output); /*this causes issue*/
return 0;
}
void gen_binary_string(string input,stringRef output)
{
int i=0, t=0;
size_t max_chars = strlen(input);
/*
the array has to hold total_chars * 8bits/char.
e.g. if input is Mary => array size 4*8=32 + 1 (+1 for \0)
*/
string binary_string = (string)calloc((BINARY_MAX*max_chars) + 1, sizeof(char));
int offset = 0;
/* for each character in input string */
while (*(input+i))
{
/* do some binary stuff... */
}
/* null terminator */
binary_string[BINARY_MAX*max_chars] = '\0';
int newLen = strlen(binary_string);
string new_output = (string) realloc((*output), newLen);
if (new_output == NULL)
{
printf("FATAL: error in realloc!\n");
free(binary_string);
return;
}
strcpy(new_output, binary_string);
(*output) = new_output;
free(binary_string);
}
You may be misunderstanding the purpose of realloc. Calling realloc will not necessarily return a newly allocated object. If possible, it will return the same object, extended to hold more bytes. Also, it automatically copies the object's contents. Theferore: (1) you should not copy and (2) you should not free the old buffer.
The realloc() function changes the size of the memory block pointed
to by ptr to size bytes. The contents will be unchanged in the range
from the start of the region up to the minimum of the old and new
sizes... (snip) Unless ptr is NULL, it must have been returned by an
earlier call to malloc(), calloc() or realloc(). If the area
pointed to was moved, a free(ptr) is done.
After a better reading of your code, I don't understand why you're using realloc at all here, as you're not using the old contents of output. You'd get the same behaviour (and the same error) if you replaced realloc with malloc. I think your real problem is that you're not allocating enough bytes: you should have strlen(binary_string) + 1 to accommodate the '\0' at the end of the string.
A better option would be to pass in a char** from the caller, let the callee allocate the char* and then pass back the pointer at the end of the function.
This prevents the need for two allocations and one free (always a bad sign).