I'm trying to read a CSV file and am writing a function to parse a line of data into an array of strings, which dynamically changes the size of the array and updates size and str_size accordingly. I have written a properly-working function called find_key() to locate the fseek() location of the line in question. I'm coming across a problem that I think relates to the allocation of the string array: I get a segmentation fault on the line at the bottom of the while loop, where it reads data[data_count][str_pos] = curr. The program breaks when I try to access data[0][0], even though as far as I can tell I've allocated memory properly. Any help would be appreciated!
/**
* #brief Get a row from the provided CSV file by first item. Dynamically
* allocated memory to data array
*
* #param file
* #param key First item of row
* #param data Array of strings containing data
* #param size Size of array
* #param str_size Size of strings in array
* #return 0 if successful, -1 if the row cannot be found, or 1 otherwise
*/
int csv_get_row(FILE *file, char *key, char **data, size_t *size, size_t *str_size) {
if(!file || !key) return 1;
/* Get the position of the beginning of the line starting with the key */
long pos = find_key(file, key);
if(pos == -1) return -1;
fseek(file, pos, SEEK_SET);
/* If these parameters aren't useful values, assign default values */
if(*size < 1) *size = DEFAULT_ARRAY_SIZE;
if(*str_size < 1) *str_size = DEFAULT_BUFFER_SIZE;
/* If the memory for the array hasn't been allocated, do so now */
if(!data) data = (char**) malloc(*size * *str_size);
/* Get characters one-by-one, keeping track of the current amount of elements and the current buffer position */
size_t data_count = 0;
size_t str_pos = 0;
char curr;
while(fscanf(file, "%c", &curr)) {
if(data_count >= *size) data = (char**) realloc(data, (*size *= 2) * *str_size);
if(str_pos >= *str_size) data = (char**) realloc(data, *size * (*str_size *= 2));
if(curr == ',') {
data[data_count][str_pos] = '\0';
data_count++;
str_pos = 0;
continue;
}
if(curr == '\n') {
data[data_count][str_pos] = '\0';
data_count++;
break;
}
data[data_count][str_pos] = curr;
str_pos++;
}
/* Resize the array to fit */
*size = data_count;
data = (char**) realloc(data, *size * *str_size);
return 0;
}
Assume that *size starts at 1. You set data_count to 0. Then, in the first iteration,
you do not have data_count >= *size, so you don't realloc(). You increment data_count
to 1, though, so in the next iteration, you grow the data buffer to 2.
So, at this point, data has length 2 and data_count is 1.
Let's then say we do not go into the first if statement but the second.
There, you increment data_count to two. You then access data[data_count], which is one
past the last element. That is a problem.
That might be the problem you see at the end of the while-loop, but it is far from the
only problem. Whenever you malloc() or realloc() data, you are invalidating the pointer the caller
has, because you might be freeing the memory at the original location. You never give him
a pointer to the new data back; all the newly allocated data will be leaked when you
return from the function, and the caller must never access data after the call, lest
he wants a segfault.
Related
I want to write a function that adds an Entry to the first free slot of an array list (if there is a slot free) and a function that merges all those entries to one "String" (I know there are no strings) separated by ', '.
Example list contains: "Anna", "Alex", "Anton" in this order
Output:"Anna,Alex,Anton".
This is the .c file (only the toDo's should be done)
/** The function adds an entry name to the array list at the
* first free position, if such a position exists.
* If there is no free space, ie no NULL entry,
* the array remains unchanged. When inserting a name,
* First, the same amount of memory is allocated on the heap using malloc
* (note space for a final \ 0). After that
* we use strncpy to convert the string name into the allocated
* Memory area copied.
* #param list: Array of \0 terminated strings
* #param listsize: size of array
* #param name: \0 terminated string that should be copied into the array
*/
void addNameToList(char **list, size_t listsize, char *name) {
//toDo
}
/** The function adds all entries in the list array to one separated by ','
* zero-terminated string that is stored in destbuffer.
* Entries with the value NULL are not added.
* So that chained calls are possible, the function returns the
* Pointer destbuffer as return value.
* Example: The list contains the names "Anna", "Alex", "Anton" in this
* Sequence. The function returns "Anna, Alex, Anton".
* #param list: array of \0 terminated strings
* #param listsize: size of array
* #param destbuffer: destinationbuffer
* #param buffersize: size of destbuffer
* #return destbuffer
*/
char *printListToString(char **list, size_t listsize, char *destbuffer, size_t buffersize) {
//toDo
return destbuffer;
}
void freeList(char **list, size_t listsize) {
size_t index = 0;
for (index = 0; index < listsize; index++)
if (list[index] != NULL)
free(list[index]);
free(list);
}
char **initList(size_t size) {
return calloc(size, sizeof(char *));
}
This is the given Main (not to change):
int main(void) {
char **names;
char outputbuffer[100];
names = initList(LIST_SIZE);
addNameToList(names, LIST_SIZE, "Alice");
addNameToList(names, LIST_SIZE, "Bob");
addNameToList(names, LIST_SIZE, "Carla");
addNameToList(names, LIST_SIZE, "Dana");
addNameToList(names, LIST_SIZE, "Eve");
printListToString(names, LIST_SIZE, outputbuffer, 100);
printf("%s\n", outputbuffer); //Output: Alice,Bob,Carla,Dana,Eve
freeList(names,LIST_SIZE);
}
What I have tried so far (not working):
char *printListToString(char **list, size_t listsize, char *destbuffer, size_t buffersize) {
int k = 1;
for(int i = 0; i < listsize; i++) {
if(list != NULL) {
strcpy_s(destbuffer, sizeof list, list);
}
if(list == '\0') {
destbuffer[i] =',';
destbuffer[k] = ' ';
}
k++;
}
return destbuffer;
}
The code above:
k is always one step ahead of i, so it can add a space right after the ',' (which is added at i)
I iterate thru the list and check whether an entry is NULL or not if its not NULL it should copy the name from the list into the destbuffer
Since the names end with \0 I thought I can just add , and a space right after I copied the name
void addNameToList(char **list, size_t listsize, char *name) {
malloc(sizeof name);
if(sizeof list != listsize) {
for(int i = 0; i < listsize; i++) {
if(list[i] == NULL) {
list[i] = name;
}
}
}
}
the code above:
saving memory for the name
check if list is full
if not I add the name at the first place thats null
(Note that I dont have any experience in C, only Python and Java. The "code above" section is what the code meant to do, not what its actually doing)
Let's consider the last snippet
void addNameToList(char **list, size_t listsize, char *name) {
// ^^^^^^^^^^
// 'name' is a POINTER. Not an array, nor a C-string (NULL-terminated array
// of char). It may point to the first element of one of those, though.
malloc(sizeof name);
// ^^^^^^^^^^^ This returns the size, in bytes, of a POINTER to char.
// Maybe 8, but surely not the number of characters up to the first '\0'
// from the one pointed by 'name'. To get that, you need
// strlen(name)
// Besides, 'malloc' RETURNS a pointer and you NEED it, because the whole
// point of that function call is to allocate enough memory to store
// your object and you better know WHERE this space is. If only to later
// release that resorce and avoid any leak.
if(sizeof list != listsize) {
// ^^^^^^^^^^^ Again, it's the size of a pointer to a pointer to char...
// Unfortunately, it's also probably different from listsize, so that
// it's harder to catch this error. Also, You don't need this check...
for(int i = 0; i < listsize; i++) {
// ^^^^^^^^^^^^ ... Because there already is this.
if(list[i] == NULL) {
list[i] = name;
// ^ This is a shallow copy, only the value OF the pointer
// (let's say the memory address) is copied, NOT all the elements
// of the string.
// Here is where the memory should be allocated, something like
// size_t size = strlen(name);
// list[i] = malloc(size);
// if ( list[i] == NULL ) {
/* malloc may fail... */
// }
// Then, the array must be copied. Your teacher mentioned
// 'strncpy'.
}
}
}
}
The other function, printListToString is broken too, perhaps even worse than the previous one.
k is always one step ahead of i, so it can add a space right after the ',' (which is added at i)
To do that, i + 1 could be enough, but that's not the point. There should be two different indices, one to itarate over list and one for destbuffer.
I iterate thru the list and check whether an entry is NULL or not if its not NULL it should copy the name from the list into the destbuffer
Unfortunately, in all its occurrences, list is written without an [i], so that no name in the list is actually accessed.
Since the names end with \0 I thought I can just add , and a space right after I copied the name
I'd do something like
char *printListToString( char **list, size_t listsize
, char *destbuffer, size_t buffersize) {
size_t j = 0;
for (size_t i = 0; i < listsize; ++i) {
if ( list[i] != NULL ) {
// Add ", ", but only if it's not the first name.
if ( j != 0 ) {
if ( j + 2 > buffersize )
break;
destbuffer[j++] = ',';
destbuffer[j++] = ' ';
}
// Copy the name. Feel free to use a library function instead.
for ( size_t k = 0; list[i][k] != '\0' && j < buffersize; ++k, ++j ) {
destbuffer[j] = list[i][k];
}
}
}
destbuffer[j] = '\0';
return destbuffer;
}
if I want to store string array in C program from stdin, whose array length is not known in advance and the string length is unfixed or unlimited. That means I can not define such thing as char buf[10][100]; in the program. Is there any good solution for this case?
The C standard doesn't have such a function but getline() which is POSIX does what you want. This may or may not be what you're looking for, depending on what OS you're planning to run this on.
You just do something like:
char *inf_line = NULL;
size_t n = 0;
ssize_t input = getline(&inf_line, &n, stdin);
Alternatively, you could try filling up an array with getchar() in some loop, dynamically reallocating memory as you reach the end of the array using malloc(), for example.
See the following code as an example how to read input until EOF is reached (in terminal, try Ctrl-Z or Ctrl-D to emulate an EOF, depending on your OS), by using fixed size chunks and creating a full string after the last chunk was read.
#define CHUNK_SIZE 4 // testing size
//#define CHUNK_SIZE 1024 // my suggested production size
struct node
{
char data[CHUNK_SIZE];
struct node* next;
};
int main()
{
// will be allocated and filled after reading all input
char* full_text = NULL;
// head node
struct node* start = NULL;
// iterator node
struct node* current = NULL;
// for tail allocation
struct node** next = &start;
// count the number of chunks (n-1 full and one partially filled)
size_t count = 0;
// size of the last read - will be the count of characters in the partially filled chunk
size_t last_size;
// will be initialized to the full text size (without trailing '\0' character)
size_t full_size;
while (!feof(stdin))
{
// casting malloc result is bad practice, but working with VS here and it's complaining otherwise
// also, you may want to check the result for NULL values.
*next = (struct node*)calloc(1, sizeof (struct node));
last_size = fread_s((*next)->data, CHUNK_SIZE, 1/* sizeof char */, CHUNK_SIZE, stdin);
next = &((*next)->next);
++count;
}
// calculate the full size and copy each chunk data into the combined text
if (count > 0)
{
full_size = CHUNK_SIZE * (count - 1) + last_size;
// one additional character for the null terminator character
full_text = (char*)malloc(full_size + 1);
full_text[full_size] = '\0';
count = 0;
current = start;
while (current && current->next)
{
memcpy(&full_text[count * CHUNK_SIZE], current->data, CHUNK_SIZE);
current = current->next;
++count;
}
if (current)
{
memcpy(&full_text[count * CHUNK_SIZE], current->data, last_size);
}
}
else
{
full_text = (char*)calloc(1, 1);
}
// full_text now contains all text
// TODO free the node structure
return 0;
}
side note: I use calloc instead of malloc so I get zero-initialized storage.
side note: I use the binary fread_s instead of fgets, which doesn't zero-terminate the read data (would need some different handling otherwise) and which may not play nice with non-ASCII input. So make sure you understand your input format when using this 1:1
I wrote a code for managing a library; the compilation is done but during the simulation I obtained an Allocation error (case2) and I don't know why.
The first case works correctly but if I entered more than one name in the first case, the second case doesn't work.
What did I do wrong? I hope I was clear enough.
typedef struct {
char name[80];
char **books;
int books_num;
} Subscription;
int main() {
// Variables declaration:
int option = 0, subs_num = 0, i = 0, books_num = 0;
Subscription *subs_library;
char **books;
char subs_new_name[80], book_new_name[80];
printf("Choose an option\n");
do {
scanf("%d", &option);
switch (option) {
case 1:
printf("Case 1: enter a new name\n");
scanf("%s", subs_new_name);
if (subs_num == 0) {
subs_library = malloc(sizeof(Subscription));
} else {
subs_library = realloc(subs_library, sizeof(Subscription));
}
strcpy(subs_library[subs_num].name, subs_new_name);
subs_library[subs_num].books_num = 0;
subs_num++;
printf("ADDED\n");
break;
case 2:
printf("Case 2: enter the book name\n");
scanf("%s", book_new_name);
if (books_num == 0) {
books = malloc(sizeof(char*));
books[books_num] = malloc(80 * sizeof(char));
} else {
books = realloc(books, sizeof(char*));
books[books_num] = malloc(80 * sizeof(char));
}
if (books[books_num] == NULL) {
printf("Allocation Error\n");
exit(1);
}
strcpy(books[books_num], book_new_name);
books_num++;
printf("ADDED\n");
break;
}
} while (option != 7);
return 0;
}
Your code to reallocate the arrays is incorrect. You do not allocate enough room for the new array sizes. When you reallocate these arrays, you pass the size of a single element, therefore the array still has a length of 1 instead of subs_num + 1. The size passed to realloc should be the number of elements times the size of a single element in bytes.
Initialize subs_library and books to NULL and change your array reallocations:
if (subs_num == 0) {
subs_library = malloc(sizeof(Subscription));
} else {
subs_library = realloc(subs_library, sizeof(Subscription));
}
Into this:
subs_library = realloc(subs_library, (subs_num + 1) * sizeof(*subs_library));
And do the same for books, change:
if (books_num == 0) {
books = malloc(sizeof(char*));
books[books_num] = malloc(80 * sizeof(char));
} else {
books = realloc(books, sizeof(char*));
books[books_num] = malloc(80 * sizeof(char));
}
To this:
books = realloc(books, (books_num + 1) * sizeof(*books));
books[books_num] = malloc(80 * sizeof(char));
Or simpler:
books = realloc(books, (books_num + 1) * sizeof(*books));
books[books_num] = strdup(book_new_name);
I guess the problem is with scanf reading a string only until a separator, in your case - a whitespace separating multiple names entered. The characters after separator remain in the input buffer and get immediately processed by other calls to scanf.
You should consider using getline for reading name(s) and checking return values from other calls to scanf.
The problem is your reallocation calls. For example you do
realloc(books,sizeof(char*))
This reallocates the memory pointed to be books to be one pointer to character in size, which is exactly what you already have. This will lead you to index out of bounds of the allocated memory, something which is undefined behavior.
If you want to allocate more than one element you need to multiply the base type size with the number of elements you want to allocate, e.g.
realloc(books, (books_num + 1) * sizeof(char *))
Your reallocation realloc(books, sizeof(char *)) only allocates the size of one pointer char *, not the size of the enlarged array that you need:
books=realloc(books,sizeof(char*));
You need to multiply the size of a pointer (char *) by the number of books you plan on storing in the array. You maintain the number of books in books_num.
As Joachim Pileborg said, with every allocation/reallocation, you want this to be one more than the current size. For the first allocation (malloc()), you want to allocate for one book, which is 1 times sizeof(char *). This happens to be equivalent to your existing code, which is fine. But the reallocation (realloc()) reallocates for the same size every time (only enough for one pointer), so you're not enlarging the allocation. You need to multiply the size required for one pointer (sizeof(char *)) by the number of pointers you want, which is books_num + 1. As in Joachim's answer, this is
books = realloc(books, (books_num + 1)*sizeof(char *));
This will enlarge the allocation of the array books by one more pointer. Then, on the next line you correctly allocate a string of size 80.
Your subs_library has the same reallocation issue.
Less frequent reallocation
You might want to resize an allocation less frequently. In this situation, you are reallocating every time you add an entry. One simple technique to reduce the number of reallocations is to double the allocation size every time it gets full. But you have to maintain the allocation size (capacity) and check for it whenever you add something. For example:
char **buffer; /* buffer of pointers to char */
int capacity = 1; /* number of elements allocated for */
int size = 0; /* number of elements actually used */
Then the initial allocation is
/* Initial allocation */
buffer = malloc(capacity*sizeof(*buffer));
And to add some char *new_item to the buffer
/* When adding an element */
if ( size == capacity ) {
/* Double allocation every time */
capacity *= 2;
/* Reallocate the buffer to new capacity */
realloc(buffer, capacity*sizeof(*buffer));
}
/* Item will fit, add to buffer */
buffer[size++] = new_item;
Notice that I have used sizeof(*buffer) instead of sizeof(char *). This makes the compiler figure out what the type and size is. That way, if I change the type of buffer for some reason, I don't have to change more places in the code. Another thing that I left out for brevity is that you should always check the return values to make sure they are not NULL.
I have a file which stored a sequence of integers. The number of total integers is unknown, so I keep using malloc() to apply new memory if i read an integer from the file.
I don't know if i could keep asking for memory and add them at the end of the array. The Xcode keeps warning me that 'EXC_BAD_EXCESS' in the line of malloc().
How could i do this if i keep reading integers from a file?
int main()
{
//1.read from file
int *a = NULL;
int size=0;
//char ch;
FILE *in;
//open file
if ( (in=fopen("/Users/NUO/Desktop/in.text","r")) == NULL){
printf("cannot open input file\n");
exit(0); //if file open fail, stop the program
}
while( ! feof(in) ){
a = (int *)malloc(sizeof(int));
fscanf(in,"%d", &a[size] );;
printf("a[i]=%d\n",a[size]);
size++;
}
fclose(in);
return 0;
}
Calling malloc() repeatedly like that doesn't do what you think it does. Each time malloc(sizeof(int)) is called, it allocates a separate, new block of memory that's only large enough for one integer. Writing to a[size] ends up writing off the end of that array for every value past the first one.
What you want here is the realloc() function, e.g.
a = realloc(a, sizeof(int) * (size + 1));
if (a == NULL) { ... handle error ... }
Reworking your code such that size is actually the size of the array, rather than its last index, would help simplify this code, but that's neither here nor there.
Instead of using malloc, use realloc.
Don't use feof(in) in a while loop. See why.
int number;
while( fscanf(in, "%d", &number) == 1 ){
a = realloc(a, sizeof(int)*(size+1));
if ( a == NULL )
{
// Problem.
exit(0);
}
a[size] = number;
printf("a[i]=%d\n", a[size]);
size++;
}
Your malloc() is overwriting your previous storage with just enough space for a single integer!
a = (int *)malloc(sizeof(int));
^^^ assignment overwrites what you have stored!
Instead, realloc() the array:
a = realloc(a, sizeof(int)*(size+1));
You haven't allocated an array of integers, you've allocated one integer here. So you'll need to allocate a default array size, then resize if you're about to over run. This will resize it by 2 each time it is full. Might not be in your best interest to resize it this way, but you could also reallocate each for each additional field.
size_t size = 0;
size_t current_size = 2;
a = (int *)malloc(sizeof(int) * current_size);
if(!a)
handle_error();
while( ! feof(in) ){
if(size >= current_size) {
current_size *= 2;
a = (int *)realloc(a, sizeof(int) * current_size);
if(!a)
handle_error();
}
fscanf(in,"%d", &a[size] );;
printf("a[i]=%d\n",a[size]);
size++;
}
The usual approach is to allocate some amount of space at first (large enough to cover most of your cases), then double it as necessary, using the realloc function.
An example:
#define INITIAL_ALLOCATED 32 // or enough to cover most cases
...
size_t allocated = INITIAL_ALLOCATED;
size_t size = 0;
...
int *a = malloc( sizeof *a * allocated );
if ( !a )
// panic
int val;
while ( fscanf( in, "%d", &val ) == 1 )
{
if ( size == allocated )
{
int *tmp = realloc( a, sizeof *a * allocated * 2 ); // double the size of the buffer
if ( tmp )
{
a = tmp;
allocated *= 2;
}
else
{
// realloc failed - you can treat this as a fatal error, or you
// can give the user a choice to continue with the data that's
// been read so far.
}
a[size++] = val;
}
}
We start by allocating 32 elements to a. Then we read a value from the file. If we're not at the end of the array (size is not equal to allocated), we add that value to the end of the array. If we are at the end of the array, we then double the size of it using realloc. If the realloc call succeeds, we update the allocated variable to keep track of the new size and add the value to the array. We keep going until we reach the end of the input file.
Doubling the size of the array each time we reach the limit reduces the total number of realloc calls, which can save performance if you're loading a lot of values.
Note that I assigned the result of realloc to a different variable tmp. realloc will return NULL if it cannot extend the array for any reason. If we assign that NULL value to a, we lose our reference to the memory that was allocated before, causing a memory leak.
Note also that we check the result of fscanf instead of calling feof, since feof won't return true until after we've already tried to read past the end of the file.
so I have an array that keeps getting overwritten with new values. For example, this is the output:
instruc[0] = PRINTNUM
instruc[1] = PRINTNUM
instruc[2] = PRINTNUM
where PRINTNUM is supposed to be the last thing in the array and the first two elements are supposed to be something else.
Here is my code for the specific segment:
//array of instructions
char** instruc = malloc(numLines * 200);
c = fgets(inputString, 200, in_file);
while (c != NULL){
instruc[i]=inputString;
i++;
c = fgets(inputString, 200, in_file);
}
//print out what's in the array
i=0;
for (i=0; i<numLines; i++){
printf("instruc[%d] = %s\n", i, instruc[i]);
}
Thanks in advance!
There is no memory allocated to store the lines read in from the file. Yes there is memory allocated but instruc is of type char ** and the program is using this memory as if instruc is of char * type.
If you wish to store the pointers to all the records in the file memory must be allocated not only to store the data in the file BUT also memory is required to store the pointers to the start of each record.
In this code it appears the file records are fixed length 200 bytes long. The array of pointers which a variable of char ** would point to is not strictly necessary - it would be possible to determine the start of the next record simply by adding 200 bytes to the char * pointer to the start of the data read from the file.
To store the pointers to the start of each record something like this would be required-:
char **fileRecPtrs = malloc(sizeof(char *) * numLines);
char *instruc = malloc(200 * numLines);
Then in the loop to read each record of the file-:
c = fgets(instruc, 200, in_file);
while (c != NULL) {
fileRecPtrs[i] = instruc;
instruc += 200;
i++;
c = fgets(instruc, 200, in_file);
}
You're pointing every index in the array to the same memory address. I think you'd be better off using strcpy