I'm somehow having troubles creating a dynamic array of strings in C. I'm not getting the expected results and I want to know why ?
readLine() function will read each line seperately and will do some changes if necessary :
char *readLine(FILE *f, size_t *len)
{
char *line = NULL;
ssize_t nread;
if (f == NULL)
{
return NULL;
}
if ((nread = getline(&line, len, f)) != -1)
{
if (line[nread - 1] == '\n')
{
line[strlen(line)-1] = '\0';
*len = strlen(line);
}
return line;
}
else
{
return NULL;
}
}
readFile() function will return an array of strings after reading all of the lines using readLine and then storing them into an array of strings :
char **readFile(const char *filename, size_t *fileLen)
{
char *result;
int idx = 0;
char **array = calloc(1, sizeof(char*) );
if (filename == NULL || fileLen == NULL)
{
return NULL;
}
FILE *f = fopen(filename, "r");
if (f == NULL)
{
return NULL;
}
while (1)
{
result = readLine(f, fileLen);
if (result == NULL)
break;
else
{
*(array + idx) = malloc(LENGTH * sizeof(char *));
strncpy(array[idx], result, strlen(result) + 1);
idx++;
array = realloc(array, (idx + 1) * sizeof(char *));
}
}
return array;
}
In main I created a temporary file to test my functions but it didn't work properly :
int main()
{
char filename[] = "/tmp/prefXXXXXX";
int fd;
size_t len = 0;
FILE *f;
if (-1 == (fd = mkstemp(filename)))
perror("internal error: mkstemp");
if (NULL == (f = fdopen(fd, "w")))
perror("internal error: fdopen");
for (int i = 0; i < 10000; i++)
fprintf(f, "%d\n", i);
fclose(f);
char **number = readFile(filename, &len);
for (int i = 0; i < sizeof(number) / sizeof(number[0]); i++)
printf("number[%i] = %s\n", i, number[i]);
return 0;
}
When I execute the program, I get the following output:
number[0] = 0
What am I doing wrong here ?
There are lots of issues in that code, it's difficult to find where to start...
Let's look at each function.
char *readLine(FILE *f, size_t *len)
{
char *line = NULL;
ssize_t nread;
if (f == NULL)
{
return NULL;
}
if ((nread = getline(&line, len, f)) != -1)
{
if (line[nread - 1] == '\n')
{
line[strlen(line)-1] = '\0';
*len = strlen(line);
}
return line;
}
else
{
return NULL;
}
}
There is not much wrong here. But manpage for geline tells us:
If *lineptr is set to NULL before the call, then getline() will
allocate a buffer for storing the line. This buffer should be
freed by the user program even if getline() failed.
You do not free the buffer if nread==-1 but only do return NULL; possibly causing a memory leak.
You should also check whether len==NUL as you already do it with f.
Then look at the next function:
char **readFile(const char *filename, size_t *fileLen)
{
char *result;
int idx = 0;
char **array = calloc(1, sizeof(char*) );
if (filename == NULL || fileLen == NULL)
{
return NULL;
}
FILE *f = fopen(filename, "r");
if (f == NULL)
{
return NULL;
}
while (1)
{
result = readLine(f, fileLen);
if (result == NULL)
break;
else
{
*(array + idx) = malloc(LENGTH * sizeof(char *));
strncpy(array[idx], result, strlen(result) + 1);
idx++;
array = realloc(array, (idx + 1) * sizeof(char *));
}
}
return array;
}
In this function you fail to free(array) in case you hit a return NULL; exit.
readLine puts strlen(result) into filelen. Why don't you use it to allocate memory? Instead you take some unknown fixed length LENGTH that may or may not be sufficient to hold the string. Instead you should use fileLen+1 or strlen(result)+1 as you do it with strncpy.
You are also using size of wrong type. You allocate a pointer to char, not char*. As size of char is defined to be 1 you can just drop the size part here.
Then, the length parameter for strncpy should hold the length of the destination, not the source. Otherwise it is completely useless to use strncpy at all.
As you already (should) use the string length to allocate the memory, just use strncpy.
Then, just passing fileLen to the next function does not make sense. In readLine it means length of a line while in readFile that would not make any sense. Instead it should mean number of lines. And as we just came to the topic... You should pass some value to the caller.
Finally, you should not assign the return value of realloc directly to the varirable you passed into it. In case of an error, NULL is returned and you cannot access or free the old pointer any longer.
This block should look like this:
{
array[idx] = malloc(fileLen+1);
strcpy(array[idx], result);
idx++;
void *temp = realloc(array, (idx + 1) * sizeof(char *));
if (temp != NULL)
array = temp;
// TODO: else <error handling>
}
}
*fileLen = idx;
return array;
}
This still has the flaw that you have allocated memory for one more pointer that you do not use. You can change this as further optimization.
Lastly the main function:
int main()
{
char filename[] = "/tmp/prefXXXXXX";
int fd;
size_t len = 0;
FILE *f;
if (-1 == (fd = mkstemp(filename)))
perror("internal error: mkstemp");
if (NULL == (f = fdopen(fd, "w")))
perror("internal error: fdopen");
for (int i = 0; i < 10000; i++)
fprintf(f, "%d\n", i);
fclose(f);
char **number = readFile(filename, &len);
for (int i = 0; i < sizeof(number) / sizeof(number[0]); i++)
printf("number[%i] = %s\n", i, number[i]);
return 0;
}
char **number = readFile(filename, &len); You get an array holding all the lines of a file. number is a very poor name for this.
You return NULL from readFile in case of an error. You should check for that after calling.
Then you forgot that arrays are not pointers and pointers are not arrays. They behave similar in many places but are very different at the same time.
i < sizeof(number) / sizeof(number[0])
Here number is a pointer and its size of the size of a pointer. Also number[0] is a pointer again. Different type, but same size.
What you want is the number of lines which you get from readFile. Use that variable.
This part should look like this:
char **all_lines = readFile(filename, &len);
if (all_lines != NULL)
{
for (int i = 0; i < len; i++)
printf("all_lines[%i] = %s\n", i, all_lines[i]);
And you should not forget that you have allocated a lot of memory which you should also free.
(This might not strictly be necessary when you terminate your program, but you should keep in mind to clean up behind you)
if (all_lines != NULL)
{
for (int i = 0; i < len; i++)
printf("all_lines[%i] = %s\n", i, all_lines[i]);
for (int i = 0; i < len; i++)
free(all_lines[i];
free(all_lines);
}
Related
I'm trying to read the following file line by line into an array of strings where each line is an element of the array:
AATGC
ATGCC
GCCGT
CGTAC
GTACG
TACGT
ACGTA
CGTAC
GTACG
TACGA
ACGAA
My code is as follows:
void **get_genome(char *filename) {
FILE *file = fopen(filename, "r");
int c;
int line_count = 0;
int line_length = 0;
for (c = getc(file); c != EOF; c = getc(file)) {
if (c == '\n') line_count++;
else line_length++;
}
line_length /= line_count;
rewind(file);
char **genome = calloc(line_length * line_count, sizeof(char));
for (int i = 0; i < line_count; i++) {
genome[i] = calloc(line_length, sizeof(char));
fscanf(file, "%s\n", genome[i]);
}
printf("%d lines of %d length\n", line_count, line_length);
for (int i = 0; i < line_count; i++)
printf("%s\n", genome[i]);
}
However, for some reason I get garbage output for the first 2 elements of the array. The following is my output:
`NP��
�NP��
GCCGT
CGTAC
GTACG
TACGT
ACGTA
CGTAC
GTACG
TACGA
ACGAA
You seem to assume that all lines have the same line length. If such is the case, you still have some problems:
the memory for the row pointers is allocated incorrectly, it should be
char **genome = calloc(line_count, sizeof(char *));
or better and less error prone:
char **genome = calloc(line_count, sizeof(*genome));
the memory for each row should be one byte longer the the null terminator.
\n is the fscanf() format string matches any sequence of whitespace characters. It is redundant as %s skips those anyway.
it is safer to count items separated by white space to avoid miscounting the items if the file contains any blank characters.
you do not close file.
you do not return the genome at the end of the function
you do not check for errors.
Here is a modified version:
void **get_genome(const char *filename) {
FILE *file = fopen(filename, "r");
if (file == NULL)
return NULL;
int line_count = 1;
int item_count = 0;
int item_length = -1;
int length = 0;
int c;
while ((c = getc(file)) != EOF) {
if (isspace(c)) {
if (length == 0)
continue; // ignore subsequent whitespace
item_count++;
if (item_length < 0) {
item_length = length;
} else
if (item_length != length) {
printf("inconsistent item length on line %d\", line_count);
fclose(file);
return NULL;
}
length = 0;
} else {
length++;
}
}
if (length) {
printf("line %d truncated\n", line_count);
fclose(file);
return NULL;
}
rewind(file);
char **genome = calloc(item_count, sizeof(*genome));
if (genome == NULL) {
printf("out of memory\n");
fclose(file);
return NULL;
}
for (int i = 0; i < item_count; i++) {
genome[i] = calloc(item_length + 1, sizeof(*genome[i]));
if (genome[i] == NULL) {
while (i > 0) {
free(genome[i]);
}
free(genome);
printf("out of memory\n");
fclose(file);
return NULL;
}
fscanf(file, "%s", genome[i]);
}
fclose(file);
printf("%d items of %d length on %d lines\n",
item_count, item_length, line_count);
for (int i = 0; i < item_count; i++)
printf("%s\n", genome[i]);
return genome;
}
char **genome = calloc(line_length * line_count, sizeof(char));
must be
char **genome = calloc(line_count, sizeof(char*));
or more 'secure'
char **genome = calloc(line_count, sizeof(*genome));
in case you change the type of genome
else the allocated block if not enough long if you are in 64b because line_count is 5 rather than 8, so you write out of it with an undefined behavior
You also need to return genome at the end of the function
It was also possible to not count the number of lines and to use realloc to increment your array when reading the file
As I see the lines have the same length. Your function should inform the caller how many lines have been read. There is no need of reading the file twice. There is no need of calloc (which is more expensive function). Always check the result of the memory allocation functions.
Here is a bit different version of the function:
char **get_genome(char *filename, size_t *line_count) {
FILE *file = fopen(filename, "r");
int c;
size_t line_length = 0;
char **genome = NULL, **tmp;
*line_count = 0;
if(file)
{
while(1)
{
c = getc(file);
if( c == EOF || c == '\n') break;
line_length++;
}
rewind(file);
while(1)
{
char *line = malloc(line_length + 1);
if(line)
{
if(!fgets(line, line_length + 1, file))
{
free(line);
break;
}
line[line_length] = 0;
tmp = realloc(genome, (*line_count + 1) * sizeof(*genome));
if(tmp)
{
genome = tmp;
genome[*line_count] = line;
*line_count += 1;
}
else
{
// do some memory free magic
}
}
}
fclose(file);
}
return genome;
}
So I'm trying to create a function that takes in a text file, which contains a bunch of words separated by the newline character, and reads the text file into a char** array.
When I run this code in netbeans on windows, it works fine but if I run it in Linux, I get a segmentation fault error.
// globals
FILE *words_file;
char **dic;
int num_words = 0;
void read_to_array() {
words_file = fopen("words.txt", "r");
char *line = NULL;
int i = 0;
size_t len = 0;
dic = (char **)malloc(99999 * sizeof(char *));
// read dic to array
while (getline(&line, &len, words_file) != -1) {
dic[i] = (char*)malloc(len);
strcpy(dic[i], line);
// get rid of \n after word
if (dic[i][strlen(dic[i]) - 1] == '\n') {
dic[i][strlen(dic[i]) - 1] = '\0';
}
++i;
num_words++;
}
//printf("%s", dic[i][strlen(dic[i]) - 1]); //testing
fclose(words_file);
dic[i] = NULL;
}
What am I missing here?
There are some problems in your program that may cause the undefined behavior that you observe:
You do not test if the file was open successfully, causing undefined behavior if the file is not where you expect it or has a different name.
You do not limit the number of lines read into the array, causing undefined behavior if the file contains more than 99998 lines, which may be be the case in linux as /usr/share/dict/words has 139716 lines on my system, for example.
Your memory allocation scheme is suboptimal but correct: you should compute the length of the word and strip the newline before allocating the copy. As coded, you allocate too much memory. Yet you should free line before returning from read_to_array and you should avoid using global variables.
Here is a modified version:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **read_to_array(const char *filename, int *countp) {
FILE *words_file;
char *line = NULL;
size_t line_size = 0;
char **dic = NULL;
int dic_size = 0;
int i = 0;
words_file = fopen(filename, "r");
if (words_file == NULL) {
fprintf(stderr, "cannot open dictionary file %s\n", filename);
return NULL;
}
dic_size = 99999;
dic = malloc(dic_size * sizeof(char *));
if (dic == NULL) {
fprintf(stderr, "cannot allocate dictionary array\n");
fclose(words_file);
return NULL;
}
// read dic to array
while (getline(&line, &line_size, words_file) != -1) {
size_t len = strlen(line);
/* strip the newline if any */
if (len > 0 && line[len - 1] == '\n') {
line[--len] = '\0';
}
if (i >= dic_size - 1) {
/* too many lines: should reallocate the dictionary */
fprintf(stderr, "too many lines\n");
break;
}
dic[i] = malloc(len + 1);
if (dic[i] == NULL) {
/* out of memory: report the error */
fprintf(stderr, "cannot allocate memory for line %d\n", i);
break;
}
strcpy(dic[i], line);
i++;
}
dic[i] = NULL;
*countp = i;
fclose(words_file);
free(line);
return dic;
}
int main(int argc, char **argv) {
const char *filename = (argc > 1) ? argv[1] : "words.txt";
int num_words;
char **dic = read_to_array(filename, &num_words);
if (dic != NULL) {
printf("dictionary loaded: %d lines\n", num_words);
while (num_words > 0)
free(dic[--num_words]);
free(dic);
}
return 0;
}
Output:
chqrlie> readdic /usr/share/dict/words
too many lines
dictionary loaded: 99998 lines
In C, I am trying to implement a function that uses getline() to read all the lines from a file. It is implemented similarly to getline(), specifically the fact that it is using realloc() to resize a char** if there is not enough memory allocated to store the next pointer to a line. Unfortunately I am getting seg faults during the string dupilcation process.
After a little poking around, I discovered that the segfault happens during the second iteration while attempting to store the second line in the char pointer array.
ssize_t fgetlines(char*** linesptr, size_t* n, FILE* fp)
{
char* line = NULL;
size_t sz_line = 0;
size_t cur_len = 0;
size_t needed;
if (linesptr == NULL || n == NULL) {
errno = EINVAL;
return -1;
}
if (*linesptr == NULL) {
if (*n == 0)
*n = sizeof(**linesptr) * 30; /* assume 30 lines */
*linesptr = malloc(*n);
if (*linesptr == NULL) {
*n = 0;
return -1;
}
}
while (getline(&line, &sz_line, fp) > 0) {
needed = (cur_len + 1) * sizeof(**linesptr);
while (needed > *n) {
char** new_linesptr;
*n *= 2;
new_linesptr = realloc(*linesptr, *n);
if (new_linesptr == NULL) {
*n /= 2;
free(line);
return -1;
}
*linesptr = new_linesptr;
}
*linesptr[cur_len] = strdup(line);
printf("%s", *linesptr[cur_len]);
if (*linesptr[cur_len] == NULL) {
free(line);
free(*linesptr);
return -1;
}
++cur_len;
}
free(line);
return cur_len;
}
And I call the function like so:
char **settings = NULL;
size_t sz_settings = sizeof(*settings) * 6;
int count = fgetlines(&settings, &sz_settings, f_cfg);
Due to the function not being able to successfully complete I do not get any output. But after printing back the string after strdup() I managed to get one line of f_cfg, "Hello World" before a seg fault.
Should change
*linesptr[cur_len] => (*linesptr)[cur_len]
The modified function is as follows:
ssize_t fgetlines(char *** linesptr, size_t *n, FILE *fp)
{
char *line = NULL;
size_t sz_line = 0;
size_t cur_len = 0;
size_t needed;
if (linesptr == NULL || n == NULL) {
errno = EINVAL;
return -1;
}
if (*linesptr == NULL) {
if (*n == 0)
*n = sizeof(**linesptr) * 30; /* assume 30 lines */
*linesptr = malloc(*n);
if (*linesptr == NULL) {
*n = 0;
return -1;
}
}
while (getline(&line, &sz_line, fp) > 0) {
needed = (cur_len + 1) * sizeof(**linesptr);
while (needed > *n) {
char **new_linesptr;
*n *= 2;
new_linesptr = realloc(*linesptr, *n);
if (new_linesptr == NULL) {
*n /= 2;
free(line);
return -1; // Possible memory leak
}
*linesptr = new_linesptr;
}
(*linesptr)[cur_len] = strdup(line);
printf("%s", (*linesptr)[cur_len]);
if ((*linesptr)[cur_len] == NULL) {
free(line);
free(*linesptr);
return -1; // Possible memory leak
}
++cur_len;
}
free(line);
return cur_len;
}
In addition, when your memory allocation fails, the memory of "strdup" is not free, which will lead to memory leak.
As chux pointed out, the intended precedence here was incorrect. References to *linesptr[cur_len] must be changed to (*linesptr[cur_len]). Also the code hole *n == 0 and *n *= 2 has been fixed.
I've debugged the heck out of this and cannot figure why my fgets is not working. Before I changed my code such that it dynamically resizes arrays, fgets works perfectly well. As I am a beginner in C, this problem has baffled me for quite a long time.
Here is the faulty code:
int readNumbers(int **array, char* fname, int hexFlag) {
int numberRead = 0;
FILE* fp;
int counter = 0;
char arr[100];
char* ptr;
size_t curSize = 16;
int radix = hexFlag ? 16 : 10;
*array = malloc(0 * sizeof(*array));
fp = fopen(fname, "r");
if (fp == NULL) {
printf("Error opening file\n");
return -1;
}
while (fgets(arr, sizeof(arr), fp)) { //Seg faults here when it reaches end of file.
ptr = strtok(arr, " \n");
while(ptr) {
if (counter >= curSize) {
curSize += 16;
array = realloc(*array, curSize * sizeof(**array));
}
(*array)[counter++] = strtol(ptr, NULL, radix);
++numberRead;
ptr = strtok(NULL , " \n");
}
}
if (ferror(fp)) {
fclose(fp);
return -1;
}
Here is the working code before the changes to make the array resize:
int readNumbers(int array[], char* fname, int hexFlag) {
int numberRead = 0;
FILE* fp;
int counter = 0;
char arr[100];
char* ptr;
fp = fopen(fname, "r");
if (fp == NULL) {
printf("Error opening file\n");
return -1;
}
while (fgets(arr, sizeof(arr), fp)) {
ptr = strtok(arr, " \n");
while(ptr) {
if (hexFlag == 0) {
array[counter++] = strtol(ptr , NULL , 10);
} else {
array[counter++] = strtol(ptr, NULL, 16);
}
++numberRead;
ptr = strtok(NULL , " \n");
}
}
if (ferror(fp)) {
fclose(fp);
return -1;
}
The newly added changes seg faults when the end of file is reached. I strongly suspect that this has to do with the double pointers. Any help is strongly appreciated!
Didn't went through whole code. But *array = malloc(0 * sizeof(*array)) here this malloc call will not allocate any memory.
In addition to the problem Amit Sharma pointed out:
You initially allocate your dynamic array using:
*array = malloc(0 * sizeof(*array));
And when you store into the dynamic array, you use:
(*array)[counter++] = strtol(ptr, NULL, radix);
However, your subsequent reallocations use:
array = realloc(*array, curSize * sizeof(**array));
which should likely be:
*array = realloc(*array, curSize * sizeof(*array));
Note that it's OK (unusual, but not unheard of) to use malloc(0), as long as your code is prepared to deal with a NULL pointer return or an allocation that can't be read from or written to.
this is an assignment for my CS course,
im trying to write a code that reads a file line by line and put the input into a struct element.the struct looks like this:
typedef char* Name;
struct Room
{
int fStatus;
Name fGuest;
};
the status is 0 for available and 1 for booked. the name will be empty if the room is available.
there are 2 function, one to read and put the values to a struct element, and the other one to print it out.
int openRoomFile()
{
FILE *roomFile;
char *buffer = NULL;
size_t length = 0;
size_t count = 0;
roomFile = fopen("roomstatus.txt", "r+");
if (roomFile == NULL)
return 1;
while (getline(&buffer, &length, roomFile) != -1) {
if (count % 2 == 0) {
sscanf(buffer, "%d", &AllRooms[count].fStatus);
} else {
AllRooms[count].fGuest = buffer;
}
count++;
}
fclose(roomFile);
free(buffer);
return 0;
}
print function
void printLayout(const struct Room rooms[])
{
for (int i=0; i<3; i++) {
printf("%3d \t", rooms[i].fStatus);
puts(rooms[i].fGuest);
}
}
the output is not what i expected, given the input file is :
1
Johnson
0
1
Emilda
i will get the output :
1 (null)
0
0 (null)
i dont know what went wrong, am i using the right way to read the file? every code is adapted from different sources on the internet.
Here is a fixed version of the openRoomFile()
int openRoomFile(void)
{
FILE *roomFile;
char *buffer = NULL;
size_t length = 0;
size_t count = 0;
roomFile = fopen("roomstatus.txt", "r+");
if (roomFile == NULL)
return 1;
while (1) {
buffer = NULL;
if (getline(&buffer, &length, roomFile) == -1) {
break;
}
sscanf(buffer, "%d", &AllRooms[count].fStatus);
free(buffer);
buffer = NULL;
if (getline(&buffer, &length, roomFile) == -1) {
fprintf(stderr, "syntax error\n");
return 1;
}
AllRooms[count].fGuest = buffer;
count++;
}
fclose(roomFile);
return 0;
}
When you no longer need those fGuest anymore, you should call free on them.
If your input is guaranteed to be valid (as were many of my inputs in my CS classes), I'd use something like this for reading in the file.
while(!feof(ifp)){
fscanf(ifp,"%d%s",&AllRooms[i].fStatus, AllRooms[i].fGuest); //syntax might not be right here
//might need to play with the '&'s
//and maybe make the dots into
//arrows
//do work here
i++;
}
You are not allocating memory for Name. Check this. In the below example i'm not included free() calls to allocated memory. you need to call free from each pointer in AllRooms array, once you feel you are done with those and no more required.
#include<stdio.h>
#include<stdlib.h>
typedef char* Name;
struct Room
{
int fStatus;
Name fGuest;
}Room_t;
struct Room AllRooms[10];
int openRoomFile()
{
FILE *roomFile;
char *buffer = NULL;
size_t length = 0;
size_t count = 0;
size_t itemCount = 0;
roomFile = fopen("roomstatus.txt", "r+");
if (roomFile == NULL)
return 1;
buffer = (char *) malloc(16); // considering name size as 16 bytes
while (getline(&buffer, &length, roomFile) != -1) {
if (count % 2 == 0) {
sscanf(buffer, "%d", &AllRooms[itemCount].fStatus);
} else {
AllRooms[itemCount].fGuest = buffer;
itemCount++;
}
count++;
buffer = (char *) malloc(16); // considering name size as 16 bytes
}
fclose(roomFile);
free(buffer);
return 0;
}
void printLayout(const struct Room rooms[])
{
int i;
for (i=0; i<3; i++) {
printf("%3d \t", rooms[i].fStatus);
puts(rooms[i].fGuest);
}
}
int main(void)
{
openRoomFile();
printLayout(AllRooms);
// free all memory allocated using malloc()
return 0;
}