I have a homework task that requires me to process .txt files by scanning them into a flexible data structure and then searching the files for words with capital letters. I'm having issues scanning them in this flexible data structure I'm using. The reason that the data structure needs to be flexible is that it needs to be able to process any .txt files.
The data structure I want to use is an array that points to arrays that contains the content of the line. I'm open to using a different structure if it's easier.
I've tried to scan it in line by line using fgets, and using malloc to allocate just enough to store the line, but it doesn't seem to work.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define STEPSIZE 100
int main()
{
FILE *inputFile;
//Opens the file in read mode
inputFile = fopen("testfile.txt", "r");
//Error message if file cannot open
if (inputFile == NULL)
{
printf("Unable to open file");
return 1;
}
int arrayLen = STEPSIZE;
// Allocate space for 100 lines. The **lines is the data structure used to store all the lines
char **lines = (char **)malloc(STEPSIZE * sizeof(char*));
char buffer[3000];
int i = 0;
while (fgets(buffer, 3000, inputFile))
{
//Checks if the array is full, and extends it
if(i == arrayLen)
{
arrayLen += arrayLen;
char ** newLines = realloc(lines, 200 * sizeof(char*));
if(!newLines)
{
printf("cant realloc\n");
}
lines= newLines;
}
// Get length of buffer
int lengthOfBuffer = strlen(buffer);
//Allocate space for string. The +1 is for the terminating character
char *string = (char *)malloc((lengthOfBuffer + 1) * sizeof(char));
//copy string from buffer to string
strcpy(string, buffer);
//Attach string to data structure
lines[i] = string;
//Increment counter
i++;
printf("%s", lines[i]);
}
//Closes the file
fclose(inputFile);
for (int j = 0; j < 100; j++){
printf("%s \n", lines[i]);
}
return 0;
}
When the final for loop runs, ideally the contents of the file gets printed, just to show that it has been stored and is able to be processed, but currently i get exit code 11.
Any help would be appreciated.
There is a problem is here:
//Increment counter
i++;
printf("%s", lines[i]); // you're printing the next file that does not yet exist
Correct code:
printf("%s", lines[i]);
//Increment counter
i++;
And another one here:
for (int j = 0; j < 100; j++) { // your loop variable is j
printf("%s \n", lines[i]); // but you use i here.
}
Correct code:
for (int i = 0; i < 100; i++) {
printf("%s \n", lines[i]);
}
And still another one here:
arrayLen += arrayLen;
char ** newLines = (char**)realloc(lines, 200 * sizeof(char*));
// here the new length of your array is inconditionally 200
// but actually the new array length is arrayLen
Correct code:
arrayLen += arrayLen;
char ** newLines = (char**)realloc(lines, arrayLen * sizeof(char*));
There may be more problems though, I didn't check everything.
BTW: sizeof(char) is 1 by definition, so you can just drop it.
BTW2: arrayLen += arrayLen; are you sure this is what you want? You double the size of your array each time. This is not necessarily wrong but using this method the array length will very quickly grow to a very big number. You probably wanted this: arrayLen += STEPSIZE;
BTW3:
while (fgets(buffer, 3000, inputFile))
this is not actually wrong, but you'd better write this:
while (fgets(buffer, sizeof buffer, inputFile))
which eliminates one of the two hard coded constants 3000.
BTW4: at the end you only print the first 100 lines yo've read. You should be able to correct this yorself.
BTW5: you should also free all the memory you have allocated. I leave this as an exercise to you. Hint: it's about three lines of code to add at the end of main.
Related
I'm trying to read a file that contains a five letter word on each line for 12972 lines. I'm not sure why I'm getting this error even with freeing storage.
#include <stdio.h>
#include <stdlib.h>
int main()
{
FILE *file;
file = fopen("words.txt", "r"); // Opens file
char** gssArr = (char**)malloc(12972 * sizeof(char*)); // Allocates memory for 2d array
for(int i = 0; i < 12972; i++)
{
gssArr[i] = (char*)malloc(5 * sizeof(char));
}
char word[6]; // Current word being looked at from file
int current = 0; // Used for indexing x coordinate of 2d array
while(fgets(word,6,file) != NULL) // Not at end of file
{
if(word[0] != '\n') // Not at end of line
{
for(int j = 0; j < 5; j++) // Loops through all 5 letters in word, adding them to gssArr
{
gssArr[current][j] = word[j];
}
}
current++; // Index increase by 1
}
fclose(file); // Close file, free memory
free(gssArr);
FYI - your reader loop - you may want to make sure that your current index is not going beyond 12971.
Your problem is here:
current++;
Why?
Because it's done even when you are at the end of the line.
One solution is to move it inside the if statement but instead I'll recommend that you use fgets with a much larger buffer.
Details
If every line holds a five letter word then
fgets(word,6,file)
will read the five letters and zero terminate it. Then the next fgets will just read the "newline". Still you increment the index counter and in the end, you write outside allocated memory.
Try
while(fgets(word,6,file) != NULL) // Not at end of file
{
printf("current=%d\n", current);
and you see the problem.
I want to get all lines from the text file and store them in my char** pointer (array of strings). The problem is that when I try to set indices for pointer's strings, the program assigns the last scanned sentence for all indices.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LINE 10000
int main()
{
FILE *fp = NULL;
char line[MAX_LINE];
char** lines = (char**) malloc(10000*200*sizeof(char));
int count = 0;
fp = fopen("test.txt","r");
while(fgets(line,10000,fp)) {
lines[count] = line;
count++;
}
fclose(fp);
for(int i =0; i<2000;i++){
printf("%s",lines[i]);
}
return 0;
}
lets assume test.txt is like this:
Alice was beginning to get very tired of sitting by her sister on the
bank, and of having nothing to do: once or twice she had peeped into the
book her sister was reading, but it had no pictures or conversations in
it, and what is the use of a book, thought Alice without pictures or
conversations?
When I print like this, every time I get the last sentence (in this case conversations? ) in my text file. However, I want to set every scanned sentence from the text file to the different index in my char**. For example, I want to set like this:
lines[0] gives "Alice was beginning to get very tired of sitting by her sister on the"
lines[1] gives "bank, and of having nothing to do: once or twice she had peeped into the"
and so on.
You can't copy characters from one string buffer to another simply by assigning a pointer (all that does is to make the destination point to the source, as you have noticed).
Instead, you must actually copy the characters, using the strcpy function. So, instead of:
lines[count] = line; // Just makes each pointer point to the same buffer
use:
strcpy(lines[count], line); // Copies the CURRENT contents of "line"
You also have a severe problem in the way you are using your char** lines buffer. If you want an array of 200 lines, each with a maximum length of 10000 characters, you should allocate them as follows:
char** lines = malloc(200 * sizeof(char*)); // Make 200 pointers
// Now allocate 10000 chars to each of these pointers:
for (int i = 0; i < 200; ++i) lines[i] = malloc(10000 * sizeof(char));
Note: The 200 buffers will be uninitialized (contain random data) so, in your print loop, you should only use those you have copied real data to, using the count variable as the loop limit:
for(int i = 0; i < count; i++) {
printf("%s", lines[i]);
}
Also, don't forget to free the memory allocated when you're done:
for (int i = 0; i < 200; ++i) free(lines[i]); // Free each line buffer...
free(lines); // ... then free the array of pointers itself
strdup resolve the issue, free resources as said by Adrian when finished.
int main()
{
FILE *fp = NULL;
char line[MAX_LINE];
char** lines = (char**) malloc(10000*200*sizeof(char));
int count = 0;
fp = fopen("test.txt","r");
while(fgets(line,10000,fp)) {
lines[count] = strdup(line);
count++;
}
fclose(fp);
for(int i =0; i<count;i++){
printf("%s",lines[i]);
}
for (int i = 0; i < count; ++i) free(lines[i]);
free(lines);
return 0;
}
If you are looking for better performance look at my repo (https://github.com/PatrizioColomba/strvect)
I'm supposed to copy fp to lines.
I first find the length of the texts in fp
then I dynamically allocate lines and retrieve the texts using fgets.
I keep getting a "Your return code was -11 but it was supposed to be 0" on my auto grader. This is only part of the code of course. I have a makefile and main.
Where is my seg fault??
void read_lines(FILE* fp, char*** lines, int* num_lines){
int num_chars=0;
int index=0;
int lengths[index];
int i=0;
//find the length of the rows n cols in fp
//while there is still character in the text
while(!feof(fp)){
//get that character
char current_char= fgetc(fp);
//implement the number character
num_chars++;
//enter at the end of the first then each line
if(current_char=='\n'){
//find the length of the next line of sentence/word.
// This array stores the length of characters of each line
lengths[index]= num_chars;
//update index
index++;
// Reset the number of characters for next iteration
num_chars = 0;
// Increment the number of lines read so far
(*num_lines)++;
}
}
//now we need to copy the characters in fp to lines
(*lines)=(char**) malloc((*num_lines)*sizeof(char*));
for(i=0;i<*num_lines;i++){
(*lines)[i]=(char*)malloc(lengths[i]*sizeof(char));
fgets(*lines[i],(lengths[i]+1),fp);
fseek(fp,0,SEEK_SET);
}
}
I'm seeing two problems, here.
First, lengths is statically allocated with zero bytes. That can and will never work. You will need to either create a lengths array with a maximum size (say, 256 line maximum) or make lengths a linked list so that it can grow with the index. Alternatively, you can make two passes through the file - once to get the number of lines (after which you allocate your lines array) and once to get the number of characters per line.
Second, although it is a nitpick, you can greatly simplify the code by removing num_lines from your while loop. After of the loop, just set
*num_lines = index;
The reason of segfault is your are passing lines pointer in wrong way
fgets(*lines[i],(lengths[i]+1),fp);
correct way is :-
fgets((*lines)[i],(lengths[i]+1),fp);
fix like this
void read_lines(FILE *fp, char ***lines, int *num_lines){
int num_chars=0;
/* int index=0; int lengths[index];//lengths[0] is bad. */
int ch, i = 0, max_length = 0;
while((ch=fgetc(fp))!=EOF){//while(!feof(fp)){ is bad. Because it loops once more.
num_chars++;
if(ch == '\n'){
++i;//count line
if(num_chars > max_length)
max_length = num_chars;
//reset
num_chars = 0;
}
}
if(num_chars != 0)//There is no newline in the last line
++i;
*num_lines = i;
rewind(fp);//need Need rewind
char *line = malloc(max_length + 1);
*lines = malloc(*num_lines * sizeof(char*));
for(i = 0; i < *num_lines; i++){
fgets(line, max_length+1, fp);
(*lines)[i] = malloc(strlen(line)+1);
strcpy((*lines)[i], line);
}
free(line);
}
I have a text file with names that looks as follows:
"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER","MARIA","SUSAN","MARGARET",
I have used the following code to attempt to put the names into an array:
char * names[9];
int i = 0;
FILE * fp = fopen("names.txt", "r");
for (i=0; i < 9; i++) {
fscanf(fp, "\"%s\",", names[i]);
}
The program comes up with a segmentation fault when I try to run it. I have debugged carefully, and I notice that the fault comes when I try and read in the second name.
Does anybody know why my code isn't working, and also why the segmentation fault is happening?
You have undefined behavior in your code, because you don't allocate memory for the pointers you write to in the fscanf call.
You have an array of nine uninitialized pointers, and as they are part of a local variable they have an indeterminate value, i.e. they will point to seemingly random locations. Writing to random locations in memory (which is what will happen when you call fscanf) will do bad things.
The simplest way to solve the problem is to use an array of arrays, like e.g.
char names[9][20];
This will gives you an array of nine arrays, each sub-array being 20 characters (which allows you to have names up to 19 characters long).
To not write out of bounds, you should also modify your call so that you don't read to many characters:
fscanf(fp, "\"%19s\",", names[i]);
There is however another problem with your use of the fscanf function, and that is that the format to read a string, "%s", reads until it finds a whitespace in the input (or until the limit is reached, if a field width is provided).
In short: You can't use fscanf to read your input.
Instead I suggest you read the whole line into memory at once, using fgets, and then split the string on the comma using e.g. strtok.
One way of handling arbitrarily long lines as input from a file (pseudoish-code):
#define SIZE 256
size_t current_size = SIZE;
char *buffer = malloc(current_size);
buffer[0] = '\0'; // Terminator at first character, makes the string empty
for (;;)
{
// Read into temporary buffer
char temp[SIZE];
fgets(temp, sizeof(temp), file_pointer);
// Append to actual buffer
strcat(buffer, temp);
// If last character is a newline (which `fgets` always append
// if it reaches the end of the line) then the whole line have
// been read and we are done
if (last_character_is_newline(buffer))
break;
// Still more data to read from the line
// Allocate a larger buffer
current_size += SIZE;
buffer = realloc(buffer, current_size);
// Continues the loop to try and read the next part of the line
}
// After the loop the pointer `buffer` points to memory containing the whole line
[Note: The above code snippet doesn't contain any error handling.]
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *names[9], buff[32];
int i = 0;
FILE *fp = fopen("names.txt", "r");
for(i = 0; i < 9; i++) {
if(1==fscanf(fp, "\"%31[^\"]\",", buff)){//"\"%s\"," does not work like that what you want
size_t len = strlen(buff) + 1;
names[i] = malloc(len);//Space is required to load the strings of each
memcpy(names[i], buff, len);
}
}
fclose(fp);
//check print & deallocate
for(i = 0; i< 9; ++i){
puts(names[i]);
free(names[i]);
}
return 0;
}
try this...
for (i=0; i < 9; i++)
{
names[i]=malloc(15);// you should take care about size
fscanf(fp, "\"%s\",", names[i]);
}
This might be a very inefficient way to do it, but its sort of working
This code reads through a file, stores 8 line of text at a time in a global array (Would like a better option to do this if possible ) and dispatches for further processing.
here's the code
int count = 0; //global
char *array_buffer[8]; //global
void line(char *line_char)
{
int lent = strlen(line_char);
array_buffer[count] = line_char;
printf("%d\n",count);
if (count == 8)
{
int row,col;
for(row = 0; row<count; row++){
printf("%d\n",row);
for(col = 0; col<=lent; col++) {
printf("%c", array_buffer[row][col]);
}
printf("\n");
}
count = 0;
}
count++;
}
int main(int argc,char **argv)
{
clock_t start = clock();
FILE *fp = fopen(argv[1], "r");
if(fp == NULL )
{
printf("Couldn't open file %s",argv[1]);
}
char buff[512];
while (fgets(buff, 512, fp) != NULL )
{
line(buff); /*sending out an array having one line*/
}
return 0;
}
The issue is that while printing out the contents of array_buffer, its printing out the last line in the buffer 8 times. (i.e. the 8th line its reading in every cycle). Its pretty obvious that
array_buff[0]
....
array_buff[7]
all point to the address of line 8
any help in solving this ? I know it might not be the correct way to buffer something at all !
The problem with your approach that leads to the behavior that you see is that your code never copies the data from the buffer. This line
array_buffer[count] = line_char;
puts a pointer to the same char buff[512] from main at all eight locations. Subsequent calls to fgets override the content of previous reads, so you end up with eight copies of the last line.
You can fix this issue by making a copy, e.g. with strdup or by allocating memory with malloc and making a copy. You need to free everything that you allocate, though.
void line(char *line_char){
if (count == 8){
int row,col;
for(row = 0; row<count; row++){
printf("%2d:",row);
printf("%s", array_buffer[row]);
free(array_buffer[row]);
}
count = 0;
}
int lent = strlen(line_char);
array_buffer[count] = malloc((lent + 1)*sizeof(char));
strcpy(array_buffer[count], line_char);
//printf("%d\n", count);
count++;
}
You have a stale pointer, here I will explain
while (fgets(buff, 512, fp) != NULL )
{
//buff updated
line(buff);
//...
//inside of the line function
somepointertopointers[currIndex]=buff;
now it is looking at the location at buff, so all of the elements are looking at the same location, you need to copy the chars, or make a longer buffer and make sure you are updating the location the pointer is looking at, you can make 8 separate char[] pointers as well
This will give you the result you want
buff[512][8];
char** curr = buff;
while(fget(*curr,512,fp)!= NULL)
{
line(*curr);
curr++;
}
or alternatively you could allocate the buffer that is passed
#def BUFF_SIZE 512
#def BUFF_ARRAY_LEN 8
//put this somewhere before calling line
//to initialize your array_buffer
for(i=0;i<BUFF_ARRAY_LEN;i++)
{
array_buffer[i]=NULL;
}
...
//update in function line
//makes more sense to just use
//the max len of a line
if(array_buffer[count] == NULL)
array_buffer[count]=(char*)malloc(sizeof(char)*BUFF_SIZE);
strcpy(array_buffer[count],line_char);
...
//you will also need to
//clean up after you are
//done with the memory
for(i=0;i<BUFF_ARRAY_LEN;i++)
{
free(array_buffer[i]);
}