How can I read a specific line from a file, in C? - c

All right: So I have a file, and I must do things with it. Oversimplifying, the file has this format:
n
first name
second name
...
nth name
random name
do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾
random name
do x⁽²⁾, y⁽²⁾, z⁽²⁾
...
random name
do x⁽ⁿ⁾, y⁽ⁿ⁾, z⁽ⁿ⁾
So, the actual details are not important.
The problem is: I'll have to declare a variable n, I have an array name[MAX], and I'll fill this array with the names, from name[0] to name[n-1].
Alright, the problem is: How can I get this input, if I don't know previously how many names do I have?
For example, I could do it just fine if that was an user input, from the keyboard: I would do it like this:
int n; char name[MAX];
scanf( "%d", &n);
int i; for (i = 0; i < n; i++)
scanf( "%s", &N[i]);
And I could go on, do the whole code, but you get the point. But, my input now comes from a file. I don't know how can I get the input, all I can do is to fscanf() the whole file, but since I don't know its size (the first number will determine it), I can't do it. As far as I know (please correct me if that's not true, I am very new to this), we can't use the command "for" and get the numbers gradually as if that was coming from the keyboard, right?
So, the only exit I see is to find a way to read a particular line from the file. If I can do this, the rest is easy. The thing is, how can I do that?
I google'd it, I even found some questions in there, though it didn't make any sense at all. Apparently, reading a particular line from a file is really complicated.
This is from a beginner problem set, so I doubt it is something that complicated. I must be missing something very simple, though I just don't know what it is.
So, the question is: How would you do it, for instance?
How to scan the first number n from the file, and then, scan the others 'n' names, assigning each one to an element in an array (first name = name[0], last name = name[n - 1])?

I would suggest looking into End Of File.
while(!eof(fd))
{
...code...
}
Mind you my C knowledge is rusty, but this should get you started.
IIRC eof returns a value (-1) so that's why you need to compare it to something. Here fd being file descriptor of the file you are reading.
Then after parse of text or count of lines you have your 'n'.
EDIT: Since I'm obviously more tired then I thought(didn't notice your 'n' at the top).
Read first line
malloc for 'n' size array
for loop to iterate names.

Here you go.. I leve compiling and debugging as an exercise for the student.
The idea is to slurp the whole file into a single array if you files are always small.
This is so much more efficient than scanf().
char buf[100000], *bp, *N[1000]; // plenty big
memset( buf, '\0', sizeof buf );
if ( fgets( buf, sizeof(buf), fd ) )
{
int n = 0;
char *bp;
if ( buf[(sizeof buf)-2)] != '\0' )
{ // file too long for buffer
printf( stderr, "trouble: file too large: %d\n", (int)(sizeof buf));
exit(EXIT_FAILURE);
}
// now replace each \n with a \0, remembering where each line is.
for ( bp = buf, bp = strchr( bp, '\n' ); bp++ )
N[n++] = bp;
}
If you want to read any size files you need to read the file in chunks, calloc()ing each chunk before a read, and carefully handling of the line fragments left at the end of the current buffer to move them to the next buffer and then properly continuing you reads.
Unless you have a limit on how many lines you can read the N may need to also be set up in chunks, but this time remalloc() might be your friend.

Since the given format seems to imply that the number of names n is given as the first entry in the file, it would be possible to use the style of reading that the OP describes when reading from stdin. Use fscanf to read the first integer from the file (n), then use malloc to allocate the array(s) for the names, then use a for loop up to n to read the names.
However, I am unsure of the meaning of the example data following that with the do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾ format. Perhaps I am not understanding part of the question. If it means there are potentially more than n names, then you can use realloc to grow the size of the array. One way of growing the array that is not uncommon is to double the length each time.

Related

Pthreads, fread(), and printf(): Getting random D4's in my string

The Scoop:
I am creating a method that runs through a lengthy file in chunks: using pthreads. I am calling fread() to read the file in this sort of fashion:
fread( thread_data[i].buffer, 1, 50, f )
/*
thread_data is a data structure for each thread (hence i)
buffer is in thread_data as an array of length 50
*/
I am then directly calling a print statement to see what each thread is doing, as a weird pattern was showing up in some of the parts that I was printing. Namely, my print statement would look something like this:
this is suppose to be 50 characters, but it is only a fewgD4
That D4 directly above is what I have my question on. Every thread that I make, at the end of the string, we are printing D4, and in this case, followed by a g. Other times, it is followed by a d, and most commonly a �. Now, I did read the wikipedia page on this character, which states:
replacement character used to replace an unknown or unrepresentable character
My question:
What kind of an error am I running into? Why is the end of each read statement containing unknown characters, especially the weird gD4 guy?
Aside:
I am trying to make a function in c that utilizes pthreads to find the frequency of each word in a file, in case anyone was wondering. These weird characters were showing up in my list, which is something that I find slightly unpleasent. Finally, don't bother linking me to the Obligaroty Unicode article, I am already aware of it, and the characters are not outside of what I am working with.
The strings you are printing out are not null-terminated — fread() does not null-terminate its output, it simply reads in as many raw bytes as you asked for (or fewer). So when you print out your buffer, your print function is walking past the end of the data and printing out whatever garbage memory comes after the buffer, which in your case just happens to be gD4.
You need to either explicitly null-terminate your buffer; or, if your print function supports it, tell it exactly how many characters to print. Either way, you need to save the return value from fread to know how many characters you read. For example:
int n = fread(thread_data[i].buffer, 1, 50, f);
if (n < 0) /* Handle error */ ;
// Explicitly add a null terminator -- make sure the buffer has room for it!
thread_data[i].buffer[n] = 0;

C multidimensional array size using scanf

I'm having some trouble using multidimensional arrays for a program. Essentially, the program uses scanf to read a user ID and a string of chars from a redirected file. the file format is a three digit user ID, a space, and a string of chars representing the answers to multiple choice problems on a test on each line, eg.
111 dabac
102 dcbdc
251 dbbac
The problem I'm running into is that I don't know how many users there are, and I can't read the file data multiple times. I've tried using
for (lineNumber = 0; lineNumber != -1; lineNumber++)
{
int result = scanf("%d ", &data);
if (result == EOF)
break;
for(i = 0; i < numProblems; i++)
{
scanf("%c", &input);
}
to get the number of lines in the file, then set the size of the array. The array is then passed to another function that reads the data, using the same for loop but with
input = arrayName[numProblems][lineNumber];
in the second for loop. The issue I'm running into is that scanf can only read the data from the file once, and I can't store the data in the array until I initialize it, which requires me to know how many users there are.
The way I have it set up, the program can either find the number of lines(users) or store the data in the array (if I set the size to an arbitrary number), but not both.
I have to use scanf because the filename isn't constant (and also this is for a class... the professor requires scanf to be used), and I can't figure out how to get the number of lines in the file and still be able to read the data. If anyone knows of a workaround to either find the number of lines without using scanf, or to read the data twice, I would really appreciate some help. If it would help to post the entire program, I can do that as well.
Thank you,
Erik
There are multiple ways to do this. I would recommend a linked list using structs, and keeping a count of what has been read so far.
If you really want to continue using arrays, allocate the array using malloc first, and then reallocate the array using realloc. You'll have a running count of the size of the array for reallocation.
it should be something like this:
POINTER *array = realloc(orig_array_pt, size);
if (array == NULL)
{
// realloc failed
}
else
{
orig_array_pt = array;
}
As long as the input is an actual file (and not a terminal or some other device), you can use rewind(stdin); to set the FILE pointer back to the beginning of the file...
Luckily scanf() will parse the whole line into multiple variables in one pass. Try something more like
scanf( "%d %s", &uid, answers );
Just count the number of times you're able to successfully parse data out of the stdin and call that your lineNumber count ( if you even need to know the total number of lines ).

Why is this C code giving me a bus error?

I have, as usual, been reading quite a few posts on here. I found a particular useful posts on bus errors in general, see here. My problem is that I cannot understand why my particular code is giving me an error.
My code is an attempt to teach myself C. It's a modification of a game I made when I learned Java. The goal in my game is to take a huge 5049 x 1 text file of words. Randomly pick a word, jumble it and try to guess it. I know how to do all of that. So anyway, each line of the text file contains a word like:
5049
must
lean
better
program
now
...
So, I created an string array in C, tried to read this string array and put it into C. I didn't do anything else. Once I get the file into C, the rest should be easy. Weirder yet is that it complies. My problem comes when I run it with ./blah command.
The error I get is simple. It says:
zsh: bus error ./blah
My code is below. I suspect it might have to do with memory or overflowing the buffer, but that's completely unscientific and a gut feeling. So my question is simple, why is this C code giving me this bus error msg?
#include<stdio.h>
#include<stdlib.h>
//Preprocessed Functions
void jumblegame();
void readFile(char* [], int);
int main(int argc, char* argv[])
{
jumblegame();
}
void jumblegame()
{
//Load File
int x = 5049; //Rows
int y = 256; //Colums
char* words[x];
readFile(words,x);
//Define score variables
int totalScore = 0;
int currentScore = 0;
//Repeatedly pick a random work, randomly jumble it, and let the user guess what it is
}
void readFile(char* array[5049], int x)
{
char line[256]; //This is to to grab each string in the file and put it in a line.
FILE *file;
file = fopen("words.txt","r");
//Check to make sure file can open
if(file == NULL)
{
printf("Error: File does not open.");
exit(1);
}
//Otherwise, read file into array
else
{
while(!feof(file))//The file will loop until end of file
{
if((fgets(line,256,file))!= NULL)//If the line isn't empty
{
array[x] = fgets(line,256,file);//store string in line x of array
x++; //Increment to the next line
}
}
}
}
This line has a few problems:
array[x] = fgets(line,256,file);//store string in line x of array
You've already read the line in the condition of the immediately preceding if statement: the current line that you want to operate on is already in the buffer and now you use fgets to get the next line.
You're trying to assign to the same array slot each time: instead you'll want to keep a separate variable for the array index that increments each time through the loop.
Finally, you're trying to copy the strings using =. This will only copy references, it won't make a new copy of the string. So each element of the array will point to the same buffer: line, which will go out of scope and become invalid when your function exits. To populate your array with the strings, you need to make a copy of each one for the array: allocate space for each new string using malloc, then use strncpy to copy each line into your new string. Alternately, if you can use strdup, it will take care of allocating the space for you.
But I suspect that this is the cause of your bus error: you're passing in the array size as x, and in your loop, you're assigning to array[x]. The problem with this is that array[x] doesn't belong to the array, the array only has useable indices of 0 to (x - 1).
You are passing the value 5049 for x. The first time that the line
array[x] = ...
executes, it's accessing an array location that does not exist.
It looks like you are learning C. Great! A skill you need to master early is basic debugger use. In this case, if you compile your program with
gcc -g myprogram.c -o myprogram
and then run it with
gdb ./myprogram
(I am assuming Linux), you will get a stack dump that shows the line where bus error occurred. This should be enough to help you figure out the error yourself, which in the long run is much better than asking others.
There are many other ways a debugger is useful, but this is high on the list. It gives you a window into your running program.
You are storing the lines in the line buffer, which is defined inside the readFile function, and storing pointers to it in the arary. There are two problems with that: you are overwriting the value everytime a new string is read and the buffer is in the stack, and is invalid once the function returns.
You have at least a few problems:
array[x] = fgets(line,256,file)
This stores the address of line into each array element. line in no longer valid when readFile() returns, so you'll have an array of of useless pointers. Even if line had a longer lifetime, it wouldn't be useful to have all your array elements having the same pointer (they'd each just point to whatever happened to be written in the buffer last)
while(!feof(file))
This is an antipattern for reading a file. See http://c-faq.com/stdio/feof.html and "Using feof() incorrectly". This antipattern is likely responsible for your program looping more than you might expect when reading the file.
you allocate the array to hold 5049 pointers, but you simply read however much is in the file - there's no checking for whether or not you read the expected number or to prevent reading too many. You should think about allocating the array dynamically as you read the file or have a mechanism to ensure you read the right amount of data (not too little and not too much) and handle the error when it's not right.
I suspect the problem is with (fgets(line,256,file))!=NULL). A better way to read a file is with fread() (see http://www.cplusplus.com/reference/clibrary/cstdio/fread/). Specify the FILE* (a file stream in C), the size of the buffer, and the buffer. The routine returns the number of bytes read. If the return value is zero, then the EOF has been reached.
char buff [256];
fread (file, sizeof(char), 256, buff);

Reading and comparing numbers from txt file C

I am new to C programming, so I am having difficulties with the problem below.
I have a text file inp.txt which contains information like the following:
400;499;FIRST;
500;599;SECOND;
670;679;THIRD;
I need to type a number and my program needs to compare it with numbers from the inp.txt file.
For example, if I type 450, it's between 400 and 499, so I need write to the word FIRST to the file out.txt
I have no idea how to convert a character array to an int.
I think you'll want these general steps in your program (but I'll leave it to you to figure out how you want to do it exactly)
Load each of the ranges and the text "FIRST", "SECOND", etc. from the file inp.txt, into an array, or several arrays, or similar. As I said in the comment above, fscanf might be handy. This page describes how to use it - the page is about C++, but using it in C should be the same http://www.cplusplus.com/reference/clibrary/cstdio/fscanf/. Roughly speaking, the idea is that you give fscanf a format specifier for what you want to extract from a line in a file, and it puts the bits it finds into the variables you specify)
Prompt the user to enter a number.
Look through the array(s) to work out which range the number fits into, and therefore which text to output
Edit: I'll put some more detail in, as asker requested. This is still a kind of skeleton to give you some ideas.
Use the fopen function, something like this (declare a pointer FILE* input_file):
input_file = fopen("c:\\test\\inp.txt", "r") /* "r" opens inp.txt for reading */
Then, it's good to check that the file was successfully opened, by checking if input_file == NULL.
Then use fscanf to read details from one line of the file. Loop through the lines of the file until you've read the whole thing. You give fscanf pointers to the variables you want it to put the information from each line of the file into. (It's a bit like a printf formatting specifier in reverse).
So, you could declare int range_start, range_end, and char range_name[20]. (To make things simple, let's assume that all the words are at most 20 characters long. This might not be a good plan in the long-run though).
while (!feof(input_file)) { /* check for end-of-file */
if(fscanf(input_file, "%d;%d;%s", &range_start, &range_end, range_name) != 3) {
break; /* Something weird happened on this line, so let's give up */
else {
printf("I got the following numbers: %d, %d, %s\n", range_start, range_end, range_name);
}
}
Hopefully that gives you a few ideas. I've tried running this code and it did seem to work. However, worth saying that fscanf has some drawbacks (see e.g. http://mrx.net/c/readfunctions.html), so another approach is to use fgets to get each line (the advantage of fgets is that you get to specify a maximum number of characters to read, so there's no danger of overrunning a string buffer length) and then sscanf to read from the string into your integer variables. I haven't tried this way though.

Read file in array line by line

Can you set any index of array as starting index i.e where to read from file? I was afraid if the buffer might get corrupted in the process.
#include <stdio.h>
int main()
{
FILE *f = fopen("C:\\dummy.txt", "rt");
char lines[30]; //large enough array depending on file size
fpos_t index = 0;
while(fgets(&lines[index], 10, f)) //line limit is 10 characters
{
fgetpos (f, &index );
}
fclose(f);
}
You can, but since your code is trying to read the full contents of the file, you can do that much more directly with fread:
char lines[30];
// Will read as much of the file as can fit into lines:
fread(lines, sizeof(*lines), sizeof(lines) / sizeof(*lines), f);
That said, if you really wanted to read line by line and do it safely, you should change your fgets line to:
// As long as index < sizeof(lines), guaranteed not to overflow buffer
fgets(&lines[index], sizeof(lines) - index, f);
Not like this no. There is a function called fseek that will take you to a different location in the file.
Your code will read the file into a different part of the buffer (rather than reading a different part of the file).
lines[index] is the index'th character of the array lines. Its address is not the index'th line.
If you want to skip to a particular line, say 5, then in order to read the 5th line, read 4 lines and do nothing with them, them read the next line and do something with it.
If you need to skip to a particular BYTE within a file, then what you want to use is fseek().
Also: be careful that the number of bytes that you tell fgets to read for you (10) is the same as the size of the array you are putting the line into (30) - so this is not the case right now.
If you need to read a part of a line starting from a certain character within that line, you still need to read the whole line, then just choose to use a chunk of it starting someplace other than the beginning.
Both of these examples are like requesting a part of a document from a website or a library - they're not going to tear out a page for you, you get the whole document, and you have to flip to what you want.

Resources