Can you set any index of array as starting index i.e where to read from file? I was afraid if the buffer might get corrupted in the process.
#include <stdio.h>
int main()
{
FILE *f = fopen("C:\\dummy.txt", "rt");
char lines[30]; //large enough array depending on file size
fpos_t index = 0;
while(fgets(&lines[index], 10, f)) //line limit is 10 characters
{
fgetpos (f, &index );
}
fclose(f);
}
You can, but since your code is trying to read the full contents of the file, you can do that much more directly with fread:
char lines[30];
// Will read as much of the file as can fit into lines:
fread(lines, sizeof(*lines), sizeof(lines) / sizeof(*lines), f);
That said, if you really wanted to read line by line and do it safely, you should change your fgets line to:
// As long as index < sizeof(lines), guaranteed not to overflow buffer
fgets(&lines[index], sizeof(lines) - index, f);
Not like this no. There is a function called fseek that will take you to a different location in the file.
Your code will read the file into a different part of the buffer (rather than reading a different part of the file).
lines[index] is the index'th character of the array lines. Its address is not the index'th line.
If you want to skip to a particular line, say 5, then in order to read the 5th line, read 4 lines and do nothing with them, them read the next line and do something with it.
If you need to skip to a particular BYTE within a file, then what you want to use is fseek().
Also: be careful that the number of bytes that you tell fgets to read for you (10) is the same as the size of the array you are putting the line into (30) - so this is not the case right now.
If you need to read a part of a line starting from a certain character within that line, you still need to read the whole line, then just choose to use a chunk of it starting someplace other than the beginning.
Both of these examples are like requesting a part of a document from a website or a library - they're not going to tear out a page for you, you get the whole document, and you have to flip to what you want.
Related
This would be enough to read the first character 'a' inside fp
file.txt
abcdef
// readchar.c
FILE *fp = fopen("file.txt", "r");
int c = fgetc(fp);
but how can i read (e.g.) the 3rd character?
A text file is a kind of sequential file. To read a specific item, you must read everything preceding it in the file.
Operating systems let you read a file anywhere by moving a so called file pointer. The position in the file where read take place can be changed by seeking into the file. There are several ways to handle file I/O. The one you found invoke fopen to open the file and get a file pointer, fgetcto read the next character where the file pointer points and advance it by one character. You have fgetsto read a complete line. fseek to more the file pointer somewhere else. fcloseto close the file. And other similar function.
Back to the text file. Assuming we have a file containing: two lines containing:
Hello world,
Programming rocks!
If you want to read the 5th character of the first line, it is easy: just position the file pointer with fseek to the 5th position in the file (the first character is at position zero). Then read it with fgetc.
Now if you need to read the 5th character of the second line, whatever the first line is, you cannot use fseek because you don't know the length of the first line without reading the line first.
To read the Nth character on the Mth line, you must read M lines, throwing away data except the last one (You simply read all line in a for/loop into the same buffer). And then access the Nth character in the buffer where you just read the last line. Make that buffer an array of char and you have direct access to the Nth character.
I'm building a program that reads in a file and then stores each line in an array for manipulation. The input file has a single string on each line, and I want to store each read word in its own slot in a single array. This in an example input file:
This
is
a
test
file
I'm trying to use this with the kernel level read command. This is what I got:
const int recordSize = 1024;
char buffer [recordSize];
int n = 0;
char word[10][50];
while ((n = read(fd_in, buffer, recordSize)) > 0) {
sscanf(buffer,"%s\n%s",word[0],word[1]);
}
The file is read in and stored in buffer. Then I want to put each line into the word array. I made it to hold 10 words of 50 characters length. The purpose of doing something like this is so that I can do something like, change word[0] in one way and alter word[3] in another way.
What I tried is using sscanf. The only issue is that in order for it to know to read on to the next line, I need to use \n and another %s. Since I don't know how long the input file it, this isn't a viable solution.
Right now I'm stuck on how to nondeterministically read line 1, store it in array slot 0, and move on to the next line, repeating for line 2 and slot 1, etc.
I am writing an academic project in C and I can use only <fcntl.h> and <unistd.h> libraries to file operations.
I have the function to read file line by line. The algorithm is:
Set pointer at the beginning of the file and get current position.
Read data to the buffer (char buf[100]) with constant size, iterate character by character and detect end of line '\n'.
Increment current position: curr_pos = curr_pos + length_of_read_line;
Set pointer to current position using lseek(fd, current_position, SEEK_SET);
SEEK_SET - set pointer to given offset from the beginning of the file. In my pseudo code current_position is the offset.
And actually it works fine, but I always move the pointer starting at the beginning of the file - I use SEEK_SET - it isn't optimized.
lseek accept also argument SEEK_CUR - it's a current position. How can I move back pointer from current position of pointer (SEEK_CUR). I tried to set negative offset, but didn't work.
The most efficient way to read lines of data from a file is typically to read a large chunk of data that may span multiple lines, process lines of data from the chunk until one reaches the end, move any partial line from the end of the buffer to the start, and then read another chunk of data. Depending upon the target system and task to be performed, it may be better to read enough to fill whatever space remains after the partial line, or it may be better to always read a power-of-two number of bytes and make the buffer large enough to accommodate a chunk that size plus a maximum-length partial line (left over from the previous read). The one difficulty with this approach is that all data to be read from the stream using the same buffer. In cases where that is practical, however, it will often allow better performance than using many separate calls to fread, and may be nicer than using fgets.
While it should be possible for a standard-library function to facilitate line input, the design of fgets is rather needlessly hostile since it provides no convenient indication of how much data it has read. After reading each line, code that wants a string containing the printable portion will have to use strlen to try to ascertain how much data was read (hopefully the input won't contain any zero bytes) and then check the byte before the trailing zero to see if it's a newline. Not impossible, but awkward at the very least. If the fread-and-buffer approach will satisfy an application's needs, it's likely to be at least as efficient as using fgets, if not moreso, and since the effort required to use fgets() robustly will be comparable to that required to use a buffering approach, one may as well use the latter.
Since your question is tagged as posix, I would go with getline(), without having to manually take care of moving the file pointer.
Example:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE* fp;
char* line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("input.txt", "r");
if(fp == NULL)
return -1;
while((read = getline(&line, &len, fp)) != -1)
{
printf("Read line of length %zu:\n", read);
printf("%s", line);
}
fclose(fp);
if(line)
free(line);
return 0;
}
Output with custom input:
Read line of length 11:
first line
Read line of length 12:
second line
Read line of length 11:
third line
I have a file in which I'd like to iterate without processing in any sort the current line. What I am looking for is the best way to go to a determined line of a text file. For example, storing the current line into a variable seems useless until I get to the pre-determined line.
Example :
file.txt
foo
fooo
fo
here
Normally, in order to get here, I would have done something like :
FILE* file = fopen("file.txt", "r");
if (file == NULL)
perror("Error when opening file ");
char currentLine[100];
while(fgets(currentLine, 100, file))
{
if(strstr(currentLine, "here") != NULL)
return currentLine;
}
But fgetswill have to read fully three line uselessly and currentLine will have to store foo, fooo and fo.
Is there a better way to do this, knowing that here is line 4? Something like a go tobut for files?
Since you do not know the length of every line, no, you will have to go through the previous lines.
If you knew the length of every line, you could probably play with how many bytes to move the file pointer. You could do that with fseek().
You cannot access directly to a given line of a textual file (unless all lines have the same size in bytes; and with UTF8 everywhere a Unicode character can take a variable number of bytes, 1 to 6; and in most cases lines have various length - different from one line to the next). So you cannot use fseek (because you don't know in advance the file offset).
However (at least on Linux systems), lines are ending with \n (the newline character). So you could read byte by byte and count them:
int c= EOF;
int linecount=1;
while ((c=fgetc(file)) != EOF) {
if (c=='\n')
linecount++;
}
You then don't need to store the entire line.
So you could reach the line #45 this way (using while ((c=fgetc(file)) != EOF) && linecount<45) ...) and only then read entire lines with fgets or better yet getline(3) on POSIX systems (see this example). Notice that the implementation of fgets or of getline is likely to be built above fgetc, or at least share some code with it. Remember that <stdio.h> is buffered I/O, see setvbuf(3) and related functions.
Another way would be to read the file in two passes. A first pass stores the offset (using ftell(3)...) of every line start in some efficient data structure (a vector, an hashtable, a tree...). A second pass use that data structure to retrieve the offset (of the line start), then use fseek(3) (using that offset).
A third way, POSIX specific, would be to memory-map the file using mmap(2) into your virtual address space (this works well for not too huge files, e.g. of less than a few gigabytes). With care (you might need to mmap an extra ending page, to ensure the data is zero-byte terminated) you would then be able to use strchr(3) with '\n'
In some cases, you might consider parsing your textual file line by line (using appropriately fgets, or -on Linux- getline, or generating your parser with flex and bison) and storing each line in a relational database (such as PostGreSQL or sqlite).
PS. BTW, the notion of lines (and the end-of-line mark) vary from one OS to the next. On Linux the end-of-line is a \n character. On Windows lines are rumored to end with \r\n, etc...
A FILE * in C is a stream of chars. In a seekable file, you can address these chars using the file pointer with fseek(). But apart from that, there are no "special characters" in files, a newline is just another normal character.
So in short, no, you can't jump directly to a line of a text file, as long as you don't know the lengths of the lines in advance.
This model in C corresponds to the files provided by typical operating systems. If you think about it, to know the starting points of individual lines, your file system would have to store this information somewhere. This would mean treating text files specially.
What you can do however is just count the lines instead of pattern matching, something like this:
#include <stdio.h>
int main(void)
{
char linebuf[1024];
FILE *input = fopen("seekline.c", "r");
int lineno = 0;
char *line;
while (line = fgets(linebuf, 1024, input))
{
++lineno;
if (lineno == 4)
{
fputs("4: ", stdout);
fputs(line, stdout);
break;
}
}
fclose(input);
return 0;
}
If you don't know the length of each line, you have to go through all of them. But if you know the line you want to stop you can do this:
while (!found && fgets(line, sizeof line, file) != NULL) /* read a line */
{
if (count == lineNumber)
{
//you arrived at the line
//in case of a return first close the file with "fclose(file);"
found = true;
}
else
{
count++;
}
}
At least you can avoid so many calls to strstr
All right: So I have a file, and I must do things with it. Oversimplifying, the file has this format:
n
first name
second name
...
nth name
random name
do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾
random name
do x⁽²⁾, y⁽²⁾, z⁽²⁾
...
random name
do x⁽ⁿ⁾, y⁽ⁿ⁾, z⁽ⁿ⁾
So, the actual details are not important.
The problem is: I'll have to declare a variable n, I have an array name[MAX], and I'll fill this array with the names, from name[0] to name[n-1].
Alright, the problem is: How can I get this input, if I don't know previously how many names do I have?
For example, I could do it just fine if that was an user input, from the keyboard: I would do it like this:
int n; char name[MAX];
scanf( "%d", &n);
int i; for (i = 0; i < n; i++)
scanf( "%s", &N[i]);
And I could go on, do the whole code, but you get the point. But, my input now comes from a file. I don't know how can I get the input, all I can do is to fscanf() the whole file, but since I don't know its size (the first number will determine it), I can't do it. As far as I know (please correct me if that's not true, I am very new to this), we can't use the command "for" and get the numbers gradually as if that was coming from the keyboard, right?
So, the only exit I see is to find a way to read a particular line from the file. If I can do this, the rest is easy. The thing is, how can I do that?
I google'd it, I even found some questions in there, though it didn't make any sense at all. Apparently, reading a particular line from a file is really complicated.
This is from a beginner problem set, so I doubt it is something that complicated. I must be missing something very simple, though I just don't know what it is.
So, the question is: How would you do it, for instance?
How to scan the first number n from the file, and then, scan the others 'n' names, assigning each one to an element in an array (first name = name[0], last name = name[n - 1])?
I would suggest looking into End Of File.
while(!eof(fd))
{
...code...
}
Mind you my C knowledge is rusty, but this should get you started.
IIRC eof returns a value (-1) so that's why you need to compare it to something. Here fd being file descriptor of the file you are reading.
Then after parse of text or count of lines you have your 'n'.
EDIT: Since I'm obviously more tired then I thought(didn't notice your 'n' at the top).
Read first line
malloc for 'n' size array
for loop to iterate names.
Here you go.. I leve compiling and debugging as an exercise for the student.
The idea is to slurp the whole file into a single array if you files are always small.
This is so much more efficient than scanf().
char buf[100000], *bp, *N[1000]; // plenty big
memset( buf, '\0', sizeof buf );
if ( fgets( buf, sizeof(buf), fd ) )
{
int n = 0;
char *bp;
if ( buf[(sizeof buf)-2)] != '\0' )
{ // file too long for buffer
printf( stderr, "trouble: file too large: %d\n", (int)(sizeof buf));
exit(EXIT_FAILURE);
}
// now replace each \n with a \0, remembering where each line is.
for ( bp = buf, bp = strchr( bp, '\n' ); bp++ )
N[n++] = bp;
}
If you want to read any size files you need to read the file in chunks, calloc()ing each chunk before a read, and carefully handling of the line fragments left at the end of the current buffer to move them to the next buffer and then properly continuing you reads.
Unless you have a limit on how many lines you can read the N may need to also be set up in chunks, but this time remalloc() might be your friend.
Since the given format seems to imply that the number of names n is given as the first entry in the file, it would be possible to use the style of reading that the OP describes when reading from stdin. Use fscanf to read the first integer from the file (n), then use malloc to allocate the array(s) for the names, then use a for loop up to n to read the names.
However, I am unsure of the meaning of the example data following that with the do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾ format. Perhaps I am not understanding part of the question. If it means there are potentially more than n names, then you can use realloc to grow the size of the array. One way of growing the array that is not uncommon is to double the length each time.