sscanf On A Multi-Line File - c

I'm building a program that reads in a file and then stores each line in an array for manipulation. The input file has a single string on each line, and I want to store each read word in its own slot in a single array. This in an example input file:
This
is
a
test
file
I'm trying to use this with the kernel level read command. This is what I got:
const int recordSize = 1024;
char buffer [recordSize];
int n = 0;
char word[10][50];
while ((n = read(fd_in, buffer, recordSize)) > 0) {
sscanf(buffer,"%s\n%s",word[0],word[1]);
}
The file is read in and stored in buffer. Then I want to put each line into the word array. I made it to hold 10 words of 50 characters length. The purpose of doing something like this is so that I can do something like, change word[0] in one way and alter word[3] in another way.
What I tried is using sscanf. The only issue is that in order for it to know to read on to the next line, I need to use \n and another %s. Since I don't know how long the input file it, this isn't a viable solution.
Right now I'm stuck on how to nondeterministically read line 1, store it in array slot 0, and move on to the next line, repeating for line 2 and slot 1, etc.

Related

Parsing words in C; Translating program

I'm developing a program that will translate a string from the user (English) into Spanish.
For the assignment I'm given a file that contains a list of a 100 words and their spanish equivalent. I've successfully opened that file, and fed it to the string with a two dimensional array.
What I'm having difficulty with is parsing the words so it will allow me to find the equivalent version of the given words; any words that aren't given are suppose to be replaced with asterisks (*). Any ideas on how I can parse the words from the users inputted string?
Below is snippits of the source code to save some time.
--Thanks
char readFile[100][25];
fp = fopen("words.dat", "r");
if (fp == NULL){
printf ("File failed to load\n");
}
//This is how I stored the file into the two dimensional string.
while (fgets(readFile, 100, fp)){
x++;
}
printf ("User please input string\n");
gets (input);
That's as far as I've gotten. I commented out the for-loop that outputs the words so I can see the words (for the sake of curiousity) and it was successful. The format of the file string is
(english word), (spanish word).
First of, the array you declare is 100 arrays of 25-character arrays. If we talk about "lines" it means you have 100 lines where each line can be 24 characters (remember we need one extra for the terminating '\0' character). If you want 25 lines of 99 characters each, switch place of the sizes.
Secondly, you overwrite the same bytes of the array over and over again. And since each sub-array is actually only 25 characters, you can overwrite up to four of those arrays with that fgets call.
I suggest something like this instead:
size_t count = 0;
for (int i = 0; i < sizeof(readFile) / sizeof(readFile[0]) &&
fgets(readFile[i], sizeof(readFile[i]), fp); i++, count++)
{
}
This will make sure you don't read more than you can store, and automatically reads into the correct "line" in the array. After the loop count will contain the number of lines you read.

How can I read a specific line from a file, in C?

All right: So I have a file, and I must do things with it. Oversimplifying, the file has this format:
n
first name
second name
...
nth name
random name
do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾
random name
do x⁽²⁾, y⁽²⁾, z⁽²⁾
...
random name
do x⁽ⁿ⁾, y⁽ⁿ⁾, z⁽ⁿ⁾
So, the actual details are not important.
The problem is: I'll have to declare a variable n, I have an array name[MAX], and I'll fill this array with the names, from name[0] to name[n-1].
Alright, the problem is: How can I get this input, if I don't know previously how many names do I have?
For example, I could do it just fine if that was an user input, from the keyboard: I would do it like this:
int n; char name[MAX];
scanf( "%d", &n);
int i; for (i = 0; i < n; i++)
scanf( "%s", &N[i]);
And I could go on, do the whole code, but you get the point. But, my input now comes from a file. I don't know how can I get the input, all I can do is to fscanf() the whole file, but since I don't know its size (the first number will determine it), I can't do it. As far as I know (please correct me if that's not true, I am very new to this), we can't use the command "for" and get the numbers gradually as if that was coming from the keyboard, right?
So, the only exit I see is to find a way to read a particular line from the file. If I can do this, the rest is easy. The thing is, how can I do that?
I google'd it, I even found some questions in there, though it didn't make any sense at all. Apparently, reading a particular line from a file is really complicated.
This is from a beginner problem set, so I doubt it is something that complicated. I must be missing something very simple, though I just don't know what it is.
So, the question is: How would you do it, for instance?
How to scan the first number n from the file, and then, scan the others 'n' names, assigning each one to an element in an array (first name = name[0], last name = name[n - 1])?
I would suggest looking into End Of File.
while(!eof(fd))
{
...code...
}
Mind you my C knowledge is rusty, but this should get you started.
IIRC eof returns a value (-1) so that's why you need to compare it to something. Here fd being file descriptor of the file you are reading.
Then after parse of text or count of lines you have your 'n'.
EDIT: Since I'm obviously more tired then I thought(didn't notice your 'n' at the top).
Read first line
malloc for 'n' size array
for loop to iterate names.
Here you go.. I leve compiling and debugging as an exercise for the student.
The idea is to slurp the whole file into a single array if you files are always small.
This is so much more efficient than scanf().
char buf[100000], *bp, *N[1000]; // plenty big
memset( buf, '\0', sizeof buf );
if ( fgets( buf, sizeof(buf), fd ) )
{
int n = 0;
char *bp;
if ( buf[(sizeof buf)-2)] != '\0' )
{ // file too long for buffer
printf( stderr, "trouble: file too large: %d\n", (int)(sizeof buf));
exit(EXIT_FAILURE);
}
// now replace each \n with a \0, remembering where each line is.
for ( bp = buf, bp = strchr( bp, '\n' ); bp++ )
N[n++] = bp;
}
If you want to read any size files you need to read the file in chunks, calloc()ing each chunk before a read, and carefully handling of the line fragments left at the end of the current buffer to move them to the next buffer and then properly continuing you reads.
Unless you have a limit on how many lines you can read the N may need to also be set up in chunks, but this time remalloc() might be your friend.
Since the given format seems to imply that the number of names n is given as the first entry in the file, it would be possible to use the style of reading that the OP describes when reading from stdin. Use fscanf to read the first integer from the file (n), then use malloc to allocate the array(s) for the names, then use a for loop up to n to read the names.
However, I am unsure of the meaning of the example data following that with the do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾ format. Perhaps I am not understanding part of the question. If it means there are potentially more than n names, then you can use realloc to grow the size of the array. One way of growing the array that is not uncommon is to double the length each time.

array size in fgets()

i m using fgets to read line form .txt file. i m passing an array as the first argument. different lines fill in different amount of space in the array, but i want to know the exact length of the line that is read and make decision based in that. is it possible?
FILE * old;
old = fopen("m2p1.txt","r");
char third[100];
fgets(third,sizeof(third),old);
now if i ask for sizeof(third), its obviously 100 because i declared so myself (i cant declare 'third' array without specifying the size) but i need to get the exact size of the line read from file(as it may not fill in the enitre array).
is it possible? what should do?
If fgets succeeds, it'll read a string into your buffer. use strlen() to find its length.
char third[100];
if(fgets(third,sizeof(third),old) != NULL) {
size_t len = strlen(third);
..
}

c programming read() and write() content to file

The user should input some file names in the command line and the program will read each file name from argv[] array. I have to perform error checking etc.
I want to read each filename. For example, if argv[2] is 'myfile.txt', the program should read the content of 'myfile.txt' and store value in char buffer[BUFSIZ] and then write the content of buffer into another file.
However before the content is written, the program should also write the name of the file and the size. Such that the file can be easily extracted later. A bit like the tar function.
The file I write the content of buffer, depending on the number of files added by user, should be a string like:
myfile.txt256Thisisfilecontentmyfile2.txt156Thisisfile2content..............
My question is
1) How do I write value of argv[2] into file using write() statement, as having problems writing char array, what should I put as (sizeof(?)) inside write(). see below as I don't know the length of the file name entered by the user.
2) Do I use the '&' to write an integer value into file after name, for example write 4 bytes after file name for the size of file
Here is the code I have written,
char buffer[BUFSIZ];
int numfiles=5; //say this is no of files user entered at command
open(file.....
lseek(fdout, 0, SEEK_SET); //start begging of file and move along each file some for loop
for(i=0-; ......
//for each file write filename,filesize,data....filename,filesize,data......
int bytesread=read(argv[i],buffer,sizeof(buffer));
write(outputfile, argv[i], sizeof(argv)); //write filename size of enough to store value of filename
write(outputfile, &bytesread, sizeof(bytesread));
write(outputfile, buffer, sizeof(buffer));
But the code is not working as I expected.
Any suggestions?
Since argv consists of null-terminated arrays, the length you can write is strlen(argv[2])+1 to write both the argument and null terminator:
size_t sz = strlen (argv[2]);
write (fd, argv[2], sz + 1);
Alternatively, if you want the length followed by the characters, you can write the size_t itself returned from strlen followed by that many characters.
size_t sz = strlen (argv[2]);
write (fd, &sz, sizeof (size_t));
write (fd, argv[2], sz);
You probably also need to write the length of the file as well so that you can locate the next file when reading it back.
1., You can write the string the following way:
size_t size = strlen(string);
write(fd, string, size);
However, most of the time it's not this simple: you will need the size of the string so you'll know how much you need to read. So you should write the string size too.
2., An integer can be written the following way:
write(fd, &integer, sizeof(integer));
This is simple, but if you plan to use the file on different architectures, you'll need to deal with endianness too.
It sounds like your best bet is to use a binary format. In your example, is the file called myfile.txt with a content length of 256, or myfile.txt2 with a content length of 56, or myfile.txt25 with a content length of 6? There's no way to distinguish between the end of the filename and the start of the content length field. Similarly there is no way to distinguish between the end of the content length and the start of the content. If you must use a text format, fixed width fields will help with this. I.e. 32 characters of filename followed by 6 digits of content length. But binary format is more efficient.
You get the filename length using strlen(), don't use sizeof(argv) as you will get completely the wrong result. sizeof(argv[i]) will also give the wrong result.
So write 4 bytes of filename length followed by the filename then 4 bytes of content length followed by the content.
If you want the format to be portable you need to be aware of byte order issues.
Lastly, if the file won't all fit in your buffer then you are stuffed. You need to get the size of the file you are reading to write it to your output file first, and then make sure you read that number of bytes from the first file into the second file. There are various techniques to do this.
thanks for replies guys,
I decided not to use (size_t) structure instead just assigned (int) and (char) types so I know exact value of bytes to read() out. ie I know start at beggining of file and read 4 bytes(int) to get value of lenght of filename, which I use as size in next read()
So, when I am writing (copying file exactly with same name) users inputted file to the output file (copied file) I writing it in long string, without spaces obviously just to make it readable here,
filenamesize filename filecontentsize filecontent
ie 10 myfile.txt 5 hello
So when come to reading that data out I start at begining of file using lseek() and I know the first 4 bytes are (int) which is lenght of filename so I put that into value int namelen using the read function.
My problem is I want to use that value read for the filenamesize(first 4 bytes) to declare my array to store filename with the right lenght. How do I put this array into read() so the read stores value inside that char array specified, see below please
int namelen; //value read from first 4 bytes of file lenght of filename to go in nxt read()
char filename[namelen];
read(fd, filename[namelen], namelen);//filename should have 'myfile.txt' if user entered that filename
So my question is once I read that first 4 bytes from file giving me lenght of filename stored in namelen, I then want to read namelen amount of bytes to give me the filename of originally file so I can create copied file inside directory?
Thanks

Read file in array line by line

Can you set any index of array as starting index i.e where to read from file? I was afraid if the buffer might get corrupted in the process.
#include <stdio.h>
int main()
{
FILE *f = fopen("C:\\dummy.txt", "rt");
char lines[30]; //large enough array depending on file size
fpos_t index = 0;
while(fgets(&lines[index], 10, f)) //line limit is 10 characters
{
fgetpos (f, &index );
}
fclose(f);
}
You can, but since your code is trying to read the full contents of the file, you can do that much more directly with fread:
char lines[30];
// Will read as much of the file as can fit into lines:
fread(lines, sizeof(*lines), sizeof(lines) / sizeof(*lines), f);
That said, if you really wanted to read line by line and do it safely, you should change your fgets line to:
// As long as index < sizeof(lines), guaranteed not to overflow buffer
fgets(&lines[index], sizeof(lines) - index, f);
Not like this no. There is a function called fseek that will take you to a different location in the file.
Your code will read the file into a different part of the buffer (rather than reading a different part of the file).
lines[index] is the index'th character of the array lines. Its address is not the index'th line.
If you want to skip to a particular line, say 5, then in order to read the 5th line, read 4 lines and do nothing with them, them read the next line and do something with it.
If you need to skip to a particular BYTE within a file, then what you want to use is fseek().
Also: be careful that the number of bytes that you tell fgets to read for you (10) is the same as the size of the array you are putting the line into (30) - so this is not the case right now.
If you need to read a part of a line starting from a certain character within that line, you still need to read the whole line, then just choose to use a chunk of it starting someplace other than the beginning.
Both of these examples are like requesting a part of a document from a website or a library - they're not going to tear out a page for you, you get the whole document, and you have to flip to what you want.

Resources