I am trying to learn C and have recieved a homework assignment to write code which can read data from a .txt file and print out particular lines.
I wrote the following:
#include <stdio.h>
void main() {
char str[5];
FILE *fp;
fp=fopen("data.txt","r");
int i;
for (i=1;i<=5;i++){
fgets(str,5,fp);
printf("%d \n",i);
if (i==1||i==3||i==5) {
printf("%s \n \n",str);
}
}
}
The file data.txt is just the following:
3.21
5.22
4.67
2.31
2.51
1.11
I had read that each time fgets is run, the pointer is updated to point to the next line. I thought I could keep running fgets and then only print the string str when at the correct value for i (the line I want output on the console).
It partially worked, here is the output:
1
3.21
2
3
5.22
4
5
4.67
Process returned 8 (0x8) execution time : 0.024 s
Press any key to continue.
It did only print when i had the correct values, but for some reason it only printed the first 3 lines, even though fgets was supposed to have been run 5 times by the last iteration, and so the pointer should have been reading the last line.
Can someone explain why the pointer did not update as expected and if there is an easier way to slice or index through a file in c.
You need to account for (at least) two additional characters, in addition to the numbers you have in the file. There is the end-of-line delimiter (\n on UNIX/Mac, or possibly \r\n on Windows... so maybe 3 additional characters), plus (from the fgets documentation):
A terminating null character is automatically appended after the characters copied to str.
A lot of the C functions that manipulate character arrays (ie. strings) will give you this extra null "for free" and it can be tricky if you forget about it.
Also, a better way to loop over the lines might be:
#define MAX_CHARS 7
char buf[MAX_CHARS];
while((fgets(buf, MAX_CHARS, fp)) != NULL) {
printf("%s\n", buf);
}
It's still not the best way to do it (no error checking) but a little more compact/readable and idiomatic C, IMO.
Related
I would like to write a lottery program in C, that reads the chosen numbers of former weeks into an array. I have got a text file in which there are 5 columns that are separated with tabulators. My questions would be the following:
What should I separate the columns with? (e.g. a comma, a semicolon, a tabulator or something else)
Should I include a kind of EOF in the last row? (e.g. -1, "EOF") Is there any accepted or "official" convention to do this?
Which function should I use for reading the numbers? Is there any proper or "accepted" way of reading data from text files?
I used to write a C program for a "Who Wants to Be a Billionaire" game. In that one I used a kind of function that read each line into an array that was big enough to hold a whole line. After that I separated its data into variables like this:
line: "text1";"text2";"text3";"text4"endline (-> line loaded into a buffer array)
text1 -> answer1 (until reaching the semicolon)
text2 -> answer2 (until reaching the semicolon)
text3 -> answer3 (until reaching the semicolon)
text4 -> answer4 (until reaching the end of the line)
endline -> start over, that is read a new line and separate its contents into variables.
It worked properly, but I don't know if it was good enough for a programmer. (btw I'm not a programmer yet, I study Computer Science at a university)
Every answers and advice is welcome. Thanks in advance for your kind help!
The scanf() family of functions don't care about newlines, so if you want to process lines, you need to read the lines first and then process the lines with sscanf(). The scanf() family of functions also treats white space — blanks, tabs, newlines, etc. — interchangeably. Using tabs as separators is fine, but blanks will work too. Clearly, if you're reading and processing a line at a time, newlines won't really factor into the scanning.
int lottery[100][5];
int line;
char buffer[4096];
for (line = 0; fgets(buffer, sizeof(buffer), stdin) != 0 && line < 100; line++)
{
if (sscanf(buffer, "%d %d %d %d %d", &lottery[line][0], &lottery[line][1],
&lottery[line][2], &lottery[line][3], &lottery[line][4]) != 5)
{
fprintf(stderr, "Faulty line: [%s]\n", line);
break;
}
}
This stops on EOF, too many lines, and a faulty line (one which doesn't start with 5 numbers; you can check their values etc in the loop if you want to — but what are the tests you need to run?). If you want to validate the white space separators, you have to work harder.
Maybe you want to test for nothing but spaces and newlines after the 5 numbers; that's a bit trickier (it can be done; look up the %n conversion specification in sscanf()).
For my assignment, I'm required to use fread/fwrite. I wrote
#include <stdio.h>
#include <string.h>
struct rec{
int account;
char name[100];
double balance;
};
int main()
{
struct rec rec1;
int c;
FILE *fptr;
fptr = fopen("clients.txt", "r");
if (fptr == NULL)
printf("File could not be opened, exiting program.\n");
else
{
printf("%-10s%-13s%s\n", "Account", "Name", "Balance");
while (!feof(fptr))
{
//fscanf(fptr, "%d%s%lf", &rec.account, rec.name, &rec.balance);
fread(&rec1, sizeof(rec1),1, fptr);
printf("%d %s %f\n", rec1.account, rec1.name, rec1.balance);
}
fclose(fptr);
}
return 0;
}
clients.txt file
100 Jones 564.90
200 Rita 54.23
300 Richard -45.00
output
Account Name Balance
540028977 Jones 564.90
200 Rita 54.23
300 Richard -45.00╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠ü☻§9x°é -92559631349317831000000000000000000000000000000000000000000000.000000
Press any key to continue . . .
I can do this with fscanf (which Ive commented out), but I'm required to use fread/fwrite.
Why does it start with a massive number for Jone's account?
Why is there garbage after? Shouldn't feof stop this?
Are there any drawbacks using this method? or fscanf method?
How can I fix these?
Many thanks in advance
As the comments say, fread reads the bytes in your file without any interpretation. The file clients.txt consists of 50 characters, 16 in the first line plus 14 in the second plus 18 in the third line, plus two newline characters. (Your clients.txt does not contain a newline after the third line, as you will soon see.) The newline character is a single byte \n on UNIX or Mac OS X machines, but (probably) two bytes \r\n on Windows machines - hence either 50 or 51 characters. Here is the sequence of ASCII bytes in hexadecimal:
3130 3020 4a6f 6e65 7320 3536 342e 3930 100 Jones 564.90
0a32 3030 2052 6974 6120 3534 2e32 330a \n200 Rita 54.23\n
3330 3020 5269 6368 6172 6420 2d34 352e 300 Richard -45.
3030 00
Your fread statement copies these bytes without any interpretation directly into your rec1 data structure. That structure begins with int account;, which says to interpret the first four bytes as an int. As one of the comments noted, you are running your program on a little-endian machine (most likely an Intel machine), so the least significant byte is the first and the most significant byte is the fourth. Thus, your fread said to interpret the sequence of four ASCII characters "100 " as the four byte integer 0x20303031, which equals, in decimal, 540028977. The next member of your struct is char name[100];, which means that the next 100 bytes of data in rec1 will be the name. But the fread was told to read sizeof(rec1)=112 bytes (4 byte account, 100 byte name, 8 byte balance). Since your file is only 50 (or 52) characters, fread will have only been able to fill in that many bytes of rec1. The return value of fread, had you not discarded it, would have told you that the read stopped short of the number of bytes you requested. Since you hit EOF, the feof call breaks out of the loop after that first pass, having consumed the entire file in one gulp.
All of your output was produced by the first and only call to fprintf. The number 540028977 and the following space were produced by the "%d " and the rec1.account argument. The next bit is only partly determinate, and you got lucky: The "%s" specifier and the corresponding rec1.name argument will print the next characters as ASCII until a \0 byte is found. Thus, the output will begin with the 50-4 (or 52-4) remaining characters of your file -- including the two newlines -- and potentially continue forever, because there are no \0 bytes in your file (or in any text file), which means that after printing the last character of your file, what you are seeing is whatever garbage happened to be in the automatic variable rec1 when your program started. (That kind of unintentional output is similar to the famous heartbleed bug in OpenSSL.) You were lucky the garbage included a \0 byte after only a few dozen more characters. Note that printf has no way to know that rec1.name was declared to be only a 100 byte array -- it only got the pointer to the beginning of name -- it was your responsibility to guarantee that rec1.name contained a terminating \0 byte, and you never did that.
We can tell a little bit more. The number -9.2559631349317831e61 (which is pretty ugly in "%f" format) is the value of rec1.balance. The 8 bytes for that double value on an IEEE 754 machine (like your Intel and all modern computers) are in hex 0xcccccccccccccccc. Sixty four of the peculiar ╠ symbol appear in the "%s" output corresponding to rec1.name, while only 100-46 = 54 characters remain of the 100, so your "%s" output has run off the end of rec1.name, and includes rec1.balance into the bargain, and we learn that your terminal program interpreted the non-ASCII character 0xcc as ╠. There are many ways to interpret bytes bigger than 127 (0x7f); in latin-1 it would have been Ì for example. The graphical character ╠ is the representation of the 0xcc (204) byte in the ancient MS-DOS character set, Windows code page 437. Not only are you running on an Intel machine, it is a Windows machine (of course the mostly likely possibility to begin with).
That answers your first two questions. I'm not sure I understand your third question. The "drawbacks" I hope are obvious.
As for how to fix it, there is no reasonably simple way to read and interpret a text file using fread. To do so, you would need to duplicate much of the code in the libc fscanf function. The only sensible way is to first use fwrite to create a binary file; then fread will work naturally to read it back. So there have to be two programs -- one to write a binary clients.bin file, and a second to read it back. Of course, that does not solve the problem of where the data for that first program should come from in the first place. It could come from reading clients.txt using fscanf. Or it could be included in the source code of the fwrite program, for example by initializing an array of struct rec like this:
struct rec recs[] = {{100, "Jones", 564.90},
{200, "Rita", 54.23},
{300, "Richard", -45.00}};
Or it could come from reading a MySQL database, or... The one place it is unlikely to originate is in a binary file (easily) readable with fread.
Say I'm calling a program:
$ ./dataset < filename
where filename is any file with x amount of line pairs where the first line contains a string and second line contains 10 numbers separated by spaces. The last line ends with "END"
How can I then start putting the first lines of pairs (string) into:
char *experiments[20] // max of 20 pairs
and the second lines of the pairs (numbers) into:
int data[10][20] // max of 20, 10 integers each
Any guidance? I don't even understand how I'm supposed to scan the file into my arrays.
Update:
So say this is my file:
Test One
0 1 2 3 4 5 6 7 8 9
END
Then redirecting this file would mean if I want to put the first line into my *experiments, that I would need to scan it as such?
scanf("%s", *experiments[0]);
Doing so gives me an error: Segmentation fault (core dumped)
What is incorrect about this?
Say my file is simply numbers, for ex:
0 1 2 3 4 5 6 7 8 9
Then,
scanf("%d", data[0][0]); works, and will hold value of '1'. Is there an easier way to do this for the whole line of data? i.e. data[0-9][0].
find the pseudo-code, code explains how to read the input
int main()
{
char str[100]; // make sure that this size is enough to hold the single line
int no_line=1;
while(gets(str) != NULL && strcmp(str,"END"))
{
if(no_line % 2 == 0)
{
/*read integer values from the string "str" using sscanf, sscanf can be called in a loop with %d untill it fails */
}
else
{
/*strore string in your variable "experiments" , before copying allocate a memory for the each entry */
}
no_line++;
}
}
The redirected file is associated with the FILE * stdin. It's already opened for you...
otherwise, you can treat it the same as any other text file, and/or use the functions that are dedicated to standard input - with the only exception that you cannot seek in the file and not retrieve the size of the input.
For the data sizes you're talking about, by far the easiest thing to do is just slurp all of the content into a buffer and work on that: you don't have to be super-stingy, just make sure that you don't overrun.
If you want to be super-stingy with memory, preallocate a 4kB buffer with malloc(), progressively read() into it from stdin, and realloc() another 4kB every time the input exceeds what you've already read. If you don't care so much about being stingy with memory (e.g. on a modern machine with gigabytes of memory), just malloc() something much bigger than the expected input (e.g. a megabyte) and bug out if the input is more than that: this is far simpler to implement but less general/elegant.
You then have all of the input in a buffer and you can do what you like with it, which depends too strongly on the format of the input for me to say how you should approach that part.
I'm trying to delete a string in a data-file.
The format of the data is just like following records:
4253 1
3119 1
5709 1
576 1
857 1
5859 1
5896 1
116 1
2396 1
1088 1
4180 1
Those are a part of a file.I have no right to send img.
Each record makes up of two numbers segregated by space and segregates by invisible char '\n'.
There're thousands of records in the file, I just want to delete some records useless when i scan the file. Should use C language to implement it.
very sorry for not providing detailed format of the data.
Files in C are sequential entities. Unless you impose your own structure on them (such as treating NUL characters as non-existent ones), the only real way to delete characters or lines is to overwrite them, shifting the part of the file following them a little towards the front.
You can either do this in-place with things like fseek and truncate (that last is not ISO C) or by reading from one file and writing to another.
For example, the following program will delete a line containing 11 from the standard input:
#include <stdio.h>
int main (void) {
char buff[1024];
while (fgets (buff, sizeof(buff), stdin) != NULL)
if (strcmp (buff, "11\n") != 0)
printf ("%s", buff);
return 0;
}
Beware the usual caveats lines lines that are too long for the input buffer.
I have a txt file named prob which contains:
6 2 8 3
4 98652
914
143 789
1
527 146
85
1 74 8
7 6 3
Each line has 9 chars and there are 9 lines. Since I cant make a string array in c, im be using a two dimensional array. Careful running the code, infinite loops are common and it prints weird output. Im also curious as to where does it stop taking in the string? until newline?
expected result for each "save": 6 2 8 3
or watever the line contained.
#include <stdio.h>
FILE *prob;
main()
{
prob = fopen("prob.txt", "r");
char grid_values[9][9];
char save[9];
int i;
for (i = 0; (fscanf(prob, "%s", save) != EOF); i++)
{
int n;
for (n = 0; n <= 9; n++)
{
grid_values[i][n] = save[n];
printf("%c", grid_values[i][n]);
}
}
fclose(prob);
}
if you use fscanf, it will stop after a space delimiter..
try fgets to do it.. It will read line by line..
for (i = 0; (fgets(save, sizeof(save), prob) != EOF); i++)
the detail of fgets usage can be found here:
http://www.cplusplus.com/reference/clibrary/cstdio/fgets/
--edited--
here's the second
while(!feof(file))
{
fgets(s, sizeof(s), file); ......
}
I think it'll work well..
This looks like a homework problem, so I will try to give you some good advice.
First, read the description of the fscanf function and the description of the "%s" conversion.
Here is a snip from the description I have for "%s":
Matches a sequence of non-white-space characters; the next pointer must be a pointer to a character array that is long enough to hold the input sequence and the terminating null
character (’\0’), which is added automatically. The input string stops at white space or
at the maximum field width, whichever occurs first.
Here are the two important points:
Each of your input lines contains numbers and whitespace characters. So the function will read a number, reach whitespace, and stop. It will not read 9 characters.
If it did read 9 characters, you do not have enough room in your array to store the 10 bytes required. Note that a "terminating null character" will be added. 9 characters read, plus 1 null, equals 10. This is a common mistake in C programming and it is best to learn now to always account for the terminating null in any C string.
Now, to fix this to read characters into a two dimensional array: You need to use a different function. Look through your list of C stdio functions.
See anything useful sounding?
If you haven't, I will give you a hint: fread. It will read a fixed number of bytes from the input stream. In your case you could tell it to always read 9 bytes.
That would only work if each line is guaranteed to be padded out to 9 characters.
Another function is fgets. Again, carefully read the function documentation. fgets is another function that appends a terminating null. However! In this case, if you tell fgets a size of 9, fgets will only read 8 characters and it will write the terminating null as the 9th character.
But there is even another way! Back to fscanf!
If you look at the other conversion specifiers, you could use "%9c" to read 9 characters. If you use this operation, it will not add a terminating null to the string.
With both fread and fscanf "%9c" if you wanted to use those 9 bytes as a string in other functions such as printf, you would need to make your buffers 10 bytes and after every fread or fscanf function you would need to write save[9] = '\0'.
Always read the documentation carefully. C string functions sometimes do it one way. But not always.