I'm really stuck on something.
I have a text file, which has 1 word followed by ~100 float numbers. The float numbers are separated by space, tab, or newline. This format repeats several times throughout the text file.
For example, this is what the text file looks like:
one 0.00591 0.07272 -0.78274 ...
0.0673 ...
0.0897 ...
two 0.0654 ...
0.07843 ...
0.0873 ...
three ...
...
...
This is a snippet of my code:
char word[30];
double a[1000];
double j;
while (!feof(fp))
{
fscanf(fp, "%s", word);
printf("%s\n", word);
while (!feof(fp) && (fscanf(fp, " %lf", &j)) == 1)
{
a[z] = j;
z++;
num_of_vectors++;
}
z = 0;
}
The word "nine" in the text file, is printed as "ne".
And the word "in" doesn't even print, a floating point number gets printed.
What am I doing wrong?
Any help would be much appreciated.
Thanks.
As per the standard:
An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.
The likely reason that nine is giving you ne is because, when reading a double value, nan is one of the acceptable values. Hence, the n and i are read to establish that it's not nan.
Similarly, with the word in, that a valid prefix for inf representing infinity.
The standard also states in a footnote:
fscanf pushes back at most one input character onto the input stream.
so it's quite possible that this is why the i in nine is not being pushed back.
Bottom line is that it's basically unsafe to assume where the file pointer will end up when fscanf operations fail for some reason.
One way to fix this is to use ftell and fseek to save the file pointer for each successfully item, so that you can move back to the correct file position if the thing you're attempting to read is not successful.
Let's say you have the input file:
one 1 2 3 4 5
nine 9 8 7 6 5
in 3.14159 2.71828
The following code will save and restore file positions to make it work as you wish:
#include <stdio.h>
int main(void) {
char buff[50]; double dbl; size_t pos;
FILE *fin = fopen("inputFile.txt", "r");
while (fscanf(fin, "%s", buff) == 1) {
printf("Got string [%s]\n", buff);
pos = ftell(fin);
while (sscanf(buff, "%lf", &dbl) == 1) {
printf("Got double [%f]\n", dbl);
pos = ftell(fin);
}
fseek(fin, pos, SEEK_SET);
}
fclose(fin);
return 0;
}
By commenting out the fseek, you can see similar behaviour to what you describe:
Got string [one]
Got double [1.000000]
Got double [2.000000]
Got double [3.000000]
Got double [4.000000]
Got double [5.000000]
Got string [ne]
Got double [9.000000]
Got double [8.000000]
Got double [7.000000]
Got double [6.000000]
Got double [5.000000]
Got double [3.141590]
Got double [2.718280]
I consider this solution a little messy in that it's continuously having to call ftell and occasionally fseek to get it to work.
Another way is to just read everything as strings and decide whether it's a numeric or string with a sscanf operation after reading it in, as in the following code (with the afore-mentioned input file):
#include <stdio.h>
int main(void) {
char buff[50]; double dbl;
FILE *fin = fopen("inputFile.txt", "r");
while (fscanf(fin, "%s", buff) == 1) {
if (sscanf(buff, "%lf", &dbl) == 1) {
printf("Got double [%f]\n", dbl);
} else {
printf("Got string [%s]\n", buff);
}
}
fclose(fin);
return 0;
}
This works because a floating point value is actually a proper subset of a string (i.e., it has no embedded spaces).
The output of both those programs above is:
Got string [one]
Got double [1.000000]
Got double [2.000000]
Got double [3.000000]
Got double [4.000000]
Got double [5.000000]
Got string [nine]
Got double [9.000000]
Got double [8.000000]
Got double [7.000000]
Got double [6.000000]
Got double [5.000000]
Got string [in]
Got double [3.141590]
Got double [2.718280]
which is basically what was desired.
One thing you need to be aware of is that scanning something like inf or nan as a double will actually work - that is the intended behaviour of the library (and how your original code would have worked had it not had the issues). If that's not acceptable, you can do something like evaluate the string before trying to scan it as a double, to ensure it's not one of those special values.
Related
Say Given a text file that looks like this:
a,b,c
x,y,z
where a is a char *, b contains a float and c contains a double.
For an example, the input file can look like this:
apple,$12.34,test130.8
x,y,z
I want to use fscanf() to read a, b, c and assign each one of them to a corresponding variable.
"apple" will be assigned to A of the same data type; "12.34"(not "$12.34") will be assigned to B with a float data type; so on.
My attempt was as follows:
fp = the file pointer
char A[50];
float B;
double C;
fscanf(fp, "%[^,],%[^,],%[^,]\n", A, B, C);
But I realized that %[^,]can only specify type char *; ergo, I'm not allowed to assign type char * to a float or double variable.
Is there a way to parse %[^,] to make it only specifies type float?
if I only use this:
fscanf(fp, "%s,%f,%lf\n", A, B, C);
It will be thrown off by the "$" in "12.34", and it will give me 0.000000.
Using sscanf() (instead of fscanf()) for ease of testing:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *s = "apple,$12.34,test130.8\npear,$23.45,abc";
for(int offset = 0, n;; offset += n) {
char *symbol;
float price;
char *note;
if(sscanf(s + offset, " %m[^,],$%f,%m[^\n]%n", &symbol, &price, ¬e, &n) != 3) {
break;
}
printf("symbol: %s, price: %f, note: %s\n", symbol, price, note);
free(note);
free(symbol);
}
}
and the matching output (note how it demonstrate the evils of using floating points for money):
symbol: apple, price: 12.340000, note: test130.8
symbol: pear, price: 23.450001, note: abc
I used %m to have scanf() allocate the strings. If I knew the maximum size of the strings I would reuse a fixed size strings instead of dynamically allocating and freeing those.
When using fscanf() instead of break you could use feof() to see if we are done, or if the input is invalid. If it's invalid you may want to resync to the next \n with fsccnf(..., "%c", ch). For the above s[offset] == '\0' will tell if you are the end but see below.
You may find it's much easier to get a line with fgets(), then use sscanf() similar to above to extract each item. If fails you can report the line and just read the next one. fgets() will return NULL if you have no more data and it leads to cleaner code when you separate I/O and parsing.
There's already an answer from #AllanWind (using dynamic allocation for strings that my old library doesn't do.) Here's an alternative solution (that is much the same.)
First, the input file used for testing:
apple,$12.34,test130.8
banana,$20.67,testing201.45
Then the code using fscanf() with a complicated format string:
int main( void ) {
FILE *fp = fopen( "test.txt", "r" );
if( fp == NULL) {
fprintf( stderr, "fopen() failed\n" );
return -1;
}
char txt[50], word[12];
double dval1, dval2;
while( fscanf( fp, " %49[^,],%*c%lf,%11[^0123456789]%lf", txt, &dval1, word, &dval2 ) == 4 )
printf( "'%s' / %.2lf / '%s' / %.2lf\n", txt, dval1, word, dval2 );
fclose( fp );
return 0;
}
Finally, the output
'apple' / 12.34 / 'test' / 130.80
'banana' / 20.67 / 'testing' / 201.45
I have the following code which reads from a given input file into and then into struct I have made.
OFFFile ReadOFFFile(OFFFile fileData, FILE* srcFile)
{
int nvert, nfaces;
fscanf(srcFile, "%s\n");
fscanf(srcFile, "%d %d %s\n", &nvert, &nfaces);
fileData.nvert = nvert;
fileData.nfaces = nfaces;
fileData.vertices = (int *) malloc(fileData.nvert * sizeof(float));
fileData.triFaces = (int *) malloc(fileData.nfaces * sizeof(int));
// Print to check correct size was allocated
printf("%d\n", (fileData.nvert * sizeof(float)));
printf("%d\n", (fileData.nfaces * sizeof(int)));
int i;
float ftemp1, ftemp2, ftemp3;
int itemp1, itemp2, itemp3;
fscanf(srcFile, "%f", &ftemp1);
printf("%lf", ftemp1);
fscanf(srcFile, "%f", &ftemp2);
// fscanf(srcFile, " %lf", &ftemp3);
/* for (i = 0; i < nvert; ++i)
{
fscanf(srcFile, "%f %f %f\n", &ftemp1, &ftemp2, &ftemp3);
fileData.vertices[i].x = ftemp1;
fileData.vertices[i].y = ftemp2;
fileData.vertices[i].z = ftemp3;
}
*/
return(fileData);
}
The problem I am having is with the whole last section that is currently in quotes (The 2 fscanf lines above it are me attempting to test). If I have just one float being read it works fine, but when I add the second or third the whole function fails to even run, although it still compiles. I believe it to be caused by the negative sign in the input, but I don't know how I can fix it.
The data is in the form
OFF
4000 7000 0
0.8267261981964111 -1.8508968353271484 0.6781123280525208
0.7865174412727356 -1.8490413427352905 0.7289819121360779
With the floats continuing on for 4000 lines (hence for loop). These are the structs I have made
typedef struct
{
float x;
float y;
float z;
} Point3D;
typedef struct
{
int face1;
int face2;
int face3;
} triFace;
typedef struct
{
int nvert;
int nfaces;
Point3D *vertices;
triFace *triFaces;
} OFFFile;
Text dump of another file with a lot less lines, also does not work in the for loop. Only using this for testing. https://justpaste.it/9ohcc
Your main problem is the first line in the readOFFFile function:
fscanf(srcFile, "%s\n");
This tries to read a string (presumably the string OFF on the first line of the file), but you don't give fscanf any place to store the string, so it crashes. (As an aside, your compiler really should have warned you about this problem. If it didn't, it's old-fashioned, and there are lots of easy mistakes that it's probably not going to warn you about, and learning C is going to be much harder than it ought to be. Or perhaps you just need to find an option flag or checkbox to enable more warnings.)
You can tell fscanf to read and discard something, without storing it anywhere, using the * modifier. Here's a modified version of your program, that works for me.
void ReadOFFFile(OFFFile *fileData, FILE* srcFile)
{
fscanf(srcFile, "%*s");
if(fscanf(srcFile, "%d %d %*s", &fileData->nvert, &fileData->nfaces) != 2) {
exit(1);
}
fileData->vertices = malloc(fileData->nvert * sizeof(Point3D));
fileData->triFaces = malloc(fileData->nfaces * sizeof(triFace));
int i;
for (i = 0; i < fileData->nvert; ++i)
{
if(fscanf(srcFile, "%f %f %f", &fileData->vertices[i].x,
&fileData->vertices[i].y,
&fileData->vertices[i].z) != 3) {
exit(1);
}
}
}
I have made a few other changes. The other fscanf call, that reads three values but only stores two, also needs a * modifier. I check the return value of fscanf to catch errors (via a crude exit) if the input is not as expected. I got rid of the \n characters in the fscanf calls, since they're not necessary, and potentially misleading. I got rid of some unnecessary temporary variables, and I had the readOFFFile function accept a pointer to an OFFFile structure to fill in, rather than passing and returning it.
Here is the main program I tested it with:
int main()
{
OFFFile fd;
FILE *fp = fopen("dat", "r");
ReadOFFFile(&fd, fp);
for (int i = 0; i < fd.nvert; ++i)
printf("%f %f %f\n", fd.vertices[i].x, fd.vertices[i].y, fd.vertices[i].z);
}
This is still an incomplete program: there are several more places where you need to check for errors (opening the file, calling malloc, etc.), and when you do detect an error, you need to at least print a useful error message before exiting or whatever.
One more thing. As I mentioned, those \n characters you had in the fscanf format strings were unnecessary and misleading. To illustrate what I mean, once you get the program working, have it try to read a data file like this:
OFF 2 0
0 0.8267261981964111
-1.8508968353271484 0.6781123280525208
0.7865174412727356 -1.8490413427352905 0.7289819121360779
Totally malformed, but the program reads it without complaint! This is one reason (one of several dozen reasons, actually) why the scanf family of functions is basically useless for most things. These functions claim to "scan formatted data", but their definition of "formatted" is quite loose, in that they actually read free-form input, generally without any regard for line boundaries.
For some advice on graduating beyond scanf and using better, more reliable methods for reading input, see this question. See also this section and this section in some online C programming course notes.
The line:
fscanf(srcFile, "%s\n");
is invoking undefined behavior. The compiler should warn you about that. Once you've invoked UB, there's no point in speculating further about what is happening.
It's not clear to me what you intended that line to do, but if you use %s in a scanf, you need to give it a valid place to write data. You should always check the value returned by scanf to ensure that you have actually read some data, and you should never use "%s" without a width modifier. Perhaps you want something like:
char buf[256];
if( fscanf(srcFile, "%255s ", buf) == 1 ){
/* Do something with the string in buf */
}
From your comment, it seems that you are intending to use that scanf to skip a line. I strongly recommend using a while(fgetc) loop instead of scanf to do that. If you do want to use scanf, you could try something like fscanf(srcFile, "%*s\n"), but beware that it will stop at the first whitespace, and not necessarily consume an entire line. You could also do fscanf(srcFile, "%*[^\n]%*c"); to consume the line, but you are really better off using a fgetc in a while loop.
Addressing title question:
"How do I read multiple floats from one line of a file"
...with suggestions for a non-scanf() approach.
Assuming the file is opened, (and a file pointer) fp is obtained ) , the first two lines are already handled, and values into ints, say the lines value is converted to int lines;
And given your struct definition (modified to use double to accommodate type compatibility in code below):
typedef struct
{
double x;
double y;
double z;
} Point3D;
In a function somewhere here is one way to parse the contents of each data line into the 3 respective struct values using fgets(), strtok() and strtod():
char delim[] = " \n";
char *tok = NULL;
char newLine[100] = {0};
Point3D *point = calloc(lines, sizeof(*point));
if(point)
{
int i = 0;
while(fgets(newLine, sizeof newLine, fp))
{
tok = strtok(newLine, delim);
if(tok)
{
if(parseDbl(tok, &point[i].x))
{
tok = strtok(NULL, delim);
if(tok)
{
if(parseDbl(tok, &point[i].y))
{
tok = strtok(NULL, delim);
if(tok)
{
if(!parseDbl(tok, &point[i].z))
{
;//handle error
}else ;//continue
}else ;//handle error
}else ;//handle error
}else ;//handle error
}else ;//handle error
}else ;//handle error
i++;//increment for next read
}//end of while
}else ;//handle error
Where parseDbl is defined as:
bool parseDbl(const char *str, double *val)
{
char *temp = NULL;
bool rc = true;
errno = 0;
*val = strtod(str, &temp);
if (temp == str)
rc = false;
return rc;
}
I'm new to using strings in C and am needing to read from a file lines of data that contain strings and numbers, parsing them as I go along. I've done similar programs reading in just numbers, such as a list of ordered pairs, using a for loop so this is the strategy I am leaning towards.
Example of data line in the file: PART,2.000,-1,0.050,V
When I compile I get an error in the for loop declaration of "expected expression before 'char'". What is missing or needs reviewing in this code?
#define flush fflush(stdin)
#define N 50
int main()
{
flush;
const char part[] = "PART"; // String for PART variable
char descriptor[N]; // Starting string of data set
double p_dim[N]; // Array for part dimensions
int t_sens[N]; // Array for sensitivity values: -1 or +1
double t[N]; // Array for part tolerance, bilateral
char t_status[N]; // Array for tolerance status, F(ixed) or V(ariable)
double key_max; // Maximum value of key characteristic
double key_min; // Minimum value of key characteristic
FILE* fpin;
if((fpin = fopen("input.txt","r"))==(FILE*)NULL)
{
printf("File input does not exist\n"); exit(-1);
}
// For loop to parse data lines from file
for(N; char* fgets(descriptor, int N, FILE* fpin); N-1);
{
compare(descriptor, part);
if (descriptor == part)
{
fscanf(fpin, "%lf,%d,%lf,%s", p_dim[N], t_sens[N], t[N], t_status[N]);
}
else if (descriptor != part)
{
fscanf(fpin, "%lf, %lf", &key_min, &key_max);
}
}
1.) #define flush fflush(stdin)
Flushing stdin invokes undefined behaviour.
2.) if((fpin = fopen("input.txt","r"))==(FILE*)NULL)
The cast to (FILE*) is superfluous.
3.) for(N; ... ; N-1);
You defined N as a constant (#define N 50) so this loop won't ever exit.
4.) for(... ; char* fgets(descriptor, int N, FILE* fpin); ...);
This is just plain wrong ...
I'd lean more toward breaking the string apart
See question 3501338 for reading a file line by line
See question 15472299 using strtok to break apart the string
If you need to cast the strings as numbers use sscanf
I have a file format like this
1.9969199999999998 2.4613199999999997 130.81278270000001 AA
2.4613199999999997 2.5541999999999998 138.59131554109211 BB
2.5541999999999998 2.9953799999999995 146.83238401449094 CC
...........................
I have to read first three columns as float and the last column as char array in C. All the columns are tab separated and the there is an new line character at the end of each line. Everything works fine with fscanf(fp1, "%f\t%f\t%f\t%s\n", ...) till I have a some text at the end of each line (the char string part).
There are cases where instead of AA/BB/CC, I have an empty string in the file. How to handle that case. I have tried fscanf(fp1, "%f\t%f\t%f\t%s[^\n]\n", ...) and many other things, but I am unable to figure out the right way. Can you please help me out here?
Using float rather than double will throw away about half the digits shown. You get 6-7 decimal digits with float; you get 15+ digits with double.
As to your main question: use fgets() (or POSIX
getline()) to read lines and then sscanf() to parse the line that is read. This will avoid confusion. When the input is line-based but not regular enough, don't use fscanf() and family to read the data — the file-reading scanf() functions don't care about newlines, even when you do.
Note that sscanf() will return either 3 or 4, indicating whether there was a string at the end of a line or not (or EOF, 0, 1 or 2 if it is given an empty string, or a string which doesn't start with a number, or a string which only contains one or two numbers). Always test the return value from scanf() and friends — but do so carefully. Look for the number of values that you expect (3 or 4 in this example), rather than 'not EOF'.
This leads to roughly:
#include <stdio.h>
int main(void)
{
double d[3];
char text[20];
char line[4096];
while (fgets(line, sizeof(line), stdin) != 0)
{
int rc = sscanf(line, "%lf %lf %lf %19s", &d[0], &d[1], &d[2], &text[0]);
if (rc == 4)
printf("%13.6f %13.6f %13.6f [%s]\n", d[0], d[1], d[2], text);
else if (rc == 3)
printf("%13.6f %13.6f %13.6f -NA-\n", d[0], d[1], d[2]);
else
printf("Format error: return code %d\n", rc);
}
return 0;
}
If given this file as standard input:
1.9969199999999998 2.4613199999999997 130.81278270000001 AA
2.4613199999999997 2.5541999999999998 138.59131554109211 BB
2.5541999999999998 2.9953799999999995 146.83238401449094 CC
19.20212223242525 29.3031323334353637 3940.41424344454647
19.20212223242525 29.3031323334353637 3940.41424344454647 PolyVinyl-PolySaccharide
the output is:
1.996920 2.461320 130.812783 [AA]
2.461320 2.554200 138.591316 [BB]
2.554200 2.995380 146.832384 [CC]
19.202122 29.303132 3940.414243 -NA-
19.202122 29.303132 3940.414243 [PolyVinyl-PolySacch]
You can tweak the output format to suit yourself. Note that the %19s avoids buffer overflow even when the text is longer than 19 characters.
I have the following in a text file called: values.txt
1 4
2.5 3.76
122 10
277.543
165.4432
I am trying to read the content of this text file, and add each two pairs together and output the result ...
the output would be something like this :
1 Pair:(1, 4) = 5
2 Pair:(2.5, 3.76)= 6.26
and so on ..
I am opening the file like this
int c;
FILE myfile;
myfile= fopen("values.txt", "r");
if ( myfile == NULL ) {
printf("Cannot open TEXT file\n");
return 1;
}
double aa,bb;
while ( (c = getc(myfile) ) != EOF ) {
// HERE SHOULD I DO THE OUTPUT BUT HOW?
}
Any help is really appreciated ..
Language = C
The following code does what you expect. myfile should be declared as FILE*. fopen returns a pointer to FILE structure. If the file is very large, I would recommend reading in buffers of big size (eg: 65535 etc) and parse it char by char and convert it to float values. It reduces system call overhead which takes more time than processing text to float values.
#include <stdio.h>
#include <string.h>
main(int argc, char *argv[])
{
FILE* myfile;
myfile = fopen("values.txt", "r");
if ( myfile == NULL ) {
printf("Cannot open TEXT file\n");
return 1;
}
double aa,bb;
while (2 == fscanf(myfile, "%lf %lf", &aa, &bb)) {
printf("%lf\n", aa+bb);
}
return 0;
}
For this simple task, use double a, b;
if (fscanf(myfile, "%lf %lf", &a, &b) == 2)
printf("%f + %f = %f\n", a, b, a+b);.
looks like a homework problem but fscanf can read the string into a variable like:
int n;
fscanf (myfile,"%d",&n);
You haven't shown what you need as output for the single-value lines, but this looks like a case for fgets() and sscanf(), unless you really want the two lines with a single value to be processed as a unit.
char buffer[256];
int rownum = 0;
while (fgets(buffer, sizeof(buffer), myfile) != 0)
{
double aa, bb;
int n = sscanf(buffer, "%lf %lf", &aa, &bb);
if (n == 2)
printf("%d Pair:(%g, %g) = %g\n", ++rownum, aa, bb, aa+bb);
else if (n == 1)
printf("%d Solo:(%g) = %g\n", ++rownum, aa, aa);
else
{
printf("Failed to find any numbers in <<%s>>\n", buffer);
}
}
If you used fscanf(myfile, "%g %g", &aa, &bb), then it would read over newlines (they count as white space) looking for numbers, so it would read one number from one line, and the second from another line. This is not usually what people are after (but when it is what you need, it is extremely useful). Error recovery with fscanf() tends to be more fraught than with fgets() and sscanf().
its in c++ sorry :( i dont know c
this is a very simple logic code for simple minde :D im a begineer too, i havent tested this prog so sorry if something goes wrong but exactly
on a same principle was working my parser and it worked fine. so this is a true method. not very efficent but...
do not use this program straight away, understand it's logic this will help you alot. copying that wont give you anything
...parser tutors are so rare....
int x=0;
char ch = 'r'; //i'v used this equasion to avoid error on first ckeck of ch.
it must be filled by something when program starts.
char bigch[10];
int checknumber = 0;
float firstnumber = 0;
float secondnumber = 0;
float result=0;
void clearar(char frombigar[10], int xar) //this function gets bigch as a reference which means that eny
changes made here, will directly affect bigch itself.
ths function gets the actual length of array and puts spaces
in bigch's every element to zero out numbers. we need to clear
bigch of any previous numbers. down below you'l see why i needed this.
'xar' is the x from main function. its here to tell our cleaner the
true length of filled bigar elements.
{
for (int i=0; i
}
}
int main()
{
<------------------- //here you add file opening and reading commands
while(!myfile.eof()) //while end of txt file have not been reached
{
ch=myfile.get(); //gets each letter into ch, and make cursor one step
forward in txt file for further reading.
get() does cursor forwarding automatically
if (ch!= " ") //i used space as an indicator where one number ends
//so while space havent been reahced, read letters.
{ bigch[x] = ch; //get read letter into bigch array.
x++; //icrement bigch array step
}
else
if(ch == " ") //if space is reached that means one number has ended and
{ im trying to set a flag at that moment. it will be used further.
checknumber++; the flag is simple number. first space will set checknumber to 1
second space will set it to 2. thats all.
}
if (checknumber == 1) //if our checknumber is 1, wich means that reading
of first number is done, lets make one whole float
from that bigch array.
{ firstnumber = atof(bigch); //here we get bigch, atof (array to float) command converts
bigch array into one whole float number.
clearar(bigch,x); //here we send bigch and its element step into function where
bigch gets cleaned because we dont want some ghost numbers in it.
abviously clearar function cleans bigch int main function aswell,
not only in it's teritory. its a global cleaning :)
}
else if (checknumber ==2) //here we do the same but if flag is 2 this means that two spaces
had been passed and its time to convert bigch into secondnumber.
{ secondnumber = atof(bigch); //same method of converting array into float (it hates other
not number letters, i mean if its a number its fine. if in your text
was 'a' or 's' in that case atof will panic hehe.. )
clearar(bigch,x); //same thing, we send bigch to cleaner function to kill any numbers
it, we get one space letter ( " " ) into each element of bigch.
}
checknumber = 0; if both two numbers had been read out and converted. we need to reset
space flagger. and start counting form 0; for next pair numbers.
result = firstnumber+secondnumber; well here everything is clear.
}
}