Read Txt file Language C - c

Hi guys I have this file struct:
0
2 4
0: 1(ab) 5(b)
1: 2(b) 6(a)
2: 0(a) 2(b)
3: 2(a) 6(b)
4: 5(ab)
5: 2(a) 6(b)
6: 4(b) 6(ab)
Each line will feed a struct with its data (numbers + letters).
What's the best way to read the line and get the strings I want?
Example:
0
2 4
0,1,ab,5,b
1,2,b,5,a
...
The lines may vary in size because we can have 1, 2, 3, .... numbers.
I already did it :
//struct
#define MAX_ 20
struct otherstats{ //struct otherStats
int conectstat[MAX_];//conection with others stats
int transitions[MAX_];//Symbols betwen conection ASCI
}tableStats[MAX_];
struct sAutomate{
int stat_initial; //initial
int stats_finals[MAX_]; //final orfinals
struct otherstats tableStats[MAX_]; //otherStats 0 1 2 3 4 5 6
};
/* eXample that what i want ..using the example
sAutomate.stat_initial=0
sAutomate.stats_finals[0]=2
sAutomate.stats_finals[1]=4
Others Stats table
//0
sAutomate.tableStats[0].conectstat[0]=1;
sAutomate.tableStats[0].conectstat[1]=5;
sAutomate.tableStats[0].transitions[0]=ab;
sAutomate.tableStats[0].transitions[1]=b;
//1
sAutomate.tableStats[1].conectstat[0]=2;
sAutomate.tableStats[1].conectstat[1]=6;
sAutomate.tableStats[1].transitions[0]=b;
sAutomate.tableStats[1].transitions[1]=a;
///etc
*/
void scanfile(){ //function to read the file
struct sAutomate st; //initialize st struct
char filename[] = "txe.txt";
FILE *file = fopen ( filename, "r" );
char buf[81];
char parts[5][11];
fscanf(file,"%d", &st.stat_initial);//read first line
printf("initial state : %d \n", st.stat_initial);
fscanf(file,"%d",&st.stats_finals);
fscanf(file,"%d",&st.stats_finals);
while (fgets(buf, sizeof(buf), stdin) != NULL)
{
if (sscanf(buf, "%10[^:]: (%10[^(], %10[^)]), (%10[^(], %10[^)])",
parts[0], parts[1], parts[2], parts[3], parts[4]) == 5)
{
printf("parts: %s, %s, %s, %s, %s\n",
parts[0], parts[1], parts[2], parts[3], parts[4]);
}
else
{
printf("Invalid input: %s", buf);
}
}
//fclose

First problem I see is you're overwriting stats_finals:
fscanf(file,"%d",&st.stats_finals);
fscanf(file,"%d",&st.stats_finals);
What you wanted to do here was:
fscanf(file,"%d",&st.stats_finals[0]);
fscanf(file,"%d",&st.stats_finals[1]);
To save off both the "2" and the "4" from the text file.
Second major problem is you're reading from stdin:
while (fgets(buf, sizeof(buf), stdin) != NULL)
That doesn't read your text file, that reads input from the keyboard... So you wanted that to be:
while (fgets(buf, sizeof(buf), file) != NULL)
Third (minor) problem is that fscanf() will not read newlines, and fgets() will. This means when you go from reading your second stats_finals to the first read in the while loop, your first input will just be the left over newline character. That's not a big deal since you check for "invalid input", but it's worth noting.
Finally, your sscanf looks wrong to me:
sscanf(buf, "%10[^:]: (%10[^(], %10[^)]), (%10[^(], %10[^)])",
^ ^
That's a width of 10, Why are you checking for commas? You didn't
I don't think that's have any in your text file
what you wanted...
I think this is more what you were looking for:
sscanf(buf, "%[0-9]: %[0-9](%[^)]) %[0-9](%[^)])",
^
takes a digit (0 to 9)
EDIT
Missed your original point. If you don't know how long the strings will be that you're reading, you can't use sscanf(). It's that simple. :)
The scanf family assumes you know how many objects you'll be parsing and the format string takes in that many. There are other options however.
Read a single line with fgets as you're doing, but then you can tokenize it. Either with the C function strtok or by your own hand with a for loop.
One note however:
Since you don't know how long it is, this: char parts[5][11]; is not your best bet. This limits you to 2 entries... probably it would be better to do this dynamically (read the line then allocate the correct size to store your tokens in.)

If you really don't know how many numbers and letters the line will contain, why are you reading a fixed amount of numbers and letters?
You could read the whole line with fgets and then parse it with a tokenizer like strtok, something like this:
const char* const DELIMITERS = " ";
int i; // index for tableStats
char* token;
token = strtok(line, DELIMITERS);
// first integer
if (token == NULL || sscanf(token, "%d:", &i) < 1)
// error
/* it seems like you should have at least one element in your "list",
* otherwise this is not necessary
*/
token = strtok(NULL, DELIMITERS);
if (token == NULL || sscanf(token, "%d(%[^)])",
&(tableStats[i].connectstat[0]),
&(tableStats[i].transitions[0])) < 2)
// error
// read optional part
for (int j = 1; (token = strtok(NULL, DELIMITERS)) != NULL; ++j)
if (sscanf(token, "%d(%[^)])", &(tableStats[i].connectstat[j]),
&(tableStats[i].transitions[j])) < 3)
break;
Remember that strtok changes the string, make a copy of it if you still need it.
Obviusly the code is for the arbitrary long lines, reading the first two lines is trivial.

Related

How to parse each column in a CSV file using C

I'm trying to use C to read a CSV file, iterate line by line (until EOF), and delimit/split each line by the comma. Then I wish to separate each column into "bins" and put add them to a struct (which isn't shown here; I defined it in a helper file) based on type.
For example, if I have 1,Bob, I'd like to split 1 and Bob into two variables. Here's what I've written so far.
void readFile(char file[25]) {
FILE *fp;
char line[1000];
fp = fopen(file, "r"))
while(fgets(line, 1000, fp)) {
char* tmp = strdup(line);
char* token;
while((token = strsep(&tmp, ","))) {
printf("%s\n", token); // I want to split token[0] and token[1]
}
}
fclose(fp);
}
T he above code does compile and run. I just don't know how to access each split of the token, like token[0] or token[1]. In python, this would be simple enough. I could just access 1 using token[0] and Bob using token[1] for each line. But here in C, I can't do that.
For testing purposes, all I'm doing right now is printing each line (in the second while loop), just to see how each split looks. I haven't implemented the code where I put each split line into its respective struct member.
I've searched Stack Overflow and found a multitude of threads on this topic. None of them seemed to help me except for this one, which I have drawn from. But I wasn't able to get the storing of split columns working.
In python, this would be simple enough. I could just access 1 using token[0] and Bob using token[1] for each line. But here in C, I can't do that.
Yes, you can, if only you define the array.
while (fgets(line, sizeof line, fp))
{
char *tmp = strchr(line, '\n');
if (tmp) *tmp = '\0'; // remove the '\n'
tmp = strdup(line);
#define MAXCOLUMNS 2
char *token[MAXCOLUMNS];
int c = 0;
while (tmp)
{
if (c == MAXCOLUMNS) puts("too many columns"), exit(1);
token[c++] = strsep(&tmp, ",");
}
if (1 <= c) printf("column 1: %s\n", token[0]);
if (2 <= c) printf("column 2: %s\n", token[1]);
// ONLY if the line's tokens are no longer needed:
free(*token);
}

How to read only integers from a file with strings, spaces, new lines and integers in C

I know this is a very trivial question but I just need quick help. I have been trying to figure this out for a while now. All I am trying to do is read only integers from a text file that has the form
8 blah blah
10 blah blah
2 blah blah
3 blah blah
I ultimately want to take the numbers only, store them in an array and put those numbers in a BST. My BST works fine when I have a file with just numbers, but not with the specified file format.
It doesn't matter what blah is I just want to get the numbers and store them in an array. I can do this if I take out the blah's. Using fscanf, I got my code to store the first number which is 8, but it stops there. Also in this example there are four lines but it doesn't matter how many lines are in the file. It could be 12 or 6. How can I properly do this. Below is my poor attempt to solve this.
fscanf(instructionFile, "%d", &num);
I also tried doing something like
while(!feof(instructionFile)){
fscanf("%d %s %s", &num, string1, string2);
}
To store everything and only use the integers, but my BST doesn't work when I do something like that.
Use fgets() to fetch a line of input, and sscanf() to get the integer. In your example use of fscanf(), the first call would read an int, and the next calls would fail since the next item in the input stream is not an int. After each failure, the bad input is left in the input stream. By getting a line at a time, you can scan the line at your leisure, before fetching another line of input.
Here is an example of how you might do this. And note that you should not use feof() to control the read loop; instead, use the return value from fgets(). This code assumes that the first entry on a line is the data you want, perhaps with leading whitespace. The format string can be modified for slightly more complex circumstances. You can also use strtok() if you need finer control over parsing of the lines.
#include <stdio.h>
#include <stdlib.h>
#define MAX_LINES 100
int main(void)
{
FILE *fp = fopen("data.txt", "r");
if (fp == NULL) {
fprintf(stderr, "Unable to open file\n");
exit(EXIT_FAILURE);
}
char buffer[1000];
int arr[MAX_LINES];
size_t line = 0;
while ((fgets(buffer, sizeof buffer, fp) != NULL)) {
if (sscanf(buffer, "%d", &arr[line]) != 1) {
fprintf(stderr, "Line formatting error\n");
exit(EXIT_FAILURE);
}
++line;
}
for (size_t i = 0; i < line; i++) {
printf("%5d\n", arr[i]);
}
fclose(fp);
return 0;
}
It would be good to add a check for empty lines before the call to sscanf(); right now an empty line is considered badly formatted data.
Output for your example file:
8
10
2
3
If you want to pick out only integers from a mess of a file, then you actually need work through each line you read with a pointer to identify each beginning digit (or beginning - sign for negative numbers) converting each integer found one at a time. You can do this with a pointer and sscanf, or you can do this with strtol making use of the endptr parameter to move to the next character following any successful conversion. You can also use character-oriented input (e.g. getchar or fgetc) manually performing the digit identification and conversion if you like.
Given you started with the fgets and sscanf approach, the following continues with it. Whether you use sscanf or strtol, the whole key is to advance the start of your next read to the character following each integer found, e.g.
#include <stdio.h>
#include <stdlib.h>
#define MAXC 256
int main (int argc, char **argv) {
char buf[MAXC] = ""; /* buffer to hold MAXC chars at a time */
int nval = 0; /* total number of integers found */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets (buf, MAXC, fp)) {
char *p = buf; /* pointer to line */
int val, /* int val parsed */
nchars = 0; /* number of chars read */
/* while chars remain in buf and a valid conversion to int takes place
* output the integer found and update p to point to the start of the
* next digit.
*/
while (*p) {
if (sscanf (p, "%d%n", &val, &nchars) == 1) {
printf (" %d", val);
if (++nval % 10 == 0) /* output 10 int per line */
putchar ('\n');
}
p += nchars; /* move p nchars forward in buf */
/* find next number in buf */
for (; *p; p++) {
if (*p >= '0' && *p <= '9') /* positive value */
break;
if (*p == '-' && *(p+1) >= '0' && *(p+1) <= '9') /* negative */
break;
}
}
}
printf ("\n %d integers found.\n", nval);
if (fp != stdin) fclose (fp); /* close file if not stdin */
return 0;
}
Example Input
The following two input files illustrate picking only integers out of mixed input. Your file:
$ cat dat/blah.txt
8 blah blah
10 blah blah
2 blah blah
3 blah blah
A really messy file
$ cat ../dat/10intmess.txt
8572,;a -2213,;--a 6434,;
a- 16330,;a
- The Quick
Brown%3034 Fox
12346Jumps Over
A
4855,;*;Lazy 16985/,;a
Dog.
11250
1495
Example Use/Output
In your case:
$ ./bin/fgets_sscanf_int_any_ex < dat/blah.txt
8 10 2 3
4 integers found.
With the really messy file:
$ ./bin/fgets_sscanf_int_any_ex <dat/10intmess.txt
8572 -2213 6434 16330 3034 12346 4855 16985 11250 1495
10 integers found.
Look things over and let me know if you have any questions.
A simple way to "read only integers" is to use fscanf(file_pointer, "%d", ...) and fgetc() when that fails
int x;
int count;
while ((count = fscanf(file_pointer, "%d", &x)) != EOF) {
if (count == 1) {
// Use the `int` in some fashion (store them in an array)
printf("Success, an int was read %d\n", x);
} else {
fgetc(file_pointer); // Quietly consume 1 non-numeric character
}
}
I got my code to store the first number which is 8, but it stops there.
That is because the offending non-numeric input remains in the FILE stream. That text needs to be consumed in some other way. Calling fscanf(instructionFile, "%d", &num); again simple results in the same problem: fscanf() fails as initial input is non-numeric.
Note: OP's code is missing the FILE pointer
// fscanf(????, "%d %s %s", &num, string1, string2);

C - How to read a list of space-separated text file of numbers into a List

I am trying to read a textfile like this
1234567890 1234
9876543210 22
into a List struct in my program. I read in the files via fgets() and then use strtok to seperate the numbers, put them into variables and then finally into the List. However, I find that in doing this and printing the resulting strings, strtok always takes the final string in the final line to be NULL, thus resulting in a segmentation fault.
fgets(fileOutput,400,filePointer); //Read in a line from the file
inputPlate = strtok(fileOutput," "); // Take the first token, store into inputPlate
while(fileOutput != NULL)
{
string = strtok(NULL," ");
mileage = atoi(string); //Convert from string to integer and store into mileage
car = initializeCar(mileage,dateNULL,inputPlate);
avail->next = addList(avail->next,car,0);
fgets(fileOutput,400,filePointer);
inputPlate = strtok(fileOutput," ");
}
How do I resolve this?
Reading a text file line by line with fgets() is good.
Not checking the return value of fgets() is weak. This caused OP's code to process beyond the last line.
// Weak code
// fgets(fileOutput,400,filePointer); //Read in a line from the file
// ...
// while(fileOutput != NULL)
// {
Better to check the result of fgets() to determine when input is complete:
#define LINE_SIZE 400
...
while (fgets(fileOutput, LINE_SIZE, filePointer) != NULL)
{
Then process the string. A simple way to assess parsing success to is to append " %n" to a sscanf() format to record the offset of the scan.
char inputPlate[LINE_SIZE];
int mileage;
int n = -1;
sscanf(fileOutput, "%s%d %n", inputPlate, &mileage, &n);
// Was `n` not changed? Did scanning stop before the string end?
if (n < 0 || fileOutput[n] != '\0') {
Handle_Bad_input();
break;
} else {
car = initializeCar(mileage, dateNULL, inputPlate);
avail->next = addList(avail->next,car,0);
}
}
You could write a simpler parser with fscanf():
FILE *filePointer;
... // code not shown for opening the file, initalizing the list...
char inputPlate[32];
int mileage;
while (fscanf(filePointer, "%31s%d", inputPlate, &mileage) == 2) {
car = initializeCar(mileage, dateNULL, inputPlate);
avail->next = addList(avail->next, car, 0);
}

Analyzing Strings with sscanf

I need to analyze a string previous reader with fgets,
then I have a row from:
name age steps\n
mario 10 1 2 3 4\n
joe 15 3 5\n
max 20 9 3 2 4 5\n
there are a variable number of steps for each column,
then I can read name and age with
sscanf(mystring, "%s %d", name, &age);
after this I have a for cycle for read all steps
int step[20];
int index=0;
while(sscanf(mystring,"%d", &step[index++])>0);
but this cycle never ends populating all array data with the age column.
The reason this never ends is because you are constantly providing the same string to scan.
sscanf provides the %n switch which stores the amount of characters read before it is reached inside a, which allows you to move forward in your input string by that amount of characters before rescanning.
This'll work:
int step[20];
int index=0;
int readLen;
while(sscanf(mystring,"%d%n", &step[index++], &readLen)>0) {
mystring += readLen;
}
A working solution is given in the answer from sokkyoku.
Another possibility to read variable length lines is to use strtok like in the following code snippet:
int getlines (FILE *fin)
{
int nlines = 0;
int count = 0;
char line[BUFFSIZE]={0};
char *p;
if(NULL == fgets(buff, BUFFSIZE, fin))
return -1;
while(fgets(line, BUFFSIZE, fin) != NULL) {
//Remove the '\n' or '\r' character
line[strcspn(line, "\r\n")] = 0;
count = 0;
printf("line[%d] = %s\n", nlines, line);
for(p = line; (p = strtok(p, " \t")) != NULL; p = NULL) {
printf("%s ", p);
++count;
}
printf("\n\n");
++nlines;
}
return nlines;
}
Explanation of the above function getlines:
Each line in the file fin is read using fgets and stored in the variable line.
Then each substring in line (separated by a white space or \t character) is extracted and the pointer to that substring stored in p, by means of the function strtok in the for loop (see for example this post for further example on strtok).
The function then just print p but you can do everything with the substring here.
I also count (++count) the number of items found in each line. At the end, the function getline count and returns the number of lines read.

file input int strings c [duplicate]

This question already has answers here:
Going through a text file line by line in C
(4 answers)
Closed 8 years ago.
So I'm having issues right now with my program. I'm trying to get it to open a file count the lines rewind and then go through the file to store the variables.
String String String int is the format of the file but I'm having issues after I count the lines. I can print the numbers to the screen but then I get a seg fault right after the print. I don't know why
int countLines(FILE * fin){
int count=0;
char street[100];
char city[100];
char state[3];
int zip;
do{
fgets(street, 100, fin);
fgets(city, 100, fin);
fgets(state, 3, fin);
fscanf(fin, "%d\n", &zip);
count++;
}while(!feof(fin));
rewind(fin);
return count;
}
lines=countLines(fin); is how I call the function. What am I doing wrong?
Do not mix fgets() with fscanf() until you are very comfortable with these functions. They do not play well together. That \n in the format is a white space and will match any number of consecutive white space including multiple \n, spaces, tabs, etc.
// fscanf(fin, "%d\n", &zip);
Recommend avoiding feof() and using the return value from fgets().
feof() does not become true until a file read is attempted and fails to provide a char. This is different than "true when none left". Example: you read the last char of a file. feof() is still false. Code attempts to read more (and fails). Now feof() is true. (and remains true).
Do count the lines in a simple fashion and use symmetry. Further consider more error checking. Read the zip code line as a string and then parse it as an integer.
int countLines(FILE * fin){
int count=0;
char street[100];
char city[100];
char state[100];
char zips[100];
unsigned zip;
while (fgets(street, sizeof street, fin) != NULL) {
count++;
}
rewind(fin);
if (count%4 != 0) Handle_LineCountNotMultipleof4();
// You could return here, but let's read the file again and get the data.
// This is likely part of OP's next step.
for (int i=0; i<count; i += 4) {
if ((NULL == fgets(street, sizeof street, fin)) ||
(NULL == fgets(city, sizeof city, fin)) ||
(NULL == fgets(state, sizeof state, fin)) ||
(NULL == fgets(zips, sizeof zips, fin)) ||
(1 != sscanf(zips, "%u", &zip))) handle_error();
// Remember street, city, state, still have an ending \n
do_something(street, city, state, zip);
}
return count;
}
Alternatively, to count the lines use the following. A singular difficulty occurs in reading if you have long lines, so let's check that as we go. Take this out line length stuff if you prefer a simple answer. You could use the Maxline+1 as you buffer size instead of a fixed 100.
size_t Maxline = 0;
size_t Curline = 0;
int ch;
while ((ch = fgetc(fin)) != EOF) {
Curline++;
if (ch == '\n') {
count++;
if (Curline > Maxline) MaxLine = Curline;
Curline = 0;
}
}
if ((Maxline + 1) > 100) TroubleAhead() ; // Trouble with future (fgets(buf, 100, fin), use bigger buffers
rewind(fin);
fgets tries to read a whole line,not just a word, which seems to be what you are hoping for.
So, the buffers you are passing it arent big enough, and they won't get what you were hopign for and you are incrementign the count once for every 3 lines and fscan usually wont read a line leaving the remains of that line to interfere with your next read.
If you want to read words, try scanf, with %s. If you want to read a fixed number of chars, try fread.

Resources