How to save specific strings in a file to variables in C - c

So if for example I have a file with the following content:
STUDENTS: Three
NAME 1: Andy
NAME 2: Becky
NAME 3: Chris
TYPE: Undergrads
I would like to extract the names of the students into an array.
I have tried to implement this using fscanf, for instance this works and I can save "Three" to student struct:
fscanf(fptr, "STUDENTS: %s\n", student.count);
So I've tried some variations of this (where count is the number of lines in the file), but the names array remains empty:
int *num = NULL;
*num = 1;
int j;
for (j=0; j<count; j++) {
if (j != 0 && j != count-1) {
fscanf(fptr, "NAME %d: %s\n", num, student.names[j]);
*num+=1;
}
}
Is there a better method than fscanf, for example fseek() which I am not really familiar with. Any ideas would be appreciated, thanks.
edit:
struct Students {
char *name;
char *type;
char *connections[6];
};
struct Students student;

The scanf family of functions isn't great for scanning lines that have variable formats. In this case a reasonable approach is to first scan the input line as a tag and string value separated by a colon.
char tag[MAX_TAG_SIZE], value[MAX_VALUE_SIZE];
if (fscanf(f, "%[^:]: %s ", tag, value) != 2) error("bad line format");
This format string gets any series of characters other than : into tag. Then it skips a : followed by whitespace. Then it gets a non-whitespace word into value followed by skipping whitespace (including newlines). The last bit gets the input ready to scan the next tag, which is important. The biggest mistake new C programmers make with scanf is forgetting to deal correctly whitespace in the input stream.
Now you can inspect the tag to see what to do next:
if (strcmp("STUDENTS", tag) == 0) {
... Handle students value
} else if(strcmp("TYPE", tag) == 0) {
... Handle type value
} else if (strncmp("NAME", tag, 4) == 0) {
if (sscanf(tag + 4, "%d", &name_number) != 1) error("bad name number");
... Handle name_number and value
} else error("unexpected tag");

Related

parsing a file while reading in c

I am trying to read each line of a file and store binary values into appropriate variables.
I can see that there are many many other examples of people doing similar things and I have spent two days testing out different approaches that I found but still having difficulties getting my version to work as needed.
I have a txt file with the following format:
in = 00000000000, out = 0000000000000000
in = 00000000001, out = 0000000000001111
in = 00000000010, out = 0000000000110011
......
I'm attempting to use fscanf to consume the unwanted characters "in = ", "," and "out = "
and keep only the characters that represent binary values.
My goal is to store the first column of binary values, the "in" values into one variable
and the second column of binary values, the "out" value into another buffer variable.
I have managed to get fscanf to consume the "in" and "out" characters but I have not been
able to figure out how to get it to consume the "," "=" characters. Additionally, I thought that fscanf should consume the white space but it doesn't appear to be doing that either.
I can't seem to find any comprehensive list of available directives for scanners, other than the generic "%d, %s, %c....." and it seems that I need a more complex combination of directives to filter out the characters that I'm trying to ignore than I know how to format.
I could use some help with figuring this out. I would appreciate any guidance you could
provide to help me understand how to properly filter out "in = " and ", out = " and how to store
the two columns of binary characters into two separate variables.
Here is the code I am working with at the moment. I have tried other iterations of this code using fgetc() in combination with fscanf() without success.
int main()
{
FILE * f = fopen("hamming_demo.txt","r");
char buffer[100];
rewind(f);
while((fscanf(f, "%s", buffer)) != EOF) {
fscanf(f,"%[^a-z]""[^,]", buffer);
printf("%s\n", buffer);
}
printf("\n");
return 0;
}
The outputs from my code appear as follows:
= 00000000000,
= 0000000000000000
= 00000000001,
= 0000000000001111
= 00000000010,
= 0000000000110011
Thank you for your time.
The scanf family function is said to be a poor man'parser because it is not very tolerant to input errors. But if you are sure of the format of the input data it allows for simple code. The only magic here if that a space in the format string will gather all blank characters including new lines or none. Your code could become:
int main()
{
FILE * f = fopen("hamming_demo.txt", "r");
if (NULL == f) { // always test open
perror("Unable to open input file");
return 1;
}
char in[50], out[50]; // directly get in and out
// BEWARE: xscanf returns the number of converted elements and never EOF
while (fscanf(f, " in = %[01], out = %[01]", in, out) == 2) {
printf("%s - %s\n", in, out);
}
printf("\n");
return 0;
}
So basically you want to filter '0' and '1'? In this case fgets and a simple loop will be enough: just count the number of 0's and 1's and null-terminate the string at the end:
#include <stdio.h>
int main(void)
{
char str[50];
char *ptr;
// Replace stdin with your file
while ((ptr = fgets(str, sizeof str, stdin)))
{
int count = 0;
while (*ptr != '\0')
{
if ((*ptr >= '0') && (*ptr <= '1'))
{
str[count++] = *ptr;
}
ptr++;
}
str[count] = '\0';
puts(str);
}
}

C - reading ints and chars into arrays from a file

I have a .txt file with values written in this format: LetterNumber, LetterNumber, LetterNumber etc (example: A1, C8, R43, A298, B4). I want to read the letters and the numbers into two separate arrays (example: array1 would be A C R A B; array2 would be 1 8 43 298 4). How can I make it happen?
At the moment I only figured out how to read all the values, both numbers and letters and the commas and everything, into one array of chars:
FILE *myfile;
myfile = fopen("input1.txt", "r");
char input[677]; //I know there are 676 characters in my .txt file
int i;
if (myfile == NULL) {
printf("Error Reading File\n");
exit (0);
}
for (i=0; i<677; i++) {
fscanf(myfile, "%c", &input[i]);
}
fclose(myfile);
But ideally I want two arrays: one containing only letters and one containing only numbers. Is it even possible?
I would appreciate any kind of help, even just a hint. Thank you!
Define another array for integers,
int inputD[677];
Then in for loop read one char, one integer and one space char at a time.
fscanf(myfile, " %c%d %*[,] ", &input[i], &inputD[i]);
I would actually define a struct to keep letter and number together; the data format strongly suggests that they have a close relation. Here is a program that exemplifies the idea.
The scanf format is somewhat tricky to get right (meaning as simple as possible, but no simpler). RoadRunner, for example, forgot to skip whitespace preceding the letter in his answer.
It helps that we have (I assume) only single letters. It is helpful to remember that all standard formats except %c skip whitespace. (Both parts of that sentence should be remembered.)
#include<stdio.h>
#define ARRLEN 10000
// Keep pairs of data together in one struct.
struct CharIntPair
{
char letter;
int number;
};
// test data. various space configurations
// char *data = " A1, B22 , C333,D4,E5 ,F6, Z12345";
void printParsedPairs(struct CharIntPair pairs[], int count)
{
printf("%d pairs:\n", count);
for(int i = 0; i<count; i++)
{
printf("Pair %6d. Letter: %-2c, number: %11d\n", i, pairs[i].letter, pairs[i].number);
}
}
int main()
{
setbuf(stdout, NULL);
setbuf(stdin, NULL);
// For the parsing results
struct CharIntPair pairs[ARRLEN];
//char dummy [80];
int parsedPairCount = 0;
for(parsedPairCount=0; parsedPairCount<ARRLEN; parsedPairCount++)
{
// The format explained>
// -- " ": skips any optional whitespace
// -- "%c": reads the next single character
// -- "%d": expects and reads a number after optional whitespace
// (the %d format, like all standard formats except %c,
// skips whitespace).
// -- " ": reads and discards optional whitespace
// -- ",": expects, reads and discards a comma.
// The position after this scanf returns with 2 will be
// before optional whitespace and the next letter-number pair.
int numRead
= scanf(" %c%d ,",
&pairs[parsedPairCount].letter,
&pairs[parsedPairCount].number);
//printf("scanf returned %d\n", numRead);
//printf("dummy was ->%s<-\n", dummy);
if(numRead < 0) // IO error or, more likely, EOF. Inspect errno to tell.
{
printf("scanf returned %d\n", numRead);
break;
}
else if(numRead == 0)
{
printf("scanf returned %d\n", numRead);
printf("Data format problem: No character? How weird is that...\n");
break;
}
else if(numRead == 1)
{
printf("scanf returned %d\n", numRead);
printf("Data format problem: No number after first non-whitespace character ->%c<- (ASCII %d).\n",
pairs[parsedPairCount].letter, (int)pairs[parsedPairCount].letter);
break;
}
// It's 2; we have parsed a pair.
else
{
printf("Parsed pair %6d. Letter: %-2c, number: %11d\n", parsedPairCount,
pairs[parsedPairCount].letter, pairs[parsedPairCount].number);
}
}
printf("parsed pair count: %d\n", parsedPairCount);
printParsedPairs(pairs, parsedPairCount);
}
I was struggling a bit with my cygwin environment with bash and mintty on a Windows 8. The %c would sometimes encounter a newline (ASCII 10) which should be eaten by the preceding whitespace-eating space, derailing the parsing. (More robust parsing would, after an error, try to read char by char until the next comma is encountered, and try to recover from there.)
This happened when I typed Ctr-D (or, I think, also Ctr-Z in a console window) in an attempt to signal EOF; the following enter key stroke would cause a newline to "reach" the %c. Of course text I/O in a POSIX emulation on a Windows system is tricky; I must assume that somewhere between translating CR-NL sequences back and forth this bug slips in. On a linux system via ssh/putty it works as expected.
You basically just have to create one char array and one int array, then use fscanf to read the values from the file stream.
For simplicity, using a while loop in this case makes the job easier, as you can read the 2 values returned from fscanf until EOF.
Something like this is the right idea:
#include <stdio.h>
#include <stdlib.h>
// Wasn't really sure what the buffer size should be, it's up to you.
#define MAXSIZE 677
int
main(void) {
FILE *myFile;
char letters[MAXSIZE];
int numbers[MAXSIZE], count = 0, i;
myFile = fopen("input1.txt", "r");
if (myFile == NULL) {
fprintf(stderr, "%s\n", "Error reading file\n");
exit(EXIT_FAILURE);
}
while (fscanf(myFile, " %c%d ,", &letters[count], &numbers[count]) == 2) {
count++;
}
for (i = 0; i < count; i++) {
printf("%c%d ", letters[i], numbers[i]);
}
printf("\n");
fclose(myFile);
return 0;
}

C - How to read a list of space-separated text file of numbers into a List

I am trying to read a textfile like this
1234567890 1234
9876543210 22
into a List struct in my program. I read in the files via fgets() and then use strtok to seperate the numbers, put them into variables and then finally into the List. However, I find that in doing this and printing the resulting strings, strtok always takes the final string in the final line to be NULL, thus resulting in a segmentation fault.
fgets(fileOutput,400,filePointer); //Read in a line from the file
inputPlate = strtok(fileOutput," "); // Take the first token, store into inputPlate
while(fileOutput != NULL)
{
string = strtok(NULL," ");
mileage = atoi(string); //Convert from string to integer and store into mileage
car = initializeCar(mileage,dateNULL,inputPlate);
avail->next = addList(avail->next,car,0);
fgets(fileOutput,400,filePointer);
inputPlate = strtok(fileOutput," ");
}
How do I resolve this?
Reading a text file line by line with fgets() is good.
Not checking the return value of fgets() is weak. This caused OP's code to process beyond the last line.
// Weak code
// fgets(fileOutput,400,filePointer); //Read in a line from the file
// ...
// while(fileOutput != NULL)
// {
Better to check the result of fgets() to determine when input is complete:
#define LINE_SIZE 400
...
while (fgets(fileOutput, LINE_SIZE, filePointer) != NULL)
{
Then process the string. A simple way to assess parsing success to is to append " %n" to a sscanf() format to record the offset of the scan.
char inputPlate[LINE_SIZE];
int mileage;
int n = -1;
sscanf(fileOutput, "%s%d %n", inputPlate, &mileage, &n);
// Was `n` not changed? Did scanning stop before the string end?
if (n < 0 || fileOutput[n] != '\0') {
Handle_Bad_input();
break;
} else {
car = initializeCar(mileage, dateNULL, inputPlate);
avail->next = addList(avail->next,car,0);
}
}
You could write a simpler parser with fscanf():
FILE *filePointer;
... // code not shown for opening the file, initalizing the list...
char inputPlate[32];
int mileage;
while (fscanf(filePointer, "%31s%d", inputPlate, &mileage) == 2) {
car = initializeCar(mileage, dateNULL, inputPlate);
avail->next = addList(avail->next, car, 0);
}

fscanf to structure array

I'm trying to take some input from a text file, put it into a structure and print it out. The sample text file looks like this:
2
Curtis
660-------
Obama
2024561111
(Digits on the first number dashed out (for privacy), second is the Whitehouse.gov one, I called, they can't help me.)
Sample output:
204-456-1111 Obama
660--------- Curtis
(Formatting and sorting shouldn't be a problem when I figure out the rest.)
My question is labeled by the question marks below (in the first FOR loop, how do I get specific lines out of the text file to create the structures?
#include <stdio.h>
#include <string.h>
struct telephone {
char name[80];
long long int number;
}
main() {
struct telephone a, b;
char text[80];
int amount, i;
FILE *fp;
fp = fopen("phone.txt", "r");
fscanf(fp, "%d", amount);
struct telephone list[amount];
for(i = 0; i < amount; i++) {
strcpy(list[i].name, ???);
list[i].number, ???);
}
fclose(fp);
for(i = 0; i < amount; i++) {
DisplayStruct(list[i]);
}
}
DisplayStruct(struct telephone input) {
printf("%lld %s\n", input.number, input.name);
}
Use fgets to read one line at a time.
int lnum = 0;
char line[100];
while( fgets(line, sizeof(line), fp) ) {
lnum++;
printf( "Line %d : %s\n", lnum, line );
}
You can then use sscanf or strtok or numerous other approaches to pull data out of the string you just read.
I advise against storing your phone number as an integer. Phone numbers are better represented as strings.
If you can guarantee that neither names nor phone numbers have blanks in them, you can utilize fscanf() to read this data:
for(i = 0; i < amount; i++) {
fscanf("%s %lld", list[i].name, &list[i].phone);
}
Things to keep in mind:
You must check for conversion errors
This approach is less tolerant to input errors (in case of using fgets() it might be easier to recover and drop the malformed entry - unless the record has wrong number of fields).
Agree with #paddy, use a string to store phone numbers. (Cope with leading 0s, variant length, #, *, pause, etc.). Might as well also make sure it is big enough for a int64_t.
Note: The web has examples of 22 digits.
struct telephone {
char name[80];
char number[21];
}
To read in the data ...
for (i = 0; i < amount; i++) {
// +1 for buffer size as string read has a \n which is not stored.
char na[sizeof list[0].name + 1];
char nu[sizeof list[0].number + 1];
if ((fgets(na, sizeof na, fp) == NULL) || (fgets(nu, sizeof nu, fp) == NULL)) {
break; // TBD, Handle unexpected missing data
}
// The leading space in the format will skip leading white-spaces.
if (1 != sscanf(na, " %79[^\n]", list[i].name)) {
break; // TBD, Handle empty string
}
if (1 != sscanf(na, " %20[^\n]", list[i].number)) {
break; // TBD, Handle empty string
}
}
if (fgetc(fp) != EOF) {
; // Handle unexpected extra data
}
amount = i;
To write
// Pass address of structure
for(i = 0; i < amount; i++) {
DisplayStruct(&list[i]);
}
void DisplayStruct(const struct telephone *input) {
if (strlen(input->number) == 10) {
printf("%.3s-%.3s-%4s", input->number, &input->number[3], &input->number[6]);
}
else { // tbd format for unexpected telephone number length
printf("%13s", input->number);
}
// Suggest something around %s like \"%s\" to encapsulate names with spaces
printf(" %s\n", input->name);
}

Scanning from files

I am currently trying to scan a single line in from a file but having a snag at strings.
This is the example line my professor told me to work on.
enum status{MEM,PREP,TRAV}
union type { double int day, char* title, float cost}
13953 P 12 26 2011 1 5 2012 2 A 3.30 249.00 A 2.0 148.00 MEM Cuba Christmas 3 0 2 Sierra Del Rosario, Cuba
I'm fine with everything accept at the point (MEM Cuba Christmas) when I'm scanning it in from a FILE. I read the first part of the data just using fscanf(), but MEM is a enumerated type with a union type that dictates the following input. My problem is with the syntax of the scanning. I tried using getline starting at MEM but I hit snags with the tokenizing since the city / country can have spaces. Not sure what other scans to use I was looking at sscanf() but wasn't sure if it works with files.
UPDATED:
int main(void);
{
int m, length = 100;
char *word, file_name[100];
FILE *file_point
printf("Please enter file name with .txt extension:");
scanf("%s", file_name);
file_point = fopen(file_name,"r");
while (fscanf(file_point, "%d", &m) != EOF)
{
temp.dest_code = m;
fscanf(file_point, " %c %d %d %d %d %d %d %d",
&temp.area_code,
&temp.Smonth, &temp.Sday, &temp.Syear,
&temp.Emonth, &temp.Eday, &temp.Eyear,
&temp.leg_num);
for (n=0; n < temp.leg_num; n++)
{
fscanf(file_point," %c %f %f",
&temp.tleg[n].travel_type,
&temp.tleg[n].travel_time,
&temp.tleg[n].cost);
}
fscanf(file_point," %d %d %d ",
&temp.adult,
&temp.child,
&temp.infant);
temp_name = (char *)malloc(length + 1);
getline (&temp_name, &length, file_point);
word = strtok(temp_name, ",");
temp.dest_name=(char *)malloc(strlen(word)+1);
strcpy(temp.dest_name, word);
word = strtok(NULL, ",");
temp.dest_country=(char *)malloc(strlen(word)+1);
strcpy(temp.dest_country,word2);
printf("name:%s country:%s\n", temp.dest_name, temp.dest_country);
printf("adult:%d , child:%d , infant:%d \n", temp.adult, temp.child, temp.infant);
}
}
This was the code I was using as a base that I came up with but not sure how to handle the enumerated and union. I was thinking of doing something like:
getline(&status, &length, file_point);
but how do I convert string to integer or float?
If I understand your problem properly (I'm not sure I do), then you face the problem of seeing 'MEM' (or 'PREP' or 'TRAV') as a string in the input, and you have to understand how to handle the following data. The enum suggests that you might want to convert the string MEM to the value of MEM in the enumeration.
It is hard to fully automate such a conversion. It would be simplest simply to recognize the strings and decide what to do based on the string:
if (strcmp(found_string, "MEM") == 0)
...do the MEM stuff...
else if (strcmp(found_string, "PREP") == 0)
...do the PREP stuff...
else if (strcmp(found_string, "TRAV") == 0)
...do the TRAV stuff...
else
...report unknown type code...
However, you can create a structure to handle the conversion from string to enumeration value.
struct StateConv
{
const char *string;
enum state number;
};
static struct StateConv converter[] =
{
{ "MEM", MEM },
{ "PREP", PREP },
{ "TRAV", TRAV },
};
enum { NUM_STATECONV = sizeof(converter) / sizeof(converter[0]) };
enum state state_conversion(const char *string)
{
for (int i = 0; i < NUM_STATECONV; i++)
{
if (strcmp(string, converter[i].string) == 0)
return(converter[i].number);
}
fprintf(stderr, "Failed to find conversion for %s\n", string);
exit(1);
}
You need a better error handling strategy than 'exit on error'.
Your scanning code will need to read the word, and then call state_conversion(). Then depending on what you get back, you can read the remaining (following) data in the correct way for the state you were given.
No, you can't do that in the way you are trying. MEM in your file is a string type, you need to parse it like you parse a string and then set the value of your enum according to that string.
For example, when you want to parse your status type (MEM,PREP,TRAV):
char typeBuffer[6];
fscanf(file_point,"%5s",typeBuffer);
Then manually compare the content of typeBuffer:
status stat;
if (strcmp(typeBuffer, "MEM") == 0){
stat = MEM;
}
The conversion between string type and enum cannot be implicit.

Resources