C Programming, Reading specific sections from file - c

my question is how can I read specific sections from a file? For instance, if my file was:
454545454 Joe Brown 70 50 40
656565656 David Smith 80 90 100
383838383 George Williams 95 100 80
How could I read the first string (9-Digit #), skip over the name, and then read the 3 sets of numbers?

I think that you could notice that the white space is your sentinel. I'm thinking that maybe you can store the whole file into a char* and asking for this sentinel each time.
Other solution could be using atoi (ascii to int) for validate if it's a number or a letter. You can also read about fread and fseek.
I think that the best way is to mix both solution... find each sentinel and try to parse it using atoi.
The main idea is that you try to find some pattern in the file that allows you to think the algorithm.
In C, most of the times you have to solve the logic by yourself.
Hope it helps!

Instead of "reading specific sections," read file line by line and save the information you want and discard the others. scanf is used to read formatted from an external source into program variables. Since scanf returns the number of successful reads from the source, you can use that to do some error checking.
char num_string[STR_LEN];
int numbers[3];
char dummy1[STR_LEN], dummy2[STR_LEN];
int num_read = scanf( "%s%s%s%d%d%d", num_string, dummy1, dummy2, &numbers[0], &numbers[1], &numbers[2] );
if( num_read != 6 )
// error
else
{
// do stuff with num_string, and numbers[0]-numbers[2]
}

Related

How to read two words in one string

I have sample input file like this
1344 Muhammad Ayyubi 1
1344 Muhammad Ali Ayyubi 1
First, last number and surname are separated with tab character. However, a person may have two names. In that case, names are separated with whitespace.
I am trying to read from input file and store them in related variables.
Here is my code that successfully reads when a person has only one name.
fscanf(fp, "%d\t%s\t%s\t%d", &id, firstname, surname, &roomno)
The question is that is there any way to read the input file which may contain two first names.
Thanks in advance.
Read the line with fgets() which then saves that as a string.
Then parse the string. Save into adequate sized buffers.
Scanning with "\t", scans any number of white-space - zero or more. Use TABFMT below to scan 1 tab character.
Test results along the way.
This code uses " %n" to see that parsing reached that point and nothing more on the line.
#define LINE_N 100
char line[LINE_N];
int id,
char firstname[LINE_N];
char surname[LINE_N];
int roomno;
if (fgets(line, sizeof line, fp)) {
int n = 0;
#define TABFMT "%*1[\t]"
#define NAMEFMT "%[^\t]"
sscanf(line, "%d" TABFMT NAMEFMT TABFMT NAMEFMT TABFMT "%d %n",
&id, firstname, surname, &roomno, &n);
if (n == 0 || line[n]) {
fprintf(stderr, "Failed to parse <%s>\n", line);
} else {
printf("Success: %d <%s> <%s> %d\n", id, firstname, surname, roomno);
}
}
If the last name or first is empty, this code treats that as an error.
Alternate approach would read the line into a string and then use strcspn(), strchr() or strtok() to look for tabs to parse into the 4 sub-strings`.
The larger issue missed by OP is what to do about ill-formatted input? Error handling is often dismissed with "input will be well formed", yet in real life, bad input does happen and also is the crack the hackers look for. Defensive coding takes steps to validate input. Pedantic code would not use *scanf() at all, but instead fgets(), strcspn(), strspn(), strchr(), strtol() and test, test, test. This answer is a middle-of-the-road testing effort.
You can use the %[ specifier to read whitespace in a string:
fscanf(fp, "%d\t%[^\t]\t%[^\t]\t%d", &id, firstname, surname, &roomno)
The answers to the question as stated are reasonable, but the question is wrong.
The end-goal here is to read human-names. Human names come in quite a variety - not always first, [middle,] last. Baking in this assumption is an error in design.
This is a many, many times repeated error. Better not to repeat.
Simplest solution is to re-order the data fields, and make no assumptions about the structure of names. So the input data becomes:
1344 1 Muhammad Ayyubi
1344 1 Muhammad Ali Ayyubi
Scanning code then can pull off the first two numeric fields, and use the remainder of the line for name (making no assumptions about structure).
More generally, if you do need to scan fields with embedded whitespace, remember the 32 "control" characters in the ASCII character table, of which ~24 have no assigned semantics (in current use). You can add structure to a file of text, for example with use of (from man ascii:
034 28 1C FS (file separator)
035 29 1D GS (group separator)
036 30 1E RS (record separator)
037 31 1F US (unit separator)
There is almost no case where text fields are allowed these characters.

How to slice/index data in c

I am trying to learn C and have recieved a homework assignment to write code which can read data from a .txt file and print out particular lines.
I wrote the following:
#include <stdio.h>
void main() {
char str[5];
FILE *fp;
fp=fopen("data.txt","r");
int i;
for (i=1;i<=5;i++){
fgets(str,5,fp);
printf("%d \n",i);
if (i==1||i==3||i==5) {
printf("%s \n \n",str);
}
}
}
The file data.txt is just the following:
3.21
5.22
4.67
2.31
2.51
1.11
I had read that each time fgets is run, the pointer is updated to point to the next line. I thought I could keep running fgets and then only print the string str when at the correct value for i (the line I want output on the console).
It partially worked, here is the output:
1
3.21
2
3
5.22
4
5
4.67
Process returned 8 (0x8) execution time : 0.024 s
Press any key to continue.
It did only print when i had the correct values, but for some reason it only printed the first 3 lines, even though fgets was supposed to have been run 5 times by the last iteration, and so the pointer should have been reading the last line.
Can someone explain why the pointer did not update as expected and if there is an easier way to slice or index through a file in c.
You need to account for (at least) two additional characters, in addition to the numbers you have in the file. There is the end-of-line delimiter (\n on UNIX/Mac, or possibly \r\n on Windows... so maybe 3 additional characters), plus (from the fgets documentation):
A terminating null character is automatically appended after the characters copied to str.
A lot of the C functions that manipulate character arrays (ie. strings) will give you this extra null "for free" and it can be tricky if you forget about it.
Also, a better way to loop over the lines might be:
#define MAX_CHARS 7
char buf[MAX_CHARS];
while((fgets(buf, MAX_CHARS, fp)) != NULL) {
printf("%s\n", buf);
}
It's still not the best way to do it (no error checking) but a little more compact/readable and idiomatic C, IMO.

reading file unknown format in C

I need some help with this exercise in C language.
I would like to know how do I read data from a file that I don't know it's format.
-The file will contain int(1-999) and char: "OL"=overloaded, "ND"=noData, "LB"=lowBattery.
Example:
My_file.txt
Can be made like this:
25
764
OL
ND
34
LB
624
235
ND
........
Or like this:
534 ND 356 LB LB 234 765 123 ND ND......
235 976 LB 156 ND......
I know that this:
FILE *f;
char str1;
f=fopen(filename,"r");
str1=fgetc(f);
while(str1 != EOF)
{
printf("%c",str1);
str1=fgetc(f);
}
fclose(f);
can read the file until EOF. But I can't use it because i need to assign those values to some int or chars...(what if i use enum?)
I am sure that I can't use fscanf. But the real question is: How to I read the file, and how to I assign those values to a struct or something...
So then i can use them for operations(like sum and more).
Thank you very much guys...
 I don't know it's format
Hmm .. it seems to me that You know the format exactly:
The file will contain int(1-999) and char: "OL"=overloaded, "ND"=noData, "LB"=lowBattery
Your file contains a whitespace separated sequence of tokens, each of which is either OL, ND, LB or an integer in the specified range.
So to parse that file read one character at a time. Whitespace? Ignore and continue with the next. A digit? Now should come up to 2 more digits. Read them and convert to an integer. 'O', 'N' or 'L'? Look for the next character to be the correct one. Everything else? Parse error!
To save each token create a structure like:
struct Token
{
enum
{ TokenOverLoad
, TokenNoData
, TokenLowBattery
, TokenData
} kind;
short data; // only if kind == TokenData
};
Then store these in either a list or dynamic array during parsing. Afterwards You can iterate over that list/array to implement any required functions like sum ...
I asked a friend. He said that I can use fscanf.
I only need to define a struct with characters.
With fscanf I will read %s and add them to the char char_name[20];
If I want, i can use atoi/atof for the numbers or strcmp for the chars.
If anybody knows another easiest solution. Please answer :)
Soon i will post the code, working on it:)

How to read one whole line from text file using <

I am trying to get one whole line from some text file instead of one word until it meets white space, here is source code:
#include <stdio.h>
void main() {
int lineNum=0;
char lineContent[100];
scanf("%s", &lineContent);
printf("%s\n", lineContent);
}
And here is my text file, called test.txt, content containing 2 lines:
111 John Smith 100 98 1.2 2.5 3.6
222 Bob Smith 90 91 3.2 6.5 9.6
And I run it with following command:
a.out < test.txt
My output is just:
111
What I want is:
111 John Smith 100 98 1.2 2.5 3.6
Of course, I can simply use while statement and read recursively until it meets EOF, but that is not what I want. I just want to read one whole line per each time I read from file.
How can I do this?
Thank you very much.
fgets() is the most convenient standard library function for reading files one line at a time. GNU getline() is even more convenient, but it is non-standard.
If you want to use scanf(), however, then you can do so by using the [ field descriptor instead of s:
char lineContent[100];
scanf("%99[^\n]", &lineContent);
getchar();
Note the use of an explicit field width to protect against overrunning the bounds of lineContent in the event that a long line is encountered. Note also that the [ descriptor differs from s in that [ does not skip leading whitespace. If preserving leading whitespace is important to you, then scanf with an s field is a non-starter.
The getchar() reads the terminating newline, presuming that there is one, and that scanf() did not read 99 characters without reaching the end of the line.
If you want to read line by line, then use fgets() or getline() (available on POSIX systems) instead of scanf(). scanf() stops at first whitespace (or matching failure) for %s.

read from text file and put into struct

i have assembler file actually text file like that
1 # Test case 1 for assembler
2
3 .text
4 test1: lwa $1,val1
5 prh $1
6 val12: lwa $2,val2
7 prh $2
..................
i am reading each line with fgets and keeping in char buffer which name is "linebuffer"
and im reading linebuffer with sscanf.
while((fgets(linebuffer,sizeof(linebuffer),ifp)!=NULL)
{
sscanf(linebuffer,"%s%s%s%s",line[i].label,line[i].opcode,line[i].operands,line[i].comment);
......
}
and i want keep them into struct,
struct instrucion{
char lable[8];
char opcode[4];
char opearand[15];
char comment[100];
}line[65536];
problem is some columns doesnt have anything just space and sscanf skipping spaces and reading very next string and keeping in first column. sorry i could not understand exactly but i hope somebody is understand.
for example i want like that for 3rd line;
line[2].label=NULL
line[2].opcode=".text"
line[2].opernds=NULL
line[2].comment=NULL
for 4th line;
line[3].label="test1:"
line[3].opcode="lwa"
line[3].operands="$1,val1"
line[3].comment=NULL
problem is starting with 5th line its has to be like that
line[4].label=NULL
line[4].opcode="prh"
line[4].operands="$1"
line[4].comment=NULL
buts when i run code im getting this result;
line[4].label="prh"
line[4].opcode="$1"
line[4].opernds=NULL
line[4].comment=NULL
how can i deliminate this linebuffer correctly?
OK, so your first problem is that fgets() does not read one line - It reads up to sizeof(linebuffer) number of bytes, you can see it's man page here:
http://linux.die.net/man/3/gets
Second, say that you do have only one line in the string "linebuffer", what you would like to do is use sscanf return value to determine which tokens appear in the line (scanf functions family return the number of parameters that were read from the stream).
Third, pay attention to the fact the scanf considers only spaces and newlines as tokens separators, so it will not separate the string "$1,val1" to the two sub-strings - you will need to do it manually.
And finally, there's a string-parsing function that can maybe make you life easier- strtok_r. You can see it's man page here:
http://linux.die.net/man/3/strtok_r
Amnon.

Resources