How can I retrieve rows in a text file - c

I started programing in language C, and I have some problems with reading text files. Let me explain.
I have one file text which is organized like this :
Tony
12.23
John
09.45
Tayris
03.99
I would like to retrieve all notes less than ten and display them, but I can't...
Does anybody could help me?
Thanks a lot.

C provides four functions that can be used to read files from disk:
fscanf()
field oriented function.
fgets()
line oriented function.
fgetc()
character oriented function
fread()
block oriented function.
See this article for more information.

Check out the fgets function. It will return until (and including) the end of string character (you can strip it from the destination string if you want).
http://people.cs.uchicago.edu/~iancooke/osstuff/ccc.html offers an example:
Here's a more complicated example.
Readline() uses fgets() to read up to
MAX_LINE - 1 characters into the
buffer 'in'. It strips preceding
whitespace and returns a pointer to
the first non-whitespace character.
char *Readline(char *in) {
char *cptr;
if (cptr = fgets(in, MAX_LINE, stdin)) {
/* kill preceding whitespace but leave \n
so we're guaranteed to have something*/
while(*cptr == ' ' || *cptr == '\t') {
cptr++;
}
return cptr;
} else {
return 0;
}
}
That should be enough I think.

Related

how to scan line in c program not from file

How to scan total line from user input with c program?
I tried scanf("%99[^\n]",st), but it is not working when I scan something before this scan statment.It worked if this is the first scan statement.
How to scan total line from user input with c program?
There are many ways to read a line of input, and your usage of the word scan suggests you're already focused on the scanf() function for the job. This is unfortunate, because, although you can (to some extent) achieve what you want with scanf(), it's definitely not the best tool for reading a line.
As already stated in the comments, your scanf() format string will stop at a newline, so the next scanf() will first find that newline and it can't match [^\n] (which means anything except newline). As a newline is just another whitespace character, adding a blank in front of your conversion will silently eat it up ;)
But now for the better solution: Assuming you only want to use standard C functions, there's already one function for exactly the job of reading a line: fgets(). The following code snippet should explain its usage:
char line[1024];
char *str = fgets(line, 1024, stdin); // read from the standard input
if (!str)
{
// couldn't read input for some reason, handle error here
exit(1); // <- for example
}
// fgets includes the newline character that ends the line, but if the line
// is longer than 1022 characters, it will stop early here (it will never
// write more bytes than the second parameter you pass). Often you don't
// want that newline character, and the following line overwrites it with
// 0 (which is "end of string") **only** if it was there:
line[strcspn(line, "\n")] = 0;
Note that you might want to check for the newline character with strchr() instead, so you actually know whether you have the whole line or maybe your input buffer was to small. In the latter case, you might want to call fgets() again.
How to scan total line from user input with c program?
scanf("%99[^\n]",st) reads a line, almost.
With the C Standard Library a line is
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined. C11dr §7.21.2 2
scanf("%99[^\n]",st) fails to read the end of the line, the '\n'.
That is why on the 2nd call, the '\n' remains in stdin to be read and scanf("%99[^\n]",st) will not read it.
There are ways to use scanf("%99[^\n]",st);, or a variation of it as a step in reading user input, yet they suffer from 1) Not handling a blank line "\n" correctly 2) Missing rare input errors 3) Long line issues and other nuances.
The preferred portable solution is to use fgets(). Loop example:
#define LINE_MAX_LENGTH 200
char buf[LINE_MAX_LENGTH + 1 + 1]; // +1 for long lines detection, +1 for \0
while (fgets(buf, sizeof buf, stdin)) {
size_t eol = strcspn(buf, "\n"); **
buf[eol] = '\0'; // trim potential \n
if (eol >= LINE_MAX_LENGTH) {
// IMO, user input exceeding a sane generous threshold is a potential hack
fprintf(stderr, "Line too long\n");
// TBD : Handle excessive long line
}
// Use `buf[[]`
}
Many platforms support getline() to read a line.
Short-comings: Non C-standard and allow a hacker to overwhelm system resources with insanely long lines.
In C, there is not a great solution. What is best depends on the various coding goals.
** I prefer size_t eol = strcspn(buf, "\n\r"); to read lines in a *nix environment that may end with "\r\n".
scanf() should never be used for user input. The best way to get input from the user is with fgets().
Read more: http://sekrit.de/webdocs/c/beginners-guide-away-from-scanf.html
char str[1024];
char *alline = fgets(str, 1024, stdin);
scanf("%[^'\n']s",alline);
I think the correct solution should be like this. It is worked for me.
Hope it helps.

Scans multiple lines in C

I am new to C programming. I am taking a class where I have to:
The program will take all input from standard input, possibly transform it, and output it to standard output.
The program will read in input line by line. Transformations, if any, will be done per line. Then print out the transformed line.
You will have to read from the user until there is no more text left. Ctrl+D can be typed into the terminal to indicate there is no text left.
I am not a student who is looking for the answer to be done for me, but I am completely lost here. I tried to use:
char*buf = NULL;
while (fscanf(stdin, "%ms", &buf) > 0)
{ do transform }
but I have no luck. So any help is appreciated. Also I have no idea about the Ctrl+D part.
char*buf = NULL;
while (fscanf(stdin, "%ms", &buf) > 0)
has the following problems.
buf does not point to anything valid where input can be read and stored.
%ms is not a standard C format specifier (it is supported in POSIX standard compliant platforms, thanks #JonathanLeffler).
It will be better to use fgets to read lines of text.
I sugguest:
// Make LINE_LENGTH large enough for your needs.
#define LINE_LENGTH 200
char buf[LINE_LENGTH];
while ( fgets(buf, LINE_LENGTH, stdin) != NULL )
{
// Use buf
}
Ctrl+D is EOF usually so just check for that. fscanf(stdin, "%ms", &buf)!=EOF
Also you reserved just a pointer to char, you should either statically reserve array or do dynamic allocation.
char buf[255];
or
char *buf = (char*) malloc(255);
EDIT:
As noted by Jonathan Leffler fscanf() is really terrible idea if your lines don't have specific format use fgets() https://www.tutorialspoint.com/c_standard_library/c_function_fgets.htm
Since you tagged as C++, try this:
std::string text;
std::getline(cin, text);
The std::string will dynamically expand as necessary.
The getline function will read until an end-of-line character is read.
Much safer than reading into a character array.

When using fscanf to parse words, how can I check when I skipped a line

I'm working on a program that reads text from a file and parses the text to words and manipulates them. I'm parsing with fscanf like that
while (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1)
{
/*manipulate the text word by word */
…
}
I wanna write next to each word that I find in which line I found it.
Is there a way that I can check when I moved down a line
when using the function fscanf?
The soundest advice is to use fgets() or perhaps POSIX
getline() to read lines and then consider using
sscanf() to parse each line. You will probably need to consider how to use sscanf() in a loop. There are also numerous other options for parsing the line instead of sscanf(), such as strtok_r() or the less desirable strtok() — or, on Windows, strtok_s();
strspn(),
strcspn(),
strpbrk(); and other functions that are not as standardized.
If you feel you must use fscanf(), then you probably need to capture the trailing context. A simple version of that would be:
char c;
while (fscanf(fp, " %32[^ ,.\t\n]%c", word, &c) == 2)
…
This captures the character after the word, assuming there is one. If your file doesn't end with a newline, it is possible a word will be lost. It's also rather too easy to miss a newline. For example, if the line ends with a full stop (period) before the newline, then c will hold the . and the newline will be skipped by the next iteration of the loop. You could overcome that with:
char s[33];
while (fscanf(fp, " %32[^ ,.\t\n]%32[ ,.\t\n]", word, s) == 2)
…
Note that the length in the format string must be one less than the length in the variable declaration!
After a successful call to fscanf(), the string s could contain multiple newlines and blanks and so on. The fscanf() functions mostly don't care about newlines, and the scan set for s would read multiple newlines in a row if that's what's in the data file.
If you explicitly capture the status from fscanf(), you can be more sensitive to files that end without a newline (or a punctuation character), or that cause other problems:
char s[33];
int rc;
while ((rc = fscanf(fp, " %32[^ ,.\t\n]%32[ ,.\t\n]", word, s)) != EOF)
{
switch (rc)
{
case 2:
…proceed as normal, checking s for newlines.
break;
case 1:
…probably an overlong word or EOF without a newline.
break;
case 0:
…probably means the next character is one of comma or dot.
…spaces, tabs, newlines will be skipped without detection
…by the leading space in the format string.
break;
default:
assert(0);
break;
}
}
If you start to care about !, ?, ;, :, ' or " characters — not to mention ( and ) — life gets more complex still. In fact, at that point, the alternatives to sscanf() start looking much better.
It is very hard to use the scanf() family of functions correctly. They're anything but tools for the novice, at least once you start needing to do anything complex. You could look at A beginner's guide to not using scanf(), which contains much valuable information. I'm not wholly convinced by the last couple of examples which are supposed to be bomb-proof uses of scanf(). (It is a little easier to use sscanf() correctly, but you still need to understand what you're up to in detail.)
Read lines with fgets() and then parse them using sscanf:
char buff[1024];
int lineno = 0;
int offset = 0;
while (fgets(buff, 1024, fp)) {
lineno++;
offset = 0;
while (sscanf(buff + offset, " %32[^ ,.\t\n]%*c", word) == 1)
{
/* manipulate the text word by word */
}
}
In second loop you must increase buffer offset appropriately in order to parse line correctly. for this you can use %n for example in order to get read bytes.

What happens with extra memory using fscanf?

I'm new to C and I have a couple questions about fscanf. I wrote a simple program that reads the contents of a file and spits it back out on the command line:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char* argv[1])
{
if (argc != 2)
{
printf("Usage: fscanf txt\n");
return 1;
}
char* txt = argv[1];
FILE* fp = fopen(txt, "r");
if (fp == NULL)
{
printf("Could not open %s.\n", txt);
return 2;
}
char s[50];
while (fscanf(fp, "%49s", s) == 1)
printf("%s\n", s);
return 0;
}
Let's say the contents of my text file is just "C is cool.", which will output:
C
is
cool.
So I have two questions here:
1) Does fscanf assume that the placeholder "%s" will be a single word (an array of chars only)? According to this program's output, spaces and line breaks seem to prompt the function to return. But what if I wanted to read a whole paragraph? Would I use fread() instead?
2) More importantly I'm wondering what happens with all of the unused space in the array. On the first iteration, I think s[0] = "C" and s[1] = "\0", so are s[2] - s[49] just wasted?
EDIT: while (fscanf(fp, "%**49**s", s) == 1) - thanks to #M Oehm for pointing this out - enforcing strong limit here to prevent dangerous buffer overflows
1) Does fscanf assume that the placeholder "%s" will be a single word
(an array of chars only)? According to this program's output, spaces
and line breaks seem to prompt the function to return. But what if I
wanted to read a whole paragraph? Would I use fread() instead?
The %s specifier reads single words that are delimited by white space. The scanf family of functions are very cerude; they do not normally distinguish between line breaks and spaces, for example.
A line is anything up to the next newline. There is no concept of paragraph, but you might consider anything between blank lines a paragraph. The function to read lines of text is fgets, so you could read lines until you find an empty one. (fgets retains the newline at the end, mind.)
fread is a function for reading binary data. It is not useful for reading structured texts. (But it can be used to read the contents of a whole text file at once.)
2) More importantly I'm wondering what happens with all of the unused
space in the array. On the first iteration, I think c[0] = 'C' and
c[1] = '\0', so are c[2] - c[49] just wasted?
You are right, the data after the null ternimator isn't used. "Wasted" is too negative – with user input you don't know whether you encounter a longer word eventually. Because dynamic allocation requires some care in C, allocating "enogh for most cases" is a goopd practice in C. You should enforce the hard limit when reading, though, to prevent buffer overruns:
fscanf(fp, "%49s", s)
The issue of "wasted" memory becomes more serious if you have an array of arrays of 50 chars. Most of the words will be much shorter than 50 chars. Here, the extra memory might eventually hurt you. 48 extra characters for reading a line are okay, though.
(A strategy to save "compact" arrays of chars is to have a running array of chars that is a concatenation of all strings, including their terminators. The word array is then an array of piointers into that master string.)
You use specifier %s which will read and store data in array s until it encounters a space or newline . As soon as it encounters space fscanf returns.
I think c[0] = "C" and c[1] = "\0", so are c[2] - c[49] just wasted?
Yes , s[0]='C' and s[1]='\0' and you probably can't do anything about the size of array being much more.
If you want complete string "C is cool" stored in array use fgets.
#define len 1000
char s[len];
while(fgets(s,len,fp)!=NULL) {
//your code
}

File Handling question on C programming

I want to read line-by-line from a given input file,, process each line (i.e. its words) and then move on to other line...
So i am using fscanf(fptr,"%s",words) to read the word and it should stop once it encounters end of line...
but this is not possible in fscanf, i guess... so please tell me the way as to what to do...
I should read all the words in the given line (i.e. end of line should be encountered) to terminate and then move on to other line, and repeat the same process..
Use fgets(). Yeah, link is to cplusplus, but it originates from c stdio.h.
You may also use sscanf() to read words from string, or just strtok() to separate them.
In response to comment: this behavior of fgets() (leaving \n in the string) allows you to determine if the actual end-of-line was encountered. Note, that fgets() may also read only part of the line from file if supplied buffer is not large enough. In your case - just check for \n in the end and remove it, if you don't need it. Something like this:
// actually you'll get str contents from fgets()
char str[MAX_LEN] = "hello there\n";
size_t len = strlen(str);
if (len && str[len-1] == '\n') {
str[len-1] = 0;
}
Simple as that.
If you are working on a system with the GNU extensions available there is something called getline (man 3 getline) which allows you to read a file on a line by line basis, while getline will allocate extra memory for you if needed. The manpage contains an example which I modified to split the line using strtok (man 3 strtrok).
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE * fp;
char * line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("/etc/motd", "r");
if (fp == NULL)
{
printf("File open failed\n");
return 0;
}
while ((read = getline(&line, &len, fp)) != -1) {
// At this point we have a line held within 'line'
printf("Line: %s", line);
const char * delim = " \n";
char * ptr;
ptr = (char * )strtok(line,delim);
while(ptr != NULL)
{
printf("Word: %s\n",ptr);
ptr = (char *) strtok(NULL,delim);
}
}
if (line)
{
free(line);
}
return 0;
}
Given the buffering inherent in all the stdio functions, I would be tempted to read the stream character by character with getc(). A simple finite state machine can identify word boundaries, and line boundaries if needed. An advantage is the complete lack of buffers to overflow, aside from whatever buffer you collect the current word in if your further processing requires it.
You might want to do a quick benchmark comparing the time required to read a large file completely with getc() vs. fgets()...
If an outside constraint requires that the file really be read a line at a time (for instance, if you need to handle line-oriented input from a tty) then fgets() probably is your friend as other answers point out, but even then the getc() approach may be acceptable as long as the input stream is running in line-buffered mode which is common for stdin if stdin is on a tty.
Edit: To have control over the buffer on the input stream, you might need to call setbuf() or setvbuf() to force it to a buffered mode. If the input stream ends up unbuffered, then using an explicit buffer of some form will always be faster than getc() on a raw stream.
Best performance would probably use a buffer related to your disk I/O, at least two disk blocks in size and probably a lot more than that. Often, even that performance can be beat by arranging the input to be a memory mapped file and relying on the kernel's paging to read and fill the buffer as you process the file as if it were one giant string.
Regardless of the choice, if performance is going to matter then you will want to benchmark several approaches and pick the one that works best in your platform. And even then, the simplest expression of your problem may still be the best overall answer if it gets written, debugged and used.
but this is not possible in fscanf,
It is, with a bit of wickedness ;)
Update: More clarification on evilness
but unfortunately a bit wrong. I assume [^\n]%*[^\n] should read [^\n]%*. Moreover, one should note that this approach will strip whitespaces from the lines. – dragonfly
Note that xstr(MAXLINE) [^\n] reads MAXLINE characters which can be anything except the newline character (i.e. \n). The second part of the specifier i.e. *[^\n] rejects anything (that's why the * character is there) if the line has more than MAXLINE characters upto but NOT including the newline character. The newline character tells scanf to stop matching. What if we did as dragonfly suggested? The only problem is scanf will not know where to stop and will keep suppressing assignment until the next newline is hit (which is another match for the first part). Hence you will trail by one line of input when reporting.
What if you wanted to read in a loop? A little modification is required. We need to add a getchar() to consume the unmatched newline. Here's the code:
#include <stdio.h>
#define MAXLINE 255
/* stringify macros: these work only in pairs, so keep both */
#define str(x) #x
#define xstr(x) str(x)
int main() {
char line[ MAXLINE + 1 ];
/*
Wickedness explained: we read from `stdin` to `line`.
The format specifier is the only tricky part: We don't
bite off more than we can chew -- hence the specification
of maximum number of chars i.e. MAXLINE. However, this
width has to go into a string, so we stringify it using
macros. The careful reader will observe that once we have
read MAXLINE characters we discard the rest upto and
including a newline.
*/
int n = fscanf(stdin, "%" xstr(MAXLINE) "[^\n]%*[^\n]", line);
if (!feof(stdin)) {
getchar();
}
while (n == 1) {
printf("[line:] %s\n", line);
n = fscanf(stdin, "%" xstr(MAXLINE) "[^\n]%*[^\n]", line);
if (!feof(stdin)) {
getchar();
}
}
return 0;
}

Resources