Reading the string with defined number of characters from the input - c

So I am trying to read a defined number of characters from the input. Let's say that I want to read 30 characters and put them in to a string. I managed to do this with a for loop, and I cleaned the buffer as shown below.
for(i=0;i<30;i++){
string[i]=getchar();
}
string[30]='\0';
while(c!='\n'){
c=getchar(); // c is some defined variable type char
}
And this is working for me, but I was wondering if there is another way to do this. I was researching and some of them are using sprintf() for this problem, but I didn't understand that solution. Then I found that you can use scanf with %s. And some of them use %3s when they want to read 3 characters. I tried this myself, but this command only reads the string till the first empty space. This is the code that I used:
scanf("%30s",string);
And when I run my program with this line, if I for example write: "Today is a beatiful day. It is raining, but it's okay i like rain." I thought that the first 30 characters would be saved in to the string. But when i try to read this string with puts(string); it only shows "Today".
If I use scanf("%s",string) or gets(string) that would rewrite some parts of my memory if the number of characters on input is greater than 30.

You can use scanf("%30[^\n]",s)
Actually, this is how you can set which characters to input. Here, carat sign '^' denotes negation, ie. this will input all characters except \n. %30 asks to input 30 characters. So, there you are.

The API you're looking for is fgets(). The man page describes
char *fgets(char *s, int size, FILE *stream);
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.

Related

fgets doesn't read in at most one less than size characters from stream

I am learning fgets from the manpage. I did some tests on fgets to make sure I understand it. One of the tests I did results in behaviour contrary to what is specified in the man page. The man page says:
char *fgets(char s[restrict .size], int size, FILE *restrict stream);
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s. Reading stops after an EOF
or a newline. If a newline is read, it is stored into the buffer. A
terminating null byte ('\0') is stored after the last character in the
buffer.
But it doesn't "read in at most one less than size characters from stream". As demonstrated by the following program:
#include<stdio.h>
#include<stdlib.h>
int main(){
FILE *fp;
fp=fopen("sample", "r");
char *s=calloc(50, sizeof(char));
while(fgets(s,2,fp)!=NULL) printf("%s",s);
}
The sample file:
thiis is line no. 1
joke joke 2 joke joke
arch linux btw 3
4th line
5th line
The output of the compiled binary:
thiis is line no. 1
joke joke 2 joke joke
arch linux btw 3
4th line
5th line
The expected output according to the man page:
t
j
a
4
5
Is the man page wrong, or am I missing something?
Is the man page wrong or am i missing something?
I won't say that the man page is wrong but it could be more clear.
There are 3 things that may stop fgets from reading from the stream.
The buffer is full (i.e. only room left for the termination character)
A newline character was read from the stream
End-Of-File occured
The quoted man page only mentions two of those conditions clearly.
Reading stops after an EOF or a newline.
That is #2 and #3 are mentioned very explicit while #1 is (kind of) derived from
reads in at most one less than size characters from stream
Here is another description from https://man7.org/linux/man-pages/man3/fgets.3p.html
... read bytes from stream into the array pointed to by s until n-1 bytes are read, or a newline is read and transferred to s, or an end-of-file condition is encountered.
where the 3 cases are clearly mentioned.
But yes... you are missing something. Once the buffer gets full, the rest of the current line is not read and discarded. The rest will stay in the stream and be available for the next read. So nothing is lost. You just need more fgets calls to read all data.
As suggested in a number of comments (e.g. Fe2O3 and Lundin) you can see this if you change the print statement so that it includes a delimiter of some kind. For instance (from Lundin):
printf("|%s|",s);
This will make clear exactly what you got from the individual fgets calls.
In the provided quote there is writte clear
If a newline is read, it is stored into the buffer.
Where do you see that this call fgets(s,2,fp) reads the new line character for example when reading this line?
thiis is line no. 1
The line contains only one new line character at its end.
This call reads only one character after another that is character by character that is appended by the terminating zero character '\0'.
So the read strings look like
{ 't', '\0' }
{ 'h', '\0' },
{ 'i', '\0' }
// ...
{ '1', '\0' }
{ '\n', '\0' }
If you have a call of fgets like that
fgets(s,n,fp)
then at most n-1 characters are read from the input stream. One character is reserved for the terminating zero character '\0' to build a string.
From the C Standard (7.21.7.2 The fgets function)
2 The fgets function reads at most one less than the number of
characters specified by n from the stream pointed to by stream into
the array pointed to by s. No additional characters are read after a
new-line character (which is retained) or after end-of-file. A null
character is written immediately after the last character read into
the array

Skipping oversized inputs when using fgets(3)

this is probably quite easy to figure out, maybe i'm just looking in the wrong places, but how does one test if fgets has read an oversized input? In the code below, i'm trying to skip further processing for empty lines and oversized ones and go straight to the next line, for empty lines it works just fine.
Printing the strlen(buffer) when using line lengths < maxsize and it gives me expected values.
However when i enter lines that exceed the maxsize, it prints a value over 9000, which should still exceed the maxsize, and therefore enter the if-clause, but this doesn't happen. I've tried casting the return value of strlen into an int, didn't work.
What am i missing here? Thanks for any replies :)
char buffer[102];
while (fgets(buffer,100,stdin)!=NULL){
size_t maxsize = 102;
printf("%ld",strlen(buffer));
if(strcmp(buffer,"\n")==0||strlen(buffer)>maxsize){
continue;
}
//further processing
}
I
in the code:
char buffer[102];
while (fgets(buffer,100,stdin)!=NULL){
You don't need to give two more characters to buffer. The parameter size of fgets just can be the application of the sizeof operator, as in:
char buffer[102];
while (fgets(buffer, sizeof buffer, stdin) != NULL) {
That will give you space for lines of up to 101 characters (to leave space to the string terminator) including (or not, see below) the new line character.
But, answering your question, I understand that you want to know what happens if your input in one line is indeed bigger that the buffer size you provided, what happens then to the input, and how fgets deal with this:
Fgets() reads as many characters as it finds a \n in the input, or the buffer fills completely (this is, after including the \0 character that it must append to the string to terminate it) So, fgets() will fill as many characters in the buffer as the buffer has, minus one, reserved for the null string terminator, and the rest of the line will be read in the next fgets() (or another call to any of the functions of the stdio package).
So, basically, lines longer than one less than the buffer size are split in pieces, in which all except the last don't actually end in a new line, and the last will have the new line included, and will be shorter, all with a length of the length you specified minus one, but the last piece, in which the length is what it requires (again, always less than or equal than the length specified minus one)

I/O in C Errors

I'm trying for hours to find the answer for this question i've got in university. I tried running this with writing a file with two lines of :
hello
world
and it reads the file perfectly, So i cant find the answer. I would appreciate your help !
A student wrote the next function for reading a text file and printing it exactly as it is.
void ReadFile(FILE *fIn)
{
char nextLine[MAX_LINE_LENGTH];
while(!feof(fIn))
{
fscanf(fIn,"%s",nextLine);
printf("%s\n",nextLine);
}
}
What are the two errors in this function?
You can assume that each line in the file is not longer than MAX_LINE_LENGTH characters, and that it is a text file that contains only alphabet characters, and that each line is terminated by '\n'.
Thanks.
It discards white space. Try adding multiple spaces and tabs.
It may evaluate a stream more than once, and If there is a read error, the loop never terminates.
See: Why is “while ( !feof (file) )” always wrong?
Reading strings via scanf is dangerous. There is no bounds checking. You may read past you MAX_LINE_LENGTH.(and boom! Segfault)
The main error is that fsacnf( fIn, "%s", nextLine ) doesn't scan a complete line.
From man page:
s
Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
Thus if you have a line "a b" the first fscanf() will scan just "a" and the second one "b" and both are printed in two different lines. You can use fgets() to read a whole line.
The second one is maybe that it's stated "each line in the file is not longer than MAX_LINE_LENGTH characters" but nextLine can contain atmost MAX_LINE_LENGTH-1 characters (+ '\0'). That problem becomes even more important if you replace fscanf() by fgets() because than nextLine must have also capacity to store '\n' or '\r\n' (depending on the platform you're on)
A correct way of doing that is:
void ReadFile(FILE *fIn)
{
char nextLine[MAX_LINE_LENGTH];
while(fgets(nextLine, MAX_LINE_LENGTH, fIn)) {
printf("%s", nextLine);
}
}
As some have posted using feof to control a loop is not a good idea nor using fscanf to read lines.

C - fgets() - length of newline char

I am trying to read 1 line and I am not sure how newline char is represented. Should I consider it as 2 chars or 1 char, when reading it from file by fgets() ? For example, I have a line of 15 chars + new line in file. So how should I safely allocate string and read that line?
At first, I tried this:
char buf[16];
fgets(buf, 16, f);
It read the line correctly without newline char and I assume that buf[15] holds the null character.
However, when I want to read and store the newline char, it doesn't work as I thought. As far as I know, '\n' should be considered as one char and take just one byte, so to read it, I just need to read one more char.
But when i try this
char buf[17];
fgets(buf, 17, f);
it does completely the same thing than previous example - there is now newline char stored in my string (I am not sure where null char is stored in this case)
To read entire line with newline I need to do this
char buf[18];
fgets(buf, 18, f);
OR this (it works, but I am not sure if it's safe)
char buf[17];
fgets(buf, 18, f);
So the questions is, why do I need to allocate and read 18 chars, when the line has only 15 chars + newline?
You need to provide buffer space for the 15-chars of text, up to 2 characters for the new line (to handle Windows line termination of \r\n), and one more for the null termination. So that's 18.
Like you did here:
char buf[18]; fgets(buf, 18, f);
The num parameter to fgets tells the call the size of your buffer it's writing to.
I am trying to read 1 line and I am not sure how newline char is represented.
In text mode, newline is '\n' and that's true on any conform C implementation and I wouldn't use fgets on anything but a text mode stream (I don't know -- and I don't want to know -- how it works in binary mode on an implementation using \r as end of line marker, or worse using an out of band end of line marker, I wouldn't be surprised it looks for a \n and never find one thus try to read until the end of file).
You should allocate space for the maximal line length, included the newline plus the terminating NUL and more important you must never lie the fgets about the length of the buffer. You can check if the buffer was long enough as the newline won't be present if it isn't.
The matter is about the espace sequence that lets you test for a newline, it is two characters \0x0d\0x0a but when using a strcmp and need to provide a string for this and a length, the C escape code holds in one character, so you must:
if(strncmp(&buff[i], "\n", 1) == 0)
which would not work with a length of two. Don't ask me why.

How To Read in Strings that only Contain Alphabet letters with fscanf?

I have been struggling to figure out the fscanf formatting. I just want to read in a file of words delimited by spaces. And I want to discard any strings that contain non-alphabetic characters.
char temp_text[100];
while(fscanf(fcorpus, "%101[a-zA-Z]s", temp_text) == 1) {
printf("%s\n", temp_text);
}
I've tried the above code both with and without the 's'. I read in another stackoverflow thread that the s when used like that will be interpreted as a literal 's' and not as a string. Either way - when I include the s and when I do not include the s - I can only get the first word from the file I am reading through to print out.
The %[ scan specifier does not skip leading spaces. Either add a space before it or at the end in place of your s. Also you have your 100 and 101 backwards and thus a serious buffer overflow bug.
The s isn't needed.
Here are a few things to try:
Print out the return value from fscanf, and make sure it is 1.
Make sure that the fscanf is consuming the whitespace by using fgetc to get the next character and printing it out.

Resources