Scanf: detect that the input was too long - c

We can easily limit the length of the input accepted by scanf:
char str[101];
scanf("%100s", str);
Is there any efficient way to find out that the string was trimmed? We could, for example, report an error in such case.
We could read "%101s" into char strx[102] and check with strlen() but this involves extra cost.

Use the %n conversion to write the scan position to an integer. If it was 100 past the beginning then the string was too big.
I find that %n is useful for all kinds of things.
I thought the above was plenty of information for anyone who had read the scanf docs / man page and had actually tried it.
The idea is that you make your buffer and your scan limit bigger than whatever size string you expect to find. Then if you find a scan result that is exactly as big as your scan limit you know it is an invalid string. Then you report an error or exit or whatever it is that you do.
Also, if you're about to say "But I want to report an error and continue on the next line but scanf left my file in an unknown position."
That is why you read a line at a time using fgets and then use sscanf instead of scanf. It removes the possibility of ending the scan in the middle of the line and makes it easy to count line numbers for error reporting.
So here is the code that I just wrote:
#include <stdio.h>
#include <stdlib.h>
int scan_input(const char *input) {
char buf[101];
int position = 0;
int matches = sscanf(input, "%100s%n", buf, &position);
printf("'%s' matches=%d position=%d\n", buf, matches, position);
if (matches < 1)
return 2;
if (position >= 100)
return 3;
return 0;
}
int main(int argc, char *argv[]) {
if (argc < 2)
exit(1);
const char *input = argv[1];
return scan_input(input);
}
And here is what happens:
$ ./a.out 'This is a test string'
'This' matches=1 position=4
$ ./a.out 'This-is-a-test-string'
'This-is-a-test-string' matches=1 position=21
$ ./a.out '01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'
'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789' matches=1 position=100

You could use fgets() to read an entire line. Then you verify if the newline character is in the string. However, this has a few disadvantages:
It will consume the entire line, and maybe that's not what you want. Notice that fgets() is not equivalent to scanf("%100s") -- the latter only reads until the first blank character appears;
If the input stream is closed before a newline character is supplied, you will be undecided;
You have to go through the array to search for the newline character.
So the better option seems to be as such:
char str[101];
int c;
scanf("%100s", str);
c = getchar();
ungetc(c, stdin);
if (c == EOF || isspace(c)) {
/* successfuly read everything */
}
else {
/* input was too long */
}
This reads the string normally and checks for the next character. If it's a blank or if the stream has been closed, then everything was read.
The ungetc() is there in case you don't want your test to modify the input stream. But it's probably unnecessary.

fgets() is a better way to go, read the line of user input and then parse it.
But is OP still wants to use scanf()....
Since it is not possible to "detect that the input was too long" without attempting to read more than the n maximum characters, code needs to read beyond.
unsigned char sentinel;
char str[101];
str[0] = '\0';
if (scanf("%100s%c", str, &sentinel) == 2) {
ungetc(sentential, stdin); // put back for next input function
if (isspace(sentential) NoTrimOccurred();
else TrimOccurred();
else {
NoTrimOccurred();
}

A very rough but easy way of doing this would be, adding a getchar() call after the scanf().
scanf() leaves the newline into the input buffer after reading the actual input. In case, the supplied input is less than the maximum field width, getchar() would return the newline. Otherwise, the first unconsumed input will be returned.
That said, the ideal way of doing it is to actually read a bit more than the required value and see if anything appears in the buffer area. You can make use of fgets() and then, check for the 100th element value to be a newline or not but this also comes with additional cost.

Related

While loops repeats with scanf equal to number of characters read

I have a program that is meant to take commands the first question is the format the commands will be taken in command line or file by typing c or f
if neither is typed the while loop repeats without allowing input equal to the number of characters in the incorrect input instead of stopping and allowing scanf to grab input again. I don't use it's return values at any point so I am at a loss as to why this happens. correctly entering 'f' or 'c' does not cause the problem.
any help would be greatly appreciated
#include<stdio.h>
#include <math.h>
#include <stdlib.h>
#include <string.h>
#define true 1
#define false 0
typedef int bool;
double **temp_array;
double temp1d_array[36];
char consolep[100];
char *fp1;
FILE *fp;
char string_IO1[50];
char string_temp[50];
char buffer[50];
char current_command[10];
int halt = 0;
char *strtodptr;
void main(){
printf("welcome \n");
char IO;
char read[250];
char file_console;
int IO_method = 0;
char command[10];
char type_IO;
char type_of_var_IO;
char dim_IO[3];
char array_string_IO[40];
//console or file
//decide IO Method loop 1
while (IO_method==0)
{
printf("please type 'c'for console or 'f' for file to select input type\n");
scanf("%c", &file_console);
//if console
if(file_console =='c')
{
IO_method=1;
printf("method is console\n");
}
//if file
else if(file_console=='f')
{
IO_method=2;
printf("method is file\n");
printf("please enter a file directory\n");
scanf("%s",&string_IO1);
}
else
{
printf("invalid entry\n");
file_console=NULL;
IO_method=0;
}
}}//code here continues but i compiled it without and has no bearing on the error.
The calls to scanf() in the posted code leave characters behind in the input stream. If, for example, the user enters g at the first prompt, pressing ENTER after, the \n character is left behind. If the user enters more than one character, the extra characters are left behind. The later calls to I/O functions will pick up these unexpected characters, causing the program to misbehave.
One solution is to write a little function to clear the input stream after such I/O function calls:
void clear_input(void)
{
int c;
while ((c = getchar()) != '\n' && c != EOF) {
continue;
}
}
This function discards any characters that remain in the input stream (up to and including the first newline character). Note that c must be an int to ensure that EOF is handled correctly. Also note that this function should only be called when the input stream is not empty; an empty input stream will cause the call to getchar() to block, waiting for input.
For example, after the first call to scanf() you know that there is at least a \n character still in the input stream (maybe more characters preceding the newline); just call clear_input() to clean the input stream before the next I/O call:
scanf("%c", &file_console);
clear_input();
The value returned by scanf() should be checked in robust code; the number of successful assignments made is returned, or EOF in rare the event of an error. This can help to validate input.
A better option would be to use fgets() to read from stdin and fetch a line of input to a buffer, and then use sscanf() to parse the buffer. One advantage here is that fgets() will read all characters up to, and including, a newline character, provided there is adequate space in the buffer. So, allocate a generous buffer[] to make it likely that no reasonable input can fail to be contained in the buffer. If you need to be more careful, you can check the input buffer for a \n character (using strchr(), for example). If the \n character is found in the buffer, then the input stream is empty, otherwise there are extra characters left behind, and the clear_input() function can be called to clean things up:
#include <stdlib.h>
#include <string.h>
...
char buffer[1000];
char end;
while (IO_method==0)
{
printf("please type 'c'for console or 'f' for file to select input type\n");
if (fgets(buffer, sizeof buffer, stdin) == NULL) {
/* Handle input error */
perror("Error in fgets()");
exit(EXIT_FAILURE);
}
/* May need to clear input stream, if input is too large */
if (strchr(buffer, '\n') == NULL) {
clear_input();
}
/* Input again if input is not as expected */
if (sscanf(buffer, "%c%c", &file_console, &end) != 2 || end != '\n') {
continue;
}
...
Here, buffer[] is declared with a generous size to hold all reasonable inputs. fgets() places the input in buffer, up to and including the newline (space-permitting). Note that the return value from fgets() is checked; a null pointer is returned if there is a rare I/O error. Next, strchr() is used to check for the \n in buffer; it is expected to be present, but if not, a null pointer is returned, signalling that there are still characters in the input stream to be cleared. Next, sscanf() is used to parse the buffer. Here, note that end is used store the character after the user-input character. In expected input, this is a \n character. If the user enters too many characters, testing end will reveal this, and input is taken again.
Also note that in the posted code, string_IO1 was not declared (and not a great name, since the characters in IO1 are difficult to distinguish on a screen); if this is a character array, then the call to scanf() should have looked like:
scanf("%s",string_IO1);
And, file_console has been declared as a char, so the assigment file_console = NULL; is wrong, since NULL is the null pointer macro, not an integer type.

C string compare wont allow space bar

using the cprogrammingsimplified tutorial for writing my own stringcompare.
Finished reformatting it and ran it.
works fine for single words,
But
typing space bar skips the second scan and immediately outputs
'words aren't the same'
anyone any idea how to allow the use of even a single space bar?
Thanks in advance.
#include <stdio.h>
int mystrcmp(char s1[], char s2[]);
int main(){
char s1[10], s2[10];
int flag;
printf("Type a string of 10\n\n");
scanf("%s",&s1);
printf("type another string of 10 to compare\n\n");
scanf("%s",&s2);
flag = mystrcmp(s1,s2);
if(flag==0)
printf("the words are the same\n\n");
else
printf("the words are not the same\n\n");
return 0;
}
int mystrcmp(char s1[], char s2[]){
int l=0;
while (s1[l] == s2[l]) {
if (s1[l] == '\0' || s2[l] == '\0')
break;
l++;
}
if (s1[l] == '\0' && s2[l] == '\0')
return 0;
else
return -1;
}
Use fgets() to read full lines, rather than scanf() to read space-separated words.
Remember that fgets() will include the linefeed in the string, though.
It is not strcmp that wouldn't allow space bar, it's scanf with %s format specifier. The input is truncated at the space, so the second string that you read is actually the continuation of the first string.
You can fix this by using %9[^\n] instead of %s in your format specifier:
printf("Type a string of 10\n\n");
scanf("%9[^\n]",s1); //s1 is char [10]
printf("type another string of 10 to compare\n\n");
scanf("%9[^\n]",s2); //s2 is char [10]
9 limits input to nine characters, because you are using a ten-character buffer.
Many answers have told you that scanf("%s",s1) only reads word by word. This is because by default scanf("%s",s1) is delimited by all white spaces, this includes \t, \n, <space>, or any other you can think of.
What scanf("%[^\n]s",s1) does is set the delimiter to \n. So in effect reads all other spaces.
#dasablinklight has also specified a 9 before the '[^\n]' this denotes that scanf() takes 9 values from input buffer.
IMO scanf() is a really nice function due to it's hidden features. I suggest you read more about it in it's documentation.
The problem is that if you type abc def on the first line, the first scanf("%s", s1) (no ampersand required — it should be absent) reads abc and the second reads def. And those are not equal. Type very very and you'd find the words are equal. %s stops reading at a space.
Your buffers of size 10 are too small for comfort.
Fix: read lines (e.g. char s1[1024], s2[1024];) with fgets() or POSIX's getline(), remove trailing newlines (probably: s1[strcspn(s1, "\n")] = '\0'; is a reliable way to do it) and then go ahead compare the lines.

how to find out if there is a newline or number in c?

I have this assignment where I have to read till the "?" char and then check if it is followed by number and newline, or newline and then the number and than again newline.
I checked the first char after the "?"
if (scanf("%c",c)=='\n') ...;
but that only works if the first one is a newline, and when it isn't and i want to read the number instead, it cuts the first digit ... for example, it doesn´t read 133 but only 33
... how do i do this?
I also tried puting the char back, but that wouldn't work
please help :)
One advantage of getline over either fgets (or a distant scanf) is that getline returns the actual number of characters successfully read. This allows a simple check for a newline at the end by using the return to getline. For example:
while (printf ((nchr = getline (&line, &n, stdin)) != -1)
{
if (line[nchr - 1] = '\n') /* check whether the last character is newline */
line[--nchr] = 0; /* replace the newline with null-termination */
/* while decrementing nchr to new length */
Use fgets(3), or better yet, getline(3) (like here) to read the entire line, then parse the line using strtol(3) or sscanf(3) (like here)
Don't forget to carefully read the documentation of every function you are using. Handle the error cases - perhaps using perror then exit to show a meaningful message. Notice that scanf and sscanf return the number of scanned items, and know about %n, and that strtol can set some end pointer.
Remember that on some OSes (e.g. Linux), the terminal is a tty and is often line-buffered by the kernel; so nothing is sent to your program until you press the return key (you could do raw input on a terminal, but that is OS specific; consider also readline on Linux).
this line: if (scanf("%c",c)=='\n') ...; will NEVER work.
scanf returns a value that indicates the number of successful parameter conversions.
suggest:
// note: 'c' must be defined as int, not char
// for several reasons including:
// 1) getchar returns an int
// 2) on some OSs (dos/windows) '\n' is 2 characters long
// 3) if checking for EOF, EOF is defined as an int
if( '\n' == (c = getchar() ) )
{ // then found newline
...
#include <stdio.h>
int main (void){
int num;
scanf("%*[^?]?");//read till the "?"
while(1==scanf("%d", &num)){
printf("%d\n", num);
}
return 0;
}
DEMO

Changing the scanf() delimiter

My objective is to change the delimiter of scanf to "\n".
I tried using scanf("%[^\n]s",sen); and works fine for single inputs.
But when i put the same line inside a for loop for multiple sentences it gives me garbage values.
Does anyone know why?
Here's my code:
char sen[20];
for (i=0;i<2;i++)
{
scanf("%[^\n]s",sen);
printf("%s\n",sen);
}
Consider this (C99) code:
#include <stdio.h>
int main(void)
{
char buffer[256];
while (scanf("%255[^\n]", buffer) == 1)
printf("Found <<%s>>\n", buffer);
int c;
if ((c = getchar()) != EOF)
printf("Failed on character %d (%c)\n", c, c);
return(0);
}
When I run it and type in a string 'absolutely anything with spaces TABTABtabs galore!', it gives me:
Found <<absolutely anything with spaces tabs galore!>>
Failed on character 10 (
)
ASCII (UTF-8) 1010 is newline, of course.
Does this help you understand your problem?
It works in this case (for a single line) but if I want to take multiple lines of input into an array of arrays then it fails. And I don't get how scanf returns a value in your code?
There are reasons why many (most?) experienced C programmers avoid scanf() and fscanf() like the plague; they're too hard to get to work correctly. I'd recommend this alternative, using sscanf(), which does not get the same execration that scanf() and fscanf() do.
#include <stdio.h>
int main(void)
{
char line[256];
char sen[256];
while (fgets(line, sizeof(line), stdin) != 0)
{
if (sscanf(line, "%255[^\n]", sen) != 1)
break;
printf("Found <<%s>>\n", sen);
}
int c;
if ((c = getchar()) != EOF)
printf("Failed on character %d (%c)\n", c, c);
return(0);
}
This reads the line of input (using fgets() which ensures no buffer overflow (pretend that the gets() function, if you've heard of it, melts your computer to a pool of metal and silicon), then uses sscanf() to process that line. This deals with newlines, which are the downfall of the original code.
char sen[20];
for (i=0;i<2;i++)
{
scanf("%[^\n]s",sen);
printf("%s\n",sen);
}
Problems:
You do not check whether scanf() succeeded.
You leave the newline in the buffer on the first iteration; the second iteration generates a return value of 0 because the first character to read is newline, which is the character excluded by the scan set.
The gibberish you see is likely the first line of input, repeated. Indeed, if it were not for the bounded loop, it would not wait for you to type anything more; it would spit out the first line over and over again.
Return value from scanf()
The definition of scanf() (from ISO/IEC 9899:1999) is:
§7.19.6.4 The scanf function
Synopsis
#include <stdio.h>
int scanf(const char * restrict format, ...);
Description
2 The scanf function is equivalent to fscanf with the argument stdin interposed
before the arguments to scanf.
Returns
3 The scanf function returns the value of the macro EOF if an input failure occurs before
any conversion. Otherwise, the scanf function returns the number of input items
assigned, which can be fewer than provided for, or even zero, in the event of an early
matching failure.
Note that when the loop in my first program exits, it is because scanf() returned 0, not EOF.
%[^\n] leaves the newline in the buffer. %[^\n]%*c eats the newline character.
In any case, %[^\n] can read any number of characters and cause buffer overflow or worse.
I use the format string %*[^\n]%*c to gobble the remainder of a line of input from a file. For example, one can read a number and discard the remainder of the line by %d%*[^\n]%*c. This is useful if there is a comment or label following the number, or other data that is not needed.
char sen[20];
for (i=0;i<2;i++)
{
scanf("%[^\n]s",sen);
printf("%s\n",sen);
getchar();
}
Hope this helps ... actually "\n" remains in stream input buffer... Ee need to flush it out before scanf is invoked again
I know I am late, but I ran into same problem after testing C after a long time.
The problem here is the new line is considered as input for next iteration.
So, here is my solution, use getchar() to discard the newline the input stream:
char s[10][25];
int i;
for(i = 0; i < 10; i++){
printf("Enter string: ");
scanf("%s", s[i]);
getchar();
}
Hope it helps :)
While using scanf("%[^\n]", sen) in a loop, the problem that occurs is that the \n stays within the input buffer and is not flushed. As a result next time, when the same input syntax is used, it reads the \n and considers it as a null input. A simple but effective solution to address this problem is to use:
char sen[20];
for (i=0;i<2;i++)
{
scanf("%[^\n]%*c",sen);
printf("%s\n",sen);
}
%*c gets rid of the \n character in the input buffer.

what happens when you input things like 12ab to scanf("%d",&argu)?

I came across this problem when I want to check what I input is number. The scanf function will return 1 if I successfully input a number. So here is what I wrote:
int argu;
while(scanf("%d",&argu)!=1){
printf("Please input a number!\n");
}
But when I input things like abcd to it, the loop would go forever and not stop for prompt.
I looked it up online and found that it had something to do with the cache and I need to clean it up so scanf can get new data. So I tried fflush but it didn't work.
Then I saw this:
int argu,j;
while(scanf("%d",&argu)!=1){
printf("Please input a number!\n");
while((j=getchar())!='\n' && j != '\n');
}
Then when I input things like 'abcd' it worked well and it prompted for my input. But when I input things like '12ab', it wouldn't work again.
So is there a way I can check the input for scanf("%d", &argu) is actually a number and prompt for another input if it isn't?
EDIT:
I saw the answers and solved my problem by using while(*eptr != '\n').
Notice that the fgets function actually reads '\n' into the array and gets doesn't. So be careful.
It's better to read a full line, using fgets(), and then inspecting it, rather than trying to parse "on the fly" from the input stream.
It's easier to ignore non-valid input, that way.
Use fgets() and then just strtol() to convert to a number, it will make it easy to see if there is trailing data after the number.
For instance:
char line[128];
while(fgets(line, sizeof line, stdin) != NULL)
{
char *eptr = NULL;
long v = strtol(line, &eptr, 10);
if(eptr == NULL || !isspace(*eptr))
{
printf("Invalid input: %s", line);
continue;
}
/* Put desired processing code here. */
}
But when I input things like abcd to it, the loop would go forever and not stop for prompt.
That's because if scanf encounters a character that does not match the conversion specifier, it leaves it in the input stream. Basically, what's happening is that scanf reads the character a from the input stream, determines that it's not a valid match for the %d conversion specifier, and then pushes it back onto the input stream. The next time through the loop it does the same thing. And again. And again. And again.
fflush is not a good solution, because it isn't defined to work on input streams.
For the input "12ab", scanf will read and convert "12", leaving "ab" in the input stream.
The best solution is to read all your input as text, then convert to numeric types using strtol (for integral values) and strtod (for real values). For example:
char input[SIZE]; // assume SIZE is big enough for whatever input we get
int value;
if (fgets(input, sizeof input, stdin) != NULL)
{
char *chk;
int tmp = (int) strtol(input, &chk, 10);
if (isspace(*chk) || *chk == 0)
value = tmp;
else
printf("%s is not a valid integer string\n", input);
}
chk points to the first character in the input stream that isn't a decimal digit. If this character is not whitespace or the 0 terminator, then the input string wasn't a valid integer. This will detect and reject inputs like "12ab" as well as "abcd".
scanf is a good solution if you know your input is always going to be properly formed and well-behaved. If there's a chance that your input isn't well-behaved, use fgets and convert as needed.
I will suggest to get input as a string and check for non-numeric characters in it. If input is valid convert string to int by sscanf(str,"%d",&i); or else diplay error.
Just call scanf("%*[^\n]\n") inside the loop, and it will discard the "cache".
Call scanf("%*[^\n]\n") inside the loop. This should be enough to discard anything associated with the cache.

Resources