How to get string length using scanf function - c

How to get the string length without using strlen function or counters, like:
scanf("???",&len);
printf("%d",len);
Input: abcde
Expected output: 5

You can use assignment-suppression (character *) and %n which will store the number of characters consumed into an int value:
int count;
scanf( "%*s%n", &count );
printf( "string length: %d\n", count );
Explanation:
%*s will parse a string (up to the first whitespace characters) but will not store it because of the *.
Then %n will store the numbers of characters consumed (which is the length of the string parsed) into count.
Please note that %n is not necessarily counted for the return value of scanf():
The C standard says: "Execution of a %n directive does not increment the assignment count returned at the completion of execution" but the Corrigendum seems to contradict this. Probably it is wise not to make any assumptions on the effect of %n conversions on the return value.
quoted from the man page where you will find everything else about scanf() too.

Use the %n format specifier to get the amount of characters consumed so far and write it to len of type int:
char buf[50];
int len;
if ( scanf("%49s%n", buf, &len) != 1 )
{
// error routine.
}
printf("%d", len);

You can doing this with the n specifier:
%n returns the number of characters read so far.
char str[20];
int len;
scanf("%s%n", &str, &len);

Explanation : Here [] is used as scanset character and ^\n takes input with spaces until the new line encountered. For length calculation of whole string, We can use a flag character in C where nothing is expected from %n, instead, the number of characters consumed thus far from the input is stored through the next pointer, which must be a pointer to int. This is not a conversion, although it can be suppressed with the *flag. Here load the variable pointed by the corresponding argument with a value equal to the number of characters that have been scanned by scanf() before the occurrence of %n.
By using this technique, We can speed up our program at runtime instead of using strlen() or loop of O(n) count.
scanf("%[^\n]%n",str,&len);
printf("%s %i",str,len);

Related

Why does sscanf read more than expected?

sscanf supports %n to count how many bytes are read.
Why does sscanf sometimes read additional bytes?
#include <stdio.h>
int main()
{
char *data = "X\n \n\x09\n \x10\n";
int len = 0;
sscanf(data, "X%n", &len);
printf("%i\n", len);
sscanf(data, "X\n%n", &len);
printf("%i\n", len);
return 0;
}
This program prints:
1
7
I would expect:
1
2
(1 for X and 2 for X\n.)
Why does it read more bytes than expected?
From cppreference:
The format string consists of
non-whitespace multibyte characters except %: each such character in the format string consumes exactly one identical character from the
input stream, or causes the function to fail if the next character on
the stream does not compare equal.
whitespace characters: any single whitespace character in the format string consumes all available consecutive whitespace characters from
the input (determined as if by calling isspace in a loop). Note that
there is no difference between "\n", " ", "\t\t", or other whitespace
in the format string.
Thus, your \n in the second format string will cause the function to consume all remaining whitespace characters – which is actually all 6 characters following the X and preceding the 0x10.

If I use fscanf to read a file, can I use it to store the number of characters it reads?

I'm trying to use fscanf to read in a bunch of words (all on separate lines) from a file and I want to find the length of each word. Does fscanf allow me to do that or if not, is there any other way to do so? I've been looking for an explanation on "input items successfully matched and assigned", but it's still a bit confusing.
To find the text length of a scanned word (or int, double, ...), precede and follow the specifier with a pair of "%n". "%n" saves the scan offset and does not contribute to the scanf() return value.
int n1;
int n2;
char buff[100];
// v-------- space to consume leading white-space
while (scanf(" %n%99s%n", &n1, buff, &n2) == 1) {
printf("word <%s> length: %d\n", buff, n2 - n1);
}
The above works equally well had the item been a double with "%f", an int with "%d", ....
For "%s", the length of the string can be found with strlen() after the scan.
char buff[100];
while (scanf("%99s", buff) == 1) {
printf("word <%s> length: %zu\n", buff, strlen(buff));
}
Advanced
Pedantically, input may include embedded null characters. Null characters are not typically found in text files unless by accident (reading a UTF-16 as UTF-8) or by nefarious users. With "%s", reading a null character is treated like any other non-white-space. Thus the " %n%99s%n" approach can result in a length more than strlen().
I've been looking for an explanation on "input items successfully matched and assigned", but it's still a bit confusing.
The return value of scanf() is roughly the count of matched specifiers - not the length of a word. The return value may be less than the number of specifiers if scanning was incomplete or EOF if end-of-file first encountered. "%n" does not contribute to this count. Specifiers with a "*" do not contribute either.

Scanning with %c or %s

I had to do a program for college in which I should separate, between a certain amount of people, those who liked and the ones who disliked something.
so I did this:
char like[100];
printf("Like? Y or N \n");
scanf ("%c", like);
The program compiled, but didn't work the way it should. The user was not able to write "y or n" when asked "Like?"
So I tried this:
char like[100];
printf("Like? Y or N \n");
scanf ("%s", like);
And it worked. But I don't know why it worked. Can somebody please explain me the difference between %c and %s in a scanf?
First, please do some basic research before coming here - questions like this can usually be answered with a quick Google search or checking your handy C reference manual.
char inputChar; // stores a single character
char inputString[100] = {0}; // stores a string up to 99 characters long
scanf( " %c", &inputChar ); // read the next non-whitespace character into inputChar
// ^ Note & operator in expression
scanf( "%s", inputString ); // read the next *sequence* of non-whitespace characters into inputString
// ^ Note no & operator in expression
You would use %c when you want to read a single character from the input stream and store it to a char object. The %c conversion specifier will not skip over any leading whitespace, so if you want to read the next non-whitespace character, you need a blank before the %c specifier in your format string, as shown above.
You would use %s when you want to read a sequence of non-whitespace characters from the input stream and store them to an array of char. Your target array must be large enough to store the input string plus a terminating 0-valued character. The %s conversion specifier skips over any leading whitespace and stops reading at the first whitespace character following the non-whitespace characters.
Both %c and %s expect their corresponding argument to have type char * (pointer to char); however, in the first case, it's assumed that the pointer points to a single object, whereas in the second case, it's assumed that the pointer points to the first element of an array. For inputChar, we must use the unary & operator to obtain the pointer value. For inputString, we don't, because under most circumstances an expression of type "array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element of the array.
Your code works fine as it is, but it's a bit confusing to read a single character and store it to an array.
Using %s without an explicit field width is risky; if someone types in more than 100 non-whitespace characters, scanf will happily store those extra characters to memory following inputString, potentially clobbering something important. It's generally safer to write something like
scanf( "%99s", inputString ); // reads no more than 99 characters into inputString
or to use fgets() to read input strings instead:
fgets( inputString, sizeof inputString, stdin );
Please check §7.21.6.2 of the online draft of the C language standard for a complete description of all of the conversion specifiers for the *scanf functions.
%s is used for string of characters and reads subsequent characters until it finds a whitespace(blank, newline or tab). Whereas %c is used for single character and reads the next character. If there are more than 1 character, including any whitespace character, it is read and stored in next iteration.

Taking formatted input : sscanf not ignoring white spaces

I have to find out the input hours and minutes after taking inputs from the user of the form :
( Number1 : Number2 )
eg: ( 12 : 21 )
I should report 12 hours and 21 minutes and then again wait for input. If there is a mismatch in the given format, I should report it as invalid input. I wrote this code :
#include<stdio.h>
int main()
{
int hourInput=0,minutesInput=0;
char *buffer = NULL;
size_t size;
do
{
puts("\nEnter current time : ");
getline ( &buffer, &size, stdin );
if ( 2 == sscanf( buffer, "%d:%d", &hourInput, &minutesInput ) && hourInput >= 0 && hourInput <= 24 && minutesInput >=0 && minutesInput <= 60 )
{
printf("Time is : %d Hours %d Minutes", hourInput, minutesInput );
}
else
{
puts("\nInvalid Input");
}
}
while ( buffer!=NULL && buffer[0] != '\n' );
return 0;
}
Q. if someone gives spaces between the number and :, my program considers it as invalid input, while I should treat it as valid.
Can someone explain why it is happening and any idea to get rid of this issue ? As far as I understand, sscanf should ignore all the white spaces ?
To allow optional spaces before ':', replace
"%d:%d"
with
"%d :%d"
sscanf() ignores white space where its format directives tell it to ignore, not everywhere. A whitespace character in the directive such as ' ' will ignore all white spaces. %d as well as other integer and floating point directives will ignore leading white space. Thus a space before %d is redundant.
C11 7,21,6,2,8 Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier.)
Additional considerations include using %u and unsigned as an alternate way to not accept negative numbers. strptime() is a common function used for scanning strings for time info.
I think if you put a space before and after the colon, it will ignore any whitespace and still work when they don't put spaces before and after the colon.
Like this:
sscanf( buffer, "%d : %d", &hourInput, &minutesInput )
use "%d : %d" as format string. it will work with and without spaces.
1st thing allocate memory before using buffer
2nd is this a C++ program or C as getline is not a C standard function.
Check this
int main()
{
int x=0,y=0;
char bff[]="7 8";
sscanf(bff,"%d%d",&x,&y);
printf("%d %d",x,y);
}
o/p-7 8
On sscanf man pagez
RETURN VALUE OF SSCANF -
These functions return the number of input items assigned. This can
be
fewer than provided for, or even zero, in the event of a matching fail-
ure. Zero indicates that, although there was input available, no conver-
sions were assigned; typically this is due to an invalid input character,
such as an alphabetic character for a `%d' conversion. The value EOF is
returned if an input failure occurs before any conversion such as an end-
of-file occurs. If an error or end-of-file occurs after conversion has
begun, the number of conversions which were successfully completed is
returned.
Now,
if someone gives spaces between the number and :, my program considers
it as invalid input
Yes , it should consider it wrong because the sscanf reads the from the buffer in exactly the same way as %d:%d,but if a character in the input stream conflicts with format-string, the function ends, ending with a matching failure.
Characters outside of conversion specifications are expected to match the sequence of characters in the input stream; the matched characters in the input stream are scanned but not stored. (Please see the emphasis on the sentence in bold)
i.e, sscanf while writing to the memory ignores whitespaces.
Avoid comparing sscanf() return value. In your case, it is always depends on the user input. If user gives spaces between input this value changes.

using scanf to read a string and an int separated by /

The input consists a string and an integer, which are separated by a '/', like this:
hello/17
And I want to read the input into a string and an int, like this:
char str[20];
int num;
scanf("%s/%d", str, &num); // this how I tried to do it.
I can't seem to make it, any advice?
scanf awaits a whitespace terminated string when it tries to read %s.
Try to specify the forbidden character set directly:
scanf("%19[^/]/%d", str, &num);
You can read more about the formating codes here
You only need to run the following program:
#include <stdio.h>
int main (void) {
char str[20] = {'\0'};
int count, num = 42;
count = sscanf ("hello/17", "%s/%d", str, &num);
printf ("String was '%s'\n", str);
printf ("Number was %d\n", num);
printf ("Count was %d\n", count);
return 0;
}
to see why this is happening. The output is:
String was 'hello/17'
Number was 42
Count was 1
The reason has to do with the %s format specifier. From C99 7.19.6.2 The fscanf function (largely unchanged in C11, and the italics are mine):
s: matches a sequence of non-white-space characters.
Since / is not white space, it gets included in the string bit, as does the 17 for the same reason. That's also indicated by the fact that sscanf returns 1, meaning that only one item was scanned.
What you'll then be looking for is something that scans any characters other than / into the string (including white space). The same section of the standard helps out there as well:
[: matches a nonempty sequence of characters from a set of expected characters (the scanset). The conversion specifier includes all subsequent characters in the format string, up to and including the matching right bracket (]). The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex (^), in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket.
In other words, something like:
#include <stdio.h>
int main (void) {
char str[20] = {'\0'};
int count, num = 42;
count = sscanf ("hello/17", "%[^/]/%d", str, &num);
printf ("String was '%s'\n", str);
printf ("Number was %d\n", num);
printf ("Count was %d\n", count);
return 0;
}
which gives you:
String was 'hello'
Number was 17
Count was 2
One other piece of advice: never ever use scanf with an unbounded %s or %[; you're asking for a buffer overflow attack. If you want a robust user input function, see this answer.
Once you have it in as a string, you can sscanf it to your heart's content without worrying about buffer overflow (since you've limited the size on input).
Could be like that:
char str[20];
int num;
scanf("%19[^/]%*c%d", str, &num);
%*c reads one character and discards it

Resources