What does the 2nd argument in strtoul() function do? - c

According to this document,
The second argument (char **endptr) seems to be a waste of space! If
it is set to NULL, STRTOL seems to work its way down the string until
it finds an invalid character and then stops. All valid chars read are
then converted if the string starts with an invalid character the
function returns ZERO (0).
It means that the following code should detect 2 as the hex number:
int main()
{
char * string = "p1pp2ppp";
unsigned integer = strtoul(string, NULL, 16);
printf("%u", integer);
return 0;
}
but, it is returning zero.
Why?

The man page says the following about the second argument:
If endptr is not NULL, strtol() stores the address of the first
invalid character in *endptr. If there were no digits at all,
strtol() stores the original value of nptr in *endptr (and
returns 0). In particular, if *nptr is not '\0' but **endptr is
'\0' on return, the entire string is valid.
For example:
char str[] = "123xyz45";
char *p;
long x = strtol(str, &p, 10);
printf("x=%ld\n", x);
printf("p - str = %d\n", p - str);
printf("*p = %c\n", *p);
printf("p (as string) = %s\n", p);
Output:
x=123
p - str = 3
*p = x
p (as string) = xyz45
We can see that when strtol returns p points to the first character in str that cannot be converted. This can be used to parse through the string a bit at a time, or to see if the entire string can be converted or if there are some extra characters.
In the case of your example, the first character in string, namely "p" is not a base 10 digit so nothing gets converted and the function returns 0.

Why?
It's returning 0 because "p..." does not follow any rules about integer representation. The 2nd argument is not relevant for your question.

The char **endptr argument in all the strto* functions is intended to receive the address of the first character that isn’t part of a valid integer (decimal, hex, or octal) or floating point number. Far from useless, it’s handy for checking invalid input. For example, if I meant to type in 1234 but fat-fingered something like 12w4, strtoul will return 12 and set the endptr argument to point to w.
Basically, if the character endptr points to isn’t whitespace or 0, then the input should most likely be rejected.

Related

Why printf works as intended here?

Why does this code actually print out "HI!" ? char *s is an adress to the first character of a string, so in the next line of code when we put variable s into printf it should return an adress to that character to printf which obviously can't be represented as a string with %s. But it does. Why?
#include <stdio.h>
int main(void)
{
char *s = "HI!";
printf("%s\n", s);
}
if you want to print the adress, you have to write printf("%p\n", s); instead of printf("%s\n", s);
7.21.6.1 The fprintf function...
8 The conversion specifiers and their meanings are:
...
s If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type.280) Characters from the array are written up to (but not including) the terminating null character. If the precision is specified, no more than that many bytes are written. If the precision is not specified or is greater than the size of the array, the array shall contain a null character.
...
280) No special provisions are made for multibyte characters.
C 2011 Online Draft
The %s conversion specifier expects a pointer to the first character of a string - it will print that character and all characters following it until it sees the 0 terminator.
When you pass an array expression as an argument:
char s[] = "Hi!";
printf( "%s\n", s );
that array expression "decays" to a pointer to the first element.

Printf output of pointer string explanation from an interview

I had an interview and I was given this code and asked what is the output for each one of these printf statements.
I have my answers as comments, but I am not sure about the rest.
Can anyone explain the different outputs for statements 1, 3 and 7 and why?
Thank you!
#include <stdio.h>
int main(int argc, const char * argv[]) {
char *s = "12345";
printf("%d\n", s); // 1.Outputs "3999" is this the address of the first pointer?
printf("%d\n", *s); // 2.The decimal value of the first character
printf("%c\n", s); // 3.Outputs "\237" What is this value?
printf("%c\n", *s); // 4.Outputs "1"
printf("%c\n", *(s+1)); // 5.Outputs "2"
printf("%s\n", s); // 6.Outputs "12345"
printf("%s\n", *s); // 7.I get an error, why?
return 0;
}
This call
printf("%d\n", s);
has undefined behavior because an invalid format specifier is used with a pointer.
This call
printf("%d\n", *s);
outputs the internal code (for example ASCII code) of the character '1'.
This call
printf("%c\n", s);
has undefined behavior due to using an invalid format specifier with a pointer.
These calls
printf("%c\n", *s);
printf("%c\n", *(s+1));
are valid. The first one outputs the character '1' and the second one outputs the character '2'.
This call
printf("%s\n", s);
is correct and outputs the string "12345".
This call
printf("%s\n", *s);
is invalid because an invalid format specifier is used with an object of the type char.
This code is undefined behaviour (UB). You are passing a pointer, where the function requires an int value. For example, in a 64-bit architecture, a pointer is 64 bit, and an int is 32 bit. You can be printing a truncated value.
You are passing the first char value (automatically converted to an int by the compiler) and print it in decimal. Probably you got 49 (the ASCII code for '1'. This is legal use, but be careful about surprises, as you can get negative values if your platform char implementation is signed.
You are printing the passed pointer reinterpreted as a char value. Undefined behaviour, as you cannot convert a pointer to a char value.
You are printing the pointed value of s as a char so you get the first character of string "12345" ('1').
You are printing the next to first char pointed to by s, so you get the second character of string ('2').
You are printing the string pointed to by s, so you get the whole string. This is legal and indeed, the common way to print a string.
You are passing the first character of string to be interpreted as a pointer to a null terminated string to be printed (which it isn't). This is undefined behaviour again. You are reinterpreting a char value as a pointer to a null terminated string. A SIGSEGV is common in this case, (but not warranted :) ) The signal is sent when the program tries to access unallocated memory before reaching the supposed null character that terminates the string (but it could find a '\0' in the way and just print rubbish).
The 7'th line is failing because a C style string is expected as an input, and you are placing a character instead.
Take a look at:
What does %s and %d mean in printf in the C language
C style strings guide
I used the following online C compiler in order to run your code,
and here are the results:
1. 4195988 - undefined behaviour (UB), manifesting here as the address
of the char array as you stated (for a 64 bit address you might or
might not get truncation)
2. 49 - ASCII value of '1'
3. � - undefined behaviour, manifesting here as unsupported ASCII value
for a truncation of the address of the array of chars
(placing 32-bit address into a char - assuming a 32-bit system)
4. 1 - obvious
5. 2 - obvious
6. 12345 - obvious
7. Segmentation fault - undefined behaviour, trying to place the first char
of a char array into a string reserved position
(placing char into a string)
Note on point number 3: we can deduce what took place during run-time.
In the specific example provided in the question -
printf("%c\n", s); // 3.Outputs "\237". What is this value?
This is a hardware/compiler/OS related behavior when handling the UB.
Why? Due to the output "\237" -> this implies truncation under the specific hardware system executing this code!
Please see the explanation below (assumption - 32-bit system):
char *s = "12345"; // Declaring a char pointer pointing to a char array
char c = s; // Placement of the pointer into a char - our UB
printf("Pointer to character array: %08x\n", s); // Get the raw bytes
printf("Pointer to character: %08x\n", c); // Get the raw bytes
printf("%c\n", s); // place the pointer as a character
// display is dependent on the ASCII value and the OS
// definitions for 128-255 ASCII values
The outputs:
Pointer to character array: 004006e4 // Classic 32-bit pointer
Pointer to character: ffffffe4 // Truncation to a signed char
// (Note signed MSB padding to 32 bit display)
� // ASCII value E4 = 228 is not displayed properly
The final printf command is equivalent to char c = s; printf("%c\n", c);.
Why? Thanks to truncation.
An additional example with a legitimate ASCII character output:
char *fixedPointer = 0xABCD61; // Declaring a char pointer pointing to a dummy address
char c = fixedPointer; // Placement of the pointer into a char - our UB
printf("Pointer to 32-bit address: %08x\n", fixedPointer); // Get the raw bytes
printf("Pointer to character: %08x\n", c); // Get the raw bytes
printf("%c\n", fixedPointer);
And the actual outputs:
Pointer to 32-bit address: 00abcd61
Pointer to character: 00000061
a

atoi ignores a letter in the string to convert

I'm using atoi to convert a string integer value into integer.
But first I wanted to test different cases of the function so I have used the following code
#include <stdio.h>
int main(void)
{
char *a ="01e";
char *b = "0e1";
char *c= "e01";
int e=0,f=0,g=0;
e=atoi(a);
f=atoi(b);
g=atoi(c);
printf("e= %d f= %d g=%d ",e,f,g);
return 0;
}
this code returns e= 1 f= 0 g=0
I don't get why it returns 1 for "01e"
that's because atoi is an unsafe and obsolete function to parse integers.
It parses & stops when a non-digit is encountered, even if the text is globally not a number.
If the first encountered char is not a space or a digit (or a plus/minus sign), it just returns 0
Good luck figuring out if user input is valid with those (at least scanf-type functions are able to return 0 or 1 whether the string cannot be parsed at all as an integer, even if they have the same behaviour with strings starting with integers) ...
It's safer to use functions such as strtol which checks that the whole string is a number, and are even able to tell you from which character it is invalid when parsing with the proper options set.
Example of usage:
const char *string_as_number = "01e";
char *temp;
long value = strtol(string_as_number,&temp,10); // using base 10
if (temp != string_as_number && *temp == '\0')
{
// okay, string is not empty (or not only spaces) & properly parsed till the end as an integer number: we can trust "value"
}
else
{
printf("Cannot parse string: junk chars found at %s\n",temp);
}
You are missing an opportunity: Write your own atoi. Call it Input2Integer or something other than atoi.
int Input2Integer( Str )
Note, you have a pointer to a string and you will need to establish when to start, how to calculate the result and when to end.
First: Set return value to zero.
Second: Loop over string while it is not null '\0'.
Third: return when the input character is not a valid digit.
Fourth: modify the return value based on the valid input character.
Then come back and explain why atoi works the way it does. You will learn. We will smile.

C: strtof returns 0.000, what´s wrong?

I´m having some troubles with this part of my C code. Everything should work well except a function "strtof", which returns 0.000 instead of a float number.
What the code should do:
read a line, e.g. "a 12"
if the first character is "a", than, using strtof, it should set pointer to next white space and save value between the two white spaces to x... (probably wrong)
(All libraries are included and MAX_LINE is defined.)
Thank you for any answer :).
int run(void) {
char line[MAX_LINE];
fgets(line, sizeof(line), stdin);
char * ptr;
ptr = strtok (line," ");
if (strcmp(ptr, "a") == 0){
{
float x;
x = strtof(line, &ptr); /*HERE*/
printf("%f", x);
}
}
return 0;
}
You don't read the floating value after the "a", i think you need to do this:
ptr = strtok(NULL," ");
x = strtof(ptr, NULL);
The next call of strtok will read the "12" and the strtofwill convert it to a float into the x variable.
You already pointed it out:
What the code should do: read a line, e.g. "a 12" if the first character is "a", than, using strtof, it should set pointer to next white space and save value between the two white spaces to x... (probably wrong)
From the glibc manual (strtod and strtof are equivalent):
If the string is empty, contains only whitespace, or does not contain an initial substring that has the expected syntax for a floating-point number, no conversion is performed. In this case, strtod returns a value of zero and the value returned in *tailptr is the value of string.
According to float strtof (const char* str, char** endptr);
if your first char is 'a' then you should call :
x = strtof(line+2,&ptr);
if line is used with no offset, strtof will parse "a 12" and stop at 'a', you could have check the ptr value to see where the parsing stopped.

What does the n stand for in `sscanf(s, "%d %n", &i, &n)`?

The man page states that the signature of sscanf is
sscanf(const char *restrict s, const char *restrict format, ...);
I have seen an answer on SO where a function in which sscanf is used like this to check if an input was an integer.
bool is_int(char const* s) {
int n;
int i;
return sscanf(s, "%d %n", &i, &n) == 1 && !s[n];
}
Looking at !s[n] it seems to suggest that we check if sscanf scanned the character sequence until the termination character \0. So I assume n stands for the index where sscanf will be in the string s when the function ends.
But what about the variable i? What does it mean?
Edit:
To be more explicit: I see the signature of sscanf wants a pointer of type char * as first parameter. A format specifier as seconf parameter so it knows how to parse the character sequence and as much variables as conversion specifiers as next parameters. I understand now that i is for holding the parsed integer.
Since there is only one format specifier, I tried to deduce the function of n.
Is my assumption above for n correct?
Looks like the op has his answer already, but since I bothered to look this up for myself and run the code...
From "C The Pocket Reference" (2nd Ed by Herbert Shildt) scanf() section:
%n Receives an integer of value equal to the number of characters read so far
and for the return value:
The scanf() function returns a number equal to the number of the number of fields
that were successfully assigned values
The sscanf() function works the same, it just takes it's input from the supplied buffer argument ( s in this case ). The "== 1" test makes sure that only one integer was parsed and the !s[n] makes sure the input buffer is well terminated after the parsed integer and/or that there's really only one integer in the string.
Running this code, an s value like "32" gives a "true" value ( we don't have bool defined as a type on our system ) but s as "3 2" gives a "false" value because s[n] in that case is "2" and n has the value 2 ( "3 " is parsed to create the int in that case ). If s is " 3 " this function will still return true as all that white space is ingored and n has the value of 3.
Another example input, "3m", gives a "false" value as you'd expect.
Verbatim from sscanf()'s man page:
Conversions
[...]
n
Nothing is expected; instead, the number of characters
consumed thus far from the input is stored through the next pointer,
which must be a pointer to int. This is not a
conversion, although it can be suppressed with the * assignment-suppression character. The C
standard says: "Execution of
a %n directive does not increment the assignment count returned at the completion of
execution" but the Corrigendum seems to contradict this. Probably it is wise not
to make any assumptions on the effect of %n conversions on the return value.
I would like to point out that the original code is buggy:
bool is_int(char const* s) {
int n;
int i;
return sscanf(s, "%d %n", &i, &n) == 1 && !s[n];
}
I will explain why. And I will interpret the sscanf format string.
First, buggy:
Given input "1", which is the integer one, sscanf will store 1 into i. Then, since there is no white space after, sscanf will not touch n. And n is uninitialized. Because sscanf set i to 1, the value returned by sscanf will be 1, meaning 1 field scanned. Since sscanf returns 1, the part of the expression
sscanf(s, "%d %n", &i, &n) == 1
will be true. Therefore the other part of the && expression will execute. And s[n] will access some random place in memory because n is uninitialized.
Interpreting the format:
"%d %n"
Attempts to scan a number which may be a decimal number or an integer or a scientific notation number. The number is an integer, it must be followed by at least one white space. White space would be a space, \n, \t, and certain other non-printable characters. Only if it is followed by white space will it set n to the number of characters scanned to that point, including the white space.
This code might be what is intended:
static bool is_int(char const* s)
{
int i;
int fld;
return (fld = sscanf(s, "%i", &i)) == 1;
}
int main(int argc, char * argv[])
{
bool ans = false;
ans = is_int("1");
ans = is_int("m");
return 0;
}
This code is based on, if s is an integer, then sscanf will scan it and fld will be exactly one. If s is not an integer, then fld will be zero or -1. Zero if something else is there, like a word; and -1 if nothing is there but an empty string.
variable i there means until it has read an integer vaalue.
what are you trying to ask though? Its not too clear! the code will (try to ) read an integer from the string into 'i'

Resources