Sscanf uninitialized value when using prior argument in address - c

I am trying to use sscanf (on a string that I know is well formed and not malicious) to write a value to a specific part of an array.
#include <stdio.h>
int main(void){
int arr[10], i;
sscanf("5 5", "%d %d", &i, arr + i);
}
When I run this in valgrind, I am told that the sscanf line reads an uninitialized value. However, arr is initialized as a pointer to an automatically allocated array and i is initialized because there is a sequence point associated with every format specifier (this is not the case for general functions which do not have sequence points between arguments). I might be inclined to believe that this is only happening because valgrind does not know about that format specifier sequence point rule, but in my full program the line causes an actual segmentation fault. Why is this happening? Am I mistaken about format specifiers creating sequence points?

The problem is that the sequence point of the function call means that i is definitely accessed for the argument computation of arr + i before it is set by sscanf. So you are accessing it before it is initialized.
You'll need two sccanf calls; something like:
int len, i;
if (sscanf(buf, "%d%n", &i, &len) >= 1 && sscanf(buf+len, "%d", arr + i) >= 1) {
// successful sscanf
Note that your should ALWAYS be checking the return value of the sscanf call to see if it worked.

Related

What does scanf do when passing a char and an integer specifier?

For the purposes of an exercise, I was given a snippet of code and told to find the bug. I removed a bunch of noise and the part that is tripping me up is the following:
int main() {
char *p;
char n;
scanf("%i", n);
if (n < get_int()) {
p = malloc(n);
}
}
Here, if I enter a number for n, I get a seg fault. If I enter a character, n is set to 0. What is scanf doing that makes this so?
Edit: the exercise I'm trying to figure out is Exercise 2 from this page
It is simply UB.
C does not specify any specific behavior here. "%i" expect a int *, not an uninitialized char converted to an int.
"What is scanf doing that makes this so?" implies defined behavior. There is no specified UB.
"If I enter a character, n is set to 0. " --> scanf() does not attempt to change n, it uses a copy of n (passed by value).
The usual scanf() usages is like the below where the address of nn is passed, not nn itself.
int nn;
if (scanf("%i", &nn) == 1) Success();
else Failure();
You aren't just passing the wrong kind of variable to scanf, you are also passing it's value instead of the pointer to it. The scanf have no way of knowing this value isn't an actual pointer to store the scanned data into, so thats exactly what its going to try and do, scan the input and place it into whatever memory address the n, treated as pointer value, happened to point to. In the absolute most of the cases this will attempt to access unmapped/protected/etc memory, and cause segfault/access violation.
Entering a character simply terminates the scan prematurely, avoiding the segfault, and leaving the n intact. Bit since the value of n isn't initialized, it can happen to be just about anything, any junk that happened to be on the stack at that point of time.

Why does the strange value come out with an unnecessary % conversion in the code in C language?

I started learning programming only few days ago, so basically I have no knowledge.
I'm starting with C, and I wrote a very simple code which is:
int main (int argc, const char * argv[])
{
printf("%d + %d", 1 + 3);
return 0;
}
with the code above, I got the value of 4 + 1606416608 and later found that the return value is wrong because I put more %d than necessary. Then my question is, how did that strange value actually come out? If anyone knows, please help me. Thank you!!
You know what you did wrong already, so to explain what your particular implementation of C probably did:
When you call printf, a new stack frame is pushed to the call stack. The call stack is a last in first out structure with one 'frame' per called function. So if main called logStuff which called printf then three consecutive frames would be for main, then logStuff, then printf. When printf returns, it's frame is removed from the structure and execution continues with logStuff.
So a frame usually contains at least the parameters passed to the function and storage for local variables. Those things may be one and the same, it's implementation dependant.
With a variadic function like printf there's a stream of unnamed parameters. The bit patterns will be put into an appropriate place in the frame. But C is not a reflective language. Each bit patten doesn't inherently have a meaning: any one could be an integer, a float, or anything else. It also isn't a language that invests in bounds checking. You're trusted to write code that acts correctly.
printf determines the types and number of unnamed parameters from the string. So if you've given it false information, it will interpret the bit patterns with something other than their correct meaning and it may think there are fewer or more than are really there.
You told it there were more. So what probably happened was that the parameters were in the equivalent of an array and it read a value from beyond the end of the array. As it's all implementation dependent, that value may have been meant to represent anything. It could be the address of the caller. It could be uninitialised storage for another local variable. It could be bookkeeping. It could be the format string, incorrectly interpreted as an integer.
What it isn't is any reliable value. It may not even always be safe to read.
You are in undefined behavior land... you are telling a variadic function that you have 2 int sized params, then you only supply one, you are leaking something from the stack.
1) %d is a format specifier, it tells the compiler how you want to access the value stored at a particular location.(here as an integer)
2) For every format specifier you need to provide a corresponding variable or a value, otherwise at runtime you will get "garbage" i.e. some random value.
Example :
int main()
{
int a = 65;
printf("\na = %d", a); // here the value stored in a is accessed as an integer.
printf("\na = %c", a); // the value inside a is accessed as a character.
return 0;
}
In the above example '%d' in the first printf statement tells the compiler that the value stored in the variable a is to be accessed as an integer. (o/p - 65)
In the second printf statement '%c' is used to access the same variable as a character.(o/p - A)
Your code expects two numerical parameters to be printed, and you're giving it one.
Expected:
printf("%d + %d", <some_num>, <another_num>);
You're giving it:
printf("%d + %d", <some_num>);
Where <some_num> is what 1+3 evaluates to. The function expects another argument, but receives garbage instead.
What you should do is
printf("%d + %d = %d", 1, 3, 1+3);

Loop ending condition is not working - C

I have a homework regarding dynamic arrays, therefore I was trying to understand how it works with simple programs.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
int cnt,i=0;
char temp[1001];
char *obj[5];
scanf("%d",cnt);
while(i<cnt){
scanf("%s",temp);
obj[i]=malloc(sizeof(char)*(strlen(temp)+1));
obj[i]=temp;
printf("%s\n",obj[i]);
printf("%d\n",i);
i++;
}
return 0;
}
When i get the "cnt" to be equal to 5, by reading from stdin, the program is running forever, though ending condition meets. But when i get the "cnt" to be equal to 5, by assigning it, at the very beginning of the program (not by using scanf) the program works just fine.
What might be the reason for this?
This:
scanf("%d",cnt);
should be:
/* Always check return value of scanf(),
which returns the number of assignments made,
to ensure the variables have been assigned a value. */
if (scanf("%d",&cnt) == 1)
{
}
as scanf() requires the address of cnt.
Also:
Don't cast result of malloc().
sizeof(char) is guaranteed to be 1 so can be omitted from the space calculation in malloc().
Check result of malloc() to ensure memory was allocated.
free() whatever was malloc()d.
Prevent buffer overrun with scanf("%s") by specifying the maximum number of characters to read, which must be one less than the target buffer to allow a space for the terminating null character. In your case scanf("%1000s", temp).
There is no protection for out of bounds access on the array obj. The while loop's terminating condition is i<cnt but if cnt > 5 the an out of bounds access will occur, causing undefined behaviour.
This assigns the address of temp to obj[i]:
obj[i]=temp;
it does not copy (and causes a memory leak). Use strcpy() instead:
obj[i] = malloc(strlen(temp) +1 );
if (obj[i])
{
strcpy(obj[i], temp);
}
you should use this
scanf("%d",&cnt);
BTW:
scanf("%s",temp);
is used in a while loop to read your strings. you have to add space at the beginning of the format specifier to avoid the newline problems. it should be " %s"
Undefined behavior. You need to pass the address of the variable to scanf():
scanf("%d", &cnt);
But you better not use scanf() anyway. fgets() is simpler and safer to use.

Get number of characters read by sscanf?

I'm parsing a string (a char*) and I'm using sscanf to parse numbers from the string into doubles, like so:
// char* expression;
double value = 0;
sscanf(expression, "%lf", &value);
This works great, but I would then like to continue parsing the string through conventional means. I need to know how many characters have been parsed by sscanf so that I may resume my manual parsing from the new offset.
Obviously, the easiest way would be to somehow calculate the number of characters that sscanf parses, but if there's no simple way to do that, I am open to alternative double parsing options. However, I'm currently using sscanf because it's fast, simple, and readable. Either way, I just need a way to evaluate the double and continue parsing after it.
You can use the format specifier %n and provide an additional int * argument to sscanf():
int pos;
sscanf(expression, "%lf%n", &value, &pos);
Description for format specifier n from the C99 standard:
No input is consumed. The corresponding argument shall be a pointer to
signed integer into which is to be written the number of characters read from the input stream so far by this call to the fscanf function. Execution of a %n directive does not increment the assignment count returned at the completion of execution of the fscanf function. No argument is converted, but one is consumed. If the conversion specification includes an assignment suppressing character or a field width, the behavior is undefined.
Always check the return value of sscanf() to ensure that assignments were made, and subsequent code does not mistakenly process variables whose values were unchanged:
/* Number of assignments made is returned,
which in this case must be 1. */
if (1 == sscanf(expression, "%lf%n", &value, &pos))
{
/* Use 'value' and 'pos'. */
}
int i, j, k;
char s[20];
if (sscanf(somevar, "%d %19s %d%n", &i, s, &j, &k) != 3)
...something went wrong...
The variable k contains the character count up to the point where the end of the integer stored in j was scanned.
Note that the %n is not counted in the successful conversions. You can use %n several times in the format string if you need to.

how can I printf in c

How can I make this print properly without using two printf calls?
char* second = "Second%d";
printf("First%d"second,1,2);
The code you showed us is syntactically invalid, but I presume you want to do something that has the same effect as:
printf("First%dSecond%d", 1, 2);
As you know, the first argument to printf is the format string. It doesn't have to be a literal; you can build it any way you like.
Here's an example:
#include <stdio.h>
#include <string.h>
int main(void)
{
char *second = "Second%d";
char format[100];
strcpy(format, "First%d");
strcat(format, second);
printf(format, 1, 2);
putchar('\n');
return 0;
}
Some notes:
I've added a newline after the output. Output text should (almost) always be terminated by a newline.
I've set an arbitrary size of 100 bytes for the format string. More generally, you could declare
char *format;
and initialize it with a call to malloc(), allocating the size you actually need (and checking that malloc() didn't signal failure by returning a null pointer); you'd then want to call free(format); after you're done with it.
As templatetypedef says in a comment, this kind of thing can be potentially dangerous if the format string comes from an uncontrolled source.
(Or you could just call printf twice; it's not that much more expensive than calling it once.)
Use the preprocessor to concatenate the two strings.
#define second "Second%d"
printf("First%d"second,1,2);
Do not do this in a real program.
char *second = "Second %d";
char *first = "First %d";
char largebuffer[256];
strcpy (largebuffer, first);
strcat (largebuffer, second);
printf (largebuffer, 1, 2);
The problem with using generated formats such as the method above is that the printf() function, since it is a variable length argument list, has no way of knowing the number of arguments provided. What it does is to use the format string provided and using the types as described in the format string it will then pick that number and types of arguments from the argument list.
If you provide the correct number of arguments like in the example above in which there are two %d formats and there are two integers provided to be printed in those places, everything is fine. However what if you do something like the following:
char *second = "Second %s";
char *first = "First %d";
char largebuffer[256];
strcpy (largebuffer, first);
strcat (largebuffer, second);
printf (largebuffer, 1);
In this example the printf() function is expecting the format string as well as a variable number of arguments. The format string says that there will be two additional arguments, an integer and a zero terminated character string. However only one additional argument is provided so the printf() function will just use what ever is next on the stack as being a pointer to a zero terminated character string.
If you are lucky, the data that the printf() function interprets as a pointer will a valid memory address for your application and the memory area pointed to will be a couple of characters terminated by a zero. If you are less lucky the pointer will be zero or garbage and you will get an access violation right then and it will be easy to find the cause of the application crash. If you have no luck at all, the pointer will be good enough that it will point to a valid address that is about 2K of characters and the result is that printf() will totally mess up your stack and go into the weeds and the resulting crash data will be pretty useless.
char *second = "Second%d";
char tmp[256];
memset(tmp, 0, 256);
sprintf(tmp, second, 2);
printf("First%d%s", 1,tmp);
Or something like that
I'm assuming you want the output:
First 1 Second 2
To do this we need to understand printf's functionality a little better. The real reason that printf is so useful is that it not only prints strings, but also formats variables for you. Depending on how you want your variable formatted you need to use different formatting characters. %d tells printf to format the variable as a signed integer, which you already know. However, there are other formats, such as %f for floats and doubles, %l% for long integers, and %s for strings, or char*.
Using the %s formatting character to print your char* variable, second, our code looks like this:
char* second = "Second";
printf ( " First %d %s %d ", 1, second, 2 );
This tells printf that you want the first variable formatted as an integer, the second as a string, and the third as another integer.

Resources