Why doesn't sizeof work as expected? - c

#include <stdio.h>
int main(void)
{
printf("%d", sizeof (getchar()) );
}
What I expect is,
1. Type input.
2. Read input and return input value.
3. Evaluate sizeof value.
4. Print the sizeof value.
But the first step never happens.
Why doesn't the first step happen?

The sizeof operator does not evaluate its operand unless its type is a variable length array type: It looks at the type and returns the size. This is perfectly safe:
char *ptr = NULL; // NULL ponter!
printf("%zu", sizeof *ptr);
It will return 1, since it does not have to evaluate the expression to know the answer.

What I expect is, 1. Type input. 2. Read input and return input value. 3. Evaluate sizeof value 4. Print the sizeof value.
But the first step never happens. Why doesn't the first step happen?
Because, with a very few exceptions, the sizeof operator does not evaluate its operand. Your usage is not one of the exceptions. Not evaluating getchar() means getchar() is not called.
In any event, I'm not sure what you expect from your code. Even if getchar() were called, the result always has the same type (int), which does not depend on the input.
Do also pay attention to #P.P.'s comments. Your printf() format does not match the type of the data being printed, size_t. As he observes, the printf() call has undefined behavior as a result.

In C the sizeof operator is evaluated at run-time only for Variable Size Arrays (VLA). In all other cases the operator does nor evaluate its operand. It deduces the type of the expression and returns the size of object of the deduced type.

Because getchar() return type is an int, not a char. sizeof(int) is 4 on your platform.
Also, you should use %zu to print size_t values. Using incorrect format specifier is technically undefined behaviour.

Related

Explanation of output of program

Can anyone explain why this program prints 4 1 instead of 4 2?
Shouldn't pre increment operator which has higher precedence get executed first and print 4 2?
#include <stdio.h>
int main() {
int a=1;
printf ("%ld %d",sizeof(++a),a);
return 0;
}
Although you've already gotten several answers, I want to provide one more, because your question actually contained three separate misunderstandings, and I want to touch on all of them.
First of all, sizeof is a special operator which, by definition, does not evaluate its argument (that is, whatever subexpression it's taking the size of). So sizeof(++a) does not increment a. And sizeof(x = 5) would not assign 5 to x. And sizeof(printf("Hello!")) would not print "Hello".
Second, if we got rid of the sizeof, and simply wrote
printf("%d %d", ++a, a);
we would not be able to use precedence to figure out the behavior. Precedence is an important concept, but in general it does not help you figure out the behavior of confusing operations involving ++.
Finally, the perhaps surprising answer is that if you write
printf("%d %d", ++a, a);
it is not possible to figure out what it will do at all. It's basically undefined. (Specifically: in any function call like printf("%d %d", x, y) it's unspecified which order the arguments get evaluated in, so you don't know whether x or y gets evaluated first -- although there is an order. But then, when one of them is a and one of them is ++a, you have a situation where a is both being modified and having its value used, so there's no way to know whether the old or the new value gets used, and this makes the expression undefined. See this question for more on this issue.)
P.S. One more issue I forgot to mention, as noted by #Vlad from Moscow: %ld is not a reliable way to print the result of sizeof, which is a value of type size_t. You should use %zu if you can, or %u after casting to (unsigned) if you can't.
From the C Standard *6.5.3.4 The sizeof and alignof operators)
2 The sizeof operator yields the size (in bytes) of its operand, which
may be an expression or the parenthesized name of a type. The size is
determined from the type of the operand. The result is an integer. If
the type of the operand is a variable length array type, the operand
is evaluated; otherwise, the operand is not evaluated and the result
is an integer constant.
So the task of the sizeof operator is to determine the type of the expression used as an operand and then knowing the type of its operand to return the size of an object of the type. If the operand is not a variable length array then the expression used as an operand is not evaluated and the value returned by the sizeof operator is calculated at compile-time.
Thus this call
printf ("%ld %d",sizeof(++a),a);
is equivalent to the call
printf ("%ld %d",sizeof( int ),a);
and in your system sizeof( int ) is equal to 4.
So it does not matter what expression is used (except using a variable length array the size of which is calculated at run time) as an operand. It is the type of the expression that is important. For example you could even write
printf ( "%zu %d\n", sizeof( ( ++a, ++a, ++a, ++a, ++a ) ), a );
and got the same result.
Pay attention to that you should use the conversion specifier zu used for values of the type size_t instead of ld used for signed values. That is you need to write
printf ("%zu %d",sizeof(++a),a);
According to c99 standard,
the sizeof() operator only takes into account the type of the operand, which may be an expression or the name of a type (i.e int, double, float etc) and not the value obtained on evaluating the expression.
Hence, the operand inside the sizeof() operator is not evaluate.

Why in C does the function sizeof() output the size of right most operand when more than one operands are passed separated by comma?

I have the following code in C:
#include <stdio.h>
void main() {
printf("%d %d\n",sizeof(5),sizeof(5,5));
printf("%d %d\n",sizeof(5),sizeof(5.0,5));
printf("%d %d\n",sizeof(5),sizeof(5,5.0));
}
And I get the output:
4 4
4 4
4 8
I understand that sizeof(5) would return me the size of integer and sizeof(5.0) would return the size of a double, but why does it give the size of the rightmost operand in case more than one arguments are passed separated by comma? Why not the first argument or the collective size of all the arguments?
I am compiling online using OnlineGDB.com compiler for C.
Thanks for your help.
The simple reason is: Because sizeof is not a function! It is an operator that takes some expression on its right. Syntactically, it behaves the same as the return operator. The parentheses are only added by the programmers for clarity, and are not needed in most cases:
sizeof(foo); //no surprise, take the size of the variable/object
sizeof foo; //same as above, the parentheses are not needed
sizeof(void*); //you cannot pass a type to a function, but you can pass it to the sizeof operator
sizeof void*; //same as above
typedef char arrayType[20]
arrayType* bar; //pointer to an array
sizeof(*bar); //you cannot pass an array to a function, but you can pass it to the sizeof operator
sizeof*bar; //same as above
//compare to the behavior of `return`:
return foo; //no surprise
return(foo); //same as above, any expression may be enclosed in parentheses
So, what happens when you say sizeof(5, 5.0)? Well, since sizeof is an operator, the parentheses are not a function call, but rather interpreted like the parentheses in 1*(2 + 3) == 5. In both cases, the ( follows an operator, and is thus not interpreted as a function call. As such, the comma does not separate function call arguments (because there is no function call), rather it's interpreted as the comma operator. And the comma operator is defined to evaluate both its operands, and then return the value of the last operand. The operator nature of the sizeof dictates how the expression on its right is parsed.
Because the associativity of the comma operator is left to right.
Only the rightmost expression is used and the rest are discarded (although it's side effects have to do with sequencing).
Therefore,
sizeof(5.0,5) is equivalent to sizeof(5)
and
sizeof(5,5.0) is equivalent to sizeof(5.0)

Meaning of &st[3]-st in printf("%ld", &st[3]-st)

I'm new to C and in an exercise, I have to write the output of the following portion of code, which is 3. But I couldn't understand why it is that.
int main() {
char st[100]="efgh";
printf ("\n%ld\n",&st[3]-st);
return 0;
}
When you use an array in an expression, unless it is the argument of & or sizeof, it evaluates to the address of its first element.
Thus &st[3] - st evaluates as &st[3] - &st[0], which is just pointer arithmetic: The difference between the addresses of two array elements is just the difference between their indices, i.e. 3 - 0, which gives 3.
The only problem is that the result is of type ptrdiff_t, but printf %ld expects a long int. If those types are different on your machine, it won't work. In a printf() format string, the correct length modifier for ptrdiff_t is t — use "\n%td\n".
By definition, &st[3] is the same as st+3. st+3-st is 3. (st in that expression decays from array to a pointer. For portability, the printf format string should technically have %td instead of %ld.)

How does sizeof operator behaves in below code snippet?

Please explain the OP for below code snippet :
int *a="";
char *b=NULL;
float *c='\0' ;
printf(" %d",sizeof(a[1])); // prints 4
printf(" %d",sizeof(b[1])); // prints 1
printf(" %d",sizeof(c[1])); // prints 4
Compiler interprets a[1] as *(a+1) , so a has some address , now it steps 4 bytes ahead , then it will have some garbage value there so how is the OP 4 bytes , even if I do a[0] , still it prints 4 , although it is an empty string , so how come its size is 4 bytes ?
Here we are finding out the size of the variable the pointer is pointing to , so if I say size of a[1] , it means size of *(a+1), Now a has the address of a string constant which is an empty string , after I do +1 to that address it moves 4 bytes ahead , now its at some new address , now how do we know the size of this value , it can be an integer , a character or a float , anything , so how to reach to a conclusion for this ?
The sizeof operator does not evaluate its operand except one case.
From the C Standard (6.5.3.4 The sizeof and alignof operators)
2 The sizeof operator yields the size (in bytes) of its operand, which
may be an expression or the parenthesized name of a type. The size is
determined from the type of the operand. The result is an integer.
If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the
result is an integer constant.
In this code snippet
int *a="";
char *b=NULL;
float *c='\0' ;
printf(" %d",sizeof(a[1])); // prints 4
printf(" %d",sizeof(b[1])); // prints 1
printf(" %d",sizeof(c[1])); // prints 4
the type of the expression a[1] is int, the type of the expression b[1] is char and the type of the expression c[1] is float.
So the printf calls output correspondingly 4, 1, 4.
However the format specifiers in the calls are specified incorrectly. Instead of "%d" there must be "%zu" because the type of the value returned by the sizeof operator is size_t.
From the same section of the C Standard
5 The value of the result of both operators is implementation-defined,
and its type (an unsigned integer type) is size_t, defined in
<stddef.h> (and other headers).
This is all done statically, i.e. no dereferencing is happening at runtime. This is how the sizeof operator works, unless you use variable-length arrays (VLAs), then it must do work at runtime.
Which is why you can get away with sizeof:ing through a NULL pointer, and other things.
You should still be getting trouble for
int *a = "";
which makes no sense. I really dislike the c initializer too, but at least that makes sense.
sizeof operator happens at compilation (except for VLA's). It is looking at the type of an expression, not the actual data so even something like this will work:
sizeof(((float *)NULL)[1])
and give you the size of a float. Which on your system is 4 bytes.
Live example
Even though this looks super bad, it is all well defined, since no dereference ever actually occurs. This is all operations on type information at compile time.
sizeof() is based on the data type, so whilst it's getting the sizes outside the bounds of memory allocated to your variables, it doesn't matter as it's worked out at compile time rather than run time.

What is the reason for the double negation -(-n)?

I'm going through some legacy code and I've seen something like
char n = 65;
char str[1024];
sprintf(str, "%d", -(-n));
Why has the author (no longer present) written -(-n) rather than just n? Wouldn't --n suffice?
The first thing to note is that --n actually decreases n by 1 and evaluates to the new value, with the type char; so it does something very different to -(-n). Don't change the code to that!
-n performs a unary negation of n and is also an expresion of type int due to the type promotion rules of C. The further negation sets it back to the original value but with the type int retained.
So -(-n) is actually a verbose way of writing +n, which is often though to be a no-op but in this case it converts the type of n to an int.
I suspect the author is guarding themselves against errant refactoring and they were worried about mismatching the type of the argument with the format specifier %d.
But in this particular case it does not matter: sprintf will automatically promote the char type to an int, so it's perfectly safe to write
sprintf(str, "%d", n);
Do also consider reducing the size of the str buffer if that's "real" code, and consider using the safer snprintf variant.
(As a final remark note that a double negation can yield signed integral type overflow, so do use with caution.)

Resources