short and int in c - c

short int a,b,c;
scanf("%d%d",&a,&b);
c = a + b;
printf("%d %d %d",a,b,c);
Input: 5 8
Output: 0 8 8
Why is the value of a 0? Can any one explain this?
Platform --- GCC ubuntu 10.04

scanf with a "%d" format requires an int* argument. You're giving it a short int* argument, thus your program's behavior is undefined.
If you actually want to know why you're getting the results you're getting, I can speculate on that, but it's much easier just to correct your code:
scanf("%hd%hd", &a, &b);
You can continue to use "%d" for printf, since the short int arguments are promoted to int. You could use "%hd" with printf, but it's not necessary. (There is no similar promotion of short int* arguments to int*.)
You can safely stop reading here.
The following is some speculation about what's probably happening in your incorrect code. This is not a solution; the solution is to correct your code so it does what you want it to do. But might be instructive to see just how incorrect code can misbehave.
Assume short is 16 bits and int is 32 bits, which is typical. The first "%d" in the format string tells scanf to read a value (you gave it 5) and store it into a 32-bit int pointed to by the second argument, &a. Since a is only 16 buts, it will store half the 32-bit value in a and the other half in some adjacent chunk of memory. The second "%d" does the same thing with &b; it stores half of the 32-bit representation of 8 in b, and the other half somewhere else.
Based on your output, it appears that the second "%d" caused scanf to store the low-order 16 bits of the value 8 in b, and the high-order 16 bits (with value 0) in a, overwriting the value stored by the first "%d". Note that the high-order 16 bits from the first "%d" were probably stored somewhere else, perhaps clobbering some other variable or perhaps writing to some otherwise unused memory.
So the result is that you've stored 0 in a and 8 in b, which explains the output you're getting.
All this is very speculative, and many many other results are possible. This kind of analysis is useful only for tracking down the behavior of incorrect code, with the goal of correcting it. Writing code that deliberately takes advantage of this kind of thing is an extraordinarily bad idea. The language says absolutely nothing about what incorrect code like this will do; its behavior can vary wildly on different systems, with different compiler settings, or even depending on the phase of the moon.

According to this:
http://www.cplusplus.com/reference/clibrary/cstdio/scanf/
you need to specify a modify if your variable is a "short" int and not a regular int

In VC++ 6.0, The value of c is 13.
#include <stdio.h>
int main()
{
short int a,b,c;
scanf("%d%d",&a,&b);
c = a + b;
printf("%d + %d = %d\n",a,b,c);
return 0;
}

Related

C: What happens technically when int type is stored in long long type? [duplicate]

This question already has answers here:
Undefined, unspecified and implementation-defined behavior
(9 answers)
What happens when I use the wrong format specifier?
(2 answers)
Wrong format specifiers in scanf (or) printf
(2 answers)
Closed 1 year ago.
#include <stdio.h>
int main() {
long long a, b;
scanf("%d %d", &a, &b);
printf("%lld", a + b);
return 0;
}
The code above reads two numbers and prints the sum of them.
I know the precise one should use format specifier %lld rather than %d, which I believe is the cause of compiling error.
However, the problem is that some compilers, such as https://www.programiz.com/c-programming/online-compiler/, execute the code without any syntax error but print awkward value like below, which I don't get it at all.
Input: 123 -123
Output: 235046380240896 (This value consistently changes)
What is happening on the foundational level when int type is stored in long long type?
Formally it is undefined behavior, since the format specifiers don't match the type. So anything can happen in theory. Compilers aren't required to give diagnostics for mismatching format strings vs variables provided.
In practice, many (32/64 bit) compilers likely read 4 bytes and place them in the 4 least significant positions (little endian) of the the long long, whereas the upper 4 most significant bytes maintain their indeterminate values - that is, garbage, since the variable was never initialized.
So in practice if you initialize the long long to zero you might actually get the expected output, but there are no guarantees for it.
This is undefined behavior, so anything could happen.
What's most likely happening in practice is that scanf() is storing the numbers you enter into 32 bits halves of each 64-bit long long variable. Since you never initialized the variable, the other halves contain unpredictable values. This is why you're getting a different result each time.

Why can I printf with the wrong specifier and still get output?

My question involves the memory layout and mechanics behind the C printf() function. Say I have the following code:
#include <stdio.h>
int main()
{
short m_short;
int m_int;
m_int = -5339876;
m_short = m_int;
printf("%x\n", m_int);
printf("%x\n", m_short);
return 0;
}
On GCC 7.5.0 this program outputs:
ffae851c
ffff851c
My question is, where is the ffff actually coming from in the second hex number? If I'm correct, those fs should be outside the bounds of the short, but printf is getting them from somewhere.
When I properly format with specifier %hx, the output is rightly:
ffae851c
851c
As far as I have studied, the compiler simply truncates the top half of the number, as shown in the second output. So in the first output, are the first four fs from the program actually reading into memory that it shouldn't? Or does the C compiler behind-the-scenes still reserve a full integer even for a short, sign-extended, but the high half shall be undefined behavior, if used?
Note: I am performing research, in a real-world application, I would never try to abuse the language.
When a char or short (including signed and unsigned versions) is used as a function argument where there is no specific type (as with the ... arguments to printf(format,...))1, it is automatically promoted to an int (assuming it is not already as wide as an int2).
So printf("%x\n", m_short); has an int argument. What is the value of that argument? In the assignment m_short = m_int;, you attempted to assign it the value −5339876 (represented with bytes 0xffae851c). However, −5339876 will not fit in this 16-bit short. In assignments, a conversion is automatically performed, and, when a conversion of an integer to a signed integer type does not fit, the result is implementation-defined. It appears your implementation, as many do, uses two’s complement and simply takes the low bits of the integer. Thus, it puts the bytes 0x851c in m_short, representing the value −31460.
Recall that this is being promoted back to int for use as the argument to printf. In this case, it fits in an int, so the result is still −31460. In a two’s complement int, that is represented with the bytes 0xffff851c.
Now we know what is being passed to printf: An int with bytes 0xffff851c representing the value −31460. However, you are printing it with %x, which is supposed to receive an unsigned int. With this mismatch, the behavior is not defined by the C standard. However, it is a relatively minor mismatch, and many C implementations let it slide. (GCC and Clang do not warn even with -Wall.)
Let’s suppose your C implementation does not treat printf as a special known function and simply generates code for the call as you have written it, and that you later link this program with a C library. In this case, the compiler must pass the argument according to the specification of the Application Binary Interface (ABI) for your platform. (The ABI specifies, among other things, how arguments are passed to functions.) To conform to the ABI, the C compiler will put the address of the format string in one place and the bits of the int in another, and then it will call printf.
The printf routine will read the format string, see %x, and look for the corresponding argument, which should be an unsigned int. In every C implementation and ABI I know of, an int and an unsigned int are passed in the same place. It may be a processor register or a place on the stack. Let’s say it is in register r13. So the compiler designed your calling routine to put the int with bytes 0xffff851c in r13, and the printf routine looked for an unsigned int in r13 and found bytes 0xffff851c.
So the result is that printf interprets the bytes 0xffff851c as if they were an unsigned int, formats them with %x, and prints “ffff851c”.
Essentially, you got away with this because (a) a short is promoted to an int, which is the same size as the unsigned int that printf was expecting, and (b) most C implementations are not strict about mismatching integer types of the same width with printf. If you had instead tried printing an int using %ld, you might have gotten different results, such as “garbage” bits in the high bits of the printed value. Or you might have a case where the argument you passed is supposed to be in a completely different place from the argument printf expected, so none of the bits are correct. In some architectures, passing arguments incorrectly could corrupt the stack and break the program in a variety of ways.
Footnotes
1 This automatic promotion happens in many other expressions too.
2 There are some technical details regarding these automatic integer promotions that need not concern us at the moment.

what happens when we type cast from lower datatype to higher datatype

Will the accessibility of memory space get changed or just informing the compiler take the variable of mentioned type?
Example:
int main()
{
char a;
a = 123456789;
printf("ans is %d\n",(int)a);
}
Output:
overflow in implicit constant conversion a= 123456789.
ans is 21.
Here I know why it's causing overflow. But I want to know how memory is accessed when an overflow occurs.
This is kind of simple: Since char typically only holds one byte, only a single byte of 123456789 will be copied to a. Exactly how depends on if char is signed or unsigned (it's implementation-specific which one it is). For the exact details see e.g. this integer conversion reference.
What typically happens (I haven't seen any compiler do any different) is that the last byte of the value is copied, unmodified, into a.
For 123456789, if you view the hexadecimal representation of the value it will be 0x75bcd15. Here you can easily see that the last byte is 0x15 which is 21 in decimal.
What happens with the casting to int when you print the value is actually nothing that wouldn't happen anyway... When using variable-argument functions like printf values of a smaller type than int will be promoted to an int. Your printf call is exactly equal to
printf("ans is %d\n",a);

C program : help about variable definition sequence

void main()
{
float x = 8.2;
int r = 6;
printf ( "%f" , r/4);
}
It is clearly odd that i am not explicitly typecasting the r ( of int type ) in the printf func to float. However if i change the sequence of declaring x and r and declare r first and then x i get different results(in this case it is a garbage value). Again i am not using x
in the program anywhere.. These are the things i meant to be wrong... i want to keep them the way they are. But when i excute the first piece of code i get 157286.375011 as result ( a garbage value ).
void main()
{
int r = 6;
float x = 8.2;
printf ( "%f" , r/4);
}
and if i execute the code above i get 0.000000 as result. i know results can go wrong because i am using %f in the printf when it should have been %d... the results may be wrong... but my question is why the results change when i change sequence of variable definitions. Should not it be the same whether right or wrong???
Why is this happening?
printf does not have any type checking. It relies on you to do that checking yourself, verifying that all of the types match the formatting specifiers.
If you don't do that, you enter into the realm of undefined behavior, where anything can happen. The printf function is trying to interpret the specified value in terms of the format specifier you used. And if they don't match, boom.
It's nonsense to specify %f for an int, but you already knew that...
f conversion specifier takes a double argument but you are passing an int argument. Passing an int argument to f conversion specifier is undefined behavior.
In this expression:
r / 4
both operands are of type int and the result is also of type int.
Here is what you want:
printf ("%f", r / 4.0);
When printf grabs the optional variables (i.e. the variables after the char * that tells it what to print), it has to get them off the stack. double is usually 64 bits (8 bytes) whereas int is 32 bits (4 bytes).
Moreover, floating point numbers have an odd internal structure as compared to integers.
Since you're passing an int in place of a double, printf is trying to get 8 bytes off the stack instead of four, and it's trying to interpret the bytes of a int as the bytes of a double.
So not only are you getting 4 bytes of memory containing no one knows what, but you're also interpreting that memory -- that's 4 bytes of int and 4 bytes of random stuff from nowhere -- as if it were a double.
So yeah, weird things are going to happen. When you re-compile (or even times re-run) a program that just wantonly picks things out of memory where it hasn't malloc'd and it hasn't stored, you're going to get unpredictable and wildly-changing values.
Don't do it.

strange behavior of scanf for short int

the code is as follows:
#include <stdio.h>
main()
{
int m=123;
int n = 1234;
short int a;
a=~0;
if((a>>5)!=a){
printf("Logical Shift\n");
m=0;
}
else{
printf("Arithmetic Shift\n");
m=1;
}
scanf("%d",&a);
printf("%d\n", m);
}
after the line scanf("%d",&a); the value of m becomes 0.
I know it may be caused by the scanf: a's type is short and the input's type is int. But How can this affect the value of m ?
Thanks a lot !
The most likely reason for m being 0 in your snippet is because you assign m to have this value in the body of your if-statement, but since the code contains undefined behavior no one can say that for sure.
The likely story about passing a short* when scanf expects an int*
Assuming sizeof(short) = 2 and sizeof(int) == 4.
When entering your main function the stack on which the variables reside would normally look something like the below:
_
|short int (a) : scanf will try to read an int (4 bytes).
|_ 2 bytes : This part of memory will most
|int (n) : likely be overwritten
| :..
|
|_ 4 bytes
|int (m)
|
|
|_ 4 bytes
When you read a %d (ie. an int) into the variable a that shouldn't affect variable m, though n will most likely have parts of it overwritten.
Undefined Behavior
Though it's all a guessing game since you are invoking what we normally refer to as "undefined behavior" when using your scanf statement.
Everything the standard doesn't guarantee is UB, and the result could be anything. Maybe you will write data to another segment that is part of a different variable, or maybe you might make the universe implode.
Nobody can guarantee that we will live to see another day when UB is present.
How to read a short int using scanf
Use %hd, and be sure to pass it a short*.. we've had enough of UB for one night!
Assuming that int and short are four- and two-byte integers, respectively, on your platform (which is a likely assumption, but not guaranteed by the standard), you're asking scanf to read in an integer and store it in four bytes: the two bytes of b, and whatever two bytes follow it in memory. (Well, technically this is undefined behavior, and no specific behavior is guaranteed; but that's what it's likely to do.) Apparently your compiler is using the two bytes after b as the first two bytes of m. Which is a bit surprising — I certainly wouldn't expect b and m to be adjacent, and it rather implies that your compiler isn't aligning shorts and ints to the beginning of four-byte blocks — but perfectly legal.
You can see better what's going on if you add
printf("&a: %08X\n&m: %08X\n", (int)&a, (int)&m);
which will show you where a and m are stored, relative to each other. (Just as a test, I mean. You wouldn't want that in "real" code.)
You are correct, %d expects and writes an int. If you enter a value less than 65535, it fits in the bytes outside short, so you see 0 when you print a back. I tried reading a short and printing it back; I entered 65536123, and got 123, which makes perfect sense (65536 occupies precisely 16 bits; you see the remaining 123 through the two bytes of the short). This behavior is dangerous, because the other two bytes of the short end up in a "variable next door" to the short, which is very, very bad. I hope this should convince you not to do it.
P.S. To read a short with scanf, declare a temporary int variable, read the value into it using scanf, and then cast it to short.
You're invoking Undefined Behavior when passing a pointer to a non-int to scanf's %d.
Likely, the compiler introduces padding bytes for alignment purposes and the values get stored in the padding bytes and not the "useful" bytes.
However, the compiler is free to do anything from raise a segfault / access violation to invoke nasal demons.
If you had actually used variable n, then it would probably have been the one that got clobbered, rather than m. Since you didn't use n, the compiler optimized it away, and that means that it was m that got clobbered by scanf() writing 4 bytes (because it was told that it got a pointer to a (4-byte) integer) instead of the 2 bytes. This depends on quite a lot of details of your hardware, such as endian-ness and alignment (if int had to be aligned on a 4-byte boundary, you wouldn't see the problem; I guess you are on an Intel machine rather than, say, PowerPC or SPARC).
Don't fib to your compiler - even accidentally. It will get its own back.

Resources