strange behavior of scanf for short int - c

the code is as follows:
#include <stdio.h>
main()
{
int m=123;
int n = 1234;
short int a;
a=~0;
if((a>>5)!=a){
printf("Logical Shift\n");
m=0;
}
else{
printf("Arithmetic Shift\n");
m=1;
}
scanf("%d",&a);
printf("%d\n", m);
}
after the line scanf("%d",&a); the value of m becomes 0.
I know it may be caused by the scanf: a's type is short and the input's type is int. But How can this affect the value of m ?
Thanks a lot !

The most likely reason for m being 0 in your snippet is because you assign m to have this value in the body of your if-statement, but since the code contains undefined behavior no one can say that for sure.
The likely story about passing a short* when scanf expects an int*
Assuming sizeof(short) = 2 and sizeof(int) == 4.
When entering your main function the stack on which the variables reside would normally look something like the below:
_
|short int (a) : scanf will try to read an int (4 bytes).
|_ 2 bytes : This part of memory will most
|int (n) : likely be overwritten
| :..
|
|_ 4 bytes
|int (m)
|
|
|_ 4 bytes
When you read a %d (ie. an int) into the variable a that shouldn't affect variable m, though n will most likely have parts of it overwritten.
Undefined Behavior
Though it's all a guessing game since you are invoking what we normally refer to as "undefined behavior" when using your scanf statement.
Everything the standard doesn't guarantee is UB, and the result could be anything. Maybe you will write data to another segment that is part of a different variable, or maybe you might make the universe implode.
Nobody can guarantee that we will live to see another day when UB is present.
How to read a short int using scanf
Use %hd, and be sure to pass it a short*.. we've had enough of UB for one night!

Assuming that int and short are four- and two-byte integers, respectively, on your platform (which is a likely assumption, but not guaranteed by the standard), you're asking scanf to read in an integer and store it in four bytes: the two bytes of b, and whatever two bytes follow it in memory. (Well, technically this is undefined behavior, and no specific behavior is guaranteed; but that's what it's likely to do.) Apparently your compiler is using the two bytes after b as the first two bytes of m. Which is a bit surprising — I certainly wouldn't expect b and m to be adjacent, and it rather implies that your compiler isn't aligning shorts and ints to the beginning of four-byte blocks — but perfectly legal.
You can see better what's going on if you add
printf("&a: %08X\n&m: %08X\n", (int)&a, (int)&m);
which will show you where a and m are stored, relative to each other. (Just as a test, I mean. You wouldn't want that in "real" code.)

You are correct, %d expects and writes an int. If you enter a value less than 65535, it fits in the bytes outside short, so you see 0 when you print a back. I tried reading a short and printing it back; I entered 65536123, and got 123, which makes perfect sense (65536 occupies precisely 16 bits; you see the remaining 123 through the two bytes of the short). This behavior is dangerous, because the other two bytes of the short end up in a "variable next door" to the short, which is very, very bad. I hope this should convince you not to do it.
P.S. To read a short with scanf, declare a temporary int variable, read the value into it using scanf, and then cast it to short.

You're invoking Undefined Behavior when passing a pointer to a non-int to scanf's %d.
Likely, the compiler introduces padding bytes for alignment purposes and the values get stored in the padding bytes and not the "useful" bytes.
However, the compiler is free to do anything from raise a segfault / access violation to invoke nasal demons.

If you had actually used variable n, then it would probably have been the one that got clobbered, rather than m. Since you didn't use n, the compiler optimized it away, and that means that it was m that got clobbered by scanf() writing 4 bytes (because it was told that it got a pointer to a (4-byte) integer) instead of the 2 bytes. This depends on quite a lot of details of your hardware, such as endian-ness and alignment (if int had to be aligned on a 4-byte boundary, you wouldn't see the problem; I guess you are on an Intel machine rather than, say, PowerPC or SPARC).
Don't fib to your compiler - even accidentally. It will get its own back.

Related

what happens when we type cast from lower datatype to higher datatype

Will the accessibility of memory space get changed or just informing the compiler take the variable of mentioned type?
Example:
int main()
{
char a;
a = 123456789;
printf("ans is %d\n",(int)a);
}
Output:
overflow in implicit constant conversion a= 123456789.
ans is 21.
Here I know why it's causing overflow. But I want to know how memory is accessed when an overflow occurs.
This is kind of simple: Since char typically only holds one byte, only a single byte of 123456789 will be copied to a. Exactly how depends on if char is signed or unsigned (it's implementation-specific which one it is). For the exact details see e.g. this integer conversion reference.
What typically happens (I haven't seen any compiler do any different) is that the last byte of the value is copied, unmodified, into a.
For 123456789, if you view the hexadecimal representation of the value it will be 0x75bcd15. Here you can easily see that the last byte is 0x15 which is 21 in decimal.
What happens with the casting to int when you print the value is actually nothing that wouldn't happen anyway... When using variable-argument functions like printf values of a smaller type than int will be promoted to an int. Your printf call is exactly equal to
printf("ans is %d\n",a);

Typecasting of a pointer to some basic data types(what happens internally)

What happens when a pointer is typecasted to a basic data type. Why do we get some value?
For example:
int h=4;
int * ph=&h;
printf("%p",ph);
printf("%d",ph);
Both the print statements print different values...
printf("%p",ph);
says to the runtime "See over there, that memory is a pointer, load it and print it out as hex please". Note that this is not said to the compiler, the compiler doesnt know what printf is doing (actually most modern compilers sneak a look inside printf statements, you probably got a warning).
printf("%d",ph);
says - "See that piece of memory, its an integer, please load it and print it as a human readable base 10 number"
Given that ph is a pointer to an int the first one does the correct thing, it prints out the value of the pointer.
The second one's behavior depends on the size and representation of int and pointers on your system. The value is 'really' a pointer but you are telling the runtime its an int. ON many many systems pointers and ints are 32 bits. In that case the load will load 32 bits and the print will interpret those bits as an int and print out the base 10 value of the pointer. On other systems pointers might be 64 bits and int would still be 32 bit. Since you dont say what value you get out its hard to know whats going on but if I had to guess I would say that you are getting the same value, one in hex the other in decimal
Note that the second one is whats called 'undefined behavior', you are lying to the system: bad, confusing, unexplained things can happen
This program has undefined behaviour. The type of actual paramater ph (int*) doesn't match format specifier "%d". It can print anything or nothing and simply make your PC explode.

taking integer value in character pointer

int main()
{
int i=21;
char *p;
p=(char*)&i;
printf("%d",*p);
getch();
return 0;
}
printf statement gave me perfect answer but I think it shouldn't have as 'p' is a character pointer it will be able to save its base address but int takes up two spaces, *p shouldn't be able to give me integer value as it will point to address let say X but int is stored in two bytes so value need to be collected from X and X+1 address but I ran this code and gave me the value , or do I have the wrong insight on this ?
p=(char*)&i;
This points p to the lowest address in i. Whether that is the address of the low order byte or the high order byte depends on the endianness of your system. (It could even be an internal byte ... PDP-11's are little-endian but longs (32 bits) were stored with the high order 16-bit word first, so the byte order was 2,3,0,1.) Likely you're running on a little-endian machine (x86's are) so it points to the low order byte.
*p
Given little-endianness, this fetches the low order byte of i, which is (char)21, and then does the default conversion to an int, giving (int)21, and prints 21. If i contained a value > 255, you would get the "wrong" result. Also if it contained a value > 127 and < 256 and char is signed on your system -- it would print a negative value.
Since the result depends on the endianness of the machine and is implementation-defined and thus is not portable, you should not do this sort of thing unless your specific goal is to determine the endianness of your machine. Beginning programmers should spend a lot less time trying to understand why bad code sometimes "works" and instead learn how to write good code. A general rule (with plenty of exceptions): code with casts is bad code.

How are pointers stored in memory?

I'm a little confused about this.
On my system, if I do this:
printf("%d", sizeof(int*));
this will just yield 4. Now, the same happens for sizeof(int). Conclusion: if both integers and pointers are 4 bytes, a pointer can be safely "converted" to an int
(i.e. the memory it points to could be stored in an int). However, if I do this:
int* x;
printf("%p", x);
The returned hex address is far beyond the int scope, and thus any attempt to store the value in an int fails obviously.
How is this possible? If the pointer takes 4 bytes of memory, how can it store more than 232?
EDIT:
As suggested by a few users, I'm posting the code and the output:
#include <stdio.h>
int main()
{
printf ("%d\n", sizeof(int));
printf ("%d\n", sizeof(int*));
int *x;
printf ("%d\n", sizeof(x));
printf ("%p\n", x);
}
The output:
4
4
4
0xb7778000
C11, 6.3.2.3, paragraphs 5 and 6:
An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.
Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
So the conversions are allowed, but the result is implementation defined (or undefined if the result cannot be stored in an integer type). (The "previously specified" is referring to NULL.)
In regards to your print statement for a pointer printing something larger than what 4 bytes of data can represent, this is not true, as 0xb7778000 is within range of a 32 bit integral type.
The returned hex address is far beyond the int scope, and thus any attempt to store the value in an int fails obviously.
4
4
4
0xb7778000
And 0xb7778000 is a 32-bit value, so an object of 4 bytes can hold it.
No, they cannot be "safely" converted. Certainly they use the same amount of storage space, but there is no guarantee that they interpret a number of set bits in the same manner.
As for the second question (and one question per question please), there is no guaranteed size for int, or for a pointer. An int is roughly the optimum size of data transfer on the bus (also known as a word). It can differ on different platforms, but must be relatively (equal or) larger than a short or char. This is why there are standard definitions for MAX_INT, but not a standard "value" for the definition.
A pointer is roughly the number of bits wide as necessary to access a memory location. The old original PC's had a 8 bit bus, but a 12 bit pointer (due to some fancy bit-shifting) to extend it's memory range past its bus size.

short and int in c

short int a,b,c;
scanf("%d%d",&a,&b);
c = a + b;
printf("%d %d %d",a,b,c);
Input: 5 8
Output: 0 8 8
Why is the value of a 0? Can any one explain this?
Platform --- GCC ubuntu 10.04
scanf with a "%d" format requires an int* argument. You're giving it a short int* argument, thus your program's behavior is undefined.
If you actually want to know why you're getting the results you're getting, I can speculate on that, but it's much easier just to correct your code:
scanf("%hd%hd", &a, &b);
You can continue to use "%d" for printf, since the short int arguments are promoted to int. You could use "%hd" with printf, but it's not necessary. (There is no similar promotion of short int* arguments to int*.)
You can safely stop reading here.
The following is some speculation about what's probably happening in your incorrect code. This is not a solution; the solution is to correct your code so it does what you want it to do. But might be instructive to see just how incorrect code can misbehave.
Assume short is 16 bits and int is 32 bits, which is typical. The first "%d" in the format string tells scanf to read a value (you gave it 5) and store it into a 32-bit int pointed to by the second argument, &a. Since a is only 16 buts, it will store half the 32-bit value in a and the other half in some adjacent chunk of memory. The second "%d" does the same thing with &b; it stores half of the 32-bit representation of 8 in b, and the other half somewhere else.
Based on your output, it appears that the second "%d" caused scanf to store the low-order 16 bits of the value 8 in b, and the high-order 16 bits (with value 0) in a, overwriting the value stored by the first "%d". Note that the high-order 16 bits from the first "%d" were probably stored somewhere else, perhaps clobbering some other variable or perhaps writing to some otherwise unused memory.
So the result is that you've stored 0 in a and 8 in b, which explains the output you're getting.
All this is very speculative, and many many other results are possible. This kind of analysis is useful only for tracking down the behavior of incorrect code, with the goal of correcting it. Writing code that deliberately takes advantage of this kind of thing is an extraordinarily bad idea. The language says absolutely nothing about what incorrect code like this will do; its behavior can vary wildly on different systems, with different compiler settings, or even depending on the phase of the moon.
According to this:
http://www.cplusplus.com/reference/clibrary/cstdio/scanf/
you need to specify a modify if your variable is a "short" int and not a regular int
In VC++ 6.0, The value of c is 13.
#include <stdio.h>
int main()
{
short int a,b,c;
scanf("%d%d",&a,&b);
c = a + b;
printf("%d + %d = %d\n",a,b,c);
return 0;
}

Resources