If I have the following:
char v = 32; // 0010 0000
then I do:
v << 2
the number becames negative. // 1000 0000 -128
I read the standard but it is only written:
If E1 has a signed type and nonnegative value, and E1 × 2 E2 is
representable in the result type, then that is the resulting value;
otherwise, the behavior is undefined.
so I don't understand if is a rule that if a bit goes on most left bit the
number must begin negative.
I'm using GCC.
Left shifting it twice would give 1000 0000)2 = 128)10.
If 128 is representable in char i.e. you're in some machine (with a supporting compiler) that provides a char of size > 8 bits then 128 would be the value you get (since it's representable in such a type).
Otherwise, if the size of a char is just 8 bits like most common machines, for a signed character type that uses two's complement for negative values, [-128, 127] is the representable range. You're in undefined behaviour land since it's not representable as-is in that type.
Signed data primitives like char use two's complement(http://en.wikipedia.org/wiki/Twos_complement) to encode value. You probably are looking for is unsigned char which won't encode the value using two's complement(no negatives).
Try using unsigned char instead char uses less bit for representing your character, by using unsigned char you avail more bits for representing your character
unsigned char var=32;
v=var<<2;
Related
I tried the following piece of code, expecting the output to be positive 64:
char val = 0x80;
printf("%d",val>>1);
My understanding of what happens is(please correct me if i'm wrong as i probably am):
Referring to the ASCII table, there is no mapping of 0x80 to any character so i assume this is stored as an unsigned integer.
This is represented as 1000 0000 in bitwise format, so a right shift of 1 would result in 0100 0000
When printed as an integer value, this will then show as positive 64.
However it shows -64.
In contrast:
char val = 0x40;
printf("%d",val>>1);
gives positive 32.
Is the value implicitly converted to a signed integer in the first case and not in the second?
Your C implementation uses an eight-bit signed char. (The C standard permits char to be signed or unsigned.) In char val = 0x80;, a char cannot represent the value you initialize it with, 128. In this case, the value 128 is converted to char which, per C 2018 6.3.1.3 3, yields either an implementation-defined value or a trap. Your implementation likely produces −128. (This is a common result because 128 in binary is 10000000, and converting an out-of-range result to an eight-bit two’s complement integer often simply reinterprets the low eight bits of the value as eight-bit two’s complement. In two’s complement, 10000000 represents −128.)
So val>>1 asks to shift −128 right one bit. Per C 2018 6.5.7 5, shifting a negative value right yields an implementation defined value. Producing −64 is a common result.
(In detail, in val>>1, val is automatically promoted from char to int. It has the same value, −128. However, with a 32-bit int, it would then be represented as 11111111111111111111111110000000 instead of 10000000. Then shifting right “arithmetically,” which propagates the sign bit, yields 11111111111111111111111111000000, which is −64, the result you go. Some C implementations might shift right “logically,” which sets the sign bit to zero, yielding 01111111111111111111111111000000. In this case, the printf would show “2147483584”, which is 231−64).
Whether ASCII has any character with code 0x80 is irrelevant. The C rules apply to the values involved, regardless of what character encoding scheme is used.
Right shift of the signed integer is implementation-defined. In most modern systems signed integers are two's complement and the shift will be translated by the compiler to the arithmetic shift.
after the shift the binary value of val is 0xc0 which is -64 in the two's complement encoding.
The val is converted first to the signed integer then passed to the function. If you put some effort into your question and add a few more lines to your code you would discover it yourself.
int main(void)
{
char c = 0x80;
printf("%d\n", c >> 1);
printf("%x\n", c >> 1);
printf("%hhd\n", c >> 1);
printf("%hhx\n", c >> 1);
c >>= 1;
printf("%d\n", c);
printf("%x\n", c);
printf("%hhd\n",c);
printf("%hhx\n",c);
}
https://godbolt.org/z/YsaGos
You can also see if the MSB bit is 0 arithmetic shift behaves exactly as the binary shift, thus 0x40 >> 1 == 0x20
I am going through 'The C language by K&R'. Right now I am doing the bitwise section. I am having a hard time in understanding the following code.
int mask = ~0 >> n;
I was playing on using this to mask n left side of another binary like this.
0000 1111
1010 0101 // random number
My problem is that when I print var mask it still negative -1. Assuming n is 4. I thought shifting ~0 which is -1 will be 15 (0000 1111).
thanks for the answers
Performing a right shift on a negative value yields an implementation defined value. Most hosted implementations will shift in 1 bits on the left, as you've seen in your case, however that doesn't necessarily have to be the case.
Unsigned types as well as positive values of signed types always shift in 0 bits on the left when shifting right. So you can get the desired behavior by using unsigned values:
unsigned int mask = ~0u >> n;
This behavior is documented in section 6.5.7 of the C standard:
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative
value, the value of the result is the integral part of the quotient
of E1 / 2E2 .If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
Right-shifting negative signed integers is an implementation-defined behavior, which is usually (but not always) filling the left with ones instead of zeros. That's why no matter how many bits you've shifted, it's always -1, as the left is always filled by ones.
When you shift unsigned integers, the left will always be filled by zeros. So you can do this:
unsigned int mask = ~0U >> n;
^
You should also note that int is typically 2 or 4 bytes, meaning if you want to get 15, you need to right-shift 12 or 28 bits instead of only 4. You can use a char instead:
unsigned char mask = ~0U;
mask >>= 4;
In C, and many other languages, >> is (usually) an arithmetic right shift when performed on signed variables (like int). This means that the new bit shifted in from the left is a copy of the previous most-significant bit (MSB). This has the effect of preserving the sign of a two's compliment negative number (and in this case the value).
This is in contrast to a logical right shift, where the MSB is always replaced with a zero bit. This is applied when your variable is unsigned (e.g. unsigned int).
From Wikipeda:
The >> operator in C and C++ is not necessarily an arithmetic shift. Usually it is only an arithmetic shift if used with a signed integer type on its left-hand side. If it is used on an unsigned integer type instead, it will be a logical shift.
In your case, if you plan to be working at a bit level (i.e. using masks, etc.) I would strongly recommend two things:
Use unsigned values.
Use types with specific sizes from <stdint.h> like uint32_t
For example:
int x = 65535; char y = x; printf("%d\n", y)
This will output -1. Anyway to derive this by hand?
In order to derive this by hand you need to know several implementation-defined aspects of your system - namely
If char is signed or not
If the char is signed, what representation scheme is used
How does your system treat narrowing conversions of values that cannot be represented exactly in the narrow type.
Although the standard allows implementations to decide, a very common approach to narrowing conversions is to truncate the bits that do not fit in a narrow type. Assuming that this is the approach taken by your system, the first part of figuring out the output is to find the last eight bits of the int value being converted. In your case, 65535 is 11111111111111112, so the last eight bits are all ones.
Now you need to decide the interpretation of 111111112. On your system char is signed, and the system uses two's complement representation of negative values, so this pattern is interpreted as an eight-bit -1 value.
When you call printf, eight-bit value of signed char is promoted to int, so the preserved value is printed.
On systems where char is unsigned by default the same pattern would be interpreted as 255.
65535 is 0xffff
When converting to char, the left bits are left off:
0xffff AND 0xff is 0xff
When passing a char to a function, it is expanded to int. As the left-most bit of the char is a 1, this will be sign-extended, so it will become 0xffffffff (32 bits)
This is -1, so -1 is printed as.
As Dasblinkenlight points out, it matters if the char is signed or unsigned (whether as a default in the implementation or as the declaration unsigned char y). My last lines would read for unsigned char:
When passing an unsgined char to a function, it is expanded to unsigned int. As it is unsigned, just zeroes are added at the left, so it will become 0x000000ff (32 bits).
This is 255, so 255 is printed as.
65535 when converted to binary is 1111111111111111.
and when you assign this to a character variable it gets trimmed to least significant 8 bits which is 11111111.
This is equivalent to -1 in Binary. 2's complement of any number gives its negative value, hence 2's complement of 00000001:
11111110 + 1 = 11111111 is -1.
int x = 65535;
The value of 'x' 65535 is equivalent to 0xffff (binary = 1111 1111 1111 1111) in hexadecimal which represent the 2 Byte(16 bit) number .
Then the value of 'x' is converting into character 'y' and the size of character datatype is 1 Byte (8 bit) .so right side values of x is cutoff while assigning to 'y' because it only have 1B memory space .
So the value of 'y' is 0xff (binary = 1111 1111) which is equal to -1 in signed number integer representation .
And we are displaying the 'y' character value in term of integer by '%d' .
Consider the following program
void main(){
char t = 179;
printf("%d ",t);
}
Output is -77.
But binary representation of 179 is
10110011
So, shouldn't output be -51, considering 1st bit is singed bit.
Binary representation of -77 is
11001101
It seems bit order is reversed. What's going on? Please somebody advice.
You claimed that the binary representation of -77 is 11001101, but the reversal is yours. The binary representation of -77 is 10110011.
Binary 10110011 unsigned is decimal 179.
Binary 10110011 signed is decimal -77.
You assigned the out-of-range value 179 to a signed char. It might theoretically be Undefined Behaviour, but apart from throwing an error, it would be a very poor compiler that placed anything but that 8-bit value in the signed char.
But when printed, it's interpreted as a negative number because b7 is set.
Looks like char is a signed type on your system. The valid range for a char would be [-128, 127]
By using
char t = 179;
the compiler uses the 2's complement of 179 (which is most likely -77) and assigns that value to t.
To convert between a positive and a negative number in 2's complement you invert all the bits and then you add 1.
10110011
01001100 (invert all bits)
01001101 (add one)
That is 77 decimal
Char could be signed or unsigned: that's up to your compiler.
If signed, it could be 2s or 1 complement.
Furthermore, it could be larger than 8 bits, although sizeof char is defined to be 1.
So it's inadvisable to rely on a specific representation.
Please note, that on many systems char is signed. Hence, when you assign 179 (which is of type int) to a char this value is outside of char range, hence it is unspecified behaviour.
6.3.1.3 Signed and unsigned integers:
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more unsigned integer than the maximum value that can be represented in the new type until the value is in the range of the new conversion to type. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
If you change the type to unsigned char your program will perform correctly.
Also note that char, signed char and unsigned char are 3 distinct types, unlike int and signed int. The signedness of char is implementation defined.
I understand that character variable holds from (signed)-128 to 127 and (unsigned)0 to 255
char x;
x = 128;
printf("%d\n", x);
But how does it work? Why do I get -128 for x?
printf is a variadic function, only providing an exact type for the first argument.
That means the default promotions are applied to the following arguments, so all integers of rank less than int are promoted to int or unsigned int, and all floating values of rank smaller double are promoted to double.
If your implementation has CHAR_BIT of 8, and simple char is signed and you have an obliging 2s-complement implementation, you thus get
128 (literal) to -128 (char/signed char) to -128 (int) printed as int => -128
If all the listed condition but obliging 2s complement implementation are fulfilled, you get a signal or some implementation-defined value.
Otherwise you get output of 128, because 128 fits in char / unsigned char.
Standard quote for case 2 (Thanks to Matt for unearthing the right reference):
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.60)
3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
This all has nothing to do with variadic functions, default argument promotions etc.
Assuming your system has signed chars, then x = 128; is performing an out-of-range assignment. The behaviour of this is implementation-defined ; meaning that the compiler may choose an action but it must document what it does (and therefore, do it reliably). This action is allowed to include raising a signal.
The usual behaviour that modern compilers do for out-of-range assignment is to truncate the representation of the value to fit in the destination type.
In binary representation, 128 is 000....00010000000.
Truncating this into a signed char gives the signed char of binary representation 10000000. In two's complement representation, which is used by all modern C systems for negative numbers, this is the representation of the value -128. (For historical curiousity: in one's complement this is -127, and in sign-magnitude, this is -0 which may be a trap representation and thus raise a signal).
Finally, printf accurately prints out this char's value of -128. The %d modifier works for char because of the default argument promotions and the facts that INT_MIN <= CHAR_MIN and INT_MAX >= CHAR_MAX.; this behaviour is guaranteed except on systems which have plain char as unsigned, and sizeof(int)==1 (which do exist but you'd know about it if you were on one).
Lets look at the binary representation of 128 when stored into 8 bits:
1000 0000
And now let's look at the binary representation of -128 when stored into 8 bits:
1000 0000
The standard for char with your current setup looks to be a signed char (note this isn't in the c standard, look here if you don't believe me) and thus when you're assigning the value of 128 to x you're assigning it the value 1000 0000 and thus when you compile and print it out it's printing out the signed value of that binary representation (meaning -128).
It turns out my environment is the same in assuming char is actually signed char. As expected if I cast x to be an unsigned char then I get the expected output of 128:
#include <stdio.h>
#include <stdlib.h>
int main() {
char x;
x = 128;
printf("%d %d\n", x, (unsigned char)x);
return 0;
}
gives me the output of -128 128
Hope this helps!