Tilde C unsigned vs signed integer

Tilde C unsigned vs signed integer - c

For example:
unsigned int i = ~0;
Result: Max number I can assign to i
and
signed int y = ~0;
Result: -1
Why do I get -1? Shouldn't I get the maximum number that I can assign to y?

Both 4294967295 (a.k.a. UINT_MAX) and -1 have the same binary representation of 0xFFFFFFFF or 32 bits all set to 1. This is because signed numbers are represented using two's complement. A negative number has its MSB (most significant bit) set to 1 and its value determined by flipping the rest of the bits, adding 1 and multiplying by -1. So if you have the MSB set to 1 and the rest of the bits also set to 1, you flip them (get 32 zeros), add 1 (get 1) and multiply by -1 to finally get -1.
This makes it easier for the CPU to do the math as it needs no special exceptions for negative numbers. For example, try adding 0xFFFFFFFF (-1) and 1. Since there is only room for 32 bits, this will overflow and the result will be 0 as expected.
See more at:
http://en.wikipedia.org/wiki/Two%27s_complement

unsigned int i = ~0;
Result: Max number I can assign to i
Usually, but not necessarily. The expression ~0 evaluates to an int with all (non-padding) bits set. The C standard allows three representations for signed integers,
two's complement, in which case ~0 = -1 and assigning that to an unsigned int results in (-1) + (UINT_MAX + 1) = UINT_MAX.
ones' complement, in which case ~0 is either a negative zero or a trap representation; if it's a negative zero, the assignment to an unsigned int results in 0.
sign-and-magnitude, in which case ~0 is INT_MIN == -INT_MAX, and assigning it to an unsigned int results in (UINT_MAX + 1) - INT_MAX, which is 1 in the unlikely case that unsigned int has a width (number of value bits for unsigned integer types, number of value bits + 1 [for the sign bit] for signed integer types) smaller than that of int and 2^(WIDTH - 1) + 1 in the common case that the width of unsigned int is the same as the width of int.
The initialisation
unsigned int i = ~0u;
will always result in i holding the value UINT_MAX.
signed int y = ~0;
Result: -1
As stated above, only if the representation of signed integers uses two's complement (which nowadays is by far the most common representation).

~0 is just an int with all bits set to 1. When interpreted as unsigned this will be equivalent to UINT_MAX. When interpreted as signed this will be -1.
Assuming 32 bit ints:
0 = 0x00000000 = 0 (signed) = 0 (unsigned)
~0 = 0xffffffff = -1 (signed) = UINT_MAX (unsigned)

Paul's answer is absolutely right. Instead of using ~0, you can use:
#include <limits.h>
signed int y = INT_MAX;
unsigned int x = UINT_MAX;
And now if you check values:
printf("x = %u\ny = %d\n", UINT_MAX, INT_MAX);
you can see max values on your system.

No, because ~ is the bitwise NOT operator, not the maximum value for type operator. ~0 corresponds to an int with all bits set to 1, which, interpreted as an unsigned gives you the max number representable by an unsigned, and interpreted as a signed int, gives you -1.

You must be on a two's complement machine.

Look up http://en.wikipedia.org/wiki/Two%27s_complement, and learn a little about Boolean algebra, and logic design. Also learning how to count in binary and addition and subtraction in binary will explain this further.
The C language used this form of numbers so to find the largest number you need to use 0x7FFFFFFF. (where you use 2 FF's for each byte used and the leftmost byte is a 7.) To understand this you need to look up hexadecimal numbers and how they work.
Now to explain the unsigned equivalent. In signed numbers the bottom half of numbers are negative (0 is assumed positive so negative numbers actually count 1 higher than positive numbers). Unsigned numbers are all positive. So in theory your highest number for a 32 bit int is 2^32 except that 0 is still counted as positive so it's actually 2^32-1, now for signed numbers half those numbers are negative. which means we divide the previous number 2^32 by 2, since 32 is an exponent we get 2^31 numbers on each side 0 being positive means the range of an signed 32 bit int is (-2^31, 2^31-1).
Now just comparing ranges:
unsigned 32 bit int: (0, 2^32-1)
signed 32 bit int: (-2^31, 2^32-1)
unsigned 16 bit int: (0, 2^16-1)
signed 16 bit int: (-2^15, 2^15-1)
you should be able to see the pattern here.
to explain the ~0 thing takes a bit more, this has to do with subtraction in binary. it's just adding 1 and flipping all the bits then adding the two numbers together. C does this for you behind the scenes and so do many processors (including the x86 and x64 lines of processors.)
Because of this it's best to store negative numbers as though they are counting down, and in two's complement the added 1 is also hidden. Because 0 is assumed positive thus negative numbers can't have a value for 0, so they automatically have -1 (positive 1 after the bit flip) added to them. when decoding negative numbers we have to account for this.

Related

can't understand bit fields in C

For about 3-4 hours and start reading about bit fields in C and I can't understand how they work. For example, I can't understand why the program has the output: -1, 2, -3
#include <stdio.h>
struct REGISTER {
int bit1 : 1;
int : 2;
int bit3 : 4;
int bit4 : 4;
};
int main(void) {
struct REGISTER bit = { 1, 2, 13 };
printf("%d, %d, %d\n", bit.bit1, bit.bit3, bit.bit4);
return 0;
}
Can someone give me an explanation? I tend to think that if I use unsigned in the struct then the output would be positive. But I don't know where that -3 comes from.

Your compiler considers the type int of a bit field as the type signed int.
Consider the binary representation of initializers (it is enough to consider one byte)
1 -> 0b00000001
2 -> 0b00000010
13 -> 0b00001101
So the first bit-field having the width equal to 1 gets 1. For an integer with one bit this bit is the sign bit. So such a bit field can represent two values 0 and -1 (in the 2's complement representation).
The second initialized bit-field has the width equal to 4. So it stores the bit representation 0010 that is equal to 2.
The third initialized bit-field also has the width equal to 4. So its stored bit combination is equal to 1101. The most significant bit is the sign bit and it is set. So the bit-field contains a negative number. This number is equal to -3.
1101 ( = -3 )
+
0011 ( = 3 )
====
0000 ( = 0 )

It is implementation-defined whether an int bitfield is signed or unsigned, hence it is actually an error in any program where you care about the value - if you care for the value you will qualify it either as signed or unsigned.
Now, your compiler considers bitfields without specified signedness as signed. I.e. int bit4: 4 tells that that bit-field is 4 bits wide and signed.
13 cannot be represented in a signed 4-bit bitfield as the maximum value is 7, no matter the representation of negative numbers - 2's complement, 1's complement, sign-and-magnitude. An implementation-specified conversion will now occur: in your case the bit representation 1101 is stored as-is in the 2's complement signed bitfield, and it is considered as the 2's complement negative value -3.
Same happens for 1-bit signed bitfield: the one bit is the sign bit, hence there are only 2 possible values: 0 and -1 on two's complement systems. On one's complement system, or sign-and-magnitude, one-bit bitfield does not make any sense at all because it can only store 0 or a trap value.
If you want it to be able to store value 13, you will have to use at least 5 bits or use unsigned int: 4.

Please Dont use signed int in this operation. here that lead to minus results. if we use unsigned int,The output comes out to be negative
What happened behind is that the value 13 was stored in 4 bit signed integer which is equal to 1101. The MSB is a 1, so it’s a negative number and you need to calculate the 2’s complement of the binary number to get its actual value which is what is done internally. By calculating 2’s complement you will arrive at the value 0011 which is equivalent to decimal number 3 and since it was a negative number you get a -3.
#include <stdio.h>
struct REGISTER {
unsigned int bit1: 1;
unsigned int :2;
unsigned int bit3: 4;
unsigned int bit4: 4;
};
int main(void) {
struct REGISTER bit={1,2,13};
printf("%d, %d, %d\n", bit.bit1, bit.bit3, bit.bit4);
return 0;
}
Here the output will be exactly 1,2,13
Thanks

It's pretty easy, you're initializing the bitfields as follows:
1 as bit length 1. That means the MSB is 1, which since the type is int is the sign bit. So the resulting value is -0-1 = -1.
2 as bit length 4. MSB (sign bit) is 0, so the result is positive, ie 2.
13 as bit length 4. In binary that is 1101, so the MSB is 1 (negative), so the resulting value is -2-1 = -3.
You can read more about two's complement (the format in which numbers are stored on Intel architectures) here. The short version is that negative numbers have MSB=1, and to calculate their value you take the negative of their not representation (without the sign bit) minus one.

Difference between unsigned int and int

I read about twos complement on wikipedia and on stack overflow, this is what I understood but I'm not sure if it's correct
signed int
the left most bit is interpreted as -231 and this how we can have negative numbers
unsigned int
the left most bit is interpreted as +231 and this is how we achieve large positive numbers
update
What will the compiler see when we store 3 vs -3?
I thought 3 is always 00000000000000000000000000000011
and -3 is always 11111111111111111111111111111101
example for 3 vs -3 in C:
unsigned int x = -3;
int y = 3;
printf("%d %d\n", x, y); // -3 3
printf("%u %u\n", x, y); // 4294967293 3
printf("%x %x\n", x, y); // fffffffd 3

Two's complement is a way to represent negative integers in binary.
First of all, here's a standard 32-bit integer ranges:
Signed = -(2 ^ 31) to ((2 ^ 31) - 1)
Unsigned = 0 to ((2 ^ 32) - 1)
In two's complement, a negative is represented by inverting the bits of its positive equivalent and adding 1:
10 which is 00001010 becomes -10 which is 11110110 (if the numbers were 8-bit integers).
Also, the binary representation is only important if you plan on using bitwise operators.
If your doing basic arithmetic, then this is unimportant.
The only time this may give unexpected results outside of the aforementioned times is getting the absolute value of the signed version of -(2 << 31) which will always give a negative.
Your problem does not have to do with the representation, but the type.
A negative number in an unsigned integer is represented the same, the difference is that it becomes a super high number since it must be positive and the sign bit works as normal.
You should also realize that ((2^32) - 5) is the exact same thing as -5 if the value is unsigned, etc.
Therefore, the following holds true:
unsigned int x = (2 << 31) - 5;
unsigned int y = -5;
if (x == y) {
printf("Negative values wrap around in unsigned integers on underflow.");
}
else {
printf( "Unsigned integer underflow is undefined!" );
}

The numbers don't change, just the interpretation of the numbers. For most two's complement processors, add and subtract do the same math, but set a carry / borrow status assuming the numbers are unsigned, and an overflow status assuming the number are signed. For multiply and divide, the result may be different between signed and unsigned numbers (if one or both numbers are negative), so there are separate signed and unsigned versions of multiply and divide.

For 32-bit integers, for both signed and unsigned numbers, n-th bit is always interpreted as +2n.
For signed numbers with the 31th bit set, the result is adjusted by -232.
Example:
1111 1111 1111 1111 1111 1111 1111 11112 as unsigned int is interpreted as 231+230+...+21+20. The interpretation of this as a signed int would be the same MINUS 232, i.e. 231+230+...+21+20-232 = -1.
(Well, it can be said that for signed numbers with the 31th bit set, this bit is interpreted as -231 instead of +231, like you said in the question. I find this way a little less clear.)
Your representation of 3 and -3 is correct: 3 = 0x00000003, -3 + 232 = 0xFFFFFFFD.

Yes, you are correct, allow me to explain a bit further for clarification purposes.
The difference between int and unsigned int is how the bits are interpreted. The machine processes unsigned and signed bits the same way, but there are extra bits added for signing. Two's complement notation is very readable when dealing with related subjects.
Example:
The number 5's, 0101, inverse is 1011.
In C++, it's depends when you should use each data type. You should use unsigned values when functions or operators return those values. ALUs handle signed and unsigned variables very similarly.
The exact rules for writing in Two's complement is as follows:
If the number is positive, count up to 2^(32-1) -1
If it is 0, use all zeroes
For negatives, flip and switch all the 1's and 0's.
Example 2(The beauty of Two's complement):
-2 + 2 = 0 is displayed as 0010 + 1110; and that is 10000. With overflow at the end, we have our result as 0000;

C programming bitfield in structure

Why is answer of it
-1, 2, -3 ? (especially -3 ??? how come)
struct b1 {
int a:1;
int b:3;
int c:4;
} ;
int main()
{
struct b1 c = {1,2,13};
printf("%d, %d, %d",c.a,c.b,c.c);
return 0;
}
Compiled on VC++ 32 bit editor. Many Thanks.

signed integers are represented in twos complement. The range of a 1-bit twos complement number is -1 to 0. Because the first bit of a twos complement number indicates it is negative, which is what you have done there.
See here:
sign extend 1-bit 2's complement number?
That is the same for the third number, you have overflowed the range of -8 to 7 which is the range of a 4 bit signed integer.
What you meant to do there is make all of those int -> unsigned int
See here for twos complement explanation:
http://www.ele.uri.edu/courses/ele447/proj_pages/divid/twos.html

Because a and c are signed integers with their sign bit set, hence they are negative:
a: 1d = 1b (1 bit)
b: 2d = 010b (3 bits)
c: 13d = 1101b (4 bits)
Signed integer values are stored in two's complement, with the highest bit representing the sign (1 means "negative"). Hence, when you read the bitfield values as signed int's, you get back negative values for a and c (sign extended and converted back from their two's complement representation), but a positive 2 for b.
To get the absolute value of a negative two's complement number, you subtract one and invert the result:
1101b
-0001b
======
1100b
1100b inverted becomes 0011b which is equal to 3d. Since the sign bit is negative (which you had to examine before doing the previous calculation anyway), the result is -3d.

What does signed and unsigned values mean?

What does signed mean in C? I have this table to show:
This says signed char 128 to +127. 128 is also a positive integer, so how can this be something like +128 to +127? Or do 128 and +127 have different meanings? I am referring to the book Apress Beginning C.

A signed integer can represent negative numbers; unsigned cannot.
Signed integers have undefined behavior if they overflow, while unsigned integers wrap around using modulo.
Note that that table is incorrect. First off, it's missing the - signs (such as -128 to +127). Second, the standard does not guarantee that those types must fall within those ranges.

By default, numerical values in C are signed, which means they can be both negative and positive. Unsigned values on the other hand, don't allow negative numbers.
Because it's all just about memory, in the end all the numerical values are stored in binary. A 32 bit unsigned integer can contain values from all binary 0s to all binary 1s. When it comes to 32 bit signed integer, it means one of its bits (most significant) is a flag, which marks the value to be positive or negative. So, it's the interpretation issue, which tells that value is signed.
Positive signed values are stored the same way as unsigned values, but negative numbers are stored using two's complement method.
If you want to write negative value in binary, first write positive number, next invert all the bits and last add 1. When a negative value in two's complement is added to a positive number of the same magnitude, the result will be 0.
In the example below lets deal with 8-bit numbers, because it'll be simple to inspect:
positive 95: 01011111
negative 95: 10100000 + 1 = 10100001 [positive 161]
0: 01011111 + 10100001 = 100000000
^
|_______ as we're dealing with 8bit numbers,
the 8 bits which means results in 0

The table is missing the minuses. The range of signed char is -128 to +127; likewise for the other types on the table.

It was a typo in the book; signed char goes from -128 to 127.
Signed integers are stored using the two's complement representation, in which the first bit is used to indicate the sign.
In C, chars are just 8 bit integers. This means that they can go from -(2^7) to 2^7 - 1. That's because we use the 7 last bits for the number and the first bit for the sign. 0 means positive and 1 means negative (in two's complement representation).
The biggest positive 7 bit number is (01111111)b = 2^7 - 1 = 127.
The smallest negative 7 bit number is (11111111)b = -128
(because 11111111 is the two's complement of 10000000 = 2^7 = 128).
Unsigned chars don't have signs so they can use all the 8 bits. Going from (00000000)b = 0 to (11111111)b = 255.

Signed numbers are those that have either + or - appended with them.
E.g +2 and -6 are signed numbers.
Signed Numbers can store both positive and negative numbers thats why they have bigger range.
i.e -32768 to 32767
Unsigned numbers are simply numbers with no sign with them. they are always positive. and their range is from 0 to 65535.
Hope it helps

Signed usually means the number has a + or - symbol in front of it. This means that unsigned int, unsigned shorts, etc cannot be negative.

Nobody mentioned this, but range of int in table is wrong:
it is
-2^(31) to 2^(31)-1
i.e.,
-2,147,483,648 to 2,147,483,647

A signed integer can have both negative and positive values. While a unsigned integer can only have positive values.
For signed integers using two's complement , which is most commonly used, the range is (depending on the bit width of the integer):
char s -> range -128-127
Where a unsigned char have the range:
unsigned char s -> range 0-255

First, your table is wrong... negative numbers are missing. Refering to the type char.... you can represent at all 256 possibilities as char has one byte means 2^8. So now you have two alternatives to set ur range. either from -128 to +128 or 0 to 255. The first one is a signed char the second a unsigned char. If you using integers be aware what kind of operation system u are using. 16 bit ,32 bit or 64 bit. Int (16 bit,32 bit,64 bit). char has always just 8 bit value.

It means that there will likely be a sign ( a symbol) in front of your value (+12345 || -12345 )

Why is this bit-hack code portable?

int v;
int sign; // the sign of v ;
sign = -(int)((unsigned int)((int)v) >> (sizeof(int) * CHAR_BIT - 1));
Q1: Since v in defined by type of int ,so why bother to cast it into int again? Is it related to portability?
Edit:
Q2:
sign = v >> (sizeof(int) * CHAR_BIT - 1);
this snippt isn't portable, since right shift of signed int is implementation defined, how to pad the left margin bits is up to complier.So
-(int)((unsigned int)((int)v)
do the poratable trick. Explain me why thid works please.
Isn't right shift of unsigned int alway padding 0 in the left margin bits ?

It's not strictly portable, since it is theoretically possible that int and/or unsigned int have padding bits.
In a hypothetical implementation where unsigned int has padding bits, shifting right by sizeof(int)*CHAR_BIT - 1 would produce undefined behaviour since then
sizeof(int)*CHAR_BIT - 1 >= WIDTH
But for all implementations where unsigned int has no padding bits - and as far as I know that means all existing implementations - the code
int v;
int sign; // the sign of v ;
sign = -(int)((unsigned int)((int)v) >> (sizeof(int) * CHAR_BIT - 1));
must set sign to -1 if v < 0 and to 0 if v >= 0. (Note - thanks to Sander De Dycker for pointing it out - that if int has a negative zero, that would also produce sign = 0, since -0 == 0. If the implementation supports negative zeros and the sign for a negative zero should be -1, neither this shifting, nor the comparison v < 0 would produce that, a direct inspection of the object representation would be required.)
The cast to int before the cast to unsigned int before the shift is entirely superfluous and does nothing.
It is - disregarding the hypothetical padding bits problem - portable because the conversion to unsigned integer types and the representation of unsigned integer types is prescribed by the standard.
Conversion to an unsigned integer type is reduction modulo 2^WIDTH, where WIDTH is the number of value bits in the type, so that the result lies in the range 0 to 2^WIDTH - 1 inclusive.
Since without padding bits in unsigned int the size of the range of int cannot be larger than that of unsigned int, and the standard mandates (6.2.6.2) that signed integers are represented in one of
sign and magnitude
ones' complement
two's complement
the smallest possible representable int value is -2^(WIDTH-1). So a negative int of value -k is converted to 2^WIDTH - k >= 2^(WIDTH-1) and thus has the most significant bit set.
A non-negative int value, on the other hand cannot be larger than 2^(WIDTH-1) - 1 and hence its value will be preserved by the conversion and the most significant bit will not be set.
So when the result of the conversion is shifted by WIDTH - 1 bits to the right (again, we assume no padding bits in unsigned int, hence WIDTH == sizeof(int)*CHAR_BIT), it will produce a 0 if the int value was non-negative, and a 1 if it was negative.

It should be quite portable because when you convert int to unsigned int (via a cast), you receive a value that is 2's complement bit representation of the value of the original int, with the most significant bit being the sign bit.
UPDATE: A more detailed explanation...
I'm assuming there are no padding bits in int and unsigned int and all bits in the two types are utilized to represent integer values. It's a reasonable assumption for the modern hardware. Padding bits are a thing of the past, from where we're still carrying them around in the current and recent C standards for the purpose of backward compatibility (i.e. to be able to run code on old machines).
With that assumption, if int and unsigned int have N bits in them (N = CHAR_BIT * sizeof(int)), then per the C standard we have 3 options to represent int, which is a signed type:
sign-and-magnitude representation, allowing values from -(2N-1-1) to 2N-1-1
one's complement representation, also allowing values from -(2N-1-1) to 2N-1-1
two's complement representation, allowing values from -2N-1 to 2N-1-1 or, possibly, from -(2N-1-1) to 2N-1-1
The sign-and-magnitude and one's complement representations are also a thing of the past, but let's not throw them out just yet.
When we convert int to unsigned int, the rule is that a non-negative value v (>=0) doesn't change, while a negative value v (<0) changes to the positive value of 2N+v, hence (unsigned int)-1=UINT_MAX.
Therefore, (unsigned int)v for a non-negative v will always be in the range from 0 to 2N-1-1 and the most significant bit of (unsigned int)v will be 0.
Now, for a negative v in the range from to -2N-1 to -1 (this range is a superset of the negative ranges for the three possible representations of int), (unsigned int)v will be in the range from 2N+(-2N-1) to 2N+(-1), simplifying which we arrive at the range from 2N-1 to 2N-1. Clearly, the most significant bit of this value will always be 1.
If you look carefully at all this math, you will see that the value of (unsigned)v looks exactly the same in binary as v in 2's complement representation:
...
v = -2: (unsigned)v = 2N - 2 = 111...1102
v = -1: (unsigned)v = 2N - 1 = 111...1112
v = 0: (unsigned)v = 0 = 000...0002
v = 1: (unsigned)v = 1 = 000...0012
...
So, there, the most significant bit of the value (unsigned)v is going to be 0 for v>=0 and 1 for v<0.
Now, let's get back to the sign-and-magnitude and one's complement representations. These two representations may allow two zeroes, a +0 and a -0. But arithmetic computations do not visibly distinguish between +0 and -0, it's still a 0, whether you add it, subtract it, multiply it or compare it. You, as an observer, normally wouldn't see +0 or -0 or any difference from having one or the other.
Trying to observe and distinguish +0 and -0 is generally pointless and you should not normally expect or rely on the presence of two zeroes if you want to make your code portable.
(unsigned int)v won't tell you the difference between v=+0 and v=-0, in both cases (unsigned int)v will be equivalent to 0u.
So, with this method you won't be able to tell whether internally v is a -0 or a +0, you won't extract v's sign bit this way for v=-0.
But again, you gain nothing of practical value from differentiating between the two zeroes and you don't want this differentiation in portable code.
So, with this I dare to declare the method for sign extraction presented in the question quite/very/pretty-much/etc portable in practice.
This method is an overkill, though. And (int)v in the original code is unnecessary as v is already an int.
This should be more than enough and easy to comprehend:
int sign = -(v < 0);

Nope its just excessive casting. There is no need to cast it to an int. It doesn't hurt however.
Edit: Its worth noting that it may be done like that so the type of v can be changed to something else or it may have once been another data type and after it was converted to an int the cast was never removed.

It isn't. The Standard does not define the representation of integers, and therefore it's impossible to guarantee exactly what the result of that will be portably. The only way to get the sign of an integer is to do a comparison.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Tilde C unsigned vs signed integer - c

For example: unsigned int i = ~0; Result: Max number I can assign to i and signed int y = ~0; Result: -1 Why do I get -1? Shouldn't I get the maximum number that I can assign to y?

~0 is just an int with all bits set to 1. When interpreted as unsigned this will be equivalent to UINT_MAX. When interpreted as signed this will be -1. Assuming 32 bit ints: 0 = 0x00000000 = 0 (signed) = 0 (unsigned) ~0 = 0xffffffff = -1 (signed) = UINT_MAX (unsigned)

Paul's answer is absolutely right. Instead of using ~0, you can use: #include <limits.h> signed int y = INT_MAX; unsigned int x = UINT_MAX; And now if you check values: printf("x = %u\ny = %d\n", UINT_MAX, INT_MAX); you can see max values on your system.

No, because ~ is the bitwise NOT operator, not the maximum value for type operator. ~0 corresponds to an int with all bits set to 1, which, interpreted as an unsigned gives you the max number representable by an unsigned, and interpreted as a signed int, gives you -1.

You must be on a two's complement machine.

Related

can't understand bit fields in C

Difference between unsigned int and int

C programming bitfield in structure

What does signed and unsigned values mean?

Why is this bit-hack code portable?

Categories

Resources