As been told again and so that nehative numbers are represented by 2s complement while unsigned don't use that extra bit for signed convention. In case of integer we can represent both signed and unsigned. How in data type integer computer figures out which encoding scheme to pursue
Some operations (such as addition) work identically on both signed and unsigned integers.
But that's not the case for all operations. When right-shifting, we shift in zeroes for unsigned integers, and we shift in the sign bit for signed integers.
In these cases, the processor provides the means to achieve both operations. It's possible for the processor two offer two different instructions, or two variations of one.
But whatever the case, there is no decision making on the processor's part. The processor just executes the instructions selected by the compiler. It's up to the compiler to emit instructions that achieve the desired result based on the type of the values involved.
In C bitwise left shift operation invokes Undefined Behaviour when the left side operand has negative value.
Relevant quote from ISO C99 (6.5.7/4)
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo
one more than the maximum value representable in the result type. If E1 has a signed
type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.
But in C++ the behaviour is well defined.
ISO C++-03 (5.8/2)
The value of E1 << E2 is E1 (interpreted as a bit pattern) left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 multiplied by the quantity 2 raised to the power E2, reduced modulo ULONG_MAX+1 if E1 has type unsigned long, UINT_MAX+1 otherwise.
[Note: the constants ULONG_MAXand UINT_MAXare defined in the header ). ]
That means
int a = -1, b=2, c;
c= a << b ;
invokes Undefined Behaviour in C but the behaviour is well defined in C++.
What forced the ISO C++ committee to consider that behaviour well defined as opposed to the behaviour in C?
On the other hand the behaviour is implementation defined for bitwise right shift operation when the left operand is negative, right?
My question is why does left shift operation invoke Undefined Behaviour in C and why does right shift operator invoke just Implementation defined behaviour?
P.S : Please don't give answers like "It is undefined behaviour because the Standard says so". :P
The paragraph you copied is talking about unsigned types. The behavior is undefined in C++. From the last C++0x draft:
The value of E1 << E2 is E1
left-shifted E2 bit positions; vacated
bits are zero-filled. If E1 has an
unsigned type, the value of the result
is E1 × 2E2, reduced modulo one more
than the maximum value representable
in the result type. Otherwise, if E1
has a signed type and non-negative
value, and E1×2E2 is representable in
the result type, then that is the
resulting value; otherwise, the
behavior is undefined.
EDIT: got a look at C++98 paper. It just doesn't mention signed types at all. So it's still undefined behavior.
Right-shift negative is implementation defined, right. Why? In my opinion: It's easy to implementation-define because there is no truncation from the left issues. When you shift left you must say not only what's shifted from the right but also what happens with the rest of the bits e.g. with two's complement representation, which is another story.
In C bitwise left shift operation invokes Undefined Behaviour when the
left side operand has negative value.
[...]
But in C++ the behaviour is well defined.
[...] why [...]
The easy answer is: Becuase the standards say so.
A longer answer is: It has probably something to do with the fact that C and C++ both allow other representations for negative numbers besides 2's complement. Giving fewer guarantees on what's going to happen makes it possible to use the languages on other hardware including obscure and/or old machines.
For some reason, the C++ standardization committee felt like adding a little guarantee about how the bit representation changes. But since negative numbers still may be represented via 1's complement or sign+magnitude the resulting value possibilities still vary.
Assuming 16 bit ints, we'll have
-1 = 1111111111111111 // 2's complement
-1 = 1111111111111110 // 1's complement
-1 = 1000000000000001 // sign+magnitude
Shifted to the left by 3, we'll get
-8 = 1111111111111000 // 2's complement
-15 = 1111111111110000 // 1's complement
8 = 0000000000001000 // sign+magnitude
What forced the ISO C++ committee to consider that behaviour well
defined as opposed to the behaviour in C?
I guess they made this guarantee so that you can use << appropriately when you know what you're doing (ie when you're sure your machine uses 2's complement).
On the other hand the behaviour is implementation defined for bitwise
right shift operation when the left operand is negative, right?
I'd have to check the standard. But you may be right. A right shift without sign extension on a 2's complement machine isn't particularly useful. So, the current state is definitely better than requiring vacated bits to be zero-filled because it leaves room for machines that do a sign extensions -- even though it is not guaranteed.
To answer your real question as stated in the title: as for any operation on a signed type, this has undefined behavior if the result of the mathematical operation doesn't fit in the target type (under- or overflow). Signed integer types are designed like that.
For the left shift operation if the value is positive or 0, the definition of the operator as a multiplication with a power of 2 makes sense, so everything is ok, unless the result overflows, nothing surprising.
If the value is negative, you could have the same interpretation of multiplication with a power of 2, but if you just think in terms of bit shift, this would be perhaps surprising. Obviously the standards committee wanted to avoid such ambiguity.
My conclusion:
if you want to do real bit pattern
operations use unsigned types
if you want to multiply a
value (signed or not) by a power of two, do just
that, something like
i * (1u << k)
your compiler will transform this into decent assembler in any case.
A lot of these kind of things are a balance between what common CPUs can actually support in a single instruction and what's useful enough to expect compiler-writers to guarantee even if it takes extra instructions. Generally, a programmer using bit-shifting operators expects them to map to single instructions on CPUs with such instructions, so that's why there's undefined or implementation behaviour where CPUs had various handling of "edge" conditions, rather than mandating a behaviour and having the operation be unexpectedly slow. Keep in mind that the additional pre/post or handling instructions may be made even for the simpler use cases. undefined behaviour may have been necessary where some CPUs generated traps/exceptions/interrupts (as distinct from C++ try/catch type exceptions) or generally useless/inexplicable results, while if the set of CPUs considered by the Standards Committee at the time all provided at least some defined behaviour, then they could make the behaviour implementation defined.
My question is why does left shift operation invoke Undefined Behaviour in C and why does right shift operator invoke just Implementation defined behaviour?
The folks at LLVM speculate the shift operator has constraints because of the way the instruction is implemented on various platforms. From What Every C Programmer Should Know About Undefined Behavior #1/3:
... My guess is that this originated because the underlying shift operations on various CPUs do different things with this: for example, X86 truncates 32-bit shift amount to 5 bits (so a shift by 32-bits is the same as a shift by 0-bits), but PowerPC truncates 32-bit shift amounts to 6 bits (so a shift by 32 produces zero). Because of these hardware differences, the behavior is completely undefined by C...
Nate that the discussion was about shifting an amount greater than the register size. But its the closest I've found to explaining the shift constraints from an authority.
I think a second reason is the potential sign change on a 2's compliment machine. But I've never read it anywhere (no offense to #sellibitze (and I happen to agree with him)).
In C89, the behavior of left-shifting negative values was unambiguously defined on two's-complement platforms which did not use padding bits on signed and unsigned integer types. The value bits that signed and unsigned types had in common to be in the same places, and the only place the sign bit for a signed type could go was in the same place as the upper value bit for unsigned types, which in turn had to be to the left of everything else.
The C89 mandated behaviors were useful and sensible for two's-complement platforms without padding, at least in cases where treating them as multiplication would not cause overflow. The behavior may not have been optimal on other platforms, or on implementations that seek to reliably trap signed integer overflow. The authors of C99 probably wanted to allow implementations flexibility in cases where the C89 mandated behavior would have been less than ideal, but nothing in the rationale suggests an intention that quality implementations shouldn't continue to behave in the old fashion in cases where there was no compelling reason to do otherwise.
Unfortunately, even though there have never been any implementations of C99 that don't use two's-complement math, the authors of C11 declined to define the common-case (non-overflow) behavior; IIRC, the claim was that doing so would impede "optimization". Having the left-shift operator invoke Undefined Behavior when the left-hand operand is negative allows compilers to assume that the shift will only be reachable when the left-hand operand is non-negative.
I'm dubious as to how often such optimizations are genuinely useful, but the rarity of such usefulness actually weighs in favor of leaving the behavior undefined. If the only situations where two's-complement implementations wouldn't behave in commonplace fashion are those where the optimization would actually be useful, and if no such situations actually exist, then implementations would behave in commonplace fashion with or without a mandate, and there's no need to mandate the behavior.
The behavior in C++03 is the same as in C++11 and C99, you just need to look beyond the rule for left-shift.
Section 5p5 of the Standard says that:
If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined
The left-shift expressions which are specifically called out in C99 and C++11 as being undefined behavior, are the same ones that evaluate to a result outside the range of representable values.
In fact, the sentence about unsigned types using modular arithmetic is there specifically to avoid generating values outside the representable range, which would automatically be undefined behavior.
The result of shifting depends upon the numeric representation. Shifting behaves like multiplication only when numbers are represented as two's complement. But the problem is not exclusive to negative numbers. Consider a 4-bit signed number represented in excess-8 (aka offset binary). The number 1 is represented as 1+8 or
1001
If we left shift this as bits, we get
0010
which is the representation for -6. Similarly, -1 is represented as -1+8
0111
which becomes
1110
when left-shifted, the representation for +6. The bitwise behavior is well-defined, but the numeric behavior is highly dependent on the system of representation.
Context:
This is a followup to that other question of mine. I asked about both C and C++ and soon got an answer about C++ because last draft for C++20 explicitely requires that signed integer types use two's complement and that padding bits (if any) cannot give trap representations. Unfortunately this is not true for C.
Of course, I know that most modern system only use 2-complement representations of integers and no padding bits, meaning that no trap representation can be observed. Nevertheless the C standard seem to still allow for 3 representations of signed types: sign and magnitude, one's complement and two's complement. And at least C18 draft (n2310 6.2.6 Representations of types explicitely allows padding bits for integer types other that char. This is still true for the latest version (n2454) I could find
Question
So in the context of possible padding bits, or non two's complement signed representation, int variables could contain trap values for conformant implementations. Is there a reliable way to make sure that an int variable contains a valid value?
I'm writing some SSE/AVX code and there's a task to divide a packed signed 32 bit integers by 2's complement. When the values are positive this shift works fine, however it produces wrong results for negative values, because of shifting the sign bit.
Is there any SIMD operation that lets me shift preserving the position of the sign bit? Thanks
SSE2/AVX2 has a choice of arithmetic1 vs. logical right shifts for 16 and 32-bit element sizes. (For 64-bit elements, only logical is available until AVX512).
Use _mm_srai_epi32 (psrad) instead of _mm_srli_epi32 (psrld).
See Intel's intrinsics guide, and other links in the SSE tag wiki https://stackoverflow.com/tags/sse/info. (Filter it to exclude AVX512 if you want, because it's pretty cluttered these days with all the masked versions for all 3 sizes...)
Or just look at the asm instruction-set reference, which includes intrinsics for instructions that have them. Searching for "arithmetic" in http://felixcloutier.com/x86/index.html finds the shifts you want.
Note the a=arithmetic vs. l=logical, instead of the usual intrinsics naming scheme of epu32 for unsigned. The asm mnemonics are simple and consistent (e.g. Packed Shift Right Arithmetic Dword = psrad).
Arithmetic right shifts are also available for AVX2 variable-shifts (vpsravd, and for the one-variable-for-all-elements version of the immediate shifts.
Footnote 1:
Arithmetic right shifts shift in copies of the sign bit, instead of zero.
This correctly implement 2's complement signed division by powers of 2 with rounding towards negative infinity, unlike the truncation toward zero you get from C signed division. Look at the asm output for int foo(int a){return a/4;} to see how compilers implement signed division semantics in terms of shifts.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Rephrased: list of platforms supported by the C standard
The C standard is very loosely defined:
- it covers two's complement, ones' complement, signed magnitude
- integers can be of various width, with padding bits
- certain bit patterns may not represent valid values.
There is a obvious downside to this: it make portable code harder to write. Does anyone know of platforms for which there are still active development work, but which are
not 2's complement or
the integer width is not 32 bits or 64 bits or
some integer types have padding bits or
if you worked on a 2's complement machine, the bit pattern with sign
bit 1 and all value bits zero is not a valid negative number or
integer conversion from signed to unsigned (and vice versa) is not via verbatim
copying of bit patterns or
right shift of integer is not arithmetic shift or
the number of value bits in an unsigned type is not the number of
value bits in the corresponding signed type + 1 or
conversion from a wider int type to a smaller type is not by
truncation of the left most bits which would not fit
yes...it is still used in embedded system and in micro-controllers.
It is also used in education purposes.
yes, we see this all the time when working with customizable microcontrolers and DSPs for things like audio processing.