In programming languages operators like & and | are called bit wise operators. My question is even addition(+) and subtraction(-) or to that matter any mathematical expressions are bit wise operations. I mean the calculation happens on binary data as machine cannot understand decimals. I think for addition also there will be an add gate so why only operators like & and |(or) are called bit wise operators.
Because the bitwise operators only operate on the bits, they do nothing "more" and there's no question of the underlying format.
Addition treats a bunch of bits as a number, which might be signed (or even floating point); this means it must interpret the bits in a particular way (e.g. two's complement, signed magnitude, floating point, and so on), while the bitwise operators treat the bits as just "raw" bits, with no interpretation and no dependencies between bits as there might be in the higher-level numerical formats.
Also, you forgot some: there's also the ^ bitwise XOR operator, ~ which is bitwise not, and of course the shifting operators << and >>.
In the C language, there are a lot of operators called bitwise: & | ^ << >> ~ &= |= ^= <<= >>= ~=. They have in common that they are only used for bit manipulation on the "raw binary level", regardless of what kind of data the variable contains.
But of course, all operators have the purpose of altering bits. Bitwise is just a naming convention by the C language. Strictly speaking, C groups operators together in different groups with related operators, like this (C11 6.5):
Additive operators + -
Bitwise shift operators >> <<
Bitwise AND operator &
Bitwise exclusive OR operator ^
Bitwise inclusive OR operator |
And so on.
^ (XOR) and the shift operators are also bitwise operators. The difference between these and other operators is mainly that bitwise operators do not assume a particular encoding of a value, as is the case with the two's complement representation of integers. A small exception to this rule is that >> makes no sense unless the leftmost bit is interpreted as a sign bit.
In bitwise operations, the value of the bit at some particular position in the result may depend from the value of the bit in the same position in operands but does not depend on bit values in any other positions.
With one operand, there are 2 possible inputs (false, true) so 4 bitwise operations are possible (x, not x, 0 and 1). With two operands, there are 4 possible input combinations so in total 16 such operations are possible (and, or, xor, not x, not y, x, y, 0, 1, etc).
Related
Does anyone know what does &- in C programming?
limit= address+ (n &- sizeof(uint));
This isn't really one operator, but two:
(n) & (-sizeof(uint))
i.e. this is performing a bitwise and operation between n and -sizeof(uint).
What does this mean?
Let's assume -sizeof(uint) is -4 - then by two's complement representation, -sizeof(uint) is 0xFFFFFFFC or
1111 1111 1111 1111 1111 1111 1111 1100
We can see that this bitwise and operation will zero-out the last two bits of n. This effectively aligns n to the lowest multiple of sizeof(uint).
&- is the binary bitwise AND operator written together with - which is the unary minus operator. Operator precedence (binary & having lowest precedence here) gives us the operands of & as n and -sizeof(uint).
The purpose is to create a bit mask in a very obscure way, relying on unsigned integer arithmetic. Assuming uint is 4 bytes (don't use homebrewed types either btw, use stdint.h), then the code is equivalent to this
n & -(size_t)4
size_t being the type returned by sizeof, which is guaranted to be a large, unsigned integer type. Applying unary minus on unsigned types is of course nonsense too. Though even if it is obscure, applying minus on unsigned arithmetic results in well-defined wrap-around1), so in case of the value 4, we get 0xFFFFFFFFFFFFFFFC on a typical PC where size_t is 64 bits.
n & 0xFFFFFFFFFFFFFFFC will mask out everything but the 2 least significant bits.
What the relation between these 2 bits and the size of the type used is, I don't know. I guess that the purpose is to store something equivalent to the type's size in bytes in that area. Something with 4 values will fit in the two least significant bits: binary 0, 1, 10, 11. (The purpose could maybe be masking out misaligned addresses or some such?)
Assuming I guessed correct, we can write the same code without any obfuscation practices as far more readable code:
~(sizeof(uint32_t)-1)
Which gives us 4-1 = 0x3, ~0x3 = 0xFFFF...FC. Or in case of 8 byte types, 0xFFFF...F8. And so on.
So I'd rewrite the code as
#include <stdint.h>
uint32_t mask = ~(sizeof(uint32_t)-1);
limit = address + (n & mask);
1) C17 6.3.1.3
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type60)
Where the foot note 60) says:
The rules describe arithmetic on the mathematical value, not the value of a given type of expression.
In this case repeatedly subtracting SIZE_MAX+1 from 4 until the value is in range of what can fit inside a size_t variable.
I am learning about masking techniques in C.
Here is a practice problem I am working on:
Need to find the complement of 0x87654321 while leaving the least significant byte intact which should look like this 0x789ABC21
The only mask I'm familiar with right now is using x & 0xFF to strip all but the last byte. I don't know which bitwise operator to use to get the complement of a hex number. How do I approach that?
My book doesn't explain what a one complement of a hex number is but I googled that and found out the shortcut method to determine a one complement is to take 15 - hexDigit = complement Please correct me if I'm wrong.
The bitwise complement operator in C is the tilde ~. This flips every bit in a bit pattern.
The bitwise XOR operator (^) can also be used to do a bitwise complement. The bitwise XOR truth table is below:
^| 0 1
------
0| 0 1
1| 1 0
Notice in particular that 1^0 = 1 and that 1^1 = 0. Thus, bitwise XOR with a one will also have the effect of flipping bits.
Thus, bitwise XOR with 0xFFFFFF00 will flip all but the last eight bits of a number.
Suppose we have the following code:
int j = -1 & 0xFF;
The resulting value in j could be one of the following based on the underlying representation:
System Value
Two's complement 0xFF
One's complement 0xFE
Sign/Magnitude 0x01
But are the &, |, and ^ operators in C always defined in terms of two's complement (thus making j always be equal to 0xFF), or are they defined in terms of the underlying representation of the system?
They're defined in terms of the actual bit representation. From the C11 final draft:
The result of the binary & operator is the bitwise AND of the operands (that is, each bit in the result is set if and only if each of the corresponding bits in the converted operands is set).
...
The result of the ^ operator is the bitwise exclusive OR of the operands (that is, each bit in the result is set if and only if exactly one of the corresponding bits in the converted operands is set).
...
The result of the | operator is the bitwise inclusive OR of the operands (that is, each bit in the result is set if and only if at least one of the corresponding bits in the converted operands is set).
This question already has answers here:
Arithmetic bit-shift on a signed integer
(6 answers)
Closed 9 years ago.
so, lets say I have a signed integer (couple of examples):
-1101363339 = 10111110 01011010 10000111 01110101 in binary.
-2147463094 = 10000000 00000000 01010000 01001010 in binary.
-20552 = 11111111 11111111 10101111 10111000 in binary.
now: -1101363339 >> 31 for example, should equal 1 right? but on my computer, I am getting -1. Regardless of what negative integer I pick if x = negative number, x >> 31 = -1. why? clearly in binary it should be 1.
Per C99 6.5.7 Bitwise shift operators:
If E1 has a signed type and a negative value, the resulting value is implementation-defined.
where E1 is the left-hand side of the shift expression. So it depends on your compiler what you'll get.
In most languages when you shift to the right it does an arithmetic shift, meaning it preserves the most significant bit. Therefore in your case you have all 1's in binary, which is -1 in decimal. If you use an unsigned int you will get the result you are looking for.
Per C 2011 6.5.7 Bitwise shift operators:
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type
or if E1 has a signed type and a nonnegative value, the value of the result is the integral
part of the quotient of E1/ 2E2. If E1 has a signed type and a negative value, the
resulting value is implementation-defined.
Basically, the right-shift of a negative signed integer is implementation defined but most implementations choose to do it as an arithmetic shift.
The behavior you are seeing is called an arithmetic shift which is when right shifting extends the sign bit. This means that the MSBs will carry the same value as the original sign bit. In other words, a negative number will always be negative after a left shift operation.
Note that this behavior is implementation defined and cannot be guaranteed with a different compiler.
What you are seeing is an arithmetic shift, in contrast to the bitwise shift you were expecting; i.e., the compiler, instead of "brutally" shifting the bits, is propagating the sign bit, thus dividing by 2N.
When talking about unsigned ints and positive ints, a right shift is a very simple operation - the bits are shifted to the right by one place (inserting 0 on the left), regardless of their meaning. In such cases, the operation is equivalent to dividing by 2N (and actually the C standard defines it like that).
The distinction comes up when talking about negative numbers. Several negative numbers representation exist, although currently for integers most commonly 2's complement representation is used.
The problem of a "brutal" bitwise shift here is, for starters, that one of the bits is used in some way to express the sign; thus, shifting the binary digits regardless of the negative integers representation can give unexpected results.
For example, commonly in 2's representation the most significant bit is 1 for negative numbers, 0 for positive numbers; applying a bitwise shift (with zeroes inserted to the left) to a negative number would (between other things) make it positive, not resulting in the (usually expected) division by 2N
So, arithmetic shift is introduced; negative numbers represented in 2's complement have an interesting property: the division by 2N behavior of the shift is preserved if, instead of inserting zeroes from the left, you insert bits that have the same value of the original sign bit.
In this way, signed divisions by 2N can be performed with just a bit of extra logic in the shift, without having to resort to a fully-fledged division routine.
Now, is arithmetic shift guaranteed for signed integers? In some languages yes1, but in C it's not like that - the behavior of the shift operators when dealing with negative integers is left as an implementation-defined detail.
As often happens, this is due to different hardware support for the operation; C is used on vastly different platforms, and, especially in the past, there was quite a difference in the "cost" of operations depending on the platform.
For example, if the processor does not provide an arithmetic right shift instruction, the compiler would be mandated to emit a much slower DIV instruction of some kind, which could be a problem in an inner loop on slower processors. For these reasons, the C standard leaves it up to the implementor to do the most appropriate thing for the current platform.
In your case, your implementation probably chose arithmetic shift because you are running on an x86 processor, that uses 2's complement arithmetic and provides both bitwise and arithmetic shift as single CPU instructions.
Actually, languages like Java even have separated arithmetic and bitwise shift operators - this is mainly due to the fact that they do not have unsigned types to e.g. store bitfields.
I know that the behavior of >> on signed integer can be implementation dependent (specifically, when the left operand is negative).
What about the others: ~, >>, &, ^, |?
When their operands are signed integers of built-in type (short, int, long, long long), are the results guaranteed to be the same (in terms of bit content) as if their type is unsigned?
For negative operands, << has undefined behavior and the result of >> is implementation-defined (usually as "arithmetic" right shift). << and >> are conceptually not bitwise operators. They're arithmetic operators equivalent to multiplication or division by the appropriate power of two for the operands on which they're well-defined.
As for the genuine bitwise operators ^, ~, |, and &, they operate on the bit representation of the value in the (possibly promoted) type of the operand. Their results are well-defined for each possible choice of signed representation (twos complement, ones complement, or sign-magnitude) but in the latter two cases it's possible that the result will be a trap representation if the implementation treats the "negative zero" representation as a trap. Personally, I almost always use unsigned expressions with bitwise operators so that the result is 100% well-defined in terms of values rather than representations.
Finally, note that this answer as written may only apply to C. C and C++ are very different languages and while I don't know C++ well, I understand it may differ in some of these areas from C...
A left shift << of a negative value has undefined behaviour;
A right shift >> of a negative value gives an implementation-defined result;
The result of the &, | and ^ operators is defined in terms of the bitwise representation of the values. Three possibilities are allowed for the representation of negative numbers in C: two's complement, ones' complement and sign-magnitude. The method used by the implementation will determine the numerical result when these operators are used on negative values.
Note that the value with sign bit 1 and all value bits zero (for two's complement and sign-magnitude), or with sign bit and all value bits 1 (for ones’ complement) is explicitly allowed to be a trap representation, and in this case if you use arguments to these operators that would generate such a value the behaviour is undefined.
The bit content will be the same, but the resulting values will still be implementation dependent.
You really shouldn't see the values as signed or unsigned when using bitwise operations, because that is working on a different level.
Using unsigned types saves you from some of this trouble.
The C89 Standard defined the behavior of left-shifting signed numbers based upon bit positions. If neither signed nor unsigned types have padding bits, the required behavior for unsigned types, combined with the requirement that positive signed types share the same representation as unsigned types, would imply that the sign bit is immediately to the left of the most significant value bit.
This, in C89, -1<<1 would be -2 on two's-complement implementations which don't have padding bits and -3 on ones'-complement implementations which don't have padding bits. If there are any sign-magnitude implementations without padding bits, -1<<1 would equal 2 on those.
The C99 Standard changed left-shifts of negative values to Undefined Behavior, but nothing in the rationale gives any clue as to why (or even mentions the change at all). The behavior required by C89 may have been less than ideal in some ones'-complement implementations, and so it would made sense to allow those implementations the freedom to select something better. I've seen no evidence to suggest that the authors of the Standard didn't intended that quality two's-complement implementations should continue to provide the same behavior mandated by C89, but unfortunately they didn't actually say so.