Is the least significant bit (LSB) always the "first" bit? - c

I'm reading Modern C (version Feb 13, 2018.) and on page 42 it says
It says that the bit with index 4 is the least significant bit. Isn't the bit with index 0 should be the least significant bit? (Same question about MSB.)
Which is right? What's the correct terminology?

Their definition of "most significant bit" and "least significant bit" is misleading:
8 bit Binary number : 1 1 1 1 0 0 0 0
Bit number 7 6 5 4 3 2 1 0
| | |
| | least significant bit
| |
| |
| least significant bit that is 1
|
most significant bit that is 1 and also just most significant bit

The book's definition does not align with common/typical/mainstream/correct usage. See Wikipedia, for instance:
In computing, the least significant bit (LSB) is the bit position in a binary integer giving the units value, that is, determining whether the number is even or odd.
The book, on the other hand, seems to consider only bits that are 1, so that in an 8-bit byte representing the number 16, which we can write:
00010000
the bit that is 1 has index 4 (it's b4 in the book's notation), and then it claims that that particular number's LSB is four.
The proper definition just uses LSB to denote that bit whose value is 1, i.e. the "units", and with that the LSB is the rightmost bit. This latter definition is more useful, and I really think the book is wrong.

They're using an unusual definition of LSB and MSB, which only refers to the bits that are set to 1. So in the case of 240, the first 1 bit is b4, not b0, because b0 through b3 are all 0.
I'm not sure why the book considers this definition of LSB/MSB to be useful. It's not generally interesting for integers, although it does come into play in floating point. Floating point numbers are scaled so integers above 1 have the low-order zero bits shifted away, and the exponent is incremented to make up for this (conversely, fractions have their high-order bits shifted away, and the exponent is decremented).

Related

what is the difference between logical OR operation and binary addition?

I'm trying to understand how a binary addition and logical OR table differs.
does both carry forward 1 or if not which one does carry forward operation and which does not?
The exclusive-or (XOR) operation is like binary addition, except that
there is no carry from one bit position to the next. Thus, each bit
position can be evaluated independently of the rest.
I'll attempt to clarify a few points with a few illustrations.
First, addition. Basically like adding numbers in grade school. But if you have a 1-bit aligned with a 1-bit, you get a 0 with a 1 carry (i.e. 10, essentially analogous to 5 plus 5 in base-10). Otherwise, add them like 'regular' (base-10) numbers. For instance:
₁₁₁
1001
+ 1111
______
11000
Note that in the left-most column two 1's are added to give 10, which with another 1 gives 11 (similar to 5 + 5 + 5).
Now, assuming by "logical OR" you mean something along the lines of bitwise OR (an operation which basically performs the logical OR (inclusive) operation on each pair of corresponding bits), then you have this:
1001
| 1111
______
1111
Only case here you should have a 0 bit is if both bits are 0.
Finally, since you tagged this question xor, which I assume is bitwise as well.
1001
^ 1111
______
0110 = 110₂
In this case, two 1-bits give a 0, and of course two 0-bits give 0.
With a logical OR you get a logical result (Boolean). IOW true OR true is true (anything other than false OR false is true). In some languages (like C) any numeric value other than 0 means true. And some languages use an explicit datatype for true, false (bool, Boolean).
In case of binary OR, you are ORing the bits of two binary values. ie: 1 (which is binary 1) bitwise OR 2 (which is binary 10) is binary 11:
01
10
11
which is 3. Thus binary OR is also an addition when the values do not have shared bits (like flag values).

What does (size + 7) & ~7 mean?

I'm reading the Multiboot2 specification. You can find it here. Compared to the previous version, it names all of its structures "tags". They're defined like this:
3.1.3 General tag structure
Tags constitutes a buffer of structures following each other padded on u_virt size. Every structure has
following format:
+-------------------+
u16 | type |
u16 | flags |
u32 | size |
+-------------------+
type is divided into 2 parts. Lower contains an identifier of
contents of the rest of the tag. size contains the size of tag
including header fields. If bit 0 of flags (also known as
optional) is set if bootloader may ignore this tag if it lacks
relevant support. Tags are terminated by a tag of type 0 and size
8.
Then later in example code:
for (tag = (struct multiboot_tag *) (addr + 8);
tag->type != MULTIBOOT_TAG_TYPE_END;
tag = (struct multiboot_tag *) ((multiboot_uint8_t *) tag
+ ((tag->size + 7) & ~7)))
The last part confuses me. In Multiboot 1, the code was substantially simpler, you could just do multiboot_some_structure * mss = (multiboot_some_structure *) mbi->some_addr and get the members directly, without confusing code like this.
Can somebody explain what ((tag->size + 7) & ~7) means?
As mentioned by chux in his comment, this rounds tag->size up to the nearest multiple of 8.
Let's take a closer look at how that works.
Suppose size is 16:
00010000 // 16 in binary
+00000111 // add 7
--------
00010111 // results in 23
The expression ~7 takes the value 7 and inverts all bits. So:
00010111 // 23 (from pervious step)
&11111000 // bitwise-AND ~7
--------
00010000 // results in 16
Now suppose size is 17:
00010001 // 17 in binary
+00000111 // add 7
--------
00011000 // results in 24
Then:
00011000 // 24 (from pervious step)
&11111000 // bitwise-AND ~7
--------
00011000 // results in 24
So if the lower 3 bits of size are all zero, i.e. a multiple of 8, (size+7)&~7 sets those bits and then clears them, so no net effect. But if any one of those bits is 1, the bit corresponding to 8 gets incremented, then the lower bits are cleared, i.e. the number is rounded up to the nearest multiple of 8.
~ is a bitwise not. & is a bitwise AND
assuming 16 bits are used:
7 is 0000 0000 0000 0111
~7 is 1111 1111 1111 1000
Anything and'd with a 0 is 0. Anything and'd with 1 is itself. Thus
N & 0 = 0
N & 1 = N
So when you AND with ~7, you essentially clear the lowest three bits and all of the other bits remain unchanged.
Thanks for #chux for the answer. According to him, it rounds the size up to a multiple of 8, if needed. This is very similar to a technique done in 15bpp drawing code:
//+7/8 will cause this to round up...
uint32_t vbe_bytes_per_pixel = (vbe_bits_per_pixel + 7) / 8;
Here's the reasoning:
Things were pretty simple up to now but some confusion is introduced
by the 16bpp format. It's actually 15bpp since the default format is
actually RGB 5:5:5 with the top bit of each u_int16 being unused. In
this format, each of the red, green and blue colour components is
represented by a 5 bit number giving 32 different levels of each and
32786 possible different colours in total (true 16bpp would be RGB
5:6:5 where there are 65536 possible colours). No palette is used for
16bpp RGB images - the red, green and blue values in the pixel are
used to define the colours directly.
& ~7 sets the last three bits to 0

Possible values for operands in bitwise-and expression

Given the following C code:
int x = atoi(argv[1]);
int y = (x & -x);
if (x==y)
printf("Wow... they are the same!\n");
What values of x will result in "Wow... they are the same!" getting printed? Why?
So. It generally depends, but I can assume, that your architecture represents numbers with sign in U2 format (everything is false if it's not in U2 format). Let's have an example.
We take 3, which representation will be like:
0011
and -3. which will be:
~ 0011
+ 1
-------
1101
and we make and
1101
& 0011
------
0001
so:
1101 != 0001
that's what is happening underhood. You have to find numbers that fit to this pattern. I do not know what kind of numbers fit it upfront. But basing on this you can predict this.
The question is asking about the binary & operator, and 2's compliment arithmetic.
I would look to how numbers are represented in 2's compliment, and what the binary & symbol does.
Assuming a 2's compliment representation for negative numbers, the only values for which this is true are positive numbers of the form 2^n where n >= 0, and 0.
When you take the 2's compliment of a number, you flip all bits and then add one. So the least significant bit will always match. The next bit won't match unless the prior carried over, and the same for the next bit.
An int is typically 32 bits, however I'll use 5 bits in the following examples for simplicity.
For example, 5 is 00101. Flipping all bits gives us 11010, then adding 1 gives us 11011. Then 00101 & 11011 = 00001. The only bit that matches a set bit is the last one, so 5 doesn't work.
Next we'll try 12, which is 01100. Flipping the bits gives us 10011, then adding 1 gives us 10100. Then 01100 & 10100 = 00100. Because of the carry-over the third bit is set, however the second bit is not, so 12 doesn't work either.
So the most significant bit which is set won't match unless all lower bits carry over when 1 is added. This is true only for numbers with one bit set, i.e. powers of 2.
If we now try 8, which is 01000, flipping the bits gives us 10111 and adding 1 gives us 11000. And 01000 & 11000 = 01000. In this case, the second bit is set, which is the only bit set in the original number. So the condition holds.
Negative numbers cannot satisfy this condition because positive numbers have the most significant bit set to 0, while negative numbers have the most significant bit set to 1. So a bitwise AND of a number and its negative will always have the most significant bit set to 0, meaning this number cannot be negative.
0 is a special case since it is its own negative. 0 & 0 = 0, so it also satisfies this condition.
Another special case is the smallest number you can represent. In the case of a 5-bit number this is -16, which is represented by 10000. Flipping all the bits gives you 01111 and adding 1 gives you 10000, which is the same number. On the surface it seems this number also satisfies the condition, however this is an overflow condition and implementations may not handle this case correctly. See this link for more details.

C MSB to LSB explanation

Could someone explain to me how this algorithm converts MSB to LSB or LSB to MSB on a 32-bit system?
unsigned char b = x;
b = ((b * 0x0802LU & 0x22110LU) | (b * 0x8020LU & 0x88440LU)) * 0x10101LU >> 16;
I've seen hex values end with LU or just U in code before, what do they mean?
Thanks!
Presumably, a char has eight bits, so unsigned char b = x takes the low eight bits of x.
The mask with 0x22110 extracts bits 4, 8, 13, and 17 (numbering from 0 for the least significant bit). So, in the multiplication by 0x0802, we only care about what it places at those bits. In 0x802, bits 1 and 11 are on, so this multiplication places a copy of the eight bits of b in bits 1 through 8 and another copy in bits 11 through 18. There is no overlap, so there are no effects from adding bits that overlap in more general multiplications.
From this product, we take these bits:
Bit 4, which is bit 3 of b. (Bit 4 from the copy starting at bit 1, so bit 4–1 = 3 of b.)
Bit 8, which is bit 7 of b. (8–1 = 7.)
Bit 13, which is bit 2 of b. (13–11 = 2.)
Bit 17, which is bit 6 of b. (17–11 = 6.)
Similarly, the mask by 0x88440 extracts bits 6, 10, 15, and 19. The multiplication by 0x8020 places a copy of b in bits 5 to 12 and another copy in bits 15 to 22. From this product, we take these bits:
Bit 6, which is bit 1 of b.
Bit 10, which is bit 5 of b.
Bit 15, which is bit 0 of b.
Bit 19, which is bit 4 of b.
Then we OR those together, producing:
Bit 4, which is bit 3 of b.
Bit 6, which is bit 1 of b.
Bit 8, which is bit 7 of b.
Bit 10, which is bit 5 of b.
Bit 13, which is bit 2 of b.
Bit 15, which is bit 0 of b.
Bit 17, which is bit 6 of b.
Bit 19, which is bit 4 of b.
Call this result t.
We are going to multiply that by 0x10101, shift right by 16, and assign to b. The assignment converts to unsigned char, so only the low eight bits are kept. The low eight bits after the shift are bits 24 to 31 before the shift. So we only care about bits 24 to 31 in the product.
The multiplier 0x10101 has bits 0, 8, and 16 set. Thus, bit 24 in the result is the sum of bits 24, 16, and 8 in t, plus any carry from elsewhere. However, there is no carry: Observe that none of the set bits in t are eight apart, as the bits in the multiplier are. Therefore, none of them can directly contribute to the same bit in the product. Each bit in the product is the result of at most one bit in t. We just need to figure out which bit that is.
Bit 24 must come from bit 8, 16, or 24 in t. Only bit 8 can be set, and it is bit 7 from b. Deducing all the bits this way:
Bit 24 is bit 8 in t, which is bit 7 in b.
Bit 25 is bit 17 in t, which is bit 6 in b.
Bit 26 is bit 10 in t, which is bit 5 in b.
Bit 27 is bit 19 in t, which is bit 4 in b.
Bit 28 is bit 4 in t, which is bit 3 in b.
Bit 29 is bit 13 in t, which is bit 2 in b.
Bit 30 is bit 6 in t, which is bit 1 in b.
Bit 31 is bit 15 in t, which is bit 0 in b.
Thus, bits 24 to 31 in the product are bits 7 to 0 in b, so the eight bits finally produced are bits 7 to 0 in b.
View b as an 8 bit value abcdefgh where each of those letters is a single bit (0 or 1), with a the most significant bit and h the least significant. Then look at what each of the operations do to those bits:
b * 0x0802LU = 00000abcdefgh00abcdefgh0
b * 0x0802LU & 0x22110LU = 000000b000f0000a000e0000
b * 0x8020LU = 0abcdefgh00abcdefgh00000
b * 0x8020LU & 0x88440LU = 0000d000h0000c000g000000
((b * 0x0802LU & 0x22110LU) | (b * 0x8020LU & 0x88440LU))
= 0000d0b0h0f00c0a0g0e0000
so at this point, it has shuffled the bits and spread them out.
(....) * 0x10101LU = d0b0h0f00c0a0g0e0000
+ d0b0h0f00c0a0g0e000000000000
+ d0b0h0f00c0a0g0e00000000000000000000
= d0b0f0f0dcbahgfedcbahgfe0c0a0g0e0000
(...) * 0x10101LU >> 16 = d0b0f0f0dcbahgfedcba
b = hgfedcba
the multiply is equivalent to shift/add/add (3 bits set in the constant), which lines up the bits where they should end up. Then the final shift and reduction to 8 bits gives you the final bit-reversed result.
To answer your second question, u means to treat the hex constant as an unsigned (if there is need to expand it to a longer width), and l means to treat it as a long.
I'm working on your first question.
It's difficult to visualize what this algorithm is doing when you look at it as multiplications and hex. It becomes more clear when you convert it to binary and replace the multiplications with an equivalent sum of shift operations. Essentially what it is doing is it is spreading out parts of the byte by shifting and masking it, and then implementing a parallel half-adder that reconstructs the parts in place, which happens to be the reverse of where they started.
For example,
b * 0x0802 = b << 11 | b << 1
Plug in some values (in binary) for b and follow along.

Can anyone explain why '>>2' shift means 'divided by 4' in C codes?

I know and understand the result.
For example:
<br>
7 (decimal) = 00000111 (binary) <br>
and 7 >> 2 = 00000001 (binary) <br>
00000001 (binary) is same as 7 / 4 = 1 <br>
So 7 >> 2 = 7 / 4 <br>
<br>
But I'd like to know how this logic was created.
Can anyone elaborate on this logic?
(Maybe it just popped up in a genius' head?)
And are there any other similar logics like this ?
It didn't "pop-up" in a genius' head. Right shifting binary numbers would divide a number by 2 and left shifting the numbers would multiply it by 2. This is because 10 is 2 in binary. Multiplying a number by 10(be it binary or decimal or hexadecimal) appends a 0 to the number(which is effectively left shifting). Similarly, dividing by 10(or 2) removes a binary digit from the number(effectively right shifting). This is how the logic really works.
There are plenty of such bit-twiddlery(a word I invented a minute ago) in computer world.
http://graphics.stanford.edu/~seander/bithacks.html Here is for the starters.
This is my favorite book: http://www.amazon.com/Hackers-Delight-Edition-Henry-Warren/dp/0321842685/ref=dp_ob_image_bk on bit-twiddlery.
It is actually defined that way in the C standard.
From section 6.5.7:
The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...]
the value of the result is the integral part of the quotient of E1 / 2E2
On most architectures, x >> 2 is only equal to x / 4 for non-negative numbers. For negative numbers, it usually rounds the opposite direction.
Compilers have always been able to optimize x / 4 into x >> 2. This technique is called "strength reduction", and even the oldest compilers can do this. So there is no benefit to writing x / 4 as x >> 2.
Elaborating on Aniket Inge's answer:
Number: 30710 = 1001100112
How multiply by 10 works in decimal system
10 * (30710)
= 10 * (3*102 + 7*100)
= 3*102+1 + 7*100+1
= 3*103 + 7*101
= 307010
= 30710 << 1
Similarly multiply by 2 in binary,
2 * (1001100112)
= 2 * (1*28 + 1*25 + 1*24 + 1*21 1*20)
= 1*28+1 + 1*25+1 + 1*24+1 + 1*21+1 1*20+1
= 1*29 + 1*26 + 1*25 + 1*22 + 1*21
= 10011001102
= 1001100112 << 1
I think you are confused by the "2" in:
7 >> 2
and are thinking it should divide by 2.
The "2" here means shift the number ("7" in this case) "2" bit positions to the right.
Shifting a number "1"bit position to the right will have the effect of dividing by 2:
8 >> 1 = 4 // In binary: (00001000) >> 1 = (00000100)
and shifting a number "2"bit positions to the right will have the effect of dividing by 4:
8 >> 2 = 2 // In binary: (00001000) >> 2 = (00000010)
Its inherent in the binary number system used in computer.
a similar logic is --- left shifting 'n' times means multiplying by 2^n.
An easy way to see why it works, is to look at the familiar decimal ten-based number system, 050 is fifty, shift it to the right, it becomes 005, five, equivalent to dividing it by 10. The same thing with shifting left, 050 becomes 500, five hundred, equivalent to multiplying it by 10.
All the other numeral systems work the same way.
they do that because shifting is more efficient than actual division. you're just moving all the digits to the right or left, logically multiplying/dividing by 2 per shift
If you're wondering why 7/4 = 1, that's because the rest of the result, (3/4) is truncated off so that it's an interger.
Just my two cents: I did not see any mention to the fact that shifting right does not always produce the same results as dividing by 2. Since right shifting rounds toward negative infinity and integer division rounds to zero, some values (like -1 in two's complement) will just not work as expected when divided.
It's because >> and << operators are shifting the binary data.
Binary value 1000 is the double of binary value 0100
Binary value 0010 is the quarter of binary value 1000
You can call it an idea of a genius mind or just the need of the computer language.
To my belief, a Computer as a device never divides or multiplies numbers, rather it only has a logic of adding or simply shifting the bits from here to there. You can make an algorithm work by telling your computer to multiply, subtract them up, but when the logic reaches for actual processing, your results will be either an outcome of shifting of bits or just adding of bits.
You can simply think that for getting the result of a number being divided by 4, the computer actually right shifts the bits to two places, and gives the result:
7 in 8-bit binary = 00000111
Shift Right 2 places = 00000001 // (Which is for sure equal to Decimal 1)
Further examples:
//-- We can divide 9 by four by Right Shifting 2 places
9 in 8-bit binary = 00001001
Shift right 2 places: 00000010 // (Which is equal to 9/4 or Decimal 2)
A person with deep knowledge of assembly language programming can explain it with more examples. If you want to know the actual sense behind all this, I guess you need to study bit level arithmetic and assembly language of computer.

Resources