What does (size + 7) & ~7 mean? - c

I'm reading the Multiboot2 specification. You can find it here. Compared to the previous version, it names all of its structures "tags". They're defined like this:
3.1.3 General tag structure
Tags constitutes a buffer of structures following each other padded on u_virt size. Every structure has
following format:
+-------------------+
u16 | type |
u16 | flags |
u32 | size |
+-------------------+
type is divided into 2 parts. Lower contains an identifier of
contents of the rest of the tag. size contains the size of tag
including header fields. If bit 0 of flags (also known as
optional) is set if bootloader may ignore this tag if it lacks
relevant support. Tags are terminated by a tag of type 0 and size
8.
Then later in example code:
for (tag = (struct multiboot_tag *) (addr + 8);
tag->type != MULTIBOOT_TAG_TYPE_END;
tag = (struct multiboot_tag *) ((multiboot_uint8_t *) tag
+ ((tag->size + 7) & ~7)))
The last part confuses me. In Multiboot 1, the code was substantially simpler, you could just do multiboot_some_structure * mss = (multiboot_some_structure *) mbi->some_addr and get the members directly, without confusing code like this.
Can somebody explain what ((tag->size + 7) & ~7) means?

As mentioned by chux in his comment, this rounds tag->size up to the nearest multiple of 8.
Let's take a closer look at how that works.
Suppose size is 16:
00010000 // 16 in binary
+00000111 // add 7
--------
00010111 // results in 23
The expression ~7 takes the value 7 and inverts all bits. So:
00010111 // 23 (from pervious step)
&11111000 // bitwise-AND ~7
--------
00010000 // results in 16
Now suppose size is 17:
00010001 // 17 in binary
+00000111 // add 7
--------
00011000 // results in 24
Then:
00011000 // 24 (from pervious step)
&11111000 // bitwise-AND ~7
--------
00011000 // results in 24
So if the lower 3 bits of size are all zero, i.e. a multiple of 8, (size+7)&~7 sets those bits and then clears them, so no net effect. But if any one of those bits is 1, the bit corresponding to 8 gets incremented, then the lower bits are cleared, i.e. the number is rounded up to the nearest multiple of 8.

~ is a bitwise not. & is a bitwise AND
assuming 16 bits are used:
7 is 0000 0000 0000 0111
~7 is 1111 1111 1111 1000
Anything and'd with a 0 is 0. Anything and'd with 1 is itself. Thus
N & 0 = 0
N & 1 = N
So when you AND with ~7, you essentially clear the lowest three bits and all of the other bits remain unchanged.

Thanks for #chux for the answer. According to him, it rounds the size up to a multiple of 8, if needed. This is very similar to a technique done in 15bpp drawing code:
//+7/8 will cause this to round up...
uint32_t vbe_bytes_per_pixel = (vbe_bits_per_pixel + 7) / 8;
Here's the reasoning:
Things were pretty simple up to now but some confusion is introduced
by the 16bpp format. It's actually 15bpp since the default format is
actually RGB 5:5:5 with the top bit of each u_int16 being unused. In
this format, each of the red, green and blue colour components is
represented by a 5 bit number giving 32 different levels of each and
32786 possible different colours in total (true 16bpp would be RGB
5:6:5 where there are 65536 possible colours). No palette is used for
16bpp RGB images - the red, green and blue values in the pixel are
used to define the colours directly.

& ~7 sets the last three bits to 0

Related

Is the least significant bit (LSB) always the "first" bit?

I'm reading Modern C (version Feb 13, 2018.) and on page 42 it says
It says that the bit with index 4 is the least significant bit. Isn't the bit with index 0 should be the least significant bit? (Same question about MSB.)
Which is right? What's the correct terminology?
Their definition of "most significant bit" and "least significant bit" is misleading:
8 bit Binary number : 1 1 1 1 0 0 0 0
Bit number 7 6 5 4 3 2 1 0
| | |
| | least significant bit
| |
| |
| least significant bit that is 1
|
most significant bit that is 1 and also just most significant bit
The book's definition does not align with common/typical/mainstream/correct usage. See Wikipedia, for instance:
In computing, the least significant bit (LSB) is the bit position in a binary integer giving the units value, that is, determining whether the number is even or odd.
The book, on the other hand, seems to consider only bits that are 1, so that in an 8-bit byte representing the number 16, which we can write:
00010000
the bit that is 1 has index 4 (it's b4 in the book's notation), and then it claims that that particular number's LSB is four.
The proper definition just uses LSB to denote that bit whose value is 1, i.e. the "units", and with that the LSB is the rightmost bit. This latter definition is more useful, and I really think the book is wrong.
They're using an unusual definition of LSB and MSB, which only refers to the bits that are set to 1. So in the case of 240, the first 1 bit is b4, not b0, because b0 through b3 are all 0.
I'm not sure why the book considers this definition of LSB/MSB to be useful. It's not generally interesting for integers, although it does come into play in floating point. Floating point numbers are scaled so integers above 1 have the low-order zero bits shifted away, and the exponent is incremented to make up for this (conversely, fractions have their high-order bits shifted away, and the exponent is decremented).

How to set last three bit of a byte efficiently?

I want to set the last three bit of a byte to a specific vaue. What’s the best way to archive this goal?
I can think of at least two solutions… Let’s assume I have the following Byte: 1101 0110 and want to set the three last bits to 011.
Solution 1:
1101 0110
&1111 1000 //and with mask to clear last three bits
|0000 0011 //add the target bits
Solution 2:
1101 0110 >> 3 //shift right to remove last three
0001 1010 << 3 //shift left to clear last three
|0000 0011 //add target bits
Is there a better/shorter/more efficient way?
The best way is to say
b = (b & ~7u) | 3
because 3=0...011 and 7u=0..111 in binary, and the complement of 7u is ~7u=11...1000, so the operation does what you want. It first clears the last three bits (by doing b & ~7u) and then sets the first and the second bits (by doing bitwise-OR with 3).
If the C source code has a >> in it, that does not mean the generated code will have shift instructions.
((x>>3)<<3) | 3 may generate the exact same code as (x & ~7) | 3. Compilers are very sophisticated in their optimization.
Use what is simplest #Martin James.
Recommend #blazs solution, as that is simple to understand and well copes with signed integer issues.
(x & ~7u) | 3
I would recommend you to do your second solution, sure it depends on your hardware architecture but almost always SHIFT operations are faster than ADD.

what is the difference between logical OR operation and binary addition?

I'm trying to understand how a binary addition and logical OR table differs.
does both carry forward 1 or if not which one does carry forward operation and which does not?
The exclusive-or (XOR) operation is like binary addition, except that
there is no carry from one bit position to the next. Thus, each bit
position can be evaluated independently of the rest.
I'll attempt to clarify a few points with a few illustrations.
First, addition. Basically like adding numbers in grade school. But if you have a 1-bit aligned with a 1-bit, you get a 0 with a 1 carry (i.e. 10, essentially analogous to 5 plus 5 in base-10). Otherwise, add them like 'regular' (base-10) numbers. For instance:
₁₁₁
1001
+ 1111
______
11000
Note that in the left-most column two 1's are added to give 10, which with another 1 gives 11 (similar to 5 + 5 + 5).
Now, assuming by "logical OR" you mean something along the lines of bitwise OR (an operation which basically performs the logical OR (inclusive) operation on each pair of corresponding bits), then you have this:
1001
| 1111
______
1111
Only case here you should have a 0 bit is if both bits are 0.
Finally, since you tagged this question xor, which I assume is bitwise as well.
1001
^ 1111
______
0110 = 110₂
In this case, two 1-bits give a 0, and of course two 0-bits give 0.
With a logical OR you get a logical result (Boolean). IOW true OR true is true (anything other than false OR false is true). In some languages (like C) any numeric value other than 0 means true. And some languages use an explicit datatype for true, false (bool, Boolean).
In case of binary OR, you are ORing the bits of two binary values. ie: 1 (which is binary 1) bitwise OR 2 (which is binary 10) is binary 11:
01
10
11
which is 3. Thus binary OR is also an addition when the values do not have shared bits (like flag values).

Getting a second bit of the internal representation of the number

Please, explain why by doing as follows I'll get a second bit of the number stored in i in it's internal representation.
(i & 2) / 2;
Doing i & 2 masks out all but the second bit in i. [1]
That means the expression evaluates to either 0 or 2 (binary 00 and 10 respectively).
Dividing that by 2 gives either 0 or 1 which is effectively the value of the second bit in i.
For example, if i = 7 i.e. 0111 in binary:
i & 2 gives 0010.
0010 is 2 in decimal.
2/2 gives 1 i.e. 0001.
[1] & is the bitwise AND in C. See here for an explanation on how bitwise AND works.
i & 2 masks out all but the second bit.
Dividing it by 2 is the same as shifting down 1 bit.
e.g.
i = 01100010
(i & 2) == (i & 00000010) = 00000010
(i & 2) / 2 == (i & 2) >> 1 = 00000001
The & operator is bitwise AND: for each bit, the result is 1 only if the corresponding bits of both arguments are 1. Since the only 1 bit in the number 2 is the second-lowest bit, a bitwise AND with 2 will force all the other bits to 0. The result of (i & 2) is either 2 if the second bit in i is set, or 0 otherwise.
Dividing by 2 just changes the result to 1 instead of 2 when the second bit of i is set. It isn't necessary if you're just concerned with whether the result is zero or nonzero.
2 is 10 in binary. & is a bitwise conjunction. So, i & 2 gets you the second-from-the-end bit of i. And dividing by 2 is the same as bit-shifting by 1 to the right, which gets the value of the last bit.
Actually, shifting to the right would be better here, as it clearly states your intent. So, this code would be normally written like this: (i & 0x02) >> 1

What does this condition written in bitwise operators really do?

What does the following condition effectively check in C :
if(a & (1<<b))
I have been wracking my brains but I can't find a pattern.
Any help?
Also I have seen this used a lot in competitive programming, could anyone explain when and why this is used?
It is checking whether the bth bit of a is set.
1<<b will shift over a single set bit b times so that only one bit in the bth position is set.
Then the & will perform a bitwise and. Since we already know the only bit that is set in 1<<b, either it is set in a, in which case we get 1<<b, or it isn't, in which case we get 0.
In mathematical terms, this condition verifies if a's binary representation contains 2b. In terms of bits, this checks if b's bit of a is set to 1 (the number of the least significant bit is zero).
Recall that shifting 1 to the left by b positions produces a mask consisting of all zeros and a single 1 in position b counting from the right. A value of this mask is 2b.
When you perform a bitwise "AND" with such a mask, the result would be non-zero if, and only if, a's binary representation contains 2b.
Lets say for example a = 12 (binary: 1100) and you want to check that the third bit (binaries are read from right to left) is set to 1, to do that you can use & bitwise operator which work as following:
1 & 0 = 0
0 & 1 = 0
0 & 0 = 0
1 & 1 = 1
To check if the third bit in a is set to 1 we can do:
1100
0100 &
------
0100 (4 in decimal) True
if a = 8 (binary: l000) on the other hand:
1000
0100 &
------
0000 (0 in decimal) False
Now to get the 0100 value we can right shift 1 by 2 (1 << 2) wich will append two zeros from the right and we'll get 100, in binaries left trailing zeros doesn't change the value so 100 is the same as 0100.

Resources