How to do data alignment in C? - c

I want to know why the following macro works for data alignment in C?
#define CMIALIGN(x,n) (size_t)((~(n-1))&((x)+(n-1))) Say if n is equal to 7 why the following macro should work? #define ALIGN8(x) (size_t)((~7)&((x)+7) . Can you please show it with some example and tell why this formula works? Is there any other tangible formula for data alignment?

The purpose of the ~N (where N is one less than the alignment you seek), is to ensure all high-order bits already present in your number are kept lit after the alignment, including the bits pushed there by the add operation. The actual round-up for alignment is done by the addition of N This ensures that any proper carry-bits are pushed into higher bits locations, which the bitwise-AND with ~N is then guaranteed to retain while ensuring the bottom bits are swept away, as they are not needed.
Imagine this:
00100011 : 35
+ 00000111 : 7
-------- ----
00101010 : 42
& 11111000 : ~7
---------- ----
00101000 : 40
Another example:
11101111 : 239
+ 00000111 : 7
-------- ----
11110110 : 246
& 11111000 : ~7
-------- ----
11110000 : 240
And finally, an example that ends up doing nothing, as it is already aligned:
10100000 : 160
+ 00000111 : 7
-------- ----
10100111 : 167
& 11111000 : ~7
-------- ----
10100000 : 160
I should note that this offers no real protection against overflow, other than the caller side checking for a zero-return value, which would clearly be not what you would want to use as a rounded-up alignment value.
11111110 : 254
+ 00000111 : 7
-------- ----
00000101 : 5 (of)
& 11111000 : ~7
-------- ----
00000000 : 0

Related

Bitwise carry applications

Call me naive but in this area I always struggled. So I was just browsing through the code for adding two numbers without + operator and bumped into this code:
int Add(int x, int y)
{
// Iterate till there is no carry
while (y != 0)
{
// carry now contains common set bits of x and y
int carry = x & y;
// Sum of bits of x and y where at least one of the bits is not set
x = x ^ y;
// Carry is shifted by one so that adding it to x gives the
// required sum
y = carry << 1;
}
return x;
}
Now I understand, how he is calculating the carry but why y!=0 and how this code is achieving the result for adding two numbers?
Basics first. Exclusive or'ing two bits is the same as the bottom digit of their sum. And'ing two bits is the same as the top bit of their sum.
A | B | A&B | A^B | A+B
-----------------------
0 | 0 | 0 | 0 | 00
0 | 1 | 0 | 1 | 01
1 | 0 | 0 | 1 | 01
1 | 1 | 1 | 0 | 10
As you can see the exclusive-or result is the same as the last digit of the sum. You can also see that the first digit of the sum is only 1 when A is 1 and B is 1.
[If you have a circuit with two inputs and two outputs, one of which is the exclusive or of the inputs and the other is the and of the inputs, it is called a half adder - because there is no facility to also input a carry (from a previous digit).]
So, to sum two bits, you calculate the XOR to get the lowest digit of the result and the AND to get the highest digit of the result.
For each individual pair of bits in a pair of numbers, I can calculate the sum of those two bits by doing both an XOR and an AND. Using four bit numbers, for example 3 and 5
3 0011
5 0101
------
0110 3^5 = 6 (low bit)
0001 3&5 = 1 (high bit)
In order to treat the 3 and 5 as single numbers rather than collections of four bits, each of those high bits needs to be treated as a carry and added to the next low bit to the left. We can do this by shifting the 3&5 left 1 bit and adding to the 3^5 which we do by repeating the two operations
6 0110
1<<1 0010
----
0100 6^(1<<1) = 4
0010 6&(1<<1) = 2
Unfortunately, one of the additions resulted in another carry being generated. So we can just repeat the operation.
4 0100
2<<1 0100
----
0000 4^(2<<1) = 0
0100 4&(2<<1) = 4
We still have a carry, so round we go again.
0 0000
4<<1 1000
----
1000 4^(4<<1) = 8
0000 4&(4<<1) = 0
This time, all the carries are 0 so more iterations are not going to change anything. We've finished.
I will try to explain it on a simple 3 bits example (you can skip this example to the actual explanation which marked in bold font and starts at Now to the way we achieve the same flow from the posted code).
Lets say we want to add x=0b011 with y=0b101. First we add the least significant bits 1+1 = 0b10
carry: x10
x: 011
+
y: 101
-----
xx0
Then we add the second bits (and by the book we need to add also the carry from the previous stage but we can also skip it for later stage): 1+0 = 0b1
carry: 010
x: 011
+
y: 101
-----
x10
Do the same for the third bit: 0+1 = 0b1
carry: 010
x: 011
+
y: 101
-----
110
So now we have carry = 0b010 and some partial result 0b110.
Remember my comment earlier that we take care of carry at some later stage? So now is this "later stage". Now we add the carry to the partial result we got (note that it is the same if we added the carry for each bit separately at the earlier stages). LSB bits addition:
NEW carry: x00
carry: 010
+
part. res.: 110
-----
xx0
Second bits addition:
NEW carry: 100
carry: 010
+
part. res.: 110
-----
x00
Third bit addition:
NEW carry: 100
carry: 010
+
part. res.: 110
-----
new part. res. 100
Now carry = NEW carry, part. res. = new part. res. and we do the same iteration once again.
For LSB
NEW carry: x00
carry: 100
+
part. res.: 100
-----
xx0
For the second bits:
NEW carry: 000
carry: 100
+
part. res.: 100
-----
x00
Third bits:
NEW carry: 1000 --> 000 since we are working with 3 bits only
carry: 100
+
part. res.: 100
-----
000
Now NEW carry is 0 so we have finished the calculation.The final result is 0b000 (overflow).
I am sure I haven't discovered anything to here. Now to the way we achieve the same flow from the posted code:
The partial result is the result without the carry, which means when x and y have different bits at the same position, the sum of these bits will be 1. If the same bits are identical, the result will be 0 (1+1 => 0, carry 1 and 0+0 => 0, carry 0).
Thus partial result is x ^ y (see the properties of the XOR operation). In the posted code it is x = x ^ y;.
Now let's look at the carry. We will get carry from a single bit addition only if both bits are 1. So the bits which will set the carry bits to 1 are marked as 1 in the following expression: x & y (only the set bits at the same position will remain 1). But the carry should be added to the next (more significant) bit! Thus
carry = (x & y) << 1; // in the posted code it is y = carry << 1
And the iterations are performed unless carry is 0 (like in our example).

What does (size + 7) & ~7 mean?

I'm reading the Multiboot2 specification. You can find it here. Compared to the previous version, it names all of its structures "tags". They're defined like this:
3.1.3 General tag structure
Tags constitutes a buffer of structures following each other padded on u_virt size. Every structure has
following format:
+-------------------+
u16 | type |
u16 | flags |
u32 | size |
+-------------------+
type is divided into 2 parts. Lower contains an identifier of
contents of the rest of the tag. size contains the size of tag
including header fields. If bit 0 of flags (also known as
optional) is set if bootloader may ignore this tag if it lacks
relevant support. Tags are terminated by a tag of type 0 and size
8.
Then later in example code:
for (tag = (struct multiboot_tag *) (addr + 8);
tag->type != MULTIBOOT_TAG_TYPE_END;
tag = (struct multiboot_tag *) ((multiboot_uint8_t *) tag
+ ((tag->size + 7) & ~7)))
The last part confuses me. In Multiboot 1, the code was substantially simpler, you could just do multiboot_some_structure * mss = (multiboot_some_structure *) mbi->some_addr and get the members directly, without confusing code like this.
Can somebody explain what ((tag->size + 7) & ~7) means?
As mentioned by chux in his comment, this rounds tag->size up to the nearest multiple of 8.
Let's take a closer look at how that works.
Suppose size is 16:
00010000 // 16 in binary
+00000111 // add 7
--------
00010111 // results in 23
The expression ~7 takes the value 7 and inverts all bits. So:
00010111 // 23 (from pervious step)
&11111000 // bitwise-AND ~7
--------
00010000 // results in 16
Now suppose size is 17:
00010001 // 17 in binary
+00000111 // add 7
--------
00011000 // results in 24
Then:
00011000 // 24 (from pervious step)
&11111000 // bitwise-AND ~7
--------
00011000 // results in 24
So if the lower 3 bits of size are all zero, i.e. a multiple of 8, (size+7)&~7 sets those bits and then clears them, so no net effect. But if any one of those bits is 1, the bit corresponding to 8 gets incremented, then the lower bits are cleared, i.e. the number is rounded up to the nearest multiple of 8.
~ is a bitwise not. & is a bitwise AND
assuming 16 bits are used:
7 is 0000 0000 0000 0111
~7 is 1111 1111 1111 1000
Anything and'd with a 0 is 0. Anything and'd with 1 is itself. Thus
N & 0 = 0
N & 1 = N
So when you AND with ~7, you essentially clear the lowest three bits and all of the other bits remain unchanged.
Thanks for #chux for the answer. According to him, it rounds the size up to a multiple of 8, if needed. This is very similar to a technique done in 15bpp drawing code:
//+7/8 will cause this to round up...
uint32_t vbe_bytes_per_pixel = (vbe_bits_per_pixel + 7) / 8;
Here's the reasoning:
Things were pretty simple up to now but some confusion is introduced
by the 16bpp format. It's actually 15bpp since the default format is
actually RGB 5:5:5 with the top bit of each u_int16 being unused. In
this format, each of the red, green and blue colour components is
represented by a 5 bit number giving 32 different levels of each and
32786 possible different colours in total (true 16bpp would be RGB
5:6:5 where there are 65536 possible colours). No palette is used for
16bpp RGB images - the red, green and blue values in the pixel are
used to define the colours directly.
& ~7 sets the last three bits to 0

basic bit field C

I don't know why this code works. it supposed to print out every student which follows chem. but why does a number for instance 21&4(student 123001) evaluate to true while a number like 49&4(123008) doesn't?
I think it is due to bit operation AND.
In binary
49 is 110001
4 is 000100
& = 000000
So it evaluates to false
wheras
21 is 10101
4 is 00100
& = 00100
So you get a non-zero result which is true.

Bit operation of extract flags into 24 bit integer

I saw a line in xen's kernel code (file: xen/include/asm-x86/x86_64/page.h), but cannot understand why they are doing this:
/* Extract flags into 24-bit integer, or turn 24-bit flags into a pte mask. */
#define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
#define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
As to
#define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
I understand ((int)(x) & 0xFFF) will extract the last 24 bits of x, but why they need the first part ((int)((x) >> 40) & ~0xFFF) ?
As to
#define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
I'm lost at the purpose of ((intpte_t)((x) & ~0xFFF) << 40). It should be 0 in my opinion. Then why do we need it?
Thanks,
I had to look twice at their code. Because it took me a minute to realize that 0xFFF is not 24 bits, it's only 12 bits. So take an example 64 bit input: 0xAABBCCDDEEFF1122. Shift it right by 40, and you get 0x0000000000AABBCC. ~0xFFF is shorthand in this case for 0xFFFFFFFFFFFFF000. And them together, and you get 0x0000000000AAB000. So basically, they grabbed the top 12 bits and moved them down. Then they or that with the bottom 12 bits. So they end up with 0x0000000000AAB122.
The other half does the opposite, takes 24 bits at the bottom, cuts them in half and puts 12 at the top and 12 at the bottom.
0xFFF is not 24 one-bits, it's only 12.
Knowing this, you'll see that the purpose of get_pte_flags is to move the top 12 bits into position 12-24, like so:
xxxxxxxx xxxx0000 00000000 00000000 00000000 00000000 0000yyyy yyyyyyyy
becomes
00000000 00000000 00000000 00000000 00000000 xxxxxxxx xxxxyyyy yyyyyyyy
Of course, put_pte_flags does the inverse, moving the bits back to the most significant position.
Think 64 bit.
On a 32 bit system the result would be 0, of course.
But, when you shift 24 bit 40 bits left, you have
xxxxxxxx yyyyyyyy zzzzzzzz 00000000 00000000 00000000 00000000 00000000
which is a valid 64 bit value.

C GPIO hex numbering

I have been given the following bit of code as an example:
Make port 0 bits 0-2 outputs, other to be inputs.
FIO0DIR = 0x00000007;
Set P0.0, P0.1, P0.2 all low (0)
FIO0CLR = 0x00000007;
I have been told that the port has 31 LED's attached to it. I cant understand why, to enable the first 3 outputs, it is 0x00000007 not 0x00000003?
These GPIO config registers are bitmaps.
Use your Windows calculator to convert the hex to binary:
0x00000007 = 111, or with 32 bits - 00000000000000000000000000000111 // three outputs
0x00000003 = 11, or with 32 bits - 00000000000000000000000000000011 // only two outputs
Because the value you write to the register is a binary bit-mask, with a bit being one meaning "this is an output". You don't write the "number of outputs I'd like to have", you are setting 8 individual flags at the same time.
The number 7 in binary is 00000111, so it has the lower-most three bits set to 1, which here seems to mean "this is an output". The decimal value 3, on the other hand, is just 00000011 in binary, thus only having two bits set to 1, which clearly is one too few.
Bits are indexed from the right, starting at 0. The decimal value of bit number n is 2n. The decimal value of a binary number with more than one bit set is simply the sum of all the values of all the set bits.
So, for instance, the decimal value of the number with bits 0, 1 and 2 set is 20 + 21 + 22 = 1 + 2 + 4 = 7.
Here is an awesome ASCII table showing the 8 bits of a byte and their individual values:
+---+---+---+---+---+---+---+---+
index | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
+---+---+---+---+---+---+---+---+
value |128| 64| 32| 16| 8 | 4 | 2 | 1 |
+---+---+---+---+---+---+---+---+

Resources