c language bitwise trick - c

Here is code i saw in a C program , i knew this piece of code is to set a bit in the bit ASCII bit map corresponding to the character c.
field[ (c & 0x7f) >> 3 ] |= 1 << (c & 0x07);
field is an array of 16 characters, each character is 8 bits.
for example '97' is lower case 'a', if we set c to 97, then bit position 97 will be set to 1.
any one know why above code will set bit map corresponding to the character c?
and what are those magic number 0x7f, 0x07, 3 and 1 for?

If your array is 16 bytes long, it has 128 bits (16 x 8). So the first mask (0x7f) guarantees that you are only interested in the first 128 bits. Once you shift it 3 bits right, you have 4 bits left that are used to address your bitfield (the number ((c & 0x7F) >> 3 is a number between 0 and 15). So this part uses the upper 4 bits to address the byte.
Now, you need to address the bit in the byte, so you use the mask 0x07 to limit the value to the range 0 - 7 (corresponding to the bits 0 to 7). You use this number to shift the 1 so many positions.
At the end, you have a bit set in a position 0 to 127 (16 bytes of 8 bits). I hope this helps!

First, to clear up the magic numbers
0x7f is 0111 1111 in binary. This means the lower 7 bits of c are significant. This is then shifted by 3 so that only the original 0xxx x000 (4) bits are significant. But since these bits are shifted by 3 they count 0 to 15.
0x07 is 0000 0111 in binary. This means only the lower 3 bits are significant. The number 1 is shifted left by the value in these 3 bits, resulting a bit set in bit positions 0 to 7 within the byte.
In the end, the function only uses the lower 7 bits in the byte, which are the only significant bits in an ascii character. It uses the upper 4 for addressing the byte in the array and the bottom 3 to address the bit in the addressed byte.

Related

What does it mean "bytes numbered from 0 (LSB) to 3 (MSB)"?

I should extract byte n from word x.
Example: getByte(0x12345678,1) = 0x56.
And there is written, that bytes numbered from 0(LSB) to 3(MSB), the meaning of which I can't understand.
Thank you.
Consider your 32 bit word (0x12345678) as 4 bytes:
Word : 12 34 56 78 (hex)
Byte #: 3 2 1 0
MSB<-----LSB
MSB = Most Significant Byte
LSB = Least Signficant Byte
It means that you are supposed to consider x composed of bytes as x = &Sum;n&in;[0,4) bn × 256n, and given x you are supposed to compute bn. That is, b0 is the least-significant byte and b3 is the most-significant byte.
MSB and LSB mean Most Significative Byte and Least Significative Byte, respectively. A byte being a 8-bit number that can be directly represented by 2 hexadecimal positions. So, the number 0x12345678 is a word containing 4 bytes, 12 34 56 78. The rightmost is the LSB, and the leftmost is the MSB. So you are taking the byte 1 that is the SECOND byte from right to left.

Understanding shifting and logical operations

I am trying to read the 'size' of an SD card. The sample example which I am having has following lines of code:
unsigned char xdata *pchar; // Pointer to external mem space for FLASH Read function;
pchar += 9; // Size indicator is in the 9th byte of CSD (Card specific data) register;
// Extract size indicator bits;
size = (unsigned int)((((*pchar) & 0x03) << 1) | (((*(pchar+1)) & 0x80) >> 7));
I am not able to understand what is actually being done in the above line where indicator bit is being extracted. Can somebody help me in understanding this?
The size is made up of bits from two bytes. One byte is at pchar, the other at pchar + 1.
(*pchar) & 0x03) takes the 2 least significant bits (chopping of the 6 most significant ones).
This result is shifted one bit to the left using << 1. For example:
11011010 (& 0x03/00000011)==> 00000010 (<< 1)==> 00000100 (-----10-)
Something similar is done with pchar + 1. For example:
11110110 (& 0x80/10000000)==> 10000000 (>> 7)==> 00000001 (-------1)
Then these two values are OR-ed together with |. So in this example you'd get:
00000100 | 00000001 = 00000101 (-----101)
But note that the 5 most significant bits will always be 0 (above indicated with -) because they were &-ed away:
To summarize, the first byte holds two bits of size, while the second byte only one bit.
It seems the size indicator, say SI, consists of 3 bits, where *pchar contains the two most significant bits of SI in its lowest two bits (0x03) and *(pchar+1) contains the least significant bit of SI in its highest bit (0x80).
The first and second line figure out how to point to the data that you want.
Let's now go through the steps involved, from left to right.
The first portion of the operations takes the byte pointed to by pchar, performs a logical AND on the byte and 0x03 and shifts over that result by one bit.
That result is then logically ORed with the next byte (*pchar+1), which in turn is ANDed with 0x80, which is then right shifted by seven bits. Essentially, this portion just strips off the first bit in the byte and shifts it over by seven bits.
What the result is essentially this:
Imagine pchar points to the byte where bits are represented by letters: ABCDEFGH.
The first part ANDs with 0x03, so we are left with 000000GH. This is then left shifted by one bit, so we are left with 00000GH0.
Same thing for the right portion. pchar+1 is represented by IJKLMNOP. With the first logical AND, we are left with I0000000. This is then right shifted seven times. So we have 0000000I. This is combined with the left hand portion using the OR, so we have 00000GHI, which is then casted into an int, which holds your size.
Basically, there are three bits that hold the size, but they are not byte aligned. As a result, some manipulation is necessary.
size = (unsigned int)((((*pchar) & 0x03) << 1) | (((*(pchar+1)) & 0x80) >> 7));
Can somebody help me in understanding this?
We have byte *pchar and byte *(pchar+1). Each byte consists of 8 bits.
Let's index each bit of *pchar in bold: 76543210 and each bit of *(pchar+1) in italic: 76543210.
1.. ((*pchar) & 0x03) << 1 means "zero all bits of *pchar except bits 0 and 1, then shift result to the left by 1 bit":
76543210 --> xxxxxx10 --> xxxxx10x
2.. (((*(pchar+1)) & 0x80) >> 7) means "zero all bits of *(pchar+1) except bit 7, then shift result to the right by 7 bits":
76543210 --> 7xxxxxxx --> xxxxxxx7
3.. ((((*pchar) & 0x03) << 1) | (((*(pchar+1)) & 0x80) >> 7)) means "combine all non-zero bits of left and right operands into one byte":
xxxxx10x | xxxxxxx7 --> xxxxx107
So, in the result we have two low bits from *pchar and one high bit from *(pchar+1).

Left shift using bitwise AND

The following lines of code Shift left 5 bits ie make bottom 3 bits the 3 MSB's
DWORD dwControlLocAddress2;
DWORD dwWriteDataWordAddress //Assume some initial value
dwControlLocAddress2 = ((dwWriteDataWordAddress & '\x07') * 32);
Can somebody help me understand how?
The 0x07 is 00000111 in binary. So you are masking the input value and getting just the right three bits. Then you are multiplying by 32 which is 2 * 2 * 2 * 2 * 2... which, if you think about it, shifting left by 1 is the same as multiplying by 2. So, shifting left five times is the same as multiplying by 32.
Multiplying by a power of two x is the same as left shifting by log2(x):
x *= 2 -> x <<= 1
x *= 4 -> x <<= 2
.
.
.
x *= 32 -> x <<= 5
The & doesn't do the shift - it just masks the bottom three bits. The syntax used in your example is a bit weird - it's using a hexadecimal character literal '\x07', but that's literally identical to hex 0x07, which in turn in binary is:
00000111
Since any bit ANDed with 0 yields 0 and any bit ANDed with 1 is itself, the & operation in your example simply gives a result of being the bottom three bits of dwWriteDataWordAddress.
It's a bit obtuse but essentially you're anding with 0x07 and then multiplying by 32 which is the same as shifting by 5. I'm not sure why a character literal is used rather than an integer literal but perhaps so that it is represented as a single byte rather than a word.
The equivalent would be:
( ( dw & 0x07 ) << 5 )
The & 0x07 masks off the first 3 bits and << 5 does a left shift by 5 bits.
& '\x07' - masks in the bottom three bits only (hex 7 is 111 in binary)
* 32 - left shifts by 5 (32 is 2^5)

reading 2 bits off a register

I'm looking at a datasheet specification of a NIC and it says:
bits 2:3 of register contain the NIC speed, 4 contains link state, etc. How can I isolate these bits using bitwise?
For example, I've seen the code to isolate the link state which is something like:
(link_reg & (1 << 4))>>4
But I don't quite get why the right shift. I must say, I'm still not fairly comfortable with the bitwise ops, even though I understand how to convert to binary and what each operation does, but it doesn't ring as practical.
It depends on what you want to do with that bit. The link state, call it L is in a variable/register somewhere
43210
xxxxLxxxx
To isolate that bit you want to and it with a 1, a bitwise operation:
xxLxxxx
& 0010000
=========
00L0000
1<<4 = 1 with 4 zeros or 0b10000, the number you want to and with.
status&(1<<4)
This will give a result of either zero or 0b10000. You can do a boolean comparison to determine if it is false (zero) or true (not zero)
if(status&(1<<4))
{
//bit was on/one
}
else
{
//bit was off/zero
}
If you want to have the result be a 1 or zero, you need to shift the result to the ones column
(0b00L0000 >> 4) = 0b0000L
If the result of the and was zero then shifting still gives zero, if the result was 0b10000 then the shift right of 4 gives a 0b00001
so
(status&(1<<4))>>4 gives either a 1 or 0;
(xxxxLxxxx & (00001<<4))>>4 =
(xxxxLxxxx & (10000))>>4 =
(0000L0000) >> 4 =
0000L
Another way to do this using fewer operations is
(status>>4)&1;
xxxxLxxxx >> 4 = xxxxxxL
xxxxxxL & 00001 = 00000L
Easiest to look at some binary numbers.
Here's a possible register value, with the bit index underneath:
00111010
76543210
So, bit 4 is 1. How do we get just that bit? We construct a mask containing only that bit (which we can do by shifting a 1 into the right place, i.e. 1<<4), and use &:
00111010
& 00010000
----------
00010000
But we want a 0 or a 1. So, one way is to shift the result down: 00010000 >> 4 == 1. Another alternative is !!val, which turns 0 into 0 and nonzero into 1 (note that this only works for single bits, not a two-bit value like the link speed).
Now, if you want bits 3:2, you can use a mask with both of those bits set. You can write 3 << 2 to get 00001100 (since 3 has two bits set). Then we & with it:
00111010
& 00001100
----------
00001000
and shift down by 2 to get 10, the desired two bits. So, the statement to get the two-bit link speed would be (link_reg & (3<<2))>>2.
If you want to treat bits 2 and 3 (starting the count at 0) as a number, you can do this:
unsigned int n = (link_get & 0xF) >> 2;
The bitwise and with 15 (which is 0b1111 in binary) sets all but the bottom four bits to zero, and the following right-shift by 2 gets you the number in bits 2 and 3.
you can use this to determine if the bit at position pos is set in val:
#define CHECK_BIT(val, pos) ((val) & (1U<<(pos)))
if (CHECK_BIT(reg, 4)) {
/* bit 4 is set */
}
the bitwise and operator (&) sets each bit in the result to 1 if both operands have the corresponding bit set to 1. otherwise, the result bit is 0.
The problem is that isolating bits is not enough: you need to shift them to get the correct size order of the value.
In your example you have bit 2 and 3 for the size (I'm assuming that least significant is bit 0), it means that it is a value in range [0,3]. Now you can mask these bits with reg & (0x03<<2) or, converted, (reg & 0x12) but this is not enough:
reg 0110 1010 &
0x12 0000 1100
---------------
0x08 0000 1000
As you can see the result is 1000b which is 8, which is over the range. To solve this you need to shift back the result so that the least significant bit of the value you are interested in corresponds to the least significant bit of the containing byte:
0000 1000 >> 2 = 10b = 3
which now is correct.

Decomposition of an IP header

I have to do a sniffer as an assignment for the security course. I am using C and the pcap library. I got everything working well (since I got a code from the internet and changed it). But I have some questions about the code.
u_int ip_len = (ih->ver_ihl & 0xf) * 4;
ih is of type ip_header, and its currently pointing the to IP header in the packet.
ver_ihl gives the version of the IP.
I can't figure out what is: & 0xf) * 4;
& is the bitwise and operator, in this case you're anding ver_ihl with 0xf which has the effect of clearing all the bits other than the least signifcant 4
0xff & 0x0f = 0x0f
ver_ihl is defined as first 4 bits = version + second 4 = Internet header length. The and operation removes the version data leaving the length data by itself. The length is recorded as count of 32 bit words so the *4 turns ip_len into the count of bytes in the header
In response to your comment:
bitwise and ands the corresponding bits in the operands. When you and anything with 0 it becomes 0 and anything with 1 stays the same.
0xf = 0x0f = binary 0000 1111
So when you and 0x0f with anything the first 4 bits are set to 0 (as you are anding them against 0) and the last 4 bits remain as in the other operand (as you are anding them against 1). This is a common technique called bit masking.
http://en.wikipedia.org/wiki/Bitwise_operation#AND
Reading from RFC 791 that defines IPv4:
A summary of the contents of the internet header follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first 8 bits of the IP header are a combination of the version, and the IHL field.
IHL: 4 bits
Internet Header Length is the length of the internet header in 32
bit words, and thus points to the beginning of the data. Note that
the minimum value for a correct header is 5.
What the code you have is doing, is taking the first 8 bits there, and chopping out the IHL portion, then converting it to the number of bytes. The bitwise AND by 0xF will isolate the IHL field, and the multiply by 4 is there because there are 4 bytes in a 32-bit word.
The ver_ihl field contains two 4-bit integers, packed as the low and high nybble. The length is specified as a number of 32-bit words. So, if you have a Version 4 IP frame, with 20 bytes of header, you'd have a ver_ihl field value of 69. Then you'd have this:
01000101
& 00001111
--------
00000101
So, yes, the "&0xf" masks out the low 4 bits.

Resources