Decomposition of an IP header - c

I have to do a sniffer as an assignment for the security course. I am using C and the pcap library. I got everything working well (since I got a code from the internet and changed it). But I have some questions about the code.
u_int ip_len = (ih->ver_ihl & 0xf) * 4;
ih is of type ip_header, and its currently pointing the to IP header in the packet.
ver_ihl gives the version of the IP.
I can't figure out what is: & 0xf) * 4;

& is the bitwise and operator, in this case you're anding ver_ihl with 0xf which has the effect of clearing all the bits other than the least signifcant 4
0xff & 0x0f = 0x0f
ver_ihl is defined as first 4 bits = version + second 4 = Internet header length. The and operation removes the version data leaving the length data by itself. The length is recorded as count of 32 bit words so the *4 turns ip_len into the count of bytes in the header
In response to your comment:
bitwise and ands the corresponding bits in the operands. When you and anything with 0 it becomes 0 and anything with 1 stays the same.
0xf = 0x0f = binary 0000 1111
So when you and 0x0f with anything the first 4 bits are set to 0 (as you are anding them against 0) and the last 4 bits remain as in the other operand (as you are anding them against 1). This is a common technique called bit masking.
http://en.wikipedia.org/wiki/Bitwise_operation#AND

Reading from RFC 791 that defines IPv4:
A summary of the contents of the internet header follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first 8 bits of the IP header are a combination of the version, and the IHL field.
IHL: 4 bits
Internet Header Length is the length of the internet header in 32
bit words, and thus points to the beginning of the data. Note that
the minimum value for a correct header is 5.
What the code you have is doing, is taking the first 8 bits there, and chopping out the IHL portion, then converting it to the number of bytes. The bitwise AND by 0xF will isolate the IHL field, and the multiply by 4 is there because there are 4 bytes in a 32-bit word.

The ver_ihl field contains two 4-bit integers, packed as the low and high nybble. The length is specified as a number of 32-bit words. So, if you have a Version 4 IP frame, with 20 bytes of header, you'd have a ver_ihl field value of 69. Then you'd have this:
01000101
& 00001111
--------
00000101
So, yes, the "&0xf" masks out the low 4 bits.

Related

How do I get the MSB and LSB of a 16 bit unsigned short in C?

I'm trying to return the 10 least significant bits (while setting the 6 most significant bits to 0) and 6 most significant bits (while setting the 10 least significant bits to 0) from a 16-bit unsigned short and I'm stuck on how to accomplish this. For example, if I have an unsigned short 0x651A, then the bit representation would be:
// MSB LSB
// +-----------+-------------------+
// |0 1 1 0 0 1|0 1 0 0 0 1 1 0 1 0|
// +-----------+-------------------+
// | | |
// bit offset: 16 10 0
So, if I were to get the 6 most significant bits, then the returned short would be 0b0000000000011001. I'm very new to C and I'm still trying to understand bit management and bit shifting. Any advice or feedback is appreciated in helping me understand C better.
To get the least significant bits, you would use a bit mask. You apply a bitwise and to the number at hand with a mask of 10 1's in this case. The number needed can be obtained by a bitshift of 1, and then subtracting 1. That is (1 << 10)-1 in this case. So the result for any x is x & ((1 << 10)-1).
To get the most significant bits is easier. You just shift off the lower bits, eg x >> 10

Confused with network byte order and host byte order

I'm getting terribly confused with host byte order and network byte order. I know network byte order is big endian. And I know host byte order in my case is little endian.
So, if I'm printing data I would need to convert to host byte order in order to get the correct value right?
My problem is I am trying to print the value of data returned by htonl. Here is my example:
#include <stdio.h>
#include <netinet/in.h>
int main(int argc, char *argv[])
{
int bits = 12;
char *ip = "132.89.39.0";
struct in_addr addr;
uint32_t network, netmask, last_addr;
uint32_t total_hosts;
inet_aton(ip, &addr);
printf("Starting IP:\t%s\n", inet_ntoa(addr.s_addr));
netmask = (0xFFFFFFFFUL << (32 - bits)) & 0xFFFFFFFFUL;
netmask = htonl(netmask);
printf("Netmask:\t%s\n", inet_ntoa(netmask));
network = addr.s_addr & netmask;
printf("Network:\t%s\n", inet_ntoa(network));
printf("Total Hosts:\t%d\n", ntohl(netmask));
return 0;
}
printf("Total Hosts:\t%d\n", ntohl(netmask)); prints the correct value but it prints with a minus sign.If I use %uI get the wrong value.
Where am I going wrong?
With %d output is:
Starting IP: 132.89.39.0
Netmask: 255.240.0.0
Network: 132.80.0.0
Total Hosts: -1048576
With %u output is:
Starting IP: 132.89.39.0
Netmask: 255.240.0.0
Network: 132.80.0.0
Total Hosts: 4293918720
I've been stuck on this for 2 days. Something seemingly so simple has threw me off completely. I don't want anyone to solve the problem, but a push in the right direction would be very helpful.
If you see, the prototype of htonl() is
uint32_t htonl(uint32_t hostlong);
so, it returns an uint32_t, which is of unsigned type. Printing that value using %d , (which expects an argument of type signed int) is improper.
At least, you need to use %u for getting the unsigned value. Generally, if possible, try to use PRIu32 MACRO for printing fixed-width (32) unsigned integers.
There are currently a variety of systems that can change between little-endian and bigendian
byte ordering, sometimes at system reset, sometimes at run-time.
We must deal with these byte ordering differences as network programmers because
networking protocols must specify a network byte order. For example, in a TCP segment, there
is a 16-bit port number and a 32-bit IPv4 address. The sending protocol stack and the
receiving protocol stack must agree on the order in which the bytes of these multibyte fields
will be transmitted. The Internet protocols use big-endian byte ordering for these multibyte
integers.
In theory, an implementation could store the fields in a socket address structure in host byte
order and then convert to and from the network byte order when moving the fields to and from
the protocol headers, saving us from having to worry about this detail. But, both history and
the POSIX specification say that certain fields in the socket address structures must be
maintained in network byte order. Our concern is therefore converting between host byte order
and network byte order. We use the following four functions to convert between these two byte
orders.
#include <netinet/in.h>
uint16_t htons(uint16_t host16bitvalue) ;
uint32_t htonl(uint32_t host32bitvalue) ;
Both return: value in network byte order
uint16_t ntohs(uint16_t net16bitvalue) ;
uint32_t ntohl(uint32_t net32bitvalue) ;
Both return: value in host byte order
In the names of these functions, h stands for host, n stands for network, s stands for short,
and l stands for long. The terms "short" and "long" are historical artifacts from the Digital VAX
implementation of 4.2BSD. We should instead think of s as a 16-bit value (such as a TCP or
UDP port number) and l as a 32-bit value (such as an IPv4 address). Indeed, on the 64-bit
Digital Alpha, a long integer occupies 64 bits, yet the htonl and ntohl functions operate on
32-bit values.
When using these functions, we do not care about the actual values (big-endian or littleendian)
for the host byte order and the network byte order. What we must do is call the
appropriate function to convert a given value between the host and network byte order. On
those systems that have the same byte ordering as the Internet protocols (big-endian), these
four functions are usually defined as null macros.
We will talk more about the byte ordering problem, with respect to the data contained in a
network packet as opposed to the fields in the protocol headers,
We have not yet defined the term "byte." We use the term to mean an 8-bit quantity since
almost all current computer systems use 8-bit bytes. Most Internet standards use the term
octet instead of byte to mean an 8-bit quantity. This started in the early days of TCP/IP
because much of the early work was done on systems such as the DEC-10, which did not use
8-bit bytes.
Another important convention in Internet standards is bit ordering. In many Internet
standards, you will see "pictures" of packets that look similar to the following (this is the first
32 bits of the IPv4 header from RFC 791):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL| TYPE OF SERCVICE | TOTAL LENGTH |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This represents four bytes in the order in which they appear on the wire; the leftmost bit is the
most significant. However, the numbering starts with zero assigned to the most significant bit.
This is a notation that you should become familiar with to make it easier to read protocol
definitions in RFCs.
A common network programming error in the 1980s was to develop code on Sun
workstations (big-endian Motorola 68000s) and forget to call any of these four functions.
The code worked fine on these workstations, but would not work when ported to littleendian
machines (such as VAXes).
The problem here is not your conversion between network and host order. That part of your code works perfectly.
The problem is your belief that the netmask, interpreted as an integer, is the number of hosts which match that mask. That is exactly the inverse of the truth.
Consider your 12-bit netmask, 255.240.0.0. Or, in binary:
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
As your code indicates, for a host address to match a network address with this netmask, the two addresses need to be identical where the netmask has a 1 bit. Bit positions corresponding to a 0 in the netmask can be freely chosen. The number of such addresses can be determined by considering only the 0 bits. But of course we can't leave those bits as 0s; to count the number of qualifying addresses, we need to prepend a 1. So the count, in this case, is (exactly as you suspect) 1,048,576:
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
One way to compute this value would be to invert the netmask and add 1:
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (bitwise invert)
1 (+ 1)
---------------------------------------------------------------
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In 2's-complement arithmetic, this is precisely the same as arithmetic negation. So it is not suprising that when you print the netmask out as a signed integer, you'll see the negative of the expected count. (Printing an unsigned uint32_t as a signed int is technically undefined behaviour but will probably work as expected on 2's-complement machines with 32-bit ints.)
In short, what you should do to compute the number of qualifying addresses from the netmask is:
uint32_t address_count = ~ntohl(netmask) + 1;
(which most compilers will optimize to a unary negation opcode, if available.)

Left shift using bitwise AND

The following lines of code Shift left 5 bits ie make bottom 3 bits the 3 MSB's
DWORD dwControlLocAddress2;
DWORD dwWriteDataWordAddress //Assume some initial value
dwControlLocAddress2 = ((dwWriteDataWordAddress & '\x07') * 32);
Can somebody help me understand how?
The 0x07 is 00000111 in binary. So you are masking the input value and getting just the right three bits. Then you are multiplying by 32 which is 2 * 2 * 2 * 2 * 2... which, if you think about it, shifting left by 1 is the same as multiplying by 2. So, shifting left five times is the same as multiplying by 32.
Multiplying by a power of two x is the same as left shifting by log2(x):
x *= 2 -> x <<= 1
x *= 4 -> x <<= 2
.
.
.
x *= 32 -> x <<= 5
The & doesn't do the shift - it just masks the bottom three bits. The syntax used in your example is a bit weird - it's using a hexadecimal character literal '\x07', but that's literally identical to hex 0x07, which in turn in binary is:
00000111
Since any bit ANDed with 0 yields 0 and any bit ANDed with 1 is itself, the & operation in your example simply gives a result of being the bottom three bits of dwWriteDataWordAddress.
It's a bit obtuse but essentially you're anding with 0x07 and then multiplying by 32 which is the same as shifting by 5. I'm not sure why a character literal is used rather than an integer literal but perhaps so that it is represented as a single byte rather than a word.
The equivalent would be:
( ( dw & 0x07 ) << 5 )
The & 0x07 masks off the first 3 bits and << 5 does a left shift by 5 bits.
& '\x07' - masks in the bottom three bits only (hex 7 is 111 in binary)
* 32 - left shifts by 5 (32 is 2^5)

reading 2 bits off a register

I'm looking at a datasheet specification of a NIC and it says:
bits 2:3 of register contain the NIC speed, 4 contains link state, etc. How can I isolate these bits using bitwise?
For example, I've seen the code to isolate the link state which is something like:
(link_reg & (1 << 4))>>4
But I don't quite get why the right shift. I must say, I'm still not fairly comfortable with the bitwise ops, even though I understand how to convert to binary and what each operation does, but it doesn't ring as practical.
It depends on what you want to do with that bit. The link state, call it L is in a variable/register somewhere
43210
xxxxLxxxx
To isolate that bit you want to and it with a 1, a bitwise operation:
xxLxxxx
& 0010000
=========
00L0000
1<<4 = 1 with 4 zeros or 0b10000, the number you want to and with.
status&(1<<4)
This will give a result of either zero or 0b10000. You can do a boolean comparison to determine if it is false (zero) or true (not zero)
if(status&(1<<4))
{
//bit was on/one
}
else
{
//bit was off/zero
}
If you want to have the result be a 1 or zero, you need to shift the result to the ones column
(0b00L0000 >> 4) = 0b0000L
If the result of the and was zero then shifting still gives zero, if the result was 0b10000 then the shift right of 4 gives a 0b00001
so
(status&(1<<4))>>4 gives either a 1 or 0;
(xxxxLxxxx & (00001<<4))>>4 =
(xxxxLxxxx & (10000))>>4 =
(0000L0000) >> 4 =
0000L
Another way to do this using fewer operations is
(status>>4)&1;
xxxxLxxxx >> 4 = xxxxxxL
xxxxxxL & 00001 = 00000L
Easiest to look at some binary numbers.
Here's a possible register value, with the bit index underneath:
00111010
76543210
So, bit 4 is 1. How do we get just that bit? We construct a mask containing only that bit (which we can do by shifting a 1 into the right place, i.e. 1<<4), and use &:
00111010
& 00010000
----------
00010000
But we want a 0 or a 1. So, one way is to shift the result down: 00010000 >> 4 == 1. Another alternative is !!val, which turns 0 into 0 and nonzero into 1 (note that this only works for single bits, not a two-bit value like the link speed).
Now, if you want bits 3:2, you can use a mask with both of those bits set. You can write 3 << 2 to get 00001100 (since 3 has two bits set). Then we & with it:
00111010
& 00001100
----------
00001000
and shift down by 2 to get 10, the desired two bits. So, the statement to get the two-bit link speed would be (link_reg & (3<<2))>>2.
If you want to treat bits 2 and 3 (starting the count at 0) as a number, you can do this:
unsigned int n = (link_get & 0xF) >> 2;
The bitwise and with 15 (which is 0b1111 in binary) sets all but the bottom four bits to zero, and the following right-shift by 2 gets you the number in bits 2 and 3.
you can use this to determine if the bit at position pos is set in val:
#define CHECK_BIT(val, pos) ((val) & (1U<<(pos)))
if (CHECK_BIT(reg, 4)) {
/* bit 4 is set */
}
the bitwise and operator (&) sets each bit in the result to 1 if both operands have the corresponding bit set to 1. otherwise, the result bit is 0.
The problem is that isolating bits is not enough: you need to shift them to get the correct size order of the value.
In your example you have bit 2 and 3 for the size (I'm assuming that least significant is bit 0), it means that it is a value in range [0,3]. Now you can mask these bits with reg & (0x03<<2) or, converted, (reg & 0x12) but this is not enough:
reg 0110 1010 &
0x12 0000 1100
---------------
0x08 0000 1000
As you can see the result is 1000b which is 8, which is over the range. To solve this you need to shift back the result so that the least significant bit of the value you are interested in corresponds to the least significant bit of the containing byte:
0000 1000 >> 2 = 10b = 3
which now is correct.

c language bitwise trick

Here is code i saw in a C program , i knew this piece of code is to set a bit in the bit ASCII bit map corresponding to the character c.
field[ (c & 0x7f) >> 3 ] |= 1 << (c & 0x07);
field is an array of 16 characters, each character is 8 bits.
for example '97' is lower case 'a', if we set c to 97, then bit position 97 will be set to 1.
any one know why above code will set bit map corresponding to the character c?
and what are those magic number 0x7f, 0x07, 3 and 1 for?
If your array is 16 bytes long, it has 128 bits (16 x 8). So the first mask (0x7f) guarantees that you are only interested in the first 128 bits. Once you shift it 3 bits right, you have 4 bits left that are used to address your bitfield (the number ((c & 0x7F) >> 3 is a number between 0 and 15). So this part uses the upper 4 bits to address the byte.
Now, you need to address the bit in the byte, so you use the mask 0x07 to limit the value to the range 0 - 7 (corresponding to the bits 0 to 7). You use this number to shift the 1 so many positions.
At the end, you have a bit set in a position 0 to 127 (16 bytes of 8 bits). I hope this helps!
First, to clear up the magic numbers
0x7f is 0111 1111 in binary. This means the lower 7 bits of c are significant. This is then shifted by 3 so that only the original 0xxx x000 (4) bits are significant. But since these bits are shifted by 3 they count 0 to 15.
0x07 is 0000 0111 in binary. This means only the lower 3 bits are significant. The number 1 is shifted left by the value in these 3 bits, resulting a bit set in bit positions 0 to 7 within the byte.
In the end, the function only uses the lower 7 bits in the byte, which are the only significant bits in an ascii character. It uses the upper 4 for addressing the byte in the array and the bottom 3 to address the bit in the addressed byte.

Resources