I am working on a driver for the Linux kernel. For the success of my project, I need to determine the amount of padding added to Ethernet frames smaller than the minimum size of 60 bytes (not counting the FCS). I am not generating these frames; I am receiving them on a NIC for processing.
Having a struct sk_buff, is it possible to determine the amount of trailing zeros added to the packet directly?
I can of course determine that value by going through the entire packet, figuring out where the content of the highest layer ends and then simply subtracting that position from the frame size (in this case, 60 bytes). But is there a more efficient way to do it directly from the information stored on a struct sk_buff?
EDIT: As far as I know, there is no way to check for zero padding directly using the sk_buff structure without actually looking at the ethernet header, which is simple enough.
That said, with some simple pointer arithmetic and byte subtraction, you can use the length field in IP data to figure out the padding.
This is a good reference for sk_buff:
http://vger.kernel.org/~davem/skb_data.html
And here is a good reference for the packet structure, showing the 'length' field in the bottom picture within 'data'.
http://nerdcrunch.com/wp-content/uploads/2011/05/Ethernet-Frame-Explained.png
I think this is the way it must be done, but it doesn't require parsing as you had previously maintained. The header/data structure fields are set up such that they can be referenced/stripped directly via pointer/array without parsing, and then by subtracting header+data length from raw packet length you can get the padding, all without inspecting the data.
Hope that helps.
Also, for best practice, you should probably have your driver account for both versions of 802.3 in use. You can do so by inspecting the Ethertype/length field. If the value is greater than 1536 (0x0600) than you know it's an Ethernet II type packet and the field contains an ethertype, which tells you what the ethernet packet encapsulates. There are some popular ones if you Wikipedia for "Ethertype."
For example, IP = 0x0800. If the field designates an Ethertype, you must resort to finding the data length field inside in order to find the padding. If it does not, which alot of Ethernet based LAN's still don't, then you can directly use the field specified as length to do your job.
IPv4 does almost the same, there is probably no better ways. Check ip_rcv():
len = ntohs(iph->tot_len);
if (skb->len < len) {
IP_INC_STATS_BH(dev_net(dev), IPSTATS_MIB_INTRUNCATEDPKTS);
goto drop;
} else if (len < (iph->ihl*4))
goto inhdr_error;
/* Our transport medium may have padded the buffer out. Now we know it
* is IP we can trim to the true length of the frame.
* Note this now means skb->len holds ntohs(iph->tot_len).
*/
if (pskb_trim_rcsum(skb, len)) {
IP_INC_STATS_BH(dev_net(dev), IPSTATS_MIB_INDISCARDS);
goto drop;
}
Related
I'm making a program using pcap to parse .pcap files.
I'm actually working on the DNS protocol, i'm able to get the header and display its information. Now I'd like to display its Resource Records (Question, Answer, Authority, Additional).
I found this interesting doc: http://www.zytrax.com/books/dns/ch15/
And, as I did before for parsing the different headers, I wanted to create a structure and cast my packet in it.
Following this doc I created my structure as follow:
struct question_s {
u_short *qname;
u_short qtype;
u_short qclass;
}
and I'm casting :
struct question_s *record = (struct question_s*)(data + offset);
Where data is the packet representation, and offset is the total size of previous protocols.
Now I'm having trouble understanding some points, and as my English is not perfect, it's possible that I missed something in the documentation. Here are my questions:
As qname is of variable size, am I doing it right by making it a pointer on u_short?
All pointer are 8 bytes long, so my structure should be 12 bytes long, but where is the name in memory? Should I add 12 to my offset without taking care of the name length?
I tried to display qname, working on it as if it was a char*, but it doesn't seem to work (seg. fault), here is what I did:
void test(u_short *qname) {
for (int c = 0; qname[c] != 0; ++c)
write(1, &qname[c], 1);
}
But maybe there isn't a '\0' in the string?
May be that's an endianess issue? I use htons and htonl on all my u_short and u_int values because the network byte order isn't the same as mine, but I'm not sure it applies to pointers.
If you want to see how to dissect DNS records, first read and understand RFC 1035, and then take a look at the tcpdump code to dissect DNS records. It's harder than you think; you can't just overlay a structure on top of the raw packet data.
And you can't ever overlay a structure with a pointer in it on top of raw packet data. The pointer will almost certainly point to some bogus location in your address space; protocols don't send raw pointers over the network, as a pointer is a pointer in a particular address space, and two processes on the network will have different address spaces.
(In fact, just about everything in packet dissection is harder than people think when they first try to write code to dissect packets.)
(Excuse me if I am not able to put the question correctly. English in not my primary language.)
I am trying to parse SyncE ESMC packet. It is Ethernet slow protocol packet.
Approach 1:
For parsing this packet, I used byte by byte approach similar to what has been done here.
Approach 2:
Other way to parse the packet would be to define a 'structure' to represent the whole packet and access individual fields to retrieve the value at particular offset.
However in this approach structure padding and alignment may come into picture(which I am not sure) but on Linux various packet headers are defined in form of structure, e.g. iphdr in ip.h. IP packets (buffer) can be type casted to 'iphdr' to retrieve ip header fields, so it must be working.
Which approach is better to parse a network packet in C?
Does structure padding and aligment make any difference while parsing a packet via approach 2? If yes, how did Linux headers overcome this problem?
Approach 1 is the best for portability. It allows you to safely avoid misaligned accesses, for example. In particular, if your machine is little-endian, this approach lets you take care of the byte-swapping very easily.
Approach 2 is sometimes convenient and often is faster code to write. If structure padding gets in your way, your compiler probably provides a flag or attribute (like __attribute__((__packed__)) or #pragma pack) to work around it. If you have a little-endian machine, however, you're still going to have to byteswap fields all over the place.
I'm writing a toy database management system, and running up against some alignment and endianness issues.
First, allow me to explain the data that is being stored, and where it's being stored. So first some definitions. The layout of a record is broken up into a Record Directory and Record Data.
[Field count=N] [Field offset[0]] [...] [Field offset[N-1]] [Data for fields 0 to N]
The field count and offsets combined are called the Record Directory.
The data is called the Record Data.
The field count is of type
uint16_t.
The field offset is of type
uint16_t.
The data fields can be treated as a variable length byte buffer pointed to by (uint8_t *) with a length of at least N bytes.
The field count cannot exceed: 4095 or 0x0FFF (in big endian).
The records are stored in a Page:
Pages are of size: 4096 bytes.
Pages need to store 2 bytes of data for each record.
The last 6 bytes of the page stores the running free space offset, and data for a slot directory. The metadata is irrelevant to the question, so I will not bore anyone with the details.
We're storing records on the page, by appending to the running free space offset, and appending to it. Records can later be altered and deleted. This will leave unused space fragments on the page. This data is not reused until time of compaction.
At the moment, we store a fragment byte of 0x80 in unused space (since the free space cannot exceed 0x0FFF, the first byte will never be 0x80).
However this becomes a problem during compaction time. We end up scanning everything until we hit the first byte that is not 0x80. We consider this the start of the free space. Well unfortunately, this is not portable and will only work on big endian machines.
But just to restate the issue here, the problem is distinguishing between: 0x808000 and 0x800080 where the first two bytes (read right to left) are two valid Field count fields depending on the endianness of the platform.
I want to try aligning records on even bytes. I just don't have the foresight to see if this would be a correct workaround for this issue.
At any given time, the free space offset should always sit on an even byte boundary. This means after inserting a record, you advance the free space pointer to the next even boundary.
The problem then becomes an issue of marking the fragments. Fragments are created upon deletion or altering a record (growing/shrinking by some number of bytes). I wanted to store what I would call 2-byte fragment markers: 0xFFFF. But that doesn't seem possible when altering.
This is where I'm stuck. Sorry for the long-winded problem explanation. We (my partner, this is an academic assignment) battled the problem of data ambiguity several times, and it keeps masking itself under different solutions.
Any insight would help. I hope the problem statement can be followed.
I would try this:
Align records to at least 2-byte boundaries.
Scan the list for free space as a list of uint16_t rather than char,
then look for length & 0x8000.
If you let the machine interpret integers as such instead of trying to scan
them as characters, endianness shouldn't be an issue here (at least until
you want to read your database on a different machine than the one that
wrote it).
I am currently working on a school project which asks me to implement a DNS client, without using any library functions.
I have got to the point where i send a DNS request and Receive the Reply. I'm getting stuck at the parsing of the reply. I receive the reply in a char* array and i want to convert it into some meaningful structure, from which i can parse the answer.
I went through the RFC and i read about the packet structure, but implementing it in C is giving me problems.
Can anyone give me any examples, in C, or maybe in any other language that explains how this is done. Or any reference to a book is also fine.
Additional Details:
So, the following are the structures that i'm using.
struct result{
int type;
struct res_ip_cname ip_cname;
struct res_error error;
struct res_mx_ns mx_ns;
};
struct res_ip_cname{
char* lst;
int sec;
char* auth_flag;
};
struct res_error{
char * info;
};
struct res_mx_ns{
char * name;
unsigned short pref;
int sec;
char* auth_flag;
};
I have a char* buffer[], where im storing the response the i receive from the server. And, i need to extract information from this buffer and populate the structure result.
Thanks,
Chander
Your structures don't look like anything I recognise from the RFCs (yes, I've written lots of DNS packet decoding software).
Look at RFC 1035 in particular - most of the structures you need can be mapped directly from the field layouts show therein.
For example, you need a header (see s4.1.1):
struct dns_header {
uint16_t query_id;
uint16_t flags;
uint16_t qdcount;
uint16_t ancount;
uint16_t nscount;
uint16_t arcount;
};
Don't forget to use ntohs() to convert the wire format of these fields into your machine's native byte order. The network order is big-endian, and most machines these days are little-endian.
You'll need a "question" structure (see s4.1.2), and a generic "resource record" structure too (see s4.1.3).
Note however that the wire format of both of these starts with a variable length "label", which can also include compression pointers (see s4.1.4). This means that you can't in these cases trivially map the whole wire block onto a C structure.
Hope this helps...
If I were you I'd be using wireshark (in combination with the RFC) to inspect the packet structure. Wireshark captures and displays the network packets flowing through your computer. It lets you see both the raw data you will be receiving, and the decoded packet structure.
For example, in the screenshot below you can see the IP address of chat.meta.stackoverflow.com returned in a DNS Response packet, rendered in three different ways. Firstly, you can see a human readable version, in the middle pane of the screen. Secondly, the highlighted text in the lower left pane shows the raw DNS packet as a series of hexadecimal bytes. Thirdly, in the highlighted text in the lower left pane, you can see the packet rendered as ASCII text (in this case, mostly but not entirely, gobbledigook).
The request format and response format are quite similar - both contain variable length fields, which I guess is what you're stuck on - but if you've managed to form a request properly, you shouldn't have too much trouble parsing the response. If you can post some more details, like where exactly you're stuck, we could help better.
My advice is don't make a meal of it. Extract QDCOUNT and ANCOUNT from the header, then skip over the header, skip QDCOUNT questions, and start parsing answers. Skipping a label is easy (just look for the first byte that's 0 or has the high bit set), but decoding one is a little bit more work (you need to follow and validate "pointers" and make sure you don't get stuck in a loop). If you're only looking up addresses (and not PTR records) then you really never need to decode labels at all.
I need to MPI_Gatherv() a number of int/string pairs. Let's say each pair looks like this:
struct Pair {
int x;
unsigned s_len;
char s[1]; // variable-length string of s_len chars
};
How to define an appropriate MPI datatype for Pair?
In short, it's theoretically impossible to send one message of variable size and receive it into a buffer of the perfect size. You'll either have to send a first message with the sizes of each string and then a second message with the strings themselves, or encode that metainfo into the payload and use a static receiving buffer.
If you must send only one message, then I'd forgo defining a datatype for Pair: instead, I'd create a datatype for the entire payload and dump all the data into one contiguous, untyped package. Then at the receiving end you could iterate over it, allocating the exact amount of space necessary for each string and filling it up. Let me whip up an ASCII diagram to illustrate. This would be your payload:
|..x1..|..s_len1..|....string1....|..x2..|..s_len2..|.string2.|..x3..|..s_len3..|.......string3.......|...
You send the whole thing as one unit (e.g. an array of MPI_BYTE), then the receiver would unpack it something like this:
while (buffer is not empty)
{
read x;
read s_len;
allocate s_len characters;
move s_len characters from buffer to allocated space;
}
Note however that this solution only works if the data representation of integers and chars is the same on the sending and receiving systems.
I don't think you can do quite what you want with MPI. I'm a Fortran programmer, so bear with me if my understanding of C is a little shaky. You want, it seems, to pass a data structure consisting of 1 int and 1 string (which you pass by passing the location of the first character in the string) from one process to another ? I think that what you are going to have to do is pass a fixed length string -- which would have, therefore, to be as long as any of the strings you really want to pass. The reception area for the gathering of these strings will have to be large enough to to receive all the strings together with their lengths.
You'll probably want to declare a new MPI datatype for your structs; you can then gather these and, since the gathered data includes the length of the string, recover the useful parts of the string at the receiver.
I'm not certain about this, but I've never come across truly variable message lengths as you seem to want to use and it does sort feel un-MPI-like. But it may be something implemented in the latest version of MPI that I've just never stumbled across, though looking at the documentation on-line it doesn't seem so.
MPI implementations do not inspect or interpret the actual contents of a message. Provided that you know the size of the data structure, you can represent that size in some number of char's or int's. The MPI implementation will not know or care about the actual internal details of the data.
There are a few caveats...both the sender and receiver need to agree on the interpretation of the message contents, and the buffer that you provide on the sending and receiving side needs to fit into some definable number of char's or int's.