iov and msg_control in sendmsg and recvmsg - c

What is the difference between iov.iov_base and msg.msg_control ?
I'm looking at some code examples (ipuitls open source ping)
When sending data using sendmsg the packet is set in iov.iov_base
When reading data using recvmsg the packet is read from msg->msg_control directly.
What is the relationship between struct iovec and struct msghdr ? Is there a difference when reading/sending data ?
Sorry for the silly question. I didn't find an answer so far and I'm confused.
thanks !

Ancillary data or control messages (.msg_controllen bytes at .msg_control) is data provided or verified by the kernel, whereas the normal payload (in iovecs) is just data received from the other endpoint, unverified and unchecked by the kernel (except for checksum, if the protocol has one).
For IP sockets (see man 7 ip), there are several socket options that cause the kernel to provide ancillary data on received messages. For example:
IP_RECVORIGDSTADDR socket option tells the kernel to provide a IP_ORIGDSTADDR type ancillary message (with a struct sockaddr_in as data), identifying the original destination address of the datagram received
IP_RECVOPTS socket option tells the kernel to provide a IP_OPTIONS type ancillary message containing all IP option headers (up to 40 bytes for IPv4) for incoming datagrams
Ping and traceroute uses ICMP messages over IP; see man 7 icmp (and man 7 raw) for details.
Because most ICMP responses do not contain useful data filled in by the sender, the iovecs don't usually contain anything interesting. Instead, the interesting data is in the IP message headers and options.
For example, an ICMP Echo reply packets contain just 8 bytes (64 bits): 8-bit type (0), 8-bit code (0), 16-bit checksum, 16-bit id, and 16-bit sequence number. To get the IP headers with the interesting fields, you need the kernel to provide them as ancillary data control messages.
The background:
As described in the sendmsg() and related man pages, we have
ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);
struct msghdr {
void *msg_name; /* Optional address */
socklen_t msg_namelen; /* Size of address */
struct iovec *msg_iov; /* Scatter/gather array */
size_t msg_iovlen; /* # elements in msg_iov */
void *msg_control; /* Ancillary data */
size_t msg_controllen; /* Ancillary data buffer len */
int msg_flags; /* Flags (unused) */
};
struct iovec {
void *iov_base; /* Starting address */
size_t iov_len; /* Number of bytes to transfer */
};
with man 3 cmsg describing how to construct and access such ancillary data,
struct cmsghdr {
size_t cmsg_len; /* Data byte count, including header
(type is socklen_t in POSIX) */
int cmsg_level; /* Originating protocol */
int cmsg_type; /* Protocol-specific type */
unsigned char cmsg_data[]; /* Data itself */
};
struct cmsghdr *CMSG_FIRSTHDR(struct msghdr *msgh);
struct cmsghdr *CMSG_NXTHDR(struct msghdr *msgh, struct cmsghdr *cmsg);
size_t CMSG_ALIGN(size_t length);
size_t CMSG_SPACE(size_t length);
size_t CMSG_LEN(size_t length);
unsigned char *CMSG_DATA(struct cmsghdr *cmsg);
These ancillary data messages are always sufficiently aligned for the current architecture (so that the data items can be directly accessed), so to construct a proper ancillary message (SCM_CREDENTIALS to pass user, group, and process ID information over an Unix domain socket, or SCM_RIGHTS to pass file descriptors), these macros have to be used. The man 3 cmsg man page contains example code for these.
Suffice it to say, that to loop over each ancillary data part in a given message (struct msghdr msg), you use something that boils down to
char *const end = (char *)msg.msg_control + msg.msg_controllen;
char *ptr = (char *)msg.msg_control;
for (char *ptr = (char *)msg.msg_control; ptr < end;
ptr += ((struct cmsghdr *)ptr)->cmsg_len) {
struct cmsghdr *const cmsg = (struct cmsghdr *)ptr;
/* level is cmsg->cmsg_level and type is cmsg->cmsg_type, and
cmsg->cmsg_data is sufficiently aligned for the level and type,
so you can use ((datatype *)(cmsg->cmsg_data)) to obtain a pointer
to the type corresponding to this level and type ancillary payload.
The exact size of the payload is
(cmsg->cmsg_len - sizeof (struct cmsghdr))
so e.g. an SCM_RIGHTS ancillary message, with
cmsg->cmsg_level == SOL_SOCKET && cmsg->cmsg_type == SCM_RIGHTS
has exactly
(cmsg->cmsg_len - sizeof (struct cmsghrd)) / sizeof (int)
new file descriptors as a payload.
*/
}

Related

When implementing traceroute in C , I couldn't find the IP address of the router that send back the timeout error message

In order to receive timeout ICMP message, I set the receive socket recvfd with option IP_RECVERR:
int val=1;
setsockopt(recvfd,IPPROTO_IP,IP_RECVERR,&val,sizeof(int));
And I try to receive the timeout ICMP error message with recvmsg(recvfd,msg,MSG_ERRQUEUE);
The error message is stored in msg, which is a pointer to the struct msghdr.
I look up the manual, 7/ip and recvmsg.
The struct msghdr as follows,
struct iovec { /* Scatter/gather array items */
void *iov_base; /* Starting address */
size_t iov_len; /* Number of bytes to transfer */
};
struct msghdr {
void *msg_name; /* optional address */
socklen_t msg_namelen; /* size of address */
struct iovec *msg_iov; /* scatter/gather array */
size_t msg_iovlen; /* # elements in msg_iov */
void *msg_control; /* ancillary data, see below */
size_t msg_controllen; /* ancillary data buffer len */
int msg_flags; /* flags on received message */
};
The field msg_control points to struct cmsghdr, as follows,
struct cmsghdr {
socklen_t cmsg_len; /* data byte count, including hdr */
int cmsg_level; /* originating protocol */
int cmsg_type; /* protocol-specific type */
/* followed by
unsigned char cmsg_data[]; */
};
and the field cmsg_data points to a struct sock_extended_err, as follows
#define SO_EE_ORIGIN_NONE 0
#define SO_EE_ORIGIN_LOCAL 1
#define SO_EE_ORIGIN_ICMP 2
#define SO_EE_ORIGIN_ICMP6 3
struct sock_extended_err {
uint32_t ee_errno; /* error number */
uint8_t ee_origin; /* where the error originated */
uint8_t ee_type; /* type */
uint8_t ee_code; /* code */
uint8_t ee_pad;
uint32_t ee_info; /* additional information */
uint32_t ee_data; /* other data */
/* More data may follow */
};
struct sockaddr *SO_EE_OFFENDER(struct sock_extended_err *);
In the manual, it says
the macro SO_EE_OFFENDER returns a pointer to the address of the network object where the error originated from given a pointer to the ancillary message.
I tried to convert the struct sock_extended_err into IP address and print it out, but the result is 2.0.0.0 all the time. But I couldn't find elsewhere that stores the IP address of the router that send back the timeout ICMP error message in the above three structs.
Code that receives the error message as follows,
//construct the msghdr struct
struct msghdr* msg=(struct msghdr*)malloc(sizeof(struct msghdr));
char recvBuffer[CMSG_SPACE(64)];
msg->msg_control=recvBuffer;
msg->msg_controllen=sizeof(recvBuffer);
//receive the message
recvmsg(recvfd,msg,MSG_ERRQUEUE);
//visit all the nodes in msg and get the IP address
struct cmsghdr* cmsg=CMSG_FIRSTHDR(msg);
for(;cmsg!=NULL;cmsg=CMSG_NXTHDR(msg,cmsg)){
struct sock_extended_err* exterr=(struct sock_extended_err*)(CMSG_DATA(cmsg));
struct sockaddr* DestAddr=SO_EE_OFFENDER(exterr);
struct sockaddr_in* ad=(struct sockaddr_in*)DestAddr;
char destination[20];
inet_ntop(AF_INET,ad,destination,sizeof(destination));
printf("IP: %20s\n",destination);
}
Could anyone help me ? Thanks.

Passing multiple buffers with iovec in C Linux sockets

I'm writing a linux C client-server programs that communicates with each other with unix domain sockets and passes couple of buffers each time.
I'm using ioverctors but for some reasons the server program only receives the first io vector.
Any idea ?
I attached the relevant code snippets.
Client code:
struct iovec iov[2];
struct msghdr mh;
int rc;
char str1[] = "abc";
char str2[] = "1234";
iov[0].iov_base = (caddr_t)str1;
iov[0].iov_len = sizeof(str1);
iov[1].iov_base = (caddr_t)str2;
iov[1].iov_len = sizeof(str2);
memset(&mh, 0, sizeof(mh));
mh.msg_iov = iov;
mh.msg_iovlen = 2;
n = sendmsg(sockfd, &mh, 0); /* no flags used*/
if (n > 0) {
printf("Sendmsg successfully executed\n");
}
}
Server code:
{
struct sockaddr_un *client_sockaddr = (sockaddr_un *)opq;
struct msghdr msg;
struct iovec io[2];
char buf[16];
char buf2[16];
io[0].iov_base = buf;
io[0].iov_len = sizeof(buf);
io[1].iov_base = buf2;
io[1].iov_len = sizeof(buf2);
msg.msg_iov = io;
msg.msg_iovlen = 2;
int len = recvmsg(sock, &msg, 0);
if (len > 0) {
printf("recv: %s %d %s %d\n", msg.msg_iov[0].iov_base, msg.msg_iov[0].iov_len, msg.msg_iov[1].iov_base, msg.msg_iov[1].iov_len);
}
return 0;
}
The output i'm getting from the server:
recv: abc 16 16
sendmsg(), writev(), pwritev(), and pwritev2() do not operate on multiple buffers, but one discontiguous buffer. They operate exactly as if you'd allocate a large enough temporary buffer, gather the data there, and then do the corresponding syscall on the single temporary buffer.
Their counterparts recvmsg(), readv(), preadv(), and preadv2() similarly do not operate on multiple buffers, only on one discontiguous buffer. They operate exactly as if you'd allocate a large enough temporary buffer, receive data into that buffer, then scatter the data from that buffer to the discontiguous buffer parts.
Unix domain datagram (SOCK_DGRAM) and seqpacket (SOCK_SEQPACKET) sockets preserve message boundaries, but stream sockets (SOCK_STREAM) do not. That is, using a datagram or seqpacket socket you receive each message as it was sent. With a stream socket, message boundaries are lost: two consecutively sent messages can be received as a single message, and you can (at least in theory) receive a partial message now and the rest later.
You can use the Linux-specific sendmmsg() function to send several messages in one call (using the same socket). If you use an Unix domain datagram or seqpacket socket, these will then retain their message boundaries.
Each message is described using a struct mmsghdr. It contains struct msghdr msg_hdr; and unsigned int msg_len;. msg_hdr is the same as you use when sending a single message using e.g. sendmsg(); you can use more than one iovec for each message, but the recipient will receive them concatenated into a single buffer (but can scatter that buffer using e.g. recvmsg()). msg_len will be filled in by the sendmmsg() call: the number of bytes sent for that particular message, similar to the return value of e.g. sendmsg() call when no errors occur.
The return value from the sendmmsg() call is the number of messages sent successfully (which may be fewer than requested!), or -1 if an error occurs (with errno indicating the error as usual). Thus, you'll want to write a helper function or a loop around sendmmsg() to make sure you send all the messages. For portability, I recommend a helper function, because you can then provide another based on a loop around sendmsg() for use when sendmmsg() is not available.
The only real benefit of sendmmsg() is that you need fewer syscalls to send a large number of messages: it boosts efficiency in certain situations, that's all.

Be confused with msg_name field in msghdr structure

In user space, I encapsulated a L3 packet using sock_raw (including IP header) and send to kernel space using sock_sendmsg() using msghdr structure
struct msghdr {
void *msg_name; /* optional address */
struct iovec *msg_iov; /* scatter/gather array */
...
};
I cannot understand clearly the roles of msg_name. I already specified the source IP and dest IP in L3 header. Why do I need msg_name?
The msg_name and msg_namelen fields of struct msghdr have the same function as the dest_addr and addrlen arguments to sendto: they specify the destination address. They are intended to be used with normal unconnected datagram sockets. For instance, when sending UDP packets with sendmsg on an AF_INET/SOCK_DGRAM socket, you supply only the payload, not the headers, in the iovec, and the destination address goes in msg_name + msg_namelen.
raw(7), the manpage describing SOCK_RAW sockets, indicates that you are allowed to put the header into the iovec when using raw sockets (note in particular the discussion of IP_HDRINCL) but does not make clear what you should set msg_name and msg_namelen to in that case. I would recommend you try setting both of them to 0 and see if that works.

How to make a raw packet with special header structure and send over raw unix socket

I need to create a message and send it over a unix socket.
I have a socket defined as such: socket(AF_UNIX, SOCK_RAW, 0)
The header structure of the message/packet i want to send is as follows:
struct map_msghdr {
uint8_t map_msglen; /* to skip over non-understood messages */
uint8_t map_version; /* future binary compatibility */
uint16_t map_type; /* message type */
uint32_t map_flags; /* flags, incl. kern & message, e.g. DONE */
uint16_t map_addrs; /* bitmask identifying sockaddrs in msg */
uint16_t map_versioning;/* Mapping Version Number */
int map_rloc_count;/* Number of rlocs appended to the msg */
pid_t map_pid; /* identify sender */
int map_seq; /* for sender to identify action */
int map_errno; /* why failed */
};
I need to build a buffer containing the map_msghdr{} structure followed by a socket address structure. The socket address structure will have an ip address. How do i do this? Can you please show me an example? Thank you.
Allocate (statically or dynamically) sizeof(struct map_msghdr) + sizeof(sockaddr_storage) bytes. Copy the map_msghdr to the beginning of the allocated memory and copy the socket address structure to the buffer after the header structure (i.e. buffer + sizeof(map_msghdr)). Send the buffer.
Simple pseudo-ish code:
struct map_msghdr hdr;
struct sockaddr_storage addr;
fill_in_header(&hdr); // You need to write this
fill_in_sockaddr(&addr); // You need to write this
// Create a buffer to send the header and address
int8_t buffer[sizeof hdr + sizeof addr] = { 0 };
memcpy(buffer, &hdr, sizeof hdr); // Copy header to beginning of buffer
memcpy(buffer + sizeof(hdr), &addr, sizeof addr); // Copy address after header
write(your_socket, buffer, sizeof buffer); // Write buffer to socket

Understanding the msghdr structure from sys/socket.h

I'm trying to understand the following members of the msghdr structure of the sys/socket.h lib.
struct iovec *msg_iov scatter/gather array
void *msg_control ancillary data, see below
It states below:
Ancillary data consists of a sequence of pairs, each consisting of a cmsghdr structure followed by a data array. The data array contains the ancillary data message, and the cmsghdr structure contains descriptive information that allows an application to correctly parse the data.
I'm assuming the msghdr struct, contains the protocol-header information? if so... *msg_iov is the input/output "vector" of parameters in the request/response? and the *msg_control contains the response messages?
msg_iov is an array of input/output buffers with length msg_iovlen. Each member of this array contains a pointer to a data buffer and the size of the buffer. This is where the data to read/write lives. It allows you to read/write to an array of buffers which are not necessarily in contiguous memory regions.
msg_control points to a buffer of size msg_controllen that contains additional information about the packet. To read this field, you first need to declare a struct cmsghdr * (let's call it cmhdr). You populate this by calling CMSG_FIRSTHDR() the first time, passing it the address of the msghdr struct, and CMSG_NXTHDR() each subsequent time, passing it the address of the msghdr struct and the current value of cmhdr.
From the msg_control, you can find interesting things like the destination IP of the packet (useful for multicast) and the contents of the TOS/DSCP byte in the IP header (useful for custom congestion control protocols), among others. In most cases, you'll need to make a setsockopt call to enable receiving this data. In the examples given, the IP_PKTINFO and IP_TOS options need to be enabled.
See the cmsg(3) manpage for more details.
The source IP and port, are not in msg_control, but are in msg_name which expects a pointer to a struct sockaddr with length msg_namelen.
Here's an example of how to use this:
struct msghdr mhdr;
struct iovec iov[1];
struct cmsghdr *cmhdr;
char control[1000];
struct sockaddr_in sin;
char databuf[1500];
unsigned char tos;
mhdr.msg_name = &sin
mhdr.msg_namelen = sizeof(sin);
mhdr.msg_iov = iov;
mhdr.msg_iovlen = 1;
mhdr.msg_control = &control;
mhdr.msg_controllen = sizeof(control);
iov[0].iov_base = databuf;
iov[0].iov_len = sizeof(databuf);
memset(databuf, 0, sizeof(databuf));
if ((*len = recvmsg(sock, &mhdr, 0)) == -1) {
perror("error on recvmsg");
exit(1);
} else {
cmhdr = CMSG_FIRSTHDR(&mhdr);
while (cmhdr) {
if (cmhdr->cmsg_level == IPPROTO_IP && cmhdr->cmsg_type == IP_TOS) {
// read the TOS byte in the IP header
tos = ((unsigned char *)CMSG_DATA(cmhdr))[0];
}
cmhdr = CMSG_NXTHDR(&mhdr, cmhdr);
}
printf("data read: %s, tos byte = %02X\n", databuf, tos);
}

Resources