I have a raw socket set up bound to a device that is already in promiscuous mode:
int sock = socket (PF_INET, SOCK_RAW, IPPROTO_TCP);
if(sock == -1)
{
return -1;
}
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
strncpy(ifr.ifr_ifrn.ifrn_name, "eth0", IFNAMSIZ);
if (setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, &ifr, sizeof(ifr)) < 0)
{
close(sock);
return -2;
}
while(1) {
packet_size = recvfrom(sock , buffer , 65536 , 0 , NULL, NULL);
// packet processing...
}
And my issue is that I am only receiving packets on my socket with IP destination matching the IP of the device (eth0) I am bound to. How can I receive all the TCP packets that the device is receiving? I can see all the TCP packets on the device in Wireshark, but the only packets that I see in my raw socket are those addressed to the device IP.
The reason of receiving packets directed only to IP of your device is that you are using PF_INET raw socket. When PF_INET raw socket is used - skb faces different sanity checks when goes across the stack (see below).
F.e. :
int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev)
{
/*...*/
/* When the interface is in promisc. mode, drop all the crap
* that it receives, do not try to analyse it.
*/
if (skb->pkt_type == PACKET_OTHERHOST)
goto drop;
So the call trace is something like: __netif_receive_skb_core()->ip_rcv()->...->ip_local_deliver()->...->raw_local_deliver()->raw_rcv()->...->tcp_rcv() (you can check trace through the trace-cmd).
But tcpdump/Wireshark obtains packets around __netif_receive_skb_core(), i.e. before some sanity checks. Hence is discrepancy that confused you.
Therefore if you want skb's to bypass a large part of Linux kernel network stack - you should use PF_PACKET raw sockets.
Useful link
Related
I am trying to send raw bits over an Ethernet cable without using any protocol, even without an Ethernet frame. I realize that this data won't really go anywhere as it will not have a receiving MAC address, but this is purely educational.
I know I can create a socket but it always encapsulates my data in an Ethernet frame. Does this mean I would have to write raw data somehow to the port itself?
This is a pseudo example of how I send data by creating a socket.
int main()
{
char *request = "GET / HTTP/1.1";
socket = socket(AF_INET, SOCK_STREAM, 0);
bind(server_fd, (struct sockaddr *)&address, sizeof(address));
write(new_socket , request , strlen(request));
}
I have a forking HTTP proxy implemented on my Ubuntu 14.04 x86_64 with the following scheme (I'm reporting the essential code and pseudocode just to show the concept):
socketClient = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
bind(socketClient,(struct sockaddr*)&addr, sizeof(addr));
listen(socketClient, 50);
newSocket = accept(socketClient, (struct sockaddr*)&cliAddr, sizeof(cliAddr));
get request from client, parse it to resolve the requested hostname in an IP address;
fork(), open connection to remote server and deal the request;
child process: if it is a GET request, send original request to server and while server is sending data, send data from server to client;
child process: else if it is a CONNECT request, send string 200 ok to client and poll both client socket descriptor and server socket descriptor with select(); if I read data from server socket, send this data to client; else if I read data from client socket, send this data to server.
The good thing is that this proxy works, the bad thing is that now I must collect statistics; this is bad because I'm working on a level where I can't get the data I'm interested in. I don't care about the payload, I just need to check in IP and TCP headers the flags I care about.
For example, I'm interested in:
connection tracking;
number of packets sent and received.
As for the first, I would check in the TCP header the SYN flag, SYN/ACK and then a last ACK; as for the second, I would just do +1 to a counter of mine every time a char buffer[1500] is filled with data when I send() or recv() a full packet.
I realized that this is not correct: SOCK_STREAM doesn't have the concept of packet, it is just a continuous stream of bytes! The char buffer[1500] I use at point 7. and 8. has useful statistic, I may set its capacity to 4096 bytes and yet I couldn't keep track of the TCP packets sent or received, because TCP has segments, not packets.
I couldn't parse the char buffer[] looking for SYN flag in TCP header either, because IP and TCP headers are stripped from the header (because of the level I'm working on, specified with IPPROTO_TCP flag) and, if I understood well, the char buffer[] contains only the payload, useless to me.
So, if I'm working on a too high level, I should go lower: once I saw a simple raw socket sniffer where an unsigned char buffer[65535] was cast to struct ethhdr, iphdt, tcphdr and it could see all the flags of all the headers, all the stats I'm interested in!
After the joy, the disappointment: since raw sockets work on a low level they don't have some concepts vital to my proxy; raw sockets can't bind, listen and accept; my proxy is listening on a fixed port, but raw sockets don't know what a port is, it belongs to the TCP level and they bind to a specified interface with setsockopt.
So, if I'd socket(PF_INET, SOCK_RAW, ntohs(ETH_P_ALL)) I should be able to parse the buffer where I recv() and send() at .7 and .8, but I should use recvfrom() and sendto()...but all this sounds quite messy, and it envolves a nice refactoring of my code.
How can I keep intact the structure of my proxy (bind, listen, accept to a fixed port and interface) and increase my line of vision for IP and TCP headers?
My suggestion is to open a raw socket in, for example, another thread of your application. Sniff all traffic and filter out the relevant packets by addresses and port numbers. Basically you want to implement your own packet sniffer:
int sniff()
{
int sockfd;
int len;
int saddr_size;
struct sockaddr saddr;
unsigned char buffer[65536];
sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_TCP);
if (sockfd < 0) {
perror("socket");
return -1;
}
while (1) {
saddr_size = sizeof(saddr);
len = recvfrom(sockfd, buffer, sizeof(buffer), 0, &saddr, &saddr_size);
if (len < 0) {
perror("recvfrom");
close(sockfd);
return -1;
}
// ... do the things you want to do with the packet received here ...
}
close(sockfd);
return 0;
}
You can also bind that raw socket to a specific interface if you know which interface is going to be used for the proxy's traffic. For example, to bind to "eth0":
setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE, "eth0", 4);
Use getpeername() and getsockname() function calls to find the local and remote addresses and port numbers of your TCP connections. You'll want to filter the packets by those.
I am trying to send and receive packets of type SOCK_RAW over PF_SOCKETs using my own custom protocol ID on the same machine. Here is my sender and receiver sample code-
sender.c
#include<sys/socket.h>
#include<linux/if_packet.h>
#include<linux/if_ether.h>
#include<linux/if_arp.h>
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define CUSTOM_PROTO 0xB588
int main ()
{
int sockfd = -1;
struct sockaddr_ll dest_addr = {0}, src_addr={0};
char *buffer = NULL;
struct ethhdr *eh;
sockfd = socket(PF_PACKET, SOCK_RAW, htons(CUSTOM_PROTO) );
if ( sockfd == -1 )
{
perror("socket");
return -1;
}
buffer = malloc(1518);
eh = (struct ethhdr *)buffer;
dest_addr.sll_ifindex = if_nametoindex("eth0");
dest_addr.sll_addr[0] = 0x0;
dest_addr.sll_addr[1] = 0xc;
dest_addr.sll_addr[2] = 0x29;
dest_addr.sll_addr[3] = 0x49;
dest_addr.sll_addr[4] = 0x3f;
dest_addr.sll_addr[5] = 0x5b;
dest_addr.sll_addr[6] = 0x0;
dest_addr.sll_addr[7] = 0x0;
//other host MAC address
unsigned char dest_mac[6] = {0x0, 0xc, 0x29, 0x49, 0x3f, 0x5b};
/*set the frame header*/
memcpy((void*)buffer, (void*)dest_mac, ETH_ALEN);
memcpy((void*)(buffer+ETH_ALEN), (void*)dest_mac, ETH_ALEN);
eh->h_proto = htons(PAVAN_PROTO);
memcpy((void*)(buffer+ETH_ALEN+ETH_ALEN + 2), "Pavan", 6 );
int send = sendto(sockfd, buffer, 1514, 0, (struct sockaddr*)&dest_addr,
sizeof(dest_addr) );
if ( send == -1 )
{
perror("sendto");
return -1;
}
return 0;
}
receiver.c
#include<sys/socket.h>
#include<linux/if_packet.h>
#include<linux/if_ether.h>
#include<linux/if_arp.h>
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define CUSTOM_PROTO 0xB588
int main ()
{
int sockfd = -1;
struct sockaddr_ll dest_addr = {0}, src_addr={0};
char *recvbuf = malloc(1514);
sockfd = socket(PF_PACKET, SOCK_RAW, htons(CUSTOM_PROTO) );
if ( sockfd == -1 )
{
perror("socket");
return -1;
}
int len = recvfrom(sockfd, recvbuf, 1514, 0, NULL, NULL);
printf("I received: \n");
return 0;
}
Both sender and receiver are running on Ubuntu Virtualbox. The problem is the receiver hangs in recvfrom. But in receiver.c, if I change htons(CUSTOM_PROTO) to htons(ETH_P_ALL), the receiver works just fine.
Why is the kernel not delivering the packet with my custom protocol ID to my custom protocol ID socket?
I verified in GDB that the ethernet header is formed correctly when I receive packet with htons(ETH_P_ALL)
Update: Instead of interface eth0 and its corresponding MAC, if I choose local loopback lo and a MAC address of 00:00:00:00:00:00, CUSTOM_PROTO works just fine!
Update 2 CUSTOM_PROTO works fine if the sender and receiver are on different machines. This finding and prev update made me suspect that packets being sent out on eth0 are not being received by the same machine. But the fact that ETH_P_ALL works on the same machine, refutes my suspicion.
ETH_P_ALL vs any other protocol
The protocol ETH_P_ALL has the special role of capturing outgoing packets.
Receiver socket with any protocol that is not equal to ETH_P_ALL receives packets of that protocol that come from the device driver.
Socket with protocol ETH_P_ALL receives all packets before sending outgoing packets to the device driver and all incoming packets that are received from the device driver.
Loopback device vs Ethernet device
Packets sent to the loopback device go out from that device and then the same packets are received from the device as incoming.
So, when CUSTOM_PROTO is used with loopback the socket captures packets with custom protocol as incoming.
Note that if ETH_P_ALL is used with the loopback device each packet is received twice. Once it is captured as outgoing and the second time as incoming.
In case of eth0 the packet is transmitted from the device. So, such packets go to the device driver and then they can be seen on the other side of the physical Ethernet port. For example, with VirtualBox "Host-only" network adapter those packets can be captured by some sniffer in the host system.
However, packets transmitted to the physical port (or its emulation) are not redirected back to that port. So, they are not received as incoming from the device. That is why such packets can be captured only by ETH_P_ALL in outgoing direction and they cannot be seen by CUSTOM_PROTO in incoming direction.
Technically it should possible to prepare special setup to do external packet loopback (packets from the device port should be sent back to that port). In that case the behavior should be similar to the loopback device.
Kernel implementation
See the kernel file net/core/dev.c. There are two different lists:
struct list_head ptype_base[PTYPE_HASH_SIZE] __read_mostly;
struct list_head ptype_all __read_mostly; /* Taps */
The list ptype_all is for socket handlers with protocol ETH_P_ALL. The list ptype_base is for handlers with normal protocols.
There is a hook for outgoing packets in xmit_one() called from dev_hard_start_xmit():
if (!list_empty(&ptype_all))
dev_queue_xmit_nit(skb, dev);
For outgoing packets the function dev_queue_xmit_nit() is called for ETH_P_ALL processing each item of ptype_all. Finally the sockets of type AF_SOCKET with protocol ETH_P_ALL capture that outgoing packet.
So, the observed behavior is not related to any custom protocol. The same behavior can be observed with ETH_P_IP. In that case the receiver is able to capture all incoming IP packets, however it cannot capture IP packets from sender.c that sends from "eth0" to MAC address of "eth0" device.
It can be also seen by tcpdump. The packets sent by the sender are not captured if tcpdump is called with an option to capture only incoming packets (different versions of tcpdump use different command line argument to enable such filtering).
The initial task where on the same machines it is needed to distinguish packets by protocol IDs can be solved using ETH_P_ALL. The receiver should capture all packets and check the protocol, for example:
while (1) {
int len = recvfrom(sockfd, recvbuf, 1514, 0, NULL, NULL);
if (ntohs(*(uint16_t*)(recvbuf + ETH_ALEN + ETH_ALEN)) == CUSTOM_PROTO) {
printf("I received: \n");
break;
}
}
Useful reference "kernel_flow" with a nice diagram http://www.linuxfoundation.org/images/1/1c/Network_data_flow_through_kernel.png
It is based on the 2.6.20 kernel, however in the modern kernels ETH_P_ALL is treated in the same way.
When packets with same source nad destination MAC address are transmitted from real network device ethX and physically looped back.
If protocol ETH_P_ALL is specified, packet is captured twice:
first packet with socket_address.sll_pkttype is PACKET_OUTGOING
and second packet with socket_address.sll_pkttype is PACKET_HOST
If specific protocol is specified CUSTOM_PROTO, packet is captured once:
in the case of normal packet: socket_address.sll_pkttypeis
PACKET_HOST.
in the case of VLAN packet:
socket_address.sll_pkttypeis PACKET_OTHERHOST.
I am buliding a server/client software using PF_PACKET and SOCK_RAW and a custom protocol when calling socket()
When in the client software I create the socket the same way and just do a rcvfrom that socket and I get the data
My question is do I have to fill out the sockaddr_ll struct the same way I do for the server since when I reply from the client the source MAC address I got is a wierd one something like
11:11:00:00:00:00 and of course this is not my client's MAC
Does anyone know what this happens?
Open the socket
if ( (sckfd=socket(PF_PACKET, SOCK_RAW, htons(proto)))<0)
{
myError("socket");
}
this is how I receive the data
n = recvfrom(sckfd, buffer, 2048, 0, NULL, NULL);
printf("%d bytes read\n",n);
So this is how I basically receive the data in the client without filling the struct sockaddr_ll
For the server Program I do have to fill the struct
struct sockaddr_ll saddrll;
memset((void*)&saddrll, 0, sizeof(saddrll));
saddrll.sll_family = PF_PACKET;
saddrll.sll_ifindex = ifindex;
saddrll.sll_halen = ETH_ALEN;
memcpy((void*)(saddrll.sll_addr), (void*)dest, ETH_ALEN);
My question is I receive as shown and send as shown and when I reply to the server call the same function used in the server for Sending then what do I get 11:11:00:00:00:00 when receiving client replies
You should probably use
socket(AF_PACKET, SOCK_DGRAM, htons(proto)))
instead of a SOCK_RAW socket. With a SOCK_RAW, you are sending/receiving the entire ethernet frame, including source and destination MAC address. with SOCK_DGRAM, the kernel will fill in the ethernet header.
You probably want to send the reply to the same address as the request comes from, recvfrom() can fill in the source address;
struct sockaddr_ll src_addr;
socklen_t addr_len = sizeof src_addr;
n = recvfrom(sckfd, buffer, 2048, 0,
(struct sockaddr*)&src_addr, &addr_len);
Now you've learned the source address, so send the packet back to it:
...
sendto(sckfd, data, data_len, src_addr, addr_len);
And if you rather need to use SOCK_RAW, your will receive the ethernet header too, so just copy out the MAC addresses from the received data and swap them around when you are constructing the reply frame.
For an a SOCK_RAW socket, you craft the entire ethernet frame, you don't need to fill in the ethernet address, so the following is not needed;
memcpy((void*)(saddrll.sll_addr), (void*)dest, ETH_ALEN);
According to the connect(2) man pages
If the socket sockfd is of type SOCK_DGRAM then serv_addr is the address to which datagrams are sent by default, and the only address from which datagrams are received. If the socket is of type SOCK_STREAM or SOCK_SEQPACKET, this call attempts to make a connection to the socket that is bound to the address specified by serv_addr.
I am trying to filter packets from two different multicast groups that are being broadcasted on the same port and I thought connect() would have done the job but I can't make it work. In facts when I add it to my program I don't receive any packet. More info in this thread.
This is how I set the connect parameters:
memset(&mc_addr, 0, sizeof(mc_addr));
mc_addr.sin_family = AF_INET;
mc_addr.sin_addr.s_addr = inet_addr(multicast_addr);
mc_addr.sin_port = htons(multicast_port);
printf("Connecting...\n");
if( connect(sd, (struct sockaddr*)&mc_addr, sizeof(mc_addr)) < 0 ) {
perror("connect");
return -1;
}
printf("Receiving...\n");
while( (len = recv(sd, msg_buf, sizeof(msg_buf), 0)) > 0 )
printf("Received %d bytes\n", len);
Your program (probably) has the following problems:
you should be using bind() instead of connect(), and
you're missing setsockopt(..., IP_ADD_MEMBERSHIP, ...).
Here's an example program that receives multicasts. It uses recvfrom(), not recv(), but it's the same except you also get the source address for each received packet.
To receive from multiple multicast groups, you have three options.
First option: Use a separate socket for each multicast group, and bind() each socket to a multicast address. This is the simplest option.
Second option: Use a separate socket for each multicast group, bind() each socket INADDR_ANY, and use a socket filter to filter out all but a single multicast group.
Because you've bound to INADDR_ANY, you may still get packets for other multicast groups. It is possible to filter them out using the kernel's socket filters:
#include <stdint.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <linux/filter.h>
/**
* Adds a Linux socket filter to a socket so that only IP
* packets with the given destination IP address will pass.
* dst_addr is in network byte order.
*/
int add_ip_dst_filter (int fd, uint32_t dst_addr)
{
uint16_t hi = ntohl(dst_addr) >> 16;
uint16_t lo = ntohl(dst_addr) & 0xFFFF;
struct sock_filter filter[] = {
BPF_STMT(BPF_LD + BPF_H + BPF_ABS, SKF_NET_OFF + 16), // A <- IP dst high
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, hi, 0, 3), // if A != hi, goto ignore
BPF_STMT(BPF_LD + BPF_H + BPF_ABS, SKF_NET_OFF + 18), // A <- IP dst low
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, lo, 0, 1), // if A != lo, goto ignore
BPF_STMT(BPF_RET + BPF_K, 65535), // accept
BPF_STMT(BPF_RET + BPF_K, 0) // ignore
};
struct sock_fprog fprog = {
.len = sizeof(filter) / sizeof(filter[0]),
.filter = filter
};
return setsockopt(fd, SOL_SOCKET, SO_ATTACH_FILTER, &fprog, sizeof(fprog));
}
Third option: use a single socket to receive multicasts for all multicast groups.
In that case, you should do an IP_ADD_MEMBERSHIP for each of the groups. This way you get all packets on a single socket.
However, you need extra code to determine which multicast group a received packet was addressed to. To do that, you have to:
receive packets with recvmsg() and read the IP_PKTINFO or equivalent ancillary data message. However, to make recvmsg() give you this message, you first have to
enable reception of IP_PKTINFO ancillary data messages with setsockopt().
The exact thing you need to do depends on IP protocol version and OS. Here's how I did it (IPv6 code not tested): enabling PKTINFO and reading the option.
Here's a simple program that receives multicasts, which demonstrates the first option (bind to multicast address).
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define MAXBUFSIZE 65536
int main (int argc, char **argv)
{
if (argc != 4) {
printf("Usage: %s <group address> <port> <interface address>\n", argv[0]);
return 1;
}
int sock, status, socklen;
char buffer[MAXBUFSIZE+1];
struct sockaddr_in saddr;
struct ip_mreq imreq;
// set content of struct saddr and imreq to zero
memset(&saddr, 0, sizeof(struct sockaddr_in));
memset(&imreq, 0, sizeof(struct ip_mreq));
// open a UDP socket
sock = socket(PF_INET, SOCK_DGRAM, 0);
if (sock < 0) {
perror("socket failed!");
return 1;
}
// join group
imreq.imr_multiaddr.s_addr = inet_addr(argv[1]);
imreq.imr_interface.s_addr = inet_addr(argv[3]);
status = setsockopt(sock, IPPROTO_IP, IP_ADD_MEMBERSHIP,
(const void *)&imreq, sizeof(struct ip_mreq));
saddr.sin_family = PF_INET;
saddr.sin_port = htons(atoi(argv[2]));
saddr.sin_addr.s_addr = inet_addr(argv[1]);
status = bind(sock, (struct sockaddr *)&saddr, sizeof(struct sockaddr_in));
if (status < 0) {
perror("bind failed!");
return 1;
}
// receive packets from socket
while (1) {
socklen = sizeof(saddr);
status = recvfrom(sock, buffer, MAXBUFSIZE, 0, (struct sockaddr *)&saddr, &socklen);
if (status < 0) {
printf("recvfrom failed!\n");
return 1;
}
buffer[status] = '\0';
printf("Received: '%s'\n", buffer);
}
}
The first thing to note is that multicast packets are sent to a multicast address, not from a multicast address.
connect() will allow (or not) packets received from a nominated address.
To configure your socket to receive multicast packets you need to use one of two socket options:
IP_ADD_MEMBERSHIP, or
IP_ADD_SOURCE_MEMBERSHIP
The former allows you to specify a multicast address, the latter allows you to specify a multicast address and source address of the sender.
This can be done using something like the following:
struct ip_mreq groupJoinStruct;
unsigned long groupAddr = inet_addr("239.255.0.1");
groupJoinStruct.imr_multiaddr.s_addr = groupAddr;
groupJoinStruct.imr_interface.s_addr = INADDR_ANY; // or the address of a specific network interface
setsockopt( yourSocket, IPPROTO_IP, IP_ADD_MEMBERSHIP, &groupJoinStruct );
(error handling omitted for brevity)
To stop receiving multicast packets for this group address, use the socket options:
IP_DROP_MEMBERSHIP, or
IP_DROP_SOURCE_MEMBERSHIP
Note that a socket can have multiple multicast memberships. But, as the multicast address is the destination address of the packet, you need to be able to grab the destination address of the packet to be able to distinguish between packets for different multicast addresses.
To grab the destination address of the packet you'll need to use recvmsg() instead of recv() or recvfrom(). The destination address is contained within the IPPROTO_IP message level, of type DSTADDR_SOCKOPT.
As #Ambroz Bizjak has stated, you'll need to set the IP_PKTINFO socket option to be able to read this information.
Other things to check are:
Is multicast supported in your kernel? Check for the existence of /proc/net/igmp to ensure it's been enabled.
Has multicast been enable on your network interface? Check for "MULTICAST" listed when you run ifconfig on your interface
Does your network interface support multicast? Historically not all have. If not you may be able to get around this by setting your interface to promiscuous mode. e.g. ifconfig eth0 promisc
This should work as long as all the SENDING sockets are bound to the multicast address in question with bind. The address you specify in connect is matched against the SOURCE address of received packets, so you want to ensure that all packets have the same (multicast) SOURCE AND DESTINATION.
bind(2) each socket to the address of the respective multicast group and port instead of INADDR_ANY. That would do the filtering for you.