I am currently working on a programming assignment. The assignment is to implement a client,network emulator, and server. The client passes packets to a network emulator, and the network emulator passes to the server. Vice-versa applies as well. The prerequisite for the assignment is that I may only use raw sockets. So I will create my own IP and UDP headers. I have tested my packets with wireshark. They are all correct and in the proper format(it reads them properly).
Another requirement is that the emulator, client and server all have specific ports they must be bound to. Now, I do not understand how to bind a raw socket to a specific port. All my raw sockets receive all traffic on the host address they are bound to. According to man pages, and everywhere else on the internet, including "Unix Network Programming" by Richard Stevens, this is how they are supposed to work. My teacher has not responded to any of my emails and I probably will not be able to ask him until Tuesday.I see two options in front of me. First I can use libpcap to filter from a specific device and then output to my raw socket. I feel this is way out of scope for our assignment though. Or I can filter them after I receive them from the socket. This apparently has a lot of overhead because all the packets are being copied/moved through the kernel. At least, that is my understanding(please feel free to correct me if i'm wrong).
So my question is:
Is their an option or something I can set for this? Where the raw socket will bind to a port? Have I missed something obvious?
Thank you for your time.
--
The man page for raw(7) says:
A raw socket can be bound to a specific local address using the bind(2) call. If it isn't bound all packets with the specified IP protocol are received. In addition a RAW socket can be bound to a specific network device using SO_BINDTODEVICE; see socket(7).
Edit: You cannot bind a raw socket to a specific port because "port" is a concept in TCP and UDP, not IP. Look at the header diagrams for those three protocols and it should become obvious: you are working at a lower level, where the concept of port is not known.
I would think you're expected to filter the packets in your software. It sounds like the exercise is to learn what the different components of the IP stack do by recreating a simplified piece of it in user space. Normally in the kernel, the IP code would process all packets, verify the IP headers, reassemble fragments, and check the protocol field. If the protocol field is 17 (udp), then it passes it to the UDP code (every UDP packet). It's up to the UDP code to then validate the UDP header and determine if any applications are interested in them based on the destination port.
I imagine your project is expected to more or less mimic this process. Obviously none of it will be as efficient as doing it in the kernel, but since the assignment is to write (part of) an IP stack in user-space, I'd guess efficiency isn't the point of the exercise.
Related
I need to write a proxy server in C language on Linux (Ubuntu 20.04). The purpose of this proxy server is as follows. There're illogical governmental barriers in accessing the free internet. Some are:
Name resolution: I ping telegram.org and many other sites which the government doesn't want me to access. I ask 8.8.8.8 to resolve the name, but they response of behalf of the server that the IP may be resolved to 10.10.34.35!
Let's concentrate on this one, because when this is solved many other problems will be solved too. For this, I need to setup such a configuration:
A server outside of my country is required. I prepared it. It's a VPS. Let's call it RS (Remote Server).
A local proxy server is required. Let's call it PS. PS runs on the local machine (client) and knows RS's IP. I need it to gather all requests going to be sent through the only NIC available on client, process them, scramble them, and send them to RS in a way to be hidden from the government.
The server-side program should be running on RS on a specific port to get the packet, unscramble it, and send it to the internet on behalf of the client. After receiving the response from the internet, it should send it back to the client via the PS.
PS will deliver the response to the client application which originates the request. Of course this happens after it will unscramble and will find the original response from the internet.
This is the design and some parts is remained gloomy for me. Since I'm not an expert in network programming context, I'm going to ask my questions in the parts I'm getting into trouble or are not clear for me.
Now, I'm in part 2. See whether I'm right. There're two types of sockets, a RAW socket and a stream socket. A RAW socket is opened this way:
socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
And a stream socket is opened this way:
socket(AF_INET, SOCK_STREAM, 0);
For RAW sockets, we use sockaddr_ll and for stream sockets we use sockaddr_in. May I use stream sockets between client applications and PS? I think not, because I need the whole RAW packet. I should know the protocol and maybe some other info of the packet, because the whole packet should be retrieved transparently in RS. For example, I should know whether it has been a ping packet (ICMP) or a web request (TCP). For this, I need to have packet header in PS. So I can't use a stream socket, because it doesn't contain the packet header. But until now, I've used RAW sockets for interfaces and have not written a proxy server to receive RAW packets. Is it possible? In another words, I've the following questions to go to next step:
Can a RAW socket be bound to localhost:port instead of an interface so that it may receive all low-level packets containing packet headers (RAW packets)?
I may define a proxy server for browser. But can I put the whole system behind the proxy server so that packets of other apps like PING may route automatically via it?
Do I really need RAW sockets in PS? Can't I change the design to suffice the data I got from the packets payload?
Maybe I'm wrong in some of the concepts and will appreciate your guidance.
Thank you
Can a RAW socket be bound to localhost:port instead of an interface so that it may receive all low-level packets containing packet headers (RAW packets)?
No, it doesn't make sense. Raw packets don't have port numbers so how would it know which socket to go to?
It looks like you are trying to write a VPN. You can do this on Linux by creating a fake network interface called a "tun interface". You create a tun interface, and whenever Linux tries to send a packet through the interface, instead of going to a network cable, it goes to your program! Then you can do whatever you like with the packet. Of course, it works both ways - you can send packets from your program back to Linux through the tun interface, and Linux will act like they just arrived on a network cable.
Then, you can set up your routing table so that all traffic goes to the tun interface, except for traffic to the VPN server ("RS"), which goes to your real ethernet/wifi interface. Otherwise you'd have an endless loop where your VPN program PS tried to send packets to RS but they just went back to PS.
When using a PF_PACKET type of socket with protocol type ETH_P_IP, the man packet documentation talks about a socket option for multicast. The socket option is PACKET_ADD_MEMBERSHIP.
Assuming you use PACKET_ADD_MEMBERSHIP socket option on a PF_PACKET socket correctly, what features and benefits and use cases is this socket option for?
Right now I receive all incoming IP packets so I look at each packet to see if it has the correct IP dst-address and UDP dst-port and I skip over all the other packets. Would using PACKET_ADD_MEMBERSHIP socket option mean I don't need to do my own filter because the kernel or driver would filter for me?
I dug into the linux-kernel source and traced down the code a little bit. I found that the ethernet-mac-address you pass in via setsockopt() is added to a list of ethernet-mac-addresses. And then the list is sent to the network-device hardware to do something... but I can't find any authoritative documentation telling me what happens next.
My educated guess is that the ethernet-mac-address list is used by the hardware to filter at the layer-2 ethernet protocol (i.e. the hardware only accepts packets that have a destination ethernet-address that matches one on the list). If there is some good documentation I would welcome that.
(I'm more familiar with TCP/UDP sockets and so this looks very similar to AF_INET type of socket's IP_ADD_MEMBERSHIP socket option... so I was expecting IGMP reports to be generated which would start multicast traffic from the router... but I found out experimentally that no IGMP reports are generated when you use this socket option.)
Your guess is correct. PACKET_ADD_MEMBERSHIP should add addresses to the NIC's hardware filter. As you've surmised, it's intended to allow you to receive multicasts for a number of different addresses without incurring the load(*) of full promiscuous mode.
(* With modern full duplex ethernet, there's generally not a lot of traffic coming to the NIC that it wouldn't want to receive anyway, unless it's in a virtualized environment.)
Note that there is also a separate PACKET_MR_UNICAST which does not appear in the packet(7) man page but works analogously. I would use the appropriate one (unicast vs multicast) for the type of address you're filtering on, as it's conceivable (though unlikely) that a driver would refuse to put a unicast address into the multicast filtering table.
All that being said, you'll still need to keep your software filtering as backup. There are some older drivers that don't implement MAC filtering at all (particularly for multiple unicast addresses). The core kernel or the driver handles this by simply turning on promiscuous mode if the feature isn't available.
As for the relationship with IP_ADD_MEMBERSHIP, the IP_ADD_MEMBERSHIP code will automatically construct the appropriate multicast MAC address and add it to the interface. See ip_mc_filter_add.
May be the question is a bit stupid, but I'll ask it. I read a lot about raw sockets in network, have seen several examples. So, basically with raw sockets it's possible to build own stack of headers, like stack = IP + TCP/UDP + OWN_HEADER. My question is, is it possible to get some kind of ready frame of first two(IP + TCP/UDP) from the linux kernel and then just append own header to them? The operating system in question is linux and the language is C.
I cannot find any function which can do such a thing, but may be I'm digging in a wrong direction.
I would rely on existing protocols (i.e., UDP for best-effort and TCP for reliable) and encapsulate my own header inside them. In practice, this means embed your own protocol over a standard UDP/TCP connection. This way, you can use existing kernel and user-level tools.
If you want to do the other way around (i.e., UDP/TCP encapsulated in your own header) you have to design/implement your own protocol in the linux kernel, which is quite complicated.
No, that is not possible for raw sockets.
You either sit on top of TCP/UDP, in which case the IP stack takes care of the headers and the operation of the protocol(e.g. in the case of TCP how to slice up your data into segments),
So, if you want to add stuff on top of TCP or UDP, that's what a normal TCP or UDP socket is for.
Or you sit on top of IP, in which case it is your responsibility to craft whatever you want on top of IP - That's what a raw socket is for. Though in this case you can construct the IP header as well, or opt in to have the IP stack generate the IP header too.
I don't think you understand what RAW sockets are for. With RAW sockets (s = socket (AF_INET, SOCK_RAW, IPPROTO_RAW)) you have to build everything above the physical layer yourself.
That means you build the IP header (if you want to build your protocol on top of IP of course), you could do your own thing here with AF_PACKET. And if you want TCP, you build the TCP header yourself, and add it to the IP header. In most cases when you use RAW sockets, this is where you would start building your own protocol in place of TCP or UDP, otherwise why use RAW Sockets in the first place? IF you wanted to build your own ICMP or SCTP implementation for example, you would use RAW Sockets.
If you really want to understand how this works I suggest building your own version of "ping" (ICMP echo request implementation, in other words). Its easy to do and easy to test and will force you to get your hands a little dirty.
The man pages cover this entire topic quite well if you ask me. Start at socket.
I am unable to understand or grasp rather; what it means to program at a lower layer in socket programming. I am used to working with tcp/udp/file system sockets. These are all wrapped around their own protocol specifications ... which as i understand would make it working at the application layer in the stack.
In the project i am on , i have seen some files which are "named" LinkLayer, TransportLayer... but i don't see any more calls other than standard socket calls....send /recv/ seletct...
Does the fact you are setting a socket options mean you are programming at a lower level ? Is it just restricted to it? Or are there other API's which grant you access at the representation in kernel ?
Typically this refers to using SOCK_RAW sockets, which requires you to assemble your own packet headers, calculate checksums, etc. You still use send/recv/etc. but now you are responsible for making sure every bit is in the right place.
You can use SOCK_RAW sockets to implement protocols other than TCP or UDP, or to do things with the Internet protocols that higher-level interfaces don't accommodate (like tweaking the TTL of your packets to implement something like traceroute).
This usually means working on a lower OSI-Layer, for example, not directly sending TCP-streams or UDP-packets, but crafting own IP or even Ethernet packets or other low-layer protocols which would - in normal case - be handled by the operating system.
This can be done done via specific socket options which enable you to receive or send data on any layer, even layer 2 (Data Link).
I have an application bound to eth0, sending UDP packets on port A to 255.255.255.255. At the same time, I have a UDP server bound to eth0, 0.0.0.0 and port A.
What I want to do is to make sure that the server won't receive messages generated by the application (handled purely in software by the kernel) but it will receive messages generated by other hosts in the network.
I can't change the payload of UDP packets nor add any headers to it.
I've already implemented a solution using RTNETLINK to fetch all IP addresses of the machine I'm sitting on (and filter basing on address from recvfrom()), but I'm wondering if there might be a simpler and cleaner solution.
EDIT: I thought about something like tagging the skb - the tag would disappear after leaving a physical interface, but wouldn't if it's just routed in the software.
Any ideas?
If you can patch your Linux kernel, you could use a setsockopt() option for choosing if you want to loopback the broadcast packets you're sending or not.
This patch reuse the IP_MULTICAST_LOOP option exactly for this purpose.
Also, instead of "messing" with the IP_MULTICAST_LOOP option, you could easily add your own setsockopt() option, maybe called IP_BROADCAST_NO_LOOP. This would guarantee that you're not changing the behavior for any other application.
You can compute a checksum or CRC (better) over the payload and filter against this.
You can do this at the firewall level by dropping packets to broadcast address port A with source address of the eth0.