Unix Domain sockets (C) - Client "deletes" socket on connect()? - c

This may be a bit difficult to enumerate succinctly but I will give it my best on my novice understanding of the domain and problem.
I have 2 processes, one stream server who first unlinks, creates a socket descriptor, binds, listens, and accepts on a local unix socket. The job of the server is to accept a connection, send arbitrary data, and also receive arbitrary data. The client process' job is to do the same as the server with the exception of the initial setup; create a socket descriptor, and connect to the unix socket.
Upon launching the server, I can verify the unix socket is being created. Upon launching the client, I receive a connect() error stating the file or directory doesn't exist or invalid. And yes, attempting to locate the unix socket as before, the file no longer exists...
Does anyone know why or where in the bug may lie that is causing this behavior?
If code snippets would be helpful to clarify, I can certainly post those as well.
struct addrinfo * server;
int sockfd;
sockfd = socket( server->ai_family, server->ai_socktype, server->ai_protocol );
if( connect(sockfd, server->ai_addr, server->ai_addrlen) == 0 )
return sockfd;
else
perror("connect()");
It's probably also worth noting that I'm using a modified version of getaddrinfo to populate the addrinfo struct for the unix domain specifically.

Following the server startup, check that the socket file exists on the client system i.e. make sure that the file you're going to use in the sun_path field of the struct sockaddr_un passed into the connect on the client exists. This entry must match the one that was created in the server and passed into the bind. Also make sure that you are populating the sun_family field in both the client and the server with AF_UNIX.
In the client do not perform any creation/deletion of the socket file - i.e there should not be an unlink anywhere in the client code related to the location of the server socket.
These are the general processes I would follow to ensure that the code is doing the right thing. There is a sample server/client in the old, but still reliable Beej's guide to UNIX IPC which is probably the simplest example you should be comparing to.
Edit Based on the discussion in the comments, it turns out that the custom getaddrinfo call is the culprit in the deletion of the unix socket file. This is because there is server-side logic in the code which checks if hints->ai_flags & AI_PASSIVE is set. If this is the case, then it unlinks the socket file, as it expects the software to be performing a bind (as in be a server). The logic about the AI_PASSIVE flag is codified in the RFC, and in that case, the bind would fail if the file does not exist.
If the AI_PASSIVE flag is specified, the returned address information
shall be suitable for use in binding a socket for accepting incoming
connections for the specified service (i.e., a call to bind()).
However, the end sentence of that paragraph states:
This flag is ignored if the nodename argument is not null
So it seems like the logic is slightly incorrect in this case of the call getaddrinfo( "/local", "/tmp/socket", hints, &server), as the nodename parameter is not null.

Related

What is the purpose of "\0hidden" in an AF_UNIX socket path?

I followed a tutorial on how to make two processes on Linux communicate using the Linux Sockets API, and that's the code it showed to make it happen:
Connecting code:
char* socket_path = "\0hidden";
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
struct sockaddr_un addr;
memset(&addr, 0x0, sizeof(addr));
addr.sun_family = AF_UNIX;
*addr.sun_path = '\0';
strncpy(addr.sun_path+1, socket_path+1, sizeof(addr.sun_path)-2);
connect(fd, (struct sockaddr*)&addr, sizeof(addr));
Listening code:
char* socket_path = "\0hidden";
struct sockaddr_un addr;
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
memset(&addr, 0x0, sizeof(addr));
addr.sun_family = AF_UNIX;
*addr.sun_path = '\0';
strncpy(addr.sun_path+1, socket_path+1, sizeof(addr.sun_path)-2);
bind(fd, (struct sockaddr*)&addr, sizeof(addr));
listen(fd, 5);
Basically, I have written a web server for a website in C, and a database management system in C++, and making them communicate (after a user's browser sends an HTTP request to my web server, which it's listening for using an AF_INET family socket, but that's not important here, just some context) using this mechanism.The database system is listening with its socket, and the web server connects to it using its own socket. It's been working perfectly fine.
However, I never understood what the purpose of a null byte at the beginning of the socket path is. Like, what the heck does "\0hidden" mean, or what does it do? I read the manpage on sockets, it says something about virtual sockets, but it's too technical for me to get what's going on. I also don't have a clear understanding of the concept of representing sockets as files with file descriptors. I don't understand the role of the strncpy() either. I don't even understand how the web server finds the database system with this code block, is it because their processes were both started from executables in the same directory, or is it because the database system is the only process on the entire system listening on an AF_UNIX socket, or what?
If someone could explain this piece of the Linux Sockets API that has been mystifying me for so long, I'd be really grateful. I've googled and looked at multiple places, and everyone simply seems to be using "\0hidden" without ever explaining it, as if it's some basic thing that everyone should know. Like, am I missing some piece of theory here or what? Massive thanks to anybody explaining in advance!
This is specific to the Linux kernel implementation of the AF_UNIX local sockets. If the character array which gives a socket name is an empty string, then the name doesn't refer to anything in the filesystem namespace; the remaining bytes of the character array are treated as an internal name sitting in the kernel's memory. Note that this name is not null-terminated; all bytes in the character array are significant, regardless of their value. (Therefore it is a good thing that your example program is doing a memset of the structure to zero bytes before copying in the name.)
This allows applications to have named socket rendezvous points that are not occupying nodes in the filesystem, and are therefore are more similar to TCP or UDP port numbers (which also don't sit in the file system). These rendezvous points disappear automatically when all sockets referencing them are closed.
Nodes in the file system have some disadvantages. Creating and accessing them requires a storage device. To prevent that, they can be created in a temporary filesystem that exists in RAM like tmpfs in Linux; but tmpfs entries are almost certainly slower to access and take more RAM than a specialized entry in the AF_UNIX implementation. Sockets that are needed temporarily (e.g. while an application is running) may stay around if the application crashes, needing external intervention to clean them up.
hidden is probably not a good name for a socket; programs should take advantage of the space and use something quasi-guaranteed not to clash with anyone else. The name allows over 100 characters, so it's probably a good idea to use some sort of UUID string.
The Linux Programmer's Manual man page calls this kind of address "abstract". It is distinct and different from "unnamed".
Any standard AF_UNIX implementation provides "unnamed" sockets which can be created in two ways: any AF_UNIX socket that has been created with socket but not given an address with bind is unamed; and the pair of sockets created by socketpair are unnamed.
For more information, see
man 7 unix
in some GNU/Linux distro that has the Linux Man Pages installed.
\0 just puts a NUL character into the string. As a NUL characters is used to terminate a string, to all C string functions socket_path looks like an empty string, while in fact it is not but they would stop processing it after the first character.
So im memory socket_path actually looks like this:
char socket_path[] = { `\0`, `h`, `i`, `d`, `d`, `e`, `n`, `\0` };
As all strings automatically get a terminating NUL attached.
The line
strncpy(addr.sun_path+1, socket_path+1, sizeof(addr.sun_path)-2);
copies the bytes of socket_path to the socket address structure addr, yet skipping the first (NUL) byte as well as the last one (also NUL). Thus the address of the socket effectively is just the word "hidden".
But as the first byte is left out from the addr.sun_path as well and this byte has been initialized to NUL by memset before, the actual path is still \0hidden.
So why would anyone do that? Probably to hide the socket, as normally systems show UNIX sockets in the file system as actual path entries but no file system I'm aware of can handle the \0 character. So if the name has a \0 character, it won't appear in the file system, yet such a characters is only allowed as the very first characters, otherwise the system would still try to create that path entry and fail and thus the socket creating would fail. Only as the first characters, the system will not even try to create it, which means you cannot see that socket by just calling ls in terminal and whoever wants to connect to it needs to know the name.
Note that this is not POSIX conform, as POSIX expects UNIX sockets to always appear in the file system and thus only characters that are legal for the file system in use are allowed as socket name. This will only work on Linux.

How to filter a multicast receiving socket by interface?

I need to create two sockets listening on the same IP:port but on different interfaces:
socket0 receives UDP traffic sent to 224.2.2.2:5000 on interface eth0
socket1 receives UDP traffic sent to 224.2.2.2:5000 on interface eth1
It seemed pretty straight forward until I realized that Linux merges all of that into the same traffic. For example, say there's only traffic on eth1 and there's no activity on eth0. When I first create socket0 it won't be receiving any data but as soon as I create socket1 (and join the multicast group) then socket0 will also start receiving the same data. I found this link that explains this.
Now this actually makes sense to me because the only moment when I specify the network interface is when joining the multicast group setsockopt(socket,IPPROTO_IP,IP_ADD_MEMBERSHIP,...) with ip_mreq.imr_interface.s_addr. I believe this specifies which interface joins the group but has nothing to do with from which interface your socket will receive from.
What I tried so far is binding the sockets to the multicast address and port, which behaves like mentioned above. I've tried binding to the interface address but that doesn't work on Linux (it seems to do so on Windows though), you don't receive any traffic on the socket. And finally, I've tried binding to INADDR_ANY but this isn't what I want since I will receive any other data sent to the port regardless of the destination IP, say unicast for example, and it will still not stop multicast data from other interfaces.
I cannot use SO_BINDTODEVICE since it requires root privileges.
So what I want to know is if this is possible at all. If it can't be done then that's fine, I'll take that as an answer and move on, I just haven't been able to find any way around it. Oh, and I've tagged the question as C because that's what we're using, but I'm thinking it really might not be specific to the language.
I haven't included the code for this because I believe it's more of a theoretical question rather than a problem with the source code. We've been working with sockets (multicast or otherwise) for a while now without any problems, it's just this is the first time we've had to deal with multiple interfaces. But if you think it might help I can write some minimal working example.
Edit about the possible duplicate:
I think the usecase I'm trying to achieve here is different. The socket is supposed to receive data from the same multicast group and port (224.2.2.2:5000 in the example above) but only from one specific interface. To put it another way, both interfaces are receiving data from the same multicast group (but different networks, so data is different) and I need each socket to only listen on one interface.
I think that question is about multiple groups on same port, rather than same group from different interfaces. Unless there's something I'm not seeing there that might actually help me with this.
Yes, you can do what you want on Linux, without root privileges:
Bind to INADDR_ANY and set the IP_PKTINFO socket option. You then have to use recvmsg() to receive your multicast UDP packets and to scan for the IP_PKTINFO control message. This gives you some side band information of the received UDP packet:
struct in_pktinfo {
unsigned int ipi_ifindex; /* Interface index */
struct in_addr ipi_spec_dst; /* Local address */
struct in_addr ipi_addr; /* Header Destination address */
};
The ipi_ifindex is the interface index the packet was received on. (You can turn this into an interface name using if_indextoname() or the other way round with if_nametoindex().
As you said on Windows the same network functions have different semantics, especially for UDP and even more for multicast.
The Linux bind() semantics for the IP address for UDP sockets are mostly useless. It is essentially just a destination address filter. You will almost always want to bind to INADDR_ANY for UDP sockets since you either do not care to which address a packet was sent or you want to receive packets for multiple addresses (e.g. receiving unicast and multicast).

Purpose of sendto address for C raw socket?

I'm sending some ping packets via a raw socket in C, on my linux machine.
int sock_fd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
This means that I specify the IP packet header when I write to the socket (IP_HDRINCL is implied).
Writing to the socket with send fails, telling me I need to specify an address.
If I use sendto then it works. For sendto I must specify a sockaddr_in struct to use, which includes the fields sin_family, sin_port and sin_addr.
However, I have noticed a few things:
The sin_family is AF_INET - which was already specified when the socket was created.
The sin_port is naturally unused (ports are not a concept for IP).
It doesn't matter what address I use, so long as it is an external address (the IP packet specifies 8.8.8.8 and the sin_addr specifies 1.1.1.1).
It seems none of the extra fields in sendto are actually used to great extent. So, is there a technical reason why I have to use sendto instead of send or is it just an oversight in the API?
Writing to the socket with send fails, telling me I need to specify an address.
It fails, because the send() function can only be used on connected sockets (as stated here). Usually you would use send() for TCP communication (connection-oriented) and sendto() can be used to send UDP datagrams (connectionless).
Since you want to send "ping" packets, or more correctly ICMP datagrams, which are clearly connectionless, you have to use the sendto() function.
It seems none of the extra fields in sendto are actually used to great
extent. So, is there a technical reason why I have to use sendto
instead of send or is it just an oversight in the API?
Short answer:
When you are not allowed to use send(), then there is only one option left, called sendto().
Long answer:
It is not just an oversight in the API. If you want to send a UDP datagram by using an ordinary socket (e.g. SOCK_DGRAM), sendto() needs the information about the destination address and port, which you provided in the struct sockaddr_in, right? The kernel will insert that information into the resulting IP header, since the struct sockaddr_in is the only place where you specified who the receiver will be. Or in other words: in this case the kernel has to take the destination info from your struct as you don't provide an additional IP header.
Because sendto() is not only used for UDP but also raw sockets, it has to be a more or less "generic" function which can cover all the different use cases, even when some parameters like the port number are not relevant/used in the end.
For instance, by using IPPROTO_RAW (which automatically implies IP_HDRINCL), you show your intention that you want to create the IP header on your own. Thus the last two arguments of sendto() are actually redundant information, because they're already included in the data buffer you pass to sendto() as the second argument. Note that, even when you use IP_HDRINCL with your raw socket, the kernel will fill in the source address and checksum of your IP datagram if you set the corresponding fields to 0.
If you want to write your own ping program, you could also change the last argument in your socket() function from IPPROTO_RAW to IPPROTO_ICMP and let the kernel create the IP header for you, so you have one thing less to worry about. Now you can easily see how the two sendto()-parameters *dest_addr and addrlen become significant again because it's the only place where you provide a destination address.
The language and APIs are very old and have grown over time. Some APIs can look weird from todays perspective but you can't change the old interfaces without breaking a huge amount of existing code. Sometimes you just have to get used to things that were defined/designed many years or decades ago.
Hope that answers your question.
The send() call is used when the sockets are in a TCP SOCK_STREAM connected state.
From the man page:
the send() call may be used only when the socket is in a connected
state (so that the intended recipient is known).
Since your application obviously does not connect with any other socket, we cannot expect send() to work.
In addition to InvertedHeli's answer, the dest_addr passed in sendto() will be used by kernel to determine which network interface to used.
For example, if dest_addr has ip 127.0.0.1 and the raw packet has dest address 8.8.8.8, your packet will still be routed to the lo interface.

Using C sockets: Address already in use

So the basic premise of my program is that I'm supposed to create a tcp session, direct traffic through it, and detect any connection losses. If the connection does break, I need to close the sockets and reopen them (using the same ports) in such a way that it will seem like the connection (almost) never died. It should also be noted that the two programs will be treated as proxies (data gets sent to them, if the connection breaks it gets stored until connection is fixed, then data is sent off).
I've done some research and gone ahead and used setsockopt() with the SO_REUSEADDR option to set the socket options so that I can reuse the address.
Here's the basic algorithm I do to detect a connection break using signals:
After initial setup of sockets, begin sending data
After x seconds, set a flag to false, which will prevent all other data from being sent
Send a single piece of data to let the other program know the connection is still open, reset timer to x seconds
If I receive same piece of data from the program, set the flag to true to continue sending
If I don't receive the data after x seconds, close the socket and attempt to reconnect
(step 5 is where I'm getting the error).
Essentially one program is a client(on one VM) and one program is a server(on another VM), each sending and receiving data to/from each other and to/from another program on each VM.
My question is: Given that I'm still getting this error after setting the socket options, why am I not allowed to re-bind the address when a connection has been detected?
The server is the one complaining when a disconnect is detected (I close the socket, open a new one, set the option, and attempt to bind the port with the same information).
One other thing of note is the way I'm receiving the data from the sockets. If I have a socket open, I'm basically reading it by doing the following:
while((x = recv(socket, buff, 1, 0)>=0){
//add to buffer
// send out to other program if connection is alive
}
Since I'm using the timer to close/reopen the socket, and this is in a different thread, will this prevent the socket from closing?
SO_REUSEADDR only allows limited reuse of ports. Specifically, it does not allow reuse of a port that some other socket is currently actively listening for incoming connections on.
There seems to be an epidemic here of people calling bind() and then setsockopt() and wondering why the setsockopt() doesn't fix an error that had already happened on bind().
You have to call setsockopt() first.
But I don't understand your problem. Why do you think you need to use the same ports? Why are you setting a flag preventing you from sending data? You don't need any of this. Just handle the errors on send() when and if they arise, creating a new connection when necessary. Don't try to out-think TCP. Many have tried, few if any have succeeded.

Data transfer over TCP-IPv6 connection

I am working on a client-server application in C and on Linux platform. What I am trying to achieve is to change the socket id over a TCP connection on both client and server without data loss where in the client sends the data from a file to the server in the main thread. The application is multithreaded where the other threads change the socket id based on some global flags set.
Problem: The application has two TCP socket connections established, over both IPv4 and IPv6 paths. I am transferring a file over the TCP-IPv4 connection first in the main thread. The other thread is checking on some global flags and has access to/share the socket IDs created for each protocol in the main thread. The send and recv use a pointer variable in its call to point to the socket ID to be used for the data transfer. The data is transferred initially over TCP-Ipv4. Once the global flags are set and few other checks are made the other thread changes the socket ID used in send call to point to IPv6 socket. This thread also takes care of communicating the change between the two hosts.I am getting all the data over IPv4 sent completely before switching. Also I am getting data sent over Ipv6 after the socket ID is just switched. But down the transfer there is loss of data over IPv6 connection.(I am using a pointer variable in send function on server side send(*p_dataSocket.socket_id,sentence,p_size,0); to change the pointer to IPv6 socket ID on the fly)
The error after recv and send call on both side respectively is says ESPIPE:Illegal seek, but this error exists even before switching. So I am pretty much sure this is nothing to do with the data loss
I am using pselect() to check for the available data for each socket. I can somehow understand the data loss while switching(if not properly handled) but I am not able to figure out why the data loss is occurring down the transfer after switching. I hope I am clear on what the issue is. I have also checked to send the data individually over each protocol without switching and there is no data loss.It I initially transfer the data over Ipv6 and then switch to IPv4, there is no data loss. Also would really appreciate to know to how to investigate in this issue apart from using errno or netstat.
When you are using TCP to send data you just can't loose a part of the information in between. You either receive the byte stream the way it was sent or receive nothing at all - provided that you are using the socket-related functions correctly.
There are several points you may want to investigate.
First of all you must make sure that you are really sending the data which is lost. Add some logging on the server side application: dump anything that you transmit witn send() into some file. Include some extra info as well, like:
Data packet no.==1234, *p_dataSocket.socket_id==11, Data=="data_contents_here", 22 bytes total; send() return==22
The important thing here is to watch the contents of *p_dataSocket.socket_id. Make sure that you are using mutex or something like that cause you have a thread which regularly reads socket_id contents and another thread which occasionally changes it. You are not guranteed against the getting of a wrong value from that address unless your threads have monopoly access to it while reading/writing. It is important both for normal program operation and for the debugging information generation.
Another possible problem here is the logic which selects sentence to send. Corruption of this variable may be hard to track in multithreaded program. The logging of transmitted information will help you here too.
Use any TCP sniffer to check what TCP stack really transmits. Are there packets with lost data? If there are no those packets, try to find out which send() call was responsible for sending that data. If those packets exist, check the receiving side for bugs.
errno value should not be used alone. Its value has meaning only when you get an erroneous return from a function. Try to find out when exactly errno becomes ESPIPE That may happen when any of API functions return something like -1 (depends on function). When you find out where it happens you should find out what is wrong in that particular piece of code (debugger is your friend). Have in mind that errno behavior in multithreaded environment depends on your system implementation. Make sure that you use -pthread option (gcc) or at least compile with -D_REENTRANT to minimize the risks.
Check this question for some info about the possible cause of your situation with errno==ESPIPE. Try some debuggin techniques, as suggested there. Errno value of ESPIPE gives a hint that you are using file descriptors incorrectly somewhere in your program. Maybe somewhere you are using a socket fd as regular file or something like that. This may be caused by some race condition (simultaneous access to one object from several threads).

Resources