I am a newbie in C. Now I let my server create two threads listening to two different ports, both of them will call bind a port->listen()->accept(). Then there are two clients connecting to these two ports respectively. Then these two threads will accept() and generate two file descriptors. What I am curious is that is it possible that the generated file descriptors can be the same integers?
A file descriptor is something that you are not expected to interpret - So it is actually "not your business" to know this ;)
Within a process, file descriptors returned from system calls are guaranteed to be unique. So the two threads will receive two different integers (actually, multi-threading does not affect this question at all. The result is the same as if both sockets would have been opened in the main thread).
They might be the same integers, if the first thread will close the new socket before the accept on a second socket will create a new socket - file descriptors are recycled.
Related
From my understanding, C pipes are like a special kind of file, where internally, the kernal keep tracks of the openings and closings from each process in a table. see the post here
So in that sense:
Is it possible for 1 single pipe to be connected by multiple processes?
If it is possible, can multiple processes read the same data?
If 2 is possible, will they be reading the same data, or does reading the data "empty" the data?
For example: process 1 writes into pipe, can process 2,3,4 read the data that process 1 wrote?
Yes, multiple processes can read from (or write to) a pipe.
But data isn't duplicated for the processes. Once data has been read from the pipe by one process, it's lost and available only to the process that actually read it.
Conversely, there's no way to distinguish data or from which process it originated if you have multiple processes writing to a single pipe.
1. Is it possible for 1 single pipe to be connected by multiple processes?
Yes.
2. If it is possible, can multiple processes read the same data?
No!
Unix fifos (pipes) can not be used in "single producer, multiple consumer" (spmc) manner; this also holds for Unix Domain Sockets (for most implementations UDS and fifos are implemented by the very same code, with just a few configuration bits differing on creation). Each byte written into a pipe / SOCK_STREAM UDS (or datagram written into a SOCK_DGRAM unix domain socket) can be read from only one single reading end.
However what's perfectly possible is having a "multiple producer, single consumer" fifo, UDS, that is the consumer having open one reading end (and also keeping open the writing end, but not using it¹), multiple producers can send data to the single consumer. For stream oriented pipes there's no strict ordering, so all the bytes sent will get mixed up. But for SOCK_DGRAM UDS socketpairs message boundaries are preserved.
¹: There's a particular pitfall, that if the creating process does not keep open its instance of the writing end, as soon as any one of the producer processes closes one of their writing end, it will tear down the connection for all other processes.
I am currently measuring performance of named pipe to compare with another library.
I need to simulate a clients (n) / server (1) situation where server read messages and do a simple action for every written messages. So clients are writers.
My code now work, but if I add a 2nd writer, the reader (server) will never see the data and will not receive forever. The file is still filled with the non-read data at the end and read method will return 0.
Is it ok for a single named pipe to be written by multiple-process? Do I need to initialize it with a special flag for multiple-process?
I am not sure I can/should use multiple writers on a single pipe. But, I am not sure also it would be a good design to create 1 pipe for each clients.
Would it be a more standard design to use 1 named pipe per client connection?
I know about Unix Domain Name Socket and it will be use later. I need o make the named pipes work.
I have two file descriptors created with socket() and both are connected to separate hosts. I want anything received on the first socket to be immediately sent on the second and vice versa.
I know can achieve this manually with a combination of select(), send() and recv(), but is there a more direct way to tell the kernel to simply pipe the output from one into the other?
No. You can use a tool like netcat which does this for you (so you don't have to write the code) but even netcat contains a loop that copies the data.
I am doing some network testing, and I am connecting between linux boxes with 2 small C programs that are just using function:
connect()
After connection some small calculations are made and recorded to local file, and I instruct one of the programs to close the connection and then run a netcat listener on the same port. The first program then retries the connection and connects to netcat.
I wondered if someone could advise if it is possible to maintain the initial connection whilst freeing the port and pass the connection to netcat on that port (so that the initial connection is not closed).
Each TCP connection is defined by the four-tuple (target IP address, target port, source IP address, source port), so there is no need to "free up" the port on either machine.
It is very common for a server process to fork() immediately after accept()ing a new connection. The parent process closes its copy of the connection descriptor (returned by accept()), and waits for a new connection. The child process closes the original socket descriptor, and executes the desired program or script that should handle the actual connection. In many cases the child moves the connection descriptor to standard input and standard output (using dup2()), so that the executed script or program does not even need to know it is connected to a remote client: everything it writes to standard output is sent to the remote client, and everything the remote client sends is readable from standard input.
If there is an existing process that should handle the connection, and there is an Unix domain socket connection (stream, datagram or seqpacket socket; makes no difference) between the two processes, it is possible to transfer the connection descriptor as an SCM_RIGHTS ancillary message. See man 2 sendmsg, man 2 recvmsg, man 3 cmsg, and man 7 unix for details. This only works on the same machine over an Unix domain socket, because the kernel actually duplicates the descriptor from one process to the other; really, the kernel does some funky magic to make this happen.
If your server-side logic is something like
For each incoming connection:
Do some calculations
Store calculations into a file
Store incoming data from the connection into a file (or standard output)
then I recommend using pthreads. Just create the desired number of threads, have all of them wait for an incoming connection by calling accept() on the listening socket, and have each thread handle the connection by themselves. You can even use stdio.h I/O for the file I/O. For more complex output -- multiple statements per chunk --, you'll need a pthread_mutex_t per output stream, and remember to fflush() it before releasing the mutex. I suspect a single multithreaded program that does all that, and exits nicely if interrupted (SIGINT aka CTRL+C), should not exceed three hundred lines of C.
If you only need to output data from a stream socket, or if you only need to write input to it, you can treat it as a file handle.
So in a Posix program you can use dup2() to duplicate the socket handle to the value 1, which is standard output. Then close the original handle. Then use exec() to overwrite your program with "cat", which will write the output from standard input aka filehandle 1 aka your socket.
After looking at a unix named socket and i thought they were named pipes. I looked at name pipes and didnt see much of a difference. I saw they were initialized differently but thats the only thing i notice. Both use the C write/read function and work alike AFAIK.
Whats the difference between unix domain sockets and named pipes? When would i pick one over the other? Which should i use by default (like how i use use vector by default in C++ than use deque, list or whatever else if i have needs)?
UNIX-domain sockets are generally more flexible than named pipes. Some of their advantages are:
You can use them for more than two processes communicating (eg. a server process with potentially multiple client processes connecting);
They are bidirectional;
They support passing kernel-verified UID / GID credentials between processes;
They support passing file descriptors between processes;
They support packet and sequenced packet modes.
To use many of these features, you need to use the send() / recv() family of system calls rather than write() / read().
One difference is that named pipes are one-way, so you'll need to use two of them in order to do two-way communication. Sockets of course are two way. It seems slightly more complicated to use two variables instead of one (that is, two pipes instead of one socket).
Also, the wikipedia article is pretty clear on the following point: "Unix domain sockets may be created as byte streams or as datagram sequences, while pipes are byte streams only."
Named pipes are, in fact, bi-directional but half-duplex. This means that communication may go either from end A to end B, or B to A, but never both at the same time.