Using windows sockets for a new process I/O - c

Are there any restrictions to the types of I/O kernel objects that one can use for redirection when calling CreateProcess()?
As can be seen in the documentation, one of the parameters the function receives is the STARTUPINFO structure. Within this structure, one can specify handles for both input, output and errors. However, it's not mentioned what specific kinds of I/O kernel objects can be use.
I have tested the use of windows sockets as I/O devices, and found that it works. However, I assume that the nature of such a device is rather different than a file object, which makes me wonder if it was actually intended that sockets can be used for this purpose.

When using STARTF_USESTDHANDLES to specify STDIN/OUT/ERR handles for CreateProcess(), they are supposed to be valid kernel objects that are compatible with CloseHandle() (since that is stated as much in the STARTUPINFO documentation). Sockets are not kernel objects and do not meet the CloseHandle() requirement (although sockets can be used with the ReadFile() and WriteFile() kernel functions).
When a caller is launching a redirected process on behalf of a socket connection, the typical scenario is for the caller to use an anonymous pipe via CreatePipe() for the redirection and then proxy data between the socket connection and the pipe as needed.
That being said, one of the answers to another question states that is possible to use a socket for redirection, but only if the socket was created using WSASocket() with the WSA_FLAG_OVERLAPPED flag omitted, since redirection does not allow overlapped handles to be used. I have not personally attempted that solution myself, so I do not know I it works or not. It is better to be safe and use a pipe instead, especially if you have no control over (or knowledge of) how the socket was created.

Well, not sure if I'm answering your question, but in terms of inheritance, as of MSDN documentation states, one can modify the inheritance policy of a socket by means of SetHandleInformation(). This way you can prevent child processes from inherit a listening socket of the parent process.

I don't believe this is documented.
However, common practice says that the standard I/O handles may be any such stream devices as TCP/IP sockets, COM or LPT ports, files, pipes, and so on. I believe there is sample code on MSDN for several of these scenarios.
(The default standard I/O handles are to the console, not to a file, so we can safely assume at least that they do not have to be file handles!)

Related

Message Passing between Processes without using dedicated functions

I'm writing a C program in which I need to pass messages between child processes and the main process. But the thing is, I need to do it without using the functions like msgget() and msgsnd()
How can I implement? What kind of techniques can I use?
There are multiple ways to communicate with children processes, it depends on your application.
Very much depends on the level of abstraction of your application.
-- If the level of abstraction is low:
If you need very fast communication, you could use shared memory (e.g. shm_open()). But that would be complicated to synchronize correctly.
The most used method, and the method I'd use if I were in your shoes is: pipes.
It's simple, fast, and since pipes file descriptors are supported by epoll() and those kind of asynchronous I/O APIs, you can take advantage from this fact.
Another plus is that, if your application grows, and you need to communicate with remote processes (processes that are not in your local machine), adapting pipes to sockets is very easy, basically it's still the same reading/writing from/to a file descriptor.
Also, Unix-domain sockets (which in other platforms are called "named pipes") let you to have a server process that creates a listening socket with a very well known name (e.g. an entry in the filesystem, such as /tmp/my_socket) and all clients in the local machine can connect to that.
Pipes, networking sockets, or unix-domain sockets are very interchangeable solutions, because - as said before - all involve reading/writing data from/to a file descriptor, so you can reuse the code.
The disadvantage with a file descriptor is that you're writing data to a stream of bytes, so you need to implement the "message streaming protocol" of your messages by yourself, to "unstream" your messages (marshalling/unmarshalling), but that's not so complicated in the most of the cases, and that also depends on the kind of messages you're sending.
I'd pass on other solutions such as memory mapped files and so on.
-- If the level of abstraction is higher:
You could use a 3rd party message passing system, such as RabbitMQ, ZMQ, and so on.

setting up IPC between unrelated processes

I would like to inject a shared library into a process (I'm using ptrace() to do that part) and then be able to get output from the shared library back into the debugger I'm writing using some form of IPC. My instinct is to use a pipe, but the only real requirements are:
I don't want to store anything on the filesystem to facilitate the communication as it will only last as long as the debugger is running.
I want a portable Unix solution (so Unix-standard syscalls would be ideal).
The problem I'm running into is that as far as I can see, if I call pipe() in the debugger, there is no way to pass the "sending" end of the pipe to the target process, and vice versa with the receiving end. I could set up shared memory, but I think that would require creating a file somewhere so I could reference the memory segment from both processes. How do other debuggers capture output when they attach to a process after it has already begun running?
I assume that you are in need of a debugging system for your business logic code (I mean application). From my experience, this kind of problem is tackled with below explained system design. (My experience is in C++, I think the same must hold good for the C based system also.)
Have a logger system (a separate process). This will contain - logger manager and the logging code - which will take the responsibility of dumping the log into hard disk.
Each application instance (process running in Unix) will communicate to this process with sockets. So you can have your own messaging protocol and communicate with the logger system with socket based communication.
Later, for each of this application - have a switch which can switch off/on the log.So that you can have a tool - to send signal to this process to switch on/off the message logging.
At a high level, this is the most generic way to develop a logging system. In case you need any information - Do comment it. I will try to answer.
Using better search terms showed me this question is a dup of these guys:
Can I share a file descriptor to another process on linux or are they local to the process?
Can I open a socket and pass it to another process in Linux
How to use sendmsg() to send a file-descriptor via sockets between 2 processes?
The top answers were what I was looking for. You can use a Unix-domain socket to hand a file descriptor off to a different process. This could work either from debugger to library or vice versa, but is probably easier to do from debugger to library because the debugger can write the socket's address into the target process while it injects the library.
However, once I pass the socket's address into the target process, I might as well just use the socket itself instead of using a pipe in addition.

Is there a Windows equivalent for eventfd?

I am writing a cross-platform library which emulates sockets behaviour, having additional functionality in the between (App->mylib->sockets).
I want it to be the most transparent possible for the programmer, so primitives like select and poll must work accordingly with this lib.
The problem is when data becomes available (for instance) in the real socket, it will have to go through a lot of processing, so if select points to the real socket fd, app will be blocked a lot of time. I want the select/poll to unblock only when data is ready to be consumed (after my lib has done all the processing).
So I came across this eventfd which allows me to do exactly what I want, i.e. to manipule select/poll behaviour on a given fd.
Since I am much more familiarized with Linux environment, I don't know what is the windows equivalent of eventfd. Tried to search but got no luck.
Note:
Other approach would be to use another socket connected with the interface, but that seems to be so much overhead. To make a system call with all data just because windows doesn't have (appears so) this functionality.
Or I could just implement my own select, reinventing the wheel. =/
There is none. eventfd is a Linux-specific feature -- it's not even available on other UNIXy operating systems, such as BSD and Mac OS X.
Yes, but it's ridiculous. You can make a Layered Service Provider (globally installed...) that fiddles with the system's network stack. You get to implement all the WinSock2 functions yourself, and forward most of them to the underlying TCP. This is often used by firewalls or antivirus programs to insert themselves into the stack and see what's going on.
In your case, you'd want to use an ioctl to turn on "special" behaviour for your application. Whenever the app tries to create a socket, it gets forwarded to your function, which in turn opens a real TCP socket (say). Instead of returning that HANDLE though, you use a WinSock function to create ask for a dummy handle from the kernel, and give that to the application instead. You do your stuff in a thread. Then, when the app calls WinSock functions on the dummy handle, they end up in your implementation of read, select, etc. You can decouple select notifications on the dummy handle from those on the actual handle. This lets you do things like, for example, transparently give an app a socket that wraps data each way in encryption, indistinguishably from the original socket. (Almost indistinguishably! You can call some LSP APIs on a handle to find out if there's actually and underlying handle you weren't given.)
Pretty heavy-weight, and monstrous in some ways. But, it's there... Hope that's a useful overview.

Inter-program communication for an arbitrary number of programs

I am attempting to have a bunch of independent programs intelligently allocate shared resources among themselves. However, I could have only one program running, or could have a whole bunch of them.
My thought was to mmap a virtual file in each program, but the concurrency is killing me. Mutexes are obviously ineffective because each program could have a lock on the file and be completely oblivious of the others. However, my attempts to write a semaphore have all failed, since the semaphore would be internal to the file, and I can't rely on only one thing writing to it at a time, etc.
I've seen quite a bit about named pipes but it doesn't seem to be to be a practical solution for what I'm doing since I don't know how many other programs there will be, if any, nor any way of identifying which program is participating in my resource-sharing operation.
You could use a UNIX-domain socket (AF_UNIX) - see man 7 unix.
When a process starts up, it tries to bind() a well-known path. If the bind() succeeds then it knows that it is the first to start up, and becomes the "resource allocator". If the bind() fails with EADDRINUSE then another process is already running, and it can connect() to it instead.
You could also use a dedicated resource allocator process that always listens on the path, and arbitrates resource requests.
Not entirely clear what you're trying to do, but personally my first thought would be to use dbus (more detail). Should be easy enough within that framework for your processes/programs to register/announce themselves and enumerate/signal other registered processes, and/or to create a central resource arbiter and communicate with it. Readily available on any system with gnome or KDE installed too.

What are named pipes?

What are they and how do they work?
Context happens to be SQL Server
Both on Windows and POSIX systems, named-pipes provide a way for inter-process communication to occur among processes running on the same machine. What named pipes give you is a way to send your data without having the performance penalty of involving the network stack.
Just like you have a server listening to a IP address/port for incoming requests, a server can also set up a named pipe which can listen for requests. In either cases, the client process (or the DB access library) must know the specific address (or pipe name) to send the request. Often, a commonly used standard default exists (much like port 80 for HTTP, SQL server uses port 1433 in TCP/IP; \\.\pipe\sql\query for a named pipe).
By setting up additional named pipes, you can have multiple DB servers running, each with its own request listeners.
The advantage of named pipes is that it is usually much faster, and frees up network stack resources.
--
BTW, in the Windows world, you can also have named pipes to remote machines -- but in that case, the named pipe is transported over TCP/IP, so you will lose performance. Use named pipes for local machine communication.
Unix and Windows both have things called "Named pipes", but they behave differently. On Unix, a named pipe is a one-way street which typically has just one reader and one writer - the writer writes, and the reader reads, you get it?
On Windows, the thing called a "Named pipe" is an IPC object more like a TCP socket - things can flow both ways and there is some metadata (You can obtain the credentials of the thing on the other end etc).
Unix named pipes appear as a special file in the filesystem and can be accessed with normal file IO commands including the shell. Windows ones don't, and need to be opened with a special system call (after which they behave mostly like a normal win32 handle).
Even more confusing, Unix has something called a "Unix socket" or AF_UNIX socket, which works more like (but not completely like) a win32 "named pipe", being bidirectional.
Linux Pipes
First In First Out (FIFO) interproccess communication mechanism.
Unnamed Pipes
On the command line, represented by a "|" between two commands.
Named Pipes
A FIFO special file. Once created, you can use the pipe just like a normal file(open, close, write, read, etc).
To create a named pipe, called "myPipe", from the command line (man page):
mkfifo myPipe
To create a named pipe from c, where "pathname" is the name you would like the pipe to have and "mode" contains the permissions you want the pipe to have (man page):
#include <sys/types.h>
#include <sys/stat.h>
int mkfifo(const char *pathname, mode_t mode);
According to Wikipedia:
[...] A traditional pipe is "unnamed" because it exists anonymously and persists only for as long as the process is running. A named pipe is system-persistent and exists beyond the life of the process and must be "unlinked" or deleted once it is no longer being used. Processes generally attach to the named pipe (usually appearing as a file) to perform IPC (inter-process communication).
Compare
echo "test" | wc
to
mkdnod apipe p
wc apipe
wc will block until
echo "test" > apipe
executes
This is an exeprt from Technet (so not sure why the marked answer says named pipes are faster??):
Named Pipes vs. TCP/IP Sockets
In a fast local area network (LAN) environment, Transmission Control Protocol/Internet Protocol (TCP/IP) Sockets and Named Pipes clients are comparable with regard to performance. However, the performance difference between the TCP/IP Sockets and Named Pipes clients becomes apparent with slower networks, such as across wide area networks (WANs) or dial-up networks. This is because of the different ways the interprocess communication (IPC) mechanisms communicate between peers.
For named pipes, network communications are typically more interactive. A peer does not send data until another peer asks for it using a read command. A network read typically involves a series of peek named pipes messages before it starts to read the data. These can be very costly in a slow network and cause excessive network traffic, which in turn affects other network clients.
It is also important to clarify if you are talking about local pipes or network pipes. If the server application is running locally on the computer that is running an instance of SQL Server, the local Named Pipes protocol is an option. Local named pipes runs in kernel mode and is very fast.
For TCP/IP Sockets, data transmissions are more streamlined and have less overhead. Data transmissions can also take advantage of TCP/IP Sockets performance enhancement mechanisms such as windowing, delayed acknowledgements, and so on. This can be very helpful in a slow network. Depending on the type of applications, such performance differences can be significant.
TCP/IP Sockets also support a backlog queue. This can provide a limited smoothing effect compared to named pipes that could lead to pipe-busy errors when you are trying to connect to SQL Server.
Generally, TCP/IP is preferred in a slow LAN, WAN, or dial-up network, whereas named pipes can be a better choice when network speed is not the issue, as it offers more functionality, ease of use, and configuration options.
Pipes are a way of streaming data between applications. Under Linux I use this all the time to stream the output of one process into another. This is anonymous because the destination app has no idea where that input-stream comes from. It doesn't need to.
A named pipe is just a way of actively hooking onto an existing pipe and hoovering-up its data. It's for situations where the provider doesn't know what clients will be eating the data.
Inter-process communication (mostly) for Windows Applications. Similar to using sockets to communicate between applications in Unix.
MSDN
Named pipes in a unix/linux context can be used to make two different shells to communicate since a shell just can't share anything with another.
Furthermore, one script instantiated twice in the same shell can't share anything through the two instances. I found a use for named pipes when coding a daemon that contains the start() and stop() function, and I wanted to use the same script to perform the two actions.
Without named pipes (or any kind of semaphore) starting the script in the background is not a problem. The thing is when it finishes you just can't access the instance in background.
So when you want to send him the stop command you just can't: running the same script without named pipes and calling the stop() function won't do anything since you are actually running another instance.
The solution was to implement two pipes, one READ and the other WRITE when you start the daemon. Then make him, among its other tasks, listen to the READ pipe. Then the Stop() function contains a command that will write a message in the pipe, that will be handled by the background running script that will perform an exit 0. This way our second instance of the same script has only on task to do: tell the first instance to stop.
This way one and only one script can start and stop itself.
Of course you have different ways to do it by triggering the stop via a touch for example. But this one is nice and interesting to code.
Named pipes is a windows system for inter-process communication. In the case of SQL server, if the server is on the same machine as the client, then it is possible to use named pipes to tranfer the data, as opposed to TCP/IP.

Resources