How to update internal state of nginx' module runtime? - c

Lets suppose I wish to write a nginx module that blocks clients by IP.
In order to do so, on initialization stage i read a file with IP addresses
that I have to block (black list) and store it in module's context.
Now I wish to update the black list without restarting nginx.
One of the possible solutions, is to add a handler on specific location.
e.g. if uri "/block/1.2.3.4" requested, my handler adds ip address 1.2.3.4 to the black list.
However, nginx runs several workers as separated processes, so only one particular worker will updated.
What is a common pattern to cope such problems?

But nginx does not require a restart (nor any downtime) in order to change the configuration!
See:
http://nginx.org/en/docs/control.html#reconfiguration
In order for nginx to re-read the configuration file, a HUP signal should be sent to the master process. The master process first checks the syntax validity, then tries to apply new configuration, that is, to open log files and new listen sockets. If this fails, it rolls back changes and continues to work with old configuration. If this succeeds, it starts new worker processes, and sends messages to old worker processes requesting them to shut down gracefully. Old worker processes close listen sockets and continue to service old clients. After all clients are serviced, old worker processes are shut down.
As an administrator, it would be my expectation that all modules would, in fact, be controlled in this way, too.
(Of course, if you require a lot of changes to the configuration very often, a different solution might be more appropriate.)
You give an explicit example of blocking access by IP. Are you sure you require a new module in order to accomplish the task? It would seem that a combination of the following standard directives might suffice already:
http://nginx.org/r/deny && http://nginx.org/r/allow
http://nginx.org/r/geo
http://nginx.org/r/map
http://nginx.org/r/if && http://nginx.org/r/return

If you're able to move the black list outside of the module's context, perhaps to a system file, a KV store, or SHM, that would allow each process to talk to a central source blacklist. I believe shmat() and futex will do the job and the overhead will be negligible.

Related

Simplest way to awake multiple processes with a single broadcast?

Context: this is a web/sqlite application. One process receives new data over TCP, and feed them to a SQLite database. Other processes (number is variable) are launched as required as clients connect and request updates over HTML5's server-side events interface (this might change to websocket in the future).
The idea is to force the client apps to block, and to find a way for the server to create a notification that will wakeup all awaiting clients.
Note that the clients aren't fork'ed from the server.
I'm hoping for a solution that:
doesn't require clients to register themselves to the server
allows the server to broadcast even if no client is listening - and doesn't create a huge pile of unprocessed notifications
allows clients to detect that server isn't present
allows clients to define a custom timeout (maximum wait time for an event)
Solutions checked:
sqlite3_update_hook() - only works within a single process (damned, that would have been sleek)
signals: I still have nightmares about the last time I used signals. Maybe signalfd would be better (server creates a folder, client create unique files, and server notifies all files in that folder)
iNotify - didn't read enough on this one
semaphores / locks / shared memory - can't think of a non-hacked way to use these. The server could update a shared memory area with the row ID of the line just inserted in the DB, but then what?
I'm sure I'm missing something obvious - but what? At this time, polling seems to be the best option!
Thanks.
Just as a suggestion can you try message queues? multiple clients can connect to the same queue and receive one broadcast message, each client can have its own message queue if it requires communication with the server.
Message queues are implemented by Linux OS and they are very reliable. I personally use message queues to pass messages from several clients to a central routing daemon, clients being responsible of processing and returning the altered data.

How to restrict write access to a Linux directory by process attributes?

We've got a situation where it would be advantageous to limit write access to a logging directory to a specific subset of user processes. These particular processes (say, for example, telnet and the like) have been modified by us to generate a logging record whenever a significant user action takes place (like a remote connection, etc). What we do not want is for the user to manually create these records by copying and editing existing logging records.
syslog comes close but still allows the user to generate spurious records, SELinux seems plausible but has a terrible reputation of being an unmanageable beast.
Any insight is appreciated.
Run a local logging daemon as root. Have it listen on an Unix domain socket (typically /var/run/my-logger.socket or similar).
Write a simple logging library, where event messages are sent to the locally running daemon via the Unix domain socket. With each event, also send the process credentials via an ancillary message. See man 7 unix for details.
When the local logging daemon receives a message, it checks for the ancillary message, and if none, discards the message. The uid and gid of the credentials tell exactly who is running the process that has sent the logging request; these are verified by the kernel itself, so they cannot be spoofed (unless you have root privileges).
Here comes the clever bit: the daemon also checks the PID in the credentials, and based on its value, /proc/PID/exe. It is a symlink to the actual process binary being executed by the process that send the message, something the user cannot fake. To be able to fake a message, they'd have to overwrite the actual binaries with their own, and that should require root privileges.
(There is a possible race condition: a user may craft a special program that does the same, and immediately exec()s a binary they know to be allowed. To avoid that race, you may need to have the daemon respond after checking the credentials, and the logging client send another message (with credentials), so the daemon can verify the credentials are still the same, and the /proc/PID/exe symlink has not changed. I would personally use this to check the message veracity (by the logger asking for confirmation for the event, with a random cookie, and have the requester respond with both the checksum and the cookie whether the event checksum is correct. Including the random cookie should make it impossible to stuff the confirmation in the socket queue before exec().)
With the pid you can do also further checks. For example, you can trace the process parentage to see how the human user has connected by tracking parents till you detect a login via ssh or console. It's a bit tedious, since you'll need to parse /proc/PID/stat or /proc/PID/status files, and nonportable. OSX and BSDs have a sysctl call you can use to find out the parent process ID, so you can make it portable by writing a platform-specific parent_process_of(pid_t pid) function.
This approach will make sure your logging daemon knows exactly 1) which executable the logging request came from, and 2) which user (and how connected, if you do the process tracing) ran the command.
As the local logging daemon is running as root, it can log the events to file(s) in a root-only directory, and/or forward the messages to a remote machine.
Obviously, this is not exactly lightweight, but assuming you have less than a dozen events per second, the logging overhead should be completely neglible.
Generally there's two ways of doing this. One, run these processes as root and write protect the directory (mentioned mainly for historical purposes). Then no one but root can write there. The second, and more secure is to run them as another user (not root) and give that user, but no one else, write access to the log directory.
The approach we went with was to use a setuid binary to allow write access to the logging directory, the binary was executable by all users but would only allow a log record to be written if the parent process path as defined by /proc/$PPID/exe matched the subset of modified binary paths we placed on the system.

Is a server an infinite loop running as a background process?

Is a server essentially a background process running an infinite loop listening on a port? For example:
while(1){
command = read(127.0.0.1:xxxx);
if(command){
execute(command);
}
}
When I say server, I obviously am not referring to a physical server (computer). I am referring to a MySQL server, or Apache, etc.
Full disclosure - I haven't had time to poke through any source code. Actual code examples would be great!
That's more or less what server software generally does.
Usually it gets more complicated because the infinite loop "only" accepts the connection and each connection can often handle multiple "commands" (or whatever they are called in the used protocol), but the basic idea is roughly this.
There are three kinds of 'servers' - forking, threading and single threaded (non-blocking). All of them generally loop the way you show, the difference is what happens when there is something to be serviced.
A forking service is just that. For every request, fork() is invoked creating a new child process that handles the request, then exits (or remains alive, to handle subsequent requests, depending on the design).
A threading service is like a forking service, but instead of a whole new process, a new thread is created to serve the request. Like forks, sometimes threads stay around to handle subsequent requests. The difference in performance and footprint is simply the difference of threads vs forks. Depending on the memory usage that is not servicing a client (and prone to changing), its usually better to not clone the entire address space. The only added complexity here is synchronization.
A single process (aka single threaded) server will fork only once to daemonize. It will not spawn new threads, it will not spawn child processes. It will continue to poll() the socket to find out when the file descriptor is ready to receive data, or has data available to be processed. Data for each connection is kept in its own structure, identified by various states (writing, waiting for ACK, reading, closing, etc). This can be an extremely efficient design, if done properly. Instead of having multiple children or threads blocking while waiting to do work, you have a single process and event loop servicing requests as they are ready.
There are instances where single threaded services spawn multiple threads, however the additional threads aren't working on servicing incoming requests, one might (for instance) set up a local socket in a thread that allows an administrator to obtain a status of all connections.
A little googling for non blocking http server will yield some interesting hand rolled web servers written as code golf challenges.
In short, the difference is what happens once the endless loop is entered, not just the endless loop :)
In a matter of speaking, yes. A server is simply something that "loops forever" and serves. However, typically you'll find that "daemons" do things like open STDOUT and STDERR onto file handles or /dev/null along with double forks among other things. Your code is a very simplistic "server" in a sense.

Managing resources allocated by client processes

As part of an experiment, I want to write a OpenGL-based UI server for applications, similar to X11 or Quartz in architecture: a core process renders objects into a single viewport, but all graphical objects are being controlled by remote processes.
The idea is that the views stability is only dependent on the core process. If a client process segfaults, its allocated resources would be safely freed - a requirement for that feature is being able to securely find out whether a client process has crashed.
What is the best practice here?
I think this should be detected as an event on the connection to the client, just as with any other client/server architecture.
If you use sockets, the socket will eventually register that one side has closed the socket (as the process crashes, its end of the socket will be closed), and you can detect that, look up the owning client in the server's records, and clean out all resources.
It would be very weird for the server to directly (through process IDs or whatever) look for the clients, and that would also needlessly limit your architecture to only run locally, and not across a network.

Cleanest way to stop a process on Win32?

While implementing an applicative server and its client-side libraries in C++, I am having trouble finding a clean and reliable way to stop client processes on server shutdown on Windows.
Assuming the server and its clients run under the same user, the requirements are:
the solution should work in the following cases:
clients may each feature either a console or a gui.
user may be unprivileged.
clients may be or become unresponsive (infinite loop, deadlock).
clients may or may not be children of the server (direct or indirect).
unless prevented by a client-side defect, clients shall be allowed the opportunity to exit cleanly (free their ressources, sync some data to disk...) and some reasonable time to do so.
all client return codes shall be made available (if possible) to the server during the shutdown procedure.
server shall wait until all clients are gone.
As of this edit, the majority of the answers below advocate the use of a shared memory (or another IPC mechanism) between the server and its clients to convey shutdown orders and client status. These solutions would work, but require that clients successfully initialize the library.
What I did not say, is that the server is also used to start the clients and in some cases other programs/scripts which don't use the client library at all. A solution that did not rely on a graceful communication between server and clients would be nicer (if possible).
Some time ago, I stumbled upon a C snippet (in the MSDN I believe) that did the following:
start a thread via CreateRemoteThread in the process to shutdown.
had that thread directly call ExitProcess.
Unfortunately now that I'm looking for it, I'm unable to find it and the search results seem to imply that this trick does not work anymore on Vista. Any expert input on this ?
If you use thread, a simple solution is to use a named system event, the thread sleeps on the event waiting for it to be signaled, the control application can signal the event when it wants the client applications to quit.
For the UI application it (the thread) can post a message to the main window, WM_ CLOSE or QUIT I forget which, in the console application it can issue a CTRL-C or if the main console code loops it can check some exit condition set by the thread.
Either way rather than finding the client applications an telling them to quit, use the OS to signal they should quit. The sleeping thread will use virtually no CPU footprint provided it uses WaitForSingleObject to sleep on.
You want some sort of IPC between clients and servers. If all clients were children, I think pipes would have been easiest; since they're not, I guess a server-operated shared-memory segment can be used to register clients, issue the shutdown command, and collect return codes posted there by clients successfully shutting down.
In this shared-memory area, clients put their process IDs, so that the server can forcefully kill any unresponsive clients (modulo server privileges), using TerminateProcess().
If you are willing to go the IPC route, make the normal communication between client and server bi-directional to let the server ask the clients to shut down. Or, failing that, have the clients poll. Or as the last resort, the clients should be instructed to exit when the make a request to server. You can let the library user register an exit callback, but the best way I know of is to simply call "exit" in the client library when the client is told to shut down. If the client gets stuck in shutdown code, the server needs to be able to work around it by ignoring that client's data structures and connection.
Use PostMessage or a named event.
Re: PostMessage -- applications other than GUIs, as well as threads other than the GUI thread, can have message loops and it's very useful for stuff like this. (In fact COM uses message loops under the hood.) I've done it before with ATL but am a little rusty with that.
If you want to be robust to malicious attacks from "bad" processes, include a private key shared by client/server as one of the parameters in the message.
The named event approach is probably simpler; use CreateEvent with a name that is a secret shared by the client/server, and have the appropriate app check the status of the event (e.g. WaitForSingleObject with a timeout of 0) within its main loop to determine whether to shut down.
That's a very general question, and there are some inconsistencies.
While it is a not 100% rule, most console applications run to completion, whereas GUI applications run until the user terminates them (And services run until stopped via the SCM). Hence, it's easier to request a GUI to close. You send them the equivalent of Alt-F4. But for a console program, you have to send them the equivalent of Ctrl-C and hope they handle it. In both cases, you simply wait. If the process sticks around, you then shoot it down (TerminateProcess) and pray that the damage is limited. But your HDD can fill up with temporary files.
GUI application in general do not have exit codes - where would they go? And a console process that is forcefully terminated by definition does not exit, so it has no exit code. So, in a server shutdown scenario, don't expect exit codes.
If you've got a debugger attached, you generally can't shutdown the process from another application. That would make it impossible for debuggers to debug exit code!

Resources