Is it possible to shutdown a dask.distributed cluster given a Client instance? - distributed

if I have a distributed.Client instance can I use that to shutdown the remote cluster? i.e. to kill all workers and also shutdown the scheduler?
If that can't be done using the Client instance is there another way other than manually killing each remote process?

There is no client function specifically for this.
The scheduler has a close() method which you could call using run_on_scheduler thus
c.run_on_scheduler(lambda dask_scheduler=None:
dask_scheduler.close() & sys.exit(0))
which will tell workers to disconnect and shutdown, and will close all connections before terminating the process. Note that this raises an error in the client, since the connection is broken without a reply. There are probably more elegant ways.
Note that the right way to do this is probably to interact with one of the deployment cluster managers. For example, LocalCluster has a user-facing close() method that you can call directly.
--EDIT--
client.shutdown() is now available.

Related

upgrade server executable without losing user's connections

I need to develop a mechanism to upgrade a running daemon in production environment to a new version without losing client's (TCP) connections. Something similar to what nginx does when you upgrade it to a new version. I need this for bug removal or to release minor version changes, which may be once a day. The daemon is developed in C for Linux platform.
The process for the upgrade would be like this:
The new_daemon would be ran from the command line specifying the process id of the old_daemon
The new_daemon would connect via socket to the old daemon to send/receive data and mesages.
The new_daemon would send the old_daemon a message to stop listening on the PORT which is used to receive client's connections. After confirming the detention of the listening service, the new_daemon would start listening on PORT
The new_daemon would send the message to old_daemon to send currently open file descriptors of the user's connections. Using the system call sendmsg() the old_daemon would pass the new_daemon all resources it has allocated with the kernel, not only the connections, but also all open files.
The new_daemon would send the message to old_daemon to pass all global memory variables and the old_daemon would send it over the socket connection between both processes.
This process is very complex, so I would like to ask if someone can suggest a better process or maybe there is some methodology to do this easily? The goal is to have the least downtime during the upgrade process.
TIA
Another alternative is to force the old_daemon to fork()/exec() the new_daemon and immediately stop accepting. The new_daemon would inherit the listening socket, existing connections, and open files (unless they are fcntl'd to FD_CLOEXEC) automagically.
That said, I don't think there is a clean way to hand over incomplete jobs (as I understand steps 4 and 5 try to accomplish). If possible, let the old_daemon complete them.
One alternative is to write most of your demon as a shared library and use dlopen to link the new functions into the running process. This means some parts can't be changed and you might have concurrency issues but it removes the need for IPC.

Simplest way to awake multiple processes with a single broadcast?

Context: this is a web/sqlite application. One process receives new data over TCP, and feed them to a SQLite database. Other processes (number is variable) are launched as required as clients connect and request updates over HTML5's server-side events interface (this might change to websocket in the future).
The idea is to force the client apps to block, and to find a way for the server to create a notification that will wakeup all awaiting clients.
Note that the clients aren't fork'ed from the server.
I'm hoping for a solution that:
doesn't require clients to register themselves to the server
allows the server to broadcast even if no client is listening - and doesn't create a huge pile of unprocessed notifications
allows clients to detect that server isn't present
allows clients to define a custom timeout (maximum wait time for an event)
Solutions checked:
sqlite3_update_hook() - only works within a single process (damned, that would have been sleek)
signals: I still have nightmares about the last time I used signals. Maybe signalfd would be better (server creates a folder, client create unique files, and server notifies all files in that folder)
iNotify - didn't read enough on this one
semaphores / locks / shared memory - can't think of a non-hacked way to use these. The server could update a shared memory area with the row ID of the line just inserted in the DB, but then what?
I'm sure I'm missing something obvious - but what? At this time, polling seems to be the best option!
Thanks.
Just as a suggestion can you try message queues? multiple clients can connect to the same queue and receive one broadcast message, each client can have its own message queue if it requires communication with the server.
Message queues are implemented by Linux OS and they are very reliable. I personally use message queues to pass messages from several clients to a central routing daemon, clients being responsible of processing and returning the altered data.

No threads and blocking sockets - is it possible to handle several connections?

I have a program that needs to:
Handle 20 connections. My program will act as client in every connection, each client connecting to a different server.
Once connected my client should send a request to the server every second and wait for a response. If no request is sent within 9 seconds, the server will time out the client.
It is unacceptable for one connection to cause problems for the rest of the connections.
I do not have access to threads and I do not have access to non-blocking sockets. I have a single-threaded program with blocking sockets.
Edit: The reason I cannot use threads and non blocking sockets is that I am on a non-standard system. I have a single RTOS(Real-Time Operating System) task available.
To solve this, use of select is necessary but I am not sure if it is sufficient.
Initially I connect to all clients. But select can only be used to see if a read or write will block or not, not if a connect will.
So when I have connected to say 2 clients and they are all waiting to be served, what if the 3rd does not work, the connection will block causing the first 2 connections to time out as well.
Can this be solved?
I think the connection-issue can be solved by setting a timeout for the connect-operation, so that it will fail fast enough. Of course that will limit you if the network really is working, but you have a very long (slow) path to some of the server(s). That's bad design, but your requirements are pretty harsh.
See this answer for details on connection-timeouts.
It seems you need to isolate the connections. Well, if you cannot use threads you can always resort to good-old-processes.
Spawn each client by forking your server process and use traditional IPC mechanisms if communication between them is required.
If you can neither use a multiprocess approach I'm afraid you'll have a hard time doing that.

Reconnecting with hiredis

I'm trying to reconnect to the Redis server on disconnect.
I'm using redisAsyncConnect and I've setup a callback on disconnect. In the callback I try to reconnect with the same command I use at the very start of the program to establish the connection but it's not working. Can't seem to reconnect.
Can anyone help me out with an example?
Managing Redis (re)connections asynchronously is a bit tricky when an event loop is used.
Here is an example implementing a small zset polling daemon connecting to a list of Redis instances, which is resilient to disconnection events. The ae event loop is used (it is the one used by Redis itself).
http://gist.github.com/4149768
Check the following functions:
connectCallback
disconnectCallback
checkConnections
reconnectIfNeeded
The main daemon loop does its activity only when the connection is available. Once per second, a second time initiated callback checks if some connections have to be reestablished. We have found this mechanism quite reliable.
Note: error management is crude in this example for brevity sake. Real production code should manage errors in a more graceful way.
One tricky point when dealing with multiple asynchronous connections is the fact there is no user defined contextual data passed as a parameter of the corresponding callbacks. Cleaning the data associated to a connection after a disconnection event can be a bit difficult.

Cleanest way to stop a process on Win32?

While implementing an applicative server and its client-side libraries in C++, I am having trouble finding a clean and reliable way to stop client processes on server shutdown on Windows.
Assuming the server and its clients run under the same user, the requirements are:
the solution should work in the following cases:
clients may each feature either a console or a gui.
user may be unprivileged.
clients may be or become unresponsive (infinite loop, deadlock).
clients may or may not be children of the server (direct or indirect).
unless prevented by a client-side defect, clients shall be allowed the opportunity to exit cleanly (free their ressources, sync some data to disk...) and some reasonable time to do so.
all client return codes shall be made available (if possible) to the server during the shutdown procedure.
server shall wait until all clients are gone.
As of this edit, the majority of the answers below advocate the use of a shared memory (or another IPC mechanism) between the server and its clients to convey shutdown orders and client status. These solutions would work, but require that clients successfully initialize the library.
What I did not say, is that the server is also used to start the clients and in some cases other programs/scripts which don't use the client library at all. A solution that did not rely on a graceful communication between server and clients would be nicer (if possible).
Some time ago, I stumbled upon a C snippet (in the MSDN I believe) that did the following:
start a thread via CreateRemoteThread in the process to shutdown.
had that thread directly call ExitProcess.
Unfortunately now that I'm looking for it, I'm unable to find it and the search results seem to imply that this trick does not work anymore on Vista. Any expert input on this ?
If you use thread, a simple solution is to use a named system event, the thread sleeps on the event waiting for it to be signaled, the control application can signal the event when it wants the client applications to quit.
For the UI application it (the thread) can post a message to the main window, WM_ CLOSE or QUIT I forget which, in the console application it can issue a CTRL-C or if the main console code loops it can check some exit condition set by the thread.
Either way rather than finding the client applications an telling them to quit, use the OS to signal they should quit. The sleeping thread will use virtually no CPU footprint provided it uses WaitForSingleObject to sleep on.
You want some sort of IPC between clients and servers. If all clients were children, I think pipes would have been easiest; since they're not, I guess a server-operated shared-memory segment can be used to register clients, issue the shutdown command, and collect return codes posted there by clients successfully shutting down.
In this shared-memory area, clients put their process IDs, so that the server can forcefully kill any unresponsive clients (modulo server privileges), using TerminateProcess().
If you are willing to go the IPC route, make the normal communication between client and server bi-directional to let the server ask the clients to shut down. Or, failing that, have the clients poll. Or as the last resort, the clients should be instructed to exit when the make a request to server. You can let the library user register an exit callback, but the best way I know of is to simply call "exit" in the client library when the client is told to shut down. If the client gets stuck in shutdown code, the server needs to be able to work around it by ignoring that client's data structures and connection.
Use PostMessage or a named event.
Re: PostMessage -- applications other than GUIs, as well as threads other than the GUI thread, can have message loops and it's very useful for stuff like this. (In fact COM uses message loops under the hood.) I've done it before with ATL but am a little rusty with that.
If you want to be robust to malicious attacks from "bad" processes, include a private key shared by client/server as one of the parameters in the message.
The named event approach is probably simpler; use CreateEvent with a name that is a secret shared by the client/server, and have the appropriate app check the status of the event (e.g. WaitForSingleObject with a timeout of 0) within its main loop to determine whether to shut down.
That's a very general question, and there are some inconsistencies.
While it is a not 100% rule, most console applications run to completion, whereas GUI applications run until the user terminates them (And services run until stopped via the SCM). Hence, it's easier to request a GUI to close. You send them the equivalent of Alt-F4. But for a console program, you have to send them the equivalent of Ctrl-C and hope they handle it. In both cases, you simply wait. If the process sticks around, you then shoot it down (TerminateProcess) and pray that the damage is limited. But your HDD can fill up with temporary files.
GUI application in general do not have exit codes - where would they go? And a console process that is forcefully terminated by definition does not exit, so it has no exit code. So, in a server shutdown scenario, don't expect exit codes.
If you've got a debugger attached, you generally can't shutdown the process from another application. That would make it impossible for debuggers to debug exit code!

Resources