Reconnecting with hiredis - c

I'm trying to reconnect to the Redis server on disconnect.
I'm using redisAsyncConnect and I've setup a callback on disconnect. In the callback I try to reconnect with the same command I use at the very start of the program to establish the connection but it's not working. Can't seem to reconnect.
Can anyone help me out with an example?

Managing Redis (re)connections asynchronously is a bit tricky when an event loop is used.
Here is an example implementing a small zset polling daemon connecting to a list of Redis instances, which is resilient to disconnection events. The ae event loop is used (it is the one used by Redis itself).
http://gist.github.com/4149768
Check the following functions:
connectCallback
disconnectCallback
checkConnections
reconnectIfNeeded
The main daemon loop does its activity only when the connection is available. Once per second, a second time initiated callback checks if some connections have to be reestablished. We have found this mechanism quite reliable.
Note: error management is crude in this example for brevity sake. Real production code should manage errors in a more graceful way.
One tricky point when dealing with multiple asynchronous connections is the fact there is no user defined contextual data passed as a parameter of the corresponding callbacks. Cleaning the data associated to a connection after a disconnection event can be a bit difficult.

Related

Apache Flink Stateful Functions forwarding the same message to N functions

I'm trying to send incoming messages to multiple stateful functions but I couldn't fully understand how to do. For the sake of understandability let's say one of my stateful function getting some integers and sending them to couple of remote functions. These functions adds this integers to their state values ​​and saves it as the new state.
When one of these 2 remote functions fails, the other should continue to work the same way.
When the failed function recovered, it should process messages that it cannot process during failure.
I thought about sending them one after another as below, but I don't think it will work
context.send(RemoteFuncType1,someID,someInteger);
context.send(RemoteFuncType2,someID,someInteger);
...
how can I do this in a fault tolerant way?
if possible how it works in the background?
The way you are suggesting to do it is the correct way!
StateFun would deliver the messages to the remote functions in a consistent manner. If one of the functions is experiencing a short downtime, StateFun would retry sending the message until:
It would successfully deliver it (with back off)
A maximum timeout for retries would be reached. When a timeout is reached the whole StateFun job would be rewind to a
previously consistent checkpoint.
Since StateFun is managing message delivery and the state of the functions (remote included) it would make sure that a consistent state and message would be delivered to each function.
In your example: the second remote function would receive someInteger with whatever state it had before, once recovered.
To get a deeper understanding of how checkpointing works in Flink and how it enables exactly once processing I’d recommend the following:
https://ci.apache.org/projects/flink/flink-docs-stable/internals/stream_checkpointing.html

Simplest way to awake multiple processes with a single broadcast?

Context: this is a web/sqlite application. One process receives new data over TCP, and feed them to a SQLite database. Other processes (number is variable) are launched as required as clients connect and request updates over HTML5's server-side events interface (this might change to websocket in the future).
The idea is to force the client apps to block, and to find a way for the server to create a notification that will wakeup all awaiting clients.
Note that the clients aren't fork'ed from the server.
I'm hoping for a solution that:
doesn't require clients to register themselves to the server
allows the server to broadcast even if no client is listening - and doesn't create a huge pile of unprocessed notifications
allows clients to detect that server isn't present
allows clients to define a custom timeout (maximum wait time for an event)
Solutions checked:
sqlite3_update_hook() - only works within a single process (damned, that would have been sleek)
signals: I still have nightmares about the last time I used signals. Maybe signalfd would be better (server creates a folder, client create unique files, and server notifies all files in that folder)
iNotify - didn't read enough on this one
semaphores / locks / shared memory - can't think of a non-hacked way to use these. The server could update a shared memory area with the row ID of the line just inserted in the DB, but then what?
I'm sure I'm missing something obvious - but what? At this time, polling seems to be the best option!
Thanks.
Just as a suggestion can you try message queues? multiple clients can connect to the same queue and receive one broadcast message, each client can have its own message queue if it requires communication with the server.
Message queues are implemented by Linux OS and they are very reliable. I personally use message queues to pass messages from several clients to a central routing daemon, clients being responsible of processing and returning the altered data.

How can I subscribe to a channel and then do something else without blocking?

I am using redis pub/sub to do some real-time processing.
In subscribe ends, I want to subscribe to a specified channel, then do some other computations. I am under the imporession that if I send a subscribe command to server, it will block the code.
So how can I do something else, and when the subscribe message arrives, I process that via a callback handler?
You need two different connections to do that. This is was a design choice because when you SUBSCRIBE / PSUBSCRIBE, actually the connection semantics changes from Request-Response to Push-style, so it is not suitable to run commands without implementing a more complex semantics like the one, for example, of the IMAP protocol.
The first point is to dedicate a Redis connection to the subscriptions. Once SUBSCRIBE or PUSBSCRIBE has been applied on a connection, only subscription related commands can be done. So in a C program, you need at least one connection for your subscription(s), and one connection to do anything else with Redis.
Then you need to also find a way to handle those two connections from the C program. Several solutions are possible. For instance:
use multi-threading, and dedicate a thread to the connection responsible of subscriptions. On reception on a new event, the dedicated thread should post a message to your main application thread, which will activate a callback.
use non-blocking and asynchronous API. Hiredis comes with event loop adapters. You need an event loop to handle the connections to Redis (including the one dedicated to subscription). Upon reception of a publication event, the associated callback will be directly triggered by the event loop. Here is an example of subscription with hiredis and libevent.

Problem supporting keep-alive sockets on a home-grown http server

I am currently experimenting with building an http server. The server is multi-threaded by one listening thread using select(...) and four worker threads managed by a thread pool. I'm currently managing around 14k-16k requests per second with a document length of 70 bytes, a response time of 6-10ms, on a Core I3 330M. But this is without keep-alive and any sockets I serve I immediatly close when the work is done.
EDIT: The worker threads processes 'jobs' that have been dispatched when activity on a socket is detected, ie. service requests. After a 'job' is completed, if there are no more 'jobs', we sleep until more 'jobs' gets dispatched or if there already are some available, we start processing one of these.
My problems started when I began to try to implement keep-alive support. With keep-alive activated I only manage 1.5k-2.2k requests per second with 100 open sockets. This number grows to around 12k with 1000 open sockets. In both cases the response time is somewhere around 60-90ms. I feel that this is quite odd since my current assumptions says that requests should go up, not down, and response time should hopefully go down, but definitely not up.
I've tried several different strategies for fixing the low performance:
1. Call select(...)/pselect(...) with a timeout value so that we can rebuild our FD_SET structure and listen to any additional sockets that arrived after we blocked, and service any detected socket activity.
(aside from the low performance, there's also the problem of sockets being closed while we're blocking, resulting in select(...)/pselect(...) reporting bad file descriptor.)
2. Have one listening thread that only accept new connections and one keep-alive thread that is notified via a pipe of any new sockets that arrived after we blocked and any new socket activity, and rebuild the FD_SET.
(same additional problem here as in '1.').
3. select(...)/pselect(...) with a timeout, when new work is to be done, detach the linked-list entry for the socket that has activity, and add it back when the request has been serviced. Rebuilding the FD_SET will hopefully be faster. This way we also avoid trying to listen to any bad file descriptors.
4. Combined (2.) and (3.).
-. Probably a few more, but they escape me atm.
The keep-alive sockets are stored in a simple linked List, whose add/remove methods are surrounded by a pthread_mutex lock, the function responsible for rebuilding the FD_SET also has this lock.
I suspect that it's the constant locking/unlocking of the mutex that is the main culprit here, I've tried to profile the problem but neither gprof or google-perftools has been very cooperative, either introducing extreme instability or plain refusing to gather any data att all (This could be me not knowing how to use the tools properly though.). But removing the locks risks putting the linked list in a non-sane state and probably crash or put the program into an infinite loop.
I've also suspected the select(...)/pselect(...) timeout when I've used it, but I'm pretty confident that this was not the problem since the low performance is maintained even without it.
I'm at a loss of how I should handle keep-alive sockets and I'm therefor wondering if you people out there has any suggestions on how to fix the low performance or have suggestions on any alternate methods I can use to go about supporting keep-alive sockets.
If you need any more information to be able to answer my question properly, don't hesitate to ask for it and I shall try my best to provide you with the necessary information and update the question with this new information.
Try to get rid of select completely. You can find some kind of event notification on every popular platform: kqueue/kevent on freebsd(), epoll on Linux, etc. This way you do not need to rebuild FD_SET and can add/remove watched fds anytime.
The time increase will be more visible when the client uses your socket for more then one request. If you are merely opening and closing yet still telling the client to keep alive then you have the same scenario as you did without keepalive. But now you have the overhead of the sockets sticking around.
If however you are using the sockets multiple times from the same client for multiple requests then you will lose the TCP connection overhead and gain performance that way.
Make sure your client is using keepalive properly. and likely a better way to get notification of the sockets state and data. Perhaps a poll device or queuing the requests.
http://www.techrepublic.com/article/using-the-select-and-poll-methods/1044098
This page has a patch for linux to handle a poll device. Perhaps some understanding of how it works and you can use the same technique in your application rather then rely on a device that may not be installed.
There are many alternatives:
Use processes instead of threads, and pass file descriptors via Unix sockets.
Maintain per-thread lists of sockets. You could even accept() directly on the worker threads.
etc...
Are your test clients reusing the sockets? Are they correctly handling keep alive?
I could see that case where you do the minimum change possible in your benchmarking code by just passing the keep alive header, but then not changing your code so that the socket is closed at the client end once the pay packet is received.
This would incure all the costs of keep-alive with none of the benefits.
What you are trying to do has been done before. Consider reading about the Leader-Follower network server pattern, http://www.kircher-schwanninger.de/michael/publications/lf.pdf

Cleanest way to stop a process on Win32?

While implementing an applicative server and its client-side libraries in C++, I am having trouble finding a clean and reliable way to stop client processes on server shutdown on Windows.
Assuming the server and its clients run under the same user, the requirements are:
the solution should work in the following cases:
clients may each feature either a console or a gui.
user may be unprivileged.
clients may be or become unresponsive (infinite loop, deadlock).
clients may or may not be children of the server (direct or indirect).
unless prevented by a client-side defect, clients shall be allowed the opportunity to exit cleanly (free their ressources, sync some data to disk...) and some reasonable time to do so.
all client return codes shall be made available (if possible) to the server during the shutdown procedure.
server shall wait until all clients are gone.
As of this edit, the majority of the answers below advocate the use of a shared memory (or another IPC mechanism) between the server and its clients to convey shutdown orders and client status. These solutions would work, but require that clients successfully initialize the library.
What I did not say, is that the server is also used to start the clients and in some cases other programs/scripts which don't use the client library at all. A solution that did not rely on a graceful communication between server and clients would be nicer (if possible).
Some time ago, I stumbled upon a C snippet (in the MSDN I believe) that did the following:
start a thread via CreateRemoteThread in the process to shutdown.
had that thread directly call ExitProcess.
Unfortunately now that I'm looking for it, I'm unable to find it and the search results seem to imply that this trick does not work anymore on Vista. Any expert input on this ?
If you use thread, a simple solution is to use a named system event, the thread sleeps on the event waiting for it to be signaled, the control application can signal the event when it wants the client applications to quit.
For the UI application it (the thread) can post a message to the main window, WM_ CLOSE or QUIT I forget which, in the console application it can issue a CTRL-C or if the main console code loops it can check some exit condition set by the thread.
Either way rather than finding the client applications an telling them to quit, use the OS to signal they should quit. The sleeping thread will use virtually no CPU footprint provided it uses WaitForSingleObject to sleep on.
You want some sort of IPC between clients and servers. If all clients were children, I think pipes would have been easiest; since they're not, I guess a server-operated shared-memory segment can be used to register clients, issue the shutdown command, and collect return codes posted there by clients successfully shutting down.
In this shared-memory area, clients put their process IDs, so that the server can forcefully kill any unresponsive clients (modulo server privileges), using TerminateProcess().
If you are willing to go the IPC route, make the normal communication between client and server bi-directional to let the server ask the clients to shut down. Or, failing that, have the clients poll. Or as the last resort, the clients should be instructed to exit when the make a request to server. You can let the library user register an exit callback, but the best way I know of is to simply call "exit" in the client library when the client is told to shut down. If the client gets stuck in shutdown code, the server needs to be able to work around it by ignoring that client's data structures and connection.
Use PostMessage or a named event.
Re: PostMessage -- applications other than GUIs, as well as threads other than the GUI thread, can have message loops and it's very useful for stuff like this. (In fact COM uses message loops under the hood.) I've done it before with ATL but am a little rusty with that.
If you want to be robust to malicious attacks from "bad" processes, include a private key shared by client/server as one of the parameters in the message.
The named event approach is probably simpler; use CreateEvent with a name that is a secret shared by the client/server, and have the appropriate app check the status of the event (e.g. WaitForSingleObject with a timeout of 0) within its main loop to determine whether to shut down.
That's a very general question, and there are some inconsistencies.
While it is a not 100% rule, most console applications run to completion, whereas GUI applications run until the user terminates them (And services run until stopped via the SCM). Hence, it's easier to request a GUI to close. You send them the equivalent of Alt-F4. But for a console program, you have to send them the equivalent of Ctrl-C and hope they handle it. In both cases, you simply wait. If the process sticks around, you then shoot it down (TerminateProcess) and pray that the damage is limited. But your HDD can fill up with temporary files.
GUI application in general do not have exit codes - where would they go? And a console process that is forcefully terminated by definition does not exit, so it has no exit code. So, in a server shutdown scenario, don't expect exit codes.
If you've got a debugger attached, you generally can't shutdown the process from another application. That would make it impossible for debuggers to debug exit code!

Resources