zmq pattern for reliable multicast

zmq pattern for reliable multicast - c

I am struggling to work out how to use zmq to implement the architecture I need. I have a classic publish/subscribe situation except that once client x has subscribed to a topic I need the topic data to be sent to it to be cached if the client dies and resent on reconnect. The data order is important and I can't miss messages should the client be offline for a while.
The PUB/SUB pattern doesn't seem to know about individual clients and will just stop sending to client x if it dies. Plus I can't find out this has happened and cache the messages, or know when it reconnects.
To try to get around this I used the REQ/REP pattern so the clients can announce themselves and have some persistence but this is not ideal for a couple of reasons:
1) The clients must constantly ask "got any data for me?" which offends my sensibilities
2) What happens if there's no data to send to client x but there is to client y? Without zmq I'd have had a thread per client and simply block the one with no data but I can't block client x without also blocking client y in a single thread.
Am I trying to shove a round peg in a square hole, here? Is there some way I can get feedback from PUB saying 'failed to send to client x'? so I can cache the messages instead? Or is there some other pattern I should be using?
Otherwise it's back to low level tcp for me...
Many thanks;
Jeremy

This is an area of active research.
I'm currently working on something similar. Our solution is to have a TCP "back channel" on which to receive missed data and have the subscribers know what the last successfully received publication was so that when they reconnect, they can ask for publications since that one.

In some sense you are trying to shove a round peg in a square hole. You have choosen the tool - PUB/SUB - and are trying to solve a problem it are not designed to solve, at least not without some additional design.
The PUB/SUB is an unreliable broadcast. The client can miss messages for several reasons:
Subscribers join late, so they miss messages the server already sent.
Subscribers can fetch messages too slowly, so queues build up and then overflow.
Subscribers can drop off and lose messages while they are away.
Subscribers can crash and restart, and lose whatever data they already received.
etc...
For REQ/RSP the client do not have to constantly ask "got any data for me?", instead the client should probably acknowledge every data so that the server can send correct data next time. If the server has nothing to send, it is just quite.
eg.
client server
Hello ---------------->
(wait until something exist to send)
<-------------------- Msg 1
Ack 1 ---------------->
(wait ...)
<-------------------- Msg 2
...
There are several good ways to do what you want with zmq. First of all you should try to design your protocol. What shall happen when I connect? Should I get any old messages then? If so, how old? If i miss a message when I am connected, should I be able to get it? If the client restarts, should I then get any old messages?
I strongly recommend the very good zmq guide http://zguide.zeromq.org/page:all that have a lot of very good information regardning different ways to get reliability in a protocol. Read the complete guide, including the chapters 4 and 5 whih discuss different techniques on getting a reliable transport. Based on your problem discussion: the Chapter 5 seems like a good start. Try out some of the examples. Then design your protocol.

What about adding an Archiver process. Part of a Client's subscription process would be to also notify the Archiver to start archiving the same subscription(s). The Archiver would keeps all the messages received in an ordered list.
The Clients would record the time or id of the last published message they received. When they started after a crash, they would first contact the Archiver and say "Give me all messages since X". And they would resubscribe with the Publisher. When a client receives the same message from both the Publisher and the Archiver, it tells the Archiver to stop replaying.
The Archiver could purge messages older then the max expected down time for an offline client. Or alternately, Clients could periodically check in to say "I am up to date with message Y", allowing purging of all older items.

Related

How to publish data synchronously using mosquitto_publish?

I have written code (mosquitto_publish()) using Mosquitto to publish data to AWS.
My problem is the sequence with which data is arriving on the MQTT broker. In the Paho client, I see waitForCompletion(), but nothing similar in Mosquitto. Would anyone please help me in dealing with this problem ?

Based on the mosquitto_publich documentation, the function returns when sending has been "successful". MQTT does not guarantee the order in which messages arrive, so you should arguably watch for the arrival rather than the sending, and avoid having two messages race each other to the broker. With QoS 0, the client never knows if a message arrived; that requires QoS 1 or 2, for which additional communications are exchanged. Raise the quality of service, and you can use mosquitto_max_inflight_messages_set(mosq, 1) so that the client queues any additional messages until it receives confirmation from the server. This may be even more efficient than "waiting" for completion, since non-MQTT operations can continue. The queue might pile up if you send bursts of many messages.
The more complex alternative is to send messages unrestricted, but include an index with each, so that the subscriber can sort them upon receipt (for which it would need its own queue and delay). Not recommended if this burden is going to fall on multiple subscribers.

Not persisting messages when the system comes up in the wrong order

We're sending messages to Apache Camel using RabbitMQ.
We have a "sender" and a Camel route that processes a RabbitMQ message sent by the sender.
We're having deployment issues regarding which end of the system comes up first.
Our system is low-volume. I am sending perhaps 100 messages at a time. The point of the message is to reduce 'temporal cohesion' between a thing happening in our primary database, and logging of same to a different database. We don't want our front-end to have to wait.
The "sender" will create an exchange if it does not exist.
The issue is causing deployment issues.
Here's what I see:
If I down the sender, down Camel, delete the exchange (clean slate), start the sender, then start Camel, and send 100 messages, the system works. (I think because the sender has to be run manually for testing, the Exchange is being created by the Camel Route...)
If I clean slate, and send a message, and then up Camel afterwards, I can see the messages land in RabbitMQ (using the web tool). No queues are bound. Once I start Camel, I can see its bound queue attached to the Exchange. But the messages have been lost to time and fate; they have apparently been dropped.
If, from the current state, I send more messages, they flow properly.
I think that if the messages that got dropped were persisted, I'd be ok. What am I missing?

For me it's hard to say what exactly is wrong, but I'll try and provide some pointers.
You should set up all exchanges and queues to be durable, and the messages persistent. You should never delete any of these entities (unless they are empty and you no longer use them) and maybe look at them as tables in a database. It's your infrastructure of sorts, and as with database, you wouldn't want that the first DB client to create a table that it needs (this of course applies to your use case, at least that's what it seems to me).
In the comments I mentioned flow state of the queue, but with 100 messages this will probably never happen.
Regarding message delivery - persistent or not, the broker (server) keeps them until they are consumed with acknowledgment that's sent back by the consumer (in lot's of APIs this is done automatically but it's actually one of the most important concepts).
If the exchange to which the messages were published is deleted, they are gone. If the server gets killed or restarted and the messages are persisted - again, they're gone. There may as well be some more scenarios in which messages get dropped (if I think of some I'll edit the answer).
If you don't have control over creating (declaring usually in the APIs) exchanges and queues, than (aside from the fact that's it's not the best thing IMHO) it can be tricky since declaring those entities is idempotent, i.e. you can't create a durable queue q1 , if a non durable queue with the same name already exists. This could also be a problem in your case, since you mention the which part of the system comes first thing - maybe something is not declared with same parameters on both sides...

I want to log all mqtt messages of the broker. How should I design schema of database. Avoiding dulplicate entries and fast searching

I am implementing a callback in java to store messages in a database. I have a client subscribing to '#'. But the problem is when this # client disconnects and reconnect it adds duplicate entries in the database of retained messages. If I search for previous entries bigger tables will be expensive in computing power. So should I allot a separate table for each sensor or per broker. I would really appreciate if you suggest me better designs.

Subscribing to wildcard with a single client is definitely an anti-pattern. The reasons for that are:
Wildcard subscribers get all messages of the MQTT broker. Most client libraries can't handle that load, especially not when transforming / persisting messages.
If you wildcard subscriber dies, you will lose messages (unless the broker queues endlessly for you, which also doesn't work)
You essentially have a single point of failure in your system. Use MQTT brokers which are hardened for production use. These are much more robust single point of failures than your hand-written clients. (You can overcome the SIP through clustering and load balancing, though).
So to solve the problem, I suggest the following:
Use a broker which can handle shared subscriptions (like HiveMQ or MessageSight), so you can balance all messages between many clients
Use a custom plugin for doing the persistence at the broker instead of the client.
You can also read more about that topic here: http://www.hivemq.com/blog/mqtt-sql-database

Also consider using QoS = 3 for all message to make sure one and only one message is delivered. Also you may consider time-stamp each message to avoid inserting duplicate messages if QoS requirement is not met.

Timers In Network Protocols

I'm building a network application in RedHat/C with a protocol called SMPP that is being used in telecom to send SMS.
I'm at a point where I send messages (~70 SMS/second) to the server and I have to wait to a few seconds and for a successful response and delete the messages, but if the message timed-out then I have to resend the message to the server.
The question is how to design something to retry the expired messages?

There is an id in the SMPP spec called sequence_number - this should be monotonically incrementing for every request you make and the response coming back from the server will have the sequence_number of the request it is responding to.
If you wait for a bit (maybe 10 seconds, maybe longer) and you don't get your response back you can re-send the request with the same sequence_number and the server should spot it as a duplicate if it did receive it first time; if it did not receive it first time then it will treat it as a new request.
The server may also make requests to your client; e.g. here is a delivery receipt or here is a mobile-originate message - it will also have it's own sequence_number counter and you should acknowledge it's requests with responses having the same sequence number. You should track the sequence numbers you have seen so you can tell if you hit a duplicate request.
This property is called http://en.wikipedia.org/wiki/Idempotence and is something you should become familiar with if you are implementing telecoms protocols.
In order to get your 70 msgs/sec you will likely need to build on top of Idempotence using a sliding window http://en.wikipedia.org/wiki/Flow_control_(data)#Sliding_Window so you can have a maximum of N (maybe 10) requests outstanding you are still waiting for the response acknowledgements to - unless you are very close to the SMPP server with very low latency.
Doing SMPP right is not trivial I would recommend you read SMPP v3.4 spec front to back before you get too far into an implementation.

It is not very clear what you are asking for so the answer will be also probably not very precise.
I would suggest to see how this is implemented in some existing solutions. I have worked a bit with kannel and mbuni (this is rather for MMS) and I suggest take a look at kannel especially.
Kannel is basically open source SMS gateway and have working SMPP support.
Take also a look at this stackoverflow thread which may also help to understand some ideas.

What is the right way to use PushSharp?

I use PushSharp to send notifications for a few Apps.
PushSharp is great it really simplifies the work with push services, and I wonder what is the right way to work with it?
I haven't found examples/ explanations about that.
Now, when I have a message to send , I ...
create a PushSharp object
do a PushService.QueueNotification() for all devices
do a PushService.StopAllServices to send all queued messages
exits the method (and kill the PushService object).
Should I work this way, or keep this PushService object alive and call its methods when needed?
How should I use a PushService object to get the unregistered device ids? with a dedicated instance?
Any suggestion would be appreciated.

This is a question which frequently comes up.
The answer isn't necessarily one way or the other, but it depends on your situation. In most cases it would be absolutely fine to just create a PushBroker instance whenever you need it, since most platforms use HTTP based protocols for sending notifications. In the case of Apple, they state in their documentation that you should keep your connection to APNS open in order to minimize overhead of opening and closing secure connections.
However, in practice I think this means that they don't want you connecting and disconnecting VERY frequently (eg: they don't want you creating a new connection for every message you send). In reality, if you're sending batches of notifications every so often (let's say every 15 minutes or every hour) they probably won't have a problem with you opening a new connection for each batch and then closing it when done.
I've never heard of anyone being blocked from Apple's APNS servers for doing this. In fact in the very early days of working with push notifications, I had a bug that caused a new apns connection to be created for each notification. I sent thousands of notifications a day like this and never heard anything about it from Apple (eventually I identified it as a bug and fixed it of course).
As for collecting feedback, by default the ApplePushService will poll the feedback servers after 10 seconds of starting, and then every 10 minutes thereafter. If you want to disable this from happening you can simply set the ApplePushChannelSettings.FeedbackIntervalMinutes to <= 0. You can then use the FeedbackService class to poll for feedback whenever you need to, manually.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight