What APIs to use for program working across computers - distributed

I want to try programming something which can do things across multiple end points, so that when things occur on one computer events can occur on others. Obviously the problem here is in sending commands to the other end points.
I'm just not sure of what program I would use to do this with, I'm guessing it would have to use an API which uses some kind of client server model. I expect there are things that people use to do this with but I don't know what they are called.
How would I go about doing this? Are there common APIs which allow people to do this?

There are (at least) two types to distinguish between: RPC APIs and Message Queues (MQ)
An RPC-style API can be imaged like an remotely callable interface, it typically gives you one response per request. Apache Thrift1) is one of the frameworks designed for this purpose: An easy to use cross-platform, cross-language RPC framework. (And oh yes, it also supports Erlang, just in case ...). There are a few others around, like Googles protocol buffers, Apache Avro, and a few more.
Message Queuing systems are more suitable in cases where a more loosely coupling is desired or acceptable. In contrast to an RPC-style framework and API, a messaging queue decouples request and response a bit more. For example, a MQ system is more suitable for distributing work to multiple handlers, or distributing one event to multiple recipients via producer/consumer or publish/subscribe patterns. A typical candidate could be MSMQ, ApacheMQ or RabbitMQ
Although with RPC this can be achieved as well, it is much more complicated and involves more work as you are operating on a somewhat lower abstraction level. RPCs shine when you need more the request/response style and value performance higher than the comfort of an MQ.
On top of MQ systems there are more sophisticated Service Bus systems, like for example NServiceBus. A service bus operates on an even higher level of abstraction. They also have their pro's and con's, but can be helpful too. At the end, it depends on your use case.
1) Disclaimer: I am actively involved in that project.

Without more information, I can just suggest you to look at Erlang. It is probably the easiest language to learn distributed systems, since sending messages is built into the language and it is irrelevant for the language and command itself, if the message is sent within the same PC, LAN or through the internet to a different machine.

Related

Consensus algorithm for Node.js

I'm trying to implement a collaborative canvas in which many people can draw free-handly or with specific shape tools.
Server has been developed in Node.js and client with Angular1-js (and I am pretty new to them both).
I must use a consensus algorithm for it to show always the same stuff to all the users.
I'm seriously in troubles with it since I cannot find a proper tutorial its use. I have been looking and studying Paxos implementation but it seems like Raft is very used in practical.
Any suggestions? I would really appreciate it.
Writing a distributed system is not an easy task[1], so I'd recommend using some existing strongly consistent one instead of implementing one from scratch. The usual suspects are zookeeper, consul, etcd, atomix/copycat. Some of them offer nodejs clients:
https://github.com/alexguan/node-zookeeper-client
https://www.npmjs.com/package/consul
https://github.com/stianeikeland/node-etcd
I've personally never used any of them with nodejs though, so I won't comment on maturity of clients.
If you insist on implementing consensus on your own, then raft should be easier to understand — the paper is surprisingly accessible https://raft.github.io/raft.pdf. They also have some nodejs implementations, but again, I haven't used them, so it is hard to recommend any particular one. Gaggle readme contains an example and skiff has an integration test which documents its usage.
Taking a step back, I'm not sure if the distributed consensus is what you need here. Seems like you have multiple clients and a single server. You can probably use a centralized data store. The problem domain is not really that distributed as well - shapes can be overlaid one on top of the other when they are received by server according to FIFO (imagine multiple people writing on the same whiteboard, the last one wins). The challenge is with concurrent modifications of existing shapes, by maybe you can fallback to last/first change wins or something like that.
Another interesting avenue to explore here would be Conflict-free Replicated Data Types — CRDT. Folks at github used them to implement collaborative "pair" programming in atom. See the atom teletype blog post, also their implementation maybe useful, as collaborative editing seems to be exactly the problem you try to solve.
Hope this helps.
[1] Take a look at jepsen series https://jepsen.io/analyses where Kyle Kingsbury tests various failure conditions of distribute data stores.
Try reading Understanding Paxos. It's geared towards software developers rather than an academic audience. For this particular application you may also be interested in the Multi-Paxos Example Application referenced by the article. It's intended both to help illustrate the concepts behind the consensus algorithm and it sounds like it's almost exactly what you need for this application. Raft and most Multi-Paxos designs tend to get bogged down with an overabundance of accumulated history that generates a new set of problems to deal with beyond simple consistency. An initial prototype could easily handle sending the full-state of the drawing on each update and ignore the history issue entirely, which is what the example application does. Later optimizations could be made to reduce network overhead.

in what language should the API be written?

We want to implement an API, we have a database located on a central server, and a network of many computers.
On these computers, several local programs will be developed in the future using different programming languages, some in java, some in perl, C++, ... etc.
These local programs should be able to access the API functions and interact with the database.
So in what language should the API be written ? so that it would be binding to the other languages. Is there any specific architecture that should be implemented ?
Is there any link that would provide useful information about this ?
If the API is pure database access, then a REST web service is a reasonable choice. It allows a (reasonably) easy interface from almost any language, and allows you to choose whatever language you feel is best for writing the actual web service. However, in doing it this way, you're paying the cost of an extra network call per API call. If you put the web service on the same host (or local network) as the database, you can minimize the cost of the network call from the web service to the database, which mitigates the cost of the extra call to the API.
If the API has business logic in it, there's two via approaches...
You can write the API as a library that can be used from multiple languages. C is a good choice for this because most languages can link C libraries, but the languages you expect to use it from can have a large impact, too. For example, if you know it's always going to be used by a language hosted on the JVM, the any JVM language it probably a reasonably good choice.
Another choice is to use a hybrid of the two. A REST API for database access, plus a business layer library written in multiple languages. The idea being that you have business logic on the application end, but it's simple enough that you can write a "client library" in multiple languages that knows how to call out to the REST API and then apply business logic to the results it gets back. Assuming the business logic isn't too complex (ie, limited to ways to merge/view the database data), then this isn't a bad solution.
The benefit is that it should be relatively easy to supply one "default" library that can be used by many languages, plus other language specific versions of the library where you have time available to implement them. For cases where figuring out what calls need to be made to the database, and how to combine the results, can be complicated, I find this to be a reasonably good solution.
I would resort to webservices. Doesn't matter what language you use as long as you have a framework to interact with webservices you are good. Depending on your needs you could expose a simple REST API or go all the way with SOAP/WSDL and the likes.

Messaging Middleware Vs RPC and Distributed Databases

I would like to know your opinions on advantages and disadvantages of using
Messaging Middleware vs. RPC and Distributed Databases in a distributed application?
These three are completely different things:
Message Oriented Middleware (MOM): A subsystem providing (arbitrary) message delivery services between interested systems. Usually providing the ability to change messages' content, route them, log them, guarantee the delivery, etc.
Remote Procedure Call (RPC): A rather generic term denoting a method of invoking a procedure / method / service residing in a remote process.
Distributed database: seems quite self-explanatory to me, refer to wikipedia.
Hence it's hard to tell specific (dis)advantages not knowing the actual distributed application better. You could be comparing RPC and MOM. In that case MOM usually is a complete message delivery solution, while RPC is just a technical mean of inter-process communication.

At what level should I implement communication between nodes on a distributed system?

I'm building a web application that from day one will be on the limits of what a single server can handle. So I'm considering to adopt a distributed architecture with several identical nodes. The goal is to provide scalability (add servers to accommodate more users) and fault tolerance. The nodes need to share some state between them, therefore some communication between them is required. I believe I have the following alternatives to implement this communication in Java:
Implement it using sockets and a custom protocol.
Use RMI
Use web services (each node can send and receive/parse HTTP request).
Use JMS
Use another high-level framework like Terracotta or hazelcast
I would like to know how this technologies compare to each other:
When the number of nodes increases
When the amount of communication between the nodes increases (1000s of messages per second and/or messages up to 100KB etc)
On a practical level (eg ease of implementation, available documentation, license issues etc)
I'm also interested to know what technologies are people using in real production projects (as opposed to experimental or academic ones).
Don't forget Jini.
It gives you automatic service discovery, service leasing, and downloadable proxies so that the actual client/server communication protocol is up to you and not enforced by the framework (e.g. you can choose HTTP/RMI/whatever).
The framework is built around acknowledgement of the 8 Fallacies of Distributed Computing and recovery-oriented computing. i.e. you will have network problems, and the architecture is built to help you recover and maintain a service.
If you also use Javaspaces it's trivial to implement workflows and consumer-producer architectures. Producers will write into the Javaspaces, and one or more consumers will take that work from the space (under a transaction) and work with it. So you scale it simply by providing more consumers.

Exchange Data between two apps across PC on LAN

I have a need of implementing two apps that will exchange data with each other. Both apps will be running on separate PCs which are part of a LAN.
How we can do this in Delphi?
Is there any free component which will make it easy to exchange data between apps across PCs?
If I'm writing it myself, I (almost) always use sockets to exchange data between apps.
It's light weight, it works well on the same machine, across the local network or the Internet with no changes and it lets you communicate between apps with different permissions, like services (Windows messages cause problems here).
It might not be a requirements for you, but I'm also a fan of platform independent transports, like TCP/IP.
There are lots of free choices for Delphi. Here are a few that I know of. If you like blocking libraries, look at Indy or Synapse. If you prefer non-blocking, check out ICS.
Before you choose a technique, you should characterize the communication according to its throughput, granularity, latency, and criticality.
Throughput -- how much data per unit time will you need to move? The range of possible values is so wide that the lowest-rate and highest-rate applications have almost nothing in common.
Granularity -- how big are the messages? How much data does the receiving application need before it can use the message?
Latency -- when one aplication sends a message, how soon must the other application see it? How quickly do you want the receiving application to react to the sending application?
Criticality -- how long can a received message be left unattended before it is overrun by a later message? (This is usually not important unless the throughput is high and the message storage is limited.)
Once you have these questions answered, you can begin to ask about the best technology for your particular situation.
-Al.
I used to use Mailslots if I needed to communicate with more than one PC at a time ("broadcast") over a network, although there is the caveat that mailslots are not guaranteed.
For 1-to-1, Named Pipes are a Windows way of doing this sort thing, you basically open a communication channel between 2 PCs and then write messages into the pipe. Not straight forward to start with but very reliable and the recommended way for things like Windows Services.
MS offer Named Pipes as an alternative way of communicating with an SQL Server (other than TCP/IP).
But as Bruce said, TCP/IP is standard and platform independent, and very reliable.
DCOM used to be a good method of interprocess communication. This was also one of Delphis strong points. Today I would strongly advice against using it.
Depending on the nature of your project I'd choose either
using a SQL server
socket communication
Look at solutions that use "Remote Procedure Call" type interfaces. I use RemObjects SDK for this sort of thing, but there are open source versions of RealThinClient which would do just as well.
Both of these allow you to create a connection that for most of your code is "transparent", and you just call an interface which sends the data over the wire and gets results back. You can then program how you usually do, and forget the details of sockets etc.
This is one of those cases where there really isn't a "best" answer as just about any of the technologies already discussed can be used to accurately communicate between two applications. The choice of which method to use will really come down to the critical nature of your communication, as well as how much data must be transfered from one workstation to another.
If your communication is not time sensitive or critical, then a simple poll of a database or file at regular intervals might be sufficient. If your communication is critical and time sensitive then placing a TCPIP server in each client might be worth pursuing. If just time sensitive then mailslots makes a good choice, if critical but not time sensitive then named pipes.
I've used the Indy library's Multicast components (IdIPMCastClient/Server) for this type of thing many times. The apps just send XML to each other. Quick and easy with minimal connection requirements.
Probably the easiest way is to read and write a file (or possibly one file per direction). It also has the advantage that it is easy to simulate and trace. It's not the fastest option, though (and it definitely sounds lame ;-) ).
A possibility could be to "share" objects across the network.
It is possible with a Client-Server ORM like our little mORMot.
This Open Source libraires are working from Delphi 6 up to XE2, and use JSON for transmission. There is some security features included (involving a RESTful authentication mechanism), and can use any database - or no database at all.
See in particular the first four samples provided, and the associated documentation.
For Delphi application integration, a message oriented middleware might be an option. Message brokers offer guaranteed delivery, load balancing, different communication models and they work cross-platform and cross-language. Open source message message brokers include:
Apache ActiveMQ and ActiveMQ Apollo
Open Message Queue (OpenMQ)
HornetQ
RabbitMQ
(Disclaimer - I am the author of Delphi / Free Pascal client libraries for these servers)

Resources