Is this system an optimal solution to sync an app with a server in real time efficiently? - database

Problem
I have an Android and iOS app, looking like a classic social network. I need to update UI in real time. Currently, I use a classic system of a client polling each second to a php script by HTTP. The php script bother the database every second for every client and responds, most of the time that there is no new update. If there is a new update, the php script process it and send it back to the client app.
There are 3 problems in this approach : (1) slow user experience (1 second delay each time) + high battery and data usage, (2) apache machines bothered each second by incoming HTTP request, (3) database machine bothered each second by the apaches machines (requesting if they are new stored updates in the main database).
I feel that this system could be substentially improved. For problem (1), I know a TCP connection can be "piped" to the app, but there is still problem (3) because the thread behind the socket still polls the database each second to know if they are new stored updates for their member ID.
Solution ?
I thought of a system to get rid of any activity (client, apaches and database) if there are no new updates. There would be : N apaches server on N machines, a load balancer exposed to the Internet. Behind, these apache server, connected only to local network, 1 "central" database and one "update" database, dedicated for the update system. The "update" database would store 2 tables :
1 table for the mapping between user tokens (and their member ID), and the thread ID and name of current apache machine holding the thread. One user ID may have several connection tokens, but one connection token is associated to only one unique couple (PID - machine name). Each time a user connects to the app, it would create a TCP con held by one thread (in one apache machine), and the [thread ID - machine name] would be stored in that table.
1 table to store the updates themselves. They contain all the informations needed to get up-to-date data (either in raw primitive form like string or int, or in "reference" form, telling the recipient TCP threads it needs to compute "at sending time" some params, for more complex data structures)
The system would be the following :
(1) A user wants to send a message to another user. The app client of the sender sends an HTTP request to the app API endpoint; the load balancer forwards the request to one of the apache machines.
(2) The apache server requests the main database to insert the "user message" row.
(3) The apache server requests the "update" database to know if the recipient has any currently connected device.
(4) if there is at least one connected device, insert an "update" row in the "update" database with all the informations needed, and wake up all thread associated to the recipient user ID (maybe using C signals ?).
(5) All the thread(s) associated to the recipient user ID wake up, they look in the "update" database for new updates associated with their user ID, they process their parameters (especially if there are references params to be computed), they send them back to the recipient devices via TCP.
So my final question is : is such a system feasible, reliable and if so, do you think it can be optimal in term of database and apache machines performence ?
I'm more a front-end programmer and I'm not used to implement complex server architecture, so I wanted to have some opinions before diving into the code, especially if I missed something in my approach (storing PIDs is reliable ? Is it possible for one machine to wake up a thread in another machine through local network ? ...)
PS : I already tried Firebase cloud messaging, but the problem is that they authorize only a one dimension array to be sent with update params. When dealing with complex data structure (like a "user message"), when I receive a signal from FCM in my client app, I still need to make an extra HTTP call to my server to retrieve the new "user message" JSON payload. So, good for my apaches and databases machines (they are not bothered when there is no new updates), bad for the client app that has to send additional HTTP requests. Once again, tell me if I missed something here :)
Thanks for reading

Related

Webapp server data storage: Memory vs database

We are making a web application in Go with a MySQL database. Our users are allowed to only have one active client at a time. Much like Spotify allows you to only listen to music on one device at a time. To do this I made a map with as key the user ids and a reference to their active websocket connection as a value. Based on the websocket id that the client has to send in the header of the request we can identify weather the request comes from their active session.
My question is if it's a good practice to store data (in this case the map with the user ids and websockets) in a global space or is it better to store it in the database.
We don't expect to reach over 10000 simultaneously active clients. Average is probably gonna be around 1000.
If you only run one instance of the websocket server storing it in memory should be sufficient. Because if it for some reason goes down/restarts then all the connections will be lost and all the clients will have to create them again (and hence the list of connection will once again be populated by all the clients who want to use the service).
However, if you plan on scaling it horizontally so you have multiple websocket services behind a load balancer, then the connections may need to be stored in a database of some sort. And not because it necessarily needs to be more persistant but because you need to be able to check the request against all the services connections.
It is also possible to have a separate service which handles the incoming request and asks all the websocket services if any of them have the connection specified in the request. This could be done if you add a pub/sub queue and every websocket service subscribes to channels for all its websocket ids and the service that receives the request then publishes the websocket id, and the websocket services can then send back replies on a separate channel if they have that connection. You must decide how to handle if no one is responding (no websocket service has the websocket id). Either the channel does not exist, or you expect the answer within a specific time. Or you could publish the question on a general topic and expect all the websocket services to reply (yes or no).
And regarding whether you need to scale it I guess depends mostly on the underlying server you're running the service on. If I understand it correctly the websocket service will basically not do anything except from keeping track of its connections (you should add some ping pong to discover if connections are lost). Then your limitation should mainly be on how many file descriptors your system can handle at once. If that limit is much larger than your expected maximum number of users, then running only one server and storing everything in memory might be an OK solution!
Finally, if you're in the business of having a websocket open for all users, why not do all the "other" communication over that websocket connection instead of having them send HTTP requests with their websocket id? Perhaps HTTP fits better for your use case but could be something to think about :)

How to handle long requests on the frontend?

My application allows a user to enter a URL of an article he/she wishes to analyze. It goes through our API gateway to reach the correct services engaged in this process. The analysis takes between 5 and 30 seconds depending on the article's word count.
For now, my reactjs client sends the request to the API and waits for 5 to 30 seconds to receive the response. Is there a better way to handle this such as enqueuing the job and let the API ping the client (reactjs frontend) once it has been done?
Server-sent Events (SSEs) allow your server to push new information to your browser, and hence look ideal to me for this purpose. They work over HTTP and there is good support for all browsers except for IE.
So the new process could look as follows:
Client send request to server, which initiates the lookup and potentially responds with the topic the browser needs to subscribe to (in case that's unique per lookup)
Server does its thing and sends updates as it processes new content. See how the beauty of this is that you could inform your client about partial updates.
If SSEs is not an option to you, you could leverage good old Websockets for bi-directional communication, but for such a simple endeavor, it might be too much technology to solve the problem.
A third alternative, especially if you are talking amongst services (no web or mobile clients on the other side) is to use web-hooks, so that the interested party would expose and listen on a specific endpoint, that the publisher (the server that does the processing) would write updates to.
Hope this is useful.

HTTP response following long process

The current project is in Node.js with the Expressjs framework. We have an application with client/prospect information, authenticated users are allowed to modify the database and initiate long-running processes on the server. As an example, printing a 30 pg document could be one process.
The application user needs two main things:
A response when the process begins.
A response (notification) when the process ends.
We currently handle need #1 in standard express fashion by ensuring the process starts followed by res.json({msg: 'Process Started']); back to the Angular front end. Need #2 is currently handed with an email to the user that initiated the process with a process status report.
I would like to improve how we handle need #2. Specifically, I would like to send a JSON string to the user to display in the interface.
Questions:
Is it possible to do this over HTTP?
Does this functionality exist within Express or a well-know middleware.
Assuming 1 & 2 are false. My solution is to run a TCP socket server to maintain a socket with the required users. When a process ends a message is sent to the user with a status update. Can anyone comment on the issues my solution presents?
Yes to both 1 and 2. Essentially what you seek to achieve here is to push from the server to the client. The need to do this is pretty ubiquitous in web applications and there have been various solutions for it over the years with various fancy names. You might like to read up on Ajax, Comet, Long-polling, Websockets.
For your node application, take a look at socket.io. In a nutshell, what this framework does is it abstracts the complexities of Ajax, Websockets, etc. into a single API. Put another way, socket.io gives you bi-directional communications between your node application and front end.

Pushing data across App Engine instances

Let's say we have several clients connected to App Engine using Channel API. Each client sends messages, which should be propagated to other conntected clients according to some rules. The tricky part is that clients may not be to the same App Engine instance.
Is there any way to push data from one instance to the others?
(Yes, I know about Memcache, but this would require some kind of polling.)
You're asking two questions here.
a. Can you push data from one instance to another without the use of polling. The answer is generally no.
b. Can one client send messages to the server that can be propagated to other clients? Yes, and this does not require propagating messages to other server-side instances.
Consider the Channel API as a service. Clients are connected to the Channel API service; they are not connected to any particular instance. Therefore any instance can send messages to any client.
You'll need to store the Channel tokens of your clients in the datastore, in some way that's queryable to match your rules.
Your client makes an HTTP request to send a message to your server.
The handler on the server queries for channel tokens that it needs to propagate the message to (either from memcache or datastore).
The handler on the server sends messages to all the clients.
If the list of destination clients is extremely large, you might want to do steps 3/4 in a task queue where the operation can run longer.
It does not matter what instance a client is connected to, that's hidden from you by the API.
Clients can only "reply" to message via standard HTTP commands, they don't actually have any way to respond via the channel API directly.
So Client A on server A1 wants to sent a message to client B on server B1.
Client A posts to a handler. That might be instance A1 or B1. It does not matter which as the server now passes the message on to client B whatever server client B is connected to via the Channel API.
The real point is that no App Engine instance has any data at all, in general. So it does not matter which instance you connect to, it might be the 99th instance or the very first to start up. So you have to design your application so that it's irrelevant what instance is in use.
Client sends message to server via HTTP.
Server sends message to N clients via the channel API.
Channel API does not make a fixed frontend-instance-to-client connection. Any frontend instance can push message to channel if it knows the channel ID.
What you need to do is pass messages cross-channel.
User one sends message normally to server (e.g. via GET)
Server looks up channel ID of second user and pushes the message
Repeat procedure in other direction: second user to first user.

Structure to handle inter-device messaging

How is the best way to handle messages through a server to multiple devices?
Scenario
It will be an app capable of running on multiple mobile platforms including online in a web browser. A type of instant messenger. The messages will be directed through a server to another mobile device.
The back-end structure/concept must be basically the same as WhatsApp. Sending messages to one-another like that.
What I think
Have the device send it to the web-server.
Server saves it in a queue table in a database.
When receiver device checks for new message (every second) it finds it in the queue.
Remove it from queue and put message in history table.
Final
What would be a efficient way to structure/handle such an app to get similar results as WhatsApp?
You may want to push messages instead of pull them every second. This has two big advantages:
Less bandwidth usage.
You can skip the database part if the sender and the receiver are both connected when the message is sent. Only queue the messages in the database if the receiver isn't connected.
So it's a huge performance boost if you use push.
If you have a web app using JavaScript you can use a JSON stream or, for new browsers, JavaScript WebSokets.

Resources