Programming a load balancer with sockets using C - c

I'm trying to make a load balancer for 3 HTTP servers{hosts= "web1", "web2", "web3"}{load balancer ports="8081","8082","8083"}.
This load balancer transfers HTTP requests randomly to one of the servers and then returns the result of the request to the sender.
I'm begining with sockets so if any one could tell me what would the program look like?
If it is not clear I'm ready to give more details.

You need to figure out if the requests are stateful, meaning - the requests belong to a valid session, then such requests should be consistently routed to the same server to avoid failures and inconsistencies. Fresh requests can be routed to any of the servers based on load balancing algorithm eg. round robin or least loaded server etc.

Related

Webapp server data storage: Memory vs database

We are making a web application in Go with a MySQL database. Our users are allowed to only have one active client at a time. Much like Spotify allows you to only listen to music on one device at a time. To do this I made a map with as key the user ids and a reference to their active websocket connection as a value. Based on the websocket id that the client has to send in the header of the request we can identify weather the request comes from their active session.
My question is if it's a good practice to store data (in this case the map with the user ids and websockets) in a global space or is it better to store it in the database.
We don't expect to reach over 10000 simultaneously active clients. Average is probably gonna be around 1000.
If you only run one instance of the websocket server storing it in memory should be sufficient. Because if it for some reason goes down/restarts then all the connections will be lost and all the clients will have to create them again (and hence the list of connection will once again be populated by all the clients who want to use the service).
However, if you plan on scaling it horizontally so you have multiple websocket services behind a load balancer, then the connections may need to be stored in a database of some sort. And not because it necessarily needs to be more persistant but because you need to be able to check the request against all the services connections.
It is also possible to have a separate service which handles the incoming request and asks all the websocket services if any of them have the connection specified in the request. This could be done if you add a pub/sub queue and every websocket service subscribes to channels for all its websocket ids and the service that receives the request then publishes the websocket id, and the websocket services can then send back replies on a separate channel if they have that connection. You must decide how to handle if no one is responding (no websocket service has the websocket id). Either the channel does not exist, or you expect the answer within a specific time. Or you could publish the question on a general topic and expect all the websocket services to reply (yes or no).
And regarding whether you need to scale it I guess depends mostly on the underlying server you're running the service on. If I understand it correctly the websocket service will basically not do anything except from keeping track of its connections (you should add some ping pong to discover if connections are lost). Then your limitation should mainly be on how many file descriptors your system can handle at once. If that limit is much larger than your expected maximum number of users, then running only one server and storing everything in memory might be an OK solution!
Finally, if you're in the business of having a websocket open for all users, why not do all the "other" communication over that websocket connection instead of having them send HTTP requests with their websocket id? Perhaps HTTP fits better for your use case but could be something to think about :)

How to ignore idle timeout from AWS ELB in the browser

I have an application where a user can upload a PDF using angular-file-upload.js
This library does not support file chunking: https://github.com/nervgh/angular-file-upload/issues/41
My elastic load balancer is configured to have an idle timeout of 10 seconds and other parts of the application depend on keeping this parameter.
The issue is if the file upload takes longer than 10 seconds the user receives a 504 Gateway Timeout in the browser and an error message. However, the file still reaches the server after some time.
How can I ignore or not show the user this 504 Gateway Timeout that comes from the ELB? Is there another way around this issue?
The issue you have is that an ELB is always going to close the connection unless it gets some traffic back from your server. See below from AWS docs. It's the same behaviour for an ALB or a Classic load balancer.
By default, Elastic Load Balancing sets the idle timeout to 60 seconds
for both connections. Therefore, if the instance doesn't send some
data at least every 60 seconds while the request is in flight, the
load balancer can close the connection. To ensure that lengthy
operations such as file uploads have time to complete, send at least 1
byte of data before each idle timeout period elapses, and increase the
length of the idle timeout period as needed.
So to get around this, you have two options:
Change the server processing to start sending some data back as soon as the connection is established, on an interval of less than 10 seconds.
Use another library for doing your uploads, or use vanilla javascript. There are plenty of examples out there, e.g. this one.
Edit: Third option
Thanks to #colde for making the valid point that you can simply work around your load balancer altogether. This has the added benefit of freeing up your server resources which get tied up with lengthy uploads. In our implementation of this we used pre-signed urls to securely achieve this.

How to handle long requests on the frontend?

My application allows a user to enter a URL of an article he/she wishes to analyze. It goes through our API gateway to reach the correct services engaged in this process. The analysis takes between 5 and 30 seconds depending on the article's word count.
For now, my reactjs client sends the request to the API and waits for 5 to 30 seconds to receive the response. Is there a better way to handle this such as enqueuing the job and let the API ping the client (reactjs frontend) once it has been done?
Server-sent Events (SSEs) allow your server to push new information to your browser, and hence look ideal to me for this purpose. They work over HTTP and there is good support for all browsers except for IE.
So the new process could look as follows:
Client send request to server, which initiates the lookup and potentially responds with the topic the browser needs to subscribe to (in case that's unique per lookup)
Server does its thing and sends updates as it processes new content. See how the beauty of this is that you could inform your client about partial updates.
If SSEs is not an option to you, you could leverage good old Websockets for bi-directional communication, but for such a simple endeavor, it might be too much technology to solve the problem.
A third alternative, especially if you are talking amongst services (no web or mobile clients on the other side) is to use web-hooks, so that the interested party would expose and listen on a specific endpoint, that the publisher (the server that does the processing) would write updates to.
Hope this is useful.

Higher than expected outgoing bandwidth

We have an application which is experiencing much higher outgoing bandwidth than we would expect given the loads. We have over 10 GB/day of outgoing bandwidth with essentially 0 visitors/day on the front end and a bunch of back end processing (using backend servers and the task queue). We also use memcache.
Google says they bill as follows:
Outgoing Bandwidth (billable)
The amount of data sent by the application in response to requests.
This includes:
data served in response to both secure requests and non-secure requests by application servers, static file servers, or the Blobstore
data sent in email messages
data sent over XMPP or the Channel API
data in outgoing HTTP requests sent by the URL fetch service.
We are not serving static files (it only has a rest api), don't use the blob store, don't send emails, don't use XMPP. We do use the URL fetch service, but only with GET requests. I find it hard to believe that 6000 GET requests would amount to 10 GBs of data.
Does anyone know how I can track down the details of what goes into our outgoing bandwidth usage?
To get an idea of when this bandwidth is being consumed, on the appengine dashboard you can change the charts context to: Traffic (Bytes/Second)
Also, within the dashboard I would open the Quota Details page and give it a quick once over and see if you can isolate which service is consuming the bandwidth.
On a side note, have you reviewed tasks in flight to see if perhaps something is stuck in a queue?

Aynchronous web server calls in Silverlight and maximum HTTP connections

I've read that Silverlight 2.0 imposes by design an asynchronous model when communicating with the web server. I haven't had a chance to experiment with Silverlight, but I assume that it uses a thread-pool to manage threads like in the .NET Framework.
Now, since some browsers, most notably Internet Explorer, have an hard-coded limit of maximum two concurrent HTTP connections that can be made on the web server, what happens if I make a bunch of asynchronous requests from Silverlight?
Does Silverlight bypass this limitation in the web browser and open as many HTTP connections as there are threads available, or do the asynchronous requests queue up and wait for one of the two connections to become available?
In IE (haven't tested others) Silverlight is restricted to 2 connections at a time.
The behavior in Silverlight is to simply not make the request. So if you make 5 Async web service requests right in a row, the first 2 will happen, the other three won't. No exception is thrown that i've seen...
Fiddler is a big help here :)
Create a messaging manager interface for your client. Any outgoing request are posted to a queue that this manager processes against. It would serially process queued messages (i.e., when the call back of the last message sent to the server is invoked, can then safely proceed to process the next queued message).
You can consume the other connection resource by keeping a Comet connection open to the server. The server would push any return messages to the client via this Comet connection. You'll need to stamp out-going messages with a unique number that can be embedded as a property on in-coming messages - so that results can be correlated to request. The messaging manager would dispatch a result message to the appropriate handler for that result.
Essentially you end up using two connection resources to establish bi-directional messaging. But there is no artificial limit on the number of requesters on the client (though request will get serially transmitted to the server). The act of sending is always fast, though, because you don't wait for any result to be computed - you just need to deliver the message reliably to the server and return. Results come back asynchronously on the other Comet connection.
We do something along these lines with our Flex client apps in conjunction to Adobe BlazeDS running in our Tomcat web server:
A Flex-based asynchronous stack
Firefox is also limited to two connections, in addition to IE as stated already.
Note that the limit is per hostname.
If you add entries to your hosts file, or use dns aliases you can get more connections. For example in testing, add lines like '127.0.0.1 test1' to your hosts file, and then you can open two connections to http://localhost and two more to http://test1
I guess, being a .NET application Silverlight 2 has an independent from browser limit.
I would assume It is maxconnection attribute in Machine.config as mentioned in http://support.microsoft.com/kb/828219
Firstly the Machine.config file would not be used as the Silverlight control is sandboxed with its own version of the CoreCLR.
I believe that the Silverlight control actually makes use of the underlying browser to make the asynchronous HTTP requests. This is most likely the case considering how the Silverlight control can't gain access to SOAP fault information as the SOAP specification requires that the server returns an HTTP 500 response code and the Silverlight control doesn't get that from the browser hosting the control.
This post here serves to confirm this.
As to the limit of concurrent HTTP connections, I believe IE5 and later limit the number of connections to the same site based on HTTP protocol version - HTTP/1.0 it limits to 4 connections and HTTP/1.1 to 3 connections. Most of the time the web server will limit the number of connections to 2 per client, queueing or discarding the remainder.

Resources