About ringpop, application layer sharding, used in Uber - database

https://ringpop.readthedocs.org/en/latest/
To my understanding, the sharding can be implemented in some library routines, and the application programs are just linked with the library. If the library is a RPC client, the sharding can be queried from the server side in real-time. So, even if there is a new partition, it is transparent to the applications.
Ringpop is application-layer sharding strategy, based on SWIM membership protocol. I wonder what is the major advantage at the application layer?
What is the other side, say the sharding in the system layer?
Thanks!

Maybe a bit late for this reply, but maybe someone still needs this information.
Ringpop has introduced the idea of 'sharding' inside application rather then data. It works more or less like an application level middleware, but with the advantage that it offers an easy way to build scalabale and fault-tolerance applications.
The things that Ringpop shards are the requests coming from clients to a specific service. This is one of its major advantages (there are mores, keep reading).
In a traditional SOA architecure, all requests for a specific serveice goes to a unique system that dispatch them among the workers for load balancing. These workers do not know each other, they are indipendent entities and cannot communicate between them. They do their job and sent back a reply.
Ringpop is the opposite: the workers know each other and can discover new ones, regularly talk among them to check their healthy status, and spread this information with the other workers.
How Ringpop shard the request?
It uses the concept of keyspaces. A keyspace is just a range of number, e.g. you are free to choice the range you like, but the obvious choice is hash the IDs of the objects in the application and use the hashing-function's codomain as range.
A keyspace can be imaginated as an hash "ring", but in practice is just a 4 or 8 byte integer.
A worker, e.g. a node that can serve a request for a specific service, is 'virtually' placed on this ring, e.g. it owns a contiguous portion of the ring. In practice, it has assigned a sub-range. A worker is in charge to handle all the requests belonging to its sub-range. Handle a request means two things:
- process the request and provide a response, or
- forward the request to another service that actually knows how to serve it
Every application is build with this behaviour embedded. There is the logic to handle a request or just forward it to another service that can handle it. The forwarding mechanism is nothing more than a remote call procedure, which is actually made using TChannel, the Uber's high performance forwarding for general RPC.
If you think on this, you can figure out that Ringpop is actually offering a very nice thing that traditionals SOA architecture do not have. The clients don't need to know or care about the correct instance that can serve their request. They can just send a request anywhere in Ringpop, and the receiver worker will serve it or forward to the rigth owner.
Ringpop has another interesting feature. New workers can dinamically enter the ring and old workers can leave the ring (e.g. because a crash or just a shutdown) without any service interrputions.
Ringpop implements a membership protocol based on SWIM.
It enable workers to discover each another and exclude a broken worker from the ring using a tcp-based gossip protocol. When a new worker is discovered by another worker, a new connection is established between them. Every worker map the status of the other workers sending a ping request at regular time intervals, and spread the status information with the other workers if a ping does not get a reply (e.g. piggyback membership update on a ping / gossip based)
These 3 elements consistent hashing, request forwarding and a membership protocol, make Ringpop an interesting solution to promote scalability and fault tolerance at application layer while keeping the complexity and operational overhead to a minimum.

Related

10,000 HTTP requests per minute performance

I'm fairly experienced with web crawlers, however, this question is in regards to performance and scale. I'm needing to request and crawl 150,000 urls over an interval(most urls are every 15 minutes which makes it about 10,000 requests per minute). These pages have a decent amount of data(around 200kb per page). Each of the 150,000 urls exist in our database(MSSQL) with a timestamp of the last crawl date, and an interval for so we know when to crawl again.
This is where we get an extra layer of complexity. They do have an API which allows for up to 10 items per call. The information we need exists partially only in the API, and partially only on the web page. The owner is allowing us to make web calls and their servers can handle it, however, they can not update their API or provide direct data access.
So the flow should be something like: Get 10 records from the database that intervals have passed and need to be crawled, then hit the API. Then each item in the batch of 10 needs their own separate web-requests. Once the request returns the HTML we parse it and update records in our database.
I am interested in getting some advice on the correct way to handle the infrastructure. Assuming a multi-server environment some business requirements:
Once a URL record is ready to be crawled, we want to ensure it is only grabbed and ran by a single server. If two servers check it out simultaneously and run, it can corrupt our data.
The workload can vary, currently, it is 150,000 url records, but that can go much lower or much higher. While I don't expect more than a 10% change per day, having some sort of auto-scale would be nice.
After each request returns the HTML we need to parse it and update records in our database with the individual data pieces. Some host providers allow free incoming data but charge for outgoing. So ideally the code base that requests the webpage and then parses the data also has direct SQL access. (As opposed to a micro-service approach)
Something like a multi-server blocking collection(Azure queue?), autoscaling VMs that poll the queue, single database host server which is also queried by MVC app that displays data to users.
Any advice or critique is greatly appreciated.
Messaging
I echo Evandro's comment and would explore Service Bus Message Queues of Event Hubs for loading a queue to be processed by your compute nodes. Message Queues support record locking which based on your write up might be attractive.
Compute Options
I also agree that Azure Functions would provide a good platform for scaling your compute/processing operations (calling the API & scraping HTML). In addition Azure Functions can be triggered by Message Queues, Event Hubs OR Event Grid. [Note: Event Grid allows you to connect various Azure services (pub/sub) with durable messaging. So it might play a helpful middle-man role in your scenario.]
Another option for compute could be Azure Container Instances (ACI) as you could spin up containers on demand to process your records. This does not have the same auto-scaling capability that Functions does though and also does not support the direct binding operations.
Data Processing Concern (Ingress/Egress)
Indeed Azure does not charge for data ingress but any data leaving Azure will have an egress charge after the initial 5 GB each month. [https://azure.microsoft.com/en-us/pricing/details/bandwidth/]
You should be able to have the Azure Functions handle calling the API, scraping the HTML and writing to the database. You might have to break those up into separated Functions but you can chain Functions together easily either directly or with LogicApps.

Reliable network all-to-all communication: Message Bus or Publish/Subscribe or Multicast/PGM?

I have a particle simulation program and want to split it up onto multiple machines in a LAN.
Each server node calculates a number of particles' positions and needs to send the updated positions (i.e. the same data) to all other server nodes. So each server node needs to be connected to all the others.
This reliable all-to-all communication should work without a central server/message broker and should have few latency. Also sending same data multiple times over the network should be avoided, if possible (as with multicast).
I looked around and ZeroMQ (supports Bus, Pub/Sub and PGM) as well as Nanomsg (supports Bus and Pub/Sub, but no PGM) look like great libraries fitting to the problem.
Which networking technique would be suited best in this case, a Message Bus, Publish/Subscribe or Multicast/PGM?
Pub / Sub is implemented as multicast, the use of PGM as a method of transport is transparent to a ZeroMQ client. ZeroMQ works hard to hide implementation details from the client, which is one of the reasons why it works so well. But don't let the simplicity fool you, ZeroMQ is an expertly engineered messaging solution that is very powerful and flexible. It is fast, efficient, and can handle really huge numbers of messages with ease.
ZeroMQ is used in some impressively large deployments, at my workplace we use ZeroMQ to handle over 200k messages per second. We have found ZeroMQ to scale effortlessly, the library is well engineered and optimised (no memory leaks, good performance), and has proven itself to work very well regardless of what we throw at it.
In ZeroMQ, publish / subscribe is done on user-defined topics, which dictate what data is sent to which connected clients. So if I have 10 clients connected to my publish socket, and 9 subscribe to a topic called "A", and 1 client subscribes to a topic "B", and I send a message with the topic of "B", then only the client subscribed to the topic "B" will be sent the message. ZeroMQ performs the filtering of pub/sub messages at both the point of transmission (to avoid wasting bandwidth), and at the point of receipt (to avoid possible race conditions when a topic is unsubscribed). It is also possible to subscribe to more than one topic.
To implement the mesh messaging system you describe, I recommend creating two sockets on each node in the cluster, one for receiving messages from all the other nodes, and one for sending messages to all the other nodes. If you do not need topics, then a subscription to the "*" topic will allow that client to receive all messages.

Scaling WebSockets on Google Compute Engine

I would like to implement a chat system as part of a game I am developing on App Engine. To implement this, I would like to use WebSockets, and have clients connect to each other though a hub, in this case an instance of GCE. Assuming this game needed to scale to multiple instances on GCE, how would this work? If I had a client 1, and the load balancer directed that request of client 1 to instance A, and another client (2) came in and was directed to instance B, but those clients wanted to chat with each other, they would each be connected to different hubs, and would be unable to reach each other. How would this be set up to work with scale? Would I implement it using queues, where each instance listens on that queue, and if so, how would I do that?
Google Play Game Services offers exactly the functionality that you want but in regard to Android and ios clients. So this option may not be compatible with your game tech design.
In general you're reasoning correctly. Messages from client who want to talk to each other will most of the time hit different server instances. What you want to do is to make instances handle the communication between users. Pub/sub (publish-subscribe pattern) is very suitable pattern in this scenario. Roughly:
whenever there's a message directed to client X a message is published on the channel X,
whenever client X creates a session, instance handling it subscribes to channel X.
You can use one of many existing solutions for starters. It's very easy to set this up using redis. If you need something more low-level and more flexible check out zeromq.
You can expect single instance of either solution to be able to handle thousands of QPS.
Unfortunately I don't have any experience with scaling neither of these solutions so can't offer you any practical advice as to the limits of their scalability.
PS. There are also other topics you may want to explore such as: message persistence and failure recovery I didn't address here at all.
I didn't try to implement this yet but I'll probably have to soon, I think it should be fairly simple to handle it yourself.
You have: server 1 with list of clients and you have server 2 with another list of clients,
so if client wants to send data to another client which might be on server 2, you have to:
Lookup if the receiver is on current server - if it is, you just send it (standard)
Otherwise you send the same data to all other servers you have, so they would check their lists for particular client (or clients) and send data to them.

distributed system: sql server broker implementation

I have a distributed application consisting of 8 servers, all running .NET windows services. Each service polls the database for work packages that are available.
The polling mechanism is important for other reasons (too boring for right now).
I'm thinking that this polling mechanism would be best implemented in a queue as the .NET services will all be polling the database regularly and when under load I don't want deadlocks.
I'm thinking I would want each .NET service to put a message into an input queue. The database server would pop each message of the input queue one at a time, process it, and put a reply message on another queue.
The issue I am having is that most examples of SQL Server Broker (SSB) are between database services and not initiated from a .NET client. I'm wondering if SQL Server Broker is just the wrong tool for this job. I see that the broker T-SQL DML is available from .NET but the way I think this should work doesn't seem to fit with SSB.
I think that I would need a single SSB service with 2 queues (in and out) and a single activation stored procedure.
This doesn't seem to be the way SSB works, am I missing something ?
You got the picture pretty much right, but there are some missing puzzle pieces in that picture:
SSB is primarily a communication technology, designed to deliver messages across the network with exactly-once-in-order semantics (EOIO), in a fully transactional fashion. It handles network connectivity (authentication, traffic confidentiality and integrity) and acknowledgement and retry logic for transmission.
Internal Activation is an unique technology in that it eliminates the requirement for a resident service to poll the queue. Polling can never achieve the dynamic balance needed for low latency and low resource consumption under light load. Polling forces either high latency (infrequent polling to save resources) or high resource utilization (frequent polling required to provide low latency). Internal activation also has self-tunning capability to ramp up more processors to answer to spikes in load (via max_queue_readers) while at the same time still being capable of tuning down the processing under low load, by deactivating processors. One of the often overlooked advantages of the Internal Activation mechanism is the fact that is fully contained within a database, ie. it fails over with a cluster or database mirroring failover, and it travels with the database in backups and copy-attach operations. There is also an External Activation mechanism, but in general I much more favor the internal one for anything that fits an internal context (Eg. not and HTTP request, that must be handled outside the engine process...)
Conversation Group Locking is again unique and is a means to provide exclusive access to correlated message processing. Application can take advantage by using the conversation_group_id as a business logic key and this pretty much completely eliminates deadlocks, even under heavy multithreading.
There is also one issue which you got wrong about Service Broker: the need to put a response into a separate queue. Unlike most queueing products with which you may be familiar, SSB primitive is not the message but a 'conversation'. A conversation is a fully duplex, bidirectional communication channel, you can think of it much like a TCP socket. An SSB service does not need to 'put responses in a queue' but instead it can simply send a response on the conversation handle of the received message (much like how you would respond in a TCP socket server by issue a send on the same socket you got the request from, not by opening a new socket and sending a response). Furthermore SSB will take care of the inherent message correlation so that the sender will know exactly the response belong to which request it sent, since the response will come back on the same conversation handle the request was sent on (much like in a TCP socket case the client receives the response from the server on the same socket it sent the request on). This conversation handle is important again when it comes to the correlation locking of related conversation groups, see the link above.
You can embed .Net logic into Service Broker processing via SQLCLR. Almost all SSB applciations have a non-SSB service at at least one end, directly or indirectly (eg. via a trigger), a distributed application entirely contained in the database is of little use.

working with new channel creation limits

Google app engine seems to have recently made a huge decrease in free quotas for channel creation from 8640 to 100 per day. I would appreciate some suggestions for optimizing channel creation, for a hobby project where I am unwilling to use the paid plans.
It is specifically mentioned in the docs that there can be only one client per channel ID. It would help if there were a way around this, even if it were only for multiple clients on one computer (such as multiple tabs)
It occurred to me I might be able to simulate channel functionality by repeatedly sending XHR requests to the server to check for new messages, therefore bypassing limits. However, I fear this method might be too slow. Are there any existing libraries that work on this principle?
One Client per Channel
There's not an easy way around the one client per channel ID limitation, unfortunately. We actually allow two, but this is to handle the case where a user refreshes his page, not for actual fan-out.
That said, you could certainly implement your own workaround for this. One trick I've seen is to use cookies to communicate between browser tabs. Then you can elect one tab the "owner" of the channel and fan out data via cookies. See this question for info on how to implement the inter-tab communication: Javascript communication between browser tabs/windows
Polling vs. Channel
You could poll instead of using the Channel API if you're willing to accept some performance trade-offs. Channel API deliver speed is on the order of 100-200ms; if you could accept 500ms average then you could poll every second. Depending on the type of data you're sending, and how much you can fit in memcache, this might be a workable solution. My guess is your biggest problem is going to be instance-hours.
For example, if you have, say, 100 clients you'll be looking at 100qps. You should experiment and see if you can serve 100 requests in a second for the data you need to serve without spinning up a second instance. If not, keep increasing your latency (ie., decreasing your polling frequency) until you get to 1 instance able to serve your requests.
Hope that helps.

Resources