MuleSoft RTF Architecture and understanding of Cores compared to Cloudhub - mulesoft

Hi we are planning to migrate to Mule4 from Mule3 and I've few questions related to sizing of cores compared to Cloudhub vs RTF.
Currently we installed Mule Runtimes on AWS(on-premise) . 2 VM Machines of 2 Cores each. so that is 4 Cores subscription. Clustered them as ServerGroup. Deployed 40 applications on both.
Question 1) So My understanding is we are using 2 cores to maintain 40 applications and other 2 cores for high availability . Let me know if this is correct and If the same 40 apps have to be moved to Cloudhub with HA do i need 8Cores ?
coming to RTF, i guess we need to have 3 controller and 3 worker nodes. suppose if i take AWS VM Machine of 3 Core capacity . It will be 3X3 = 9 cores using and I can deploy the same 40 applications on those 3 VM machines. (it could be more than 40 apps as well ).This is with high availability
When it comes to cloudhub if i need to deploy 40 apps with high availability (each app deployed on 2 cores) it would take 8Cores. and I cannot deploy not single application more than 40.
Question 2) RTF though i have 4 core VM machine i can deploy 50 or 60 apps. but for cloudhub if i take 4core subscription i cannot deploy more than 40 apps. Is this correct ?

yes, you're right. Currently (Dec 2021), the minimum allocation of vCores when deploying applications to Cloudhub is 0.1 of a vCore, so to your 1st question, yes, correct, you would strictly need 8 vCores, assuming 2 workers per application for a "somewhat" high-availability. A true end-to-end high-availability would more likely need 3 workers, so that if one dies, you would still have HA within the other 2.
To the second question, when you deploy in RTF and even Mule runtime directly to say a VM or a container, you have more flexibility in terms of how much of a vCore portion you need to allocate for your applications. Your MuleSoft account manager would be able to articulate with you as to how much that would mean.
Last but not least, you could also think of different deployment models and cost-savings approaches, which depending on your scenario could mean using say service mesh, so you drastically reduce the number of vCores you use and also you can come with a strategy of grouping endpoints/resources of different applications in a single one. Example: if you have 2 different applications, both related to customer data or somehow the same domain, you could group them together.
Ed.

Related

Which M.. tier do I need for my app (MongoDB)?

I’m new to MongoDB,
I have an Ionic App for a local restaurant where you have some products which you can order. The app also have a register to create some users. There is also a Angular Web App where you can put in products and look up users etc.
Both apps are connected to the MongoDB. Unfortunately I don’t have any clue which data plan is necessary for the deployment of these two apps.
Is it maybe better to switch to Firebase?
Can anybody help me please?
Best regards
Basti
Selecting a tier in MongoDB-Atlas depends on various factors like data size, IOPS, Price etc.. As you're saying this is for a local restaurant & I would assume there could be less traffic to the App, then in that case you can go with M10 cause that's where MongoDB Atlas really provide some valuable features to database which is used in production environment. For development environment you can give a try with M5 cluster. Some features you can enjoy using M10 or above are :
Dedicated Cluster : Clusters deploy each mongod process to its own instance, Where as M0, M2 & M5 are in shared environment, So Atlas will automatically upgrade the cluster to latest version upon availability which is not preferred in realtime Apps as there could be a functionality/package that can break with upgrades.
Queryable backups : You can take advantage of querying specific continuous backup snapshot, Which is really helpful to restore a part of data instead of entire dataset which is back'd up a day ago.
Supports Network Peering : As most of projects now a days use cloud platforms to deploy Apps, Clusters >= M10 supports network peering.
Metrics & Performance Advisor : This is one most important thing which you'll get benefited using clusters >= M10. Using alerts you'll get to know which kind of queries are taking much time, How many connections are open at a given time, monitor CPU threshold & get alerted, additionally MongoDB can suggest you with indexes to be created for better performance of queries being run on collections which fail to use index already present in.
At the end of the day, Remaining most other features remain almost same. From my experience usually you'll estimate & pre-pay certain amount for MongoDB Atlas account for around 3 years, Where you don't get back anything if you've not utilized all of it. Also you can upgrade & downgrade clusters at anytime manually or can be automatically scaled up or down based on incoming traffic.
Ref : cluster-tier

Google Cloud Spanner - technical implication of 3 node recommendation in production

I'm wondering if there is any technical implication behind that recommendation.
It can't be split-brain because even when configuring 1 node there are 3 instances in total running in different data-centers within the configured region, each having all the data.
So when configuring 3 nodes google will run 9 instances in total sharding the data as well.
I would be happy if anybody could provide information on what's behind that recommendation.
Thanks,
Christian
"when configuring 1 node there are 3 instances"
Quickly correcting terminology to make sure we're on the same page, 1 node still has 3 replicas. Or, more fully: "When configuring your instance to have only 1 node, there are still 3 replicas". When you have 3 nodes, you still have 3 replicas, however each replica will have more physical machines assigned to it.
The reason 3 nodes is the recommended minimum for a 'production grade' instance is around the number of machines that are assigned. When working with highly available systems we need to consider the probability of a replica being unavailable (zone maintenance, unforeseen network issues, etc) combined with the probability of other simultaneous issues, such as physical machines failures.
It's true that even with 1 node Cloud Spanner can continue to operate if a replica becomes available. However, to provide the uptime guarantees of Cloud Spanner, 3 nodes reduces the probability of physical machine failures being catastrophic to the overall system.

Is CakePHP on Amazon Web Services (Free Tier), a good fit?

I have a 2 GB database and a front end that will likely handle 10-15 hits during the day. Is the AWS (with MySQL RDS) free-tier a good place to start?
Will CakePHP apps encounter time-outs or other resource issues, due to sizing of the Micro Instance?
Micro Instance (from Amazon): Micro instances (t1.micro) provide a small amount of
consistent CPU resources and allow you to increase CPU capacity in
short bursts when additional cycles are available. They are well
suited for lower throughput applications and web sites that require
additional compute cycles periodically. You can learn more about how
you can use Micro instances and appropriate applications in the Amazon
EC2 documentation.
Micro Instance 613 MiB of memory, up to 2 ECUs (for short periodic
bursts), EBS storage only, 32-bit or 64-bit platform
If you're only getting a very small amount of hits, you can probably run your application and mysql database on a micro instance.
The micro will be free, but you will have to pay for the RDS.
You should not notice any issues - we do most of our testing on micros, and our database is larger than yours.
It will work perfectly for your scenario.
I have deployed myself for other clients applications with at least twice requirements as yours and they worked fine.
If your application does operations saving and retrieving files from the disk I would like to suggest you giving a try to Amazon S3.

is Google App Engine-MapReduce my best bet for a massively parallel solution in a cloud?

Is Google App Engine-MapReduce my best bet for a massively parallel solution in a cloud? My problem takes hours multi-threaded on a 4 core PC. I'd say 600 minutes might do. I would prefer 1000 servers get it done in 36 seconds. Switching from 4 core threading to 1000 server processing is eminently doable in my app. In fact, I can already send 1000 small jobs to 4 cores but it's not going to get done sooner than 4 big jobs to 4 cores given that I still have only 4 cores. (My dataset is small so Map-Reduce, which was designed for large datasets, might have a different sweet-spot than my type of compute-bound problem.)
I think I can get this done if I have 1000 simultaneous URL fetches but as you may know Google limits at 10 requests. It seems Google is actively discouraging outsiders from putting massively parallel solutions on their infrastructure.
I started looking into Google App Engine because upon deployment there will be very few users and it appeared App Engine has fine-grained costs - a feature I really like. My impression was that Amazon EC2 would be more work but also that costs were more likely to be chunky. Given that I'm a home-based business, I don't want to pay anything more than a nominal amount when in the early months I don't expect a lot of visitors to my website. May be they will never visit.
In general, where do people turn to for massively parallel (compute-bound) problems that ought to be served by a cloud?
For compute bound tasks, EC2 is often better than App Engine. App Engine is focused on serving web requests, not pure number crunching. It is not designed to go from 0 requests this minute to 1000 requests the next minute and back to 0 requests the minute after that. In fact, one of its features is that you generally don't need explicit control over how many instances are running at once. Also, long running jobs are not possible, though for many tasks you can use Task Queues to chain jobs together. I think the current limit on background tasks is 10 minutes.
EC2 does have a super low tier of service that you can get for free. EC2 lets you explicitly bring servers up and down, but I think the smallest increment you can pay for is 1 hour.
Of course, if you want to literally run your job on 1000 servers, neither app engine nor EC2 will likely let you do that for free. Both are very elastic/adaptive, but bringing 1000 servers up for 30 seconds of work is not very economical for them. On App Engine you will likely run up against an hourly or daily quota before you had 1000 simultaneous instances running. On EC2, you generally pay by the server instance. So you would be paying for 1000 hours of instance time. Of course, one of Amazon's High CPU instances might be much more powerful than your PC, so maybe you'd only need a 100 or so. or maybe you could compromise and have only 20 instances running at a time, meaning it takes a few minutes to finish your computation, but you don't go broke.
Have you checked Amazon's Elastic MapReduce? http://aws.amazon.com/elasticmapreduce/
With App Engine you should also investigate the task queues. If you already know how to split the big problem into many small ones, you could create one task that takes in the big problem and then creates 1000 (or 10.000) subtasks to tackle the smaller problems. And after that collect the results in one task, if needed.
Individual tasks can run up to 10 minutes before they are terminated, which makes them a little bit easier to use for computing tasks than regular requests.

Are there any "gotchas" in deploying a Cassandra cluster to a set of Linode VPS instances?

I am learning about the Apache Cassandra database [sic].
Does anyone have any good/bad experiences with deploying Cassandra to less than dedicated hardware like the offerings of Linode or Slicehost?
I think Cassandra would be a great way to scale a web service easily to meet read/write/request load... just add another Linode running a Cassandra node to the existing cluster. Yes, this implies running the public web service and a Cassandra node on the same VPS (which many can take exception with).
Pros of Linode-like deployment for Cassandra:
Private VLAN; the Cassandra nodes could communicate privately
An API to provision a new Linode (and perhaps configure it with a "StackScript" that installs Cassandra and its dependencies, etc.)
The price is right
Cons:
Each host is a VPS and is not dedicated of course
The RAM/cost ratio is not that great once you decide you want 4GB RAM (cf. dedicated at say SoftLayer)
Only 1 disk where one would prefer 2 disks I suppose (1 for the commit log and another disk for the data files themselves). Probably moot since this is shared hardware anyway.
EDIT: found this which helps a bit: http://wiki.apache.org/cassandra/CassandraHardware
I see that 1GB is the minimum but is this a recommendation? Could I deploy with a Linode 720 for instance (say 500 MB usable to Cassandra)? See http://www.linode.com/
How much ram you needs really depends on your workload: if you are write-mostly you can get away with less, otherwise you will want ram for the read cache.
You do get more ram for you money at my employer, rackspace cloud: http://www.rackspacecloud.com/cloud_hosting_products/servers/pricing. (our machines also have raided disks so people typically see better i/o performance vs EC2. Dunno about linode.)
Since with most VPSes you pay roughly 2x for the next-size instance, i.e., about the same as adding a second small instance, I would recommend going with fewer, larger instances than more, smaller ones, since in small numbers network overhead is not negligible.
I do know someone using Cassandra on 256MB VMs but you're definitely in the minority if you go that small.

Resources