I currently have a web application which on each page request gets user data out of a database, for the currently logged in user.
This web application could have approximately 30 thousands concurrent users.
My question is would it be best to cache this. For example in C# using System.Web.HttpRuntime.Cache.Add
or would this cripple the servers memory storing up to 30 thousand user objects in the memory?
Would it be better to not cache and just get the required data from the database on each request?
At that scale you need an explicit caching and scaling strategy. Hacking together a cache is different from planning an explicit strategy. Hacking together a cache will fail.
Caching is highly dependent upon the data. Does the data change frequently? What ratio of reads-to-writes are you going to have? How are you going to scale your database? What happens if the servers in your web farm have different values for the data? Is cache consistency critical?
You'll probably end up with several different types of caching:
IIS Static Caching
ASP.Net Caching
An LRU cache in your app.
An in-memory distributed cache such as MemCacheD.
HTTP caching in the browser.
Also, if you're serving static data (Images, CSS, Javascript, etc) you'll want to integrate with a CDN for delivery. This is easy do with with AWS S3 or Azure Storage.
You'll also want to make sure you plan how to scale out from the get-go. You'll probably want to deploy to a cloud provider such as AWS with Elastic Bean Stalk or Azure's Websites Infrastruture.
How about having some limited cache as per available memory. And write an LRU(Least Recently used) algorithm on top of it. It may improve the performance especially for the frequently visited users.
Related
I have conducted a performance testing a on a e-commerce website hosted on Azure. And I am checking azure logs for the test duration to find some scaling issues. From the logs I saw a lot of "InProc" dependencies failure. Also a lot of "Technical exception" with message " Cart not recalculated for remove shipping methods". So I would like to if this indicates any scaling issues or what should check or scaling issues for example slow database queries. I am very much new in performance testing and Azure so any help will be much appreciated. Thanks!!
Performance can be improved using Cache-Aside pattern- Caching Can Improve Application Performance
Data from a data store can be loaded into a cache on demand. This can assist increase performance while also ensuring data consistency between the cache and the underlying data storage.
Read-through and write-through/write-behind actions are available in
many commercial caching systems. An application in these systems
retrieves data by referring to the cache. If the data isn't already in
the cache, it's fetched and added from the data store. Any changes to
the data in the cache are also automatically pushed back to the data
store.
It is the duty of the programmes that utilise the cache to retain the
data if the cache does not provide this feature.
The cache-aside technique allows an application to mimic the
capabilities of read-through caching. This technique stores data in
the cache only when it is needed. The following diagram shows how to
use the Cache.
I have a servers cluster, each server gets real-time authentication events as requests, and returns a risk score for the incoming event, based on AI models that sits in S3.
This cluster serves multiple customers. Each customer has its own AI model in S3.
Each AI model file in S3 size is ~50MB in size.
The problem:
Let's say this cluster consists of 10 servers, and it serves 20 customers. Respectively, there are 20 AI models in S3.
In a naive solution, each server in the cluster might end up loading all the 20 models from S3 to the server memory.
20(servers in the cluster)*50MB(model size in S3) = 1GB.
It takes long time to download the model and load it to memory, and the amount of memory is limited to the memory capacity of the server.
And of course - these problems get bigger with scale.
So what are my options?
I know that there are out of the box products for model life cycle management, such as: MlFlow, KubeFlow, ...
Do these products have a solution to the problem I raised?
Maybe use Redis as a cache layer?
Maybe use Redis as a cache layer in combination with MlFlow and KubeFlow?
Any other solution?
Limitation:
I can't have sticky session between the servers in that cluster, so I can't ensure all the requests of the same customer will end up in the same server.
As far as I understand your problem, I'd use separate serving servers for each model. As a result, you'll have 20 model serving servers which loads only 50MB model data in it and the server is going to serve for one model. You'll also need to have 1 server which stores the model metadata and it's responsible for sending the incoming request to the related model serving server. This metadata will have the information of "customer vs model serving server endpoint".
Essentially, Kubeflow offers the above solution as a package and it's highly scalable since it's using Kubernetes for orchestration. For example, someday if you want to add new customer, you can trigger a Kubeflow pipeline which trains your model, saves it to S3, deploys a separate model server within the Kubeflow cluster and updates the metadata. Kubeflow offers both automation using its pipeline approach and scalability using Kubernetes.
The cons of Kubeflow at the moment, in my opinion, are the community is not big and the product has been improving.
I haven't used MlFlow before, so I cannot give details about it.
As far as I understand your problem, this can not be solved by any model serving library/framework. The server instance that gets the request for the risk score, must load the according model.
To solve this you should requests dependent on the tenant to a specific server instance.
The "Deployment stamps" pattern could help you in this case. See https://learn.microsoft.com/en-us/azure/architecture/patterns/deployment-stamp for more details.
As a front door (see pattern) a NGINX or Spring Cloud Gateway could be a good solution. Just look at the request header (authorization header) to get the tenant/user and determine the appropriate server instance.
Currently clouds are mushrooming like crazy and people start to deploy everything to the cloud including CMS systems, but so far I have not seen people that have succeeded in deploying popular CMS systems to a load balanced cluster in the cloud. Some performance hurdles seem to prevent standard open-source CMS systems to be deployed to the cloud like this.
CLOUD: A cloud, better load-balanced cluster, has at least one frontend-server, one network-connected(!) database-server and one cloud-storage server. This fits well to Amazon Beanstalk and Google Appengine. (This specifically excludes CMS on a single computer or Linux server with MySQL on the same "CPU".)
To deploy a standard CMS in such a load balanced cluster needs a cloud-ready CMS with the following characteristics:
The CMS must deal with the latency of queries to still be responsive and render pages in less than a second to be cached (or use a precaching strategy)
The filesystem probably must be connected to a remote storage (Amazon S3, Google cloudstorage, etc.)
Currently I know of python/django and Wordpress having middleware modules or plugins that can connect to cloud storages instead of a filesystem, but there might be other cloud-ready CMS implementations (Java, PHP, ?) and systems.
I myself have failed to deploy django-CMS to the cloud, finally due to query latency of the remote DB. So here is my question:
Did you deploy an open-source CMS that still performs well in rendering pages and backend admin? Please post your average page rendering access stats in microseconds for uncached pages.
IMPORTANT: Please describe your configuration, the problems you have encountered, which modules had to be optimized in the CMS to make it work, don't post simple "this works", contribute your experience and knowledge.
Such a CMS probably has to make fewer than 10 queries per page, if more, the queries must be made in parallel, and deal with filesystem access times of 100ms for a stat and query delays of 40ms.
Related:
Slow MySQL Remote Connection
Have you tried Umbraco?
It relies on database, but it keeps layers of cache so you arent doing selects on every request.
http://umbraco.com/azure
It works great on azure too!
I have found an excellent performance test of Wordpress on Appengine. It appears that Google has spent some time to optimize this system for load-balanced cluster and remote DB deployment:
http://www.syseleven.de/blog/4118/google-app-engine-php/
Scaling test from the report.
parallel
hits GAE 1&1 Sys11
1 1,5 2,6 8,5
10 9,8 8,5 69,4
100 14,9 - 146,1
Conclusion from the report the system is slower than on traditional hosting but scales much better.
http://developers.google.com/appengine/articles/wordpress
We have managed to deploy python django-CMS (www.django-cms.org) on GoogleAppEngine with CloudSQL as DB and CloudStore as Filesystem. Cloud store was attached by forking and fixing a django.storage module by Christos Kopanos http://github.com/locandy/django-google-cloud-storage
After that, the second set of problems came up as we discovered we had access times of up to 17s for a single page access. We have investigated this and found that easy-thumbnails 1.4 accessed the normal file system for mod_time requests while writing results to the store (rendering all thumb images on every request). We switched to the development version where that was already fixed.
Then we worked with SmileyChris to fix unnecessary access of mod_times (stat the file) on every request for every image by tracing and posting issues to http://github.com/SmileyChris/easy-thumbnails
This reduced access times from 12-17s to 4-6s per public page on the CMS basically eliminating all storage/"file"-system access. Once that was fixed, easy-thumbnails replaced (per design) file-system accesses with queries to the DB to check on every request if a thumbnail's source image has changed.
One thing for the web-designer: if she uses a image.width statement in the template this forces a ugly slow read on the "filesystem", because image widths are not cached.
Further investigation led to the conclusion that DB accesses are very costly, too and take about 40ms per roundtrip.
Up to now the deployment is unsuccessful mostly due to DB access times in the cloud leading to 4-5s delays on rendering a page before caching it.
I'm currently looking for a Cloud PaaS that will allow me to scale an application to handle anything between 1 user and 10 Million+ users ... I've never worked on anything this big and the big question that I can't seem to get a clear answer for is that if you develop, let's say a standard application with a relational database and soap-webservices, will this application scale automatically when deployed on a Paas solution or do you still need to build the application with fall-over, redundancy and all those things in mind?
Let's say I deploy a Spring Hibernate application to Amazon EC2 and I create single instance of Ubuntu Server with Tomcat installed, will this application just scale indefinitely or do I need more Ubuntu instances? If more than one Ubuntu instance is needed, does Amazon take care of running the application over both instances or is this the developer's responsibility? What about database storage, can I install a database on EC2 that will scale as the database grow or do I need to use one of their APIs instead if I want it to scale indefinitely?
CloudFoundry allows you to build locally and just deploy straight to their PaaS, but since it's in beta, there's a limit on the amount of resources you can use and databases are limited to 128MB if I remember correctly, so this a no-go for now. Some have suggested installing CloudFoundry on Amazon EC2, how does it scale and how is the database layer handled then?
GAE (Google App Engine), will this allow me to just deploy an app and not have to worry about how it scales and implements redundancy? There appears to be some limitations one what you can and can't run on GAE and their price increase recently upset quite a large number of developers, is it really that expensive compared to other providers?
So basically, will it scale and what needs to be done to make it scale?
That's a lot of questions for one post. Anyway:
Amazon EC2 does not scale automatically with load. EC2 is basically just a virtual machine. You can achieve scaling of EC2 instances with Auto Scaling and Elastic Load Balancing.
SQL databases scale poorly. That's why people started using NoSQL databases in the first place. It's best to see which database your cloud provider offers as a managed service: Datastore on GAE and DynamoDB on Amazon.
Installing your own database on EC2 instances is very impractical as EC2 has ephemeral storage (it looses all data on "disk" when it reboots).
GAE Datastore is actually a one big database for all applications running on it. So it's pretty scalable - your million of users should not be a problem for it.
http://highscalability.com/blog/2011/1/11/google-megastore-3-billion-writes-and-20-billion-read-transa.html
Yes App Engine scales automatically, both frontend instances and database. There is nothing special you need to do to make it scale, just use their API.
There are limitations what you can do with AppEngine:
A. No local storage (filesystem) - you need to use Datastore or Blobstore.
B. Comet is only supported via their proprietary Channels API
C. Datastore is a NoSQL database: no JOINs, limited queries, limited transactions.
Cost of GAE is not bad. We do 1M requests a day for about 5 dollars a day. The biggest saving comes from the fact that you do not need a system admin on GAE ( but you do need one for EC2). Compared to the cost of manpower GAE is incredibly cheap.
Some hints to save money (an speed up) GAE:
A. Use get instead of query in Datastore (requires carefully crafting natiral keys).
B. Use memcache to cache data you got form datastore. This can be done automatically with objectify and it's #Cached annotation.
C. Denormalize data. Meaning you write data redundantly in various places in order to get to it in as few operations as possible.
D. If you have a lot of REST requests from devices, where you do not use cookies, then switch off session support ( or roll your own as we did). Sessions use datastore under the hood and for every request it does get and put.
E. Read about adjusting app settings. Try different settings (depending how tolerant your app is to request delay and your traffic patterns/spikes). We were able to cut down frontend instances by 70%.
We provide a critical application for a customer. It's a clickonce winforms application which consumes several WCF services which communicates with an Oracle Database.
The service is hosted with Oracle Application Server with two Web Cache Servers in front for load balancing. The Database is on another separate machine.
Thing is, the application has now poor performance and we need to speed it up. We have tried many techniques: optimize queries with adding indexes when analyzing explain plans, reducing service calls from client and profiling the client application for pitfalls.
But I would really like two set up a caching layer over the database or the WCF. The data is critical and changed quite often so it's necessary to get the latest data at all requests.
So when data changes in the database the cache should immediately be expired. The queries are complex with up two 14-15 joins...
How is the right way to do this and which tools/frameworks should I use? I have heard of memcached.. is this good?
Because your code sees all updates to the data you can have a very effective caching layer as the cache can be updated at the same time as the database.
With your requirement for absolute cache coherency you need to make sure all servers see the same cache. There are two approaches you could take:
Have a cache server which uses something like the ASP.NET cache which the application servers talk to to get and update the data
Use a caching product to maintain the cache
If you use a caching product there are a number on market: memcached, gemfire, coherence, Windows Server AppFabric Caching and more
The nice thing about AppFabric Caching (project formally known as Velocity) is that it is free with Windows Server and is very .NET friendly (although it is newer than some of the others and so you might say less proven)
Before adding a new tool you should make sure you're correctly using all of the Oracle caching that is available to you.
There's the buffer cache, PL/SQL function result cache, client query result cache, sql query result cache, materialized views, and bind variables will help cache query plans.