Properly analyze Azure logs - database

I have conducted a performance testing a on a e-commerce website hosted on Azure. And I am checking azure logs for the test duration to find some scaling issues. From the logs I saw a lot of "InProc" dependencies failure. Also a lot of "Technical exception" with message " Cart not recalculated for remove shipping methods". So I would like to if this indicates any scaling issues or what should check or scaling issues for example slow database queries. I am very much new in performance testing and Azure so any help will be much appreciated. Thanks!!

Performance can be improved using Cache-Aside pattern- Caching Can Improve Application Performance
Data from a data store can be loaded into a cache on demand. This can assist increase performance while also ensuring data consistency between the cache and the underlying data storage.
Read-through and write-through/write-behind actions are available in
many commercial caching systems. An application in these systems
retrieves data by referring to the cache. If the data isn't already in
the cache, it's fetched and added from the data store. Any changes to
the data in the cache are also automatically pushed back to the data
store.
It is the duty of the programmes that utilise the cache to retain the
data if the cache does not provide this feature.
The cache-aside technique allows an application to mimic the
capabilities of read-through caching. This technique stores data in
the cache only when it is needed. The following diagram shows how to
use the Cache.

Related

When is SQL Server as a distributed caching mechanism worthwhile?

I have 2 web servers, and I'm running into an issue where I need to prematurely expire (remove) a cached item. Since I'm currently using IMemoryCache, a Remove(key) call only removes the cached item from one server. I don't have the ability to leverage Redis, Nache, etc. but the app is already using SQL server. I can easily set up distributed caching with a cache table, but it seems counter-intuitive because what I'm caching is user data that I don't want to hit the database for on every call (e.g., I cache 50 items of user data every 5 minutes which has cut down on 500 trips to the database). Is there something I'm missing which would make using SQL server as my distributed cache backend actually beneficial?
Sounds like you are having the typical problem of cache invalidation and expiry. You can use a grid-cache for distributed caching (e.g. Redis, Hazelcast) but it doesn't solve the invalidation problem. You may want to consider vendors like ScaleArc or Heimdall Data. They provide the caching logic. You choose the storage of choice (in-memory, Redis etc.) and it handles query caching and invalidation. The is SQL Server blog on it: https://www.itprotoday.com/industry-perspectives/reduce-sql-server-costs-heimdall-data-caching

Caching Data on a Heavy Load Web Server

I currently have a web application which on each page request gets user data out of a database, for the currently logged in user.
This web application could have approximately 30 thousands concurrent users.
My question is would it be best to cache this. For example in C# using System.Web.HttpRuntime.Cache.Add
or would this cripple the servers memory storing up to 30 thousand user objects in the memory?
Would it be better to not cache and just get the required data from the database on each request?
At that scale you need an explicit caching and scaling strategy. Hacking together a cache is different from planning an explicit strategy. Hacking together a cache will fail.
Caching is highly dependent upon the data. Does the data change frequently? What ratio of reads-to-writes are you going to have? How are you going to scale your database? What happens if the servers in your web farm have different values for the data? Is cache consistency critical?
You'll probably end up with several different types of caching:
IIS Static Caching
ASP.Net Caching
An LRU cache in your app.
An in-memory distributed cache such as MemCacheD.
HTTP caching in the browser.
Also, if you're serving static data (Images, CSS, Javascript, etc) you'll want to integrate with a CDN for delivery. This is easy do with with AWS S3 or Azure Storage.
You'll also want to make sure you plan how to scale out from the get-go. You'll probably want to deploy to a cloud provider such as AWS with Elastic Bean Stalk or Azure's Websites Infrastruture.
How about having some limited cache as per available memory. And write an LRU(Least Recently used) algorithm on top of it. It may improve the performance especially for the frequently visited users.

Caching database table on a high-performance application

I have a high-performance application I'm considering making distributed (using rabbitMQ as the MQ). The application uses a database (currently SQLServer, but I can still switch to something else) and caches most of it in the RAM to increase performance.
This causes a problem because when one of the applications writes to the database, the others' cached database becomes out-of-date.
I figured it is something that happens a lot in the High-Availability community, however I couldn't find anything useful. I guess I'm not searching for the right thing.
Is there an out-of-the-box solution?
PS: I'm sorry if this belongs to serverfault - Since this a development issue I figured it belongs here
EDIT:
The application reads and writes to the database. Since I'm changing the application to be distributed - Now more than one application reads and writes to the database. The caching is done in each of the distributed applications, which are not aware to DB changes from another application.
I mean - How can one know if the DB was updated, if he wasn't the one to update it?
So you have one database and many applications on various servers. Each application has its own cache and all the applications are reading and writing to the database.
Look at a distributed cache instead of caching locally. Check out memcached or AppFabric. I've had success using AppFabric to cache things in a Microsoft stack. You can simply add new nodes to AppFabric and it will automatically distribute the objects for high availability.
If you move to a shared cache, then you can put expiration times on objects in the cache. Try to resist the temptation to proactively evict items when things change. It becomes a very difficult problem.
I would recommend isolating your critical items and only cache them once. As an example, when working on an auction site, we cached very aggressively. We only cached an auction listing's price once. That way when someone else bid on it, we only had to do one eviction. We didn't have to go through the entire cache and ask "Where does the price appear? Change it!"
For 95% of your data, the reads will expire on their own and writes won't affect them immediately. 5% of your data needs to be evicted when a new write comes in. This is what I called your "critical items". Things that always need to be up to date.
Hope that gives you ideas!

Improving speed in winform application and WCF with Caching

We provide a critical application for a customer. It's a clickonce winforms application which consumes several WCF services which communicates with an Oracle Database.
The service is hosted with Oracle Application Server with two Web Cache Servers in front for load balancing. The Database is on another separate machine.
Thing is, the application has now poor performance and we need to speed it up. We have tried many techniques: optimize queries with adding indexes when analyzing explain plans, reducing service calls from client and profiling the client application for pitfalls.
But I would really like two set up a caching layer over the database or the WCF. The data is critical and changed quite often so it's necessary to get the latest data at all requests.
So when data changes in the database the cache should immediately be expired. The queries are complex with up two 14-15 joins...
How is the right way to do this and which tools/frameworks should I use? I have heard of memcached.. is this good?
Because your code sees all updates to the data you can have a very effective caching layer as the cache can be updated at the same time as the database.
With your requirement for absolute cache coherency you need to make sure all servers see the same cache. There are two approaches you could take:
Have a cache server which uses something like the ASP.NET cache which the application servers talk to to get and update the data
Use a caching product to maintain the cache
If you use a caching product there are a number on market: memcached, gemfire, coherence, Windows Server AppFabric Caching and more
The nice thing about AppFabric Caching (project formally known as Velocity) is that it is free with Windows Server and is very .NET friendly (although it is newer than some of the others and so you might say less proven)
Before adding a new tool you should make sure you're correctly using all of the Oracle caching that is available to you.
There's the buffer cache, PL/SQL function result cache, client query result cache, sql query result cache, materialized views, and bind variables will help cache query plans.

Caching in Google App Engine/Cloud Based Hosting

I am curious as to how caching works in Google App Engine or any cloud based application. Since there is no guarantee that requests are sent to same sever, does that mean that if data is cached on 1st request on Server A, then on 2nd requests which is processed by Server B, it will not be able to access the cache?
If thats the case (cache only local to server), won't it be unlikely (depending on number of users) that a request uses the cache? eg. Google probably has thousands of servers
With App Engine you cache using memcached. This means that a cache server will hold the data in memory (rather than each application server). The application servers (for a given application) all talk the same cache server (conceptually, there could be sharding or replication going on under the hoods).
In-memory caching on the application server itself will potentially not be very effective, because there is more than one of those (although for your given application there are only a few instances active, it is not spread out over all of Google's servers), and also because Google is free to shut them down all the time (which is a real problem for Java apps that take some time to boot up again, so now you can pay to keep idle instances alive).
In addition to these performance/effectiveness issues, in-memory caching on the application server could lead to consistency problems (every refresh shows different data when the caches are not in sync).
Depends on the type of caching you want to achieve.
Caching on the application server itself can be interesting if you have complex in-memory object structure that takes time to rebuild from data loaded from the database. In that specific case, you may want to cache the result of the computation. It will be faster to use a local cache than a shared memcache to load if the structure is large.
If having consistent value between in-memory and the database is paramount, you can do some checksum/timestamp check with a stored value on the datastore, every time you use the cached value. Storing checksum/timestamp on a small object or in a global cache will fasten the process.
One big issue using global memcache is ensuring proper synchronization on "refilling" it, when a value is not yet present or has been flushed. If you have multiple servers doing the check at the exact same time and refilling value in cache, you may end-up having several distinct servers doing the refill at the same time. If the operation is idem-potent, this is not a problem; if not, a potential and very hard to trace bug.

Resources