What are the cost associated for Snowflake Caching? - snowflake-cloud-data-platform

Does snowflake charges storage cost for result cache, warehouse cache and metadata cache?

While I find the official answer, these are the principles behind each:
Result cache: Available across warehouses, hence its persisted on cloud storage and should count within your storage costs.
Warehouse cache: Available within an active WH, hence the cost is already included in the WH costs.
Metadata cache: Naturally available throughout the cloud services layer. Free unless "Snowflake credits are used to pay for the usage of the cloud services that exceeds 10% of the daily usage of the compute resources".
https://docs.snowflake.com/en/user-guide/credits.html

Related

Most cost-effective GAE app settings?

A GAE python webapp I got splits its cost about evenly into 1) front instances and 2) data reads. What I can think of reducing the costs for data reads is store more items with memcache. But I don't know how to reduce costs for the front instances. I'm using the F1 setting, how do I know whether other setting increase or decrease the cost? What happens if I enable the PageSpeed service?
About PageSpeed Service cost:
At this time, the service is being offered to a limited set of webmasters free of charge.
Have a look for more information here. But on the other hand there is an article (Enabling PageSpeed Optimization Service) in docs that says this:
There is a small fee for using PageSpeed ($0.39 per gigabyte of bandwidth in addition to regular bandwidth charges)...
About lowering front end instance costs you could have a smaller number of idle Instances as that reduces costs.
Definitely use Memcache extensively and take care to keep cache in sync with Datastore.
Slightly increasing the Pending Latency makes AppEngine hold off spawning additional server instances for a short while. If your site throughput is low to medium, the impact on performance should be negligible. On the other hand, the cost savings might also be negligible. It will not hurt to give it a try.

Master/Slave Datastore vs High Replication Datastore

Before starting GAE datastore, I think it will be good to know the difference b/w Master/Slave Datastore and High Replication Datastore?
And what makes GAE team to migrate from Master/Slave to HRD?
The difference between the two (as well as the reason for the switch) is increased fault-tolerance and data consistency.
The Master/Slave Datastore implements a primary-backup protocol. Each app is served by a master (i.e. a single data center) and its data is replicated asynchronously to the slave (i.e. some other data center). The problem with this schema is that it doesn't protect your application from local failures and is more likely to lead to data inconsistencies.
The High Replication Datastore implements the Paxos consensus algorithm to ensure that a majority of data centers maintain a consistent view of your application's data. Because your data is no longer reliant on the health of a single data center, the datastore is able to function properly even in the presence of local/global failures. Google's engineers also benefit from this implementation, as it allows them to perform data center maintenance without having to enforce scheduled read-only periods for AppEngine applications.
The downside of using the High Replication Datastore is slower writes (about 2x as slow, since Paxos is inherently 2-phase). This isn't that big of a deal though, especially when compared to the increased fault tolerance and data consistency that the High Replication Datastore has to offer.
For the first three years of App Engine only with Master/Slave, the health of the datastore was tied to the health of a single data center. Users had low latency and strong consistency, but also transient data unavailability and planned read-only periods.
The High Replication Datastore trades small amounts of latency and consistency for significantly higher availability.
Master/Slave store is deprecated, it's advised that you do not use it, https://developers.google.com/appengine/docs/python/datastore/usingmasterslave

How to estimate hosting services cost on GAE?

I'm building a system which I plan to deploy on Google App Engine. Current pricing is described here:
Google App Engine - Pricing and Features
I need an estimate of cost per client managed by the webapp. The cost won't be very accurate until I have completed the development. GAE uses such fine grained price calculation such as READs and WRITEs that it becomes a very daunting task to estimate operation cost per user.
I have an agile dev. process which leaves me even more clueless in determining my cost. I've been exploiting my users stories to create a cost baseline per user story. Then I roughly estimate how will the user execute each story workflow to finally compute a simplistic estimation.
As I see it, computing estimates for Datastore API is overly complex for a startup project. The other costs are a bit easier to grasp. Unfortunately, I need to give an approximate cost to my manager!
Has anyone undergone such a task? Any pointers would be great, regarding tools, examples, or any other related information.
Thank you.
Yes, it is possible to do cost estimate analysis for app engine applications. Based on my experience, the three major areas of cost that I encountered while doing my analysis are the instance hour cost, the datastore read/write cost, and the datastore stored data cost.
YMMV based on the type of app that you are developing, of course. If it is an intense OLTP application that handle simple-but-frequent CRUD to your data records, most of the cost would be on the datastore read/write operations, so I would suggest to start your estimate on this resource.
For datastore read/write, the cost for writing is generally much more expensive than the cost for reading the data. This is because write cost take into account not only the cost to write the entity, but also to write all the indexes associated with the entity. I would suggest you to read an article by Google about the life of a datastore write, especially the part about Apply Phase, to understand how to calculate the number of write per entity based on your data model.
To do an estimate of instance hours that you would need, the simplest approach (but not always feasible) would be to deploy a simple app to test how long would a particular request took. If this approach is undesirable, you might also base your estimate on the Google App Engine System Status page (e.g. what would be the latency for a datastore write for a particularly sized entity) to get a (very) rough picture on how long would it take to process your request.
The third major area of cost, in my opinion, is the datastore stored data cost. This would vary based on your data model, of course, but any estimate you made need to also take into account the storage that would be taken by the entity indexes. Taking a quick glance on the datastore statistic page, I think the indexes could increase the storage size between 40% to 400%, depending on how many index you have for the particular entity.
Remember that most costs are an estimation of real costs. The definite source of truth is here: https://cloud.google.com/pricing/.
A good tool to estimate your cost for Appengine is this awesome Chrome Extension: "App Engine Offline Statistics Estimator".
You can also check out the AppStats package (to infer costs from within the app via API).
Recap:
Official Appengine Pricing
AppStats for Python
AppStats for Java
Online Estimator (OSE) Chrome Extension
You can use the pricing calculator
https://cloud.google.com/products/calculator/

GAE: About the Usage of High Replication Data

I'm using Google App Engine and High Replication Datastore.
I checked the Dashboard of one of my GAE app today, I found that High Replication Data became 52%, 0.26 of 0.50 GBytes in the Billing Status.
I don't use so much data for the app, so I also checked Datastore Statistics and Total number of entities is about 60,000 and Size of all entities is only 42 MBytes, which is far from 0.26 GBytes.
What is the difference between the Usage in the Dashboard and in the Datastore Statistics? And how can I reduce the former Usage?
Thank you.
Because the datastore creates automatic indexes for your entities. In addition if you have custom indexes, they will also need storage.
You can reduce this by removing unused indexes and by not indexing properties, which are not needed for queries (setting indexed=false).
In general however, you need to get used to the idea that the storage for your entities is not the same as total storage needed for the datastore ;)

How much memory of Memcache is available to a Google App Engine account?

Google App Engine has some information about Memcache limits:
http://code.google.com/appengine/docs/quotas.html#Memcache
http://code.google.com/appengine/docs/python/memcache/overview.html#Quotas_and_Limits
However, total allowed size of RAM/memory store for a single application is not specified. It's known that no objects above 1MB is allowed. Do you have information?
The amount of memcache capacity your app has isn't fixed, and may vary depending on the traffic to your app and how it uses memcache.
As of November 2013, App Engine offers dedicated memcache which guarantees a specific amount of memcache RAM. You can purchase between 1 and 20 GB of dedicated cache. More than 20 GB is available upon request.
The shared memcache RAM available to an application still varies based on multiple factors.

Resources