I need a NoSql database to write continuous log data. Approx. 100 write per second. And a single data is contains 3 column and less than 1kb. Read is necessarily only once a day, then I can delete all daily data. But I can't decide that which is the cheapest solution; Google App Engine and Datastore or Heroku and Mongolab?
I can give you costs for GAE:
Taking billing docs and assuming you'll have about 258M operations per (86400 second per day * 100 requests/s) this would cost you
Writing: 258M record * ($0.2 / 100k) = $516 for writing unindexed data
Reading: 258M records * ($0.07 / 100k ops) = $180 for reading once a month
Deleting 258M rec * ($0.2 / 100k) = $516 for deleting unindexed data
Storage: 8.6M entities at 1kb per day = 8.6GB per day = 240 GB / month = averaged 120 GB
Storage cost: 120 GB * 0.12$/GB = $15 / month
So your total operation per month on GAE would be about $1300 per month. Note that using a structured database for writing unstructured data is not optimal and it reflects on the price.
With App Engine, it is recommended that you use memcache for operations like this, and memcache does not incur database charges. Using python 2.7 and ndb, memcache is automatically used and you will get at most 1 database write per second.
At current billing:
6 cents per day for reads/writes.
Less than $1 per day storage
Related
I read the snowflake document a lot. Snowflake will has storage-costs if data update.
"tables-storage-considerations.html" mentioned that:
As an extreme example, consider a table with rows associated with
every micro-partition within the table (consisting of 200 GB of
physical storage). If every row is updated 20 times a day, the table
would consume the following storage:
Active 200 GB | Time Travel 4 TB | Fail-safe 28 TB | Total Storage 32.2 TB
The first Question is, if a periodical task run 20 times a day, and the task exactly update one row in each micro-partition, then the table still consume 32.2TB for the total storage?
"data-time-travel.html" mentioned that:
Once the defined period of time has elapsed, the data is moved into
Snowflake Fail-safe and these actions can no longer be performed.
So my second question is: why Fail-safe cost 28TB, not 24TB (reduce the time travel cost)?
https://docs.snowflake.com/en/user-guide/data-cdp-storage-costs.html
https://docs.snowflake.com/en/user-guide/tables-storage-considerations.html
https://docs.snowflake.com/en/user-guide/data-time-travel.html
First question: yes, it's the fact that the micro-partition is changing that is important not how many rows within it change
Question 2: fail-safe is 7 days of data. 4Tb x 7 = 28Tb
I'm just deciding between GSS (Google Site Search) and CSE (Custom Search Engine) with JSON API. But I'm a little bit confused about JSON API billing.
My approved start budget is 100$ per year which allows 20 000 queries/year in GSS but how many queries will I get in JSON API and how I must set quota to not exceed the budget?
I have opinion how google makes billing:
Price of 1 query is 0.005$ = 5$ / 1000 queries => https://developers.google.com/custom-search/json-api/v1/overview#pricing
Google adds day queries (over 100 free) and then create billing for month. So my quota has to be set to 154 (100 free + 54):
54 queries per day * 31 days * 12 months = 20 088 queries * 0.005$ = 100,44$ which is maximum I will pay (lesser maybe).
Am I right? Or google makes billing in different way?
My GAE app will request weekly data from Google Analytics like
number of visitors during last week
number of visitors of particular page during last week
etc.
Then I would like to show this data on my GAE web-page with Google Charts. The data will be shown for last X weeks (let's say, 10 weeks).
What is the best approach to store this data (number of metrics multiplied by number of weeks)? Old data could be deleted.
I don't think I should use datastore like:
class Visitors(ndb.Model):
week1 = ndb.IntegerProperty(default=0) # should store week start and end dates also
week2 = ndb.IntegerProperty(default=0)
...
Probably, it would be better to store data like:
class Analytics(ndb.Model):
visitors = ndb.StringProperty(default=0) # comma separated values like '1000,1001,1002'; last value is previous week
page_visitors = ndb.IntegerProperty(repeated=True,default=0) # [1000,1001,1002]
...
What are you trying to optimize?
With this amount of data, you will pay pennies, or less, for data storage. You are well within the free quota on datastore reads and writes. Performance-wise, the difference is negligible.
I would recommend going with the most straightforward solution: each week is a new entity, each data point is in its own property.
So I am currently performing a test, to estimate how much can my Google app engine work, without going over quotas.
This is my test:
I have in the datastore an entity, that according to my local
dashboard, needs 18 write operations. I have 5 entries of this type
in a table.
Every 30 seconds, I fetch those 5 entities mentioned above. I DO
NOT USE MEMCACHE FOR THESE !!!
That means 5 * 18 = 90 read operations, per fetch right ?
In 1 minute that means 180, in 1 hour that means 10800 read operations..Which is ~20% of the daily limit quota...
However, after 1 hour of my test running, I noticed on my online dashboard, that only 2% of the read operations were used...and my question is why is that?...Where is the flaw in my calculations ?
Also...where can I see in the online dashboard how many read/write operations does an entity need?
Thanks
A write on your entity may need 18 writes, but a get on your entity will cost you only 1 read.
So if you get 5 entries every 30 secondes during one hour, you'll have about 5reads * 120 = 600 reads.
This is in the case you make a get on your 5 entries. (fetching the entry with it's id)
If you make a query to fetch them, the cost is "1 read + 1 read per entity retrieved". Wich mean 2 reads per entries. So around 1200 reads in one hour.
For more details informations, here is the documentation for estimating costs.
You can't see on the dashboard how many writes/reads operations an entity need. But I invite you to check appstats for that.
Assume that I upload 1GB (one gigabyte) data to my gae blobstore everyday. How can I calculate my storage cost at the end of first year?
At the end of the year you will have 365GB which will cost you 0.13$ (in today prices) per GB per month, only for the blob storage you will pay 47.45$ per month (~1.5$ per day).