Queuing Emails on App Engine - google-app-engine

I need to send out emails at a rate exceeding App Engine's free email quota (8 emails/minute). I'm planning to use a TaskQueue to queue the emails, but I wondered: is there already a library or Python module I could use to automate this? It seems like the kind of problem someone might have run into before.

If it's an option, why not just enable billing? It'll jump the max rate from 8 recipients/minute to 5,100 recipients/minute.
The first 2000 recipients is free each day, as long as you aren't going over the daily free quotas my understanding is that it will not cost you anything (and if you need to email more than 2000 people per day you're going to have to enable billing anyways).

The deferred library is designed for exactly this sort of thing. Simply use deferred.defer(message.send), and make sure the queue you're using has the appropriate execution rate.

its cheaper to just pay for it for a year than to engineer a workaround.

Easiest way in my opinion would be to use a queue, ex Amazon SQS, and pull 8 records per minute, in a cron job running every minute.
Considering it was pushed into the queue, then taken out, I am working out the math that it is an extremely cheap service.
See below, 0.000002 is the rate for 2 requests. (Add and View)
8 requests per minute, 60 minutes in an hour, and 24 hours in a day. Take into account 30 days in the average month, you are still under $1.
0.000002 * 8 * 60 * 24 * 30 = $0.6912
This might not be exactly what you were looking for, but it should be a pretty simple solution.
EDIT:
See here, a python SQS & S3 Lib (sqs is all that you should be looking for).
http://pypi.python.org/pypi/Python-Amazon/0.5

I'm not familiar with any canned solutions to this problem, but it should be something very easy to solve. Write the emails to a datastore table, with an auto_add_now date field to record the order in which they entered. Your cron job that runs every minute pulls the eight oldest records off, mails them and deletes them.
Certainly, if you can solve this is a reasonably generic manner, you can be the person who solves this problem for everyone with a nice open source module.

Related

Mass Booking Scheduler

I am trying to create a scheduler for my organization which allows for admin to schedule 100+ people for a single time slot. Each scheduled person must have their own message and receive a reminder 24 hours before the event. The event is repeated every M-F at the exact same time. We also need to be able to check for duplicates, mark attendance and roll over no-shows to the next day. We used to use a Google sheet to do this, but this is no longer an option. I looked at using Google Calendar, which with scheduling would be perfect, but am limited to only 4 slots in an hour. I was curious if anyone had any ideas of what would work best for this solution. I have really wracked my brains out, and short of writing a custom app, which I do not have the time to do, I do not know where to go.
Thank you

Interpreting cost data in Google Cloud Platform

I host a basic web app on Google Cloud Platform, and I've noticed my costs creeping up over the last couple of months. It's really accelerated over the last 30 days (fortunately, on a tiny base - I'm still ticking along at under $2 a day). I haven't added any new functionality or clients in months so this was a bit surprising.
My first instinct was an increase in traffic. I couldn't see anything like that in the App Engine dashboard, but I put in a heap of optimizations and dramatically decreased QPS just in case. No change.
The number of instances hasn't moved around much either - this looks like the most likely culprit but it's still just flat, not growing.
My next guess was that data was accumulating in Datastore (even though the cost chart is filtered to App Engine only, I figured a fuller datastore -> a slower datastore -> more instance time in GAE). There's no chart for this, annoyingly, but I determined the data store size was more or less flat (I have a blunt instrument TTL job that runs daily) and culled it by dropping my retention threshold by 20% just to be safe.
These optimizations were on the 17th, but my cost hasn't moved at all. I considered forex fluctuations (I'm billed in Aussie dollars, all my charges are for frontend instances in Japan) but they haven't been anywhere near big enough to explain this.
Any ideas what's going on? I've clicked through all the graphs and reports in billing but can't reconcile the ~100% growth in cost with a flat or dropping qps, instance count and database size.
Yes! I've seen the same thing on a simple App Engine website running Python 3.7! I've had a ticket open since April 29th and they're not helpful. I saw a step change in frontend instance hours on March 24th with no corresponding increase in traffic. I have screenshots that are really telling but I can't upload them since I don't have 10 reputation points.
There's no corresponding increase in traffic, either in the cloud console or in Google analytics.
What's worse, each day the daily estimate shows I'm be under the 28 hour quota. For example, I took a screenshot that showed after 15 hours I was on pace for 24.352 frontend instance hours for the day (I didn't take one at the end of the quota day since it resets at 3AM)
When I woke up the next morning the billing report showed I was charged $0.00 for frontend instance hours for the previous day, but 3 hours later it shot up to $0.48, which means I used 38.6 frontend instance hours worth.
Somehow, the estimated cost calculation was off by 14 hours. Why have the estimate at all if it has an error that large? When I looked at the minute-by-minute billed instance hours for the hours after taking the screenshot through the end of the quota-day, there's nothing that indicates I would have used 23 additional hours from the time I took the screenshot to the time of the quota reset.
This behavior has been happening every day since March 24th for me with no explanation from Google besides "it looks like you exceeded your instances..." I wish I could share the screenshots so you can compare what you're seeing.

how do you deploy a cron script in production?

i would like to write a script that schedules various things throughout the day. unfortunately it will do > 100 different tasks a day, closer to 500 and could be up to 10,000 in the future.
All the tasks are independent in that you can think of my script as a service for end users who sign up and want me to schedule a task for them. so if 5 ppl sign up and person A wants me to send them an email at 9 am, this will be different than person B who might want me to query an api at 10:30 pm etc.
now, conceptually I plan to have a database that tells me what each persons task will be and what time they asked to schedule that task and the frequency. once a day I will get this data from my database so I have an up-to-date record of all the tasks that need to be executed in the day
running them through a loop I can create channels that can execute timers or tickers for each task.
the question I have is how does this get deployed in production to, for example google app engine? since those platforms are for Web servers I'm not sure how this would work...Or am I supposed to use Google Compute Engine and have it act as a computation for 24 hours? Can google compute engine even make http calls?
also if I have to have say 500 channels in go open 24 hrs a day, does that count as 500 containers in google app engine? I imagine that will get very costly quickly, despite what is essentially a very low cost product.
so again the question comes back to, how does a cron script get deployed in production?
any help or guidance will be greatly appreciated as I have done a lot of googling and unfortunately everything leads back to a cron scheduler that has a limit of 100 tasks in google app engine...
Details about cron operation on GAE can be found here.
The tricky portion from your prospective is that updating the cron configuration is done from outside the application, so it's at least difficult (if not impossible) to customize the cron jobs based on your app user's actions.
It is however possible to just run a generic cron job (once a minute, for example) and have that job's handler read the users' custom job configs and further generate tasks accordingly to handle them. Running ~10K tasks per day is usually not an issue, they might even fit inside the free app quotas (depending on what the tasks are actually doing).
The same technique can be applied on a regular Linux OS (including on a GCE VM). I didn't yet use GCE, so I can't tell exactly if/how would a dynamically updated cron be possible with it.
You only need one cron job for your requirements. This cron job can run every 30 minutes - or once per day. It will see what has to be done over the next period of time, create tasks to do it, and add these tasks to the queue.
It can all be done by a single App Engine instance. The number of instances you need to execute your tasks depends, of course, on how long each task runs. You have a lot of control over running the task queue.

how is Billing for Channel API done?

I've chosen google-app-engine because of its scalability, and now I try to understand how much I will have to pay once I release the product.
I've looked back and forth in the google app engine documentation to find an answer for question and couldn't find. I found few details in the "Quotas" page, I found how much I can get for free and how much is the Billing Default Quota.
In Billing Page there are number for CPU, etc with Resource and Unit and how much it cost. But no where could I find how much will it cost me per channel calls/created, etc.
I can't even try to make calculations with what's in the Admin console, because the current numbers there now are 0 (since there are 2 users which are the programmers).
How can I be ready for the releasing of the product that (hopefully) will have a huge number of channels created daily?
Is there a page I missed, or is there a tool for calculating?
Thanks!
EDIT:
Moishe, thanks for the quick and readable answer. So here are some more questions:
1. Do you think - if needed - that I will be able to get even more quota for the number of channels? I saw there's a special form to ask for more quotas, but I'm not sure that includes the Channel-API feature...
2. Are there any posts you've made for "how to use channel-API efficiently"? I saw some stuff about reusing the tokens per user. Is there more?
Thanks again.
Creating a channel costs about 2.7 CPU-seconds. A CPU-hour costs $0.10. So, each channel created costs
(2.7 / 3 600) * $0.10 = $7.5 × 10^-5
So creating 1000 channels will cost $0.075, or 7.5 cents.
You'll also get charged the normal outgoing bandwidth costs for any data sent over a channel.
The CPU cost probably isn't the biggest concern; you're more likely to run into quota caps then running out of money. Paid apps are limited to 86400 channel creations/day (1/second).

GAE: How many task per second is enough?

I'm using GAE and task queue. In queue.yaml file, I keep default setting: 5/s. 1 month ago I thought it's enough but now there are about 40-50 tasks in one queue so my application runs too slow.
I want to know how many tasks per second is enough ? Can I change to 100/s ?
Thank you :)
Update:
My application gets data from some social networks, calculate and save to datastore. To over limit 30 seconds of GAE, I split this operation to tasks. I want to know the limit of GAE task queue before change and deploy to GAE :)
http://code.google.com/appengine/docs/python/taskqueue/overview.html#Quotas_and_Limits
or
http://code.google.com/appengine/docs/java/taskqueue/overview.html#Quotas_and_Limits
I'd highly recommend increasing the settings in steps to find your performance sweet spot. The number of tasks needed to run is obviously higher than 5/s but you don't know what's the proper number until you've tried running for a while, and heading straight to the top doesn't sound like a good idea.

Resources