Introducing randomness of time using cron and task queue - google-app-engine

I'm looking for some engineering creativity to solve a problem on Google App Engine.
I have a small number of jobs that run periodically, but I'd like the jobs to be executed at random times. So instead of running a job every Tuesday at 2:00pm, I'd like it to run every Tuesday "between 2:00pm and 5:00pm".
Currently, I'm using the following algorithm...
Cron job runs every Tuesday at 2:00pm
The cron handler finds a list of specific jobs to run and creates a task queue event for each discrete task.
The respective task queue handler decides if it should actually run by picking a random number between one and N. If the random number is X, the job gets executed. Otherwise, it creates a new task queue event to try again. Each task has a maximum number of queue attempts to guarantee that the job actually completes at some point.
I've realized that another solution would be to create a task queue that has a very slow rate, and when the cron job fills the queue, it randomly re-orders the list of tasks before doing so.
Any ideas from App Engine users?

Have a cron job at 2 pm that queues a task with a random countdown between 0 and 3 hours?

Related

Continuously running service in Google Cloud Engine

I am trying to figure out how to run a service(1) when it does not receive any calls.
I want to use Microservices Architecture.
Basically i want to run this service (1) when the other service(2) is receiving calls and all data.
As the service(1) i mentioned is not receiving it would not have to spawn new instances and i would want only the service(2) to scale.
I have noticed scheduling jobs with cron yaml but the number of calls is limited.
I need to get this service(1) to be active every 1 min when service(2) is active.
It's hard to give a good answer without knowing more about what service (1) has to do when it is 'active'. It sounds you want cron to launch a task every minute.
You can use cron in conjunction with push queues: https://cloud.google.com/appengine/docs/standard/go/taskqueue/push/
When creating a push queue task, you can set the property delay before adding it to the queue: https://cloud.google.com/appengine/docs/standard/go/taskqueue/reference#Task
(For me in Python they called it countdown https://cloud.google.com/appengine/docs/standard/python/refdocs/google.appengine.api.taskqueue.taskqueue#google.appengine.api.taskqueue.taskqueue.add)
You could have a cron job that fires every 24 hrs. That cron job would load up your push queue with tasks who's delays are staggered. The delay of the first one is 1 min, the delay of the second one is 2 min, etc.

App Engine Cron.yaml run multiple instances of a script

How can one run multiple instances of a Script Using Google App Engine's Cron system?
By default, it will run, then wait the specified interval before running again, which means that only one instance runs. What i am looking for is how one can get a script that takes 2+ minutes to run start a new instance every 30-60 seconds regardless of if it is running already or not, which does assume the script does not interfere with itself if multiple instances are running. this would effectively allow the script to deal with several times more information in the same period of time.
Edit, Completely reworded the question.
You only get resolution to the minute. To get finer-grained, you'll need instances that know whether they should handle the request from chron immediately, of if they'll have to sleep 30 seconds first. A 30 second sleep uses up half of the 60 second request deadline. Depending on the workload you expect to handle, this might require that you use Modules.
By the way, I'm not aware of any guarantee that a job scheduled for 01:00 will fire at exactly 01:00:00 (and not at, say, 01:00:03).
Since the cron service doesn't allow intervals below 1 min you'd need to achieve staggering script launching in a different manner.
One possibility would be to have a cron entry handler running every 2 mins which internally sleeps for 30 seconds (or as low as your "few seconds of each-other" requirements are) between triggering the respective script instance launches.
Note: the sleeps would probably burn into your Instance Hours usage. You might be able to incorporate the staggered triggering logic into some other long-living task you may have instead of simply sleeping.
To decouple the actual script execution from the cron handler (or the other long-living task) execution you could use dedicated task queues for each script instance, with queue handlers sharing the actual script code if needed. The actual triggering would be done by enqueueing tasks in the respective script instance queue. As a bonus you may further control each script instance executions by customizing the respective queue configuration.
Note: if your script execution time exceeds the 2 minutes cron period you may need to take extra precautions in the queue configurations as there can be extra delays (due to queueing) which could push lauching of the respective script instance closer to the next instance launch.
Working off Dave W. Smith's answer, The Line would be
every 1 minute from 00:00 to 23:59
Which means that it would create a new instance every minute, even if the script takes longer than a minute to run. It does seem that specifying seconds is not possible.

Scheduling cron jobs

I want to develop an app on which a user can register for alerts( multiple) so that whenever the fare hits below some threshold, he gets a notification. Fares are fetched from a third party website.I want to do this on google app-engine.
Now from what i understand , i need a process running 24/7 which checks the fares at say intervals of 30 mins and send out a notification whenever it hits below the threshold. Probably the cron job of app-engine can be used for this task ? But at max 100 cron jobs can be scheduled, what would be the better way to this. Also having a process for each user would be wastage of resources, what would be better scheduling algorithms for higher efficiency ?
You want to schedule a single cron that runs every 30 minutes and throws an item onto a task queue. That single item on the task queue would then be able to go through all your users, and generate tasks to fetch whatever you need in the background again. Two important things:
You want the initial cron call to return as quickly as possible, as URLs have a 60 second deadline.
Split up any work into separate task queues to achieve above and also iterate through data sources and/or users.
Based on what you're explaining, you can use push task queues: https://cloud.google.com/appengine/docs/python/taskqueue/overview-push

[google-app-engine]Cron Error - Launching tasks every few seconds instead of the specified frequency

I run an application that usually triggers a heavy update every 2 hours, by queuing heavy tasks, using the cron mecanism. This has been working well for months.
However, the december 16th of 2012, this url has been called (by user-agent AppEngine-Google) every few seconds between 15:17 and 15:51, launching hundreds of heavy tasks. This resulted in an explosion of my quotas and forced me to switch to the paid version of the application, in order for my website to stay alive.
Anybody having the same issue? Any idea of what happened and how I could avoid this problem in the future?
I had the same issue.
I don't have an answer, but I think this is a task queue problem.
I have 4 cron jobs and some task queue tasks piled up. Everything was normal until 14:05Z (16th 6:05 PST?).
At 14:06Z and 14:07Z, two of my cron jobs were called (at their scheduled times) and finished with 200. After that, AppEngine-Google started to call the same jobs few times per every minutes. The disorder vanished after 14:50Z and no issues right now.
During the period, one of my task queue tasks was called at 14:11Z and finished with 503 (this was an expected failure). The task was scheduled to retry some hours later, but was called hundreds of times in an hour. The task's retry count was not incremented.
My guess is that something went wrong in task queues ("__cron" and, for me, "default") and the tasks were not removed until 14:50Z.
My app's App ID is vidssage.

How Google App Engine Java Task Queues can be used for mass scheduling for users?

I am focusing GAE-J for developing a Java web application.
I have a scenario where user will create his schedule for set of reminders. And I have to send emails on that particular date/time.
I can not create thread on GAE. So I have the solution of Task Queues.
So can I achieve this functionality with Task Queues. User will create tasks. And App Engine will execute it on specific date and time.
Thanks
Although using the task queue directly, as Chris suggests, will work, for longer reminder periods (eg, 30+ days) and in cases where the reminder might be modified, a more indirect approach is probably wise.
What I would recommend is storing reminders in the datastore, and then taking one of a few approaches, depending on your requirements:
Run a regular cron job (say, hourly) that fetches a list of reminders coming up in the next interval, and schedules task queue tasks for each.
Have a single task that you schedule to be run at the time the next reminder (system-wide) is due, which sends out the reminder(s) and then enqueues a new task for the next reminder that's due.
Run a backend, as Chris suggests, which regularly scans the datastore for upcoming reminders.
In all the above cases, you'll probably need some special case code for when a user sets a reminder in less than the minimum polling interval you've set - probably enqueuing a task directly. You'll also want to consider batching up the sending of reminders, to minimize tasks and wallclock time consumed.
You can do this with Task Queues - basically when you receive the request 'remind me at date/time X by sending an email', you create a new task with the following basic structure:
if current time is close to or past the given date/time X:
send the email
else
fail this task
If the reminder time is far in the future, the first few times the task is scheduled, it will fail and be scheduled for later. The downside of this approach is that it doesn't guarantee that the task will run exactly when the reminder is supposed to be sent - it may be a little while before or afterwards. You could slim down this window by taking into account that your task can run for 10 minutes, so if you're within 10 minutes of the reminder time, sleep until the right time and then send the e-mail.
If the reminders have to be sent out as close in time as possible then just use a Backend - keep an instance running forever and dispatch all reminders to it, and it can continuously look at all reminders it has to send out and send them out at exactly the right time.

Resources