I have a lot of cron jobs with same config. I want to use vars to reuse some configs.
Here is my try.
cron.yaml:
cron:
- description: 'a'
url: /cron/events/a/b
schedule: &schedule every 1 hours
target: &target reuse-cron-config
- description: 'b'
url: /cron/events/a/c
schedule: *schedule
target: *target
But when I ran gcloud app deploy ./cron.yaml. It thrown an error:
ERROR: (gcloud.app.deploy) An error occurred while parsing file: [/Users/ldu020/workspace/nodejs-gcp/src/app-engine/standard-environment/reuse-cron-config/cron.yaml]
Anchors not supported in this handler
in "/Users/ldu020/workspace/nodejs-gcp/src/app-engine/standard-environment/reuse-cron-config/cron.yaml", line 4, column 15
All of my cron jobs have same target and schedule. How can I solve this? thanks.
update
I have a route like this to get params for each cron url:
app.get('/cron/events/:topic/:retryTopic', (req, res) => {
console.log(req.params); // {topic: 'a', retryTopic: 'b'}
})
You could wrap up all of these cron entries into a single entry called 'Hourly tasks' or 'daily tasks' and then the request handler could then launch all of these tasks via the task queue.
This would also help you stay well under the the cap imposed on the total number of cron tasks youre allowed to have
https://cloud.google.com/appengine/docs/standard/python/config/cronref#limits
Free applications can have up to 20 scheduled tasks. Paid applications can have up to 250 scheduled tasks.
Related
We are facing this bug in our production environment. I have been looking for a solution for a while and I cannot seem to solve it. Any help would be appreciated.
We are using Sagemaker Batch Transform to perform inference on our machine learning models. Each job is supposed to create one instance using a docker image from our ECR container. This job then consumes a payload and starts processing it using a pytorch script. When the job is done, the script calls an API to store the results.
The issue is that when we check the cloud watch logs for a SINGLE job, we see that it is repeated. After the job is repeated multiple times, the individual instances of the same job may or may not finish and the whole operation returns with an error.
Basically, we see the following issue in our cloud watch logs and cannot seem to figure out what is causing this:
2022-04-24 19:41:47,865 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Starting to Process Task: 12345678-abcd-1234-efgh-123456ab12c3
...
[The job is running and printing logs]
...
[There is no error but the job doesn't seem to run anymore, the same job seems to roll back again]
...
2022-04-24 19:52:09,522 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Starting to Process Task: 12345678-abcd-1234-efgh-123456ab12c3
...
[The job is running and printing logs]
...
[There is no error but the job doesn't seem to run anymore, the same job seems to roll back again]
...
2022-04-24 20:12:11,834 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Starting to Process Task: 12345678-abcd-1234-efgh-123456ab12c3
...
[The job is running and printing logs]
...
[There are no errors but the cloud watch logs stop here. Sagemaker returns an error to the client.]
The following sample code is what we are using to run the jobs:
def inference_batch(self):
batch_input = f"s3://{self.cnf.SAGEMAKER_BUCKET}/batch-input/batch.csv"
batch_output = f"s3://{self.cnf.SAGEMAKER_BUCKET}/batch-output/"
job_name = f"{self.cnf.SAGEMAKER_MODEL}-{str(datetime.datetime.now().strftime('%Y-%m-%d-%H-%m-%S'))}"
transform_input = {
'DataSource': {
'S3DataSource': {
'S3DataType': 'S3Prefix',
'S3Uri': batch_input
}
},
'ContentType': 'text/csv',
'SplitType': 'Line',
}
transform_output = {
'S3OutputPath': batch_output
}
transform_resources = {
'InstanceType': self.cnf.SAGEMAKER_BATCH_INSTANCE,
'InstanceCount': 1
}
# self.sm_boto_client is an instance of boto3.Session(region_name="some-region).client("sagemaker")
self.sm_boto_client.create_transform_job(
TransformJobName=job_name,
ModelName=self.cnf.SAGEMAKER_MODEL,
TransformInput=transform_input,
TransformOutput=transform_output,
TransformResources=transform_resources
)
status = self.sm_boto_client.describe_transform_job(TransformJobName=job_name)
print(f'Executing transform job {job_name}...')
while status['TransformJobStatus'] == 'InProgress':
time.sleep(5)
status = self.sm_boto_client.describe_transform_job(TransformJobName=job_name)
if status['TransformJobStatus'] == 'Completed':
print(f'Batch transform job {job_name} successfully completed.')
else:
raise Exception(f'Batch transform job {job_name} failed.')
I have a twitter bot which tweets out content a few times a day. here is my cron.yaml file.
cron:
- description: "twitter instagram scraper"
url: /scrape/twitter_intra
schedule: every 24 hours
target: scraper
- description: "USD to LKR scraper"
url: /scrape/exRates
schedule: every 1 hours
target: scraper
- description: "last 24 hour weather"
url: /scrape/weather_last24hours
schedule: every day 12:00
target: scraper
- description: "tweet out last 24 hour weather"
url: /tweet/weather_last24hours
schedule: every day 13:00
target: twitter
- description: "tweet out exchange Rate USD to LKR"
url: /tweet/exRates
schedule: every day 7:00
target: twitter
here is an example of one request method,
app.get(`/tweet/weather_last24hours`, async (req, res, next) => {
console.log(`Tweet!! last 24 hours`);
try {
//await tweetText('This is a test');
const report = await getWeatherLast24Hours();
const content = makeTweetLast24HourWeather(report);
console.log(content);
await tweetText(content);
res.status(200)
.set('Content-Type', 'text/plain')
.send(`Completed Successfully...!`)
.end();
} catch (error) {
next(error);
}
});
right now It's working fine except it cost me 2.2$ a day because I have to keep an instance up all day by doing the following.
const PORT = process.env.PORT || 8080;
app.listen(PORT, () => {
console.log(`App listening on port ${PORT}`);
console.log('Press Ctrl+C to quit.');
});
I try removing this part of the code assuming that GCP cron job can up an instance on its own and run the task. Even though cron job itself was able to up the instance, Task was a failure with 500 Error code and following message.
This request caused a new process to be started for your application,
and thus caused your application code to be loaded for the first time.
This request may thus take longer and use more CPU than a typical
request for your application.
error message in the logs. I try it a few times and got the same result.
Is there a solution to this, by starting an instance just before cron job executes?
Can newly introduced cloud scheduler can help?
am I doing something wrong?
Thank you in advance.
Your 500 is a warning to tell you something you already know: that bootup time will be longer due to needing to warm up a new instance. This is not necessarily a problem in itself, unless you notice that your tasks are not running properly.
To use Cloud Scheduler, consider decomposing your application into Cloud Functions. You can build a separate function for each of your 5 endpoints, plus a sixth that contains the shared logic (e.g. getWeatherLast24Hours()). You can then invoke them on a schedule using Cloud Scheduler.
The cost of running with Cloud Functions + Scheduler will be near-zero, so you'll want to do your own ROI evaluation to determine whether the development effort is worth the savings.
I have an RDS database instance on AWS and have turned it off for now. However, every few days it starts up on its own. I don't have any other services running right now.
There is this event in my RDS log:
"DB instance is being started due to it exceeding the maximum allowed time being stopped."
Why is there a limit to how long my RDS instance can be stopped? I just want to put my project on hold for a few weeks, but AWS won't let me turn off my DB? It costs $12.50/mo to have it sit idle, so I don't want to pay for this, and I certainly don't want AWS starting an instance for me that does not get used.
Please help!
That's a limitation of this new feature.
You can stop an instance for up to 7 days at a time. After 7 days, it will be automatically started. For more details on stopping and starting a database instance, please refer to Stopping and Starting a DB Instance in the Amazon RDS User Guide.
You can setup a cron job to stop the instance again after 7 days. You can also change to a smaller instance size to save money.
Another option is the upcoming Aurora Serverless which stops and starts for you automatically. It might be more expensive than a dedicated instance when running 24/7.
Finally, there is always Heroku which gives you a free database instance that starts and stops itself with some limitations.
You can also try saving the following CloudFormation template as KeepDbStopped.yml and then deploy with this command:
aws cloudformation deploy --template-file KeepDbStopped.yml --stack-name stop-db --capabilities CAPABILITY_IAM --parameter-overrides DB=arn:aws:rds:us-east-1:XXX:db:XXX
Make sure to change arn:aws:rds:us-east-1:XXX:db:XXX to your RDS ARN.
Description: Automatically stop RDS instance every time it turns on due to exceeding the maximum allowed time being stopped
Parameters:
DB:
Description: ARN of database that needs to be stopped
Type: String
AllowedPattern: arn:aws:rds:[a-z0-9\-]+:[0-9]+:db:[^:]*
Resources:
DatabaseStopperFunction:
Type: AWS::Lambda::Function
Properties:
Role: !GetAtt DatabaseStopperRole.Arn
Runtime: python3.6
Handler: index.handler
Timeout: 20
Code:
ZipFile:
Fn::Sub: |
import boto3
import time
def handler(event, context):
print("got", event)
db = event["detail"]["SourceArn"]
id = event["detail"]["SourceIdentifier"]
message = event["detail"]["Message"]
region = event["region"]
rds = boto3.client("rds", region_name=region)
if message == "DB instance is being started due to it exceeding the maximum allowed time being stopped.":
print("database turned on automatically, setting last seen tag...")
last_seen = int(time.time())
rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": str(last_seen)}])
elif message == "DB instance started":
print("database started (and sort of available?)")
last_seen = 0
for t in rds.list_tags_for_resource(ResourceName=db)["TagList"]:
if t["Key"] == "DbStopperLastSeen":
last_seen = int(t["Value"])
if time.time() < last_seen + (60 * 20):
print("database was automatically started in the last 20 minutes, turning off...")
time.sleep(10) # even waiting for the "started" event is not enough, so add some wait
rds.stop_db_instance(DBInstanceIdentifier=id)
print("success! removing auto-start tag...")
rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": "0"}])
else:
print("ignoring manual database start")
else:
print("error: unknown database event!")
DatabaseStopperRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Action:
- sts:AssumeRole
Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Policies:
- PolicyName: Notify
PolicyDocument:
Version: '2012-10-17'
Statement:
- Action:
- rds:StopDBInstance
Effect: Allow
Resource: !Ref DB
- Action:
- rds:AddTagsToResource
- rds:ListTagsForResource
- rds:RemoveTagsFromResource
Effect: Allow
Resource: !Ref DB
Condition:
ForAllValues:StringEquals:
aws:TagKeys:
- DbStopperLastSeen
DatabaseStopperPermission:
Type: AWS::Lambda::Permission
Properties:
Action: lambda:InvokeFunction
FunctionName: !GetAtt DatabaseStopperFunction.Arn
Principal: events.amazonaws.com
SourceArn: !GetAtt DatabaseStopperRule.Arn
DatabaseStopperRule:
Type: AWS::Events::Rule
Properties:
EventPattern:
source:
- aws.rds
detail-type:
- "RDS DB Instance Event"
resources:
- !Ref DB
detail:
Message:
- "DB instance is being started due to it exceeding the maximum allowed time being stopped."
- "DB instance started"
Targets:
- Arn: !GetAtt DatabaseStopperFunction.Arn
Id: DatabaseStopperLambda
It has worked for at least one person. If you have issues please report here.
I am trying to build my solr index for Django on ubuntu for the first time with ./manage.py rebuild_index and I get the following error:
Removing all documents from your index because you said so.
Failed to clear Solr index: Connection to server 'http://localhost:8983/solr/update/?commit=true' timed out: HTTPConnectionPool(host='localhost', port=8983): Request timed out. (timeout=10)
All documents removed.
Indexing 4 dishess
Failed to add documents to Solr: Connection to server 'http://localhost:8983/solr/update/?commit=true' timed out: HTTPConnectionPool(host='localhost', port=8983): Request timed out. (timeout=10)
I have access to localhost:8983/solr/ and localhost:8983/solr/admin via my web browser
You can bump up the TIMEOUT in settings.py.
For example
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
'URL': 'http://127.0.0.1:8080/solr/default',
'INCLUDE_SPELLING': True,
'TIMEOUT': 60 * 5,
},
}
Important thing here is that you shoudn't increase default timeout, because it could possibly block all your workers as haystack works synchronously.
The best way to avoid this is to define multiple connections for reads and writes with different timeouts and define.
http://django-haystack.readthedocs.org/en/latest/settings.html#haystack-connections
And use routers for read and write separation http://django-haystack.readthedocs.org/en/v2.4.0/multiple_index.html#automatic-routing
I have a following problem. I have defined a cron job in Google App Engine, but my get method is not called (or to be precise it is called every other time - if I run it manually it doesn't do anything at the first time, but at the second it works flawlessly). This is output from logging for the call made by cron:
2011-07-04 11:39:08.500 /suggestions/ 200 489ms 70cpu_ms 0kb AppEngine-Google; (+http://code.google.com/appengine)
0.1.0.1 - - [04/Jul/2011:11:39:08 -0700] "GET /suggestions/ HTTP/1.1" 200 0 - "AppEngine-Google; (+http://code.google.com/appengine)" "bazinga-match.appspot.com" ms=489 cpu_ms=70 api_cpu_ms=0 cpm_usd=0.001975 queue_name=__cron task_name=a449e27ff383de24ff8fc5d5f05f2aae
As you can see it makes GET request on /suggestions/, but nothing happens, including my log messages (they are printed, when I run it second time manually). Do you have any idea, why this might be happening?
My handler:
class SuggestionsHandler(RequestHandler):
def get(self):
logging.debug('Creating suggestions')
for key in db.Query(User, keys_only=True).order('last_suggestion'):
make_suggestion(key)
logging.debug('Done creating suggestions')
print
print('Done creating suggestions')
This is my cron.yaml:
cron:
- description: daily suggestion creation
url: /suggestions/
schedule: every 6 hours
and proper section of my app.yaml:
- url: /suggestions/
script: cron.py
login: admin
You're missing this declaration from the bottom of your handler file, after main:
if __name__ == '__main__':
main()
The first time a handler script is run, App Engine simply imports it, and this snippet runs your main in that situation. On subsequent requests, App Engine runs the main function, if you defined one.