Google Calendar api - rate limit exceeded (even though it's not) - google-app-engine

I have an App Engine app that works with various Google APIs. I started a sync task that syncs like 3000 events to various users calendars. It worked for a while but now I am getting the following error:
PHP Fatal error: Uncaught exception 'Google_Service_Exception' with message '{
"error": {
"errors": [
{
"domain": "usageLimits",
"reason": "rateLimitExceeded",
"message": "Rate Limit Exceeded"
}
],
"code": 403,
"message": "Rate Limit Exceeded"
}
}
If I look into the Api Dashboard, the limit is really high:
Queries per day 1,000,000
Queries per 100 seconds per user 50,000,000.
How can I get over this error? I want this task to finish so users see the events in their calendar.

As stated in the documentaion user rate limit is flood protection. An application can only make X number of requests per second.
403: Rate Limit Exceeded
The per-user limit from the Developer Console has been reached.
{
"error": {
"errors": [
{
"domain": "usageLimits",
"reason": "rateLimitExceeded",
"message": "Rate Limit Exceeded"
}
],
"code": 403,
"message": "Rate Limit Exceeded"
}
}
Suggested actions:
Use exponential backoff.
You can try adding quota user this helps sometimes.
quotaUser An arbitrary string that uniquely identifies a user.
Lets you enforce per-user quotas from a server-side application even in cases when the user's IP address is unknown. This can occur, for example, with applications that run cron jobs on App Engine on a user's behalf.
You can choose any arbitrary string that uniquely identifies a user, but it is limited to 40 characters.
If you are getting a quota error then it has been exceeded even though you dont think it has. Application level quotas can not be increased. The only thing you can do is slow down.

I could not find an answer to this question either. The default is 500 requests per 100 seconds, and even if I increased it, after some time, I can only make 5 requests/second which matches with a limit of 500.
That means the old default is always being used.

Related

"Error querying agent backend. State: URL_TIMEOUT, reason: TIMEOUT_DNSLOOKUP"

Since 1.5-2 my google smarthome action was working absolutely fine with device sync, state query and all relevant actions.
Since last 2 months I am getting following error, although I haven't changed anything:
0: {
action: {
actionType: "STATE_QUERY"
}
device: {
deviceType: "LIGHT"
}
status: {
externalDebugString: "Error querying agent backend. State: URL_TIMEOUT, reason: TIMEOUT_DNSLOOKUP"
isSuccess: false
statusType: "EXECUTION_BACKEND_FAILURE"
}
}
]
executionType: "PARTNER_CLOUD"
latencyMsec: "2834"
requestId: "5786688694498341746"
}
]
}
locale: "en-US"
}
Now, the smart home devices does not report states and control anything and on the Google Home app shows "Not Responding". And strange thing is, sometime it does work (2 out of 10 times I would say).
Another Info: I have the server hosted at my data centre and absolutely no changes have been made in terms of network, DNS etc.
Can anyone please advise what could be the reason for this? and how it could be resolved. Help is highly appreciated.
The error returned has the status type EXECUTION_BACKEND_FAILURE which means the Google smart home execution service tried to locate and reach out to your fulfillment endpoint but could not receive a valid response, potentially due to one of many different reasons.
The error log indicates that Google is trying to do a DNS lookup but is failing by timing out. You should check your server settings, and make sure that the domain name & IP matching is happening correctly with your DNS records (you can use a tool like nslookup to verify how it resolves).

StackDriver is alerting even when policy is met

I have a Cron job setup on Google App Engine, that runs everyday at 11 am.
Given that the execution of the cron is crucial to my system, I decided to setup an alert for it if there's any failures, and this was setup as:
Create a metric with
Resource Type: GAE Application
Metric: Response Count
And filters:
Response Code >= 200
Response Code <= 299
Module ID: my-application
With an empty Group By and Aggregator set to NONE
Condition triggers if any time series violates condition is absent for 1 day
So my intended behavior was "If this app doesn't respond 200 at the cron for 1 day (enough for it to be hit), alert me"
But I'm getting an email with the alert almost daily.
Is this a bug? Or is there another way to set this up? I'm receiving an alert, and when I see the cron page at App Engine, it was ran on the correct time, and returned 200 (success).

Rate Limit Exceeded Google Calendar

We are using Google Cloud instances (AppEngine) to synchronize data for our users with their Google Calendars (through the Calendar API). Basically, we provide a task management solution and the tasks should be synchronized (unidirectional) with the calendars they (the users) provide us access for.
How it all works:
1. We ask the users to grant access to their Google Account.
2. We ask them to select the desired calendar or offer the possibility of creating a new one under their account.
3. We push inserts/updates/deletes through the API.
The specific error we don't understand is 403 "Rate Limit Exceeded", which we received 190 times in the last 30 days from a total of 84,773 requests.
"error": {
"errors": [
{
"domain": "usageLimits",
"reason": "rateLimitExceeded",
"message": "Rate Limit Exceeded"
}
],
"code": 403,
"message": "Rate Limit Exceeded"
}
}
The reason we don't understand is because the maximum number of queries/day we have made is around 8K. The maximum daily limit we have in the Google Cloud API setting is 1 million.
Are there any other limits we need to be aware of? If not, what could be causing the issue? Did anyone face a similar scenario?
Thanks!
The rate limit error is not the same as the daily usage limit error. The rate limit is a safety limit to ensure we are not bombarded with requests over a short period of time.
You can use exponential backoff retry algorithms to ensure rate limit doesn't stop your app dead in the water (instead it just slows it down).
We had the same problem, without a logic reason, and we have solved it by using the batch mode

Errors in vm.syslog and Memory Usage constantly increasing on NodeJS AppEngine

I am having a problem on some of my AppEngine projects, since a few days I started to I see a lot of errors (which I noticed they might happen when an health check arrives) in my vm.syslog logs from Stackdriver Logging.
In the specific these are:
write_gcm: Server response (CollectdTimeseriesRequest) contains errors:#012{#012 "payloadErrors": [#012 {#012 "index": 71,#012 "error": {#012 "code": 3,#012 "message": "Expected 4 labels. Found 0. Mismatched labels for payload [values {\n data_source_name: \"value\"\n data_source_type: GAUGE\n value {\n double_value: 694411264\n }\n}\nstart_time {\n seconds: 1513266364\n nanos: 618061284\n}\nend_time {\n seconds: 1513266364\n nanos: 618061284\n}\nplugin: \"processes\"\nplugin_instance: \"all\"\ntype: \"ps_rss\"\n] on resource [type: \"gce_instance\"\nlabels {\n key: \"instance_id\"\n value: \"xxx\"\n}\nlabels {\n key: \"zone\"\n value: \"europe-west2-a\"\n}\n] for project xxx"#012 }#012 }#012 ]#012}
write_gcm: Unsuccessful HTTP request 400: {#012 "error": {#012 "code": 400,#012 "message": "Field timeSeries[11].metric.labels[1] had an invalid value of \"health_check_type\": Unrecognized metric label.",#012 "status": "INVALID_ARGUMENT"#012 }#012}
write_gcm: Error talking to the endpoint.
write_gcm: wg_transmit_unique_segment failed.
write_gcm: wg_transmit_unique_segments failed. Flushing.
At the same time, I noticed that my Memory Usage in the AppEngine dashboard for the very same projects is increasing with the passing of time at the point where it reaches the max amount available and the instance restarts, throwing a 502 error when visiting the web site that the app is serving.
All this is not happening on a couple of projects that have not been updated since at least 2 weeks (neither the errors above or the memory increase) but it does happen on a newly created instance when deployed with the same codebase of one of the healthy projects. In addition, I don't happen to see any increase in the memory when running my project locally.
Can someone gently tell me if they experienced something similar or if they think that the errors and the memory increase are related? I have haven't changed my yaml file for deployment recently and I haven't specified any custom configuration for the health checks (which run on legacy mode at the default rate).
Thank you for your help,
Nicola
Simliar question here App Engine Deferred: Tracking Down Memory Leaks
Going through same thing in compute engine on a single VM. I've tried increasing memory but the problem persists. Seems to be tied to a stackdriver method call. Not sure what to do, causes machines to stop after about 24hrs for me. In my case, I'm getting information every 3 seconds from a set of API's, but the error comes up every minute in the serial port 1 (console), which makes me suspect that it is a some kind of failure outside of my code. More from Google here: https://cloud.google.com/monitoring/api/ref_v3/rest/v3/projects.collectdTimeSeries/create .
I'm not sure about all of the errors, but for the "write_gcm: Server response (CollectdTimeseriesRequest)" I had the same issue and contacted Google Cloud Support. They told me that the Stackdriver service has been updated recently to accept more detailed information on ps_rss metrics, but it has caused metrics from older agents to not be sent at all.
You should be able to fix this issue by upgrading your Stackdriver agent to the latest version. On Compute Engine (that I was running) you have control over this, I'm not sure how you'd do it on AppEngine, maybe trigger a new deploy?

Unsure on how to solve a "termsOfServiceNotAccepted" Error

Background:
So I'm a novice to the whole app engine thing: I have made an app on google app engine that on the main page accepts user input and then sends the information to a handler that then uses the Big Query API to run a synchronous query with some tables I have uploaded to Big Query. The handler then sends back the results of the query as a json.
Problem:
In deployment it works mostly except sometimes a user can often run into this error while trying to make the synchronous query:
Error in runSyncQuery:
{
"error": {
"errors": [
{
"domain": "global",
"reason": "termsOfServiceNotAccepted",
"message": "BigQuery Terms of Service have not been accepted"
}
],
"code": 403,
"message": "BigQuery Terms of Service have not been accepted"
}
}
After doing some searching I com across this:
https://groups.google.com/forum/#!msg/bigquery-announce/l0kwVNpQX5A/ct0MglMmYMwJ
If you make API calls that are authenticated as an end user, then API calls will soon return errors when the end user has not accepted the updated Terms of Service. Apps built against the BigQuery API should ideally look for those errors and direct the user to the Google APIs Console to accept the new terms.
Except I dont really understand how to do this.
Also all the potential user accounts that I have tested my app with have access to a specific project that has Big Query API enabled, and can use the API so why does this pop up?
Also there are times when a specific account does not run into this problem. For instance if I keep refreshing and retrying to use the app eventually it will not have this problem and work. But then the next time this error will resurface again.
I don't understand how a user can have accepted the terms of service at one point of time and then not another at some point in the future?
Yes, any end users who authorize access to the BigQuery API must accept the Terms of Service (ToS) provided by the Google Developers Console at https://developers.google.com/console
It is possible that Terms of Service can change, and that some of your project members have not yet accepted updated BigQuery ToS. If one of your users is receiving this message when authorizing access to the BigQuery API, you redirect them to the https://developers.google.com/console to accept the terms of service.
Re: "specific account does not run into this problem" - can you confirm this is happening consistently with a specific account on a specific project?

Resources