Appengine deployments are extraodinarily slow today? - google-app-engine

We have a small java project need to deploy
it include 9000+ files
command : mvn gcloud:deploy
but I get the Log:
...
[INFO] INFO: Uploading [/home/steven/work/idigisign/target/appengine-staging/__static__/node_modules/rx/src/core/linq/observable/when.js] to [7dfb30ad32893c5042dba03601f006a40419fab0]
[INFO] DEBUG: Uploading [/home/steven/work/idigisign/target/appengine-staging/assets/global/plugins/bootstrap-switch/js/bootstrap-switch.min.js] to [7e0725897d7b99c3c33b56915d202e2dde552ea9]
[INFO] INFO: Uploading [/home/steven/work/idigisign/target/appengine-staging/assets/global/plugins/bootstrap-switch/js/bootstrap-switch.min.js] to [7e0725897d7b99c3c33b56915d202e2dde552ea9]
[INFO] DEBUG: Uploading [/home/steven/work/idigisign/target/appengine-staging/node_modules/is-redirect/index.js] to [7e0afe4775bf7f8558665760171c01948c22f771]
[INFO] INFO: Uploading [/home/steven/work/idigisign/target/appengine-staging/node_modules/is-redirect/index.js] to [7e0afe4775bf7f8558665760171c01948c22f771]
[INFO] DEBUG: Uploading [/home/steven/work/idigisign/target/appengine-staging/node_modules/rxjs/src/util/Map.ts] to [7e11722f4cd9ce91ec99b97710fbc4e7f40be09d]
...
About 50 per minute
So it will spent 180 minute...
It is extraodinarily slow
anybody can help me?

Set the environment variable CLOUDSDK_APP_USE_GSUTIL=1 and try again; this uses a less-reliable but faster codepath for file upload (there are plans to speed up the default codepath).

We have the same issue, it's very slow.
Guess we have solved it.
First, we traced the gcloud logs and we found many files had been uploaded again, these files all are no modified. So we try to trace the source code of gcloud and we found the issue is caused by "Google Cloud Storage JSON API".
When it queried the List of Bucket, it returned 1000 items but we have 1325 items so I guess we find the issue.
Then, we look for the api reference, and we find a parameter - maxResults, so we try to modify the source code(cloud_storage.py), and we find it has No Effect when its value is over 1000.
Finally, we find another parameter - nextPageToken, and we query list until the "nextPageToken" is None, now it got all items from "Google Cloud Storage" and the exists files not be uploaded again.
def ListBucket(bucket_ref, client):
request = STORAGE_MESSAGES.StorageObjectsListRequest(bucket=bucket_ref.bucket)
items = set()
try:
response = client.objects.List(request)
for item in response.items:
items.add(item.name)
while response.nextPageToken:
request = STORAGE_MESSAGES.StorageObjectsListRequest(bucket=bucket_ref.bucket,pageToken=response.nextPageToken)
response = client.objects.List(request)
for item in response.items:
items.add(item.name)
except api_exceptions.HttpError as e:
raise UploadError('Error uploading files: {e}'.format(e=e))
return items

Related

[flink]Task manager initialization failed

I am new to flink. I am trying to run the flink example on my local PC(windows).
However, after I run the start-cluster.bat, I login to the dashboard, it shows the task manager is 0.
I checked the log and seems it fails to initialize:
2020-02-21 23:03:14,202 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager initialization failed.
org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpec.FromConfig(TaskExecutorResourceUtils.java:72)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerSecurely$2(TaskManagerRunner.java:322)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerSecurely(TaskManagerRunner.java:321)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:287)
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) is not set
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84)
at java.util.Arrays$ArrayList.forEach(Arrays.java:3880)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70)
... 7 more
2020-02-21 23:03:14,217 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
Basically, it looks like a required option 'taskmanager.cpu.cores' is not set. However, I can't find this property in flink-conf.yaml and in the document(https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html) either.
I am using flink 1.10.0. Any help would be highly appreciated!
That configuration option is intended for internal use only -- it shouldn't be user configured, which is why it isn't documented.
The windows start-cluster.bat is failing because of a bug introduced in Flink 1.10. See https://jira.apache.org/jira/browse/FLINK-15925.
One workaround is to use the bash script, start-cluster.sh, instead.
See also this mailing list thread: https://lists.apache.org/thread.html/r7693d0c06ac5ced9a34597c662bcf37b34ef8e799c32cc0edee373b2%40%3Cdev.flink.apache.org%3E

Akeneo 2.2.5: No JobInstance found with code "add_to_existing_product_model"

Since the forum at akeneo.com is locked down, I posted my question here.
When I try to add Produkts to a Product-Model via mass-edit, I get the following error message:
No JobInstance found with code "add_to_existing_product_model"
[2018-06-19 19:39:31] request.INFO: Matched route "pim_enrich_mass_edit_rest_launch". {"route":"pim_enrich_mass_edit_rest_launch","route_parameters":{"_controller":"pim_enrich.controller.rest.mass_edit:launchAction","_route":"pim_enrich_mass_edit_rest_launch"},"request_uri":"http://pim.eu-trading.eu/rest/mass_edit/","method":"POST"} []
[2018-06-19 19:39:32] request.CRITICAL: Uncaught PHP Exception Symfony\Component\Translation\Exception\NotFoundResourceException: "No JobInstance found with code "add_to_existing_product_model"" at ./vendor/akeneo/pim-community-dev/src/Pim/Bundle/EnrichBundle/MassEditAction/OperationJobLauncher.php line 59 {"exception":"[object] (Symfony\\Component\\Translation\\Exception\\NotFoundResourceException(code: 0): No JobInstance found with code \"add_to_existing_product_model\" at ./vendor/akeneo/pim-community-dev/src/Pim/Bundle/EnrichBundle/MassEditAction/OperationJobLauncher.php:59)"} []
I get this error with the latest version of Akeneo 2 (v2.2.5). The product model was created manually, the products to be associated with the model came through the api.
This error looks like a missing job in the database. Did you run all the doctrine migrations?
To do so you need to launch this command:
bin/console doctrine:migrations:migrate --env=prod
If you already launched the migrations and they failed, you can install a clean 2.2.5 PIM elsewhere and dump the job instance table to be able to add the missing jobs. Here is the list of the jobs to add or update in 2.2:
- add_association
- move_to_category
- add_to_category
- remove_from_category
- add_to_existing_product_model
- compute_family_variant_structure_changes
- compute_completeness_of_products_family
- add_attribute_value
- delete_products_and_product_models

Non-interactive auto-refresh stale OAuth Token with Googlesheets package

I'm trying to automatically run an r script to download a private Google Sheet every hour. It always works fine when I'm interactively using R. It also works fine during the first hour after I automate the script with launchd.
It stops working an hour after I start automating it with launchd. I think the problem is that after one hour the access token changes, and the non-interactive version isn’t waiting for the auto refreshing of the OAuth token. Here is the error that I get from the error report:
Auto-refreshing stale OAuth token.
Error in gzfile(file, mode) : cannot open the connection
Calls: gs_auth ... -> -> cache_token -> saveRDS -> gzfile
In addition: Warning message:
In gzfile(file, mode) :
cannot open compressed file '.httr-oauth', probable reason 'Permission denied'
Execution halted
I'm using Jenny Bryan's googlesheets package. Here is the code that I initially use to register the sheet, and then save the oAuth token:
gToken <- gs_auth() # Run this the first time to get the oAuth information
saveRDS(gToken, "/Users/…/gToken.rds") # Save the oAuth information for non-interactive use
I then use the following script in the file that I automate with launchd:
gs_auth(token = "/Users/…/gToken.rds")
How can I avoid this error when running the script automatically with launchd?
I don't know about launchd but I had the same problem when I wanted to run a R script automatically from the Windows task planer. Changing the 'cache' attribute value to FALSE did the trick for me [1]: https://i.stack.imgur.com/pprlC.png
You can find the solution here: https://github.com/jennybc/googlesheets/issues/262
To authenticate once in the browser in order to get a token file, I did this:
token_file <- gs_auth(new_user = TRUE, cache = FALSE)
saveRDS(token_file, "googlesheets_token.rds")
Automatic login afterwards via:
gs_auth(token = paste0(path_scripts, "googlesheets_token.rds"),
verbose = TRUE, cache = FALSE)

Set default settings to 'no-cache' on Google Cloud Storage

Is there a way to set all public links to have 'no-cache' in Google Cloud Storage?
I've seen solutions to use gsutil to set the "Cache-Control" upon file-upload, but I'm looking for a more permanent solution.
There was a conversation about providing a cache invalidation feature but I didn't quite follow the reasoning. Any explanations would be greatly appreciated!
it would be difficult to provide a cache invalidation feature because once served with a non-0 cache TTL any cache on the Internet (not just those under Google's control) is allowed (per HTTP spec) to cache the data
Thanks!
For a more permanent one-time-effort solution, with the current offerings on GCP, you can do this with Cloud Functions.
Create a new Funciton, set the Event type to "On (finalizing/creating) file in the selected bucket" - google.storage.object.finalize. Make sure to select the bucket you want this on. In the body of the function, set the cacheControl / Cache-Control attribute for the blob. The attribute name depends on the language. Here's my version in Python, using cache_control:
main.py:
match the function name below to the Entry point
from google.cloud import storage
def set_file_uncached(event, context):
file = event # auto-generated
print(f"Processing file: {file=}") # logging, if you want it
storage_client = storage.Client()
# we expect just one with that name
blob = storage_client.bucket(file["bucket"]).get_blob(file["name"])
if not blob:
# in case the blob is deleted before this executes
print(f"blob not found")
return None
blob.cache_control = "public, max-age=0" # or whatever you need
blob.patch()
requirements.txt
google-cloud-storage
From the logs: Function execution took 1712 ms, finished with status: 'ok'. This could have been faster but I've set the minimum to 0 instances so it needs to spin-up for each upload. Depending on your usage and cost constraints, you can set it to 1 or something higher.
Other settings:
Retry on failure: No/False
Region: [wherever your bucket is]
Memory allocated: 128 MB (smallest available currently)
Timeout: 5 seconds (smallest available currently, function shouldn't take longer)
Minimum instances: 0
Maximum instances: 1

How to catch BigQuery loading errors from an AppEngine pipeline

I have built a pipeline on AppEngine that loads data from Cloud Storage to BigQuery. This works fine, ..until there is any error. How can I can loading exceptions by BigQuery from my AppEngine code?
The code in the pipeline looks like this:
#Run the job
credentials = AppAssertionCredentials(scope=SCOPE)
http = credentials.authorize(httplib2.Http())
bigquery_service = build("bigquery", "v2", http=http)
jobCollection = bigquery_service.jobs()
result = jobCollection.insert(projectId=PROJECT_ID,
body=build_job_data(table_name, cloud_storage_files))
#Get the status
while (not allDone and not runtime.is_shutting_down()):
try:
job = jobCollection.get(projectId=PROJECT_ID,
jobId=insertResponse).execute()
#Do something with job.get('status')
except:
exc_type, exc_value, exc_traceback = sys.exc_info()
logging.error(traceback.format_exception(exc_type, exc_value, exc_traceback))
time.sleep(30)
This gives me status error, or major connectivity errors, but what I am looking for is functional errors from BigQuery, like fields formats conversion errors, schema structure issues, or other issues BigQuery may have while trying to insert rows to tables.
If any "functional" error on BigQuery's side happens, this code will run successfully and complete normally, but no table will be written on BigQuery. Not easy to debug when this happens...
You can use the HTTP error code from the exception. BigQuery is a REST API, so the response codes that are returned match the description of HTTP error codes here.
Here is some code that handles retryable errors (connection, rate limit, etc), but re-raises when it is an error type that it doesn't expect.
except HttpError, err:
# If the error is a rate limit or connection error, wait and
# try again.
# 403: Forbidden: Both access denied and rate limits.
# 408: Timeout
# 500: Internal Service Error
# 503: Service Unavailable
if err.resp.status in [403, 408, 500, 503]:
print '%s: Retryable error %s, waiting' % (
self.thread_id, err.resp.status,)
time.sleep(5)
else: raise
If you want even better error handling, check out the BigqueryError class in the bq command line client (this used to be available on code.google.com, but with the recent switch to gCloud, it isn't any more. But if you have gcloud installed, the bq.py and bigquery_client.py files should be in the installation).
The key here is this part of the pasted code:
except:
exc_type, exc_value, exc_traceback = sys.exc_info()
logging.error(traceback.format_exception(exc_type, exc_value, exc_traceback))
time.sleep(30)
This "except" is catching every exception, logging it, and letting the process continue without any consideration for re-trying.
The question is, what would you like to do instead? At least the intention is there with the "#Do something" comment.
As a suggestion, consider App Engine's task queues to check the status, instead of a loop with a 30 second wait. When tasks get an exception, they are automatically retried - and you can tune that behavior.

Resources