GAE log search is not reliable

GAE log search is not reliable - google-app-engine

I`m running several python (2.7) applications and constantly hit one problem: log search (from dashboard, admin console) is not reliable. it is fine when i searching for recent log entries (they are normally found ok), but after some period (one day for instance) its not possible to find same record with same search query again. just "no results". admin console shows that i have 1 gig of logs spanning 10-12 days, so old record should be here to find, retention/log size limits is not a reason for this.
Specifically i have "cron" request that write stats to log every day (it`s enough for me) and searching for this request always gives me the last entry, not entry-per-day-of-span-period as expected.
Is it expected behaviour (i do not see clear statements about log storage behaviour in docs, for example) or there is something to tune? For example, will it help to log less per request? or may be there is advanced use of query language.
Please advise.

This is a known issue that has already been reported on googleappengine issue tracker.
As an alternative you can consider reading your application logs programmatically using the Log Service API to ingest them in BigQuery, or build your own search index.
Google App Engine Developer Relations delivered a codelab at Google I/O 2012 about App Engine logs ingestion into Big Query.
And Streak released a tool called called Mache and a chrome extension to automate this use case.

Related

GCP App Engine flex (GAE): Error when deploying

When deploying using gcloud app deploy I get the following error:
Timed out waiting for the app infrastructure to become healthy gcp
I contacted GCP Support and they told me the same thing I had read in other threads:
the error you are referring to may be related to the Compute Engine “In-Use IP Addresses” Quota limit. You can view your current quota limit information by accessing from your GCP menu “IAM & Admin > Quotas”.
I checked the "In-Use IP Addresses" and it doesn't seem like I have a problem with quotas:
Looking for the error, I found that in the Activity tab, when deploying, I get an error. Apparently , when App Engine is trying to delete a VM, the process starts to loop trying to delete it. You can see the error:
(I intentionally covered the project ID)
Edit: It seem like the problem is only with southamerica-east1. I created a new project in southamerica-east1 but I kept getting the same error, so then I created a new project with the App Engine in us-west2 and worked like a charm (I used the same application and app.yaml). I wonder if the problem is GCP southamerica-east1 or a unknown bad configuration by my side.

This is probably related to this issue: https://issuetracker.google.com/u/2/issues/73583699. It does mentioned the "in-use IP Address" quota, but many people have posted in recent days (Nov 2018) indicating that they are seeing the error and have verified that they have not hit their quota.
Unfortunately, no solution has been posted and there hasn't been any recent comment from the devs.

First, our apologies that you’ve experienced this issue. Be assured that we are aware of the situation and the team works hard to resolve it.
Our goal is to make sure that there are available resources in all zones. This
type of issue is rare. When a situation like this occurs, or is about to
occur, our team is notified immediately and the issue is investigated.
We recommend deploying and balancing your workload across multiple zones or
regions to reduce the likelihood of an outage. Please review our documentation
which outlines how to build resilient and scalable architectures on Google
Cloud Platform.
For the time being, you can try relaxing your requirements (e.g. requesting a smaller instance or one with fewer resources) or removing the external IP requirement.
If that proves not to be enough, you can try deploying your application to another region
Again, we want to offer our sincerest apologies.
Thanks for understanding.

At the end we didn't find a real solution so we moved all our services from Brazil to US-2. I'm not sure if the Region is the problem, but there in US-2 all works like a charm

Flex Instance Core Hours Sao Paulo

On our development environment, we have been charged about USD100 every month for an instance we didn't know existed (and of course we are not using), and we cant find in the entire AppEngine or Console Engine.
Also, the usage report shows no activity for the whole month, but we are still getting the charges.
The instance is: Flex Instance Core Hours Sao Paulo
I found similar posts in stackoverflow, so, here are the questions:
- is this some bad strategy from Google???
- where can I see this instance to stop it or delete it?
- where can I see who started this instance and when?
Of course, I called google support and no answer received.
Many thanks!

Google Cloud Platform Support here! I found your ticket and see that you were provided an answer there already. In addition to what Dan described in his answer, if your app has currently the "Serving" status it will still run with the corresponding instances regardless of any requests incoming or not. As long as the version is serving it will continue to bill for hours that you are using. Also, if you are using automatic scaling with a minimum number of instances:
that specified number of instances run as resident instances while any
additional instances are dynamic
(Instance scaling description in GCP docs)
You can use basic or manual scaling if this is not what you're interested in.

Check the App Engine Versions pages for all your projects, you should find at least one with Flexible environment. The Deployed column should indicate who deployed it and when.
Based on that information you can decide to keep or delete the respective version(s). Simply stopping the instance may not be sufficient, depending on the scaling configuration for that service version GAE may automatically start one or more new instances.
You should also check the App Engine Instances pages for your projects and cross-reference that with the versions info to make sure no undesired instances are accidentally left behind (at least in the standard environment they are normally stopped when the respective versions are deleted, not entirely certain the same is true for the flex environment)
The running flexible environment instances are billed by the hour, even if they receive no requests, which could explain why you're seeing charges without any activity.

Apparently, the source of this instance was a firebase setting we made to make some test, and it automatically creates this instance. I shut off the billing account for this space, and instantly I received an email from firebase saying it detected changes that make some functions unavailable.

Google Cloud Platform Tensorboard - No dashboards are currently active

I was working on the tensorflow object detection API. I managed to train it locally on my computer and get decent results. However, when I tried to replicate the same on GCP, I had several errors. So, basically, I followed the documentation mentioned in the official tensorflow -running on cloud documentation
So this is how the bucket is laid out:
Bucket
weeddetectin-data
Train-packages
This is how I ran the training and evaluation job:
Running a multiworker training job
Running an evaluation job on cloud
I then used the following command to monitor on tensoboard:
tensorboard --logdir=gs://weeddetection --port=8080
I opened the dashboard using the preview feature in the console. But it says no dashboards are active for the current data set.
No Dashboards are active
So, I checked on my activity page to really see if the training and evaluation job were submitted:
Training Job
Evaluation Job

It seems as if there are no events files being written to your bucket.
The root cause could be that the manual your are using refers to an old version of the tensor models.
Please try and change
--train_dir=gs:...
to
--model_dir=gs://${YOUR_BUCKET_NAME}/model
And resend the job, once the job is running check the model_dir in the bucket to see if the files are written there.
Check out: gcloud ml-engine jobs documentation for additional read.
Hope it help!

Solr reindex is stopping prematurely when running Collective Solr for Plone

My team is working on a search application for our websites. We are using Collective Solr in Plone to index our intranet and documentation sites. We recently set up shared blob storage on our test instance of the intranet site because Solr was not indexing our PDF files. This appears to be working, however, each time I run the reindexing script (##solr-maintenance/reindex) it stops after about an hour and a half. I know that it is not indexing our entire site as there are numerous pages, files, etc. missing when I run a query in the Solr dashboard.
The warning below is the last thing I see in the Solr log before the script stops. I am very new to Solr so I'm not sure what it indicates. When I run the same script on our documentation site, it completes without error.
2017-04-14 18:05:37.259 WARN (qtp1989972246-970) [ ] o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_284]
java.nio.file.NoSuchFileException: /var/solr/data/uvahealthPlone/data/index/segments_284
I'm hoping someone out there might have more experience with Collective Solr for Plone and could recommend some good resources for debugging this issue. I've done a lot of searching lately but haven't found much useful info.

This was a bug fixed some time ago with https://github.com/collective/collective.solr/pull/122

Google App Engine log download takes too long

I wrote an application running on Google App that generates logs at very high rate. Now I need to download and process them. It says in the Admin Console that the log is around 300MB. Ideally I would like to process only small parts of the logs. Could anybody give me some pointers on how to:
Download only log entries in a specific time range (i.e. the timestamp of the log records fall into this range).
OR
If I have about 300Mb of logs, what is the quickest way to download it?. I've run the appcfg.sh command for almost an hour now and it's still running (around 250000 log records now). Is there a way to break the log down to small chunks and download them in parallel?
Many thanks,
Anh.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

GAE log search is not reliable - google-app-engine

Related

GCP App Engine flex (GAE): Error when deploying

Flex Instance Core Hours Sao Paulo

Google Cloud Platform Tensorboard - No dashboards are currently active

Solr reindex is stopping prematurely when running Collective Solr for Plone

Google App Engine log download takes too long

Categories

Resources