Problem with nltk during app deployment on google cloud

Problem with nltk during app deployment on google cloud - google-app-engine

I tried to deploy my application on gcloud app engine, when the deployment finish and I tried to brows the URL, I got 502 server error. The log shows that there is problem with nltk package:
[31m>>> import nltk
>>> nltk.download('punkt')
[0m
Searched in:
- '/root/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/env/nltk_data'
- '/env/lib/nltk_data'
- ''
I have put the necessary hardware requirement on my app.yaml file :
service: vapi
runtime: python
env: flex
health_check:
enable_health_check: True
check_interval_sec: 5
timeout_sec: 4
unhealthy_threshold: 2
healthy_threshold: 2
entrypoint: gunicorn -b :$PORT wsgi:app
runtime_config:
python_version: 3.5
resources:
cpu: 2
memory_gb: 8
disk_size_gb: 20
I have tried to install the nltk packages into one of the search path shown in the log above.
also, I have created app engine configuration file:
# appengine_config.py
from google.appengine.ext import vendor
# Add any libraries install in the "lib" folder.
vendor.add(os.path.join(os.path.dirname(os.path.realpath(__file__)), 'lib'))
any suggestions?

You're mixing up the documentation for the standard environment with the one for the flexible environment.
Installing dependencies into the lib directory and using a appengine_config.py file is specific for the 1st generation standard environment.
For the flexible environment you specify your python dependencies using the requirements.txt file, see Using Python Libraries:
The Python runtime will automatically install all dependencies
declared in your requirements.txt during deployment.
For non-python dependencies or those which aren't pip-installable you can use a custom runtime, see Up-to-date pip with AppEngine Python flex env?
Maybe of interest: How to tell if a Google App Engine documentation page applies to the 1st/2nd generation standard or the flexible environment

Related

"gcloud app deploy" hangs on "Building and pushing image for service"

I suddenly can't deploy using gcloud app deploy.
It hangs on "Building and pushing image for service [default]". At that time, the Python process takes 99% CPU, and continues until the deploy times out. I've tried upgrading Python to no avail.
It occurs regardless of Google Appengine Project. Have tried installing different versions of gcloud CLI to no avail.
My teammates can deploy successfully using the same commands. Any ideas?
EDIT:
The app.yaml file:
runtime: nodejs
env: flex
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 6
disk_size_gb: 10
env_variables:
...
Output of my gcloud app deploy verbosity=debug: https://gist.github.com/bbarton/05b8ae514849a5731101afb5ae639357

gcloud app deploy trying to push old code

I have an app engine flexible environment running in Python. There was a syntax error in one of the python files, so when I ran the deploy command, it failed with the error "Application startup error! Code: APP_CONTAINER_CRASHED" showing me the syntax error. I went ahead and fixed the error, and when I tried to re-deploy, it still showed me the old error with the old code in the stack trace. The command i am using to deploy is this :
gcloud app deploy --version=6
The code being pushed is getting cached somewhere and I am unable to figure out how to clear the cache. I am using gcloudignore file and its content are as follows :
.gcloudignore
.git
.gitignore
__pycache__/
/setup.cfg
EDIT : My app.yaml file is as follows :
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app --timeout 3600
runtime_config:
python_version: 3
beta_settings:
cloud_sql_instances: ai4sg-armman:asia-south1:mmitra-armaan-10gb-cloudsql,ai4sg-armman:asia-south1:mmitra-predictions
# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml
manual_scaling:
instances: 1
resources:
cpu: 8
memory_gb: 45
disk_size_gb: 1000
The full terminal logs can be found here : https://justpaste.it/4y1px

How can I lower my billing in gcp app engine deployment?

I have hosted multiple applications in GCP App Engine. We are currently in development and testing environment, and the user requests are almost nil. In my understanding, the billing should have been low when the traffic is low. But the billing of past two months is way more than what we had initially expected. Our target is to host over hundred application in near future, but if the current billing trend continues, the potential situation after we scale is scary.
Until October we had hosted 5 applications and the billing was around 250 USD per month, but since November we've added two more applications-practically of same size and requirements as that of our previous five applications, and the billing has crossed 700 USD per month.
Is there any possibility that we could have been doing something wrong? or is it better if we shift to Kubernetes or VM instance?

App Engine is billed per instance/hour, compared to the market its prices are more than fair but you have to consider the whole picture while forecasting costs of your applications including, price of other services, traffic, etc.
First off I suggest you review the pricing of App Engine, which instance types are you using ? Can you use a more cheap instance type ?
Check how many instances your application spawn, you can do it in GPC App Engine info page or with Stackdriver Monitoring. Is the behaviour the one you expect ? Are you spawning too many instances at one point because of cron jobs, etc ? Will it be possible for your application to limit the max number or instances at a time in order to contain costs?
If you are using also other services, carefully review costs for each project inside the specific page, what is costing more than expecting ?
Review your total costs using the gcp pricing calculator, understand what you didn't expect and adjust your application to cover spikes in costs.

The bill suggests something is not truly idle.
Could you provide more details like a screenshot of the App Engine->Instances summary page? Or the billing summary? ( I can't add comments yet, hence my answer is a question for now .. )

#standard app.yaml
# service name or project name
service: default

# python runtime version
runtime: python37

entrypoint: #django

# type of app engine standard or flex
env: standard

# environment varible required for the project
env_variables:

# GCP cloud database envs
# Bucket storage envs

handlers:
- url: /static
static_dir: static
#flex app.yaml
service: #service name

runtime: custom

entrypoint: #django
env: flex

env_variables:

#GCP cloud sql
# bucket link

handlers:
- url: /static
static_dir: static/

runtime_config:
python_version: 3.6 # enter your Python version BASE ONLY here. Enter 2 for 2.7.9 or 3 for 3.6.4
#flex docker file
FROM gcr.io/google-appengine/python
LABEL python_version=python3.6
RUN virtualenv --no-download /env -p python3.6
# Set virtualenv environment variables. This is equivalent to running
# source /env/bin/activate
ENV VIRTUAL_ENV /env
ENV PATH /env/bin:$PATH
ADD requirements.txt /app/
RUN apt-get update
RUN apt-get install -y software-properties-common
RUN apt-add-repository ppa:ricotz/testing
RUN apt-get update
RUN apt-get install -y libcairo2-dev
RUN apt-get install -y build-essential python3-dev python3-pip python3-setuptools python3-wheel python3-cffi libcairo2 libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info
RUN apt install -y pkg-config
RUN pip install -r requirements.txt
ADD . /app/
CMD exec gunicorn -b :$PORT name.wsgi

How to pass -ldflags to GAE build?

I have an HTTP service written in Go. Inside main.go I have a global version string.
package main
var version string
Locally, I build using -ldflags "-X main.version=$VERSION where $VERSION is determined by the shell environment, like so:
VERSION=v0.16.0 go build ./cmd/app -ldflags "-X main.version=$VERSION
I've recently decided to trial Google App Engine and started with a basic YAML file:
runtime: go111
handlers:
- url: /.*
script: auto
What can I set in the YAML file in order to instruct GAE to build with the equivalent ldflags to bake in my version string?
I should also mention I use go modules with GO111MODULE=on locally when building.

You can't do it with you app.yaml file.
However, you can use Cloud build to build and deploy your app to App Engine.
In your cloudbuild.yaml you can add a line to the build step
args: ['build', '-a', '-installsuffix', 'cgo', '-ldflags', '''-w''', '-o', 'main', './main.go']

Is scrapy supported on google app engine?

It has following dependencies:
- Twisted 2.5.0, 8.0 or above
- lxml or libxml2 (if using libxml2, version 2.6.28 or above is highly recommended)
- simplejson
- pyopenssl

You cannot use C extensions on App Engine, which rules out lxml and (I believe) libxml2 and pyopenssl.
I doubt most of what Twisted does is possible in the App Engine sandbox either; you can't directly open sockets or spawn threads.
EDIT (January 2013): The Python 2.7 runtime does include some C extensions, including lxml. However, it's still not possible to use C extensions that aren't provided by Google with the runtime; most likely scrapy is still unusable at this time.

No but you could try AWS (http://dev.scrapy.org/wiki/AmazonEC2)

Update for 2019:
Scrapy indeed works on GAE. I can confirm that Scrapy can be deployed on GAE Python 3 standard environment using ScrapyRT.
Your scrapy.cfg file must be in the same directory as app.yaml to be picked up accordingly and a minimal setup would look like this:
runtime: python37
instance_class: F2
env_variables:
PORT: 8080
entrypoint: scrapyrt -i 0.0.0.0 -p $PORT -s LOG_DIR=/tmp
Note how LOG_DIR is set to /tmp which is most likely not what anyone would want for production environment. I might extend this answer once i figured out how to approach this appropriately.