Spacy on GAE/standard/second/Python exceeds memory of largest instance - google-app-engine

I've been using GAE for awhile without any issues. Only recent change is that I added Spacy along with a model I trained.
When I run locally with the dev_appserver, the app consumes about 153 MB. After deployment, I get memory exceeded errors. Even with the F4_1G instance, I exceed the memory:
Exceeded hard memory limit of 1228 MB with 1280 MB after servicing 0 requests total. Consider setting a larger instance class in app.yaml.
The deployment works if I import Spacy and don't load my model (the instance has about 200MB) so Spacy itself isn't the problem, but when I load my model with spacy.load() the memory then exceeds the limit. Note that this happens before I even use my Spacy model so just loading the model causes the problem.
My Spacy model is a tagger and parser that takes up 27 MB on disk. I can't understand why the memory requirements would be so much larger on app engine than on my Mac.
Looks like others have been able to run Spacy on app engine. Any idea what I could be doing wrong?

I was able to find a solution. I was loading my model into a module-level variable so when the module was imported, the model would be loaded.
When you deploy a second-gen GAE app, a bunch of worker threads get deployed (8 in my case). I don't understand the details of the worker threads, but I suspect that several of the worker threads import the module and that all of the worker threads contribute to memory usage.
I changed my code so that the model gets loaded on first use instead of at module import. With this change, the memory usage is 428MB.
Here is an example of what not to do:
import spacy
nlp = spacy.load('my_model')
def process_text(text):
return nlp(text)
Instead do this:
import spacy
nlp = None
def process_text(text):
global nlp
if nlp is None:
nlp = spacy.load('my_model')
return nlp(text)

Related

How to reduce react app build time and understanding behaviour of webpack when bundling

Recently I am trying to optimize the performance of a web app(React). Assuming it is somewhat heavy as it consists of Code editors, Firebase, SQL, AWS SDK, etc. So I integrated react-loadable which will lazy load the components, After that, I got this Javascript heap out of memory issue.
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory in React
After some research(From a friend), I came to know that If we keep too many lazy loadings webpack will try to bundle them parallelly It might be the cause to get a Javascript heap memory issue, To confirm that I removed all Lazy loading routes in my App and built. Now build is successful. Later as suggested by the community I increased Node heap space size and got the below insights
First I increased it to 8 GB(8192) then build is success I got build time of around 72 mins, From next time onwards I am getting around 20 mins. Then I decreased heap memory size to 4 GB(4096) and getting build success it is around 15 - 20 mins. System configuration is 2vCPU, 16GB RAM(AWS EC2 Instance r5a.large).
Next, I kept build in another system (Mac book pro, i5, 8 GB RAM, 4 Cores) Now it took 30 mins, Second time it took 20 mins
So from these data points, I got a couple of questions
Do we need to keep increasing heap space whenever we add some code? If yes what would be the average heap memory size in the community
What would be the usual configuration for build systems for these kinds of heavy apps, Why because now I am not sure whether to increase the number of cores or RAM or Heap space or Altogether something to do with our app code.
Do webpack provide any kind of solutions to avoid heap memory issue like limiting parallel processes, or any plugins?
If at all it is related with our App code, Is there any standard process to debug where it is taking memory and to optimize based on that
PS : Some people suggested to keep GENERATE_SOUCREMAP=false it got worked but we need source maps as they will be helpful in debugging production issues
Finally, I could resolve the heap out of memory issue without increasing heap memory space.
As mentioned in the question If I remove all Lazy routes build is getting successful Or I had to keep 4GB Heap space to get it success with plenty of build time. When the build is success with 4GB Heapspace I observed that nearly 8 - 10 chunk files sizes are nearly 1MB. So I analyzed all those chunks using Source map explorer. In all chunks almost same libraries code is included (In my case those are Firebase, Video player, etc which are heavy)
So In my assumption when webpack is trying to bundle all these chunks It had to build all those libraries dependency graph in every chunk which in turn causes heap memory space issue. So I used Loadable components to lazy load those libraries.
After lazy loading all those libraries all chunks files size is reduced almost by half, And Build is getting success without increasing any heap space and build time also got reduced.
After optimization if I keep build in 6vCPU , i7 System it is taking around 3 - 4 minutes and I observed that based on the number of Cores available in the system build time is getting reduced. If I keep build in 2vCPU system it is taking around 20 - 25 mins like that sometimes
Vanilla webpack has been developed for monolithic builds. It's main purpose is to take many modules and bundle them into ONE (not many). If you want to keep things modular, you want to use webpack-module-federation (WMF):
WMF allows you to develop independent packages that can easily depend on (and lazy load) each other.
These packages will automatically share dependencies between each other.
Without WMF, webpack allows none of the above.
Short Example
A library package app2 provides a component Button
An application package app1 consumes it.
When the time comes, app1 requests a component using dynamic import.
You can wrap the load using React.lazy, like so:
const RemoteButton = React.lazy(() => import("app2/Button"));
E.g., you can do this in a useEffect, or a Route.render callback etc.
app1 can use that component, once it's loaded. While loading, you might want to show a loading screen (e.g. using Suspense):
<React.Suspense fallback={<LoadingScreen />}>
<RemoteButton />
</React.Suspense>
Alternatively, instead of using lazy and Suspense, just take the promise returned from the import(...) statement and handle the asynchronous loading any way you prefer. Of course, WMF is not at all restricted to react and can load any module dynamically.
On the flip side, WMF dynamic loading must use dynamic import (i.e. import(...)), because:
non-dynamic imports will always resolve at load time (thus making it a non-dynamic dependency), and
"dynamic require" cannot be bundled by webpack since browsers have no concept of commonjs (unless you use some hacks, in which case, you will lose the relevant "loading promise").
Documentation
Even though, in my experience, WMF is mature, easy to use, and probably production ready, it's official documentation is currently only a not all too polished collection of conceptual notes. That is why I would recommend this as a "Getting Started" guide.

NDB query().iter() of 1000<n<1500 entities is wigging out

I have a script that, using Remote API, iterates through all entities for a few models. Let's say two models, called FooModel with about 200 entities, and BarModel with about 1200 entities. Each has 15 StringPropertys.
for model in [FooModel, BarModel]:
print 'Downloading {}'.format(model.__name__)
new_items_iter = model.query().iter()
new_items = [i.to_dict() for i in new_items_iter]
print new_items
When I run this in my console, it hangs for a while after printing 'Downloading BarModel'. It hangs until I hit ctrl+C, at which point it prints the downloaded list of items.
When this is run in a Jenkins job, there's no one to press ctrl+C, so it just runs continuously (last night it ran for 6 hours before something, presumably Jenkins, killed it). Datastore activity logs reveal that the datastore was taking 5.5 API calls per second for the entire 6 hours, racking up a few dollars in GAE usage charges in the meantime.
Why is this happening? What's with the weird behavior of ctrl+C? Why is the iterator not finishing?
This is a known issue currently being tracked on the Google App Engine public issue tracker under Issue 12908. The issue was forwarded to the engineering team and progress on this issue will be discussed on said thread. Should this be affecting you, please star the issue to receive updates.
In short, the issue appears to be with the remote_api script. When querying entities of a given kind, it will hang when fetching 1001 + batch_size entities when the batch_size is specified. This does not happen in production outside of the remote_api.
Possible workarounds
Using the remote_api
One could limit the number of entities fetched per script execution using the limit argument for queries. This may be somewhat tedious but the script could simply be executed repeatedly from another script to essentially have the same effect.
Using admin URLs
For repeated operations, it may be worthwhile to build a web UI accessible only to admins. This can be done with the help of the users module as shown here. This is not really practical for a one-time task but far more robust for regular maintenance tasks. As this does not use the remote_api at all, one would not encounter this bug.

GAE Soft private memory limit error on post requests

I am working on an application where I am using the paid services of Google app engine. In the application I am parsing a large xml file and trying to extracting data to the datastore. But while performing this task GAE is throwing me an error as below.
I also tried to change the performance setting by increasing frontend instance class from F1 to F2.
ERROR:
Exceeded soft private memory limit of 128 MB with 133 MB after servicing 14 requests total.
After handling this request, the process that handled this request was found to be using too much memory and was terminated. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may have a memory leak in your application.
Thank you in advance.
When you face the Exceeded soft private memory limit error you have two alternatives to follow:
To upgrade your instance to a more powerful one, which gives you more memory.
To reduce the chunks of data you process in each request. You could split the XML file to smaller pieces and keep the smaller instance doing the work.
I agree with Mario's answer. Your options are indeed to either upgrade to an Instance class with more memory such as F2 or F3 or process these XML files in smaller chunks.
To help you decide what would be the best path for this task, you would need to know if these XML files to be processed will grow in size. If the XML file(s) will remain approximately this size, you can likely just upgrade the instance class for a quick fix.
If the files can grow in size, then augmenting the instance memory may only buy you more time before encountering this limit again. In this case, your ideal option would be to use a stream to parse the XML file(s) in smaller units, consuming less memory. In Python, xml.sax can be used to accomplish just that as the parse method can accept streams. You would need to implement your own ContentHandler methods.
In your case, the file is coming from the POST request but if the file were coming from Cloud Storage, you should be able to use the client library to stream the content through to the parser.
I had a similar problem, almost sure it's my usage of /tmp directory was causing it, this directory is mounted in memory which was causing it. So, if you are writing any files into /tmp don't forget to remove them!
Another option is that you actully have a memory leak! It says after servicing 14 requests - this means the getting more powerful instance will only delay the error. I would recommend cleaning memory, now I don't know what your code looks like, I'm trying following with my code:
import gc
# ...
#app.route('/fetch_data')
def fetch_data():
data_object = fetch_data_from_db()
uploader = AnotherHeavyObject()
# ...
response = extract_data(data_object)
del data_object
del uploader
gc.collect()
return response
After trying things above, now it seems that issue was with FuturesSession - related to this https://github.com/ross/requests-futures/issues/20. So perhaps it's another library you're using - but just be warned that some of those libraries leak memory - and AppEngine preserves state - so whatever is not cleaned out - stays in memory, and affects following requests on that same instance.

zipped packages and in memory storage strategies

i have a multitenant app with a zipped package for each tenant/client which contains the templates and handlers for the public site for each of them. right now i have under 50 tenants and its fine to keep the imported apps in memory after the first request to that specific clients domain.
this approach works well but i have to redeploy the app with the new clients zipped package every time i make changes and/or a new client gets added.
now im working to make it possible to upload those packages via web upload and store them into the blobstore.
my concerns now are:
getting the packages from the blobstore is of course slower than importing a zipped package in the filesystem.
but this is not the biggest issue.
how do i load/import a module that is not in the filesystem and has no path?
if every clients package is around 1mb its not a problem as long as the client base is low but what if it raises
to 1k or even more? obviously there i dont have enough memory to store a few GB of data in memory.
what is the best way to deal with this?
if i use the instance memory to store the previously tenant package in memory how would
invalidate the data in memory if there would be a newly uploaded package?
i would appreciate some thougts about how to deal this kind of situation.
i agree with nick. there should be no python code in the tenant specific zip. to solve the memory issue i would cache most of the pages in the datastore. to serve them you don't need to have all tenants loaded in your instances. you might also wanna look in pre generating html views on save rather then on request.

Google App Engine Large File Upload

I am trying to upload data to Google App Engine (using GWT). I am using the FileUploader widget and the servlet uses an InputStream to read the data and insert directly to the datastore. Running it locally, I can upload large files successfully, but when I deploy it to GAE, I am limited by the 30 second request time. Is there any way around this? Or is there any way that I can split the file into smaller chunks and send the smaller chunks?
By using the BlobStore you have a 1 GB size limit and a special handler, called unsurprisingly BlobstoreUpload Handler that shouldn't give you timeout problems on upload.
Also check out http://demofileuploadgae.appspot.com/ (sourcecode, source answer) which does exactly what you are asking.
Also, check out the rest of GWT-Examples.
Currently, GAE imposes a limit of 10 MB on file upload (and response size) as well as 1 MB limits on many other things; so even if you had a network connection fast enough to pump up more than 10 MB within a 30 secs window, that would be to no avail. Google has said (I heard Guido van Rossum mention that yesterday here at Pycon Italia Tre) that it has plans to overcome these limitations in the future (at least for users of GAE which pay per-use to exceed quotas -- not sure whether the plans extend to users of GAE who are not paying, and generally need to accept smaller quotas to get their free use of GAE).
you would need to do the upload to another server - i believe that the 30 second timeout cannot be worked around. If there is a way, please correct me! I'd love to know how!
If your request is running out of request time, there is little you can do. Maybe your files are too big and you will need to chunk them on the client (with something like Flash or Java or an upload framework like pupload).
Once you get the file to the application there is another issue - the datastore limitations. Here you have two options:
you can use the BlobStore service which has quite nice API for handling up 50megabytes large uploads
you can use something like bigblobae which can store virtually unlimited size blobs in the regular appengine datastore.
The 30 second response time limit only applies to code execution. So the uploading of the actual file as part of the request body is excluded from that. The timer will only start once the request is fully sent to the server by the client, and your code starts handling the submitted request. Hence it doesn't matter how slow your client's connection is.
Uploading file on Google App Engine using Datastore and 30 sec response time limitation
The closest you could get would be to split it into chunks as you store it in GAE and then when you download it, piece it together by issuing separate AJAX requests.
I would agree with chunking data to smaller Blobs and have two tables, one contains th metadata (filename, size, num of downloads, ...etc) and other contains chunks, these chunks are associated with the metadata table by a foreign key, I think it is doable...
Or when you upload all the chunks you can simply put them together in one blob having one table.
But the problem is, you will need a thick client to serve chunking-data, like a Java Applet, which needs to be signed and trusted by your clients so it can access the local file-system

Resources