I am aware that App Engine has a limit of 32 MB request upload limit. I am wondering if that could be increased.
A lot of other research suggests that I need to use the blobstore api directly, however my application has a special requirement where I cannot use it.
Other issues suggest that you can modify the nginx file in your custom flex environment. However I ssh'd into the instance I did not see any nginx. I have a reason to believe that its the GAE Load Balancer blocking the request to even reach the application.
Here is my setup.
GAE Flex Environment
Custom Runtime, Java using Docker
Objective: I want to increase the client_max_body_size to a 100 MB.
As you can see here this limit is stated in the official documentation. There is no way you can increase that limit, as it is something regarding the programming language itself. You can use Go environment, which has a limit of 64 MB.
This issue is discussed on more forums, but, for now, you just need to handle this kind of requests programatically. Check if they are bigger than 32MB, and in case they are, split them somehow and aggregate the results.
As a workaround you can also store the data in Google Cloud Storage as a temporary path for your workflow.
Related
I am trying to deploy a web app I have written, but I am stuck with one element. The bulk of it is just an Angular application that interacts with a MongoDB database, thats all fine. Where I am stuck is that I need local read access to around 10Gb of files (geoTiff digital elevation models) - these dont change and are broken down into 500 or so files. Each time my app needs geographic elevations, it needs to find the right file, read the right bit of the files, return the data - the quicker the better. To reiterate, I am not serving these files, just reading data from them.
In development these files are on my machine and I have no problems, but the files seem to be too large to bundle in the Angular app (runs out of memory), and too large to include in any backend assets folder. I've looked at two serverless cloud hosting platforms (GCP and Heroku) both of which limit the size of the deployed files to around 1Gb (if I remember right). I have considered using cloud storage for the files, but I'm worried about negative performance as each time I need a file it would need to be downloaded from the cloud to the application. The only solution I can think of is to use a VM based service like Google Compute and use an API service to recieve requests from the app and deliver back the required data, but I had hoped it could be more co-located (not least cos that solution costs more $$)...
I'm new to deployment so any advice welcome.
Load your data to a GIS DB, like PostGIS. Then have your app query this DB, instead of the local raster files.
I have started to try to use the Google Cloud datalab. While I understand it is a Beta product, I find the Doc's very frustrating, to say the least.
The questions here and lack of responses as well as lack of new revisions or docs over the several months the project has been available make me wonder if there is any commitment to the product?
A beginning would be a notebook that shows data ingestion from external sources to both the datastore system and the Big query system. That is a common use case. I'd like to use my own data, it would be great to have a Notebook to ingest it. It seems that should be doable without huge effort? And it would get me (and others) out of this mess trying to link the various terse docs from various products and workspaces up and working together..
in addition to a better explanation of the Git hub connection process (prior question))
For BigQuery, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Importing%20and%20Exporting%20Data.ipynb
For GCS, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/Storage/Storage%20Commands.ipynb
Those are the only two storage options currently supported in Datalab (which should not be used in any event for large scale data transfers; these are for small scale transfers that can fit in memory in the Datalab VM).
For Git support, see https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/intro/Using%20Datalab%20-%20Managing%20Notebooks%20with%20Git.ipynb. Note that this has nothing to do with Github, however.
As for the low level of activity recently, that is because we have been heads down getting ready for GCP Next (which happens this coming week). Once that is over we should be able to migrate a number of new features over to Datalab and get a new public release out soon.
Datalab isn't running on your local machine. Just the presentation part is in your browser. So if you mean the browser client machine, that wouldn't be a good solution - you'd be moving data from the local machine to a VM which is running the Datalab Python code (and this VM has limited storage space), and then moving it again to the real destination. Instead, you should use the cloud console or (preferably) gcloud command line on your local machine for this.
I have this application that will be run in a local network where a number of devices should interact with a database. I could use xampp and go for CherryPy or any other Python framework (Python is usually my choice) but it is the sum of a lot of different things: Python, Apache, MySQL... With GAE, which I have previously used in a number of applications successfully, I feel everything is neatly packed in a single box. Thay may not be true, but using the Google App Engine Launcher to create a local working copy of an app couldn't be easier.
But is it reliable? Should it be used like that? I know it's intended for development, so I'm unsure about using it as a local server in production. A few versions ago there even was this nasty bug that flushed the local datastore from time to time. But it seems that they fixed it and now data persists.
Would you recommend GAE for an application running in a local network or should I stick to LAMP (P for Python)?
Other alternative is http://code.google.com/p/appscale/.
May be you can check the the project TyphoonAE. I think it is exactly what you need.
The TyphoonAE project aims at providing a full-featured and productive
serving environment to run Google App Engine (Python) applications. It
delivers the parts for building your own scalable App Engine while
staying compatible with Google's API.
I have created a GWT application and now want to deploy it outside GAE. The reason I wish to deploy outside the GAE is the Sandbox security feature of GAE, which disallows me from writing files to my system. I store my data in the form of an ontology (.owl file) under my '/war/WEB_INF' and I want the end user to be able to modify (write to / save) this file through the server.
I understand that GAE does not let me do this, but is there a paid Google Service (e.g. google apps) that would allow hosting a GWT application which would allow writing files to the system? For instance, like an add-on to GAE?
If not, what solution would you recommend to host a GWT application (that would let me write a file to the WEB-INF folder) on the web?
EDIT: I solved this by deploying the GWT project as a .war file and hosting in TomCat.
I'm very new to GAE, but in case you haven't looked at their experimental write/read blobstore services you can check that out here. They have a similar API for python I believe. It's ofcourse stored on the GAE blobstore and not under /war/WEB-INF/ directory but It does allow a possible solution to what you're looking for.
Also, if you're looking to run your own server (possibly on EC2 for example), then you might want to look into AppScale. But I, personally, would stay away from that as a solution because I highly doubt that AppScale performs as well as google's GAE web servers and furthermore lacks the same degree of support/development.
Have you ruled out something like creating an Owl Entity to hold your ontologies, and arranging for *.owl requests to be handled by using that as a key name to find and serve the corresponding Owl? That's really simple code.
GWT is primarily a client side technology. GAE is a server side technology. You seem to be getting GWT and GAE engine mixed up with each other. GAE can work with almost any client side technology, and GWT can connect to many different back end platforms.
Are you trying to move your back end code directly to a new platform? Are you planning on rewriting the back end for a new platform, but keep the GWT code? What is your goal for this application? To be used by you and a few friends, or by thousands of people? For free or paying customers?
If you want to move off of AppEngine, you can switch to pretty much any java hosting service that you want - anything from a tiny shared VPS up to a Amazon EC2 mini cloud of your own. I don't think google offers generic java hosting. I don't know how you have built your application's back end, but you probably used servlets, which you should be able to get working pretty much anywhere.
If you want to stay on AppEngine, you should think about whether or not you can break your owl file into smaller sections that can be stored as entities in the database.
Whichever platform you choose, if you are planning on serving more than a few people, you will need some way to prevent one giant owl file from becoming a huge bottleneck.
I have an appengine app and I need to receive files from Third Parties.
The best option to me is to receive the files via ftp, but I have read that it is not possible, at least a year ago.
It is still not possible? Which way could I receive the files?
This is very important to my project, in fact it is indispensable.
Thx a lot!!!!
You need to use the Blobstore.
Edit: To post to the blobstore in Java, the code fragment in this SO question should work (this was for Android; elsewhere, use e.g. Apache HTTPClient). The URL to post to must have been created with createUploadUrl. The simplest way to communicate it to the source server might be a GET URL, e.g. "/makeupload" which is text/plain and contains only the URL to POST to. To prevent unauthorized uploads, you can require a password either in the POST, or already in the GET (e.g. as a query parameter)
The answer depend a lot of the size range of your imports. For small files the Urlfetch API will be sufficient.
I myself tend to import large CSV files ranging from 70–800 MB, in which case the legacy Blobstore and HTTP-POST doesn't cut it. GAE cannot handle HTTP requests >32 MB directly, nor can you upload static files >32 MB for manual import.
Traditionally, I've used a *nix relay for downloading the data files, splitting them into well-formed JSON segments and then submitting maybe 10-30 K HTTP-POST requests back to GAE. all inputs into well-formed. This used to be the only viable work-around, and for >1 GB files it might still come across as the preferred method due to scaling performance (complex import procedures is easily distributed across hundreds of F1 instances).
Luckily, as of April 9 this year (SDK 1.7.7) importing large files directly to GAE isn't much of a problem any longer. Outbound sockets are generally available to all billing-enabled apps, and consequently you'd easily solve the "large files" issue by opening up an FTP connection and downloading.
Sockets API Overview (Python): https://developers.google.com/appengine/docs/python/sockets/