appengine - ftp alternatives - google-app-engine

I have an appengine app and I need to receive files from Third Parties.
The best option to me is to receive the files via ftp, but I have read that it is not possible, at least a year ago.
It is still not possible? Which way could I receive the files?
This is very important to my project, in fact it is indispensable.
Thx a lot!!!!

You need to use the Blobstore.
Edit: To post to the blobstore in Java, the code fragment in this SO question should work (this was for Android; elsewhere, use e.g. Apache HTTPClient). The URL to post to must have been created with createUploadUrl. The simplest way to communicate it to the source server might be a GET URL, e.g. "/makeupload" which is text/plain and contains only the URL to POST to. To prevent unauthorized uploads, you can require a password either in the POST, or already in the GET (e.g. as a query parameter)

The answer depend a lot of the size range of your imports. For small files the Urlfetch API will be sufficient.
I myself tend to import large CSV files ranging from 70–800 MB, in which case the legacy Blobstore and HTTP-POST doesn't cut it. GAE cannot handle HTTP requests >32 MB directly, nor can you upload static files >32 MB for manual import.
Traditionally, I've used a *nix relay for downloading the data files, splitting them into well-formed JSON segments and then submitting maybe 10-30 K HTTP-POST requests back to GAE. all inputs into well-formed. This used to be the only viable work-around, and for >1 GB files it might still come across as the preferred method due to scaling performance (complex import procedures is easily distributed across hundreds of F1 instances).
Luckily, as of April 9 this year (SDK 1.7.7) importing large files directly to GAE isn't much of a problem any longer. Outbound sockets are generally available to all billing-enabled apps, and consequently you'd easily solve the "large files" issue by opening up an FTP connection and downloading.
Sockets API Overview (Python): https://developers.google.com/appengine/docs/python/sockets/

Related

AWS S3 Upload speed?

I'm uploading my files from a project to S3 bucket in AWS. This is my first time uploading a project with AWS so I'm not sure if it usually take this long but its saying it will take over 1 day.
I also have turned on transfer acceleration and turned off everything running in the background which helped but it still seems like a long wait.
Any advice would be really appreciated!
You are uploading a large number of files via a web browser. This would involve overhead for each file and is likely single-threaded.
I would recommend using the AWS Command-Line Interface (CLI) to upload the files. It can upload multiple files simultaneously to take advantage of your bandwidth.
Also, it has the aws s3 sync command, which can recover from failures by only copying files that have not yet been uploaded. (That is, you can run it multiple times.)

Request Entity Too Large (App Engine + Docker + Java)

I am aware that App Engine has a limit of 32 MB request upload limit. I am wondering if that could be increased.
A lot of other research suggests that I need to use the blobstore api directly, however my application has a special requirement where I cannot use it.
Other issues suggest that you can modify the nginx file in your custom flex environment. However I ssh'd into the instance I did not see any nginx. I have a reason to believe that its the GAE Load Balancer blocking the request to even reach the application.
Here is my setup.
GAE Flex Environment
Custom Runtime, Java using Docker
Objective: I want to increase the client_max_body_size to a 100 MB.
As you can see here this limit is stated in the official documentation. There is no way you can increase that limit, as it is something regarding the programming language itself. You can use Go environment, which has a limit of 64 MB.
This issue is discussed on more forums, but, for now, you just need to handle this kind of requests programatically. Check if they are bigger than 32MB, and in case they are, split them somehow and aggregate the results.
As a workaround you can also store the data in Google Cloud Storage as a temporary path for your workflow.

Azure logic Apps large file support

The video update from the Azure Logic Apps team suggested that large file (exceeding the 100MB limit) support is due for release back in August 2017.
https://youtu.be/DSPNHLOVu_A?t=1514
But I haven't seen it mentioned in the documentation for the connectors.
How do I know which connectors support large files? And how do I make use of it, I'd guess it's different to the normal having the payload in the body of a message (as that's limited to 100MB).
EDIT:
I struggled to find the release notes at first, but saw they are actually in the azure portal now rather than on a blog (which is quite cool).
Here's a deep link: https://ema.hosting.portal.azure.net/ema/1.30101.1.594429775.180105-1338/Html/iframereleasenotes.html?locale=en&trustedAuthority=https://portal.azure.com
I couldn't see it mentioned.
Also this user voice ticket hasn't been closed yet:
https://feedback.azure.com/forums/287593-logic-apps/suggestions/17229566-increase-ftp-connector-limit-above-50mb
Either it's not been documented as released / difficult to find or perhaps although mentioned in the video it didn't get released in the end?
Judging by this video.
https://youtu.be/qBD_RswoaPg?t=631
It sounds like it's shipped for blob storage -> ftp connector and combinations thereof.
But it didn't mention the http connector (which I'd need to copy the file down prior).
It's not mentioned in the release notes as far as I could tell.
Interestingly however in the settings of the http connector there is this:
But the address didn't lead to a page discussing chunking and enabling it doesn't allow me to exceed the 100mb limit.
EDIT
Discussing in the MSDN forums, there's a suggestion that the http connector not working for files above 100MB could be a bug: https://social.msdn.microsoft.com/Forums/en-US/741529c7-a5ad-44e0-8839-497fe8548dee/chunked-transfer-for-http-action-not-working?forum=azurelogicapps

Common file system API for files in the cloud?

Our app is a sort-of self-service website builder for a particular industry. We need to be able to store the HTML and image files for each customer's site so that users can easily access and edit them. I'd really like to be able to store the files on S3, but potentially other places like Box.net, Google Docs, Dropbox, and Rackspace Cloud Files.
It would be easiest if there there some common file system API that I could use over these repositories, but unfortunately everything is proprietary. So I've got to implement something. FTP or SFTP is the obvious choice, but it's a lot of work. WebDAV will also be a pain.
Our server-side code is Java.
Please someone give me a magic solution which is fast, easy, standards-based, and will solve all my problems perfectly without any effort on my part. Please?
Not sure if this is exactly what you're looking for but we built http://mover.io to address this kind of thing. We currently support 13 different end points and we have a GUI interface and an API for interfacing with all these cloud storage providers.

What is the best way to Optimise my Apache2/PHP5/MySQL Server for HTTP File Sharing?

I was wondering what optimisations I could make to my server to better it's performance at handling file uploads/downloads.
At the moment I am thinking Apache2 may not be the best HTTP server for this?
Any suggestions or optimisations I could make on my server?
My current set up is an Apache2 HTTP server with PHP dealing with the file uploads which are currently stored in a folder out of the web root and randomly assigned a name which is stored in a MySQL database (along with more file/user information).
When a user wants to download a file, I use the header() function to force the download and readfile() to output the file contents.
You are correct that this is inefficient, but it's not Apache's fault. Serving the files with PHP is going to be your bottleneck. You should look into X-Sendfile, which allows you to tell Apache (via a header inserted by PHP) what file to send (even if it's outside the DocRoot).
The increase in speed will be more pronounced with larger files and heavier loads. Of course an even better way to increase speed is by using a CDN, but that's overkill for most of us.
Using X-Sendfile with Apache/PHP
http://www.jasny.net/articles/how-i-php-x-sendfile/
As for increasing performance with uploads, I have no particular knowledge. In general however, I believe each file upload would "block" one of your Apache workers for a long time, meaning Apache has to spawn more worker processes for other requests. With enough workers spawned, a server can slow noticeably. You may look into Nginx, which is an event-based, rather than process-based, server. This may increase your throughput, but I admit I have never experimented with uploads under Nginx.
Note: Nginx uses the X-Accel-Redirect instead of X-Sendfile.
http://wiki.nginx.org/XSendfile

Resources