What is the best way to Optimise my Apache2/PHP5/MySQL Server for HTTP File Sharing? - file

I was wondering what optimisations I could make to my server to better it's performance at handling file uploads/downloads.
At the moment I am thinking Apache2 may not be the best HTTP server for this?
Any suggestions or optimisations I could make on my server?
My current set up is an Apache2 HTTP server with PHP dealing with the file uploads which are currently stored in a folder out of the web root and randomly assigned a name which is stored in a MySQL database (along with more file/user information).
When a user wants to download a file, I use the header() function to force the download and readfile() to output the file contents.

You are correct that this is inefficient, but it's not Apache's fault. Serving the files with PHP is going to be your bottleneck. You should look into X-Sendfile, which allows you to tell Apache (via a header inserted by PHP) what file to send (even if it's outside the DocRoot).
The increase in speed will be more pronounced with larger files and heavier loads. Of course an even better way to increase speed is by using a CDN, but that's overkill for most of us.
Using X-Sendfile with Apache/PHP
http://www.jasny.net/articles/how-i-php-x-sendfile/
As for increasing performance with uploads, I have no particular knowledge. In general however, I believe each file upload would "block" one of your Apache workers for a long time, meaning Apache has to spawn more worker processes for other requests. With enough workers spawned, a server can slow noticeably. You may look into Nginx, which is an event-based, rather than process-based, server. This may increase your throughput, but I admit I have never experimented with uploads under Nginx.
Note: Nginx uses the X-Accel-Redirect instead of X-Sendfile.
http://wiki.nginx.org/XSendfile

Related

AWS S3 Upload speed?

I'm uploading my files from a project to S3 bucket in AWS. This is my first time uploading a project with AWS so I'm not sure if it usually take this long but its saying it will take over 1 day.
I also have turned on transfer acceleration and turned off everything running in the background which helped but it still seems like a long wait.
Any advice would be really appreciated!
You are uploading a large number of files via a web browser. This would involve overhead for each file and is likely single-threaded.
I would recommend using the AWS Command-Line Interface (CLI) to upload the files. It can upload multiple files simultaneously to take advantage of your bandwidth.
Also, it has the aws s3 sync command, which can recover from failures by only copying files that have not yet been uploaded. (That is, you can run it multiple times.)

GAE: What's faster loading an include config file from GCS or from cloud SQL

Based on the subdomain that is accessing my application I need to include a different configuration file that sets some variables used throughout the application (the file is included on every page). I'm in two minds about how to do this
1) Include the file from GCS
2) Store the information in a table on Google Cloud SQL and query the database on every page through an included file.
Or am I better off using one of these options and then Memcache.
I've been looking everywhere for what is the fastest option (loading from GCS or selecting from cloud SQL), but haven't been able to find anything.
NB: I don't want to have the files as normal php includes as I don't want to have to redeploy the app every time I setup a new subdomain (different users get different subdomains) and would rather either just update the database or upload a new config file to cloud storage, leaving the app alone.
I would say the most sane solution would be to store the configuration files in the Cloud SQL as you can easily make changes to them even from within the app and using the memcache since it was build exactly for this kind of stuff.
The problem with the GCS is that you cannot simply edit the file and you will have to delete and add a new version every time which is not going to be optimal in a long run.
GCS is cheaper, although for small text files it does not matter much. Otherwise, I don't see much of a difference.

Pass file to and from JBossAS to client

I've got client server application, with JBossAS7 and client which uses remote EJB provided by the server. I have to pass file from client to server, where it will be further processed via InputStream. Also have to pass file from server to client, where on server i get OutputStream. File size is not limited, it might be even 5GB. What can I do to implement solution to this case? Passing byte[] array seems not to be a good solution, RMI limits size what I've read. RMIIO is GPL (i need solution free for commercial use). Is http transfer the only reasonable way to do this?
edit: it seems that RMIIO was always LGPL!
You might consider setting up a Netty Server running on JBoss AS as showed in this Netty Tutorial and pass data using bare sockets.
Another option is HTTP by means of a simple HTTP Transfer using a Servlet for example.
I'd exclude EJB since they are transactional component, and admitted you managed to pass this data through RMI-IIOP, you still have to set up a huge Transactional timeout.
Hope it helps.
RMIIO is LGPL (different from just GPL), which is free for commercial use and is not viral (assuming you have not modified the rmiio library).

Sencha too slow

I introduced Sencha grid in one of my JSPs. Locally sencha is quite fast but on an external server it is too slow.
I followed deploy instructions here
http://docs.sencha.com/ext-js/4-0/#!/guide/getting_started
using ext-debug.js and my app.js.
Then, in my JSP, I imported app-all.js (670KB) and ext.js
Where am I wrong ?
Thanks
app-all.js is 670KB, which is a very big file. You should refactor, optimize and minify the code so it will be smaller. You could even separate into multiple files per class or implementation and do a dynamic js load (but that would take more time). A good target would be as small as ext.js's.
Also if you have access to your webserver (i.e. Apache/Tomcat), you could turn on gz compression to compress files before sending to browsers. Also look out for other webserver optimizations.
(btw, your question sounds more like a webserver issue rather than a sencha related issue).
Another way to improve the load time of your application is making sure ext.js and app-all.js are cached by the browser. This way the first time your application loads it will be slow, but the following loads will be faster.
Look into the cache-control, expires and other HTTP cache controlling headers (this appears to be a nice explanation). Your server should generate this headers when sending the files you want to be cached.
The real problem, as it appears from the timeline, is the slow connection to the server (10 seconds loading 206/665 KB is slow for most connections), so you should see if there are no other server problems causing the slowness.

appengine - ftp alternatives

I have an appengine app and I need to receive files from Third Parties.
The best option to me is to receive the files via ftp, but I have read that it is not possible, at least a year ago.
It is still not possible? Which way could I receive the files?
This is very important to my project, in fact it is indispensable.
Thx a lot!!!!
You need to use the Blobstore.
Edit: To post to the blobstore in Java, the code fragment in this SO question should work (this was for Android; elsewhere, use e.g. Apache HTTPClient). The URL to post to must have been created with createUploadUrl. The simplest way to communicate it to the source server might be a GET URL, e.g. "/makeupload" which is text/plain and contains only the URL to POST to. To prevent unauthorized uploads, you can require a password either in the POST, or already in the GET (e.g. as a query parameter)
The answer depend a lot of the size range of your imports. For small files the Urlfetch API will be sufficient.
I myself tend to import large CSV files ranging from 70–800 MB, in which case the legacy Blobstore and HTTP-POST doesn't cut it. GAE cannot handle HTTP requests >32 MB directly, nor can you upload static files >32 MB for manual import.
Traditionally, I've used a *nix relay for downloading the data files, splitting them into well-formed JSON segments and then submitting maybe 10-30 K HTTP-POST requests back to GAE. all inputs into well-formed. This used to be the only viable work-around, and for >1 GB files it might still come across as the preferred method due to scaling performance (complex import procedures is easily distributed across hundreds of F1 instances).
Luckily, as of April 9 this year (SDK 1.7.7) importing large files directly to GAE isn't much of a problem any longer. Outbound sockets are generally available to all billing-enabled apps, and consequently you'd easily solve the "large files" issue by opening up an FTP connection and downloading.
Sockets API Overview (Python): https://developers.google.com/appengine/docs/python/sockets/

Resources