Managing Uploaded File Validation using Servlet

Managing Uploaded File Validation using Servlet - file

I have the below requirement
A large text file around 10Mb to 25Mb (With 50,000 to 100,000 lines of data) is uploaded into the web application. I have to validate the file line by line and write the output to another location and then display a message to the user.
The App Server is WebLogic and its is accessed through Web Server through Apache Bridge. Apache Bridge times out pretty quickly during the upload + processing activity. Is there any way to solve this issue without changing the timeout of the Apache Bridge
What is best possible solution ? Below are my current thoughts.
Soln 1 Upload the file and return back to the page. Then trigger a Ajax to run the validation in a separate thread and check its status through further Ajax requests.
Soln 2. Use sc_partial_content(206) http Code to keep the connection alive.

Related

Getting large images from mulesoft into salesforce

So currently I am doing a synchronous call to mulesoft which returns raw image(no encoding is done) and then storing the image in a document.So when ever we are getting bigger images more than 6 MB it is hitting the governerlimit for max size.So wanted to know is there a way to get a reduced or compressed image

I have no idea if Mule has anything to preprocess images, compress...
In apex you could try to make the operation asynchronous to benefit from 22 mb limit. But there wil be no UI element for it anymore, your component / user would have to periodically check if the file got saved or something.
you could always change the direction. Make Mule push to salesforce over standard API instead of apex code pulling from Mule. From what I remember standard files API is good for up to 2GB.
Maybe send some notification to mule that you want file XYZ attached to account 123, mule would insert contentversion, contentdocumentlink? And have apex periodically check.
And when file is not needed - nightly job to delete files created by "Mr mule" over a week ago?

Dealing with large zip uploads and extracting using google cloud

I am trying to create a site for e-learning courses (zips html/css/js/media) to be uploaded to.
I am using go on google app engine with google cloud storage to store the zips and extracted courses.
I will explain the development dead ends I have encountered.
My first thought was to use the resumable upload functionality of cloud storage to send the zip file, then read it using go on app engine, unzip the files and write them back to cloud storage.
This took a while to read and understand the documentation and worked perfectly for my 2MB test zip. It failed when I tried it with a modest 67MB zip. I had encountered a hidden limitation when accessing cloud storage from app engine. No matter the client I used there was a 10MB/32MB limit.
I tried both the old and new libraries as well as blobstore.
I also looked into creating a custom oauth2 supporting client library using sockets but hit too many dead ends.
Giving up on that approach I thought even though it would mean more uploading, perhaps just extracting on the client (browser) side then uploading each file with it's own resumable upload would make the most sense. After exploring a few libraries I had extracting in browser working ready to upload.
I wrote my handler that created the datastore entry for the upload, selected a location for the upload and created all the upload urls.
When testing this I was finding that it would take a while to go through generating the long lists of files (anything over 100). I decided that it would make sense since I was using to to make the requests concurrently. I spend a day or two getting that working. After dealing with some CORS issues that weirdly did not show up earlier I had everything working.
Then I started getting errors when stress testing my approach with a large (500mb) zip/course. The uploads would fail and I discovered that when trying to send 300+ files to generate upload urls I was getting the following error
Post http://localhost:62394: dial tcp [::1]:62394: connectex: No connection could be made because the target machine actively refused it.
now I have no idea how to diagnose this. I don't know if I am hitting a rate limit and if I am I don't know how to avoid it.
This seems like creating this should be simple, but it is anything but.
I have a few options I can pursue
Try to create the resumable uploads with a batch operation(https://cloud.google.com/storage/docs/json_api/v1/how-tos/batch)
batch operations to /upload are not supported.
Maybe make requesting each url a one by one api call.
Make requesting the url happen over a channel (https://cloud.google.com/appengine/docs/go/channel/reference)
spend the next week or more adding layers of retries and fallback error handling.
Try another solution.
This should be simple. How should this be done?

Browser based file upload to AWS S3 and encode server-client workflow

Im writing a single-page-web-app (angularJs) and a server back-end (node.js). The communication between them is done via REST.
Currently im trying to implement the following scenario:
Upload big files from browser to S3 public bucket.
Copy uploaded file to private bucket on S3
Transcode uploaded file to HTML 5 compatible format (AWS Elastic Transcoder)
Store Meta-Object about the file in DB to access later
I'm racking my brains to get a well working design of the communication/ data-workflow between server and client, but always got stuck at the following questions?
Store file meta-object at the end or at the beginning of the process. If it is at the beginning, i have to store and handle some state information?
Who should start copying uploaded files to private bucket. Server or client? If it is the server, how can the client get informed about the job succeeded?
Who starts the transcoding process? If it is the server, how can the client get informed about the job succeeded?
How would you do this?

there is a pretty good tutorial which describes the use case you are planning to implement: http://www.bitcodin.com/blog/2015/02/create-mpeg-dash-hls-content-for-amazon-s3-and-cloudfront/
If your transcoding system has a RESTfull API (like bitcodin which is used in this tutorial, or any other service) you can do your application also client-side and use the API calls to get the state of your transcodings, etc. However, using the API you can do the same also server-side, whatever fits better for you.
I personally would store the metadata infos at the beginning of the process, as this is the point of time where you generate the "asset" in your database/CMS/etc.

Continuous polling of output using spring ,rest and angular js

I am working on a web application the frontend of which is developed in angular js and spring mvc and which consumes the data from a RESTful webservice.
There is a scenario wherein the REST webservice executes a tail command on a log file.
Now this output should be streamed on the UI.Any pointers on this would be helpful.

Solution 1
Maybe you want to take a look into WebSockets. The idea is, that you open a contineous connection between server and client for information exchange. This could be used for receiving log file updates from server.
The scenario could be something like this:
User enters the log view page and thereby subscribes for getting log file updates
In the server side code, where tail command is executed, an update is sent to all subscribers
User receives new log content
-> Spring Websockets <-
Solution 2
Another solution of plain polling would be to use a javascript timer function for repeating requests to your log file. Something like this:
setTimeout(function(){ queryLogFile() }, 1000);
However, this will result in a high amount of requests, so maybe you should use some kind of caching mechanism for your log file.

Uploading file on Google App Engine using Datastore and 30 sec response time limitation

Will the response timer on google app engine start upon submitting the web page's form?
If I'm going to upload a file that is greater than 1MB, I could split the files to 1MB to fit in the limitation of the Google App Engine Datastore. Now, my concern is if the client's internet connection is slow, it would eat up the 30 seconds timer right? If this is the case, it is impossible to upload large files with slow connection?

The 30 second response time limit only applies to code execution. So the uploading of the actual file as part of the request body is excluded from that. The timer will only start once the request is fully sent to the server by the client, and your code starts handling the submitted request. Hence it doesn't matter how slow your client's connection is.

As an side note, Instead of splitting your file into multiple parts, try using the blobstore. I am using it for images and it raises the storage limit to 50MB. (Remember to enable billing to get access to the blobstore)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight