Google cloud storage using stream instead of bytebuffer - java

Google cloud storage using stream instead of bytebuffer - java - google-app-engine

I'm using the following code:
GcsService gcsService = GcsServiceFactory.createGcsService();
GcsFilename filename = new GcsFilename(BUCKETNAME, fileName);
GcsFileOptions options = new GcsFileOptions.Builder()
.mimeType(contentType)
.acl("public-read")
.addUserMetadata("myfield1", "my field value")
.build();
#SuppressWarnings("resource")
GcsOutputChannel outputChannel =
gcsService.createOrReplace(filename, options);
outputChannel.write(ByteBuffer.wrap(byteArray));
outputChannel.close();
The problem is that when I try to store video files, I have to store the file in the byteArray which could cause memory issues.
But I cannot find any interface to do the same with stream.
questions:
Should I worry about mem issues in the appengine srv, or are they capable of keeping a 1 min video in mem?
is it possible to use stream instead of byte array? how?
I'm reading the bytes as byte[] byteArray = IOUtils.toByteArray(stream); should I use the byte array as a real buffer and just read chunks and upload them to the GCS? how do I do that?

The amount of memory available depends on the appengine instance type you've configured. Streaming this data seems like a good idea if you can.
Not sure about the GcsService api, but looks like you can do this using the gcloud Storage api:
https://github.com/GoogleCloudPlatform/gcloud-java/blob/master/gcloud-java-storage/src/main/java/com/google/cloud/storage/Storage.java
This code might work (untested)...
final BlobInfo info = BlobInfo.builder(bucket.getBucketName(), "name").contentType("image/png").build();
final ReadableByteChannel src = Channels.newChannel(stream);
final WriteChannel dst = gcsStorage.writer(info);
fastChannelCopy(src, dst);
private void fastChannelCopy(final ReadableByteChannel src, final WritableByteChannel dest) throws IOException {
final ByteBuffer buffer = ByteBuffer.allocateDirect(16 * 1024);
while (src.read(buffer) != -1) {
buffer.flip(); // prepare the buffer to be drained
dest.write(buffer); // write to the channel, may block
// If partial transfer, shift remainder down
// If buffer is empty, same as doing clear()
buffer.compact();
}
// EOF will leave buffer in fill state
buffer.flip();
// make sure the buffer is fully drained.
while (buffer.hasRemaining()) {
dest.write(buffer);
}
}

Related

Codename One API to append / merge files

To merge Storage files in Codename One I elaborated this solution:
/**
* Merges the given list of Storage files in the output Storage file.
* #param toBeMerged
* #param output
* #throws IOException
*/
public static synchronized void mergeStorageFiles(List<String> toBeMerged, String output) throws IOException {
if (toBeMerged.contains(output)) {
throw new IllegalArgumentException("The output file cannot be contained in the toBeMerged list of input files.");
}
// Note: the temporary file used for merging is placed in the FileSystemStorage because it offers the method
// openOutputStream(String file, int offset) that allows appending to a stream. Storage doesn't have a such method.
long writtenBytes = 0;
String tempFile = FileSystemStorage.getInstance().getAppHomePath() + "/tempFileUsedInMerge";
for (String partialFile : toBeMerged) {
InputStream in = Storage.getInstance().createInputStream(partialFile);
OutputStream out = FileSystemStorage.getInstance().openOutputStream(tempFile, (int) writtenBytes);
Util.copy(in, out);
writtenBytes = FileSystemStorage.getInstance().getLength(tempFile);
}
Util.copy(FileSystemStorage.getInstance().openInputStream(tempFile), Storage.getInstance().createOutputStream(output));
FileSystemStorage.getInstance().delete(tempFile);
}
This solution is based on the API FileSystemStorage.openOutputStream(String file, int offset), that is the only API that I found to allow to append the content of a file to another.
Are there other API that can be used to append or merge files?
Thank you

Since you end up copying everything to a Storage entry I don't see the value of using FileSystemStorage as an intermediate merging tool.
The only reason I can think of is integrity of the output file (e.g. if failure happens while writing) but that can happen here too. You can guarantee integrity by setting a flag e.g. creating a file called "writeLock" and deleting it when write has finished successfully.
To be clear I would copy like this which is simpler/faster:
try(OutputStream out = Storage.getInstance().createOutputStream(output)) {
for (String partialFile : toBeMerged) {
try(InputStream in = Storage.getInstance().createInputStream(partialFile)) {
Util.copyNoClose(in, out, 8192);
}
}
}

read cloud storage content with "gzip" encoding for "application/octet-stream" type content

We're using "Google Cloud Storage Client Library" for app engine, with simply "GcsFileOptions.Builder.contentEncoding("gzip")" at file creation time, we got the following problem when reading the file:
com.google.appengine.tools.cloudstorage.NonRetriableException: java.lang.RuntimeException: com.google.appengine.tools.cloudstorage.SimpleGcsInputChannelImpl$1#1c07d21: Unexpected cause of ExecutionException
at com.google.appengine.tools.cloudstorage.RetryHelper.doRetry(RetryHelper.java:87)
at com.google.appengine.tools.cloudstorage.RetryHelper.runWithRetries(RetryHelper.java:129)
at com.google.appengine.tools.cloudstorage.RetryHelper.runWithRetries(RetryHelper.java:123)
at com.google.appengine.tools.cloudstorage.SimpleGcsInputChannelImpl.read(SimpleGcsInputChannelImpl.java:81)
...
Caused by: java.lang.RuntimeException: com.google.appengine.tools.cloudstorage.SimpleGcsInputChannelImpl$1#1c07d21: Unexpected cause of ExecutionException
at com.google.appengine.tools.cloudstorage.SimpleGcsInputChannelImpl$1.call(SimpleGcsInputChannelImpl.java:101)
at com.google.appengine.tools.cloudstorage.SimpleGcsInputChannelImpl$1.call(SimpleGcsInputChannelImpl.java:81)
at com.google.appengine.tools.cloudstorage.RetryHelper.doRetry(RetryHelper.java:75)
... 56 more
Caused by: java.lang.IllegalStateException: com.google.appengine.tools.cloudstorage.oauth.OauthRawGcsService$2#1d8c25d: got 46483 > wanted 19823
at com.google.common.base.Preconditions.checkState(Preconditions.java:177)
at com.google.appengine.tools.cloudstorage.oauth.OauthRawGcsService$2.wrap(OauthRawGcsService.java:418)
at com.google.appengine.tools.cloudstorage.oauth.OauthRawGcsService$2.wrap(OauthRawGcsService.java:398)
at com.google.appengine.api.utils.FutureWrapper.wrapAndCache(FutureWrapper.java:53)
at com.google.appengine.api.utils.FutureWrapper.get(FutureWrapper.java:90)
at com.google.appengine.tools.cloudstorage.SimpleGcsInputChannelImpl$1.call(SimpleGcsInputChannelImpl.java:86)
... 58 more
What else should be added to read files with "gzip" compression to be able to read the content in app engine? ( curl cloud storage URL from client side works fine for both compressed and uncompressed file )
This is the code that works for uncompressed object:
byte[] blobContent = new byte[0];
try
{
GcsFileMetadata metaData = gcsService.getMetadata(fileName);
int fileSize = (int) metaData.getLength();
final int chunkSize = BlobstoreService.MAX_BLOB_FETCH_SIZE;
LOG.info("content encoding: " + metaData.getOptions().getContentEncoding()); // "gzip" here
LOG.info("input size " + fileSize); // the size is obviously the compressed size!
for (long offset = 0; offset < fileSize;)
{
if (offset != 0)
{
LOG.info("Handling extra size for " + filePath + " at " + offset);
}
final int size = Math.min(chunkSize, fileSize);
ByteBuffer result = ByteBuffer.allocate(size);
GcsInputChannel readChannel = gcsService.openReadChannel(fileName, offset);
try
{
readChannel.read(result); <<<< here the exception was thrown
}
finally
{
......
It is now compressed by:
GcsFilename filename = new GcsFilename(bucketName, filePath);
GcsFileOptions.Builder builder = new GcsFileOptions.Builder().mimeType(image_type);
builder = builder.contentEncoding("gzip");
GcsOutputChannel writeChannel = gcsService.createOrReplace(filename, builder.build());
ByteArrayOutputStream byteStream = new ByteArrayOutputStream(blob_content.length);
try
{
GZIPOutputStream zipStream = new GZIPOutputStream(byteStream);
try
{
zipStream.write(blob_content);
}
finally
{
zipStream.close();
}
}
finally
{
byteStream.close();
}
byte[] compressedData = byteStream.toByteArray();
writeChannel.write(ByteBuffer.wrap(compressedData));
the blob_content is compressed from 46483 bytes to 19823 bytes.
I think it is the google code's bug
https://code.google.com/p/appengine-gcs-client/source/browse/trunk/java/src/main/java/com/google/appengine/tools/cloudstorage/oauth/OauthRawGcsService.java, L418:
Preconditions.checkState(content.length <= want, "%s: got %s > wanted %s", this, content.length, want);
the HTTPResponse has decoded the blob, so the Precondition is wrong here.

If I good understand you have to set mineType:
GcsFileOptions options = new GcsFileOptions.Builder().mimeType("text/html")
Google Cloud Storage does not compress or decompress objects:
https://developers.google.com/storage/docs/reference-headers?csw=1#contentencoding
I hope that's what you want to do .

Looking at your code it seems like there is a mismatch between what is stored and what is read. The documentation specifies that compression is not done for you (https://developers.google.com/storage/docs/reference-headers?csw=1#contentencoding). You will need to do the actual compression manually.
Also if you look at the implementation of the class that throws the exception (https://code.google.com/p/appengine-gcs-client/source/browse/trunk/java/src/main/java/com/google/appengine/tools/cloudstorage/oauth/OauthRawGcsService.java?r=81&spec=svn134) you will notice that you get the original contents back but you're actually expecting compressed content. Check the method readObjectAsync in the above mentioned class.
It looks like the content persisted might not be gzipped or the content-length is not set properly. What you should do is verify length of the compressed stream just before writing it into the channel. You should also verify that the content length is set correctly when doing the http request. It would be useful to see the actual http request headers and make sure that content length header matches the actual content length in the http response.
Also it looks like contentEncoding could be set incorrectly. Try using:.contentEncoding("Content-Encoding: gzip") as used in this TCK test. Although still the best thing to do is inspect the HTTP request and response. You can use wireshark to do that easily.
Also you need to make sure that GCSOutputChannel is closed as that's when the file is finalized.
Hope this puts you on the right track. To gzip your contents you can use java GZIPInputStream.

I'm seeing the same issue, easily reproducable by uploading a file with "gsutil cp -Z", then trying to open it with the following
ByteArrayOutputStream output = new ByteArrayOutputStream();
try (GcsInputChannel readChannel = svc.openReadChannel(filename, 0)) {
try (InputStream input = Channels.newInputStream(readChannel))
{
IOUtils.copy(input, output);
}
}
This causes an exception like this:
java.lang.IllegalStateException:
....oauth.OauthRawGcsService$2#1883798: got 64303 > wanted 4096
at ....Preconditions.checkState(Preconditions.java:199)
at ....oauth.OauthRawGcsService$2.wrap(OauthRawGcsService.java:519)
at ....oauth.OauthRawGcsService$2.wrap(OauthRawGcsService.java:499)
The only work around I've found is to read the entire file into memory using readChannel.read:
int fileSize = 64303;
ByteBuffer result = ByteBuffer.allocate(fileSize);
try (GcsInputChannel readChannel = gcs.openReadChannel(new GcsFilename("mybucket", "mygzippedfile.xml"), 0)) {
readChannel.read(result);
}
Unfortunately, this only works if the size of the bytebuffer is greater or equal to the uncompressed size of the file, which is not possible to get via the api.
I've also posted my comment to an issue registered with google: https://code.google.com/p/googleappengine/issues/detail?id=10445

This is my function for reading compressed gzip files
public byte[] getUpdate(String fileName) throws IOException
{
GcsFilename fileNameObj = new GcsFilename(defaultBucketName, fileName);
try (GcsInputChannel readChannel = gcsService.openReadChannel(fileNameObj, 0))
{
maxSizeBuffer.clear();
readChannel.read(maxSizeBuffer);
}
byte[] result = maxSizeBuffer.array();
return result;
}
The core is that you cannot use the size of the saved file cause Google Storage will give it to you with the original size, so it checks the sizes you expected and the real size and these are differents:
Preconditions.checkState(content.length <= want, "%s: got %s > wanted
%s", this, content.length, want);
So i solved it allocating the biggest amount possible for these files using BlobstoreService.MAX_BLOB_FETCH_SIZE. Actually maxSizeBuffer is only allocated once outsize of the function
ByteBuffer maxSizeBuffer = ByteBuffer.allocate(BlobstoreService.MAX_BLOB_FETCH_SIZE);
And with maxSizeBuffer.clear(); all data is flushed again.

How to use Java nio to write an uploaded image from ServletInputStream?

I've done the upload using ByteArrayOutputStream and now I want to use nio to write an image to a file in the hard disk from a ServletInputStream, I've tried a couple of ways but with no luck so far, now I have :
#Override
public void doPost(final HttpServletRequest request, final HttpServletResponse response)
throws IOException, ServletException {
final String fileName = "img_" + UUID.randomUUID().toString() + ".jpg";
final String filePathName = "E:\\tmp\\" + fileName;
final FileChannel outChannel = new FileOutputStream(filePathName).getChannel();
final ReadableByteChannel inChannel = Channels.newChannel(request.getInputStream());
outChannel.transferFrom(inChannel, 0, request.getContentLength());
inChannel.close();
outChannel.close();
}
The specified file is generated with the same size as original, but cannot be opened. What have I done wrong here please? what is the proper way?
Thanks.

I don't see why the '--' is being put in the file, unless it is being sent to you, but you need to call transferFrom() in a loop. You can't assume the entire file is transferred in one call. It returns the number of bytes it transferred each call, so you can track the total number transferred: if it's complete, break, otherwise add that to the offset, subtract it from the length, and repeat.

How to read byte by byte from appengine datastore Entity Object

In a nutshell, since GAE cannot write to a filesystem, I have decided to persist my data into the datastore (using JDO). Now, I will like to retrieve the data byte by byte and pass it to the client as an input stream. There's code from the gwtupload library(http://code.google.com/p/gwtupload/) (see below) which breaks on GAE because it writes to the system filesystem. I'll like to be able to provide a GAE ported solution.
public static void copyFromInputStreamToOutputStream(InputStream in, OutputStream out) throws IOException {
byte[] buffer = new byte[100000];
while (true) {
synchronized (buffer) {
int amountRead = in.read(buffer);
if (amountRead == -1) {
break;
}
out.write(buffer, 0, amountRead);
}
}
in.close();
out.flush();
out.close();
}
One work around I have tried (didn't work) is to retrieve the data from the datastore as a resource like this:
InputStream resourceAsStream = null;
PersistenceManager pm = PMF.get().getPersistenceManager();
try {
Query q = pm.newQuery(ImageFile.class);
lf = q.execute();
resourceAsStream = getServletContext().getResourceAsStream((String) pm.getObjectById(lf));
} finally {
pm.close();
}
if (lf != null) {
response.setContentType(receivedContentTypes.get(fieldName));
copyFromInputStreamToOutputStream(resourceAsStream, response.getOutputStream());
}
I welcome your suggestions.
Regards

Store data in a byte array, and use a ByteArrayInputStream or ByteArrayOutputStream to pass it to libraries that expect streams.
If by 'client' you mean 'HTTP client' or browser, though, there's no reason to do this - just deal with regular byte arrays on your end and send them to/from the user as you would any other data. The only reason to mess around with streams like this is if you have some library that expects them.

Split/break a File in to pieces in J2ME

I need to split an audio or large image file in J2ME before uploading it. How can i split/ break a file in to pieces in JavaME.

What API for file loading do you use?
1)If FileConnection API you can load data by blocks. There is no problem in this case.
2)If you use Class.getResourceAsStream( String pathInsideJar ) you will have problems. Most of KVMs load resource fully before returning control to your code. So I see one way - to split big file into several small files before creating jar.

DataInputStream dis =
FileConnection.getDataInputStream();
byte[] buffer = new byte[2048];
int count;
int total = 0;
Vector v = new Vector();
ByteArrayOutputStream baos = new
ByteArrayOutputStream();
while( ( count = dis.read( buffer ) ) >= 0 )
{
total += count;
baos.write( buffer, 0, count );
if( total > 100000 )
{
baos.close();
byte[] data = baos.toByteArray();
v.addElement( data );
baos = new ByteArrayOutputStream();
}
}
So you will have Vector of several byte arrays. And you can send it one by one.
Or read all file data into one byte array and post this data by parts with shifting start position of sending data.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight