So I want to create a java.io.File so that I can use it to generate a multipart-form POST request. I have the file in the form of a com.google.api.services.drive.model.File so I'm wondering, is there a way I can convert this Google File to a Java File? This is a web-app that uses the Google App Engine SDK, which prohibits every approach I've tried to make this work
No, you it doesn't seem like you can convert from com.google.api.services.drive.model.File to java.io.File. But it should still be possible to generate a multipart-form POST request using your data in Drive.
So the com.google.api.services.drive.model.File class is used for storing metadata about the file. It's not storing the file contents.
If you want to read the contents of your file into memory, this code snippet from the Drive documentation shows how to do it. Once the file is in memory, you can do whatever you want with it.
/**
* Download the content of the given file.
*
* #param service Drive service to use for downloading.
* #param file File metadata object whose content to download.
* #return String representation of file content. String is returned here
* because this app is setup for text/plain files.
* #throws IOException Thrown if the request fails for whatever reason.
*/
private String downloadFileContent(Drive service, File file)
throws IOException {
GenericUrl url = new GenericUrl(file.getDownloadUrl());
HttpResponse response = service.getRequestFactory().buildGetRequest(url)
.execute();
try {
return new Scanner(response.getContent()).useDelimiter("\\A").next();
} catch (java.util.NoSuchElementException e) {
return "";
}
}
https://developers.google.com/drive/examples/java
This post might be helpful for making your multi-part POST request from Google AppEngine.
In GoogleDrive Api v3 you can download the file content into your OutputStream. You need for that the file id, which you can get from your com.google.api.services.drive.model.File:
String fileId = "yourFileId";
OutputStream outputStream = new ByteArrayOutputStream();
driveService.files().get(fileId).executeMediaAndDownloadTo(outputStream);
Related
Since version 1.15 of Apache Flink you can use the compaction feature to merge several files into one.
https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/filesystem/#compaction
How can we use compaction with bulk Parquet format?
The existing implementations for the RecordWiseFileCompactor.Reader (DecoderBasedReader and ImputFormatBasedReader) do not seem suitable for Parquet.
Furthermore we can not find any example for compacting Parquet or other bulk formats.
There are two types of file compactor mentioned in flink's document.
OutputStreamBasedFileCompactor : The users can write the compacted results into an output stream. This is useful when the users don’t want to or can’t read records from the input files.
RecordWiseFileCompactor : The compactor can read records one-by-one from the input files and write into the result file similar to the FileWriter.
If I remember correctly, Parquet saves meta information at end of files. So obviously we need to use RecordWiseFileCompactor. Because we need to read the whole Parquet file so we can get the meta information at the end of the file. Then we can use the meta information (number of row groups, schema) to parse the file.
From the java api, to construct a RecordWiseFileCompactor, we need a instance of RecordWiseFileCompactor.Reader.Factory.
There are two implementations of interface RecordWiseFileCompactor.Reader.Factory, DecoderBasedReader.Factory and InputFormatBasedReader.Factory respectively.
DecoderBasedReader.Factory creates a DecoderBasedReader instance, which reads whole file content from InputStream. We can load the bytes into a buffer and parse the file from the byte buffer, which is obviously painful. So we don't use this implementation.
InputFormatBasedReader.Factory creates a InputFormatBasedReader, which reads whole file content using the FileInputFormat supplier we passed to InputFormatBasedReader.Factory constructor.
The InputFormatBasedReader instance uses the FileInputFormat to read record by record, and pass records to the writer which we passed to forBulkFormat call, till the end of the file.
The writer receives all the records and compact the records into one file.
So the question becomes what is FileInputFormat and how to implement it.
Though there are many methods and fields of class FileInputFormat, we know only four methods are called from InputFormatBasedReader from InputFormatBasedReader source code mentioned above.
open(FileInputSplit fileSplit), which opens the file
reachedEnd(), which checks if we hit end of file
nextRecord(), which reads next record from the opened file
close(), which cleans up the site
Luckily, there's a AvroParquetReader from package org.apache.parquet.avro we can utilize. It has already implemented open/read/close. So we can wrap the reader inside a FileInputFormat and use the AvroParquetReader to do all the dirty works.
Here's a example code snippet
import org.apache.avro.generic.GenericRecord;
import org.apache.flink.api.common.io.FileInputFormat;
import org.apache.flink.core.fs.FileInputSplit;
import org.apache.hadoop.conf.Configuration;
import org.apache.parquet.avro.AvroParquetReader;
import org.apache.parquet.hadoop.ParquetReader;
import org.apache.parquet.hadoop.util.HadoopInputFile;
import org.apache.parquet.io.InputFile;
import java.io.IOException;
public class ExampleFileInputFormat extends FileInputFormat<GenericRecord> {
private ParquetReader<GenericRecord> parquetReader;
private GenericRecord readRecord;
#Override
public void open(FileInputSplit split) throws IOException {
Configuration config = new Configuration();
// set hadoop config here
// for example, if you are using gcs, set fs.gs.impl here
// i haven't tried to use core-site.xml but i believe this is feasible
InputFile inputFile = HadoopInputFile.fromPath(new org.apache.hadoop.fs.Path(split.getPath().toUri()), config);
parquetReader = AvroParquetReader.<GenericRecord>builder(inputFile).build();
readRecord = parquetReader.read();
}
#Override
public void close() throws IOException {
parquetReader.close();
}
#Override
public boolean reachedEnd() throws IOException {
return readRecord == null;
}
#Override
public GenericRecord nextRecord(GenericRecord genericRecord) throws IOException {
GenericRecord r = readRecord;
readRecord = parquetReader.read();
return r;
}
}
Then you can use the ExampleFileInputFormat like below
FileSink<GenericRecord> sink = FileSink.forBulkFormat(
new Path(path),
AvroParquetWriters.forGenericRecord(schema))
.withRollingPolicy(OnCheckpointRollingPolicy.build())
.enableCompact(
FileCompactStrategy.Builder.newBuilder()
.enableCompactionOnCheckpoint(10)
.build(),
new RecordWiseFileCompactor<>(
new InputFormatBasedReader.Factory<>(new SerializableSupplierWithException<FileInputFormat<GenericRecord>, IOException>() {
#Override
public FileInputFormat<GenericRecord> get() throws IOException {
FileInputFormat<GenericRecord> format = new ExampleFileInputFormat();
return format;
}
})
))
.build();
I have successfully deployed this to a flink on k8s and compacted files on gcs. There're some notes for deploying.
You need to download flink shaded hadoop jar from https://flink.apache.org/downloads.html (search Pre-bundled Hadoop in webpage) and the jar into $FLINK_HOME/lib/
If you are writing files to some object storage, for example gcs, you need to follow the plugin instruction. Remember to put the plugin jar into the plugin folder but not the lib foler.
If you are writing files to some object storage, you need to download the connector jar from cloud service supplier. For example, I'm using gcs and download gcs-connector jar following GCP instruction. Put the jar into some foler other than $FLINK_HOME/lib or $FLINK_HOME/plugins. I put the connector jar into a newly made folder $FLINK_HOME/hadoop-lib
Set environment HADOOP_CLASSPATH=$FLINK_HOME/lib/YOUR_SHADED_HADOOP_JAR:$FLINK_HOME/hadoop-lib/YOUR_CONNECTOR_JAR
After all these steps, you can start your job and good to go.
I need to display on the screen a file that is returned by a web service that is in the format of a base 64 array. The ideal was to display it in a new browser window. I built a java agent to consume the web service.
I think an output would be an xagent, but I never implemented one, I do not know if that's the way, and how I would do that. Would a button on xpage call the xpage that would execute xagent? How would you consume the web service? Calling the xagent xpage agent? Impior both the java agent code for xagent? I've been researching how to do this fun all day, but so far I have not had much success.
Thanks a lot,
Marcus
The following XSnippet (of an XAgent) gives you a bit of an idea of what to do.
https://openntf.org/XSnippets.nsf/snippet.xsp?id=download-all-attachments
In the XSnippet above, all the attachments in a Document are being zipped and then sent to the browser as a zip file.
The url has some parameters e.g. documentID which the XAgent uses to determine which document to zip through.
The XAgent gets a handle to the HttpServletResponse, and configures it so that, instead of sending back an XPage, it specifies it is sending an 'application/zip' file.
It then finds the document using documentId and zips it up, writes the contents to the response and then tells the facesContext that the response is complete (don't do anymore rendering).
In your case, you would put a parameter which identifies the file that you want download. And you could link to this XAgent using a url e.g.
Download.xsp?fileId=somefileid
Your XAgent would then setup the response similar to above but the content type might not be an 'application/zip'. If you don't know the file type you can use 'application/octet-sream' but if you know it is a pdf of something you can use the appropriate Mime Type
Retrieve the file using whatever code you have written to access your webservice, decode it, and write it to the response's output stream
Example Implementing in Java as a Managed bean
The following example outputs some plain text that was originally in a base64 byte array. It is decoded and then written to the response.
All you would do is change the content type to 'application/octet-stream'
Create the managed bean in the Java Design Element.
package com.example;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.Serializable;
import javax.faces.context.ExternalContext;
import javax.faces.context.FacesContext;
import javax.servlet.http.HttpServletResponse;
import com.sun.faces.util.Base64;
public class DownloadBean implements Serializable {
private static final long serialVersionUID = 1L;
public DownloadBean() {
};
public void downloadFile() throws IOException {
FacesContext fc = FacesContext.getCurrentInstance();
ExternalContext context = fc.getExternalContext();
String myFileName = "SomeFile.txt";
HttpServletResponse resp = (HttpServletResponse) context.getResponse();
resp.setHeader("Cache-Control", "no-cache");
resp.setDateHeader("Expires", -1);
// This example is just plain text but you would
// change this to 'application/octet-stream'
resp.setContentType("text/plain");
// Tell the browser it is an attachment with filename of <myFilename>
resp.setHeader("Content-Disposition", "attachment; filename="
+ myFileName);
OutputStream os = resp.getOutputStream();
// Somehow you need to get your byte[],
// that is up to you how you do that
// This example has just used base64 encoding of 'Hello Marcus'
byte[] base64bytes = "SGVsbG8gTWFyY3Vz".getBytes();
// Option 1 : use sun.misc.BASE64Decoder to decode with Streams
ByteArrayInputStream bais = new ByteArrayInputStream(base64bytes);
sun.misc.BASE64Decoder dec = new sun.misc.BASE64Decoder();
dec.decodeBuffer(bais, os);
// Option 2 : use com.sun.faces.util.Base64 to decode to normal byte[]
//byte[] normalBytes = Base64.decode(base64bytes);
//os.write(normalBytes);
os.flush();
os.close();
fc.responseComplete();
}
}
Register it in faces-config.xml. Request Scope should be enough. so for example if you called it 'downloadBean', and it was of class com.example.DownloadBean, put this entry in faces-config.xml
<?xml version="1.0" encoding="UTF-8"?>
<faces-config>
<managed-bean>
<managed-bean-name>downloadBean</managed-bean-name>
<managed-bean-class>com.example.DownloadBean</managed-bean-class>
<managed-bean-scope>request</managed-bean-scope>
</managed-bean>
</faces-config>
How to use
You can then call this from a button:
<?xml version="1.0" encoding="UTF-8"?>
<xp:view xmlns:xp="http://www.ibm.com/xsp/core">
<xp:button value="Download" id="buttonDownload">
<xp:eventHandler event="onclick" submit="true"
refreshMode="complete" action="#{downloadBean.downloadFile}">
</xp:eventHandler>
</xp:button>
</xp:view>
Or you can create an XPage like an XAgent that just downloads the file, and link to this Xpage in new window.
<?xml version="1.0" encoding="UTF-8"?>
<xp:view xmlns:xp="http://www.ibm.com/xsp/core"
rendered="false" beforePageLoad="#{downloadBean.downloadFile}">
</xp:view>
You could access url parameters if needed using something like
https://openntf.org/XSnippets.nsf/snippet.xsp?id=get-url-parameter-using-java
I'm using the Gmail API in browser and want to allow the user to download email attachments. I see https://developers.google.com/gmail/api/v1/reference/users/messages/attachments/get but it returns JSON and base64 data. I don't think I can get that data in memory then trigger a "download" to save the file locally. Even if I could I don't think it would be efficient - it would probably download the file in memory vs. streaming it to a file. I think I need a direct link to a file that returns the correct file name and raw binary data (not base64). Is there a way to do this? Right now the only way I see is to proxy requests.
You can get the data from the base64 and save it to file locally.
If you are getting the attachment in Java, you can use the FileOutputStream class (or f.write() in Python) to write the bytes to file and save it locally with a path.
You can try with the following sample code from Google Developer page:
public static void getAttachments(Gmail service, String userId, String messageId)
throws IOException {
Message message = service.users().messages().get(userId, messageId).execute();
List<MessagePart> parts = message.getPayload().getParts();
for (MessagePart part : parts) {
if (part.getFilename() != null && part.getFilename().length() > 0) {
String filename = part.getFilename();
String attId = part.getBody().getAttachmentId();
MessagePartBody attachPart = service.users().messages().attachment().
get(userId, messageId, attId).execute();
byte[] fileByteArray = Base64.decodeBase64(attachPart.getData());
FileOutputStream fileOutFile =
new FileOutputStream("directory_to_store_attachments" + filename);
fileOutFile.write(fileByteArray);
fileOutFile.close();
}
}
}
Using Wildcards in file name i am trying to read files from GCS bucket.
in gsutil command line wildcards is working in specifying file names.
but in java client api
GcsFilename filename = new GcsFilename(BUCKETNAME, "big*");
it is searching for file named "big*" instead of file starting with big .
please help me how i can use Wildcards in GCSFilename.
Thanks in advance.
Wildcard characters are a feature of gsutil, but they're not an inherent part of the Google Cloud Storage API. You can, however, handle this the same way that gsutil does.
If you want to find the name of every object that begins with a certain prefix, Google Cloud Storage's APIs provide a list method with a "prefix" argument. Only objects matching the prefix will be returned. This doesn't work for arbitrary regular expressions, but it will work for your example.
The documentation for the list method goes into more detail.
As Brandon Yarbrough mentioned, GcsFilename represent a name of a single GCS Object, which could include any valid UTF-8 character [excluding a few such as \r \n but including '*' though
not recommended). see https://developers.google.com/storage/docs/bucketnaming#objectnames for more info.
GAE GCS client does not support listing yet (though that is planned to be added), so for now you can use the GCS XML or JSON API directly (using urlfetch) or use the Java GCS api client, https://developers.google.com/api-client-library/java/apis/storage/v1
See example for the latter option:
public class ListServlet extends HttpServlet {
public static final List<String> OAUTH_SCOPES =
ImmutableList.of("https://www.googleapis.com/auth/devstorage.read_write");
#Override
protected void doPost(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
try {
String bucket = req.getParameter("bucket");
AppIdentityCredential cred = new AppIdentityCredential(OAUTH_SCOPES);
Storage storage = new Storage.Builder(new UrlFetchTransport(), new JacksonFactory(), cred)
.setApplicationName(SystemProperty.applicationId.get()).build();
Objects.List list = storage.objects().list(bucket);
for (StorageObject o : list.execute().getItems()) {
resp.getWriter().println(o.getName() + " -> " + o);
}
} catch (Exception ex) {
throw new ServletException(ex);
}
}
}
I am using jdeveloper version 11.1.1.5.0. In my use case I have created Mail Client Send Mail program where I used ADF InputFile component to attach File on mail.
But problem is that InputFile Component only return path of file(only get file name). And in my mail program DataSource class use full path to access file name.
UploadedFile uploadfile=(UploadedFile) actionEvent.getNewValue();
String fname= uploadfile.getFilename();//this line only get file name.
So how can I get full path using adf InputFile component or any other way to fulfill my requirement.
You could save the uploaded file in a path at the server. Only take care about naming that file, because of concurrency of users you should follow a policy about it, for example, adding te time in milliseconds to the name of the file. Like this...
private String writeToFile(UploadedFile file) {
ServletContext servletCtx =
(ServletContext)FacesContext.getCurrentInstance().getExternalContext().getContext();
String fileDirPath = servletCtx.getRealPath("/files/tmp");
String fileName = getTimeInMilis()+file.getFilename();
try {
InputStream is = file.getInputStream();
OutputStream os =
new FileOutputStream(fileDirPath + "/"+fileName);
int readData;
while ((readData = is.read()) != -1) {
os.write(readData);
}
is.close();
os.close();
} catch (IOException ex) {
ex.printStackTrace();
}
return fileName;
}
This method also returns the new name of the uploaded file. You can replace getTimeInMilis() with any naming policy you like.
It would be a security issue if a web app is able to see anything other than the data stream for an uploaded file. The directory structure of the client would not be exposed to the webapp. As such, unless you plan to upload the file from the same host as the server, you will not have access to the file path on the client.
Note: Using answer instead of comment due to reputation threshold