Files downloading process automate with selenium - file

URLs are in the file, need to copy from this file and hit in the browser. This process need to be done for all the URLs one by one. File size is less than 1 MB in max conditions. After downloading completes then only another file downloading will start, that means at any time active download is 1.
This process i am doing manually, for downloading of 125 files. This number may increase in coming days. So I am planing to automate this process. Can i achieve this in selenium?
I have knowledge on selenium webdriver, i can able to write the simple scripts.
If it is not possible with selenium, refer the any alternate ways.

Use the below code to rad from excel sheet
public class excel {
public static String getExcelData(String sheetName, int rowNum,int colNum)
String url=null;
FileInputStream fis=new FileInputStream("filepath.xls");
Workbook wb=WorkbookFactory.create(fis);
Sheet s=wb.getSheet(sheetName);
Row r=s.getRow(rowNum);
Cell c=r.getCell(colNum);
catch(Exception e)
return url;
Code will return you the url
use this url for downloading the file
WebDriver driver;
This will down load the file.

Firstly, read the excel file using Selenium/Java. Refer below link for more information:
Read url from excel sheet one after the other:
driver.get("url captured from the excel")
I assume that file gets downloaded automatically when you hit the url. In case of any specific element to be displayed after download is done, you can waitUntilElementVisible or Sikuli can be used for downloading ( In case of manual download)


Use of compaction for Parquet bulk format

Since version 1.15 of Apache Flink you can use the compaction feature to merge several files into one.
How can we use compaction with bulk Parquet format?
The existing implementations for the RecordWiseFileCompactor.Reader (DecoderBasedReader and ImputFormatBasedReader) do not seem suitable for Parquet.
Furthermore we can not find any example for compacting Parquet or other bulk formats.
There are two types of file compactor mentioned in flink's document.
OutputStreamBasedFileCompactor : The users can write the compacted results into an output stream. This is useful when the users don’t want to or can’t read records from the input files.
RecordWiseFileCompactor : The compactor can read records one-by-one from the input files and write into the result file similar to the FileWriter.
If I remember correctly, Parquet saves meta information at end of files. So obviously we need to use RecordWiseFileCompactor. Because we need to read the whole Parquet file so we can get the meta information at the end of the file. Then we can use the meta information (number of row groups, schema) to parse the file.
From the java api, to construct a RecordWiseFileCompactor, we need a instance of RecordWiseFileCompactor.Reader.Factory.
There are two implementations of interface RecordWiseFileCompactor.Reader.Factory, DecoderBasedReader.Factory and InputFormatBasedReader.Factory respectively.
DecoderBasedReader.Factory creates a DecoderBasedReader instance, which reads whole file content from InputStream. We can load the bytes into a buffer and parse the file from the byte buffer, which is obviously painful. So we don't use this implementation.
InputFormatBasedReader.Factory creates a InputFormatBasedReader, which reads whole file content using the FileInputFormat supplier we passed to InputFormatBasedReader.Factory constructor.
The InputFormatBasedReader instance uses the FileInputFormat to read record by record, and pass records to the writer which we passed to forBulkFormat call, till the end of the file.
The writer receives all the records and compact the records into one file.
So the question becomes what is FileInputFormat and how to implement it.
Though there are many methods and fields of class FileInputFormat, we know only four methods are called from InputFormatBasedReader from InputFormatBasedReader source code mentioned above.
open(FileInputSplit fileSplit), which opens the file
reachedEnd(), which checks if we hit end of file
nextRecord(), which reads next record from the opened file
close(), which cleans up the site
Luckily, there's a AvroParquetReader from package org.apache.parquet.avro we can utilize. It has already implemented open/read/close. So we can wrap the reader inside a FileInputFormat and use the AvroParquetReader to do all the dirty works.
Here's a example code snippet
import org.apache.avro.generic.GenericRecord;
import org.apache.flink.core.fs.FileInputSplit;
import org.apache.hadoop.conf.Configuration;
import org.apache.parquet.avro.AvroParquetReader;
import org.apache.parquet.hadoop.ParquetReader;
import org.apache.parquet.hadoop.util.HadoopInputFile;
public class ExampleFileInputFormat extends FileInputFormat<GenericRecord> {
private ParquetReader<GenericRecord> parquetReader;
private GenericRecord readRecord;
public void open(FileInputSplit split) throws IOException {
Configuration config = new Configuration();
// set hadoop config here
// for example, if you are using gcs, set here
// i haven't tried to use core-site.xml but i believe this is feasible
InputFile inputFile = HadoopInputFile.fromPath(new org.apache.hadoop.fs.Path(split.getPath().toUri()), config);
parquetReader = AvroParquetReader.<GenericRecord>builder(inputFile).build();
readRecord =;
public void close() throws IOException {
public boolean reachedEnd() throws IOException {
return readRecord == null;
public GenericRecord nextRecord(GenericRecord genericRecord) throws IOException {
GenericRecord r = readRecord;
readRecord =;
return r;
Then you can use the ExampleFileInputFormat like below
FileSink<GenericRecord> sink = FileSink.forBulkFormat(
new Path(path),
new RecordWiseFileCompactor<>(
new InputFormatBasedReader.Factory<>(new SerializableSupplierWithException<FileInputFormat<GenericRecord>, IOException>() {
public FileInputFormat<GenericRecord> get() throws IOException {
FileInputFormat<GenericRecord> format = new ExampleFileInputFormat();
return format;
I have successfully deployed this to a flink on k8s and compacted files on gcs. There're some notes for deploying.
You need to download flink shaded hadoop jar from (search Pre-bundled Hadoop in webpage) and the jar into $FLINK_HOME/lib/
If you are writing files to some object storage, for example gcs, you need to follow the plugin instruction. Remember to put the plugin jar into the plugin folder but not the lib foler.
If you are writing files to some object storage, you need to download the connector jar from cloud service supplier. For example, I'm using gcs and download gcs-connector jar following GCP instruction. Put the jar into some foler other than $FLINK_HOME/lib or $FLINK_HOME/plugins. I put the connector jar into a newly made folder $FLINK_HOME/hadoop-lib
After all these steps, you can start your job and good to go.

ADF: How to get path of file when using InputFile Component

I am using jdeveloper version In my use case I have created Mail Client Send Mail program where I used ADF InputFile component to attach File on mail.
But problem is that InputFile Component only return path of file(only get file name). And in my mail program DataSource class use full path to access file name.
UploadedFile uploadfile=(UploadedFile) actionEvent.getNewValue();
String fname= uploadfile.getFilename();//this line only get file name.
So how can I get full path using adf InputFile component or any other way to fulfill my requirement.
You could save the uploaded file in a path at the server. Only take care about naming that file, because of concurrency of users you should follow a policy about it, for example, adding te time in milliseconds to the name of the file. Like this...
private String writeToFile(UploadedFile file) {
ServletContext servletCtx =
String fileDirPath = servletCtx.getRealPath("/files/tmp");
String fileName = getTimeInMilis()+file.getFilename();
try {
InputStream is = file.getInputStream();
OutputStream os =
new FileOutputStream(fileDirPath + "/"+fileName);
int readData;
while ((readData = != -1) {
} catch (IOException ex) {
return fileName;
This method also returns the new name of the uploaded file. You can replace getTimeInMilis() with any naming policy you like.
It would be a security issue if a web app is able to see anything other than the data stream for an uploaded file. The directory structure of the client would not be exposed to the webapp. As such, unless you plan to upload the file from the same host as the server, you will not have access to the file path on the client.
Note: Using answer instead of comment due to reputation threshold

Run SSIS only when complete file is present in the folder

I have developed a ETL which is consuming flat files. The size of flat files varies from 250 MB - 300 MB.
It is working absoultely fine when file present in the folder. But it fails when the file is in generation mode.
Ex: This ETL package runs from 8 AM to 10 AM to check whether the file is present in the folder or not. Now, at any instance(let say 9 AM) if the file is starting generated and till now it is 10 MB. ETL start processing the file and just hang and fail after 4-5 min ( hang at script task which is reading that the file is present in the folder or not).
What is the best way to trigger SSIS package only when the file generation is completely done?
Note: I have no control over the file generation.
Add a For Loop Container with a Boolean variable bFileAccessible:
The Init expression is #bFileAccessible=False
The Eval expression is #bFileAccessible==False
Inside the For Loop Container add a Script Task with a ReadWriteVariable User::bFileAccessible and the following C# script (showing only the Main() method):
public void Main()
using (Stream stream = new FileStream("Path\to\your\file", FileMode.Open))
Dts.Variables["bFileAccessible"].Value = true;
Dts.Variables["bFileAccessible"].Value = false;
Dts.TaskResult = (int)ScriptResults.Success;
You should also use a variable for the filename and maybe a little wait interval. For more information about the script see here.
Check the FIle modified time everytime and comapre the same with previous one....
it's not good logic but a good idea if no perfect alternative

Silverlight: Business Application Needs Access To Files To Print and Move

I have the following requirement for a business application:
(All of this could be on local or server)
Allow user to select folder location
Show contents of folder
Print selected items from folder (*.pdf)
Display which files have been printed
Potentially move printed files to new location (sub-folder of printed)
How can I make this happen in Silverlight?
Kind regards,
First of all, all but the last item can be done (the way you expect). Due to security protocols, silverlight cannot access the user's drive and manipulate it. The closest you can get is accessing silverlight's application storage which will be of no help to you whatsoever in this case. I will highlight how to do the first 4 items.
Allow user to select folder location & Show contents of folder
public void OnSelectPDF(object sender)
//create the open file dialog
OpenFileDialog ofg = new OpenFileDialog();
//filter to show only pdf files
ofg.Filter = "PDF Files|*.pdf";
byte[] _import_file = new byte[0];
//once a file is selected proceed
if (!object.ReferenceEquals(ofg.File, null))
fs = ofg.File.OpenRead();
_import_file = new byte[fs.Length];
fs.Read(_import_file, 0, (int)fs.Length);
catch (Exception ex)
if (!object.ReferenceEquals(fs, null))
//do stuff with file - such as upload the file to the server
If you noticed, in my example, once the file is retrieved, i suggest uploading it to a webserver or somewhere with temporary public access. I would recommend doing this via a web service. E.g
//configure the system file (customn class)
TSystemFile objFile = new TNetworkFile().Initialize();
//get the file description from the Open File Dialog (ofg)
objFile.Description = ofg.File.Extension.Contains(".") ? ofg.File.Extension : "." + ofg.File.Extension;
objFile.FileData = _import_file;
objFile.FileName = ofg.File.Name;
//upload the file
Once this file is uploaded, on the async result, most likely returning the temporary file name and upload location, I would foward the call to some javascript method in the browser for it to use the generic "download.aspx?fileName=givenFileName" technique to force a download on the users system which would take care of both saving to a new location and printing. Which is what your are seeking.
Example of the javascript technique (remember to include System.Windows.Browser):
public void OnInvokeDownload(string _destination)
//call the browser method/jquery method
//(I use constants to centralize the names of the respective browser methods)
HtmlWindow window = HtmlPage.Window;
//where BM_INVOKE_DOWNLOAD is something like "invokeDownload"
window.Invoke(Constants.TBrowserMethods.BM_INVOKE_DOWNLOAD, new object[] { _destination});
catch (Exception ex) { System.Diagnostics.Debug.WriteLine(ex.ToString()); }
Ensure you have the javascript method existing either in an included javaScript file or in the same hosting page as your silverlight app. E.g:
function invokeDownload(_destination) {
//some fancy jquery or just the traditional document.location change here
//open a popup window to fileName=_destination
The code for download.aspx is outside the scope of my answer, as it varies per need and would just lengthen this post (A LOT MORE). But from what I've given, it will "work" for what you're looking for, but maybe not in exactly the way you expected. However, remember that this is primarily due to silverlight restrictions. What this approach does is rather than forcing you to need a pluging to view pdf files in your app, it allows the user computer to play it's part by using the existing adobe pdf reader. In silverlight, most printing, at least to my knowledge is done my using what you call and "ImageVisual" which is a UIElement. To print a pdf directly from silverlight, you need to either be viewing that PDF in a silverlight control, or ask a web service to render the PDF as an image and then place that image in a control. Only then could you print directly. I presented this approach as a lot more clean and direct approach.
One note - with the temp directory, i would recommend doing a clean up by some timespan of the files on the server side everytime a file is being added. Saves you the work of running some task periodically to check the folder and remove old files. ;)

Provide a database packaged with the .APK file or host it separately on a website?

Here is some background about my app:
I am developing an Android app that will display a random quote or verse to the user. For this I am using an SQLite database. The size of the DB would be approximately 5K to 10K records, possibly increasing to upto 1M in later versions as new quotes and verses are added. Thus the user would need to update the DB as and when newer versions are of the app or DB are released.
After reading through some forums online, there seem to be two feasible ways I could provide the DB:
1. Bundle it along with the .APK file of the app, or
2. Upload it to my app's website from where users will have to download it
I want to know which method would be better (if there is yet another approach other than these, please do let me know).
After pondering this problem for some time, I have these thoughts regarding the above approaches:
Approach 1:
Users will obtain the DB along with the app, and won't have to download it separately. Installation would thereby be easier. But, users will have to reinstall the app every time there is a new version of the DB. Also, if the DB is large, it will make the installable too cumbersome.
Approach 2:
Users will have to download the full DB from the website (although I can provide a small, sample version of the DB via Approach 1). But, the installer will be simpler and smaller in size. Also, I would be able to provide future versions of the DB easily for those who might not want newer versions of the app.
Could you please tell me from a technical and an administrative standpoint which approach would be the better one and why?
If there is a third or fourth approach better than either of these, please let me know.
Thank you!
I built a similar app for Android which gets periodic updates with data from a government agency. It's fairly easy to build an Android compatible db off the device using perl or similar and download it to the phone from a website; and this works rather well, plus the user gets current data whenever they download the app. It's also supposed to be possible to throw the data onto the sdcard if you want to avoid using primary data storage space, which is a bigger concern for my app which has a ~6Mb database.
In order to make Android happy with the DB, I believe you have to do the following (I build my DB using perl).
$st = $db->prepare( "CREATE TABLE \"android_metadata\" (\"locale\" TEXT DEFAULT 'en_US')");
$st = $db->prepare( "INSERT INTO \"android_metadata\" VALUES ('en_US')");
I have an update activity which checks weather updates are available and if so presents an "update now" screen. The download process looks like this and lives in a DatabaseHelperClass.
public void downloadUpdate(final Handler handler, final UpdateActivity updateActivity) {
URL url;
try {
File f = new File(getDatabasePath());
if (f.exists()) {
url = new URL("" + currentDbVersion + ".sqlite");
URLConnection urlconn = url.openConnection();
final int contentLength = urlconn.getContentLength();
Log.i(TAG, String.format("Download size %d", contentLength)); Runnable() {
public void run() {
InputStream is = urlconn.getInputStream();
// Open the empty db as the output stream
OutputStream os = new FileOutputStream(f);
// transfer bytes from the inputfile to the outputfile
byte[] buffer = new byte[1024 * 1000];
int written = 0;
int length = 0;
while (written < contentLength) {
length =;
os.write(buffer, 0, length);
written += length;
final int currentprogress = written; Runnable() {
public void run() {
Log.i(TAG, String.format("progress %d", currentprogress));
// Close the streams
Log.i(TAG, "Download complete");
} catch (Exception e) {
Log.e(TAG, "bad things", e);
} Runnable() {
public void run() {
Also note that I keep a version number in the filename of the db files, and a pointer to the current one in a text file on the server.
It sounds like your app and your db are tightly bound -- that is, the db is useless without the database and the database is useless without the app, so I'd say go ahead and put them both in the same .apk.
That being said, if you expect the db to change very slowly over time, but the app to change quicker, and you don't want your users to have to download the db with each new app revision, then you might want to unbundle them. To make this work, you can do one of two things:
Install them as separate applications, but make sure they share the same userID using the sharedUserId tag in the AndroidManifest.xml file.
Install them as separate applications, and create a ContentProvider for the database. This way other apps could make use of your database as well (if that is useful).
If you are going to store the db on your website then I would recommend that you just make rpc calls to your webserver and get data that way, so the device will never have to deal with a local database. Using a cache manager to avoid multiple lookups will help as well so pages will not have to lookup data each time a page reloads. Also if you need to update the data you do not have to send out a new app every time. Using HttpClient is pretty straight forward, if you need any examples please let me know
