Pause directory until after unzipping is complete - apache-camel

I have a ZIP file with multiple files about 1GB each. Unzipping is done by one route (Route1) that polls the download directory and saves files to the extract directory. Another route (Route2) polls the extract directory to process the files.
These files are supposed to be processed in a certain sequence (Route2 uses sortBy when getting the files). Route2 immediately picks files that are still being unzipped before all the files are available.
How can i pause Route2 from processing until after Route1 is done unzipping the files.

Can you write a done-file in your first route once the unzip process has finished and then use the "doneFileName" URI option of the File2 component in your second route?

Take a look at readLock and readLockInterval parameters in the File2 component.

You can pickup one file at a time in route 1 with maxMessagesPerPoll=1, and then use the control bus component to stop the route, and then from the other route, start the route again.
http://camel.apache.org/controlbus.html

Related

How to download full directories off apache server via just a URL

I have searched all around and the only results I get are using wget or curl.
I would like to be able to download full directories being served on my apache server using just a URL. For example if I want to download the directory Contents, I would like to be able to do this: http://127.0.0.1/Contents. Instead, when I do this, I just get a page with all the files inside it instead of actually downloading the directory.
Is it because this is not possible? Or because i just need to configure it in apache2.conf ?
Worked it out, in order to download a folder/directory from a URL, compress the folder into a .zip file or anything similar to .zip, then call it in the URL: http://127.0.0.1/yourfolder.zip

Is there a file size limit to move a file over SFTP once processed by camel?

A file size of 9MB works as expected. But a file of 15MB does not get moved.
I am using:
sftp://{{sftp.user}}#{{sftp.server}}/?username={{user}}?password={{sftp.pass}}&binary=true&passiveMode=true&move=processed/
You could try to use the localWorkDirectory option of the Camel FTP component. This downloads the files to the configured directory first.
sftp://{{sftp.user}}#{{sftp.server}}/?username={{user}}?password={{sftp.pass}}&binary=true&passiveMode=true&move=processed/&localWorkDirectory=/tmp

How to upload folders to Google Colab?

I want to run a notebook that uses many header files defined in the directory. So basically I want to upload the entire directory to Google Colab so that I can run the notebook. But I am unable to find any such options and only able to upload files not complete folders. So can someone tell me how to upload entire directory to google colab?
I suggest you not to upload them just in Colab, since when you're restarting the runtime you will lose them (just need to re-upload them, but it can be an issue with slow connections).
I suggest you to use the google.colab package to manage files and folders in Colab. Just upload everything you need to your google drive, then import:
from google.colab import drive
drive.mount('/content/gdrive')
In this way, you just need to login to your google account through google authentication API, and you can use files/folders as if they were uploaded on Colab.
EDIT May 2022:
As pointed out in the comments, using Google Drive as storage for a large number of files to train a model is painfully slow, as described here: Google Colab is very slow compared to my PC. The better solution in this case is to zip the files, upload them to colab and then unzip them using
!unzip file.zip
More unzip options here: https://linux.die.net/man/1/unzip
You can zip them, upload, then unzip it.
!unzip file.zip
The easiest way to do this, if the folder/file is on your local drive:
Compress the folder into a ZIP file.
Upload the zipped file into colab using the upload button in the File section. Yes, there is a File section, see the left side of the colab screen.
Use this line of code to extract the file. Note: The file path is from colab's File section.
from zipfile import ZipFile
file_name = file_path
with ZipFile(file_name, 'r') as zip:
zip.extractall()
print('Done')
Click Refresh in the colab File section.
Access the files in your folder through the file paths
Downside: The files will be deleted after the runtime is over.
You can use some part of these steps if your file is on a Google Drive, just upload the zipped file to colab from Google Drive.
you can create a git repository and push the files and folders to it,
and then can clone the repository in colaboratory with the command
!git clone https://github.com/{username}/{projectname}.git
i feel this method is faster.
but if the file size is more than 100 mb you will have to zip the file or will have to add extentions to push it to github.
for more information refer the link below.
https://help.github.com/en/github/managing-large-files/configuring-git-large-file-storage
The best way to approach this problem is simple yet tricky sometimes.
You first need to compress the folder into a zipped file and upload the same into your google drive.
While doing so, Make sure that the folder is in the root directory of the drive and not in any other subfolder!. If the compressed folder/data is in other subfolder, you can easily move the same into the root directory.
Compresses folder/data in another subfolder often messes with the unzipping process when you will be specifying the file location.
Once you did the afore mentioned tasks, enter the following commands in the colab to mount your drive:
from google.colab import drive
drive.mount('/content/gdrive')
This will ask for an access token that can be generated by clicking on the url displayed in the output of the same cell
!ls gdrive/MyDrive
Check the contents of the drive by executing the above command and ensure that your folder/data is displayed in the output.
!unzip gdrive/MyDrive/<File_name_without_space>.zip
eg:
!unzip gdrive/MyDrive/data_folder.zip
Executing the same will start unzipping your folder into the memory.
Congrats! You have successfully uploaded your folder/data into the colab.
zip your files zip -r file.zip your_folder and then:
from google.colab import files
from zipfile import ZipFile
with ZipFile(files.upload(), 'r') as zip:
zip.extractall()
print('Done')
So here's what you can do:
-upload the dataset desired folder to your drive
-over colab, mount the drive wherein this
"from google.colab import drive
drive.mount('/content/gdrive')"
automatically shows up and you just need to run it
-then check for your file over the Files section on the left-hand side (if folder not visible try refreshing, also there should be a drop-down arrow next to it where you can check all the files under the folder )
-left-click over the folder wherein you get a COPY PATH option
-paste the copied path over the desired location in your colab

how to use camel consumer template to consume all files in directory

I need to read all files from the directory and to process each file using consumerTemplate. something like below.
consumerTemplate.receiveBody(directory, File.class)

encfs does not mount folder

I have installed encfs to encrypt my directory and files and everything worked well when I installed encfs and configured it. I created two folders, one /XDM/game and second /Private and as encfs works I have to copy my files into /Private folder and this will sync all those files' encrypted version with /XDM/game. But when I restart my pc I have to remount /Private folder for that I am using the following command -
encfs ~/XDM/game ~/Private
But it returns me this error -
encfs ~/XDM/game* ~/Private
EncFS Password:
fuse: mountpoint is not empty
fuse: if you are sure this is safe, use the 'nonempty' mount option
fuse failed. Common problems:
- fuse kernel module not installed (modprobe fuse)
- invalid options -- see usage message
help me with this error please. Thank you in Advance!!!
Short answer is 'In order to mount the directory you need to remove file from /Private folder and try mounting again with
encfs `~/XDM/game ~/Private`'
And long answer, after you configured encfs once you restart computer, you can see your folder /Private empty and to show the files you have to mount folder but if you copy any file in /Private folder and then try to mount then it will give you this error. So first delete the file you pasted/created after you restarted computer and then mount. It should work!
I have written a detailed article to help you understand better.
http://www.linuxandubuntu.com/home/how-to-encrypt-cloud-storage-files-folders-in-linux
I was stuck in a similar scenario once when accidentally trying to write to the encrypted directory. Solved it by doing
encfs -o nonempty <encryptedDirPath> <decryptedDirMountPoint>
There are some repercussions to using the 'nonempty' option though. More here.
If you are not able to unmount the decrypted directory after doing the above, you might need to delete the entry for your stash from Gnome Encfs Manager, in case you are using it.
I was having the same problem where it says that mountpoint is busy or not empty. so, I created another dir. in your case, you can do mkdir ~/Private2. and then, simply treat this as your new mountpoint.
encfs ~/XDM/game ~/Private2.

Resources