Untar a file with !tar -xzvf a 100GB file - dataset

I'm trying to untar a dataset which is 100GB using !tar -xzvf. I executed the command yesterday on colab pro+ (I'm using the dataset for ML) yesterday, but the file is still untarring. Is there any other way to untar the file in order to save time?
I need to then run a python script to read from the json of the folders within.

Related

'mongo' is still not working on PowerShell after doing all recommended things

I installed MongoDB and tried to run it on terminal. It just shows up 'mongo' is not recognized as an internal or external command, operable program or batch file.
I have set the path to bin folder inside Environment variables too. One thing I noticed is I might have a missing file inside bin folder and that is mongo. Because I have mongod and mongos file inside the bin folder. I tried to uninstall and reinstall the program and it was still not working.
I have no idea it's what that I'm missing. Please help out
Finally I have found the solution,
Mongo shell no longer ships with server binaries. We can download it from MongoDB Shell Download
Then we should extract the contents of the bin from the downloaded zip file to the bin file of the MongoDB folder and run mongosh instead of mongo on the terminal

'tar' is not recognized as an internal or external command

I am trying to extract a zip file in Windows 10 using a batch script.
It a simple command:
tar zxf "logstash-5.4.0.tar.gz"
ECHO "installed"
But I am getting following error:
'tar' is not recognized as an internal or external command
I have seen that I have to install the tar but how can I do that?
How can I do this?
EDIT Tar is pre installed in windows or we have to externally add it? Still how can i extract without using third party tool.
You can download Tartool Application in your desktop and paste it into
C:\Windows\system32\
For eg:-(C:\Windows\system32\tartool.exe)
By doing this it work as internal command when you want to extract your file you can simply use
C:>TarTool.exe D:\sample.tar.gz ./
For more commands you can read documention part of that Tool
Starting windows 10 build 17063, TAR is an inbuilt tool and no need to install it separately. MSDN link
For example, to uninstall a file named XYZ.zip you can execute the following in Command Prompt.
tar -xvf XYZ.zip

BIRT runtime 4.6.0 batch file not running

Trying to run a batch file on the latest version of BIRT. Upgraded from 3.7.1 to 4.6.0. The .bat file is exactly the same (other than changing the BIRT_HOME system variable).
The steps I took were as follows:
Downloaded 4.6.0 from an official mirror
Copied 2 jar files across into the BIRT_HOME/ReportEngine/lib folder. These jar files are jtds.jar and ojdbc6.jar so I can connect to an external database
Copied over my .bat file, report design file and report properties file
Edited the .bat file to give the correct location to BIRT_HOME
Executed the .bat file from command line
The error I get is:
Could not find or load main class org.eclipse.birt.report.engine.api.ReportRunner
The contents of my .bat file are:
#echo off
set BIRT_HOME=C:\birt-460\ReportEngine\
call %BIRT_HOME%genReport.bat -m runrender -o "output.PDF" -f PDF -F "reportproperties.properties" "reportproperties.rptDesign"
I can confirm that the following JAR file is present in my /lib folders: org.eclipse.birt.runtime_4.6.0-20160607.jar
The part I'm struggling with is that these steps work in 3.7.1 and 4.2.2, but not 4.6.0
Anyone got any ideas?
This is a bug in the 4.6.0 BIRT release.
As a workaround, simply remove the ECLIPSE_.RSA and ECLIPSE_.SF, from the META-INF/ folder in org.eclipse.birt.runtime_4.6.0-20160607.jar, which is in $BIRT_HOME/ReportEngine/lib/.
Refs: https://www.eclipse.org/forums/index.php/t/1086829/
This is fixed in the BIRT 4.9 runtime.
https://projects.eclipse.org/projects/technology.birt

Having issues unzipping a .tgz file

Trying to open a zipped file with a .tgz extension in the desktop, tried opening with archive utility and am being told 'unable to expand into desktop' error 1 - operation not permitted. When i try to unzip the file in terminal it says it cannot be found.
No idea what's going on. Any clues?
The .tgz extension implies that the arhive is a zipped 'tar ball'. So there are two kinds of compression applied to the file. Is it possible that you are using a windows utility that can unzip but not untar? If not, if you are in Linux try using the console and running the command: tar -zxvf <yourarchive>.tgz

Jenkins delete files from Workspace

I have jenkin's job, which copy tar file from linux user folder and then copy binary file (compiled) from another job and make new tar file. Then jenkins user can copy that new tar file from jenkin's workspace.
It doesn't build anything or take files from SCM. Then after a while, suddenly tar file has been deleted from workspace, I have to run job again. How I can prevent that?
You really shouldn't rely on your workspace existing after a job has completed, as the workspace can be overwritten by another build starting, or when someone deletes a build, by a slave going offline, etc...
Since you want to save the file for later use, you should use the "Archive the artifacts" option in your job's post-build configuration. If you enter **/*.tar, for example, Jenkins would save all TAR files at the end of the build.
Then you can use Jenkins' permalinks to access the artifacts, e.g.:
http://JENKINS/job/JOB_NAME/lastSuccessfulBuild/artifact/bin/my-app.tar
As the URL suggests, this would give you the file from the last successful build.
As a sidenote, if you then want to copy archived files to another build, the best way to do this is with the Copy Artifact plugin.
That way Jenkins handles the file copying for you, even across multiple Jenkins slaves, and you don't have to do anything nasty like hard-coding paths to other Jenkins workspaces.

Resources