assign two bash arrays with a single command - arrays

youtube-dl can take some time parsing remote sites when called multiple times.
EDIT0 : I want to fetch multiple properties (here fileNames and remoteFileSizes) output by youtube-dl without having to run it multiple times.
I use those 2 properties to compare the local file size and ${remoteFileSizes[$i]} to tell if the file is finished downloading.
$ youtube-dl --restrict-filenames -o "%(title)s__%(format_id)s__%(id)s.%(ext)s" -f m4a,18,webm,251 -s -j https://www.youtube.com/watch?v=UnZbjvyzteo 2>errors_youtube-dl.log | jq -r ._filename,.filesize | paste - - > input_data.txt
$ cat input_data.txt
Alan_Jackson_-_I_Want_To_Stroll_Over_Heaven_With_You_Live__18__UnZbjvyzteo__youtube_com.mp4 8419513
Alan_Jackson_-_I_Want_To_Stroll_Over_Heaven_With_You_Live__250__UnZbjvyzteo__youtube_com.webm 1528955
Alan_Jackson_-_I_Want_To_Stroll_Over_Heaven_With_You_Live__140__UnZbjvyzteo__youtube_com.m4a 2797366
Alan_Jackson_-_I_Want_To_Stroll_Over_Heaven_With_You_Live__244__UnZbjvyzteo__youtube_com.webm 8171725
I want the first column in the fileNames array and the second column in the remoteFileSizes.
For the time being, I use a while read loop, but when this loop is finished my two arrays are lost :
$ fileNames=()
$ remoteFileSizes=()
$ cat input_data.txt | while read fileName remoteFileSize; do \
fileNames+=($fileName); \
remoteFileSizes+=($remoteFileSize); \
done
$ for fileNames in "${fileNames[#]}"; do \
echo PROCESSING....; \
done
$ echo "=> fileNames[0] = ${fileNames[0]}"
=> fileNames[0] =
$ echo "=> remoteFileSizes[0] = ${remoteFileSizes[0]}"
=> remoteFileSizes[0] =
$
Is it possible to assign two bash arrays with a single command ?

You assign variables in a subshell, so they are not visible in the parent shell. Read https://mywiki.wooledge.org/BashFAQ/024 . Remove the cat and do a redirection to solve your problem.
while IFS=$'\t' read -r fileName remoteFileSize; do
fileNames+=("$fileName")
remoteFileSizes+=("$remoteFileSize")
done < input_data.txt
You might also interest yourself in https://mywiki.wooledge.org/BashFAQ/001.

For what it's worth, if you're looking for specific/bespoke functionality from youtube-dl, I recommend creating your own python scripts using the 'embedded' approach: https://github.com/ytdl-org/youtube-dl/blob/master/README.md#embedding-youtube-dl
You can set your own signal for when a download is finished (text/chime/mail/whatever) and track downloads without having to compare file sizes.

Related

Using array from an output file in bash shell

am in a need of using an array to set the variable value for further manipulations from the output file.
scenario:
> 1. fetch the list from database
> 2. trim the column using sed to a file named x.txt (got specific value as that is required)
> 3. this file x.txt has the below output as
10000
20000
30000
> 4. I need to set a variable and assign the above values to it.
A=10000
B=20000
C=30000
> 5. I can invoke this variable A,B,C for further manipulations.
Please let me know how to define an array assigning to its variable from the output file.
Thanks.
In bash (starting from version 4.x) and you can use the mapfile command:
mapfile -t myArray < file.txt
see https://stackoverflow.com/a/30988704/10622916
or another answer for older bash versions: https://stackoverflow.com/a/46225812/10622916
I am not a big proponent of using arrays in bash (if your code is complex enough to need an array, it's complex enough to need a more robust language), but you can do:
$ unset a
$ unset i
$ declare -a a
$ while read line; do a[$((i++))]="$line"; done < x.txt
(I've left the interactive prompt in place. Remove the leading $
if you want to put this in a script.)

Trouble with AWK'd command output and bash array

I am attempting to get a list of running VirtualBox VMs (the UUIDs) and put them into an array. The command below produces the output below:
$ VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}'
f93c17ca-ab1b-4ba2-95e5-a1b0c8d70d2a
46b285c3-cabd-4fbb-92fe-c7940e0c6a3f
83f4789a-b55b-4a50-a52f-dbd929bdfe12
4d1589ba-9153-489a-947a-df3cf4f81c69
I would like to take those UUIDs and put them into an array (possibly even an associative array for later use, but a simple array for now is sufficient)
If I do the following:
array1="( $(VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}') )"
The commands
array1_len=${#array1[#]}
echo $array1_len
Outputs "1" as in there's only 1 element. If I print out the elements:
echo ${array1[*]}
I get a single line of all the UUIDs
( f93c17ca-ab1b-4ba2-95e5-a1b0c8d70d2a 46b285c3-cabd-4fbb-92fe-c7940e0c6a3f 83f4789a-b55b-4a50-a52f-dbd929bdfe12 4d1589ba-9153-489a-947a-df3cf4f81c69 )
I did some research (Bash Guide/Arrays on how to tackle this and found this with command substitution and redirection, but it produces an empty array
while read -r -d '\0'; do
array2+=("$REPLY")
done < <(VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}')
I'm obviously missing something. I've looked at several simiar questions on this site such as:
Reading output of command into array in Bash
AWK output to bash Array
Creating an Array in Bash with Quoted Entries from Command Output
Unfortunately, none have helped. I would apprecaite any assistance in figuring out how to take the output and assign it to an array.
I am running this on macOS 10.11.6 (El Captain) and BASH version 3.2.57
Since you're on a Mac:
brew install bash
Then with this bash as your shell, pipe the output to:
readarray -t array1
Of the -t option, the man page says:
-t Remove a trailing delim (default newline) from each line read.
If the bash4 solution is admissible, then the advice given
e.g. by gniourf_gniourf at reading-output-of-command-into-array-in-bash
is still sound.

How do I store the output from a find command in an array? + bash

I have the following find command with the following output:
$ find -name '*.jpg'
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
./public_html/github/screencasts-gh-pages/introToBackbone/presentation/images/telescope.jpg
./public_html/github/StarCraft-master/img/Maps/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Snapshot.jpg
./public_html/github/StarCraft-master/img/Maps/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Original/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Bg/GameLose.jpg
./public_html/github/StarCraft-master/img/Bg/GameWin.jpg
./public_html/github/StarCraft-master/img/Bg/GameStart.jpg
./public_html/github/StarCraft-master/img/Bg/GamePlay.jpg
./public_html/github/StarCraft-master/img/Demo/Demo.jpg
./public_html/github/flot/examples/image/hs-2004-27-a-large-web.jpg
./public_html/github/minicourse-ajax-project/other/GameLose.jpg
How do I store this output in an array? I want it to handle filenames with spaces
I have tried this arrayname=($(find -name '*.jpg')) but this just stores the first element. # I am doing the following which seems to be just the first element?
$ arrayname=($(find -name '*.jpg'))
$ echo "$arrayname"
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
$
I have tried here but again this just stores the 1st element
Other similar Qs
How do I capture the output from the ls or find command to store all file names in an array?
How do i store the output of a bash command in a variable?
If you know with certainty that your filenames will not contain newlines, then
mapfile -t arrayname < <(find ...)
If you want to be able to handle any file
arrayname=()
while IFS= read -d '' -r filename; do
arrayname+=("$filename")
done < <(find ... -print0)
echo "$arrayname" will only show the first element of the array. It is equivalent to echo "${arrayname[0]}". To dump an array:
printf "%s\n" "${arrayname[#]}"
# ............^^^^^^^^^^^^^^^^^ must use exactly this form, with the quotes.
arrayname=($(find ...)) is still wrong. It will store the file ./file with spaces.txt as 3 separate elements in the array.
If you have a sufficiently recent version of bash, you can save yourself a lot of trouble by just using a ** glob.
shopt -s globstar
files=(**/*.jpg)
The first line enables the feature. Once enabled, ** in a glob pattern will match any number (including 0) of directories in the path.
Using the glob in the array definition makes sure that whitespace is handled correctly.
To view an array in a form which could be used to define the array, use the -p (print) option to the declare builtin:
declare -p files

How can I read the files from a tar command into an array using BASH?

Using a BASH script, how can I execute a tar command and read the output (the list of files) into an array?
I've tried a number of things, including:
fileNames=`tar xvfz ${compressedFile} | readarray`
tar xvfz ${compressedFile} | readarray fileNames
readarray fileNames < tar xvfz ${compressedFile}
files=`tar xvfz ${compressedFile}'
readarray fileNames < ${files}
readarray fileNames < `echo ${files}`
I've tried with and without the backquote (grave accent) and I've tried using t as an option on the tar.gz file. I was trying to accomplish this without sending the output to a file and then reading the file, but I guess that's my fall-back plan (though I haven't tried it yet).
The closest I've come is:
echo ${fileNames} | readarray fileNames
but this creates only 1 item in the array and it contains all of the file names. Argh. :)
Anyone know how to accomplish this (seemingly) easy task?
V
This should create array with contents of tar file:
IFS=$'\n' declare -a fileNames=($(tar tzf "$compressedFile"))
This is assuming your file names inside tar file don't have newlines.
There is nothing wrong with using readarray (now called mapfile), but you need to execute the command in the current shell, not a subshell. So
tar tzf "$compressedFile" | readarray -t fileNames
won't work, because readarray is being executed in a subshell, as a result of the pipe. Instead you need to redirect stdin from process substitution:
readarray -t filenames < <(tar tzf "$compressedFile")
Note the -t flag to readarray. Without it, the values in the array would contain trailing newlines.

Need bash to separate cat'ed string to separate variables and do a for loop

I need to get a list of files added to a master folder and copy only the new files to the respective backup folders; The paths to each folder have multiple folders, all named by numbers and only 1 level deep.
ie /tester/a/100
/tester/a/101 ...
diff -r returns typically "Only in /testing/a/101: 2093_thumb.png" per line in the diff.txt file generated.
NOTE: there is a space after the colon
I need to get the 101 from the path and filename into separate variables and copy them to the backup folders.
I need to get the lesserfolder var to get 101 without the colon
and mainfile var to get 2093_thumb.png from each line of the diff.txt and do the for loop but I can't seem to get the $file to behave. Each time I try testing to echo the variables I get all the wrong results.
#!/bin/bash
diff_file=/tester/diff.txt
mainfolder=/testing/a
bacfolder= /testing/b
diff -r $mainfolder $bacfolder > $diff_file
LIST=`cat $diff_file`
for file in $LIST
do
maindir=$file[3]
lesserfolder=
mainfile=$file[4]
# cp $mainfolder/$lesserFolder/$mainfile $bacfolder/$lesserFolder/$mainfile
echo $maindir $mainfile $lesserfolder
done
If I could just get the echo statement working the cp would work then too.
I believe this is what you want:
#!/bin/bash
diff_file=/tester/diff.txt
mainfolder=/testing/a
bacfolder= /testing/b
diff -r -q $mainfolder $bacfolder | egrep "^Only in ${mainfolder}" | awk '{print $3,$4}' > $diff_file
cat ${diff_file} | while read foldercolon mainfile ; do
folderpath=${foldercolon%:}
lesserFolder=${folderpath#${mainfolder}/}
cp $mainfolder/$lesserFolder/$mainfile $bacfolder/$lesserFolder/$mainfile
done
But it is much more reliable (and much easier!) to use rsync for this kind of backup. For example:
rsync -a /testing/a/* /testing/b/
You could try a while read loop
diff -r $mainfolder $bacfolder | while read dummy dummy dir file; do
echo $dir $file
done

Resources