Sorting an array of pathnames (strings) [Bash] - arrays

I have seen way too many duplicates of this, but none of the answer codes or tips ever helped me, so I'm left confused.
input=/foo/bar/*;
#Contains something along the lines of
#/foo/bar/file1 /foo/bar/file2 /foo/bar/file3
#And I simply need
#/foo/bar/file3 /foo/bar/file2 /foo/bar/file1
output=($(for l in ${input[#]}; do echo $l; done | sort));
#Doesn't work, returns only the last entry from input
output=$(sort -nr ${input});
#Works, returns everything correctly reversed, but outputs the file contents and not the pathnames;
output=($(sort -nr ${input}));
#Outputs only the last entry and also its contents and not the pathname;
I tried many more options, but I'm not gonna fill this whole page with them, you get the gist.
Duplicates: (None of them helpful to me)
How can I sort the string array in linux bash shell?
How to sort an array in BASH
custom sort bash array
Sorting bash arguments alphabetically

You're confused about what is an array in bash: this does not declare an array:
input=/foo/bar/*
$input is just the string "/foo/bar/*" -- the list of files does not get expanded until you do something like for i in ${input[#]} where the "array" expansion is unquoted.
You want this:
input=( /foo/bar/* )
mapfile -t output < <(printf "%s\n" "${input[#]}" | sort -nr)
I don't have time to explain it. I'll come back later.

You can use sort -r with printf, where input containg glob string to match your filenames:
sort -r <(printf "%s\n" $input)

This works:
input=`foo/bar/*`
output=`for l in $input ; do echo $l ; done | sort -r`

Related

Creating an array of Strings from Grep Command

I'm pretty new to Linux and I've been trying some learning recently. One thing I'm struggling is Within a log file I would like to grep for all the unique IDs that exist and store them in an array.
The format of the ids are like so id=12345678,
I'm struggling though to get these in to an array. So far I've tried a range of things, the below however
a=($ (grep -HR1 `id=^[0-9]' logfile))
echo ${#a[#]}
but the echo count is always returned as 0. So it is clear the populating of the array is not working. Have explored other pages online, but nothing seems to have a clear explanation of what I am looking for exactly.
a=($(grep -Eow 'id=[0-9]+' logfile))
a=("${a[#]#id=}")
printf '%s\n' "${a[#]}"
It's safe to split an unquoted command substitution here, as we aren't printing pathname expansion characters (*?[]), or whitespace (other than the new lines which delimit the list).
If this were not the case, mapfile -t a <(grep ...) is a good alternative.
-E is extended regex (for +)
-o prints only matching text
-w matches a whole word only
${a[#]#id=} strips the id suffix from each array element
Here is an example
my_array=()
while IFS= read -r line; do
my_array+=( "$line" )
done < <( ls )
echo ${#my_array[#]}
printf '%s\n' "${my_array[#]}"
It prints out 14 and then the names of the 14 files in the same folder. Just substitute your command instead of ls and you started.
Suggesting readarray command to make sure it array reads full lines.
readarray -t my_array < <(grep -HR1 'id=^[0-9]' logfile)
printf "%s\n" "${my_array[#]}"

Make a list of all files in two folders then iterate through the combined list randomly

I have two directories with photos that I want to manipulate to output a random order of the files each time a script is run. How would I create such a list?
d1=/home/Photos/*.jpg
d2=/mnt/JillsPC/home/Photos/*.jpg
# somehow make a combined list, files = d1 + d2
# somehow randomise the file order
# during execution of the for;do;done loop, no file should be repeated
for f in $files; do
echo $f # full path to each file
done
I wouldn't use variables if you don't have to. It's more natural if you chain a couple of commands together with pipes or process substitution. That way everything operates on streams of data without loading the entire list of names into memory all at once.
You can use shuf to randomly permute input lines, and find to list files one per line. Or, to be maximally safe, let's use \0 separators. Finally, a while loop with process substitution reads line by line into a variable.
while IFS= read -d $'\0' -r file; do
echo "$file"
done < <(find /home/Photos/ /mnt/JillsPC/home/Photos/ -name '*.jpg' -print0 | shuf -z)
That said, if you do want to use some variables then you should use arrays. Arrays handle file names with whitespace and other special characters correctly, whereas regular string variables muck them all up.
d1=(/home/Photos/*.jpg)
d2=(/mnt/JillsPC/home/Photos/*.jpg)
files=("${d1[#]}" "${d2[#]}")
Iterating in order would be easy:
for file in "${files[#]}"; do
echo "$file"
done
Shuffling is tricky though. shuf is still the best tool but it works best on a stream of data. We can use printf to print each file name with the trailing \0 we need to make shuf -z happy.
d1=(/home/Photos/*.jpg)
d2=(/mnt/JillsPC/home/Photos/*.jpg)
files=("${d1[#]}" "${d2[#]}")
while IFS= read -d $'\0' -r file; do
echo "$file"
done < <(printf '%s\0' "${files[#]}" | shuf -z)
Further reading:
How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
How can I find and safely handle file names containing newlines, spaces or both?
I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?
How can I randomize (shuffle) the order of lines in a file? Or select a random line from a file, or select a random file from a directory?
I came up with this solution after some more reading:
files=(/home/roy/Photos/*.jpg /mnt/JillsPC/home/jill/Photos/*.jpg)
printf '%s\n' "${files[#]}" | sort -R
Edit: updated with John's improvements from comments.
You can add any number of directories into an array declaration (though see caveat with complex names in comments).
sort -R seems to use shuf internally from looking at it's man page.
This was the original, which works, but is not as robust as the above:
files=(/home/roy/Photos/*.jpg /mnt/JillsPC/home/jill/Photos/*.jpg)
(IFS=$'\n'; echo "${files[*]}") | sort -R
With IFS=$'\n', echoing the array will display it line by line (IFS=$'somestring' is syntax for string literals with escape sequences. So unlike '\n', $'\n' is the correct way to set it to a line break). IFS is not needed when using the printf method above.
echo ${files[*]} will print out all array elements at once, using the IFS defined in

How do I store the output from a find command in an array? + bash

I have the following find command with the following output:
$ find -name '*.jpg'
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
./public_html/github/screencasts-gh-pages/introToBackbone/presentation/images/telescope.jpg
./public_html/github/StarCraft-master/img/Maps/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Snapshot.jpg
./public_html/github/StarCraft-master/img/Maps/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Original/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Bg/GameLose.jpg
./public_html/github/StarCraft-master/img/Bg/GameWin.jpg
./public_html/github/StarCraft-master/img/Bg/GameStart.jpg
./public_html/github/StarCraft-master/img/Bg/GamePlay.jpg
./public_html/github/StarCraft-master/img/Demo/Demo.jpg
./public_html/github/flot/examples/image/hs-2004-27-a-large-web.jpg
./public_html/github/minicourse-ajax-project/other/GameLose.jpg
How do I store this output in an array? I want it to handle filenames with spaces
I have tried this arrayname=($(find -name '*.jpg')) but this just stores the first element. # I am doing the following which seems to be just the first element?
$ arrayname=($(find -name '*.jpg'))
$ echo "$arrayname"
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
$
I have tried here but again this just stores the 1st element
Other similar Qs
How do I capture the output from the ls or find command to store all file names in an array?
How do i store the output of a bash command in a variable?
If you know with certainty that your filenames will not contain newlines, then
mapfile -t arrayname < <(find ...)
If you want to be able to handle any file
arrayname=()
while IFS= read -d '' -r filename; do
arrayname+=("$filename")
done < <(find ... -print0)
echo "$arrayname" will only show the first element of the array. It is equivalent to echo "${arrayname[0]}". To dump an array:
printf "%s\n" "${arrayname[#]}"
# ............^^^^^^^^^^^^^^^^^ must use exactly this form, with the quotes.
arrayname=($(find ...)) is still wrong. It will store the file ./file with spaces.txt as 3 separate elements in the array.
If you have a sufficiently recent version of bash, you can save yourself a lot of trouble by just using a ** glob.
shopt -s globstar
files=(**/*.jpg)
The first line enables the feature. Once enabled, ** in a glob pattern will match any number (including 0) of directories in the path.
Using the glob in the array definition makes sure that whitespace is handled correctly.
To view an array in a form which could be used to define the array, use the -p (print) option to the declare builtin:
declare -p files

bash4 read file into associative array

I am able to read file into a regular array with a single statement:
local -a ary
readarray -t ary < $fileName
Not happening is reading a file into assoc. array.
I have control over file creation and so would like to do as simply as possible w/o loops if possible at all.
So file content can be following to be read in as:
keyname=valueInfo
But I am willing to replace = with another string if cuts down on code, especially in a single line code as above.
And ...
So would it be possible to read such a file into an assoc array using something like an until or from - i.e. read into an assoc array until it hits a word, or would I have to do this as part of loop?
This will allow me to keep a lot of similar values in same file, but read into separate arrays.
I looked at mapfile as well, but does same as readarray.
Finally ...
I am creating an options list - to select from - as below:
local -a arr=("${!1}")
select option in ${arr[*]}; do
echo ${option}
break
done
Works fine - however the list shown is not sorted. I would like to have it sorted if possible at all.
Hope it is ok to put all 3 questions into 1 as the questions are similar - all on arrays.
Thank you.
First thing, associative arrays are declared with -A not -a:
local -A ary
And if you want to declare a variable on global scope, use declare outside of a function:
declare -A ary
Or use -g if BASH_VERSION >= 4.2.
If your lines do have keyname=valueInfo, with readarray, you can process it like this:
readarray -t lines < "$fileName"
for line in "${lines[#]}"; do
key=${line%%=*}
value=${line#*=}
ary[$key]=$value ## Or simply ary[${line%%=*}]=${line#*=}
done
Using a while read loop can also be an option:
while IFS= read -r line; do
ary[${line%%=*}]=${line#*=}
done < "$fileName"
Or
while IFS== read -r key value; do
ary[$key]=$value
done < "$fileName"

KSH scripting: how to split on ',' when values have escaped commas?

I try to write KSH script for processing a file consisting of name-value pairs, several of them on each line.
Format is:
NAME1 VALUE1,NAME2 VALUE2,NAME3 VALUE3, etc
Suppose I write:
read l
IFS=","
set -A nvls $l
echo "$nvls[2]"
This will give me second name-value pair, nice and easy. Now, suppose that the task is extended so that values could include commas. They should be escaped, like this:
NAME1 VALUE1,NAME2 VALUE2_1\,VALUE2_2,NAME3 VALUE3, etc
Obviously, my code no longer works, since "read" strips all quoting and second element of array will be just "NAME2 VALUE2_1".
I'm stuck with older ksh that does not have "read -A array". I tried various tricks with "read -r" and "eval set -A ....", to no avail. I can't use "read nvl1 nvl2 nvl3" to do unescaping and splitting inside read, since I dont know beforehand how many name-value pairs are in each line.
Does anyone have a useful trick up their sleeve for me?
PS
I know that I have do this in a nick of time in Perl, Python, even in awk. However, I have to do it in ksh (... or die trying ;)
As it often happens, I deviced an answer minutes after asking the question in public forum :(
I worked around the quoting/unquoting issue by piping the input file through the following sed script:
sed -e 's/\([^\]\),/\1\
/g;s/$/\
/
It converted the input into:
NAME1.1 VALUE1.1
NAME1.2 VALUE1.2_1\,VALUE1.2_2
NAME1.3 VALUE1.3
<empty line>
NAME2.1 VALUE2.1
<second record continues>
Now, I can parse this input like this:
while read name value ; do
echo "$name => $value"
done
Value will have its commas unquoted by "read", and I can stuff "name" and "value" in some associative array, if I like.
PS
Since I cant accept my own answer, should I delete the question, or ...?
You can also change the \, pattern to something else that is known not to appear in any of your strings, and then change it back after you've split the input into an array. You can use the ksh builtin pattern-substitution syntax to do this, you don't need to use sed or awk or anything.
read l
l=${l//\\,/!!}
IFS=","
set -A nvls $l
unset IFS
echo ${nvls[2]/!!/,}

Resources