bash array count always returns 1 - arrays

I searched all over for this, but the terms are apparently too general. I'm writing a script to search a group of folders for .mp3 files. Some folders don't have mp3's so they have to be excluded.
I created an array to hold the uniq'd folder names. This find command will get the folders I need.
Folders=$(sudo find /my/music/ -type f -name "*.mp3" | cut -d'/' -f7 | sort -u)
When I try to count the number of folders in the array, I always get 1
echo ${#Folders[#]}
echo ${Folders[#]} prints them out on separate lines so I thought they were separate array elements. Can anyone explain what is going on? You might have to jiggle the field number in the cut command to reproduce locally.

Folders is not an array but a variable.
You need:
Folders=( $(sudo find /my/music/ -type f -name "*.mp3" | cut -d'/' -f7 | sort -u) )
i.e. enclose the command substitution with (). Now ${#Folders[#]} would give you the number of elements of array Folders.

Or do :
sudo find /my/music/ -type f -name "*.mp3" | cut -d'/' -f7 | sort -u | wc -l
Note
wc -l prints the number of lines which in this case would be the number of unique files
to make things a bit more explicit, use -printf "%p\n" option with find where %p specifier prints the file with full path.

Assuming bash 4 or later, don't use find here; use the globstar operator.
shopt -s globstar
folders=( /my/music/**/*.mp3 )
Also assuming that cut -d/ -f7 is supposed to extract the filename alone, follow this up with
folders=${folders[#]##*/}
Other methods for populating the array must take more care to accomodate files containing whitespace or characters like ?, *, or [. File names containing newlines (rare, but not illegal) are much more difficult to handle correctly. Pathname expansion is done inside the shell, so you don't need to worry about any such special characters.

Related

Bash Array Script Exclude Duplicates

So I have written a bash script (named music.sh) for a Raspberry Pi to perform the following functions:
When executed, look into one single directory (Music folder) and select a random folder to look into. (Note: none of these folders here have subdirectories)
Once a folder within "Music" has been selected, then play all mp3 files IN ORDER until the last mp3 file has been reached
At this point, the script would go back to the folders in the "Music" directory and select another random folder
Then it would again play all mp3 files in that folder in order
Loop indefinitely until input from user
I have this code which does all of the above EXCEPT for the following items:
I would like to NOT play any other "album" that has been played before
Once all albums played once, then shutdown the system
Here is my code so far that is working (WITH duplicates allowed):
#!/bin/bash
folderarray=($(ls -d /home/alphekka/Music/*/))
for i in "${folderarray[#]}";
do
folderitems=(${folderarray[RANDOM % ${#folderarray[#]}]})
for j in "${folderitems[#]}";
do
echo `ls $j`
cvlc --play-and-exit "${j[#]}"
done
done
exit 0
Please note that there isn't a single folder or file that has a space in the name. If there is a space, then I face some issues with this code working.
Anyways, I'm getting close, but I'm not quite there with the entire functionality I'm looking for. Any help would be greatly appreciated! Thank you kindly! :)
Use an associative array as a set. Note that this will work for all valid folder and file names.
#!/bin/bash
declare -A folderarray
# Each folder name is a key mapped to an empty string
for d in /home/alphekka/Music/*/; do
folderarray["$d"]=
done
while [[ "${!folderarray[*]}" ]]; do
# Get a list of the remaining folder names
foldernames=( "${!folderarray[#]}" )
# Pick a folder at random
folder=${foldernames[RANDOM%${#foldernames[#]}]}
# Remove the folder from the set
# Must use single quotes; see below
unset folderarray['$folder']
for j in "$folder"/*; do
cvlc --play-and-exit "$j"
done
done
Dealing with keys that contain spaces (and possibly other special characters) is tricky. The quotes shown in the call to unset above are not syntactic quotes in the usual sense. They do not prevent $folder from being expanded, but they do appear to be used by unset itself to quote the resulting string.
Here's another solution: randomize the list of directories first, save the result in an array and then play (my script just prints) the files from each element of the array
MUSIC=/home/alphekka/Music
OLDIFS=$IFS
IFS=$'\n'
folderarray=($(ls -d $MUSIC/*/|while read line; do echo $RANDOM $line; done| sort -n | cut -f2- -d' '))
for folder in ${folderarray[*]};
do
printf "Folder: %s\n" $folder
fileArray=($(find $folder -type f))
for j in ${fileArray[#]};
do
printf "play %s\n" $j
done
done
For the random shuffling I used this answer.
One liner solution with mpv, rl (randomlines), xargs, find:
find /home/alphekka/Music/ -maxdepth 1 -type d -print0 | rl -d \0 | xargs -0 -l1 mpv

How to remove numbers from extensions from files

I have many files in a directory having extension like
.text(2) and .text(1).
I want to remove the numbers from extension and output should be like
.text and .text .
can anyone please help me with the shell script for that?
I am using centOs.
A pretty portable way of doing it would be this:
for i in *.text*; do mv "$i" "$(echo "$i" | sed 's/([0-9]\{1,\})$//')"; done
Loop through all files which end in .text followed by anything. Use sed to remove any parentheses containing one or more digits from the end of each filename.
If all of the numbers within the parentheses are single digits and you're using bash, you could also use built-in parameter expansion:
for i in *.text*; do mv "$i" "${i%([0-9])}"; done
The expansion removes any parentheses containing a single digit from the end of each filename.
Another way without loops, but also with sed (and all the regexp's inside) is piping to sh:
ls *text* | sed 's/\(.*\)\..*/mv \1* \1.text/' | sh
Example:
[...]$ ls
xxxx.text(1) yyyy.text(2)
[...]$ ls *text* | sed 's/\(.*\)\..*/mv \1* \1.text/' | sh
[...]$ ls
xxxx.text yyyy.text
Explanation:
Everything between \( and \) is stored and can be pasted again by \1 (or \2, \3, ... a consecutive number for each pair of parentheses used). Therefore, the code above stores all the characters before the first dot \. and after that, compounds a sequence like this:
mv xxxx* xxxx.text
mv yyyy* yyyy.text
That is piped to sh
Most simple way if files are in same folder
rename 's/text\([0-9]+\)/text/' *.text*
link

Bash arrays: appending and prepending to each element in array

I'm trying to build a long command involving find. I have an array of directories that I want to ignore, and I want to format this directory into the command.
Basically, I want to transform this array:
declare -a ignore=(archive crl cfg)
into this:
-o -path "$dir/archive" -prune -o -path "$dir/crl" -prune -o -path "$dir/cfg" -prune
This way, I can simply add directories to the array, and the find command will adjust accordingly.
So far, I figured out how to prepend or append using
${ignore[#]/#/-o -path \"\$dir/}
${ignore[#]/%/\" -prune}
But I don't know how to combine these and simultaneously prepend and append to each element of an array.
You cannot do it simultaneously easily. Fortunately, you do not need to:
ignore=( archive crl cfg )
ignore=( "${ignore[#]/%/\" -prune}" )
ignore=( "${ignore[#]/#/-o -path \"\$dir/}" )
echo ${ignore[#]}
Note the parentheses and double quotes - they make sure the array contains three elements after each substitution, even if there are spaces involved.
Have a look at printf, which does the job as well:
printf -- '-o -path "$dir/%s" -prune ' ${ignore[#]}
In general, you should strive to always treat each variable in the quoted form (e.g. "${ignore[#]}") instead of trying to insert quotation marks yourself (just as you should use parameterized statements instead of escaping the input in SQL) because it's hard to be perfect by manual escaping; for example, suppose a variable contains a quotation mark.
In this regard, I would aim at crafting an array where each argument word for find becomes an element: ("-o" "-path" "$dir/archive" "-prune" "-o" "-path" "$dir/crl" "-prune" "-o" "-path" "$dir/cfg" "-prune") (a 12-element array).
Unfortunately, Bash doesn't seem to support a form of parameter expansion where each element expands to multiple words. (p{1,2,3}q expands to p1q p2q p3q, but with a=(1 2 3), p"${a[#]}"q expands to p1 2 3q.) So you need to resort to a loop:
declare -a args=()
for i in "${ignore[#]}"
do
args+=(-o -path "$dir/$i" -prune) # I'm not sure if you want to have
# $dir expanded at this point;
# otherwise, just use "\$dir/$i".
done
find ... "${args[#]}" ...
If I understand right,
declare -a ignore=(archive crl cfg)
a=$(echo ${ignore[#]} | xargs -n1 -I% echo -o -path '"$dir/%"' -prune)
echo $a
prints
-o -path "$dir/archive" -prune -o -path "$dir/crl" -prune -o -path "$dir/cfg" -prune
Works only with xargs what has the next switches:
-I replstr
Execute utility for each input line, replacing one or more occurrences of replstr in up to replacements
(or 5 if no -R flag is specified) arguments to utility with the entire line of input. The resulting
arguments, after replacement is done, will not be allowed to grow beyond 255 bytes; this is implemented
by concatenating as much of the argument containing replstr as possible, to the constructed arguments to
utility, up to 255 bytes. The 255 byte limit does not apply to arguments to utility which do not contain
replstr, and furthermore, no replacement will be done on utility itself. Implies -x.
-J replstr
If this option is specified, xargs will use the data read from standard input to replace the first occur-
rence of replstr instead of appending that data after all other arguments. This option will not affect
how many arguments will be read from input (-n), or the size of the command(s) xargs will generate (-s).
The option just moves where those arguments will be placed in the command(s) that are executed. The
replstr must show up as a distinct argument to xargs. It will not be recognized if, for instance, it is
in the middle of a quoted string. Furthermore, only the first occurrence of the replstr will be
replaced. For example, the following command will copy the list of files and directories which start
with an uppercase letter in the current directory to destdir:
/bin/ls -1d [A-Z]* | xargs -J % cp -rp % destdir

script for getting extensions of a file

I need to get all the file extension types in a folder. For instance, if the directory's ls gives the following:
a.t
b.t.pg
c.bin
d.bin
e.old
f.txt
g.txt
I should get this by running the script
.t
.t.pg
.bin
.old
.txt
I have a bash shell.
Thanks a lot!
See the BashFAQ entry on ParsingLS for a description of why many of these answers are evil.
The following approach avoids this pitfall (and, by the way, completely ignores files with no extension):
shopt -s nullglob
for f in *.*; do
printf '%s\n' ".${f#*.}"
done | sort -u
Among the advantages:
Correctness: ls behaves inconsistently and can result in inappropriate results. See the link at the top.
Efficiency: Minimizes the number of subprocess invoked (only one, sort -u, and that could be removed also if we wanted to use Bash 4's associative arrays to store results)
Things that still could be improved:
Correctness: this will correctly discard newlines in filenames before the first . (which some other answers won't) -- but filenames with newlines after the first . will be treated as separate entries by sort. This could be fixed by using nulls as the delimiter, or by the aforementioned bash 4 associative-array storage approach.
try this:
ls -1 | sed 's/^[^.]*\(\..*\)$/\1/' | sort -u
ls lists files in your folder, one file per line
sed magic extracts extensions
sort -u sorts extensions and removes duplicates
sed magic reads as:
s/ / /: substitutes whatever is between first and second / by whatever is between second and third /
^: match beginning of line
[^.]: match any character that is not a dot
*: match it as many times as possible
\( and \): remember whatever is matched between these two parentheses
\.: match a dot
.: match any character
*: match it as many times as possible
$: match end of line
\1: this is what has been matched between parentheses
People are really over-complicating this - particularly the regex:
ls | grep -o "\..*" | uniq
ls - get all the files
grep -o "\..*" - -o only show the match; "\..*" match at the first "." & everything after it
uniq - don't print duplicates but keep the same order
you can also sort if you like, but sorting doesn't match the example
This is what happens when you run it:
> ls -1
a.t
a.t.pg
c.bin
d.bin
e.old
f.txt
g.txt
> ls | grep -o "\..*" | uniq
.t
.t.pg
.bin
.old
.txt

Sorting a bash array

I'm trying to sort the output of this code by size of the file. Currently I have:
IFS=!
FILEARRAY=(`find * -printf %f!`)
to get all of the file names out of the directory. I've tried piping it all sorts of ways and nothing works. Is it even possible to do like this or do I need to go about getting the file names in my array a different way?
Thanks
Try something like this instead:
FILEARRAY=$(find * -printf '%s~%f\n' | sort -n | awk -F"~" '{print $2}')
This should give you a list of file names sorted by size.
Not sure what you are trying to achieve here but to extract the size of the files you might want to use sed. to pass it to sort or some other sorting utility check out xargs which gives you some extra features when piping and might be of some use.
Edit:
If you are trying to sort all of the files in the current directory by size,
somthing like this:
find ./ -name "*" | xargs ls -s | sort -n
should work.
Does not use bash arrays. Also does not parse ls
find . -type f -printf '%s:%f\n' | sort -t: -n -k1 | cut -d: -f2-

Resources