Why are loop generated Bash array values concatenated together? - arrays

I'm writing a short script to automate output filenames. The testing folder has the following files:
test_file_1.fa
test_file_2.fa
test_file_3.fa
So far, I have the following:
#!/bin/bash
filenames=$(ls *.fa*)
output_filenames=$()
output_suffix=".output.faa"
for name in $filenames
do
output_filenames+=$name$output_suffix
done
for name in $output_filenames
do
echo $name
done
The output for this is:
test_file_1.fa.output.faatest_file_2.fa.output.faatest_file_3.fa.output.faa
Why does this loop 'stick' all of the filenames together as one array variable?

shell arrays require particular syntax.
output_filenames=() # not $()
output_suffix=".output.faa"
for name in *.fa* # don't parse `ls`
do
output_filenames+=("$name$output_suffix") # parentheses required
done
for name in "${output_filenames[#]}" # braces and index and quotes required
do
echo "$name"
done
https://tldp.org/LDP/abs/html/arrays.html has more examples of using arrays.
"Don't parse ls" => https://mywiki.wooledge.org/ParsingLs

Related

Why does "echo $array" print all members of the array in this specific case instead of only the first member like in any other case?

I have encountered a very curious problem, while trying to learn bash.
Usually trying to print an echo by simply parsing the variable name like this only outputs the first member Hello.
#!/bin/bash
declare -a test
test[0]="Hello"
test[1]="World"
echo $test # Only prints "Hello"
BUT, for some reason this piece of code prints out ALL members of the given array.
#!/bin/bash
declare -a files
counter=0
for file in "./*"
do
files[$counter]=$file
let $((counter++))
done
echo $files # prints "./file1 ./file2 ./file3" and so on
And I can't seem to wrap my head around it on why it outputs the whole array instead of only the first member. I think it has something to do with my usage of the foreach-loop, but I was unable to find any concrete answer. It's driving me crazy!
Please send help!
When you quoted the pattern, you only created a single entry in your array:
$ declare -p files
declare -a files=([0]="./*")
If you had quoted the parameter expansion, you would see
$ echo "$files"
./*
Without the quotes, the expansion is subject to pathname generation, so echo receives multiple arguments, each of which is printed.
To build the array you expected, drop the quotes around the pattern. The results of pathname generation are not subject to further word-splitting (or recursive pathname generation), so no quotes would be needed.
for file in ./*
do
...
done

Echoing an array containing elements with spaces as an argument to another command

I am writing a little script that outputs a list of duplicate files in the directory, ie. pairs of XXX.jpg and XXX (1).jpg. I want to use the output of this script as an argument to a command, namely ql (quicklook) so I can look through all such images (to verify they are indeed duplicate images, or just filenames). For instance, I can do `ql (' which will allow me to look through all the files 'XXX (1).jpg'; but I want to include in that list also the original 'XXX.jpg' files.
Here is my script so far:
dups=()
for file in *\(*; do
dups+=( "${file}" )
breakdown=( $file )
dupfile="${breakdown[0]}.jpg"
if [ -e "$dupfile" ]; then
dups+=( "$dupfile" )
fi
done
echo ${dups[#]}
As far as building an array of the required filenames goes, it works. But when it comes to invoking something like ql $(./printdups.sh), the command gets confused by the filenames with spaces. It will attempt to open 'XXX' as a file, and then '(1).jpg' as another file. So the question is, how can I echo this array such that filenames with spaces are recognised as such by the command I pass it to?
I have tried changing line 3 to:
dups+=( "'$file'" )
And:
dups+=( "${file/ /\ }" )
Both to no avail.
You can't pass arrays from one process to another. All you are doing is writing a space-separated sequence of file names to standard output, and the unquoted command substitution in ql $(./printdups.sh) fails for the same reason you need an array in the first place: word-splitting does not distinguish between spaces in file names and spaces between file names.
I would recommend defining a function, rather than a script, and have that function populate a global array that you can access directly after the function has been called.
get_dups () {
dups=()
for file in *\(*; do
dups+=( "$file" )
read -a breakdown <<< "$file" # safer way to split the name into parts
dupfile="${breakdown[0]}.jpg"
if [ -e "$dupfile" ]; then
dups+=( "$dupfile" )
fi
done
}
get_dups
ql "${dups[#]}"

Loop thru a filename list and iterate thru a variable/array removing all strings from filenames with bash

I have a list of strings that I have in a variable and would like to remove those strings from a list of filenames. I'm pulling that string from a file that I can add to and modify over time. Some of the strings in the variable may include part of the item needed to be removed while the other may be another line in the list. Thats why I need to loop thru the entire variable list.
I'm familiar using a while loop to loop thru a list but not sure how I can loop thru each line to remove all strings from that filename.
Here's an example:
getstringstoremove=$(cat /text/from/some/file.txt)
echo "$getstringstoremove"
# Or the above can be an array
getstringstoremove=$(cat /text/from/some/file.txt)
declare -a arr=($getstringstoremove)
the above 2 should return the following lines
-SOMe.fil
(Ena)M-3_1
.So[Me].filEna)M-3_2
SOMe.fil(Ena)M-3_3
Here's the loop I was running to grab all filenames from a directory and remove anything other than the filenames
ls -l "/files/in/a/folder/" | awk -v N=9 '{sep=""; for (i=N; i<=NF; i++) {printf("%s%s",sep,$i); sep=OFS}; printf("\n")}' | while read line; do
echo "$line"
returns the following result after each loop
# 1st loop
ilikecoffee1-SOMe.fil(Ena)M-3_1.jpg
# iterate thru $getstringstoremove to remove all strings from the above file.
# 2nd loop
ilikecoffee2.So[Me].filEna)M-3_2.jpg
# iterate thru $getstringstoremove again
# 3rd loop
ilikecoffee3SOMe.fil(Ena)M-3_3.jpg
# iterate thru $getstringstoremove and again
done
the final desired output would be the following
ilikecoffee1.jpg
ilikecoffee2.jpg
ilikecoffee3.jpg
I'm running this in bash on Mac.
I hope this makes sense as I'm stuck and can use some help.
If someone has a better way of doing this by all means it doesn't have to be the way I have it listed above.
You can get the new filenames with this awk one-liner:
$ awk 'NR==FNR{a[$0];next} {for(i in a){n=index($0,i);if(n){$0=substr($0,0,n-1)substr($0,n+length(i))}}} 1' rem.txt files.lst
This assumes your exclusion strings are in rem.txt and there's a files list in files.lst.
Spaced out for easier commenting:
NR==FNR { # suck the first file into the indices of an array,
a[$0]
next
}
{
for (i in a) { # for each file we step through the array,
n=index($0,i) # search for an occurrence of this string,
if (n) { # and if found,
$0=substr($0,0,n-1)substr($0,n+length(i))
# rewrite the line with the string missing,
}
}
}
1 # and finally, print the line.
If you stow the above script in a file, say foo.awk, you could run it as:
$ awk -f foo.awk rem.txt files.lst
to see the resultant files.
Note that this just shows you how to build new filenames. If what you want is to do this for each file in a directory, it's best to avoid running your renames directly from awk, and use shell constructs designed for handling files, like a for loop:
for f in path/to/*.jpg; do
mv -v "$f" "$(awk -f foo.awk rem.txt - <<<"$f")"
done
This should be pretty obvious except perhaps for the awk options, which are:
-f foo.awk, use the awk script from this filename,
rem.txt, your list of removal strings,
-, a hyphen indicating that standard input should be used IN ADDITION to rem.txt, and
<<<"$f", a "here-string" to provide that input to awk.
Note that this awk script will work with both gawk and the non-GNU awk that is included in macos.
I think I have understood what you mean, and I would do it with Perl which comes built-in to the standard macOS - so nothing to install.
I assume you have a file called remove.txt with your list of stuff to remove, and that you want to run the script on all files in your current directory. If so, the script would be:
#!/usr/local/bin/perl -w
use strict;
# Load the strings to remove into array "strings"
my #strings = `cat remove.txt`;
for(my $i=0;$i<$#strings;$i++){
# Strip carriage returns and quote metacharacters - e.g. *()[]
chomp($strings[$i]);
$strings[$i] = quotemeta($strings[$i]);
}
# Iterate over all filenames
my #files = glob('*');
foreach my $file (#files){
my $new = $file;
# Iterate over replacements
foreach my $string (#strings){
$new =~ s/$string//;
}
# Check if name would change
if($new ne $file){
if( -f $new){
printf("Cowardly refusing to rename %s as %s since it involves overwriting\n",$file,$new);
} else {
printf("Rename %s as %s\n",$file,$new);
# rename $file,$new;
}
}
}
Then save that in your HOME directory as renamer. Make it executable - only necessary once - with this command in Terminal:
chmod +x $HOME/renamer
Then you can go in any directory where you madly named files are and run the script like this:
cd path/to/mad/files
$HOME/renamer
As with all things you download off the Internet, make a backup first and just run on a small, copied, subset of your files till you get the idea of how it works.
If you use homebrew as your package manager, you could install rename using:
brew install rename
You could then take all the Perl from my other answer and condense it down to a couple of lines and embed it in a rename command which would give you the added benefit of being able to do dry-runs etc. The code below does exactly the same as my other answer but is somewhat harder to read for non_perl folk.
Your command would simply be:
rename --dry-run '
my #strings = map { s/\r|\n//g; $_=quotemeta($_) } `cat remove.txt`;
foreach my $string (#strings){ s/$string//; } ' *
Sample Output
'ilikecoffee(Ena)M-3_1' would be renamed to 'ilikecoffee'
'ilikecoffee-SOMe.fil' would be renamed to 'ilikecoffee'
'ilikecoffee.So[Me].filEna)M-3_2' would be renamed to 'ilikecoffee'
To try and understand it, remember:
the rename part applies the following Perl to each file because of the asterisk at the end
the #strings part reads all the strings from the file remove.txt and removes any carriage returns and linefeeds from them and quotes any metacharacters
the foreach applies each of the deletions to the current filename which rename stores in $_ for you
Note that this method trades simplicity for performance somewhat. If you have millions of files to do, the other method will be quicker because here I read the remove.txt file for each and every file whose name is checked, but if you only have a few hundred/thousand files, I doubt you'll notice it.
This should be much the same, just shorter:
rename --dry-run '
my #strings = `cat remove.txt`; chomp #strings;
foreach my $string (#strings){ s/\Q$string\E//; } ' *

Extract information from ini file and add to associative array (Bash)

I'm stucked on a bash script.
I'm having a config.ini files like this :
#Username
username=user
#Userpassword
userpassword=password
And i'm looking in a bash script to extract this information and put it in a associative array. My script looks like :
declare -A array
OIFS=$IFS
IFS='='
grep -vE '^(\s*$|#)' file | while read -r var1 var2
do
array+=([$var1]=$var2)
done
echo ${array[#]}
But the array seems to be empty because the commande echo ${array[#]} gives no output.
Any idea why me script don't work ? Thanks for your help and sorry for my bad english.
Common error - "grep | while" causes the while loop to be executed in a separate shell and the variables inside the loop are not global to your shell. Use a here string instead:
while read -r var1 var2
do
array+=([$var1]=$var2)
done <<< $(grep -vE '^(\s*$|#)' file)
Assuming the file can be trusted (ie the content is regulated and known), the simplest method would be to source the ini file and then directly use the variable names within the script:
. config.ini
You can either use the period (.) as above or the source builtin command

transform file content into array or grep for values

I have a file containing config information and a shell script that reads that file. I want to hand over values to a bash script.
file.txt
varNumber=1.1.1
varName=testThis
varFile=~/myDir/mySubDir/output.zip
myShellScript.sh
FILENAME="~/myDir/mySubDir/output.zip" <- this is what I expect from grep/awk
startNextScript.sh -f $FILENAME
I would like to extract the variables either as an associated array or - if easier - grep for them,
but as I'm not used to writing commands like this in bash I am asking for help!
Using associative array in bash:
#!/bin/bash
declare -A vars
while read -r line ; do
var=${line%%=*} # Remove everything after the first =.
value=${line#*=} # Remove everything before the first =.
vars[$var]=$value
done < file.txt
echo Number: ${vars[varNumber]}
echo Name: ${vars[varName]}
echo File: ${vars[varFile]}

Resources