$# vs. "$#" when an argument is enclosed with single quotes [duplicate] - arrays

The $# variable seems to maintain quoting around its arguments so that, for example:
$ function foo { for i in "$#"; do echo $i; done }
$ foo herp "hello world" derp
herp
hello world
derp
I am also aware that bash arrays, work the same way:
$ a=(herp "hello world" derp)
$ for i in "${a[#]}"; do echo $i; done
herp
hello world
derp
What is actually going on with variables like this? Particularly when I add something to the quote like "duck ${a[#]} goose". If its not space separated what is it?

Usually, double quotation marks in Bash mean "make everything between the quotation marks one word, even if it has separators in it." But as you've noticed, $# behaves differently when it's within double quotes. This is actually a parsing hack that dates back to Bash's predecessor, the Bourne shell, and this special behavior applies only to this particular variable.
Without this hack (I use the term because it seems inconsistent from a language perspective, although it's very useful), it would be difficult for a shell script to pass along its array of arguments to some other command that wants the same arguments. Some of those arguments might have spaces in them, but how would it pass them to another command without the shell either lumping them together as one big word or reparsing the list and splitting the arguments that have whitespace?
Well, you could pass an array of arguments, and the Bourne shell really only has one array, represented by $* or $#, whose number of elements is $# and whose elements are $1, $2, etc, the so-called positional parameters.
An example. Suppose you have three files in the current directory, named aaa, bbb, and cc c (the third file has a space in the name). You can initialize the array (that is, you can set the positional parameters) to be the names of the files in the current directory like this:
set -- *
Now the array of positional parameters holds the names of the files. $#, the number of elements, is three:
$ echo $#
3
And we can iterate over the position parameters in a few different ways.
1) We can use $*:
$ for file in $*; do
> echo "$file"
> done
but that re-separates the arguments on whitespace and calls echo four times:
aaa
bbb
cc
c
2) Or we could put quotation marks around $*:
$ for file in "$*"; do
> echo "$file"
> done
but that groups the whole array into one argument and calls echo just once:
aaa bbb cc c
3) Or we could use $# which represents the same array but behaves differently in double quotes:
$ for file in "$#"; do
> echo "$file"
> done
will produce
aaa
bbb
cc c
because $1 = "aaa", $2 = "bbb", and $3 = "cc c" and "$#" leaves the elements intact. If you leave off the quotation marks around $#, the shell will flatten and re-parse the array, echo will be called four times, and you'll get the same thing you got with a bare $*.
This is especially useful in a shell script, where the positional parameters are the arguments that were passed to your script. To pass those same arguments to some other command -- without the shell resplitting them on whitespace -- use "$#".
# Truncate the files specified by the args
rm "$#"
touch "$#"
In Bourne, this behavior only applies to the positional parameters because it's really the only array supported by the language. But you can create other arrays in Bash, and you can even apply the old parsing hack to those arrays using the special "${ARRAYNAME[#]}" syntax, whose at-sign feels almost like a wink to Mr. Bourne:
$ declare -a myarray
$ myarray[0]=alpha
$ myarray[1]=bravo
$ myarray[2]="char lie"
$ for file in "${myarray[#]}"; do echo "$file"; done
alpha
bravo
char lie
Oh, and about your last example, what should the shell do with "pre $# post" where you have $# within double quotes but you have other stuff in there, too? Recent versions of Bash preserve the array, prepend the text before the $# to the first array element, and append the text after the $# to the last element:
pre aaa
bb
cc c post

Related

GNU Make: How to set an array from a space-separated string?

I'm writing a Terminal Match-Anything Pattern Rule, i.e. %::, that, as expected, will run only if no other target is matched. In its recipe I want to iterate over makefile's explicit targets and check if the found pattern ($*) is the beginning of any other target
By now I'm successfully getting all desired targets in a space-separated string and storing it in a variable TARGETS, however I couldn't turn it in an array to be able to iterate over each word in the string.
For instance
%::
$(eval TARGETS ::= $(shell grep -Ph "^[^\t].*::.*##" ./Makefile | cut -d : -f 1 | sort))
echo $(TARGETS)
gives me just what I was expecting:
build clean compile deploy execute init run serve
The Question
How could I iterate over each of $(TARGET) string words inside a GNU Make 4.2.1 loop?
I found a bunch of BASH solutions, but none of them worked in my tests:
Reading a delimited string into an array in Bash
How to split one string into multiple strings separated by at least >one space in bash shell?
It's generally a really bad idea to use eval and shell inside a recipe. A recipe is already a shell script so you should just use shell scripting.
It's not really clear exactly what you want to do. If you want to do this in a recipe, you can use a shell loop:
%::
TARGETS=$$(grep -Ph "^[^\t].*::.*##" ./Makefile | cut -d : -f 1 | sort); \
for t in $$TARGETS; do \
echo $$t; \
done
If you want to do it outside of a recipe you can use the GNU make foreach function.

Bash: quoted array expansion leads to strange results

While experimenting with bash arrays, I stumbled upon behaviour I find hard to explain.
> arr=("a" "b")
> bash -c "echo ${arr[*]}"
a b
> bash -c "echo ${arr[#]}"
a
The relevant part of the bash manual states:
${!name[#]}, ${!name[*]} : If name is an array variable, expands to the list of array indices (keys) assigned in name. [...] When ‘#’ is used and the expansion appears within double quotes, each key expands to a separate word.
As I understand it, I would expect the latter example to expand from bash -c "echo ${arr[#]}" to bash -c "echo \"a\" \"b\"" (or even bash -c "echo a b") and output a b in the subshell.
So, which is the correct behaviour? The observed behaviour? The behaviour I expect? Or something entirely different?
You can run the code under set -xv to see how bash expands the variables:
choroba#triangle:~ $ (set -xv ; arr=("a" "b") ; bash -c "echo ${arr[#]}")
+ arr=("a" "b")
+ bash -c 'echo a' b
a
"echo ${arr[#]}" is expanded to two words, echo a and b, the first one is used as a command to run, the second one is interpreted as the value for the $0 or name of the shell. Any following arguments would be used to set the positional parameters.
"echo ${arr[#]}" expands to two words, echo a, and b. And the manual also says
If the -c option is present, then commands are read from the first non-option argument command_string. If there are arguments after the command_string, the first argument is assigned to $0 and any remaining arguments are assigned to the positional parameters. The assignment to $0 sets the name of the shell, which is used in warning and error messages.
So, you're assigning b to $0 there.
Proof of concept:
$ arr=(a b)
$ bash -c "echo \$0; echo ${arr[#]}"
b
a

What is the difference between bash arrays with the notation ${array[*]} and ${array[#]} [duplicate]

I'm taking a stab at writing a bash completion for the first time, and I'm a bit confused about about the two ways of dereferencing bash arrays (${array[#]} and ${array[*]}).
Here's the relevant chunk of code (it works, but I would like to understand it better):
_switch()
{
local cur perls
local ROOT=${PERLBREW_ROOT:-$HOME/perl5/perlbrew}
COMPREPLY=()
cur=${COMP_WORDS[COMP_CWORD]}
perls=($ROOT/perls/perl-*)
# remove all but the final part of the name
perls=(${perls[*]##*/})
COMPREPLY=( $( compgen -W "${perls[*]} /usr/bin/perl" -- ${cur} ) )
}
bash's documentation says:
Any element of an array may be referenced using ${name[subscript]}. The braces are required to avoid conflicts with the shell's filename expansion operators. If the subscript is ‘#’ or ‘*’, the word expands to all members of the array name. These subscripts differ only when the word appears within double quotes. If the word is double-quoted, ${name[*]} expands to a single word with the value of each array member separated by the first character of the IFS variable, and ${name[#]} expands each element of name to a separate word.
Now I think I understand that compgen -W expects a string containing a wordlist of possible alternatives, but in this context I don't understand what "${name[#]} expands each element of name to a separate word" means.
Long story short: ${array[*]} works; ${array[#]} doesn't. I would like to know why, and I would like to understand better what exactly ${array[#]} expands into.
(This is an expansion of my comment on Kaleb Pederson's answer -- see that answer for a more general treatment of [#] vs [*].)
When bash (or any similar shell) parses a command line, it splits it into a series of "words" (which I will call "shell-words" to avoid confusion later). Generally, shell-words are separated by spaces (or other whitespace), but spaces can be included in a shell-word by escaping or quoting them. The difference between [#] and [*]-expanded arrays in double-quotes is that "${myarray[#]}" leads to each element of the array being treated as a separate shell-word, while "${myarray[*]}" results in a single shell-word with all of the elements of the array separated by spaces (or whatever the first character of IFS is).
Usually, the [#] behavior is what you want. Suppose we have perls=(perl-one perl-two) and use ls "${perls[*]}" -- that's equivalent to ls "perl-one perl-two", which will look for single file named perl-one perl-two, which is probably not what you wanted. ls "${perls[#]}" is equivalent to ls "perl-one" "perl-two", which is much more likely to do something useful.
Providing a list of completion words (which I will call comp-words to avoid confusion with shell-words) to compgen is different; the -W option takes a list of comp-words, but it must be in the form of a single shell-word with the comp-words separated by spaces. Note that command options that take arguments always (at least as far as I know) take a single shell-word -- otherwise there'd be no way to tell when the arguments to the option end, and the regular command arguments (/other option flags) begin.
In more detail:
perls=(perl-one perl-two)
compgen -W "${perls[*]} /usr/bin/perl" -- ${cur}
is equivalent to:
compgen -W "perl-one perl-two /usr/bin/perl" -- ${cur}
...which does what you want. On the other hand,
perls=(perl-one perl-two)
compgen -W "${perls[#]} /usr/bin/perl" -- ${cur}
is equivalent to:
compgen -W "perl-one" "perl-two /usr/bin/perl" -- ${cur}
...which is complete nonsense: "perl-one" is the only comp-word attached to the -W flag, and the first real argument -- which compgen will take as the string to be completed -- is "perl-two /usr/bin/perl". I'd expect compgen to complain that it's been given extra arguments ("--" and whatever's in $cur), but apparently it just ignores them.
Your title asks about ${array[#]} versus ${array[*]} (both within {}) but then you ask about $array[*] versus $array[#] (both without {}) which is a bit confusing. I'll answer both (within {}):
When you quote an array variable and use # as a subscript, each element of the array is expanded to its full content regardless of whitespace (actually, one of $IFS) that may be present within that content. When you use the asterisk (*) as the subscript (regardless of whether it's quoted or not) it may expand to new content created by breaking up each array element's content at $IFS.
Here's the example script:
#!/bin/sh
myarray[0]="one"
myarray[1]="two"
myarray[3]="three four"
echo "with quotes around myarray[*]"
for x in "${myarray[*]}"; do
echo "ARG[*]: '$x'"
done
echo "with quotes around myarray[#]"
for x in "${myarray[#]}"; do
echo "ARG[#]: '$x'"
done
echo "without quotes around myarray[*]"
for x in ${myarray[*]}; do
echo "ARG[*]: '$x'"
done
echo "without quotes around myarray[#]"
for x in ${myarray[#]}; do
echo "ARG[#]: '$x'"
done
And here's it's output:
with quotes around myarray[*]
ARG[*]: 'one two three four'
with quotes around myarray[#]
ARG[#]: 'one'
ARG[#]: 'two'
ARG[#]: 'three four'
without quotes around myarray[*]
ARG[*]: 'one'
ARG[*]: 'two'
ARG[*]: 'three'
ARG[*]: 'four'
without quotes around myarray[#]
ARG[#]: 'one'
ARG[#]: 'two'
ARG[#]: 'three'
ARG[#]: 'four'
I personally usually want "${myarray[#]}". Now, to answer the second part of your question, ${array[#]} versus $array[#].
Quoting the bash docs, which you quoted:
The braces are required to avoid conflicts with the shell's filename expansion operators.
$ myarray=
$ myarray[0]="one"
$ myarray[1]="two"
$ echo ${myarray[#]}
one two
But, when you do $myarray[#], the dollar sign is tightly bound to myarray so it is evaluated before the [#]. For example:
$ ls $myarray[#]
ls: cannot access one[#]: No such file or directory
But, as noted in the documentation, the brackets are for filename expansion, so let's try this:
$ touch one#
$ ls $myarray[#]
one#
Now we can see that the filename expansion happened after the $myarray exapansion.
And one more note, $myarray without a subscript expands to the first value of the array:
$ myarray[0]="one four"
$ echo $myarray[5]
one four[5]

Printing array in sh

In bash, if you have an array arr and you want to print all its values, the command
echo ${arr[#]}
will do the trick. In sh however, this command gives a bad substitution error. What is an alternative command(s) for this task in sh?
There is no such thing as a general-purpose array in the POSIX sh specification. The closest thing you have available for an arbitrary variable is a string separated by some delimiter; usually whitespace separated, but can be separated by other characters if the elements themselves can contain spaces.
$# can be treated as an array in POSIX sh, but it's a bit limited due to the fact that there's only one such variable. You can change the value of $# with set, so you can do the following:
$ set -- one "two three" four
$ echo "$#"
3
$ echo "$1"
one
$ echo "$2"
two three
$ echo "$3"
four
$ printf '"%s" "%s" "%s"\n' "$#"
"one" "two three" "four"
Couple questions:
- Can you provide any further details on the script and how the array is being initialized?
- Are you sure that you're actually using sh? On some system /bin/sh is a symlink something else like bash.
ls -l /bin/sh
lrwxrwxrwx 1 root root 4 2013-06-04 19:52 /bin/sh -> bash
I would recommend http://www.tutorialspoint.com/unix/unix-using-arrays.htm as a starting point.

Bash arrays: appending and prepending to each element in array

I'm trying to build a long command involving find. I have an array of directories that I want to ignore, and I want to format this directory into the command.
Basically, I want to transform this array:
declare -a ignore=(archive crl cfg)
into this:
-o -path "$dir/archive" -prune -o -path "$dir/crl" -prune -o -path "$dir/cfg" -prune
This way, I can simply add directories to the array, and the find command will adjust accordingly.
So far, I figured out how to prepend or append using
${ignore[#]/#/-o -path \"\$dir/}
${ignore[#]/%/\" -prune}
But I don't know how to combine these and simultaneously prepend and append to each element of an array.
You cannot do it simultaneously easily. Fortunately, you do not need to:
ignore=( archive crl cfg )
ignore=( "${ignore[#]/%/\" -prune}" )
ignore=( "${ignore[#]/#/-o -path \"\$dir/}" )
echo ${ignore[#]}
Note the parentheses and double quotes - they make sure the array contains three elements after each substitution, even if there are spaces involved.
Have a look at printf, which does the job as well:
printf -- '-o -path "$dir/%s" -prune ' ${ignore[#]}
In general, you should strive to always treat each variable in the quoted form (e.g. "${ignore[#]}") instead of trying to insert quotation marks yourself (just as you should use parameterized statements instead of escaping the input in SQL) because it's hard to be perfect by manual escaping; for example, suppose a variable contains a quotation mark.
In this regard, I would aim at crafting an array where each argument word for find becomes an element: ("-o" "-path" "$dir/archive" "-prune" "-o" "-path" "$dir/crl" "-prune" "-o" "-path" "$dir/cfg" "-prune") (a 12-element array).
Unfortunately, Bash doesn't seem to support a form of parameter expansion where each element expands to multiple words. (p{1,2,3}q expands to p1q p2q p3q, but with a=(1 2 3), p"${a[#]}"q expands to p1 2 3q.) So you need to resort to a loop:
declare -a args=()
for i in "${ignore[#]}"
do
args+=(-o -path "$dir/$i" -prune) # I'm not sure if you want to have
# $dir expanded at this point;
# otherwise, just use "\$dir/$i".
done
find ... "${args[#]}" ...
If I understand right,
declare -a ignore=(archive crl cfg)
a=$(echo ${ignore[#]} | xargs -n1 -I% echo -o -path '"$dir/%"' -prune)
echo $a
prints
-o -path "$dir/archive" -prune -o -path "$dir/crl" -prune -o -path "$dir/cfg" -prune
Works only with xargs what has the next switches:
-I replstr
Execute utility for each input line, replacing one or more occurrences of replstr in up to replacements
(or 5 if no -R flag is specified) arguments to utility with the entire line of input. The resulting
arguments, after replacement is done, will not be allowed to grow beyond 255 bytes; this is implemented
by concatenating as much of the argument containing replstr as possible, to the constructed arguments to
utility, up to 255 bytes. The 255 byte limit does not apply to arguments to utility which do not contain
replstr, and furthermore, no replacement will be done on utility itself. Implies -x.
-J replstr
If this option is specified, xargs will use the data read from standard input to replace the first occur-
rence of replstr instead of appending that data after all other arguments. This option will not affect
how many arguments will be read from input (-n), or the size of the command(s) xargs will generate (-s).
The option just moves where those arguments will be placed in the command(s) that are executed. The
replstr must show up as a distinct argument to xargs. It will not be recognized if, for instance, it is
in the middle of a quoted string. Furthermore, only the first occurrence of the replstr will be
replaced. For example, the following command will copy the list of files and directories which start
with an uppercase letter in the current directory to destdir:
/bin/ls -1d [A-Z]* | xargs -J % cp -rp % destdir

Resources