Bash: quoted array expansion leads to strange results - arrays

While experimenting with bash arrays, I stumbled upon behaviour I find hard to explain.
> arr=("a" "b")
> bash -c "echo ${arr[*]}"
a b
> bash -c "echo ${arr[#]}"
a
The relevant part of the bash manual states:
${!name[#]}, ${!name[*]} : If name is an array variable, expands to the list of array indices (keys) assigned in name. [...] When ‘#’ is used and the expansion appears within double quotes, each key expands to a separate word.
As I understand it, I would expect the latter example to expand from bash -c "echo ${arr[#]}" to bash -c "echo \"a\" \"b\"" (or even bash -c "echo a b") and output a b in the subshell.
So, which is the correct behaviour? The observed behaviour? The behaviour I expect? Or something entirely different?

You can run the code under set -xv to see how bash expands the variables:
choroba#triangle:~ $ (set -xv ; arr=("a" "b") ; bash -c "echo ${arr[#]}")
+ arr=("a" "b")
+ bash -c 'echo a' b
a
"echo ${arr[#]}" is expanded to two words, echo a and b, the first one is used as a command to run, the second one is interpreted as the value for the $0 or name of the shell. Any following arguments would be used to set the positional parameters.

"echo ${arr[#]}" expands to two words, echo a, and b. And the manual also says
If the -c option is present, then commands are read from the first non-option argument command_string. If there are arguments after the command_string, the first argument is assigned to $0 and any remaining arguments are assigned to the positional parameters. The assignment to $0 sets the name of the shell, which is used in warning and error messages.
So, you're assigning b to $0 there.
Proof of concept:
$ arr=(a b)
$ bash -c "echo \$0; echo ${arr[#]}"
b
a

Related

How to unset array in bash?

In bash shell for variables:
#!/bin/bash
set -o nounset
my_var=aaa
unset var
echo "$var"
Because set command is defined to return error if variable is not set, last line returns error:
line 6: var: unbound variable
OK, that is what I want.
Now the same thing with arrays:
#!/bin/bash
set -o nounset
my_array=(aaa bbb)
unset my_array
echo "${my_array[#]}"
But to my surprise last line does not return error. I would like bash script to return error when array is not defined.
${my_array[#]} is similar to $# which is documented to be ignored by nounset:
-u Treat unset variables and parameters other than the special parameters "#" and "*" as an error when performing parameter expansion. If expansion is attempted on an unset variable or parameter, the shell prints an error message, and, if not interactive, exits with a non-zero status.
Returning the array size is not ignored, though. Prepend the following line to make sure the array is not unset:
: ${#my_array[#]}

How to write bash function to print and run command when the command has arguments with spaces or things to be expanded

In Bash scripts, I frequently find this pattern useful, where I first print the command I'm about to execute, then I execute the command:
echo 'Running this cmd: ls -1 "$HOME/temp/some folder with spaces'
ls -1 "$HOME/temp/some folder with spaces"
echo 'Running this cmd: df -h'
df -h
# etc.
Notice the single quotes in the echo command to prevent variable expansion there! The idea is that I want to print the cmd I'm running, exactly as I will type and run the command, then run it!
How do I wrap this up into a function?
Wrapping the command up into a standard bash array, and then printing and calling it, like this, sort-of works:
# Print and run the passed-in command
# USAGE:
# cmd_array=(ls -a -l -F /)
# print_and_run_cmd cmd_array
# See:
# 1. My answer on how to pass regular "indexed" and associative arrays by reference:
# https://stackoverflow.com/a/71060036/4561887 and
# 1. My answer on how to pass associative arrays: https://stackoverflow.com/a/71060913/4561887
print_and_run_cmd() {
local -n array_reference="$1"
echo "Running cmd: ${cmd_array[#]}"
# run the command by calling all elements of the command array at once
${cmd_array[#]}
}
For simple commands like this it works fine:
Usage:
cmd_array=(ls -a -l -F /)
print_and_run_cmd cmd_array
Output:
Running cmd: ls -a -l -F /
(all output of that cmd is here)
But for more-complicated commands it is broken!:
Usage:
cmd_array=(ls -1 "$HOME/temp/some folder with spaces")
print_and_run_cmd cmd_array
Desired output:
Running cmd: ls -1 "$HOME/temp/some folder with spaces"
(all output of that command should be here)
Actual Output:
Running cmd: ls -1 /home/gabriel/temp/some folder with spaces
ls: cannot access '/home/gabriel/temp/some': No such file or directory
ls: cannot access 'folder': No such file or directory
ls: cannot access 'with': No such file or directory
ls: cannot access 'spaces': No such file or directory
The first problem, as you can see, is that $HOME got expanded in the Running cmd: line, when it shouldn't have, and the double quotes around that path argument were removed, and the 2nd problem is that the command doesn't actually run.
How do I fix these 2 problems?
References:
my bash demo program where I have this print_and_run_cmd function: https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/bash/argument_parsing__3_advanced__gen_prog_template.sh
where I first documented how to pass bash arrays by reference, as I do in that function:
Passing arrays as parameters in bash
How to pass an associative array as argument to a function in Bash?
Follow-up question:
Bash: how to print and run a cmd array which has the pipe operator, |, in it
If you've got Bash version 4.4 or later, this function may do what you want:
function print_and_run_cmd
{
local PS4='Running cmd: '
local -
set -o xtrace
"$#"
}
For example, running
print_and_run_cmd echo 'Hello World!'
outputs
Running cmd: echo 'Hello World!'
Hello World!
local PS4='Running cmd: ' sets a prefix for commands printed by the shell when the xtrace option is on. The default is + . Localizing it means that the previous value of PS4 is automatically restored when the function returns.
local - causes any changes to shell options to be reverted automatically when the function returns. In particular, it causes the set -o xtrace on the next line to be automatically undone when the function returns. Support for local - was added in Bash 4.4.
From man bash, under the local [option] [name[=value] ... | - ] section (emphasis added):
If name is -, the set of shell options is made local to the function in which local is invoked: shell options changed using the set builtin inside the function are restored to their original values when the function returns.
set -o xtrace (which is equivalent to set -x) causes the shell to print commands, preceded by the expanded value of PS4, before running them.
See help set.
Check your scripts with shellcheck:
Line 2:
local -n array_reference="$1"
^-- SC2034 (warning): array_reference appears unused. Verify use (or export if used externally).
Line 3:
echo "Running cmd: ${cmd_array[#]}"
^-- SC2145 (error): Argument mixes string and array. Use * or separate argument.
^-- SC2154 (warning): cmd_array is referenced but not assigned.
Line 5:
${cmd_array[#]}
^-- SC2068 (error): Double quote array expansions to avoid re-splitting elements.
You might want to research https://github.com/koalaman/shellcheck/wiki/SC2068 . We fix all errors and we get:
print_and_run_cmd() {
local -n array_reference="$1"
echo "Running cmd: ${array_reference[*]}"
# run the command by calling all elements of the command array at once
"${array_reference[#]}"
}
For me it's odd to pass an array by reference in this case. I would pass the actual values. I often do:
prun() {
# in the style of set -x
# print to stderr, so output can be captured
echo "+ $*" >&2
# or echo "+ ${*#Q}" >&2
# or echo "+$(printf " %q" "$#")" >&2
# or echo "+$(/bin/printf " %q" "$#")" >&2
"$#"
}
prun "${cmd_array[#]}"
How do I fix these 2 problems?
Incorporate into your workflow linters, formatters and static analysis tools, like shellcheck, and check the problems they point out.
And quote variable expansion. It's "${array[#]}".
You can achieve what you want with DEBUG trap :
#!/bin/bash
set -T
trap 'test "$FUNCNAME" = print_and_run_cmd || trap_saved_command="${BASH_COMMAND}"' DEBUG
print_and_run_cmd(){
echo "Running this cmd: ${trap_saved_command#* }"
"$#"
}
outer(){
print_and_run_cmd ls -1 "$HOME/temp/some folder with spaces"
}
outer
# output ->
# Running this cmd: ls -1 "$HOME/temp/some folder with spaces"
# ...
I really like #pjh's answer, so I've marked it as correct. It doesn't fully answer my original question though, so if another answer comes along that does, I may have to change that. Anyway, see #pjh's answer or a full explanation of how the below code works, and what all those lines mean. I've helped edit that answer with some of the sources from man bash and help set.
I'd like to change the formatting and provide some more examples, however, to show that variable expansion does take place within the command. I'd also like to provide one version which passes by reference, and one which does not, so you can choose the call style which you like best.
Here are my examples, showing both call styles (print_and_run1 cmd_array and print_and_run2 "${cmd_array[#]}"):
#!/usr/bin/env bash
# Print and run the passed-in command, which is passed in as an
# array **by reference**.
# See here for a full explanation: https://stackoverflow.com/a/71151669/4561887
# USAGE:
# cmd_array=(ls -a -l -F /)
# print_and_run1 cmd_array
print_and_run1() {
local -n array_reference="$1"
local PS4='Running cmd: '
local -
set -o xtrace
# Call the cmd
"${array_reference[#]}"
}
# Print and run the passed-in command, which is passed in as members
# of an array **by value**.
# See here for a full explanation: https://stackoverflow.com/a/71151669/4561887
# USAGE:
# cmd_array=(ls -a -l -F /)
# print_and_run2 "${cmd_array[#]}"
print_and_run2() {
local PS4='Running cmd: '
local -
set -o xtrace
# Call the cmd
"$#"
}
cmd_array=(ls -1 "$HOME/temp/some folder with spaces")
print_and_run1 cmd_array
echo ""
print_and_run2 "${cmd_array[#]}"
echo ""
Sample run and output:
eRCaGuy_hello_world/bash$ ./print_and_run.sh
Running cmd: ls -1 '/home/gabriel/temp/some folder with spaces'
file1.txt
file2.txt
Running cmd: ls -1 '/home/gabriel/temp/some folder with spaces'
file1.txt
file2.txt
This seems to work too:
print_and_run_cmd() {
echo "Running cmd: $1"
eval "$cmd"
}
cmd='ls -1 "$HOME/temp/some folder with spaces"'
print_and_run_cmd "$cmd"
Output:
Running cmd: ls -1 "$HOME/temp/some folder with spaces"
(result of running the cmd is here)
But now the problem is, if I want to print an expanded version of the cmd too, to verify that part worked properly, I can't, or at least, don't know how.

$# vs. "$#" when an argument is enclosed with single quotes [duplicate]

The $# variable seems to maintain quoting around its arguments so that, for example:
$ function foo { for i in "$#"; do echo $i; done }
$ foo herp "hello world" derp
herp
hello world
derp
I am also aware that bash arrays, work the same way:
$ a=(herp "hello world" derp)
$ for i in "${a[#]}"; do echo $i; done
herp
hello world
derp
What is actually going on with variables like this? Particularly when I add something to the quote like "duck ${a[#]} goose". If its not space separated what is it?
Usually, double quotation marks in Bash mean "make everything between the quotation marks one word, even if it has separators in it." But as you've noticed, $# behaves differently when it's within double quotes. This is actually a parsing hack that dates back to Bash's predecessor, the Bourne shell, and this special behavior applies only to this particular variable.
Without this hack (I use the term because it seems inconsistent from a language perspective, although it's very useful), it would be difficult for a shell script to pass along its array of arguments to some other command that wants the same arguments. Some of those arguments might have spaces in them, but how would it pass them to another command without the shell either lumping them together as one big word or reparsing the list and splitting the arguments that have whitespace?
Well, you could pass an array of arguments, and the Bourne shell really only has one array, represented by $* or $#, whose number of elements is $# and whose elements are $1, $2, etc, the so-called positional parameters.
An example. Suppose you have three files in the current directory, named aaa, bbb, and cc c (the third file has a space in the name). You can initialize the array (that is, you can set the positional parameters) to be the names of the files in the current directory like this:
set -- *
Now the array of positional parameters holds the names of the files. $#, the number of elements, is three:
$ echo $#
3
And we can iterate over the position parameters in a few different ways.
1) We can use $*:
$ for file in $*; do
> echo "$file"
> done
but that re-separates the arguments on whitespace and calls echo four times:
aaa
bbb
cc
c
2) Or we could put quotation marks around $*:
$ for file in "$*"; do
> echo "$file"
> done
but that groups the whole array into one argument and calls echo just once:
aaa bbb cc c
3) Or we could use $# which represents the same array but behaves differently in double quotes:
$ for file in "$#"; do
> echo "$file"
> done
will produce
aaa
bbb
cc c
because $1 = "aaa", $2 = "bbb", and $3 = "cc c" and "$#" leaves the elements intact. If you leave off the quotation marks around $#, the shell will flatten and re-parse the array, echo will be called four times, and you'll get the same thing you got with a bare $*.
This is especially useful in a shell script, where the positional parameters are the arguments that were passed to your script. To pass those same arguments to some other command -- without the shell resplitting them on whitespace -- use "$#".
# Truncate the files specified by the args
rm "$#"
touch "$#"
In Bourne, this behavior only applies to the positional parameters because it's really the only array supported by the language. But you can create other arrays in Bash, and you can even apply the old parsing hack to those arrays using the special "${ARRAYNAME[#]}" syntax, whose at-sign feels almost like a wink to Mr. Bourne:
$ declare -a myarray
$ myarray[0]=alpha
$ myarray[1]=bravo
$ myarray[2]="char lie"
$ for file in "${myarray[#]}"; do echo "$file"; done
alpha
bravo
char lie
Oh, and about your last example, what should the shell do with "pre $# post" where you have $# within double quotes but you have other stuff in there, too? Recent versions of Bash preserve the array, prepend the text before the $# to the first array element, and append the text after the $# to the last element:
pre aaa
bb
cc c post

Program tester, bash

I'm trying to test a program (tp3) with several input files and printing the output in another file. So I've designed the following bash script name runner to do everything at the same time:
#!/bin/bash
rm $2
clear
FILES=(`ls ${1}`)
cmd='./tp3'
for f in ${FILES[*]}
do
echo "$f"
echo "--------------<$f>--------------" >> $2
$cmd < $1$f 2>> $2 >> $2
done
Everytime I run this script I get the following error:
./runner: line 10: $2: ambiguous redirect
./runner: line 11: testtest: No such file or directory
To run the bash script I do:
./runner test
What is wrong in the script?
Modifications to make it work:
First of all I've quoted the variables, then I've replaced the second argument "$2" for a file named "TEST" and now everything is working just fine.
New code:
#!/bin/bash
rm TEST
clear
FILES=(`ls *.in`)
cmd='./tp3'
for f in ${FILES[*]}
do
echo "$f"
echo "--------------<"$f">--------------" >> "TEST"
"$cmd" < "$1$f" >> "TEST" 2>> "TEST"
done
Thanks everyone for your help.
You are running ./runner test in which test is $1 and $2 is empty. Your redirection is therefor illegal. Also try to couple stdout and stderr when pointing to the same output. This can be done as follows: command arguments > output 2>&1. This will send stderr output to where ever the stdout output is sent.
Also, as Wintermute pointed out: quote variables. Spaces in variables will make it be interpreted as separate arguments. e.g. command $1 supplies two arguments to command if $1 equals some string for example.
This translates into the following: you use $f if this contains a space it will split the argument and everything after the space will be treated as extra arguments or commands rather than one single argument.

Printing array in sh

In bash, if you have an array arr and you want to print all its values, the command
echo ${arr[#]}
will do the trick. In sh however, this command gives a bad substitution error. What is an alternative command(s) for this task in sh?
There is no such thing as a general-purpose array in the POSIX sh specification. The closest thing you have available for an arbitrary variable is a string separated by some delimiter; usually whitespace separated, but can be separated by other characters if the elements themselves can contain spaces.
$# can be treated as an array in POSIX sh, but it's a bit limited due to the fact that there's only one such variable. You can change the value of $# with set, so you can do the following:
$ set -- one "two three" four
$ echo "$#"
3
$ echo "$1"
one
$ echo "$2"
two three
$ echo "$3"
four
$ printf '"%s" "%s" "%s"\n' "$#"
"one" "two three" "four"
Couple questions:
- Can you provide any further details on the script and how the array is being initialized?
- Are you sure that you're actually using sh? On some system /bin/sh is a symlink something else like bash.
ls -l /bin/sh
lrwxrwxrwx 1 root root 4 2013-06-04 19:52 /bin/sh -> bash
I would recommend http://www.tutorialspoint.com/unix/unix-using-arrays.htm as a starting point.

Resources