Remove some arguments from argument string in zsh - arrays

I'm trying to remove part of an arguments string using zsh parameter expansion (no external tools like sed please). Here's what for:
The RUBYOPT environment variable contains arguments which are applied whenever the ruby interpreter is used just as if they were given along with the ruby command. One argument controls the warning verbosity, possible settings are for instance -W0 or -W:no-deprecated. My goal is to remove all all -W... from RUBYOPT, say:
-W0 -X -> -X
-W:no-deprecated -X -W1 -> -X
My current approach is to split the string to an array and then make a substitution on every member of the array. This works on two lines of code, but I can't make it work on a single line of code:
% RUBYOPT="-W:no-deprecated -X -W1"
% parts=(${(#s: :)RUBYOPT})
% echo ${parts/-W*}
-X
% echo ${(${(#s: :)RUBYOPT})/-W*}
zsh: error in flags
What am I doing wrong here... or is there a different, more elegant way to achieve this?
Thanks for your hints!

${(... introduces parameter expansion flags (for expample:${(s: :)...}).
It cannot handle ${(${(#s: :... as a parameter expansion, especially as the parameter expansion flags for the (${(#s... part, so zsh yields an error "zsh: error in flags".
% RUBYOPT="-W:no-deprecated -X -W1"
% print -- ${${(s: :)RUBYOPT}/-W*}
# -X
could rescue.
update from rowboat's comments: it could be inappropriate for some flags like -abc-Whoops or -foo-Whoo etc:
% RUBYOPT="-W:no-deprecated -X -W1 -foo-Whoo"
% parts=(${(s: :)RUBYOPT})
% print -- ${parts/-W*}
# -X -foo
# Note: -foo would be unexpected
% print -- ${${(s: :)RUBYOPT}/-W*}
# -X -foo
# Note: -foo would be unexpected
The s globbing flag (along with the shell option EXTENDED_GLOB) could rescue:
% RUBYOPT="-W:no-deprecated -X -W1 -foo-Whoo"
% parts=(${(s: :)RUBYOPT})
% setopt extendedglob
# To use `(#s)` flag which is like regex's `^`
% print -- ${parts/(#s)-W*}
# -X -foo-Whoo
% print -- ${${(s: :)RUBYOPT}/(#s)-W*}
# -X -foo-Whoo
Globbing Flags
There are various flags which affect any text to their right up to the end of the enclosing group or to the end of the pattern; they require the EXTENDED_GLOB option. All take the form (#X) where X may have one of the following forms:
...
s, e
Unlike the other flags, these have only a local effect, and each must appear on its own: (#s) and (#e) are the only valid forms. The (#s) flag succeeds only at the start of the test string, and the (#e) flag succeeds only at the end of the test string; they correspond to ^ and $ in standard regular ex‐ pressions.
...
--- zshexpn(1), Expansion, Globbing Flags
Or ${name#:pattern} syntax described below could rescue, too.
end update from rowboat's comments
Use typeset -T feature to manipulate the scalar value by array operators is an option.
RUBYOPT="-W:no-deprecated -X -W1"
typeset -xT RUBYOPT rubyopt ' '
rubyopt=(${rubyopt:#-W*})
print -l -- "$RUBYOPT"
# -X
typeset
...
-T [ SCALAR[=VALUE] ARRAY[=(VALUE ...)] [ SEP ] ]
...
the -T option requires zero, two, or three arguments to be present. With no arguments, the list of parameters created in this fashion is shown. With two or three arguments, the first two are the name of a scalar and of an array parameter (in that order) that will be tied together in the manner of $PATH and $path. The optional third argument is a single-character separator which will be used to join the elements of the array to form the scalar; if absent, a colon is used, as with $PATH. Only the first character of the separator is significant; any remaining characters are ignored. Multibyte characters are not yet supported.
...
Both the scalar and the array may be manipulated as normal. If one is unset, the other will automatically be unset too.
...
--- zshbuiltin(1), Shell Bultin Commands, typeset
And rubyopt=(${rubyopt:#-W*}) to filter the array elements
${name:#pattern}
If the pattern matches the value of name, then substitute the empty string; otherwise, just substitute the value of name. If name is an array the matching array elements are removed (use the (M) flag to remove the non-matched elements).
--- zshexpn(1), Parameter Expansion , ${name:#pattern}
Note: It is possible to omit "#" from flags because the empty values are not necessary in this case.
RUBYOPT="-W:no-deprecated -X -W1"
parts=(${(s: :)RUBYOPT})
print -- ${parts/-W*}
# -X
print -- ${${(s: :)RUBYOPT}/-W*}
# -X
Parameter Expansion Flags
...
#
In double quotes, array elements are put into separate words. E.g., "${(#)foo}" is equivalent to "${foo[#]}" and "${(#)foo[1,2]}" is the same as "$foo[1]" "$foo[2]". This is distinct from field splitting by the f, s or z flags, which still applies within each array element.
--- zshexpn(1), Parameter Expansion Flags, #
If we cannot omit the empty value, ${name:#pattern} syntax could rescue.
RUBYOPT="-W:no-deprecated -X -W1"
parts=("${(#s: :)RUBYOPT}")
# parts=("-W:no-deprecated" "" "-X" "-W1")
# Note the empty value are retained
print -rC1 -- "${(#qqq)parts:#-W*}"
# ""
# "-X"
print -rC1 -- "${(#qqq)${(#s: :)RUBYOPT}:#-W*}"
# ""
# "-X"

Related

Pass a combination of arguments and array from a linux shell script to another shell script

How can I pass below 2 variables and the array (the dimensions will vary each time) from a linux shell script to another shell script?
logfolder=mylogfolder
processname=myname
script=(\
[1]="1 MyScript1" \
[2]="2 MyScript2 MyScript3 MyScript4 MyScript5 MyScript6 MyScript7 MyScript8 MyScript9 MyScript10 MyScript11" \
[3]="2 MyScript12 MyScript13" \
[4]="2 MyScript14 MyScript15" \
[5]="1 MyScript16" \
[6]="1 MyScript17")
Do you need to pass just the array's elements, or the specific index->element mappings? The reason I ask is that the example array you give has indexes 1-6, but by default a bash array would have indexes 0-5. I'm going to assume the indexes aren't important (other than order); if they need to be passed, things get a bit more complicated.
There are two fairly standardish ways to do this; you can either pass the two fixed arguments as the first two positional parameters ($1 and $2), and the array as the third on, like this:
#!/bin/bash
# Usage example:
# ./scriptname "mylogfolder" "myname" "1 MyScript1" \
# "2 MyScript2 MyScript3 MyScript4 MyScript5 MyScript6 MyScript7 MyScript8 MyScript9 MyScript10 MyScript11" \
# "2 MyScript12 MyScript13" ...
logfolder=$1
processname=$2
script=("${#:3}")
Or, if the two values are optional, you could make them options that take parameters, like this:
#!/bin/bash
# Usage example:
# ./scriptname -l "mylogfolder" -p "myname" "1 MyScript1" \
# "2 MyScript2 MyScript3 MyScript4 MyScript5 MyScript6 MyScript7 MyScript8 MyScript9 MyScript10 MyScript11" \
# "2 MyScript12 MyScript13" ...
while getopts "l:p:" o; do
case "${o}" in
l)
logfolder =${OPTARG} ;;
p)
processname =${OPTARG} ;;
*)
echo "Usage: $0 [-l logfolder] [-p processname] script [script...]" >&2
exit 1
;;
esac
done
shift $((OPTIND-1))
script=("${#}")
In either case, to pass an array, you'd use "${arrayname[#]}" to pass each element as a separate argument to the script.
[EDIT] Passing the array indexes with the elements is more complicated, since the arg list is basically just a list of strings. You need to somehow encode the indexes along with the elements, and then parse them out in the script. Here's one way to do it, using index=element for each thing in the array:
#!/bin/bash
# Usage example:
# ./scriptname "mylogfolder" "myname" "1=1 MyScript1" \
# "2=2 MyScript2 MyScript3 MyScript4 MyScript5 MyScript6 MyScript7 MyScript8 MyScript9 MyScript10 MyScript11" \
# "3=2 MyScript12 MyScript13" ...
logfolder=$1
processname=$2
# Parse arguments 3 on as index=element pairs:
declare -a script=()
for scriptarg in "${#:3}"; do
index=${scriptarg%%=*}
if [[ "$index" = "$scriptarg" || -z "$index" || "$index" = *[^0-9]* ]]; then
echo "$0: invalid script array index in '$scriptarg'" >&2
exit 1
fi
script[$index]=${scriptarg#*=}
done
And to call it, you'd need to package up the script array in the appropriate format something like this:
# Convert the normal script array to a version that has "index=" prefixes
scriptargs=()
for index in "${!script[#]}"; do # iterate over the array _indexes_
scriptargs+=("${index}=${script[$index]}")
done
./script.sh "$logfolder" "$processname" "${scriptargs[#]}"

Adding a prefix to all array elements in Bash

I am storing command line parameters in an array variable. (This is necessary for me).
I wanted to prefix all the array values with a string passing through a variable.
PREFIX="rajiv"
services=$( echo $* | tr -d '/' )
echo "${services[#]/#/$PREFIX-}"
I am getting this output.
> ./script.sh webserver wistudio
rajiv-webserver wistudio
But I am expecting this output.
rajiv-webserver rajiv-wistudio
Your array initialization is wrong. Change it to this:
services=($(echo $* | tr -d '/'))
Without the outer (), services would become a string and the parameter expansion "${services[#]/#/$PREFIX-}" adds $PREFIX- to your string.
In situations like this, declare -p can be used to examine the contents of your variable. In this case, declare -p services should show you:
declare -a services=([0]="webserver" [1]="wistudio") # it is an array!
and not
declare -- services="webserver wistudio" # it is a plain string

call program with arguments from an array containing items from another array wrapped in double quotes

(This is a more specific version of the problem discussed in bash - expand arguments from array containing double quotes
.)
I want bash to call cmake with arguments from an array with double quotes which itself contain items from another array. Here is an example for clarification:
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=("-DCMAKE_CXX_FLAGS=\"${cxx_flags[#]}\"")
The arguments shall be printed pretty like this:
$ echo "CMake arguments: ${cmake_arguments[#]}"
CMake arguments: -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-fdiagnostics-color -O3"
Problem
And finally cmake should be called (this does not work!):
cmake .. "${cmake_arguments[#]}"
It expands to (as set -x produces):
cmake .. -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS="-fdiagnostics-color' '-O3"'
Workaround
echo "cmake .. ${cmake_arguments[#]}" | source /dev/stdin
Expands to:
cmake .. -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'
That's okay but it seems like a hack. Is there a better solution?
Update
If you want to iterate over the array you should use one more variable (as randomir and Jeff Breadner suggested):
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
cxx_flags_string="${cxx_flags[#]}"
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=("-DCMAKE_CXX_FLAGS=\"$cxx_flags_string\"")
The core problem remains (and the workaround still works) but you could iterate over cmake_arguments and see two items (as intended) instead of three (-DCMAKE_BUILD_TYPE=Release, -DCMAKE_CXX_FLAGS="-fdiagnostics-color and -O3"):
echo "cmake .. \\"
size=${#cmake_arguments[#]}
for ((i = 0; i < $size; ++i)); do
if [[ $(($i + 1)) -eq $size ]]; then
echo " ${cmake_arguments[$i]}"
else
echo " ${cmake_arguments[$i]} \\"
fi
done
Prints:
cmake .. \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_FLAGS="-fdiagnostics-color -O3"
It seems that there's another layer of parsing that has to happen before cmake is happy; the | source /dev/stdin handles this, but you could also just move your CXX flags through an additional variable:
#!/bin/bash -x
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
CXX_FLAGS="${cxx_flags[#]}"
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=("'-DCMAKE_CXX_FLAGS=${CXX_FLAGS}'")
CMAKE_ARGUMENTS="${cmake_arguments[#]}"
echo "CMake arguments: ${CMAKE_ARGUMENTS}"
returns:
+ cxx_flags=()
+ cxx_flags+=(-fdiagnostics-color)
+ cxx_flags+=(-O3)
+ CXX_FLAGS='-fdiagnostics-color -O3'
+ cmake_arguments=()
+ cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
+ cmake_arguments+=("'-DCMAKE_CXX_FLAGS=${CXX_FLAGS}'")
+ CMAKE_ARGUMENTS='-DCMAKE_BUILD_TYPE=Release '\''-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'\'''
+ echo 'CMake arguments: -DCMAKE_BUILD_TYPE=Release '\''-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'\'''
CMake arguments: -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'
There is probably a cleaner solution still, but this is better than the | source /dev/stdin thing, I think.
You basically want cxx_flags array expanded into a single word.
This:
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
flags="${cxx_flags[#]}"
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=(-DCMAKE_CXX_FLAGS="$flags")
will produce the output you want:
$ set -x
$ echo "${cmake_arguments[#]}"
+ echo -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'
So, to summarize, running:
cmake .. "${cmake_arguments[#]}"
with array expansion quoted, ensures each array element (cmake argument) is expanded as only one word (if it contains spaces, the shell won't print quotes around it, but the command executed will receive the whole string as a single argument). You can verify that with set -x.
If you need to print the complete command with arguments in a way that can be reused by copy/pasting, you can consider using printf with %q format specifier, which will quote the argument in a way that can be reused as shell input:
$ printf "cmake .. "; printf "%q " "${cmake_arguments[#]}"; printf "\n"
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS=-fdiagnostics-color\ -O3
Note the backslash which escapes the space.

Arrays in a POSIX compliant shell

According to this reference sheet on hyperpolyglot.org, the following syntax can be used to set an array.
i=(1 2 3)
But I get an error with dash which is the default for /bin/sh on Ubuntu and should be POSIX compliant.
# Trying the syntax with dash in my terminal
> dash -i
$ i=(1 2 3)
dash: 1: Syntax error: "(" unexpected
$ exit
# Working fine with bash
> bash -i
$ i=(1 2 3)
$ echo ${i[#]}
1 2 3
$ exit
Is the reference sheet misleading or erroneous?
If yes, what would be the correct way to define an array or a list and be POSIX compliant?
Posix does not specify arrays, so if you are restricted to Posix shell features, you cannot use arrays.
I'm afraid your reference is mistaken. Sadly, not everything you find on the internet is correct.
As said by rici, dash doesn't have array support. However, there are workarounds if what you're looking to do is write a loop.
For loop won't do arrays, but you can do the splitting using a while loop + the read builtin. Since the dash read builtin also doesn't support delimiters, you would have to work around that too.
Here's a sample script:
myArray="a b c d"
echo "$myArray" | tr ' ' '\n' | while read item; do
# use '$item'
echo $item
done
Some deeper explanation on that:
The tr ' ' '\n' will let you do a single-character replace where
you remove the spaces & add newlines - which are the default delim
for the read builtin.
read will exit with a failing exit code when it detects that stdin
has been closed - which would be when your input has been fully
processed.
Since echo prints an extra newline after its input, that will let you
process the last "element" in your array.
This would be equivalent to the bash code:
myArray=(a b c d)
for item in ${myArray[#]}; do
echo $item
done
If you want to retrieve the n-th element (let's say 2-th for the purpose of the example):
myArray="a b c d"
echo $myArray | cut -d\ -f2 # change -f2 to -fn
It is true that the POSIX sh shell does not have named arrays in the same sense that bash and other shells have, but there is a list that sh shells (as well as bash and others) could use, and that's the list of positional parameters.
This list usually contains the arguments passed to the current script or shell function, but you can set its values with the set built-in command:
#!/bin/sh
set -- this is "a list" of "several strings"
In the above script, the positional parameters $1, $2, ..., are set to the five string shown. The -- is used to make sure that you don't unexpectedly set a shell option (which the set command is also able to do). This is only ever an issue if the first argument starts with a - though.
To e.g. loop over these strings, you can use
for string in "$#"; do
printf 'Got the string "%s"\n' "$string"
done
or the shorter
for string do
printf 'Got the string "%s"\n' "$string"
done
or just
printf 'Got the string "%s"\n' "$#"
set is also useful for expanding globs into lists of pathnames:
#!/bin/sh
set -- "$HOME"/*/
# "visible directory" below really means "visible directory, or visible
# symbolic link to a directory".
if [ ! -d "$1" ]; then
echo 'You do not have any visible directories in your home directory'
else
printf 'There are %d visible directories in your home directory\n' "$#"
echo 'These are:'
printf '\t%s\n' "$#"
fi
The shift built-in command can be used to shift off the first positional parameter from the list.
#!/bin/sh
# pathnames
set -- path/name/1 path/name/2 some/other/pathname
# insert "--exclude=" in front of each
for pathname do
shift
set -- "$#" --exclude="$pathname"
done
# call some command with our list of command line options
some_command "$#"
You can use the argument list $# as an array in POSIX shells
It's trivial to initialize, shift, unshift, and push:
# initialize $# containing a string, a variable's value, and a glob's matches
set -- "item 1" "$variable" *.wav
# shift (remove first item, accepts a numeric argument to remove more)
shift
# unshift (prepend new first item)
set -- "new item" "$#"
# push (append new last item)
set -- "$#" "new item"
Here's a pop implementation:
# pop (remove last item, store it in $last)
i=0
for last in "$#"; do
if [ $((i+=1)) = 1 ]; then set --; fi # increment $i. first run: empty $#
if [ $i = $# ]; then break; fi # stop before processing the last item
set -- "$#" "$last" # add $a back to $#
done
echo "$last has been removed from ($*)"
($* puts the contents of $# into a single space-delimited string, ideal for quoting within another string.)
Iterate through the $# array and modify some of its contents:
i=0
for a in "$#"; do
if [ $((i+=1)) = 1 ]; then set --; fi # increment $i. first run: empty $#
a="${a%.*}.mp3" # example tweak to $a: change extension to .mp3
set -- "$#" "$a" # add $a back to $#
done
Refer to items in the $# array:
echo "$1 is the first item"
echo "$# is the length of the array"
echo "all items in the array (properly quoted): $#"
echo "all items in the array (in a string): $*"
[ "$n" -ge 0 ] && eval "echo \"the ${n}th item in the array is \$$n\""
(eval is dangerous, so I've ensured $n is a number before running it)
There are two ways to set $last to the final item of a list without popping it:
with an eval (safe since $# is always a number):
eval last="\$$#"
... or with a loop:
for last in "$#"; do true; done
⚠️ Warning: Functions have their own $# arrays. You'll have to pass it to the function, like my_function "$#" if read-only or else set -- $(my_function "$#") if you want to manipulate $# and don't expect spaces in item values.
If you need to handle spaces in item values, it becomes much more cumbersome:
# ensure my_function() returns each list item on its own line
i=1
my_function "$#" |while IFS= read line; do
if [ $i = 1 ]; then unset i; set --; fi
set -- "$#" "$line"
done
This still won't work with newlines in your items. You'd have to escape them to another character (but not null) and then escape them back later.

Shell script array from command line

I'm trying to write a shell script that can accept multiple elements on the command line to be treated as a single array. The command line argument format is:
exec trial.sh 1 2 {element1 element2} 4
I know that the first two arguments are can be accessed with $1 and $2, but how can I access the array surrounded by the brackets, that is the arguments surrounded by the {} symbols?
Thanks!
This tcl script uses regex parsing to extract pieces of the commandline, transforming your third argument into a list.
Splitting is done on whitespaces - depending on where you want to use this may or may not be sufficient.
#!/usr/bin/env tclsh
#
# Sample arguments: 1 2 {element1 element2} 4
# Split the commandline arguments:
# - tcl will represent the curly brackets as \{ which makes the regex a bit ugly as we have to escape this
# - we use '->' to catch the full regex match as we are not interested in the value and it looks good
# - we are splitting on white spaces here
# - the content between the curly braces is extracted
regexp {(.*?)\s(.*?)\s\\\{(.*?)\\\}\s(.*?)$} $::argv -> first second third fourth
puts "Argument extraction:"
puts "argv: $::argv"
puts "arg1: $first"
puts "arg2: $second"
puts "arg3: $third"
puts "arg4: $fourth"
# Third argument is to be treated as an array, again split on white space
set theArguments [regexp -all -inline {\S+} $third]
puts "\nArguments for parameter 3"
foreach arg $theArguments {
puts "arg: $arg"
}
You should always place variable length arguments at the end. But if you can guarantee you always mjust provide the last argument, then something like this will suffice:
#!/bin/bash
arg1=$1 ; shift
arg2=$1 ; shift
# Get the array passed in.
arrArgs=()
while (( $# > 1 )) ; do
arrArgs=( "${arrArgs[#]}" "$1" )
shift
done
lastArg=$1 ; shift

Resources