How can zsh array elements be transformed in a single expansion? - arrays

Say you have a zsh array like:
a=("x y" "v w")
I want to take the first word of every element, say:
b=()
for e in $a; {
b=($b $e[(w)0])
}
So now I have what I need in b:
$ print ${(qq)b}
'x' 'v'
Is there a way to do this in a single expansion expression? (i.e. not needing a for loop for processing each array element and accumulating the result in a new array).

It could be possible to take the word by removing from the first occurrence of a white space to the end in each element of the array like this:
$ print ${(qq)a%% *}
'x' 'v'
It coulde be noted that the%% expression (and some others) could be used for array elements:
In the following expressions, when name is an array and the substitution is not quoted, or if the ‘(#)’ flag or the name[#] syntax is used, matching and replacement is performed on each array element separately.
...
${name%pattern}
${name%%pattern}
If the pattern matches the end of the value of name, then substitute the value of name with the matched portion deleted; otherwise, just substitute the value of name. In the first form, the smallest matching pattern is preferred; in the second form, the largest matching pattern is preferred.
-- zshexpn(1): Expansion, Parameter Expansion

Related

sliced arrays flatten to len=1?

I am pulling my hair out manipulating arrays in bash. I have an array of strings, which contain spaces. I would like an array containing all but the first element of my input array.
input=("first string" "second string" "third string")
echo ${#input[#]}
# len(input)=3
# get slice of all except for first element of input
slice=${input[#]:1}
echo ${#slice[#]}
# expect 2, but get 1
echo $slice
# second string third string
# slice should contain ("second string" "third string"), but instead is "second string third string"
Slicing the array clearly works to eliminate the first element, but the result appears to be a concatenation of all remaining strings, rather than an array. Is there a way to slice an array in bash and get an array as a result?
(sorry, I'm not new to bash, but I've never used it for much before, and I can't find any documentation showing why my slice is flattened)
First off, you should always quote variable expansions. Be very wary of any solution that relies on unquoted expansions. ShellCheck.net is a great tool for catching bugs related to quoting (among many other issues).
To your specific issue, slice=${input[#]:1} does not do what you want. It defines a single scalar variable slice rather than an array, meaning the array expansion (denoted by the [#]) will first be munged into a single string using the current IFS. Here's a demo:
$ arr=(1 2 '3 4')
$ IFS=,
$ var="${arr[#]:1}"
$ echo "$var"
2,3 4
To instead declare and populate an array use the =() notation, like so:
$ var=("${arr[#]:1}")
$ printf '%s\n' "${var[#]}"
2
3 4
Indexes are reset, element 1 is now element 0:
slice=("${input[#]:1}")
Element and index are removed, the first element is now index 1, not index 0:
unset input[0]
${#slice[#]} or ${#input[#]} will now be 1 less than the previous value of ${#input[#]}. Starting out with three elements in slice, the values of "${!slice[#]}" and "${!input[#]}", will be 0 1 and 1 2 respectively (for either the first or second approach)
If you don't quote slice=("${input[#]:1}"), each array element is split on whitespace, creating many more elements.

How to slice a variable into array indexes?

There is this typical problem: given a list of values, check if they are present in an array.
In awk, the trick val in array does work pretty well. Hence, the typical idea is to store all the data in an array and then keep doing the check. For example, this will print all lines in which the first column value is present in the array:
awk 'BEGIN {<<initialize the array>>} $1 in array_var' file
However, it is initializing the array takes some time because val in array checks if the index val is in array, and what we normally have stored in array is a set of values.
This becomes more relevant when providing values from command line, where those are the elements that we want to include as indexes of an array. For example, in this basic example (based on a recent answer of mine, which triggered my curiosity):
$ cat file
hello 23
bye 45
adieu 99
$ awk -v values="hello adieu" 'BEGIN {split(values,v); for (i in v) names[v[i]]} $1 in names' file
hello 23
adieu 99
split(values,v) slices the variable values into an array v[1]="hello"; v[2]="adieu"
for (i in v) names[v[i]] initializes another array names[] with names["hello"] and names["adieu"] with empty value. This way, we are ready for
$1 in names that checks if the first column is any of the indexes in names[].
As you see, we slice into a temp variable v to later on initialize the final and useful variable names[].
Is there any faster way to initialize the indexes of an array instead of setting one up and then using its values as indexes of the definitive?
No, that is the fastest (due to hash lookup) and most robust (due to string comparison) way to do what you want.
This:
BEGIN{split(values,v); for (i in v) names[v[i]]}
happens once on startup and will take close to no time while this:
$1 in array_var
which happens once for every line of input (and so is the place that needs to have optimal performance) is a hash lookup and so the fastest way to compare a string value to a set of strings.
not an array solution but one trick is to use pattern matching. To eliminate partial matches wrap the search and array values with the delimiter. For your example,
$ awk -v values="hello adieu" 'FS values FS ~ FS $1 FS' file
hello 23
adieu 99

Inline expansion of arrays in bash

I'm trying to expand an array in bash:
FILES=(2009q{1..4})
echo ${FILES[#]}
echo ${FILES[#]}.zip
Output is:
2009q1 2009q2 2009q3 2009q4
2009q1 2009q2 2009q3 2009q4.zip
But how can I expand the last line as in echo 2009q{1..4}.zip expansion, so that the last line looked like:
2009q1.zip 2009q2.zip 2009q3.zip 2009q4.zip
... but using array FILES?
FILES=(2009q{1..4})
echo ${FILES[#]/%/.zip}
Output:
2009q1.zip 2009q2.zip 2009q3.zip 2009q4.zip
From Bash's Parameter Expansion:
${parameter/pattern/string}: The pattern is expanded to produce a pattern just as in filename expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. If pattern begins with '/', all matches of pattern are replaced with string. Normally only the first match is replaced. If pattern begins with '#', it must match at the beginning of the expanded value of parameter. If pattern begins with '%', it must match at the end of the expanded value of parameter. If string is null, matches of pattern are deleted and the / following pattern may be omitted. If parameter is '#' or '', the substitution operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with '#' or '', the substitution operation is applied to each member of the array in turn, and the expansion is the resultant list.
You can use printf:
printf "%s.zip " "${FILES[#]}"
2009q1.zip 2009q2.zip 2009q3.zip 2009q4.zip

HP-UX Unix add existing array string elements to total string with for loop starting from defined element

In shell script (!/bin/sh) I have array elements:
u[1]="someString"
u[2]="anotherString"
u[3]="aThirdString"
u[4]="aFourhString"
I want to build a new string, which will display the string values of u array elements that I want and separated by semi-column (;). I want to do the following:
totalString=""
for k in {2..4}; do
totalString+="${u[k]};"
done
But I cannot make it to work. First of all, for some reason k is interpreted like a {2..4} at the above, instead of evaluating the value of u[k] every time to finally display the desired:
anotherString;aThirdString;aFourhString
Note that the example array elements are not always the same as above example. I could have more than 4, or less. I am counting and define them with other part of code and I want to display some of them only at the total new string.

For loop to take the value of the whole array each time

Suppose I have 3 arrays, A, B and C
I want to do the following:
A=("1" "2")
B=("3" "4")
C=("5" "6")
for i in $A $B $C; do
echo ${i[0]} ${i[1]}
#process data etc
done
So, basically i takes the value of the whole array each time and I am able to access the specific data stored in each array.
On the 1st loop, i should take the value of the 1st array, A, on the 2nd loop the value of array B etc.
The above code just iterates with i taking the value of the first element of each array, which clearly isn't what I want to achieve.
So the code only outputs 1, 3 and 5.
You can do this in a fully safe and supportable way, but only in bash 4.3 (which adds namevar support), a feature ported over from ksh:
for array_name in A B C; do
declare -n current_array=$array_name
echo "${current_array[0]}" "${current_array[1]}"
done
That said, there's hackery available elsewhere. For instance, you can use eval (allowing a malicious variable name to execute arbitrary code, but otherwise safe):
for array_name in A B C; do
eval 'current_array=( "${'"$array_name"'[#]}"'
echo "${current_array[0]}" "${current_array[1]}"
done
If the elements of the arrays don't contain spaces or wildcard characters, as in your question, you can do:
for i in "${A[*]}" "${B[*]}" "${C[*]}"
do
iarray=($i)
echo ${iarray[0]} ${iarray[1]}
# process data etc
done
"${A[*]}" expands to a single string containing all the elements of ${A[*]}. Then iarray=($i) splits this on whitespace, turning the string back into an array.

Resources