Unable to add element to array in bash - arrays

I have the following problem. Let´s assume that $# contains only valid files. Variable file contains the name of the current file (the file I'm currently "on"). Then variable element contains data in the format file:function.
Now, when variable element is not empty, it should be put into the array. And that's the problem. If I echo element, it contains exactly what I want, although it is not stored in array, so for cycle doesn't print out anything.
I have written two ways I try to insert element into array, but neither works. Can you tell me, What am I doing wrong, please?
I'm using Linux Mint 16.
#!/bin/bash
nm $# | while read line
do
pattern="`echo \"$line\" | sed -n \"s/^\(.*\):$/\1/p\"`"
if [ -n "$pattern" ]; then
file="$pattern"
fi
element="`echo \"$line\" | sed -n \"s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p\"`"
if [ -n "$element" ]; then
array+=("$element")
#array[$[${#array[#]}+1]]="$element"
echo element - "$element"
fi
done
for j in "${array[#]}"
do
echo "$j"
done

Your problem is that the while loop runs in a subshell because it is the second command in a pipeline, so any changes made in that loop are not available after the loop exits.
You have a few options. I often use { and } for command grouping:
nm "$#" |
{
while read line
do
…
done
for j in "${array[#]}"
do
echo "$j"
done
}
In bash, you can also use process substitution:
while read line
do
…
done < <(nm "$#")
Also, it is better to use $(…) in place of back-quotes `…` (and not just because it is hard work getting back quotes into markdown text!).
Your line:
element="`echo \"$line\" | sed -n \"s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p\"`"
could be written:
element="$(echo "$line" | sed -n "s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p")"
or even:
element=$(echo "$line" | sed -n "s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p")
It really helps when you need them nested. For example, to list the lib directory adjacent to where gcc is found:
ls -l $(dirname $(dirname $(which gcc)))/lib
vs
ls -l `dirname \`dirname \\\`which gcc\\\`\``/lib
I know which I find easier!

Related

Bash: how to print and run a cmd array which has the pipe operator, |, in it

This is a follow-up to my question here: How to write bash function to print and run command when the command has arguments with spaces or things to be expanded
Suppose I have this function to print and run a command stored in an array:
# Print and run the cmd stored in the passed-in array
print_and_run() {
echo "Running cmd: $*"
# run the command by calling all elements of the command array at once
"$#"
}
This works fine:
cmd_array=(ls -a /)
print_and_run "${cmd_array[#]}"
But this does NOT work:
cmd_array=(ls -a / | grep "home")
print_and_run "${cmd_array[#]}"
Error: syntax error near unexpected token `|':
eRCaGuy_hello_world/bash$ ./print_and_run.sh
./print_and_run.sh: line 55: syntax error near unexpected token `|'
./print_and_run.sh: line 55: `cmd_array=(ls -a / | grep "home")'
How can I get this concept to work with the pipe operator (|) in the command?
If you want to treat an array element containing only | as an instruction to generate a pipeline, you can do that. I don't recommend it -- it means you have security risk if you don't verify that variables into your string can't consist only of a single pipe character -- but it's possible.
Below, we create a random single-use "$pipe" sigil to make that attack harder. If you're unwilling to do that, change [[ $arg = "$pipe" ]] to [[ $arg = "|" ]].
# generate something random to make an attacker's job harder
pipe=$(uuidgen)
# use that randomly-generated sigil in place of | in our array
cmd_array=(
ls -a /
"$pipe" grep "home"
)
exec_array_pipe() {
local arg cmd_q
local -a cmd=( )
while (( $# )); do
arg=$1; shift
if [[ $arg = "$pipe" ]]; then
# log an eval-safe copy of what we're about to run
printf -v cmd_q '%q ' "${cmd[#]}"
echo "Starting pipeline component: $cmd_q" >&2
# Recurse into a new copy of ourselves as a child process
"${cmd[#]}" | exec_array_pipe "$#"
return
fi
cmd+=( "$arg" )
done
printf -v cmd_q '%q ' "${cmd[#]}"
echo "Starting pipeline component: $cmd_q" >&2
"${cmd[#]}"
}
exec_array_pipe "${cmd_array[#]}"
See this running in an online sandbox at https://ideone.com/IWOTfO
Do this instead. It works.
print_and_run() {
echo "Running cmd: $1"
eval "$1"
}
Example usage:
cmd='ls -a / | grep -C 9999 --color=always "home"'
print_and_run "$cmd"
Output:
Running cmd: ls -a / | grep -C 9999 --color=always "home"
(rest of output here, with the word "home" highlighted in red)
The general direction is that you don't. You do not store the whole command line to be printed later, and this is not the direction you should take.
The "bad" solution is to use eval.
The "good" solution is to store the literal '|' character inside the array (or some better representation of it) and parse the array, extract the pipe parts and execute them. This is presented by Charles in the other amazing answer. It is just rewriting the parser that already exists in the shell. It requires significant work, and expanding it will require significant work.
The end result is, is that you are reimplementing parts of shell inside shell. Basically writing a shell interpreter in shell. At this point, you can just consider taking Bash sources and implementing a new shopt -o print_the_command_before_executing option in the sources, which might just be simpler.
However, I believe the end goal is to give users a way to see what is being executed. I would propose to approach it like .gitlab-ci.yml does with script: statements. If you want to invent your own language with "debug" support, do just that instead of half-measures. Consider the following YAML file:
- ls -a / | grep "home"
- echo other commands
- for i in "stuff"; do
echo "$i";
done
- |
for i in "stuff"; do
echo "$i"
done
Then the following "runner":
import yaml
import shlex
import os
import sys
script = []
input = yaml.safe_load(open(sys.argv[1], "r"))
for line in input:
script += [
"echo + " + shlex.quote(line).replace("\n", "<newline>"), # some unicode like ␤ would look nice
line,
]
os.execvp("bash", ["bash", "-c", "\n".join(script)])
Executing the runner results in:
+ ls -a / | grep "home"
home
+ echo other commands
other commands
+ for i in "stuff"; do echo "$i"; done
stuff
+ for i in "stuff"; do<newline> echo "$i"<newline>done<newline>
stuff
This offers greater flexibility and is rather simple, supports any shell construct with ease. You can try gitlab-ci/cd on their repository and read the docs.
The YAML format is only an example of the input format. Using special comments like # --- cut --- between parts and extracting each part with the parser will allow running shellcheck over the script. Instead of generating a script with echo statements, you could run Bash interactively, print the part to be executed and then "feed" the part to be executed to interactive Bash. This will alow to preserve $?.
Either way - with a "good" solution, you end up with a custom parser.
Instead of passing an array, you can pass the whole function and use the output of declare -f with some custom parsing:
print_and_run() {
echo "+ $(
declare -f "$1" |
# Remove `f() {` and `}`. Remove indentation.
sed '1d;2d;$d;s/^ *//' |
# Replace newlines with <newline>.
sed -z 's/\n*$//;s/\n/<newline>/'
)"
"$#"
}
cmd() { ls -a / | grep "home"; }
print_and_run cmd
Results in:
+ ls --color -F -a / | grep "home"
home/
It will allow for supporting any shell construct and still allow you to check it with shellcheck and doesn't require that much work.

Trouble with AWK'd command output and bash array

I am attempting to get a list of running VirtualBox VMs (the UUIDs) and put them into an array. The command below produces the output below:
$ VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}'
f93c17ca-ab1b-4ba2-95e5-a1b0c8d70d2a
46b285c3-cabd-4fbb-92fe-c7940e0c6a3f
83f4789a-b55b-4a50-a52f-dbd929bdfe12
4d1589ba-9153-489a-947a-df3cf4f81c69
I would like to take those UUIDs and put them into an array (possibly even an associative array for later use, but a simple array for now is sufficient)
If I do the following:
array1="( $(VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}') )"
The commands
array1_len=${#array1[#]}
echo $array1_len
Outputs "1" as in there's only 1 element. If I print out the elements:
echo ${array1[*]}
I get a single line of all the UUIDs
( f93c17ca-ab1b-4ba2-95e5-a1b0c8d70d2a 46b285c3-cabd-4fbb-92fe-c7940e0c6a3f 83f4789a-b55b-4a50-a52f-dbd929bdfe12 4d1589ba-9153-489a-947a-df3cf4f81c69 )
I did some research (Bash Guide/Arrays on how to tackle this and found this with command substitution and redirection, but it produces an empty array
while read -r -d '\0'; do
array2+=("$REPLY")
done < <(VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}')
I'm obviously missing something. I've looked at several simiar questions on this site such as:
Reading output of command into array in Bash
AWK output to bash Array
Creating an Array in Bash with Quoted Entries from Command Output
Unfortunately, none have helped. I would apprecaite any assistance in figuring out how to take the output and assign it to an array.
I am running this on macOS 10.11.6 (El Captain) and BASH version 3.2.57
Since you're on a Mac:
brew install bash
Then with this bash as your shell, pipe the output to:
readarray -t array1
Of the -t option, the man page says:
-t Remove a trailing delim (default newline) from each line read.
If the bash4 solution is admissible, then the advice given
e.g. by gniourf_gniourf at reading-output-of-command-into-array-in-bash
is still sound.

Content of array in bash is OK when called directly, but lost when called from function

I am trying to use xmllint to search an xml file and store the values I need into an array. Here is what I am doing:
#!/bin/sh
function getProfilePaths {
unset profilePaths
unset profilePathsArr
profilePaths=$(echo 'cat //profiles/profile/#path' | xmllint --shell file.xml | grep '=' | grep -v ">" | cut -f 2 -d "=" | tr -d \")
profilePathsArr+=( $(echo $profilePaths))
return 0
}
In another function I have:
function useProfilePaths {
getProfilePaths
for i in ${profilePathsArr[#]}; do
echo $i
done
return 0
}
useProfilePaths
The behavior of the function changes whether I do the commands manually on the command line VS calling them from different function as part of a wrapper script. When I can my function from a wrapper script, the items in the array are 1, compared to when I do it from the command line, it's 2:
$ echo ${#profilePathsArr[#]}
2
The content of profilePaths looks like this when echoed:
$ echo ${profilePaths}
/Profile/Path/1 /Profile/Path/2
I am not sure what the separator is for an xmllint call.
When I call my function from my wrapper script, the content of the first iteration of the for loop looks like this:
for i in ${profilePathsArr[#]}; do
echo $i
done
the first echo looks like:
/Profile/Path/1
/Profile/Path/2
... and the second echo is empty.
Can anyone help me debug this issue? If I could find out what is the separator used by xmllint, maybe I could parse the items correctly in the array.
FYI, I have already tried the following approach, with the same result:
profilePaths=($(echo 'cat //profiles/profile/#path' | xmllint --shell file.xml | grep '=' | grep -v ">" | cut -f 2 -d "=" | tr -d \"))
Instead of using the --shell switch and many pipes, you should use the proper --xpath switch.
But as far of I know, when you have multiple values, there's no simple way to split the different nodes.
So a solution is to iterate like this :
profilePaths=(
$(
for i in {1..100}; do
xmllint --xpath "//profiles[$i]/profile/#path" file.xml || break
done
)
)
or use xmlstarlet:
profilePaths=( $(xmlstarlet sel -t -v "//profiles/profile/#path" file.xml) )
it display output with newlines by default
The problem you're having is related to data encapsulation; specifically, variables defined in a function are local, so you can't access them outside that function unless you define them otherwise.
Depending on the implementation of sh you're using, you may be able get around this by using eval on your variable definition or with a modifier like global for mksh and declare -g for zsh and bash. I know that mksh's implementation definitely works.
Thank you for providing feedback on how I can resolve this problem. After investigating more, I was able to make this work by changing the way I was iterating the content of my 'profilePaths' variable to insert its values into the 'profilePathsArr' array:
# Retrieve the profile paths from file.xml and assign to 'profilePaths'
profilePaths=$(echo 'cat //profiles/profile/#path' | xmllint --shell file.xml | grep '=' | grep -v ">" | cut -f 2 -d "=" | tr -d \")
# Insert them into the array 'profilePathsArr'
IFS=$'\n' read -rd '' -a profilePathsArr <<<"$profilePaths"
For some reason, with all the different function calls from my master script and calls to other scripts, it seemed like the separators were lost along the way. I am unable to find the root cause, but I know that by using "\n" as the IFS and a while loop, it worked like a charm.
If anybody wishes to add more comments on this, you are more than welcome.

How to store elements with whitespace in an array?

Just wondering, assuming I am storing my data in a file called BookDB.txt in the following format :
C++ for dummies:Jared:10.52:5:6
Java for dummies:David:10.65:4:6
whereby each field is seperated by the delimeter ":".
How would I preserve whitespace in the first field and have an array with the following contents : ('C++ for dummies' 'Java for dummies')?
Any help is very much appreciated!
Ploutox's solution is almost correct, but without setting IFS, you will not get the array that you seek, with two elements in this case.
Note: He corrected his solution after this post.
IFS=$'\n': arr=( $(awk -F':' '{print $1 }' Input.txt ) )
echo ${#arr[#]}
echo ${arr[0]}
echo ${arr[1]}
Output:
2
C++ for dummies
Java for dummies
Just use a while loop:
#!/bin/bash
# create and populate the array
a=()
while IFS=':' read -r field _
do
a+=("$field")
done < file
# print the array contents
printf "%s\n" "${a[#]}"
I totally misunderstood your question on my 1st attempt to answer. awk seems more suited for your need though. You can get what you want with simple scripting :
IFS=$'\n' : MYARRAY=($(awk -F ":" '{print $1}' myfile))
the -F flag forces : as the field separator.
echo ${MYARRAY[0]} will print :
C++ for dummies
$ yes sed -i "s/:/\'\'/" BookDB.txt | head -n100 | bash
this command while work. this is a linux command, run it on shell in same path with BookDB.txt

Using a variable to pass grep pattern in bash

I am struggling with passing several grep patterns that are contained within a variable. This is the code I have:
#!/bin/bash
GREP="$(which grep)"
GREP_MY_OPTIONS="-c"
for i in {-2..2}
do
GREP_MY_OPTIONS+=" -e "$(date --date="$i day" +'%Y-%m-%d')
done
echo $GREP_MY_OPTIONS
IFS=$'\n'
MYARRAY=( $(${GREP} ${GREP_MY_OPTIONS} "/home/user/this path has spaces in it/"*"/abc.xyz" | ${GREP} -v :0$ ) )
This is what I wanted it to do:
determine/define where grep is
assign a variable (GREP_MY_OPTIONS) holding parameters I will pass to grep
assign several patterns to GREP_MY_OPTIONS
using grep and the patterns I have stored in $GREP_MY_OPTIONS search several files within a path that contains spaces and hold them in an array
When I use "echo $GREP_MY_OPTIONS" it is generating what I expected but when I run the script it fails with an error of:
/bin/grep: invalid option -- ' '
What am I doing wrong? If the path does not have spaces in it everything seems to work fine so I think it is something to do with the IFS but I'm not sure.
If you want to grep some content in a set of paths, you can do the following:
find <directory> -type f -print0 |
grep "/home/user/this path has spaces in it/\"*\"/abc.xyz" |
xargs -I {} grep <your_options> -f <patterns> {}
So that <patterns> is a file containing the patterns you want to search for in each file from directory.
Considering your answer, this shall do what you want:
find "/path\ with\ spaces/" -type f | xargs -I {} grep -H -c -e 2013-01-17 {}
From man grep:
-H, --with-filename
Print the file name for each match. This is the default when
there is more than one file to search.
Since you want to insert the elements into an array, you can do the following:
IFS=$'\n'; array=( $(find "/path\ with\ spaces/" -type f -print0 |
xargs -I {} grep -H -c -e 2013-01-17 "{}") )
And then use the values as:
echo ${array[0]}
echo ${array[1]}
echo ${array[...]}
When using variables to pass the parameters, use eval to evaluate the entire line. Do the following:
parameters="-H -c"
eval "grep ${parameters} file"
If you build the GREP_MY_OPTIONS as an array instead of as a simple string, you can get the original outline script to work sensibly:
#!/bin/bash
path="/home/user/this path has spaces in it"
GREP="$(which grep)"
GREP_MY_OPTIONS=("-c")
j=1
for i in {-2..2}
do
GREP_MY_OPTIONS[$((j++))]="-e"
GREP_MY_OPTIONS[$((j++))]=$(date --date="$i day" +'%Y-%m-%d')
done
IFS=$'\n'
MYARRAY=( $(${GREP} "${GREP_MY_OPTIONS[#]}" "$path/"*"/abc.xyz" | ${GREP} -v :0$ ) )
I'm not clear why you use GREP="$(which grep)" since you will execute the same grep as if you wrote grep directly — unless, I suppose, you have some alias for grep (which is then the problem; don't alias grep).
You can do one thing without making things complex:
First do a change directory in your script like following:
cd /home/user/this\ path\ has\ spaces\ in\ it/
$ pwd
/home/user/this path has spaces in it
or
$ cd "/home/user/this path has spaces in it/"
$ pwd
/home/user/this path has spaces in it
Then do what ever your want in your script.
$(${GREP} ${GREP_MY_OPTIONS} */abc.xyz)
EDIT :
[sgeorge#sgeorge-ld stack1]$ ls -l
total 4
drwxr-xr-x 2 sgeorge eng 4096 Jan 19 06:05 test tesd
[sgeorge#sgeorge-ld stack1]$ cat test\ tesd/file
SUKU
[sgeorge#sgeorge-ld stack1]$ grep SUKU */file
SUKU
EDIT :
[sgeorge#sgeorge-ld stack1]$ find */* -print | xargs -I {} grep SUKU {}
SUKU

Resources