How to split a file on "empty line" pattern, into an array of variables ? (only using native bash) - arrays

I'm actually using a bash script to work on a deep folder structure, and extracting informations (related-folder size, extracted text from config files, etc ...), to push them into a database to summarize.
"NO NEW PROCESS" is my rule on this script, since every folder leads to about 300 conf files, and I have about 10.000 folders ... so only native command please.
Here's a part of one input file that i'm actually trying to work with :
include_ldap_query
attrs mail
ssl_ciphers ALL
filter (mail=john.doe*)
name MyRequestName1
host myldaphost:30002
use_ssl no
passwd MyPassword
timeout 60
suffix ou=collaborators,ou=My Company,ou=people,dc=MyLdapContent,dc=MyCompany,dc=fr
user uid=MyUserID,ou=accounts,dc=MyLdapContent,dc=MyCompany,dc=fr
ssl_version sslv2
scope sub
select all
include_ldap_query
attrs mail
ssl_ciphers ALL
filter (mail=janedoe*)
name MyRequestName2
host myldaphost:30002
use_ssl no
passwd MyPassword
timeout 60
suffix ou=collaborators,ou=My Company,ou=people,dc=MyLdapContent,dc=MyCompany,dc=fr
user uid=MyUserID,ou=accounts,dc=MyLdapContent,dc=MyCompany,dc=fr
ssl_version sslv3
scope sub
select first
include_ldap_query
attrs mail
ssl_ciphers ALL
filter (mail=jimmy.page*)
name MyRequestName3
host myldaphost:30002
use_ssl no
passwd MyPassword
timeout 60
suffix ou=collaborators,ou=My Company,ou=people,dc=MyLdapContent,dc=MyCompany,dc=fr
user uid=MyUserID,ou=accounts,dc=MyLdapContent,dc=MyCompany,dc=fr
ssl_version sslv3
scope sub
I'd like to put those querys into an array, to work separately with each other.
how can i split on the empty line pattern ?

Not exactly pure BASH but you can use null RS in awk for this splitting on an empty line:
awk '{print NR ":", $0}' RS= file

Pure bash
You might use a blank line to trigger an array index increment.
In this example we assume the input is piped to stdin.
The layout in the array will be like:
ARRAY[0]="include_ldap_query |attrs mail|ssl_ciphers ALL|filter (mail=john.doe*)|..."
ARRAY[1]="include_ldap_query |attrs mail|ssl_ciphers ALL|filter (mail=janedoe*)|...."
This could do the job, i think.
#!/bin/bash
INDEX=0
while read LINE
do
[ "$LINE" = "" ] && (( ++INDEX )) || ARRAY[$INDEX]="$ARRAY[$INDEX]""|$LINE"
done
# down here comes the rest of your code
# everything is in arrays by now

Related

Split comma separated and quoted string into an array in Bash

I need to split a comma separated, but quoted list of strings into an indexed bash array in a script.
I know there are a lot of posts on the web in general and also on SO that show how to create an indexed array from a given line / string, but I could not find any example that does the array elements the way I need. I apologise, if I have missed any obvious examples from SO itself.
I am reading a file that I receive from someone, and cannot change it.
The file is formatted like this
"Grant ACL","grantacls.sh"
"Revoke ACL","revokeacls.sh"
"Get ACls for Topic","topicacls.sh"
"Get Topics for User with ACLs","useracls.sh"
I need to create an array for each line above where the separator is comma - and each of the quoted string will be an array element. I have tried various options. The latest attempt was using a construct like this - copied from some example on the web
parseScriptMapLine=${scriptName[$IN_OPTION]}
mapfile -td ',' script1 < <(echo -n "${parseScriptMapLine//, /,}")
declare -p script1
echo "script1 $script1"
where script name is an associative array created from the original file, whose format is with 1, 2, etc. as the key and the other part after '=' sign as value.
The above snippet prints
script1
And the value part I need to split into an indexed array, so that I can pass the second element as a parameter. When creating indexed array from the value string, if I have to lose the quotes, that is fine or if it creates the elements with the quotes, that is fine too.
1="Grant ACL","grantacls.sh"
2="Revoke ACL","revokeacls.sh"
3="Get ACls for Topic","topicacls.sh"
4="Get Topics for User with ACLs","useracls.sh"
I have looked at a lot of examples, but haven't been able to get this particular requirement working.
Thank you
With apologies, I could not understand what you wanted - this sounds like an X/Y Problem. Can you clarify?
Maybe this?
$: while IFS=',"' read -r _ a _ _ d _ && [[ -n "$d" ]]; do echo "a=[$a] d=[$d]"; done < file
a=[Grant ACL] d=[grantacls.sh]
a=[Revoke ACL] d=[revokeacls.sh]
a=[Get ACls for Topic] d=[topicacls.sh]
a=[Get Topics for User with ACLs] d=[useracls.sh]
That will let you do whatever you wanted with the fields, which I named a and d.
If you just want to load the lines of the file into an array -
$: mapfile -t script1 < file
$: for i in "${!script1[#]}"; do echo "$i=${script1[i]}"; done
0="Grant ACL","grantacls.sh"
1="Revoke ACL","revokeacls.sh"
2="Get ACls for Topic","topicacls.sh"
3="Get Topics for User with ACLs","useracls.sh"
If you want a two-dimensional array, then sorry, you're going to have to use something besides bash. or get more creative.

Adding value to an associative array named after a variable

I need your help with a bash >= 4 script I'm writing.
I am retrieving some files from remote hosts to back them up.
I have a for loop that iterate through the hosts and for each one tests connection and the start a function that retrieves the various files.
My problem is that I need to know what gone wrong (and if), so I am trying to store OK or KO values in an array and parse it later.
This is the code:
...
for remote_host in $hosts ; do
short_host=$(echo "$remote_host" | grep -o '^[^.]\+')
declare -A cluster
printf "INFO: Testing connectivity to %s... " "$remote_host"
if ssh -q "$remote_host" exit ; then
printf "OK!\n"
cluster[$short_host]="Reacheable"
mkdir "$short_host"
echo "INFO: Collecting files ..."
declare -A ${short_host}
objects1="/etc/krb5.conf /etc/passwd /etc/group /etc/fstab /etc/sudoers /etc/shadow"
for obj in ${objects1} ; do
if file_retrieve "$user" "$remote_host" "$obj" ; then
-> ${short_host}=["$obj"]=OK
else
${short_host}=["$obj"]=KO
fi
done
...
So I'm using an array named cluster to list if the nodes were reacheable, and another array - named after the short name of the node - to list OK or KO for single files.
On execution, I got the following error (line 130 is the line I marked with the arrow above):
./test.sh: line 130: ubuntu01=[/etc/krb5.conf]=OK: command not found
I think this is a synthax error for sure, but I can't fix it. I tried a bunch of combinations without success.
Thanks for your help.
Since the array name is contained in a variable short_list, you need eval to make the assignment work:
${short_host}=["$obj"]=OK
Change it to:
eval ${short_host}=["$obj"]=OK
eval ${short_host}=["$obj"]=OK
Similar posts:
Single line while loop updating array

How can I split bash CLI arguments into two separate arrays for later usage?

New to StackOverflow and new to bash scripting. I have a shell script that is attempting to do the following:
cd into a directory on a remote machine. Assume I have already established a successful SSH connection.
Save the email addresses from the command line input (these could range from 1 to X number of email addresses entered) into an array called 'emails'
Save the brand IDs (integers) from the command line input (these could range from 1 to X number of brand IDs entered) into an array called 'brands'
Use nested for loops to iterate over the 'emails' and 'brands' arrays and add each email address to each brand via add.py
I am running into trouble splitting up and saving data into each array, because I do not know where the command line indices of the emails will stop, and where the indices of the brands will begin. Is there any way I can accomplish this?
command line input I expect to look as follows:
me#some-remote-machine:~$ bash script.sh person1#gmail.com person2#gmail.com person3#gmail.com ... personX#gmail.com brand1 brand2 brand3 ... brandX
The contents of script.sh look like this:
#!/bin/bash
cd some/directory
emails= ???
brands= ???
for i in $emails
do
for a in $brands
do
python test.py add --email=$i --brand_id=$a --grant=manage
done
done
Thank you in advance, and please let me know if I can clarify or provide more information.
Use a sentinel argument that cannot possibly be a valid e-mail address. For example:
$ bash script.sh person1#gmail.com person2#gmail.com '***' brand1 brand2 brand3
Then in a loop, you can read arguments until you reach the non-email; everything after that is a brand.
#!/bin/bash
cd some/directory
while [[ $1 != '***' ]]; do
emails+=("$1")
shift
done
shift # Ignore the sentinal
brands=( "$#" ) # What's left
for i in "${emails[#]}"
do
for a in "${brands[#]}"
do
python test.py add --email="$i" --brand_id="$a" --grant=manage
done
done
If you can't modify the arguments that will be passed to script.sh, then perhaps you can distinguish between an address and a brand by the presence or absence of a #:
while [[ $1 = *#* ]]; do
emails+=("$1")
shift
done
brands=("$#")
I'm assuming that the number of addresses and brands are independent. Otherwise, you can simply look at the total number of arguments $#. Say there are N of each. Then
emails=( "${#:1:$#/2}" ) # First half
brands=( "${#:$#/2+1}" ) # Second half

Bash - Concatenating backslash while joining an array

I've been trying to figure out a bash script to determine the server directory path, such as D:\xampp\htdocs, and the project folders name, such as "my_project", while Grunt is running my postinstall script. So far I can grab the projects folder name, and I can get an array of the remaining indices that comprise the server root path on my system, but I can't seem to join the array with an escaped backslash. This is probably not the best solution (definitely not the most elegant) so if you have any tips or suggestions along the way I'm amendable.
# Determine project folder name and server root directory path
bashFilePath=$0 # get path to post_install.sh
IFS='\' bashFilePathArray=($bashFilePath) # split path on \
len=${#bashFilePathArray[#]} # get array length
# Name of project folder in server root directory
projName=${bashFilePathArray[len-3]} # returns my_project
ndx=0
serverPath=""
while [ $ndx -le `expr $len - 4` ]
do
serverPath+="${bashFilePathArray[$ndx]}\\" # tried in and out of double quotes, also in separate concat below
(( ndx++ ))
done
echo $serverPath # returns D: xampp htdocs, works if you sub out \\ for anything else, such as / will produce D:/xampp/htdocs, just not \\
You can only prefix command invocations, not variable assignments, with IFS, so your line
IFS='\' bashFilePathArray=($bashFilePath)
is just a pair of assignments; the expansion of $bashFilePath is unaffected by the assignment to IFS. Instead, use the read builtin.
IFS='\' read -ra bashFilePathArray <<< "$bashFilePath"
Later, you can use a subshell to easily join the first few elements of the array into a single string.
serverPath=$(IFS='\'; echo "${bashFilePathArray[*]:0:len-3}")
The semi-colon is required, since the argument to echo is expanded before echo actually runs, meaning IFS needs to be modified "globally" rather than just for the echo command. Also, [*] is required in place of the more commonly recommended [#] because here we are making explicit use of the property that the elements of such an array expansion will produce a single word rather than a sequence of words.

is it possible to use bash to access more than one array in a for loop

I'm trying to write a bash script that will let me download multiple web pages using curl. For each webpage, I want to be able to pass curl the page and the referer link. I want to be able to supply multiple webpages at once.
In other words, I want to be able to loop through the webpages I supply the script, and for each page, pass the associated webpage and referer link to curl.
I thought I'd use an array to store the webpage and referer link in a single variable, thinking that I could then extract the individual elements of the array when running curl.
My problem is that I can't figure out how to get multiple arrays to work properly in a for loop. Here is an idea of what I want to do. This code does not work, since "$i" (in the for loop) doesn't become an array.
#every array has the information for a separate webpage
array=( "webpage" "referer" )
array2=( "another webpage" "another referer" )
for i in "${array[#]}" "${array2[#]}" #line up multiple web pages
do
#use curl to download the page, giving the referer ("-e")
curl -O -e "${i[1]}" "${i[0]}"
done
If I was only working with one array, I could easily do it like this:
array=( "webpage" "referer" )
REFERER="${array[1]}"
PAGE="${array[0]}"
#use curl to download the page, giving the referer ("-e")
curl -O -e "$REFERER" "$LINK"
It's once I have more than one webpage that I want to process at once that I can't figure out how to do it correctly.
If there is another way to handle multiple webpages, without having to use arrays and a for loop, please let me know.
If there is another way to handle multiple webpages, without having to use arrays and a for loop, please let me know.
Using arrays is fine, at least it's much better than using space-separated lists or similar hacks. Simply loop over the indices:
array=('webpage' 'another webpage')
array2=('referrer' 'another referrer')
# note the different layout!
for i in "${!array[#]}"
do
webpage="${array[$i]}"
referrer="${array2[$i]}"
done
You need a trick here. Note that spaces are not allowed in URLs, so you can say:
webpages=("url referrer" "url2 ref2" ...)
for i in "${webpages[#]}" ; do
set -- "$i"
url="$1"
ref="$2"
curl -O -e "${url}" "${ref}"
done
[EDIT] Maybe a better solution will be to put all the URLs into a file and then use this code:
while read url ref ; do
curl -O -e "${url}" "${ref}"
done < file
or if you prefer here documents:
while read url ref ; do
echo "url=$url ref=$ref"
done <<EOF
url1 ref1
url2 ref2
... xxx
EOF
Just as a general aside: Inside a function at least just declare the IFS variable to limit its scope to that function only. No need to save & restore IFS via OLD_IFS!
help declare
IFS=$' \t\n'
printf "%q\n" "$IFS"
function ifs_test () {
declare IFS
IFS=$'\n'
printf "%q\n" "$IFS"
return 0
}
ifs_test
printf "%q\n" "$IFS"
Thanks to everyone for their responses. Both ideas had merit, but I found some code in the Advanced Bash Guide that does exactly what I want to do.
I can't say I fully understand it, but by using an indirect reference to the array, I can use multiple arrays in the for loop. I'm not sure what the local command does, but it is the key (I think it runs a sort of eval and assigns the string to the variable).
The advantage of this is that I can group each webpage and referer into their own array. I can then easily add a new website, by creating a new array and adding it to the for loop. Also, should I need to add more variables to the curl command (such as a cookie), I can easily expand the array.
function get_page () {
OLD_IFS="$IFS"
IFS=$'\n' # If the element has spaces, when using
# local to assign variables
local ${!1}
# Print variable
echo First Variable: "\"$a\""
echo Second Variable: "\"$b\""
echo ---------------
echo curl -O -e "\"$a\"" "\"$b\""
echo
IFS="$OLD_IFS"
}
#notice the addition of "a=" and "b="
#this is not an associative array, that would be [a]= and [b]=
array=( a="webpage" b="referer" )
array2=( a="another webpage" b="another referer" )
#This is just a regular string in the for loop, it doesn't mean anything
#until the indirect referencing later
for i in "array[*]" "array2[*]" #line up multiple web pages
do
#must use a function so that the local command works
#but I'm sure there's a way to do the same thing without using local
get_page "$i"
done
This results in:
First Variable: "webpage"
Second Variable: "referer"
---------------
curl -O -e "webpage" "referer"
First Variable: "another webpage"
Second Variable: "another referer"
---------------
curl -O -e "another webpage" "another referer"

Resources