Read paired arrays from a file in bash - arrays

I have a bash script which breaks bash array into pairs, and match on either element;
declare -a arr=(
"apple" "fruit"
"cabbage" "vegetables"
)
for ((i=0; i<${#arr[#]}; i+=2)); do
echo "${arr[i]} ${arr[i+1]}"
done
So when you run this script, it prints out each 2 element from the array, like this;
# bash script
apple fruit
cabbage vegetables
and I can also choose any element I want with ${arr[i+#]}.
Now I'm trying to read this array from a separate text file, instead of inside the script since I'll be manipulating this array in the future.
I've tried this method so far, which looked pretty promising at first but didn't work at all;
filename='stuff.log'
filelines=`cat $filename`
for line in $filelines ; do
props=($line)
echo "${props[0]} ${props[1]}"
done
which should've print out the below content in the console (basically the same thing as the first script where the array is inside the script), supposedly but instead, it returned nothing.
# bash script
apple fruit
cabbage vegetables
And the inside of stuff.log is;
"apple" "fruit"
"cabbage" "vegetables"
How can I basically read the array from a separate file for the first script and also be able to manipulate the content of array file in the future?

I think, if you trust your input, you can do:
IFS=' \n' eval props=($(<stuff.log))
Eval is evil and it is there to remove leading and trailing ". And it will parse properly elements with spaces in them. We can do a little safer by reading the file into array and then removing leading and trailing ":
IFS=' \n' props=($(<stuff.log))
IFS='\n' props=($(printf "%s\n" "${props[#]}" | sed 's/^"//;s/"$//'))
Anyway I think I would hesitate to use such method in production code. Would be better to write a proper fully parser that takes " into account and reads input char by char.
If you want to read a file into an array, use mapfile or readarray commands (they are exactly the same command).

Related

Split comma separated and quoted string into an array in Bash

I need to split a comma separated, but quoted list of strings into an indexed bash array in a script.
I know there are a lot of posts on the web in general and also on SO that show how to create an indexed array from a given line / string, but I could not find any example that does the array elements the way I need. I apologise, if I have missed any obvious examples from SO itself.
I am reading a file that I receive from someone, and cannot change it.
The file is formatted like this
"Grant ACL","grantacls.sh"
"Revoke ACL","revokeacls.sh"
"Get ACls for Topic","topicacls.sh"
"Get Topics for User with ACLs","useracls.sh"
I need to create an array for each line above where the separator is comma - and each of the quoted string will be an array element. I have tried various options. The latest attempt was using a construct like this - copied from some example on the web
parseScriptMapLine=${scriptName[$IN_OPTION]}
mapfile -td ',' script1 < <(echo -n "${parseScriptMapLine//, /,}")
declare -p script1
echo "script1 $script1"
where script name is an associative array created from the original file, whose format is with 1, 2, etc. as the key and the other part after '=' sign as value.
The above snippet prints
script1
And the value part I need to split into an indexed array, so that I can pass the second element as a parameter. When creating indexed array from the value string, if I have to lose the quotes, that is fine or if it creates the elements with the quotes, that is fine too.
1="Grant ACL","grantacls.sh"
2="Revoke ACL","revokeacls.sh"
3="Get ACls for Topic","topicacls.sh"
4="Get Topics for User with ACLs","useracls.sh"
I have looked at a lot of examples, but haven't been able to get this particular requirement working.
Thank you
With apologies, I could not understand what you wanted - this sounds like an X/Y Problem. Can you clarify?
Maybe this?
$: while IFS=',"' read -r _ a _ _ d _ && [[ -n "$d" ]]; do echo "a=[$a] d=[$d]"; done < file
a=[Grant ACL] d=[grantacls.sh]
a=[Revoke ACL] d=[revokeacls.sh]
a=[Get ACls for Topic] d=[topicacls.sh]
a=[Get Topics for User with ACLs] d=[useracls.sh]
That will let you do whatever you wanted with the fields, which I named a and d.
If you just want to load the lines of the file into an array -
$: mapfile -t script1 < file
$: for i in "${!script1[#]}"; do echo "$i=${script1[i]}"; done
0="Grant ACL","grantacls.sh"
1="Revoke ACL","revokeacls.sh"
2="Get ACls for Topic","topicacls.sh"
3="Get Topics for User with ACLs","useracls.sh"
If you want a two-dimensional array, then sorry, you're going to have to use something besides bash. or get more creative.

Make a list of all files in two folders then iterate through the combined list randomly

I have two directories with photos that I want to manipulate to output a random order of the files each time a script is run. How would I create such a list?
d1=/home/Photos/*.jpg
d2=/mnt/JillsPC/home/Photos/*.jpg
# somehow make a combined list, files = d1 + d2
# somehow randomise the file order
# during execution of the for;do;done loop, no file should be repeated
for f in $files; do
echo $f # full path to each file
done
I wouldn't use variables if you don't have to. It's more natural if you chain a couple of commands together with pipes or process substitution. That way everything operates on streams of data without loading the entire list of names into memory all at once.
You can use shuf to randomly permute input lines, and find to list files one per line. Or, to be maximally safe, let's use \0 separators. Finally, a while loop with process substitution reads line by line into a variable.
while IFS= read -d $'\0' -r file; do
echo "$file"
done < <(find /home/Photos/ /mnt/JillsPC/home/Photos/ -name '*.jpg' -print0 | shuf -z)
That said, if you do want to use some variables then you should use arrays. Arrays handle file names with whitespace and other special characters correctly, whereas regular string variables muck them all up.
d1=(/home/Photos/*.jpg)
d2=(/mnt/JillsPC/home/Photos/*.jpg)
files=("${d1[#]}" "${d2[#]}")
Iterating in order would be easy:
for file in "${files[#]}"; do
echo "$file"
done
Shuffling is tricky though. shuf is still the best tool but it works best on a stream of data. We can use printf to print each file name with the trailing \0 we need to make shuf -z happy.
d1=(/home/Photos/*.jpg)
d2=(/mnt/JillsPC/home/Photos/*.jpg)
files=("${d1[#]}" "${d2[#]}")
while IFS= read -d $'\0' -r file; do
echo "$file"
done < <(printf '%s\0' "${files[#]}" | shuf -z)
Further reading:
How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
How can I find and safely handle file names containing newlines, spaces or both?
I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?
How can I randomize (shuffle) the order of lines in a file? Or select a random line from a file, or select a random file from a directory?
I came up with this solution after some more reading:
files=(/home/roy/Photos/*.jpg /mnt/JillsPC/home/jill/Photos/*.jpg)
printf '%s\n' "${files[#]}" | sort -R
Edit: updated with John's improvements from comments.
You can add any number of directories into an array declaration (though see caveat with complex names in comments).
sort -R seems to use shuf internally from looking at it's man page.
This was the original, which works, but is not as robust as the above:
files=(/home/roy/Photos/*.jpg /mnt/JillsPC/home/jill/Photos/*.jpg)
(IFS=$'\n'; echo "${files[*]}") | sort -R
With IFS=$'\n', echoing the array will display it line by line (IFS=$'somestring' is syntax for string literals with escape sequences. So unlike '\n', $'\n' is the correct way to set it to a line break). IFS is not needed when using the printf method above.
echo ${files[*]} will print out all array elements at once, using the IFS defined in

Splitting string separated by comma into array values in shell script?

My data set(data.txt) looks like this [imageID,sessionID,height1,height2,x,y,crop]:
1,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
2,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
3,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
4,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,950
These are a set of values which I wish to use. I'm new to shell script :) I read the file line by line like this ,
cat $FILENAME | while read LINE
do
string=($LINE)
# PROCESSING THE STRING
done
Now, in the code above, after getting the string, I wish to do the following :
1. Split the string into comma separated values.
2. Store these variables into arrays like imageID[],sessionID[].
I need to access these values for doing image processing using imagemagick.
However, I'm not able to perform the above steps correctly
set -A doesn't work for me (probably due to older BASH on OSX)
Posting an alternate solution using read -a in case someone needs it:
# init all your individual arrays here
imageId=(); sessionId=();
while IFS=, read -ra arr; do
imageId+=(${arr[0]})
sessionId+=(${arr[1]})
done < input.csv
# Print your arrays
echo "${imageId[#]}"
echo "${sessionId[#]}"
oIFS="$IFS"; IFS=','
set -A str $string
IFS="$oIFS"
echo "${str[0]}";
echo "${str[1]}";
echo "${str[2]}";
you can split and store like this
have a look here for more on Unix arrays.

How to read lines from a file into an array?

I'm trying to read in a file as an array of lines and then iterate over it with zsh. The code I've got works most of the time, except if the input file contains certain characters (such as brackets). Here's a snippet of it:
#!/bin/zsh
LIST=$(cat /path/to/some/file.txt)
SIZE=${${(f)LIST}[(I)${${(f)LIST}[-1]}]}
POS=${${(f)LIST}[(I)${${(f)LIST}[-1]}]}
while [[ $POS -le $SIZE ]] ; do
ITEM=${${(f)LIST}[$POS]}
# Do stuff
((POS=POS+1))
done
What would I need to change to make it work properly?
I know it's been a lot of time since the question was answered but I think it's worth posting a simpler answer (which doesn't require the zsh/mapfile external module):
#!/bin/zsh
for line in "${(#f)"$(</path/to/some/file.txt)"}"
{
// do something with each $line
}
#!/bin/zsh
zmodload zsh/mapfile
FNAME=/path/to/some/file.txt
FLINES=( "${(f)mapfile[$FNAME]}" )
LIST="${mapfile[$FNAME]}" # Not required unless stuff uses it
integer POS=1 # Not required unless stuff uses it
integer SIZE=$#FLINES # Number of lines, not required unless stuff uses it
for ITEM in $FLINES
# Do stuff
(( POS++ ))
done
You have some strange things in your code:
Why are you splitting LIST each time instead of making it an array variable? It is just a waste of CPU time.
Why don’t you use for ITEM in ${(f)LIST}?
There is a possibility to directly ask zsh about array length: $#ARRAY. No need in determining the index of the last occurrence of the last element.
POS gets the same value as SIZE in your code. Hence it will iterate only once.
Brackets are problems likely because of 3.: (I) is matching against a pattern. Do read documentation.
Let's say, for the purpose of example, that file.txt contains the following text:
one
two
three
The solution depends on whether or not you'd like to elide the empty lines in file.txt:
Creating an array lines from file file.txt, eliding empty lines:
typeset -a lines=("${(f)"$(<file.txt)"}")
print ${#lines}
Expected output:
3
Creating an array lines from file file.txt, without eliding empty lines:
typeset -a lines=("${(#f)"$(<file.txt)"}")
print ${#lines}
Expected output:
5
In the end, the difference in the resulting array is a result of whether or not the parameter expansion flag (#) is provided during brace expansion.
while read -r line;
do ARRAY+=("$line");
done < file.txt

KSH scripting: how to split on ',' when values have escaped commas?

I try to write KSH script for processing a file consisting of name-value pairs, several of them on each line.
Format is:
NAME1 VALUE1,NAME2 VALUE2,NAME3 VALUE3, etc
Suppose I write:
read l
IFS=","
set -A nvls $l
echo "$nvls[2]"
This will give me second name-value pair, nice and easy. Now, suppose that the task is extended so that values could include commas. They should be escaped, like this:
NAME1 VALUE1,NAME2 VALUE2_1\,VALUE2_2,NAME3 VALUE3, etc
Obviously, my code no longer works, since "read" strips all quoting and second element of array will be just "NAME2 VALUE2_1".
I'm stuck with older ksh that does not have "read -A array". I tried various tricks with "read -r" and "eval set -A ....", to no avail. I can't use "read nvl1 nvl2 nvl3" to do unescaping and splitting inside read, since I dont know beforehand how many name-value pairs are in each line.
Does anyone have a useful trick up their sleeve for me?
PS
I know that I have do this in a nick of time in Perl, Python, even in awk. However, I have to do it in ksh (... or die trying ;)
As it often happens, I deviced an answer minutes after asking the question in public forum :(
I worked around the quoting/unquoting issue by piping the input file through the following sed script:
sed -e 's/\([^\]\),/\1\
/g;s/$/\
/
It converted the input into:
NAME1.1 VALUE1.1
NAME1.2 VALUE1.2_1\,VALUE1.2_2
NAME1.3 VALUE1.3
<empty line>
NAME2.1 VALUE2.1
<second record continues>
Now, I can parse this input like this:
while read name value ; do
echo "$name => $value"
done
Value will have its commas unquoted by "read", and I can stuff "name" and "value" in some associative array, if I like.
PS
Since I cant accept my own answer, should I delete the question, or ...?
You can also change the \, pattern to something else that is known not to appear in any of your strings, and then change it back after you've split the input into an array. You can use the ksh builtin pattern-substitution syntax to do this, you don't need to use sed or awk or anything.
read l
l=${l//\\,/!!}
IFS=","
set -A nvls $l
unset IFS
echo ${nvls[2]/!!/,}

Resources