Related
I'd like to either process one row of a csv file or the whole file.
The variables are set by the header row, which may be in any order.
There may be up to 12 columns, but only 3 or 4 variables are needed.
The source files might be in either format, and all I want from both is lastname and country. I know of many different ways and tools to do it if the columns were fixed and always in the same order. But they're not.
examplesource.csv:
firstname,lastname,country
Linus,Torvalds,Finland
Linus,van Pelt,USA
examplesource2.csv:
lastname,age,country
Torvalds,66,Finland
van Pelt,7,USA
I have cobbled together something from various Stackoverflow postings which looks a bit voodoo but seems fairly robust. I say "voodoo" because shellcheck complains that, for example, "firstname is referenced but not assigned". And yet it prints it.
#!/bin/bash
#set the field seperator to newline
IFS=$'\n'
#split/transpose the first-line column titles to rows
COLUMNAMES=$(head -n1 examplesource.csv | tr ',' '\n')
#set an array and read the columns into it
columns=()
for line in $COLUMNAMES; do
columns+=("$line")
done
#reset the field seperator
IFS=","
#using -p here to debug in output
declare -ap columns
#read from line 2 onwards
sed 1d examplesource.csv | while read "${columns[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
In the case of looping through everything, it works perfectly for my needs and I can process within the "while read" loop. But to make it cleaner, I'd rather pass the current element(?) to an external function to process (not just echo).
And if I only wanted the array (current row) belonging to "Torvalds", I cannot find how to access that or even get its current index, eg: "if $wantedname && $lastname == $wantedname then call function with currentrow only otherwise loop all rows and call function".
I know there aren't multidimensional associative arrays in bash from reading
Multidimensional associative arrays in Bash and I've tried to understand arrays from
https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
Is it clear what I'm trying to achieve in a bash-only manner and does the question make sense?
Many thanks.
Let's short your function. Don't read the source twice (first with head then with sed). You can do that once. Also the whole array reading can be shorten to just IFS=',' COLUMNAMES=($(head -n1 source.csv)). Here's a shorter version:
#!/bin/bash
cat examplesource.csv |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
If you want to parse both files and the same time, ie. join them, nothing simpler ;). First, let's number lines in the first file using nl -w1 -s,. Then we use join to join the files on the name of the people. Remember that join input needs to be sort-ed using proper fields. Then we sort the output with sort using the number from the first file. After that we can read all the data just like that:
# join the files, using `,` as the seaprator
# on the 3rd field from the first file and the first field from the second file
# the output should be first the fields from the first file, then the second file
# the country (field 1.4) is duplicated in 2.3, so just omiting it.
join -t, -13 -21 -o 1.1,1.2,1.3,2.2,2.3 <(
# number the lines in the first file
<examplesource.csv nl -w1 -s, |
# there is one field more, sort using the 3rd field
sort -t, -k3
) <(
# sort the second file using the first field
<examplesource2.csv sort -t, -k1
) |
# sort the output using the numbers from the first file
sort -t, -k1 -n |
# well, remove the numbers
cut -d, -f2- |
# just a normal read follows
{
# read the headers
IFS=, read -r -a names
while IFS=, read -r "${names[#]}"; do
# finally out output!
echo "${firstname} ${lastname} is from ${country} and is so many ${age} years old!"
done
}
Tested on tutorialspoint.
GNU Awk has multidimensional arrays. It also has array sorting mechanisms, which I have not used here. Please comment if you are interested in pursuing this solution further. The following depends on consistent key names and line numbers across input files, but can handle an arbitrary number of fields and input files.
$ gawk -V |gawk NR==1
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
$ gawk -F, '
FNR == 1 {for(f=1;f<=NF;f++) Key[f]=$f}
FNR != 1 {for(f=1;f<=NF;f++) People[FNR][Key[f]]=$f}
END {
for(Person in People) {
for(attribute in People[Person])
output = output FS People[Person][attribute]
print substr(output,2)
output=""
}
}
' file*
66,Finland,Linus,Torvalds
7,USA,Linus,van Pelt
A bash solution takes a bit more work than an awk solution, but if this is an exercise over what bash provides, it provides all you need to handle determining the column holding the last name from the first line of input and then outputting the lastname from the remaining lines.
An easy approach is simply to read each line into a normal array and then loop over the elements of the first line to locate the column "lastname" appears in saving the column in a variable. You can then read each of the remaining lines the same way and output the lastname field by outputting the element at the saved column.
A short example would be:
#!/bin/bash
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{ col=i; break; } ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
echo "line $cnt lastname: ${arr[col]}" ## output lastname variable
((cnt++)) ## increment linecount
done < "$1"
Example Use/Output
Using your two files data files, the output would be:
$ bash readcsv.sh ex1.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
$ bash readcsv.sh ex2.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
A similar implementation using awk would be:
awk -F, -v col=1 '
NR == 1 {
for (i in FN) {
if (i = "lastname") next
}
col++
}
NR > 1 {
print "lastname: ", $col
}
' ex1.csv
Example Use/Output
$ awk -F, -v col=1 'NR == 1 { for (i in FN) { if (i = "lastname") next } col++ } NR > 1 {print "lastname: ", $col }' ex1.csv
lastname: Torvalds
lastname: van Pelt
(output is the same for either file)
Thank you all. I've taken a couple of bits from two answers
I used the answer from David to find the number of the row, then I used the elegantly simple solution from Kamil at to loop through what I need.
The result is exactly what I wanted. Thank you all.
$ readexample.sh examplesource.csv "Torvalds"
Everyone
Linus Torvalds is from Finland
Linus van Pelt is from USA
now just Torvalds
Linus Torvalds is from Finland
And this is the code - now that you know what I want it to do, if anyone can see any dangers or improvements, please let me know as I'm always learning. Thanks.
#!/bin/bash
FILENAME="$1"
WANTED="$2"
printDetails() {
SINGLEROW="$1"
[[ ! -z "$SINGLEROW" ]] && opt=("--expression" "1p" "--expression" "${SINGLEROW}p") || opt=("--expression" "1p" "--expression" "2,199p")
sed -n "${opt[#]}" "$FILENAME" |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
}
findRow() {
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{
col=i
break
} ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
if [ "${arr[col]}" == "$1" ]; then
echo "$cnt" ## output lastname variable
fi
((cnt++)) ## increment linecount
done <"$FILENAME"
}
echo "Everyone"
printDetails
if [ ! -z "${WANTED}" ]; then
echo -e "\nnow just ${WANTED}"
row=$(findRow "${WANTED}")
printDetails "$((row + 1))"
fi
I need to remove an element from an array in bash shell.
Generally I'd simply do:
array=("${(#)array:#<element to remove>}")
Unfortunately the element I want to remove is a variable so I can't use the previous command.
Down here an example:
array+=(pluto)
array+=(pippo)
delete=(pluto)
array( ${array[#]/$delete} ) -> but clearly doesn't work because of {}
Any idea?
The following works as you would like in bash and zsh:
$ array=(pluto pippo)
$ delete=pluto
$ echo ${array[#]/$delete}
pippo
$ array=( "${array[#]/$delete}" ) #Quotes when working with strings
If need to delete more than one element:
...
$ delete=(pluto pippo)
for del in ${delete[#]}
do
array=("${array[#]/$del}") #Quotes when working with strings
done
Caveat
This technique actually removes prefixes matching $delete from the elements, not necessarily whole elements.
Update
To really remove an exact item, you need to walk through the array, comparing the target to each element, and using unset to delete an exact match.
array=(pluto pippo bob)
delete=(pippo)
for target in "${delete[#]}"; do
for i in "${!array[#]}"; do
if [[ ${array[i]} = $target ]]; then
unset 'array[i]'
fi
done
done
Note that if you do this, and one or more elements is removed, the indices will no longer be a continuous sequence of integers.
$ declare -p array
declare -a array=([0]="pluto" [2]="bob")
The simple fact is, arrays were not designed for use as mutable data structures. They are primarily used for storing lists of items in a single variable without needing to waste a character as a delimiter (e.g., to store a list of strings which can contain whitespace).
If gaps are a problem, then you need to rebuild the array to fill the gaps:
for i in "${!array[#]}"; do
new_array+=( "${array[i]}" )
done
array=("${new_array[#]}")
unset new_array
You could build up a new array without the undesired element, then assign it back to the old array. This works in bash:
array=(pluto pippo)
new_array=()
for value in "${array[#]}"
do
[[ $value != pluto ]] && new_array+=($value)
done
array=("${new_array[#]}")
unset new_array
This yields:
echo "${array[#]}"
pippo
This is the most direct way to unset a value if you know it's position.
$ array=(one two three)
$ echo ${#array[#]}
3
$ unset 'array[1]'
$ echo ${array[#]}
one three
$ echo ${#array[#]}
2
This answer is specific to the case of deleting multiple values from large arrays, where performance is important.
The most voted solutions are (1) pattern substitution on an array, or (2) iterating over the array elements. The first is fast, but can only deal with elements that have distinct prefix, the second has O(n*k), n=array size, k=elements to remove. Associative array are relative new feature, and might not have been common when the question was originally posted.
For the exact match case, with large n and k, possible to improve performance from O(nk) to O(n+klog(k)). In practice, O(n) assuming k much lower than n. Most of the speed up is based on using associative array to identify items to be removed.
Performance (n-array size, k-values to delete). Performance measure seconds of user time
N K New(seconds) Current(seconds) Speedup
1000 10 0.005 0.033 6X
10000 10 0.070 0.348 5X
10000 20 0.070 0.656 9X
10000 1 0.043 0.050 -7%
As expected, the current solution is linear to N*K, and the fast solution is practically linear to K, with much lower constant. The fast solution is slightly slower vs the current solution when k=1, due to additional setup.
The 'Fast' solution: array=list of input, delete=list of values to remove.
declare -A delk
for del in "${delete[#]}" ; do delk[$del]=1 ; done
# Tag items to remove, based on
for k in "${!array[#]}" ; do
[ "${delk[${array[$k]}]-}" ] && unset 'array[k]'
done
# Compaction
array=("${array[#]}")
Benchmarked against current solution, from the most-voted answer.
for target in "${delete[#]}"; do
for i in "${!array[#]}"; do
if [[ ${array[i]} = $target ]]; then
unset 'array[i]'
fi
done
done
array=("${array[#]}")
Here's a one-line solution with mapfile:
$ mapfile -d $'\0' -t arr < <(printf '%s\0' "${arr[#]}" | grep -Pzv "<regexp>")
Example:
$ arr=("Adam" "Bob" "Claire"$'\n'"Smith" "David" "Eve" "Fred")
$ echo "Size: ${#arr[*]} Contents: ${arr[*]}"
Size: 6 Contents: Adam Bob Claire
Smith David Eve Fred
$ mapfile -d $'\0' -t arr < <(printf '%s\0' "${arr[#]}" | grep -Pzv "^Claire\nSmith$")
$ echo "Size: ${#arr[*]} Contents: ${arr[*]}"
Size: 5 Contents: Adam Bob David Eve Fred
This method allows for great flexibility by modifying/exchanging the grep command and doesn't leave any empty strings in the array.
Partial answer only
To delete the first item in the array
unset 'array[0]'
To delete the last item in the array
unset 'array[-1]'
To expand on the above answers, the following can be used to remove multiple elements from an array, without partial matching:
ARRAY=(one two onetwo three four threefour "one six")
TO_REMOVE=(one four)
TEMP_ARRAY=()
for pkg in "${ARRAY[#]}"; do
for remove in "${TO_REMOVE[#]}"; do
KEEP=true
if [[ ${pkg} == ${remove} ]]; then
KEEP=false
break
fi
done
if ${KEEP}; then
TEMP_ARRAY+=(${pkg})
fi
done
ARRAY=("${TEMP_ARRAY[#]}")
unset TEMP_ARRAY
This will result in an array containing:
(two onetwo three threefour "one six")
Here's a (probably very bash-specific) little function involving bash variable indirection and unset; it's a general solution that does not involve text substitution or discarding empty elements and has no problems with quoting/whitespace etc.
delete_ary_elmt() {
local word=$1 # the element to search for & delete
local aryref="$2[#]" # a necessary step since '${!$2[#]}' is a syntax error
local arycopy=("${!aryref}") # create a copy of the input array
local status=1
for (( i = ${#arycopy[#]} - 1; i >= 0; i-- )); do # iterate over indices backwards
elmt=${arycopy[$i]}
[[ $elmt == $word ]] && unset "$2[$i]" && status=0 # unset matching elmts in orig. ary
done
return $status # return 0 if something was deleted; 1 if not
}
array=(a 0 0 b 0 0 0 c 0 d e 0 0 0)
delete_ary_elmt 0 array
for e in "${array[#]}"; do
echo "$e"
done
# prints "a" "b" "c" "d" in lines
Use it like delete_ary_elmt ELEMENT ARRAYNAME without any $ sigil. Switch the == $word for == $word* for prefix matches; use ${elmt,,} == ${word,,} for case-insensitive matches; etc., whatever bash [[ supports.
It works by determining the indices of the input array and iterating over them backwards (so deleting elements doesn't screw up iteration order). To get the indices you need to access the input array by name, which can be done via bash variable indirection x=1; varname=x; echo ${!varname} # prints "1".
You can't access arrays by name like aryname=a; echo "${$aryname[#]}, this gives you an error. You can't do aryname=a; echo "${!aryname[#]}", this gives you the indices of the variable aryname (although it is not an array). What DOES work is aryref="a[#]"; echo "${!aryref}", which will print the elements of the array a, preserving shell-word quoting and whitespace exactly like echo "${a[#]}". But this only works for printing the elements of an array, not for printing its length or indices (aryref="!a[#]" or aryref="#a[#]" or "${!!aryref}" or "${#!aryref}", they all fail).
So I copy the original array by its name via bash indirection and get the indices from the copy. To iterate over the indices in reverse I use a C-style for loop. I could also do it by accessing the indices via ${!arycopy[#]} and reversing them with tac, which is a cat that turns around the input line order.
A function solution without variable indirection would probably have to involve eval, which may or may not be safe to use in that situation (I can't tell).
Using unset
To remove an element at particular index, we can use unset and then do copy to another array. Only just unset is not required in this case. Because unset does not remove the element it just sets null string to the particular index in array.
declare -a arr=('aa' 'bb' 'cc' 'dd' 'ee')
unset 'arr[1]'
declare -a arr2=()
i=0
for element in "${arr[#]}"
do
arr2[$i]=$element
((++i))
done
echo "${arr[#]}"
echo "1st val is ${arr[1]}, 2nd val is ${arr[2]}"
echo "${arr2[#]}"
echo "1st val is ${arr2[1]}, 2nd val is ${arr2[2]}"
Output is
aa cc dd ee
1st val is , 2nd val is cc
aa cc dd ee
1st val is cc, 2nd val is dd
Using :<idx>
We can remove some set of elements using :<idx> also. For example if we want to remove 1st element we can use :1 as mentioned below.
declare -a arr=('aa' 'bb' 'cc' 'dd' 'ee')
arr2=("${arr[#]:1}")
echo "${arr2[#]}"
echo "1st val is ${arr2[1]}, 2nd val is ${arr2[2]}"
Output is
bb cc dd ee
1st val is cc, 2nd val is dd
http://wiki.bash-hackers.org/syntax/pe#substring_removal
${PARAMETER#PATTERN} # remove from beginning
${PARAMETER##PATTERN} # remove from the beginning, greedy match
${PARAMETER%PATTERN} # remove from the end
${PARAMETER%%PATTERN} # remove from the end, greedy match
In order to do a full remove element, you have to do an unset command with an if statement. If you don't care about removing prefixes from other variables or about supporting whitespace in the array, then you can just drop the quotes and forget about for loops.
See example below for a few different ways to clean up an array.
options=("foo" "bar" "foo" "foobar" "foo bar" "bars" "bar")
# remove bar from the start of each element
options=("${options[#]/#"bar"}")
# options=("foo" "" "foo" "foobar" "foo bar" "s" "")
# remove the complete string "foo" in a for loop
count=${#options[#]}
for ((i = 0; i < count; i++)); do
if [ "${options[i]}" = "foo" ] ; then
unset 'options[i]'
fi
done
# options=( "" "foobar" "foo bar" "s" "")
# remove empty options
# note the count variable can't be recalculated easily on a sparse array
for ((i = 0; i < count; i++)); do
# echo "Element $i: '${options[i]}'"
if [ -z "${options[i]}" ] ; then
unset 'options[i]'
fi
done
# options=("foobar" "foo bar" "s")
# list them with select
echo "Choose an option:"
PS3='Option? '
select i in "${options[#]}" Quit
do
case $i in
Quit) break ;;
*) echo "You selected \"$i\"" ;;
esac
done
Output
Choose an option:
1) foobar
2) foo bar
3) s
4) Quit
Option?
Hope that helps.
There is also this syntax, e.g. if you want to delete the 2nd element :
array=("${array[#]:0:1}" "${array[#]:2}")
which is in fact the concatenation of 2 tabs. The first from the index 0 to the index 1 (exclusive) and the 2nd from the index 2 to the end.
POSIX shell script does not have arrays.
So most probably you are using a specific dialect such as bash, korn shells or zsh.
Therefore, your question as of now cannot be answered.
Maybe this works for you:
unset array[$delete]
What I do is:
array="$(echo $array | tr ' ' '\n' | sed "/itemtodelete/d")"
BAM, that item is removed.
This is a quick-and-dirty solution that will work in simple cases but will break if (a) there are regex special characters in $delete, or (b) there are any spaces at all in any items. Starting with:
array+=(pluto)
array+=(pippo)
delete=(pluto)
Delete all entries exactly matching $delete:
array=(`echo $array | fmt -1 | grep -v "^${delete}$" | fmt -999999`)
resulting in
echo $array -> pippo, and making sure it's an array:
echo $array[1] -> pippo
fmt is a little obscure: fmt -1 wraps at the first column (to put each item on its own line. That's where the problem arises with items in spaces.) fmt -999999 unwraps it back to one line, putting back the spaces between items. There are other ways to do that, such as xargs.
Addendum: If you want to delete just the first match, use sed, as described here:
array=(`echo $array | fmt -1 | sed "0,/^${delete}$/{//d;}" | fmt -999999`)
Actually, I just noticed that the shell syntax somewhat has a behavior built-in that allows for easy reconstruction of the array when, as posed in the question, an item should be removed.
# let's set up an array of items to consume:
x=()
for (( i=0; i<10; i++ )); do
x+=("$i")
done
# here, we consume that array:
while (( ${#x[#]} )); do
i=$(( $RANDOM % ${#x[#]} ))
echo "${x[i]} / ${x[#]}"
x=("${x[#]:0:i}" "${x[#]:i+1}")
done
Notice how we constructed the array using bash's x+=() syntax?
You could actually add more than one item with that, the content of a whole other array at once.
In ZSH this is dead easy (note this uses more bash compatible syntax than necessary where possible for ease of understanding):
# I always include an edge case to make sure each element
# is not being word split.
start=(one two three 'four 4' five)
work=(${(#)start})
idx=2
val=${work[idx]}
# How to remove a single element easily.
# Also works for associative arrays (at least in zsh)
work[$idx]=()
echo "Array size went down by one: "
[[ $#work -eq $(($#start - 1)) ]] && echo "OK"
echo "Array item "$val" is now gone: "
[[ -z ${work[(r)$val]} ]] && echo OK
echo "Array contents are as expected: "
wanted=("${start[#]:0:1}" "${start[#]:2}")
[[ "${(j.:.)wanted[#]}" == "${(j.:.)work[#]}" ]] && echo "OK"
echo "-- array contents: start --"
print -l -r -- "-- $#start elements" ${(#)start}
echo "-- array contents: work --"
print -l -r -- "-- $#work elements" "${work[#]}"
Results:
Array size went down by one:
OK
Array item two is now gone:
OK
Array contents are as expected:
OK
-- array contents: start --
-- 5 elements
one
two
three
four 4
five
-- array contents: work --
-- 4 elements
one
three
four 4
five
To avoid conflicts with array index using unset - see https://stackoverflow.com/a/49626928/3223785 and https://stackoverflow.com/a/47798640/3223785 for more information - reassign the array to itself: ARRAY_VAR=(${ARRAY_VAR[#]}).
#!/bin/bash
ARRAY_VAR=(0 1 2 3 4 5 6 7 8 9)
unset ARRAY_VAR[5]
unset ARRAY_VAR[4]
ARRAY_VAR=(${ARRAY_VAR[#]})
echo ${ARRAY_VAR[#]}
A_LENGTH=${#ARRAY_VAR[*]}
for (( i=0; i<=$(( $A_LENGTH -1 )); i++ )) ; do
echo ""
echo "INDEX - $i"
echo "VALUE - ${ARRAY_VAR[$i]}"
done
exit 0
[Ref.: https://tecadmin.net/working-with-array-bash-script/ ]
How about something like:
array=(one two three)
array_t=" ${array[#]} "
delete=one
array=(${array_t// $delete / })
unset array_t
#/bin/bash
echo "# define array with six elements"
arr=(zero one two three 'four 4' five)
echo "# unset by index: 0"
unset -v 'arr[0]'
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
arr_delete_by_content() { # value to delete
for i in ${!arr[*]}; do
[ "${arr[$i]}" = "$1" ] && unset -v 'arr[$i]'
done
}
echo "# unset in global variable where value: three"
arr_delete_by_content three
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
echo "# rearrange indices"
arr=( "${arr[#]}" )
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
delete_value() { # value arrayelements..., returns array decl.
local e val=$1; new=(); shift
for e in "${#}"; do [ "$val" != "$e" ] && new+=("$e"); done
declare -p new|sed 's,^[^=]*=,,'
}
echo "# new array without value: two"
declare -a arr="$(delete_value two "${arr[#]}")"
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
delete_values() { # arraydecl values..., returns array decl. (keeps indices)
declare -a arr="$1"; local i v; shift
for v in "${#}"; do
for i in ${!arr[*]}; do
[ "$v" = "${arr[$i]}" ] && unset -v 'arr[$i]'
done
done
declare -p arr|sed 's,^[^=]*=,,'
}
echo "# new array without values: one five (keep indices)"
declare -a arr="$(delete_values "$(declare -p arr|sed 's,^[^=]*=,,')" one five)"
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
# new array without multiple values and rearranged indices is left to the reader
I know that reading a .csv file can be done simply in bash with this loop:
#!/bin/bash
INPUT=data.cvs
OLDIFS=$IFS
IFS=,
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
while read flname dob ssn tel status
do
echo "Name : $flname"
echo "DOB : $dob"
echo "SSN : $ssn"
echo "Telephone : $tel"
echo "Status : $status"
done < $INPUT
IFS=$OLDIFS
But I want to slightly modify this- I want to make the columns be defined by the programmer in the bash file.
For example:
declare -a columns=("Name", "Surname", "ID", "Gender")
while read columns
do
//now echo everything that has been read
done < $INPUT
So I want to specify the list of variables that should be used as the container to the read CSV data with an array and then access this array inside the while body.
Is there a way to do it?
The key to this solution is the comment before the while statement below. read is a built-in, but it is still a command, and command arguments are expanded by the shell before executing the command. After expansion of ${columns[#]}, the command becomes
read Name Surname ID Gender
Example:
# Don't use commas in between array values (since they become part of the value)
# Values not quoted because valid names don't need quotes, and these
# value must be valid names
declare -a columns=(Name Surname ID Gender)
Then, we can try:
# Read is a command. Arguments are expanded.
# The quotes are unnecessary but it's hard to break habits :)
while read "${columns[#]}"; do
echo Name is "$Name"
# etc
done <<< "John Doe 27 M"
Output:
Name is John
This same approach would work even in a shell without arrays; the column names can just be a space separated list. (Example run in dash, a Posix shell)
$ columns="Name Surname ID Gender"
$ # Here it is vital that $columns not be quoted; we rely on word-splitting
$ while read $columns; do
> echo Name is $Name
> done
John Doe 27 M
Name is John
...
Read the line into an array, then loop through that array and create an associative array that uses the column names.
while read -r line
do
vals=($line)
declare -A colmap
i=0
for col in ${columns[#]}
do
colmap[col]=${vals[$i]}
let i=i+1
done
# do stuff with colmap here
# ...
unset colmap # Clear colmap before next iteration
done < $INPUT
Don't know where to ask for help so trying here.
I'm creating a bash menu script with some operations and reading lots of bash tutorials but think my brain is starting to melt with all different syntax and ways to do it, can't fully wrap my head around bash/sh. End script will run on OSX for an art team.
What this is
A script to upload/download files with rsync.
The script that will grab latest 'config/menu' from remote server. That file menu.txt will create an menu that lists project and when selected you get the option to download / upload.
Issue
Where I'm stuck is how to handle arrays to menus. Tried 2d arrays with no luck so now it's split into 3 arrays to hold the values I need. However when trying to display the menus I can't get it to work correctly. Look at the bottom for how to test, what is shows and what it should show.
more info: create_menus function
This function parse the menu.txt to build an array that is used when showing a menu,
Project Title5, source1dir, destination1dir
Project Title6, source2dir, destination2dir
Project Title7, source3dir, destination3dir
Instead of gettin selection 1 'Project Title5' a menu will display
1) Project
2) Title5Project
3) Title6Project
4) Title7Quit
Script:
function create_menus() {
#operations for project
MENU_OPERATIONS=(
"Get latest from remote"
"Show changes from remote"
"Send latest from me to remote"
"Show changes from me to remote"
"Return to main menu"
)
#projects to choose from, load from textfile
declare -a t; declare -a s; declare -a d;
while IFS= read -r line; do
IFS=',' read -ra obj <<< "$line"
#TODO 2d array nicer than 3 arrays!
eval "t+=\"${obj[0]}\""
eval "s+=\"${obj[1]}\""
eval "d+=\"${obj[2]}\""
done <$FILE_MENU
t+="Quit" #add quit
MENU_MAIN=($t)
PROJECT_SOURCE=($s)
PROJECT_TARGET=($d)
}
then to show the main menu main_menu "${MENU_MAIN[#]}"
function main_menu
{
#clear
#header
PS3="Select project: "
select option; do # in "$#" is the default
if [ "$REPLY" -eq "$#" ];
then
echo "Exiting..."
break;
elif [ 1 -le "$REPLY" ] && [ "$REPLY" -le $(($#-1)) ];
then
# $REPLY = index
# $option = text
echo "You selected $option which is option $REPLY"
SELETED_PROJECT_TITLE=${MENU_MAIN[$REPLY]}
SELETED_PROJECT_SOURCE=${PROJECT_SOURCE[$REPLY]}
SELETED_PROJECT_TARGET=${PROJECT_TARGET[$REPLY]}
echo "Sel title $SELETED_PROJECT_TITLE"
echo "Sel source $SELETED_PROJECT_SOURCE"
echo "Sel target $SELETED_PROJECT_TARGET"
project_menu "${MENU_OPERATIONS[#]}" "$SELETED_PROJECT_TITLE" "$SELETED_PROJECT_SOURCE" "$SELETED_PROJECT_TARGET"
break;
else
echo "Incorrect Input: Select a number 1-$#"
fi
done
}
Here's full code
https://github.com/fbacker/BigFileProjectsSync/blob/master/app.sh
ADDED MORE DESCRIPTION
To test:
git clone https://github.com/fbacker/BigFileProjectsSync.git
cd BigFileProjectsSync/
./app.sh
What happens:
Shows a menu with options:
1) Project
2) Title5Project
3) Title6Project
4) Title7Quit
Should happen:
Shows a menu with options:
1) Project Title5
2) Project Title6
3) Project Title7
4) Quit
app.sh > function create_menus() > This should create a menu based on the menu.txt file.
menu.txt one line is a project: first value is project name, second value is source directory and third is target directory.
Here's a fixed version of your create_menus() function, which should do the trick:
function create_menus() {
#operations for project
# MENU_OPERATIONS=( # ... OMITTED FOR BREVITY
#projects to choose from
local -a titles sources destinations
local title source destination
while IFS='|' read -r title source destination; do
titles+=( "$title" )
sources+=( "$source" )
destinations+=( "$destination" )
done < <(sed 's/, /|/g' "$FILE_MENU")
# Copy to global arrays
MENU_MAIN+=( "${titles[#]}" )
PROJECT_SOURCE+=( "${sources[#]}" )
PROJECT_TARGET+=( "${destinations[#]}" )
MENU_MAIN+=( "Quit" ) #add quit
}
There were 2 crucial problems with your approach (I'll assume an array variable $arr below):
In order to append a new element to an array, the new element must itself be specified as an array too; i.e., it must be enclosed in (...):
arr+=( "$newElement" ) - OK: value is appended as new element
arr+=$newElement - BROKEN: String-appends the value of $newElement to $arr's first element(!), without adding a new one.
arr=( 1 2 ); arr+=3; declare -p arr -> declare -a arr='([0]="13" [1]="2")'
You can't copy a whole array with arrCopy=( $arr ) - all that does is to create a single-item array containing only $arr's first element. To refer to an array as a whole, you must use "${arr[#]}" (enclosing in "..." ensures that no word-splitting is applied):
arrCopy=( "${arr[#]}" ) - OK
arrCopy=( $arr ) - BROKEN - only copies 1st element
Also note that it's better not to use all-uppercase shell-variable names in order to avoid conflicts with environment variables and special shell variables.
This question already has answers here:
Remove an element from a Bash array
(19 answers)
Closed 6 years ago.
I have an array list in a bash script, and a variable var. I know that $var appears in ${list[#]}, but have no easy way of determining its index. I'd like to remove it from list.
This answer achieves something very close to what I need, except that list retains an empty element where $var once was. Note, e.g.:
$ list=(one two three)
$ var="two"
$ list=( "${list[#]/$var}" )
$ echo ${list[#]}
one three
$ echo ${#list[#]}
3
The same thing happens if I use delete=( "$var" ) and replace $var for $delete in the third line. Also, doing list=( "${list[#]/$var/}" ) makes no difference either.
(I'll note that, experimenting with the comment to that answer, I managed to match only whole words using list=( "${list[#]/%$var}" ), omitting the #.)
I also saw this answer proposing a nice trick to keep track of index and use unset, but that is unfeasible in my case. Finally, the same issue also appeared here, except that OP was satisfied with the result and probably didn't run into the problem empty elements create for me later on in my script, when I iterate through list. I tried to negate that problem by using expansion as follows, without any apparent effect:
for item in "${list[#]}"; do
if [ -n ${item:+'x'} ];then
...
fi
done
It's the same when I do [ ${#item} > 0 ], and I'm running out of ideas. Suggestions?
EDIT:
I have no understanding of why this happens, but #l0b0's comment made me notice something. Using the above preamble, I get:
$ for item in "${list[#]}"; do echo "Here!"; done
Here!
Here!
Here!
but:
$ for item in ${list[#]}; do echo "Here!"; done
Here!
Here!
I'm not sure I can omit the quotes in my script, though, as items are considerably more complicated there (file names and paths, both containing spaces and odd characters).
You can delete an element from existing array though the whole process isn't very straightforward and may appear like a hack.
#!/bin/bash
list=( "one" "two" "three" "four" "five" )
var1="two"
var2="four"
printf "%s\n" "Before:"
for (( i=0; i<${#list[#]}; i++ )); do
printf "%s = %s\n" "$i" "${list[i]}";
done
for (( i=0; i<${#list[#]}; i++ )); do
if [[ ${list[i]} == $var1 || ${list[i]} == $var2 ]]; then
list=( "${list[#]:0:$i}" "${list[#]:$((i + 1))}" )
i=$((i - 1))
fi
done
printf "\n%s\n" "After:"
for (( i=0; i<${#list[#]}; i++ )); do
printf "%s = %s\n" "$i" "${list[i]}";
done
This script outputs:
Before:
0 = one
1 = two
2 = three
3 = four
4 = five
After:
0 = one
1 = three
2 = five
Key part of the script is:
list=( "${list[#]:0:$i}" "${list[#]:$((i + 1))}" )
Here we re-construct your existing array by specifying the index and length to remove the element from array completely and re-order the indices.
If you want to delete the array element & shift the indices, you can use answer by l0b0 or JS웃.
However, if you don't want to shift the indices, you can use below script-let:
(Particularly useful for associative arrays)
$ list=(one two three)
$ delete_me=two
$ for i in ${!list[#]};do
if [ "${list[$i]}" == "$delete_me" ]; then
unset list[$i]
fi
done
$ for i in ${!list[#]};do echo "$i = ${list[$i]}"; done
0 = one
2 = three
If you want to shift the indices to make them continuous, re-construct the array as this:
$ list=("${list[#]}")
$ for i in ${!list[#]};do echo "$i = ${list[$i]}"; done
0 = one
1 = three
If you want to remove by value and shift the indexes I think you have to create a new array:
list=(one two three)
new_list=() # Not strictly necessary, but added for clarity
var="two"
for item in ${list[#]}
do
if [ "$item" != "$var" ]
then
new_list+=("$item")
fi
done
list=("${new_list[#]}")
unset new_list
Test:
$ echo "${list[#]}"
one three
$ echo "${#list[#]}"
2