This question already has answers here:
How do I split a string on a delimiter in Bash?
(37 answers)
Closed last year.
STATUS QUO
i have an external properties file, where a couple of variables are stored. one of these has a list of values (min 1 value, max X values).
when declaring this single list within the shell script, it would look like this:
NODES=(
"node"
"node-2"
"node-3"
)
I read the values from the properties file like this:
# checks, if file exists and is readable
file="./properties.credo"
if [ ! -r "$file" ]
then
echo "fatal: properties file $file not found / readable."
exit 2
fi
# loads properties into variables
. $file
[[ -z "$VALUEA" ]] && echo "fatal: VALUEA not specified in .credo" && exit 2
...
PROBLEM
When defining the NODES values in the properties like this:
NODES=node,node-2,node-3
... and reading with that:
...
[[ -z "$NODES" ]] && echo "fatal: NODES not specified in .credo" && exit 2
...
... it will be read from the file as a single string node,node-2,node-3, but not as a list or one-dimensional array.
Beware! This is dangerous as the config file can contain PATH=/some/path and the script can then execute commands over which you have no control.
You can use read -a to populate an array. Set IFS to the separator and send the values read from the file to read's standard input.
#! /bin/bash
file=$1
IFS== read var values < "$file"
IFS=, read -a $var <<< "$values"
echo "${NODES[#]}"
For a multiline config, I tried the following:
nodes=node1,node2,node3
servers=server1,server2
and modified the script to loop over the input:
#! /bin/bash
file=$1
while IFS== read var values ; do
IFS=, read -a $var <<< "$values"
done < "$file"
echo "${nodes[#]}"
echo "${servers[#]}"
You might need to skip over lines that don't follow the var=val1,val2,... pattern.
I have a file like this below:
-bash-4.2$ cat a1.txt
0 10.95.187.87 5444 up 0.333333 primary 0 false 0
1 10.95.187.88 5444 up 0.333333 standby 1 true 0
2 10.95.187.89 5444 up 0.333333 standby 0 false 0
I want to fetch the data from the above file into a 2D array.
Can you please help me with a suitable way to put into an array.
Also post putting we need put a condition to check whether the value in the 4th column is UP or DOWN. If it's UP then OK, if its down then below command needs to be executed.
-bash-4.2$ pcp_attach_node -w -U pcpuser -h localhost -p 9898 0
(The value at the end is getting fetched from the 1st column.
You could try something like that:
while read -r line; do
declare -a array=( $line ) # use IFS
echo "${array[0]}"
echo "${array[1]}" # and so on
if [[ "$array[3]" ]]; then
echo execute command...
fi
done < a1.txt
Or:
while read -r -a array; do
if [[ "$array[3]" ]]; then
echo execute command...
fi
done < a1.txt
This works only if field are space separated (any kind of space).
You could probably mix that with regexp if you need more precise control of the format.
Firstly, I don't think you can have 2D arrays in bash. But you can however store lines into a 1-D array.
Here is a script ,parse1a.sh, to demonstrate emulation of 2D arrays for the type of data you included:
#!/bin/bash
function get_element () {
line=${ARRAY[$1]}
echo $line | awk "{print \$$(($2+1))}" #+1 since awk is one-based
}
function set_element () {
line=${ARRAY[$1]}
declare -a SUBARRAY=($line)
SUBARRAY[$(($2))]=$3
ARRAY[$1]="${SUBARRAY[#]}"
}
ARRAY=()
while IFS='' read -r line || [[ -n "$line" ]]; do
#echo $line
ARRAY+=("$line")
done < "$1"
echo "Full array contents printout:"
printf "%s\n" "${ARRAY[#]}" # Full array contents printout.
for line in "${ARRAY[#]}"; do
#echo $line
if [ "$(echo $line | awk '{print $4}')" == "down" ]; then
echo "Replace this with what to do for down"
else
echo "...and any action for up - if required"
fi
done
echo "Element access of [2,3]:"
echo "get_element 2 3 : "
get_element 2 3
echo "set_element 2 3 left: "
set_element 2 3 left
echo "get_element 2 3 : "
get_element 2 3
echo "Full array contents printout:"
printf "%s\n" "${ARRAY[#]}" # Full array contents printout.
It can be executed by:
./parsea1 a1.txt
Hope this is close to what you are looking for. Note that this code will loose all indenting spaces during manipulation, but a formatted update of the lines could solve that.
I'd like to either process one row of a csv file or the whole file.
The variables are set by the header row, which may be in any order.
There may be up to 12 columns, but only 3 or 4 variables are needed.
The source files might be in either format, and all I want from both is lastname and country. I know of many different ways and tools to do it if the columns were fixed and always in the same order. But they're not.
examplesource.csv:
firstname,lastname,country
Linus,Torvalds,Finland
Linus,van Pelt,USA
examplesource2.csv:
lastname,age,country
Torvalds,66,Finland
van Pelt,7,USA
I have cobbled together something from various Stackoverflow postings which looks a bit voodoo but seems fairly robust. I say "voodoo" because shellcheck complains that, for example, "firstname is referenced but not assigned". And yet it prints it.
#!/bin/bash
#set the field seperator to newline
IFS=$'\n'
#split/transpose the first-line column titles to rows
COLUMNAMES=$(head -n1 examplesource.csv | tr ',' '\n')
#set an array and read the columns into it
columns=()
for line in $COLUMNAMES; do
columns+=("$line")
done
#reset the field seperator
IFS=","
#using -p here to debug in output
declare -ap columns
#read from line 2 onwards
sed 1d examplesource.csv | while read "${columns[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
In the case of looping through everything, it works perfectly for my needs and I can process within the "while read" loop. But to make it cleaner, I'd rather pass the current element(?) to an external function to process (not just echo).
And if I only wanted the array (current row) belonging to "Torvalds", I cannot find how to access that or even get its current index, eg: "if $wantedname && $lastname == $wantedname then call function with currentrow only otherwise loop all rows and call function".
I know there aren't multidimensional associative arrays in bash from reading
Multidimensional associative arrays in Bash and I've tried to understand arrays from
https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
Is it clear what I'm trying to achieve in a bash-only manner and does the question make sense?
Many thanks.
Let's short your function. Don't read the source twice (first with head then with sed). You can do that once. Also the whole array reading can be shorten to just IFS=',' COLUMNAMES=($(head -n1 source.csv)). Here's a shorter version:
#!/bin/bash
cat examplesource.csv |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
If you want to parse both files and the same time, ie. join them, nothing simpler ;). First, let's number lines in the first file using nl -w1 -s,. Then we use join to join the files on the name of the people. Remember that join input needs to be sort-ed using proper fields. Then we sort the output with sort using the number from the first file. After that we can read all the data just like that:
# join the files, using `,` as the seaprator
# on the 3rd field from the first file and the first field from the second file
# the output should be first the fields from the first file, then the second file
# the country (field 1.4) is duplicated in 2.3, so just omiting it.
join -t, -13 -21 -o 1.1,1.2,1.3,2.2,2.3 <(
# number the lines in the first file
<examplesource.csv nl -w1 -s, |
# there is one field more, sort using the 3rd field
sort -t, -k3
) <(
# sort the second file using the first field
<examplesource2.csv sort -t, -k1
) |
# sort the output using the numbers from the first file
sort -t, -k1 -n |
# well, remove the numbers
cut -d, -f2- |
# just a normal read follows
{
# read the headers
IFS=, read -r -a names
while IFS=, read -r "${names[#]}"; do
# finally out output!
echo "${firstname} ${lastname} is from ${country} and is so many ${age} years old!"
done
}
Tested on tutorialspoint.
GNU Awk has multidimensional arrays. It also has array sorting mechanisms, which I have not used here. Please comment if you are interested in pursuing this solution further. The following depends on consistent key names and line numbers across input files, but can handle an arbitrary number of fields and input files.
$ gawk -V |gawk NR==1
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
$ gawk -F, '
FNR == 1 {for(f=1;f<=NF;f++) Key[f]=$f}
FNR != 1 {for(f=1;f<=NF;f++) People[FNR][Key[f]]=$f}
END {
for(Person in People) {
for(attribute in People[Person])
output = output FS People[Person][attribute]
print substr(output,2)
output=""
}
}
' file*
66,Finland,Linus,Torvalds
7,USA,Linus,van Pelt
A bash solution takes a bit more work than an awk solution, but if this is an exercise over what bash provides, it provides all you need to handle determining the column holding the last name from the first line of input and then outputting the lastname from the remaining lines.
An easy approach is simply to read each line into a normal array and then loop over the elements of the first line to locate the column "lastname" appears in saving the column in a variable. You can then read each of the remaining lines the same way and output the lastname field by outputting the element at the saved column.
A short example would be:
#!/bin/bash
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{ col=i; break; } ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
echo "line $cnt lastname: ${arr[col]}" ## output lastname variable
((cnt++)) ## increment linecount
done < "$1"
Example Use/Output
Using your two files data files, the output would be:
$ bash readcsv.sh ex1.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
$ bash readcsv.sh ex2.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
A similar implementation using awk would be:
awk -F, -v col=1 '
NR == 1 {
for (i in FN) {
if (i = "lastname") next
}
col++
}
NR > 1 {
print "lastname: ", $col
}
' ex1.csv
Example Use/Output
$ awk -F, -v col=1 'NR == 1 { for (i in FN) { if (i = "lastname") next } col++ } NR > 1 {print "lastname: ", $col }' ex1.csv
lastname: Torvalds
lastname: van Pelt
(output is the same for either file)
Thank you all. I've taken a couple of bits from two answers
I used the answer from David to find the number of the row, then I used the elegantly simple solution from Kamil at to loop through what I need.
The result is exactly what I wanted. Thank you all.
$ readexample.sh examplesource.csv "Torvalds"
Everyone
Linus Torvalds is from Finland
Linus van Pelt is from USA
now just Torvalds
Linus Torvalds is from Finland
And this is the code - now that you know what I want it to do, if anyone can see any dangers or improvements, please let me know as I'm always learning. Thanks.
#!/bin/bash
FILENAME="$1"
WANTED="$2"
printDetails() {
SINGLEROW="$1"
[[ ! -z "$SINGLEROW" ]] && opt=("--expression" "1p" "--expression" "${SINGLEROW}p") || opt=("--expression" "1p" "--expression" "2,199p")
sed -n "${opt[#]}" "$FILENAME" |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
}
findRow() {
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{
col=i
break
} ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
if [ "${arr[col]}" == "$1" ]; then
echo "$cnt" ## output lastname variable
fi
((cnt++)) ## increment linecount
done <"$FILENAME"
}
echo "Everyone"
printDetails
if [ ! -z "${WANTED}" ]; then
echo -e "\nnow just ${WANTED}"
row=$(findRow "${WANTED}")
printDetails "$((row + 1))"
fi
Before you get concerned, this (hopefully) isn't another of the many threads about having variable access problems from a while loop in BASH. I looked at those and implemented the redirect method, but I'm having some issues.
Code:
#!/bin/bash
rm /tmp/sched.now
rm /tmp/sched.later
NOWURL="schedNOWurl"
LATERURL="schedLATERurl"
curl -s $NOWURL --output /tmp/sched.now
curl -s $LATERURL --output /tmp/sched.later
re='^[0-9]+$'
if grep -q 'no shifts' /tmp/sched.now ; then
nowpeople[0]="No Shifts Assigned For This Time"
nowtime[0]="Now"
else
while read line
do
if [[ ${line:0:1} =~ $re ]] ; then
nowtime[$nowcount]=$line
((nowcount++))
elif [[ ${line:0:1} == '-' ]] ; then
nowpeople[$nowcount]=$line
((nowcount++))
fi
done < /tmp/sched.now
fi
declare -a latertime
declare -a laterpeople
if grep -q 'no shifts' /tmp/sched.later ; then
laterpeople[0]="No Shifts Assigned For This Time"
latertime[0]="Until 23:59"
else
while read line
do
if [[ ${line:0:1} =~ $re ]] ; then
latertime[$latercount]=$line
((latercount++))
elif [[ ${line:0:1} == '-' ]] ; then
laterpeople[$latercount]=$line
((latercount++))
fi
done < /tmp/sched.later
fi
echo ${nowpeople[#]}
echo ${nowtime[#]}
echo ${laterpeople[#]}
echo ${latertime[#]}
The URLs that I'm getting are HTML files. The text I'm attempting to pull in the loops looks something like this
11a-1:30p
- John Doe
with some HTML thrown above and below it. Note that these are all on their own lines.
The issue I'm facing is that the loops aren't saving all of the matches to the array. In the second loop, it should pull three entries into each array. Right now it pulls one with malformed text as follows:
- John Doe
1:30p-4pp
Notice the missing m in pm and the replaced m with a p in pp.
The strange thing is that if I echo $line before the array statement, the line is pulled from the file fine, so it sounds like it's something in the way I'm saving to the array.
Suggestions? Please let me know if you need more details.
Posted my code below, wondering if I can search one array for a match... or if theres a way I can search a unix file inside of an argument.
#!/bin/bash
# store words in file
cat $1 | ispell -l > file
# move words in file into array
array=($(< file))
# remove temp file
rm file
# move already checked words into array
checked=($(< .spelled))
# print out words & ask for corrections
for ((i=0; i<${#array[#]}; i++ ))
do
if [[ ! ${array[i]} = ${checked[#]} ]]; then
read -p "' ${array[i]} ' is mispelled. Press "Enter" to keep
this spelling, or type a correction here: " input
if [[ ! $input = "" ]]; then
correction[i]=$input
else
echo ${array[i]} >> .spelled
fi
fi
done
echo "MISPELLED: CORRECTIONS:"
for ((i=0; i<${#correction[#]}; i++ ))
do
echo ${array[i]} ${correction[i]}
done
otherwise, i would need to write a for loop to check each array indice, and then somehow make a decision statement whether to go through the loop and print/take input
The ususal shell incantation to do this is:
cat $1 | ispell -l |while read -r ln
do
read -p "$ln is misspelled. Enter correction" corrected
if [ ! x$corrected = x ] ; then
ln=$corrected
fi
echo $ln
done >correctedwords.txt
The while;do;done is kind of like a function and you can pipe data into and out of it.
P.S. I didn't test the above code so there may be syntax errors