How to read in csv file to array in bash script - arrays

I have written the following code to read in my csv file (which has a fixed number of columns but not a fixed number of rows) into my script as an array. I need it to be a shell script.
usernames x1 x2 x3 x4
username1, 5 5 4 2
username2, 6 3 2 0
username3, 8 4 9 3
My code
#!/bin/bash
set oldIFS = $IFS
set IFS=,
read -a line < something.csv
another option I have used is
#!/bin/bash
while IFS=$'\t' reaad -r -a line
do
echo $line
done < something.csv
for both I tried some test code to see what the size of the array line would be and I seem to be getting a size of 10 with the first one but the array only outputs username. For the second one, I seem to be getting a size of 0 but the array outputs the whole csv.
Help is much appreciated!

You may consider using AWK with a regular expression in FS variable like this:
awk 'BEGIN { FS=",?[ \t]*"; } { print $1,"|",$2,"|",$3,"|",$4,"|",$5; }'
or this
awk 'BEGIN { FS=",?[ \t]*"; OFS="|"; } { $1=$1; print $0; }'
($1=$1 is required to rebuild $0 with new OFS)

Related

How do I split a text file into an array by blank lines?

I have a bash command that outputs text in the following format:
Header 1
- Point 1
- Point 2
Header 2
- Point 1
- Point 2
Header 3
-Point 1
- Point 2
...
I want to parse this text into an array, separating on the empty line so that array[0] for example contains:
Header 1
- Point 1
- Point 2
And then I want to edit some of the data in the array if it satisfies certain conditions.
I was looking at something like this Separate by blank lines in bash but I'm completely new to bash so I don't understand how to save the output from awk RS=null to an array instead of printing it out. Could someone please point me in the right direction?
You can use readarray command to populate a bash array after reading your file with gnu awk command with empty RS that lets awk split records on empty lines and using ORS as \0 (NUL) byte:
IFS= readarray -d '' arr < <(awk -v RS= -v ORS='\0' '1' file)
Check output:
echo "${arr[0]}"
Header 1
- Point 1
- Point 2
echo "${arr[1]}"
Header 2
- Point 1
- Point 2
echo "${arr[2]}"
Header 3
-Point 1
- Point 2
Online Demo

Bash: awk output to array

Im trying to put the contents of a awk command in to a bash array however im having a bit of trouble.
>>test.sh
f_checkuser() {
_l="/etc/login.defs"
_p="/etc/passwd"
## get mini UID limit ##
l=$(grep "^UID_MIN" $_l)
## get max UID limit ##
l1=$(grep "^UID_MAX" $_l)
awk -F':' -v "min=${l##UID_MIN}" -v "max=${l1##UID_MAX}" '{ if ( $3 >= min && $3 <= max && $7 != "/sbin/nologin" ) print $0 }' "$_p"
}
...
Used files:
Sample File: /etc/login.defs
>>/etc/login.defs
### Min/max values for automatic uid selection in useradd
UID_MIN 1000
UID_MAX 60000
Sample File: /etc/passwd
>>/etc/passwd
root:x:0:0:root:/root:/usr/bin/zsh
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:Daniel,,,:/home/daniel:/bin/bash
The output looks like:
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:User,,,:/home/user:/bin/bash
respectively (awk ... print $1 }' "$_p")
admin
daniel
Now my problem is to save the awk output in an Array to use it as variable.
>>test.sh
...
f_checkuser
echo "Array items and indexes:"
for index in ${!LOKAL_USERS[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
It could/should look like this example.
Array items and indexes:
0: admin
1: daniel
Specially i would become all Users of a System (not root,bin,sys,ssh,...) without blocked users in an array.
Perhaps someone has another idea to solve my Problem?
Are you trying to set the output of one script to an array? There is a bash has a way of doing this. For example,
a=( $(seq 1 10) ); echo ${a[1]}
will populate the array a with elements 1 to 10 and will print 2, the second line generated by seq (array index starts at zero). Simply replace the contents of $(...) with your script.
For those coming to this years later ...
bash 4 introduced readarray (aka mapfile) exactly for this purpose.
See also Bash capturing output of awk into array
One solution that works:
array=()
f_checkuser(){
...
...
tempfile="localuser.tmp"
touch ${tempfile}
awk -F':'...'{... print $1 }' "$_p" > ${HOME}/${tempfile}
getArrayfromFile "${tempfile}"
}
getArrayfromFile() {
i=0
while read line # Read a line
do
array[i]=$line # Put it into the array
i=$(($i + 1))
done < $1
}
f_checkuser
echo "Array items and indexes:"
for index in ${!array[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
Output:
Array items and indexes:
0: daniel
1: admin
But I like more to observe without a new temp-file.
So, have someone any another idea without a temp-file?

Access a bash array in awk loop

I have a bash array like
myarray = (1 2 3 4 5 ... n)
Also I am reading a file with an input of only one line for example:
1 2 3 4 5 ... n
I am reading it line by line into an array and printing it with:
awk 'BEGIN{FS=OFS="\t"}
NR>=1{for (i=1;i<=NF;i++) a[i]+=$i}
END{for (i=1;i<NF;i++) print OFS a[i]}' myfile.txt
myarray has the same size as a. Now myarray starts with the index 0 and a with index 1. My main problem though is how I can pass the bash array to my awk expression so that I can use it inside the print loop with the corresponding elements. So what I tried was this:
awk -v array="${myarray[*]}"
'BEGIN{FS=OFS="\t"}
NR>=1{for (i=1;i<=NF;i++) a[i]+=$i}
END{for (i=1;i<NF;i++) print OFS a[i] OFS array[i-1]}' myfile.txt
This doens't work though. I don't get any output for myarray. My desired output in this example would be:
1 1
2 2
3 3
4 4
5 5
...
n n
To my understanding, you just need to feed awk with the bash array in a correct way. That is, by using split():
awk -v bash_array="${myarray[*]}"
'BEGIN{split(bash_array,array); FS=OFS="\t"}
NR>=1{for (i=1;i<=NF;i++) a[i]+=$i}
END{for (i=1;i<NF;i++) print a[i], array[i]}' file
Since the array array[] is now in awk, you don't have to care about the indices, so you can call them normally, without worrying about the ones in bash starting from 0.
Note also that print a,b is the same (and cleaner) as print a OFS b, since you already defined OFS in the BEGIN block.

Bash function with array won't work

I am trying to write a function in bash but it won't work. The function is as follows, it gets a file in the format of:
1 2 first 3
4 5 second 6
...
I'm trying to access only the strings in the 3rd word in every line and to fill the array "arr" with them, without repeating identical strings.
When I activated the "echo" command right after the for loop, it printed only the first string in every iteration (in the above case "first").
Thank you!
function storeDevNames {
n=0
b=0
while read line; do
line=$line
tempArr=( $line )
name=${tempArr[2]}
for i in $arr ; do
#echo ${arr[i]}
if [ "${arr[i]}" == "$name" ]; then
b=1
break
fi
done
if [ "$b" -eq 0 ]; then
arr[n]=$name
n=$(($n+1))
fi
b=0
done < $1
}
The following line seems suspicious
for i in $arr ; do
I changed it as follows and it works for me:
#! /bin/bash
function storeDevNames {
n=0
b=0
while read line; do
# line=$line # ?!
tempArr=( $line )
name=${tempArr[2]}
for i in "${arr[#]}" ; do
if [ "$i" == "$name" ]; then
b=1
break
fi
done
if [ "$b" -eq 0 ]; then
arr[n]=$name
(( n++ ))
fi
b=0
done
}
storeDevNames < <(cat <<EOF
1 2 first 3
4 5 second 6
7 8 first 9
10 11 third 12
13 14 second 15
EOF
)
echo "${arr[#]}"
You can replace all of your read block with:
arr=( $(awk '{print $3}' <"$1" | sort | uniq) )
This will fill arr with only unique names from the 3rd word such as first, second, ... This will reduce the entire function to:
function storeDevNames {
arr=( $(awk '{print $3}' <"$1" | sort | uniq) )
}
Note: this will provide a list of all unique device names in sorted order. Removing duplicates also destroys the original order. If preserving the order accept where duplicates are removed, see 4ae1e1's alternative.
You're using the wrong tool. awk is designed for this kind of job.
awk '{ if (!seen[$3]++) print $3 }' <"$1"
This one-liner prints the third column of each line, removing duplicates along the way while preserving the order of lines (only the first occurrence of each unique string is printed). sort | uniq, on the other hand, breaks the original order of lines. This one-liner is also faster than using sort | uniq (for large files, which doesn't seem to be applicable in OP's case), since this one-liner linearly scans the file once, while sort is obviously much more expensive.
As an example, for an input file with contents
1 2 first 3
4 5 second 6
7 8 third 9
10 11 second 12
13 14 fourth 15
the above awk one-liner gives you
first
second
third
fourth
To put the results in an array:
arr=( $(awk '{ if (!seen[$3]++) print $3 }' <"$1") )
Then echo ${arr[#]} will give you first second third fourth.

KSH Error : '$' unexpected

Below KSH script results in the error "Syntax error at line 4: '$' unexpected"
!#/bin/ksh
for i in `cat pins.list`
do
set -A array_${i} `grep -i "$i " pins.txt | awk '{print $2}'`
echo "Elements of array_${i} are ${array_${i}[#]}"
done
#=================================
I am creating multiple arrays (array_$i) for each iteration of i, after parsing the file pins.txt.
I can see the arrays array_block , array_group, array_range created and the elements of pins.txt stored in these arrays correctly, but I am unable to print the values of each of these arrays due to this error. Printing the contents of these 3 arrays outside the loop has no issues. But I need to access these arrays inside the loop for further processing in my script. Is there a way to resolve this?
Contents of pins.list and pins.txt are as follows:
pins.list (Arrays)
==================
block
group
range
pins.txt
===========
range 444
group 46
range 32
block 96
group 99
range 123
block 56
range 22
Thanks
You cannot create a dynamic variable name in this way, you need eval. For example:
while read i
do
eval "set -A array_${i} \$(grep -i $i pins.txt | awk '{print $2}')"
eval "echo \"Elements of array_${i} are \${array_${i}[#]}\" "
done < pins.list
I have changed from a for loop to a while, this is an alternative method of reading a file rather than using cat (also, check your #! line).

Resources