How to parse only selected column values using awk - arrays

I have a sample flat file which contains the following block
test my array which array is better array huh got it?
INDIA USA SA NZ AUS ARG ARM ARZ GER BRA SPN
I also have an array(ksh_arr2) which was defined like this
ksh_arr2=$(awk '{if(NR==1){for(i=1;i<=NF;i++){if($i~/^arr/){print i}}}}' testUnix.txt)
and contains the following integers
3 5 8
Now I want to parse only those column values which are at the respective numbered positions i.e. third fifth and eighth.
I also want the outputs from the 2nd line on wards.
So I tried the following
awk '{for(i=1;i<=NF;i++){if(NR >=1 && i=${ksh_arr2[i]}) do print$i ; done}}' testUnix.txt
but it is apparently not printing the desired outputs.
What am I missing ? Please help.

How i would approach it
awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{for(i in B)print $B[i]}' file
Explanation
-vA="${ksh_arr2[*]}" - Set variable A to expanded ksh array
'BEGIN{split(A,B," ") - Splits the expanded array on spaces
(effictively recreating it in awk)
{for(i in B)print $B[i]} - Index in the new array print the field that is the number
contained in that index
Edit
If you want to preserve the order of the fields when printing then this would be better
awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{while(++i<=length(B))print $B[i]}' file

Since no sample output is shown, I don't know if this output is what you want. It is the output one gets from the code provided with the minimal changes required to get it to run:
$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<=length(a);i++){print $a[i]}}' testUnix.txt
array
array
array
SA
AUS
ARZ
The above code prints out the selected columns in the same order supplied by the variable k.
Notes
The awk code never defined ksh_arr2. I presume that the value of this array was to be passed in from the shell. It is done here using the -v option to set the variable k to the value of ksh_arr2.
It is not possible to pass into awk an array directly. It is possible to pass in a string, as above, and then convert it to an array using the split function. Above the string k is converted to the awk array a.
awk syntax is different from shell syntax. For instance, awk does not use do or done.
Details
-v k='3 5 8'
This defines an awk variable k. To do this programmatically, replace 3 5 8 with a string or array from the shell.
BEGIN{split(k,a," ");}
This converts the space-separated values in variable k into an array named a.
for(i=1;i<=length(a);i++){print $a[i]}
This prints out each column in array a in order.
Alternate Output
If you want to keep the output from each line on a single line:
$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<length(a);i++) printf "%s ",$a[i]; print $a[length(a)]}' testUnix.txt
array array array
SA AUS ARZ

awk 'NR>=1 { print $3 " " $5 " " $8 }' testUnix.txt

Related

bash shell pull all values from array in random order. Pull each value only once

I need to pull the values from an array in a random order. It shouldn't pull the same value twice.
R=$(($RANDOM%5))
mixarray=("I" "like" "to" "play" "games")
echo ${mixarray[$R]}
I'm not sure what to do after the code above. I thought of putting the first pulled value into another array, and then nesting it all in a loop that checks that second array so it doesn't pull the same value twice from the first array. After many attempts, I just can't get the syntax right.
The output should be something like:
to
like
I
play
games
Thanks,
Would you please try the following:
#!/bin/bash
mixarray=("I" "like" "to" "play" "games")
mapfile -t result < <(for (( i = 0; i < ${#mixarray[#]}; i++ )); do
echo -e "${RANDOM}\t${i}"
done | sort -nk1,1 | cut -f2 | while IFS= read -r n; do
echo "${mixarray[$n]}"
done)
echo "${result[*]}"
First, it prints a random number and an index starting with 0 side by side.
The procedure above is repeated as much as the length of mixarray.
The output will look like:
13959 0
6416 1
6038 2
492 3
19893 4
Then the table is sorted with the 1st field:
492 3
6038 2
6416 1
13959 0
19893 4
Now the indices in the 2nd field are randomized. The field is extracted with
the cut command.
Next rearrange the elements of mixarray using the randomized index.
Finally the array result is assigned with the output and printed out.

Bash: how to extract longest directory paths from an array?

I put the output of find command into array like this:
pathList=($(find /foo/bar/ -type d))
How to extract the longest paths found in the array if the array contains several equal-length longest paths?:
echo ${pathList[#]}
/foo/bar/raw/
/foo/bar/raw/2020/
/foo/bar/raw/2020/02/
/foo/bar/logs/
/foo/bar/logs/2020/
/foo/bar/logs/2020/02/
After extraction, I would like to assign /foo/bar/raw/2020/02/ and /foo/bar/logs/2020/02/ to another array.
Thank you
Could you please try following. This should print the longest array(could be multiple in numbers same maximum length ones), you could assign it to later an array to.
echo "${pathList[#]}" |
awk -F'/' '{max=max>NF?max:NF;a[NF]=(a[NF]?a[NF] ORS:"")$0} END{print a[max]}'
I just created a test array with values provided by you and tested it as follows:
arr1=($(printf '%s\n' "${pathList[#]}" |\
awk -F'/' '{max=max>NF?max:NF;a[NF]=(a[NF]?a[NF] ORS:"")$0} END{print a[max]}'))
When I see new array's contents they are as follows:
echo "${arr1[#]}"
/foo/bar/raw/2020/02/
/foo/bar/logs/2020/02/
Explanation of awk code: Adding detailed explanation for awk code.
awk -F'/' ' ##Starting awk program from here and setting field separator as / for all lines.
{
max=max>NF?max:NF ##Creating variable max and its checking condition if max is greater than NF then let it be same else set its value to current NF value.
a[NF]=(a[NF]?a[NF] ORS:"")$0 ##Creating an array a with index of value of NF and keep appending its value with new line to it.
}
END{ ##Starting END section of this program.
print a[max] ##Printing value of array a with index of variable max.
}'

How to get user input as number and echo the stored array value of that number in bash scripting

I have wrote a script that throws the output of running node processes with the cwd of that process and I store the value in an array using for loop and do echo that array.
How can I able to get the user enter the index of array regarding the output that the script throws and show the output against that input generated by user
Example Myscript
array=$(netstat -nlp | grep node)
for i in ${array[*]}
do
echo $i
done
output is something like that
1056
2064
3024
I want something more advance. I want to take input from user like
Enter the regarding index from above list = 1
And lets suppose user enter 1
Then next output should be
Your selected value is 2064
Is it possible in bash
First, you're not actually using an array, you are storing a plain string in the variable "array". The string contains words separated by whitespace, so when you supply the variable in the for statement, the unquoted value is subject to Word Splitting
You need to use the array syntax for setting the array:
array=( $(netstat -nlp | grep node) )
However, the unquoted command substitution still exposes you to Filename Expansion. The best way to store the lines of a command into an array is to use the mapfile command with a process substitution:
mapfile -t array < <(netstat -nlp | grep node)
And in the for loop, make sure you quote all the variables and use index #
for i in "${array[#]}"; do
echo "$i"
done
Notes:
arrays created with mapfile will start at index 0, so be careful of off-by-one errors
I don't know how variables are implemented in bash, but there is this oddity:
if you refer to the array without an index, you'll get the first element:
array=( "hello" "world" )
echo "$array" # ==> hello
If you refer to a plain variable with array syntax and index zero, you'll get the value:
var=1234
echo "${var[0]}" # ==> 1234

Using 'awk' to store specific line numbers in an array

I have a destination.properties file:
Port:22
10.52.16.156
10.52.16.157
10.52.16.158
10.52.16.159
10.52.16.160
10.52.16.161
10.52.16.162
10.52.16.163
10.52.16.164
10.52.16.165
10.52.16.166
10.52.16.167
10.52.16.168
10.52.16.169
Port:61900-61999
10.52.16.156
10.52.16.157
10.52.16.158
10.52.16.159
10.52.16.160
10.52.16.161
10.52.16.162
10.52.16.163
10.52.16.164
10.52.16.165
10.52.16.166
10.52.16.167
10.52.16.168
10.52.16.169
I want to use an awk command to store all of the line numbers of lines that contain the word 'Port:' in an array.
I have the following command which stores all of the line numbers in the 1st array value ie array[0]:
array=$( (awk '/Port:/ {print NR}' destinations.prop) )
To get them in a shell array, you can do:
array=( $(awk '/Port:/ {print NR}' destinations.prop) )
The parenthesis assign the words within to successive array members. As usual, IFS controls the splitting of that command output, and file name globbing also happens if you happen to output wildcard characters. Probably not an issue in this case.

Bash Arrays - Couldn't extract the assigned variable from the array(2D)

I used the array assignment below in order to simulate a two dimensional array:
for((i=0;i<2;i++))
do
for((j=0;j<3;j++))
do
read TWOD$i[$j]
done
done < hi.txt
The file hi.txt contains these lines:
1
2
3
4
5
6
If I use echo ${TWOD0[2]}, I can print the value 2, but if I am using a variable for the first index, bash throws a syntax error bad substitution:
for((i=0;i<2;i++))
do
printf "%s\n" "${TWOD$i[2]}"
done
Is there any way to extract the elements from the array using a variable for the first index?
You can use indirect expansion:
row="TWOD$i[2]"
printf "%s\n" ${!row}

Resources