How to prevent awk from sorting the array? - arrays

I'm trying to extract those lines from data text-file:
first
second
third
fourth
fifth
sixth
based on their lines numbers pre-saved in nums text-file:
2
6
3
and I ended up by the following awk solution but it gives me the lines which are sorted based on their line-numbers and not according to orders in nums file.
$ awk 'NR==FNR{lines[$0];next } FNR in lines' nums data
second
third
sixth
But what I want to achieve is below output:
second
sixth
third
So, is there any option for awk to prevent/disable it from sorting the array?

Handle the files the other way around:
awk 'NR == FNR { line[NR] = $0; next } { print line[$1] }' data nums

Related

Bash: how to extract longest directory paths from an array?

I put the output of find command into array like this:
pathList=($(find /foo/bar/ -type d))
How to extract the longest paths found in the array if the array contains several equal-length longest paths?:
echo ${pathList[#]}
/foo/bar/raw/
/foo/bar/raw/2020/
/foo/bar/raw/2020/02/
/foo/bar/logs/
/foo/bar/logs/2020/
/foo/bar/logs/2020/02/
After extraction, I would like to assign /foo/bar/raw/2020/02/ and /foo/bar/logs/2020/02/ to another array.
Thank you
Could you please try following. This should print the longest array(could be multiple in numbers same maximum length ones), you could assign it to later an array to.
echo "${pathList[#]}" |
awk -F'/' '{max=max>NF?max:NF;a[NF]=(a[NF]?a[NF] ORS:"")$0} END{print a[max]}'
I just created a test array with values provided by you and tested it as follows:
arr1=($(printf '%s\n' "${pathList[#]}" |\
awk -F'/' '{max=max>NF?max:NF;a[NF]=(a[NF]?a[NF] ORS:"")$0} END{print a[max]}'))
When I see new array's contents they are as follows:
echo "${arr1[#]}"
/foo/bar/raw/2020/02/
/foo/bar/logs/2020/02/
Explanation of awk code: Adding detailed explanation for awk code.
awk -F'/' ' ##Starting awk program from here and setting field separator as / for all lines.
{
max=max>NF?max:NF ##Creating variable max and its checking condition if max is greater than NF then let it be same else set its value to current NF value.
a[NF]=(a[NF]?a[NF] ORS:"")$0 ##Creating an array a with index of value of NF and keep appending its value with new line to it.
}
END{ ##Starting END section of this program.
print a[max] ##Printing value of array a with index of variable max.
}'

Finding an element in a bash array and printing the row out

I have this as a csv file:
Column1,Column2,Column3,Column4
wow,this,is,awesomeee
we,are,going,to
what,is,the,name
This is by script:
names=($(awk < example_list.csv -F, '{print $2}'))
lines=($(awk < example_list.csv -F, '{print $0 $1 $2 $3 }'))
for i in "${names[#]}"
do
if [ ${names[i]}=="is" ]
then
echo ${lines[i]}
fi
done
My goal is to find a match in the third column (index 2) and print the whole row out based on that matching index.
In this case:
I save the whole third column to an array (names)
I save all columns and rows to a second array (lines)
I then iterate through array names, looking for a matching word "is"
If it is found, I print out the whole row from array list based on that matching index.
Unfortunately, this does not work. All this prints out is:
Column1,Column2,Column3,Column4Column1Column2Column3
Column1,Column2,Column3,Column4Column1Column2Column3
Column1,Column2,Column3,Column4Column1Column2Column3
Column1,Column2,Column3,Column4Column1Column2Column3
You're making it too complicated, you can just use this awk to print full record when 3rd field is "is":
awk -F, '$3=="is"' file
wow,this,is,awesomeee

How to parse only selected column values using awk

I have a sample flat file which contains the following block
test my array which array is better array huh got it?
INDIA USA SA NZ AUS ARG ARM ARZ GER BRA SPN
I also have an array(ksh_arr2) which was defined like this
ksh_arr2=$(awk '{if(NR==1){for(i=1;i<=NF;i++){if($i~/^arr/){print i}}}}' testUnix.txt)
and contains the following integers
3 5 8
Now I want to parse only those column values which are at the respective numbered positions i.e. third fifth and eighth.
I also want the outputs from the 2nd line on wards.
So I tried the following
awk '{for(i=1;i<=NF;i++){if(NR >=1 && i=${ksh_arr2[i]}) do print$i ; done}}' testUnix.txt
but it is apparently not printing the desired outputs.
What am I missing ? Please help.
How i would approach it
awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{for(i in B)print $B[i]}' file
Explanation
-vA="${ksh_arr2[*]}" - Set variable A to expanded ksh array
'BEGIN{split(A,B," ") - Splits the expanded array on spaces
(effictively recreating it in awk)
{for(i in B)print $B[i]} - Index in the new array print the field that is the number
contained in that index
Edit
If you want to preserve the order of the fields when printing then this would be better
awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{while(++i<=length(B))print $B[i]}' file
Since no sample output is shown, I don't know if this output is what you want. It is the output one gets from the code provided with the minimal changes required to get it to run:
$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<=length(a);i++){print $a[i]}}' testUnix.txt
array
array
array
SA
AUS
ARZ
The above code prints out the selected columns in the same order supplied by the variable k.
Notes
The awk code never defined ksh_arr2. I presume that the value of this array was to be passed in from the shell. It is done here using the -v option to set the variable k to the value of ksh_arr2.
It is not possible to pass into awk an array directly. It is possible to pass in a string, as above, and then convert it to an array using the split function. Above the string k is converted to the awk array a.
awk syntax is different from shell syntax. For instance, awk does not use do or done.
Details
-v k='3 5 8'
This defines an awk variable k. To do this programmatically, replace 3 5 8 with a string or array from the shell.
BEGIN{split(k,a," ");}
This converts the space-separated values in variable k into an array named a.
for(i=1;i<=length(a);i++){print $a[i]}
This prints out each column in array a in order.
Alternate Output
If you want to keep the output from each line on a single line:
$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<length(a);i++) printf "%s ",$a[i]; print $a[length(a)]}' testUnix.txt
array array array
SA AUS ARZ
awk 'NR>=1 { print $3 " " $5 " " $8 }' testUnix.txt

How to get array dimension in 1 direction in awk multidimension array

Is there any way to get only one dimension length in awk array like in php
look at this simple example
awk 'BEGIN{
a[1,1]=1;
a[1,2]=2;
a[2,1]=3;
a[2,3]=2;
print length(a)
}'
Here length of array is 4 which includes each field as an entity, my interest is to get how many rows are there in array, in real code of mine I have n number of fields setting array like this
for(i=1;i<=NF;i++)A[FNR,i]=$i
problem is fields are not fixed in my file, sometimes fields are varying in each row, so I cannot calculate even like this length(array)/NF
Is there any solution ?
Use GNU awk since it has true mufti-dimensional arrays:
awk 'BEGIN{
a[1][1]=1;
a[1][2]=2;
a[1][3]=3;
a[2][1]=4;
a[2][2]=5;
print length(a)
print length(a[1])
print length(a[2])
}'
2
3
2
This can be achieved by counting unique index in array, try something like this
awk '
function _get_rowlength(Arr,fnumber, i,t,c){
for(i in Arr){
split(i,sep,SUBSEP)
if(!(sep[fnumber] in t))
{
c++
t[sep[fnumber]]
}
}
return c;
}
BEGIN{
a[1,1]=1;
a[1,2]=2;
a[2,1]=3;
a[2,3]=2;
print _get_rowlength(a,1)
}'
Resulting
$ ./tester
2
If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk

Picking out elements not in an 1-D array?

I have a 1-D array
x1 = [1, 2, 3, …, 10]
which is stored in the file x1.dat as one record (all on one line), separated by commas. x1.dat reads
1,2,3,4,5,..., 10
And there are two arrays
array1 = [1,3], and array2= [4,7]
(elements in array1 and array2 are some elements of the array x1).
I want to select all the element which is neither in array1 nor in array2.
The desired output will read
2,5,6,8,9,10
I tried with awk:
$awk 'BEGIN{array1 = (1,3); array2 = (4,7)} {for (i=1; i<=NF;i++) if ((!($i in a1)) && (!($i in a2))) {print $i }}' x1.dat
This does not work. Could you please help me to correct it or give a better way to do this selection?
You didn't give the text format of your data file. I assume it is one element per line.
You have a couple of problems in your codes.
variable assignment, you cannot assign an awk array like that.
the in usage is checking the array (hashtable actually) keys, not values.
it would be easier if you put the array1 and 2 in file, or input string, not in codes, but I am keeping it there for showing how to solve the problem exactly as you described
better read version:
awk -v arr1="<yourArray1Str>" -v arr2="<yourArray2Str>"
'BEGIN{
split(arr1,a,",");
split(arr2,b,",");
for(x in a)k[a[x]]=1;
for(x in b)k[b[x]]=1}
!k[$0]' file
with your example:
kent$ cat f
1
2
3
4
5
kent$ awk -v arr1="2,4,3" -v arr2="1,3,4" 'BEGIN{split(arr1,a,",");split(arr2,b,",");for(x in a)k[a[x]]=1;for(x in b)k[b[x]]=1}!k[$0]' f
5

Resources