Bash: how to extract longest directory paths from an array? - arrays

I put the output of find command into array like this:
pathList=($(find /foo/bar/ -type d))
How to extract the longest paths found in the array if the array contains several equal-length longest paths?:
echo ${pathList[#]}
/foo/bar/raw/
/foo/bar/raw/2020/
/foo/bar/raw/2020/02/
/foo/bar/logs/
/foo/bar/logs/2020/
/foo/bar/logs/2020/02/
After extraction, I would like to assign /foo/bar/raw/2020/02/ and /foo/bar/logs/2020/02/ to another array.
Thank you

Could you please try following. This should print the longest array(could be multiple in numbers same maximum length ones), you could assign it to later an array to.
echo "${pathList[#]}" |
awk -F'/' '{max=max>NF?max:NF;a[NF]=(a[NF]?a[NF] ORS:"")$0} END{print a[max]}'
I just created a test array with values provided by you and tested it as follows:
arr1=($(printf '%s\n' "${pathList[#]}" |\
awk -F'/' '{max=max>NF?max:NF;a[NF]=(a[NF]?a[NF] ORS:"")$0} END{print a[max]}'))
When I see new array's contents they are as follows:
echo "${arr1[#]}"
/foo/bar/raw/2020/02/
/foo/bar/logs/2020/02/
Explanation of awk code: Adding detailed explanation for awk code.
awk -F'/' ' ##Starting awk program from here and setting field separator as / for all lines.
{
max=max>NF?max:NF ##Creating variable max and its checking condition if max is greater than NF then let it be same else set its value to current NF value.
a[NF]=(a[NF]?a[NF] ORS:"")$0 ##Creating an array a with index of value of NF and keep appending its value with new line to it.
}
END{ ##Starting END section of this program.
print a[max] ##Printing value of array a with index of variable max.
}'

Related

How does bash array slicing work when start index is not provided?

I'm looking at a script, and I'm having trouble determining what is going on.
Here is an example:
# Command to get the last 4 occurrences of a pattern in a file
lsCommand="ls /my/directory | grep -i my_pattern | tail -4"
# Load the results of that command into an array
dirArray=($(echo $(eval $lsCommand) | tr ' ' '\n'))
# What is this doing?
yesterdaysFileArray=($(echo ${x[#]::$((${#x[#]} / 2))} | tr ' ' '\n'))
There is a lot going on here. I understand how arrays work, but I don't know how $x is getting referenced if it was never declared.
I see that the $((${#x[#]} / 2}} is taking the number of elements and dividing it in half, and the tr is used to create the array. But what else is going on?
I think the last line is an array slice pattern in bash of form ${array[#]:1:2}, where array[#] returns the contents of the array, :1:2 takes a slice of length 2, starting at index 1.
So for your case though you are taking the start index empty because you haven't specified any and length as half the count of array.
But there is a lot better way to do this in bash as below. Don't use eval and use the built-in globbing support from the shell itself
cd /my/directory
fileArray=()
for file in *my_pattern*; do
[[ -f "$file" ]] || { printf '%s\n' 'no file found'; return 1; }
fileArray+=( "$file" )
done
and do
printf '%s\n' "${fileArray[#]::${#fileArray[#]}/2}"

Using 'awk' to store specific line numbers in an array

I have a destination.properties file:
Port:22
10.52.16.156
10.52.16.157
10.52.16.158
10.52.16.159
10.52.16.160
10.52.16.161
10.52.16.162
10.52.16.163
10.52.16.164
10.52.16.165
10.52.16.166
10.52.16.167
10.52.16.168
10.52.16.169
Port:61900-61999
10.52.16.156
10.52.16.157
10.52.16.158
10.52.16.159
10.52.16.160
10.52.16.161
10.52.16.162
10.52.16.163
10.52.16.164
10.52.16.165
10.52.16.166
10.52.16.167
10.52.16.168
10.52.16.169
I want to use an awk command to store all of the line numbers of lines that contain the word 'Port:' in an array.
I have the following command which stores all of the line numbers in the 1st array value ie array[0]:
array=$( (awk '/Port:/ {print NR}' destinations.prop) )
To get them in a shell array, you can do:
array=( $(awk '/Port:/ {print NR}' destinations.prop) )
The parenthesis assign the words within to successive array members. As usual, IFS controls the splitting of that command output, and file name globbing also happens if you happen to output wildcard characters. Probably not an issue in this case.

Bash: awk output to array

Im trying to put the contents of a awk command in to a bash array however im having a bit of trouble.
>>test.sh
f_checkuser() {
_l="/etc/login.defs"
_p="/etc/passwd"
## get mini UID limit ##
l=$(grep "^UID_MIN" $_l)
## get max UID limit ##
l1=$(grep "^UID_MAX" $_l)
awk -F':' -v "min=${l##UID_MIN}" -v "max=${l1##UID_MAX}" '{ if ( $3 >= min && $3 <= max && $7 != "/sbin/nologin" ) print $0 }' "$_p"
}
...
Used files:
Sample File: /etc/login.defs
>>/etc/login.defs
### Min/max values for automatic uid selection in useradd
UID_MIN 1000
UID_MAX 60000
Sample File: /etc/passwd
>>/etc/passwd
root:x:0:0:root:/root:/usr/bin/zsh
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:Daniel,,,:/home/daniel:/bin/bash
The output looks like:
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:User,,,:/home/user:/bin/bash
respectively (awk ... print $1 }' "$_p")
admin
daniel
Now my problem is to save the awk output in an Array to use it as variable.
>>test.sh
...
f_checkuser
echo "Array items and indexes:"
for index in ${!LOKAL_USERS[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
It could/should look like this example.
Array items and indexes:
0: admin
1: daniel
Specially i would become all Users of a System (not root,bin,sys,ssh,...) without blocked users in an array.
Perhaps someone has another idea to solve my Problem?
Are you trying to set the output of one script to an array? There is a bash has a way of doing this. For example,
a=( $(seq 1 10) ); echo ${a[1]}
will populate the array a with elements 1 to 10 and will print 2, the second line generated by seq (array index starts at zero). Simply replace the contents of $(...) with your script.
For those coming to this years later ...
bash 4 introduced readarray (aka mapfile) exactly for this purpose.
See also Bash capturing output of awk into array
One solution that works:
array=()
f_checkuser(){
...
...
tempfile="localuser.tmp"
touch ${tempfile}
awk -F':'...'{... print $1 }' "$_p" > ${HOME}/${tempfile}
getArrayfromFile "${tempfile}"
}
getArrayfromFile() {
i=0
while read line # Read a line
do
array[i]=$line # Put it into the array
i=$(($i + 1))
done < $1
}
f_checkuser
echo "Array items and indexes:"
for index in ${!array[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
Output:
Array items and indexes:
0: daniel
1: admin
But I like more to observe without a new temp-file.
So, have someone any another idea without a temp-file?

Finding an element in a bash array and printing the row out

I have this as a csv file:
Column1,Column2,Column3,Column4
wow,this,is,awesomeee
we,are,going,to
what,is,the,name
This is by script:
names=($(awk < example_list.csv -F, '{print $2}'))
lines=($(awk < example_list.csv -F, '{print $0 $1 $2 $3 }'))
for i in "${names[#]}"
do
if [ ${names[i]}=="is" ]
then
echo ${lines[i]}
fi
done
My goal is to find a match in the third column (index 2) and print the whole row out based on that matching index.
In this case:
I save the whole third column to an array (names)
I save all columns and rows to a second array (lines)
I then iterate through array names, looking for a matching word "is"
If it is found, I print out the whole row from array list based on that matching index.
Unfortunately, this does not work. All this prints out is:
Column1,Column2,Column3,Column4Column1Column2Column3
Column1,Column2,Column3,Column4Column1Column2Column3
Column1,Column2,Column3,Column4Column1Column2Column3
Column1,Column2,Column3,Column4Column1Column2Column3
You're making it too complicated, you can just use this awk to print full record when 3rd field is "is":
awk -F, '$3=="is"' file
wow,this,is,awesomeee

How to parse only selected column values using awk

I have a sample flat file which contains the following block
test my array which array is better array huh got it?
INDIA USA SA NZ AUS ARG ARM ARZ GER BRA SPN
I also have an array(ksh_arr2) which was defined like this
ksh_arr2=$(awk '{if(NR==1){for(i=1;i<=NF;i++){if($i~/^arr/){print i}}}}' testUnix.txt)
and contains the following integers
3 5 8
Now I want to parse only those column values which are at the respective numbered positions i.e. third fifth and eighth.
I also want the outputs from the 2nd line on wards.
So I tried the following
awk '{for(i=1;i<=NF;i++){if(NR >=1 && i=${ksh_arr2[i]}) do print$i ; done}}' testUnix.txt
but it is apparently not printing the desired outputs.
What am I missing ? Please help.
How i would approach it
awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{for(i in B)print $B[i]}' file
Explanation
-vA="${ksh_arr2[*]}" - Set variable A to expanded ksh array
'BEGIN{split(A,B," ") - Splits the expanded array on spaces
(effictively recreating it in awk)
{for(i in B)print $B[i]} - Index in the new array print the field that is the number
contained in that index
Edit
If you want to preserve the order of the fields when printing then this would be better
awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{while(++i<=length(B))print $B[i]}' file
Since no sample output is shown, I don't know if this output is what you want. It is the output one gets from the code provided with the minimal changes required to get it to run:
$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<=length(a);i++){print $a[i]}}' testUnix.txt
array
array
array
SA
AUS
ARZ
The above code prints out the selected columns in the same order supplied by the variable k.
Notes
The awk code never defined ksh_arr2. I presume that the value of this array was to be passed in from the shell. It is done here using the -v option to set the variable k to the value of ksh_arr2.
It is not possible to pass into awk an array directly. It is possible to pass in a string, as above, and then convert it to an array using the split function. Above the string k is converted to the awk array a.
awk syntax is different from shell syntax. For instance, awk does not use do or done.
Details
-v k='3 5 8'
This defines an awk variable k. To do this programmatically, replace 3 5 8 with a string or array from the shell.
BEGIN{split(k,a," ");}
This converts the space-separated values in variable k into an array named a.
for(i=1;i<=length(a);i++){print $a[i]}
This prints out each column in array a in order.
Alternate Output
If you want to keep the output from each line on a single line:
$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<length(a);i++) printf "%s ",$a[i]; print $a[length(a)]}' testUnix.txt
array array array
SA AUS ARZ
awk 'NR>=1 { print $3 " " $5 " " $8 }' testUnix.txt

Resources