How to check if a variable is an array? - arrays

I was playing with PROCINFO and its sorted_in index to be able to control the array transversal.
Then I wondered what are the contents of PROCINFO, so I decided to go through it and print its values:
$ awk 'BEGIN {for (i in PROCINFO) print i, PROCINFO[i]}'
ppid 7571
pgrpid 14581
api_major 1
api_minor 1
group1 545
gid 545
group2 1000
egid 545
group3 10004
awk: cmd. line:1: fatal: attempt to use array `PROCINFO["identifiers"]' in a scalar context
As you see, it breaks because there is -at least- one item that it is also an array itself.
The fast workaround is to skip this one:
awk 'BEGIN {for (i in PROCINFO) {if (i!="identifiers") {print i, PROCINFO[i]}}}'
However it looks a bit hacky and would like to have something like
awk 'BEGIN {for (i in PROCINFO) {if (!(a[i] is array)) {print i, PROCINFO[i]}}}'
^^^^^^^^^^^^^^^^
Since there is not a thing like a type() function to determine if a variable is an array or a scalar, I wonder: is there any way to check if an element is an array?
I was thinking in something like going through it with a for and catching the possible error, but I don't know how.
$ awk 'BEGIN{a[1]=1; for (i in a) print i}'
1
$ awk 'BEGIN{a=1; for (i in a) print i}'
awk: cmd. line:1: fatal: attempt to use scalar `a' as an array
$ awk 'BEGIN{a[1]=1; print a}'
awk: cmd. line:1: fatal: attempt to use array `a' in a scalar context

In GNU Awk, there's an answer, but the recommended approach depends on what version you are running.
From GNU Awk 4.2, released in October 2017, there is a new function typeof() to check this, as indicated in the release notes from the beta release:
The new typeof() function can be used to indicate if a variable or array element is an array, regexp, string or number. The isarray() function is deprecated in favor of typeof().
So now you can say:
$ awk 'BEGIN { a[1] = "a"; print typeof(a) }'
array
And perform the check as follows:
$ awk 'BEGIN { a = "a"; if (typeof(a) == "array") print "yes" }'
$ awk 'BEGIN { a[1] = "a"; if (typeof(a) == "array") print "yes" }'
yes
In older versions, you can use isarray():
$ awk 'BEGIN { a = "a"; if (isarray(a)) print "yes" }'
$ awk 'BEGIN { a[1] = "a"; if (isarray(a)) print "yes" }'
yes
From the man page:
isarray(x)
Return true if x is an array, false otherwise.

Related

Iterating over lines (w/ numbers) read from a file to an array in bash

I'm trying to write a small script that will take the 4th columns of a file and store it in an array then do a little comparison. If the element in the array is greater than 0 and less than 500 I have to increment the counter. However when I run the script the counter always shows 0. Here's my script
#!/bin/bash
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' > test.txt)
COUNTER=0
for i in ${my_array[#]}; do
if [["${my_array[$i]}" -gt 0 -a "${my_array[$i]}" -lt 500 ]]
then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "%i" "${my_array[$i]}"//just to test if the mapfile command is working
done
echo $COUNTER
output:
./script1.bash
0
#!/bin/bash
mapfile -t my_array < <(awk '{ print $4 }' file1.txt | tee test.txt)
COUNTER=0
for idx in "${!my_array[#]}"; do
value=${my_array[$idx]}
if (( value > 0 )) && (( value < 500 )); then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "$idx" "$value"
done
echo "$COUNTER"
The use of cat here is needless: It added nothing but inefficiency (requiring an extra process to be started, and forcing awk to read from a pipe rather than direct from a file).
mapfile had nothing to read because the output of awk was redirected to test.txt. If you want it to go to both a file and stdout, then you need to use tee.
-a is not valid in [[ ]]; use && instead there. However, since you're doing only arithmetic, (( )) is more appropriate. Incidentally, -a is officially marked obsolescent even for [ ] and test; see the current POSIX standard.
${my_array[#]} iterates over values. If you want to iterate over indexes, you need ${!my_array[#]} instead.
Whitespace is mandatory in separating command names. [["$foo" is a different command from [[, unless $foo is empty or starts with a character in $IFS.
If you redirect the output to a file: > test.txt then there is no output in "standard output" because it is consumed by the file. So, first, you need to remove that redirection. You may use:
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' )
But since awk could perfectly well read a file, this is better:
mapfile -t my_array < <(awk '{ print $4 }' file1.txt)
And since you are using awk, it could do the comparison to 0 and 500 and output the whole count.
counter=$(awk '{if($4>0 && $4<500){c++}}END{print c}' file1.txt)
echo "$counter"
Simpler, faster.
That will also avoid some simple mistakes in your script, like missing an space in the […] construct:
if [[ "${my … # NOT "if [["${my …"
And some missing quotes:
for i in "${my_array[#]}" # NOT for i in ${my_array[#]}
In general, it is a good idea to check your script with ShellCheck.net to remove some simple mistakes.

Split a string directly into array

Suppose I want to pass a string to awk so that once I split it (on a pattern) the substrings become the indexes (not the values) of an associative array.
Like so:
$ awk -v s="A:B:F:G" 'BEGIN{ # easy, but can these steps be combined?
split(s,temp,":") # temp[1]="A",temp[2]="B"...
for (e in temp) arr[temp[e]] #arr["A"], arr["B"]...
for (e in arr) print e
}'
A
B
F
G
Is there a awkism or gawkism that would allow the string s to be directly split into its components with those components becoming the index entries in arr?
The reason is (bigger picture) is I want something like this (pseudo awk):
awk -v s="1,4,55" 'BEGIN{[arr to arr["1"],arr["5"],arr["55"]} $3 in arr {action}'
No, there is no better way to map separated substrings to array indices than:
split(str,tmp); for (i in tmp) arr[tmp[i]]
FWIW if you don't like that approach for doing what your final pseudo-code does:
awk -v s="1,4,55" 'BEGIN{split(s,tmp,/,/); for (i in tmp) arr[tmp[i]]} $3 in arr{action}'
then another way to get the same behavior is
awk -v s=",1,4,55," 'index(s,","$3","){action}'
Probably useless and unnecessarily complex but I'll open the game with while, match and substr:
$ awk -v s="A:B:F:G" '
BEGIN {
while(match(s,/[^:]+/)) {
a[substr(s,RSTART,RLENGTH)]
s=substr(s,RSTART+RLENGTH)
}
for(i in a)
print i
}'
A
B
F
G
I'm eager to see (if there are) some useful solutions. I tried playing around with asorts and such.
Other way kind awkism
cat file
1 hi
2 hello
3 bonjour
4 hola
5 konichiwa
Run it,
awk 'NR==FNR{d[$1]; next}$1 in d' RS="," <(echo "1,2,4") RS="\n" file
you get,
1 hi
2 hello
4 hola

Can I pass an array to awk using -v?

I would like to be able to pass an array variable to awk. I don't mean a shell array but a native awk one. I know I can pass scalar variables like this:
awk -vfoo="1" 'NR==foo' file
Can I use the same mechanism to define an awk array? Something like:
$ awk -v"foo[0]=1" 'NR==foo' file
awk: fatal: `foo[0]' is not a legal variable name
I've tried a few variations of the above but none of them work on GNU awk 4.1.1 on my Debian. So, is there any version of awk (gawk,mawk or anything else) that can accept an array from the -v switch?
I know I can work around this and can easily think of ways to do so, I am just wondering if any awk implementation supports this kind of functionality natively.
You can use the split() function inside mawk or gawk to split the input of the "-v" value (here is the gawk man page):
split(s, a [, r [, seps] ])
Split the string s into the array a and the separators array seps on
the regular expression r, and return the number of fields.*
An example here in which i pass the value "ARRAYVAR", a comma separated list of values which is my array, with "-v" to the awk program, then split it into the internal variable array "arrayval" using the split() function and then print the 3rd value of the array:
echo 0 | gawk -v ARRAYVAR="a,b,c,d,e,f" '{ split(ARRAYVAR,arrayval,","); print(arrayval[3]) }'
c
Seems to work :)
It looks like it is impossible by definition.
From man awk we have that:
-v var=val
--assign var=val
Assign the value val to the variable var, before execution of the
program begins. Such variable values are available to the BEGIN rule
of an AWK program.
Then we read in Using Variables in a Program that:
The name of a variable must be a sequence of letters, digits, or
underscores, and it may not begin with a digit.
Variables in awk can be assigned either numeric or string values.
So the way the -v implementation is defined makes it impossible to provide an array as a variable, since any kind of usage of the characters = or [ is not allowed as part of the -v variable passing. And both are required, since arrays in awk are only associative.
If you don't insist on using -v you could use -i (include) instead to read an awk file that contains the variable settings.
Like this:
if F=$(mktemp inputXXXXXX); then
cat >$F << 'END'
BEGIN {
foo[0]=1
}
END
cat $F
awk -i $F 'BEGIN { print foo[0] }' </dev/null
rm $F
fi
Sample trace (using gawk-4.2.1):
bash -x /tmp/test.sh
++ mktemp inputXXXXXX
+ F=inputrpMsan
+ cat
+ cat inputrpMsan
BEGIN {
foo[0]=1
}
+ awk -i inputrpMsan 'BEGIN { print foo[0] }'
1
+ rm inputrpMsan
Unfortunately, this is not possible. However, you can convert a bash array to an awk array using a few clever methods.
I wanted to do this recently by passing a bash array to awk to use it for filtering, so here is what I did:
$ arr=( hello world this is bash array )
$ echo -e 'this\nmight\nnot\nshow\nup' | awk 'BEGIN {
for (i = 1; i < ARGC; i++) {
my_filter[ARGV[i]]=1
ARGV[i]="" # unset ARGV[i] otherwise awk might try to read it as a file
}
} !my_filter[$0]' "${arr[#]}"
Output:
might
not
show
up
For associative arrays, you could pass it as a string of key-value pairs, and then reformat it in the BEGIN section.
$ echo | awk -v m="a,b;c,d" '
BEGIN {
split(m,M,";")
for (i in M) {
split(M[i],MM,",")
MA[MM[1]]=MM[2]
}
}
{
for (a in MA) {
printf("MA[%s]=%s\n",a, MA[a])
}
}'
Output:
MA[a]=b
MA[c]=d

Bash: awk output to array

Im trying to put the contents of a awk command in to a bash array however im having a bit of trouble.
>>test.sh
f_checkuser() {
_l="/etc/login.defs"
_p="/etc/passwd"
## get mini UID limit ##
l=$(grep "^UID_MIN" $_l)
## get max UID limit ##
l1=$(grep "^UID_MAX" $_l)
awk -F':' -v "min=${l##UID_MIN}" -v "max=${l1##UID_MAX}" '{ if ( $3 >= min && $3 <= max && $7 != "/sbin/nologin" ) print $0 }' "$_p"
}
...
Used files:
Sample File: /etc/login.defs
>>/etc/login.defs
### Min/max values for automatic uid selection in useradd
UID_MIN 1000
UID_MAX 60000
Sample File: /etc/passwd
>>/etc/passwd
root:x:0:0:root:/root:/usr/bin/zsh
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:Daniel,,,:/home/daniel:/bin/bash
The output looks like:
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:User,,,:/home/user:/bin/bash
respectively (awk ... print $1 }' "$_p")
admin
daniel
Now my problem is to save the awk output in an Array to use it as variable.
>>test.sh
...
f_checkuser
echo "Array items and indexes:"
for index in ${!LOKAL_USERS[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
It could/should look like this example.
Array items and indexes:
0: admin
1: daniel
Specially i would become all Users of a System (not root,bin,sys,ssh,...) without blocked users in an array.
Perhaps someone has another idea to solve my Problem?
Are you trying to set the output of one script to an array? There is a bash has a way of doing this. For example,
a=( $(seq 1 10) ); echo ${a[1]}
will populate the array a with elements 1 to 10 and will print 2, the second line generated by seq (array index starts at zero). Simply replace the contents of $(...) with your script.
For those coming to this years later ...
bash 4 introduced readarray (aka mapfile) exactly for this purpose.
See also Bash capturing output of awk into array
One solution that works:
array=()
f_checkuser(){
...
...
tempfile="localuser.tmp"
touch ${tempfile}
awk -F':'...'{... print $1 }' "$_p" > ${HOME}/${tempfile}
getArrayfromFile "${tempfile}"
}
getArrayfromFile() {
i=0
while read line # Read a line
do
array[i]=$line # Put it into the array
i=$(($i + 1))
done < $1
}
f_checkuser
echo "Array items and indexes:"
for index in ${!array[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
Output:
Array items and indexes:
0: daniel
1: admin
But I like more to observe without a new temp-file.
So, have someone any another idea without a temp-file?

Search array for string and return position in unix

I have a 2 dimensional array made up of letters in the first dimension and numbers in the second dimension. eg
a,1
b,3
c,9
d,8
What I would like to do is search to array for a character and return it's corresponding number. eg if $var='c' then the return value would be 9.
Being unfamiliar with Unix arrays, I was wondering if anyone knew how to do this simply?
Thanks :)
Here is what I came up with
arr1=(a b c d)
arr2=(1 3 9 8)
for ((index=0; index<${#arr1[#]}; index++)); do
if [ "${arr1[$index]}" = "$myCharacter" ]; then
echo $arr2[$index]
return
fi
done
echo 'Character not found'
Not sure if there was a shorter way to do this but works okay....
Assuming you have a file called array.txt with input like you show in the question,
$ var=c
$ awk -v key="$var" -F, '$1 ~ key {print $2; found=1} END { if (! found) { print "Key "key" not found";}}' array.txt
9
$ var=z
$ awk -v key="$var" -F, '$1 ~ key {print $2; found=1} END { if (! found) { print "Key "key" not found";}}' array.txt
Key z not found
You can use bash to prepare an associative array and lookup the value using the character:
declare -A ARR
ARR=( [a]=1 [b]=3 [c]=9 [d]=8 )
echo ${ARR[c]}

Resources