Read directory name and size into associative array - arrays

I want to get the name and size of all directories [at the top level] of a specified directory into an associative array, such that the name is used as the key and the size as the value.
I know that I can use mapfile to read the output of a command (this extracts the directory size) into an indexed array:
mapfile -t inter_arry < <( du -d0 "$completePath"* | sed -E 's/^([0-9]*).*$/\1/' );
(I would then loop through this array and use it to populate the associative array.)
I know that I could create a matching array and populate it with the directory name (though there's no way of knowing if there's been a change in the contents between commands), but how can I extract both the size and the name by modifying my code snippet?
Is there any way to skip the intermediate indexed array?

If there are many items, it's faster, though not prettier, to avoid a loop. Here I use du | awk to create the array initialization string:
declare -A ARR=$(
echo '( '$(
du -d0 "$completePath"* |
awk -F$'\t' '{printf "["$2"]="$1" "}'
)')'
)
If there are few items (e.g., thousands or less), use a loop as #Inian suggests:
declare -A ARR
while IFS=$'\t' read size name; do
ARR[$name]=$size
done < <(du -d0 "$completePath"*)

Related

How to get user input as number and echo the stored array value of that number in bash scripting

I have wrote a script that throws the output of running node processes with the cwd of that process and I store the value in an array using for loop and do echo that array.
How can I able to get the user enter the index of array regarding the output that the script throws and show the output against that input generated by user
Example Myscript
array=$(netstat -nlp | grep node)
for i in ${array[*]}
do
echo $i
done
output is something like that
1056
2064
3024
I want something more advance. I want to take input from user like
Enter the regarding index from above list = 1
And lets suppose user enter 1
Then next output should be
Your selected value is 2064
Is it possible in bash
First, you're not actually using an array, you are storing a plain string in the variable "array". The string contains words separated by whitespace, so when you supply the variable in the for statement, the unquoted value is subject to Word Splitting
You need to use the array syntax for setting the array:
array=( $(netstat -nlp | grep node) )
However, the unquoted command substitution still exposes you to Filename Expansion. The best way to store the lines of a command into an array is to use the mapfile command with a process substitution:
mapfile -t array < <(netstat -nlp | grep node)
And in the for loop, make sure you quote all the variables and use index #
for i in "${array[#]}"; do
echo "$i"
done
Notes:
arrays created with mapfile will start at index 0, so be careful of off-by-one errors
I don't know how variables are implemented in bash, but there is this oddity:
if you refer to the array without an index, you'll get the first element:
array=( "hello" "world" )
echo "$array" # ==> hello
If you refer to a plain variable with array syntax and index zero, you'll get the value:
var=1234
echo "${var[0]}" # ==> 1234

Reading several files into an associative array in bash (>4.0) [duplicate]

This question already has answers here:
How to pipe input to a Bash while loop and preserve variables after loop ends
(3 answers)
Closed 4 years ago.
I am new to associative arrays in bash so please forgive me if I sound silly somewhere. Let's say am reading through a large file and using bash (version = 4.2.46) associative array to store FDR values for genes. For one file, I am simply doing:
declare -A array
while read ID GeneID geneSymbol chr strand exonStart_0base exonEnd upstreamES upstreamEE downstreamES downstreamEE ID IJC_SAMPLE_1 SJC_SAMPLE_1 IJC_SAMPLE_2 SJC_SAMPLE_2 IncFormLen SkipFormLen PValue FDR IncLevel1 IncLevel2 IncLevelDifference; do
array[$geneSymbol]="${array[$geneSymbol]}${array[$geneSymbol]:+,}$FDR" ;
done < input.txt
Which will store the FDR values that I can print by doing
for key in "${!array[#]}"; do echo "$key->${array[$key]}"; done
# Prints out
"ABHD14B"->0.285807588279,0.898327660004,0.820468496328
"DHFR"->0.464931314555,0.449582575347
...
I naively tried to read several file through my array by doing
declare -A array
find ./aligned.filtered/rMAT*/MATS_output/SE.MATS.JunctionCountOnly.txt -type f -exec cat {} + |
while read ID GeneID geneSymbol chr strand exonStart_0base exonEnd upstreamES upstreamEE downstreamES downstreamEE ID IJC_SAMPLE_1 SJC_SAMPLE_1IJC_SAMPLE_2 SJC_SAMPLE_2 IncFormLen SkipFormLen PValue FDR IncLevel1 IncLevel2 IncLevelDifference;
do array[$geneSymbol]="${array[$geneSymbol]}${array[$geneSymbol]:+,}$FDR" ;
done
But in this case my array ends up being empty. I can of course cat all the files I need and save them into a single file that I can use as above, but it would be nice to know how to make an associative array to store data from several distinct files.
Thank you very much!
You probably shouldn't be doing this in bash in the first place, but your main problem is that the while loop runs in a subshell induced by the pipeline. Use process substitution to invert the relationship.
(Also, don't give names to all the fields you don't actually use; just split the line into an indexed array and pick out the two fields you actually want.)
while read -a fields; do
geneSymbol=${fields[1]}
FDR=${fields[...]} # some number; i'm not counting
array[$geneSymbol]="${array[$geneSymbol]}${array[$geneSymbol]:+,}$FDR"
done < <(find ./aligned.filtered/rMAT*/MATS_output/SE.MATS.JunctionCountOnly.txt -type f -exec cat {} +)
find probably isn't necessary; just put your while loop inside a for loop:
for f in ./aligned.filtered/rMAT*/MATS_output/SE.MATS.JunctionCountOnly.txt; do
while read -a fields; do
...
done < "$f"
done

Convert a array into an associative array (Bash)

I had my code converting my string into an array, but i end up noticing that i will need it to be associative with $9 as key and the rest of the string as value
stringy=$(ls -l | awk '{print$3,$6,$7,$8,$9}')
declare -a myarray=()
There is the possibility of doing it using something similar to?
readarray -t myarray <<< "$stringy"
(yes parse ls is not exactly wise ;) )

File Name based on Array

I have created an array:
declare -A months=( ["JAN"]="AP01" ["FEB"]="AP02" ["MAR"]="AP03" ["APR"]="AP04" ["MAY"]="AP05" ["JUN"]="AP06" ["JUL"]="AP07" ["AUG"]="AP08" ["SEP"]="AP09" ["OCT"]="AP10" ["NOV"]="AP11" ["DEC"]="AP12")
Now I want read the replaced value of the month as it splits the file and creates new file name:
awk -F, '{print "a~ST_SAP_FILE~Actual~",echo ${months["${"$3":0:3}"]}","~RM.txt"}' ExtractOriginal.txt
The field where the variable substitution occurs is column 3. In there I have MAR-2016, what I am expecting is a file named: a~ST_SAP_FILE~Actual~MAR~RM.txt. However, I get an error:
awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: bailing out near line 1
What is the right syntax to take column 3, pass it to my array, return the Substitution variable and use it as the file name?
There's a few ways you could go about solving your problem. Which you choose is mostly contingent on how tied to awk you want to be.
Declare the array in awk:
Is there any reason for you not to declare the variable in awk?
awk -F, 'BEGIN{months["JAN"]="AP01"; months["FEB"]="AP02"; months["MAR"]="AP03"; months["APR"]="AP04"; months["MAY"]="AP05"; months["JUN"]="AP06"; months["JUL"]="AP07"; months["AUG"]="AP08"; months["SEP"]="AP09"; months["OCT"]="AP10"; months["NOV"]="AP11"; months["DEC"]="AP12"}{print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"}' ExtractOriginal.txt
(also note that I removed the commas from print, since those will add spaces that your question seems to indicate you do not want in the result)
As #Ed Morton pointed out, due to the nature of your array, we can simplify it's creation with split/sprintf, giving you this:
awk -F, 'BEGIN{split("JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC",t," "); for (i in t) months[t[i]]=sprintf("AP%02d",i)}{print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"}' ExtractOriginal.txt
Parse the variable into awk:
This seems closest to what you were trying to do in your attempt. This keeps the array available in bash, but handles getting the filename you want with awk. Since there is no native way to handle a bash array in awk, you have to construct the latter from the former (which is made more difficult by this being an associative array).
I did this by first changing the bash array into a more easily parsed string which I then passed into awk as a variable.
# Declare the array
declare -A months=( ["JAN"]="AP01" ["FEB"]="AP02" ["MAR"]="AP03" ["APR"]="AP04" ["MAY"]="AP05" ["JUN"]="AP06" ["JUL"]="AP07" ["AUG"]="AP08" ["SEP"]="AP09" ["OCT"]="AP10" ["NOV"]="AP11" ["DEC"]="AP12")
# Change the array into a string more easily parsed with awk
# Each element in this array is of the format MON=APON
mon=`for key in ${!months[#]}; do echo ${key}'='${months[${key}]}; done`
# See below explanation
awk -F, -v mon="$mon" 'BEGIN {split(mon,tmp," "); for(m in tmp){i = index(tmp[m], "="); months[substr(tmp[m], 1, i-1)] = substr(tmp[m], i+1)}} {print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"}' ExtractOriginal.txt
Below is a more readable version of the awk script. Note that -v mon="$mon" passes the bash variable mon into awk as a variable also named mon:
BEGIN {
split(mon,tmp," "); # split the string mon into an array named tmp
for(m in tmp) { # for element in tmp
i = index(tmp[m], "="); # get the index of the '='
months[substr(tmp[m], 1, i-1)] = substr(tmp[m], i+1)
# split the elements of tmp at the '='
# and add them into an associative array called months
# the value is the part which follows the '='
}
}
{
print "a~ST_SAP_FILE~Actual~"months[substr($3,0,3)]"~RM.txt"
}
Skip awk entirely:
Another option is to simply not use awk at all, which removes the burden of getting the array into a workable state. It's not clear by your question if this is a potential solution for you, but personally I found this bash version much simpler to write/read/understand.
#!/usr/bin/env bash
filename="ExtractOriginal.txt"
declare -A months=( ["JAN"]="AP01" ["FEB"]="AP02" ["MAR"]="AP03" ["APR"]="AP04" ["MAY"]="AP05" ["JUN"]="AP06" ["JUL"]="AP07" ["AUG"]="AP08" ["SEP"]="AP09" ["OCT"]="AP10" ["NOV"]="AP11" ["DEC"]="AP12")
while read line; do # for line in file
month_yr=`echo $line | cut -d',' -f3` # get the third column
month=${months[${month_yr:0:3}]} # get first 3 characters
echo 'a~ST_SAP_FILE~Actual~'$month'~RM.txt'
done <"$filename"

How to copy an array in Bash?

I have an array of applications, initialized like this:
depends=$(cat ~/Depends.txt)
When I try to parse the list and copy it to a new array using,
for i in "${depends[#]}"; do
if [ $i #isn't installed ]; then
newDepends+=("$i")
fi
done
What happens is that only the first element of depends winds up on newDepends.
for i in "${newDepends[#]}"; do
echo $i
done
^^ This would output just one thing. So I'm trying to figure out why my for loop is is only moving the first element. The whole list is originally on depends, so it's not that, but I'm all out of ideas.
a=(foo bar "foo 1" "bar two") #create an array
b=("${a[#]}") #copy the array in another one
for value in "${b[#]}" ; do #print the new array
echo "$value"
done
The simplest way to copy a non-associative array in bash is to:
arrayClone=("${oldArray[#]}")
or to add elements to a preexistent array:
someArray+=("${oldArray[#]}")
Newlines/spaces/IFS in the elements will be preserved.
For copying associative arrays, Isaac's solutions work great.
The solutions given in the other answers won't work for associative arrays, or for arrays with non-contiguous indices. Here are is a more general solution:
declare -A arr=([this]=hello [\'that\']=world [theother]='and "goodbye"!')
temp=$(declare -p arr)
eval "${temp/arr=/newarr=}"
diff <(echo "$temp") <(declare -p newarr | sed 's/newarr=/arr=/')
# no output
And another:
declare -A arr=([this]=hello [\'that\']=world [theother]='and "goodbye"!')
declare -A newarr
for idx in "${!arr[#]}"; do
newarr[$idx]=${arr[$idx]}
done
diff <(echo "$temp") <(declare -p newarr | sed 's/newarr=/arr=/')
# no output
Try this: arrayClone=("${oldArray[#]}")
This works easily.
array_copy() {
set -- "$(declare -p $1)" "$2"
eval "$2=${1#*=}"
}
# Usage examples:
these=(apple banana catalog dormant eagle fruit goose hat icicle)
array_copy these those
declare -p those
declare -A src dest
source=(["It's a 15\" spike"]="and it's 1\" thick" [foo]=bar [baz]=qux)
array_copy src dest
declare -p dest
Note: when copying associative arrays, the destination must already exist as an associative array. If not, array_copy() will create it as a standard array and try to interpret the key names from the associative source as arithmetic variable names, with ugly results.
Isaac Schwabacher's solution is more robust in this regard, but it can't be tidily wrapped up in a function because its eval step evaluates an entire declare statement and bash treats those as equivalent to local when they're inside a function. This could be worked around by wedging the -g option into the evaluated declare but that might give the destination array more scope than it's supposed to have. Better, I think, to have array_copy() perform only the actual copy into an explicitly scoped destination.
You can copy an array by inserting the elements of the first array into the copy by specifying the index:
#!/bin/bash
array=( One Two Three Go! );
array_copy( );
let j=0;
for (( i=0; i<${#array[#]}; i++)
do
if [[ $i -ne 1 ]]; then # change the test here to your 'isn't installed' test
array_copy[$j]="${array[$i]}
let i+=1;
fi
done
for k in "${array_copy[#]}"; do
echo $k
done
The output of this would be:
One
Three
Go!
A useful document on bash arrays is on TLDP.
Problem is to copy array in function to be visible in parent code. This solution works for indexed arrays and if before copying are predefined as declare -A ARRAY, works also for associative arrays.
function array_copy
# $1 original array name
# $2 new array name with the same content
{
local INDEX
eval "
for INDEX in \"\${!$1[#]}\"
do
$2[\"\$INDEX\"]=\"\${$1[\$INDEX]}\"
done
"
}
Starting with Bash 4.3, you can do this
$ alpha=(bravo charlie 'delta 3' '' foxtrot)
$ declare -n golf=alpha
$ echo "${golf[2]}"
delta 3
Managed to copy an array into another.
firstArray=()
secondArray=()
firstArray+=("Element1")
firstArray+=("Element2")
secondArray+=("${firstArray[#]}")
for element in "${secondArray[#]}"; do
echo "${element}"
done
I've found that this works for me (mostly :)) ...
eval $(declare -p base | sed "s,base,target,")
extending the sed command to edit any switches as necessary e.g. if the new structure has to be writeable, to edit out read-only (-r).
I've discovered what was wrong.. My if isn't installed test is two for loops that remove excess characters from file names, and spits them out if they exist on a certain web server. What it wasn't doing was removing a trailing hyphen. So, when it tested it online for availability, they were parsed out. Because "file" exists, but "file-" doesn't.

Resources