Split two numbers in two arrays - arrays

I need to split 2 numbers in the form(they are from a text file):
Num1:Num2
Num3:Num4
And store num1 into array X and number 2 in array Y num 3 in array X and num4 in array Y.

With bash:
mapfile -t X < <(cut -d : -f 1 file) # read only first column
mapfile -t Y < <(cut -d : -f 2 file) # read only second column
declare -p X Y
Output:
declare -a X='([0]="num1" [1]="num3")'
declare -a Y='([0]="num2" [1]="num4")'
Disadvantage: The file is read twice.

You could perform the following steps:
Create destination arrays empty
Read file line by line, with a classic while read ... < file loop
Split each line on :, again using read
Append values to arrays
For example:
arr_x=()
arr_y=()
while IFS= read line || [ -n "$line" ]; do
IFS=: read x y <<< "$line"
arr_x+=("$x")
arr_y+=("$y")
done < data.txt
echo "content of arr_x:"
for v in "${arr_x[#]}"; do
echo "$v"
done
echo "content of arr_y:"
for v in "${arr_y[#]}"; do
echo "$v"
done

Here is a quick bash solution:
c=0
while IFS=: read a b ;do
x[$c]="$a"
y[$c]="$b"
c=$((c+1))
done < input.txt
We send the input.txt to a while loop, using Input Field Separator : and read the first number of each line as $a and second number as $b. Then we add them to the array as you specified. We use a counter $c to iterate the location in the arrays.

Using =~ operator to store the pair of numbers to array $BASH_REMATCH:
$ cat file
123:456
789:012
$ while read -r line
do
[[ $line =~ ([^:]*):(.*) ]] && echo ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}
# do something else with numbers as they will be replaced on the next iteration
done < file

Related

How to create mutiple arrays from a text file and loop through the values of each array

I have a text file with the following:
Paige
Buckley
Govan
Mayer
King
Harrison
Atkins
Reinhardt
Wilson
Vaughan
Sergovia
Tarrega
My goal is to create an array for each set of names. Then Iterate through the first array of values then move on to the second array of values and lastly the third array. Each set is separated by a new line in the text file. Help with code or logic is much appreciated!
so far I have the following. i am unsure of the logic moving forward when i reach a line break. My research here also suggests that i can use readarray -d.
#!/bin/bash
my_array=()
while IFS= read -r line || [[ "$line" ]]; do
if [[ $line -eq "" ]];
.
.
.
arr+=("$line") # i know this adds the value to the array
done < "$1"
printf '%s\n' "${my_array[#]}"
desired output:
array1 = (Paige Buckley6 Govan Mayer King)
array2 = (Harrison Atkins Reinhardt Wilson)
array3 = (Vaughan Sergovia Terrega)
#then loop through the each array one after the other.
Bash has no array-of-arrays. So you have to represent it in an other way.
You could leave the newlines and have an array of newline separated elements:
array=()
elem=""
while IFS= read -r line; do
if [[ "$line" != "" ]]; then
elem+="${elem:+$'\n'}$line" # accumulate lines in elem
else
array+=("$elem") # flush elem as array element
elem=""
fi
done
if [[ -n "$elem" ]]; then
array+=("$elem") # flush the last elem
fi
# iterate over array
for ((i=0;i<${#array[#]};++i)); do
# each array element is newline separated items
readarray -t elem <<<"${array[i]}"
printf 'array%d = (%s)\n' "$i" "${elem[*]}"
done
You could simplify the loop with some unique character and a sed for example like:
readarray -d '#' -t array < <(sed -z 's/\n\n/#/g' file)
But overall, this awk generates same output:
awk -v RS= -v FS='\n' '{
printf "array%d = (", NR;
for (i=1;i<=NF;++i) printf "%s%s", $i, i==NF?"":" ";
printf ")\n"
}'
Using nameref :
#!/usr/bin/env bash
declare -a array1 array2 array3
declare -n array=array$((n=1))
while IFS= read -r line; do
test "$line" = "" && declare -n array=array$((n=n+1)) || array+=("$line")
done < "$1"
declare -p array1 array2 array3
Called with :
bash test.sh data
# result
declare -a array1=([0]="Paige" [1]="Buckley" [2]="Govan" [3]="Mayer" [4]="King")
declare -a array2=([0]="Harrison" [1]="Atkins" [2]="Reinhardt" [3]="Wilson")
declare -a array3=([0]="Vaughan" [1]="Sergovia" [2]="Tarrega")
Assumptions:
blank links are truly blank (ie, no need to worry about any white space on said lines)
could have consecutive blank lines
names could have embedded white space
the number of groups could vary and won't always be 3 (as with the sample data provided in the question)
OP is open to using a (simulated) 2-dimensional array as opposed to a (variable) number of 1-dimensional arrays
My data file:
$ cat names.dat
<<< leading blank lines
Paige
Buckley
Govan
Mayer
King Kong
<<< consecutive blank lines
Harrison
Atkins
Reinhardt
Wilson
Larry
Moe
Curly
Shemp
Vaughan
Sergovia
Tarrega
<<< trailing blank lines
One idea that uses a couple arrays:
array #1: associative array - the previously mentioned (simulated) 2-dimensional array with the index - [x,y] - where x is a unique identifier for a group of names and y is a unique identifier for a name within a group
array #2: 1-dimensional array to keep track of max(y) for each group x
Loading the arrays:
unset names max_y # make sure array names are not already in use
declare -A names # declare associative array
x=1 # init group counter
y=0 # init name counter
max_y=() # initialize the max(y) array
inc= # clear increment flag
while read -r name
do
if [[ "${name}" = '' ]] # if we found a blank line ...
then
[[ "${y}" -eq 0 ]] && # if this is a leading blank line then ...
continue # ignore and skip to the next line
inc=y # set flag to increment 'x'
else
[[ "${inc}" = 'y' ]] && # if increment flag is set ...
max_y[${x}]="${y}" && # make note of max(y) for this 'x'
((x++)) && # increment 'x' (group counter)
y=0 && # reset 'y'
inc= # clear increment flag
((y++)) # increment 'y' (name counter)
names[${x},${y}]="${name}" # save the name
fi
done < names.dat
max_y[${x}]="${y}" # make note of the last max(y) value
Contents of the array:
$ typeset -p names
declare -A names=([1,5]="King Kong" [1,4]="Mayer" [1,1]="Paige" [1,3]="Govan" [1,2]="Buckley" [3,4]="Shemp" [3,3]="Curly" [3,2]="Moe" [3,1]="Larry" [2,4]="Wilson" [2,2]="Atkins" [2,3]="Reinhardt" [2,1]="Harrison" [4,1]="Vaughan" [4,2]="Sergovia" [4,3]="Tarrega" )
$ for (( i=1; i<=${x}; i++ ))
do
for (( j=1; j<=${max_y[${i}]}; j++ ))
do
echo "names[${i},${j}] : ${names[${i},${j}]}"
done
echo ""
done
names[1,1] : Paige
names[1,2] : Buckley
names[1,3] : Govan
names[1,4] : Mayer
names[1,5] : King Kong
names[2,1] : Harrison
names[2,2] : Atkins
names[2,3] : Reinhardt
names[2,4] : Wilson
names[3,1] : Larry
names[3,2] : Moe
names[3,3] : Curly
names[3,4] : Shemp
names[4,1] : Vaughan
names[4,2] : Sergovia
names[4,3] : Tarrega

Read lines from text file and store it in array

So i need to read all the lines from a text file(as an argument when i call the script) which contains numbers in this form(1 new line not 2):
num1:num2
num3:num4 etc
I use this command block:
while IFS= read line
do
IFS=':' read -r -a X <<< "$line"
done < "$1"
to read the lines and numbers and store it into array X but the array goes only to position 0 and 1 and when it changes line it just write the new number(eg num3) where the old number was(eg num1 in pos 0)
Any solution to this?
With bash. Replace all : with line break and use mapfile to fill array x.
mapfile -t x < <(tr ':' '\n' < file)
declare -p x
Output:
declare -a x='([0]="num1" [1]="num2" [2]="num3" [3]="num4")'
See: help mapfile

Creating an array from file and name it based on a line in bash

I have text file like this:
src_dir=source1
src_dir=source2
dst_dir=dest1
whatever_thing=thing1
whatever_thing=thing2
I want a script that will create arrays with names from the left part of a line and fill it with elements from the right part of a line. So basically it should do:
src_dir=(source1 source2)
dst_dir=(dest1)
whatever_thing=(thing1 thing2)
I've tried so far:
while read -r -a line
do
IFS='= ' read -r -a array <<< "$line"
"${array[0]}"+="${array[1]}"
done < file.txt
If your bash version is 4.3 or newer, declare has an -n option
to define a rerefence to the variable name which works as a reference in C++.
Then please try the following:
while IFS== read -r key val; do
declare -n name=$key
name+=("$val")
done < file.txt
# test
echo "${src_dir[#]}"
echo "${dst_dir[#]}"
echo "${whatever_thing[#]}"
try this:
#!/bin/bash
mapfile -t arr < YourDataFile.txt
declare -A dict
for line in "${arr[#]}"; do
key="${line%%=*}"
value="${line#*=}"
[ ${dict["$key"]+X} ] && dict["$key"]+=" $value" || dict["$key"]="$value"
done
for key in "${!dict[#]}"; do
printf "%s=(%s)\n" "$key" "${dict["$key"]}"
done
explanation
# read file into array
mapfile -t arr < YourDataFile.txt
# declare associative array
declare -A dict
# loop over the data array
for line in "${arr[#]}"; do
# extract key
key="${line%%=*}"
# extract value
value="${line#*=}"
# write into associative array
# - if key exists ==> append value
# - else initialize entry
[ ${dict["$key"]+X} ] && dict["$key"]+=" $value" || dict["$key"]="$value"
done
# loop over associative array
for key in "${!dict[#]}"; do
# print key=value pairs
printf "%s=(%s)\n" "$key" "${dict["$key"]}"
done
output
dst_dir=(dest1)
src_dir=(source1 source2)
whatever_thing=(thing1 thing2)

Picking valid IPs from an array of strings

in my usecase i'm filtering certain IPv4s from the list and putting them into array for further tasks:
readarray -t firstarray < <(grep -ni '^ser*' IPbook.file | cut -f 2 -d "-")
As a result the output is:
10.8.61.10
10.0.10.15
172.0.20.30
678.0.0.10
As you see the last row is not an IP, therefore i faced an urge to add some regex check on the FIRSTARRAY.
I do not want to save a collateral files to work with them, so i'm looking for some "on-the-fly" option to regex the firstarray. I tried the following:
for X in "${FIRSTARRAY[#]}"; do
readarray -t SECONDARRAY < <(grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' "$X")
done
But in the output I see that system thinks that $X is a file/dir, and didn't process the value, even though it clearly sees it:
line ABC: 172.0.20.30: No such file or directory
line ABC: 678.0.0.10: No such file or directory
What am I doing wrong and what would be the best approach to proceed?
You are passing "$X" as an argument to grep and hence it is being treated as a file. Use herestring <<< instead:
for X in "${firstarray[#]}"; do
readarray -t secondarray < <(grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' <<< "$X")
done
You are better off writing a function to validate the IP instead of relying on just a regex match:
#!/bin/bash
validate_ip() {
local arr element
IFS=. read -r -a arr <<< "$1" # convert ip string to array
[[ ${#arr[#]} != 4 ]] && return 1 # doesn't have four parts
for element in "${arr[#]}"; do
[[ $element =~ ^[0-9]+$ ]] || return 1 # non numeric characters found
[[ $element =~ ^0[1-9]+$ ]] || return 1 # 0 not allowed in leading position if followed by other digits, to prevent it from being interpreted as on octal number
((element < 0 || element > 255)) && return 1 # number out of range
done
return 0
}
And then loop through your array:
for x in "${firstarray[#]}"; do
validate_ip "$x" && secondarray+=("$x") # add to second array if element is a valid IP
done
The problem is, that you passing an argument to the grep command and it expects reading standard input instead.
You can use your regex to filter the IP addresses right in the first command:
readarray -t firstarray < <(grep -ni '^ser*' IPbook.file | cut -f 2 -d "-" | grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' )
Then you have only IP addresses in firstarray.

Storing data in multiple arrays (bash)

I am trying to store the contents of a .txt file in two sets of arrays in bash. The file is a list of characteristics for given data files, delimited by vertical bars (|). So far, I have written code that reads the file and prints each line of data separately, each followed by the given sections of the line.
#prints line of text and then separated version
while IFS='' read -r line || [[ -n "$line" ]]
do
echo "Text read from file: $line"
words=$(echo $line | tr "|" "\n")
for tests in $words
do
echo "> $tests"
done
done < "$1"
Example output:
Text read from file: this|is|data|in|a|file
> this
> is
> data
> in
> a
> file
Text read from file: another|example|of|data
> another
> example
> of
> data
Is there a way for me to store each individual line of data in one array, and then the broken up parts of it within another? I was thinking this might be possible using a loop, but I am confused by arrays using bash (newbie).
OK -- I just read in the lines like you have done, and append them to the lines array. Then, use tr as you have done, and append to the words array. Just use the parentheses to mark them as array elements in the assignments:
$ cat data.txt
this|is|data|in|a|file
another|example|of|data
$ cat read_data.sh
#!/bin/bash
declare -a lines
declare -a words
while IFS='' read -r line || [[ -n "$line" ]]
do
echo "Text read from file: $line"
lines+=( $line )
words+=( $(echo $line | tr "|" " ") )
done < "$1"
for (( ii=0; ii<${#lines[#]}; ii++ )); do
echo "Line $ii ${lines[ii]}"
done
for (( ii=0; ii<${#words[#]}; ii++ )); do
echo "Word $ii ${words[ii]}"
done
$ $ ./read_data.sh data.txt
Text read from file: this|is|data|in|a|file
Text read from file: another|example|of|data
Line 0 this|is|data|in|a|file
Line 1 another|example|of|data
Word 0 this
Word 1 is
Word 2 data
Word 3 in
Word 4 a
Word 5 file
Word 6 another
Word 7 example
Word 8 of
Word 9 data

Resources