Using array inside awk in shell script - arrays

I am very new to Unix shell script and trying to get some knowledge in shell scripting. Please check my requirement and my approach.
I have a input file having data
ABC = A:3 E:3 PS:6
PQR = B:5 S:5 AS:2 N:2
I am trying to parse the data and get the result as
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
The values can be added horizontally and vertically so I am trying to use an array. I am trying something like this:
myarr=(main.conf | awk -F"=" 'NR!=1 {print $1}'))
echo ${myarr[1]}
# Or loop through every element in the array
for i in "${myarr[#]}"
do
:
echo $i
done
or
awk -F"=" 'NR!=1 {
print $1"\n"
STR=$2
IFS=':' read -r -a array <<< "$STR"
for i in "${!array[#]}"
do
echo "$i=>${array[i]}"
done
}' main.conf
But when I add this code to a .sh file and try to run it, I get syntax errors as
$ awk -F"=" 'NR!=1 {
> print $1"\n"
> STR=$2
> FS= read -r -a array <<< "$STR"
> for i in "${!array[#]}"
> do
> echo "$i=>${array[i]}"
> done
>
> }' main.conf
awk: cmd. line:4: FS= read -r -a array <<< "$STR"
awk: cmd. line:4: ^ syntax error
awk: cmd. line:5: for i in "${!array[#]}"
awk: cmd. line:5: ^ syntax error
awk: cmd. line:8: done
awk: cmd. line:8: ^ syntax error
How can I complete the above expectations?

This is the awk code to do what you want:
$ cat tst.awk
BEGIN { FS="[ =:]+"; OFS="=" }
{
print $1
for (i=2;i<NF;i+=2) {
print $i, $(i+1)
}
print ""
}
and this is the shell script (yes, all a shell script does to manipulate text is call awk):
$ awk -f tst.awk file
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
A UNIX shell is an environment from which to call UNIX tools (find, sort, sed, grep, awk, tr, cut, etc.). It has its own language for manipulating (e.g. creating/destroying) files and processes and sequencing calls to tools but it is NOT intended to be used to manipulate text. The guys who invented shell also invented awk for shell to call to manipulate text.
Read https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice and the book Effective Awk Programming, 4th Edition, by Arnold Robbins.

First off, a command that does what you want:
$ sed 's/ = /\n/;y/: /=\n/' main.conf
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
This replaces, on each line, the first (and only) occurrence of = with a newline (the s command), then turns all : into = and all spaces into newlines (the y command). Notice that
this works only because there is a space at the end of the first line (otherwise it would be a bit more involved to get the empty line between the blocks) and
this works only with GNU sed because it substitutes newlines; see this fantastic answer for all the details and how to get it to work with BSD sed.
As for what you tried, there is almost too much wrong with it to try and fix it piece by piece: from the wild mixing of awk and Bash to syntax errors all over the place. I recommend you read good tutorials for both, for example:
The BashGuide
Effective AWK Programming
A Bash solution
Here is a way to solve the same in Bash; I didn't use any arrays.
#!/bin/bash
# Read line by line into the 'line' variable. Setting 'IFS' to the empty string
# preserves leading and trailing whitespace; '-r' prevents interpretation of
# backslash escapes
while IFS= read -r line; do
# Three parameter expansions:
# Replace ' = ' by newline (escape backslash)
line="${line/ = /\\n}"
# Replace ':' by '='
line="${line//:/=}"
# Replace spaces by newlines (escape backslash)
line="${line// /\\n}"
# Print the modified input line; '%b' expands backslash escapes
printf "%b" "$line"
done < "$1"
Output:
$ ./SO.sh main.conf
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2

Related

Bash Add elements to an array does not work [duplicate]

Why isn't this bash array populating? I believe I've done them like this in the past. Echoing ${#XECOMMAND[#]} shows no data..
DIR=$1
TEMPFILE=/tmp/dir.tmp
ls -l $DIR | tail -n +2 | sed 's/\s\+/ /g' | cut -d" " -f5,9 > $TEMPFILE
i=0
cat $TEMPFILE | while read line ;do
if [[ $(echo $line | cut -d" " -f1) == 0 ]]; then
XECOMMAND[$i]="$(echo "$line" | cut -d" " -f2)"
(( i++ ))
fi
done
When you run the while loop like
somecommand | while read ...
then the while loop is executed in sub-shell, i.e. a different process than the main script. Thus, all variable assignments that happen in the loop, will not be reflected in the main process. The workaround is to use input redirection and/or command substitution, so that the loop executes in the current process. For example if you want to read from a file you do
while read ....
do
# do stuff
done < "$filename"
or if you wan't the output of a process you can do
while read ....
do
# do stuff
done < <(some command)
Finally, in bash 4.2 and above, you can set shopt -s lastpipe, which causes the last command in the pipeline to be executed in the current process.
I think you're trying to construct an array consisting of the names of all zero-length files and directories in $DIR. If so, you can do it like this:
mapfile -t ZERO_LENGTH < <(find "$DIR" -maxdepth 1 -size 0)
(Add -type f to the find command if you're only interested in regular files.)
This sort of solution is almost always better than trying to parse ls output.
The use of process substitution (< <(...)) rather than piping (... |) is important, because it means that the shell variable will be set in the current shell, not in an ephimeral subshell.

Problem with Splitting Up a String and Putting it Into an Array in BASH on a Mac

I have been trying to split up a string and putting it into an Array in Bash on my Mac without success.
Here is my sample code:
#!/bin/bash
declare -a allDisks
allDisksString="`ls /dev/disk* | grep -e 'disk[0-9]s.*' | awk '{ print $NF }'`"
#allDisksString="/dev/disk0s1 /dev/disk1s1"
echo allDisksString is $allDisksString
IFS=' ' read -ra allDisks <<< "$allDisksString"
echo allDIsks is "$allDisks"
echo The second item in allDisks is "${allDisks[1]}"
for disk in "${allDisks[#]}"
do
printf "Loop $disk\n"
done
And below is the output:
allDisksString is /dev/disk0s1 /dev/disk0s2 /dev/disk0s3 /dev/disk0s4 /dev/disk1s1
allDIsks is /dev/disk0s1
The second item in allDisks is
Loop /dev/disk0s1
Interesting if I execute the following in the Mac Terminal:
ls /dev/disk* | grep -e 'disk[0-9]s.*' | awk '{ print $NF }'
I get the following output
/dev/disk0s1
/dev/disk0s2
/dev/disk0s3
/dev/disk0s4
/dev/disk1s1
So I have also tried setting IFS to IFS=$'\n' without any success.
So no luck in getting a list of my drives into an array.
Any ideas on what I am doing wrong?
You're making this much more complicated than it needs to be. You don't need to use ls, you can just use a wildcard to match the device names you want, and put that in an array assignment.
#!/bin/bash
declare -a allDisks
allDisks=(/dev/disk[0-9]s*)
echo allDIsks is "$allDisks"
echo The second item in allDisks is "${allDisks[1]}"
for disk in "${allDisks[#]}"
do
printf "Loop $disk\n"
done
read only reads one line.
Use an assignment instead. When assigning to an array, you need to use parentheses after the = sign:
#!/bin/bash
disks=( $(ls /dev/disk* | grep -e 'disk[0-9]s.*' | awk '{ print $NF }') )
echo ${disks[1]}

How do i echo specific rows and columns from csv's in a variable?

The below script:
#!/bin/bash
otscurrent="
AAA,33854,4528,38382,12
BBB,83917,12296,96213,13
CCC,20399,5396,25795,21
DDD,27198,4884,32082,15
EEE,2472,981,3453,28
FFF,3207,851,4058,21
GGG,30621,4595,35216,13
HHH,8450,1504,9954,15
III,4963,2157,7120,30
JJJ,51,59,110,54
KKK,87,123,210,59
LLL,573,144,717,20
MMM,617,1841,2458,75
NNN,234,76,310,25
OOO,12433,1908,14341,13
PPP,10627,1428,12055,12
QQQ,510,514,1024,50
RRR,1361,687,2048,34
SSS,1,24,25,96
TTT,0,5,5,100
UUU,294,1606,1900,85
"
IFS="," array1=(${otscurrent})
echo ${array1[4]}
Prints:
$ ./test.sh
12
BBB
I'm trying to get it to just print 12... And I am not even sure how to make it just print row 5 column 4
The variable is an output of a sqlquery that has been parsed with several sed commands to change the formatting to csv.
otscurrent="$(sqlplus64 user/password#dbserverip/db as sysdba #query.sql |
sed '1,11d; /^-/d; s/[[:space:]]\{1,\}/,/g; $d' |
sed '$d'|sed '$d'|sed '$d' | sed '$d' |
sed 's/Used,MB/Used MB/g' |
sed 's/Free,MB/Free MB/g' |
sed 's/Total,MB/Total MB/g' |
sed 's/Pct.,Free/Pct. Free/g' |
sed '1b;/^Name/d' |
sed '/^$/d'
)"
Ultimately I would like to be able to call on a row and column and run statements on the values.
Initially i was piping that into :
awk -F "," 'NR>1{ if($5 < 10) { printf "%-30s%-10s%-10s%-10s%-10s\n", $1,$2,$3,$4,$5"%"; } else { echo "Nothing to do" } }')"
Which works but I couldn't run commands from if else ... or atleaste I didn't know how.
If you have bash 4.0 or newer, an associative array is an appropriate way to store data in this kind of form.
otscurrent=${otscurrent#$'\n'} # strip leading newline present in your sample data
declare -A data=( )
row=0
while IFS=, read -r -a line; do
for idx in "${!line[#]}"; do
data["$row,$idx"]=${line[$idx]}
done
(( row += 1 ))
done <<<"$otscurrent"
This lets you access each individual item:
echo "${data[0,0]}" # first field of first line
echo "${data[9,0]}" # first field of tenth line
echo "${data[9,1]}" # second field of tenth line
"I'm trying to get it to just print 12..."
The issue is that IFS="," splits on commas and there is no comma between 12 and BBB. If you want those to be separate elements, add a newline to IFS. Thus, replace:
IFS="," array1=(${otscurrent})
With:
IFS=$',\n' array1=(${otscurrent})
Output:
$ bash test.sh
12
All you need to print the value of the 4th column on the 5th row is:
$ awk -F, 'NR==5{print $4}' <<< "$otscurrent"
3453
and just remember that in awk row (record) and column (field) numbers start at 1, not 0. Some more examples:
$ awk -F, 'NR==1{print $5}' <<< "$otscurrent"
12
$ awk -F, 'NR==2{print $1}' <<< "$otscurrent"
BBB
$ awk -F, '$5 > 50' <<< "$otscurrent"
JJJ,51,59,110,54
KKK,87,123,210,59
MMM,617,1841,2458,75
SSS,1,24,25,96
TTT,0,5,5,100
UUU,294,1606,1900,85
If you'd like to avoid all of the complexity and simply parse your SQL output to produce what you want without 20 sed commands in between, post a new question showing the raw sqlplus output as the input and what you want finally output and someone will post a brief, clear, simple, efficient awk script to do it all at one time, or maybe 2 commands if you still want an intermediate CSV for some reason.

Comment out items that do not match pattern in array

I have a log file I am trying to comment out lines that do not match my array. I did successfully learn how to create an array and I can echo out the array items but I am having trouble taking anything that doesn't match my array and adding something in front of it. Here is my code, if you have suggestions on another path or ways I can make it better:
for itsSaturday in $(find "$LOCATION" -mindepth 1 -maxdepth 1 -name "*.log" ); do
TEMPFILE="$itsSaturday.$$"
declare -a someArray=( "breakfast" "scrambled eggs" "Bloody Mary" )
theCall='some_additional_text_'
commentOn="## You_need_"
for arrayItem in "${someArray[#]}"; do
merged="$theCall$arrayItem"
if ! grep -q "$merged" "$itsSaturday"; then
sed -e '/$merged/! s:$commentOn$theCall::g' "$itsSaturday" > $TEMPFILE && mv $TEMPFILE "$itsSaturday"
fi
done
done
file:
some_additional_text_breakfast
some_additional_text_bacon
some_additional_text_scrambled eggs
some_additional_text_Bloody Mary
some_additional_text_orange juice
some_additional_text_breakfast
file into:
some_additional_text_breakfast
## You_need_some_additional_text_bacon
some_additional_text_scrambled eggs
some_additional_text_Bloody Mary
## You_need_some_additional_text_orange juice
some_additional_text_breakfast
How can I add a variable before items that do not match my array?
I don't like doing this using bash and sed, but I think the following might be enough:
#! /bin/bash
declare -a someArray=( "breakfast" "scrambled eggs" "Bloody Mary" )
theCall='some_additional_text_'
commentOn="## You_need_"
OIFS="$IFS"
IFS='|' mergedLines="${someArray[*]/#/$theCall}"
IFS="$OIFS"
for i in *.txt
do
TEMPFILE="$i.$$"
sed -r "/$mergedLines/!s/^/$commentOn/" "$i" >> "$TEMPFILE"
done
I shifted the array and other constants out of the loop.
"${someArray[*]/#/$theCall}" uses bash string substitution to append the contents of $theCall to every element in the array.
IFS='|' mergedLines="${someArray[*]} is a convenient trick to combine the elements of an array into a pipe-separated string.
Combined, (2) and (3) get me
some_additional_text_breakfast|some_additional_text_scrambled eggs|some_additional_text_Bloody Mary
in mergedLines.
Then it's just a matter of using extended regular expressions in sed (for |) and replacing non-matching lines.
Your sed pattern used single quotes, so the variables within were not expanded.
Try replacing the inner for-loop with:
PROG=$(printf '%s\n' "${COMMENT[#]}" | while read comment ; do
/bin/echo -n '$0 !~ /'"$comment"'$/ && '
done
echo '1 { printf commentOn } ; { print }')
awk -v commentOn="$commentOn" "$PROG" $itsSaturday > $TEMPFILE && mv $TEMPFILE $itsSaturday
On each file, this creates an awk program that does the work.

Store the output of find command in an array [duplicate]

This question already has answers here:
How can I store the "find" command results as an array in Bash
(8 answers)
Closed 4 years ago.
How do I put the result of find $1 into an array?
In for loop:
for /f "delims=/" %%G in ('find $1') do %%G | cut -d\/ -f6-
I want to cry.
In bash:
file_list=()
while IFS= read -d $'\0' -r file ; do
file_list=("${file_list[#]}" "$file")
done < <(find "$1" -print0)
echo "${file_list[#]}"
file_list is now an array containing the results of find "$1
What's special about "field 6"? It's not clear what you were attempting to do with your cut command.
Do you want to cut each file after the 6th directory?
for file in "${file_list[#]}" ; do
echo "$file" | cut -d/ -f6-
done
But why "field 6"? Can I presume that you actually want to return just the last element of the path?
for file in "${file_list[#]}" ; do
echo "${file##*/}"
done
Or even
echo "${file_list[#]##*/}"
Which will give you the last path element for each path in the array. You could even do something with the result
for file in "${file_list[#]##*/}" ; do
echo "$file"
done
Explanation of the bash program elements:
(One should probably use the builtin readarray instead)
find "$1" -print0
Find stuff and 'print the full file name on the standard output, followed by a null character'. This is important as we will split that output by the null character later.
<(find "$1" -print0)
"Process Substitution" : The output of the find subprocess is read in via a FIFO (i.e. the output of the find subprocess behaves like a file here)
while ...
done < <(find "$1" -print0)
The output of the find subprocess is read by the while command via <
IFS= read -d $'\0' -r file
This is the while condition:
read
Read one line of input (from the find command). Returnvalue of read is 0 unless EOF is encountered, at which point while exits.
-d $'\0'
...taking as delimiter the null character (see QUOTING in bash manpage). Which is done because we used the null character using -print0 earlier.
-r
backslash is not considered an escape character as it may be part of the filename
file
Result (first word actually, which is unique here) is put into variable file
IFS=
The command is run with IFS, the special variable which contains the characters on which read splits input into words unset. Because we don't want to split.
And inside the loop:
file_list=("${file_list[#]}" "$file")
Inside the loop, the file_list array is just grown by $file, suitably quoted.
arrayname=( $(find $1) )
I don't understand your loop question? If you look how to work with that array then in bash you can loop through all array elements like this:
for element in $(seq 0 $((${#arrayname[#]} - 1)))
do
echo "${arrayname[$element]}"
done
This is probably not 100% foolproof, but it will probably work 99% of the time (I used the GNU utilities; the BSD utilities won't work without modifications; also, this was done using an ext4 filesystem):
declare -a BASH_ARRAY_VARIABLE=$(find <path> <other options> -print0 | sed -e 's/\x0$//' | awk -F'\0' 'BEGIN { printf "("; } { for (i = 1; i <= NF; i++) { printf "%c"gensub(/"/, "\\\\\"", "g", $i)"%c ", 34, 34; } } END { printf ")"; }')
Then you would iterate over it like so:
for FIND_PATH in "${BASH_ARRAY_VARIABLE[#]}"; do echo "$FIND_PATH"; done
Make sure to enclose $FIND_PATH inside double-quotes when working with the path.
Here's a simpler pipeless version, based on the version of user2618594
declare -a names=$(echo "("; find <path> <other options> -printf '"%p" '; echo ")")
for nm in "${names[#]}"
do
echo "$nm"
done
To loop through a find, you can simply use find:
for file in "`find "$1"`"; do
echo "$file" | cut -d/ -f6-
done
It was what I got from your question.

Resources