I have two files,
A
john 1 2 3 4 5 6 7
Ely 10 9 9 9 9 9 9
Maria 3 5 7 9 2 1 4
Rox 10 10 10 10 10 10 10
B
john 7.5
Ely 4.5
Maria 3,7
Rox 8.5
What i want to do is create another file with only the persons who have in file A their average greater or equal with the 8.5 and in B their mark also greater or equal to 8.5, so in my example the C file would contain only Rox because only she fulfil the criteria.
I have this
#shell program
echo "Fiserul are numele $1"
filename=$1
filename2=$2
echo "">temp.txt
touch results
compara="8.5"
cat $filename | while read -r line
do
nota=0
media=0
echo " $line"
rem=$( echo "$line"| cut -f 2- -d ' ')
for word in $rem
do
echo "$word"
nota=$(($nota+$word))
echo "Nota=$nota"
done
media=$(($nota / 7))
if [ "$(echo $media '>=' $compara | bc -l)" -eq 1 ];
then
nume=$( echo "$line"| cut -f 1 -d ' ')
echo "$nume $media" >> temp.txt
fi
echo "Media : $media"
done
cat $filename2 | while read -r line
do
so I have in the temp.txt files the persons who fulfil the criteria for file A but my question is how can i compare them with the persons from filename2 and create "results" from them ?
I've tried with two while loops but i get an error, could someone please help ?
Thanks!
If you really want to read two files at the same time (which doesn't appear to be your actual question -- join is indeed the right tool for what you're doing), you can open them on different FDs:
while IFS= read -r -u 4 line1 && IFS= read -r -u 5 line2; do
echo "Line from first file: $line1"
echo "Line from second file: $line2"
done 4<file1 5<file2
Use the join command to combine A and B into a single file C:
$ join A.txt B.txt
john 1 2 3 4 5 6 7 7.5
Ely 10 9 9 9 9 9 9 4.5
Maria 3 5 7 9 2 1 4 3,7
Rox 10 10 10 10 10 10 10 8.5
It should be simple to modify your current script to process the data in this form.
Related
Let's say we have a shell variable $x containing a space separated list of numbers from 1 to 30:
$ x=$(for i in {1..30}; do echo -n "$i "; done)
$ echo $x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
We can print the first three input record fields with AWK like this:
$ echo $x | awk '{print $1 " " $2 " " $3}'
1 2 3
How can we print all the fields starting from the Nth field with AWK? E.g.
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
EDIT: I can use cut, sed etc. to do the same but in this case I'd like to know how to do this with AWK.
Converting my comment to answer so that solution is easy to find for future visitors.
You may use this awk:
awk '{for (i=3; i<=NF; ++i) printf "%s", $i (i<NF?OFS:ORS)}' file
or pass start position as argument:
awk -v n=3 '{for (i=n; i<=NF; ++i) printf "%s", $i (i<NF?OFS:ORS)}' file
Version 4: Shortest is probably using sub to cut off the first three fields and their separators:
$ echo $x | awk 'sub(/^ *([^ ]+ +){3}/,"")'
Output:
4 5 6 7 8 9 ...
This will, however, preserve all space after $4:
$ echo "1 2 3 4 5" | awk 'sub(/^ *([^ ]+ +){3}/,"")'
4 5
so if you wanted the space squeezed, you'd need to, for example:
$ echo "1 2 3 4 5" | awk 'sub(/^ *([^ ]+ +){3}/,"") && $1=$1'
4 5
with the exception that if there are only 4 fields and the 4th field happens to be a 0:
$ echo "1 2 3 0" | awk 'sub(/^ *([^ ]+ +){3}/,"")&&$1=$1'
$ [no output]
in which case you'd need to:
$ echo "1 2 3 0" | awk 'sub(/^ *([^ ]+ +){3}/,"") && ($1=$1) || 1'
0
Version 1: cut is better suited for the job:
$ cut -d\ -f 4- <<<$x
Version 2: Using awk you could:
$ echo -n $x | awk -v RS=\ -v ORS=\ 'NR>=4;END{printf "\n"}'
Version 3: If you want to preserve those varying amounts of space, using GNU awk you could use split's fourth parameter seps:
$ echo "1 2 3 4 5 6 7" |
gawk '{
n=split($0,a,FS,seps) # actual separators goes to seps
for(i=4;i<=n;i++) # loop from 4th
printf "%s%s",a[i],(i==n?RS:seps[i]) # get fields from arrays
}'
Adding one more approach to add all value into a variable and once all fields values are done with reading just print the value of variable. Change the value of n= as per from which field onwards you want to get the data.
echo "$x" |
awk -v n=3 '{val="";for(i=n; i<=NF; i++){val=(val?val OFS:"")$i};print val}'
With GNU awk, you can use the join function which has been a built-in include since gawk 4.1:
x=$(seq 30 | tr '\n' ' ')
echo "$x" | gawk '#include "join"
{split($0, arr)
print join(arr, 4, length(arr), "|")}
'
4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30
(Shown here with a '|' instead of a ' ' for clarity...)
Alternative way of including join:
echo "$x" | gawk -i join '{split($0, arr); print join(arr, 4, length(arr), "|")}'
Using gnu awk and gensub:
echo $x | awk '{ print gensub(/^([[:digit:]]+[[:space:]]){3}(.*$)/,"\\2",$0)}'
Using gensub, split the string into two sections based on regular expressions and print the second section only.
I have a text file I am trying to read into an Array using Bash. Here are the contents of text file:
Vol12
Vol0
Vol2
Vol21
I want to extract the number from the above string and present it to the user to select the number to enter choice such:
12 - Vol12
0 - Vol0
2 - Vol2
21 - Vol21
User would enter 12 to select Vol12 or 2 to select Vol2 and use the selection to do further action.
I have been searching how to do this but here is what I have so far:
Vol="/Users/alex/Downloads/file.txt"
options=($(tail -n+1 $Vol | awk '{print $1}' | sort | uniq) All Quit)
for (( i = 0; i < ${#options[#]}; i++ )); do
echo "$i - ${options[i]}"
done
echo -e "Enter number corresponding to the Volume snapshot you want to restore: \n"
read vol
}
Following output is what I am getting with above code:
OPTIONS MENU
0 - Vol12
1 - Vol0
2 - Vol2
3 - Vol21
4 - All
5 - Quit
Enter number corresponding to the Volume snapshot you want to restore:
How can I get the output to show like following and able to select 12 or 0 ?
12 - Vol12
0 - Vol0
2 - Vol2
21 - Vol21
Please help
You can use associative arrays:
#!/bin/bash
Vol="/Users/alex/Downloads/file.txt"
declare -a arr
#Loop reads each line of the file
while IFS= read -r line; do
n=${line##*[!0-9]} #Gets the number at the end of this line
arr[$n]=$line #Uses it as the key to the array, the content being the whole line
echo "$n - $line"
done < "$Vol"
read -p "Select one from above. " vol
echo "You selected ${arr[vol]}."
For example (I saved it as sh.sh):
$ ./sh.sh
12 - Vol12
0 - Vol0
2 - Vol2
21 - Vol21
Select one from above. 2
You selected Vol2.
I like to use "sed" command to delete two consecutive lines from a file.
I can delete single line using following syntax where
variable "index" holds the line number:
sed -i "${index}d" "$PWD$DEBUG_DIR$DEBUG_MENU"
Since I like to delete consecutive lines I did test this
syntax which supposedly decrements / increments the variable but I am getting a bad substitution error.
Addendum
I am not sure where to post this , but during troubleshooting of
this issue I have discovered that this syntax does not work as expected
(( index++)) nor (( index-- ))
Using sed with "index" in single line deletion twice in a row / sequentially / works and resolves few issues.
#delete command line
sed -i "${index}d" "$PWD$DEBUG_DIR$DEBUG_MENU"
#delete description line
sed -i "${index}d" "$PWD$DEBUG_DIR$DEBUG_MENU"
sed is the right tool for doing s/old/new, that is all. What you're trying to do isn't that so why are you trying to coerce sed into doing something when there's far more appropriate tools can do the job far easier?
The right way to do what your script:
#retrieve first matching line number
index=$(sed -n "/$choice/=" "$PWD$DEBUG_DIR$DEBUG_MENU")
#delete matching line plus next line from file
sed -i "/$index[1], (( $index[1]++))/"
"$PWD$DEBUG_DIR$DEBUG_MENU"
seems to be trying to do is this:
awk -v choice="$choice" '$0~choice{skip=2} !(skip&&skip--)' "$PWD$DEBUG_DIR$DEBUG_MENU"
For example:
$ seq 20 | awk -v choice="4" '$0~choice{skip=2} !(skip&&skip--)'
1
2
3
6
7
8
9
10
11
12
13
16
17
18
19
20
if you only want to delete the first match:
$ seq 20 | awk -v choice="4" '$0~choice{skip=2;cnt++} !(cnt==1 && skip && skip--)'
1
2
3
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
or the 2nd:
$ seq 20 | awk -v choice="4" '$0~choice{skip=2;cnt++} !(cnt==2 && skip && skip--)'
1
2
3
4
5
6
7
8
9
10
11
12
13
16
17
18
19
20
and to skip 5 lines instead of 2:
$ seq 20 | awk -v choice="4" '$0~choice{skip=5} !(skip&&skip--)'
1
2
3
9
10
11
12
13
19
20
Just use the right tool for the right job, don't go digging holes to plant trees with a teaspoon.
If you just want the first one, then quit when you see it:
sed -n "/$choice/ {=;q}" file
But you look like you're processing this file multiple times. There must be a simpler way to do it, if you can describe your over-arching goal.
For example, if you just want to remove the matched line and the next line, but only the first time, you can use awk: here we see "4" and "5" are gone, but "14" and "15" remain:
$ seq 20 | awk '/4/ && !seen {getline; seen++; next} 1'
1
2
3
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
With GNU sed, you can use +1 as the second address in an address range:
index=4
seq 10 | sed "$index,+1 d"
Or if you want to use bash/ksh arithmetic expansion: use the post-increment operator
seq 10 | sed "$((index++)),$index d"
Note here that, due to the pipeline, sed is running in a subshell: even though the index value is now 5, this occurs in a subshell. After the sed command ends, index has value 4 in the current shell.
I'm trying to convert a continuous stream of data (random) into comma separated and line separated values. I'm converting the continuous data into csv and then after some columns (let's say 80), I need to put a newline and repeat the process until.
Here's what I did for csv:
gawk '$1=$1' FIELDWIDTHS='4 5 7 1 9 5 10 6 8 3 2 2 8 4 8 8 4 6 9 1' OFS=, tmp
'tmp' is the file with following data:
"ZaOAkHEnOsBmD5yZk8cNLC26rIFGSLpzuGHtZgb4VUP4x1Pd21bukeK6wUYNueQQMglvExbnjEaHuoxU0b7Dcne5Y4JP332RzgiI3ZDgHOzm0gjDLVat8au7uckM3t60nqFX0Cy93jXZ5T0IaQ4fw2JfdNF1PbqxDxXv7UGiyysFJ8z16TmYQ9zfBRCZvZirIyRboHNEGgMUFZ18y8XXCGrbpeL0WLstzpSuXetmo47G2xPkDLDcFA6cdM4WAFNpoC2ztspY7YyVsoMZdU7D3u3Lm6dDcKuJKdTV6600GkbLuvAamKGyzMtoqW3liI3ybdTNR9KLz2l7KTjUiGgc3Eci5wnhIosAUMkcSQVxFrZdJ9MVyj6duXAk0CJoRvHYuyfdAr7vjlwjkLkYPtFvAZp6wK3dfetoh3ZmhJhUxqzuxOLDQ9FYcvz64iuIUbgXVZoRnpRoNGw7j3fCwyaqCi..."
I'm generating the continuous sequence from /dev/urandom. I'm not getting how to repeat the gawk after some column by adding a newline character after the column ends.
I got it actually. A simple for loop did that.
Here's my whole code:
for i in $(seq 10)
do
tr -dc A-Za-z0-9 < /dev/urandom | head -c 100 > tmp
gawk '$1=$1' FIELDWIDTHS='4 5 7 1 9 5 10 6 8 3 2 2 8 4 8 8 4 6 9 1' OFS=, tmp >> tmp1
done
Any optimizations would be appreciated.
I have the following script which uses awk to match fields with user input
NB=$#
FILE=myfile
#GET INPUT
if [ $NB -eq 1 ]
then
A=`awk -F "\t" -v town="$1" 'tolower($3) ~ tolower(town) {print NR}' $FILE`
fi
If I print the output, it reads :
7188 24369 77205 101441
Which is what I expected. Then if I do the following:
IFS=' '
array=($A)
echo ${#array[#]}
I actually get a length of 1 (?). Furthermore, if I try:
for x in $array
do
echo $x
done
It actually prints out :
7188
24369
77205
101441
How can I have it return the length of 4. I don't understand how the for...in works if there's only 1 element?
EDIT :
echo $A | od -c before I create the array is:
0000000 7 1 8 8 2 4 3 6 9 7 7 2 0 5
0000020 1 0 1 4 4 1 \n
0000030
echo $A | od -c after I create the array is:
0000000 7 1 8 8 \n 2 4 3 6 9 \n 7 7 2 0 5
0000020 \n 1 0 1 4 4 1 \n
0000030
It is because returned output from awk is newline (\n) delimited instead of space delimited. So if you have IFS like this instead:
IFS=$'\n' # newline between quotes
Then it will echo array length = 4 as you are expecting.