Sample input:
a 54 65 43
b 45 12 98
c 99 0 12
d 3 23 0
Sample output:
c,d
Basically I want to check if there's a value of zero in each line, if yes, print the index(a,b,c,d).
My code:
for(i=1;i<=NF;i++)if(i==0){print$1} I got a syntax error
Thanks.
another approach
$ awk '/\y0\y/{print $1}' file
c
d
\y is the word-boundary operator. Might be only in gawk.
The code needs a set of braces.
awk '{ for(i=1;i<=NF;i++)if($i==0) print $1}' filename
(The print doesn't need braces so I took those out.)
If the first field doesn't ever contain a number, maybe start the loop from 2.
The general form of an Awk script is a sequence of
condition { action }
pairs, where the latter needs braces around it. In the absence of a condition, an action is taken on each line, unconditionally.
To make your code work, you need change it to:
$ awk '{for(i=1;i<=NF;i++)if($i==0) print $1}' file
c
d
You need to put the code inside a block ({} pair).
You have to use $i instead of i in the if condition, $i means the ith column.
Although it's not needed here, it's better to add a space between command and paramter. (print $1)
And it's better to improve it a little bit:
awk '{for(i=1;i<=NF;i++)if($i==0) {print $1;next}}' file
Add next to avoid print $1 multiple times when there're more than one 0 in the line.
Given the columns are space separated, you can do it this way too:
awk '/( |^)0( |$)/{print $1}' file
This one does not require GNU awk.
/( |^)0( |$)/ is a RegEx, and in the command it's short for $0 ~ /( |^)0( |$)/.
^ means line beginnings, $ line endings here.
Related
I have a file with n rows and 4 columns, and I want to read the content of the 2nd and 3rd columns, row by row. I made this
awk 'NR == 2 {print $2" "$3}' coords.txt
which works for the second row, for example. However, I'd like to include that code inside a loop, so I can go row by row of coords.txt, instead of NR == 2 I'd like to use something like NR == i while going over different values of i.
I'll try to be clearer. I don't want to wxtract the 2nd and 3rd columns of coords.txt. I want to use every element idependently. For example, I'd like to be able to implement the following code
for (i=1; i<=20; i+=1)
awk 'NR == i {print $2" "$3}' coords.txt > auxfile
func(auxfile)
end
where func represents anything I want to do with the value of the 2nd and 3rd columns of each row.
I'm using SPP, which is a mix between FORTRAN and C.
How could I do this? Thank you
It is of course inefficient to invoke awk 20 times. You'd want to push the logic into awk so you only need to parse the file once.
However, one method to pass a shell variable to awk is with the -v option:
for ((i=1; i<20; i+=2)) # for example
do
awk -v line="$i" 'NR == line {print $2, $3}' file
done
Here i is the shell variable, and line is the awk variable.
something like this should work, there is no shell loop needed.
awk 'BEGIN {f="aux.aux"}
NR<21 {close(f); print $2,$3 > f; system("./mycmd2 "f)}' file
will call the command with the temp filename for the first 20 lines, the file will be overwritten at each call. Of course, if your function takes arguments or input from stdin instead of file name there are easier solution.
Here ./mycmd2 is an executable which takes a filename as an argument. Not sure how you call your function but this is generic enough...
Note also that there is no error handling for the external calls.
the hideous system( ) only way in awk would be like
system("printf \047%s\\n\047 \047" $2 "\047 \047" $3 "\047 | func \047/dev/stdin\047; ");
if the func( ) OP mentioned can be directly called by GNU parallel, or xargs, and can take in values of $2 + $3 as its $1 $2, then OP can even make it all multi-threaded like
{mawk/mawk2/gawk} 'BEGIN { OFS=ORS="\0"; } { print $2, $3; } (NR==20) { exit }' file \
\
| { parallel -0 -N 2 -j 3 func | or | xargs -0 -n 2 -P 3 func }
I want to multiply all entries in a array with numbers like 3.17 * 10^-7, but Bash can't do that. I tried with awk and bc, but it doesn't work. I would be obliged if someone can help me.
Input data example (overall 4000 datafile):
TecN210500-0100.plt
TecN210500-0200.plt
TecN210500-0300.plt
TecN210500-0400.plt
......
Here is my code:
#!/bin/bash
ZS=($(find . -name "*.plt"))
i=1
Variable=$(awk "BEGIN{print 10 ** -7}")
Solutiontime=$(awk "BEGIN{print 3.17 * $Variable}")
for Dataname in ${ZS[#]}
do
Cut=${Dataname:13}
Timesteps=${Cut:0:${#Cut}-4}
Array[i]=$Timesteps
i=$((i++))
p=$((i++))
done
Amount=$p
for ((i=1;i<10;i++))
do
Array[i]=${i}00
done
for (($i=1;i<$Amount+1;i++))
do
Array[i]=$(awk "BEGIN{print ${Array[i]} * $Solutiontime}")
done
Array[0]=Solutiontime
First loop:
Extract e.i. the "0100".
Second loop:
"Delete" the leading zero -> e.i. "100"
Last loop:
Multiply with time step -> e.i. "100 * 3.17*10^-7"
Do a little parameter expansion trimming on the filename, and then let awk do the math for you.
#!/bin/bash
for f in *.plt; do
num=${f##*-} # remove the stuff before the final -
num=${num%.*} # remove the stuff before the last .
num=${num#0} # remove the left-hand zero
awk "BEGIN {print $num * 3.17 * 10**-7}"
done
Or, done entirely with awk:
#!/bin/bash
for f in *.plt; do
awk -v f="$f" 'BEGIN {gsub(/^TecN[[:digit:]]+-0?|.plt$/, "", f); print f * 3.17 * 10**-7}'
done
awk to the rescue!
awk 'BEGIN{print 3.17 * 10^-7 }'
3.17e-07
iteration 1
awk -F'[-.]' '{printf "%s %e\n",substr($1,5),$2*3.17*10^-7}' file
210500 3.170000e-05
210500 6.340000e-05
210500 9.510000e-05
210500 1.268000e-04
for the posted file names used as input.
iteration 2
If you need just the computed numbers, simple drop the first field
awk -F'[-.]' '{printf "%e\n",$2*3.17*10^-7}' file
3.170000e-05
6.340000e-05
9.510000e-05
1.268000e-04
this will be the output of the script. I strongly suggest moving whatever logic you have inside the awk script rather than working on shell level with the array.
I need to delete leading 0s only from a string. I found that there is no in-built function like LTRIM as in C.
I'm thinking of the below AWK script to do that:
awk -F"," 'BEGIN { a[$1] }
for (v in a) {
{if ($v == 0) {delete a[$v]; print a;} else exit;}
}'
But guess I'm not declaring the array correctly, and it throws error. Sorry new to AWK programming. Can you please help me to put it together.
Using awk, as requested:
#!/usr/bin/awk -f
/^0$/ { print; next; }
/^0*[^0-9]/ { print; next; }
/^0/ { sub("^0+", "", $0); print; next; }
{ print $0; }
This provides for not trimming a plain "0" to an empty string, as well as avoiding the (probably) unwanted trimming of non-numeric fields. If the latter is actually desired behavior, the second pattern/action can be commented out. In either case, substitution is the way to go, since adding a number to a non-numeric field will generate an error.
Input:
0
0x
0000x
00012
Output:
0
0x
0000x
12
Output trimming non-numeric fields:
0
x
x
12
Here is a somewhat generic ltrim function that can be called as ltrim(s) or ltrim(s,c), where c is the character to be trimmed (assuming it is not a special regex character) and where c defaults to " ":
function ltrim(s,c) {if (c==""){c=" "} sub("^" c "*","",s); return s}
This can be called with 0, e.g. ltrim($0,0)
NOTE:
This will work for some special characters (e.g. "*"), but if you want to trim special characters, it would probably be simplest to call the appropriate sub() function directly.
Based on other recent questions you posted, you appear to be struggling with the basics of the awk language.
I will not attempt to answer your original question, but instead try to get you on the way in your investigation of the awk language.
It is true that the syntax of awk expressions is similar to c. However there are some important differences.
I would recommend that you spend some time reading a primer on awk and find some exercises. Try for instance the Gnu Awk Getting Started.
That said, there are two major differences with C that I will highlight here:
Types
Awk only uses strings and numbers -- it decides based on context whether it needs to treat input as text or as a number. In some cases
you may need to force conversion to string or to a number.
Structure
An Awk program always follows the same structure of a series of patterns, each followed by an action, enclosed in curly braces: pattern { action }:
pattern { action }
pattern { action }
.
.
.
pattern { action }
Patterns can be regular expressions or comparisons of strings or numbers.
If a pattern evaluates as true, the associated action is executed.
An empty pattern always triggers an action. The { action } part is optional and is equivalent to { print }.
An empty pattern with no action will do nothing.
Some patterns like BEGIN and END get special treatment. Before reading stdin or opening any files, awk will first collect all BEGIN statements in the program and execute their associated actions in order.
It will then start processing stdin or any files given and subject each line to all other pattern/action pairs in order.
Once all input is exhausted, all files are closed, and awk will process the actions belonging to all END patterns, again in order of appearance.
You can use BEGIN action to initialize variables. END actions are typically used to report summaries.
A warning: Quite often we see people trying to pass data from the shell by partially unquoting the awk script, or by using double quotes. Don't do this; instead, use the awk -v option to pass on parameters into the program:
a="two"
b="strings"
awk -v a=$a \
-v b=$b \
'BEGIN {
print a, b
}'
two strings
you can force awk to convert the field to a number and leading zeros by default will be eliminated.
e.g.
$ echo 0001 | awk '{print $1+0}'
1
If I understand correctly, and you just want to trim the leading '0's from a value in bash, you can use sed to provide precise regex control, or a simple loop works well -- and eliminates spawning a subshell with the external utility call. For example:
var=00104
Using sed:
$ echo "$var" | sed 's/^0*//'
104
or using a herestring to eliminate the pipe and additional subshell (bash only)
$ sed 's/^0*//' <<<$var
104
Using a simple loop with string indexes:
while [ "${var:0:1}" = '0' ]; do
var="${var:1}"
done
var will contain 104 following 2 iterations of the loop.
var1=$(echo $getDate | awk '{print $1} {print $2}')
var2=$(echo $getDate | awk '{print $3} {print $4}')
var3=$(echo $getDate | awk '{print $5} {print $6}')
Instead of repeating like the code above, I need to:
loop the same command
increment the values ({print $1} {print $2})
store the value in an array
I was doing something like below but I am stuck maybe someone can help me please:
COMMAND=`find $locationA -type f | wc -l`
getDate=$(find $locationA -type f | xargs ls -lrt | awk '{print $6} {print $7}')
a=1
b=2
for i in $COMMAND
do
i=$(echo $getDate | awk '{print $a} {print $b}')
myarray+=('$i')
a=$((a+1))
b=$((b+1))
done
PS - using ksh
Problem: $COMMAND stores the number of files found in $locationA. I need to loop through the amount of files found and store their dates in an array.
I don't get the meaning of your example code (what is the 'for' loop supposed to do? What is the content of the variable COMMAND?), but in your question you ask to store something in an array, while in the code you wish to simplify, you don't use an array, but simple variables (var1, var2, ....).
If I understand your requirement correctly, your variable getDate contains a string of several words, which are separated by spaces, and you want to assign the first two words to var1, the following two words to var2, and so on. Is this correct?
Now the edited code is at least a bit clearer, though I still don't understand, why you use i as a loop variable, and overwrite it in the first statement inside the loop.
However, a few comments:
If you push '$i' into your array, you will get a literal '$' sign, followed by the letter 'i'. To add a variable i containing to numbers, you need double quotes ("$i").
I don't understand why you want to loop over the cotnent of the variable COMMAND. This variable will always hold a single number, which means that the loop will be executed exactly once.
You could use a counting loop, incrementing loop variable by 2 on each iteration. You would have to precalculate the number of iterations beforehand.
Perhaps an easier alternative, which would work in bash or in zsh (I did not try other shells) is to first turn your variable in an array,
tmparr=($(echo $getDate|fmt -w 1))
and then use a loop to collect pairs of this element:
myarray=()
for ((i=0; i<${#tmparr[*]}; i+=2))
do
myarray+=("${tmparr[$i]} ${tmparr[$((i+1))]}")
done
${myarray[0]} will hold a string consisting of the first to words from getDate, etc.
This one should work on zsh, at least with newer versions:
myarray=()
echo $g|fmt -w 1|paste -s -d " \n"|while read s; do myarray+=("$s"); done
This leaves the first pair in ${myarray[1]}, etc.
It doesn't work with bash (and old zsh versions), because these shells would execute the body of the loop in a subshell.
ADDED:
On a second thought, in zsh this one would be simpler:
myarray=("${(f)$(echo $g|fmt -w 1|paste -s -d ' \n')}")
I have a file consisting of digits. Usually, each line contains one single number. I would like to count the number of lines in the file that begin with digit '0'. If it's the case, then I would like to do some post-processing.
Although I'm able to retrieve correctly the corresponding line numbers, the total number of retrieved lines is not correct. Below, I'm posting the code that I'm using.
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
# linesToRemove=$(grep -n "^0" ${inputFile} | cut -d":" -f1);
linesNr=${#linesToRemove} # <- here, the error
# linesNr=${#linesToRemove[#]} # <- here, the error
if [ "${linesNr}" -gt "0" ]; then
# do something here, e.g. remove corresponding lines.
awk -v n=$linesToRemove 'NR == n {next} {print}' ${anotherFile} > ${outputFile}
fi
Also, as for the awk-based command, how could I use a shell-variable? I tried the command below, but it's not working correctly, since 'myIndex' is interpreted as a text and not as a variable.
linesToRemove=$(awk -v myIndex="$myIndex" '/^myIndex/ { print NR;}' ${inputFile});
Given the line numbers starting with 0 found in ${inputFile}, I would like to remove the corresponding lines numbers from ${anotherFile}. An example for both ${inputFile} and ${anotherFile} is given below:
// ${inputFile}
0
1
3
0
// ${anotherFile}
2.617300e+01 5.886700e+01 -1.894697e-01 1.251225e+02
5.707397e+01 2.214040e+02 8.607959e-02 1.229114e+02
1.725900e+01 1.734360e+02 -1.298053e-01 1.250318e+02
2.177940e+01 1.249531e+02 1.538853e-01 1.527150e+02
// ${outputFile}
5.707397e+01 2.214040e+02 8.607959e-02 1.229114e+02
1.725900e+01 1.734360e+02 -1.298053e-01 1.250318e+02
In the example above, I need to delete lines 0 and 3 from ${anotherFile}, given that those lines correspond to the lines starting with 0 in ${inputFile}.
If you want to count the number of lines in the file that begins with 0, then this line is wrong.
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
The above says to print the line number when the line start with 0, and your linesToRemove variable will contain all the line numbers, not the total number of lines. Use END{} block to capture the total. eg
linesToRemove=$(awk '/^0/ {c++}END{print c}' ${inputFile});
As for your 2nd question on using variable inside awk, use the regex operator ~. And then set your myIndex variable to include the ^ anchor
linesToRemove=$(awk -v myIndex="^$myIndex" '$0 ~ myIndex{ print NR;}' ${inputFile});
finally, if you just want to remove those lines that start with 0, then just simply remove it
awk '/^0/{next}{print $0>FILENAME}' file
If you want to remove lines from another file using what is captured in input file, here's one way
paste -d"|" inputfile anotherfile | awk '!/^0/{gsub(/^.*\|/,"");print}'
Or just one awk command
awk 'FNR==NR && /^0/{a[FNR]} NR>FNR && (!(FNR in a))' inputfile anotherfile
crude explanation: FNR==NR && /^0/ means process the first file whole line starts with 0 and put its line number into array a. NR>FNR means process the next file and if line number not in array, print the line. See the gawk documentation for what FNR,NR etc means
I think you have to do the following to assign an array:
linesToRemove=( $(awk '/^0/ { print NR; }' ${inputFile}) )
And to get the number of elements do (as you have in a commented line):
linesNr=${#linesToRemove[#]}
To remove the lines from from the file you could do something like:
sedCmd=""
for lineNr in ${linesToRemove[#]}; do
sedCmd="$sedCmd;${lineNr}d"
done
sed "$sedCmd" ${anotherFile} > ${outputFile}
In general if you do this:
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
instead of this:
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
linesNr=${#linesToRemove}
use this:
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
linesNr=${echo $linesToRemove|awk '{print NF}'}
POC :
cat temp.sh
#!/usr/bin/ksh
lines=$(awk '/^d/{print NR}' script.sh)
nooflines=$(echo $lines|awk '{print NF}')
echo $nooflines
torinoco!DBL:/oo_dgfqausr/test/dfqwrk12/vijay> temp.sh
8
torinoco!DBL:/oo_dgfqausr/test/dfqwrk12/vijay>
It greatly depends on the post-processing you are doing, but do you really need the actual count? Why not do something like this:
if grep ^0 $inputfile > /dev/null; then
# There is at least one line with a leading 0
:
fi
grep -v ^0 $inputfile | process-lines-without-leading-zero
grep ^0 $inputfile | process-lines-with-leading-zero
Or, even just:
if grep ^0 $inputfile | process-lines-with-leading-zero; then
# some post processing
:
fi
--EDIT--
Based on what you've said in your comment, I would recommend a different approach. If I understand you correctly, you want to read file a, looking for lines of the form ^0[0-9]*,
and then remove those line numbers from file b. Doing it one line at a time is pretty slow if the files get big. Just do:
cmd=$( grep '^0[0-9]*$' a | sed 's/$/d;/g' )
sed "$cmd" b
The assignment to cmd forms a sed command to delete the lines. Invoking sed on b will omit those lines. You'll need to redirect the sed output appropriately (perhaps to a temp file and then back to b, or just use 'sed -i' if you're using gnu sed.)
Given the large number of edits to this question, it seems easiest to start a new answer. Your problem can be solved with a simple one-liner:
$ sed "$( grep -n ^0 $inputFile | sed 's/:.*/d;/g' )" $anotherFile > $outputFile