I use a command for finding strings and numbers in a file
awk -F'[=,: ]' '{print /uid=/?$4:(/^telephoneN/)?$2:$3}' 1.txt
the output is something like
a
b
c
d
e
f
g
t
I would like to write this output in a file 2.xml
<xml>
<name>aaaa</name>
<surname>bbbb</surname>
...
</xml>
<xml>
<name>eeee</name>
<surname>ffff</surname>
...
</xml>
I don't know how to manage the result from awk.
Could you help me please?
Thanks in advance
I would be nice to see what your real data looks like, but given that your output shows 4 fields and your input shows 4 fields, here is the basic idea.
awk 'BEGIN {
RS="" # make blank line between sets of data the RecordSep
FS="\n" # make each line as a field in the rec (like $1, $2 ...)
}
{ # this is the main loop, each record set is procssed here
printf("<xml>\n\t<name>%s</name>\n\t<surname>%s</surname>\n\t<Addr1>%s</Addr1>\n\t<Addr2>%s</Addr2>\n</xml>",
$1, $2, $3, $4 )
} ' 1.txt > 1.xml
Note: there should be only 1 blank like between your record sets.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, or give it a + (or -) as a useful answer.
Related
I have a file with n rows and 4 columns, and I want to read the content of the 2nd and 3rd columns, row by row. I made this
awk 'NR == 2 {print $2" "$3}' coords.txt
which works for the second row, for example. However, I'd like to include that code inside a loop, so I can go row by row of coords.txt, instead of NR == 2 I'd like to use something like NR == i while going over different values of i.
I'll try to be clearer. I don't want to wxtract the 2nd and 3rd columns of coords.txt. I want to use every element idependently. For example, I'd like to be able to implement the following code
for (i=1; i<=20; i+=1)
awk 'NR == i {print $2" "$3}' coords.txt > auxfile
func(auxfile)
end
where func represents anything I want to do with the value of the 2nd and 3rd columns of each row.
I'm using SPP, which is a mix between FORTRAN and C.
How could I do this? Thank you
It is of course inefficient to invoke awk 20 times. You'd want to push the logic into awk so you only need to parse the file once.
However, one method to pass a shell variable to awk is with the -v option:
for ((i=1; i<20; i+=2)) # for example
do
awk -v line="$i" 'NR == line {print $2, $3}' file
done
Here i is the shell variable, and line is the awk variable.
something like this should work, there is no shell loop needed.
awk 'BEGIN {f="aux.aux"}
NR<21 {close(f); print $2,$3 > f; system("./mycmd2 "f)}' file
will call the command with the temp filename for the first 20 lines, the file will be overwritten at each call. Of course, if your function takes arguments or input from stdin instead of file name there are easier solution.
Here ./mycmd2 is an executable which takes a filename as an argument. Not sure how you call your function but this is generic enough...
Note also that there is no error handling for the external calls.
the hideous system( ) only way in awk would be like
system("printf \047%s\\n\047 \047" $2 "\047 \047" $3 "\047 | func \047/dev/stdin\047; ");
if the func( ) OP mentioned can be directly called by GNU parallel, or xargs, and can take in values of $2 + $3 as its $1 $2, then OP can even make it all multi-threaded like
{mawk/mawk2/gawk} 'BEGIN { OFS=ORS="\0"; } { print $2, $3; } (NR==20) { exit }' file \
\
| { parallel -0 -N 2 -j 3 func | or | xargs -0 -n 2 -P 3 func }
I have an issue where a user gives the file, column, value, and id of the line. I am trying to change the value of the line
The format of the file is:
F1|F2|F3|F4|F5|F6|F7|F8
My thought of doing that is reading the file and put the values of each field in an array. Then I will find the line I want to change using if and I will use awk
while IFS=$'|t' read -r -a myArray
do
if [ $4 == ${myArray[0]} ]; then
echo "${myArray[1]} ${myArray[2]} ${myArray[4]}"
awk -v column="$5" -v value="$6"-F '{ ${myArray[column]} = value }'
echo "${myArray[1]} ${myArray[2]} ${myArray[4]}"
echo "${column} ${value}"
fi
done < $2
However, when I do that nothing changes: the column and value arguments don't print anything.
Any ideas?
You didnt give too much information. Assume you want to change specific column which field2 is F2 you can do as below:
$2=="F2" is checking field 2 is matching your specific string.
$2="Hello" is assigning "Hello" to field 2
$1=$1 reassign the whole record(line)
print print out the whole record
awk -F"|" 'BEGIN{OFS="|"} ($2=="F2"){$2="Hello";$1=$1; print}' sample.csv
See my example:
$cat sample.csv
F1|F2|F3|F4|F5|F6|F7|F8
$awk -F"|" 'BEGIN{OFS="|"} ($2=="F2"){$2="Hello";$1=$1; print}' sample.csv
F1|Hello|F3|F4|F5|F6|F7|F8
I've got a .dat file:
#id|firstName|lastName|gender|birthday|creationDate|locationIP|browserUsed
933|Mahinda|Perera|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox
1129|Carmen|Lepland|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer
4194|Hồ Chí|Do|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer
8333|Chen|Wang|female|1980-02-02|2010-03-15T10:21:43.365+0000|1.4.16.148|Internet Explorer
8698|Chen|Liu|female|1982-05-29|2010-02-21T08:44:41.479+0000|14.103.81.196|Firefox
8853|Albin|Monteno|male|1986-04-09|2010-03-19T21:52:36.860+0000|178.209.14.40|Internet Explorer
10027|Ning|Chen|female|1982-12-08|2010-02-22T17:59:59.221+0000|1.2.9.86|Firefox
I want to take firstName, lastName and birthday from a specific line by id.
Example: if the input is 933, I want to extract (separated by space):
Mahinda Perera 1989-12-03
This should do it:
#!/bin/sh
id="$1"
awk -F '|' -v ID="$id" '($1==ID){print $2, $3, $5}' infile
Use as:
$ script.sh 933
Mahinda Perera 1989-12-03
awk -F'|' '$1 ~ /933/{print $2, $3, $5}' file
Mahinda Perera 1989-12-03
If field one matches 933 print following fields: 2,3 and 5.
With GNU sed:
$ id=933
$ sed -n "/^$id"'|/{s/^[^|]*|\([^|]*\)|\([^|]*\)|[^|]*|\([^|]*\)|.*$/\1 \2 \3/p;q}' infile
Mahinda Perera 1989-12-03
-n prevents printing; the rest does this:
"/^$id"'|/ { # If a line starts with the ID... (notice quoting for parameter expansion)
# Capture second, third and fifth field, discard rest, print
s/^[^|]*|\([^|]*\)|\([^|]*\)|[^|]*|\([^|]*\)|.*$/\1 \2 \3/p
q # Quit to avoid processing the rest of the file for nothing
}'
To get this to run under BSD sed, there has to be another semicolon between the q and the closing brace.
Goes to show that awk is much better suited for the problem.
Use:
awk '{print $2 $3 $5}' infile.dat
Im trying to put the contents of a awk command in to a bash array however im having a bit of trouble.
>>test.sh
f_checkuser() {
_l="/etc/login.defs"
_p="/etc/passwd"
## get mini UID limit ##
l=$(grep "^UID_MIN" $_l)
## get max UID limit ##
l1=$(grep "^UID_MAX" $_l)
awk -F':' -v "min=${l##UID_MIN}" -v "max=${l1##UID_MAX}" '{ if ( $3 >= min && $3 <= max && $7 != "/sbin/nologin" ) print $0 }' "$_p"
}
...
Used files:
Sample File: /etc/login.defs
>>/etc/login.defs
### Min/max values for automatic uid selection in useradd
UID_MIN 1000
UID_MAX 60000
Sample File: /etc/passwd
>>/etc/passwd
root:x:0:0:root:/root:/usr/bin/zsh
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:Daniel,,,:/home/daniel:/bin/bash
The output looks like:
admin:x:1000:1000:Administrator,,,:/home/admin:/bin/bash
daniel:x:1001:1001:User,,,:/home/user:/bin/bash
respectively (awk ... print $1 }' "$_p")
admin
daniel
Now my problem is to save the awk output in an Array to use it as variable.
>>test.sh
...
f_checkuser
echo "Array items and indexes:"
for index in ${!LOKAL_USERS[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
It could/should look like this example.
Array items and indexes:
0: admin
1: daniel
Specially i would become all Users of a System (not root,bin,sys,ssh,...) without blocked users in an array.
Perhaps someone has another idea to solve my Problem?
Are you trying to set the output of one script to an array? There is a bash has a way of doing this. For example,
a=( $(seq 1 10) ); echo ${a[1]}
will populate the array a with elements 1 to 10 and will print 2, the second line generated by seq (array index starts at zero). Simply replace the contents of $(...) with your script.
For those coming to this years later ...
bash 4 introduced readarray (aka mapfile) exactly for this purpose.
See also Bash capturing output of awk into array
One solution that works:
array=()
f_checkuser(){
...
...
tempfile="localuser.tmp"
touch ${tempfile}
awk -F':'...'{... print $1 }' "$_p" > ${HOME}/${tempfile}
getArrayfromFile "${tempfile}"
}
getArrayfromFile() {
i=0
while read line # Read a line
do
array[i]=$line # Put it into the array
i=$(($i + 1))
done < $1
}
f_checkuser
echo "Array items and indexes:"
for index in ${!array[*]}
do
printf "%4d: %s\n" $index ${array[$index]}
done
Output:
Array items and indexes:
0: daniel
1: admin
But I like more to observe without a new temp-file.
So, have someone any another idea without a temp-file?
I have a directory on my computer which contains an entire database I found online for my research. This database contains thousands of files, so to do what I need I've been looking into file i/o stuff. A programmer friend suggested using bash/awk. I've written my code:
#!/usr/bin/env awk
ls -l|awk'
BEGIN {print "Now running"}
{if(NR == 17 / $1 >= 0.4 / $1 <= 2.5)
{print $1 > wavelengths.txt;
print $2 > reflectance.txt;
print $3 > standardDev.txt;}}END{print "done"}'
When I put this into my console, I'm already in the directory of the files I need to access. The data I need begins on line 17 of EVERY file. The data looks like this:
some number some number some number
some number some number some number
. . .
. . .
. . .
I want to access the data when the first column has a value of 0.4 (or approximately) and get the information up until the first column has a value of approximately 2.5. The first column represents wavelengths. I want to verify they are all the same for each file later, so I copy them into a file. The second column represents reflectance and I want this to be a separate file because later I'll take this information and build a data matrix from it. And the third column is the standard deviation of the reflectance.
The problem I am having now is that when I run this code, I get the following error: No such file or directory
Please, if anyone can tell me why I might be getting this error, or can guide me as to how to write the code for what I am trying to do... I will be so grateful.
The main problem is that you need to quote the names of the output file names as they are strings not variables. Use:
print $1 > "wavelengths.txt"
instead of:
print $1 > wavelengths.txt
Excellent attempt, but this is because you should never parse the output of ls. Still, you were probably looking for ls -1, not ls -l. awk can also accept a glob of files. For example, in the desired directory, you can run:
awk -f /path/to/script.awk *
Contents of script.awk:
BEGIN {
print "Now running"
}
NR == 17 && $1 >= 0.4 && $1 <= 2.5 {
print $1 > "wavelengths.txt"
print $2 > "reflectance.txt"
print $3 > "standardDev.txt"
}
END {
print "Done"
}