As a follow up of: Gnuplot: Plotting several datasets with titles from one file, I have a test.dat file:
"p = 0.1"
1 1
3 3
4 1
"p = 0.2"
1 3
2 2
5 2
and I can plot it with no issues from within gnuplot using:
> plot for [IDX=0:1] 'test.dat' i IDX u 1:2 w lines title columnheader(1)
however I cannot pipe the data.
Here is the single line example:
$ cat test.dat | gnuplot --persist -e "plot for [IDX=0:1] '-' i IDX u 1:2 w lines title columnheader(1)"
line 10: warning: Skipping data file with no valid points
I get the warning message and only the first set is plotted. I tried to add an e at the end of the data file, but no luck... This should be trivial, am I making a silly mistake?
I've messing around a bit more. So these works:
gnuplot --persist -e "plot for [IDX=0:1] 'test.dat' i IDX u 1:2 w lines title columnheader(1)"
gnuplot --persist -e "plot for [IDX=0:1] '< cat test.dat' i IDX u 1:2 w lines title columnheader(1)"
These don't:
cat test.dat | gnuplot --persist -e "plot for [IDX=0:1] '-' i IDX u 1:2 w lines title columnheader(1)"
cat test.dat | gnuplot --persist -e "plot for [IDX=0:1] '< cat' i IDX u 1:2 w lines title columnheader(1)"
It looks like a bug to me. I tried few Gnuplot versions (4.6.6, 5.0.0, 5.0.3) but they all present the same behaviour.
Ok, I've finally got it browsing the documentation. When piping, each index selection requires to repeat the whole data:
plot '-' index 0, '-' index 1
2
4
6
10
12
14
e
2
4
6
10
12
14
e
or, as a much simpler alternative, one can just do:
plot '-', '-'
2
4
6
e
10
12
14
e
Related
I want to loop through all elements in an array in awk and print. The values are sourced from the file below:
Ala A Alanine
Arg R Arginine
Asn N Asparagine
Asp D Aspartic acid
Cys C Cysteine
Gln Q Glutamine
Glu E Glutamic acid
Gly G Glycine
His H Histidine
Ile I Isoleucine
Leu L Leucine
Lys K Lysine
Met M Methionine
Phe F Phenylalanine
Pro P Proline
Pyl O Pyrrolysine
Ser S Serine
Sec U Selenocysteine
Thr T Threonine
Trp W Tryptophan
Tyr Y Tyrosine
Val V Valine
Asx B Aspartic acid or Asparagine
Glx Z Glutamic acid or Glutamine
Xaa X Any amino acid
Xle J Leucine or Isoleucine
TERM TERM termination codon
I have tried this:
awk 'BEGIN{FS="\t";OFS="\t"}{if (FNR==NR) {codes[$1]=$2;} else{next}}END{for (key in codes);{print key,codes[key],length(codes)}}' $input1 $input2
And the output is always Cys C 27 and when I replace codes[$1]=$2 for codes[$2]=$1 I get M Met 27.
How can I make my code print out all the values sequentially? I don't understand why my code selectively prints out just one element when I can tell the array length is 27 as expected. (To keep my code minimal I have excluded code within else{next} - Otherwise I just want to print all elements from array codes while retaining the else{***} command)
According to How to view all the content in an awk array?, The syntax above should work. I tried it here echo -e "1 2\n3 4\n5 6" | awk '{my_dict[$1] = $2};END {for(key in my_dict) print key " : " my_dict[key],": "length(my_dict)}' and that worked well.
With your shown samples and attempts please try following, written and tested in GNU awk.
awk '
BEGIN{
FS=OFS="\t"
}
{
codes[$1]=$2
}
END{
for(key in codes){
print key,codes[key],length(codes)
}
}' Input_file
Will add detailed explanation and OP's misses too in few mins.
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
FS=OFS="\t" ##Setting FS and OFS as TAB here.
}
{
codes[$1]=$2 ##Creating array codes with index of 1st field and value of 2nd field
}
END{ ##Starting END block of this program from here.
for(key in codes){ ##Traversing through codes array here.
print key,codes[key],length(codes) ##Printing index and value of current item along with total length of codes.
}
}' Input_file ##Mentioning Input_file name here.
I'm a bit confused what you are after, but to print the codes sequentially, with the no., (ignoring the name), you can do:
awk '{seq[++n]=$2; codes[$2]=$1}
END{for (i=1;i<=n;i++) printf "%s\t%s\t%d\n", codes[seq[i]], seq[i], i}' file
Which uses two arrays to coordinate the sequence number with the single letter in the seq array and then the letter to the code in the codes array.
Example Use/Output
$ awk '{seq[++n]=$2; codes[$2]=$1}
END{for (i=1;i<=n;i++) printf "%s\t%s\t%d\n", codes[seq[i]], seq[i], i}' file
Ala A 1
Arg R 2
Asn N 3
Asp D 4
Cys C 5
Gln Q 6
Glu E 7
Gly G 8
His H 9
Ile I 10
Leu L 11
Lys K 12
Met M 13
Phe F 14
Pro P 15
Pyl O 16
Ser S 17
Sec U 18
Thr T 19
Trp W 20
Tyr Y 21
Val V 22
Asx B 23
Glx Z 24
Xaa X 25
Xle J 26
TERM TERM 27
Resolved: The error was brought about by the introduction of ; here: END{for (key in codes);{print key,codes[key],length(codes)}}.
Solution:
awk 'BEGIN{FS="\t";OFS="\t"}{if (FNR==NR) {codes[$1]=$2;} else{next}}END{for (key in codes){print key,codes[key],length(codes)}}' $input1 $input2
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
so for reading the list of file, I use this code here below:
IFS=$'\n' read -d '' -r -a data < ./somefolder/mytext.txt
for i in {0..9} #i know that i have 10 items, thats why i use 0..9
do
echo "${data[$i]}"
done
lets say i have 1-10 in the txt file, so it should print like below:
1
2
3
4
5
6
7
8
9
10
Questions:
is there any simpler way to read/write the text list than this?
how to save/update/overwrite data of mytext.txt? lets say change 4 to 88 for example.
Full example:
#!bin/bash
IFS=$'\n' read -d '' -r -a data < ./somefolder/mytext.txt
for i in {0..9} #i know that i have 10 items, thats why i use 0..9
do
echo "${data[$i]}"
done
echo "change 4 to anything"
read any
update(){
for n in {0..9}
do
if [[ n == 3 ]]; then
echo any
else
echo "${data[$n]}"
fi
done
}
update > ./somefolder/mytext.txt
#i dont know what i should do, it throws some errors saying syntax error
echo "saved"
exit 0
This is the code and output of the code, it is not the same as you describe in the comments.
printf '%s\n' {a..z} > file.txt
cat file.txt
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
A quick way of showing line numbers by using grep
grep -n . file.txt
A function to loop through an array.
func() {
n=1
for f; do
if (( n == 3 )); then
printf '%d %s\n' "$n" foo
else
printf '%d %s\n' "$n" "$f"
fi
((n++))
done
}
mapfile -t array < file.txt
func "${array[#]}"
Output
1 a
2 b
3 foo
4 d
5 e
6 f
7 g
8 h
9 i
10 j
11 k
12 l
13 m
14 n
15 o
16 p
17 q
18 r
19 s
20 t
21 u
22 v
23 w
24 x
25 y
26 z
On the other hand if you just want to replace everything with anything at a certain line and and ed is acceptable/available.
#!/usr/bin/env bash
printf '%s\n' ,n | ed -s file.txt
read -rp 'Change 4 to anything: ' input
printf '%s\n' "4c" "$input" . ,n w | ed -s file.txt
A more flexible version of the previous script.
#!/usr/bin/env bash
total=$(printf '%s\n' '$=' | ed -s file.txt)
printf '%s\n' ,n | ed -s file.txt
read -rp 'Enter the line number you want to change: ' int
if [[ $int == *[!0-9]* ]]; then
printf >&2 '%s is not an int\n' "$int"
exit 1
elif (( int > total )); then
printf >&2 '%s is out of range!' "$int"
exit 1
fi
read -rp "Enter the replacement at line $int: " input
printf '%s\n' "${int}c" "$input" . ,n w | ed -s file.txt
Caveat The file.txt name and path is still hard coded to the script, just add an additional read for the the file.
I am trying to plot a bash array using gnuplot without dumping the array to a temporary file.
Let's say:
myarray=$(seq 1 5)
I tried the following:
myarray=$(seq 1 5)
gnuplot -p <<< "plot $myarray"
I got the following error:
line 0: warning: Cannot find or open file "1"
line 0: No data in plot
gnuplot> 2
^
line 0: invalid command
gnuplot> 3
^
line 0: invalid command
gnuplot> 4
^
line 0: invalid command
gnuplot> 5''
^
line 0: invalid command
Why it doesn't interpret the array as a data block?
Any help is appreciated.
bash array
myarray=$(seq 1 5)
The myarray is not a bash array, it is a normal variable.
The easiest is to put the data to stdin and plot <cat.
seq 5 | gnuplot -p -e 'plot "<cat" w l'
Or with your variable and with using a here-string:
<<<"$myarray" gnuplot -p -e 'plot "<cat" w l'
Or with your variable with redirection with echo or printf:
printf "%s\n" "$myarray" | gnuplot -p -e 'plot "<cat" w l'
And if you want to plot an actual array, just print it on separate lines and then pipe to gnuplot
array=($(seq 5))
printf "%s\n" "${array[#]}" | gnuplot -p -e 'plot "<cat" w l'
Plot STDIN
gnuplot -p -e 'plot "/dev/stdin"'
Sample:
( seq 5 10; seq 7 12 ) | gnuplot -p -e 'plot "/dev/stdin"'
or
gnuplot -p -e 'plot "/dev/stdin" with steps' < <( seq 5 10; seq 7 12 )
More tunned plot
gnuplot -p -e "set terminal wxt 0 enhanced;set grid;
set label \"Test demo with random values\" at 0.5,0 center;
set yrange [ \"-1\" : \"80\" ] ; set timefmt \"%s\";
plot \"/dev/stdin\" using 1:2 title \"RND%30+40\" with impulse;" < <(
paste <(
seq 2300 2400
) <(
for ((i=101;i--;)){ echo $[RANDOM%30+40];}
)
)
Please note that this is still one line, you could Copy'n paste into any terminal console.
Can anybody offer some help getting this AWK to search correctly?
I need to search inside the "sample.txt" file for all the 6 array elements in the "combinations" file. However, I need the search to happen from every single character instead of like an ordinary text editor search box type search, which searches by blocks after each occurrence. I need to search in the most squeezed in way so as to display exactly every times it happens. For example I need the type of search that finds inside the string "AAAAA" the combination "AAA" happening 3 times, not 1 time. See my previous post about this: BASH: Search a string and exactly display the exact number of times a substring happens inside it
The sample.txt file is:
AAAAAHHHAAHH
The combinations file is:
AA
HH
AAA
HHH
AAH
HHA
How do I get the script
#!/bin/bash
awk 'NR==FNR {data=$0; next} {printf "%s %d \n",$1,gsub($1,$1,data)}' 'sample.txt' combinations > searchoutput
to output the desired output:
AA 5
HH 3
AAA 3
HHH 1
AAH 2
HHA 1
instead of what it is currently outputing:
AA 3
HH 2
AAA 1
HHH 1
AAH 2
HHA 1
?
As we can see, the script is only finding the combinations just like a text editor. I need it to search for the combinations from the start of every character instead so that the desired output happens.
How do I have the AWK output the desired output instead? Can't thank you enough.
there may be a faster way to find the first match and carry forward from that index, but this might be simpler
$ awk 'NR==1{content=$0;next}
{c=0; len1=length($1);
for(i=1;i<=length(content)-len1+1;i++)
c+=substr(content,i,len1)==$1;
print $1,c}' file combs
AA 5
HH 3
AAA 3
HHH 1
AAH 2
HHA 1
you might try this:
$ awk '{x="AAAAAHHHAAHH"; n=0}{
while(t=index(x,$0)){n++; x=substr(x,t+1) }
print $0,n
}' combinations.txt
AA 5
HH 3
AAA 3
HHH 1
AAH 2
HHA 1
I am trying to eliminate a set of duplicate rows based on a separate field.
cat file.txt
1 345 a blue
1 345 b blue
3 452 c blue
3 342 d green
3 342 e green
1 345 f green
I would like to remove duplicates rows based on field 1 and 2, but separately for each colour. Desired output:
1 345 a blue
3 452 c blue
3 342 d green
1 345 f green
I can achieve this output using a for loop that iterates over the colours:
for i in $(awk '{ print $4 }' file.txt | sort -u); do
grep -w ${i} |
awk '!x[$1,$2]++' >> output.txt
done
But this is slow. Is there any way to get this output without use of a loop?
Thank you.
At least for the example, it is simple as:
$ awk 'arr[$1,$2,$4]++{next} 1' file
1 345 a blue
3 452 c blue
3 342 d green
1 345 f green
Or, you can negate that:
$ awk '!arr[$1,$2,$4]++' file
You can also use GNU sort for the same which may be faster:
$ sort -k4,4 -k2,2 -k1,1 -u file
Could you please try this too:
awk '!a[$1,$2,$4]++' Input_file