{
k = 0
x = 0
fracon = (10/2)+1
{
for (j = 1; j <= 1100 ; j++)
{
if (j <= fracon)
scal[j]= j-x
else
k= k + 1
scal[j]= j - (2*k)
{
if (scal[j] == 1)
fracon= fracon+11
{
if (j % 11 == 0)
x=x+11
k=k+0.5
}
}
}
}
}
That's all. I used the above code to generate the following array. It works in Matlab, but it does not work in awk.
array= [1 2 3 4 5 6 5 4 3 2 1 1 2 3 4 5 6]
here is another way of generating the same sequence
$ awk 'BEGIN{for(i=0;i<=20;i++) {k=i%11+1; printf "%s ", (k<7?k:12-k)}; print ""}'
1 2 3 4 5 6 5 4 3 2 1 1 2 3 4 5 6 5 4 3 2
not sure what you want is just repeated on a 11 element cycle or not; difficult to say based on limited sample.
or without awk
$ yes $({ seq 6; seq 5 -1 1; } | paste -sd' ') | head -100 | paste -sd' '
1 2 3 4 5 6 5 4 3 2 1 1 2 3 4 5 6 5 4 3 2 1 ...
with square brackets
$ awk 'BEGIN{printf "[";
for(i=0;i<=1100;i++) {k=i%11+1; printf "%s ", (k<7?k:12-k)};
printf "]\n"}'
[1 2 3 4 5 6 5 4 3 2 1 1 2 3 4 5 6 ... 5 4 3 2 1 ]
Stuffing these values into a large array is not optimal, you can write a function to return the indexed value easily
$ awk 'function k(i,_i) {_i=i%11+1; return _i<7?_i:12-_i}
BEGIN{for(i=0;i<=25;i++) print k(i)}'
in the real code, you'll use k(i) instead of printing. Note the array index starts from 0.
N.B. the _i is a local variable in the awk function; you don't need to use in the call syntax.
The main goal is to find a periodic sequence in an array with bash,for example :
{ 2, 5, 7, 8, 2, 6, 5, 3, 5, 4, 2, 5, 7, 8, 2, 6, 5, 3, 5, 4, 2, 5, 7, 8, 2, 6, 5, 3, 5, 4 }
or { 2, 5, 6, 3, 4, 2, 5, 6, 3, 4, 2, 5, 6, 3, 4, 2, 5, 6, 3, 4 }
which must return as identified sequence for the two example
{ 2, 5, 7, 8, 2, 6, 5, 3, 5, 4 } and { 2, 5, 6, 3, 4 }
I tried with a list and a sub-list made of two arrays but with no success.
I must be missing something in my loops . I think to the "tortoise and hare" algorithm as an alternative but i miss some knowledge in bash commands to implement it .
I prefer to post my second try with tortoise and hare as the first seem to be a useless try :
#!/bin/bash
declare -A array=( 1, 2, 3, 1, 2, 3, 1, 2, 3 )
declare -A found=()
loop="notfound"
tortoise=`echo ${array[0]}`
hare=`echo ${array[0]}`
found[0]=`echo ${array[0]}`
while ( $loop == "notfound" )
do
for ((i=1;i=`echo ${#array[#]}`;i++))
do
if (( `echo ${array[$#]}` == $hare ))
then
echo "no loop found"
exit 0
fi
hare=`echo ${array[$i]}`
if (( `echo ${array[$#]}` == $hare ))
then
echo "no loop found"
exit 0
fi
hare=`echo ${array[$(($i+1))]}`
tortoise=`echo ${array[$i]}`
found[$i]=`echo ${array[$i]}`
if (( $hare == $tortoise ))
then
loop="found"
printf "$found[#]}"
fi
done
done
I got errors on associative array needing indice
Given an array a of single decimal digits
a=(2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4)
then using regular expression backsubstitution, for example in perl
printf '%d' "${a[#]}" | perl -lne 'print $1 if /^(\d+)\1+/'
2578265354
Testing with an incomplete sequence
a=(1 2 3 1 2 3 1 2)
printf '%d' "${a[#]}" | perl -lne 'print $1 if /^(\d+)\1+/'
123
If you only want complete repeats, add a $ line anchor to the RE, /^(\d+)\1+$/
Now, if you want to identify the longest subsequence that is "most nearly" repeated, that's a little trickier. For example, in the case of your 250-digit sequence, there is a 118-digit subsequence, repeated 2 times (with 16 characters left over), whereas your expected output is a 13-digit subsequence (repeated 19 times, with 3 digits left over). So you want an algorithm that is "greedy but not too greedy".
One (hopefully not too inefficient) way to do that would be to successively remove trailing digits until an anchored match is obtained i.e. the entire remaining sequence s* may be represented as n x t for some subsequence t. In perl, we can write that as a simple loop
perl -lne 'while (! s/^(\d+)\1+$/$1/) {chop $_}; print'
Testing with your 250-digit sequence:
a=( 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 )
Then
printf '%d' "${a[#]}" | perl -lne 'while (! s/^(\d+)\1+$/$1/) {chop $_}; print'
1102120020222
NOTE: this will fail to terminate if the string is exhausted before a match is found; if that's a possibility, you will need to test for that and break out of the while loop.
I tested this only with the inputs you provided.
assumptions - pattern to match always starts at the beginning of the array and repeats there after.
#!/bin/bash
#arr=(2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4)
arr=(2 5 6 3 4 2 5 6 3 4 2 5 6 3 4 2 5 6 3 4)
echo ${arr[#]}
n=${#arr[*]}
match=0
in_pattern=false
print_array()
{
local first=$1
local last=$2
local i
for ((i=first; i<=last; i++));do
printf "%d " ${arr[i]}
done
printf "\n"
}
i=0
start=0
end=0
j=$((i+1))
while (( j < n )); do
#echo "arr[$i] ${arr[i]} arr[$j] ${arr[j]}"
if [[ ${arr[i]} -ne ${arr[j]} ]];then
if [[ $match -ge 1 ]];then
echo "arr[$i] != arr[$j]"
echo "pattern doesnt repeat after match # $match"
exit 1
fi
((j++))
i=0
in_pattern=false
continue
fi
if $in_pattern ; then
if [[ $i -eq $end ]];then
((match++))
end_match=$j
echo "match # $match matched from $start -> $end and $start_match -> $end_match"
print_array $start $end
print_array $start_match $end_match
((j++))
i=0
in_pattern=false
continue
fi
else
if [[ $match -eq 0 ]];then
end=$((j-1))
fi
start_match=$j
in_pattern=true
#echo "trying to match from start $start end $end to start_match $start_match"
fi
((i++))
((j++))
done
output with first array -
./sequence.sh
2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4
match # 1 matched from 0 -> 9 and 10 -> 19
2 5 7 8 2 6 5 3 5 4
2 5 7 8 2 6 5 3 5 4
match # 2 matched from 0 -> 9 and 20 -> 29
2 5 7 8 2 6 5 3 5 4
2nd array -
/sequence.sh
2 5 6 3 4 2 5 6 3 4 2 5 6 3 4 2 5 6 3 4
match # 1 matched from 0 -> 4 and 5 -> 9
2 5 6 3 4
2 5 6 3 4
match # 2 matched from 0 -> 4 and 10 -> 14
2 5 6 3 4
2 5 6 3 4
match # 3 matched from 0 -> 4 and 15 -> 19
2 5 6 3 4
2 5 6 3 4
The title may not be so descriptive. Let me explain:
I have a file (Say File 1) having some numbers [delimited by a space]. see here,
1 2 3 4 5
1 2 8 4 5 6 7
1 9 3 4 5 6 7 8
..... n lines (length of each line varies).
I have another file (Say File 2) having some numbers [delimited by a tab]. see here,
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
..... m lines (length of each line fixed).
I want sum of 1 2 3 4 5 th (file 1 Line 1) position of file 2, line 1
I want sum of 1 2 3 4 5 6 7 th (file 1 Line 2) position of file 2, line 1 and so on.
I want linewise sum of file 2 with positions all lines in file 1
It will look like:
5 6 6 …n columns (File 1)
1 8 3
9 8 4
… m rows (File 2)
I did this by the following code:
open( FH1, "File1.txt" );
#index = <FH1>;
open( FH2, "File2.txt" );
#matrix = <FH2>;
open( OUTPUT, ">sum.txt" );
foreach $xx (#matrix) {
#k1 = split( /\t/, "$xx" );
foreach $yy (#index) {
#k2 = split( / /, "$yy" );
$ssum = 0;
foreach $zz (#k2) {
$zz1 = $zz - 1;
if ( $k1[$zz1] == 1 ) {
$ssum++;
}
}
printf OUTPUT"$ssum\t";
$ssum = 0;
}
print OUTPUT"\n";
}
close FH1;
close FH2;
close OUTPUT;
It works absolutely fine except that, the time time requirement is enormous for large files. (e.g. 1000 lines File 1 X 25000 lines File 2 : The time is 8 minutes .
My data may exceed 4 times this example. And it's unacceptable for my users.
How to accomplish this, consuming much lesser time. or by Any other concept.
Always include use strict; and use warnings; in every PERL script.
You can simplify your script by not processing the first file multiple times. Also, you coding style is very outdated. You use with some lessons from Modern Perl Book by chromatic.
The following is your script simplified to take advantage of more modern style and techniques. Note, that it currently loads the file data from inside the script instead of external sources:
use strict;
use warnings;
use autodie;
use List::Util qw(sum);
my #indexes = do {
#open my $fh, '<', "File1.txt";
open my $fh, '<', \ "1 2 3 4 5\n1 2 8 4 5 6 7\n1 9 3 4 5 6 7 8\n";
map { [map {$_ - 1} split ' '] } <$fh>
};
#open my $infh, '<', "File2.txt";
my $infh = \*DATA;
#open my $outfh, '>', "sum.txt";
my $outfh = \*STDOUT;
while (<$infh>) {
my #vals = split ' ';
print $outfh join(' ', map {sum(#vals[#$_])} #indexes), "\n";
}
__DATA__
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
Outputs:
5 6 7
5 7 8
5 6 7
5 6 7
can any one help me to Vectorized this loop.
i have large Matrix and i want to replace all the pixel values whose length is less then some threshold Value For simplicity lets say
a = randi([1 5],10,10);
for i = 1:length(a)
someMat=a(a==i);
if length(someMat)<20
a(a==i)=0;
end
end
but its killing me.
Example:
a = randi([1 5],10,10)
a =
5 2 1 5 5 5 2 2 3 2
3 3 5 4 4 4 3 1 1 5
5 1 3 5 3 3 4 1 3 1
3 1 5 3 2 5 1 1 5 1
1 1 4 3 4 3 4 4 5 1
1 4 3 5 1 1 2 2 2 1
3 3 5 2 4 1 1 3 2 4
4 1 5 3 4 5 3 4 3 3
5 3 5 5 4 3 1 3 4 1
4 1 1 3 5 5 1 3 3 5
Result for Thresold 20
5 0 1 5 5 5 0 0 3 0
3 3 5 0 0 0 3 1 1 5
5 1 3 5 3 3 0 1 3 1
3 1 5 3 0 5 1 1 5 1
1 1 0 3 0 3 0 0 5 1
1 0 3 5 1 1 0 0 0 1
3 3 5 0 0 1 1 3 0 0
0 1 5 3 0 5 3 0 3 3
5 3 5 5 0 3 1 3 0 1
0 1 1 3 5 5 1 3 3 5
length of pixel 4 was 17
length of pixel 2 was 10
i try it by some thing like
[nVal Index] = histc(a(:),unique(a)); %
nVal(nVal>20) = 1; % just some threshold value and assigning by some Number may be zero as well
But I dont Know how to replace the Index Values of the corresponding Pixal and apply reshape to get it in original form. Here Even i am not sure that i will get the same Matrix With Reshape . Please Help me.....
thanks
I think this does what you want:
threshold_length = 20;
replace_value = 0;
u = unique(a); %// values of a
h = histc(a(:), u); %// count for each value
r = u(h<threshold_length); %// values to be removed
a(ismember(a,r)) = replace_value; %// remove those values
I see #LuisMendo arrived at mostly the same solution quicker than I did, but an alternative to using ismember is to use more of what unique gives you:
threshold = 20;
[vals, ~, ix] = unique(a); % capture the values and their indices
counts = histc(a(:), vals); % count the occurrences of each value
vals(counts<threshold) = 0; % zero the values that aren't common enough
a(:) = vals(ix); % recreate the matrix with updated values
I have a data file with lines like this:
A1 2 3 4 5
B 1 2 4
B 7 8 9
A6 7 8 9
B 1 2 3
B 5 6 7
A3 6 9 7
B 2 3 3
B 5 6 6
Using Perl, how do I split the file into a set of arrays (or any other data structure) when the parser hits a /^A/ please?
so I end up with
array1:
A1 2 3 4 5
B 1 2 4
B 7 8 9
array2:
A6 7 8 9
B 1 2 3
B 5 6 7
etc.
Many thanks.
I had to rewrite the answer (after rewritten question)
#arrays = ();
while (<>) {
push(#arrays, []) if /^A/;
push(#{$arrays[-1]}, $_)
}
It is at times like this when I wish $/ could be more than just a string. Nevertheless, there are workarounds.
One could slurp the file in and process with a lookahead assertion. The example below simply prints each string delimited with << >>, but the basic idea is the same regardless of what you want to do with the data:
$ perl -0777 -wE 'say "<<$_>>" for split /(?=^A)/m, <>' file.txt
<<A1 2 3 4 5
B 1 2 4
B 7 8 9
>>
<<A6 7 8 9
B 1 2 3
B 5 6 7
>>
<<A3 6 9 7
B 2 3 3
B 5 6 6
>>