I want to calculate the percent change between periods of each element in the "features" array (simply using the array as a grouping of financial time series data to report on). However the way the script is working now, it seems that it wants to calculate the percent change between each element in the array and not FOR each element in the array.
I don't think I've done anything wrong here in how I reference the array elements but I get the feeling there's some sort of 'under the hood' concept about how variables are processed by TV that is causing this issue.
//#version=4
study("My Script")
pct_change(source, period) =>
now = source
then = source[period]
missing_now = na(now)
missing_then = na(then)
if not missing_now and not missing_then
(now - then) / abs(then)
else
missing_now ? 0 : 1
evaluate(sources) =>
s = array.size(sources)
bar_changes = array.new_float()
for i = 0 to 99999
if i < s
source = array.get(sources, i)
array.push(bar_changes, pct_change(source, 1))
continue
else
break
bar_changes
features = array.new_float()
array.push(features, open)
array.push(features, high)
array.push(features, close)
bar_changes = evaluate(features)
plot(pct_change(open, 1))
plot(array.get(bar_changes, 0))
plot(pct_change(high, 1), color=color.aqua)
plot(array.get(bar_changes, 1), color=color.aqua)
plot(pct_change(close, 1), color=color.red)
plot(array.get(bar_changes, 2), color=color.red)
I think you have come across the same problem I'm faced with, and it relates to using history referencing operator [] in connection with setting array element values.
I've boiled it down to a very simple script illustrating the problem
here.
In essence what you are doing in your code is passing array element to a pct_change() function, which uses [] operator, and then use returned result in array.push() to set array element value.
I've experienced weird results when I was trying to experiment with arrays in my scripts as soon as they've been introduced, so I started to dig in order to find the root of the problem. And it came down to the script referenced in the link above. So far I believe that Pine Script still has some bugs when it comes to arrays so we just have to wait until they'll be fixed.
When I try to compile my code using -fcheck=all I get a runtime error since it seems I step out of bounds of my array dimension size. It comes from the part of my code shown below. I think it is because my loops over i,j only run from -ny to ny, -nx to nx but I try to use points at i+1,j+1,i-1,j-1 which takes me out of bounds in my arrays. When the loop over j starts at -ny, it needs j-1, so it immediately takes me out of bounds since I'm trying to access -ny-1. Similarly when j=ny, i=-nx,nx.
My question is, how can I fix this problem efficiently using minimal code?
I need the array grad(1,i,j) correctly defined on the boundary, and it needs to be defined exactly as on the right hand side of the equality below, I just don't know an efficient way of doing this. I can explicitly define grad(1,nx,j), grad(1,-nx,j), etc, separately and only loop over i=-nx+1,nx-1,j=-ny+1,ny-1 but this causes lots of duplicated code and I have many of these arrays so I don't think this is the logical/efficient approach. If I do this, I just end up with hundreds of lines of duplicated code that makes it very hard to debug. Thanks.
integer :: i,j
integer, parameter :: nx = 50, ny = 50
complex, dimension (3,-nx:nx,-ny:ny) :: grad,psi
real, parameter :: h = 0.1
do j = -ny,ny
do i = -nx,nx
psi(1,i,j) = sin(i*h)+sin(j*h)
psi(2,i,j) = sin(i*h)+sin(j*h)
psi(3,i,j) = sin(i*h)+sin(j*h)
end do
end do
do j = -ny,ny
do i = -nx,nx
grad(1,i,j) = (psi(1,i+1,j)+psi(1,i-1,j)+psi(1,i,j+1)+psi(1,i,j-1)-4*psi(1,i,j))/h**2 &
- (psi(2,i+1,j)-psi(2,i,j))*psi(1,i,j)/h &
- (psi(3,i,j+1)-psi(3,i,j))*psi(1,i,j)/h &
- psi(2,i,j)*(psi(1,i+1,j)-psi(1,i,j))/h &
- psi(3,i,j)*(psi(1,i,j+1)-psi(1,i,j))/h
end do
end do
If I was to do this directly for grad(1,nx,j), grad(1,-nx,j), it would be given by
do j = -ny+1,ny-1
grad(1,nx,j) = (psi(1,nx,j)+psi(1,nx-2,j)+psi(1,nx,j+1)+psi(1,nx,j-1)-2*psi(1,nx-1,j)-2*psi(1,nx,j))/h**2 &
- (psi(2,nx,j)-psi(2,nx-1,j))*psi(1,nx,j)/h &
- (psi(3,nx,j+1)-psi(3,nx,j))*psi(1,nx,j)/h &
- psi(2,nx,j)*(psi(1,nx,j)-psi(1,nx-1,j))/h &
- psi(3,nx,j)*(psi(1,nx,j+1)-psi(1,nx,j))/h
grad(1,-nx,j) = (psi(1,-nx+2,j)+psi(1,-nx,j)+psi(1,-nx,j+1)+psi(1,-nx,j-1)-2*psi(1,-nx+1,j)-2*psi(1,-nx,j))/h**2 &
- (psi(2,-nx+1,j)-psi(2,-nx,j))*psi(1,-nx,j)/h &
- (psi(3,-nx,j+1)-psi(3,-nx,j))*psi(1,-nx,j)/h &
- psi(2,-nx,j)*(psi(1,-nx+1,j)-psi(1,-nx,j))/h &
- psi(3,-nx,j)*(psi(1,-nx,j+1)-psi(1,-nx,j))/h
end do
One possible way for you could be using an additional index variable for the boundaries, modified from the original index to avoid getting out-of-bounds. I mean something like this:
do j = -ny,ny
jj = max(min(j, ny-1), -ny+1)
do i = -nx,nx
ii = max(min(i, nx-1), -nx+1)
grad(1,i,j) = (psi(1,ii+1,j)+psi(1,ii-1,j)+psi(1,i,jj+1)+psi(1,i,jj-1)-4*psi(1,i,j))/h**2 &
- (psi(2,ii+1,j)-psi(2,ii,j))*psi(1,i,j)/h &
- (psi(3,i,jj+1)-psi(3,i,jj))*psi(1,i,j)/h &
- psi(2,i,j)*(psi(1,ii+1,j)-psi(1,ii,j))/h &
- psi(3,i,j)*(psi(1,i,jj+1)-psi(1,i,jj))/h
end do
end do
It's hard for me to write a proper code because it seems you trimmed part of the original expression in the code you presented in the question, but I hope you understand the idea and apply it correctly for your logic.
Opinions:
Even though this is what you are asking for (as far as I understand), I would not recommend doing this before profiling and checking if assigning the boundary conditions manually after a whole array operation wouldn't be more efficient, instead. Maybe those extra calculations on the indices on each iteration could impact on performance (arguably less than if conditionals or function calls). Using "ghost cells", as suggested by #evets, could be even more performant. You should profile and compare.
I'd recommend you declaring your arrays as dimension(-nx:nx,-ny:ny,3) instead. Fortran stores arrays in column-major order and, as you are accessing values on the neighborhood of the "x" and "y", they would be non-contiguous memory locations for a fixed "other" dimension is the leftest, and that could mean less cache-hits.
In somewhat pseudo-code, you can do
do j = -ny, ny
if (j == -ny) then
p1jm1 = XXXXX ! Some boundary condition
else
p1jm1 = psi(1,i,j-1)
end if
if (j == ny) then
p1jp1 = YYYYY ! Some other boundary condition
else
p1jp1 = psi(1,i,j+1)
end if
do i = -nx, ny
grad(1,i,j) = ... term involving p1jm1 ... term involving p1jp1 ...
...
end do
end do
The j-loop isn't bad in that you are adding 2*2*ny conditionals. The inner i-loop is adding 2*2*nx conditionals for each j iteration (or 2*2*ny * 2*2*nx conditional). Note, you need a temporary for each psi with the triplet indices are unique, ie., psi(1,i,j+1), psi(1,i,j-1), and psi(3,i,j+1).
I have recently switched from STATA to R, and am stuck on some mechanisms of looping (I know looping is considered bad form here in lieu of writing a function, but my feeling is if I can figure out how to get a loop to do this, I can apply that to a function).
I regularly want to execute a call on nested variable name fragments. I cant get R to let me insert variable name fragments, either within a loop or a function; help! and while the variable name here is a number, often its a word or word fragment (a list of non-numeric, non-sequential characters, so i would like to be sure that whatever help i get isn't necessarily specific to inserting numbers!
here is an example. i would like to create an indicator variable for a patient having received a specific intervention in my dataframe called xdata. i have created several (8) intervention variables, and for each, a list of the specific patients (identified through the variable named id) who should be flagged as having received that intervention.
xdata <- data.frame(intervention1 = 0,
intervention2 = 0,
intervention3 = 0,
intervention4 = 0,
intervention5 = 0,
intervention6 = 0,
intervention7 = 0,
intervention8 = 0,
id = 1:2000)
ids <- list(dx1 = c(86, 1486, 1451),
dx2 = c(328, 1277,1458),
dx3 = c(535,1569,689),
dx4 = c(488,1335),
dx5 = c(1210,1425,932,1451,30,270,347,418,709,801,1278),
dx6 = c(282,721,749,1134),
dx7 = c(932,43,148,158,1441),
dx8 = c(932,801,1258))
for (i in 1:8) {
for (j in 1:length(ids[[i]]) {
parse(paste("xdata$intervention",i,"[xdata$id==",ids[[i]][j],"]<- 1"))
}
}
What I am hoping for the loop to do is to carry out in sequence this line of code "xdata$intervention",i,"[xdata$id==",ids[[i]][j],"]<- 1" for each iteration of a particular intervention and list of patients, as applicable. I want the loop to do this:
xdata$intervention1[xdata$id==86]<- 1
xdata$intervention1[xdata$id==1486]<- 1
xdata$intervention1[xdata$id==1451]<- 1
xdata$intervention2[xdata$id==328]<- 1
xdata$intervention2[xdata$id==1277]<- 1
xdata$intervention2[xdata$id==1458]<- 1
...
xdata$intervention8[xdata$id==932]<- 1
xdata$intervention8[xdata$id==801]<- 1
xdata$intervention8[xdata$id==1258]<- 1
Disclaimer/apology-- I have looked and looked for help with this in S.O., my guess is that the answer is out there, but one problem with being a newbie is that I have a hard time even knowing how to frame my question so the right answers will come up! If this is a duplicate I'm sorry; it was not for lack of trying that I didn't find it!
This code
foreach my $ti (#forward){
my $new_bs = %blast_values->{ $ti }->{"bitscore"};
if($new_bs > $fbs){
$fti = $ti;
$fbs = $new_bs;
}
}
my $fqstart = %blast_values->{ $fti }->{"qstart"};
my $fqend = %blast_values->{ $fti }->{"qend"};
my $fsstart = %blast_values->{ $fti }->{"sstart"};
my $fsend = %blast_values->{ $fti }->{"send"};
was originally done with a subroutine call:
my ($fti, $fqstart, $fqend, $fsstart, $fsend, $fbs) = best_one(\#forward,\%blast_values);
where inside the subroutine it did:
my #forward = #{$_[0]};
my %blast_values = %{$_[1]};
However, the subroutine version ran about 40X slower than did the code shown at the top of this post. The subroutine version was the same code, just moved into the subroutine and then return the scalar values indicated. The subroutine would have been called about 30K times if I had let it run to completion, which I never did, because it was going to take about 1800 seconds. Place debug output line before the "foreach" in the subroutine and there was a noticeable delay between output lines during a run, on the order of 1 second, whereas for the version in the main part of the perl there is no measurable delay (so < 0.1 seconds or so between output lines).
The array was generally very small, with 1 or 2 (99% of the time) entries and rarely up to 12. The hash, on the other hand, was very, very large. It has something like 1.5M keys and each key has 6 values accessed by subkeys. Both are passed by reference, so the size of their contents really should not have mattered.
What might account for this delay? I do not recall there being this much call overhead on Perl subroutine invocations, and the input parameters are passed by reference, so it isn't like it had to copy the huge hash. (Although the execution speed suggests maybe it was doing so.)
Perl 5.8.8 on Centos 5.
It's slow because when you do this
my #forward = #{$_[0]};
my %blast_values = %{$_[1]};
You are dereferencing the references you passed in and copying the referenced structures into new variables. If %blast_values is very big, that's a lot of work.
Instead, just use the references without copying. (That's what they're for.)
my $forward = shift;
my $blast_values = shift;
my $fqstart = $blast_values->{ $fti }->{"qstart"};
# etc
Also, I assume you're aware that %blast_values->{ $fti }->{"qstart"} doesn't make sense. The fact that it works at all is due to a bug in Perl. Using such a construct has issued a warning ("Using a hash as a reference is deprecated") for years. You should have been using $blast_values{ $fti }->{"qstart"}.
I have always been interested in algorithms, sort, crypto, binary trees, data compression, memory operations, etc.
I read Mark Nelson's article about permutations in C++ with the STL function next_perm(), very interesting and useful, after that I wrote one class method to get the next permutation in Delphi, since that is the tool I presently use most. This function works on lexographic order, I got the algo idea from a answer in another topic here on stackoverflow, but now I have a big problem. I'm working with permutations with repeated elements in a vector and there are lot of permutations that I don't need. For example, I have this first permutation for 7 elements in lexographic order:
6667778 (6 = 3 times consecutively, 7 = 3 times consecutively)
For my work I consider valid perm only those with at most 2 elements repeated consecutively, like this:
6676778 (6 = 2 times consecutively, 7 = 2 times consecutively)
In short, I need a function that returns only permutations that have at most N consecutive repetitions, according to the parameter received.
Does anyone know if there is some algorithm that already does this?
Sorry for any mistakes in the text, I still don't speak English very well.
Thank you so much,
Carlos
My approach is a recursive generator that doesn't follow branches that contain illegal sequences.
Here's the python 3 code:
def perm_maxlen(elements, prefix = "", maxlen = 2):
if not elements:
yield prefix + elements
return
used = set()
for i in range(len(elements)):
element = elements[i]
if element in used:
#already searched this path
continue
used.add(element)
suffix = prefix[-maxlen:] + element
if len(suffix) > maxlen and len(set(suffix)) == 1:
#would exceed maximum run length
continue
sub_elements = elements[:i] + elements[i+1:]
for perm in perm_maxlen(sub_elements, prefix + element, maxlen):
yield perm
for perm in perm_maxlen("6667778"):
print(perm)
The implentation is written for readability, not speed, but the algorithm should be much faster than naively filtering all permutations.
print(len(perm_maxlen("a"*100 + "b"*100, "", 1)))
For example, it runs this in milliseconds, where the naive filtering solution would take millenia or something.
So, in the homework-assistance kind of way, I can think of two approaches.
Work out all permutations that contain 3 or more consecutive repetitions (which you can do by treating the three-in-a-row as just one psuedo-digit and feeding it to a normal permutation generation algorithm). Make a lookup table of all of these. Now generate all permutations of your original string, and look them up in lookup table before adding them to the result.
Use a recursive permutation generating algorthm (select each possibility for the first digit in turn, recurse to generate permutations of the remaining digits), but in each recursion pass along the last two digits generated so far. Then in the recursively called function, if the two values passed in are the same, don't allow the first digit to be the same as those.
Why not just make a wrapper around the normal permutation function that skips values that have N consecutive repetitions? something like:
(pseudocode)
funciton custom_perm(int max_rep)
do
p := next_perm()
while count_max_rerps(p) < max_rep
return p
Krusty, I'm already doing that at the end of function, but not solves the problem, because is need to generate all permutations and check them each one.
consecutive := 1;
IsValid := True;
for n := 0 to len - 2 do
begin
if anyVector[n] = anyVector[n + 1] then
consecutive := consecutive + 1
else
consecutive := 1;
if consecutive > MaxConsecutiveRepeats then
begin
IsValid := False;
Break;
end;
end;
Since I do get started with the first in lexographic order, ends up being necessary by this way generate a lot of unnecessary perms.
This is easy to make, but rather hard to make efficient.
If you need to build a single piece of code that only considers valid outputs, and thus doesn't bother walking over the entire combination space, then you're going to have some thinking to do.
On the other hand, if you can live with the code internally producing all combinations, valid or not, then it should be simple.
Make a new enumerator, one which you can call that next_perm method on, and have this internally use the other enumerator, the one that produces every combination.
Then simply make the outer enumerator run in a while loop asking the inner one for more permutations until you find one that is valid, then produce that.
Pseudo-code for this:
generator1:
when called, yield the next combination
generator2:
internally keep a generator1 object
when called, keep asking generator1 for a new combination
check the combination
if valid, then yield it