Finding lengths between the elements of an event - arrays

I have a matrix which has 1's,-1's and zeros.. say
state=[1; 1; -1; 1; -1; 0; 0; 0; 0; 0; 0; 0; 0; 1; 0; -1; 1; 0; -1.........];
Say -1 is the start of an event and 0 contributes to the length of it when its between (-1 and 1) as -1 remains the start and 1 is the end of event. But when 0 comes after a 1 that means it doesn't have any value to it as the event ended recently; can't take that into consideration.
So I need to get the number of such events that happened and also lengths of those events in the entire matrix for such events so my output would be
result=[2 10 2........]
and need the no of such events.
And in the above case I would exclude my first two indices which are 1's that doesn't contribute to anything.
It sounds simple but its been a while I got back to matlab. This is what I tried but it fails as it takes the zeros in between 1 and -1 as it should be excluded:
result=[find(state==-1)-find(state==1)];
which is wrong.
Your help is appreciated.

Follow these steps:
Find all starts and all ends.
For each start, find the end immediately after it.
Subtract each end found in step 2 minus each corresponding start, and don't forget to add 1.
The number of events is immediate from that.
The interesting part is step 2. bsxfun tests, for each combination of start and end, if the start is less than the end. Then the second output of max gives the index of the first true value for each start, if any; and its first output tells you if there really was some true value, (that is, if the found index is valid).
starts = find(state(:)==-1); % // step 1
ends = find(state(:)==1); % // step 1
[valid, next_end] = max(bsxfun(#lt, starts.', ends)); %'// step 2
result = ends(next_end(valid)) - starts(valid) + 1; % // step 3
number = numel(result); % // step 4

Related

Applying easing to a loop delay

To speak simple, I am trying to figure out how to apply easing to a loop delay.
for (i := 0; i < 114; i++) {
// do a task
time.Sleep(//delay before next job)
}
As you can read it, this is very basic. Let say I want to complete the whole loop in 3s (job completion time is negligeable, t <= us). What is the proper method with Penner's Equations to calculate for each iteration the proper eased delay ?
So, with this function, to simulate an acceleration from zero-velocity, how should I use the t parameter each iteration of the loop to create the proper delay to sleep for ?
func easeInQuad(t float64) {
return math.Pow(t, 2)
}
I would be very thankfull if you could help me about this. Equations has not been a problem so far, but how to use them with my use case instead.
My question could looks like this one at first : Applying easing to setTimeout delays, within a loop
but this one does not take the total time of the loop in account.
However, I think it may be better to use equations rewritten to use only one parameter, in range [0,1] : https://gist.github.com/rezoner/713615dabedb59a15470
From my understanding, I have to calculate the abstract "percentage time elapsed", and somehow interpolate this value with an easing function.
This Node project seems to do just that : https://github.com/CharlotteGore/animation-timer, but again I can't figure out how to reproduce it.
Penner's Equations fundamentally require two parameters: the current progress, and the total possible progress. Alternatively, however, you can instead provide it with percentage-of-total-progress as a single parameter (as a value between 0 and 1), since that's what it uses the current and total to calculate anyway.
Using your original code, if you want 114 iterations to take place within 3 seconds, the easiest way is to use your iteration index as the current progress, and 114 as the total, then multiply the calculated delay factor by your total duration 3.0s.
Note, Penner's Equations calculate total displacement from start position, NOT relative displacement for each step, so you actually need to calculate that difference yourself. Thus the delay for this iteration is the total displacement (delay) for this iteration minus the total displacement for the last iteration:
func DoStuffWithEasing() {
iterations := 114
runTime := 3 * time.Second
for i := 1; i <= iterations; i++ {
// do a task
time.Sleep(easeInQuadDelay(i, iterations, runTime))
}
}
func easeInQuadDelay(c, t int, dur time.Duration) time.Duration {
if c <= 0 || t == 0 { // invalid cases
return 0
}
// This return can be a single-liner, but I split it up for clarity
// Note that time.Durations are fundamentally int64s,
// so we can easily type convert them to float64s and back
this := math.Pow(float64(c)/float64(t), 2)
last := math.Pow(float64(c-1)/float64(t), 2)
return time.Duration((this - last) * float64(dur))
}
https://play.golang.org/p/TTgZUYUvxW
Graph of the easing
I think, you can use time.Ticker.
numLoops := 144
timePerIteration := time.Duration(3) * time.Second / time.Duration(numLoops)
ticker := time.NewTicker(timePerIteration)
for i := 0; i < numLoops; i++ {
// your code
<-ticker.C
}
ticker.Stop()

How do I create a "twirly" in a C program task?

Hey guys I have created a program in C that tests all numbers between 1 and 10000 to check if they are perfect using a function that determines whether a number is perfect. Once it finds these it prints them to the user, they are 6, 28, 496 and 8128. After this the program then prints out all the factors of each perfect number to the user. This is all fine. Here is my problem.
The final part of my task asks me to:
"Use a "twirly" to indicate that your program is happily working away. A "twirly" is the following characters printed over the top of each other in the following order: '|' '/' '-' '\'. This has the effect of producing a spinning wheel - ie a "twirly". Hint: to do this you can use \r (instead of \n) in printf to give a carriage return only (instead of a carriage return linefeed). (Note: this may not work on some systems - you do not have to do it this way.)"
I have no idea what a twirly is or how to implement one. My tutor said it has something to do with the sleep and delay functions which I also don't know how to use. Can anyone help me with this last stage, it sucks that all my coding is complete but I can't get this "twirly" thing to work.
if you want to simultaneously perform the task of
Testing the numbers and
Display the twirly on screen
while the process goes on then you better look into using threads. using POSIX threads you can initiate the task on a thread and the other thread will display the twirly to the user on terminal.
#include<stdlib.h>
#include<pthread.h>
int Test();
void Display();
int main(){
// create threads each for both tasks test and Display
//call threads
//wait for Test thread to finish
//terminate display thread after Test thread completes
//exit code
}
Refer chapter 12 for threads
beginning linux programming ebook
Given the program upon which the user is "waiting", I believe the problem as stated and the solutions using sleep() or threads are misguided.
To produce all the perfect numbers below 10,000 using C on a modern personal computer takes about 1/10 of a second. So any device to show the computer is "happily working away" would either never be seen or would significanly intefere with the time it takes to get the job done.
But let's make a working twirly for perfect number search anyway. I've left off printing the factors to keep this simple. Since 10,000 is too low to see the twirly in action, I've upped the limit to 100,000:
#include <stdio.h>
#include <string.h>
int main()
{
const char *twirly = "|/-\\";
for (unsigned x = 1; x <= 100000; x++)
{
unsigned sum = 0;
for (unsigned i = 1; i <= x / 2; i++)
{
if (x % i == 0)
{
sum += i;
}
}
if (sum == x)
{
printf("%d\n", x);
}
printf("%c\r", twirly[x / 2500 % strlen(twirly)]);
}
return 0;
}
No need for sleep() or threads, just key it into the complexity of the problem itself and have it update at reasonable intervals.
Now here's the catch, although the above works, the user will never see a fifth perfect number pop out with a 100,000 limit and even with a 100,000,000 limit, which should produce one more, they'll likely give up as this is a bad (slow) algorithm for finding them. But they'll have a twirly to watch.
i as integer
loop i: 1 to 10000
loop j: 1 to i/2
sum as integer
set sum = 0
if i%j == 0
sum+=j
return sum==i
if i%100 == 0
str as character pointer
set *str = "|/-\\"
set length = 4
print str[p] using "%c\r" as format specifier
Increment p and assign its modulo by len to p

Arrays and for loops trouble

I am doing an array exercise and I almost finished it.I have trouble finishing the last part.I create two arrays that store coursework points and exam points and then using a third array I calculate the module result(it is determined by both exam and coursework points). I got this part working and assuming I have 5 modules the output is 5 numbers.However I want to calculate my stage mark so if I have 5 modules I get their marks,add them together and then divide them by 5.Here is my problem I am using for loop because that is the only way it will work(as far as I know) so given that I already have my module result I use this for loop to calculate the stage result:
for(int i = 0; i < module_result.length; i++)
{
sum = sum + module_result[i];
System.out.println(sum/5);
}
I saw in this site similar question and I used the code in the answers.I can use enhanced for loop as well.
So given that coursework array={45,70,60,55,80} and exam array={83,72,45,25,89} my module results are 64,71,60,87. By using the above for loop I get anticipated outcome:
10
22
32
37
52
So I get my result. It is 52. But I don't want the rest of the numbers.
My question is how can I get just that number(52). I guess it is not possible by using for loop because it will inevitably is going to loop 5 times not one. I thought about using while loop but I don't see how I will get much different outcome.
I'm not sure if I totally understand the question, but I think this is what you're going for:
for(int i = 0; i < module_result.length; i++)
{
sum = sum + module_result[i];
}
System.out.println(sum/5);
All you have to do is simply move the println statement outside of the loop (if I understand the question correctly).
If you just want to print out the last number, just do a condition in the for loop that would print out at the index just before the length like this:
for (int i = 0; i < module_result.length; i++) {
if (i == module_result.length - 1) {
// print results
}
else {
// Do calculations
}
}
OK here is my code.
public int[] computeResult(int []courseWork,int[] examMarks ){
int[] module_result = new int[6];
for(int i=0;i<module_result.length;i++){//CALCULATE EACH MODULE
module_result[i]= ((courseWork[i] * cw[i]) + (examMarks[i] * (100 - cw[i]))) / 100;//THIS LINE IS SIMPLY A CONDITION HOW TO CALCULATE A MODULE YOU DON'T NEED TO KNOW WHAT IS HAPPENING INSIDE
}
int sum = 0;
for(int i = 0; i < module_result.length; i++)//USE THIS FOR LOOP TO ADD THE MODULES TOGETHER.
{
sum = sum + module_result[i];
// Add this extra line
// This allows you to only print out the value when you reach the end
if (i == module_result.length - 1) {
System.out.println(sum/6);
}
}
return module_result;
}
However here is what happens-the first for loop calculate the module results.Say they are as follows in the output console:
64
71
60
31
87
33
Next the second for loop is adding them together-first is 64,next loop at 71 to 64 and you get 135,next add to 135 the next module result 60 and so on until I get the total sum of all 6 modules which is just in this example 346 and then divide it by 6 to get my stage result.So I need in my output console just 346/6.Nothing else no zeros no nothing.
What my current code does is this-the second loop star running,it already knows my module results(they have been calculated) and so it starts- the first one is 64,divide it by 6 I get outcome 10,then the loop add 71 to 64 get 135 and divide it by 6 and so on until it reaches the number 346 and divide it by 6.So I get this output:
10
22
32
37
52
57
I don't need 10,22,32,37 and 52 they hold no meaning.I just need 57.What your solution will give me is this outcome:
0
0
0
0
0
57
It still gives unnecessary numbers.

My OpenCL code changes the output based on a seemingly noop

I'm running the same OpenCL kernel code on an Intel CPU and on a NVIDIA GPU and the results are wrong on the first but right on the latter; the strange thing is that if I do some seemingly irrelevant changes the output works as expected in both cases.
The goal of the function is to calculate the matrix multiplication between A (triangular) and B (regular), where the position of A in the operation is determined by the value of the variable left. The bug only appears when left is true and when the for loop iterates at least twice.
Here is a fragment of the code omitting some bits that shouldn't affect for the sake of clarity.
__kernel void blas_strmm(int left, int upper, int nota, int unit, int row, int dim, int m, int n,
float alpha, __global const float *a, __global const float *b, __global float *c) {
/* [...] */
int ty = get_local_id(1);
int y = ty + BLOCK_SIZE * get_group_id(1);
int by = y;
__local float Bs[BLOCK_SIZE][BLOCK_SIZE];
/* [...] */
for(int i=start; i<end; i+=BLOCK_SIZE) {
if(left) {
ay = i+ty;
bx = i+tx;
}
else {
ax = i+tx;
by = i+ty;
}
barrier(CLK_LOCAL_MEM_FENCE);
/* [...] (Load As) */
if(bx >= m || by >= n)
Bs[tx][ty] = 0;
else
Bs[tx][ty] = b[bx*n+by];
barrier(CLK_LOCAL_MEM_FENCE);
/* [...] (Calculate Csub) */
}
if(y < n && x < (left ? row : m)) // In bounds
c[x*n+y] = alpha*Csub;
}
Now it gets weird.
As you can see, by always equals y if left is true. I checked (with some printfs, mind you) and left is always true, and the code on the else branch inside the loop is never executed. Nevertheless, if I remove or comment out the by = i+ty line there, the code works. Why? I don't know yet, but I though it might be something related to by not having the expected value assigned.
My train of thought took me to check if there was ever a discrepancy between by and y, as they should have the same value always; I added a line that checked if by != y but that comparison always returned false, as expected. So I went on and changed the appearance of by for y so the line
if(bx >= m || by >= n)
transformed into
if(bx >= m || y >= n)
and it worked again, even though I'm still using the variable by properly three lines below.
With an open mind I tried some other things and I got to the point that the code works if I add the following line inside the loop, as long as it is situated at any point after the initial if/else and before the if condition that I mentioned just before.
if(y >= n) left = 1;
The code inside (left = 1) can be substituted for anything (a printf, another useless assignation, etc.), but the condition is a bit more restrictive. Here are some examples that make the code output the correct values:
if(y >= n) left = 1;
if(y < n) left = 1;
if(y+1 < n+1) left = 1;
if(n > y) left = 1;
And some that don't work, note that m = n in the particular example that I'm testing:
if(y >= n+1) left = 1;
if(y > n) left = 1;
if(y >= m) left = 1;
/* etc. */
That's the point where I am now. I have added a line that shouldn't affect the program at all but it makes it work. This magic solution is not satisfactory to me and I would like to know what's happening inside my CPU and why.
Just to be sure I'm not forgetting anything, here is the full function code and a gist with example inputs and outputs.
Thank you very much.
Solution
Both users DarkZeros and sharpneli were right about their assumptions: the barriers inside the for loop weren't being hit the right amount of times. In particular, there was a bug involving the very first element of each local group that made it run one iteration less than the rest, provoking an undefined behaviour. It was painfully obvious to see in hindsight.
Thank you all for your answers and time.
Have you checked that the get_local_size always returns the correct value?
You said "In short, the full length of the matrix is divided in local blocks of BLOCK_SIZE and run in parallel; ". Remember that OpenCL allows any concurrency only within a workgroup. So if you call enqueueNDrange with global size of [32,32] and local size of [16,16] it is possible that the first thread block runs from start to finish, then the second one, then third etc. You cannot synchronize between workgroups.
What are your EnqueueNDRange call(s)? Example of the calls required to get your example output would be heavily appreciated (mostly interested in the global and local size arguments).
(I'd ask this in a comment but I am a new user).
E (Had an answer, upon verification did not have it, still need more info):
http://multicore.doc.ic.ac.uk/tools/GPUVerify/
By using that I got a complaint that a barrier could be reached by a nonuniform control flow.
It all depends on what values dim, nota and upper get. Could you provide some examples?
I did some testing. Assuming left = 1. nota != upper and dim = 32, row as 16 or 32 or whatnot, still worked and got the following result:
...
gid0: 2 gid1: 0 lid0: 14 lid1: 13 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 14 lid1: 14 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 14 lid1: 15 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 15 lid1: 0 start: 0 end: 48
gid0: 2 gid1: 0 lid0: 15 lid1: 1 start: 0 end: 48
gid0: 2 gid1: 0 lid0: 15 lid1: 2 start: 0 end: 48
...
So if my assumptions about the variable values are even close to correct you have barrier divergence issue there. Some threads encounter a barrier which another threads never will. I'm surprised it did not deadlock.
The first thing I see it can terribly fail, is that you are using barriers inside a for loop.
If all the threads do not enter the same amount of times the for loop. Then the results are undefined completely. And you clearly state the problem only occurs if the for loop runs more than once.
Do you ensure this condition?

Dijkstra's Algorithm: Why is it needed to find minimum-distance element in the queue

I wrote this implementation of Dijksta's Algorithm, which at each iteration of the loop while Q is not empty instead of finding the minimum element of the queue it takes the head of the queue.
Here is the code i wrote
#include <stdio.h>
#include <limits.h>
#define INF INT_MAX
int N;
int Dist[500];
int Q[500];
int Visited[500];
int Graph[500][500];
void Dijkstra(int b){
int H = 0;
int T = -1;
int j,k;
Dist[b] = 0;
Q[T+1] = b;
T = T+1;
while(T>=H){
j = Q[H];
Visited[j] = 1;
for (k = 0;k < N; k++){
if(!Visited[k] && Dist[k] > Graph[j][k] + Dist[j] && Graph[j][k] != -1){
Dist[k] = Dist[j]+Graph[j][k];
Q[T+1] = k;
T = T+1;
}
}
H = H+1;
}
}
int main(){
int src,target,m;
int a,w,b,i,j;
scanf("%d%d%d%d",&N,&m,&src,&target);
for(i = 0;i < N;i ++){
for(j = 0;j < N;j++){
Graph[i][j] = -1;
}
}
for(i = 0; i< N; i++){
Dist[i] = INF;
Visited[i] = 0;
}
for(i = 0;i < m; i++){
scanf("%d%d%d",&a,&b,&w);
a--;
b--;
Graph[a][b] = w;
Graph[b][a] = w;
}
Dijkstra(src-1);
if(Dist[target-1] == INF){
printf("NO");
}else {
printf("YES\n%d",Dist[target-1]);
}
return 0;
}
I ran this for all the test cases i ever found and it gave a correct answer.
My question is the why do we need to find the min at all? Can anyone explain this to me in plain english ? Also i need a test case which proves my code wrong.
Take a look at this sample:
1-(6)-> 2 -(7)->3
\ /
(7) (2)
\ /
4
I.e. you have edge with length 6 from 1 to 2, edge with length 7 from 2 to 3, edge with length 7 from 1 to 4 and edge from 4 to 3. I believe your algorithm will think shortest path from 1 to 3 has length 13 through 2, while actually best solution is with length 9 through 4.
Hope this make it clear.
EDIT: sorry this example did not brake the code. Have a look at this one:
8 9 1 3
1 5 6
5 3 2
1 2 7
2 3 2
1 4 7
4 3 1
1 7 3
7 8 2
8 3 2
Your output is Yes 8. While a path 1->7->8->3 takes only 7. Here is a link on ideone
I think your code has the wrong time complexity. Your code compares (almost) all pairs of nodes, which is of quadratic time complexity.
Try adding 10000 nodes with 10000 edges and see if the code can execute within 1 seconds.
It is always mandatory to find out the unvisited vertex with minimum distance else you will get at least one
of the edges wrong. For Example, consider the following case
4 4
1 2 8
2 4 5
1 3 2
3 2 1
(8) (5)
1-----2----4
\ /
(2)\ / (1)
3
and we start with vertex 1
distance[1]=0
when you have visited vertex 1 you have relaxed vertex 2 and vertex 3
so now
distance[2]=8 and distance[3]=2
after this, if we don't select the minimum and choose vertex 2 instead, we get
distance[4]=13
and then select vertex 3 which will give
distance[2]=3
and hence we end up with distance[4]=13 which should have been
distance[4]=8
hence we should choose minimum from unvisited at each stage of Dijkstra which can be efficiently done using priority_queue.
If you run the algorithm for the following graph it depends on the order of the children. Let's say we are looking for shortest path from 1 to 4.
If you start from the queue with 1,
dist[1] = 0
dist[2] = 21
dist[3] = 0
and seen = {1} while the queue is pushed with 2 and 3 now if we consume 2 from the queue it will make dist[4] = 51,seen={1,2}, q = [1,2,3,4] and next time when 3 is consumed from the queue 2 won't be added to queue again since it is already in seen. Hence the algorithm will later update the distance to 12+31=43 from the path of 1->3-5->4 however the shortest path is 32 and it is on 1->3->2->4.
Let me discuss some other aspects with code examples. Let's say we have a connection list of (u,v,w) where node u has a weighted and directed edge to v with weight w. And let's prepare the graph and edges as below:
graph, edges = {i: set() for i in range(1, N+1)}, dict()
for u,v,w in connection_list:
graph[u].add(v)
edges[(u,v)] = w
ALGORITHM1 - Pick any child to add if not visited
q = deque([start])
seen = set()
dist = {i:float('inf') for i in range(1, N+1)}
dist[start] = 0
while q:
top = q.pop()
seen.add(top)
for v in graph[top]:
dist[v] = min(dist[v], dist[top] + edges[(top, v)])
if v not in seen:
q.appendleft(v)
This one is already discussed above and it will give us the incorrect result 43 instead of 32 for the shortest path between 1 and 4.
The problem was not to re-add 2 to the queue, then let's get rid of seen and the children again.
ALGORITHM2 - Add all children to the queue again
while q:
top = q.pop()
seen.add(top)
for v in graph[top]:
dist[v] = min(dist[v], dist[top] + edges[(top, v)])
q.appendleft(v)
This will work in that case, but it works only for this example though. Two issues with this algorithm,
We are adding the same nodes again so for a bigger example the complexity will depend on number of edges E instead of number of nodes V and for a dense graph we can assume O(E) = O(N^2).
If we add cycles in the graph it would run forever since there is no check to stop. So this algorithm is not a fit for cyclic graphs.
So that's why we have to spend extra time to pick the minimum child if we do it with a linear search we would end up with the same complexity as above. But if we use a priority queue we can reduce the min search to O(lgN) instead of O(N). Here is the linear search update on the code.
ALGORITHM3 - Dirty Dijkstra's Algorithm with linear minimum search
q = [K]
seen = set()
dist = {i:float('inf') for i in range(1, N+1)}
dist[start] = 0
while q:
min_dist, top = min((dist[i], i) for i in q)
q.remove(top)
seen.add(top)
for v in graph[top]:
dist[v] = min(dist[v], dist[top] + edges[(top, v)])
if v not in seen:
q.append(v)
Now we know the thought process we can remember to use a heap to have the optimal Dijkstra's algorithm next time.

Resources