vf_codecview.c printing motion vectors in 'future' only - c

I am trying to understand the way ffmpeg draws motion vectors.
I went through the vf_codecview.c file and saw the function draw_arrow which takes only those vectors where source > 0 implying only those from the future.
Does anyone know why is this? And what's the use of computing past and future, if ffmpeg through this file takes only the future?

Doesn't appear to be the case
source can take on either positive or negative value.
/**
* Where the current macroblock comes from; negative value when it comes
* from the past, positive value when it comes from the future.
* XXX: set exact relative ref frame reference instead of a +/- 1 "direction".
*/
int32_t source;
In the call you linked to, we have
if ((direction == 0 && (s->mv & MV_P_FOR) && frame->pict_type == AV_PICTURE_TYPE_P) ||
(direction == 0 && (s->mv & MV_B_FOR) && frame->pict_type == AV_PICTURE_TYPE_B) ||
(direction == 1 && (s->mv & MV_B_BACK) && frame->pict_type == AV_PICTURE_TYPE_B))
draw_arrow(frame->data[0], mv->dst_x, mv->dst_y, mv->src_x, mv->src_y,
frame->width, frame->height, frame->linesize[0],
100, 0, mv->source > 0);
If the macroblock is predicted from a past frame, the last expression evaluates to and is passed as 0, else 1.
The conditional clearly allows for both past- and future- predicted macroblocks.
P.S. you are looking at the source for ver 2.5, which is very old, at this point. Current version is at https://ffmpeg.org/doxygen/trunk/vf__codecview_8c_source.html#l00256

Related

Manage values from encoder to understand the direction of the encoder itself (counterclockwise or clockwise)

I'm sampling an encoder and this sample has a value between the interval [30, 230]; I have to use this value to define two output counter variables (one increasing counter and one decreasing counter).
The problem is that sometimes when there is a rollover, which means that the encoder passes from 230 to 30 or vice versa, the sampling is too slow and I lose the direction of the movement (counterclockwise or clockwise) and this results in a wrong behaviour.
Example:
If the encoder is on the 220 value and I move it really fast in clockwise direction, my next value is for example 100 and that means that the value passed through 30 (rollover): the direction should be clockwise. But the software thinks that I moved the encoder from 230 to 100 and it gives me a counter clockwise movement.
Remind that I cannot encrease the sampling speed, it is steady.
It's in a real-time enviroment.
If you cannot guarantee that the encoder will not move more then half the range in one polling period, then the problem cannot be solved. If you assume that it will not move that far, then it is solvable - you simply assume that the movement between two polling events was the shortest of the two possible directions.
You don't explain, why your encoder range starts from non-zero, but the arithmetic is easier to follow (and code) if you remove that offset and work with a range 0 to 200 by subtracting the offset.
Given an encoder read function uint8_t ReadEnc() for example:
#define ENCODER_RANGE 200
#define ENCODER_OFFSET 30 // remove offset for range 0 to 200
static unsigned ccw_counter = 0 ;
static unsigned cw_counter = 0
static uint8_t previous_enc = ReadEnc() - ENCODER_OFFSET ;
uint8_t enc = ReadEnc() - ENCODER_OFFSET ;
signed enc_diff = enc - previous_enc ;
previous_enc = enc ;
// If absolute difference is greater then half the range
// assume that it rotated the opposite way.
if( enc_diff > ENCODER_RANGE / 2)
{
enc_diff = -(ENCODER_RANGE - enc_diff)
}
else if( enc_diff < -(ENCODER_RANGE / 2) )
{
enc_diff = (ENCODER_RANGE + enc_diff)
}
// Update counters
if( enc_diff < 0 )
{
// Increment CCW counter
ccw_counter -= enc_diff ;
}
else
{
// Increment CW counter
cw_counter += enc_diff ;
}

Detecting collision with sprites made of multiple pixel widths and heights

Context: Developing a small game on a microprocessor displayed on an LCD screen.
I'm trying to fix this collision detection function, what it does is it detects collision between a wall sprite (1 x 25 pixels) and a player sprite (3x3 pixels). It returns 1 or 0, if 1 the player sprite's dx/dy changes so it stops moving. So essentially the wall sprite is treated as a real wall.
int wall_collision(Sprite *w_sprite)
{
if (((w_sprite->x >= wall_sprite.x) && ((w_sprite->x - wall_sprite.x) < 3)) && ((w_sprite->y >= wall_sprite.y) && ((w_sprite->y - wall_sprite.y) < 3)))
return 1;
if (((w_sprite->x <= wall_sprite.x) && ((w_sprite->x - wall_sprite.x) > -3)) && ((w_sprite->y >= wall_sprite.y) && ((w_sprite->y - wall_sprite.y) < 3)))
return 1;
if (((w_sprite->x >= wall_sprite.x) && ((w_sprite->x - wall_sprite.x) < 3)) && ((w_sprite->y <= wall_sprite.y) && ((w_sprite->y - wall_sprite.y) > -3)))
return 1;
if (((w_sprite->x <= wall_sprite.x) && ((w_sprite->x - wall_sprite.x) > -3)) && ((w_sprite->y <= wall_sprite.y) && ((w_sprite->y - wall_sprite.y) > -3)))
return 1;
return 0;
}
My main issue is specifying the exact number the sprite should be equal/greater than/lesser than to, so as you can see in the example, it's 3 or -3. When I take numbers out, it returns 1 and stops the sprite regardless of where is, because the sprite is technically still on the same x or y axis as the wall, but it's not proximity wise, touching the wall. What are the correct size parameters for this?
Case problem: My sprite should only stop when it's directly touching the wall, currently it either passes through the wall, or stops when not even close to the wall.
First of all, you're code seems a little complex - below is canonical (or so I believe) method of detecting collisions. With this function, instead of having to check each collision manually, we can detect any collision between colliders A and B. Keep in mind, the collider struct used in this would have to contain information on the top, bottom, left and right co-ordinates on each collider. You can then store all colliders in an array and then index through them to check for collisions. The function:
int collisionFunction(collider * A, collider * B){
//Check to see if the colliders are "lined up" on the X-axis
if( (A->right > B->left ) && (B->right > A->left) ){
//Check to see if the colliders are also "lined up" on the Y-axis
if( (A->top < B->bottom) && (B->top < A->bottom) ){
return 1; // COLLISION DETECTED
}
}
return 0;//NO COLLISION DETECTED
}
Explanation of the function/algorithm: First of all, we check to see if A and B are "lined up" on the x axis. By "lined up", I mean to say that two colliders could be colliding based off their position on the X-axis. Then, we check to see if each collider is lined up on the Y-axis. If both conditions are met, then the colliders are colliding. It can be a little hard to grasp this at first so I suggest you trace this by drawing out shapes on paper (some colliding, others not) and see whether the algorithm says they're colliding or not. This algorithm will work for coordinate systems where the origin (i.e. (0,0) ) is in the top left corner of the screen, which is the convention for 2D graphics.
Keep in mind that your player would go through the wall partially when using this algorithm - this is very common in 2D games. But, given the number of pixels you're using, this could obviously be a problem. Therefore, you should take that into account when implementing this algorithm.
How about this?
int wall_collision(Sprite *w_sprite)
{
if(w_sprite->left >= wall_sprite->right) return 0;
if(w_sprite->right <= wall_sprite->left) return 0;
if(w_sprite->top >= wall_sprite->bottom) return 0;
if(w_sprite->bottom <= wall_sprite->top) return 0;
return 1;
}
Left/Right/Top/Bottom could be values or functions, or just replace them with the actual values. The "left" and "top" would be the same as the x/y value of the sprite or wall. The "right" and "bottom" would be the x/y + the width/height in pixels of the sprite or wall, respectively.
Take a look at this link for a more in-depth tutorial on simple collision detection: http://lazyfoo.net/SDL_tutorials/lesson17/index.php
EDIT: The example code assumes a coordinate system where x increases as you go right, and y increases as you go down.

My OpenCL code changes the output based on a seemingly noop

I'm running the same OpenCL kernel code on an Intel CPU and on a NVIDIA GPU and the results are wrong on the first but right on the latter; the strange thing is that if I do some seemingly irrelevant changes the output works as expected in both cases.
The goal of the function is to calculate the matrix multiplication between A (triangular) and B (regular), where the position of A in the operation is determined by the value of the variable left. The bug only appears when left is true and when the for loop iterates at least twice.
Here is a fragment of the code omitting some bits that shouldn't affect for the sake of clarity.
__kernel void blas_strmm(int left, int upper, int nota, int unit, int row, int dim, int m, int n,
float alpha, __global const float *a, __global const float *b, __global float *c) {
/* [...] */
int ty = get_local_id(1);
int y = ty + BLOCK_SIZE * get_group_id(1);
int by = y;
__local float Bs[BLOCK_SIZE][BLOCK_SIZE];
/* [...] */
for(int i=start; i<end; i+=BLOCK_SIZE) {
if(left) {
ay = i+ty;
bx = i+tx;
}
else {
ax = i+tx;
by = i+ty;
}
barrier(CLK_LOCAL_MEM_FENCE);
/* [...] (Load As) */
if(bx >= m || by >= n)
Bs[tx][ty] = 0;
else
Bs[tx][ty] = b[bx*n+by];
barrier(CLK_LOCAL_MEM_FENCE);
/* [...] (Calculate Csub) */
}
if(y < n && x < (left ? row : m)) // In bounds
c[x*n+y] = alpha*Csub;
}
Now it gets weird.
As you can see, by always equals y if left is true. I checked (with some printfs, mind you) and left is always true, and the code on the else branch inside the loop is never executed. Nevertheless, if I remove or comment out the by = i+ty line there, the code works. Why? I don't know yet, but I though it might be something related to by not having the expected value assigned.
My train of thought took me to check if there was ever a discrepancy between by and y, as they should have the same value always; I added a line that checked if by != y but that comparison always returned false, as expected. So I went on and changed the appearance of by for y so the line
if(bx >= m || by >= n)
transformed into
if(bx >= m || y >= n)
and it worked again, even though I'm still using the variable by properly three lines below.
With an open mind I tried some other things and I got to the point that the code works if I add the following line inside the loop, as long as it is situated at any point after the initial if/else and before the if condition that I mentioned just before.
if(y >= n) left = 1;
The code inside (left = 1) can be substituted for anything (a printf, another useless assignation, etc.), but the condition is a bit more restrictive. Here are some examples that make the code output the correct values:
if(y >= n) left = 1;
if(y < n) left = 1;
if(y+1 < n+1) left = 1;
if(n > y) left = 1;
And some that don't work, note that m = n in the particular example that I'm testing:
if(y >= n+1) left = 1;
if(y > n) left = 1;
if(y >= m) left = 1;
/* etc. */
That's the point where I am now. I have added a line that shouldn't affect the program at all but it makes it work. This magic solution is not satisfactory to me and I would like to know what's happening inside my CPU and why.
Just to be sure I'm not forgetting anything, here is the full function code and a gist with example inputs and outputs.
Thank you very much.
Solution
Both users DarkZeros and sharpneli were right about their assumptions: the barriers inside the for loop weren't being hit the right amount of times. In particular, there was a bug involving the very first element of each local group that made it run one iteration less than the rest, provoking an undefined behaviour. It was painfully obvious to see in hindsight.
Thank you all for your answers and time.
Have you checked that the get_local_size always returns the correct value?
You said "In short, the full length of the matrix is divided in local blocks of BLOCK_SIZE and run in parallel; ". Remember that OpenCL allows any concurrency only within a workgroup. So if you call enqueueNDrange with global size of [32,32] and local size of [16,16] it is possible that the first thread block runs from start to finish, then the second one, then third etc. You cannot synchronize between workgroups.
What are your EnqueueNDRange call(s)? Example of the calls required to get your example output would be heavily appreciated (mostly interested in the global and local size arguments).
(I'd ask this in a comment but I am a new user).
E (Had an answer, upon verification did not have it, still need more info):
http://multicore.doc.ic.ac.uk/tools/GPUVerify/
By using that I got a complaint that a barrier could be reached by a nonuniform control flow.
It all depends on what values dim, nota and upper get. Could you provide some examples?
I did some testing. Assuming left = 1. nota != upper and dim = 32, row as 16 or 32 or whatnot, still worked and got the following result:
...
gid0: 2 gid1: 0 lid0: 14 lid1: 13 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 14 lid1: 14 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 14 lid1: 15 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 15 lid1: 0 start: 0 end: 48
gid0: 2 gid1: 0 lid0: 15 lid1: 1 start: 0 end: 48
gid0: 2 gid1: 0 lid0: 15 lid1: 2 start: 0 end: 48
...
So if my assumptions about the variable values are even close to correct you have barrier divergence issue there. Some threads encounter a barrier which another threads never will. I'm surprised it did not deadlock.
The first thing I see it can terribly fail, is that you are using barriers inside a for loop.
If all the threads do not enter the same amount of times the for loop. Then the results are undefined completely. And you clearly state the problem only occurs if the for loop runs more than once.
Do you ensure this condition?

Understanding Matlab code

I've got some code, and I've been trying to make some minor tweaks to it. It used to use fgets to load in a single character from a line, and use it to colour points in a 3D plot. So it would read
a
p
p
n
c
and then use other data files to assign what x, y, z points to give these. The result is a really pretty 3D plot.
I've edited the input file so it reads
0
1
1
0
2
2
0
and I want it to colour numbers the same colour.
This is where I've gotten so far with the code:
function PlotCluster(mcStep)
clear all
filename = input('Please enter filename: ', 's');
disp('Loading hopping site coordinates ...')
load x.dat
load y.dat
load z.dat
temp = z;
z = x;
x = temp;
n_sites = length(x);
disp('Loading hopping site types ...')
fp = fopen([filename]);
data = load(filename); %# Load the data
% Plot the devices
% ----------------
disp('Plotting the sample surface ...')
figure
disp('Hello world!')
ia = data == 0;
in = data == 1;
ip = data == 2;
disp('Hello Again')
plot3(x(ia),y(ia),z(ia),'b.') %,'MarkerSize',4)
hold on
plot3(x(ic),y(ic),z(ic),'b.') %,'MarkerSize',4)
plot3(x(in),y(in),z(in),'g.') %,'MarkerSize',4)
plot3(x(ip),y(ip),z(ip),'r.') %,'MarkerSize',4)
daspect([1 1 1])
set(gca,'Projection','Perspective')
set(gca,'FontSize',16)
axis tight
xlabel('z (nm)','FontSize',18)
ylabel('y (nm)','FontSize',18)
zlabel('x (nm)','FontSize',18)
%title(['Metropolis Monte Carlo step ' num2str(mcStep)])
view([126.5 23])
My issue is I'm getting this error
Index exceeds matrix dimensions.
Error in PlotCluster (line 34)
plot3(x(ia),y(ia),z(ia),'b.') %,'MarkerSize',4)
And I don't see why ia would go out of bounds of the x array. Is it to do with changing the fgets to a load statement? It was the only way to get it read the correct numbers in (not 49s and 50s which was very odd.)
The main bits that are sticking me are these lines (where the number used to correspond to 'a','n','p' etc)
ia = data == 0;
in = data == 1;
ip = data == 2;
They look like implied if statements with assignment from data to ia etc. where ia becomes an array. But I'm not sure.
Any help understanding this would be greatly appreciated.
I've fixed the issue, I hadn't updated my input correctly. To clear this up for anyone who comes to this question: ia = data ==0 means 'Make an array the same size as data, and fill it with 1 or 0 depending on if the logic (data == 0) is true or false'

How does linux handle overflow in jiffies?

Suppose we have a following code:
if (timeout > jiffies)
{
/* we did not time out, good ... */
}
else
{
/* we timed out, error ...*
}
This code works fine when jiffies value do not overflow.
However, when jiffies overflow and wrap around to zero, this code doesn't work properly.
Linux apparently provides macros for dealing with this overflow problem
#define time_before(unknown, known) ((long)(unkown) - (long)(known) < 0)
and code above is supposed to be safe against overflow when replaced with this macro:
// SAFE AGAINST OVERFLOW
if (time_before(jiffies, timeout)
{
/* we did not time out, good ... */
}
else
{
/* we timed out, error ...*
}
But, what is the rationale behind time_before (and other time_ macros?
time_before(jiffies, timeout) will be expanded to
((long)(jiffies) - (long)(timeout) < 0)
How does this code prevent overflow problems?
Let's actually give it a try:
#define time_before(unknown, known) ((long)(unkown) - (long)(known) < 0)
I'll simplify things down a lot by saying that a long is only two bytes, so in hex it can have a value in the range [0, 0xFFFF].
Now, it's signed, so the range [0, 0xFFFF] can be broken into two separate ranges [0, 0x7FFF], [0x8000, 0xFFFF]. Those correspond to the values [0, 32767], [ -32768, -1]. Here's a diagram:
[0x0 - - - 0xFFFF]
[0x0 0x7FFF][0x8000 0xFFFF]
[0 32,767][-32,768 -1]
Say timeout is 32,000. We want to check if we're inside our timeout, but in truth we overflowed, so jiffies is -31,000. So if we naively tried to evaluate jiffies < timeout we'd get True. But, plugging in the values:
time_before(jiffies, offset)
== ((long)(jiffies) - (long)(offset) < 0)
== (-31000 - 32000 < 0) // WTF is this. Clearly NOT -63000
== (-31000 - 1768 - 1 - 30231 < 0) // simply expanded 32000
== (-32768 - 1 - 30232 < 0) // this -1 causes an underflow
== (32767 - 30232 < 0)
== (2535 < 0)
== False
jiffies are 4 bytes, not 2, but the same principle applies. Does that help at all?
See for example here: http://fixunix.com/kernel/266713-%5Bpatch-1-4%5D-fs-autofs-use-time_before-time_before_eq-etc.html
Code with checking overflow against some fixed small constant was converted to use time_before. Why?
I'm just summarizing the comment that goes with the definition of the
time_after etc functions:
include/linux/jiffies.h:93
93 /*
94 * These inlines deal with timer wrapping correctly. You are
95 * strongly encouraged to use them
96 * 1. Because people otherwise forget
97 * 2. Because if the timer wrap changes in future you won't have to
98 * alter your driver code.
99 *
100 * time_after(a,b) returns true if the time a is after time b.
101 *
So, time_before and time_after is the better effort of handling overflow.
Your testcase is more likely to be timeout < jiffles (w/o overflow) than timeout > jiffles (with overflow):
unsigned long jiffies = 2147483658;
unsigned long timeout = 10;
And if you will change timeout to
unsigned long timeout = -2146483000;
what will be an answer?
Or you can change the check from
printf("%d",time_before(jiffies,timeout));
to
printf("%d",time_before(jiffies,old_jiffles+timeout));
where old_jiffles is saved value of jiffles at the timer's start.
So, I think the usage of time_before can be like:
old_jiffles=jiffles;
timeout=10; // or even 10*HZ for ten-seconds
do_a_long_work_or_wait();
//check is the timeout reached or not
if(time_before(jiffies,old_jiffles+timeout) ) {
do_another_long_work_or_wait();
} else {
printk("ERRROR: the timeout is reached; here is a problem");
panic();
}
Given that jiffies is an unsigned value, a simple comparison is safe across one wraparound point (where signed values would jump from positive to negative) but not safe across the other point (where signed values would jump from negative to positive, and where unsigned values jump from high to low). It's protection against this second point that the macro is intended to solve.
There is a fundamental assumption that timeout was initially calculated as jiffies + some_offset at some prior recent point in time -- specifically, less than half the range of the variables. If you're trying to measure times longer than this then things break down and you'll get the wrong answer.
If we pretend that jiffies is 16-bit wide for convenience in the explanation (similar to the other answers):
timeout > jiffies
This is an unsigned comparison that is intended to return true if we have not yet reached the timeout. Some examples:
timeout == 0x0300, jiffies == 0x0100: result is true, as expected.
timeout == 0x8100, jiffies == 0x7F00: result is true, as expected.
timeout == 0x0100, jiffies == 0xFF00: oops, result is false, but we haven't really reached the timeout, it just wrapped the counter.
timeout == 0x0100, jiffies == 0x0300: result is false, as expected.
timeout == 0x7F00, jiffies == 0x8100: result is false, as expected.
timeout == 0xFF00, jiffies == 0x0100: oops, result is true, but we did pass the timeout.
time_before(jiffies, timeout)
This does a signed comparison on the difference of the values rather than the values themselves, and again is expected to return true if the timeout has not yet been reached. Provided that the assumption above is upheld, the same examples:
timeout == 0x0300, jiffies == 0x0100: result is true, as expected.
timeout == 0x8100, jiffies == 0x7F00: result is true, as expected.
timeout == 0x0100, jiffies == 0xFF00: result is true, as expected.
timeout == 0x0100, jiffies == 0x0300: result is false, as expected.
timeout == 0x7F00, jiffies == 0x8100: result is false, as expected.
timeout == 0xFF00, jiffies == 0x0100: result is false, as expected.
If the offset you used when calculating timeout is too large or you allow too much time to pass after calculating timeout, then the result can still be wrong. eg. if you calculate timeout once but then just keep testing it repeatedly, then time_before will initially be true, then change to false after the offset time has passed -- and then change back to true again after 0x8000 time has passed (however long that is; it depends on the tick rate). This is why when you reach the timeout, you're supposed to remember this and stop checking the time (or recalculate a new timeout).
In the real kernel, jiffies is longer than 16 bits so it will take longer for it to wrap, but it's still possible if the machine is run for long enough. (And typically it's set to wrap shortly after boot, to catch these bugs more quickly.)
I couldn't easily understand the above answers, so hoping to help with my own:
#define time_after(a,b) (long) ( (b) - (a) )
Here brackets around 'b' and 'a' make them signed.
Example overflow:
For convenience, imagine 8-bit integers, jiffy1 is changing and timeout is fixed and greater than jiffy1
like : jiffy1 = 252, timeout = 254 and jiffy2 becomes 0 or 1 after overflow
When we use unsigned:
jiffy1 < timeout and
jiffy2 < timeout (mistakenly due to overflow which we need to fix via MACRO)
When we use signed:
jiffy1 < timeout (more-negative < less-negative)
and
jiffy2 > timeout (positive > negative)
(because it will consider MSB as the sign bit, hence timeout will appear negative while our jiffy2 has become positive due to the overflow)
Do correct me if there is something wrong

Resources