Having trouble of drawing hands of an analog clock

Having trouble of drawing hands of an analog clock - c

I have an assigment to make an analog clock in C PIC18 starter kit.
I need to draw all the hands of the clock, aka seconds, minutes and hours.
I have initialized all 60 points of the clock diamater in a 2D array with x and y values and have the values of the clock center.
In order to draw the hands I've been provided for this assigment with the DrawLine function, which looks like this:
void drawLine( BYTE x0, BYTE y0, BYTE x1, BYTE y1, LineWidth lw )
the x0, y0 are where the line starts, and x1,y1 are where the line ends.
the drawLine function works like a XOR, so if I call it again on the same values the line disappears from the screen.
the screen values when they're x=0,y=0 starts from the top-left corner of the screen.
I have built a function in order to draw the hour hand of the clock.
I have an i value which increments by 1 for each coordinate of the clock, and it goes like this:
center[0][0]+(cord[i][0]-center[0][0])/2
But for some reason it only works on the 4th quadrant of the clock (aka when i is between 15 and 30), otherwise the lines it draws dosn't resemble an hand of a clock.
Below this is the full code for your understanding, but I would like to know what is wrong with my function, and what do I need to do in order of it to draw normally for the rest of thequadrants.
BYTE cord[60][2] = {
{67,0},{71,1},{74,2},{79,3},{81,4},
{82,5},{84,6},{88,9},{91,12},{92,14},
{93,16},{95,18},{96,21},{97,25},{98,29},
{98,32},{98,35},{97,39},{96,43},{95,45},
{93,47},{92,50},{89,53},{85,56},{84,58},
{82,58},{79,60},{76,61},{73,62},{70,63},
{67,63},{64,63},{60,62},{57,61},{54,60},
{51,58},{47,56},{45,54},{43,52},{41,50},
{40,47},{38,44},{37,41},{36,38},{35,35},
{35,32},{35,29},{36,26},{37,22},{38,18},
{40,16},{41,13},{44,10},{47,7},{49,6},
{51,5},{52,4},{54,3},{57,2},{61,1}};
void main(void)
{
BYTE xtemp, ytemp ;
BYTE i = 0;
BYTE center[1][2] = {{67,32}};
InitializeSystem();
while(1)
{
xtemp = center[0][0]+(cord[i][0]-center[0][0])/2;
ytemp = center[0][1]+(cord[i][1]-center[0][1])/2;
drawLine( center[0][0], center[0][1], xtemp, ytemp, thick ) ;
DelayMs(50);
drawLine( center[0][0], center[0][1], xtemp, ytemp, thick );
i++;
if(i>60)
i = 0;
}
}

There is no standard datatype BYTE included in C. I guess you have a typedef with an unsigned type.
When you do a signed calculation like:
xtemp = center[0][0]+(cord[i][0]-center[0][0])/2;
you need a signed signed type like int.

Related

Phase angle from FFT using atan2 - weird behaviour. Phase shift offset? Unwrapping?

I'm testing and performing simple FFT's and I'm interested in phase shift.
I geneate simple array of 256 samples with sinusoid with 10 cycles.
I perform an FFT of those samples and receiving complex data (2x128).
Than I calculate magnitude of those data and FFT looks like expected:
Then I want to calculate phase shift from fft complex output. I'm using atan2.
Combined output fft_magnitude (blue) + fft+phase (red) looks like this:
This is pretty much what I expect with a "small" problem. I know this is wrapping but if I imagine unwrapping it, the phase shift in the magnitude peak is reading 36 degrees and I think it should be 0 because my input sinusiod was not shifted at all.
If I shift this -36 deg (blue is in-phase, red is shifted, blue is printed only for reference) the sinusiod looks like this:
And than if I perform an FFT of this red data the magnitude + phase output looks like this:
So it is easy to imagine that unwrapped phase will be close to 0 at the magniture peak.
So there is 36 deg offset. But what happenes if I prepare data with 20 cycles per 256 samples and 0 phase shift
If I then perform an FFT, this is an output (magnitude + phase):
And I can tell you if will cross the peak point at 72 degrees. So there is now 72 degrees offset.
Can anyone give me a hint why is that happening?
Is it right that atan2() phase output is frequency dependent with offset of 2pi/cycles (360 deg/cycles) ?
How to unwrap it and get correct results (I couldn't find working C library to unwrap).
This is running on ARM Cortex-M7 processor (embedded).
#define phaseShift 0
#define cycles 20
#include <arm_math.h>
#include <arm_const_structs.h>
float32_t phi = phaseShift * PI / 180; //phase shift in radians
float32_t data[256]; //input data for fft
float32_t output_buffer[256]; //output buffer from fft
float32_t phase_data[128]; //will contain atan2 values of output from fft (output values are complex)
float32_t magnitude[128]; //will contain absolute values of output from fft (output values are complex)
float32_t incrFactorRadians = cycles * 2 * PI / 255;
arm_rfft_fast_instance_f32 RealFFT_Instance;
void setup()
{
Serial.begin(115200);
delay(2000);
arm_rfft_fast_init_f32(&RealFFT_Instance, 256); //initializing fft to be ready for 256 samples
for (int i = 0; i < 256; i++) //print sinusoids
{
data[i] = arm_sin_f32(incrFactorRadians * i + phi);
Serial.print(arm_sin_f32(incrFactorRadians * i), 8); Serial.print(","); Serial.print(data[i], 8); Serial.print("\n"); //print reference in-phase sinusoid and shifted sinusoid (data for fft)
}
Serial.print("\n\n");
delay(10000);
arm_rfft_fast_f32(&RealFFT_Instance, data, output_buffer, 0); //perform fft
for (int i = 0; i < 128; i++) //calculate absolute values of an fft output (fft output is complex), and phase shift
{
magnitude[i] = output_buffer[i * 2] * output_buffer[i * 2] + output_buffer[(i * 2) + 1] * output_buffer[(i * 2) + 1];
__ASM("VSQRT.F32 %0,%1" : "=t"(magnitude[i]) : "t"(magnitude[i])); //fast square root ARM DSP function
phase_data[i] = atan2(output_buffer[i * 2], output_buffer[i * 2 +1]) * 180 / PI;
}
}
void loop() //print magnitude of fft and phase output every 10 seconds
{
for (int i = 0; i < 128; i++)
{
Serial.print(magnitude[i], 8); Serial.print(","); Serial.print(phase_data[i], 8); Serial.print("\n");
}
Serial.print("\n\n");
delay(10000);
}

To break down the excellent answer by hotpaw2. (Their answers are always so loaded with golden nuggets of information that I spend days learning enough to comprehend the brilliance.)
When an engineer says "integer periodic" they mean your samples that you are feeding into the FFT (the aperture) needs to sample in a way the ensures you capture one full wave of the frequency sin wave.
Think of the sin wave starting at zero and cresting at one then falling below zero into the trough at negative one and then coming back up to zero.
This is one "full cycle". Now if your wave has a period of 10 cycles per second and you sample at 100 samples per second you will have 10 samples per wave.
So now you put 13 samples into an FFT and your phase is off. Why?
Well the phase is looking for the wave to smoothly continue forever. You just started a zero for the first sample and dropped off as .25 on the 13th sample. Now the phase calculation tries to connect the two ends and has this jump in the wave. This causes the phase to come out wrong.
What you need to do is select a number of samples to feed into your FFT that you know will contain full waves only.
(NOTE) You are only concerned with the phase of one freq at a time.
AND your sample aperture must not start and end at the sin waves same point.
IF you start at zero and end at zero the calculation pasting the two ends together in a forever circle will get two zeros at the transition. So you have to stop one sample short of the repeat point.
Code demonstrating this can be found: Scipy FFT - how to get phase angle

An bare FFT plus an atan2() only correctly measures the starting phase of an input sinusoid if that sinusoid is exactly integer periodic in the FFT's aperture width.
If the signal is not exactly integer periodic (some other frequency), then you have to recenter the data by doing an FFTshift (rotate the data by N/2) before the FFT. The FFT will then correctly measure the phase at the center of the original data, and away from the circular discontinuity produced by the FFT's finite length rectangular window on non-periodic-in-aperture signals.
If you want the phase at some point in the data other than the center, you can use the estimate of the frequency and phase at the center to recalculate the phase at other positions.
There are other window functions (Blackman-Nutall, et.al.) that might produce a better phase estimate than a rectangular window, but usually not as good an estimate as using an FFTShift.

What causes the stackoverflow? And how can I resolve it?

I was doing the homework for computer graphics.
We need to use floodfill to paint an area, but no matter how I changed the reserve stack of Visual Studio, it would always jump out stackoverflow.
void Polygon_FloodFill(HDC hdc, int x0, int y0, int fillColor, int borderColor) {
int interiorColor;
interiorColor = GetPixel(hdc, x0, y0);
if ((interiorColor != borderColor) && (interiorColor != fillColor)) {
SetPixel(hdc, x0, y0, fillColor);
Polygon_FloodFill(hdc, x0 + 1, y0, fillColor, borderColor);
Polygon_FloodFill(hdc, x0, y0 + 1, fillColor, borderColor);
Polygon_FloodFill(hdc, x0 - 1 ,y0, fillColor, borderColor);
Polygon_FloodFill(hdc, x0, y0 - 1, fillColor, borderColor);
}

You may have too large an area to fill, which causes recursive calls to consume all of the execution stack in your program.
Your options:
grow the execution stack even further, if you can
reduce the area (how about just 100x100 or 20x20?)
stop using the execution stack and use a data structure that works similarly but can contain more elements (by being more efficient and/or being able to grow/be larger)
use a different algorithm (e.g. consider going from individual pixels to horizontal spans of pixels, there will be many fewer of the latter than the former)

What causes the stackoverflow?
What is the range of x0? +/- 2,000,000,000? That is your stack depth potential.
Code does not obviously prevent going out of range unless GetPixel(out-of-range) returns a no-match value.
And how can I resolve it?
Code needs to be more selective on recursive calls.
When a row of pixels can be set, do so without recursion.
Then examine that row's neighbors and only recurse when the neighbors were not continuously in need of setting.
A promising approach would handle the middle and then look at the 4 cardinal directions.
// Pseudo code
Polygon_FloodFill(x,y,c)
if (pixel(x,y) needs filling) {
set pixel(x,y,c);
for each of the 4 directions
// example: east
i = 1;
// fill the east line first
while (pixel(x+i,y) needs filling) {
i++;
set pixel(x,y,c);
}
// now examine the line above the "east" line
recursed = false;
for (j=1; j<i; j++) {
if (pixel(x+j, y+j) needs filling) {
if (!recursed) {
recursed = true;
Polygon_FloodFill(x+j,y+j,c)
} else {
// no need to call Polygon_FloodFill as will be caught with previous call
}
} else {
recursed = false;
}
}
// Same for line below the "east" line
// do same for south, west, north.
}

how many pixels to fill? each pixel is one level deep of recursion and you got a lot of variables all local ones and operands of the recursive function + return value and address so for reach pixel you store this:
void Polygon_FloodFill(HDC hdc, int x0, int y0, int fillColor, int borderColor) {
int interiorColor;
in 32 bit environment I estimate this in [Bytes]:
4 Polygon_FloodFill return address
4 HDC hdc ?
4 int x0
4 int y0
4 int fillColor
4 int borderColor
4 int interiorColor
-------------------
~ 7*4 = 28 Bytes
There might be even more depending on the C engine and calling sequence.
Now if your filled area has for example 256x256 pixel then you need:
7*4*256*256 = 1.75 MByte
of memory on the stack/heap. How much memory you got depends on the settings you compile/link with so go to project option and look for memory stack/heap limits...
How to deal with this?
lower the stack/heap trashing
simply do not use operands for your flood_fill instead move them to global variables:
HDC floodfill_hdc;
int floodfill_x0,floodfill_y0,floodfill_fillColor,floodfill_borderColor;
void _Polygon_FloodFill()
{
// here your original filling code
int interiorColor;
...
}
void PolygonFloodFill(HDC hdc, int x0, int y0, int fillColor, int borderColor) // this is what you call when want to fill something
{
floodfill_hdc=hdc;
floodfill_x0=x0;
floodfill_y0=y0;
floodfill_fillColor=fillColor;
floodfill_borderColor=borderColor;
_Polygon_FloodFill();
}
this will allow to fill ~14 times bigger area.
limit recursion depth
This is also sometimes called priority que ... You just add one gobal counter that is counting actual depth of recursion and if hit limit value then do not allow recursion. Instead add pixel position to some list that will be processed after actual recursion stops.
change filling from pixels to lines
this simply eliminates a lot of recursive calls in wildly rough estimate to sqrt(n) recursions from n... You simply fill whole line from a start point to predetermined direction until you hit the border ... So you would have just recursion call per each line instead of per pixel. Here example (see [edit2]):
Paint algorithm leaving white pixels at the edges when I color
However the function name Polygon_FloodFill implies you got the border polygon in vector form. If the case than filling it will be much faster using polygon rasterization techniques like:
how to rasterize rotated rectangle (in 2d by setpixel)
but for that the polygon must be convex one so if not the case you need to triangulate or break down to convex polygons first (for example with Ear clipping).

Manage values from encoder to understand the direction of the encoder itself (counterclockwise or clockwise)

I'm sampling an encoder and this sample has a value between the interval [30, 230]; I have to use this value to define two output counter variables (one increasing counter and one decreasing counter).
The problem is that sometimes when there is a rollover, which means that the encoder passes from 230 to 30 or vice versa, the sampling is too slow and I lose the direction of the movement (counterclockwise or clockwise) and this results in a wrong behaviour.
Example:
If the encoder is on the 220 value and I move it really fast in clockwise direction, my next value is for example 100 and that means that the value passed through 30 (rollover): the direction should be clockwise. But the software thinks that I moved the encoder from 230 to 100 and it gives me a counter clockwise movement.
Remind that I cannot encrease the sampling speed, it is steady.
It's in a real-time enviroment.

If you cannot guarantee that the encoder will not move more then half the range in one polling period, then the problem cannot be solved. If you assume that it will not move that far, then it is solvable - you simply assume that the movement between two polling events was the shortest of the two possible directions.
You don't explain, why your encoder range starts from non-zero, but the arithmetic is easier to follow (and code) if you remove that offset and work with a range 0 to 200 by subtracting the offset.
Given an encoder read function uint8_t ReadEnc() for example:
#define ENCODER_RANGE 200
#define ENCODER_OFFSET 30 // remove offset for range 0 to 200
static unsigned ccw_counter = 0 ;
static unsigned cw_counter = 0
static uint8_t previous_enc = ReadEnc() - ENCODER_OFFSET ;
uint8_t enc = ReadEnc() - ENCODER_OFFSET ;
signed enc_diff = enc - previous_enc ;
previous_enc = enc ;
// If absolute difference is greater then half the range
// assume that it rotated the opposite way.
if( enc_diff > ENCODER_RANGE / 2)
{
enc_diff = -(ENCODER_RANGE - enc_diff)
}
else if( enc_diff < -(ENCODER_RANGE / 2) )
{
enc_diff = (ENCODER_RANGE + enc_diff)
}
// Update counters
if( enc_diff < 0 )
{
// Increment CCW counter
ccw_counter -= enc_diff ;
}
else
{
// Increment CW counter
cw_counter += enc_diff ;
}

Suggestions on optimizing a Z-buffer implementation?

I'm writing a 3D graphics library as part of a project of mine, and I'm at the point where everything works, but not well enough.
In particular, my main headache is that my pixel fill-rate is horribly slow -- I can't even manage 30 FPS when drawing a triangle that spans half of an 800x600 window on my target machine (which is admittedly an older computer, but it should be able to manage this . . .)
I ran gprof on my executable, and I end up with the following interesting lines:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
43.51 9.50 9.50 vSwap
34.86 17.11 7.61 179944 0.04 0.04 grInterpolateHLine
13.99 20.17 3.06 grClearDepthBuffer
<snip>
0.76 21.78 0.17 624 0.27 12.46 grScanlineFill
The function vSwap is my double-buffer swapping function, and it also performs vsyching, so it makes sense to me that the test program will spend much of its time waiting in there. grScanlineFill is my triangle-drawing function, which creates an edge list and then calls grInterpolateHLine to actually fill in the triangle.
My engine is currently using a Z-buffer to perform hidden surface removal. If we discount the (presumed) vsynch overhead, then it turns out that the test program is spending something like 85% of its execution time either clearing the depth buffer, or writing pixels according to the values in the depth buffer. My depth buffer clearing function is simplicity itself: copy the maximum value of a float into each element. The function grInterpolateHLine is:
void grInterpolateHLine(int x1, int x2, int y, float z, float zstep, int colour) {
for(; x1 <= x2; x1 ++, z += zstep) {
if(z < grDepthBuffer[x1 + y*VIDEO_WIDTH]) {
vSetPixel(x1, y, colour);
grDepthBuffer[x1 + y*VIDEO_WIDTH] = z;
}
}
}
I really don't see how I can improve that, especially considering that vSetPixel is a macro.
My entire stock of ideas for optimization has been whittled down to precisely one:
Use an integer/fixed-point depth buffer.
The problem that I have with integer/fixed-point depth buffers is that interpolation can be very annoying, and I don't actually have a fixed-point number library yet. Any further thoughts out there? Any advice would be most appreciated.

You should have a look at the source code to something like Quake - considering what it could achieve on a Pentium, 15 years ago. Its z-buffer implementation used spans rather than per-pixel (or fragment) depth. Otherwise, you could look at the rasterization code in Mesa.

Hard to really tell what higher order optimizations can be done without seeing the rest of the code. I have a couple of minor observation, though.
There's no need to calculate x1 + y * VIDEO_WIDTH more than once in grInterpolateHLine. i.e.:
void grInterpolateHLine(int x1, int x2, int y, float z, float zstep, int colour) {
int offset = x1 + (y * VIDEO_WIDTH);
for(; x1 <= x2; x1 ++, z += zstep, offset++) {
if(z < grDepthBuffer[offset]) {
vSetPixel(x1, y, colour);
grDepthBuffer[offset] = z;
}
}
}
Likewise, I'm guessing that your vSetPixel does a similar calculation, so you should be able to use the same offset there as well, and then you only need to increment offset and not x1 in each loop iteration. Chances are this can be extended back to the function that calls grInterpolateHLine, and you would then only need to do the multiplication once per triangle.
There are some other things you could do with the depth buffer. Most of the time if the first pixel of the line either fails or passes the depth test, then the rest of the line will have the same result. So after the first test you can write a more efficient assembly block to test the entire line in one shot, then if it passes you can use a more efficient block memory setter to block-set the pixel and depth values instead of doing them one at a time. You would only need to test/set per pixel if the line is only partially occluded.
Also, not sure what you mean by older computer, but if your target computer is multi-core then you can break it up among multiple cores. You can do this for the buffer clearing function as well. It can help quite a bit.

I ended up solving this by replacing the Z-buffer with the Painter's Algorithm. I used SSE to write a Z-buffer implementation that created a bitmask w/the pixels to paint (plus the range optimization suggested by Gerald), and it still ran far too slowly.
Thank you, everyone, for your input.

Looking for a fast outlined line rendering algorithm

I'm looking for a fast algorithm to draw an outlined line. For this application, the outline only needs to be 1 pixel wide. It should be possible, whether by default or through an option, to make two lines connect together seamlessly, if they share a common point.
Excuse the ASCII art but this is probably the best way to demonstrate it.
Normal line:
##
##
##
##
##
##
"Outlined" line:
**
*##**
**##**
**##**
**##**
**##**
**##*
**
I'm working on a dsPIC33FJ128GP802. It's a small microcontroller/digital signal processor, capable of 40 MIPS (million instructions per second.) It is only capable of integer math (add, subtract and multiply: it can do division, but it takes ~19 cycles.) It's being used to process an OSD layer at the same time and only 3-4 MIPS of the processing time is available for calculations, so speed is critical. The pixels occupy three states: black, white and transparent; and the video field is 192x128 pixels. This is for Super OSD, an open source project: http://code.google.com/p/super-osd/
The first solution I thought of was to draw 3x3 rectangles with outlined pixels on the first pass and normal pixels on the second pass, but this could be slow, as for every pixel at least 3 pixels are overwritten and the time spent drawing them is wasted. So I'm looking for a faster way. Each pixel costs around 30 cycles. The target is <50, 000 cycles to draw a line of 100 pixels length.

I suggest this (C/pseudocode mix) :
void draw_outline(int x1, int y1, int x2, int y2)
{
int x, y;
double slope;
if (abs(x2-x1) >= abs(y2-y1)) {
// line closer to horizontal than vertical
if (x2 < x1) swap_points(1, 2);
// now x1 <= x2
slope = 1.0*(y2-y1)/(x2-x1);
draw_pixel(x1-1, y1, '*');
for (x = x1; x <= x2; x++) {
y = y1 + round(slope*(x-x1));
draw_pixel(x, y-1, '*');
draw_pixel(x, y+1, '*');
// here draw_line() does draw_pixel(x, y, '#');
}
draw_pixel(x2+1, y2, '*');
}
else {
// same as above, but swap x and y
}
}
Edit: If you want to have successive lines connect seamlessly, I
think you really have to draw all the outlines in the first pass, and
then the lines. I edited the code above to draw only the outlines. The
draw_line() function would be exactly the same but with one single
draw_pixel(x, y, '#'); instead of four draw_pixel(..., ..., '*');.
And then you just:
void draw_polyline(point p[], int n)
{
int i;
for (i = 0; i < n-1; i++)
draw_outline(p[i].x, p[i].y, p[i+1].x, p[i+1].y);
for (i = 0; i < n-1; i++)
draw_line(p[i].x, p[i].y, p[i+1].x, p[i+1].y);
}

My approach would be to use the Bresenham to draw multiple lines. Looking at your ASCII art, you'll note that the outline lines are just the same as the Bresenham line, just shifted 1 pixel up and down -- plus a single pixel to the left of the first point and to the right of the last.
For a generic version, you'll need to determine whether your line is flat or steep -- i.e., whether abs(y1 - y0) <= abs(x1 - x0). For steep lines, the outlines are shifted by 1 pixel to the left and right, and the closing pixels are above the starting and below the ending point.
It could be worth optimizing this by drawing the line and two outline pixels in one go for each line pixel. However, if you need seamless outlines, the simplest solution would be to first draw all outlines, then the lines themselves -- which wouldn't work with the "three-pixel-Bresenham" optimization.