Pruning short line segments from edge detector output? - c

I am looking for an algorithm to prune short line segments from the output of an edge detector. As can be seen in the image (and link) below, there are several small edges detected that aren't "long" lines. Ideally I'd like just the 4 sides of the quadrangle to show up after processing, but if there are a couple of stray lines, it won't be a big deal... Any suggestions?
Image Link

Before finding the edges pre-process the image with an open or close operation (or both), that is, erode followed by dilate, or dilate followed by erode. this should remove the smaller objects but leave the larger ones roughly the same.
I've looked for online examples, and the best I could find was on page 41 of this PDF.

I doubt that this can be done with a simple local operation. Look at the rectangle you want to keep - there are several gaps, hence performing a local operation to remove short line segments would probably heavily reduce the quality of the desired output.
In consequence I would try to detect the rectangle as important content by closing the gaps, fitting a polygon, or something like that, and then in a second step discard the remaining unimportant content. May be the Hough transform could help.
UPDATE
I just used this sample application using a Kernel Hough Transform with your sample image and got four nice lines fitting your rectangle.

In case somebody steps on this thread, OpenCV 2.x brings an example named squares.cpp that basically nails this task.
I made a slight modification to the application to improve the detection of the quadrangle
Code:
#include "highgui.h"
#include "cv.h"
#include <iostream>
#include <math.h>
#include <string.h>
using namespace cv;
using namespace std;
void help()
{
cout <<
"\nA program using pyramid scaling, Canny, contours, contour simpification and\n"
"memory storage (it's got it all folks) to find\n"
"squares in a list of images pic1-6.png\n"
"Returns sequence of squares detected on the image.\n"
"the sequence is stored in the specified memory storage\n"
"Call:\n"
"./squares\n"
"Using OpenCV version %s\n" << CV_VERSION << "\n" << endl;
}
int thresh = 70, N = 2;
const char* wndname = "Square Detection Demonized";
// helper function:
// finds a cosine of angle between vectors
// from pt0->pt1 and from pt0->pt2
double angle( Point pt1, Point pt2, Point pt0 )
{
double dx1 = pt1.x - pt0.x;
double dy1 = pt1.y - pt0.y;
double dx2 = pt2.x - pt0.x;
double dy2 = pt2.y - pt0.y;
return (dx1*dx2 + dy1*dy2)/sqrt((dx1*dx1 + dy1*dy1)*(dx2*dx2 + dy2*dy2) + 1e-10);
}
// returns sequence of squares detected on the image.
// the sequence is stored in the specified memory storage
void findSquares( const Mat& image, vector<vector<Point> >& squares )
{
squares.clear();
Mat pyr, timg, gray0(image.size(), CV_8U), gray;
// karlphillip: dilate the image so this technique can detect the white square,
Mat out(image);
dilate(out, out, Mat(), Point(-1,-1));
// then blur it so that the ocean/sea become one big segment to avoid detecting them as 2 big squares.
medianBlur(out, out, 3);
// down-scale and upscale the image to filter out the noise
pyrDown(out, pyr, Size(out.cols/2, out.rows/2));
pyrUp(pyr, timg, out.size());
vector<vector<Point> > contours;
// find squares only in the first color plane
for( int c = 0; c < 1; c++ ) // was: c < 3
{
int ch[] = {c, 0};
mixChannels(&timg, 1, &gray0, 1, ch, 1);
// try several threshold levels
for( int l = 0; l < N; l++ )
{
// hack: use Canny instead of zero threshold level.
// Canny helps to catch squares with gradient shading
if( l == 0 )
{
// apply Canny. Take the upper threshold from slider
// and set the lower to 0 (which forces edges merging)
Canny(gray0, gray, 0, thresh, 5);
// dilate canny output to remove potential
// holes between edge segments
dilate(gray, gray, Mat(), Point(-1,-1));
}
else
{
// apply threshold if l!=0:
// tgray(x,y) = gray(x,y) < (l+1)*255/N ? 255 : 0
gray = gray0 >= (l+1)*255/N;
}
// find contours and store them all as a list
findContours(gray, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
vector<Point> approx;
// test each contour
for( size_t i = 0; i < contours.size(); i++ )
{
// approximate contour with accuracy proportional
// to the contour perimeter
approxPolyDP(Mat(contours[i]), approx, arcLength(Mat(contours[i]), true)*0.02, true);
// square contours should have 4 vertices after approximation
// relatively large area (to filter out noisy contours)
// and be convex.
// Note: absolute value of an area is used because
// area may be positive or negative - in accordance with the
// contour orientation
if( approx.size() == 4 &&
fabs(contourArea(Mat(approx))) > 1000 &&
isContourConvex(Mat(approx)) )
{
double maxCosine = 0;
for( int j = 2; j < 5; j++ )
{
// find the maximum cosine of the angle between joint edges
double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1]));
maxCosine = MAX(maxCosine, cosine);
}
// if cosines of all angles are small
// (all angles are ~90 degree) then write quandrange
// vertices to resultant sequence
if( maxCosine < 0.3 )
squares.push_back(approx);
}
}
}
}
}
// the function draws all the squares in the image
void drawSquares( Mat& image, const vector<vector<Point> >& squares )
{
for( size_t i = 1; i < squares.size(); i++ )
{
const Point* p = &squares[i][0];
int n = (int)squares[i].size();
polylines(image, &p, &n, 1, true, Scalar(0,255,0), 3, CV_AA);
}
imshow(wndname, image);
}
int main(int argc, char** argv)
{
if (argc < 2)
{
cout << "Usage: ./program <file>" << endl;
return -1;
}
static const char* names[] = { argv[1], 0 };
help();
namedWindow( wndname, 1 );
vector<vector<Point> > squares;
for( int i = 0; names[i] != 0; i++ )
{
Mat image = imread(names[i], 1);
if( image.empty() )
{
cout << "Couldn't load " << names[i] << endl;
continue;
}
findSquares(image, squares);
drawSquares(image, squares);
imwrite("out.jpg", image);
int c = waitKey();
if( (char)c == 27 )
break;
}
return 0;
}

The Hough Transform can be a very expensive operation.
An alternative that may work well in your case is the following:
run 2 mathematical morphology operations called an image close (http://homepages.inf.ed.ac.uk/rbf/HIPR2/close.htm) with a horizontal and vertical line (of a given length determined from testing) structuring element respectively. The point of this is to close all gaps in the large rectangle.
run connected component analysis. If you have done the morphology effectively, the large rectangle will come out as one connected component. It then only remains iterating through all the connected components and picking out the most likely candidate that should be the large rectangle.

Perhaps finding the connected components, then removing components with less than X pixels (empirically determined), followed by dilation along horizontal/vertical lines to reconnect the gaps within the rectangle

It's possible to follow two main techniques:
Vector based operation: map your pixel islands into clusters (blob, voronoi zones, whatever). Then apply some heuristics to rectify the segments, like Teh-Chin chain approximation algorithm, and make your pruning upon vectorial elements (start, endpoint, length, orientation and so on).
Set based operation: cluster your data (as above). For every cluster, compute principal components and detect lines from circles or any other shape by looking for clusters showing only 1 significative eigenvalue (or 2 if you look for "fat" segments, that could resemble to ellipses). Check eigenvectors associated with eigenvalues to have information about orientation of the blobs, and make your choice.
Both ways could be easily explored with OpenCV (the former, indeed, falls under "Contour analysis" category of algos).

Here is a simple morphological filtering solution following the lines of #Tom10:
Solution in matlab:
se1 = strel('line',5,180); % linear horizontal structuring element
se2 = strel('line',5,90); % linear vertical structuring element
I = rgb2gray(imread('test.jpg'))>80; % threshold (since i had a grayscale version of the image)
Idil = imdilate(imdilate(I,se1),se2); % dilate contours so that they connect
Idil_area = bwareaopen(Idil,1200); % area filter them to remove the small components
The idea is to basically connect the horizontal contours to make a large component and filter by an area opening filter later on to obtain the rectangle.
Results:

Related

Intesection problem with Möller-Trumbore algorithm on 1 dimension of the triangle

I am currently working on a raytracer project and I just found out a issue with the triangle intersections.
Sometimes, and I don't understand when and why, some of the pixels of the triangle don't appear on the screen. Instead I can see the object right behind it. It only occurs on one dimension of the triangle and it depends on the camera and the triangle postions (e.g. picture below).
Triangle with pixels missing
I am using Möller-Trumbore algorithm to compute every intersection. Here's my implementation :
t_solve s;
t_vec v1;
t_vec v2;
t_vec tvec;
t_vec pvec;
v1 = vec_sub(triangle->point2, triangle->point1);
v2 = vec_sub(triangle->point3, triangle->point1);
pvec = vec_cross(dir, v2);
s.delta = vec_dot(v1, pvec);
if (fabs(s.delta) < 0.00001)
return ;
s.c = 1.0 / s.delta;
tvec = vec_sub(ori, triangle->point1);
s.a = vec_dot(tvec, pvec) * s.c;
if (s.a < 0 || s.a > 1)
return ;
tvec = vec_cross(tvec, v1);
s.b = vec_dot(dir, tvec) * s.c;
if (s.b < 0 || s.a + s.b > 1)
return ;
s.t1 = vec_dot(v2, tvec) * s.c;
if (s.t1 < 0)
return ;
if (s.t1 < rt->t)
{
rt->t = s.t1;
rt->last_obj = triangle;
rt->flag = 0;
}
The only clue at the moment is that by using a different method of calculating my ray (called dir in the code), the result is that I have less pixels missing.
Moreover, when I turn the camera and look behind, I see that the bug occurs on the opposite side of the triangle. All of this make me think that the issue is mainly linked with the ray..
Take a look at Watertight Ray/Triangle Intersection. I would much appropriate if you could provide a minimal example where a ray should hit the triangle, but misses it. I had this a long time ago with the Cornel Box - inside the box there were some "black" pixels because on edges none of the triangles has been hit. It's a common problem stemming from floating-point imprecision.

What causes the stackoverflow? And how can I resolve it?

I was doing the homework for computer graphics.
We need to use floodfill to paint an area, but no matter how I changed the reserve stack of Visual Studio, it would always jump out stackoverflow.
void Polygon_FloodFill(HDC hdc, int x0, int y0, int fillColor, int borderColor) {
int interiorColor;
interiorColor = GetPixel(hdc, x0, y0);
if ((interiorColor != borderColor) && (interiorColor != fillColor)) {
SetPixel(hdc, x0, y0, fillColor);
Polygon_FloodFill(hdc, x0 + 1, y0, fillColor, borderColor);
Polygon_FloodFill(hdc, x0, y0 + 1, fillColor, borderColor);
Polygon_FloodFill(hdc, x0 - 1 ,y0, fillColor, borderColor);
Polygon_FloodFill(hdc, x0, y0 - 1, fillColor, borderColor);
}
You may have too large an area to fill, which causes recursive calls to consume all of the execution stack in your program.
Your options:
grow the execution stack even further, if you can
reduce the area (how about just 100x100 or 20x20?)
stop using the execution stack and use a data structure that works similarly but can contain more elements (by being more efficient and/or being able to grow/be larger)
use a different algorithm (e.g. consider going from individual pixels to horizontal spans of pixels, there will be many fewer of the latter than the former)
What causes the stackoverflow?
What is the range of x0? +/- 2,000,000,000? That is your stack depth potential.
Code does not obviously prevent going out of range unless GetPixel(out-of-range) returns a no-match value.
And how can I resolve it?
Code needs to be more selective on recursive calls.
When a row of pixels can be set, do so without recursion.
Then examine that row's neighbors and only recurse when the neighbors were not continuously in need of setting.
A promising approach would handle the middle and then look at the 4 cardinal directions.
// Pseudo code
Polygon_FloodFill(x,y,c)
if (pixel(x,y) needs filling) {
set pixel(x,y,c);
for each of the 4 directions
// example: east
i = 1;
// fill the east line first
while (pixel(x+i,y) needs filling) {
i++;
set pixel(x,y,c);
}
// now examine the line above the "east" line
recursed = false;
for (j=1; j<i; j++) {
if (pixel(x+j, y+j) needs filling) {
if (!recursed) {
recursed = true;
Polygon_FloodFill(x+j,y+j,c)
} else {
// no need to call Polygon_FloodFill as will be caught with previous call
}
} else {
recursed = false;
}
}
// Same for line below the "east" line
// do same for south, west, north.
}
how many pixels to fill? each pixel is one level deep of recursion and you got a lot of variables all local ones and operands of the recursive function + return value and address so for reach pixel you store this:
void Polygon_FloodFill(HDC hdc, int x0, int y0, int fillColor, int borderColor) {
int interiorColor;
in 32 bit environment I estimate this in [Bytes]:
4 Polygon_FloodFill return address
4 HDC hdc ?
4 int x0
4 int y0
4 int fillColor
4 int borderColor
4 int interiorColor
-------------------
~ 7*4 = 28 Bytes
There might be even more depending on the C engine and calling sequence.
Now if your filled area has for example 256x256 pixel then you need:
7*4*256*256 = 1.75 MByte
of memory on the stack/heap. How much memory you got depends on the settings you compile/link with so go to project option and look for memory stack/heap limits...
How to deal with this?
lower the stack/heap trashing
simply do not use operands for your flood_fill instead move them to global variables:
HDC floodfill_hdc;
int floodfill_x0,floodfill_y0,floodfill_fillColor,floodfill_borderColor;
void _Polygon_FloodFill()
{
// here your original filling code
int interiorColor;
...
}
void PolygonFloodFill(HDC hdc, int x0, int y0, int fillColor, int borderColor) // this is what you call when want to fill something
{
floodfill_hdc=hdc;
floodfill_x0=x0;
floodfill_y0=y0;
floodfill_fillColor=fillColor;
floodfill_borderColor=borderColor;
_Polygon_FloodFill();
}
this will allow to fill ~14 times bigger area.
limit recursion depth
This is also sometimes called priority que ... You just add one gobal counter that is counting actual depth of recursion and if hit limit value then do not allow recursion. Instead add pixel position to some list that will be processed after actual recursion stops.
change filling from pixels to lines
this simply eliminates a lot of recursive calls in wildly rough estimate to sqrt(n) recursions from n... You simply fill whole line from a start point to predetermined direction until you hit the border ... So you would have just recursion call per each line instead of per pixel. Here example (see [edit2]):
Paint algorithm leaving white pixels at the edges when I color
However the function name Polygon_FloodFill implies you got the border polygon in vector form. If the case than filling it will be much faster using polygon rasterization techniques like:
how to rasterize rotated rectangle (in 2d by setpixel)
but for that the polygon must be convex one so if not the case you need to triangulate or break down to convex polygons first (for example with Ear clipping).

implementing object detection like openCV

I'm trying to implement the Viola-Jones algorithm for object detection using Haar cascades (like openCV's implementation) in C, to detect faces. I writing the C code in a Vivado HLS compatible way, so I can port the the implementation to an FPGA. My main goal is to learn as much as possible, rather than just getting it to work. I would also appreciate any help with improving my question.
I basically started reading G. Bradski's Learning openCV, watched some online tutorials and got started writing the code. Sure enough its not detecting faces and I don't know why. At this point I care more about understanding my mistakes rather than beeing able to detect faces.
My Implementation Steps
I'm not sure how much detail is appropriate, but to keep it short:
Extracting Haar cascade data from haarcascade_frontalface_default.xml to C readable structures (huge arrays)
Writing a function to create an integral image of any given 8bit greyscale image of size 24x24 (same size as listed in the cascade)
Applying knowledge from this great post to make the necessary calculations
My Testing Scheme
Implementing a python script to detect faces using the openCV library with the same Haar cascade as mentioned above to create golden data, a detected face is cut out (ensuring 24x24 size) from the image and stored.
Stored images are converted to one dimensional C arrays, containing pixel values row-wise: img = {row0col0, row0col1, row1col0, row1col1, ... }
integral image is calculated and face detection applied
Result
Faces pass only 6 from 25 stages of the Haar cascade and are therefore not detected by my implementation, where I know they should have been detected since the python script with openCV and the same Haar cascade did indeed detect them.
My Code
/*
* This is detectFace.c
*/
#include <stdio.h>
#include "detectFace.h"
// define constants based on Haar cascade in use
// Each feature is made of max 3 rects
//#define FEAT_NO 1 // max no. of features (= 2912 for face_default.xml)
#define RECTS_IN_FEAT 3 // max no. of rect's per feature
//#define INTS_IN_RECT 5 // no. of int's needed to describe a rect
// each node has one feature (bijective relation) and three doubles
#define STAGE_NO 25 // no. of stages
#define NODE_NO 211 // no of nodes per stage, corresponds to FEAT_NO since each Node has always one feature in haarcascade_frontalface_default.xml
//#define ELMNT_IN_NODE 3 // no. of doubles needed to describe a node
// constants for frame size
#define WIN_WIDTH 24 // width = height =24
//int detectFace(int features[FEAT_NO][RECTS_IN_FEAT][INTS_IN_RECT], double stages[STAGE_NO][NODE_NO][ELMNT_IN_NODE], double stageThresh[STAGE_NO], int ii[24][24]){
int detectFace(
int ii[576],
int stageNum,
int stageOrga[25],
float stageThresholds[25],
float nodes[8739],
int featOrga[2913],
int rectangles[6383][5])
{
int passedStages = 0; // number of stages passed in this run
int faceDetected = 0; // turns to 1 if face is detected and to 0 if its not detected
// Debug:
int nodesUsed = 0; // number of floats out of nodes[] processed, use to skip to the unprocessed floats
int rectsUsed = 0; // number of rects processed
int droppedInStage0 = 0;
// loop through all stages
int i;
detectFace_label1:
for (i = 0; i < STAGE_NO; i++)
{
double tmp = 0.0; //variable to accumulate node-values, to then compare to stage threshold
int nodeNum = stageOrga[i]; // get number of nodes for this stage from stageOrga using stage index i
// loop through nodes inside each stage
// NOTE: it is assumed that each node maps to one corresponding feature. Ex: node[0] has feat[0) and node[1] has feat[1]
// because this is how it is written in the haarcascade_frontalface_default.xml
int j;
detectFace_label0:
for (j = 0; j < NODE_NO; j++)
{
// a node is defined by 3 values:
double nodeThresh = nodes[nodesUsed]; // the first value is the node threshold
double lValue = nodes[nodesUsed + 1]; // the second value is the left value
double rValue = nodes[nodesUsed + 2]; // the third value is the right value
int sum = 0; // contains the weighted value of rectangles in one Haar feature
// loop through rect's in a feature, some have 2 and some have 3 rect's.
// Each node always refers to one feature in a way that node0 maps to feature0 and node1 to feature1 (The XML file is build like that)
//int rectNum = featOrga[j]; // get number of rects for current feature using current node index j
int k;
detectFace_label2:
for (k = 0; k < RECTS_IN_FEAT; k++)
{
int x = 0, y = 0, width = 0, height = 0, weight = 0, coordUpL = 0, coordUpR = 0, coordDownL = 0, coordDownR = 0;
// a rect is defined by 5 values:
x = rectangles[rectsUsed][0]; // the first value is the x coordinate of the top left corner pixel
y = rectangles[rectsUsed][1]; // the second value is the y coordinate of the top left corner pixel
width = rectangles[rectsUsed][2]; // the third value is the width of the current rectangle
height = rectangles[rectsUsed][3]; // the fourth value is the height of this rectangle
weight = rectangles[rectsUsed][4]; // the fifth value is the weight of this rectangle
// calculating 1-Dim index for points of interest. Formula: index = width * row + column, assuming values are stored in row order
coordUpL = ((WIN_WIDTH * y) - WIN_WIDTH) + (x - 1);
coordUpR = coordUpL + width;
coordDownL = coordUpL + (height * WIN_WIDTH);
coordDownR = coordDownL + width;
// calculate the area sum according to Viola-Jones
//sum += (ii[x][y] + ii[x+width][y+height] - ii[x][y+height] - ii[x+width][y]) * weight;
sum += (ii[coordUpL] + ii[coordDownR] - ii[coordUpR] - ii[coordDownL]) * weight;
// Debug: counting the number of actual rectangles used
rectsUsed++; //
}
// decide whether the result of the feature calculation reaches the node threshold
if (sum < nodeThresh)
{
tmp += lValue; // add left value to tmp if node threshold was not reached
}
else
{
tmp += rValue; // // add right value to tmp if node threshold was reached
}
nodesUsed = nodesUsed + 3; // one node is processed, increase nodesUsed by number of floats needed to represent a node (3)¬
}
//######## at this point we went through each node in the current stage #######
// check if threshold of current stage was reached
if (tmp < stageThresholds[i])
{
faceDetected = 0; // if any stage threshold is not reached the operation is done and no face is present
// Debug: show in which stage the frame was dropped
printf("Face detection failed in stage %d \n", i);
//i = stageNum; // breaks out this loop, because i is supposed to stay smaller than STAGE_NO
}
else
{
passedStages++; // stage threshold is reached, therefore passedStages will count up
}
}
//######## at this point we went through all stages ###############################
//----------------------------------------------------------------------------------
// if the number of passed stages reaches the total number of stages, a face is detected
if (passedStages == stageNum)
{
faceDetected = 1; // one symbolizes that the input is a face
}
else
{
faceDetected = 0; // zero symbolizes that the input is not a face
};
return faceDetected;
}

Calculating the Power spectral density

I am trying to get the PSD of a real data set by making use of fftw3 library
To test I wrote a small program as shown below ,that generates the a signal which follows sinusoidal function
#include <stdio.h>
#include <math.h>
#define PI 3.14
int main (){
double value= 0.0;
float frequency = 5;
int i = 0 ;
double time = 0.0;
FILE* outputFile = NULL;
outputFile = fopen("sinvalues","wb+");
if(outputFile==NULL){
printf(" couldn't open the file \n");
return -1;
}
for (i = 0; i<=5000;i++){
value = sin(2*PI*frequency*zeit);
fwrite(&value,sizeof(double),1,outputFile);
zeit += (1.0/frequency);
}
fclose(outputFile);
return 0;
}
Now I'm reading the output file of above program and trying to calculate its PSD like as shown below
#include <stdio.h>
#include <fftw3.h>
#include <complex.h>
#include <stdlib.h>
#include <math.h>
#define PI 3.14
int main (){
FILE* inp = NULL;
FILE* oup = NULL;
double* value;// = 0.0;
double* result;
double spectr = 0.0 ;
int windowsSize =512;
double power_spectrum = 0.0;
fftw_plan plan;
int index=0,i ,k;
double multiplier =0.0;
inp = fopen("1","rb");
oup = fopen("psd","wb+");
value=(double*)malloc(sizeof(double)*windowsSize);
result = (double*)malloc(sizeof(double)*(windowsSize)); // what is the length that I have to choose here ?
plan =fftw_plan_r2r_1d(windowsSize,value,result,FFTW_R2HC,FFTW_ESTIMATE);
while(!feof(inp)){
index =fread(value,sizeof(double),windowsSize,inp);
// zero padding
if( index != windowsSize){
for(i=index;i<windowsSize;i++){
value[i] = 0.0;
}
}
// windowing Hann
for (i=0; i<windowsSize; i++){
multiplier = 0.5*(1-cos(2*PI*i/(windowsSize-1)));
value[i] *= multiplier;
}
fftw_execute(plan);
for(i = 0;i<(windowsSize/2 +1) ;i++){ //why only tell the half size of the window
power_spectrum = result[i]*result[i] +result[windowsSize/2 +1 -i]*result[windowsSize/2 +1 -i];
printf("%lf \t\t\t %d \n",power_spectrum,i);
fprintf(oup," %lf \n ",power_spectrum);
}
}
fclose(oup);
fclose(inp);
return 0;
}
Iam not sure about the correctness of the way I am doing this, but below are the results i have obtained:
Can any one help me in tracing the errors of the above approach
Thanks in advance
*UPDATE
after hartmut answer I'vve edited the code but still got the same result :
and the input data look like :
UPDATE
after increasing the sample frequencyand a windows size of 2048 here is what I've got :
UPDATE
after using the ADD-ON here how the result looks like using the window :
You combine the wrong output values to power spectrum lines. There are windowsSize / 2 + 1 real values at the beginning of result and windowsSize / 2 - 1 imaginary values at the end in reverse order. This is because the imaginary components of the first (0Hz) and last (Nyquist frequency) spectral lines are 0.
int spectrum_lines = windowsSize / 2 + 1;
power_spectrum = (double *)malloc( sizeof(double) * spectrum_lines );
power_spectrum[0] = result[0] * result[0];
for ( i = 1 ; i < windowsSize / 2 ; i++ )
power_spectrum[i] = result[i]*result[i] + result[windowsSize-i]*result[windowsSize-i];
power_spectrum[i] = result[i] * result[i];
And there is a minor mistake: You should apply the window function only to the input signal and not to the zero-padding part.
ADD-ON:
Your test program generates 5001 samples of a sinusoid signal and then you read and analyse the first 512 samples of this signal. The result of this is that you analyse only a fraction of a period. Due to the hard cut-off of the signal it contains a wide spectrum of energy with almost unpredictable energy levels, because you not even use PI but only 3.41 which is not precise enough to do any predictable calculation.
You need to guarantee that an integer number of periods is exactly fitting into your analysis window of 512 samples. Therefore, you should change this in your test signal creation program to have exactly numberOfPeriods periods in your test signal (e.g. numberOfPeriods=1 means that one period of the sinoid has a period of exactly 512 samples, 2 => 256, 3 => 512/3, 4 => 128, ...). This way, you are able to generate energy at a specific spectral line. Keep in mind that windowSize must have the same value in both programs because different sizes make this effort useless.
#define PI 3.141592653589793 // This has to be absolutely exact!
int windowSize = 512; // Total number of created samples in the test signal
int numberOfPeriods = 64; // Total number of sinoid periods in the test signal
for ( n = 0 ; n < windowSize ; ++n ) {
value = sin( (2 * PI * numberOfPeriods * n) / windowSize );
fwrite( &value, sizeof(double), 1, outputFile );
}
Some remarks to your expected output function.
Your input is a function with pure real values.
The result of a DFT has complex values.
So you have to declare the variable out not as double but as fftw_complex *out.
In general the number of dft input values is the same as the number of output values.
However, the output spectrum of a dft contains the complex amplitudes for positive
frequencies as well as for negative frequencies.
In the special case for pure real input, the amplitudes of the positive frequencies are
conjugated complex values of the amplitudes of the negative frequencies.
For that, only the frequencies of the positive spectrum are calculated,
which means that the number of the complex output values is the half of
the number of real input values.
If your input is a simple sinewave, the spectrum contains only a single frequency component.
This is true for 10, 100, 1000 or even more input samples.
All other values are zero. So it doesn't make any sense to work with a huge number of input values.
If the input data set contains a single period, the complex output value is
contained in out[1].
If the If the input data set contains M complete periods, in your case 5,
so the result is stored in out[5]
I did some modifications on your code. To make some facts more clear.
#include <iostream>
#include <stdio.h>
#include <math.h>
#include <complex.h>
#include "fftw3.h"
int performDFT(int nbrOfInputSamples, char *fileName)
{
int nbrOfOutputSamples;
double *in;
fftw_complex *out;
fftw_plan p;
// In the case of pure real input data,
// the output values of the positive frequencies and the negative frequencies
// are conjugated complex values.
// This means, that there no need for calculating both.
// If you have the complex values for the positive frequencies,
// you can calculate the values of the negative frequencies just by
// changing the sign of the value's imaginary part
// So the number of complex output values ( amplitudes of frequency components)
// are the half of the number of the real input values ( amplitutes in time domain):
nbrOfOutputSamples = ceil(nbrOfInputSamples/2.0);
// Create a plan for a 1D DFT with real input and complex output
in = (double*) fftw_malloc(sizeof(double) * nbrOfInputSamples);
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * nbrOfOutputSamples);
p = fftw_plan_dft_r2c_1d(nbrOfInputSamples, in, out, FFTW_ESTIMATE);
// Read data from input file to input array
FILE* inputFile = NULL;
inputFile = fopen(fileName,"r");
if(inputFile==NULL){
fprintf(stdout,"couldn't open the file %s\n", fileName);
return -1;
}
double value;
int idx = 0;
while(!feof(inputFile)){
fscanf(inputFile, "%lf", &value);
in[idx++] = value;
}
fclose(inputFile);
// Perform the dft
fftw_execute(p);
// Print output results
char outputFileName[] = "dftvalues.txt";
FILE* outputFile = NULL;
outputFile = fopen(outputFileName,"w+");
if(outputFile==NULL){
fprintf(stdout,"couldn't open the file %s\n", outputFileName);
return -1;
}
double realVal;
double imagVal;
double powVal;
double absVal;
fprintf(stdout, " Frequency Real Imag Abs Power\n");
for (idx=0; idx<nbrOfOutputSamples; idx++) {
realVal = out[idx][0]/nbrOfInputSamples; // Ideed nbrOfInputSamples is correct!
imagVal = out[idx][1]/nbrOfInputSamples; // Ideed nbrOfInputSamples is correct!
powVal = 2*(realVal*realVal + imagVal*imagVal);
absVal = sqrt(powVal/2);
if (idx == 0) {
powVal /=2;
}
fprintf(outputFile, "%10i %10.4lf %10.4lf %10.4lf %10.4lf\n", idx, realVal, imagVal, absVal, powVal);
fprintf(stdout, "%10i %10.4lf %10.4lf %10.4lf %10.4lf\n", idx, realVal, imagVal, absVal, powVal);
// The total signal power of a frequency is the sum of the power of the posive and the negative frequency line.
// Because only the positive spectrum is calculated, the power is multiplied by two.
// However, there is only one single line in the prectrum for DC.
// This means, the DC value must not be doubled.
}
fclose(outputFile);
// Clean up
fftw_destroy_plan(p);
fftw_free(in); fftw_free(out);
return 0;
}
int main(int argc, const char * argv[]) {
// Set basic parameters
float timeIntervall = 1.0; // in seconds
int nbrOfSamples = 50; // number of Samples per time intervall, so the unit is S/s
double timeStep = timeIntervall/nbrOfSamples; // in seconds
float frequency = 5; // frequency in Hz
// The period time of the signal is 1/5Hz = 0.2s
// The number of samples per period is: nbrOfSamples/frequency = (50S/s)/5Hz = 10S
// The number of periods per time intervall is: frequency*timeIntervall = 5Hz*1.0s = (5/s)*1.0s = 5
// Open file for writing signal values
char fileName[] = "sinvalues.txt";
FILE* outputFile = NULL;
outputFile = fopen(fileName,"w+");
if(outputFile==NULL){
fprintf(stdout,"couldn't open the file %s\n", fileName);
return -1;
}
// Calculate signal values and write them to file
double time;
double value;
double dcValue = 0.2;
int idx = 0;
fprintf(stdout, " SampleNbr Signal value\n");
for (time = 0; time<=timeIntervall; time += timeStep){
value = sin(2*M_PI*frequency*time) + dcValue;
fprintf(outputFile, "%lf\n",value);
fprintf(stdout, "%10i %15.5f\n",idx++, value);
}
fclose(outputFile);
performDFT(nbrOfSamples, fileName);
return 0;
}
If the input of a dft is pure real, the output is complex in any case.
So you have to use the plan r2c (RealToComplex).
If the signal is sin(2*pi*f*t), starting at t=0, the spectrum contains a single frequency line
at f, which is pure imaginary.
If the sign has an offset in phase, like sin(2*pi*f*t+phi) the single line's value is complex.
If your sampling frequency is fs, the range of the output spectrum is -fs/2 ... +fs/2.
The real parts of the positive and negative frequencies are the same.
The imaginary parts of the positive and negative frequencies have opposite signs.
This is called conjugated complex.
If you have the complex values of the positive spectrum you can calculate the values of the
negative spectrum by changing the sign of the imaginary parts.
For this reason there is no need to compute both, the positive and the negative sprectrum.
One sideband holds all information.
Therefore the number of output samples in the plan r2c is the half+1 of the number
of input samples.
To get the power of a frequency, you have to consider the positive frequency as well
as the negative frequency. However, the plan r2c delivers only the right positive half
of the spectrum. So you have to double the power of the positive side to get the total power.
By the way, the documentation of the fftw3 package describes the usage of plans quite well.
You should invest the time to go over the manual.
I'm not sure what your question is. Your results seem reasonable, with the information provided.
As you must know, the PSD is the Fourier transform of the autocorrelation function. With sine wave inputs, your AC function will be periodic, therefore the PSD will have tones, like you've plotted.
My 'answer' is really some thought starters on debugging. It would be easier for all involved if we could post equations. You probably know that there's a signal processing section on SE these days.
First, you should give us a plot of your AC function. The inverse FT of the PSD you've shown will be a linear combination of periodic tones.
Second, try removing the window, just make it a box or skip the step if you can.
Third, try replacing the DFT with the FFT (I only skimmed the fftw3 library docs, maybe this is an option).
Lastly, trying inputting white noise. You can use a Bernoulli dist, or just a Gaussian dist. The AC will be a delta function, although the sample AC will not. This should give you a (sample) white PSD distribution.
I hope these suggestions help.

Looking for a fast outlined line rendering algorithm

I'm looking for a fast algorithm to draw an outlined line. For this application, the outline only needs to be 1 pixel wide. It should be possible, whether by default or through an option, to make two lines connect together seamlessly, if they share a common point.
Excuse the ASCII art but this is probably the best way to demonstrate it.
Normal line:
##
##
##
##
##
##
"Outlined" line:
**
*##**
**##**
**##**
**##**
**##**
**##*
**
I'm working on a dsPIC33FJ128GP802. It's a small microcontroller/digital signal processor, capable of 40 MIPS (million instructions per second.) It is only capable of integer math (add, subtract and multiply: it can do division, but it takes ~19 cycles.) It's being used to process an OSD layer at the same time and only 3-4 MIPS of the processing time is available for calculations, so speed is critical. The pixels occupy three states: black, white and transparent; and the video field is 192x128 pixels. This is for Super OSD, an open source project: http://code.google.com/p/super-osd/
The first solution I thought of was to draw 3x3 rectangles with outlined pixels on the first pass and normal pixels on the second pass, but this could be slow, as for every pixel at least 3 pixels are overwritten and the time spent drawing them is wasted. So I'm looking for a faster way. Each pixel costs around 30 cycles. The target is <50, 000 cycles to draw a line of 100 pixels length.
I suggest this (C/pseudocode mix) :
void draw_outline(int x1, int y1, int x2, int y2)
{
int x, y;
double slope;
if (abs(x2-x1) >= abs(y2-y1)) {
// line closer to horizontal than vertical
if (x2 < x1) swap_points(1, 2);
// now x1 <= x2
slope = 1.0*(y2-y1)/(x2-x1);
draw_pixel(x1-1, y1, '*');
for (x = x1; x <= x2; x++) {
y = y1 + round(slope*(x-x1));
draw_pixel(x, y-1, '*');
draw_pixel(x, y+1, '*');
// here draw_line() does draw_pixel(x, y, '#');
}
draw_pixel(x2+1, y2, '*');
}
else {
// same as above, but swap x and y
}
}
Edit: If you want to have successive lines connect seamlessly, I
think you really have to draw all the outlines in the first pass, and
then the lines. I edited the code above to draw only the outlines. The
draw_line() function would be exactly the same but with one single
draw_pixel(x, y, '#'); instead of four draw_pixel(..., ..., '*');.
And then you just:
void draw_polyline(point p[], int n)
{
int i;
for (i = 0; i < n-1; i++)
draw_outline(p[i].x, p[i].y, p[i+1].x, p[i+1].y);
for (i = 0; i < n-1; i++)
draw_line(p[i].x, p[i].y, p[i+1].x, p[i+1].y);
}
My approach would be to use the Bresenham to draw multiple lines. Looking at your ASCII art, you'll note that the outline lines are just the same as the Bresenham line, just shifted 1 pixel up and down -- plus a single pixel to the left of the first point and to the right of the last.
For a generic version, you'll need to determine whether your line is flat or steep -- i.e., whether abs(y1 - y0) <= abs(x1 - x0). For steep lines, the outlines are shifted by 1 pixel to the left and right, and the closing pixels are above the starting and below the ending point.
It could be worth optimizing this by drawing the line and two outline pixels in one go for each line pixel. However, if you need seamless outlines, the simplest solution would be to first draw all outlines, then the lines themselves -- which wouldn't work with the "three-pixel-Bresenham" optimization.

Resources