I'm trying to implement the Viola-Jones algorithm for object detection using Haar cascades (like openCV's implementation) in C, to detect faces. I writing the C code in a Vivado HLS compatible way, so I can port the the implementation to an FPGA. My main goal is to learn as much as possible, rather than just getting it to work. I would also appreciate any help with improving my question.
I basically started reading G. Bradski's Learning openCV, watched some online tutorials and got started writing the code. Sure enough its not detecting faces and I don't know why. At this point I care more about understanding my mistakes rather than beeing able to detect faces.
My Implementation Steps
I'm not sure how much detail is appropriate, but to keep it short:
Extracting Haar cascade data from haarcascade_frontalface_default.xml to C readable structures (huge arrays)
Writing a function to create an integral image of any given 8bit greyscale image of size 24x24 (same size as listed in the cascade)
Applying knowledge from this great post to make the necessary calculations
My Testing Scheme
Implementing a python script to detect faces using the openCV library with the same Haar cascade as mentioned above to create golden data, a detected face is cut out (ensuring 24x24 size) from the image and stored.
Stored images are converted to one dimensional C arrays, containing pixel values row-wise: img = {row0col0, row0col1, row1col0, row1col1, ... }
integral image is calculated and face detection applied
Result
Faces pass only 6 from 25 stages of the Haar cascade and are therefore not detected by my implementation, where I know they should have been detected since the python script with openCV and the same Haar cascade did indeed detect them.
My Code
/*
* This is detectFace.c
*/
#include <stdio.h>
#include "detectFace.h"
// define constants based on Haar cascade in use
// Each feature is made of max 3 rects
//#define FEAT_NO 1 // max no. of features (= 2912 for face_default.xml)
#define RECTS_IN_FEAT 3 // max no. of rect's per feature
//#define INTS_IN_RECT 5 // no. of int's needed to describe a rect
// each node has one feature (bijective relation) and three doubles
#define STAGE_NO 25 // no. of stages
#define NODE_NO 211 // no of nodes per stage, corresponds to FEAT_NO since each Node has always one feature in haarcascade_frontalface_default.xml
//#define ELMNT_IN_NODE 3 // no. of doubles needed to describe a node
// constants for frame size
#define WIN_WIDTH 24 // width = height =24
//int detectFace(int features[FEAT_NO][RECTS_IN_FEAT][INTS_IN_RECT], double stages[STAGE_NO][NODE_NO][ELMNT_IN_NODE], double stageThresh[STAGE_NO], int ii[24][24]){
int detectFace(
int ii[576],
int stageNum,
int stageOrga[25],
float stageThresholds[25],
float nodes[8739],
int featOrga[2913],
int rectangles[6383][5])
{
int passedStages = 0; // number of stages passed in this run
int faceDetected = 0; // turns to 1 if face is detected and to 0 if its not detected
// Debug:
int nodesUsed = 0; // number of floats out of nodes[] processed, use to skip to the unprocessed floats
int rectsUsed = 0; // number of rects processed
int droppedInStage0 = 0;
// loop through all stages
int i;
detectFace_label1:
for (i = 0; i < STAGE_NO; i++)
{
double tmp = 0.0; //variable to accumulate node-values, to then compare to stage threshold
int nodeNum = stageOrga[i]; // get number of nodes for this stage from stageOrga using stage index i
// loop through nodes inside each stage
// NOTE: it is assumed that each node maps to one corresponding feature. Ex: node[0] has feat[0) and node[1] has feat[1]
// because this is how it is written in the haarcascade_frontalface_default.xml
int j;
detectFace_label0:
for (j = 0; j < NODE_NO; j++)
{
// a node is defined by 3 values:
double nodeThresh = nodes[nodesUsed]; // the first value is the node threshold
double lValue = nodes[nodesUsed + 1]; // the second value is the left value
double rValue = nodes[nodesUsed + 2]; // the third value is the right value
int sum = 0; // contains the weighted value of rectangles in one Haar feature
// loop through rect's in a feature, some have 2 and some have 3 rect's.
// Each node always refers to one feature in a way that node0 maps to feature0 and node1 to feature1 (The XML file is build like that)
//int rectNum = featOrga[j]; // get number of rects for current feature using current node index j
int k;
detectFace_label2:
for (k = 0; k < RECTS_IN_FEAT; k++)
{
int x = 0, y = 0, width = 0, height = 0, weight = 0, coordUpL = 0, coordUpR = 0, coordDownL = 0, coordDownR = 0;
// a rect is defined by 5 values:
x = rectangles[rectsUsed][0]; // the first value is the x coordinate of the top left corner pixel
y = rectangles[rectsUsed][1]; // the second value is the y coordinate of the top left corner pixel
width = rectangles[rectsUsed][2]; // the third value is the width of the current rectangle
height = rectangles[rectsUsed][3]; // the fourth value is the height of this rectangle
weight = rectangles[rectsUsed][4]; // the fifth value is the weight of this rectangle
// calculating 1-Dim index for points of interest. Formula: index = width * row + column, assuming values are stored in row order
coordUpL = ((WIN_WIDTH * y) - WIN_WIDTH) + (x - 1);
coordUpR = coordUpL + width;
coordDownL = coordUpL + (height * WIN_WIDTH);
coordDownR = coordDownL + width;
// calculate the area sum according to Viola-Jones
//sum += (ii[x][y] + ii[x+width][y+height] - ii[x][y+height] - ii[x+width][y]) * weight;
sum += (ii[coordUpL] + ii[coordDownR] - ii[coordUpR] - ii[coordDownL]) * weight;
// Debug: counting the number of actual rectangles used
rectsUsed++; //
}
// decide whether the result of the feature calculation reaches the node threshold
if (sum < nodeThresh)
{
tmp += lValue; // add left value to tmp if node threshold was not reached
}
else
{
tmp += rValue; // // add right value to tmp if node threshold was reached
}
nodesUsed = nodesUsed + 3; // one node is processed, increase nodesUsed by number of floats needed to represent a node (3)¬
}
//######## at this point we went through each node in the current stage #######
// check if threshold of current stage was reached
if (tmp < stageThresholds[i])
{
faceDetected = 0; // if any stage threshold is not reached the operation is done and no face is present
// Debug: show in which stage the frame was dropped
printf("Face detection failed in stage %d \n", i);
//i = stageNum; // breaks out this loop, because i is supposed to stay smaller than STAGE_NO
}
else
{
passedStages++; // stage threshold is reached, therefore passedStages will count up
}
}
//######## at this point we went through all stages ###############################
//----------------------------------------------------------------------------------
// if the number of passed stages reaches the total number of stages, a face is detected
if (passedStages == stageNum)
{
faceDetected = 1; // one symbolizes that the input is a face
}
else
{
faceDetected = 0; // zero symbolizes that the input is not a face
};
return faceDetected;
}
Related
I'm trying to create double doors that slide opposite ways but I want to only use a single object. Basically I'm using this script;
But I'm wondering if its possible so it separates in the middle and retracts rather than it retracting from one end? Supermarket doors would be my best example.
``//When touched the prim is retracted towards one end and when touched again stretched back out.
//
//Prim moves/changes size along the local coordinate specified in the offset vector below.
//
//To change the overall size, edit the prim when stretched out and reset the script when done.
//
//The script works both in unlinked and linked prims.
//
vector offset = <0,1,0>; //Prim moves/changes size along this local coordinate
float hi_end_fixed = TRUE; //Which end of the prim should remain in place when size changes?
//The one with the higher (local) coordinate?
float min = 0.2; //The minimum size of the prim relative to its maximum size
integer ns = 10; //Number of distinct steps for move/size change
default {
state_entry() {
offset *= ((1.0 - min) / ns) * (offset * llGetScale());
hi_end_fixed -= 0.5;
}
touch_start(integer detected) {
integer i;
do llSetPrimitiveParams([PRIM_SIZE, llGetScale() - offset,
PRIM_POSITION, llGetLocalPos() + ((hi_end_fixed * offset) * llGetLocalRot())]);
while ((++i) < ns);
offset = - offset; }
}``
I'm currently tracking the analog value of a photodetector coming into my system. The signal itself is cleaned, filtered (low pass and high pass), and amplified in hardware before coming into my system. The signal has a small amount of DC walk to it, which is giving me some trouble. I've attempted to just move the min up by 1% every 50 reads of the ADC,but it adds more noise than I'd like to my signal. Here's a snapshot of what I'm pulling in below (blue = signal, max/min average = green, red = min) The spikes in the red signal can be ignored that's something I'm doing to say when a certain condition is met.
Right now my function for tracking min is this:
//Determine is value is outside max or min
if(data > max) max = data;
if(data < min) min = data;
//Reset function to bring the bounds in every 50 cycles
if(rstCntr>=50){
rstCntr=0;
max = max/1.01;
min = min*1.01;
if(min <= 1200) min = 1200;
if(max >= 1900) max = 1900;
}
That works fine except when I do that 1% correction to make sure we are still tracking the signal it throws other functions off which rely on the average value and the min value. My objective is to determine:
On the negative slope of the signal
Data coming in is less than the average
Data coming in is 5% above the minimum
It is really #3 that is driving everything else. There is enough slack in the other two that they aren't that affected.
Any suggestions for a better way to track the max and min in real-time than what I'm doing?
EDIT: Per comment by ryyker: here is additional information and reproducible example code
Need more clearly described: I'm reading an analog signal approximately once every 2ms and determining whether that signal has crossed a threshold just above the minimum value of the analog signal. The signal has some DC walk in it which doesn't allow me to simply set the lowest value seen since power-on as the minimum value.
The question: On a reading-by-reading basis, how can I track the min of a signal that doesn't have a consistent minimum value?
int main(void) {
while (1)
{
//******************************************************************************
//** Process analog sensor data, calculate HR, and trigger solenoids
//** At some point this should probably be moved to a function call in System.c,
//** but I don't want to mess with it right now since it works (Adam 11/23/2022)
//******************************************************************************
//Read Analog Data for Sensor
data = ADC1_ReadChannel(7);
//Buffer the sensor data for peak/valley detection
for(int buf=3;buf>0;buf--){
dataBuffer[buf] = dataBuffer[buf-1];
}
dataBuffer[0] = data;
//Look for a valley
//Considered a valley is the 3 most recent data points are increasing
//This helps avoid noise in the signal
uint8_t count = 0;
for(int buf=0;buf<3;buf++) {
if(dataBuffer[buf]>dataBuffer[buf+1]) count++;
}
if(count >= 3) currentSlope = true; //if the last 3 points are increasing, we just passed a valley
else currentSlope = false; //not a valley
// Track the data stream max and min to calculate a signal average
// The signal average is used to determine when we are on the bottom end of the waveform.
if(data > max) max = data;
if(data < min) min = data;
if(rstCntr>=50){ //Make sure we are tracking the signal by moving min and max in every 200 samples
rstCntr=0;
max = max/1.01;
min = min*1.01;
if(min <= 1200) min = 1200; //average*.5; //Probably finger was removed from sensor, move back up
if(max >= 1900) max = 1900; //Need to see if this really works consistently
}
rstCntr++;
average = ((uint16_t)min+(uint16_t)max)/2;
trigger = min; //Variable is only used for debug output, resetting each time around
if(data < average &&
currentSlope == false && //falling edge of signal
data <= (((average-min)*.03)+min) && //Threshold above the min
{
FireSolenoids();
}
}
return 1;
}
EDIT2:
Here is what I'm seeing using the code posted by ryyker below. The green line is what I'm using as my threshold, which works fairly well, but you can see max and min don't track the signal.
EDIT3:
Update with edited min/max code. Not seeing it ever reach the max. Might be the window size is too small (set to 40 in this image).
EDIT4:
Just for extra clarity, I'm restating my objectives once again, hopefully to make things as clear as possible. It might be helpful to provide a bit more context around what the information is used for, so I'm doing that also.
Description:
I have an analog sensor which measures a periodic signal in the range of 0.6Hz to 2Hz. The signal's periodicity is not consistent from pulsewave to pulsewave. It varies +/- 20%. The periodic signal is used to determine the timing of when a valve is opened and closed.
Objective:
The valve needs to be opened a constant number of ms after the signal peak is reached, but the time it physically takes the valve to move is much longer than this constant number. In other words, opening the valve when the peak is detected means the valve opens too late.
Similar to 1, using the valley of the signal is also not enough time for the valve to physically open.
The periodicity of the signal varies enough that it isn't possible to use the peak-to-peak time from the previous two pulsewaves to determine when to open the valve.
I need to consistently determine a point on the negative sloped portion of the pulsewave to use as the trigger for opening the valve.
Approach:
My approach is to measure the minimum and maximum of the signal and then set a threshold above the minimum which I can use to determine the time the open the valve.
My thought is that by setting some constant percentage above the minimum will get me to a consistent location on the negative sloped which can be used to open the valve.
"On a reading-by-reading basis, how can I track the min of a signal that doesn't have a consistent minimum value?"
By putting each discrete signal sample through a moving window filter, and performing statistical operations on the window as it moves, standard deviation can be extracted (following mean and variance) which can then be combined with mean to determine the minimum allowed value for each point of a particular waveform. This assumes noise contribution is known and consistent.
The following implementation is one way to consider.
in header file or top of .c
//support for stats() function
#define WND_SZ 10;
int wnd_sz = WND_SZ;
typedef struct stat_s{
double arr[10];
double min; //mean - std_dev
double max; //mean + std_dev
double mean; //running
double variance;//running
double std_dev; //running
} stat_s;
void stats(double in, stat_s *out);
in .c (edit to change max and min)
// void stats(double in, stat_s *out)
// Used to monitor a continuous stream of sensor values.
// Accepts series of measurement values from a sensor,
// Each new input value is stored in array element [i%wnd_sz]
// where wnd_sz is the width of the sample array.
// instantaneous values for max and min as well as
// moving values of mean, variance, and standard deviation
// are derived once per input
void ISL_UTIL stats(double in, stat_s *out)
{
double sum = 0, sum1 = 0;
int j = 0;
static int i = 0;
out->arr[i%wnd_sz] = in;//array index values cycle within window size
//sum all elements of moving window array
for(j = 0; j < wnd_sz; j++)
sum += out->arr[j];
//compute mean
out->mean = sum / (double)wnd_sz;
//sum squares of diff between each element and mean
for (j = 0; j < wnd_sz; j++)
sum1 += pow((out->arr[j] - out->mean), 2);
//compute variance
out->variance = sum1 / (double)wnd_sz;
//compute standard deviation
out->std_dev = sqrt(out->variance);
//EDIT here:
//mean +/- std_dev
out->max = out->mean + out->std_dev;
out->min = out->mean - out->std_dev;
//END EDIT
//prevent overflow for long running sessions.
i = (i == 1000) ? 0 : ++i;
}
int main(void)
{
stat_s s = {0};
bool running = true;
double val = 0.0;
while(running)
{
//read one sample from some sensor
val = someSensor();
stats(val, &s);
// collect instantaneous and running data from s
// into variables here
if(some exit condition) break
}
return 0;
}
Using this code with 1000 bounded pseudo random values, mean is surrounded with traces depicting mean + std_dev and mean - std_dev As std_dev becomes smaller over time, the traces converge toward the mean signal:
Note: I used the following in my test code to produce data arrays of a signal with constant amplitude added to injected noise that diminishes in amplitude over time.
void gen_data(int samples)
{
srand(clock());
int i = 0;
int plotHandle[6] = {0};
stat_s s = {0};
double arr[5][samples];
memset(arr, 0, sizeof arr);
for(i=0; i < samples; i++)//simulate ongoing sampling of sensor
{
s.arr[i%wnd_sz] = 50 + rand()%100;
if(i<.20*samples) s.arr[i%wnd_sz] = 50 + rand()%100;
else if(i<.40*samples) s.arr[i%wnd_sz] = 50 + rand()%50;
else if(i<.60*samples) s.arr[i%wnd_sz] = 50 + rand()%25;
else if(i<.80*samples) s.arr[i%wnd_sz] = 50 + rand()%12;
else s.arr[i%wnd_sz] = 50 + rand()%6;
stats(s.arr[i%wnd_sz], &s);
arr[0][i] = s.mean;
arr[1][i] = s.variance;
arr[2][i] = s.std_dev;
arr[3][i] = s.min;
arr[4][i] = s.max;
}
//
Plotting algorithms deleted for brevity.
}
At the moment I am attempting to implement a program for finding 3 frequencies (xs = 30.1 kHz, ys = 28.3 kHz and zs = 25.9 kHz) through the use of the CMSIS pack on the STM32F411RE board. I cannot get the Complex FFT (CFFT) and complex magnitude working correctly.
In accordance with the freqeuncy bins I generate an array containing these frequencies, so that I can manually lookup which index bins the signals xs, ys and zs are on. I then use this index to look at the 3 fft outcomes (Xfft, Yfft, Zfft) to find the outcomes for these signals, but they dont match up.
I use the following order of functions:
DMA ADC Buffer: HAL_ADC_ConvHalfCpltCallback(ADC_HandleTypeDef* hadc)
Freqeuncy bins in binfreqs
Change ADC input to float Xfft
CFFT_F32: arm_cfft_f32(&arm_cfft_sR_f32_len1024, Xfft, 0, 0);
Complex Mag: arm_cmplx_mag_f32(Xfft, Xdsp, fftLen);
// ADC Stuff done via DMA, working correctly
int main(void)
{
HAL_Init();
SystemClock_Config();
MX_GPIO_Init();
MX_DMA_Init();
MX_ADC1_Init();
MX_USART2_UART_Init();
HAL_ADC_Start_DMA(&hadc1, adc_buffer, bufferLen); // adc_buffer needs to be an uint32_t
while (1)
{
/**
* Generate the frequencies
*/
for (int binfreqs = 0; binfreqs < fftLen; binfreqs++) // Generates the frequency bins to relate the amplitude to an actual value, rather than a bin frequency value
{
fftFreq[binfreqs] = binfreqs * binSize;
}
/*
* Find the amplitudes associated with the 3 emitter frequencies and store in an array for each axis. By default these arrays are generated with signal strength 0
* and with frequency index at 0: because of system limits these will indicate invalid values, as system range is from 10 - 60 kHz.
*/
volatile int32_t X_mag[3][4] = // x axis values: [index][frequency][signal_strength][phase]
{
{0, Xfreq, 0, 0}, // For x-freq index [0][0], frequency [0][1] associated with 1st biggest amplitude [0][2], phase [0][3]
{0, Yfreq, 0, 0}, // Ditto for y-freq
{0, Zfreq, 0, 0} // Ditto for z-freq
};
/*
* Finds the index in fftFreq corresponding to respectively X, Y and Z emitter frequencies
*/
for(int binSearch = 0; binSearch < fftLen; binSearch++)
{
if(fftFreq[binSearch] == Xfreq) // Find index for X emitter frequency
{
X_mag[0][0] = binSearch;
}
if(fftFreq[binSearch] == Yfreq) // Find index for Y emitter frequency
{
X_mag[1][0] = binSearch;
}
if(fftFreq[binSearch] == Zfreq) // Find index for Z emitter frequency
{
X_mag[2][0] = binSearch;
}
}
Signal processing
/* Signal processing algorithms --------------------------------------------------
*
* Only to be run once with fresh data from the buffer, [do not run continuous] or position / orientation data will be repeated.
* So only run once when conversionPaused
*/
if(conversionPaused)
{
/*
* Convert signal to voltage (12-bit, 4096)
*/
for (int floatVals = 0; floatVals < fftLen; floatVals++)
{
Xfft[floatVals] = (float) Xin[floatVals]; * 3.6 / 4096
}
/*
* Fourier transform
*/
arm_cfft_f32(&arm_cfft_sR_f32_len1024, Xfft, 0, 0); // Calculate complex fourier transform of x time signal, processing occurs in place
for (int fix_fft = 0 ; fix_fft < half_fftLen ; fix_fft++)
{
Xfft[fix_fft] = 2 * Xfft[fix_fft] / fftLen;
Xfft[fix_fft + half_fftLen] = 0;
}
/*
* Amplitude calculation
*/
arm_cmplx_mag_f32(Xfft, Xdsp, fftLen); // Calculate the magnitude of the fourier transform for x axis
/*
* Finds all signal strengths for allocated frequency indexes
*/
for(int strength_index = 0; strength_index < 3; strength_index++) // Loops through xyz frequencies for all 3 magnetometer axis
{
int x_temp_index = X_mag[strength_index][0]; // temp int necessary to store the strength, otherwise infinite loop?
X_mag[strength_index][2] = Xfft[x_temp_index]; // = Xfft[2*x_temp_index];
}
conversionPaused = 0;
}
} // While() end
} // Main() end
I do not know how I am to calculate the frequency bins for this combination of cfft and complex magnitude, as I would expect the even indexes of the array to hold the real values and the odd indexes of the array to hold the imaginary phase values. I reference some 1 2 3 examples but could not make out what I am doing wrong with my code.
However as per the images when applying an input signal of 30.1 kHz neither the 301 bin index or the 602 bin index holds the corresponding output expected?
301 bin index
602 bin index
EDIT:
I have since tried to implement the arm_cfft_f32 example given here. This latter example is completely broken as the external 10 kHz dataset is no longer included by default and trying to include it is not possible, as the program behaves poorly and keeps erroring about a return data type that is not even present in the first place. Thus I cannot use the example program given for this: it also appears to be 4 years out of date, so that is not surprising.
The arm_max_f32() function also proved not fruitful as it keeps homing in on the noise generated at bin 0 via using an analog generated signal. Manually setting this bin 0 to equal 0 then upsets the algorithm which starts pointing to random values that are not even the largest value present in the system.
Even when going manually through the CFFT data and magnitude it appears as if they are not working correctly. There are random noise values all over the spectrum parts, whilst the oscilloscope confirms that large outcomes should only be present at 0 Hz and the selected signal generator frequency (thus corresponding to a frequency bin).
Using CMSIS is extremely frustrating for me because of the little documentation and examples available, which is then further reduced by most of it simply not working (without major modification).
I have an application where I need to find the position of peaks in a given set of data. The resolution must be much higher than the spacing between the datapoints (i.e. it is not sufficient to find the highest datapoint, instead a "virtual" peak position has to be estimated given the shape of the peak). A peak is made of about 4 or 5 datapoints. A dataset is acquired every few ms and the peak detection has to be performed in real time.
I compared several methods in LabVIEW and I found the best result (in terms of resolution and speed) is given by the LabVIEW PeakDetector.vi, which scans the dataset with a moving window (>= 3 points width) and for each position performs a quadratic fit. The resulting quadratic function (a parabola) has a local maximum, which is in turn compared to nearby points.
Now I want to implement the same method in C. The polynomial fit is implemented as follows (using Gaussian matrix):
// Fits *y from x_start to (x_start + window) with a parabola and returns x_max and y_max
int polymax(uint16_t * y_data, int x_start, int window, double *x_max, double *y_max)
{
float sum[10],mat[3][4],temp=0,temp1=0,a1,a2,a3;
int i,j;
float x[window];
for(i = 0; i < window; i++)
x[i] = (float)i;
float y[window];
for(i = 0; i < window; i++)
y[i] = (float)(y_data[x_start + i] - y_data[x_start]);
for(i = 0; i < window; i++)
{
temp=temp+x[i];
temp1=temp1+y[i];
}
sum[0]=temp;
sum[1]=temp1;
sum[2]=sum[3]=sum[4]=sum[5]=sum[6]=0;
for(i = 0;i < window;i++)
{
sum[2]=sum[2]+(x[i]*x[i]);
sum[3]=sum[3]+(x[i]*x[i]*x[i]);
sum[4]=sum[4]+(x[i]*x[i]*x[i]*x[i]);
sum[5]=sum[5]+(x[i]*y[i]);
sum[6]=sum[6]+(x[i]*x[i]*y[i]);
}
mat[0][0]=window;
mat[0][1]=mat[1][0]=sum[0];
mat[0][2]=mat[1][2]=mat[2][0]=sum[2];
mat[1][2]=mat[2][3]=sum[3];
mat[2][2]=sum[4];
mat[0][3]=sum[1];
mat[1][3]=sum[5];
mat[2][3]=sum[6];
temp=mat[1][0]/mat[0][0];
temp1=mat[2][0]/mat[0][0];
for(i = 0, j = 0; j < 3 + 1; j++)
{
mat[i+1][j]=mat[i+1][j]-(mat[i][j]*temp);
mat[i+2][j]=mat[i+2][j]-(mat[i][j]*temp1);
}
temp=mat[2][4]/mat[1][5];
temp1=mat[0][6]/mat[1][7];
for(i = 1,j = 0; j < 3 + 1; j++)
{
mat[i+1][j]=mat[i+1][j]-(mat[i][j]*temp);
mat[i-1][j]=mat[i-1][j]-(mat[i][j]*temp1);
}
temp=mat[0][2]/mat[2][2];
temp1=mat[1][2]/mat[2][2];
for(i = 0, j = 0; j < 3 + 1; j++)
{
mat[i][j]=mat[i][j]-(mat[i+2][j]*temp);
mat[i+1][j]=mat[i+1][j]-(mat[i+2][j]*temp1);
}
a3 = mat[2][3]/mat[2][2];
a2 = mat[1][3]/mat[1][8];
a1 = mat[0][3]/mat[0][0];
// zX^2 + yX + x
if (a3 < 0)
{
temp = - a2 / (2*a3);
*x_max = temp + x_start;
*y_max = (a3*temp*temp + a2*temp + a1) + y_data[x_start];
return 0;
}
else
return -1;
}
The scan is performed in an outer function, which calls the above function repeatedly and chooses then the highest local y_max.
The above works and peaks are found. Only the noise is much worse than the LabVIEW counterpart (i.e. I get a very oscillating peak position, given the same input dataset and the same parameters). As the algorithm works the above code should be conceptually correct, so I think it might be a numerical problem as I simply use "floats" without further effort to improve numerical accuracy. Is this a possible answer? Does anyone have a tip, where I should be looking to?
Thanks.
PS: I have done my search and found this very good overview and also this question, similar to mine (unfortunately with not many answers). I will study these further.
EDIT: I have found my problems being elsewhere. Improving the algorithm by removing certain output values (a sort of post-validation in which a result is only accepted if the result is within the moving window) brought the solution to the issue. Now I am satisfied with the results, i.e. they are comparable to those from LabVIEW. Nevertheless, thanks a lot for your comments.
Sorry to be late to the part, but if you have C/C++ it is really easy to port it to C# code using VS2013 Express (free version) and just port that into Labview using the .NET toolset.
I am looking for an algorithm to prune short line segments from the output of an edge detector. As can be seen in the image (and link) below, there are several small edges detected that aren't "long" lines. Ideally I'd like just the 4 sides of the quadrangle to show up after processing, but if there are a couple of stray lines, it won't be a big deal... Any suggestions?
Image Link
Before finding the edges pre-process the image with an open or close operation (or both), that is, erode followed by dilate, or dilate followed by erode. this should remove the smaller objects but leave the larger ones roughly the same.
I've looked for online examples, and the best I could find was on page 41 of this PDF.
I doubt that this can be done with a simple local operation. Look at the rectangle you want to keep - there are several gaps, hence performing a local operation to remove short line segments would probably heavily reduce the quality of the desired output.
In consequence I would try to detect the rectangle as important content by closing the gaps, fitting a polygon, or something like that, and then in a second step discard the remaining unimportant content. May be the Hough transform could help.
UPDATE
I just used this sample application using a Kernel Hough Transform with your sample image and got four nice lines fitting your rectangle.
In case somebody steps on this thread, OpenCV 2.x brings an example named squares.cpp that basically nails this task.
I made a slight modification to the application to improve the detection of the quadrangle
Code:
#include "highgui.h"
#include "cv.h"
#include <iostream>
#include <math.h>
#include <string.h>
using namespace cv;
using namespace std;
void help()
{
cout <<
"\nA program using pyramid scaling, Canny, contours, contour simpification and\n"
"memory storage (it's got it all folks) to find\n"
"squares in a list of images pic1-6.png\n"
"Returns sequence of squares detected on the image.\n"
"the sequence is stored in the specified memory storage\n"
"Call:\n"
"./squares\n"
"Using OpenCV version %s\n" << CV_VERSION << "\n" << endl;
}
int thresh = 70, N = 2;
const char* wndname = "Square Detection Demonized";
// helper function:
// finds a cosine of angle between vectors
// from pt0->pt1 and from pt0->pt2
double angle( Point pt1, Point pt2, Point pt0 )
{
double dx1 = pt1.x - pt0.x;
double dy1 = pt1.y - pt0.y;
double dx2 = pt2.x - pt0.x;
double dy2 = pt2.y - pt0.y;
return (dx1*dx2 + dy1*dy2)/sqrt((dx1*dx1 + dy1*dy1)*(dx2*dx2 + dy2*dy2) + 1e-10);
}
// returns sequence of squares detected on the image.
// the sequence is stored in the specified memory storage
void findSquares( const Mat& image, vector<vector<Point> >& squares )
{
squares.clear();
Mat pyr, timg, gray0(image.size(), CV_8U), gray;
// karlphillip: dilate the image so this technique can detect the white square,
Mat out(image);
dilate(out, out, Mat(), Point(-1,-1));
// then blur it so that the ocean/sea become one big segment to avoid detecting them as 2 big squares.
medianBlur(out, out, 3);
// down-scale and upscale the image to filter out the noise
pyrDown(out, pyr, Size(out.cols/2, out.rows/2));
pyrUp(pyr, timg, out.size());
vector<vector<Point> > contours;
// find squares only in the first color plane
for( int c = 0; c < 1; c++ ) // was: c < 3
{
int ch[] = {c, 0};
mixChannels(&timg, 1, &gray0, 1, ch, 1);
// try several threshold levels
for( int l = 0; l < N; l++ )
{
// hack: use Canny instead of zero threshold level.
// Canny helps to catch squares with gradient shading
if( l == 0 )
{
// apply Canny. Take the upper threshold from slider
// and set the lower to 0 (which forces edges merging)
Canny(gray0, gray, 0, thresh, 5);
// dilate canny output to remove potential
// holes between edge segments
dilate(gray, gray, Mat(), Point(-1,-1));
}
else
{
// apply threshold if l!=0:
// tgray(x,y) = gray(x,y) < (l+1)*255/N ? 255 : 0
gray = gray0 >= (l+1)*255/N;
}
// find contours and store them all as a list
findContours(gray, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
vector<Point> approx;
// test each contour
for( size_t i = 0; i < contours.size(); i++ )
{
// approximate contour with accuracy proportional
// to the contour perimeter
approxPolyDP(Mat(contours[i]), approx, arcLength(Mat(contours[i]), true)*0.02, true);
// square contours should have 4 vertices after approximation
// relatively large area (to filter out noisy contours)
// and be convex.
// Note: absolute value of an area is used because
// area may be positive or negative - in accordance with the
// contour orientation
if( approx.size() == 4 &&
fabs(contourArea(Mat(approx))) > 1000 &&
isContourConvex(Mat(approx)) )
{
double maxCosine = 0;
for( int j = 2; j < 5; j++ )
{
// find the maximum cosine of the angle between joint edges
double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1]));
maxCosine = MAX(maxCosine, cosine);
}
// if cosines of all angles are small
// (all angles are ~90 degree) then write quandrange
// vertices to resultant sequence
if( maxCosine < 0.3 )
squares.push_back(approx);
}
}
}
}
}
// the function draws all the squares in the image
void drawSquares( Mat& image, const vector<vector<Point> >& squares )
{
for( size_t i = 1; i < squares.size(); i++ )
{
const Point* p = &squares[i][0];
int n = (int)squares[i].size();
polylines(image, &p, &n, 1, true, Scalar(0,255,0), 3, CV_AA);
}
imshow(wndname, image);
}
int main(int argc, char** argv)
{
if (argc < 2)
{
cout << "Usage: ./program <file>" << endl;
return -1;
}
static const char* names[] = { argv[1], 0 };
help();
namedWindow( wndname, 1 );
vector<vector<Point> > squares;
for( int i = 0; names[i] != 0; i++ )
{
Mat image = imread(names[i], 1);
if( image.empty() )
{
cout << "Couldn't load " << names[i] << endl;
continue;
}
findSquares(image, squares);
drawSquares(image, squares);
imwrite("out.jpg", image);
int c = waitKey();
if( (char)c == 27 )
break;
}
return 0;
}
The Hough Transform can be a very expensive operation.
An alternative that may work well in your case is the following:
run 2 mathematical morphology operations called an image close (http://homepages.inf.ed.ac.uk/rbf/HIPR2/close.htm) with a horizontal and vertical line (of a given length determined from testing) structuring element respectively. The point of this is to close all gaps in the large rectangle.
run connected component analysis. If you have done the morphology effectively, the large rectangle will come out as one connected component. It then only remains iterating through all the connected components and picking out the most likely candidate that should be the large rectangle.
Perhaps finding the connected components, then removing components with less than X pixels (empirically determined), followed by dilation along horizontal/vertical lines to reconnect the gaps within the rectangle
It's possible to follow two main techniques:
Vector based operation: map your pixel islands into clusters (blob, voronoi zones, whatever). Then apply some heuristics to rectify the segments, like Teh-Chin chain approximation algorithm, and make your pruning upon vectorial elements (start, endpoint, length, orientation and so on).
Set based operation: cluster your data (as above). For every cluster, compute principal components and detect lines from circles or any other shape by looking for clusters showing only 1 significative eigenvalue (or 2 if you look for "fat" segments, that could resemble to ellipses). Check eigenvectors associated with eigenvalues to have information about orientation of the blobs, and make your choice.
Both ways could be easily explored with OpenCV (the former, indeed, falls under "Contour analysis" category of algos).
Here is a simple morphological filtering solution following the lines of #Tom10:
Solution in matlab:
se1 = strel('line',5,180); % linear horizontal structuring element
se2 = strel('line',5,90); % linear vertical structuring element
I = rgb2gray(imread('test.jpg'))>80; % threshold (since i had a grayscale version of the image)
Idil = imdilate(imdilate(I,se1),se2); % dilate contours so that they connect
Idil_area = bwareaopen(Idil,1200); % area filter them to remove the small components
The idea is to basically connect the horizontal contours to make a large component and filter by an area opening filter later on to obtain the rectangle.
Results: