This is the function I have written for 2D Convolution in C:
typedef struct PGMImage{
int w;
int h;
int* data;
}GrayImage;
GrayImage Convolution2D(GrayImage image,GrayImage kernel){
int aH,aW,bW,bH,r,c,x,y,xx,yy,X,Y;
int temp = 0;
GrayImage conv;
CreateGrayImage(&conv,image.w,image.h);
aH = image.h;
aW = image.w;
bH = kernel.h;
bW = kernel.w;
if(aW < bW || aH < bH){
fprintf(stderr,"Image cannot have smaller dimensions than the blur kernel");
}
for(r = aH-1;r >= 0;r--){
for(c = aW-1;c >= 0;c--){
temp = 0;
for(y = bH-1;y >= 0;y--){
yy = bH - y -1;
for(x = bW-1;x >= 0;x--){
xx = bW - x - 1;
X = c + (x - (bW/2));
Y = r + (y - (bH/2));
if(X >= 0 && X < aW && Y >= 0 && Y < aH){
temp += ((kernel.data[(yy*bW)+xx])*(image.data[(Y*aW)+X]));
}
}
}
conv.data[(r*aW)+c] = temp;
}
}
return conv;
}
I reproduced this function in Matlab and found that it overestimates the values for certain pixels as compared to the regular 2D Convolution function in Matlab (conv2D). I can't figure out where I am going wrong with the logic. Please help.
EDIT:
Here's the stock image I am using (512*512):
https://drive.google.com/file/d/0B3qeTSY-DQRvdWxCZWw5RExiSjQ/view?usp=sharing
Here's the kernel (3*3):
https://drive.google.com/file/d/0B3qeTSY-DQRvdlQzamcyVmtLVW8/view?usp=sharing
On using the above function I get
46465 46456 46564
45891 46137 46158
45781 46149 46030
But Matlab's conv2 gives me
46596 46618 46627
46073 46400 46149
45951 46226 46153
for the same pixels (rows:239-241,col:316:318)
This is the Matlab code I am using to compare the values:
pgm_img = imread('path\to\lena512.pgm');
kernel = imread('path\to\test_kernel.pgm');
sz_img = size(pgm_img);
sz_ker = size(kernel);
conv = conv2(double(pgm_img),double(kernel),'same');
pgm_img = padarray(pgm_img,floor(0.5*sz_ker),'both');
convolve = zeros(sz_img);
for i=floor(0.5*sz_ker(1))+1:floor(0.5*sz_ker(1))+sz_img(1)
for j=floor(0.5*sz_ker(2))+1:floor(0.5*sz_ker(2))+sz_img(2)
startX = j - floor(sz_ker(2)/2);
startY = i - floor(sz_ker(1)/2);
endX = j + floor(sz_ker(2)/2);
endY = i + floor(sz_ker(1)/2);
block = pgm_img(startY:endY,startX:endX);
prod = double(block).*double(kernel);
convolve(i-floor(0.5*sz_ker(1)),j-floor(0.5*sz_ker(2))) = sum(sum(prod));
end
end
disp(conv(239:241,316:318));
disp(convolve(239:241,316:318));
One obvious difference is that your c code uses ints, while the matlab code uses doubles. Change your c code to use doubles, and see if the results are still different.
I created Image Convolution library for simple cases of an image which is a simple 2D Float Array.
The function supports arbitrary kernels and verified against MATLAB's implementation.
So all needed on your side is calling it with your generated Kernel.
You can use its generated DLL inside MATLAB and see it yields same results as MATLAB's Image Convolution functions.
Image Convolution - GitHub.
Related
I'm currently struggling to make a 3D Sobel edge detector in C (which I am quite new to). It's not exactly working as expected (highlighting non-edges within a solid 3D object) and I was hoping someone might see where I've gone wrong. (and sorry for the poor spacing in this post)
First of all, im is the input image which has been copied into tm with a 1 pixel border on each side.
I loop through the image:
for (z = im.zlo; z <= im.zhi; z++) {
for (y = im.ylo; y <= im.yhi; y++) {
for (x = im.xlo; x <= im.xhi; x++) {
I make an array which will house the change in the x, y, and z directions, and loop through a 3x3x3 cube:
int dxdydz[3] = {0, 0, 0};
for (a = -1; a < 2; a++) {
for (b = -1; b < 2; b++) {
for (c = -1; c < 2; c++) {
Now here's the meat, where it gets a bit tricky. I'm weighting my Sobel operator such that if you imagine one 2D surface of the kernel, it would be {{1,2,1},{2,4,2},{1,2,1}}. In other words, the weight of a kernel pixel is related to its 4-connected nearness to the center pixel.
To accomplish this, I define e as 3 - (|a| + |b| + |c|), so that it is either 0, 1, or 2. The kernel will be weighted by 3^e at each pixel.
The sign of the kernel pixel will just be determined by the sign of a, b, or c.
int e = 3 - (abs(a) + abs(b) + abs(c));
Now I loop through a, b, and c by packaging them into an array and looping from 0-1-2. When a for example is 0, we don't want to add any values to x, so we exclude that with an if statement (8 levels deep!).
int abc[3] = {a, b, c};
for (i = 0; i < 3; i++) {
if (abc[i] != 0) {
The value to add should just be the image value at that pixel multiplied by the kernel value at that pixel. abc[i] is just -1 or 1, and (int)pow(3, e) is the nearness-to-center weight.
dxdydz[i] += abc[i]*(int)pow(3, e)*tm.u[z+a][y+b][x+c];
}
}
}
}
}
Lastly take the sqrt of the sum of the squared changes in x, y, and z.
int mag2 = 0;
for (i = 0; i < 3; i++) {
mag2 += (int)pow(dxdydz[i], 2);
}
im.u[z][y][x] = (int)sqrt(mag2);
}
}
}
Of course I could just loop through the image and multiply 3x3x3 cubes by the 3D kernels:
int kx[3][3][3] = {{{-1,-2,-1},{0,0,0},{1,2,1}},
{{-2,-4,-2},{0,0,0},{2,4,2}},
{{-1,-2,-1},{0,0,0},{1,2,1}}};
int ky[3][3][3] = {{{-1,-2,-1},{-2,-4,-2},{-1,-2,-1}},
{{0,0,0},{0,0,0},{0,0,0}},
{{1,2,1},{2,4,2},{1,2,1}}};
int kz[3][3][3] = {{{-1,0,1},{-2,0,2},{-1,0,1}},
{{-2,0,2},{-4,0,4},{-2,0,2}},
{{-1,0,1},{-1,0,1},{-1,0,1}}};
But I think the loop approach is a lot sexier.
I'm new to ray tracing and trying to program one in C. But My program keep on showing a dot (around 1-3 pixel) of the sphere in the wrong places and now I'm confused. This feels like a very stupid question, but I'm confused about exactly how big is 1 radius of a sphere? What I mean by that is if the radius is 1, the circle is 2 pixels?
I know all the calculations and I triple checked if I had any errors in my codes. but just incase, here is part of my codes:
Directions:
//size: 1024x768, view point (512 384 1), screen (0 0 0) to (1024 768 0)
ray[0] = x - start_x;
ray[1] = y - start_y;
ray[2] = 0 - start_z;
//normalize
double length;
length = (sqrt((ray[0]*ray[0]) + (ray[1]*ray[1]) + (ray[2]*ray[2])));
ray[0] = ray[0]/length;
ray[1] = ray[1]/length;
ray[2] = ray[2]/length;
Intersection:
temp = top; //my struct with sphere data, _x, _y, _z, _r, _red, _green, _blue
//x and y is the current pixel value
while (temp != NULL) {
x_diff = start_x - temp->_x + 0.0;
y_diff = start_y - temp->_y + 0.0;
z_diff = start_z - temp->_z + 0.0;
//a = 1 because my direction is a normalized
b = 2.0 * ((rayVector[0] * x_diff) + (rayVector[1] * y_diff) + (rayVector[2] * z_diff));
c = (x_diff * x_diff * 1.0) + (y_diff * y_diff) + (z_diff * z_diff) - (temp->_r * temp->_r);
check = (b * b) - (4.0 * c);
if (check < 0) { //0
pixels[width][height][0] = 0.0;
pixels[width][height][1] = 0.0;
pixels[width][height][2] = 0.0;
}
else if (check == 0) { //1
r1 = (b * -1.0) /2.0;
if (r1 < nearest_z) {
nearest_z = r1;
pixels[width][height][0] = temp->_red;
pixels[width][height][1] = temp->_green;
pixels[width][height][2] = temp->_blue;
}
}
else { //2
r1 = ((b * -1.0) + sqrt(check))/2.0;
r2 = ((b * -1.0) - sqrt(check))/2.0;
if ((r1 < r2) && (r1 < nearest_z)) {
nearest_z = r1;
pixels[width][height][0] = 255.0;
pixels[width][height][1] = 0;
pixels[width][height][2] = 0;
}
else if ((r2 < r1) && (r2 < nearest_z)) {
nearest_z = r2;
pixels[width][height][0] = temp->_red;
pixels[width][height][1] = temp->_green;
pixels[width][height][2] = temp->_blue;
}
}
temp = temp->next;
}
I haven't done any lightings yet since the flat colouring it doesn't work. I'm new to openGL so expect me to miss some common functions in the codes. Thanks in advance.
Edit:
I only have one sphere currently, but my output looks like: img1
I was expecting a bigger circle? Also, I had a printf for each intersection (if there is) and when I manually plot in a paper, it is a 4x5 pixel square. But there are 4 dots in the output.
Edit 2: I change the size of the sphere to: x = 512 y = 384 z = -21 r = 30, it gave me this:
img2
Again, I only have one sphere and there are 4 in the image. Also, there are holds between the lines?
If I change the z value to -20, now my output is all white (colour of sphere).
I use glDrawPixels(1024,768,GL_RGB,GL_FLOAT,pixels); to draw
I had a RBG output file, everything seems to be in the right place. but when I draw on the program, it is off.
I want to do 2D convolution of an image with a Gaussian kernel which is not centre originated given by equation:
h(x-x', y-y') = exp(-((x-x')^2+(y-y'))/2*sigma)
Lets say the centre of kernel is (1,1) instead of (0,0). How should I change my following code for generation of kernel and for the convolution?
int krowhalf=krow/2, kcolhalf=kcol/2;
int sigma=1
// sum is for normalization
float sum = 0.0;
// generate kernel
for (int x = -krowhalf; x <= krowhalf; x++)
{
for(int y = -kcolhalf; y <= kcolhalf; y++)
{
r = sqrtl((x-1)*(x-1) + (y-1)*(y-1));
gKernel[x + krowhalf][y + kcolhalf] = exp(-(r*r)/(2*sigma));
sum += gKernel[x + krowhalf][y + kcolhalf];
}
}
//normalize the Kernel
for(int i = 0; i < krow; ++i)
for(int j = 0; j < kcol; ++j)
gKernel[i][j] /= sum;
float **convolve2D(float** in, float** out, int h, int v, float **kernel, int kCols, int kRows)
{
int kCenterX = kCols / 2;
int kCenterY = kRows / 2;
int i,j,m,mm,n,nn,ii,jj;
for(i=0; i < h; ++i) // rows
{
for(j=0; j < v; ++j) // columns
{
for(m=0; m < kRows; ++m) // kernel rows
{
mm = kRows - 1 - m; // row index of flipped kernel
for(n=0; n < kCols; ++n) // kernel columns
{
nn = kCols - 1 - n; // column index of flipped kernel
//index of input signal, used for checking boundary
ii = i + (m - kCenterY);
jj = j + (n - kCenterX);
// ignore input samples which are out of bound
if( ii >= 0 && ii < h && jj >= 0 && jj < v )
//out[i][j] += in[ii][jj] * (kernel[mm+nn*29]);
out[i][j] += in[ii][jj] * (kernel[mm][nn]);
}
}
}
}
}
Since you're using the convolution operator you have 2 choices:
Using it Spatial Invariant property.
To so so, just calculate the image using regular convolution filter (Better done using either conv2 or imfilter) and then shift the result.
You should mind the boundary condition you'd to employ (See imfilter properties).
Calculate the shifted result specifically.
You can do this by loops as you suggested or more easily create non symmetric kernel and still use imfilter or conv2.
Sample Code (MATLAB)
clear();
mInputImage = imread('3.png');
mInputImage = double(mInputImage) / 255;
mConvolutionKernel = zeros(3, 3);
mConvolutionKernel(2, 2) = 1;
mOutputImage01 = conv2(mConvolutionKernel, mInputImage);
mConvolutionKernelShifted = [mConvolutionKernel, zeros(3, 150)];
mOutputImage02 = conv2(mConvolutionKernelShifted, mInputImage);
figure();
imshow(mOutputImage01);
figure();
imshow(mOutputImage02);
The tricky part is to know to "Crop" the second image in the same axis as the first.
Then you'll have a shifted image.
You can use any Kernel and any function which applies convolution.
Enjoy.
b_k = 1;
while (b_k <= iv0[1]) {
h = vplus_data[0];
u1 = vmax->data[(int)((1.0 + (double)k) + 1.0) - 1];
if ((h <= u1) || rtIsNaN(u1)) {
minval_data_idx_0 = h;
} else {
minval_data_idx_0 = u1;
}
b_k = 2;
}
b_k = 1;
while (b_k <= iv0[1]) {
h = vmin->data[(int)((1.0 + (double)k) + 1.0) - 1];
if ((h >= minval_data_idx_0) || rtIsNaN(minval_data_idx_0)) {
} else {
h = minval_data_idx_0;
}
vplus_data[0] = h;
b_k = 2;
}
this code is compared to min function to get the minimum value for h or u1,
can anyone tell me why matlab generate such syntax? why is the while loop, although I dont see any changes inside the while block!
matlab code
v(k+1) = max(vmin(k+1), min(vplus, vmax(k+1)));
notice there are two loops for max min function
I can't explain why the generated code ends up like that, but it must have something to do with how you wrote your Matlab code. It looks strange, but if it works then it probably doesn't matter.
If you're curious about the generator, start from something very simple and watch how the generated code changes as your code gets more complex. Try variations like these:
z = min(x, y);
z = max(w, min(x, y));
for i = 1:length(v)
z(i) = max(w, min(v(i), y));
end
Keep on modifying the test code a bit at a time to make it like the code that prompted this question and maybe you'll discover exactly what triggers the result you're seeing.
I've been using the FJCore library in a Silverlight project to help with some realtime image processing, and I'm trying to figure out how to get a tad more compression and performance out of the library. Now, as I understand it, the JPEG standard allows you to specify a chroma subsampling ratio (see http://en.wikipedia.org/wiki/Chroma_subsampling and http://en.wikipedia.org/wiki/Jpeg); and it appears that this is supposed to be implemented in the FJCore library using the HsampFactor and VsampFactor arrays:
public static readonly byte[] HsampFactor = { 1, 1, 1 };
public static readonly byte[] VsampFactor = { 1, 1, 1 };
However, I'm having a hard time figuring out how to use them. It looks to me like the current values are supposed to represent 4:4:4 subsampling (e.g., no subsampling at all), and that if I wanted to get 4:1:1 subsampling, the right values would be something like this:
public static readonly byte[] HsampFactor = { 2, 1, 1 };
public static readonly byte[] VsampFactor = { 2, 1, 1 };
At least, that's the way that other similar libraries use these values (for instance, see the example code here for libjpeg).
However, neither the above values of {2, 1, 1} nor any other set of values that I've tried besides {1, 1, 1} produce a legible image. Nor, in looking at the code, does it seem like that's the way it's written. But for the life of me, I can't figure out what the FJCore code is actually trying to do. It seems like it's just using the sample factors to repeat operations that it's already done -- i.e., if I didn't know better, I'd say that it was a bug. But this is a fairly established library, based on some fairly well established Java code, so I'd be surprised if that were the case.
Does anybody have any suggestions for how to use these values to get 4:2:2 or 4:1:1 chroma subsampling?
For what it's worth, here's the relevant code from the JpegEncoder class:
for (comp = 0; comp < _input.Image.ComponentCount; comp++)
{
Width = _input.BlockWidth[comp];
Height = _input.BlockHeight[comp];
inputArray = _input.Image.Raster[comp];
for (i = 0; i < _input.VsampFactor[comp]; i++)
{
for (j = 0; j < _input.HsampFactor[comp]; j++)
{
xblockoffset = j * 8;
yblockoffset = i * 8;
for (a = 0; a < 8; a++)
{
// set Y value. check bounds
int y = ypos + yblockoffset + a; if (y >= _height) break;
for (b = 0; b < 8; b++)
{
int x = xpos + xblockoffset + b; if (x >= _width) break;
dctArray1[a, b] = inputArray[x, y];
}
}
dctArray2 = _dct.FastFDCT(dctArray1);
dctArray3 = _dct.QuantizeBlock(dctArray2, FrameDefaults.QtableNumber[comp]);
_huf.HuffmanBlockEncoder(buffer, dctArray3, lastDCvalue[comp], FrameDefaults.DCtableNumber[comp], FrameDefaults.ACtableNumber[comp]);
lastDCvalue[comp] = dctArray3[0];
}
}
}
And notice that in the i & j loops, they're not controlling any kind of pixel skipping: if HsampFactor[0] is set to two, it's just grabbing two blocks instead of one.
I figured it out. I thought that by setting the sampling factors, you were telling the library to subsample the raster components itself. Turns out that when you set the sampling factors, you're actually telling the library the relative size of the raster components that you're providing. In other words, you need to do the chroma subsampling of the image yourself, before you ever submit it to the FJCore library for compression. Something like this is what it's looking for:
private byte[][,] GetSubsampledRaster()
{
byte[][,] raster = new byte[3][,];
raster[Y] = new byte[width / hSampleFactor[Y], height / vSampleFactor[Y]];
raster[Cb] = new byte[width / hSampleFactor[Cb], height / vSampleFactor[Cb]];
raster[Cr] = new byte[width / hSampleFactor[Cr], height / vSampleFactor[Cr]];
int rgbaPos = 0;
for (short y = 0; y < height; y++)
{
int Yy = y / vSampleFactor[Y];
int Cby = y / vSampleFactor[Cb];
int Cry = y / vSampleFactor[Cr];
int Yx = 0, Cbx = 0, Crx = 0;
for (short x = 0; x < width; x++)
{
// Convert to YCbCr colorspace.
byte b = RgbaSample[rgbaPos++];
byte g = RgbaSample[rgbaPos++];
byte r = RgbaSample[rgbaPos++];
YCbCr.fromRGB(ref r, ref g, ref b);
// Only include the byte in question in the raster if it matches the appropriate sampling factor.
if (IncludeInSample(Y, x, y))
{
raster[Y][Yx++, Yy] = r;
}
if (IncludeInSample(Cb, x, y))
{
raster[Cb][Cbx++, Cby] = g;
}
if (IncludeInSample(Cr, x, y))
{
raster[Cr][Crx++, Cry] = b;
}
// For YCbCr, we ignore the Alpha byte of the RGBA byte structure, so advance beyond it.
rgbaPos++;
}
}
return raster;
}
static private bool IncludeInSample(int slice, short x, short y)
{
// Hopefully this gets inlined . . .
return ((x % hSampleFactor[slice]) == 0) && ((y % vSampleFactor[slice]) == 0);
}
There might be additional ways to optimize this, but it's working for now.