Converting Iplimage to a matrix or an array in Matlab - arrays

I am using OpenCV via Matlab to detect faces in a video and then do some processing using Matlab. At the moment I do face detection on the IplImage-structured frames (queried by cvQueryFrame) of the video. I save each of the queried frames as a jpg and then use the face coordinates to get the ROI for the required processing. See the portion of code outlining this below.
% After reading in frame from video..
for i=1:size
img = calllib('highgui210','cvQueryFrame',cvCapture);
calllib('cxcore210','cvFlip',img,img,1);
calllib('highgui210', 'cvSaveImage', 'ThisFrame.jpg', img, ptr);
% Rest of the processing comes here..
This being the case, I feel that there should be an easier and less-crude way to convert an 'IplImage' image to a matrix or array in Matlab. Is this a possibility? If yes,how is this done?
Some pointers in this direction would be much appreciated!

Try mexing the following code:
/*
* Usage:
* img = IplImage2mxArray( cvImgPtr, releaseFlag );
*/
void mexFunction( int nout, mxArray* pout[], int nin, const mxArray* pin[]) {
if ( nin != 2 )
mexErrMsgTxt("wrong number of inputs");
if ( nout != 1 )
mexErrMsgTxt("wrong number of outputs");
IplImage* cvImg = (IplImage*)mxGetData(pin[0]); // get the pointer
// allocate the output
mwSize imgDims[3] = {cvImg->height, cvImg->width, cvImg->nChannels};
pout[0] = mxCreateNumericArray( 3, imgDims, mxDOUBLE_CLASS, mxREAL );
if ( pout[0] == NULL )
mexErrMsgTxt("out of memeory");
double* imgP = mxGetPr(pout[0]);
double divVal = pow(2.0, cvImg->depth) - 1;
double releaseFlag = mxGetScalar( pin[1] );
for ( int x = 0 ; x < cvImg->width; x++ ) {
for ( int y = 0 ; y < cvImg->height; y++ ) {
CvScalar s = cvGet2D(cvImg, y, x);
for (int c = 0; c < cvImg->nChannels; c++) {
imgP[ y + x * cvImg->height + c * cvImg->height * cvImg->width ] = s.val[c] / divVal;
}
}
}
if ( releaseFlag != 0 ) {
cvReleaseImage( &cvImg );
}
}
You'll end up with a function (mex) IplImage2mxArray, use it in Matlab:
>> cvImg = calllib('highgui210','cvQueryFrame',cvCapture);
>> img = IplImage2mxArray( cvImg, false ); % no cvReleaseImage after cvQueryFrame
Due to internal opencv representations the channels of img might be permuted (BGR instread of RGB). Also note that img might contain four channels - an additional alpha channel.
-Shai

Related

Is pointers within structures slowing down my code?

I am looking for some help or hints to speed up my code.
I have implemented a routine for computing the gravitational potential at a point (r,phi,lambda) in space from a set of spherical harmonic coefficients C_{n,m} and S_{n,m}. The equation is shown in the link below:
and includes the recursive computation of the latitude (phi) dependent associated Legendre polynomials P_{n,m}, starting with the first two values P{0,0} and P_{1,1}.
At first, I had this implemented as a MATLAB C-MEX code, with only the core part of my code in C-language. I wanted to make a pure C-routine, but found that the code runs 3-5 times slower, which makes me wonder why. Could it be the way I define my structures and use pointers to pointers in the central code?
It seems like it is the core computation part that takes the extra time, but that part did not change although before I was passing using pointers directly to variables and now I am using pointers inside structures.
Any help is appreciated!
In the following, I will try to explain my code and show some extracts:
At the beginning of the program, I define three structures. One to hold the spherical harmonic coefficients, C_{n,m} and S_{n,m}, (ggm_struct), one to hold the computation coordinates (comp_struct) and one to hold the results (func_struct):
// Define constant variables
const double deg2rad = M_PI/180.0; // degrees to radians conversion factor
const double sfac = 1.0000E-280; // scaling factor
const double sqrt2 = 1.414213562373095; // sqrt(2)
const double sqrt3 = 1.732050807568877; // sqrt(3)
// Define structure to hold geopotential model
struct ggm_struct {
char product_type[100], modelname[100], errors[100], norm[100], tide_system[100];
double GM, R, *C, *S;
int max_degree, ncoef;
};
// Define structure to hold computation coordinates
struct comp_struct {
double *lat, *lon, *h;
double *r, *phi;
int nlat, nlon;
};
/* Define structure to hold results */
struct func_struct {
double *rval;
int npoints;
};
I then have a (sub-)function that starts by allocating space and then loads the coefficients from an ascii file as
int read_gfc(char mfile[100], int *nmax, int *mmax, struct ggm_struct *ggm)
{
// Set file identifier
FILE *fid;
// Declare variables
char str[200], var[100];
int n, m, nid, l00 = 0, l10 = 0, l11 = 0;
double c, s;
// Determine number of coefficients
ggm->ncoef = (*nmax+2)*(*nmax+1)/2;
// Allocate memory for coefficients
ggm->C = (double*) malloc(ggm->ncoef*sizeof(double));
if (ggm->C == NULL){
printf("Error: Memory for C not allocated!");
return -ENOMEM;
}
ggm->S = (double*) malloc(ggm->ncoef*sizeof(double));
if (ggm->S == NULL){
printf("Error: Memory for S not allocated!");
return -ENOMEM;
}
// Open file
fid = fopen(mfile,"r");
// Check that file was opened correctly
if (fid == NULL){
printf("Error: opening file %s!",mfile);
return -ENOMEM;
}
// Read file header
while (fgets(str,200,fid) != NULL && strncmp(str,"end_of_head",11) != 0){
// Extract model parameters
if (strncmp(str,"product_type",12) == 0){ sscanf(str,"%s %s",var,ggm->product_type); }
if (strncmp(str,"modelname",9) == 0){ sscanf(str,"%s %s",var,ggm->modelname); }
if (strncmp(str,"earth_gravity_constant",22) == 0){ sscanf(str,"%s %lf",var,&ggm->GM); }
if (strncmp(str,"radius",6) == 0){ sscanf(str,"%s %lf",var,&ggm->R); }
if (strncmp(str,"max_degree",10) == 0){ sscanf(str,"%s %d",var,&ggm->max_degree); }
if (strncmp(str,"errors",6) == 0){ sscanf(str,"%s %s",var,ggm->errors); }
if (strncmp(str,"norm",4) == 0){ sscanf(str,"%s %s",var,ggm->norm); }
if (strncmp(str,"tide_system",11) == 0){ sscanf(str,"%s %s",var,ggm->tide_system); }
}
// Read coefficients
while (fgets(str,200,fid) != NULL){
// Extract parameters
sscanf(str,"%s %d %d %lf %lf",var,&n,&m,&c,&s);
// Store parameters
if (n <= *nmax && m <= *mmax) {
// Determine index
nid = (n+1)*n/2 + m;
// Store values
*(ggm->C+nid) = c;
*(ggm->S+nid) = s;
}
}
// Close fil
fclose(fid);
// Return from function
return 0;
}
Afterwards, the computation grid is defined by an array of seven components. As an example, the array [-90 90 -180 180 1 1 0] defines a grid from -90 to 90 degrees latitude at 1-degree increments and from -180 to 180 degrees longitude at 1-degree increments. The height is zero. From this array, a computation grid is generated in a (sub-)function:
int make_grid(double *grid, struct comp_struct *inp)
{
// Declare variables
int n;
/* Echo routine */
printf("Creating grid of coordinates\n");
printf(" [lat1,lat2,dlat] = [%f,%f,%f]\n", *grid, *(grid+1), *(grid+4) );
printf(" [lon1,lon2,dlon] = [%f,%f,%f]\n", *(grid+2), *(grid+3), *(grid+5) );
printf(" h = %f\n", *(grid+6));
/* Latitude ------------------------------------------------------------- */
// Determine number of increments
inp->nlat = ceil( ( *(grid+1) - *grid + *(grid+4) ) / *(grid+4) );
// Allocate memory
inp->lat = (double*) malloc(inp->nlat*sizeof(double));
if (inp->lat== NULL){
printf("Error: Memory for LATITUDE (inp.lat) points not allocated!");
return -ENOMEM;
}
// Fill in values
*(inp->lat) = *(grid+1);
for (n = 1; n < inp->nlat-1; n++) {
*(inp->lat+n) = *(inp->lat+n-1) - *(grid+4);
}
*(inp->lat+inp->nlat-1) = *grid;
/* Longitude ------------------------------------------------------------ */
// Determine number of increments
inp->nlon = ceil( ( *(grid+3) - *(grid+2) + *(grid+5) ) / *(grid+5) );
// Allocate memory
inp->lon = (double*) malloc(inp->nlon*sizeof(double));
if (inp->lon== NULL){
printf("Error: Memory for LONGITUDE (inp.lon) points not allocated!");
return -ENOMEM;
}
// Fill in values
*(inp->lon) = *(grid+2);
for (n = 1; n < inp->nlon-1; n++) {
*(inp->lon+n) = *(inp->lon+n-1) + *(grid+5);
}
*(inp->lon+inp->nlon-1) = *(grid+3);
/* Height --------------------------------------------------------------- */
// Allocate memory
inp->h = (double*) malloc(inp->nlat*sizeof(double));
if (inp->h== NULL){
printf("Error: Memory for HEIGHT (inp.h) points not allocated!");
return -ENOMEM;
}
// Fill in values
for (n = 0; n < inp->nlat; n++) {
*(inp->h+n) = *(inp->h+n-1) + *(grid+6);
}
// Return from function
return 0;
}
These geographic coordinates are then transformed to spherical coordinates for the computation using another (sub-)routine
int geo2sph(struct comp_struct *inp, int *lgrid)
{
// Declare variables
double a = 6378137.0, e2 = 6.69437999014E-3; /* WGS84 parameters */
double x, y, z, sinlat, coslat, sinlon, coslon, R_E;
int i, j, nid;
/* Allocate space ------------------------------------------------------- */
// radius
inp->r = (double*) malloc(inp->nlat*sizeof(double));
if (inp->r== NULL){
printf("Error: Memory for SPHERICAL DISTANCE (inp.r) points not allocated!");
return -ENOMEM;
}
// phi
inp->phi = (double*) malloc(inp->nlat*sizeof(double));
if (inp->phi== NULL){
printf("Error: Memory for SPHERICAL LATITUDE (inp.phi) points not allocated!");
return -ENOMEM;
}
/* Loop over latitude =================================================== */
for (i = 0; i < inp->nlat; i++) {
// Compute sine and cosine of latitude
sinlat = sin(*(inp->lat+i));
coslat = cos(*(inp->lat+i));
// Compute radius of curvature
R_E = a / sqrt( 1.0 - e2*sinlat*sinlat );
// Compute sine and cosine of longitude
sinlon = sin(*(inp->lon));
coslon = cos(*(inp->lon));
// Compute rectangular coordinates
x = ( R_E + *(inp->h+i) ) * coslat * coslon;
y = ( R_E + *(inp->h+i) ) * coslat * sinlon;
z = ( R_E*(1.0-e2) + *(inp->h+i) ) * sinlat;
// Compute sqrt( x^2 + y^2 )
R_E = sqrt( x*x + y*y );
// Derive radial distance
*(inp->r+i) = sqrt( R_E * R_E + z*z );
// Derive spherical latitude
if (R_E < 1) {
if (z > 0) { *(inp->phi+i) = M_PI/2.0; }
else { *(inp->phi+i) = -M_PI/2.0; }
}
else {
*(inp->phi+i) = asin( z / *(inp->r+i) );
}
}
// Return from function
return 0;
}
Finally, the gravitational potential is computed within is own (sub-)function. This is the core part of the code, which is more or less the same as for the MATLAB C-MEX function. The only difference seems to be that before (in MATLAB MEX) everything was defined as (simple) double variables - now the variables are located inside a structure which contains pointers.
int gravpot(struct comp_struct *inp, struct ggm_struct *ggm, int *nmax,
int *mmax, int *lgrid, struct func_struct *out)
{
// Declare variables
double GMr, ar, t, u, u2, arn, gnm, hnm, P, Pp1, Pp2, msum;
double Pmm[*nmax+1], CPnm[*mmax+1], SPnm[*mmax+1];
int i, j, n, m, id;
// Allocate memory
out->rval = (double*) malloc(inp->nlat*inp->nlon*sizeof(double));
if (out->rval== NULL){
printf("Error: Memory for OUTPUT (out.rval) not allocated!");
return -ENOMEM;
}
/* Compute sectorial values of associated Legendre polynomials ========== */
// Define seed values ( divided by u^m )
Pmm[0] = sfac;
Pmm[1] = sqrt3 * sfac;
// Compute sectorial values, [1] eq. 13 and 28 ( divided by u^m )
for (m = 2; m <= *nmax; m++) {
Pmm[m] = sqrt( (2.0*m+1.0) / (2.0*m) ) * Pmm[m-1];
}
/* ====================================================================== */
/* Loop over latitude =================================================== */
for (i = 0; i < inp->nlat; i++) {
// Compute ratios to be used in summation
GMr = ggm->GM / *(inp->r+i);
ar = ggm->R / *(inp->r+i);
/* ---------------------------------------------------------------------
* Compute product of Legendre values and spherical harmonic coefficients.
* Products of similar degree are summed together, resulting in mmax
* values. The degree terms are latitude dependent, such that these mmax
* sums are valid for every point with the same latitude.
* The values of the associated Legendre polynomials, Pnm, are scaled by
* sfac = 10^(-280) and divided by u^m in order to prevent underflow and
* overflow of the coefficients.
* ------------------------------------------------------------------ */
// Form coefficients for Legendre recursive algorithm
t = sin(*(inp->phi+i));
u = cos(*(inp->phi+i));
u2 = u * u;
arn = ar;
/* Degree n = 0 terms ----------------------------------------------- */
// Compute order m = 0 term (S term is zero)
CPnm[0] = Pmm[0] * *(ggm->C);
/* Degree n = 1 terms ----------------------------------------------- */
// Compute (1,1) terms, [1] eq. 3
CPnm[1] = ar * Pmm[1] * *(ggm->C+2);
SPnm[1] = ar * Pmm[1] * *(ggm->S+2);
// Compute (1,0) Legendre value, [1] eq. 18 and 27
P = t * Pmm[1];
// Add (1,0) terms to sum (S term is zero), [1] eq. 3
CPnm[0] = CPnm[0] + ar * P * *(ggm->C+1);
/* Degree n = [2,n_max] --------------------------------------------- */
for (n = 2; n <= *nmax; n++) {
// Compute power term
arn = arn * ar;
/* Compute sectorial (m=n) terms ++++++++++++++++++++++++++++++++ */
// Extract associated Legendre value
Pp1 = Pmm[n];
// Compute product terms, [1] eq. 3
if (n <= *mmax) {
id = (n+1)*n/2 + n;
CPnm[n] = arn * Pp1 * *(ggm->C+id);
SPnm[n] = arn * Pp1 * *(ggm->S+id);
}
/* Compute first non-sectorial terms (m=n-1) ++++++++++++++++++++ */
// Compute associated Legendre value, [1] eq. 18 and 27
gnm = sqrt( 2.0*n );
P = gnm * t * Pp1;
// Add terms to summation, eq. 3 in [1]
if (n-1 <= *mmax) {
id = (n+1)*n/2 + n - 1;
CPnm[n-1] = CPnm[n-1] + arn * P * *(ggm->C+id);
SPnm[n-1] = SPnm[n-1] + arn * P * *(ggm->S+id);
}
/* Compute terms of order m = [n-2,1] +++++++++++++++++++++++++++ */
for (m = n-2; m > 0; m--) {
// Set previous values
Pp2 = Pp1;
Pp1 = P;
// Compute associated Legendre value, [1] eq. 18, 19 and 27
gnm = 2.0*(m+1.0) / sqrt( (n-m)*(n+m+1.0) );
hnm = sqrt( (n+m+2.0)*(n-m-1.0)/(n-m)/(n+m+1.0) );
P = gnm * t * Pp1 - hnm * u2 * Pp2;
// Add product terms to summation, eq. 3 in [1]
if (m <= *mmax) {
id = (n+1)*n/2 + m;
CPnm[m] = CPnm[m] + arn * P * *(ggm->C+id);
SPnm[m] = SPnm[m] + arn * P * *(ggm->S+id);
}
}
/* Compute zonal terms (m=0) ++++++++++++++++++++++++++++++++++++ */
// Compute associated Legendre value, [1] eq. 18, 19 and 27
gnm = 2.0 / sqrt( n*(n+1.0) );
hnm = sqrt( (n+2.0)*(n-1.0)/n/(n+1.0) );
P = ( gnm * t * P - hnm * u2 * Pp1 ) / sqrt2;
// Add terms to summation (S term is zero), [1] eq. 3
id = (n+1)*n/2;
CPnm[0] = CPnm[0] + arn * P * *(ggm->C+id);
} /* ---------------------------------------------------------------- */
/* Loop over longitude ============================================== */
for (j = 0; j < inp->nlon; j++) {
/* -----------------------------------------------------------------
* All associated Legendre polynomials (latitude dependent) are now
* computed and multiplied by the corresponding spherical harmonic
* coefficient. These products are scaled by u^m, meaning that
* Horner's scheme is used in the following summation.
* -------------------------------------------------------------- */
// Initialise order-dependent sum
msum = 0.0;
// Derive longitude id
id = j + i * *lgrid;
// Loop over order (m > 0)
for (m = *mmax; m > 0; m--) {
// Add to order-dependent sum using Horner's scheme, [1] eq. 2, 3 and 31
msum = ( msum + cos( m * *(inp->lon+id) ) * CPnm[m]
+ sin( m * *(inp->lon+id) ) * SPnm[m] ) * u;
}
// Add zero order term to sum
msum = msum + CPnm[0];
// Rescale value into gravitational potential, [1] eq. 1
*(out->rval+i+j*inp->nlat) = GMr * msum / sfac;
} /* ================================================================ */
} /* ==================================================================== */
// Return from function
return 0;
}
Again, any help is greatly appreciated and additional information can be supplied if relevant, but this already became a long post. I have a hard time accepting that pure c-code runs slower than the MATLAB C-MEX code.
To put it simply, yes, pointers can prevent some compiler optimizations resulting in a potential slow down. At least, this is clearly the case with ICC and a bit with GCC. The performance of the program is strongly impacted by pointer aliasing and vectorization.
Indeed, the compiler cannot easily know if the provided pointers alias each other or alias with the address with some fields of the provided data structure. As a result, the compilers tends to be conservative and assume that the pointed value can change at any time and reload them often. This can prevent optimizations like the splitting of some loops in gravpot (with GCC -- see line 119 of this modified code). Moreover, indirections and aliasing tends to prevent the vectorization of the hot loops (ie. the use of SIMD instructions provided by the target processor). Vectorisation can strongly impact the performance of a code.
To give an example, here is the initial code of geo2sph and here is a slightly modified implementation. In the first case, ICC generate a slow scalar implementation, while in the second case, ICC generate a significantly faster vectorized implementation. The only difference between the two implementation is the use of the restrict keyword. This keyword tell to the compiler that for the lifetime of the pointer, only the pointer itself or a value directly derived from it (such as pointer+1) will be used to access the object to which it points (see here for more information). Note that the use of the restrict keyword is dangerous and one should be very careful while using it since the compiler may generate a bad code if the restrict hint is wrong (very hard to debug). Alternatively, you can help the compiler to generate a vectorized code using the OpenMP SIMD directive #pragma omp simd (see here for the result). Note that you should be sure the target code can be safely vectorized (eg. iterations of must be independent).

Issues trying to scale up sine wave in c

Hopefully somebody can point out why this isnt working or where i may be going wrong. Im producing a sine wave by way of for loops in c. The ultimate aim is to produce a .ppm file displaying this. Im working on a 1 to 1 pixel ratio. My box is 128H*256W. The sine wave is displaying but due to the answer being produced in rads the result is a very small two pixel high "wave". I assume this is due to the rad values being between 1 and -1. This is my code. I tried just simply timesing by a greater number to increase the size of the y values in the hopes it would plot correctly but this does little or worse causes the applicattion to stop running. Any ideas very welcome.
for (x = 0; x < H; x++)
{
y =(int) H/2+ sin(x*(2*PI));
y = y * 50;
image[y][x][1] = 0;
image[y][x][2] = 255;
image[y][x][3] = 0;
}
EDIT: This is what is being produced in the .ppm file when opened via infraview. Also im #defining PI 3.141592653589793. Again is this possibly an area of issue.
first sine wave .ppm
I conject that y is an int.
Your sin value will be truncated to an integer; 0 for most cases, but very occasionally -1 or +1.
The fix is simple: use a floating point for y, and cast once you want to use it as an array index.
As y is commented to be an int and H appears to be an int constant, perform calculations as double first, then convert to int.
Use round to avoid truncations effect of simply casting a double to int.
y = (int) round(50*(sin(x*(2*PI)) + H/2.0));
Original code also scaled H/2 by 50. I think code may only want to scale the sin() and not the offset.
#define XOffset 0
#define YOffset (H/2.0)
#define XScale (2*PI)
#define YScale 50
y = (int) round(YScale*sin(x*XScale + XOffset) + YOffset);
Defensive programming tip: since y is calculated, insure it is in the valid index range before using it as an index.
// Assuming image` is a fixed sized array
#define Y_MAX (sizeof image/sizeof image[0] - 1)
if (y >= 0 && y <= Y_MAX) {
image[y][x][1] = 0;
image[y][x][2] = 255;
image[y][x][3] = 0;
}
y = y * 50, where y = H/2 (+ or - 1) gives you y around 25*H, which is out of bounds.
A closer approximation is this:
y = (int) ( H/2 + H/2 * sin(x*2*PI) )
which gives the extremes H/2 - H/2 = 0 and H/2 + H/2 = H, which is one too high. So, we scale not by H/2 but by (H-1)/2:
y = (int) ( H/2 + (H-1)/2 * sin(x*2*PI) )
which gives us an y-range 0 to H-1.
To have a bit more control over the period of the sine wave, let's write it like this:
sin( x/W * 2*PI )
Here, we divide x by W so that x/W itself will range from 0 to 1.
It is then scaled by 2*PI to produce a range from 0 to 2π. This will plot one period of the sine wave across the entire width. If we introduce a frequency factor f:
sin( f * x/W * 2*PI )
we can now say how many periods to draw, even fractions. For f=1 it will draw one period, f=2 two periods, and f=1 half a period.
Here's a small JS demo showing three values for f: 0.5 is red, 1 is green and 2 is white:
var c = document.getElementById('c'),
W = c.width,
H = c.height,
ctx = c.getContext('2d'),
image = ctx.getImageData(0,0,W,H);
for ( var i = 0; i < image.data.length; i +=4) {
image.data[i+0]=0;
image.data[i+1]=0;
image.data[i+2]=0;
image.data[i+3]=255;
}
function render(image,colidx,f) {
for ( var x = 0; x < W; x++ )
{
var y = H/2 - Math.round( H/2 * Math.sin(f*x/W*Math.PI*2) );
if ( y >=0 && y < H ) {
if ( colidx & 1 ) image.data[ 4*( W*y + x ) + 0] = 255;
if ( colidx & 2 ) image.data[ 4*( W*y + x ) + 1] = 255;
if ( colidx & 4 ) image.data[ 4*( W*y + x ) + 2] = 255;
}
}
}
render(image,1,0.5);
render(image,2,1);
render(image,7,2);
ctx.putImageData(image, 0,0);
canvas{ border: 1px solid red;}
<canvas id='c' width='256' height='128'></canvas>
The code then becomes:
float f = 1;
for (x = 0; x < W; x++)
{
y = (int) ( (H-1)/2 + (H-1)/2 * sin(f * x/W * 2*PI) );
image[y][x][0] = 0;
image[y][x][1] = 255;
image[y][x][2] = 0;
}

Issue displaying IDirect3DTexture8 after backporting from IDirect3DTexture9

I'm trying to backport someones Direct3d9 port of Quake 1 by ID software to Direct3d8 so I can port it to the original Xbox (only uses the D3D8 API).
After making the changes to use Direct3d8 it displays some mashed up pixels on the screen that appear to be in little squares :/ (see pictures).
Does anyone know whats gone wrong here? It works flawlessly with D3D9, is there some extra arguments required that I'm missing require for D3D8, rect pitch maybe?
The data been passed in is a Quake 1 .lmp 2d image file. "It consists of two integers (width and height) followed by a string of width x height bytes, each of which is an index into the Quake palette"
Its been passed to the D3D_ResampleTexture() function.
Any help would be much appreciated.
Image output using D3D8
Image output using D3D9
The code:
void D3D_ResampleTexture (image_t *src, image_t *dst)
{
int y, x , srcpos, srcbase, dstpos;
unsigned int *dstdata, *srcdata;
// take an unsigned pointer to the dest data that we'll actually fill
dstdata = (unsigned int *) dst->data;
// easier access to src data for 32 bit resampling
srcdata = (unsigned int *) src->data;
// nearest neighbour for now
for (y = 0, dstpos = 0; y < dst->height; y++)
{
srcbase = (y * src->height / dst->height) * src->width;
for (x = 0; x < dst->width; x++, dstpos++)
{
srcpos = srcbase + (x * src->width / dst->width);
if (src->flags & IMAGE_32BIT)
dstdata[dstpos] = srcdata[srcpos];
else if (src->palette)
dstdata[dstpos] = src->palette[src->data[srcpos]];
else Sys_Error ("D3D_ResampleTexture: !(flags & IMAGE_32BIT) without palette set");
}
}
}
void D3D_LoadTextureStage3 (LPDIRECT3DTEXTURE8/*9*/ *tex, image_t *image)
{
int i;
image_t scaled;
D3DLOCKED_RECT LockRect;
memset (&LockRect, 0, sizeof(D3DLOCKED_RECT));
// check scaling here first
for (scaled.width = 1; scaled.width < image->width; scaled.width *= 2);
for (scaled.height = 1; scaled.height < image->height; scaled.height *= 2);
// clamp to max texture size
if (scaled.width > /*d3d_DeviceCaps.MaxTextureWidth*/640) scaled.width = /*d3d_DeviceCaps.MaxTextureWidth*/640;
if (scaled.height > /*d3d_DeviceCaps.MaxTextureHeight*/480) scaled.height = /*d3d_DeviceCaps.MaxTextureHeight*/480;
IDirect3DDevice8/*9*/_CreateTexture(d3d_Device, scaled.width, scaled.height,
(image->flags & IMAGE_MIPMAP) ? 0 : 1,
/*(image->flags & IMAGE_MIPMAP) ? D3DUSAGE_AUTOGENMIPMAP :*/ 0,
(image->flags & IMAGE_ALPHA) ? D3DFMT_A8R8G8B8 : D3DFMT_X8R8G8B8,
D3DPOOL_MANAGED,
tex
);
// lock the texture rectangle
//(*tex)->LockRect (0, &LockRect, NULL, 0);
IDirect3DTexture8/*9*/_LockRect(*tex, 0, &LockRect, NULL, 0);
// fill it in - how we do it depends on the scaling
if (scaled.width == image->width && scaled.height == image->height)
{
// no scaling
for (i = 0; i < (scaled.width * scaled.height); i++)
{
unsigned int p;
// retrieve the correct texel - this will either be direct or a palette lookup
if (image->flags & IMAGE_32BIT)
p = ((unsigned *) image->data)[i];
else if (image->palette)
p = image->palette[image->data[i]];
else Sys_Error ("D3D_LoadTexture: !(flags & IMAGE_32BIT) without palette set");
// store it back
((unsigned *) LockRect.pBits)[i] = p;
}
}
else
{
// save out lockbits in scaled data pointer
scaled.data = (byte *) LockRect.pBits;
// resample data into the texture
D3D_ResampleTexture (image, &scaled);
}
// unlock it
//(*tex)->UnlockRect (0);
IDirect3DTexture8/*9*/_UnlockRect(*tex, 0);
// tell Direct 3D that we're going to be needing to use this managed resource shortly
//FIXME
//(*tex)->PreLoad ();
}
LPDIRECT3DTEXTURE8/*9*/ D3D_LoadTextureStage2 (image_t *image)
{
d3d_texture_t *tex;
// look for a match
// create a new one
tex = (d3d_texture_t *) malloc (sizeof (d3d_texture_t));
// link it in
tex->next = d3d_Textures;
d3d_Textures = tex;
// fill in the struct
tex->LastUsage = 0;
tex->d3d_Texture = NULL;
// copy the image
memcpy (&tex->TexImage, image, sizeof (image_t));
// upload through direct 3d
D3D_LoadTextureStage3 (&tex->d3d_Texture, image);
// return the texture we got
return tex->d3d_Texture;
}
LPDIRECT3DTEXTURE8/*9*/ D3D_LoadTexture (char *identifier, int width, int height, byte *data, /*bool*/qboolean mipmap, /*bool*/qboolean alpha)
{
image_t image;
image.data = data;
image.flags = 0;
image.height = height;
image.width = width;
image.palette = d_8to24table;
strcpy (image.identifier, identifier);
if (mipmap) image.flags |= IMAGE_MIPMAP;
if (alpha) image.flags |= IMAGE_ALPHA;
return D3D_LoadTextureStage2 (&image);
}
When you lock the texture, you have to observe the returned Pitch member of the D3DLOCKED_RECT structure. Your code is assuming that all the data is contiguous, but the Pitch can be larger than the width of a scanline in order to allow for locking a subregion and other layouts of the buffer that don't have contiguous pixels at the end of one scanline to the beginning of the next.
Look at Chapter 4 of my book "The Direct3D Graphics Pipeline" to see an example of accessing a surface and using the Pitch properly.
For anyone else that comes across this issue, it was due to the way the image was been loaded into the Xbox's memory, it needed to be swizzled.

Inverse filtering on OpenCV - accessing DFT values and multiplying DFT matrices

I am trying to perform an inverse and a pseudo-inverse filtering in the frequency domain.
However I am having trouble accessing DFT coefficients and multiplying DFT matrices afterwards, since I got complex numbers and, therefore, actually two matrices...
Basically the inverse filtering performs
F = G/H,
where F is the restored image, G is the blurred image and H is the kernel that blurred the image.
The pseudo-inverse needs to access the values in H, since if the value is near 0 it should be replaced in order to avoid problems in the restoration. For this we must change the H so that:
H(u,v) = 1/H(u,v) if H(u,v) > threshold
and = 0 otherwise
I have a kernel1 (h_1), and the images imf (restored) and img (blurred). Here is the code:
// compute the DFTs of the kernel (DFT_B) and the blurred image (DBF_A)
cvDFT( dft_A, dft_A, CV_DXT_FORWARD, complexInput1->height );
cvDFT( dft_B, dft_B, CV_DXT_FORWARD, complexInput2->height );
// the first type is the inverse fitlering
if (type == 1) {
printf("...performing inverse filtering\n");
// dividing the transforms
cvDiv(dft_A, dft_B, dft_C, 1);
}
// the second type is the pseudo-inverse filtering
else {
printf("...prepare kernel for pseudo-inverse filtering\n");
// will try to access the real values in order to see if value is above a threshold
cvSplit( dft_B, image_Re1, image_Im1, 0, 0 );
// pointers to access the data into the real and imaginary matrices
uchar * dRe1 = (uchar *)image_Re1->imageData;
uchar * dIm1 = (uchar *)image_Im1->imageData;
int width = image_Re1->width;
int height = image_Re1->height;
int step = image_Re1->widthStep;
image_Re2 = cvCreateImage(cvGetSize(image_Re1), IPL_DEPTH_32F, 1);
image_Im2 = cvCreateImage(cvGetSize(image_Im2), IPL_DEPTH_32F, 1);
// pointers to access the data into the real and imaginary matrices
// it will be the resulting pseudo-inverse filter
uchar * dRe2 = (uchar *)image_Re2->imageData;
uchar * dIm2 = (uchar *)image_Im2->imageData;
printf("...building kernel for pseudo-inverse filtering\n");
for ( i = 0; i < height; i++ ) {
for ( j = 0; j < width; j++ ) {
// generate the 1/H(i,j) value
if (dRe1[i * step + j] > threshold) {
float realsq = dRe1[i * step + j]*dRe1[i * step + j];
float imagsq = dIm1[i * step + j]*dIm1[i * step + j];
dRe2[i * step + j] = dRe1[i * step + j] / (realsq + imagsq);
dIm2[i * step + j] = -1 * (dIm1[i * step + j] / (realsq + imagsq));
}
else {
dRe2[i * step + j] = 0;
dIm2[i * step + j] = 0;
}
}
}
printf("...merging final kernel\n");
cvMerge(image_Re2, image_Im2, 0, 0, dft_B);
printf("...performing pseudo-inverse filtering\n");
cvMulSpectrums(dft_A, dft_B, dft_C, 1);
}
printf("...performing IDFT\n");
cvDFT(dft_C, dft_H, CV_DXT_INV_SCALE, 1);
printf("...getting size\n");
cvGetSubRect(dft_H, &tmp3, cvRect(0, 0, img->width, img->height));
printf("......(%d, %d) - (%d, %d)\n", tmp3.cols, tmp3.rows, restored->width, restored->height);
cvSplit( &tmp3, image_Re1, image_Im1, 0, 0 );
cvNamedWindow("re", 0);
cvShowImage("re", image_Re2);
cvWaitKey(0);
printf("...copying final image\n");
// error is in the line below
cvCopy(image_Re1, imf, NULL);
I have an error on the last line: --- OpenCV Error: Assertion failed (src.depth() == dst.depth() && src.size() == dst.size()) in cvCopy, file /build/buildd/opencv-2.1.0/src/cxcore/cxcopy.cpp, line 466
I know it have to do with the size or depth but I don't know how to control. Anyway, I tried to show the image_Re1 and it is empty...
Can anyone shed some light on it?
Seems like you didn't initialize your imf picture!
cvCopy needs a initialized matrix do a:
IplImage* imf= cvCreateImage(cvGetSize(image_Re1), IPL_DEPTH_32F, 1);
first and I think it'll work.
Also, you don't free the image space in this code (cvReleaseImage(&image))

need to create a webm video from RGB frames

I have an app that generates a bunch of jpgs that I need to turn into a webm video. I'm trying to get my rgb data from the jpegs into the vpxenc sample. I can see the basic shapes from the original jpgs in the output video, but everything is tinted green (even pixels that should be black are about halfway green) and every other scanline has some garbage in it.
I'm trying to feed it VPX_IMG_FMT_YV12 data, which I'm assuming is structured like so:
for each frame
8-bit Y data
8-bit averages of each 2x2 V block
8-bit averages of each 2x2 U block
Here is a source image and a screenshot of the video that is coming out:
Images
It's entirely possible that I'm doing the RGB->YV12 conversion incorrectly, but even if I only encode the 8-bit Y data and set the U and V blocks to 0, the video looks about the same. I'm basically running my RGB data through this equation:
// (R, G, and B are 0-255)
float y = 0.299f*R + 0.587f*G + 0.114f*B;
float v = (R-y)*0.713f;
float u = (B-v)*0.565f;
.. and then to produce the 2x2 filtered values for U and V that I write into vpxenc, I just do (a + b + c + d) / 4, where a,b,c,d are the U or V values of each 2x2 pixel block.
So I'm wondering:
Is there an easier way (in code) to take RGB data and feed it to vpx_codec_encode to get a nice webm video?
Is my RGB->YV12 conversion wrong somewhere?
Any help would be greatly appreciated.
freefallr: Sure. Here is the code. Note that it's converting the RGB->YUV in place as well as putting the YV12 output into pFullYPlane/pDownsampledUPlane/pDownsampledVPlane. This code produced nice looking WebM videos when I modified their vpxenc sample to use this data.
void RGB_To_YV12( unsigned char *pRGBData, int nFrameWidth, int nFrameHeight, void *pFullYPlane, void *pDownsampledUPlane, void *pDownsampledVPlane )
{
int nRGBBytes = nFrameWidth * nFrameHeight * 3;
// Convert RGB -> YV12. We do this in-place to avoid allocating any more memory.
unsigned char *pYPlaneOut = (unsigned char*)pFullYPlane;
int nYPlaneOut = 0;
for ( int i=0; i < nRGBBytes; i += 3 )
{
unsigned char B = pRGBData[i+0];
unsigned char G = pRGBData[i+1];
unsigned char R = pRGBData[i+2];
float y = (float)( R*66 + G*129 + B*25 + 128 ) / 256 + 16;
float u = (float)( R*-38 + G*-74 + B*112 + 128 ) / 256 + 128;
float v = (float)( R*112 + G*-94 + B*-18 + 128 ) / 256 + 128;
// NOTE: We're converting pRGBData to YUV in-place here as well as writing out YUV to pFullYPlane/pDownsampledUPlane/pDownsampledVPlane.
pRGBData[i+0] = (unsigned char)y;
pRGBData[i+1] = (unsigned char)u;
pRGBData[i+2] = (unsigned char)v;
// Write out the Y plane directly here rather than in another loop.
pYPlaneOut[nYPlaneOut++] = pRGBData[i+0];
}
// Downsample to U and V.
int halfHeight = nFrameHeight >> 1;
int halfWidth = nFrameWidth >> 1;
unsigned char *pVPlaneOut = (unsigned char*)pDownsampledVPlane;
unsigned char *pUPlaneOut = (unsigned char*)pDownsampledUPlane;
for ( int yPixel=0; yPixel < halfHeight; yPixel++ )
{
int iBaseSrc = ( (yPixel*2) * nFrameWidth * 3 );
for ( int xPixel=0; xPixel < halfWidth; xPixel++ )
{
pVPlaneOut[yPixel * halfWidth + xPixel] = pRGBData[iBaseSrc + 2];
pUPlaneOut[yPixel * halfWidth + xPixel] = pRGBData[iBaseSrc + 1];
iBaseSrc += 6;
}
}
}
Never mind. The scheme I was using was correct but I had a bug in the U/V downsampling code.

Resources