Passing back arrays through mex - c

I've been at this for a couple days now, have tried every variation I can think of, and looked at countless examples. I just can't get it working.
I'm trying to make a mexFunction to call from matlab. This mexFunction calls into another C function I have, lets call it retrieveValues, and returns an array and the length of that array. I need to return both of those back to the matlab function, which as I understand it, means I need to put them in the plhs array.
I call my mexFunction from matlab like this:
[foofooArray, foofooCount] = getFoo();
Which to my understanding, means that nlhs = 2, plhs is an array of length 2, nrhs = 0, and prhs just a pointer.
Here's my code for the mexFunction:
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray* prhs[])
{
foo* fooArray
int fooCount
plhs = mxCreateNumericMatrix(1, 2, mxUINT64_CLASS, mxREAL);
//feels like I shouldn't need this
retrieveValues(&fooArray, &fooCount);
plhs[0] = fooArray;
plhs[1] = fooCount;
}
Running the matlab program gets me One or more output arguments not assigned during call
I've tested and confirmed that the values are being returned from retrieveValues correctly.

You are correct that the plhs = mxCreateNumericMatrix(...) is not needed. Also, note that nlhs is the number of left-hand-sides you supply in MATLAB - so in your case, you're calling it with 2 left-hand-sides. Here's how to return trivial scalar values:
plhs[0] = mxCreateDoubleScalar(2);
plhs[1] = mxCreateDoubleScalar(3);
To handle your actual return values, you'll need to do something to copy the values out of foo and into a newly-created mxArray. For example, if your function returned doubles, you might do this:
double * values;
int numValues;
myFcn(&values, &numValues);
/* Build a 1 x numValues real double matrix for return to MATLAB */
plhs[0] = mxCreateDoubleMatrix(1, numValues, mxREAL);
/* Copy from 'values' into the data part of plhs[0] */
memcpy(mxGetPr(plhs[0]), values, numValues * sizeof(double));
EDIT Of course someone somewhere needs to de-allocate values in both my example and yours.
EDIT 2 Complete executable example code:
#include <string.h>
#include "mex.h"
void doStuff(double ** data, int * numData) {
*numData = 7;
*data = (double *) malloc(*numData * sizeof(data));
for (int idx = 0; idx < *numData; ++idx) {
(*data)[idx] = idx;
}
}
void mexFunction( int nlhs, mxArray * plhs[],
int nrhs, const mxArray * prhs[] ) {
double * data;
int numData;
doStuff(&data, &numData);
plhs[0] = mxCreateDoubleMatrix(1, numData, mxREAL);
memcpy(mxGetPr(plhs[0]), data, numData * sizeof(double));
free(data);
plhs[1] = mxCreateDoubleScalar(numData);
}

Here is an example:
testarr.cpp
#include "mex.h"
#include <stdlib.h>
#include <string.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray* prhs[])
{
// validate number of arguments
if (nrhs != 0 || nlhs > 2) {
mexErrMsgTxt("Wrong number of arguments");
}
// create C-array (or you can recieve this array from a function)
int len = 5;
double *arr = (double*) malloc(len*sizeof(double));
for(int i=0; i<len; i++) {
arr[i] = 10.0 * i;
}
// return outputs from MEX-function
plhs[0] = mxCreateDoubleMatrix(1, len, mxREAL);
memcpy(mxGetPr(plhs[0]), arr, len*sizeof(double));
if (nlhs > 1) {
plhs[1] = mxCreateDoubleScalar(len);
}
// dellocate heap space
free(arr);
}
MATLAB:
>> mex -largeArrayDims testarr.cpp
>> [a,n] = testarr
a =
0 10 20 30 40
n =
5

Related

Python C Extension

I am having issues returning a 2D array from a C extension back to Python. When I allocate memory using malloc the returned data is rubbish. When I just initialise an array like sol_matrix[nt][nvar] the returned data is as expected.
#include <Python.h>
#include <numpy/arrayobject.h>
#include <math.h>
#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
// function to be solved by Euler solver
double func (double xt, double y){
double y_temp = pow(xt, 2);
y = y_temp;
return y;
}
static PyObject* C_Euler(double h, double xn)
{
double y_temp, dydx; //temps required for solver
double y_sav = 0; //temp required for solver
double xt = 0; //starting value for xt
int nvar = 2; //number of variables (including time)
int nt = xn/h; //timesteps
double y = 0; //y starting value
//double sol_matrix[nt][nvar]; //works fine
double **sol_matrix = malloc(nt * sizeof(double*)); //doesn't work
for (int i=0; i<nt; ++i){
sol_matrix[i] = malloc (nvar * sizeof(double));
}
int i=0;
//solution loop - Euler method.
while (i < nt){
sol_matrix[i][0]=xt;
sol_matrix[i][1]=y_sav;
dydx = func(xt, y);
y_temp = y_sav + h*dydx;
xt = xt+h;
y_sav=y_temp;
i=i+1;
}
npy_intp dims[2];
dims[0] = nt;
dims[1] = 2;
//Create Python object to copy solution array into, get pointer to
//beginning of array, memcpy the data from the C colution matrix
//to the Python object.
PyObject *newarray = PyArray_SimpleNew(2, dims, NPY_DOUBLE);
double *p = (double *) PyArray_DATA(newarray);
memcpy(p, sol_matrix, sizeof(double)*(nt*nvar));
// return array to Python
return newarray;
}
static PyObject* Euler(PyObject* self, PyObject* args)
{
double h, xn;
if (!PyArg_ParseTuple(args, "dd", &h, &xn)){
return NULL;
}
return Py_BuildValue("O", C_Euler(h,xn));
}
Could you provide any guidance on where I am going wrong?
Thank you.
The data in sol_matrix is not in contiguous memory, it's in nt separately allocated arrays. Therefore the line
memcpy(p, sol_matrix, sizeof(double)*(nt*nvar));
is not going to work.
I'm not a big fan of pointer-to-pointer arrays so believe your best option is to allocate sol_matrix as one big chunk:
double *sol_matrix = malloc(nt*nvar * sizeof(double));
This does mean you can't do 2D indexing so will need to do
// OLD: sol_matrix[i][0]=xt;
sol_matrix[i*nvar + 0] = xt;
In contrast
double sol_matrix[nt][nvar]; //works fine
is a single big chunk of memory so the copy works fine.

lapack dgels_ segmentation fault 11

I am trying to use LAPACK's dgels_ in C to solve a linear least squares problem. I have to read the matrix A (assumed to have full rank and m>=n) and a vector b from 2 text files. I can easily compile my code, but when i try to run it I get a "segmentation fault 11", but I can't really see why. It is my first time using LAPACK so I don't know if maybe I am using the dgels_ function wrong?? The way I get it the solution x will be overwritten in the vector b? :
lssolve.c:
#include <stdlib.h>
#include <stdio.h>
#include "linalg.h"
/* C prototype for LAPACK routine DGELS */
void dgels_(const char * trans, const int * m, const int * n, const int *
nrhs, double * A, const int * lda, double * B, const int * ldb, double * work,
int * lwork,int * info);
int main(int argc, char * argv[]) {
vector_t * b_t = NULL;
matrix_t * A_t = NULL;
char trans = 'N';
int m, n, nrhs, mb, lda, ldb, info, lwork;
double optwork;
double * work;
// we read the matrix A and the vector b:
b_t = read_vector("b.txt");
A_t = read_matrix("A.txt");
m = A_t-> m; //number of rows in A
n = A_t-> n; //number of columns in A
nrhs = 1; //number of columns in B (will always be 1, since we read b_t with read_vector)
mb = b_t -> n; //number of rows in B
if (mb != m ) { //end program if A and B doesn't have the same number of rows
free(A_t);
free(b_t);
fprintf(stderr, "Sorry, but the matrix A and the vector b have incompatible dimensions. Good Bye!\n");
exit(EXIT_FAILURE);
}
//We make A and B into the wanted input form for the dgels_-function:
double * B = b_t -> v;
double ** A = A_t ->A;
lda = m;
ldb = mb;
//we calculate the optimal size of the work array:
lwork = -1;
dgels_(&trans, &n, &m, &nrhs, *A, &lda, B, &ldb, &optwork, &lwork, &info);
lwork = (int)optwork;
//we allocate space for the work array:
work = (double*)malloc( lwork*sizeof(double));
//solving the least squares problem:
dgels_(&trans, &n, &m, &nrhs, *A, &lda, B, &ldb, work, &lwork, &info);
//Check whether there was an successful exit:
if (info > 0){
fprintf(stderr, "Sorry, but illegal arguments were used, and therefore a least square solution cannot be computes. Good Bye!\n");
exit(EXIT_FAILURE);
} else if(info < 0){
fprintf(stderr, "Sorry, but A doesn't have full rank, and therefore a least square solution cannot be computed. Good Bye!\n");
exit(EXIT_FAILURE);
}
//Saving the least square problem as a vector_t:
vector_t * x = NULL;
x->n = mb;
x->v = B;
print_vector(x);
//Free memory
free_vector(b_t);
free_matrix(A_t);
free_vector(x);
return(EXIT_SUCCESS);
}
I am using the functions read_matrix, read_vector, print_vector, print_matrix and free_vector, which is why I use the struct vector_t and matrix_t:
typedef struct vector {
unsigned long n; /* length of vector */
double * v; /* pointer to array of length n */
} vector_t;
typedef struct matrix {
unsigned long m; /* number of rows */
unsigned long n; /* number of columns */
double ** A; /* pointer to two-dimensional array */
} matrix_t;
I don't think that anything is wrong with read_vector and read_matrix because I can easily do this and use print_vector or print_matrix before I do all of the other operations.
You dereference a NULL pointer here, causing the segfault:
//Saving the least square problem as a vector_t:
vector_t * x = NULL;
x->n = mb;
x->v = B;
Maybe you should use/create a new vector_t instead of just a pointer to a vector_t?

Mex File dot product

I'm trying to implement some elementary linear algebra routines in MEX files in C for practice, and I'm stuck with dot products. Here's what I have so far:
#define char16_t UINT16_T //shenanigans with the compiler
#include "mex.h"
void dotProd(double *a, double *b, double z, mwSize n)
{
mwSize i;
for(i=0;i<n;i++){
z+=a[i] * b[i];
}
}
/* The gateway function */
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
double z=0; //Output scalar
double *b, *a; //Input vectors
int n;
a = mxGetPr(prhs[0]); //pointer to a
b = mxGetPr(prhs[1]); //pointer to b
n = mxGetM(prhs[0]);
// Create output
plhs[0] = mxCreateDoubleScalar(z);
dotProd(a,b,z,(mwSize)n);
}
The problem is that when I test this code:
a=rand(2,1);
b=rand(2,1);
z=dotProd(a,b);
I get:
z=0
even though a and b are not orthogonal. I verified this with the MATLAB dot() function. I've picked over the code and can't quite seem to find where I'm going awry. Some suggestions would be appreciated.
Thank you.
That's because you're not returning the result of the dot product. z makes a local copy of itself in your dotProd function. Even though you are making modifications to z, those changes are not reflected because the scope of z inside dotProd is of local scope. You need to update your function that computes the dot product to return something. In addition, you are setting the output of the function before computing the dot product.
As such, do this:
// Change - Remove z as input
double dotProd(double *a, double *b, mwSize n)
{
mwSize i;
double z = 0.0; // Initialize z to 0.0
for(i=0;i<n;i++){
z+=a[i] * b[i];
}
return z; // Return z
}
Then simply do:
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
double z; //Output scalar - Change - don't need to initialize
double *b, *a; //Input vectors
int n;
a = mxGetPr(prhs[0]); //pointer to a
b = mxGetPr(prhs[1]); //pointer to b
n = mxGetM(prhs[0]);
// Create output
z = dotProd(a,b, (mwSize)n); // Change - returning output
plhs[0] = mxCreateDoubleScalar(z);
}
If you insist on changing z in the function and not letting the function return anything, you'll need to pass a pointer to z and change what z refers to. In other words, you would do this:
// Change - Make z point to a double
void dotProd(double *a, double *b, double *z, mwSize n)
{
mwSize i;
for(i=0;i<n;i++){
*z+=a[i] * b[i]; // Change - Refer to pointer
}
}
Now, do:
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
double z = 0.0;
double *b, *a; //Input vectors
int n;
a = mxGetPr(prhs[0]); //pointer to a
b = mxGetPr(prhs[1]); //pointer to b
n = mxGetM(prhs[0]);
// Create output
dotProd(a,b, &z, (mwSize)n); // Change - Pass pointer of z to function
plhs[0] = mxCreateDoubleScalar(z);
}
BTW, you still need to call dotProd before you set the output. That's why you kept getting 0 because z was 0 before you set the output, then you called dotProd after.

Cuda retrieving 3d array

I am having trouble trying to figure out how to retrieve a 3D array from the GPU.
I want to allocate the memory for the 3d array in the host code, call the kernel, where the array will be populated, Then retrieve the 3D array in the host code to a return variable in the mexFunction (host code).
I have made several attempts at it, here is my latest code. The results are all '0's, where they should be '7'. Can anyone tell me where i'm going wrong? It might have something to do with the 3D parameters, i dont think i fully understand that part.
simulate3DArrays.cpp
/* Device code */
__global__ void simulate3DArrays(cudaPitchedPtr devPitchedPtr,
int width,
int height,
int depth)
{
int threadId;
threadId = (blockIdx.x * blockDim.x) + threadIdx.x;
size_t pitch = devPitchedPtr.pitch;
for (int widthIndex = 0; widthIndex < width; widthIndex++) {
for (int heightIndex = 0; heightIndex < height; heightIndex++) {
*((double*)(((char*)devPitchedPtr.ptr + threadId * pitch * height) + heightIndex * pitch) + widthIndex) = 7.0;
}
}
}
mexFunction.cu
/* Host code */
#include <stdio.h>
#include "mex.h"
/* Kernel function */
#include "simulate3DArrays.cpp"
/* Define some constants. */
#define width 5
#define height 9
#define depth 6
void displayMemoryAvailability(mxArray **MatlabMemory);
void mexFunction(int nlhs,
mxArray *plhs[],
int nrhs,
mxArray *prhs[])
{
double *output;
mwSize ndim3 = 3;
mwSize dims3[] = {height, width, depth};
plhs[0] = mxCreateNumericArray(ndim3, dims3, mxDOUBLE_CLASS, mxREAL);
output = mxGetPr(plhs[0]);
cudaExtent extent = make_cudaExtent(width * sizeof(double), height, depth);
cudaPitchedPtr devicePointer;
cudaMalloc3D(&devicePointer, extent);
simulate3DArrays<<<1,depth>>>(devicePointer, width, height, depth);
cudaMemcpy3DParms deviceOuput = { 0 };
deviceOuput.srcPtr.ptr = devicePointer.ptr;
deviceOuput.srcPtr.pitch = devicePointer.pitch;
deviceOuput.srcPtr.xsize = width;
deviceOuput.srcPtr.ysize = height;
deviceOuput.dstPtr.ptr = output;
deviceOuput.dstPtr.pitch = devicePointer.pitch;
deviceOuput.dstPtr.xsize = width;
deviceOuput.dstPtr.ysize = height;
deviceOuput.kind = cudaMemcpyDeviceToHost;
/* copy 3d array back to 'ouput' */
cudaMemcpy3D(&deviceOuput);
return;
} /* End Mexfunction */
The basic problem appears to be that you are instructing the cudaMemcpy3D to copy zero bytes, because you have not included a non-zero extent which defines the size of the transfer to the API.
Your transfer could probably be as simple as:
cudaMemcpy3DParms deviceOuput = { 0 };
deviceOuput.srcPtr = devicePointer;
deviceOuput.dstPtr.ptr = output;
deviceOuput.extent = extent;
cudaMemcpy3D(&deviceOuput);
I can't comment on whether the MEX interface you are using is correct, but the kernel looks superficially correct and I don't see anything else obviously wrong, without going to a compiler and trying to run your code with Matlab, which I cannot.

C array sorting using qsort?

I have been stuck on this for a while and nothing seems to work.
I have a data structure:
DATA
{
int size;
int id;
}
And I have an array of DATA structures:
myArray = (DATA *) malloc(10 * sizeof(DATA));
Then I assign some test values:
myArray[0].size = 5;
myArray[1].size = 9;
myArray[2].size = 1;
myArray[3].size = 3;
So my starting array should look like:
5,9,1,3,0,0,0,0,0,0
Then, I call qsort(myArray,10,sizeof(DATA),comp)
Where comp is:
int comp(const DATA * a, const DATA * b)
{
return a.size - b.size;
}
And trust me, I tried many things with the compare function, NOTHING seems to work. I just never get any sorting that makes any sense.
So my starting array should look like 5, 9, 1, 3, 0, 0, 0, 0, 0, 0.
No, it really won't, at least it's not guaranteed to.
If you want zeros in there, either use calloc() to zero everything out, or put them in yourself. What malloc() will give you is a block of the size required that has indeterminant content. In other words, it may well have whatever rubbish was in memory beforehand.
And, on top of that, a and b are pointers in your comp function, you should be using -> rather than . and it's good form to use the correct prototype with casting.
And a final note: please don't cast the return from malloc in C - you can get into problems if you accidentally forget to include the relevant header file and your integers aren't compatible with your pointers.
The malloc function returns a void * which will quite happily convert implicitly into any other pointer.
Here's a complete program with those fixes:
#include <stdio.h>
#include <stdlib.h>
typedef struct {int size; int id;} DATA;
int comp (const void *a, const void *b) {
return ((DATA *)a)->size - ((DATA *)b)->size;
}
int main (void) {
int i;
DATA *myArray = malloc(10 * sizeof(DATA));
myArray[0].size = 5;
myArray[1].size = 9;
myArray[2].size = 1;
myArray[3].size = 3;
for (i = 4; i < 10; i++)
myArray[i].size = 0;
qsort (myArray, 10, sizeof(DATA), comp);
for (i = 0; i < 10; i++)
printf ("%d ", myArray[i].size);
putchar ('\n');
return 0;
}
The output:
0 0 0 0 0 0 1 3 5 9

Resources