How to understand the pointer usage in R's .C interface usage - c

I am a beginner for R extensions. I have problems in understanding the usage of .C interface. Take this code as an example:
void cconv(int *l, double *x, int *n, double *s)
{
double *y = x + (*n - *l), *z = x + *l, *u = x;
while ( u < y)
*s += *u++ * *z++;
}
In this code, I think the arguments should be called as *l and *x probably because l and x are defined in R environment and are transferred to the .C interface, and please correct it if I am wrong. However, inside the function cconv, why double *y are defined rather than double y, and why pointer and int variable are mixed in defining *y. At last, I saw in other codes that if R need call .C(...) to return a vector result s, then double *y is defined as an argument in the C function, but calculated as s inside the C function, like:
void xfunc(int *l, double *x, double *s){
int i,j;
for (i = 0, i < *l, i ++){
s[i] = x;
}
}

Related

Function Pointer with void* return and void* parameters

I wrote a function pointer that has all void* so that it can be used for any numeric value
int
float
double.
But it is working only for the int addition function
For float and double addition functions, it throws compile time error.
Why is that so ?
If you uncomment the last two printf lines, you would receive error
#include<stdio.h>
int int_add(int x, int y) {
return x + y;
}
float float_add(float x, float y) {
return x + y;
}
double double_add(double x, double y) {
return x + y;
}
void* do_operation(void* (*op)(void*, void*), void* x, void* y) {
return op(x, y);
}
void main(void) {
printf("Sum= %d\n",(int*) do_operation(int_add, 1, 2));
/*printf("Sum= %f\n",(float*) do_operation(float_add, 1.20, 2.50));*/
/*printf("Sum= %lf\n",(double*) do_operation(double_add, 1.20, 2.50));*/
}
void * is a pointer type. You're not passing pointers, you're passing values, so that's not going to compile. It accidentally "works" for int because pointers themselves are represented as integers by most C compilers.
If you pass pointers to int, float, and double instead of the int, float, and double themselves, you will avoid that compiler error. You'd also need to change int_add and friends to take pointers, and you'd have to make sure you dereferenced the pointers before using them. You'll also have to return pointers, which means you'll have to malloc some memory on the heap, because the stack memory assigned to your local variables will be invalid once your function exits. You'll then have to free it all later... in the end, this is going to result in something considerably more complicated than the problem it appears you are trying to solve.
I have to ask why you are trying to do this? C is really not the best language for this type of pattern. I'd suggest just calling the int_add, float_add, etc. functions directly instead of trying to abstract them in this way.
So as per #charles-srstka suggestion I rewrote the code and then it worked as I wanted
#include<stdio.h>
#include<stdlib.h>
int* int_add(int *x, int *y) {
int *c = (int *)malloc(sizeof(int));
*c = *(int*)x + *(int*)y;
return c;
}
float* float_add(float *x, float *y) {
float *c = (float*)malloc(sizeof(float));
*c = *(float*)x + *(float*)y;
return c;
}
void* do_operation(void* (*op)(void*, void*), void* x, void* y) {
return op(x, y);
}
void main(void) {
int a = 1;
int b = 2;
int *c;
c = do_operation(int_add, &a, &b);
printf("%d\n",*c);
free(c);
float x = 1.1;
float y = 2.2;
float *z;
z = do_operation(float_add, &x, &y);
printf("%f\n",*z);
free(z);
}

Avoid using global variables when using recursive functions in C

The code below uses a recursive function called interp, but I cannot find a way to avoid using global variables for iter and fxInterpolated. The full code listing (that performs N-dimensional linear interpolation) compiles straightforwardly with:
gcc NDimensionalInterpolation.c -o NDimensionalInterpolation -Wall -lm
The output for the example given is 2.05. The code works fine but I want to find alternatives for the global variables. Any help with this would be greatly appreciated. Thanks.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int linearInterpolation(double *, double **, double *, int);
double ** allocateDoubleMatrix(int, int);
double * allocateDoubleVector(int);
void interp(int, int, double *, double *, double *);
double mult(int, double, double *, double *);
/* The objectionable global
variables that I want to get rid of! */
int iter=0;
double fxInterpolated=0;
int main(int argc, char *argv[]){
double *fx, **a, *x;
int dims=2;
x=allocateDoubleVector(dims);
a=allocateDoubleMatrix(dims,2);
fx=allocateDoubleVector(dims*2);
x[0]=0.25;
x[1]=0.4;
a[0][0]=0;
a[0][1]=1;
a[1][0]=0;
a[1][1]=1;
fx[0]=1;
fx[1]=3;
fx[2]=2;
fx[3]=4;
linearInterpolation(fx, a, x, dims);
printf("%f\n",fxInterpolated);
return (EXIT_SUCCESS);
}
int linearInterpolation(double *fx, double **a, double *x, int dims){
double *b, *pos;
int i;
b=allocateDoubleVector(dims);
pos=allocateDoubleVector(dims);
for (i=0; i<dims;i++)
b[i] = (x[i] - a[i][0]) / (a[i][1] - a[i][0]);
interp(0,dims,pos,fx,b);
return (EXIT_SUCCESS);
}
void interp(int j, int dims, double *pos, double *fx, double *b) {
int i;
if (j == dims){
fxInterpolated+=mult(dims,fx[iter],pos,b);
iter++;
return;
}
for (i = 0; i < 2; i++){
pos[j]=(double)i;
interp(j+1,dims,pos,fx,b);
}
}
double mult(int dims, double fx, double *pos, double *b){
int i;
double val=1.0;
for (i = 0; i < dims; i++){
val *= fabs(1.0-pos[i]-b[i]);
}
val *= fx;
printf("mult val= %f fx=%f\n",val, fx);
return val;
}
double ** allocateDoubleMatrix(int i, int j){
int k;
double ** matrix;
matrix = (double **) calloc(i, sizeof(double *));
for (k=0; k< i; k++)matrix[k] = allocateDoubleVector(j);
return matrix;
}
double * allocateDoubleVector(int i){
double *vector;
vector = (double *) calloc(i,sizeof(double));
return vector;
}
Thanks for the comments so far. I want to avoid the use of static. I have removed the global variable and as suggested tried parsing with the iter variable. But no joy. In addition I am getting a compile warning: "value computed is not used" with reference to *iter++; What am I doing wrong?
void interp(int j, int dims, double *pos, double *fx, double *b, int *iter) {
int i;
if (j == dims){
fxInterpolated+=mult(dims,fx[*iter],pos,b);
*iter++;
return;
}
for (i = 0; i < 2; i++){
pos[j]=(double)i;
interp(j+1,dims,pos,fx,b,iter);
}
}
There are two approaches I would consider when looking at this problem:
Keep the state in a parameter
You could use one or more variables that you pass to the function (as a pointer, if necessary) to keep the state across function calls.
For instance,
int global = 0;
int recursive(int argument) {
// ... recursive stuff
return recursive(new_argument);
}
could become
int recursive(int argument, int *global) {
// ... recursive stuff
return recursive(new_argument, global);
}
or sometimes even
int recursive(int argument, int global) {
// ... recursive stuff
return recursive(new_argument, global);
}
Use static variables
You can also declare a variable in a function to be preserved across function calls by using the static keyword:
int recursive(int argument) {
static int global = 0;
// ... recursive stuff
return recursive(argument);
}
Note that because of the static keyword, global = 0 is only set when the program starts, not every time the function is called, as it would be without the keyword. This means that if you alter the value of global, it would keep this value the next time the function is called.
This method can be used if you only use your recursive function once during your program; if you need to use it multiple times, I recommend that you use the alternative method above.
A solution is to use statics and then to reset the variables on the first call, via a flag that I call initialise. That way you can choose to have the variables reset or not.
double interp(int j, int dims, double *pos, double *fx, double *b, int initialise) {
static double fxInterpolated = 0.0;
static int iter = 0;
int i;
if (initialise){
fxInterpolated = 0.0;
iter = 0;
}
.....
......
}

Mex File dot product

I'm trying to implement some elementary linear algebra routines in MEX files in C for practice, and I'm stuck with dot products. Here's what I have so far:
#define char16_t UINT16_T //shenanigans with the compiler
#include "mex.h"
void dotProd(double *a, double *b, double z, mwSize n)
{
mwSize i;
for(i=0;i<n;i++){
z+=a[i] * b[i];
}
}
/* The gateway function */
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
double z=0; //Output scalar
double *b, *a; //Input vectors
int n;
a = mxGetPr(prhs[0]); //pointer to a
b = mxGetPr(prhs[1]); //pointer to b
n = mxGetM(prhs[0]);
// Create output
plhs[0] = mxCreateDoubleScalar(z);
dotProd(a,b,z,(mwSize)n);
}
The problem is that when I test this code:
a=rand(2,1);
b=rand(2,1);
z=dotProd(a,b);
I get:
z=0
even though a and b are not orthogonal. I verified this with the MATLAB dot() function. I've picked over the code and can't quite seem to find where I'm going awry. Some suggestions would be appreciated.
Thank you.
That's because you're not returning the result of the dot product. z makes a local copy of itself in your dotProd function. Even though you are making modifications to z, those changes are not reflected because the scope of z inside dotProd is of local scope. You need to update your function that computes the dot product to return something. In addition, you are setting the output of the function before computing the dot product.
As such, do this:
// Change - Remove z as input
double dotProd(double *a, double *b, mwSize n)
{
mwSize i;
double z = 0.0; // Initialize z to 0.0
for(i=0;i<n;i++){
z+=a[i] * b[i];
}
return z; // Return z
}
Then simply do:
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
double z; //Output scalar - Change - don't need to initialize
double *b, *a; //Input vectors
int n;
a = mxGetPr(prhs[0]); //pointer to a
b = mxGetPr(prhs[1]); //pointer to b
n = mxGetM(prhs[0]);
// Create output
z = dotProd(a,b, (mwSize)n); // Change - returning output
plhs[0] = mxCreateDoubleScalar(z);
}
If you insist on changing z in the function and not letting the function return anything, you'll need to pass a pointer to z and change what z refers to. In other words, you would do this:
// Change - Make z point to a double
void dotProd(double *a, double *b, double *z, mwSize n)
{
mwSize i;
for(i=0;i<n;i++){
*z+=a[i] * b[i]; // Change - Refer to pointer
}
}
Now, do:
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
double z = 0.0;
double *b, *a; //Input vectors
int n;
a = mxGetPr(prhs[0]); //pointer to a
b = mxGetPr(prhs[1]); //pointer to b
n = mxGetM(prhs[0]);
// Create output
dotProd(a,b, &z, (mwSize)n); // Change - Pass pointer of z to function
plhs[0] = mxCreateDoubleScalar(z);
}
BTW, you still need to call dotProd before you set the output. That's why you kept getting 0 because z was 0 before you set the output, then you called dotProd after.

how to pass array as function arguments in C?

how to pass array as arguments in C?
int a,b,c[10];
void Name1(int x, int y, int *z)
{
a = x;
b = y;
c = z;
}
I try to pass it as argument, but it does not build, how to fix it?
And is the declaration of void Name1(int x, int y, int *z); is the same as void Name1(int x, int y, int z[])? does the void Name1(int x, int y, int z[]) will be treated as void Name1(int x, int y, int *z); by compiler?
When you pass array as an argument to function it decays as an pointer to its first element.
So,
void Name1(int x, int y, int *z)
will work.
But arrays are not assignable so:
c = z;
does not work, you will need to explicitly copy each array element from source to destination.
Doing c = z;, you're assigning to a non l-value, which is not allowed. It's like doing &c[0] = &z[0];.
There are a couple of answers here, but I don't think any of them are completely sufficient, so...
basically you can either copy the array or use the pointer, but either way you will need to keep track of the length.
int a,b, *c, len;
void Name1(int x, int y, int *z, int z_len)
{
a = x;
b = y;
c = z;
len = z_len;
}
//usage:
int arr[5];
Name1(1,2,arr /* or &arr[0] ,*/, sizeof(arr )/ sizeof (int));
if you never need to add items this will be sufficient... If you do it gets more complicated...
it is important to keep the length around so that you know how many elements you have, even if you are going to copy them.
c=z; is invalid
c pointer doesnt have memory to save the z address.
you can say:
int a,b,*c;
c=(int *)calloc(10,sizeof(int) );
void Name1(int x, int y, int *z)
{
a = x;
b = y;
c = z;
}

In C, initializing an array using a variable led to stack overflow error or caused R to crash when code is called in R

Okay. My original question turned out to be caused by not initializing some arrays. The original issue had to do with code crashing R. When I was trying to debug it by commenting things out, I by mistake commented out the lines that initialized the arrays. So I thought my problem had to do with passing pointers.
The actual problem is this. As I said before, I want to use outer_pos to calculate outer differences and pass both the pointers of the results and the total number of positive differences back to a function that calls outer_pos
#include <R.h>
#include <Rmath.h>
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
void outer_pos(double *x, double *y, int *n, double *output){
int i, j, l=0;
for(i=0; i<*n; i++){
for(j=0; j<*n; j++){
if((x[j]-x[i])>0){
output[l+1]=(y[j]-y[i])/(x[j]-x[i]);
output[0]=(double)(++l);
}
}
}
Rprintf("%d\n", (int)output[0]);
}
void foo1(double *x, double *y, int *nsamp){
int i, j, k, oper=2, l;
double* v1v2=malloc(sizeof(double)*((*nsamp)*(*nsamp-1)/2 + 1));
outer_pos(x, y, nsamp, &v1v2[0]);
double v1v2b[1999000]; // <--------------HERE
for(i=1; i<= (int)v1v2[0]; i++){
v1v2b[i-1]=1;
}
}
Suppose foo1 is the function that calls outer_pos. Here I specified the size of the array v1v2b using an actual number 1999000. This value corresponds to the number of positive differences. Calling foo1 from R causes no problem. It's all fine.
In the scenario above, I know the number of positive differences, so I can use the actual value to set the array size. But I would like to accommodate situations where I don't necessarily know the value. foo2 below is intended to do that. As you can see, v1v2b is initialized using the first value of the array v1v2. Recall that the first slot of the output of outer_pos stores the number of positive differences. So basically I use this value to set v1v2's size. However, calling this function in R causes R to either show a stack overflow error or causes it to crash (see screen shot below)
void foo2(double *x, double *y, int *nsamp){
int i, j, k, oper=2, l;
double* v1v2=malloc(sizeof(double)*((*nsamp)*(*nsamp-1)/2 + 1));
outer_pos(x, y, nsamp, &v1v2[0]);
double v1v2b[(int)v1v2[0]]; //<--------HERE
for(i=1; i<= (int)v1v2[0]; i++){
v1v2b[i-1]=1;
}
}
So I thought, maybe it has to do with indexation. Maybe the actual size of v1v2b is too small, or something, so the loop iterates outside the bound. So I created foo2b in which I commented out the loop, and use Rprintf to print the first slot of v1v2 to see if the value stored in it is correct. But it seems that the value v1v2[0] is correct, namely 1999000. So I don't know what is happening here.
Sorry for the confusion with my previous question!!
void foo2b(double *x, double *y, int *nsamp){
int i, j, k, oper=2, l;
double* v1v2=malloc(sizeof(double)*((*nsamp)*(*nsamp-1)/2 + 1));
outer_pos(x, y, nsamp, &v1v2[0]);
double v1v2b[(int)v1v2[0]]; //<----Array size declared by a variable
Rprintf("%d", (int)v1v2[0]);
//for(i=1; i<= (int)v1v2[0]; i++){
//v1v2b[i-1]=v1v2[i];
//}
}
R code to run the code above:
x=rnorm(2000)
y=rnorm(2000)
.C("foo1", x=as.double(x), y=as.double(y), nsamp=as.integer(2000))
.C("foo2", x=as.double(x), y=as.double(y), nsamp=as.integer(2000))
.C("foo2b", x=as.double(x), y=as.double(y), nsamp=as.integer(2000))
** FOLLOW UP **
I modified my code based on Martin's suggestion to check if the stack overflow issue can be resolved:
void foo2b(double *x, double *y, int *nsamp) {
int n = *nsamp, i;
double *v1v2, *v1v2b;
v1v2 = (double *) R_alloc(n * (n - 1) / 2 + 1, sizeof(double));
/* outer_pos(x, y, nsamp, v1v2); */
v1v2b = (double *) R_alloc((size_t) v1v2[0], sizeof(int));
for(i=0; i< (int)v1v2[0]; i++){
v1v2b[i]=1;
}
//qsort(v1v2b, (size_t) v1v2[0], sizeof(double), mycompare);
/* ... */
}
After compiling it, I ran the code:
x=rnorm(1000)
y=rnorm(1000)
.C("foo2b", x=as.double(x), y=as.double(y), nsamp=as.integer(length(x)))
And got an error message:
Error: cannot allocate memory block of size 34359738368.0 Gb
** FOLLOW UP 2 **
It seems that the error message shows up every other run of the function. At least it did not crash R...So basically function alternates between running with no problem and showing an error message.
(I included both headers in my script file).
As before, you're allocating on the stack, but should be allocating from the heap. Correct this using malloc / free as you did in your previous question (actually, I think the recommended approach is Calloc / Free or if your code returns to R simply R_alloc; R_alloc automatically recovers the memory when returning to R, even in the case of an error that R catches).
qsort is mentioned in a comment. It takes as its final argument a user-supplied function that defines how its first argument is to be sorted. The signature of qsort (from man qsort) is
void qsort(void *base, size_t nmemb, size_t size,
int(*compar)(const void *, const void *));
with the final argument being 'a pointer to a function that takes two constant void pointers and returns an int'. A function satisfying this signature and sorting pointers to two doubles according to the specification on the man page is
int mycompare(const void *p1, const void *p2)
{
const double d1 = *(const double *) p1,
d2 = *(const double *) p2;
return d1 < d2 ? -1 : (d2 > d1 ? 1 : 0);
}
So
#include <Rdefines.h>
#include <stdlib.h>
int mycompare(const void *p1, const void *p2)
{
const double d1 = *(const double *) p1,
d2 = *(const double *) p2;
return d1 < d2 ? -1 : (d2 > d1 ? 1 : 0);
}
void outer_pos(double *x, double *y, int *n, double *output){
int i, j, l = 0;
for (i = 0; i < *n; i++) {
for (j = 0; j < *n; j++) {
if ((x[j] - x[i]) > 0) {
output[l + 1] = (y[j] - y[i]) / (x[j] - x[i]);
output[0] = (double)(++l);
}
}
}
}
void foo2b(double *x, double *y, int *nsamp) {
int n = *nsamp;
double *v1v2, *v1v2b;
v1v2 = (double *) R_alloc(n * (n - 1) / 2 + 1, sizeof(double));
outer_pos(x, y, nsamp, v1v2);
v1v2b = (double *) R_alloc((size_t) v1v2[0], sizeof(double));
qsort(v1v2b, (size_t) v1v2[0], sizeof(double), mycompare);
/* ... */
}
When foo2b calls outer_pos, it is passing two allocated but uninitialized arrays as x and y. You can't depend on their contents, thus you have different results from different invocations.
Edit
You're dangerously close to your stack size with 1999000 doubles, which take just over 15.25MB, and that's because you're on Mac OS. On most other platforms, threads don't get anywhere near 16M of stack.
You don't start out with a clean (empty) stack when you call this function -- you're deep into R functions, each creating frames that take space on the stack.
Edit 2
Below, you are using an uninitialized value v1v2[0] as an argument to R-alloc. That you get an error sometimes (and not always) is not a surprise.
v1v2 = (double *) R_alloc(n * (n - 1) / 2 + 1, sizeof(double));
/* outer_pos(x, y, nsamp, v1v2); */
v1v2b = (double *) R_alloc((size_t) v1v2[0], sizeof(int));

Resources