R extension using C in mcmc - c

I am writing a C version of my R mcmc code. Part of the code that is throwing up errors is as follows:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <stddef.h>
#include <limits.h>
#include <gsl/gsl_machine.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_cdf.h>
#include <R.h>
#include <Rmath.h>
#include <Rembedded.h>
#include <Rdefines.h>
#include <R_ext/BLAS.h>
#include <R_ext/Lapack.h>
#include <R_ext/Linpack.h>
//#include <vecLib/cblas.h> /* C BLAS in APPLE */
#include "blaio.h" /*Linear Algebra I/O using http://www.mitchr.me/SS/exampleCode/blas.html*/
double rTruncNorm (double, double*, double, double);
// C version of group lasso
void BinaryBayesianGroupLasso (double *X, int *dim, int *y, int *iterations, double *bprev, double *Lprev) {
// X is design matrix with intercept
//dim is dimensions of X
int iter = 1, i;
int n = dim[0];
int p = dim[1];
double sigma = 1, z[n], mu[n], u, a, b;
// random number generation set up
const double lowest_double = -GSL_DBL_MAX;
const double highest_double = GSL_DBL_MAX;
const gsl_rng *gBaseRand; /* global rand number generator */
unsigned long randSeed;
/* specifying to use Mersenne twister as the PRNG */
gBaseRand = gsl_rng_alloc(gsl_rng_mt19937);
srand(time(NULL)); /* initialization for rand() */
randSeed = rand(); /* returns a non-negative integer */
gsl_rng_set(gBaseRand, randSeed); /* seed the PRNG */
while(iter < iterations){
cblas_dgemv(CblasRowMajor, CblasNoTrans, n, p, 1, X, p, bprev, 1, 0, mu, 1);
printVector(n, mu, 8, 3, NULL, NULL, NULL, " mu"); // width=8, precision=3
for (i = 0; i < n; i++) {
u = gsl_ran_flat(gBaseRand, 0, 1);
if(y[i]==1){
a = 0.0 - mu[i];
b = highest_double;
} else {
a = lowest_double;
b = 0.0 - mu[i];
}
z[i] = mu[i] + rTruncNorm(sigma, &u, a, b);
}
printVector(n, z, 8, 3, NULL, NULL, NULL, " mu"); // width=8, precision=3
}
gsl_rng_free(gBaseRand);
}
// gaussian truncated normal between [a,b], params - mu=0, sigma, unifrnd = u
double rTruncNorm(double sigma, double *u, double a, double b){
double unifr = *u;
double Fa = gsl_cdf_gaussian_P(a, sigma);
double Fb = gsl_cdf_gaussian_P(b, sigma);
double v = Fa + (Fb-Fa)*unifr;
double x = gsl_cdf_gaussian_Pinv(v, sigma);
return(x);
}
The functions in blaio.h have already been built using the makefile in the website, I got them from.
I am using the following to compile:
R CMD SHLIB -lgsl -lgslcblas BinaryBayesianGroupLasso.c
It is throwing the following errors:
gcc -arch x86_64 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/x86_64 -DNDEBUG -I/usr/local/include -fPIC -g -O2 -c BinaryBayesianGroupLasso.c -o BinaryBayesianGroupLasso.o
BinaryBayesianGroupLasso.c: In function ‘BinaryBayesianGroupLassoC’:
BinaryBayesianGroupLasso.c:36: error: nested functions are disabled, use -fnested-functions to re-enable
BinaryBayesianGroupLasso.c:36: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘gBaseRand’
BinaryBayesianGroupLasso.c:37: warning: implicit declaration of function ‘time’
BinaryBayesianGroupLasso.c:41: warning: comparison between pointer and integer
BinaryBayesianGroupLasso.c:50: warning: implicit declaration of function ‘cblas_dgemv’
BinaryBayesianGroupLasso.c:50: error: ‘CblasRowMajor’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:50: error: (Each undeclared identifier is reported only once
BinaryBayesianGroupLasso.c:50: error: for each function it appears in.)
BinaryBayesianGroupLasso.c:50: error: ‘CblasNoTrans’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:50: error: ‘mu’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:51: warning: implicit declaration of function ‘printVector’
BinaryBayesianGroupLasso.c:70: error: ‘i’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:71: error: ‘u’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:72: error: ‘y’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:73: error: ‘a’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:74: error: ‘b’ undeclared (first use in this function)
BinaryBayesianGroupLasso.c:79: warning: implicit declaration of function ‘rTruncNorm’
BinaryBayesianGroupLasso.c:91: warning: passing argument 1 of ‘gsl_rng_free’ discards qualifiers from pointer target type
BinaryBayesianGroupLasso.c:92: warning: ‘return’ with a value, in function returning void
BinaryBayesianGroupLasso.c: At top level:
BinaryBayesianGroupLasso.c:98: error: conflicting types for ‘rTruncNorm’
BinaryBayesianGroupLasso.c:79: error: previous implicit declaration of ‘rTruncNorm’ was here
make: *** [BinaryBayesianGroupLasso.o] Error 1
I have no nested function in C. Why is it throwing up such an error? If it is a linking problem, should it not throw up function not found kind of an error?
Edit: I was asked to post the solution. In my initial code, I was not compiling the above code (but an older version in another directory). (Yup, that is the stupidest error possible). I am reposting a working version of the code. I do not think it would be of much value, other than seeing an example of how mcmc sampling steps are coded up, and how its called from R. This code is however incomplete in terms of Grouped Bayesian Lasso. It also has codes (with sources attributed for printing matrices used in blas). I have started coding in C in just about a week, so take everything (except the print codes) with a pinch of salt. If you have much a prettier version in mind (like putting separate files for printing, and the compiling command), go ahead and make changes.
I have learned 3 things in this exercise:
calling C from R
using blas
using gsl
The code will evolve to address a 4th point later:
- calling R from C
I have to do a sampling from generalized inverse gaussian in this model (excluded in this code). However I do not see an easy piece of function that does that in C. I might have to use R's function then. So it becomes weird. R calls this C function, and within this C function a few other R functions will be called. Your comments will be appreciated, if there are any alternate approaches, and if this is the way, how to do it - RInside? anything easier?
Time is of prime importance to me. I have to run this model with about a million data points, with at least 10,000 mcmc iterations. I want to finish this in less than 5 minutes, preferably 2 minutes.
A third stage of this exercise is to parallelize the sampling step in z.
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <stddef.h>
#include <limits.h>
#include <time.h>
#include <ctype.h> /* Char classes ISOC */
#include <string.h> /* Strings ISOC */
#include <gsl/gsl_machine.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_cdf.h>
#include <gsl/gsl_cblas.h>
#include <R.h>
#include <Rmath.h>
#include <Rembedded.h>
#include <Rdefines.h>
//#include <R_ext/BLAS.h>
#include <R_ext/Lapack.h>
#include <R_ext/Linpack.h>
//#include <vecLib/cblas.h> /* C BLAS APPLE */
//#include "blaio.h" /* Basic Linear Algebra I/O */
// source for the printing codes: //source: http://www.mitchr.me/SS/exampleCode/blas.html
#define REALT double
#ifndef REALT
#define REALT double
#endif
void sgeprt(int m, int n, REALT, char *c);
void dgeprt(int m, int n, REALT, char *c);
/* ****************************** - external functions from blaio.h. */
void printVector(int n, REALT *v, /* Size and array */
int wide, int prec, /* Width and precesion for floats */
char *pad, /* Right pad string */
char *ldel, char *rdel, /* Left and right delimiter */
char *tag /* Tag for first line */
);
void printMatrix(const enum CBLAS_ORDER order,
int n, int m, REALT *a, /* Size and array */
int wide, int prec, /* Width and precesion for floats */
char *pad, /* Right pad string */
char *ldel, char *rdel, /* Left and right delimiter */
char *lidel, char *ridel, /* Left and right INNER delimiter */
char *tag /* Tag for first line */
);
int readMatrix(int *n, int *m, /* Will contain size of the array after the read */
REALT *a, /* Will point to the data */
int maxEle, /* Maximum number of elements to read */
char *fileName /* The file name to read the data from */
);
void printMatrixThr(const enum CBLAS_ORDER order,
int n, int m, REALT *a, /* Size and array */
char *inStr, char *outStr, /* "in" string, and "out" string */
REALT minIn, REALT maxIn, /* Min/Max values for "in" range. */
char *pad, /* Right pad string */
char *ldel, char *rdel, /* Left and right delimiter */
char *lidel, char *ridel, /* Left and right INNER delimiter */
char *tag /* Tag for first line */
);
double rTruncNorm (double, double*, double, double);
// C version of group lasso
void BinaryBayesianGroupLassoC (double *X, int *dim, int *y, int *iterations, double *bprev, double *Lprev) {
// X is design matrix with intercept
//dim is dimensions of X
int iter = 1, i;
int n = dim[0];
int p = dim[1];
double sigma = 1, z[n], mu[n], u, a, b, *Li;
// random number generation set up
const double lowest_double = -GSL_DBL_MAX;
const double highest_double = GSL_DBL_MAX;
const gsl_rng *gBaseRand; /* global rand number generator */
unsigned long randSeed;
/* specifying to use Mersenne twister as the uniform PRNG */
gBaseRand = gsl_rng_alloc(gsl_rng_mt19937);
srand(time(NULL)); /* initialization for rand() */
randSeed = rand(); /* returns a non-negative integer */
gsl_rng_set(gBaseRand, randSeed); /* seed the PRNG */
while(iter < *iterations){
cblas_dgemv(CblasRowMajor, CblasNoTrans, n, p, 1, X, p, bprev, 1, 0, mu, 1);
printVector(n, mu, 8, 3, NULL, NULL, NULL, " mu"); // width=8, precision=3
for (i = 0; i < n; i++) {
u = gsl_ran_flat(gBaseRand, 0, 1);
if(y[i]==1){
a = 0.0 - mu[i];
b = highest_double;
} else {
a = lowest_double;
b = 0.0 - mu[i];
}
z[i] = mu[i] + rTruncNorm(sigma, &u, a, b);
}
printVector(n, z, 8, 3, NULL, NULL, NULL, " z"); // width=8, precision=3
iter = iter + 1;
}
gsl_rng_free(gBaseRand);
}
// gaussian truncated normal between [a,b], params - mu=0, sigma, unifrnd = u
double rTruncNorm(double sigma, double *u, double a, double b){
double unifr = *u;
double Fa = gsl_cdf_gaussian_P(a, sigma);
double Fb = gsl_cdf_gaussian_P(b, sigma);
double v = Fa + (Fb-Fa)*unifr;
double x = gsl_cdf_gaussian_Pinv(v, sigma);
return(x);
}
/* function definitions from blaio.h *******/
void printMatrixUbr(const enum CBLAS_ORDER order, /* CBLAS row order */
int n, int m, REALT *a, /* Size and array */
char *inStr, char *outStr, /* "in" string, and "out" string */
REALT minIn, REALT maxIn, /* Min/Max values for "in" range. */
int wide, int prec, /* Width and precesion for floats */
int *rowPerm, int *colPerm,
char prtMode, /* b=bitmap, V=values, *=in/out) */
char *fileName,
int maskMode, // (L-below diag, U=above diag, D-Diagnal, M-Mask, 0=NONE, \0-NONE
char *mask,
char *pad, /* Right pad string */
char *ldel, char *rdel, /* Left and right delimiter */
char *lidel, char *ridel, /* Left and right INNER delimiter */
char *tag /* Tag for first line */
);
/* ************************************************************************** */
void printVector(int n, REALT *v, int wide, int prec, char *pad, char *ldel, char *rdel, char *tag) {
printMatrixUbr(CblasRowMajor, 1, n, v, NULL, NULL, 0.0, 0.0, wide, prec, NULL, NULL, 'V', NULL, '0', NULL, pad, ldel, rdel, "", "", tag);
} /* end func printVector */
/* ************************************************************************** */
void printMatrix(const enum CBLAS_ORDER order, int n, int m, REALT *a, int wide, int prec, char *pad, char *ldel, char *rdel, char *lidel, char *ridel, char *tag) {
printMatrixUbr(order, n, m, a, NULL, NULL, 0.0, 0.0, wide, prec, NULL, NULL, 'V', NULL, '0', NULL, pad, ldel, rdel, lidel, ridel, tag);
} /* end func printMatrix */
/* ************************************************************************** */
void printMatrixThr(const enum CBLAS_ORDER order, int n, int m, REALT *a, char *inStr, char *outStr, REALT minIn, REALT maxIn, char *pad, char *ldel, char *rdel, char *lidel, char *ridel, char *tag) {
printMatrixUbr(order, n, m, a, inStr, outStr, minIn, maxIn, 0, 0, NULL, NULL, '*', NULL, '0', NULL, pad, ldel, rdel, lidel, ridel, tag);
} /* end func printMatrixThr*/
/* ************************************************************************** */
void printMatrixUbr(const enum CBLAS_ORDER order, /* CBLAS row order */
int n, int m, REALT *a, /* Size and array */
char *inStr, char *outStr, /* "in" string, and "out" string */
REALT minIn, REALT maxIn, /* Min/Max values for "in" range. */
int wide, int prec, /* Width and precesion for floats */
int *rowPerm, int *colPerm, /* Permute rows i->xx[i] */
char prtMode, /* b=bitmap, V=values, *=in/out */
char *fileName, /* if NULL, stdout. */
int maskMode, /* L, U, D, M-Mask, 0=NONE */
char *mask, /* Mask (same size as a) ctrl print */
char *pad, /* Right pad string */
char *ldel, char *rdel, /* Left and right delimiter */
char *lidel, char *ridel, /* Left and right INNER delimiter */
char *tag /* Tag for first line */
) {
int i, j, iP, jP;
int k, ldelLen, tagLen, cIdx, prtPerMask;
REALT pVal;
if(inStr == NULL) inStr = "*";
if(outStr == NULL) outStr = " ";
if(wide < 0) wide = 5;
if(prec < 0) prec = 2;
if(ldel == NULL) ldel = "[";
if(ridel == NULL) ridel = "]";
if(lidel == NULL) lidel = "[";
if(rdel == NULL) rdel = "]";
if(pad == NULL) pad = " ";
if(tag == NULL) tag = "";
ldelLen = strlen(ldel);
tagLen = strlen(tag);
for(j=0; j<n; j++) {
if(j==0)
printf("%s%s%s%s", tag, ldel, lidel, pad);
else {
for(k=0;k<tagLen;k++) printf(" ");
for(k=0;k<ldelLen;k++) printf(" ");
printf("%s%s", lidel, pad);
} /* end if/else */
for(i=0; i<m; i++) {
if(colPerm != NULL)
iP = colPerm[i];
else
iP = i;
if(rowPerm != NULL)
jP = rowPerm[j];
else
jP = j;
if(order == CblasColMajor)
cIdx = n*iP+jP;
else
cIdx = m*jP+iP;
pVal = a[cIdx];
// Figure out what the mask has to do with printing..
if(maskMode == '0')
prtPerMask = 1;
else if(maskMode == 'L')
prtPerMask = (iP<jP); // Row order specific! Fix this.
else if(maskMode == 'D')
prtPerMask = (iP==jP);
else if(maskMode == 'U')
prtPerMask = (iP>jP); // Row order specific! Fix this.
else if(maskMode == 'M')
prtPerMask = mask[cIdx];
else
prtPerMask = 1;
if(prtMode == '*') {
if( prtPerMask && (pVal >= minIn) && (pVal <= maxIn) )
printf("%s%s", inStr, pad);
else
printf("%s%s", outStr, pad);
} else {
if(prtPerMask)
printf("%*.*f%s", wide, prec, pVal, pad);
else
printf("%*s%s", wide, outStr, pad);
} /* end if/else */
} /* end for */
if(j==n-1)
printf("%s%s\n", ridel, rdel);
else
printf("%s\n", ridel);
} /* end for */
} /* end func printMatrixUbr*/
/* ************************************************************************** */
int readMatrix(int *n, int *m, REALT *a, int maxEle, char *fileName) {
int i, j;
int inComment;
int numNumbers;
int ch;
int lengthOfNumber;
char numberBuffer[256];
FILE *FP;
FP = fopen(fileName, "r");
if(FP == NULL)
return 1;
i = j = 0;
inComment = 0;
lengthOfNumber = -1;
numNumbers = 0;
while(EOF != (ch = getc(FP))) {
if(ch == '#') inComment = 1; /* Enter comment upon # char. */
if(inComment && (ch < 20)) inComment = 0; /* Break out of comment upon ANY control char. */
if( !(inComment)) {
if(isdigit(ch) || (ch=='.') || (ch=='E') || (ch=='e') || (ch=='-') | (ch=='+')) {
lengthOfNumber++;
numberBuffer[lengthOfNumber] = ch;
} else {
if(lengthOfNumber>=0) {
numberBuffer[lengthOfNumber+1] = '\0';
lengthOfNumber = -1;
numNumbers++;
if(numNumbers==1) {
*n = atoi(numberBuffer);
} else if(numNumbers==2) {
*m = atoi(numberBuffer);
} else if (numNumbers<maxEle) {
a[numNumbers-3] = atof(numberBuffer);
} else {
return(-1);
} /* end if/else */
} /* end if */
} /* end if/else */
} /* end if */
} /* end while */
fclose(FP);
return(numNumbers-2);
} /* end func main */
Simulating random data in R and calling C function:
# simulate data
n = 1000000; p=30; iterations = 1000; burnin = 5;
X = as.data.frame(sapply(1:p, function(x) rnorm(n, 0, 1)))
Y = sample(c(0,1), n, replace=TRUE)
call_BinaryBayesianGroupLasso <- function(X,Y,iterations,burnin){
# X is design matrix without intercept
p = dim(X)[2]+1 # number of predictors including dummy vars and intercept
n = dim(X)[1]
z = rep(0,n)
h = 1000 # hyperparameter for shrinkage (controls number of parameters in model)
b = matrix(0, nrow=p, ncol=iterations) # store beta
L = matrix(0, nrow=p, ncol=iterations) # store lambda in
initmodel = glm(Y ~ ., data = X, family = binomial(link = "probit")) # full model initial estimates
# add intercept to design matrix
X = cbind(rep(1,n), X)
colnames(X) = names(initmodel$coefficients)
b[,1] = summary(initmodel)$coefficients[,1]
L[,1] = summary(initmodel)$coefficients[,2]
eye = diag(rep(1,n)) # used in mcmc step for b
bprev=b[,1]
Lprev=L[,1]
# call C function for group lasso
res = .C("BinaryBayesianGroupLassoC",
as.double(as.matrix(X)),
as.integer(dim(X)),
as.integer(Y),
as.integer(iterations),
as.double(bprev),
as.double(Lprev))
}

Related

How to sort dates from an array in c

I'm trying to sort dates from an array i've got the following code (without including the array and the file that i'm trying to read and the other with the sorted dates that i'm trying to write.
int aniomayor=tot[0].anio;
int diamayor=tot[0].dia;
int mesmayor=tot[0].mes;
while (i<nf) {
if (tot[i].anio > aniomayor) {
int aniomayor=tot[i].anio;
int diamayor=tot[i].dia;
int mesmayor=tot[i].mes;
}
else if (tot[i].anio == aniomayor && tot[i].mes > mesmayor) {
int aniomayor=tot[i].anio;
int diamayor=tot[i].dia;
int mesmayor=tot[i].mes;
}
else if (tot[i].anio == aniomayor && tot[i].mes == mesmayor && tot[i].dia > diamayor) {
int aniomayor=tot[i].anio;
int diamayor=tot[i].dia;
int mesmayor=tot[i].mes;
}
i++;
}
fprintf(f, "%s ", diamayor);
fprintf(f, "%s ", mesmayor);
fprintf(f, "%s \n", aniomayor);
I think it would work but in the 2,3,4.. line it will print always the same date and i don't know how to do for it to ignore the dates that already had been sorted. Thanks in advance.
The original int declaration establishes variables. The subsequent ones create "shadow" variables that have the same name but are not the same variable.
Here's a demonstration:
#include <stdio.h>
int main() {
int x = 1;
if (x == 1) {
int x = 2;
printf("x=%d\n", x);
}
printf("x=%d\n", x);
return 0;
}
This prints:
x=2
x=1
The top-level x never gets modified, so it appears to revert to the original value.
You should remove the int prefix from those, just assign to the existing variable.
When you say int x = y; in C you are declaring a variable and assigning a value. To assign to an existing variable x = y; is sufficient.
The int prefix is only necessary on the first instance of the variable so the compiler knows what type to use for that and all subsequent references inside the same scope.
Now normally the compiler would complain about creating another variable with the same name if it's done in the same scope. In your case because you're doing it inside an if, technically that's a different scope so you can have duplicates.
As has been mentioned in the comments, it is preferable to use qsort, (if one doesn't care about stability.) One needs a function pointer, which is compare_dates in the code below.
#include <stdlib.h> /* EXIT*, rand, qsort */
#include <stdio.h> /* *printf */
#include <time.h> /* clock */
#include <assert.h> /* assert */
struct Date { int anio, mes, dia; };
/** Random [i, j]. https://stackoverflow.com/a/6852396/2472827
This is just used for test purposes. */
static int rand_range(const int i, const int j) {
const unsigned long max = (unsigned long)j - i,
num_bins = max + 1l,
num_rand = (unsigned long)RAND_MAX + 1,
bin_size = num_rand / num_bins,
defect = num_rand % num_bins;
unsigned long x;
assert(i <= j && num_bins <= RAND_MAX);
do { x = 1l * rand(); } while (num_rand - defect <= x);
return i + x / bin_size;
}
/** Initiaises the date with random. */
static void init_date(struct Date *const date) {
assert(date);
date->anio = rand_range(1950, 2050);
date->mes = rand_range(1, 12);
date->dia = rand_range(1, 30); /* Approximately. */
}
/** Prints the date in a static string.
Assumes the date is sanitised, or else this presents a risk of overflow. */
static const char *print_date(const struct Date *const date) {
static char print[128]; /* Should be 11 if -999 <= year < 9999. */
assert(date);
sprintf(print, "%4.4d-%2.2d-%2.2d", date->anio, date->mes, date->dia);
return print;
}
/** The arguments must be const struct Date *.
#return -, =, + */
static int compare_dates(const void *p, const void *q) {
const struct Date *x = (const struct Date *)p, *y = (const struct Date *)q;
assert(p && q);
if(x->anio > y->anio) return 1;
if(x->anio < y->anio) return -1;
if(x->mes > y->mes) return 1;
if(x->mes < y->mes) return -1;
if(x->dia > y->dia) return 1;
if(x->dia < y->dia) return -1;
return 0;
}
int main(void) {
struct Date dates[64];
const size_t dates_size = sizeof dates / sizeof *dates;
size_t i;
/* Generate dates. */
srand((unsigned)clock());
for(i = 0; i < dates_size; i++) init_date(dates + i);
/* Sort it using compare_dates. */
qsort(dates, dates_size, sizeof *dates, &compare_dates);
/* Print. */
for(i = 0; i < dates_size; i++) printf("%s.\n", print_date(dates + i));
return EXIT_SUCCESS;
}
See How to generate a random integer number from within a range.

lapack dgels_ segmentation fault 11

I am trying to use LAPACK's dgels_ in C to solve a linear least squares problem. I have to read the matrix A (assumed to have full rank and m>=n) and a vector b from 2 text files. I can easily compile my code, but when i try to run it I get a "segmentation fault 11", but I can't really see why. It is my first time using LAPACK so I don't know if maybe I am using the dgels_ function wrong?? The way I get it the solution x will be overwritten in the vector b? :
lssolve.c:
#include <stdlib.h>
#include <stdio.h>
#include "linalg.h"
/* C prototype for LAPACK routine DGELS */
void dgels_(const char * trans, const int * m, const int * n, const int *
nrhs, double * A, const int * lda, double * B, const int * ldb, double * work,
int * lwork,int * info);
int main(int argc, char * argv[]) {
vector_t * b_t = NULL;
matrix_t * A_t = NULL;
char trans = 'N';
int m, n, nrhs, mb, lda, ldb, info, lwork;
double optwork;
double * work;
// we read the matrix A and the vector b:
b_t = read_vector("b.txt");
A_t = read_matrix("A.txt");
m = A_t-> m; //number of rows in A
n = A_t-> n; //number of columns in A
nrhs = 1; //number of columns in B (will always be 1, since we read b_t with read_vector)
mb = b_t -> n; //number of rows in B
if (mb != m ) { //end program if A and B doesn't have the same number of rows
free(A_t);
free(b_t);
fprintf(stderr, "Sorry, but the matrix A and the vector b have incompatible dimensions. Good Bye!\n");
exit(EXIT_FAILURE);
}
//We make A and B into the wanted input form for the dgels_-function:
double * B = b_t -> v;
double ** A = A_t ->A;
lda = m;
ldb = mb;
//we calculate the optimal size of the work array:
lwork = -1;
dgels_(&trans, &n, &m, &nrhs, *A, &lda, B, &ldb, &optwork, &lwork, &info);
lwork = (int)optwork;
//we allocate space for the work array:
work = (double*)malloc( lwork*sizeof(double));
//solving the least squares problem:
dgels_(&trans, &n, &m, &nrhs, *A, &lda, B, &ldb, work, &lwork, &info);
//Check whether there was an successful exit:
if (info > 0){
fprintf(stderr, "Sorry, but illegal arguments were used, and therefore a least square solution cannot be computes. Good Bye!\n");
exit(EXIT_FAILURE);
} else if(info < 0){
fprintf(stderr, "Sorry, but A doesn't have full rank, and therefore a least square solution cannot be computed. Good Bye!\n");
exit(EXIT_FAILURE);
}
//Saving the least square problem as a vector_t:
vector_t * x = NULL;
x->n = mb;
x->v = B;
print_vector(x);
//Free memory
free_vector(b_t);
free_matrix(A_t);
free_vector(x);
return(EXIT_SUCCESS);
}
I am using the functions read_matrix, read_vector, print_vector, print_matrix and free_vector, which is why I use the struct vector_t and matrix_t:
typedef struct vector {
unsigned long n; /* length of vector */
double * v; /* pointer to array of length n */
} vector_t;
typedef struct matrix {
unsigned long m; /* number of rows */
unsigned long n; /* number of columns */
double ** A; /* pointer to two-dimensional array */
} matrix_t;
I don't think that anything is wrong with read_vector and read_matrix because I can easily do this and use print_vector or print_matrix before I do all of the other operations.
You dereference a NULL pointer here, causing the segfault:
//Saving the least square problem as a vector_t:
vector_t * x = NULL;
x->n = mb;
x->v = B;
Maybe you should use/create a new vector_t instead of just a pointer to a vector_t?

How to call R's order function from within C (via R_orderVector())?

In a C function which is called inside an R package, I need to sort some
numbers. In order to be consistent with what R does, I would like to call the
sorting algorithm/function which R uses, so R_orderVector(). I get a
segmentation fault exactly at the call of R_orderVector().
Below is a minimal working example (files of a 'minimal working package')
which reproduces the segmentation fault. What am I doing wrong?
### DESCRIPTION ################################################################
Package: foo
Version: 0.0-1
Encoding: UTF-8
Title: Sorting from C via R's R_orderVector()
Description: See title
Authors#R: c(person(given = "Foo", family = "Bar", role = c("aut", "cre"), email = "foo#bar.com"))
Author: Foo Bar [aut, cre]
Maintainer: Foo Bar <foo#bar.com>
Depends: R (>= 3.0.0)
Imports:
Suggests:
Enhances:
License: GPL-2 | GPL-3
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-03-25 15:26:50
### NAMESPACE ##################################################################
useDynLib(foo, .registration = TRUE)
export("myRsort")
### ./R/mySort.R ###############################################################
myRsort <- function(x) {
stopifnot(is.numeric(x), length(x) <= 64)
myRsort_ <- NULL # to avoid "myRsort_: no visible binding for global variable 'myRsort_'"
.Call("myRsort_", x)
}
### ./man/myRsort.Rd ###########################################################
\name{myRsort}
\alias{myRsort}
\title{Using R's Sorting Algorithm from C}
\description{
R's sorting algorithm is called from C.
}
\usage{
myRsort(x)
}
\arguments{
\item{x}{vector}
}
\value{
vector
}
\author{Marius Hofert}
\examples{
set.seed(271)
x <- runif(10)
myRsort(x)
}
\keyword{utilities}
### ./src/init.c ###############################################################
#include <R.h>
#include <Rinternals.h>
#include <R_ext/Rdynload.h>
#include "myRsort.h"
static const R_CallMethodDef callMethods[] = {
{"myRsort_", (DL_FUNC) &myRsort_, 1},
{NULL, NULL, 0}
};
void R_init_foo(DllInfo *dll)
{
R_useDynamicSymbols(dll, FALSE);
R_registerRoutines(dll, NULL, callMethods, NULL, NULL); /* s. WRE (2015, Section 5.4) */
}
### ./src/myRsort.c ############################################################
#include "myRsort.h"
void myRsort_aux(double *x, int n, double *res)
{
int *ind; /* pointer to vector of indices as required for order (permutation of 0:(n-1)) */
ind = (int *) R_alloc(n, sizeof(int));
SEXP xsexp = PROTECT(allocVector(REALSXP, n)); /* turn x into SEXP */
double *xsexp_ = REAL(xsexp); /* pointer */
R_orderVector(ind, n, xsexp, TRUE, TRUE); /* nalast (use same default as order()); decreasing=TRUE */
/* the last line generates a seg-fault */
int i;
for(i=0; i<n; i++) res[i] = x[ind[i]];
}
SEXP myRsort_(SEXP x)
{
double *x_ = REAL(x); /* pointer to n-vector */
int n = LENGTH(x); /* length n */
int maxlen = 64; /* vector can be at most of length 64 here */
SEXP res = PROTECT(allocVector(REALSXP, maxlen)); /* result */
double *res_ = REAL(res); /* pointer to the (sorted) result */
myRsort_aux(x_, n, res_);
UNPROTECT(1);
return res;
}
### ./src/myRsort.h ############################################################
#ifndef myRsort_H
#define myRsort_H
#include <R.h>
#include <Rinternals.h>
#include <Rmath.h>
void myRsort_aux(double *x, int n, double *res);
SEXP myRsort_(SEXP x);
#endif
It seems as if you:
forgot an UNPROTECT.
did not copy the values in myRsort_aux.
did not use the R_orderVector function correctly.
Here is a working example which can be Rcpp::sourceCpped:
#include <Rcpp.h>
void myRsort_aux(double *x, int n, double *res)
{
int *ind = (int *) R_alloc(n, sizeof(int));
SEXP xsexp = PROTECT(Rf_allocVector(REALSXP, n));
// you forgot to copy the values?
for(int i = 0; i < n; ++i)
REAL(xsexp)[i] = x[i];
// a call as in https://github.com/wch/r-source/blob/7dcdfc2d2d0ce3ce6fe84aa1cf0d27b5cbc833fc/src/main/sort.c#L1096
R_orderVector(ind, n, Rf_lang1(xsexp), TRUE, TRUE);
for(int i = 0; i < n; i++)
res[i] = x[ind[i]];
UNPROTECT(1); // seems like you forgot this
}
// [[Rcpp::export(rng = false)]]
SEXP myRsort_(SEXP x)
{
double *x_ = REAL(x);
int n = LENGTH(x);
int maxlen = 64;
SEXP res = PROTECT(Rf_allocVector(REALSXP, maxlen));
double *res_ = REAL(res);
myRsort_aux(x_, n, res_);
UNPROTECT(1);
return res;
}
/*** R
set.seed(1)
to_be_sorted <- rnorm(10)
head(myRsort_(to_be_sorted), 10)
#R> [1] 1.5952808 0.7383247 0.5757814 0.4874291 0.3295078 0.1836433 -0.3053884 -0.6264538 -0.8204684 -0.8356286
*/
You can likely just use the R_rsort in this case.

Negative array indexing in shared memory based 1d stencil CUDA implementation

I'm currently working with CUDA programming and I'm trying to learn off of slides from a workshop I found online, which can be found here. The problem I am having is on slide 48. The following code can be found there:
__global__ void stencil_1d(int *in, int *out) {
__shared__ int temp[BLOCK_SIZE + 2 * RADIUS];
int gindex = threadIdx.x + blockIdx.x * blockDim.x;
int lindex = threadIdx.x + RADIUS;
// Read input elements into shared memory
temp[lindex] = in[gindex];
if (threadIdx.x < RADIUS) {
temp[lindex - RADIUS] = in[gindex - RADIUS];
temp[lindex + BLOCK_SIZE] = in[gindex + BLOCK_SIZE];
}
....
To add a bit of context. We have an array called in which as length say N. We then have another array out which has length N+(2*RADIUS), where RADIUS has a value of 3 for this particular example. The idea is to copy array in, into array out but to place the array in in position 3 from the beginning of array out i.e out = [RADIUS][in][RADIUS], see slide for graphical representation.
The confusion comes in on the following line:
temp[lindex - RADIUS] = in[gindex - RADIUS];
If gindex is 0 then we have in[-3]. How can we read from a negative index in an array? Any help would really be appreciated.
The answer by pQB is correct. You are supposed to offset the input array pointer by RADIUS.
To show this, I'm providing below a full worked example. Hope it would be beneficial to other users.
(I would say you would need a __syncthreads() after the shared memory loads. I have added it in the below example).
#include <thrust/device_vector.h>
#define RADIUS 3
#define BLOCKSIZE 32
/*******************/
/* iDivUp FUNCTION */
/*******************/
int iDivUp(int a, int b){ return ((a % b) != 0) ? (a / b + 1) : (a / b); }
/********************/
/* CUDA ERROR CHECK */
/********************/
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
if (code != cudaSuccess)
{
fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
if (abort) exit(code);
}
}
/**********/
/* KERNEL */
/**********/
__global__ void moving_average(unsigned int *in, unsigned int *out, unsigned int N) {
__shared__ unsigned int temp[BLOCKSIZE + 2 * RADIUS];
unsigned int gindexx = threadIdx.x + blockIdx.x * blockDim.x;
unsigned int lindexx = threadIdx.x + RADIUS;
// --- Read input elements into shared memory
temp[lindexx] = (gindexx < N)? in[gindexx] : 0;
if (threadIdx.x < RADIUS) {
temp[threadIdx.x] = (((gindexx - RADIUS) >= 0)&&(gindexx <= N)) ? in[gindexx - RADIUS] : 0;
temp[threadIdx.x + (RADIUS + BLOCKSIZE)] = ((gindexx + BLOCKSIZE) < N)? in[gindexx + BLOCKSIZE] : 0;
}
__syncthreads();
// --- Apply the stencil
unsigned int result = 0;
for (int offset = -RADIUS ; offset <= RADIUS ; offset++) {
result += temp[lindexx + offset];
}
// --- Store the result
out[gindexx] = result;
}
/********/
/* MAIN */
/********/
int main() {
const unsigned int N = 55 + 2 * RADIUS;
const unsigned int constant = 4;
thrust::device_vector<unsigned int> d_in(N, constant);
thrust::device_vector<unsigned int> d_out(N);
moving_average<<<iDivUp(N, BLOCKSIZE), BLOCKSIZE>>>(thrust::raw_pointer_cast(d_in.data()), thrust::raw_pointer_cast(d_out.data()), N);
gpuErrchk(cudaPeekAtLastError());
gpuErrchk(cudaDeviceSynchronize());
thrust::host_vector<unsigned int> h_out = d_out;
for (int i=0; i<N; i++)
printf("Element i = %i; h_out = %i\n", i, h_out[i]);
return 0;
}
You are assuming that in array points to the first position of the memory that has been allocated for this array. However, if you see slide 47, the in array has a halo (orange boxes) of three elements before and after of the data (represented as green cubes).
My assumption is (I have not done the workshop) that the input array is first initialized with an halo and then the pointer is moved in the kernel call. Something like:
stencil_1d<<<dimGrid, dimBlock>>>(in + RADIUS, out);
So, in the kernel, it's safe to do in[-3] because the pointer is not at the beginning of the array.
There are already good answers, but to focus on the actual point that caused the confusion:
In C (not only in CUDA, but in C in general), when you access an "array" by using the [ brackets ], you are actually doing pointer arithmetic.
For example, consider a pointer like this:
int* data= ... // Points to some memory
When you then write a statement like
data[3] = 42;
you are just accessing a memory location that is "three entries behind the original data pointer". So you could also have written
int* data= ... // Points to some memory
int* dataWithOffset = data+3;
dataWithOffset[0] = 42; // This will write into data[3]
and consequently,
dataWithOffset[-3] = 123; // This will write into data[0]
In fact, you can say that data[i] is the same as *(data+i), which is the same as *(i+data), which in turn is the same as i[data], but you should not use this in real programs...)
I can compile #JackOLantern's code, but there is an warning: "pointless comparison of unsigned integer with zero":
And when run, it will abort like:
I have modified the code to the following and the warning disappeared and it can get right result:
#include <thrust/device_vector.h>
#define RADIUS 3
#define BLOCKSIZE 32
/*******************/
/* iDivUp FUNCTION */
/*******************/
int iDivUp(int a, int b){ return ((a % b) != 0) ? (a / b + 1) : (a / b); }
/********************/
/* CUDA ERROR CHECK */
/********************/
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
if (code != cudaSuccess)
{
fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
if (abort) exit(code);
}
}
/**********/
/* KERNEL */
/**********/
__global__ void moving_average(unsigned int *in, unsigned int *out, int N) {
__shared__ unsigned int temp[BLOCKSIZE + 2 * RADIUS];
int gindexx = threadIdx.x + blockIdx.x * blockDim.x;
int lindexx = threadIdx.x + RADIUS;
// --- Read input elements into shared memory
temp[lindexx] = (gindexx < N)? in[gindexx] : 0;
if (threadIdx.x < RADIUS) {
temp[threadIdx.x] = (((gindexx - RADIUS) >= 0)&&(gindexx <= N)) ? in[gindexx - RADIUS] : 0;
temp[threadIdx.x + (RADIUS + BLOCKSIZE)] = ((gindexx + BLOCKSIZE) < N)? in[gindexx + BLOCKSIZE] : 0;
}
__syncthreads();
// --- Apply the stencil
unsigned int result = 0;
for (int offset = -RADIUS ; offset <= RADIUS ; offset++) {
result += temp[lindexx + offset];
}
// --- Store the result
out[gindexx] = result;
}
/********/
/* MAIN */
/********/
int main() {
const int N = 55 + 2 * RADIUS;
const unsigned int constant = 4;
thrust::device_vector<unsigned int> d_in(N, constant);
thrust::device_vector<unsigned int> d_out(N);
moving_average<<<iDivUp(N, BLOCKSIZE), BLOCKSIZE>>>(thrust::raw_pointer_cast(d_in.data()), thrust::raw_pointer_cast(d_out.data()), N);
gpuErrchk(cudaPeekAtLastError());
gpuErrchk(cudaDeviceSynchronize());
thrust::host_vector<unsigned int> h_out = d_out;
for (int i=0; i<N; i++)
printf("Element i = %i; h_out = %i\n", i, h_out[i]);
return 0;
}
The result is like this:

error C2102: '&' requires l-value

The code line: gsl_blas_daxpy(-a,&gsl_matrix_column(D, q).vector,y);
cause the error
error C2102: '&' requires l-value
, now the problem is that I have no control of the GSL functions so I don't know how to figure this out (removing the "&" didn't work)
afterwards i get
error C2198: 'gsl_blas_daxpy' : too few arguments for call
I'm using Visual studio 2010.
GSL_EXPORT int gsl_blas_daxpy (double alpha,
const gsl_vector * X,
gsl_vector * Y);
#include <stdio.h>
#include <math.h>
#include <time.h>
#include <gsl/gsl_vector.h>
#include <gsl/gsl_matrix.h>
#include <gsl/gsl_blas.h>
#define M (10) // Number of columns in dictionary */
#define N ((int)(M/2)) // Number of rows in dictionary */
int K = 0.07*M; //Number of non-zero elements in signal - the sparsity
int P=1; //number of signals
double epsilon = 1.0e-7; // Residual error
int numOfIterations = N; /* Max num of iterations - same as num of elements in signal */
double sign(double x){return (x>=0) - (x<0);} // Sign function
int main(int argc, char** argv)
{
int n, m, k, iter, q;
double normi, normf, tmp , norm=sqrt(N), htime;
gsl_matrix *D; // A random dictionary used for encoding the sparse signal NxM
gsl_vector *x; // Sparse info signal (encoder input) MxP
gsl_vector *z; // Evaluated Sparse info signal (decoder output) MxP
gsl_vector *r; // Residual error vector MxP
gsl_vector *y; // Sparse representation of signal (encoder output) NxP
gsl_vector_view v;
clock_t start; //for measuring performance
printf("\nDictionary is:NxM=%dx%d,and the signal sparsity is K=%d", N, M, K);
srand(time(NULL)); //Initialize srand
start =clock(); //Initialize clock
/* Initiallize D as a Bernoulli random dictionary */
D = gsl_matrix_alloc (N, M);
for(m=0; m<M; m++)
{
for(n=0; n<N; n++)
{
tmp=sign(2.0*rand()/(double)RAND_MAX-1.0)/norm;
gsl_matrix_set (D, n, m, tmp); //D[n,m]=tmp
}
}
/* Create a random K-sparse info signal */
x = gsl_vector_alloc(M);
for(k=0; k<K; k++)
{
gsl_vector_set(x, rand()%M, 2.0*rand()/(float)RAND_MAX - 1.0); //put random values at k random positions
}
/* Allocate memory for solution (evaluated signal) */
z = gsl_vector_calloc(M);
/* Allocate memory for residual vector */
r = gsl_vector_calloc(M);
/* Allocate memory for the encoded signal vector (its representation) */
y = gsl_vector_alloc(N);
htime=((double)clock()-start)/CLOCKS_PER_SEC;
printf("\nTime data allocation: %f", htime);
/* Encoding the signal (x to y) */
start = clock();
gsl_blas_dgemv(CblasNoTrans, 1, D, x, 0, y); // y = Dx
htime=((double)clock()-start)/CLOCKS_PER_SEC;
printf("\nTime for encoding: %f", htime);
/* Decoding the signal */
start = clock();
normi = gsl_blas_dnrm2(y); // ||y|| (L2 norm)
epsilon = sqrt(epsilon * normi);
normf = normi;
iter = 0;
/*iterate till the computational error is small enough*/
while(normf > epsilon && iter < numOfIterations)
{
gsl_blas_dgemv(CblasTrans, 1, D, y, 0, r); // r=D'*y
q = gsl_blas_idamax(r); //index of max element in residual vector
tmp = gsl_vector_get(r, q); //the max element in r
gsl_vector_set(z, q, gsl_vector_get(z, q)+tmp); // z[q]=z[q]+ tmp
v=gsl_matrix_column(D, q); // choose the dictrionary's atom (coloum) with the index of largest element in r
gsl_blas_daxpy(-tmp,&v.vector,y); // y = y-tmp*v
normf = gsl_blas_dnrm2(y); // ||y|| (L2 norm)
iter++;
}
htime = ((double)clock()-start)/CLOCKS_PER_SEC;
printf("\nTime for decoding: %f", htime);
tmp = 100.0*(normf*normf)/(normi*normi); // the error at end of algorithm
printf("\nComputation residual error: %f",tmp);
/* Check the solution (evaluated signal) against the original signal */
printf("\nSolution (first column),Reference (second column):");
getchar(); // wait for pressing a key
for(m=0; m<M; m++)
{
printf("\n%.3f\t%.3f", gsl_vector_get(x, m),gsl_vector_get(z, m));
}
normi = gsl_blas_dnrm2(x);
gsl_blas_daxpy(-1.0, x, z); // z = z-x
normf = gsl_blas_dnrm2(z); // ||z|| (L2 norm)
tmp = 100.0*(normf*normf)/(normi*normi); //final error
printf("\nSolution residual error: %f\n",tmp);
/* Memory clean up and shutdown*/
gsl_vector_free(y); gsl_vector_free(r);
gsl_vector_free(z); gsl_vector_free(x);
gsl_matrix_free(D);
getchar();
return EXIT_SUCCESS;
}
gsl_matrix_column(D, q).vector is an R-value. You can't take its address. You need an L-value, so assign it to a named variable first, then pass the address of that variable to the function.
If you make a more permanent home for the return value of gsl_matrix_column, (this particular) problem will go away.
Here is some simplified code that illustrates how one might capture a return value in an addressable slot:
struct _foo {
int i;
};
struct _foo bar () {
struct _foo result = { 5 };
return result;
}
/* won't compile; 'lvalue required as unary & operand */
void qux () {
int *j = &bar().i;
}
/* compiles OK */
void qal () {
struct _foo result = bar();
int* j = &result.i;
}
gsl_vector_view c=gsl_matrix_column(D, q);
gsl_blas_daxpy(-a,&c.vector,y);
I think, introducing a temporal variable led you pass a pointer to it to the function.
EDIT: Well, trying to understand the problem, I wanted to know what the function expect:
int gsl_blas_daxpy (double alpha, const gsl_vector * x, gsl_vector * y)
and
gsl_vector_view gsl_matrix_column (gsl_matrix * m, size_t j)
witj some explanation:
A vector view can be passed to any subroutine which takes a vector
argument just as a directly allocated vector would be, using
&view.vector.
and an example:
for (j = 0; j < 10; j++)
{
gsl_vector_view column = gsl_matrix_column (m, j);
double d;
d = gsl_blas_dnrm2 (&column.vector);
printf ("matrix column %d, norm = %g\n", j, d);
}
Now we have another problem:
Here another answer:
Are you aware that int K= 0.7 is K=0 ??
#define M (10) // Number of columns in dictionary */
int K = 0.07*M; //Number of non-zero elements in signal - the sparsity
alloc do not initialice the vector x. x will contain garbage values, not 0. Did you meant x = gsl_vector_calloc(M); with c? It will set x to 0.
/* Create a random K-sparse info signal */
x = gsl_vector_alloc(M);
for(k=0; k<K; k++) // K=0, for get skiped and x not modified.
{
gsl_vector_set(x, rand()%M, 2.0*rand()/(float)RAND_MAX - 1.0); //put random values at k random positions
}
(And here you will have at most K random values, but possible lest)

Resources