How to call C function from R?

How to call C function from R? - c

How can you use some function written in C from R level using R data.
eg. to use function like:
double* addOneToVector(int n, const double* vector) {
double* ans = malloc(sizeof(double)*n);
for (int i = 0; i < n; ++i)
ans[i] = vector[i] + 1
return ans;
}
in the context:
x = 1:3
x = addOneToVector(x)
x # 2, 3, 4

I've searched stackoverflow first but I noticed there is no answer for that in here.
The general idea is (commands for linux, but same idea under other OS):
Create function that will only take pointers to basic types and do everything by side-effects (returns void). eg in a file called foo.c:
void addOneToVector(int* n, double* vector) {
for (int i = 0; i < *n; ++i)
vector[i] += 1.0;
}
Compile file C source as dynamic library, you can use R shortcut to do this:
$ R CMD SHLIB foo.c
This will then create a file called foo.so on Mac or foo.dll on Windows.
Load dynamic library from R
on Mac:
dyn.load("foo.so")
or on Windows:
dyn.load("foo.dll")
Call C functions using .C R function, IE:
x = 1:3
ret_val = .C("addOneToVector", n=length(x), vector=as.double(x))
It returns list from which you can get value of inputs after calling functions eg.
ret_val$x # 2, 3, 4
You can now wrap it to be able to use it from R easier.
There is a nice page describing whole process with more details here (also covering Fortran):
http://users.stat.umn.edu/~geyer/rc/

I just did the same thing in a very simple way using the Rcpp package. It allows you to write C++ functions directly in R.
library("Rcpp")
cppFunction("
NumericVector addOneToVector(NumericVector vector) {
int n = vector.size();
for (int i = 0; i < n; ++i)
vector[i] = vector[i] + 1.0;
return vector;
}")
Find more details here http://adv-r.had.co.nz/Rcpp.html. C++ functions can be done very fast with these instructions.

First off, I wanted to thank both #m0nhawk and #Jan for their immensely useful contributions to this problem.
I tried both methods on my MacBook: first the one showed m0nhawk which requires creating a function in C (without the main method) and then compiling using R CMD SHLIB <prog.c> and then invoking the function from R using the .C command
Here's a small C code I wrote (not a pro in C - just learning in bits and pieces)
Step 1: Write the C Program
#include <stdio.h>
int func_test() {
for(int i = 0; i < 5; i++) {
printf("The value of i is: %d\n", i);
}
return 0;
}
Step 2: Compile the program using
R CMD SHLIB func_test.c
This will produce a func_test.so file
Step 3: Now write the R Code that invokes this C function from within R Studio
dyn.load("/users/my_home_dir/xxx/ccode/ac.so")
.C("func_test")
Step 4: Output:
.C("func_test") The value of i is: 0 The value of i is: 1 The value of i is: 2 The value of i is: 3 The value of i is: 4 list()
Then I tried the direct method suggested by Jan - using the RCpp package
library("Rcpp")
cppFunction("
NumericVector addOneToVector(NumericVector vector) {
int n = vector.size();
for (int i = 0; i < n; ++i)
vector[i] = vector[i] + 1.0;
return vector;
}")
# Test code to test the function
addOneToVector(c(1,2,3))
Both methods worked superbly. I can now start writing functions in C or C++ and use them in R
Thank you once again!

Related

passing in R array to C function with "NA"

I work in R using C libraries. I need to pass to a C function an array with numbers between 1 and 10 but that could also be "NA". Then in C, depending on the value I need to set the output.
Here's a simplified code
heredyn.load("ranking.so")
fun <- function(ranking) {
nrak <- length(ranking)
out <- .C("ranking", as.integer(nrak), as.character(ranking), rr = as.integer(vector("integer",nrak)))
out$rr
}
ranking <- sample(c(NA,seq(1,10)),10,replace=TRUE)
rr <- fun(ranking)
The C function could simply be such as
#include <R.h>
void ranking(int *nrak, char *ranking, int *rr) {
int i ;
for (i=0;i<*nrak;i++) {
if (ranking[i] == 'NA')
rr[i] = 1 ;
else
rr[i] = (int) strtol(&ranking[i],(char **)NULL,10) ;
}
}
Due to the "NA" value I set ranking as character but maybe there's another way to do that, using integer and without replacing "NA" to 0 before calling the function?
(The code like this, gives me always an array of zeros...)

Test for whether the value is an NA using R_NaInt, like
#include <R.h>
void ranking_c(int *nrak, int *ranking, int *rr) {
for (int i=0; i < *nrak; i++)
rr[i] = R_NaInt == ranking[i] ? -1 : ranking[i];
}
Invoke from R by explicitly allowing NAs
> x = c(1:2, NA_integer_)
> .C("ranking_c", length(x), as.integer(x), integer(length(x)), NAOK=TRUE)[[3]]
[1] 1 2 -1
Alternatively, use R's .Call() interface. Each R object is represented as an S-expression. There are C-level functions to manipulate S-expressions, e.g., length Rf_length(), data access INTEGER(), and allocation Rf_allocVector() of different types of S-expressions such as INTSXP for integer vectors.
R memory management uses a garbage collector that can run on any call that allocates memory. It is therefore best practice to PROTECT() any R allocation while in scope.
Your function will accept 0 or more S-expressions as input, and return a single S-expression; it might be implemented as
#include <Rinternals.h>
#include <R_ext/Arith.h>
SEXP ranking_call(SEXP ranking)
{
/* allocate space for result, PROTECTing from garbage collection */
SEXP result = PROTECT(Rf_allocVector(INTSXP, Rf_length(ranking)));
/* assign result */
for (int i = 0; i < Rf_length(ranking); ++i)
INTEGER(result)[i] =
R_NaInt == INTEGER(ranking)[i] ? -1 : INTEGER(ranking)[i];
UNPROTECT(1); /* no more need to protect */
return result;
}
And invoked from R with .Call("ranking_call", as.integer(ranking)).
Using .Call is more efficient than .C in terms of speed and memory allocation (.C may copy atomic vectors on the way in), but the primary reason to use it is for the flexibility it offers in terms of working directly with R's data structures. This is especially important when the return values are more complicated than atomic vectors.

You are attempting to address a couple of delicate and non-trivial points, least of all how to compile code with R, and to test for non-finite values.
You asked for help with C. I would like to suggest C++ -- which you do not need to use in a complicated way. Consider this short file with contains a function to process a vector along the lines you suggest (I just test for NA and then assign 42 as a marker for simplicit) or else square the value:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector foo(NumericVector x) {
unsigned int n = x.size();
for (unsigned int i=0; i<n; i++)
if (NumericVector::is_na(x[i]))
x[i] = 42.0;
else
x[i] = pow(x[i], 2);
return x;
}
/*** R
foo( c(1, 3, NA, NaN, 6) )
*/
If I save this on my box as /tmp/foo.cpp, in order compile, link, load and even run the embedded R use example, I only need one line to call sourceCpp():
R> Rcpp::sourceCpp("/tmp/foo.cpp")
R> foo( c(1, 3, NA, NaN, 6))
[1] 1 9 42 42 36
R>
We can do the same with integers:
// [[Rcpp::export]]
IntegerVector bar(IntegerVector x) {
unsigned int n = x.size();
for (unsigned int i=0; i<n; i++)
if (IntegerVector::is_na(x[i]))
x[i] = 42;
else
x[i] = pow(x[i], 2);
return x;
}

Problems using GNU Scientific Library in Visual Studio 2012

I'm having some issues trying to install GSL for my programming course at uni (the course is taught in C). I'm using Visual Studio express 2012.
I've downloaded and run a precompiled GSL thing (GnuWin32) and changed my Linker and Includes options in VS, and the code is actually running properly. I've been using the following code as an example:
#include <math.h>
#include <stdio.h>
#include <gsl/gsl_matrix.h>
#include <gsl/gsl_blas.h>
int main()
{
size_t i,j;
gsl_matrix *m;
m = gsl_matrix_alloc (10, 10);
for (i = 0; i < 10; i++)
{
for (j = 0; j < 10; j++)
{
gsl_matrix_set (m, i, j, sin (i) + cos (j));
}
}
for (j = 0; j < 10; j++)
{
gsl_vector_view column = gsl_matrix_column (m, j);
double d;
d = gsl_blas_dnrm2 (&column.vector);
printf ("matrix column %d, norm = %g\n", j, d);
}
gsl_matrix_free(m);
}
This is running properly and giving me the correct output. However, when VS gets to the final curly bracket, it freaks out and gives me this error message:
"Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention."
Does anyone have an idea how to stop this happening? I've scoured the internet, and haven't found anything useful that I can understand.

'ESP was not saved properly' sounds like the standard asm code to save ESP into EBP failed or was not generated. There is a compiler option 'Omit frame pointers'. Check how it is set in your build for both the library and your main code. They both must be built the same way. Also, if you want a debugger to work, then uncheck this option.

Weird bug in c implementing schoolbook multiplication using gmp

I have a weird problem.
I am trying to implement the schoolbook multiplication. I am aware that the function mpz_mul does that for me but it is my task to implement it myself as a homework.
So here is my code:
void mpz_school_mul(mpz_t c, mpz_t a, mpz_t b)
{
size_t i;
mp_limb_t b_i;
mpz_t c_part;
mpz_init(c_part);
/* Backup a for the special case a := a * b. */
mpz_t a_backup;
mpz_init(a_backup);
mpz_set(a_backup, a);
/* Clear the result */
mpz_set_ui(c,0);
gmp_printf("i = %zx, size(b) = %zx, a = %Zx, b = %Zx\n", i, mpz_size(b), a, b);
for(i = 0; i < mpz_size(b); i++)
{
printf("test\n");
b_i = mpz_getlimbn(b,i);
/* c = a*b_i*B^i + ... + a*b_0*B^0 */
/* Calculate a*b_i for every round. */
mpz_mul_limb(c_part,a_backup,b_i);
/* Shift it to the right position (*B^i). */
mpz_mul_base(c_part,c_part,i);
/* Sum all a*b_i*B^i */
mpz_school_add(c, c, c_part);
}
mpz_clear(a_backup);
mpz_clear(c_part);
}
This code works well for me and i can test it with several parameters. The result is correct so I don't think I need to change to much in the calculation part. ;)
As example: This parameters work as intended.
mpz_set_str(a, "ffffffff00000000abcdabcd", 16);
mpz_set_str(b, "cceaffcc00000000abcdabcd", 16);
mpz_school_mul(c,a,b);
Now to the bug:
When i run the program with a parameter b with a zero limb (I'm using a 32 bit VM) at the end the program crashes:
mpz_set_str(a, "ffffffff00000000abcdabcd", 16);
mpz_set_str(b, "cceaffcc00000000", 16);
mpz_school_mul(c,a,b);
The output with this parameter b_0 = 0 is:
i = 0, size(b) = 2, a = ffffffff00000000abcdabcd, b = cceaffcc00000000
I think the for-loop stucks because the printf("test\n"); does not show up in this run.
Thanks for your help ;)

The bug in the problem is fixed now.
Here is the solution:
I tested out to use fprintf(stderr, "test\n"); rather than printf("test\n"); and tested the code. It magically showed up the "test" in my console.
It may have to do with the wrong inclusion order of the and file.
As I had this problem with the print I didn't check my other functions.
Since I figured out, that the for-loop wasn't the problem I tested several prints after each command. I was able to detect the error in the void mpz_mul_base(mpz_t c, mpz_t a, mp_size_t i) function where I didn't check the case with c_part = 0. With this parameter the following code of the mpz_mul_base(c_part,c_part,i); function ran into an endless loop:
if(mpz_size(c) >= n)
for(i = mpz_size(c); i >= n; i--)
{
mpz_setlimbn(c, clear, i);
}
I replaced the >= with > and everything works fine now.

Visual Studios 2010: 'File location'.exe is not recognized as internal or external command... (C program)

My Thermo professor assigned our class a computational project in which we have to calculate some thermodynamic functions. He provided us with some code to work off of which is a program that essentially finds the area under a curve between two points for the function x^2. The code is said to be correct and it looks correct to me. However, I've been having FREQUENT problems with all of my programs giving me the error "'File location'.exe is not recognized as internal or external command, operable programs or batch files." upon initial running of a project or [mostly] reopening projects.
I've been researching the problem for many hours. I tried adjusting the environmental variables like so many other sites suggested, but I'm either not doing it right or it's not working. All I keep reading about is people explaining the purpose of an .exe file and that I have to locate that file and open that. The problem is that I cannot find ANY .exe file. There is the project I created with the source.c file I created and wrote the program in. Everything else has lengthy extensions that I've never seen before.
I'm growing increasingly impatient with Visual Studios' inconsistent behavior lately. I've just made the switch from MATLAB, which although is an inferior programming language, is far more user friendly and easier to program with. For those of you interested in the code I'm running, it is below:
#include <stdio.h>
#include <iostream>
#include <math.h>
using namespace std;
double integration();
double integration()
{
int num_of_intervals = 4, i;
double final_sum = 0, lower_limit = 2, upper_limit = 3, var, y = 1, x;
x = (upper_limit - lower_limit) / num_of_intervals; // Calculating delta x value
if(num_of_intervals % 2 != 0) //Simpson's rule can be performed only on even number of intervals
{
printf("Cannot perform integration. Number of intervals should be even");
return 0;
}
for(i = 0 ; i < num_of_intervals ; i++)
{
if(i != 0) //Coefficients for even and odd places. Even places, it is 2 and for odd it is 4.
{
if(i % 2 == 0)
y = 2;
else
y = 4;
}
var = lower_limit + (i * x);// Calculating the function variable value
final_sum = final_sum + (pow(var, 2) * y); //Calculating the sum
}
final_sum = (final_sum + pow(upper_limit , 2)) * x / 3; //Final sum
return final_sum;
}
int main()
{
printf("The integral value of x2 between limits 2 and 3 is %lf \n" , integration());
system("PAUSE");
return 0;
}
Thanks in advance,
Dom

Using gcc to compile a C program

I am following an example from CUNY and I have never done anything with C before so I probably don't know what I am doing.
Consider the program below.
Do I need a shebang line for C code written in emacs?
When I go to compile using the line gcc -g -o forwardadding forwardadding.c
I am hit with the message:
forwardadding.c:9:17: error: expected expression before ‘<’ token
Once I get the code compiles, I can use gdb to debug and run the code corrrect?
The code:
#include <stdio.h>
#include <math.h>
main()
{
float sum, term;
int i;
sum = 0.0;
for( i = 1; < 10000000; i++)
{
term = (float) i;
term = term * term;
term = 1 / term;
sum += term;
}
printf("The sum is %.12f\n", sum);
}

No shebang is needed. You could add an Emacs mode line comment.
The for loop should be:
for (i = 1; i < 10000000; i++)
Your code is missing the second i.
Yes, you can use GDB once you've got the code compiling.
You'd get a better answer to the mathematics if you counted down from 10,000,000 than by counting up to 10,000,000. After about i = 10000, the extra values add nothing to the result.
Please get into the habit of writing C99 code. That means you should write:
int main(void)
with the return type of int being required and the void being recommended.

You need to put a variable in the for loop for a complete expression (which is probably line 9...)
for( i = 1; < 10000000; i++)
change to this
for( i = 1; i < 10000000; i++)

You are missing an an i. Just correct that as Jonathan Leffler has suggested and save your file. Open your terminal and just use this to compile your code gcc your_file_name.c and your code compiles next to run the code that just compiled type ./a.out and your program runs and shows you the output.