I am trying to write a C code to use in R, but found that the .C function call won't update output variable. The real C code is complicated, but here is a simple example to show the behavior:
void doubleMe(const int *input, int *output) {
output[0] = input[0] * 2;
}
Save above C function into file doubleMe.c. Under Linux, compile it to create doubleMe.so file:
R CMD SHLIB doubleMe.c
In R, if I do following:
dyn.load("doubleMe.so") # load it
input = 2
output = 0
.C("doubleMe", as.integer(input), as.integer(output)) # expect output=4
[[1]]
[1] 2
[[2]]
[1] 4
The screen output indicates the input is doubled, but the output in R is still 0:
output
[1] 0
If I do fowllwing
output = .C("doubleMe", as.integer(input), as.integer(output))[[2]]
the output is 4. This should work for the example.
But my real input and output are matrix, and I have to reshape the output to correct dimension. Is there a way to let .C call update output directly?
With the .Call() interface using SEXP data types where P stands for pointer, this is automagic:
R> Rcpp::cppFunction("void doubleMe(NumericVector x) { x = 2*x; } ")
R> x <- 1 # set to one
R> doubleMe(x) # call function we just wrote
R> x # check ...
[1] 2 # and it has doubled as a side-effect
R>
I use Rcpp here as it allows me to do this on one line, you could do the same in a few lines of C code if you wanted to.
Related
When to use .Call or .C in R related with vector arguments, my current way is to handle some attributes like the length, maximum value, etc., in R and then pass those attributes as arguments to C functions.
From R extension, at least a function names length is available. So are there similar interfaces in C to R vector functions like max, min, rep.
Rcpp has basic functions like min, max and rep. Consider the following example (suppose it's called example.cpp):
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector exampleMinMax(NumericVector x) {
NumericVector out(2);
out[0] = min(x);
out[1] = max(x);
return out;
}
// [[Rcpp::export]]
NumericVector exampleRep(NumericVector x, int n) {
NumericVector out = rep_each(x, n);
return out;
}
Then in R you can do:
library(Rcpp)
sourceCpp("example.cpp")
exampleMinMax(1:10)
[1] 1 10
exampleRep(1:10, 2)
[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10
Requirement:
Given a C program I have to identify whether the functions accessing global variables are reading them or writing them.
Example code:
#include <stdio.h>
/* global variable declaration */
int g = 20;
int main()
{
/* writing the global variable */
g = 10;
/* reading the global variable */
printf ("value of g = %d\n", g);
return 0;
}
Executing the above code I want to generate a log file in the below format:
1- Global variable a written in function main() "TIME_STAMP"
2- Global variable a read in function main() "TIME_STAMP"
Research:
I am cetainly able to acheive this by doing a static analysis of source code as per below logic:
Go through the c code and identify the statements where the global
variable is read.
Then analysis the c code statement to identify if
it is a read or write statement.(Checking if ++ or -- operator is
used with global variable or any assignemnt has been made to the
global variable)
Add a log statement above the identified statement which will execute
along with this statement execution.
This is not a proper implementation.
Some studies:
I have gone through how debuggers are able to capture information.
Some links in the internet:
How to catch a memory write and call function with address of write
Not completely answering your question, but to just log access you could do:
#include <stdio.h>
int g = 0;
#define g (*(fprintf(stderr, "accessing g from %s. g = %d\n", __FUNCTION__, g), &g))
void foo(void)
{
g = 2;
printf("g=%d\n", g);
}
void bar(void)
{
g = 3;
printf("g=%d\n", g);
}
int main(void)
{
printf("g=%d\n", g);
g = 1;
foo();
bar();
printf("g=%d\n", g);
}
Which would print:
accessing g from main. g = 0
g=0
accessing g from main. g = 0
accessing g from foo. g = 1
accessing g from foo. g = 2
g=2
accessing g from bar. g = 2
accessing g from bar. g = 3
g=3
accessing g from main. g = 3
g=3
Below is the way i solved this problem:
I created a utility(In java) which works as below(C program source file is the input to my utility):
Parse the file line by line identifying the variables and functions.
It stores global variables in a separate container and look for lines using them.
For every line which access the global variable i am analyzing them identifying whether it is a read operation or write operation(ex: ==, +=, -+
etc are write operation).
For every such operation i am instrumenting the code as suggested by #alk(https://stackoverflow.com/a/41158928/6160431) and that in turn will generate the log file when i execute the modified source file.
I am certainly able to achieve what i want but still looking for better implementation if anyone have.
For further discussion if anybody want we can have have a chat.
I refer the source code and algos from the below tools:
http://www.dyninst.org/
https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool
I work in R using C libraries. I need to pass to a C function an array with numbers between 1 and 10 but that could also be "NA". Then in C, depending on the value I need to set the output.
Here's a simplified code
heredyn.load("ranking.so")
fun <- function(ranking) {
nrak <- length(ranking)
out <- .C("ranking", as.integer(nrak), as.character(ranking), rr = as.integer(vector("integer",nrak)))
out$rr
}
ranking <- sample(c(NA,seq(1,10)),10,replace=TRUE)
rr <- fun(ranking)
The C function could simply be such as
#include <R.h>
void ranking(int *nrak, char *ranking, int *rr) {
int i ;
for (i=0;i<*nrak;i++) {
if (ranking[i] == 'NA')
rr[i] = 1 ;
else
rr[i] = (int) strtol(&ranking[i],(char **)NULL,10) ;
}
}
Due to the "NA" value I set ranking as character but maybe there's another way to do that, using integer and without replacing "NA" to 0 before calling the function?
(The code like this, gives me always an array of zeros...)
Test for whether the value is an NA using R_NaInt, like
#include <R.h>
void ranking_c(int *nrak, int *ranking, int *rr) {
for (int i=0; i < *nrak; i++)
rr[i] = R_NaInt == ranking[i] ? -1 : ranking[i];
}
Invoke from R by explicitly allowing NAs
> x = c(1:2, NA_integer_)
> .C("ranking_c", length(x), as.integer(x), integer(length(x)), NAOK=TRUE)[[3]]
[1] 1 2 -1
Alternatively, use R's .Call() interface. Each R object is represented as an S-expression. There are C-level functions to manipulate S-expressions, e.g., length Rf_length(), data access INTEGER(), and allocation Rf_allocVector() of different types of S-expressions such as INTSXP for integer vectors.
R memory management uses a garbage collector that can run on any call that allocates memory. It is therefore best practice to PROTECT() any R allocation while in scope.
Your function will accept 0 or more S-expressions as input, and return a single S-expression; it might be implemented as
#include <Rinternals.h>
#include <R_ext/Arith.h>
SEXP ranking_call(SEXP ranking)
{
/* allocate space for result, PROTECTing from garbage collection */
SEXP result = PROTECT(Rf_allocVector(INTSXP, Rf_length(ranking)));
/* assign result */
for (int i = 0; i < Rf_length(ranking); ++i)
INTEGER(result)[i] =
R_NaInt == INTEGER(ranking)[i] ? -1 : INTEGER(ranking)[i];
UNPROTECT(1); /* no more need to protect */
return result;
}
And invoked from R with .Call("ranking_call", as.integer(ranking)).
Using .Call is more efficient than .C in terms of speed and memory allocation (.C may copy atomic vectors on the way in), but the primary reason to use it is for the flexibility it offers in terms of working directly with R's data structures. This is especially important when the return values are more complicated than atomic vectors.
You are attempting to address a couple of delicate and non-trivial points, least of all how to compile code with R, and to test for non-finite values.
You asked for help with C. I would like to suggest C++ -- which you do not need to use in a complicated way. Consider this short file with contains a function to process a vector along the lines you suggest (I just test for NA and then assign 42 as a marker for simplicit) or else square the value:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector foo(NumericVector x) {
unsigned int n = x.size();
for (unsigned int i=0; i<n; i++)
if (NumericVector::is_na(x[i]))
x[i] = 42.0;
else
x[i] = pow(x[i], 2);
return x;
}
/*** R
foo( c(1, 3, NA, NaN, 6) )
*/
If I save this on my box as /tmp/foo.cpp, in order compile, link, load and even run the embedded R use example, I only need one line to call sourceCpp():
R> Rcpp::sourceCpp("/tmp/foo.cpp")
R> foo( c(1, 3, NA, NaN, 6))
[1] 1 9 42 42 36
R>
We can do the same with integers:
// [[Rcpp::export]]
IntegerVector bar(IntegerVector x) {
unsigned int n = x.size();
for (unsigned int i=0; i<n; i++)
if (IntegerVector::is_na(x[i]))
x[i] = 42;
else
x[i] = pow(x[i], 2);
return x;
}
How can you use some function written in C from R level using R data.
eg. to use function like:
double* addOneToVector(int n, const double* vector) {
double* ans = malloc(sizeof(double)*n);
for (int i = 0; i < n; ++i)
ans[i] = vector[i] + 1
return ans;
}
in the context:
x = 1:3
x = addOneToVector(x)
x # 2, 3, 4
I've searched stackoverflow first but I noticed there is no answer for that in here.
The general idea is (commands for linux, but same idea under other OS):
Create function that will only take pointers to basic types and do everything by side-effects (returns void). eg in a file called foo.c:
void addOneToVector(int* n, double* vector) {
for (int i = 0; i < *n; ++i)
vector[i] += 1.0;
}
Compile file C source as dynamic library, you can use R shortcut to do this:
$ R CMD SHLIB foo.c
This will then create a file called foo.so on Mac or foo.dll on Windows.
Load dynamic library from R
on Mac:
dyn.load("foo.so")
or on Windows:
dyn.load("foo.dll")
Call C functions using .C R function, IE:
x = 1:3
ret_val = .C("addOneToVector", n=length(x), vector=as.double(x))
It returns list from which you can get value of inputs after calling functions eg.
ret_val$x # 2, 3, 4
You can now wrap it to be able to use it from R easier.
There is a nice page describing whole process with more details here (also covering Fortran):
http://users.stat.umn.edu/~geyer/rc/
I just did the same thing in a very simple way using the Rcpp package. It allows you to write C++ functions directly in R.
library("Rcpp")
cppFunction("
NumericVector addOneToVector(NumericVector vector) {
int n = vector.size();
for (int i = 0; i < n; ++i)
vector[i] = vector[i] + 1.0;
return vector;
}")
Find more details here http://adv-r.had.co.nz/Rcpp.html. C++ functions can be done very fast with these instructions.
First off, I wanted to thank both #m0nhawk and #Jan for their immensely useful contributions to this problem.
I tried both methods on my MacBook: first the one showed m0nhawk which requires creating a function in C (without the main method) and then compiling using R CMD SHLIB <prog.c> and then invoking the function from R using the .C command
Here's a small C code I wrote (not a pro in C - just learning in bits and pieces)
Step 1: Write the C Program
#include <stdio.h>
int func_test() {
for(int i = 0; i < 5; i++) {
printf("The value of i is: %d\n", i);
}
return 0;
}
Step 2: Compile the program using
R CMD SHLIB func_test.c
This will produce a func_test.so file
Step 3: Now write the R Code that invokes this C function from within R Studio
dyn.load("/users/my_home_dir/xxx/ccode/ac.so")
.C("func_test")
Step 4: Output:
.C("func_test") The value of i is: 0 The value of i is: 1 The value of i is: 2 The value of i is: 3 The value of i is: 4 list()
Then I tried the direct method suggested by Jan - using the RCpp package
library("Rcpp")
cppFunction("
NumericVector addOneToVector(NumericVector vector) {
int n = vector.size();
for (int i = 0; i < n; ++i)
vector[i] = vector[i] + 1.0;
return vector;
}")
# Test code to test the function
addOneToVector(c(1,2,3))
Both methods worked superbly. I can now start writing functions in C or C++ and use them in R
Thank you once again!
So i am trying to make a program that finds the factorial using def.
changing this:
print ("Please enter a number greater than or equal to 0: ")
x = int(input())
f = 1
for n in range(2, x + 1):
f = f * n
print(x,' factorial is ',f)
to
something that uses def.
maybe
def intro()
blah blah
def main()
blah
main()
Not entirely sure what you are asking. As I understand your question, you want to refactor your script so that the calculation of the factorial is a function. If so, just try this:
def factorial(x): # define factorial as a function
f = 1
for n in range(2, x + 1):
f = f * n
return f
def main(): # define another function for user input
x = int(input("Please enter a number greater than or equal to 0: "))
f = factorial(x) # call your factorial function
print(x,'factorial is',f)
if __name__ == "__main__": # not executed when imported in another script
main() # call your main function
This will define a factorial function and a main function. The if block at the bottom will execute the main function, but only if the script is interpreted directly:
~> python3 test.py
Please enter a number greater than or equal to 0: 4
4 factorial is 24
Alternatively, you can import your script into another script or an interactive session. This way it will not execute the main function, but you can call both functions as you like.
~> python3
>>> import test
>>> test.factorial(4)
24
def factorial(n): # Define a function and passing a parameter
fact = 1 # Declare a variable fact and set the initial value=1
for i in range(1,n+1,1): # Using loop for iteration
fact = fact*i
print(fact) # Print the value of fact(You can also use "return")
factorial(n) // Calling the function and passing the parameter
You can pass any number to n for getting factorial