C function of R libraries, acf function [duplicate] - c

I want to look at the source code for a function to see how it works. I know I can print a function by typing its name at the prompt:
> t
function (x)
UseMethod("t")
<bytecode: 0x2332948>
<environment: namespace:base>
In this case, what does UseMethod("t") mean? How do I find the source code that's actually being used by, for example: t(1:10)?
Is there a difference between when I see UseMethod and when I see standardGeneric and showMethods, as with with?
> with
standardGeneric for "with" defined from package "base"
function (data, expr, ...)
standardGeneric("with")
<bytecode: 0x102fb3fc0>
<environment: 0x102fab988>
Methods may be defined for arguments: data
Use showMethods("with") for currently available ones.
In other cases, I can see that R functions are being called, but I can't find the source code for those functions.
> ts.union
function (..., dframe = FALSE)
.cbind.ts(list(...), .makeNamesTs(...), dframe = dframe, union = TRUE)
<bytecode: 0x36fbf88>
<environment: namespace:stats>
> .cbindts
Error: object '.cbindts' not found
> .makeNamesTs
Error: object '.makeNamesTs' not found
How do I find functions like .cbindts and .makeNamesTs?
In still other cases, there's a bit of R code, but most of work seems to be done somewhere else.
> matrix
function (data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)
{
if (is.object(data) || !is.atomic(data))
data <- as.vector(data)
.Internal(matrix(data, nrow, ncol, byrow, dimnames, missing(nrow),
missing(ncol)))
}
<bytecode: 0x134bd10>
<environment: namespace:base>
> .Internal
function (call) .Primitive(".Internal")
> .Primitive
function (name) .Primitive(".Primitive")
How do I find out what the .Primitive function does? Similarly, some functions call .C, .Call, .Fortran, .External, or .Internal. How can I find the source code for those?

UseMethod("t") is telling you that t() is a (S3) generic function that has methods for different object classes.
The S3 method dispatch system
For S3 classes, you can use the methods function to list the methods for a particular generic function or class.
> methods(t)
[1] t.data.frame t.default t.ts*
Non-visible functions are asterisked
> methods(class="ts")
[1] aggregate.ts as.data.frame.ts cbind.ts* cycle.ts*
[5] diffinv.ts* diff.ts kernapply.ts* lines.ts
[9] monthplot.ts* na.omit.ts* Ops.ts* plot.ts
[13] print.ts time.ts* [<-.ts* [.ts*
[17] t.ts* window<-.ts* window.ts*
Non-visible functions are asterisked
"Non-visible functions are asterisked" means the function is not exported from its package's namespace. You can still view its source code via the ::: function (i.e. stats:::t.ts), or by using getAnywhere(). getAnywhere() is useful because you don't have to know which package the function came from.
> getAnywhere(t.ts)
A single object matching ‘t.ts’ was found
It was found in the following places
registered S3 method for t from namespace stats
namespace:stats
with value
function (x)
{
cl <- oldClass(x)
other <- !(cl %in% c("ts", "mts"))
class(x) <- if (any(other))
cl[other]
attr(x, "tsp") <- NULL
t(x)
}
<bytecode: 0x294e410>
<environment: namespace:stats>
The S4 method dispatch system
The S4 system is a newer method dispatch system and is an alternative to the S3 system. Here is an example of an S4 function:
> library(Matrix)
Loading required package: lattice
> chol2inv
standardGeneric for "chol2inv" defined from package "base"
function (x, ...)
standardGeneric("chol2inv")
<bytecode: 0x000000000eafd790>
<environment: 0x000000000eb06f10>
Methods may be defined for arguments: x
Use showMethods("chol2inv") for currently available ones.
The output already offers a lot of information. standardGeneric is an indicator of an S4 function. The method to see defined S4 methods is offered helpfully:
> showMethods(chol2inv)
Function: chol2inv (package base)
x="ANY"
x="CHMfactor"
x="denseMatrix"
x="diagonalMatrix"
x="dtrMatrix"
x="sparseMatrix"
getMethod can be used to see the source code of one of the methods:
> getMethod("chol2inv", "diagonalMatrix")
Method Definition:
function (x, ...)
{
chk.s(...)
tcrossprod(solve(x))
}
<bytecode: 0x000000000ea2cc70>
<environment: namespace:Matrix>
Signatures:
x
target "diagonalMatrix"
defined "diagonalMatrix"
There are also methods with more complex signatures for each method, for example
require(raster)
showMethods(extract)
Function: extract (package raster)
x="Raster", y="data.frame"
x="Raster", y="Extent"
x="Raster", y="matrix"
x="Raster", y="SpatialLines"
x="Raster", y="SpatialPoints"
x="Raster", y="SpatialPolygons"
x="Raster", y="vector"
To see the source code for one of these methods the entire signature must be supplied, e.g.
getMethod("extract" , signature = c( x = "Raster" , y = "SpatialPolygons") )
It will not suffice to supply the partial signature
getMethod("extract",signature="SpatialPolygons")
#Error in getMethod("extract", signature = "SpatialPolygons") :
# No method found for function "extract" and signature SpatialPolygons
Functions that call unexported functions
In the case of ts.union, .cbindts and .makeNamesTs are unexported functions from the stats namespace. You can view the source code of unexported functions by using the ::: operator or getAnywhere.
> stats:::.makeNamesTs
function (...)
{
l <- as.list(substitute(list(...)))[-1L]
nm <- names(l)
fixup <- if (is.null(nm))
seq_along(l)
else nm == ""
dep <- sapply(l[fixup], function(x) deparse(x)[1L])
if (is.null(nm))
return(dep)
if (any(fixup))
nm[fixup] <- dep
nm
}
<bytecode: 0x38140d0>
<environment: namespace:stats>
Functions that call compiled code
Note that "compiled" does not refer to byte-compiled R code as created by the compiler package. The <bytecode: 0x294e410> line in the above output indicates that the function is byte-compiled, and you can still view the source from the R command line.
Functions that call .C, .Call, .Fortran, .External, .Internal, or .Primitive are calling entry points in compiled code, so you will have to look at sources of the compiled code if you want to fully understand the function. This GitHub mirror of the R source code is a decent place to start. The function pryr::show_c_source can be a useful tool as it will take you directly to a GitHub page for .Internal and .Primitive calls. Packages may use .C, .Call, .Fortran, and .External; but not .Internal or .Primitive, because these are used to call functions built into the R interpreter.
Calls to some of the above functions may use an object instead of a character string to reference the compiled function. In those cases, the object is of class "NativeSymbolInfo", "RegisteredNativeSymbol", or "NativeSymbol"; and printing the object yields useful information. For example, optim calls .External2(C_optimhess, res$par, fn1, gr1, con) (note that's C_optimhess, not "C_optimhess"). optim is in the stats package, so you can type stats:::C_optimhess to see information about the compiled function being called.
Compiled code in a package
If you want to view compiled code in a package, you will need to download/unpack the package source. The installed binaries are not sufficient. A package's source code is available from the same CRAN (or CRAN compatible) repository that the package was originally installed from. The download.packages() function can get the package source for you.
download.packages(pkgs = "Matrix",
destdir = ".",
type = "source")
This will download the source version of the Matrix package and save the corresponding .tar.gz file in the current directory. Source code for compiled functions can be found in the src directory of the uncompressed and untared file. The uncompressing and untaring step can be done outside of R, or from within R using the untar() function. It is possible to combine the download and expansion step into a single call (note that only one package at a time can be downloaded and unpacked in this way):
untar(download.packages(pkgs = "Matrix",
destdir = ".",
type = "source")[,2])
Alternatively, if the package development is hosted publicly (e.g. via GitHub, R-Forge, or RForge.net), you can probably browse the source code online.
Compiled code in a base package
Certain packages are considered "base" packages. These packages ship with R and their version is locked to the version of R. Examples include base, compiler, stats, and utils. As such, they are not available as separate downloadable packages on CRAN as described above. Rather, they are part of the R source tree in individual package directories under /src/library/. How to access the R source is described in the next section.
Compiled code built into the R interpreter
If you want to view the code built-in to the R interpreter, you will need to download/unpack the R sources; or you can view the sources online via the R Subversion repository or Winston Chang's github mirror.
Uwe Ligges's R news article (PDF) (p. 43) is a good general reference of how to view the source code for .Internal and .Primitive functions. The basic steps are to first look for the function name in src/main/names.c and then search for the "C-entry" name in the files in src/main/*.

In addition to the other answers on this question and its duplicates, here's a good way to get source code for a package function without needing to know which package it's in.
e.g. say if we want the source for randomForest::rfcv():
To view/edit it in a pop-up window:
edit(getAnywhere('rfcv'), file='source_rfcv.r')
View(getAnywhere('rfcv'), file='source_rfcv.r')
Note that edit() opens a text editor (of user's choice), whereas
View() invokes a spreadsheet-style data viewer.
View() is great for browsing (multi-columnar) data, but usually terrible for code of anything other than toy length.
so when only want to view code, edit() is IMO actually far better than View(), since with edit() you can collapse/hide/dummy out all the arg-parsing/checking/default/error-message logic which can take up to 70% of an R function, and just get to the part where the function actually operationally does something(!), what type(s) of objects its return type is, whether and how it recurses, etc.
To redirect to a separate file (so you can bring up the code in your favorite IDE/editor/process it with grep/etc.):
capture.output(getAnywhere('rfcv'), file='source_rfcv.r')

For non-primitive functions, R base includes a function called body() that returns the body of function. For instance the source of the print.Date() function can be viewed:
body(print.Date)
will produce this:
{
if (is.null(max))
max <- getOption("max.print", 9999L)
if (max < length(x)) {
print(format(x[seq_len(max)]), max = max, ...)
cat(" [ reached getOption(\"max.print\") -- omitted",
length(x) - max, "entries ]\n")
}
else print(format(x), max = max, ...)
invisible(x)
}
If you are working in a script and want the function code as a character vector, you can get it.
capture.output(print(body(print.Date)))
will get you:
[1] "{"
[2] " if (is.null(max)) "
[3] " max <- getOption(\"max.print\", 9999L)"
[4] " if (max < length(x)) {"
[5] " print(format(x[seq_len(max)]), max = max, ...)"
[6] " cat(\" [ reached getOption(\\\"max.print\\\") -- omitted\", "
[7] " length(x) - max, \"entries ]\\n\")"
[8] " }"
[9] " else print(format(x), max = max, ...)"
[10] " invisible(x)"
[11] "}"
Why would I want to do such a thing? I was creating a custom S3 object (x, where class(x) = "foo") based on a list. One of the list members (named "fun") was a function and I wanted print.foo() to display the function source code, indented. So I ended up with the following snippet in print.foo():
sourceVector = capture.output(print(body(x[["fun"]])))
cat(paste0(" ", sourceVector, "\n"))
which indents and displays the code associated with x[["fun"]].
Edit 2020-12-31
A less circuitous way to get the same character vector of source code is:
sourceVector = deparse(body(x$fun))

It gets revealed when you debug using the debug() function.
Suppose you want to see the underlying code in t() transpose function. Just typing 't', doesn't reveal much.
>t
function (x)
UseMethod("t")
<bytecode: 0x000000003085c010>
<environment: namespace:base>
But, Using the 'debug(functionName)', it reveals the underlying code, sans the internals.
> debug(t)
> t(co2)
debugging in: t(co2)
debug: UseMethod("t")
Browse[2]>
debugging in: t.ts(co2)
debug: {
cl <- oldClass(x)
other <- !(cl %in% c("ts", "mts"))
class(x) <- if (any(other))
cl[other]
attr(x, "tsp") <- NULL
t(x)
}
Browse[3]>
debug: cl <- oldClass(x)
Browse[3]>
debug: other <- !(cl %in% c("ts", "mts"))
Browse[3]>
debug: class(x) <- if (any(other)) cl[other]
Browse[3]>
debug: attr(x, "tsp") <- NULL
Browse[3]>
debug: t(x)
EDIT:
debugonce() accomplishes the same without having to use undebug()

Didn't see how this fit into the flow of the main answer but it stumped me for a while so I'm adding it here:
Infix Operators
To see the source code of some base infix operators (e.g., %%, %*%, %in%), use getAnywhere, e.g.:
getAnywhere("%%")
# A single object matching ‘%%’ was found
# It was found in the following places
# package:base
# namespace:base
# with value
#
# function (e1, e2) .Primitive("%%")
The main answer covers how to then use mirrors to dig deeper.

In RStudio, there are (at least) 3 ways:
Press the F2 key while cursor is on any function.
Click on the function name while holding
Ctrl or Command
View(function_name) (as stated above)
A new pane will open with the source code. If you reach .Primitive or .C you'll need another method, sorry.

There is a very handy function in R edit
new_optim <- edit(optim)
It will open the source code of optim using the editor specified in R's options, and then you can edit it and assign the modified function to new_optim. I like this function very much to view code or to debug the code, e.g, print some messages or variables or even assign them to a global variables for further investigation (of course you can use debug).
If you just want to view the source code and don't want the annoying long source code printed on your console, you can use
invisible(edit(optim))
Clearly, this cannot be used to view C/C++ or Fortran source code.
BTW, edit can open other objects like list, matrix, etc, which then shows the data structure with attributes as well. Function de can be used to open an excel like editor (if GUI supports it) to modify matrix or data frame and return the new one. This is handy sometimes, but should be avoided in usual case, especially when you matrix is big.

As long as the function is written in pure R not C/C++/Fortran, one may use the following. Otherwise the best way is debugging and using "jump into":
> functionBody(functionName)

View(function_name) - eg. View(mean) Make sure to use uppercase [V]. The read-only code will open in the editor.

You can also try to use print.function(), which is S3 generic, to get the function write in the console.

To see the source code of the function use print()
f <- function(x){
x * 2
}
print(f)
function(x){
x * 2
}

First, try running the function without ()
Example: let's get the source code for the cat() function:
cat
if (is.character(file))
if (file == "")
file <- stdout()
else if (startsWith(file, "|")) {
file <- pipe(substring(file, 2L), "w")
on.exit(close(file))
}
else {
file <- file(file, ifelse(append, "a", "w"))
on.exit(close(file))
}
.Internal(cat(list(...), file, sep, fill, labels, append))
But sometimes that will return "UseMethod" instead of source code
If we try to get the source code for read_xml():
library(xml2)
read_xml
# UseMethod("read_xml")
That's not much use to us! In this case, take a look at the methods:
methods("read_xml")
# read_xml.character* read_xml.connection* read_xml.raw* read_xml.response*
And use getAnywhere on the value(s) above to see the source code:
getAnywhere("read_xml.character")
Another example
Let's try to see the source code for qqnorm():
qqnorm
# UseMethod("qqnorm") # Not very useful
methods("qqnorm")
# [1] qqnorm.default* # Making progress...
getAnywhere("qqnorm.default") # Shows source code!

A quick solution for the R-plugin in PyCharm (for RStudio, see the answer by #Arthur Yip).
Type, if necessary, and select the function name in the Editor or R Console. Then "Go to Declaration" or use shortcuts CTRL-B or Command-B.
Note: this is not useful for .Primitive (ie, internally implemented) functions.

Related

Package-qualified names. Differences (if any) between Package::<&var> vs &Package::var?

Reading through https://docs.perl6.org/language/packages#Package-qualified_names it outlines qualifying package variables with this syntax:
Foo::Bar::<$quux>; #..as an alternative to Foo::Bar::quux;
For reference the package structure used as the example in the document is:
class Foo {
sub zape () { say "zipi" }
class Bar {
method baz () { return 'Þor is mighty' }
our &zape = { "zipi" }; #this is the variable I want to resolve
our $quux = 42;
}
}
The same page states this style of qualification doesn't work to access &zape in the Foo::Bar package listed above:
(This does not work with the &zape variable)
Yet, if I try:
Foo::Bar::<&zape>; # instead of &Foo::Bar::zape;
it is resolves just fine.
Have I misinterpreted the document or completely missed the point being made? What would be the logic behind it 'not working' with code reference variables vs a scalar for example?
I'm not aware of differences, but Foo::Bar::<&zape> can also be modified to use {} instead of <>, which then can be used with something other than literals, like this:
my $name = '&zape';
Foo::Bar::{$name}()
or
my $name = 'zape';
&Foo::Bar::{$name}()
JJ and Moritz have provided useful answers.
This nanswer is a whole nother ball of wax. I've written and discarded several nanswers to your question over the last few days. None have been very useful. I'm not sure this is either but I've decided I've finally got a first version of something worth publishing, regardless of its current usefulness.
In this first installment my nanswer is just a series of observations and questions. I also hope to add an explanation of my observations based on what I glean from spelunking the compiler's code to understand what we see. (For now I've just written up the start of that process as the second half of this nanswer.)
Differences (if any) between Package::<&var> vs &Package::var?
They're fundamentally different syntax. They're not fully interchangeable in where you can write them. They result in different evaluations. Their result can be different things.
Let's step thru lots of variations drawing out the differences.
say Package::<&var>; # compile-time error: Undeclared name: Package
So, forget the ::<...> bit for a moment. P6 is looking at that Package bit and demanding that it be an already declared name. That seems simple enough.
say &Package::var; # (Any)
Quite a difference! For some reason, for this second syntax, P6 has no problem with those two arbitrary names (Package and var) not having been declared. Who knows what it's doing with the &. And why is it (Any) and not (Callable) or Nil?
Let's try declaring these things. First:
my Package::<&var> = { 42 } # compile-time error: Type 'Package' is not declared
OK. But if we declare Package things don't really improve:
package Package {}
my Package::<&var> = { 42 } # compile-time error: Malformed my
OK, start with a clean slate again, without the package declaration. What about the other syntax?:
my &Package::var = { 42 }
Yay. P6 accepts this code. Now, for the next few lines we'll assume the declaration above. What about:
say &Package::var(); # 42
\o/ So can we use the other syntax?:
say Package::<&var>(); # compile-time error: Undeclared name: Package
Nope. It seems like the my didn't declare a Package with a &var in it. Maybe it declared a &Package::var, where the :: just happens to be part of the name but isn't about packages? P6 supports a bunch of "pseudo" packages. One of them is LEXICAL:
say LEXICAL::; # PseudoStash.new(... &Package::var => (Callable) ...
Bingo. Or is it?
say LEXICAL::<&Package::var>(); # Cannot invoke this object
# (REPR: Uninstantiable; Callable)
What happened to our { 42 }?
Hmm. Let's start from a clean slate and create &Package::var in a completely different way:
package Package { our sub var { 99 } }
say &Package::var(); # 99
say Package::<&var>(); # 99
Wow. Now, assuming those lines above and trying to add more:
my Package::<&var> = { 42 } # Compile-time error: Malformed my
That was to be expected given our previous attempt above. What about:
my &Package::var = { 42 } # Cannot modify an immutable Sub (&var)
Is it all making sense now? ;)
Spelunking the compiler code, checking the grammar
1 I spent a long time trying to work out what the deal really is before looking at the source code of the Rakudo compiler. This is a footnote covering my initial compiler spelunking. I hope to continue it tomorrow and turn this nanswer into an answer this weekend.
The good news is it's just P6 code -- most of Rakudo is written in P6.
The bad news is knowing where to look. You might see the doc directory and then the compiler overview. But then you'll notice the overview doc has barely been touched since 2010! Don't bother. Perhaps Andrew Shitov's "internals" posts will help orient you? Moving on...
In this case what I am interested in is understanding the precise nature of the Package::<&var> and &Package::var forms of syntax. When I type "syntax" into GH's repo search field the second file listed is the Perl 6 Grammar. Bingo.
Now comes the ugly news. The Perl 6 Grammar file is 6K LOC and looks super intimidating. But I find it all makes sense when I keep my cool.
Next, I'm wondering what to search for on the page. :: nets 600+ matches. Hmm. ::< is just 1, but it is in an error message. But in what? In token morename. Looking at that I can see it's likely not relevant. But the '::' near the start of the token is just the ticket. Searching the page for '::' yields 10 matches. The first 4 (from the start of the file) are more error messages. The next two are in the above morename token. 4 matches left.
The next one appears a quarter way thru token term:sym<name>. A "name". .oO ( Undeclared name: Package So maybe this is relevant? )
Next, token typename. A "typename". .oO ( Type 'Package' is not declared So maybe this is relevant too? )
token methodop. Definitely not relevant.
Finally token infix:sym<?? !!>. Nope.
There are no differences between Package::<&var> and &Package::var.
package Foo { our $var = "Bar" };
say $Foo::var === Foo::<$var>; # OUTPUT: «True␤»
Ditto for subs (of course):
package Foo { our &zape = { "Bar" } };
say &Foo::zape === Foo::<&zape>;# OUTPUT: «True␤»
What the documentation (somewhat confusingly) is trying to say is that package-scope variables can only be accessed if declared using our. There are two zapes, one of them has got lexical scope (subs get lexical scope by default), so you can't access that one. I have raised this issue in the doc repo and will try to fix it as soon as possible.

Julia Concatenating dataframes in loop

I have files in folder that I want to read all at once, bind and make new dataframe.
Code:
using CSV, DataFrames
path="somepath"
files=readdir(path)
d=DataFrame()
for file=files
x=CSV.read(file)
d=vcat(d,x)
end
Produces:
Error: UndefVarError: d not defined
You can use append! in such cases (change allowing this approach is on master but is not released yet):
d=DataFrame()
for file=files
append!(d, CSV.read(file))
end
or, if you want to use vcat (this option will use a bit more memory):
reduce(vcat, [CSV.read(file) for file in files])
The original code should be rewritten as:
d=DataFrame()
for file=files
x=CSV.read(file)
global d=vcat(d,x)
end
(note global in front of d) but this is not a recommended way to perform this operation.

Finding C source code for cor() function in R stats package [duplicate]

I want to look at the source code for a function to see how it works. I know I can print a function by typing its name at the prompt:
> t
function (x)
UseMethod("t")
<bytecode: 0x2332948>
<environment: namespace:base>
In this case, what does UseMethod("t") mean? How do I find the source code that's actually being used by, for example: t(1:10)?
Is there a difference between when I see UseMethod and when I see standardGeneric and showMethods, as with with?
> with
standardGeneric for "with" defined from package "base"
function (data, expr, ...)
standardGeneric("with")
<bytecode: 0x102fb3fc0>
<environment: 0x102fab988>
Methods may be defined for arguments: data
Use showMethods("with") for currently available ones.
In other cases, I can see that R functions are being called, but I can't find the source code for those functions.
> ts.union
function (..., dframe = FALSE)
.cbind.ts(list(...), .makeNamesTs(...), dframe = dframe, union = TRUE)
<bytecode: 0x36fbf88>
<environment: namespace:stats>
> .cbindts
Error: object '.cbindts' not found
> .makeNamesTs
Error: object '.makeNamesTs' not found
How do I find functions like .cbindts and .makeNamesTs?
In still other cases, there's a bit of R code, but most of work seems to be done somewhere else.
> matrix
function (data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)
{
if (is.object(data) || !is.atomic(data))
data <- as.vector(data)
.Internal(matrix(data, nrow, ncol, byrow, dimnames, missing(nrow),
missing(ncol)))
}
<bytecode: 0x134bd10>
<environment: namespace:base>
> .Internal
function (call) .Primitive(".Internal")
> .Primitive
function (name) .Primitive(".Primitive")
How do I find out what the .Primitive function does? Similarly, some functions call .C, .Call, .Fortran, .External, or .Internal. How can I find the source code for those?
UseMethod("t") is telling you that t() is a (S3) generic function that has methods for different object classes.
The S3 method dispatch system
For S3 classes, you can use the methods function to list the methods for a particular generic function or class.
> methods(t)
[1] t.data.frame t.default t.ts*
Non-visible functions are asterisked
> methods(class="ts")
[1] aggregate.ts as.data.frame.ts cbind.ts* cycle.ts*
[5] diffinv.ts* diff.ts kernapply.ts* lines.ts
[9] monthplot.ts* na.omit.ts* Ops.ts* plot.ts
[13] print.ts time.ts* [<-.ts* [.ts*
[17] t.ts* window<-.ts* window.ts*
Non-visible functions are asterisked
"Non-visible functions are asterisked" means the function is not exported from its package's namespace. You can still view its source code via the ::: function (i.e. stats:::t.ts), or by using getAnywhere(). getAnywhere() is useful because you don't have to know which package the function came from.
> getAnywhere(t.ts)
A single object matching ‘t.ts’ was found
It was found in the following places
registered S3 method for t from namespace stats
namespace:stats
with value
function (x)
{
cl <- oldClass(x)
other <- !(cl %in% c("ts", "mts"))
class(x) <- if (any(other))
cl[other]
attr(x, "tsp") <- NULL
t(x)
}
<bytecode: 0x294e410>
<environment: namespace:stats>
The S4 method dispatch system
The S4 system is a newer method dispatch system and is an alternative to the S3 system. Here is an example of an S4 function:
> library(Matrix)
Loading required package: lattice
> chol2inv
standardGeneric for "chol2inv" defined from package "base"
function (x, ...)
standardGeneric("chol2inv")
<bytecode: 0x000000000eafd790>
<environment: 0x000000000eb06f10>
Methods may be defined for arguments: x
Use showMethods("chol2inv") for currently available ones.
The output already offers a lot of information. standardGeneric is an indicator of an S4 function. The method to see defined S4 methods is offered helpfully:
> showMethods(chol2inv)
Function: chol2inv (package base)
x="ANY"
x="CHMfactor"
x="denseMatrix"
x="diagonalMatrix"
x="dtrMatrix"
x="sparseMatrix"
getMethod can be used to see the source code of one of the methods:
> getMethod("chol2inv", "diagonalMatrix")
Method Definition:
function (x, ...)
{
chk.s(...)
tcrossprod(solve(x))
}
<bytecode: 0x000000000ea2cc70>
<environment: namespace:Matrix>
Signatures:
x
target "diagonalMatrix"
defined "diagonalMatrix"
There are also methods with more complex signatures for each method, for example
require(raster)
showMethods(extract)
Function: extract (package raster)
x="Raster", y="data.frame"
x="Raster", y="Extent"
x="Raster", y="matrix"
x="Raster", y="SpatialLines"
x="Raster", y="SpatialPoints"
x="Raster", y="SpatialPolygons"
x="Raster", y="vector"
To see the source code for one of these methods the entire signature must be supplied, e.g.
getMethod("extract" , signature = c( x = "Raster" , y = "SpatialPolygons") )
It will not suffice to supply the partial signature
getMethod("extract",signature="SpatialPolygons")
#Error in getMethod("extract", signature = "SpatialPolygons") :
# No method found for function "extract" and signature SpatialPolygons
Functions that call unexported functions
In the case of ts.union, .cbindts and .makeNamesTs are unexported functions from the stats namespace. You can view the source code of unexported functions by using the ::: operator or getAnywhere.
> stats:::.makeNamesTs
function (...)
{
l <- as.list(substitute(list(...)))[-1L]
nm <- names(l)
fixup <- if (is.null(nm))
seq_along(l)
else nm == ""
dep <- sapply(l[fixup], function(x) deparse(x)[1L])
if (is.null(nm))
return(dep)
if (any(fixup))
nm[fixup] <- dep
nm
}
<bytecode: 0x38140d0>
<environment: namespace:stats>
Functions that call compiled code
Note that "compiled" does not refer to byte-compiled R code as created by the compiler package. The <bytecode: 0x294e410> line in the above output indicates that the function is byte-compiled, and you can still view the source from the R command line.
Functions that call .C, .Call, .Fortran, .External, .Internal, or .Primitive are calling entry points in compiled code, so you will have to look at sources of the compiled code if you want to fully understand the function. This GitHub mirror of the R source code is a decent place to start. The function pryr::show_c_source can be a useful tool as it will take you directly to a GitHub page for .Internal and .Primitive calls. Packages may use .C, .Call, .Fortran, and .External; but not .Internal or .Primitive, because these are used to call functions built into the R interpreter.
Calls to some of the above functions may use an object instead of a character string to reference the compiled function. In those cases, the object is of class "NativeSymbolInfo", "RegisteredNativeSymbol", or "NativeSymbol"; and printing the object yields useful information. For example, optim calls .External2(C_optimhess, res$par, fn1, gr1, con) (note that's C_optimhess, not "C_optimhess"). optim is in the stats package, so you can type stats:::C_optimhess to see information about the compiled function being called.
Compiled code in a package
If you want to view compiled code in a package, you will need to download/unpack the package source. The installed binaries are not sufficient. A package's source code is available from the same CRAN (or CRAN compatible) repository that the package was originally installed from. The download.packages() function can get the package source for you.
download.packages(pkgs = "Matrix",
destdir = ".",
type = "source")
This will download the source version of the Matrix package and save the corresponding .tar.gz file in the current directory. Source code for compiled functions can be found in the src directory of the uncompressed and untared file. The uncompressing and untaring step can be done outside of R, or from within R using the untar() function. It is possible to combine the download and expansion step into a single call (note that only one package at a time can be downloaded and unpacked in this way):
untar(download.packages(pkgs = "Matrix",
destdir = ".",
type = "source")[,2])
Alternatively, if the package development is hosted publicly (e.g. via GitHub, R-Forge, or RForge.net), you can probably browse the source code online.
Compiled code in a base package
Certain packages are considered "base" packages. These packages ship with R and their version is locked to the version of R. Examples include base, compiler, stats, and utils. As such, they are not available as separate downloadable packages on CRAN as described above. Rather, they are part of the R source tree in individual package directories under /src/library/. How to access the R source is described in the next section.
Compiled code built into the R interpreter
If you want to view the code built-in to the R interpreter, you will need to download/unpack the R sources; or you can view the sources online via the R Subversion repository or Winston Chang's github mirror.
Uwe Ligges's R news article (PDF) (p. 43) is a good general reference of how to view the source code for .Internal and .Primitive functions. The basic steps are to first look for the function name in src/main/names.c and then search for the "C-entry" name in the files in src/main/*.
In addition to the other answers on this question and its duplicates, here's a good way to get source code for a package function without needing to know which package it's in.
e.g. say if we want the source for randomForest::rfcv():
To view/edit it in a pop-up window:
edit(getAnywhere('rfcv'), file='source_rfcv.r')
View(getAnywhere('rfcv'), file='source_rfcv.r')
Note that edit() opens a text editor (of user's choice), whereas
View() invokes a spreadsheet-style data viewer.
View() is great for browsing (multi-columnar) data, but usually terrible for code of anything other than toy length.
so when only want to view code, edit() is IMO actually far better than View(), since with edit() you can collapse/hide/dummy out all the arg-parsing/checking/default/error-message logic which can take up to 70% of an R function, and just get to the part where the function actually operationally does something(!), what type(s) of objects its return type is, whether and how it recurses, etc.
To redirect to a separate file (so you can bring up the code in your favorite IDE/editor/process it with grep/etc.):
capture.output(getAnywhere('rfcv'), file='source_rfcv.r')
For non-primitive functions, R base includes a function called body() that returns the body of function. For instance the source of the print.Date() function can be viewed:
body(print.Date)
will produce this:
{
if (is.null(max))
max <- getOption("max.print", 9999L)
if (max < length(x)) {
print(format(x[seq_len(max)]), max = max, ...)
cat(" [ reached getOption(\"max.print\") -- omitted",
length(x) - max, "entries ]\n")
}
else print(format(x), max = max, ...)
invisible(x)
}
If you are working in a script and want the function code as a character vector, you can get it.
capture.output(print(body(print.Date)))
will get you:
[1] "{"
[2] " if (is.null(max)) "
[3] " max <- getOption(\"max.print\", 9999L)"
[4] " if (max < length(x)) {"
[5] " print(format(x[seq_len(max)]), max = max, ...)"
[6] " cat(\" [ reached getOption(\\\"max.print\\\") -- omitted\", "
[7] " length(x) - max, \"entries ]\\n\")"
[8] " }"
[9] " else print(format(x), max = max, ...)"
[10] " invisible(x)"
[11] "}"
Why would I want to do such a thing? I was creating a custom S3 object (x, where class(x) = "foo") based on a list. One of the list members (named "fun") was a function and I wanted print.foo() to display the function source code, indented. So I ended up with the following snippet in print.foo():
sourceVector = capture.output(print(body(x[["fun"]])))
cat(paste0(" ", sourceVector, "\n"))
which indents and displays the code associated with x[["fun"]].
Edit 2020-12-31
A less circuitous way to get the same character vector of source code is:
sourceVector = deparse(body(x$fun))
It gets revealed when you debug using the debug() function.
Suppose you want to see the underlying code in t() transpose function. Just typing 't', doesn't reveal much.
>t
function (x)
UseMethod("t")
<bytecode: 0x000000003085c010>
<environment: namespace:base>
But, Using the 'debug(functionName)', it reveals the underlying code, sans the internals.
> debug(t)
> t(co2)
debugging in: t(co2)
debug: UseMethod("t")
Browse[2]>
debugging in: t.ts(co2)
debug: {
cl <- oldClass(x)
other <- !(cl %in% c("ts", "mts"))
class(x) <- if (any(other))
cl[other]
attr(x, "tsp") <- NULL
t(x)
}
Browse[3]>
debug: cl <- oldClass(x)
Browse[3]>
debug: other <- !(cl %in% c("ts", "mts"))
Browse[3]>
debug: class(x) <- if (any(other)) cl[other]
Browse[3]>
debug: attr(x, "tsp") <- NULL
Browse[3]>
debug: t(x)
EDIT:
debugonce() accomplishes the same without having to use undebug()
Didn't see how this fit into the flow of the main answer but it stumped me for a while so I'm adding it here:
Infix Operators
To see the source code of some base infix operators (e.g., %%, %*%, %in%), use getAnywhere, e.g.:
getAnywhere("%%")
# A single object matching ‘%%’ was found
# It was found in the following places
# package:base
# namespace:base
# with value
#
# function (e1, e2) .Primitive("%%")
The main answer covers how to then use mirrors to dig deeper.
In RStudio, there are (at least) 3 ways:
Press the F2 key while cursor is on any function.
Click on the function name while holding
Ctrl or Command
View(function_name) (as stated above)
A new pane will open with the source code. If you reach .Primitive or .C you'll need another method, sorry.
There is a very handy function in R edit
new_optim <- edit(optim)
It will open the source code of optim using the editor specified in R's options, and then you can edit it and assign the modified function to new_optim. I like this function very much to view code or to debug the code, e.g, print some messages or variables or even assign them to a global variables for further investigation (of course you can use debug).
If you just want to view the source code and don't want the annoying long source code printed on your console, you can use
invisible(edit(optim))
Clearly, this cannot be used to view C/C++ or Fortran source code.
BTW, edit can open other objects like list, matrix, etc, which then shows the data structure with attributes as well. Function de can be used to open an excel like editor (if GUI supports it) to modify matrix or data frame and return the new one. This is handy sometimes, but should be avoided in usual case, especially when you matrix is big.
As long as the function is written in pure R not C/C++/Fortran, one may use the following. Otherwise the best way is debugging and using "jump into":
> functionBody(functionName)
View(function_name) - eg. View(mean) Make sure to use uppercase [V]. The read-only code will open in the editor.
You can also try to use print.function(), which is S3 generic, to get the function write in the console.
To see the source code of the function use print()
f <- function(x){
x * 2
}
print(f)
function(x){
x * 2
}
First, try running the function without ()
Example: let's get the source code for the cat() function:
cat
if (is.character(file))
if (file == "")
file <- stdout()
else if (startsWith(file, "|")) {
file <- pipe(substring(file, 2L), "w")
on.exit(close(file))
}
else {
file <- file(file, ifelse(append, "a", "w"))
on.exit(close(file))
}
.Internal(cat(list(...), file, sep, fill, labels, append))
But sometimes that will return "UseMethod" instead of source code
If we try to get the source code for read_xml():
library(xml2)
read_xml
# UseMethod("read_xml")
That's not much use to us! In this case, take a look at the methods:
methods("read_xml")
# read_xml.character* read_xml.connection* read_xml.raw* read_xml.response*
And use getAnywhere on the value(s) above to see the source code:
getAnywhere("read_xml.character")
Another example
Let's try to see the source code for qqnorm():
qqnorm
# UseMethod("qqnorm") # Not very useful
methods("qqnorm")
# [1] qqnorm.default* # Making progress...
getAnywhere("qqnorm.default") # Shows source code!
A quick solution for the R-plugin in PyCharm (for RStudio, see the answer by #Arthur Yip).
Type, if necessary, and select the function name in the Editor or R Console. Then "Go to Declaration" or use shortcuts CTRL-B or Command-B.
Note: this is not useful for .Primitive (ie, internally implemented) functions.

R - C interface: extracting an object from an environment

Say that you pass an environment R object to an internal C routine through the .Call interface. Said enviromnent has (by design) a someObject object which I want to extract and manipulate from the C side. How to do it?
To simplify my question, I just want to write a C function that returns someObject. Like this:
en <- new.env()
en$someObject <- someValue
.Call("extractObject",en)
#the above should return en$someObject
Guess the C code should then look something like
SEXP extractObject(SEXP env) {
return SOMEMACROORFUNCTION(env, "someObject");
}
Unfortunately, I was not able to find the real SOMEMACROORFUNCTION.
After a little bit of googling and searching, I've found the solution:
findVar(install("someObject"),env)
in the C code is basically the equivalent of get("someObject",env) in R.

SWIG R wrapper not setting class properly/Create R object from C memory pointer

I have a SWIG generated R wrapper which contains the following setClass operations:
setClass('_p_f_p_struct_parameters_p_struct_chromosome_p_struct_dataSet__double',
prototype = list(parameterTypes = c('_p_parameters', '_p_chromosome', '_p_dataSet'),
returnType = '_p_f_p_struct_parameters_p_struct_chromosome_p_struct_dataSet__double'),
contains = 'CRoutinePointer')
setClass('_p_f_p_struct_parameters_p_p_struct_chromosome_p_p_struct_chromosome_int_int__void',
prototype = list(parameterTypes = c('_p_parameters', '_p_p_chromosome', '_p_p_chromosome', '_int', '_int'),
returnType = '_p_f_p_struct_parameters_p_p_struct_chromosome_p_p_struct_chromosome_int_int__void'),
contains = 'CRoutinePointer')
These operation do not appear to be behaving as expected. When I call a function with the output being the creation of a _p_parameters object (defined above), I get the following error:
Error in getClass(Class, where = topenv(parent.frame())) :
“_p_parameters” is not a defined class
The setClass therefore seems to be not doing it's thing.
I tried to manually set the _p_parameters class via:
p_parameters<-setClass(Class="_p_parameters", representation = representation(ref = "externalptr"))
But this does not seem to work as when I try and modify other parameters (or even print parameters via an inbuilt function) the terminal crashes and hangs.
For reference, the final lines in initialiseParameters (the function which initially own _p_parameters) are calling the native C function via .Call then assigning the external pointer to a new object of class _p_paramters as follows:
;ans = .Call('R_swig_initialiseParameters', numInputs, numNodes, numOutputs, arity, as.logical(.copy), PACKAGE='cgp');
ans <- new("_p_parameters", ref=ans) ;
I've read various R doc on new(), setClass, S3/S4 classes but nothing seems to clarify what I'm meant to be doing here.
Any suggestions on where to start or tutorials that would give a good heads up would be most welcome.
Please keep in mind the C code is not mine (but is freely available under GNU), I am not a C programmer and am only weakly-moderately proficient in R. So please be gentle :)
Cheers.
PS: If I call the function in R terminal via .Call it works as expected (so it doesn't seem to e a C function error)
I thought I should post the solution I have to this. The least I could do for a 'tumbleweed' medal haha.
The only solution I could find to this is to comment out the following line:
ans <- new("_p_parameters", ref=ans) ;
in all function that try to create an R object.
The resulting memory pointer is then assigned to an R object at function call.
It's dirty, but I couldn't work out how to create an object from a memory pointer (at least from within the functions themselves).
It seems to work so I guess it will do.

Resources