How do I print values from C extensions? - c

Every Ruby object is of type VALUE in C. How do I print it in a readable way?
Any other tips concerning debugging of Ruby C extensions are welcome.

You can call p on Ruby objects with the C function rb_p. For example:
VALUE empty_array = rb_ary_new();
rb_p(empty_array); // prints out "[]"

Here's what I came up with:
static void d(VALUE v) {
ID sym_puts = rb_intern("puts");
ID sym_inspect = rb_intern("inspect");
rb_funcall(rb_mKernel, sym_puts, 1,
rb_funcall(v, sym_inspect, 0));
}
Having it in a C file, you can output VALUEs like so:
VALUE v;
d(v);
I've borrowed the idea from this article.

I've found an interesting way using Natvis files in Visual Studio.
I have created C++ wrapper objects over the Ruby C API - this gives me a little bit more type safety and the syntax becomes more similar to writing actual Ruby.
I won't be posting the whole code - too long for that, I plan on open sourcing it eventually.
But the gist of it is:
class Object
{
public:
Object(VALUE value) : value_(value)
{
assert(NIL_P(value_) || kind_of(rb_cObject));
}
operator VALUE() const
{
return value_;
}
// [More code] ...
}
Then lets take the String class for example:
class String : public Object
{
public:
String() : Object(GetVALUE("")) {}
String(VALUE value) : Object(value)
{
CheckTypeOfOrNil(value_, String::klass());
}
String(std::string value) : Object( GetVALUE(value.c_str()) ) {}
String(const char* value) : Object( GetVALUE(value) ) {}
operator std::string()
{
return StringValueCStr(value_);
}
operator std::string() const
{
return operator std::string();
}
static VALUE klass()
{
return rb_cString;
}
// String.empty?
bool empty()
{
return length() == 0;
}
size_t length() const
{
return static_cast<size_t>(RSTRING_LEN(value_));
}
size_t size() const
{
return length();
};
};
So - my wrappers make sure to check that the VALUE they wrap is of expected type or Nil.
I then wrote some natvis files for Visual Studio which will provide some real time debug information for my wrapper objects as I step through the code:
<?xml version="1.0" encoding="utf-8"?>
<AutoVisualizer xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
<Type Name="SUbD::ruby::String">
<DisplayString Condition="value_ == RUBY_Qnil">Ruby String: Nil</DisplayString>
<DisplayString Condition="value_ != RUBY_Qnil">Ruby String: {((struct RString*)value_)->as.heap.ptr,s}</DisplayString>
<StringView Condition="value_ != RUBY_Qnil">((struct RString*)value_)->as.heap.ptr,s</StringView>
<Expand>
<Item Name="[VALUE]">value_</Item>
<Item Name="[size]" Condition="value_ != RUBY_Qnil">((struct RString*)value_)->as.heap.len</Item>
<Item Name="[string]" Condition="value_ != RUBY_Qnil">((struct RString*)value_)->as.heap.ptr</Item>
<Item Name="[capacity]" Condition="value_ != RUBY_Qnil">((struct RString*)value_)->as.heap.aux.capa</Item>
</Expand>
</Type>
</AutoVisualizer>
Note that this is all hard-coded to the exact internal structure of Ruby 2.0. This will not work in Ruby 1.8 or 1.9 - haven't tried with 2.1 or 2.2 yet. Also, there might be mutations of how the String can be stored which I haven't added yet. (Short strings can be stored as immediate values.)
(In fact - the natvis posted above only works for 32bit - not 64bit at the moment.)
But once that is set up I can step through code and inspect the Ruby strings almost like they are std::string:
Getting it all to work isn't trivial. If you noticed in my natvis my RUBY_Qnil references - they would not work unless I added this piece of debug code to my project:
// Required in order to make them available to natvis files in Visual Studio.
#ifdef _DEBUG
const auto DEBUG_RUBY_Qnil = RUBY_Qnil;
const auto DEBUG_RUBY_FIXNUM_FLAG = RUBY_FIXNUM_FLAG;
const auto DEBUG_RUBY_T_MASK = RUBY_T_MASK;
const auto DEBUG_RUBY_T_FLOAT = RUBY_T_FLOAT;
const auto DEBUG_RARRAY_EMBED_FLAG = RARRAY_EMBED_FLAG;
const auto DEBUG_RARRAY_EMBED_LEN_SHIFT = RARRAY_EMBED_LEN_SHIFT;
const auto DEBUG_RARRAY_EMBED_LEN_MASK = RARRAY_EMBED_LEN_MASK;
#endif
You cannot use macros in natvis definitions unfortunately, so that's why I had to manually expand many of them into the natvis file by inspecting the Ruby source itself. (The Ruby Cross Reference is of great help here: http://rxr.whitequark.org/mri/ident?v=2.0.0-p247)
It's still WIP, but it's already saved me a ton of headaches. Eventually I want to extract the debug setup on GitHub: https://github.com/thomthom (Keep an eye on that account if you are interested.)

Related

Kotlin/Native to C dylib : How to access class members in a instance object which returned by a method?

I am trying to learn Kotlin/Native C interop
I exported some Kotlin classes as C dynamic Lib and succeeded in access methods with primitive return types
But When trying to access class members in a instance object which returned by a method, the object contains something named as pinned
Code sample:
#Serializable
data class Persons (
val results: Array<Result>,
val info: Info
)
class RandomUserApiJS {
fun getPersonsDirect() : Persons {
return runBlocking {
RandomUserApi().getPersons()
}
}
}
Now when using them in C codeblocks,
In this image, note that the persons obj only showing a field named pinned and no other member functions found.
Since I don't know that much in C/C++ and can't investigate further.
Please help me to understand to access instance members of Kotlin Class in exported C lib?
Header file for ref:
https://gist.github.com/RageshAntony/a0b9007376084fa8b213b022b58f9886
for your gist
https://gist.github.com/RageshAntony/a0b9007376084fa8b213b022b58f9886
I modified the following:
// I comment this annotation
// #Serializable
data class Persons(
val results: List<Result>,
val info: Info,
/**
* the Result's properties too many
* I will use a simple data class for this example
* how to get c array from Persons (also suitable any iterable)
*/
val testList: List<Simple>,
) {
public fun toJson() = Json.encodeToString(this)
companion object {
public fun fromJson(json: String) = Json.decodeFromString<Persons>(json)
}
val arena = Arena()
fun getTestListForC(size: CPointer<IntVar>): CPointer<COpaquePointerVar> {
size.pointed.value = testList.size
return arena.allocArray<COpaquePointerVar>(testList.size) {
this.value = StableRef.create(testList[it]).asCPointer()
}
}
fun free() {
arena.clear()
}
}
/**
* kotlin <-> c bridge is primitive type
* like int <-> Int
* like char* <-> String
* so the Simple class has two primitive properties
*/
data class Simple(
val name: String,
val age: Int,
)
#include <stdio.h>
#include "libnative_api.h"
int main(int argc, char **argv) {
libnative_ExportedSymbols* lib = libnative_symbols();
libnative_kref_MathNative mn = lib->kotlin.root.MathNative.MathNative();
const char *a = lib->kotlin.root.MathNative.mul(mn,5,6); // working
printf ("Math Resullt %s\n",a);
libnative_kref_RandomUserApiJS pr = lib->kotlin.root.RandomUserApiJS.RandomUserApiJS();
libnative_kref_Persons persons = lib->kotlin.root.RandomUserApiJS.getPersonsDirect(pr);
// when accessing above persons obj, only a field 'pinned' availabe, nothing else
int size;
libnative_kref_Simple* list = (libnative_kref_Simple *)lib->kotlin.root.Persons.getTestListForC(persons, &size);
printf("size = %d\n", size);
for (int i = 0; i < size; ++i) {
const char *name = lib->kotlin.root.Simple.get_name(list[i]);
int age = lib->kotlin.root.Simple.get_age(list[i]);
printf("%s\t%d\n", name, age);
}
lib->kotlin.root.Persons.free(persons);
return 0;
}
// for output
Math Resullt The answer is 30
size = 3
name1 1
name2 2
name3 3
But I don't think calling kotlin lib through C is a good behavior, because kotlin native is not focused on improving performance for now, in my opinion, all functions that can be implemented with kotlin native can find solutions implemented in pure c, So I'm more focused on how to access the c lib from kotlin. Of course, it's a good solution if you absolutely need to access klib from c, but I'm still not very satisfied with it, then I may create a github template to better solve kotlin-interop from c.But that's not the point of this answer.

How to get the class of a VALUE in Ruby C API

I created some classes with Ruby's C API. I want to create a function whose behavior will change depending on the class of the Ruby object.
I tried to use is_a? from Ruby, however, I don't think it's the good way to do this. I checked "Creating Extension Libraries for Ruby" without success. The only direct way to check classes is with the default types.
I have my class "Klass" already created:
VALUE rb_cKlass = rb_define_class("Klass", rb_cObject);
And how I wanted to check if the class is the good one:
VALUE my_function(VALUE self, VALUE my_argument) {
if(rb_check_class(my_argument), rb_cKlass)) {
// do something if my_argument is an instance of Klass
} else {
return Qnil;
}
}
Is there a way to do this?
I came across this recently, and used the RBASIC_CLASS macro, but was getting segfaults in certain scenarios for some unexplained reason.
After scanning through ruby.h, I found the CLASS_OF macro, which returns the class as VALUE of a given object.
VALUE obj = INT2NUM(10);
VALUE klass = CLASS_OF(obj); // rb_cInteger
Using Ruby 2.5
Every ruby object is internally represented by RObject struct (I will copy the source here for the sake of future readers):
struct RObject {
struct RBasic basic;
union {
struct {
uint32_t numiv;
VALUE *ivptr;
void *iv_index_tbl; /* shortcut for RCLASS_IV_INDEX_TBL(rb_obj_class(obj)) */
} heap;
VALUE ary[ROBJECT_EMBED_LEN_MAX];
} as;
};
The very first member, RBasic, defines the class:
struct RBasic {
VALUE flags;
const VALUE klass;
}
To get an access to RBasic metadata of anything, one might use RBASIC macro:
RBASIC(my_argument)
To get the class directly, one might use RBASIC_CLASS macro:
RBASIC_CLASS(my_argument)
If you want to stay close to the is_a? Ruby fashion (i.e. check if any of the ancestors is the expected class), you could directly use the C implementation of is_a?, rb_obj_is_kind_of:
rb_obj_is_kind_of(my_argument, rb_cKlass) // Qtrue OR Qfalse
And since Qfalse == 0, you can just use that method as a condition:
VALUE my_function(VALUE self, VALUE my_argument) {
if(rb_obj_is_kind_of(my_argument, rb_cKlass)) {
// do something if my_argument is an instance of Klass
} else {
return Qnil;
}
}
To find this method, just check Object#is_a? documentation and click to toggle source, you'll see the C implementation if it is a C function (hence this will work for most of the standard lib).

Proper way to parse a file and build output

I'm trying to learn D and I thought after doing the hello world stuff, I could try something I wanted to do in Java before, where it was a big pain because of the way the Regex API worked: A little template engine.
So, I started with some simple code to read through a file, character by character:
import std.stdio, std.file, std.uni, std.array;
void main(string [] args) {
File f = File("src/res/test.dtl", "r");
bool escape = false;
char [] result;
Appender!(char[]) appender = appender(result);
foreach(c; f.rawRead(new char[f.size])) {
if(c == '\\') {
escape = true;
continue;
}
if(escape) {
escape = false;
// do something special
}
if(c == '#') {
// start of scope
}
appender.put(c);
}
writeln(appender.data());
}
The contents of my file could be something like this:
<h1>#{hello}</h1>
The goal is to replace the #{hello} part with some value passed to the engine.
So, I actually have two questions:
1. Is that a good way to process characters from file in D? I hacked this together after searching through all the imported modules and picking what sounded like it might do the job.
2. Sometimes, I would want to access more than one character (to improve checking for escape-sequences, find a whole scope, etc. Should I slice the array for that? Or are D's regex functions up to that challenge? So far, I only found matchFirst and matchAll methods, but I would like to match, replace and return to that position. How could that be done?
D standard library does not provide what you require. What you need is called "string interpolation", and here is a very nice implementation in D that you can use the way you describe: https://github.com/Abscissa/scriptlike/blob/4350eb745531720764861c82e0c4e689861bb17e/src/scriptlike/core.d#L139
Here is a blog post about this library: https://p0nce.github.io/d-idioms/#String-interpolation-as-a-library

Dart VM itself implement `eval` in `dart:mirrors` and developers use it. Are planned to make this method public?

Here is code that use this eval method in Dart platform.
This is done via reflection.
runtime/lib/mirrors_impl.dart
_getFieldSlow(unwrapped) {
// ..... Skipped
var atPosition = unwrapped.indexOf('#');
if (atPosition == -1) {
// Public symbol.
f = _eval('(x) => x.$unwrapped', null);
} else {
// Private symbol.
var withoutKey = unwrapped.substring(0, atPosition);
var privateKey = unwrapped.substring(atPosition);
f = _eval('(x) => x.$withoutKey', privateKey);
}
// ..... Skipped
}
static _eval(expression, privateKey)
native "Mirrors_evalInLibraryWithPrivateKey";
runtime/lib/mirrors.cc
DEFINE_NATIVE_ENTRY(Mirrors_evalInLibraryWithPrivateKey, 2) {
GET_NON_NULL_NATIVE_ARGUMENT(String, expression, arguments->NativeArgAt(0));
GET_NATIVE_ARGUMENT(String, private_key, arguments->NativeArgAt(1));
const GrowableObjectArray& libraries =
GrowableObjectArray::Handle(isolate->object_store()->libraries());
const int num_libraries = libraries.Length();
Library& each_library = Library::Handle();
Library& ctxt_library = Library::Handle();
String& library_key = String::Handle();
if (library_key.IsNull()) {
ctxt_library = Library::CoreLibrary();
} else {
for (int i = 0; i < num_libraries; i++) {
each_library ^= libraries.At(i);
library_key = each_library.private_key();
if (library_key.Equals(private_key)) {
ctxt_library = each_library.raw();
break;
}
}
}
ASSERT(!ctxt_library.IsNull());
return ctxt_library.Evaluate(expression);
runtime/vm/bootstrap_natives.h
V(Mirrors_evalInLibraryWithPrivateKey, 2) \
P.S.
I ask question here becuase I cannot ask it at Dart mail lists.
P.S.
As we can see it static private method in mirrors_impl.dart:
static _eval(expression, privateKey) native "Mirrors_evalInLibraryWithPrivateKey";
Does anyone want that this method should be public? (this is not a question but just a thought aloud).
According to the Dart FAQ a pure string eval like that is not likely to make it into the language, even though other dynamic features will likely be added:
So, for example, Dart isn’t likely to support evaluating a string as
code in the current context, but it may support loading that code
dynamically into a new isolate. Dart isn’t likely to support adding
fields to a value, but it may (through a mirror system) support adding
fields to a class, and you can effectively add methods using
noSuchMethod(). Using these features will have a runtime cost; it’s
important to us to minimize the cost for programs that don’t use them.
This area is still under development, so we welcome your thoughts on
what you need from runtime dynamism.

Token return values in ANTLR 3 C

I'm new to ANTLR, and I'm attempting to write a simple parser using C language target (antler3C). The grammar is simple enough that I'd like to have each rule return a value, eg:
number returns [long value]
:
( INT {$value = $INT.ivalue;}
| HEX {$value = $HEX.hvalue;}
)
;
HEX returns [long hvalue]
: '0' 'x' ('0'..'9'|'a'..'f'|'A'..'F')+ {$hvalue = strtol((char*)$text->chars,NULL,16);}
;
INT returns [long ivalue]
: '0'..'9'+ {$ivalue = strtol((char*)$text->chars,NULL,10);}
;
Each rule collects the return value of it's child rules until the topmost rule returns a nice struct full of my data.
As far as I can tell, ANTLR allows lexer rules (tokens, eg 'INT' & 'HEX') to return values just like parser rules (eg 'number'). However, the generated C code will not compile:
error C2228: left of '.ivalue' must have class/struct/union
error C2228: left of '.hvalue' must have class/struct/union
I did some poking around, and the errors make sense - the tokens end up as generic ANTLR3_COMMON_TOKEN_struct, which doesn't allow for a return value. So maybe the C target just doesn't support this feature. But like I said, I'm new to this, and before I go haring off to find another approach I want to confirm that I can't do it this way.
So the question is this: 'Does antler3C support return values for lexer rules, and if so what is the proper way to use them?'
Not really any new information, just some details on what #bemace already mentioned.
No, lexer rules cannot have return values. See 4.3 Rules from The Definitive ANTLR reference:
Rule Arguments and Return Values
Just like function calls, ANTLR parser and tree parser rules can have
arguments and return values. ANTLR lexer rules cannot have return
values [...]
There are two options:
Option 1
You can do the transforming to a long in the parser rule number:
number returns [long value]
: INT {$value = Long.parseLong($INT.text);}
| HEX {$value = Long.parseLong($HEX.text.substring(2), 16);}
;
Option 2
Or create your own token that has, say, a toLong(): long method:
import org.antlr.runtime.*;
public class YourToken extends CommonToken {
public YourToken(CharStream input, int type, int channel, int start, int stop) {
super(input, type, channel, start, stop);
}
// your custom method
public long toLong() {
String text = super.getText();
int radix = text.startsWith("0x") ? 16 : 10;
if(radix == 16) text = text.substring(2);
return Long.parseLong(text, radix);
}
}
and define in the options {...} header in your grammar to use this token and override the emit(): Token method in your lexer class:
grammar Foo;
options{
TokenLabelType=YourToken;
}
#lexer::members {
public Token emit() {
YourToken t = new YourToken(input, state.type, state.channel,
state.tokenStartCharIndex, getCharIndex()-1);
t.setLine(state.tokenStartLine);
t.setText(state.text);
t.setCharPositionInLine(state.tokenStartCharPositionInLine);
emit(t);
return t;
}
}
parse
: number {System.out.println("parsed: "+$number.value);} EOF
;
number returns [long value]
: INT {$value = $INT.toLong();}
| HEX {$value = $HEX.toLong();}
;
HEX
: '0' 'x' ('0'..'9'|'a'..'f'|'A'..'F')+
;
INT
: '0'..'9'+
;
When you generate a parser and lexer, and run this test class:
import org.antlr.runtime.*;
import java.io.*;
public class Main {
public static void main(String[] args) throws Exception {
ANTLRStringStream in = new ANTLRStringStream("0xCafE");
FooLexer lexer = new FooLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
FooParser parser = new FooParser(tokens);
parser.parse();
}
}
it will produce the following output:
parsed: 51966
The first options seems the more practical in your case.
Note that, as you can see, the examples given are in Java. I have no idea if option 2 is supported in the C target/runtime. I decided to still post it to be able to use it as a future reference here on SO.
Lexer rules must return Token objects, because that's what the Parser expects to work with. There may be a way to customize the type of token object used, but it's easier just to convert tokens to values in the lowest-level parser rules.
social_title returns [Name.Title title]
: SIR { title = Name.Title.SIR; }
| 'Dame' { title = Name.Title.DAME; }
| MR { title = Name.Title.MR; }
| MS { title = Name.Title.MS; }
| 'Miss' { title = Name.Title.MISS; }
| MRS { title = Name.Title.MRS; };
There is a third option: You can pass an object as argument to the lexer rule. This object contains a member that represents the lexer's return value. Within the lexer rule, you can set the member. Outside the lexer rule, at the point you call it, you can get the member and do whatever you want with this 'return value'.
This way of parameter passing corresponds to the 'var' parameters in Pascal or the 'out' parameters in C++ and other programming languages.

Resources