When passing a Rust pointer to C should I get 0x1? - c

I'm trying to implement a basic library in Rust that creates an object and returns its pointer to C. The pointer I get doesn't look like it is on the heap — when I print it I get 0x1:
use std::fmt;
pub struct SndbDB {}
impl SndbDB {
fn new() -> SndbDB {
SndbDB {}
}
}
impl fmt::Display for SndbDB {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "(sndb_db)")
}
}
// Implement a destructor just so we can see when the object is destroyed.
impl Drop for SndbDB {
fn drop(&mut self) {
println!("[rust] dropping {}", self);
}
}
#[no_mangle]
pub extern "C" fn sndb_db_create() -> *mut SndbDB {
let _db = Box::into_raw(Box::new(SndbDB::new()));
println!("[rust] creating DB {:?}", _db);
_db
}
#[no_mangle]
pub unsafe extern "C" fn sndb_db_destroy(ptr: *mut SndbDB) {
println!("[rust] destroying DB {:?}", ptr);
Box::from_raw(ptr); // Rust drops this for us.
}
The C calling code is trivial too:
typedef struct sndb_db sndb_db;
sndb_db * sndb_db_create(void);
void sndb_db_destroy(sndb_db* db);
void test_baseapi__create_and_destroy_db(void)
{
sndb_db * db = sndb_db_create();
printf("[C] Got db=%p\n",db);
sndb_db_destroy(db);
printf("[C] db should be dead by now...\n");
}
And all the output except the pointer location are as I expect:
[rust] creating DB 0x1
[C] Got db=0x1
[rust] destroying DB 0x1
[rust] dropping (sndb_db)
[C] db should be dead by now...
I know that memory allocated in Rust needs to be deallocated by Rust - but I'm still surprised it is using a location of 0x1 - am I doing something wrong, is something odd happening, or is this all OK?

It looks like this is an optimisation by Rust since the SndbDB struct has no state.
Adding an i: u32 field to it and plumbing that through to the constructor functions and the C code I then get:
[rust] creating DB 0x7fff2fe00000
[C] Got db=0x7fff2fe00000
[rust] destroying DB 0x7fff2fe00000
[rust] dropping (sndb_db i=123)
[C] db should be dead by now...
However I'd still love to find an official source to back up this guess.

Related

Pass an initialized Rust struct to C and call a rust function from C with the struct as input

I have built a library in Rust and my goal is to run it parallel with a Sat Solver (built in C) and make these two exchange information. The database (where information is received, stored and sent) is in Rust.
My problem is that I need to pass the Receiver from crossbeam channel from Rust to C and from the C program call the function receive (which takes as input the Receiver struct). The function receive is then executed in my Rust code.
I have programmed the structs and function in Rust:
use crossbeam_channel::Receiver;
use std::boxed::Box;
#[repr(C)]
pub struct CReceiver(Receiver<Vec<i32>>);
impl CReceiver {
fn new(receiver: Receiver<Vec<i32>>) -> CReceiver {
CReceiver(receiver)
}
}
#[no_mangle]
pub extern "C" fn data_new(creceiver: CReceiver) -> *mut CReceiver {
Box::into_raw(Box::new(creceiver))
}
#[no_mangle]
pub extern "C" fn data_free(ptr: *mut CReceiver) {
if ptr.is_null() {
return;
}
unsafe {
Box::from_raw(ptr);
}
}
#[no_mangle]
pub extern "C" fn receive(ptr: *mut CReceiver) -> Vec<i32> {
unsafe {
let creceiver = Box::from_raw(ptr);
match creceiver.0.try_recv() {
Ok(received) => received,
Err(_) => Vec::new() // glucose can handle empty clauses
}
}
}
I am trying to wrap the Receiver in a CReceiver struct to be able to pass it over to C. When I try to generate the header with cbindgen I get:
WARN: Cannot find a mangling for generic path GenericPath { path: Path { name: "Receiver" }, export_name: "Receiver", generics: [Type(Path(GenericPath { path: Path { name: "Vec" }, export_name: "Vec", generics: [Type(Primitive(Integer { zeroable: true, signed: true, kind: B32 }))], ctype: None }))], ctype: None }. This usually means that a type referenced by this generic was incompatible or not found.
WARN: Can't find Vec. This usually means that this type was incompatible or not found.
WARN: Can't find Receiver. This usually means that this type was incompatible or not found.
IMPORTANT: I need to initialise the receiver in Rust and then pass it to C as the whole channel (together with the sender) is initialised and needed.

Rust static function fn pointer, which value to initialize?

I'm trying to create a Rust library callable from C:
use std::os::raw::{c_int};
type OnDataCallback = unsafe extern "C" fn(data: *mut u8, len: usize) -> c_int;
static mut onDataCallback_: OnDataCallback = std::ptr::null();
#[no_mangle]
pub extern "C" fn registerOnDataCallback(
data: *const u8, len: usize,
cb: Option<OnDataCallback>) -> c_int
{
onDataCallback_ = cb.unwrap();
return 0;
}
#[no_mangle]
pub extern "C" fn doSomething()
{
unsafe{onDataCallback_(mut "hello world" , 100)};
}
But I'm getting:
--> interface.rs:5:46
|
5 | static mut onDataCallback_: OnDataCallback = std::ptr::null();
| ^^^^^^^^^^^^^^^^ expected fn pointer, found *-ptr
|
= note: expected fn pointer `unsafe extern "C" fn(*mut u8, usize) -> i32`
found raw pointer `*const _`
I don't have idea on what to put there for the initial value. I can't let it without one, and I can't put null. What should I put?
PS: if what I'm doing is a bad practice, please show me a good one. I'm new at Rust.
You should wrap it in an Option and initialize it with None.
static mut onDataCallback_: Option<OnDataCallback> = None;
You can initialize it with a function that panics:
static mut ON_DATA_CALLBACK: OnDataCallback = init;
extern "C" fn init(_: *mut u8, _: usize) -> c_int {
panic!("Function pointer not initialized");
}
Be aware that static mut is wildly unsafe, and its use is generally discouraged, because it's so difficult use it correctly. A safe alternative is to use a RefCell Mutex in a static. Since Mutex has interior mutability, the static doesn't need to be mutable.
A better solution would be to use once_cell or lazy_static to initialize the function the first time doSomething() is called.
By the way, Rust string literals are always immutable. You can get a mutable string by allocating a String. See this playground.

Creating a rust shared library that returns a struct of function pointers to C main program

I'm trying to make a Rust binding to nbdkit without much luck. I need to make a .so file, which is easy. The .so file must have a public function called plugin_init, also easy. However this function must return a pointer to a C-compatible struct containing a mix of strings and function pointers (that the C main program will later call).
The API is: https://github.com/libguestfs/nbdkit/blob/409ce4c9238a84ede6688423b20d5f706067834b/include/nbdkit-plugin.h#L53
I came up with:
#[repr(C)]
pub struct NBDKitPlugin {
_struct_size: uint64_t,
_api_version: c_int,
_thread_model: c_int,
name: *const c_char,
longname: Option<*const c_char>,
version: Option<*const c_char>,
description: Option<*const c_char>,
load: Option<extern fn ()>,
unload: Option<extern fn ()>,
config: Option<extern fn ()>, // XXX
config_complete: Option<extern fn () -> c_int>,
config_help: Option<*const c_char>,
open: extern fn (c_int) -> *mut c_void,
close: Option<extern fn (*mut c_void)>,
}
and a plugin_init function:
extern fn hello_load () {
println! ("hello this is the load method");
}
struct MyHandle {
}
extern fn hello_open (readonly: c_int) -> *mut c_void {
println! ("hello, this is the open method");
let mut h = MyHandle {};
let vp: *mut c_void = &mut h as *mut _ as *mut c_void;
return vp;
}
#[no_mangle]
pub extern fn plugin_init () -> *const NBDKitPlugin {
println! ("hello from the plugin");
let plugin = Box::new (NBDKitPlugin {
_struct_size: mem::size_of::<NBDKitPlugin>() as uint64_t,
_api_version: 2,
_thread_model: 3,
name: CString::new("hello").unwrap().into_raw(),
longname: None,
version: None,
description: None,
load: Some (hello_load),
unload: None,
config: None,
config_complete: None,
config_help: Some (CString::new("my config_help here").unwrap().into_raw()),
open: hello_open,
close: None,
});
return Box::into_raw(plugin);
}
Apart from leaking memory, this partially works. The integers and strings are seen from C OK. However the function pointers don't work at all. They are completely bogus and seem to occupy more space than a raw pointer so I suppose that I'm exposing a "fat" Rust pointer.
There seems very little documentation on this topic that I can find. Help.
You've probably learned at some point that a reference (&T) wrapped in an Option (Option<&T>) is optimized such that None is encoded as all zeroes, which is not valid for a reference, and Option<&T> has the same size as &T.
However, all zeroes is a valid bit pattern for a raw pointer (*const T or *mut T): it represents the null pointer. As such, wrapping those in an Option is no different from wrapping, say, an i32 in an Option: the Option type is larger so that it can store a discriminant.
To fix your struct definition, you must not use Option to define longname, version and description.

Borrow checker error in a loop inside a recursive function with lifetime bounds

Why does the borrow checker complain about this code?
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
loop {
foo(v, buf);
}
}
error[E0499]: cannot borrow `*buf` as mutable more than once at a time
--> src/main.rs:3:16
|
3 | foo(v, buf);
| ^^^ mutable borrow starts here in previous iteration of loop
4 | }
5 | }
| - mutable borrow ends here
If I remove the lifetime bound, the code compiles fine.
fn foo(v: &mut Vec<&str>, buf: &mut String) {
loop {
foo(v, buf);
}
}
This isn't a duplicate of Mutable borrow in a loop, because there is no return value in my case.
I'm pretty sure that my final goal isn't achievable in safe Rust, but right now I want to better understand how the borrow checker works and I can not understand why adding a lifetime bound between parameters extends the lifetime of the borrow in this code.
The version with the explicit lifetime 'a ties the lifetime of the Vec to the lifetime of buf. This causes trouble when the Vec and the String are reborrowed. Reborrowing occurs when the arguments are passed to foo in the loop:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
loop {
foo(&mut *v, &mut *buf);
}
}
This is done implicitly by the compiler to prevent the arguments from being consumed when foo is called in the loop. If the arguments were actually moved, they could not be used anymore (e.g. for successive calls to foo) after the first recursive call to foo.
Forcing buf to be moved around resolves the error:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
foo_recursive(v, buf);
}
fn foo_recursive<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) -> &'a mut String{
let mut buf_temp = buf;
loop {
let buf_loop = buf_temp;
buf_temp = foo_recursive(v, buf_loop);
// some break condition
}
buf_temp
}
However, things will break again as soon as you try to actually use buf. Here is a distilled version of your example demonstrating why the compiler forbids successive mutable borrows of buf:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
bar(v, buf);
bar(v, buf);
}
fn bar<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
if v.is_empty() {
// first call: push slice referencing "A" into 'v'
v.push(&buf[0..1]);
} else {
// second call: remove "A" while 'v' is still holding a reference to it - not allowed
buf.clear();
}
}
fn main() {
foo(&mut vec![], &mut String::from("A"));
}
The calls to bar are the equivalents to the recursive calls to foo in your example. Again the compiler complains that *buf cannot be borrowed as mutable more than once at a time. The provided implementation of bar shows that the lifetime specification on bar would allow this function to be implemented in such a way that v enters an invalid state. The compiler understands by looking at the signature of bar alone that data from buf could potentially flow into v and rejects the code as potentially unsafe regardless of the actual implementation of bar.

Allocating an object for C / FFI library calls

I have a C library, which has gpio implementation. There's gpio_type which is target specific, each MCU has different definition for gpio_type. One of the functions in the library:
void gpio_init(gpio_type *object, int32_t pin);
I want to write abstraction of Gpio object in Rust, using C library functions. Therefore need something like opaque pointer type (in C++ I would just create a member variable with type: gpio_type). I figured I would create an empty enum (or struct), allocate a space needed for the object and transmute it to match the type in C layer.
pub enum gpio_type {}
#[link(name = "gpio_lib", kind = "static")]
extern {
pub fn gpio_init(obj: *mut gpio_type, value: i32);
}
pub struct Gpio {
gpio : *mut gpio_type,
}
impl Gpio {
pub fn new(pin: u32) -> Gpio {
unsafe {
let mut gpio_ptr : &'static [u8; 4] = init(); // size of gpio in C is 4 bytes for one target, will be changed later to obtain it dynamically
let gpio_out = Gpio { gpio: transmute(gpio_ptr)};
gpio_init(gpio_out.gpio, pin);
gpio_out
}
}
}
This targets embedded devices, therefore no std, no libc. I don't want to redefine gpio_type for each target in rust (copy the C declaration for each target), looking for something to just allocate memory for the object which C will handle.
The following snippet below produces pointer to address 0 according to disassembly. Disassembly for Gpio new method:
45c: b580 push {r7, lr}
45e: 466f mov r7, sp
460: 4601 mov r1, r0
462: 2000 movs r0, #0
464: f000 fae6 bl a34 <gpio_init>
468: 2000 movs r0, #0
46a: bd80 pop {r7, pc}
Any ideas why 462 is 0 ?
looking for something to just allocate memory for the object which C will handle
What about something like this? Give the struct an actual size (in this case by giving it a fixed-size array of byte-sized items), allocate that space on the heap, then treat that as a raw pointer.
use std::mem;
#[allow(missing_copy_implementations)]
pub struct Gpio([u8; 4]);
impl Gpio {
fn new() -> Gpio { Gpio([0,0,0,0]) }
}
fn main() {
// Allocate some bytes and get a raw pointer
let a: *mut u8 = unsafe { mem::transmute(Box::new(Gpio::new())) };
// Use it here!
// When done... back to a box
let b: Box<Gpio> = unsafe { mem::transmute(a) };
// Now it will be dropped automatically (and free the allocated memory)
// Or you can be explicit
drop(b);
}
However, I'd suggest doing something like this; it's a lot more obvious and doesn't need a heap allocation:
#[allow(missing_copy_implementations)]
pub struct Gpio([u8; 4]);
impl Gpio {
fn new() -> Gpio { Gpio([0,0,0,0]) }
fn as_mut_ptr(&mut self) -> *mut u8 {
self.0.as_mut_ptr()
}
}
fn main() {
let mut g = Gpio::new();
let b = g.as_mut_ptr();
}
As a bonus, you get a nice place to hang some methods on. Potentially as_mut_ptr wouldn't need to be public, and could be hidden behind public methods on the Gpio struct.
(might also be able to use uninitialized instead of [0,0,0,0])
An expanded example of the second suggestion
// This depends on your library, check the FFI guide for details
extern {
fn gpio_init(gpio: *mut u8, pin: u8);
fn gpio_pin_on(gpio: *mut u8);
fn gpio_pin_off(gpio: *mut u8);
}
#[allow(missing_copy_implementations)]
pub struct Gpio([u8; 4]);
impl Gpio {
fn new(pin: u8) -> Gpio {
let mut g = Gpio([0,0,0,0]);
g.init(pin);
g
}
fn as_mut_ptr(&mut self) -> *mut u8 {
self.0.as_mut_ptr()
}
fn init(&mut self, pin: u8) { unsafe { gpio_init(self.as_mut_ptr(), pin) } }
pub fn on(&mut self) { unsafe { gpio_pin_on(self.as_mut_ptr()) } }
pub fn off(&mut self) { unsafe { gpio_pin_off(self.as_mut_ptr()) } }
}
static BLUE_LED_PIN: u8 = 0x4;
fn main() {
let mut g = Gpio::new(BLUE_LED_PIN);
g.on();
g.off();
}

Resources