How to implement Rust struct and methods as Vec from C without leaking memory? - c

I'm writing Rust bindings for a C library which uses an embedded constructor and destructor. Raw Rust code of C header:
// Opaque structures transform to enumerate
pub enum UdbEntity_ {}
pub enum UdbReference_ {}
...
pub type UdbEntity = *mut UdbEntity_;
pub type UdbReference = *mut UdbReference_;
...
// Return a non-allocated, permanent list of all entities. This list may be
// used in places where an allocated entity list is required and may be
// safely passed to udbListEntityFree().
pub fn udbListEntity(list: *mut *mut UdbEntity, items: *mut c_int);
// Free an allocated list of entities.
pub fn udbListEntityFree(list: *mut UdbEntity);
...
// Return an allocated list of all references for entity.
// Free the list with udbListReferenceFree().
pub fn udbListReference(entity : UdbEntity,
refs : *mut *mut UdbReference,
items : *mut c_int);
// Free the allocated references list.
pub fn udbListReferenceFree(refs: *mut UdbReference);
This is implementation of safe Rust code as in git2-rs:
/// Structure of Entity.
pub struct Entity<'ents> {
raw: UdbEntity,
_marker: PhantomData<&'ents UdbEntity>,
}
/// Opaque structure of list of entities.
pub struct ListEntity<'db> {
raw: *mut UdbEntity,
len: usize,
_marker: PhantomData<&'db Db>,
}
/// An iterator over the Entity in list of entities.
pub struct EntityIter<'ents> {
range: Range<usize>,
ents: &'ents ListEntity<'ents>,
}
impl<'db> Drop for ListEntity<'db> {
fn drop(&mut self) {
unsafe { udbListEntityFree(self.raw) };
}
}
And for ListReference and Reference too.
I need to work with ListEntity as with Vec<Entity> (iterators, slices for sorting and etc.), but I can't implement it. In my versions of implementing I can't create slices: from_raw_parts returns slices over UdbEntity, not Entity.
When I keep Vec<Entity> in EntityList and later when I edit Vec<Entity> (moving it), EntityList is dropped and frees the allocated list *mut UdbEntity. I need correct lifetimes too.
I reversed some simple structs (Kind, ListKind for example) for writing pure Rust code, but I think a more idiomatic path exists.

The problem is that you have two rather disjoint structures in your code. On the one hand, you have a ListEntity which owns a raw array of UdbEntity and frees it when required, on the other hand you have an Entity which wraps the UdbEntity, but is not in any way referenced in ListEntity.
You have two options here.
Transmute an array of UdbEntity into an array of Entity, in which case you will be able to create slices of it. To do that, they need to have the same in-memory representation.
Create a vector of Entity separately from UdbEntity and return them instead.
Assuming the first approach is safe, I would go with that. If not, then the second one can work. In both cases arrays of Entity should be owned by ListEntityso that the memory is properly managed. I would probably ditch the PhantomData in Entity and simply return references to them.

Related

Do I need an Rc type for storing a pointer to a Rust struct inside of it's member C struct

I would like to store the pointer to a rust struct inside of it's member C struct. Is it required that the struct be enclosed in a Rc rather than a Box?
The reason I'm asking is because although there is shared ownership here, the pointer is only ever accessed from within unsafe member functions of the Rust struct and the C struct's lifetime is tied to that of the enclosing Rust struct.
Here's an example ->
// C struct with constructor/destructor
struct c_foo {
void* internal; // pointer to rust `struct`
/* ... */
};
struct c_foo* c_foo_new();
void c_foo_free(struct c_foo* foo);
// FFI struct generated by bindgen
#[repr(C)]
#[derive(Debug, Copy)]
pub struct Foo {
pub internal: *mut libc::c_void, // pointer to rust `struct`
/* ... */
}
// Rust struct that wraps the FFI struct
struct Bar {
ptr: *mut Foo, // private
/* ... */
}
impl Bar {
fn new() -> Box<Bar> {
unsafe {
let mut bar = Box::new(Bar { ptr: c_foo_new() });
let bar_ptr: *mut ffi::c_void = &mut bar as *mut _ as *mut ffi::c_void;
(*bar.ptr).internal = bar_ptr;
bar
}
}
}
impl Drop for Bar {
fn drop(&mut self) {
unsafe {
c_foo_free((*bar.ptr).internal);
}
}
}
So there's a C struct c_foo with a void * that stores a reference to the Rust struct Bar. Foo is just the bindgen generated Rust wrapper for c_foo.
Do I need a Box or Rc in the Bar::new() function?
To clarify, there is no shared ownership on the Rust side. There is shared ownership b/w the Rust and C side so I guess there is no benefit in using a Rc type.
E_net4 is renamed all the time's comment answers my question -
use Rc only if you need shared ownership in Rust code. Since C does not retain the semantics of this pointer type, C code needs to handle boundaries manually regardless.

Rust Destructors and ownership

I almost asked the same question the other day but in context of c++.
I try to replicate destructors and constructors in my c programming. That means for every object or struct there is an initialization function and a destruct function which frees all of the objects resources like so:
struct MyObject {
struct string a;
struct string b;
struct string c;
};
void ConstructMyObject(struct MyObject *obj) {
ConstructString(&obj->a);
ConstructString(&obj->b);
ConstructString(&obj->c);
}
void DestructMyObject(struct MyObject *obj) {
DestructString(&obj->a);
DestructString(&obj->b);
DestructString(&obj->c);
}
The destruct function is called at the end of every function scope just like in Rust only that i put it manually there instead of the compiler doing the job for me. So now in DestructMyObject function I call the destructors of every struct string type because for the struct string object i would also have a destruct function written just like for the struct MyObject Object. So everything that struct MyObject has allocated will be freed.
Example with my problem:
int main {
struct MyObject Object1;
ConstructMyObject(&Object1);
...
...
...
TransferOwnershipFunction(Object1.b); /*takes a struct string object as argument*/
...
...
...
DestructMyObject(&Object1);
return 0;
}
I transfered ownersnip of a member (struct string b) of Object1 to another function. But struct string b will be freed by the main function because i have the rule that when an object goes out of scope i call its destruct function. But I don't want the main function to free this resource. TransferOwnershipFunction(...) is now responsible to free this member of object1. How does the Rust compiler deal with such situations? In Rust would i have to make a clone of string b?
The Rust compiler is smart enough to see when only a single field of a struct is consumed. Only that specific field has its ownership transferred and the remaining fields are dropped at the end of the scope (or otherwise consumed). This can be seen in the following example.
struct MyObject {
a: String,
b: String,
c: String,
}
fn consume_string(_string: String) {}
fn main() {
let object = MyObject {
a: "".to_string(),
b: "".to_string(),
c: "".to_string(),
};
consume_string(object.b);
// We can still access object.a and object.c
println!("{}", object.a);
println!("{}", object.c);
// but not object.b
// println!("{}", object.b);
}
(playground)
However, if the struct has a non-trivial destructor, i.e., implements the Drop trait, then this can't happen. Trying to move a single field of the struct will result in a compiler error, as seen below.
struct MyObject {
a: String,
b: String,
c: String,
}
// This is new
impl Drop for MyObject {
fn drop(&mut self) {
println!("dropping MyObject");
}
}
fn consume_string(_string: String) {}
fn main() {
let object = MyObject {
a: "".to_string(),
b: "".to_string(),
c: "".to_string(),
};
consume_string(object.b);
}
Attempting to compile this gives the error
error[E0509]: cannot move out of type `MyObject`, which implements the `Drop` trait
--> src/main.rs:22:20
|
22 | consume_string(object.b);
| ^^^^^^^^
| |
| cannot move out of here
| move occurs because `object.b` has type `std::string::String`, which does not implement the `Copy` trait
error: aborting due to previous error
For more information about this error, try `rustc --explain E0509`.
error: Could not compile `playground`.
(playground) ([E0509])
I think you already understand the reasoning for this, but it's worth repeating. If Drop is implemented for a struct, the destructor might have non-trivial interactions between the fields; they might not just be dropped independently. So that means that the struct has to stay as one coherent piece until it's dropped.
As an example, the Drop implementation for Rc<T> checks if there are any (strong) references to the data left and if there aren't, drops the underlying data. If the fields of Rc<T> (a pointer, a strong reference count and a weak reference count) were dropped separately, there would be no way to check how many strong references were left when dropping the pointer. There'd be no way to keep the underlying data around if there are still strong references.
As you guessed, in the case where Drop is implemented, you'd have to clone the field if you still wanted to consume it.

What is the correct way to declare a structure, or just a variable, in Rust that can be passed to C code expecting a pointer?

I'm having a hard time wrapping my head around declaring mutable (or pointer) variables and interacting with C code through FFI. I've been playing with this for most of the day and have found conflicting examples due to how quickly Rust is developing. :)
The situation is like this: I have a C function which takes in a pointer to a struct, this struct has fields that are ints and char *s. My understanding is that I need to declare a similar struct in Rust to pass to the extern C function.
Here are my example files I've written while trying to figure this out:
main.rs
extern crate libc;
struct testStruct {
an_int: libc::c_int,
a_string: *mut libc::c_char
}
extern {
fn start_test(test: *mut testStruct) -> libc::c_int;
}
fn main() {
// println!("Hello, world!");
let test_struct = testStruct { an_int: 1, a_string: "hello" };
start_test(&mut test_struct);
}
--
test_file.c
#include <stdio.h>
#include "test_file.h"
struct test_struct {
int an_int;
char *a_string;
};
int start_client(struct test_struct *test) {
printf("Test function!\n");
return 0;
}
Obviously the actual code is more complex, I'm just trying to get a basic example working to understand how mutability/pointers work in Rust with FFI.
What is the correct way to declare a structure, or just a variable, in Rust that can be passed to C code expecting a pointer?
The memory layout of a struct is undefined (the compiler is allowed to reorder fields, for instance) unless you add the #[repr(C)] attribute to the struct. This attribute gives the struct a layout compatible with C.
#[repr(C)]
struct TestStruct {
an_int: libc::c_int,
a_string: *mut libc::c_char
}
Using a raw pointer in the struct works fine, but we can do better. There are two other important types in Rust that are only composed of a pointer: borrowed pointers (&'a T or &'a mut T) and Box<T>. You can use these types instead *const T or *mut T to make it clear that the pointer borrows an existing value (and enables the compiler to validate that the pointer doesn't outlive its referent) or points to an object on the heap that should be dropped when the pointer (or the struct containing it) goes out of scope. However, be careful with Box<T>, since you could accidentally free a value while the C code still has a pointer to the value.
#[repr(C)]
struct TestStruct<'a> {
an_int: libc::c_int,
a_string: &'a mut libc::c_char
}
Another thing to watch out for is the use of fat pointers. In Rust, some pointer types are fat, i.e. they carry additional data along with the pointer. For example, slices (*const [T], *mut [T], &'a [T], &'a mut [T]) can be thought of as a struct or tuple containing a pointer to the first item and the number of items in the slice (a usize); trait objects (*const T, *mut T, &'a T, &'a mut T where T is the name of a trait) are composed of a pointer to the object and a pointer to the virtual method table for the trait implementation. You should avoid using these types when defining a Rust struct matching a C struct.
You can find more information on using Rust's FFI in the FFI section of the Rust book.

Correct idiom for freeing repr(C) structs using Drop trait

This code works fine but gives a compiler warning on Rust nightly (1.2)
#[repr(C)]
struct DbaxCell {
cell: *const c_void
}
#[link(name="CDbax", kind="dylib")]
extern {
fn new_dCell(d: c_double) -> *const c_void;
fn deleteCell(c: *const c_void);
}
impl DbaxCell {
fn new(x: f64) -> DbaxCell {
unsafe {
DbaxCell { cell: new_dCell(x) }
}
}
}
impl Drop for DbaxCell {
fn drop(&mut self) {
unsafe {
deleteCell(self.cell);
}
}
}
It links to a C library and creates/deletes cell objects correctly. However it gives a warning
src\lib.rs:27:1: 33:2 warning: implementing Drop adds hidden state to types, possibly conflicting with `#[repr(C)]`, #[warn(drop_with_repr_extern)] on by default
\src\lib.rs:27 impl Drop for DbaxCell {
\src\lib.rs:28 fn drop(&mut self) {
\src\lib.rs:29 unsafe {
\src\lib.rs:30 deleteCell(self.cell);
\src\lib.rs:31 }
\src\lib.rs:32 }
What is the right way to do this to ensure that these DbaxCells are cleaned up correctly and no warning is given?
I think you are conflating two concepts. A struct should be repr(C) if you wish for the layout of the struct to directly correspond to the layout of the struct as a C compiler would lay it out. That is, it has the same in-memory representation.
However, you don't need that if you are just holding a raw pointer, and are not going to pass the holding structure back to C. The short solution in this case is "remove repr(C)".
To explain a bit more about the error...
implementing Drop adds hidden state to types, possibly conflicting with #[repr(C)]
This was discussed in issue 24585. When an object is dropped, a hidden flag (the "state") is set that indicates that the object has been dropped, preventing multiple drops from occurring. However, hidden bits mean that what you see in Rust does not correspond to what the bytes of the struct would look like in C, negating the purpose of the repr(C).
As cribbed from #bluss:
Low level programmers, don't worry: future Rust will remove this drop flag entirely.
And
Use repr(C) to pass structs in FFI, and use Drop on "regular Rust" structs if you need to. If you need both, embed the repr(C) struct inside the regular struct.
Imagine we had a library that exposes a C struct with two 8-bit numbers, and methods that take and return that struct:
typedef struct {
char a;
char b;
} tuple_t;
tuple_t tuple_increment(tuple_t position);
In this case, you would definitely want to mimic that struct and match the C representation in Rust:
#[repr(C)]
struct Tuple {
a: libc::char,
b: libc::char,
}
However, if the library returned pointers to the struct, and you never need to poke into it (the structure is opaque) then you don't need to worry about repr(C):
void tuple_increment(tuple_t *position);
Then you can just use that pointer and implement Drop:
struct TuplePointer(*mut libc::c_void);
impl Drop for TuplePointer {
// Call the appropriate free function from the library
}

Working with c_void in an FFI

I am struggling with passing a struct through an FFI that accepts void and reading it back on the other end.
The library in question is libtsm, a terminal state machine. It allows you to feed input and then find out in which state a terminal would be after the input.
It declares its draw function as:
pub fn tsm_screen_draw(con: *tsm_screen, draw_cb: tsm_screen_draw_cb, data: *mut c_void) -> tsm_age_t;
where tsm_screen_draw_cb is a callback to be implemented by the library user, with the signature:
pub type tsm_screen_draw_cb = extern "C" fn(
con: *tsm_screen,
id: u32,
ch: *const uint32_t,
len: size_t,
width: uint,
posx: uint,
posy: uint,
attr: *tsm_screen_attr,
age: tsm_age_t,
data: *mut c_void
);
The important part here is the data parameter. It allows the user to pass through a pointer to a self-implemented state, to manipulate it and use it after drawing. Given a simple struct:
struct State {
state: int
}
how would I do that properly? I am unsure how to properly cast the pointer to the struct to void and back.
You can't cast a struct to c_void, but you can cast a reference to the struct to *mut c_void and back using some pointer casts:
fn my_callback(con: *tsm_screen, ..., data: *mut c_void) {
// unsafe is needed because we dereference a raw pointer here
let data: &mut State = unsafe { &mut *(data as *mut State) };
println!("state: {}", data.state);
state.x = 10;
}
// ...
let mut state = State { state: 20 };
let state_ptr: *mut c_void = &mut state as *mut _ as *mut c_void;
tsm_screen_draw(con, my_callback, state_ptr);
It is also possible to use std::mem::transmute() function to cast between pointers, but it is much more powerful tool than is really needed here and should be avoided when possible.
Note that you have to be extra careful casting an unsafe pointer back to a reference. If tsm_screen_draw calls its callback in another thread or stores it in a global variable and then another function calls it, then state local variable may well go out of scope when the callback is called, which will crash your program.

Resources