I almost asked the same question the other day but in context of c++.
I try to replicate destructors and constructors in my c programming. That means for every object or struct there is an initialization function and a destruct function which frees all of the objects resources like so:
struct MyObject {
struct string a;
struct string b;
struct string c;
};
void ConstructMyObject(struct MyObject *obj) {
ConstructString(&obj->a);
ConstructString(&obj->b);
ConstructString(&obj->c);
}
void DestructMyObject(struct MyObject *obj) {
DestructString(&obj->a);
DestructString(&obj->b);
DestructString(&obj->c);
}
The destruct function is called at the end of every function scope just like in Rust only that i put it manually there instead of the compiler doing the job for me. So now in DestructMyObject function I call the destructors of every struct string type because for the struct string object i would also have a destruct function written just like for the struct MyObject Object. So everything that struct MyObject has allocated will be freed.
Example with my problem:
int main {
struct MyObject Object1;
ConstructMyObject(&Object1);
...
...
...
TransferOwnershipFunction(Object1.b); /*takes a struct string object as argument*/
...
...
...
DestructMyObject(&Object1);
return 0;
}
I transfered ownersnip of a member (struct string b) of Object1 to another function. But struct string b will be freed by the main function because i have the rule that when an object goes out of scope i call its destruct function. But I don't want the main function to free this resource. TransferOwnershipFunction(...) is now responsible to free this member of object1. How does the Rust compiler deal with such situations? In Rust would i have to make a clone of string b?
The Rust compiler is smart enough to see when only a single field of a struct is consumed. Only that specific field has its ownership transferred and the remaining fields are dropped at the end of the scope (or otherwise consumed). This can be seen in the following example.
struct MyObject {
a: String,
b: String,
c: String,
}
fn consume_string(_string: String) {}
fn main() {
let object = MyObject {
a: "".to_string(),
b: "".to_string(),
c: "".to_string(),
};
consume_string(object.b);
// We can still access object.a and object.c
println!("{}", object.a);
println!("{}", object.c);
// but not object.b
// println!("{}", object.b);
}
(playground)
However, if the struct has a non-trivial destructor, i.e., implements the Drop trait, then this can't happen. Trying to move a single field of the struct will result in a compiler error, as seen below.
struct MyObject {
a: String,
b: String,
c: String,
}
// This is new
impl Drop for MyObject {
fn drop(&mut self) {
println!("dropping MyObject");
}
}
fn consume_string(_string: String) {}
fn main() {
let object = MyObject {
a: "".to_string(),
b: "".to_string(),
c: "".to_string(),
};
consume_string(object.b);
}
Attempting to compile this gives the error
error[E0509]: cannot move out of type `MyObject`, which implements the `Drop` trait
--> src/main.rs:22:20
|
22 | consume_string(object.b);
| ^^^^^^^^
| |
| cannot move out of here
| move occurs because `object.b` has type `std::string::String`, which does not implement the `Copy` trait
error: aborting due to previous error
For more information about this error, try `rustc --explain E0509`.
error: Could not compile `playground`.
(playground) ([E0509])
I think you already understand the reasoning for this, but it's worth repeating. If Drop is implemented for a struct, the destructor might have non-trivial interactions between the fields; they might not just be dropped independently. So that means that the struct has to stay as one coherent piece until it's dropped.
As an example, the Drop implementation for Rc<T> checks if there are any (strong) references to the data left and if there aren't, drops the underlying data. If the fields of Rc<T> (a pointer, a strong reference count and a weak reference count) were dropped separately, there would be no way to check how many strong references were left when dropping the pointer. There'd be no way to keep the underlying data around if there are still strong references.
As you guessed, in the case where Drop is implemented, you'd have to clone the field if you still wanted to consume it.
Related
I would like to store the pointer to a rust struct inside of it's member C struct. Is it required that the struct be enclosed in a Rc rather than a Box?
The reason I'm asking is because although there is shared ownership here, the pointer is only ever accessed from within unsafe member functions of the Rust struct and the C struct's lifetime is tied to that of the enclosing Rust struct.
Here's an example ->
// C struct with constructor/destructor
struct c_foo {
void* internal; // pointer to rust `struct`
/* ... */
};
struct c_foo* c_foo_new();
void c_foo_free(struct c_foo* foo);
// FFI struct generated by bindgen
#[repr(C)]
#[derive(Debug, Copy)]
pub struct Foo {
pub internal: *mut libc::c_void, // pointer to rust `struct`
/* ... */
}
// Rust struct that wraps the FFI struct
struct Bar {
ptr: *mut Foo, // private
/* ... */
}
impl Bar {
fn new() -> Box<Bar> {
unsafe {
let mut bar = Box::new(Bar { ptr: c_foo_new() });
let bar_ptr: *mut ffi::c_void = &mut bar as *mut _ as *mut ffi::c_void;
(*bar.ptr).internal = bar_ptr;
bar
}
}
}
impl Drop for Bar {
fn drop(&mut self) {
unsafe {
c_foo_free((*bar.ptr).internal);
}
}
}
So there's a C struct c_foo with a void * that stores a reference to the Rust struct Bar. Foo is just the bindgen generated Rust wrapper for c_foo.
Do I need a Box or Rc in the Bar::new() function?
To clarify, there is no shared ownership on the Rust side. There is shared ownership b/w the Rust and C side so I guess there is no benefit in using a Rc type.
E_net4 is renamed all the time's comment answers my question -
use Rc only if you need shared ownership in Rust code. Since C does not retain the semantics of this pointer type, C code needs to handle boundaries manually regardless.
#define LENGTH 6
typedef char data_t[LENGTH];
struct foo {
const data_t data;
...
}
...
void bar(data_t data) {
printf("%.6s\n", data);
struct foo myfoo = {*data};
printf("%.6s\n", foo.data);
}
I'm trying to have this struct which holds directly the data I'm interested in, sizeof(foo) == 6+the rest, not sizeof(foo) == sizeof(void*)+the rest. However I can't find a way to initialize a struct of type foo with a data_t. I think maybe I could remove the const modifier from the field and use memcpy but I like the extra safety and clarity.
I don't get any compile errors but when I run the code I get
123456
1??
so the copy didn't work properly I think.
This is for an arduino (or similar device) so I'm trying to keep it to very portable code.
Is it just not possible ?
EDIT: removing the const modifier on the data_t field doesn't seem to help.
It is possible to do this, for some cost >=0.
typedef struct
{
char c[LENGTH];
} data_t; // this struct is freely copyable
struct foo
{
const data_t data; // but this data member is not
int what;
};
void foo (char* x) {
data_t d; // declare freely copyable struct instance
memcpy(d.c, x, sizeof(d.c)); // memcpy it
struct foo foo = { d, 42 }; // initialise struct instance with const member
...
};
Some compilers (e.g. clang) are even able to optimise away the redundant copying (from x to d.c and then from d to foo.data ⇒ from x straight to foo.data). Others (gcc I'm looking at you) don't seem to be able to achieve this.
If you pass around pointers to data_t rather than straight char pointers, you won't need this additional memcpy step. OTOH in order to access the char array inside foo you need another level of member access (.data.c instead of just .data; this has no runtime cost though).
It's impossible to do it in a standard compliant way.
Due to its being const, const char data[6]; must be initialized to be usable, and it may only be initialized statically (static objects with no initializer get automatically zeroed), with a string literal, or with a brace-enclosed initializer list. You cannot initialize it with a pointer or another array.
If I were you, I would get rid of the const, document that .data shouldn't be changed post-initialization, and then use memcpy to initialize it.
(const on struct members doesn't work very well in my opinion. It effectively prevents you from being able to have initializer functions, and while C++ gets around the problem a little bit by having special language support for its constructor functions, the problem still remains if the const members are arrays).
I'm having a hard time wrapping my head around declaring mutable (or pointer) variables and interacting with C code through FFI. I've been playing with this for most of the day and have found conflicting examples due to how quickly Rust is developing. :)
The situation is like this: I have a C function which takes in a pointer to a struct, this struct has fields that are ints and char *s. My understanding is that I need to declare a similar struct in Rust to pass to the extern C function.
Here are my example files I've written while trying to figure this out:
main.rs
extern crate libc;
struct testStruct {
an_int: libc::c_int,
a_string: *mut libc::c_char
}
extern {
fn start_test(test: *mut testStruct) -> libc::c_int;
}
fn main() {
// println!("Hello, world!");
let test_struct = testStruct { an_int: 1, a_string: "hello" };
start_test(&mut test_struct);
}
--
test_file.c
#include <stdio.h>
#include "test_file.h"
struct test_struct {
int an_int;
char *a_string;
};
int start_client(struct test_struct *test) {
printf("Test function!\n");
return 0;
}
Obviously the actual code is more complex, I'm just trying to get a basic example working to understand how mutability/pointers work in Rust with FFI.
What is the correct way to declare a structure, or just a variable, in Rust that can be passed to C code expecting a pointer?
The memory layout of a struct is undefined (the compiler is allowed to reorder fields, for instance) unless you add the #[repr(C)] attribute to the struct. This attribute gives the struct a layout compatible with C.
#[repr(C)]
struct TestStruct {
an_int: libc::c_int,
a_string: *mut libc::c_char
}
Using a raw pointer in the struct works fine, but we can do better. There are two other important types in Rust that are only composed of a pointer: borrowed pointers (&'a T or &'a mut T) and Box<T>. You can use these types instead *const T or *mut T to make it clear that the pointer borrows an existing value (and enables the compiler to validate that the pointer doesn't outlive its referent) or points to an object on the heap that should be dropped when the pointer (or the struct containing it) goes out of scope. However, be careful with Box<T>, since you could accidentally free a value while the C code still has a pointer to the value.
#[repr(C)]
struct TestStruct<'a> {
an_int: libc::c_int,
a_string: &'a mut libc::c_char
}
Another thing to watch out for is the use of fat pointers. In Rust, some pointer types are fat, i.e. they carry additional data along with the pointer. For example, slices (*const [T], *mut [T], &'a [T], &'a mut [T]) can be thought of as a struct or tuple containing a pointer to the first item and the number of items in the slice (a usize); trait objects (*const T, *mut T, &'a T, &'a mut T where T is the name of a trait) are composed of a pointer to the object and a pointer to the virtual method table for the trait implementation. You should avoid using these types when defining a Rust struct matching a C struct.
You can find more information on using Rust's FFI in the FFI section of the Rust book.
I have a project that must be in C (just to avoid the use C++ arguments).
The project depends on virtual tables and pointers to implement polymorphism.
Im stuck however in implementing super constructors from multi-level inheritance.
An example structure is:
Base Object
/\ /\
Constructed Object Simple Object
/\
SpecificConstructed
All objects have a name and a class.
Constructed objects may have a list of sub objects for example.
As simple object may only add a single value.
Base Object is just defined as:
struct _object {
struct _class *class;
char *name;
}
Class is where the virtual table is:
struct _class {
struct _class *super;
char *name;
size_t size;
void* (*init)(void *_this, char *name);
...
}
A constructed object is:
struct _constructed_object {
struct _object base;
void* components; //a list of sub objects for example
}
A sample simple object is:
struct _simple_object {
struct _object base;
unsigned char value; //a simple value for this specific type
}
So every object has a class, and classes can have supers, specially for the SpecificConstructed -> Constructed.
The definitions i have:
struct _class base = {0, "Base", sizeof(struct _object), base_init};
struct _class constructed = {&base, "Constructed", sizeof(struct _constructed_object}, constructed_init};
struct _class specific = {&constructed, "SpecificConstructed", sizeof(struct _constructed_object), specific_init};
struct _class simple = {&base, "SimpleOBject", sizeof(struct _simple_object}, simple_init};
This definition allows me to create objects of specify classes using a function:
new(struct _class *a_class) {
...
struct _object *o = calloc(1, a_class->size);
o->class = a_class;
o = a_class->init(o);
return o;
}
The idea is if i do:
new(SpecificConstructed)
New would create the appropriate space (sizeof(struct _constructed_object)), it would call "specific_init", which in turn would call "constructed_init" (it's super), which finally would call "base_init" (it's super). However the flow is specific_init, constructed_init, specific_init, constructed_init ...
The function i have for calling the supers initializer:
void* super_init(void* _this, char *name){
struct _object *o = (struct _object*)_this;
const struct _class *c = o->class;
const struct _class *s = c->super;
return (s && s->init) ? s->init(_this, name) : _this;
}
The simple (to - super) base method call works since i can just call the supers init.
But for the specific constructed, calling super takes me to the constructed object which is the correct step, but then instead of the constructed sending me up to the base_init, it sends me back to the specific_init call. This happens since I'm passing the same _this object which starts with the class specific i understand that, but not sure how to fix it and if its actually possible to fix?
Ive read the Object Oriented C book, but it deals with one-level inheritance Circle->Point, and the Metaclasses chapter just flew over my head. Ive also looked at the Objective-C runtime to see how that handles it, but it also has metaclasses and that i can't comprehend at the moment.
super_init can't work like that, it needs class on which to call super, otherwise (as you discovered) the immediate superclass constructor ends up calling itself over and over. Since each class knows its parent, it can call superclass's init directly. For example, simple.init will call specific.init, which will in turn call constructed.init, and so on.
If you insist on a function to do that for you, you will have to give it the class so it can (trivially) invoke the constructor of the superclass. super in Python 2 is an example of such a design. Python 3 introduces a simpler-to-use super, but it required support from the compiler to figure out the correct class to pass to the underlying function.
This is awesome stuff, I did a bit of that kind of stuff in C in the early nineties, before moving on to C++.
Unfortunately, despite the fact that your question is quite long, it is still a bit vague because it is not showing us certain things, like what is a "constructed object", (why are you calling it like that,) what is the difference between "constructed object" and "simple object", and what is "the simple->base method call". Also, why the size? Also, I think that some sample code showing the actual problem with the invocation of the constructors is necessary.
The one thing that I can tell you right now about this design is that it strikes me as odd that you are storing a pointer to the constructor in the Virtual Method Table. In all object oriented languages that I know, (C++, Java, C#) constructors are never virtual; they look a lot more like "static" methods, which in C parlance are just plain link-by-name methods. This works fine, because every class has built-in, absolutely certain, unalterable knowledge of who its base class is.
Anyhow, chained constructor invocation is supposed to happen like this:
void basemost_init( struct basemost* this, char* name )
{
this->class = &basemost_class;
this->name = name;
...
}
void intermediate_init( struct intermediate* this, char* name )
{
basemost_init( &this->base, name );
this->class = &intermediate_class;
...
}
void descendant_init( struct descendant* this, char* name )
{
intermediate_init( &this->base, name );
this->class = &descendant_class;
...
}
Edit (after some clarifications)
If you want it to look cool at the allocation end, perhaps try something like this: (I am not sure how well I remember my C syntax, so please excuse any minor inaccuracies.)
struct descendant* new_descendant( char* name )
{
struct descendant* this = malloc( sizeof struct descendant );
descendant_init( this, name );
return this;
}
This way, you don't need a size anymore. Also, note that you can pass as many constructor arguments as you want, without being restricted to a fixed, predetermined number of arguments, (which I consider to be extremely important,) and without having to use variadic constructors.
You may also be able to achieve the same thing with a #define macro for all classes, if you promise to use consistent naming, so that the name of each constructor can always be computed as structname##_init, but I am not sure how to pass arbitrary constructor parameters after the this pointer through a macro in C. Anyhow, I prefer to avoid macros unless they are absolutely necessary, and in this case they are not.
This code works fine but gives a compiler warning on Rust nightly (1.2)
#[repr(C)]
struct DbaxCell {
cell: *const c_void
}
#[link(name="CDbax", kind="dylib")]
extern {
fn new_dCell(d: c_double) -> *const c_void;
fn deleteCell(c: *const c_void);
}
impl DbaxCell {
fn new(x: f64) -> DbaxCell {
unsafe {
DbaxCell { cell: new_dCell(x) }
}
}
}
impl Drop for DbaxCell {
fn drop(&mut self) {
unsafe {
deleteCell(self.cell);
}
}
}
It links to a C library and creates/deletes cell objects correctly. However it gives a warning
src\lib.rs:27:1: 33:2 warning: implementing Drop adds hidden state to types, possibly conflicting with `#[repr(C)]`, #[warn(drop_with_repr_extern)] on by default
\src\lib.rs:27 impl Drop for DbaxCell {
\src\lib.rs:28 fn drop(&mut self) {
\src\lib.rs:29 unsafe {
\src\lib.rs:30 deleteCell(self.cell);
\src\lib.rs:31 }
\src\lib.rs:32 }
What is the right way to do this to ensure that these DbaxCells are cleaned up correctly and no warning is given?
I think you are conflating two concepts. A struct should be repr(C) if you wish for the layout of the struct to directly correspond to the layout of the struct as a C compiler would lay it out. That is, it has the same in-memory representation.
However, you don't need that if you are just holding a raw pointer, and are not going to pass the holding structure back to C. The short solution in this case is "remove repr(C)".
To explain a bit more about the error...
implementing Drop adds hidden state to types, possibly conflicting with #[repr(C)]
This was discussed in issue 24585. When an object is dropped, a hidden flag (the "state") is set that indicates that the object has been dropped, preventing multiple drops from occurring. However, hidden bits mean that what you see in Rust does not correspond to what the bytes of the struct would look like in C, negating the purpose of the repr(C).
As cribbed from #bluss:
Low level programmers, don't worry: future Rust will remove this drop flag entirely.
And
Use repr(C) to pass structs in FFI, and use Drop on "regular Rust" structs if you need to. If you need both, embed the repr(C) struct inside the regular struct.
Imagine we had a library that exposes a C struct with two 8-bit numbers, and methods that take and return that struct:
typedef struct {
char a;
char b;
} tuple_t;
tuple_t tuple_increment(tuple_t position);
In this case, you would definitely want to mimic that struct and match the C representation in Rust:
#[repr(C)]
struct Tuple {
a: libc::char,
b: libc::char,
}
However, if the library returned pointers to the struct, and you never need to poke into it (the structure is opaque) then you don't need to worry about repr(C):
void tuple_increment(tuple_t *position);
Then you can just use that pointer and implement Drop:
struct TuplePointer(*mut libc::c_void);
impl Drop for TuplePointer {
// Call the appropriate free function from the library
}