I'm trying to combine my Rust program with a library written in C in a more complex scenario.
The library provides this interface:
use std::os::raw::{c_char, c_void};
extern "C" {
pub fn register_function(
name: *const c_char, signature: *const c_char,
func_ptr: *mut c_void, attachment: *mut c_void,
);
}
The signature can be a string describing the arguments and return type of the function as 32 or 64 bit ints and floats (representations: b'i' = i32, b'I' = i64, b'f' = f32, b'F' = f64). The registered function gets called with an array of u64 (uint64_t) values which correspond to the arguments from the signature.
I would like to abstract this registration and callback process, so that I can switch to another library in the future which provides a similar but different interface. My idea was to to create a proxy function that is registered instead of the actual function. This would then also provide a custom context struct.
My own functions could look like this:
use std::boxed::Box;
use std::pin::Pin;
fn return_void(context: Pin<Box<MyAttachment>>) {
// ...
}
fn return_32(context: Pin<Box<MyAttachment>>, a: u32, b: u32) -> u32 {
context.important_stuff();
// ...
}
// floating point values would be nice, but are optional
fn return_64(a: i32, b: i64, c: f64) -> f64 {
// ...
}
MyAttachment is supposed to be the context and provide the proxy function that gets an arbitrary number of arguments as array:
use std::cmp;
use std::ffi::CString;
use std::slice;
#[derive(PartialEq)]
enum ReturnType {
VOID,
BITS32,
BITS64,
}
struct MyAttachment {
real_func_ptr: *mut c_void,
signature: String,
argc: u32,
pass_attachment: bool,
return_type: ReturnType,
}
impl MyAttachment {
pub fn important_stuff(&self) {
// ...
}
unsafe extern "C" fn function_proxy(attachment: *mut c_void, argv: *mut u64) {
// Given: attachment is the pointer to MyAttachment and argv is the array of arguments.
let this = attachment.cast::<Self>();
if this.is_null() || argv.is_null() {
// error handling
return;
}
let this = Pin::new_unchecked(Box::from_raw(this)); // restore
let args = slice::from_raw_parts_mut(
argv,
// There is at least one element in argv if the function is supposed to return a value,
// because we need to write our result there.
cmp::max(
match this.return_type {
ReturnType::VOID => 0,
ReturnType::BITS32 | ReturnType::BITS64 => 1,
},
this.argc as usize,
),
);
let func_ptr = this.real_func_ptr;
// I can get the argument types from the signature.
// TODO cast to correct pointer type. For example:
// case return_void: Fn(Pin<Box<MyAttachment>>)
// case return_32: Fn(Pin<Box<MyAttachment>>, u64, u64) -> u32
// case return_64: Fn(u64, u64, f64) -> f64
// TODO call it:
if this.return_type != ReturnType::VOID {
//args[0] = func_ptr(...args);
// or
//args[0] = func_ptr(this, ...args);
} else {
//func_ptr(...args);
}
Box::into_raw(Pin::into_inner_unchecked(this)); // delay dropping of this
}
}
fn main() {
// defining functions like:
let func1 = Box::pin(MyAttachment {
real_func_ptr: return_32 as *mut _,
signature: String::from("(ii)i"),
argc: 2, // inferred from signature
pass_attachment: true,
return_type: ReturnType::BITS32, // inferred from signature
});
let name = CString::new("return_32").unwrap();
let signature = CString::new(func1.signature.as_str()).unwrap();
// leak the raw pointer
let func1_ptr = Box::into_raw(unsafe { Pin::into_inner_unchecked(func1) });
let _func1 = unsafe { Pin::new_unchecked(Box::from_raw(func1_ptr)) }; // just for housekeeping
unsafe {
register_function(
name.as_ptr(),
signature.as_ptr(),
MyAttachment::function_proxy as *mut _,
func1_ptr as *mut _,
)
};
// ...
// somewhere here is my proxy called from C
// ...
// automatic cleanup of MyAttachment structs, because the Boxes are dropped
}
How do I fill these TODOs with code?
I have seen this in C code somewhere by using a generic function pointer and defining a fixed number of calls:
void (*func_ptr)();
if (argc == 0)
func_ptr();
else if (argc == 1)
func_ptr(argv[0]);
else if (argc == 2)
func_ptr(argv[0], argv[1]);
// ... and so on
But is there a solution to do this in Rust? (This only needs to work for x86_64/amd64)
Thanks in advance for reading all this and trying to help.
(I added the reflection tag, because this would be done via reflection if Rust had any)
==== edit
I have seen these related questions, but I don't think they apply here:
Call a raw address from Rust -> My type is not given at compile time
How do I pass each element of a slice as a separate argument to a variadic C function? -> My arguments are somewhat of fixed size and don't use a valist
Related
I have built a library in Rust and my goal is to run it parallel with a Sat Solver (built in C) and make these two exchange information. The database (where information is received, stored and sent) is in Rust.
My problem is that I need to pass the Receiver from crossbeam channel from Rust to C and from the C program call the function receive (which takes as input the Receiver struct). The function receive is then executed in my Rust code.
I have programmed the structs and function in Rust:
use crossbeam_channel::Receiver;
use std::boxed::Box;
#[repr(C)]
pub struct CReceiver(Receiver<Vec<i32>>);
impl CReceiver {
fn new(receiver: Receiver<Vec<i32>>) -> CReceiver {
CReceiver(receiver)
}
}
#[no_mangle]
pub extern "C" fn data_new(creceiver: CReceiver) -> *mut CReceiver {
Box::into_raw(Box::new(creceiver))
}
#[no_mangle]
pub extern "C" fn data_free(ptr: *mut CReceiver) {
if ptr.is_null() {
return;
}
unsafe {
Box::from_raw(ptr);
}
}
#[no_mangle]
pub extern "C" fn receive(ptr: *mut CReceiver) -> Vec<i32> {
unsafe {
let creceiver = Box::from_raw(ptr);
match creceiver.0.try_recv() {
Ok(received) => received,
Err(_) => Vec::new() // glucose can handle empty clauses
}
}
}
I am trying to wrap the Receiver in a CReceiver struct to be able to pass it over to C. When I try to generate the header with cbindgen I get:
WARN: Cannot find a mangling for generic path GenericPath { path: Path { name: "Receiver" }, export_name: "Receiver", generics: [Type(Path(GenericPath { path: Path { name: "Vec" }, export_name: "Vec", generics: [Type(Primitive(Integer { zeroable: true, signed: true, kind: B32 }))], ctype: None }))], ctype: None }. This usually means that a type referenced by this generic was incompatible or not found.
WARN: Can't find Vec. This usually means that this type was incompatible or not found.
WARN: Can't find Receiver. This usually means that this type was incompatible or not found.
IMPORTANT: I need to initialise the receiver in Rust and then pass it to C as the whole channel (together with the sender) is initialised and needed.
I'm writing a MQTT5 library. To send a packet, I need to know the size of the payload before writing the payload. My solution for determining the size has the following constraints order by importance:
be easy to maintain
should not create copies of the data
should be fairly performant (avoid double calculations)
To determine the size I can do any of the following solutions:
do the calculations by hand, which is fairly annoying
hold a copy of the data to send in memory, which I want to avoid
Build an std::iter::ExactSizeIterator for the payload which consists of std::iter::Chains itself, which leads to ugly typings fast, if you don't create wrapper types
I decided to go with version 3.
The example below shows my try on writing a MQTT String iterator. A MQTT String consists of two bytes which are the length of the string followed by the data as utf8.
use std::iter::*;
use std::slice::Iter;
pub struct MQTTString<'a> {
chain: Chain<Iter<'a, u8>, Iter<'a, u8>>,
}
impl<'a> MQTTString<'a> {
pub fn new(s: &'a str) -> Self {
let u16_len = s.len() as u16;
let len_bytes = u16_len.to_be_bytes();
let len_iter = len_bytes.iter(); // len_bytes is borrowed here
let s_bytes = s.as_bytes();
let s_iter = s_bytes.iter();
let chain = len_iter.chain(s_iter);
MQTTString { chain }
}
}
impl<'a> Iterator for MQTTString<'a> {
type Item = &'a u8;
fn next(&mut self) -> Option<&'a u8> {
self.chain.next()
}
}
impl<'a> ExactSizeIterator for MQTTString<'a> {}
pub struct MQTTStringPait<'a> {
chain: Chain<std::slice::Iter<'a, u8>, std::slice::Iter<'a, u8>>,
}
This implementation doesn't compile because I borrow len_bytes instead of moving it, so it'd get dropped before the Chain can consume it:
error[E0515]: cannot return value referencing local variable `len_bytes`
--> src/lib.rs:19:9
|
12 | let len_iter = len_bytes.iter(); // len_bytes is borrowed here
| --------- `len_bytes` is borrowed here
...
19 | MQTTString { chain }
| ^^^^^^^^^^^^^^^^^^^^ returns a value referencing data owned by the current function
Is there a nice way to do this? Adding len_bytes to the MQTTString struct doesn't help. Is there a better fourth option of solving the problem?
The root problem is that iter borrows the array. In nightly Rust, you can use array::IntoIter, but it does require that you change your iterator to return u8 instead of &u8:
#![feature(array_value_iter)]
use std::array::IntoIter;
use std::iter::*;
use std::slice::Iter;
pub struct MQTTString<'a> {
chain: Chain<IntoIter<u8, 2_usize>, Copied<Iter<'a, u8>>>,
}
impl<'a> MQTTString<'a> {
pub fn new(s: &'a str) -> Self {
let u16_len = s.len() as u16;
let len_bytes = u16_len.to_be_bytes();
let len_iter = std::array::IntoIter::new(len_bytes);
let s_bytes = s.as_bytes();
let s_iter = s_bytes.iter().copied();
let chain = len_iter.chain(s_iter);
MQTTString { chain }
}
}
impl<'a> Iterator for MQTTString<'a> {
type Item = u8;
fn next(&mut self) -> Option<u8> {
self.chain.next()
}
}
impl<'a> ExactSizeIterator for MQTTString<'a> {}
You could do the same thing in stable Rust by using a Vec, but that'd be a bit of overkill. Instead, since you know the exact size of the array, you could get the values and chain more:
use std::iter::{self, *};
use std::slice;
pub struct MQTTString<'a> {
chain: Chain<Chain<Once<u8>, Once<u8>>, Copied<slice::Iter<'a, u8>>>,
}
impl<'a> MQTTString<'a> {
pub fn new(s: &'a str) -> Self {
let u16_len = s.len() as u16;
let [a, b] = u16_len.to_be_bytes();
let s_bytes = s.as_bytes();
let s_iter = s_bytes.iter().copied();
let chain = iter::once(a).chain(iter::once(b)).chain(s_iter);
MQTTString { chain }
}
}
impl<'a> Iterator for MQTTString<'a> {
type Item = u8;
fn next(&mut self) -> Option<u8> {
self.chain.next()
}
}
impl<'a> ExactSizeIterator for MQTTString<'a> {}
See also:
How to implement Iterator and IntoIterator for a simple struct?
An iterator of &u8 is not a good idea from the point of view of pure efficiency. On a 64-bit system, &u8 takes up 64 bits, as opposed to the 8 bits that the u8 itself would take. Additionally, dealing with this data on a byte-by-byte basis will likely impede common optimizations around copying memory around.
Instead, I'd recommend creating something that can write itself to something implementing Write. One possible implementation:
use std::{
convert::TryFrom,
io::{self, Write},
};
pub struct MQTTString<'a>(&'a str);
impl MQTTString<'_> {
pub fn write_to(&self, mut w: impl Write) -> io::Result<()> {
let len = u16::try_from(self.0.len()).expect("length exceeded 16-bit");
let len = len.to_be_bytes();
w.write_all(&len)?;
w.write_all(self.0.as_bytes())?;
Ok(())
}
}
See also:
How do I convert between numeric types safely and idiomatically?
Converting number primitives (i32, f64, etc) to byte representations
I'm trying to create a Rust library callable from C:
use std::os::raw::{c_int};
type OnDataCallback = unsafe extern "C" fn(data: *mut u8, len: usize) -> c_int;
static mut onDataCallback_: OnDataCallback = std::ptr::null();
#[no_mangle]
pub extern "C" fn registerOnDataCallback(
data: *const u8, len: usize,
cb: Option<OnDataCallback>) -> c_int
{
onDataCallback_ = cb.unwrap();
return 0;
}
#[no_mangle]
pub extern "C" fn doSomething()
{
unsafe{onDataCallback_(mut "hello world" , 100)};
}
But I'm getting:
--> interface.rs:5:46
|
5 | static mut onDataCallback_: OnDataCallback = std::ptr::null();
| ^^^^^^^^^^^^^^^^ expected fn pointer, found *-ptr
|
= note: expected fn pointer `unsafe extern "C" fn(*mut u8, usize) -> i32`
found raw pointer `*const _`
I don't have idea on what to put there for the initial value. I can't let it without one, and I can't put null. What should I put?
PS: if what I'm doing is a bad practice, please show me a good one. I'm new at Rust.
You should wrap it in an Option and initialize it with None.
static mut onDataCallback_: Option<OnDataCallback> = None;
You can initialize it with a function that panics:
static mut ON_DATA_CALLBACK: OnDataCallback = init;
extern "C" fn init(_: *mut u8, _: usize) -> c_int {
panic!("Function pointer not initialized");
}
Be aware that static mut is wildly unsafe, and its use is generally discouraged, because it's so difficult use it correctly. A safe alternative is to use a RefCell Mutex in a static. Since Mutex has interior mutability, the static doesn't need to be mutable.
A better solution would be to use once_cell or lazy_static to initialize the function the first time doSomething() is called.
By the way, Rust string literals are always immutable. You can get a mutable string by allocating a String. See this playground.
Let's say I have a struct that has a collection, such as a Vec as one of its data members:
struct MyCollection {
data: Vec<i32>
}
I want the user of MyCollection to be able to iterate over its data without direct access to the Vec itself, like so:
let x = MyCollection{data:vec![1, 2, 3, 4, 5]};
for i in &x {
//...
}
However, I'm struggling with implementing the necessary Trait IntoIterator for the non-consuming version with &x. I have successfully implemented the consuming version:
impl std::iter::IntoIterator for MyCollection {
type Item = i32;
type IntoIter = std::vec::IntoIter<Self::Item>;
fn into_iter(self) -> Self::IntoIter {
return self.data.into_iter();
}
}
However, this is only usable as follows:
for i in x {
println!("{:?}", i);
}
which consumes x. Cloning the data is possible, but quite expensive, so I'd like to avoid that.
Here is what I have so far for the non-consuming version, which I based on the source implementation of std::vec::Vec:
impl<'a> std::iter::IntoIterator for &'a MyCollection {
type Item = &'a i32;
type IntoIter = std::vec::IntoIter<Self::Item>;
fn into_iter(self) -> Self::IntoIter {
return self.data.into_iter();
}
}
which produces the following compile error:
error: mismatched types
error: expected &i32, found i32
note: expected type `std::vec::IntoIter<&i32>`
found type `std::vec::IntoIter<i32>`
error: expected `std::vec::IntoIter<&i32>` because of return type
I have also tried removing the &'a of the type Item since in my case, the elements of data are Copyable, but this yields the following:
error: cannot move out of `self.data` which is behind a shared reference
error: move occurs because `self.data` has type `std::vec::Vec<i32>`, which does not implement the `Copy` trait
I understand the function wants an IntoIter of a vector to references, but I'm unsure how to give it one efficiently. I'm new to Rust, so I'd much appreciate some clarity on the concern. Bonus points if you can also tell me how to create a mutable iterator for write access in the same fashion.
First, you should use slice type, your user shouldn't have to know that you inner type is vector. Then, your problem is that you must not use IntoIter type, but Iter type directly.
Simple example:
struct MyCollection {
data: Vec<i32>,
}
impl<'a> std::iter::IntoIterator for &'a MyCollection {
type Item = <std::slice::Iter<'a, i32> as Iterator>::Item;
type IntoIter = std::slice::Iter<'a, i32>;
fn into_iter(self) -> Self::IntoIter {
self.data.as_slice().into_iter()
}
}
fn main() {
let x = MyCollection {
data: vec![1, 2, 3, 4, 5],
};
for i in &x {
println!("{}", i);
}
}
I'm trying to make a Rust binding to nbdkit without much luck. I need to make a .so file, which is easy. The .so file must have a public function called plugin_init, also easy. However this function must return a pointer to a C-compatible struct containing a mix of strings and function pointers (that the C main program will later call).
The API is: https://github.com/libguestfs/nbdkit/blob/409ce4c9238a84ede6688423b20d5f706067834b/include/nbdkit-plugin.h#L53
I came up with:
#[repr(C)]
pub struct NBDKitPlugin {
_struct_size: uint64_t,
_api_version: c_int,
_thread_model: c_int,
name: *const c_char,
longname: Option<*const c_char>,
version: Option<*const c_char>,
description: Option<*const c_char>,
load: Option<extern fn ()>,
unload: Option<extern fn ()>,
config: Option<extern fn ()>, // XXX
config_complete: Option<extern fn () -> c_int>,
config_help: Option<*const c_char>,
open: extern fn (c_int) -> *mut c_void,
close: Option<extern fn (*mut c_void)>,
}
and a plugin_init function:
extern fn hello_load () {
println! ("hello this is the load method");
}
struct MyHandle {
}
extern fn hello_open (readonly: c_int) -> *mut c_void {
println! ("hello, this is the open method");
let mut h = MyHandle {};
let vp: *mut c_void = &mut h as *mut _ as *mut c_void;
return vp;
}
#[no_mangle]
pub extern fn plugin_init () -> *const NBDKitPlugin {
println! ("hello from the plugin");
let plugin = Box::new (NBDKitPlugin {
_struct_size: mem::size_of::<NBDKitPlugin>() as uint64_t,
_api_version: 2,
_thread_model: 3,
name: CString::new("hello").unwrap().into_raw(),
longname: None,
version: None,
description: None,
load: Some (hello_load),
unload: None,
config: None,
config_complete: None,
config_help: Some (CString::new("my config_help here").unwrap().into_raw()),
open: hello_open,
close: None,
});
return Box::into_raw(plugin);
}
Apart from leaking memory, this partially works. The integers and strings are seen from C OK. However the function pointers don't work at all. They are completely bogus and seem to occupy more space than a raw pointer so I suppose that I'm exposing a "fat" Rust pointer.
There seems very little documentation on this topic that I can find. Help.
You've probably learned at some point that a reference (&T) wrapped in an Option (Option<&T>) is optimized such that None is encoded as all zeroes, which is not valid for a reference, and Option<&T> has the same size as &T.
However, all zeroes is a valid bit pattern for a raw pointer (*const T or *mut T): it represents the null pointer. As such, wrapping those in an Option is no different from wrapping, say, an i32 in an Option: the Option type is larger so that it can store a discriminant.
To fix your struct definition, you must not use Option to define longname, version and description.