Rust static function fn pointer, which value to initialize? - c

I'm trying to create a Rust library callable from C:
use std::os::raw::{c_int};
type OnDataCallback = unsafe extern "C" fn(data: *mut u8, len: usize) -> c_int;
static mut onDataCallback_: OnDataCallback = std::ptr::null();
#[no_mangle]
pub extern "C" fn registerOnDataCallback(
data: *const u8, len: usize,
cb: Option<OnDataCallback>) -> c_int
{
onDataCallback_ = cb.unwrap();
return 0;
}
#[no_mangle]
pub extern "C" fn doSomething()
{
unsafe{onDataCallback_(mut "hello world" , 100)};
}
But I'm getting:
--> interface.rs:5:46
|
5 | static mut onDataCallback_: OnDataCallback = std::ptr::null();
| ^^^^^^^^^^^^^^^^ expected fn pointer, found *-ptr
|
= note: expected fn pointer `unsafe extern "C" fn(*mut u8, usize) -> i32`
found raw pointer `*const _`
I don't have idea on what to put there for the initial value. I can't let it without one, and I can't put null. What should I put?
PS: if what I'm doing is a bad practice, please show me a good one. I'm new at Rust.

You should wrap it in an Option and initialize it with None.
static mut onDataCallback_: Option<OnDataCallback> = None;

You can initialize it with a function that panics:
static mut ON_DATA_CALLBACK: OnDataCallback = init;
extern "C" fn init(_: *mut u8, _: usize) -> c_int {
panic!("Function pointer not initialized");
}
Be aware that static mut is wildly unsafe, and its use is generally discouraged, because it's so difficult use it correctly. A safe alternative is to use a RefCell Mutex in a static. Since Mutex has interior mutability, the static doesn't need to be mutable.
A better solution would be to use once_cell or lazy_static to initialize the function the first time doSomething() is called.
By the way, Rust string literals are always immutable. You can get a mutable string by allocating a String. See this playground.

Related

Rust How do you define a default method for similar types?

A very common pattern I have to deal with is, I am given some raw byte data. This data can represent an array of floats, 2D vectors, Matrices...
I know the data is compact and properly aligned. In C usually you would just do:
vec3 * ptr = (vec3*)data;
And start reading from it.
I am trying to create a view to this kind of data in rust to be able to read and write to the buffer as follows:
pub trait AccessView<T>
{
fn access_view<'a>(
offset : usize,
length : usize,
buffer : &'a Vec<u8>) -> &'a mut [T]
{
let bytes = &buffer[offset..(offset + length)];
let ptr = bytes.as_ptr() as *mut T;
return unsafe { std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()) };
}
}
And then calling it:
let data: &[f32] =
AccessView::<f32>::access_view(0, 32, &buffers[0]);
The idea is, I should be able to replace f32 with vec3 or mat4 and get a slice view into the underlying data.
This is crashing with:
--> src/main.rs:341:9
|
341 | AccessView::<f32>::access_view(&accessors[0], &buffer_views, &buffers);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type
|
= note: cannot satisfy `_: AccessView<f32>`
How could I use rust to achieve my goal? i.e. have a generic "template" for turning a set of raw bytes into a range checked slice view casted to some type.
There are two important problems I can identify:
You are using a trait incorrectly. You have to connect a trait to an actual type. If you want to call it the way you do, it needs to be a struct instead.
Soundness. You are creating a mutable reference from an immutable one through unsafe code. This is unsound and dangerous. By using unsafe, you tell the compiler that you manually verified that your code is sound, and the borrow checker should blindly believe you. Your code, however, is not sound.
To part 1, #BlackBeans gave you a good answer already. I would still do it a little differently, though. I would directly imlement the trait for &[u8], so you can write data.access_view::<T>().
To part 2, you at least need to make the input data &mut. Further, make sure they have the same lifetime, otherwise the compiler might not realize that they are actually connected.
Also, don't use &Vec<u8> as an argument; in general, use slices (&[u8]) instead.
Be aware that with all that said, there still is the problem of ENDIANESS. The behavior you will get will not be consistent between platforms. Use other means of conversion instead if that is something you require. Do not put this code in a generic library, at max use it for your own personal project.
That all said, here is what I came up with:
pub trait AccessView {
fn access_view<'a, T>(&'a mut self, offset: usize, length: usize) -> &'a mut [T];
}
impl AccessView for [u8] {
fn access_view<T>(&mut self, offset: usize, length: usize) -> &mut [T] {
let bytes = &mut self[offset..(offset + length)];
let ptr = bytes.as_ptr() as *mut T;
return unsafe { std::slice::from_raw_parts_mut(ptr, length / ::std::mem::size_of::<T>()) };
}
}
impl AccessView for Vec<u8> {
fn access_view<T>(&mut self, offset: usize, length: usize) -> &mut [T] {
self.as_mut_slice().access_view(offset, length)
}
}
fn main() {
let mut data: Vec<u8> = vec![1, 2, 3, 4, 5, 6, 7, 8];
println!("{:?}", data);
let float_view: &mut [f32] = data.access_view(2, 4);
float_view[0] = 42.0;
println!("{:?}", float_view);
println!("{:?}", data);
// println!("{:?}", float_view); // Adding this would cause a compiler error, which shows that we implemented lifetimes correctly
}
[1, 2, 3, 4, 5, 6, 7, 8]
[42.0]
[1, 2, 0, 0, 40, 66, 7, 8]
I think you didn't understood exactly what traits are. Traits represent a characteristic of a type, for instance, since I know the size at compile-time of u32 (32 bits), u32 implements the marker trait Sized, noted u32: Sized. A more feature-complete trait could be the Default one: if there is a "default" way of building of type T, then we can implement Default for it, so that now there is a standard default way of building it.
In your example, you are using a trait as a namespace for functions, ie you could simply have
fn access_view<'a, T>(
offset: usize,
length: usize,
buffer: &'a [u8]
) -> &'a mut T
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
Or, if you want to put it as a trait:
trait Viewable {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self]
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
}
Then implement it:
impl<T> Viewable for T {}
Or, again, differently
trait Viewable {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self];
}
impl<T> Viewable for T {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self]
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
}
Although all this way to structure the code will somehow produce the same result, it doesn't mean they're equivalent. Maybe you should learn a little bit more about traits before using them.
Also, your code, as is, really seems unsound, in the sense that you make a call to an unsafe function without any checking (ie. what if I call it with random nonsense in buffer?). It doesn't mean it is (we don't have access to the rest of your code), but you should be careful about that: Rust is not C.
Finally, your error simply comes from the fact that it's impossible for Rust to find out which type T you are calling the associated method access_view of.

Can I call a raw pointer with arbitrary arguments in Rust?

I'm trying to combine my Rust program with a library written in C in a more complex scenario.
The library provides this interface:
use std::os::raw::{c_char, c_void};
extern "C" {
pub fn register_function(
name: *const c_char, signature: *const c_char,
func_ptr: *mut c_void, attachment: *mut c_void,
);
}
The signature can be a string describing the arguments and return type of the function as 32 or 64 bit ints and floats (representations: b'i' = i32, b'I' = i64, b'f' = f32, b'F' = f64). The registered function gets called with an array of u64 (uint64_t) values which correspond to the arguments from the signature.
I would like to abstract this registration and callback process, so that I can switch to another library in the future which provides a similar but different interface. My idea was to to create a proxy function that is registered instead of the actual function. This would then also provide a custom context struct.
My own functions could look like this:
use std::boxed::Box;
use std::pin::Pin;
fn return_void(context: Pin<Box<MyAttachment>>) {
// ...
}
fn return_32(context: Pin<Box<MyAttachment>>, a: u32, b: u32) -> u32 {
context.important_stuff();
// ...
}
// floating point values would be nice, but are optional
fn return_64(a: i32, b: i64, c: f64) -> f64 {
// ...
}
MyAttachment is supposed to be the context and provide the proxy function that gets an arbitrary number of arguments as array:
use std::cmp;
use std::ffi::CString;
use std::slice;
#[derive(PartialEq)]
enum ReturnType {
VOID,
BITS32,
BITS64,
}
struct MyAttachment {
real_func_ptr: *mut c_void,
signature: String,
argc: u32,
pass_attachment: bool,
return_type: ReturnType,
}
impl MyAttachment {
pub fn important_stuff(&self) {
// ...
}
unsafe extern "C" fn function_proxy(attachment: *mut c_void, argv: *mut u64) {
// Given: attachment is the pointer to MyAttachment and argv is the array of arguments.
let this = attachment.cast::<Self>();
if this.is_null() || argv.is_null() {
// error handling
return;
}
let this = Pin::new_unchecked(Box::from_raw(this)); // restore
let args = slice::from_raw_parts_mut(
argv,
// There is at least one element in argv if the function is supposed to return a value,
// because we need to write our result there.
cmp::max(
match this.return_type {
ReturnType::VOID => 0,
ReturnType::BITS32 | ReturnType::BITS64 => 1,
},
this.argc as usize,
),
);
let func_ptr = this.real_func_ptr;
// I can get the argument types from the signature.
// TODO cast to correct pointer type. For example:
// case return_void: Fn(Pin<Box<MyAttachment>>)
// case return_32: Fn(Pin<Box<MyAttachment>>, u64, u64) -> u32
// case return_64: Fn(u64, u64, f64) -> f64
// TODO call it:
if this.return_type != ReturnType::VOID {
//args[0] = func_ptr(...args);
// or
//args[0] = func_ptr(this, ...args);
} else {
//func_ptr(...args);
}
Box::into_raw(Pin::into_inner_unchecked(this)); // delay dropping of this
}
}
fn main() {
// defining functions like:
let func1 = Box::pin(MyAttachment {
real_func_ptr: return_32 as *mut _,
signature: String::from("(ii)i"),
argc: 2, // inferred from signature
pass_attachment: true,
return_type: ReturnType::BITS32, // inferred from signature
});
let name = CString::new("return_32").unwrap();
let signature = CString::new(func1.signature.as_str()).unwrap();
// leak the raw pointer
let func1_ptr = Box::into_raw(unsafe { Pin::into_inner_unchecked(func1) });
let _func1 = unsafe { Pin::new_unchecked(Box::from_raw(func1_ptr)) }; // just for housekeeping
unsafe {
register_function(
name.as_ptr(),
signature.as_ptr(),
MyAttachment::function_proxy as *mut _,
func1_ptr as *mut _,
)
};
// ...
// somewhere here is my proxy called from C
// ...
// automatic cleanup of MyAttachment structs, because the Boxes are dropped
}
How do I fill these TODOs with code?
I have seen this in C code somewhere by using a generic function pointer and defining a fixed number of calls:
void (*func_ptr)();
if (argc == 0)
func_ptr();
else if (argc == 1)
func_ptr(argv[0]);
else if (argc == 2)
func_ptr(argv[0], argv[1]);
// ... and so on
But is there a solution to do this in Rust? (This only needs to work for x86_64/amd64)
Thanks in advance for reading all this and trying to help.
(I added the reflection tag, because this would be done via reflection if Rust had any)
==== edit
I have seen these related questions, but I don't think they apply here:
Call a raw address from Rust -> My type is not given at compile time
How do I pass each element of a slice as a separate argument to a variadic C function? -> My arguments are somewhat of fixed size and don't use a valist

Creating a rust shared library that returns a struct of function pointers to C main program

I'm trying to make a Rust binding to nbdkit without much luck. I need to make a .so file, which is easy. The .so file must have a public function called plugin_init, also easy. However this function must return a pointer to a C-compatible struct containing a mix of strings and function pointers (that the C main program will later call).
The API is: https://github.com/libguestfs/nbdkit/blob/409ce4c9238a84ede6688423b20d5f706067834b/include/nbdkit-plugin.h#L53
I came up with:
#[repr(C)]
pub struct NBDKitPlugin {
_struct_size: uint64_t,
_api_version: c_int,
_thread_model: c_int,
name: *const c_char,
longname: Option<*const c_char>,
version: Option<*const c_char>,
description: Option<*const c_char>,
load: Option<extern fn ()>,
unload: Option<extern fn ()>,
config: Option<extern fn ()>, // XXX
config_complete: Option<extern fn () -> c_int>,
config_help: Option<*const c_char>,
open: extern fn (c_int) -> *mut c_void,
close: Option<extern fn (*mut c_void)>,
}
and a plugin_init function:
extern fn hello_load () {
println! ("hello this is the load method");
}
struct MyHandle {
}
extern fn hello_open (readonly: c_int) -> *mut c_void {
println! ("hello, this is the open method");
let mut h = MyHandle {};
let vp: *mut c_void = &mut h as *mut _ as *mut c_void;
return vp;
}
#[no_mangle]
pub extern fn plugin_init () -> *const NBDKitPlugin {
println! ("hello from the plugin");
let plugin = Box::new (NBDKitPlugin {
_struct_size: mem::size_of::<NBDKitPlugin>() as uint64_t,
_api_version: 2,
_thread_model: 3,
name: CString::new("hello").unwrap().into_raw(),
longname: None,
version: None,
description: None,
load: Some (hello_load),
unload: None,
config: None,
config_complete: None,
config_help: Some (CString::new("my config_help here").unwrap().into_raw()),
open: hello_open,
close: None,
});
return Box::into_raw(plugin);
}
Apart from leaking memory, this partially works. The integers and strings are seen from C OK. However the function pointers don't work at all. They are completely bogus and seem to occupy more space than a raw pointer so I suppose that I'm exposing a "fat" Rust pointer.
There seems very little documentation on this topic that I can find. Help.
You've probably learned at some point that a reference (&T) wrapped in an Option (Option<&T>) is optimized such that None is encoded as all zeroes, which is not valid for a reference, and Option<&T> has the same size as &T.
However, all zeroes is a valid bit pattern for a raw pointer (*const T or *mut T): it represents the null pointer. As such, wrapping those in an Option is no different from wrapping, say, an i32 in an Option: the Option type is larger so that it can store a discriminant.
To fix your struct definition, you must not use Option to define longname, version and description.

Borrow checker error in a loop inside a recursive function with lifetime bounds

Why does the borrow checker complain about this code?
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
loop {
foo(v, buf);
}
}
error[E0499]: cannot borrow `*buf` as mutable more than once at a time
--> src/main.rs:3:16
|
3 | foo(v, buf);
| ^^^ mutable borrow starts here in previous iteration of loop
4 | }
5 | }
| - mutable borrow ends here
If I remove the lifetime bound, the code compiles fine.
fn foo(v: &mut Vec<&str>, buf: &mut String) {
loop {
foo(v, buf);
}
}
This isn't a duplicate of Mutable borrow in a loop, because there is no return value in my case.
I'm pretty sure that my final goal isn't achievable in safe Rust, but right now I want to better understand how the borrow checker works and I can not understand why adding a lifetime bound between parameters extends the lifetime of the borrow in this code.
The version with the explicit lifetime 'a ties the lifetime of the Vec to the lifetime of buf. This causes trouble when the Vec and the String are reborrowed. Reborrowing occurs when the arguments are passed to foo in the loop:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
loop {
foo(&mut *v, &mut *buf);
}
}
This is done implicitly by the compiler to prevent the arguments from being consumed when foo is called in the loop. If the arguments were actually moved, they could not be used anymore (e.g. for successive calls to foo) after the first recursive call to foo.
Forcing buf to be moved around resolves the error:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
foo_recursive(v, buf);
}
fn foo_recursive<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) -> &'a mut String{
let mut buf_temp = buf;
loop {
let buf_loop = buf_temp;
buf_temp = foo_recursive(v, buf_loop);
// some break condition
}
buf_temp
}
However, things will break again as soon as you try to actually use buf. Here is a distilled version of your example demonstrating why the compiler forbids successive mutable borrows of buf:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
bar(v, buf);
bar(v, buf);
}
fn bar<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
if v.is_empty() {
// first call: push slice referencing "A" into 'v'
v.push(&buf[0..1]);
} else {
// second call: remove "A" while 'v' is still holding a reference to it - not allowed
buf.clear();
}
}
fn main() {
foo(&mut vec![], &mut String::from("A"));
}
The calls to bar are the equivalents to the recursive calls to foo in your example. Again the compiler complains that *buf cannot be borrowed as mutable more than once at a time. The provided implementation of bar shows that the lifetime specification on bar would allow this function to be implemented in such a way that v enters an invalid state. The compiler understands by looking at the signature of bar alone that data from buf could potentially flow into v and rejects the code as potentially unsafe regardless of the actual implementation of bar.

When passing a Rust pointer to C should I get 0x1?

I'm trying to implement a basic library in Rust that creates an object and returns its pointer to C. The pointer I get doesn't look like it is on the heap — when I print it I get 0x1:
use std::fmt;
pub struct SndbDB {}
impl SndbDB {
fn new() -> SndbDB {
SndbDB {}
}
}
impl fmt::Display for SndbDB {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "(sndb_db)")
}
}
// Implement a destructor just so we can see when the object is destroyed.
impl Drop for SndbDB {
fn drop(&mut self) {
println!("[rust] dropping {}", self);
}
}
#[no_mangle]
pub extern "C" fn sndb_db_create() -> *mut SndbDB {
let _db = Box::into_raw(Box::new(SndbDB::new()));
println!("[rust] creating DB {:?}", _db);
_db
}
#[no_mangle]
pub unsafe extern "C" fn sndb_db_destroy(ptr: *mut SndbDB) {
println!("[rust] destroying DB {:?}", ptr);
Box::from_raw(ptr); // Rust drops this for us.
}
The C calling code is trivial too:
typedef struct sndb_db sndb_db;
sndb_db * sndb_db_create(void);
void sndb_db_destroy(sndb_db* db);
void test_baseapi__create_and_destroy_db(void)
{
sndb_db * db = sndb_db_create();
printf("[C] Got db=%p\n",db);
sndb_db_destroy(db);
printf("[C] db should be dead by now...\n");
}
And all the output except the pointer location are as I expect:
[rust] creating DB 0x1
[C] Got db=0x1
[rust] destroying DB 0x1
[rust] dropping (sndb_db)
[C] db should be dead by now...
I know that memory allocated in Rust needs to be deallocated by Rust - but I'm still surprised it is using a location of 0x1 - am I doing something wrong, is something odd happening, or is this all OK?
It looks like this is an optimisation by Rust since the SndbDB struct has no state.
Adding an i: u32 field to it and plumbing that through to the constructor functions and the C code I then get:
[rust] creating DB 0x7fff2fe00000
[C] Got db=0x7fff2fe00000
[rust] destroying DB 0x7fff2fe00000
[rust] dropping (sndb_db i=123)
[C] db should be dead by now...
However I'd still love to find an official source to back up this guess.

Resources