Related
A very common pattern I have to deal with is, I am given some raw byte data. This data can represent an array of floats, 2D vectors, Matrices...
I know the data is compact and properly aligned. In C usually you would just do:
vec3 * ptr = (vec3*)data;
And start reading from it.
I am trying to create a view to this kind of data in rust to be able to read and write to the buffer as follows:
pub trait AccessView<T>
{
fn access_view<'a>(
offset : usize,
length : usize,
buffer : &'a Vec<u8>) -> &'a mut [T]
{
let bytes = &buffer[offset..(offset + length)];
let ptr = bytes.as_ptr() as *mut T;
return unsafe { std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()) };
}
}
And then calling it:
let data: &[f32] =
AccessView::<f32>::access_view(0, 32, &buffers[0]);
The idea is, I should be able to replace f32 with vec3 or mat4 and get a slice view into the underlying data.
This is crashing with:
--> src/main.rs:341:9
|
341 | AccessView::<f32>::access_view(&accessors[0], &buffer_views, &buffers);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type
|
= note: cannot satisfy `_: AccessView<f32>`
How could I use rust to achieve my goal? i.e. have a generic "template" for turning a set of raw bytes into a range checked slice view casted to some type.
There are two important problems I can identify:
You are using a trait incorrectly. You have to connect a trait to an actual type. If you want to call it the way you do, it needs to be a struct instead.
Soundness. You are creating a mutable reference from an immutable one through unsafe code. This is unsound and dangerous. By using unsafe, you tell the compiler that you manually verified that your code is sound, and the borrow checker should blindly believe you. Your code, however, is not sound.
To part 1, #BlackBeans gave you a good answer already. I would still do it a little differently, though. I would directly imlement the trait for &[u8], so you can write data.access_view::<T>().
To part 2, you at least need to make the input data &mut. Further, make sure they have the same lifetime, otherwise the compiler might not realize that they are actually connected.
Also, don't use &Vec<u8> as an argument; in general, use slices (&[u8]) instead.
Be aware that with all that said, there still is the problem of ENDIANESS. The behavior you will get will not be consistent between platforms. Use other means of conversion instead if that is something you require. Do not put this code in a generic library, at max use it for your own personal project.
That all said, here is what I came up with:
pub trait AccessView {
fn access_view<'a, T>(&'a mut self, offset: usize, length: usize) -> &'a mut [T];
}
impl AccessView for [u8] {
fn access_view<T>(&mut self, offset: usize, length: usize) -> &mut [T] {
let bytes = &mut self[offset..(offset + length)];
let ptr = bytes.as_ptr() as *mut T;
return unsafe { std::slice::from_raw_parts_mut(ptr, length / ::std::mem::size_of::<T>()) };
}
}
impl AccessView for Vec<u8> {
fn access_view<T>(&mut self, offset: usize, length: usize) -> &mut [T] {
self.as_mut_slice().access_view(offset, length)
}
}
fn main() {
let mut data: Vec<u8> = vec![1, 2, 3, 4, 5, 6, 7, 8];
println!("{:?}", data);
let float_view: &mut [f32] = data.access_view(2, 4);
float_view[0] = 42.0;
println!("{:?}", float_view);
println!("{:?}", data);
// println!("{:?}", float_view); // Adding this would cause a compiler error, which shows that we implemented lifetimes correctly
}
[1, 2, 3, 4, 5, 6, 7, 8]
[42.0]
[1, 2, 0, 0, 40, 66, 7, 8]
I think you didn't understood exactly what traits are. Traits represent a characteristic of a type, for instance, since I know the size at compile-time of u32 (32 bits), u32 implements the marker trait Sized, noted u32: Sized. A more feature-complete trait could be the Default one: if there is a "default" way of building of type T, then we can implement Default for it, so that now there is a standard default way of building it.
In your example, you are using a trait as a namespace for functions, ie you could simply have
fn access_view<'a, T>(
offset: usize,
length: usize,
buffer: &'a [u8]
) -> &'a mut T
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
Or, if you want to put it as a trait:
trait Viewable {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self]
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
}
Then implement it:
impl<T> Viewable for T {}
Or, again, differently
trait Viewable {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self];
}
impl<T> Viewable for T {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self]
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
}
Although all this way to structure the code will somehow produce the same result, it doesn't mean they're equivalent. Maybe you should learn a little bit more about traits before using them.
Also, your code, as is, really seems unsound, in the sense that you make a call to an unsafe function without any checking (ie. what if I call it with random nonsense in buffer?). It doesn't mean it is (we don't have access to the rest of your code), but you should be careful about that: Rust is not C.
Finally, your error simply comes from the fact that it's impossible for Rust to find out which type T you are calling the associated method access_view of.
I am working on a project that involves reading different information from a file at different offsets.
Currently, I am using the following code:
// ------------------------ SECTORS PER CLUSTER ------------------------
// starts at 13
opened_file.seek(SeekFrom::Start(13)).unwrap();
let aux: &mut [u8] = &mut [0; 1];
let _buf = opened_file.read_exact(aux);
// ------------------------ RESERVED SECTORS ------------------------
// starts at 14
opened_file.seek(SeekFrom::Start(14)).unwrap();
let aux: &mut [u8] = &mut [0; 2];
let _buf = opened_file.read_exact(aux);
But as you can see, I need to create a new buffer of the size I want to read every time. I can't specify it directly as a parameter of the function.
I created a struct but I could not make a struct of all the different pieces of data I wanted. For example:
struct FileStruct {
a1: &mut [u8] &mut [0; 1],
a2: &mut [u8] &mut [0; 2],
}
Which are the types that are required for the read_exact method to work?
Is there a more effective way to read information from different offsets of a file without having to repeatedly copy-paste these lines of code for every piece of information I want to read from the file? Some sort of function, Cursor, or Vector to easily move around the offset? And a way to write this info into struct fields?
The easiest way is to have a struct of owned arrays, then seek and read into the struct.
use std::io::{self, prelude::*, SeekFrom};
#[derive(Debug, Clone, Default)]
struct FileStruct {
a1: [u8; 1],
a2: [u8; 2],
}
fn main() -> io::Result<()> {
let mut file_struct: FileStruct = Default::default();
let mut opened_file = unimplemented!(); // open file somehow
opened_file.seek(SeekFrom::Start(13))?;
opened_file.read_exact(&mut file_struct.a1)?;
opened_file.seek(SeekFrom::Start(14))?;
opened_file.read_exact(&mut file_struct.a2)?;
println!("{:?}", file_struct);
Ok(())
}
Playground link
This is still decently repetitive, so you can make a seek_read function to reduce the repetition:
use std::io::{self, prelude::*, SeekFrom};
#[derive(Debug, Clone, Default)]
struct FileStruct {
a1: [u8; 1],
a2: [u8; 2],
}
fn seek_read(mut reader: impl Read + Seek, offset: u64, buf: &mut [u8]) -> io::Result<()> {
reader.seek(SeekFrom::Start(offset))?;
reader.read_exact(buf)?;
Ok(())
}
fn main() -> io::Result<()> {
let mut file_struct: FileStruct = Default::default();
let mut opened_file = unimplemented!(); // open file somehow
seek_read(&mut opened_file, 13, &mut file_struct.a1)?;
seek_read(&mut opened_file, 14, &mut file_struct.a2)?;
println!("{:?}", file_struct);
Ok(())
}
Playground link
The repetition can be lowered even more by using a macro:
use std::io::{self, prelude::*, SeekFrom};
#[derive(Debug, Clone, Default)]
struct FileStruct {
a1: [u8; 1],
a2: [u8; 2],
}
macro_rules! read_offsets {
($file: ident, $file_struct: ident, []) => {};
($file: ident, $file_struct: ident, [$offset: expr => $field: ident $(, $offsets: expr => $fields: ident)*]) => {
$file.seek(SeekFrom::Start($offset))?;
$file.read_exact(&mut $file_struct.$field)?;
read_offsets!($file, $file_struct, [$($offsets => $fields),*]);
}
}
fn main() -> io::Result<()> {
let mut file_struct: FileStruct = Default::default();
let mut opened_file = unimplemented!(); // open file somehow
read_offsets!(opened_file, file_struct, [13 => a1, 14 => a2]);
println!("{:?}", file_struct);
Ok(())
}
Playground link
This is a complementary answer to Aplet123's: it's not quite clear that you must store the bytes as is into a structure, so you can also allocate one buffer (as a fixed-size array) and reuse it with the correctly sized slice e.g.
let mut buf = [0u8;16];
opened_file.read_exact(&mut buf[..4])?; // will read 4 bytes
// do thing with the first 4 bytes
opened_file.read_exact(&mut buf[..8])?; // will read 8 bytes this time
// etc...
You could also use the byteorder crate, which lets you directly read numbers or sequences of numbers. It basically just does the unrelying "create stack buffer of the right size; read; decode" for you.
That's especially useful because it looks a lot like "SECTORS PER CLUSTER" should be a u8 and "RESERVED SECTORS" should be a u16. With byteorder you can straight read_16() or read_u8().
Also building on Aplet123's answer, the following function seek_read doesn't require to know how many bytes to read at compile time, since it uses a Vector instead of a byte slice:
// Starting at `offset`, reads the `amount_to_read` from `reader`.
// Returns the bytes as a vector.
fn seek_read(
reader: &mut (impl Read + Seek),
offset: u64,
amount_to_read: usize,
) -> Result<Vec<u8>> {
// A buffer filled with as many zeros as we'll read with read_exact
let mut buf = vec![0; amount_to_read];
reader.seek(SeekFrom::Start(offset))?;
reader.read_exact(&mut buf)?;
Ok(buf)
}
Here are some tests to demonstrate how seek_read behaves:
use std::io::Cursor;
#[test]
fn seek_read_works() {
let bytes = b"Hello world!";
let mut reader = Cursor::new(bytes);
assert_eq!(seek_read(&mut reader, 0, 2).unwrap(), b"He");
assert_eq!(seek_read(&mut reader, 1, 4).unwrap(), b"ello");
assert_eq!(seek_read(&mut reader, 6, 5).unwrap(), b"world");
assert_eq!(seek_read(&mut reader, 2, 0).unwrap(), b"");
}
#[test]
#[should_panic(expected = "failed to fill whole buffer")]
fn seek_read_beyond_buffer_fails() {
let mut reader = Cursor::new(b"Hello world!");
seek_read(&mut reader, 6, 99).unwrap();
}
#[test]
#[should_panic(expected = "failed to fill whole buffer")]
fn start_seek_reading_beyond_buffer_fails() {
let mut reader = Cursor::new(b"Hello world!");
seek_read(&mut reader, 99, 1).unwrap();
}
I have an &[u8] and would like to turn it into an &[u8; 3] without copying. It should reference the original array. How can I do this?
As of Rust 1.34, you can use TryFrom / TryInto:
use std::convert::TryFrom;
fn example(slice: &[u8]) {
let array = <&[u8; 3]>::try_from(slice);
println!("{:?}", array);
}
fn example_mut(slice: &mut [u8]) {
let array = <&mut [u8; 3]>::try_from(slice);
println!("{:?}", array);
}
They arrayref crate implements this.
Here's an example, you can of course use it in different ways:
#[macro_use]
extern crate arrayref;
/// Get the first 3 elements of `bytes` as a reference to an array
/// **Panics** if `bytes` is too short.
fn first3(bytes: &[u8]) -> &[u8; 3] {
array_ref![bytes, 0, 3]
}
EDIT: TryFrom/TryInto has been stabilized as of Rust 1.34. Please see #shepmaster's answer for an updated method.
Just to re-emphasize, this can't be done without unsafe code because you don't know until runtime that the slice has three elements in it.
fn slice_to_arr3<T>(slice: &[T]) -> Option<&[T; 3]> {
if slice.len() == 3 {
Some(unsafe { &*(slice as *const [T] as *const [T; 3]) })
} else {
None
}
}
This can't be generic over the length of the array until const generics are implemented.
Why does the borrow checker complain about this code?
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
loop {
foo(v, buf);
}
}
error[E0499]: cannot borrow `*buf` as mutable more than once at a time
--> src/main.rs:3:16
|
3 | foo(v, buf);
| ^^^ mutable borrow starts here in previous iteration of loop
4 | }
5 | }
| - mutable borrow ends here
If I remove the lifetime bound, the code compiles fine.
fn foo(v: &mut Vec<&str>, buf: &mut String) {
loop {
foo(v, buf);
}
}
This isn't a duplicate of Mutable borrow in a loop, because there is no return value in my case.
I'm pretty sure that my final goal isn't achievable in safe Rust, but right now I want to better understand how the borrow checker works and I can not understand why adding a lifetime bound between parameters extends the lifetime of the borrow in this code.
The version with the explicit lifetime 'a ties the lifetime of the Vec to the lifetime of buf. This causes trouble when the Vec and the String are reborrowed. Reborrowing occurs when the arguments are passed to foo in the loop:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
loop {
foo(&mut *v, &mut *buf);
}
}
This is done implicitly by the compiler to prevent the arguments from being consumed when foo is called in the loop. If the arguments were actually moved, they could not be used anymore (e.g. for successive calls to foo) after the first recursive call to foo.
Forcing buf to be moved around resolves the error:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
foo_recursive(v, buf);
}
fn foo_recursive<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) -> &'a mut String{
let mut buf_temp = buf;
loop {
let buf_loop = buf_temp;
buf_temp = foo_recursive(v, buf_loop);
// some break condition
}
buf_temp
}
However, things will break again as soon as you try to actually use buf. Here is a distilled version of your example demonstrating why the compiler forbids successive mutable borrows of buf:
fn foo<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
bar(v, buf);
bar(v, buf);
}
fn bar<'a>(v: &mut Vec<&'a str>, buf: &'a mut String) {
if v.is_empty() {
// first call: push slice referencing "A" into 'v'
v.push(&buf[0..1]);
} else {
// second call: remove "A" while 'v' is still holding a reference to it - not allowed
buf.clear();
}
}
fn main() {
foo(&mut vec![], &mut String::from("A"));
}
The calls to bar are the equivalents to the recursive calls to foo in your example. Again the compiler complains that *buf cannot be borrowed as mutable more than once at a time. The provided implementation of bar shows that the lifetime specification on bar would allow this function to be implemented in such a way that v enters an invalid state. The compiler understands by looking at the signature of bar alone that data from buf could potentially flow into v and rejects the code as potentially unsafe regardless of the actual implementation of bar.
fn change(a: &mut i32, b: &mut i32) {
let c = *a;
*a = *b;
*b = c;
}
fn main() {
let mut v = vec![1, 2, 3];
change(&mut v[0], &mut v[1]);
}
When I compile the code above, it has the error:
error[E0499]: cannot borrow `v` as mutable more than once at a time
--> src/main.rs:9:32
|
9 | change(&mut v[0], &mut v[1]);
| - ^ - first borrow ends here
| | |
| | second mutable borrow occurs here
| first mutable borrow occurs here
Why does the compiler prohibit it? v[0] and v[1] occupy different memory positions, so it's not dangerous to use these together. And what should I do if I come across this problem?
You can solve this with split_at_mut():
let mut v = vec![1, 2, 3];
let (a, b) = v.split_at_mut(1); // Returns (&mut [1], &mut [2, 3])
change(&mut a[0], &mut b[0]);
There are uncountably many safe things to do that the compiler unfortunately does not recognize yet. split_at_mut() is just like that, a safe abstraction implemented with an unsafe block internally.
We can do that too, for this problem. The following is something I use in code where I need to separate all three cases anyway (I: Index out of bounds, II: Indices equal, III: Separate indices).
enum Pair<T> {
Both(T, T),
One(T),
None,
}
fn index_twice<T>(slc: &mut [T], a: usize, b: usize) -> Pair<&mut T> {
if a == b {
slc.get_mut(a).map_or(Pair::None, Pair::One)
} else {
if a >= slc.len() || b >= slc.len() {
Pair::None
} else {
// safe because a, b are in bounds and distinct
unsafe {
let ar = &mut *(slc.get_unchecked_mut(a) as *mut _);
let br = &mut *(slc.get_unchecked_mut(b) as *mut _);
Pair::Both(ar, br)
}
}
}
}
Since Rust 1.26, pattern matching can be done on slices. You can use that as long as you don't have huge indices and your indices are known at compile-time.
fn change(a: &mut i32, b: &mut i32) {
let c = *a;
*a = *b;
*b = c;
}
fn main() {
let mut arr = [5, 6, 7, 8];
{
let [ref mut a, _, ref mut b, ..] = arr;
change(a, b);
}
assert_eq!(arr, [7, 6, 5, 8]);
}
The borrow rules of Rust need to be checked at compilation time, that is why something like mutably borrowing a part of a Vec is a very hard problem to solve (if not impossible), and why it is not possible with Rust.
Thus, when you do something like &mut v[i], it will mutably borrow the entire vector.
Imagine I did something like
let guard = something(&mut v[i]);
do_something_else(&mut v[j]);
guard.do_job();
Here, I create an object guard that internally stores a mutable reference to v[i], and will do something with it when I call do_job().
In the meantime, I did something that changed v[j]. guard holds a mutable reference that is supposed to guarantee nothing else can modify v[i]. In this case, all is good, as long as i is different from j; if the two values are equal it is a huge violation of the borrow rules.
As the compiler cannot guarantee that i != j, it is thus forbidden.
This was a simple example, but similar cases are legions, and are why such access mutably borrows the whole container. Plus the fact that the compiler actually does not know enough about the internals of Vec to ensure that this operation is safe even if i != j.
In your precise case, you can have a look at the swap(..) method available on Vec that does the swap you are manually implementing.
On a more generic case, you'll probably need an other container. Possibilities are wrapping all the values of your Vec into a type with interior mutability, such as Cell or RefCell, or even using a completely different container, as #llogiq suggested in his answer with par-vec.
The method [T]::iter_mut() returns an iterator that can yield a mutable reference for each element in the slice. Other collections have an iter_mut method too. These methods often encapsulate unsafe code, but their interface is totally safe.
Here's a general purpose extension trait that adds a method on slices that returns mutable references to two distinct items by index:
pub trait SliceExt {
type Item;
fn get_two_mut(&mut self, index0: usize, index1: usize) -> (&mut Self::Item, &mut Self::Item);
}
impl<T> SliceExt for [T] {
type Item = T;
fn get_two_mut(&mut self, index0: usize, index1: usize) -> (&mut Self::Item, &mut Self::Item) {
match index0.cmp(&index1) {
Ordering::Less => {
let mut iter = self.iter_mut();
let item0 = iter.nth(index0).unwrap();
let item1 = iter.nth(index1 - index0 - 1).unwrap();
(item0, item1)
}
Ordering::Equal => panic!("[T]::get_two_mut(): received same index twice ({})", index0),
Ordering::Greater => {
let mut iter = self.iter_mut();
let item1 = iter.nth(index1).unwrap();
let item0 = iter.nth(index0 - index1 - 1).unwrap();
(item0, item1)
}
}
}
}
On recent nightlies, there is get_many_mut():
#![feature(get_many_mut)]
fn main() {
let mut v = vec![1, 2, 3];
let [a, b] = v
.get_many_mut([0, 1])
.expect("out of bounds or overlapping indices");
change(a, b);
}
Building up on the answer by #bluss you can use split_at_mut() to create a function that can turn mutable borrow of a vector into a vector of mutable borrows of vector elements:
fn borrow_mut_elementwise<'a, T>(v:&'a mut Vec<T>) -> Vec<&'a mut T> {
let mut result:Vec<&mut T> = Vec::new();
let mut current: &mut [T];
let mut rest = &mut v[..];
while rest.len() > 0 {
(current, rest) = rest.split_at_mut(1);
result.push(&mut current[0]);
}
result
}
Then you can use it to get a binding that lets you mutate many items of original Vec at once, even while you are iterating over them (if you access them by index in your loop, not through any iterator):
let mut items = vec![1,2,3];
let mut items_mut = borrow_mut_elementwise(&mut items);
for i in 1..items_mut.len() {
*items_mut[i-1] = *items_mut[i];
}
println!("{:?}", items); // [2, 3, 3]
The problem is that &mut v[…] first mutably borrows v and then gives the mutable reference to the element to the change-function.
This reddit comment has a solution to your problem.
Edit: Thanks for the heads-up, Shepmaster. par-vec is a library that allows to mutably borrow disjunct partitions of a vec.
I publish my daily utils for this to crate.io. Link to the doc.
You may use it like
use arref::array_mut_ref;
let mut arr = vec![1, 2, 3, 4];
let (a, b) = array_mut_ref!(&mut arr, [1, 2]);
assert_eq!(*a, 2);
assert_eq!(*b, 3);
let (a, b, c) = array_mut_ref!(&mut arr, [1, 2, 0]);
assert_eq!(*c, 1);
// ⚠️ The following code will panic. Because we borrow the same element twice.
// let (a, b) = array_mut_ref!(&mut arr, [1, 1]);
It's a simple wrapper around the following code, which is sound. But it requires that the two indexes are different at runtime.
pub fn array_mut_ref<T>(arr: &mut [T], a0: usize, a1: usize) -> (&mut T, &mut T) {
assert!(a0 != a1);
// SAFETY: this is safe because we know a0 != a1
unsafe {
(
&mut *(&mut arr[a0] as *mut _),
&mut *(&mut arr[a1] as *mut _),
)
}
}
Alternatively, you may use a method that won't panic with mut_twice
#[inline]
pub fn mut_twice<T>(arr: &mut [T], a0: usize, a1: usize) -> Result<(&mut T, &mut T), &mut T> {
if a0 == a1 {
Err(&mut arr[a0])
} else {
unsafe {
Ok((
&mut *(&mut arr[a0] as *mut _),
&mut *(&mut arr[a1] as *mut _),
))
}
}
}