Rust Strings and C variadic functions - c

I need pass a vector of Rust string to a C variadic function. But I can't figure out what is the expected(CString, [u8]..) format.
References:
API reference: isc_event_block
My extern C declaration: ibase
How I'm calling: que_events
C example: api16
API implementation: isc_event_block
My api16 example code version:
strcpy (ids[0], "evento");
length = (short) isc_event_block((char **) &event_buffer, (char **) &result_buffer, 1, ids[0], 0);
printf("event_buffer: '%s' %d\n", event_buffer, sizeof(event_buffer));
printf("result_buffer: '%s' %d\n", result_buffer, sizeof(result_buffer));
Result of version of api16:
event_buffer: 'evento' 8
result_buffer: '' 8
My Rust code:
let mut event_buffer: Vec<u8> = Vec::with_capacity(256);
let mut result_buffer: Vec<u8> = Vec::with_capacity(256);
let mut len = 0;
let names = "evento".to_string();
unsafe {
len = self.ibase.isc_event_block()(
event_buffer.as_mut_ptr() as *mut _,
result_buffer.as_mut_ptr() as *mut _,
names.len() as u16,
names.as_ptr()
);
event_buffer.set_len(len as usize);
result_buffer.set_len(len as usize);
}
println!("{:?} {:?}", len, names);
println!("{:x?} {:x?}", event_buffer, result_buffer);
println!("{:?} {:?}", String::from_utf8_lossy(&event_buffer.clone()), String::from_utf8_lossy(&result_buffer.clone()));
Result of my Rust code:
12 ["evento"]
[e0, 4f, 51, 28, a8, 7f, 0, 0, 0, 0, 0, 0] [0, 50, 51, 28, a8, 7f, 0, 0, 0, 0, 0, 0]
"�OQ(�\u{7f}\0\0\0\0\0\0" "\0PQ(�\u{7f}\0\0\0\0\0\0"
I already tried use CString or CStr, like here.
What am I doing wrong?

You're doing multiple things wrong in the rust version. For the first two arguments, you're intended to pass a pointer to a location that can hold a single pointer to a byte buffer. Instead, you're passing in a pointer to a byte buffer, so a pointer is getting written into that buffer, which isn't what you want.
Secondly, the id_count parameter corresponds to the number of strings you're passing as the variadic parameters, rather than the length of a single variadic string, meaning your c code just reads a bunch of uninitialized memory, which definitely isn't what you want. Additionally, that string does need to be null-terminated, and it isn't in your example, you do need CString. What you really want is something like this:
use std::ffi::{c_char, c_long, c_ushort, CStr, CString};
use std::ptr;
use std::slice;
use std::str;
fn main() {
let mut event_buffer = ptr::null_mut();
let mut result_buffer = ptr::null_mut();
let names = CString::new("evento").unwrap();
let len = unsafe {
isc_event_block(
&mut event_buffer,
&mut result_buffer,
1,
names.as_ptr() as *mut c_char,
)
};
debug_assert!(!event_buffer.is_null() && !result_buffer.is_null());
let event_slice = unsafe { slice::from_raw_parts(event_buffer.cast(), len as usize) };
let result_slice = unsafe { slice::from_raw_parts(event_buffer.cast(), len as usize) };
let event_str = str::from_utf8(event_slice).unwrap();
let result_str = str::from_utf8(result_slice).unwrap();
println!("event: {event_str}");
println!("result: {result_str}");
}
Playground
Adding a simple stub to print the string passed in, and write a couple of strings into the buffers, I got this:
"evento"
event: some events
result: some events

Related

Rust How do you define a default method for similar types?

A very common pattern I have to deal with is, I am given some raw byte data. This data can represent an array of floats, 2D vectors, Matrices...
I know the data is compact and properly aligned. In C usually you would just do:
vec3 * ptr = (vec3*)data;
And start reading from it.
I am trying to create a view to this kind of data in rust to be able to read and write to the buffer as follows:
pub trait AccessView<T>
{
fn access_view<'a>(
offset : usize,
length : usize,
buffer : &'a Vec<u8>) -> &'a mut [T]
{
let bytes = &buffer[offset..(offset + length)];
let ptr = bytes.as_ptr() as *mut T;
return unsafe { std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()) };
}
}
And then calling it:
let data: &[f32] =
AccessView::<f32>::access_view(0, 32, &buffers[0]);
The idea is, I should be able to replace f32 with vec3 or mat4 and get a slice view into the underlying data.
This is crashing with:
--> src/main.rs:341:9
|
341 | AccessView::<f32>::access_view(&accessors[0], &buffer_views, &buffers);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type
|
= note: cannot satisfy `_: AccessView<f32>`
How could I use rust to achieve my goal? i.e. have a generic "template" for turning a set of raw bytes into a range checked slice view casted to some type.
There are two important problems I can identify:
You are using a trait incorrectly. You have to connect a trait to an actual type. If you want to call it the way you do, it needs to be a struct instead.
Soundness. You are creating a mutable reference from an immutable one through unsafe code. This is unsound and dangerous. By using unsafe, you tell the compiler that you manually verified that your code is sound, and the borrow checker should blindly believe you. Your code, however, is not sound.
To part 1, #BlackBeans gave you a good answer already. I would still do it a little differently, though. I would directly imlement the trait for &[u8], so you can write data.access_view::<T>().
To part 2, you at least need to make the input data &mut. Further, make sure they have the same lifetime, otherwise the compiler might not realize that they are actually connected.
Also, don't use &Vec<u8> as an argument; in general, use slices (&[u8]) instead.
Be aware that with all that said, there still is the problem of ENDIANESS. The behavior you will get will not be consistent between platforms. Use other means of conversion instead if that is something you require. Do not put this code in a generic library, at max use it for your own personal project.
That all said, here is what I came up with:
pub trait AccessView {
fn access_view<'a, T>(&'a mut self, offset: usize, length: usize) -> &'a mut [T];
}
impl AccessView for [u8] {
fn access_view<T>(&mut self, offset: usize, length: usize) -> &mut [T] {
let bytes = &mut self[offset..(offset + length)];
let ptr = bytes.as_ptr() as *mut T;
return unsafe { std::slice::from_raw_parts_mut(ptr, length / ::std::mem::size_of::<T>()) };
}
}
impl AccessView for Vec<u8> {
fn access_view<T>(&mut self, offset: usize, length: usize) -> &mut [T] {
self.as_mut_slice().access_view(offset, length)
}
}
fn main() {
let mut data: Vec<u8> = vec![1, 2, 3, 4, 5, 6, 7, 8];
println!("{:?}", data);
let float_view: &mut [f32] = data.access_view(2, 4);
float_view[0] = 42.0;
println!("{:?}", float_view);
println!("{:?}", data);
// println!("{:?}", float_view); // Adding this would cause a compiler error, which shows that we implemented lifetimes correctly
}
[1, 2, 3, 4, 5, 6, 7, 8]
[42.0]
[1, 2, 0, 0, 40, 66, 7, 8]
I think you didn't understood exactly what traits are. Traits represent a characteristic of a type, for instance, since I know the size at compile-time of u32 (32 bits), u32 implements the marker trait Sized, noted u32: Sized. A more feature-complete trait could be the Default one: if there is a "default" way of building of type T, then we can implement Default for it, so that now there is a standard default way of building it.
In your example, you are using a trait as a namespace for functions, ie you could simply have
fn access_view<'a, T>(
offset: usize,
length: usize,
buffer: &'a [u8]
) -> &'a mut T
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
Or, if you want to put it as a trait:
trait Viewable {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self]
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
}
Then implement it:
impl<T> Viewable for T {}
Or, again, differently
trait Viewable {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self];
}
impl<T> Viewable for T {
fn access_view<'a>(
offset: usize,
length: usize,
buffer: &'a [u8],
) -> &'a mut [Self]
{
let bytes = &buffer[offset..offset+length];
let ptr = bytes.as_ptr() as *mut T;
unsafe {
std::slice::from_raw_parts_mut(ptr, length / size_of::<T>()
}
}
}
Although all this way to structure the code will somehow produce the same result, it doesn't mean they're equivalent. Maybe you should learn a little bit more about traits before using them.
Also, your code, as is, really seems unsound, in the sense that you make a call to an unsafe function without any checking (ie. what if I call it with random nonsense in buffer?). It doesn't mean it is (we don't have access to the rest of your code), but you should be careful about that: Rust is not C.
Finally, your error simply comes from the fact that it's impossible for Rust to find out which type T you are calling the associated method access_view of.

Reading from file at different offsets using Rust

I am working on a project that involves reading different information from a file at different offsets.
Currently, I am using the following code:
// ------------------------ SECTORS PER CLUSTER ------------------------
// starts at 13
opened_file.seek(SeekFrom::Start(13)).unwrap();
let aux: &mut [u8] = &mut [0; 1];
let _buf = opened_file.read_exact(aux);
// ------------------------ RESERVED SECTORS ------------------------
// starts at 14
opened_file.seek(SeekFrom::Start(14)).unwrap();
let aux: &mut [u8] = &mut [0; 2];
let _buf = opened_file.read_exact(aux);
But as you can see, I need to create a new buffer of the size I want to read every time. I can't specify it directly as a parameter of the function.
I created a struct but I could not make a struct of all the different pieces of data I wanted. For example:
struct FileStruct {
a1: &mut [u8] &mut [0; 1],
a2: &mut [u8] &mut [0; 2],
}
Which are the types that are required for the read_exact method to work?
Is there a more effective way to read information from different offsets of a file without having to repeatedly copy-paste these lines of code for every piece of information I want to read from the file? Some sort of function, Cursor, or Vector to easily move around the offset? And a way to write this info into struct fields?
The easiest way is to have a struct of owned arrays, then seek and read into the struct.
use std::io::{self, prelude::*, SeekFrom};
#[derive(Debug, Clone, Default)]
struct FileStruct {
a1: [u8; 1],
a2: [u8; 2],
}
fn main() -> io::Result<()> {
let mut file_struct: FileStruct = Default::default();
let mut opened_file = unimplemented!(); // open file somehow
opened_file.seek(SeekFrom::Start(13))?;
opened_file.read_exact(&mut file_struct.a1)?;
opened_file.seek(SeekFrom::Start(14))?;
opened_file.read_exact(&mut file_struct.a2)?;
println!("{:?}", file_struct);
Ok(())
}
Playground link
This is still decently repetitive, so you can make a seek_read function to reduce the repetition:
use std::io::{self, prelude::*, SeekFrom};
#[derive(Debug, Clone, Default)]
struct FileStruct {
a1: [u8; 1],
a2: [u8; 2],
}
fn seek_read(mut reader: impl Read + Seek, offset: u64, buf: &mut [u8]) -> io::Result<()> {
reader.seek(SeekFrom::Start(offset))?;
reader.read_exact(buf)?;
Ok(())
}
fn main() -> io::Result<()> {
let mut file_struct: FileStruct = Default::default();
let mut opened_file = unimplemented!(); // open file somehow
seek_read(&mut opened_file, 13, &mut file_struct.a1)?;
seek_read(&mut opened_file, 14, &mut file_struct.a2)?;
println!("{:?}", file_struct);
Ok(())
}
Playground link
The repetition can be lowered even more by using a macro:
use std::io::{self, prelude::*, SeekFrom};
#[derive(Debug, Clone, Default)]
struct FileStruct {
a1: [u8; 1],
a2: [u8; 2],
}
macro_rules! read_offsets {
($file: ident, $file_struct: ident, []) => {};
($file: ident, $file_struct: ident, [$offset: expr => $field: ident $(, $offsets: expr => $fields: ident)*]) => {
$file.seek(SeekFrom::Start($offset))?;
$file.read_exact(&mut $file_struct.$field)?;
read_offsets!($file, $file_struct, [$($offsets => $fields),*]);
}
}
fn main() -> io::Result<()> {
let mut file_struct: FileStruct = Default::default();
let mut opened_file = unimplemented!(); // open file somehow
read_offsets!(opened_file, file_struct, [13 => a1, 14 => a2]);
println!("{:?}", file_struct);
Ok(())
}
Playground link
This is a complementary answer to Aplet123's: it's not quite clear that you must store the bytes as is into a structure, so you can also allocate one buffer (as a fixed-size array) and reuse it with the correctly sized slice e.g.
let mut buf = [0u8;16];
opened_file.read_exact(&mut buf[..4])?; // will read 4 bytes
// do thing with the first 4 bytes
opened_file.read_exact(&mut buf[..8])?; // will read 8 bytes this time
// etc...
You could also use the byteorder crate, which lets you directly read numbers or sequences of numbers. It basically just does the unrelying "create stack buffer of the right size; read; decode" for you.
That's especially useful because it looks a lot like "SECTORS PER CLUSTER" should be a u8 and "RESERVED SECTORS" should be a u16. With byteorder you can straight read_16() or read_u8().
Also building on Aplet123's answer, the following function seek_read doesn't require to know how many bytes to read at compile time, since it uses a Vector instead of a byte slice:
// Starting at `offset`, reads the `amount_to_read` from `reader`.
// Returns the bytes as a vector.
fn seek_read(
reader: &mut (impl Read + Seek),
offset: u64,
amount_to_read: usize,
) -> Result<Vec<u8>> {
// A buffer filled with as many zeros as we'll read with read_exact
let mut buf = vec![0; amount_to_read];
reader.seek(SeekFrom::Start(offset))?;
reader.read_exact(&mut buf)?;
Ok(buf)
}
Here are some tests to demonstrate how seek_read behaves:
use std::io::Cursor;
#[test]
fn seek_read_works() {
let bytes = b"Hello world!";
let mut reader = Cursor::new(bytes);
assert_eq!(seek_read(&mut reader, 0, 2).unwrap(), b"He");
assert_eq!(seek_read(&mut reader, 1, 4).unwrap(), b"ello");
assert_eq!(seek_read(&mut reader, 6, 5).unwrap(), b"world");
assert_eq!(seek_read(&mut reader, 2, 0).unwrap(), b"");
}
#[test]
#[should_panic(expected = "failed to fill whole buffer")]
fn seek_read_beyond_buffer_fails() {
let mut reader = Cursor::new(b"Hello world!");
seek_read(&mut reader, 6, 99).unwrap();
}
#[test]
#[should_panic(expected = "failed to fill whole buffer")]
fn start_seek_reading_beyond_buffer_fails() {
let mut reader = Cursor::new(b"Hello world!");
seek_read(&mut reader, 99, 1).unwrap();
}

remove extra length from string converted from array in rust

I am trying to learn async in rust with tokio. I am trying to take input from the terminal by using tokio::io::AsyncReadExt::Read which need array as buffer. But when I convert that buffer into a string, I can't compare it with other strings cause I think it has extra length.
here is minimal code:-
use std::process::Command;
use tokio::prelude::*;
use tokio::time;
async fn get_input(prompt: &str) -> String {
println!("{}", prompt);
let mut f = io::stdin();
let mut buffer = [0; 10];
// read up to 10 bytes
f.read(&mut buffer).await;
String::from_utf8((&buffer).to_vec()).unwrap()
}
async fn lol() {
for i in 1..5 {
let mut input = get_input("lol asks ").await;
input.shrink_to_fit();
print!("lol {} input = '{}' len = {}", i, input, input.len());
if input.eq("sl\n") {
let mut konsole = Command::new("/usr/bin/konsole");
konsole.arg("-e").arg("sl");
konsole.output().expect("some error happend");
}
}
}
#[tokio::main]
async fn main() {
let h = lol();
futures::join!(h);
}
if I execute this code, I get this:-
lol asks
sl
lol 1 input = 'sl
' len = 10lol asks
which means string has 10 length
I solved with help from Discord people
I needed to use the number returned by f.read(…).await.unwrap(), otherwise, you will have some additional zero bytes at the end which were not returned by reading
let i = f.read(&mut buffer).await.unwrap() ;
String::from_utf8((&buffer[..i]).to_vec()).unwrap()

Why does calling Vec::resize before calling Vec::set_len cause the Vec to have data?

I have a problem that I do not understand:
fn cipher_with(key: &[u8], data: &[u8]) -> Vec<u8> {
let data_len = 16;
let mut data = data.to_vec();
data.resize(data_len, 2);
let mut output = Vec::<u8>::with_capacity(data_len);
unsafe { output.set_len(data_len) }
output
}
fn main() {
let key = "blabla".as_bytes();
let data = "lorem ipsum.".as_bytes();
println!("{:?}", cipher_with(&key, &data));
}
This prints:
[108, 111, 114, 101, 109, 32, 105, 112, 115, 117, 109, 46, 0, 0, 0, 0]
But how is it done? I never gave this value to output.
To add some details to Peter's answer, check out this annotated version:
fn cipher_with(key: &[u8], data: &[u8]) -> Vec<u8> {
let data_len = 16;
let mut data = data.to_vec();
println!("{:?}", data.as_ptr());
data.resize(data_len, 2);
println!("{:?}", data.as_ptr());
let mut output = Vec::<u8>::with_capacity(data_len);
println!("{:?}", output.as_ptr());
unsafe { output.set_len(data_len) }
output
}
0x7fa6dba27000
0x7fa6dba1e0c0
0x7fa6dba27000
When the first vector is created, it has a length of 12. When it's resized to 16, a new allocation is made and the data copied. This is likely due to the implementation of the allocator, which usually chunks allocations into buckets. 16 would be a reasonable bucket size.
When the second vector is created, the allocator hands back the same pointer that the first vector just gave up. Since nothing else has changed this memory in the mean time, it still contains whatever data was in data.
You are using unsafe Rust, which can give you unpredictable results.
In this particular case, you are extending the size of the Vec into uninitialized memory. The values are whatever happens to be there already.
So let's look at some of the code:
let mut data = data.to_vec();
This copies the data "lorem ipsum." onto the heap in the form of a vector.
data.resize(data_len, 2); // data_len = 16
This increases the capacity of the Vec from 12 to 16 items, which happen to be bytes in this case. But actually, based on what we are seeing, it looks like the implementation (or possibly the optimiser) decided it was better to just abandon the first allocated memory range and copy the data to new memory instead.
let mut output = Vec::<u8>::with_capacity(data_len);
unsafe { output.set_len(data_len) }
This creates a new vector and unsafely gives it a length. But you didn't initialise it, so the data will be what was there previously.
It looks like data.resize() actually copied the value instead of just dropping the end of the vector. When output was allocated, it was allocated the same block of memory that was previously used, which is why it contains "lorem ipsum.".

Create array with same length

I'm writing something that's writing images with lodepng-rust, and I'd like to work with pixels instead of u8s. I've created a struct
struct Pixel {r: u8, g: u8, b: u8}
and am building an array of u8s out of the elements of the array. Since I'm making the array by hand right now, I'm in the strange situation of needing to make the other array by hand as well. Is there a way I can create an array that's three times the length of the pixel array at compile time? Something like
let data: [u8; other_array.len()*3];
Which doesn't work because .len() isn't a compile time constant. The runtime cost doesn't really matter to me in this case, but if I could have the sizes related it would feel cleaner (and I might need the performance in the future).
Edit:
The solution I'm using is based on Levans's answer. For people not initializing your array by hand, just follow Levans. I initialize my first array by hand, but set the type to use the length specified in pixel_count so that it'll catch the wrong pixel_count at compile time. I create the second array with that constant, and then assert that the lengths have the right ratio. My minimal example looks like this:
struct Pixel {r: u8, g: u8, b: u8}
const pixel_count: usize = 4;
fn main() {
let pixel_data = [Pixel; pixel_count] = [
Pixel {r: 255, g: 0, b: 0},
Pixel {r: 0, g: 255, b: 0},
Pixel {r: 0, g: 0, b: 255},
Pixel {r: 0, g: 99, b: 99},
];
let mut data = [0u8; pixel_count*3];
assert_eq!(pixel_data.len()*3, data.len());
}
The easiest way for you to create size-related arrays would be to store the initial size as a const item : they are compile-time constants :
const pixel_count: uint = 42;
struct Pixel { r: u8, g: u8, b: u8 }
fn main() {
let pixels = [Pixel { r: 0, g: 0, b: 0 }, ..pixel_count];
let raw_data = [0u8, ..pixel_count * 3];
assert!(pixels.len() * 3 == raw_data.len());
}

Resources