Why does a const disappear during linking when static doesn't? - c

I have a function like this in my static library crate:
use super::*;
static BLANK_VEC: [u8; 16] = [0_u8; 16];
pub fn pad(name: &'static str) -> String {
let mut res = String::from(name);
res.push_str(&String::from_utf8_lossy(&BLANK_VEC[name.len()..]));
res
}
When I link this to C code it works as expected, but if I link with the below code (the only difference being const instead of static) the label BLANK_VEC doesn't appear in the ELF file. It could compile and run until it gets a HardFault.
use super::*;
const BLANK_VEC: [u8; 16] = [0_u8; 16];
pub fn pad(name: &'static str) -> String {
let mut res = String::from(name);
res.push_str(&String::from_utf8_lossy(&BLANK_VEC[name.len()..]));
res
}
Is this a bug on the Rust side? I think so because the const variable goes out of scope somehow. I can reference it and it compiles. Where is the ensured memory safety? Why didn't I have to use unsafe block to do that?
If this is something that depends on my linker: I use arm-gcc-none-eabi.
Edit: I understand why this happens but shouldn't Rust ensure that the user uses a variable that won't disappear?

It's not a bug in rust: const defines a constant value copied at every use-site (therefore not existing at runtime).
static defines a global variable (which may or may not be constant), and is thus present in the final program, it's an actual single location in memory.

Related

Load file contents into a static array of bytes

I have a static array initialized with some constant value:
static PROG_ROM: [u8; 850] = [0x12, 0x1d, ...];
I would like to instead load at compile-time the contents of a file into it. Sounds like a job for std::include_bytes!, however, I have two problems with it:
The type of include_bytes!("foo.dat") is &[u8; 850] i.e. it is a reference. I need this to be a bonafide static array.
Even if there was an include_bytes_static! macro with type [u8;850], I would have to use it like this:
static PROG_ROM: [u8; 850] = include_bytes_static!("foo.dat");
I.e. I would have to hardcode the length of the file. Instead, I
would like to take the length from the length of the file contents.
So the ideal replacement for my code would be a macro to replace the whole definition, i.e. look something like this:
define_included_bytes!(PROG_ROM, "foo.dat")
and it would expand to
static PROG_ROM: [u8; 850] = [0x12, 0x1d, ...];
So how do I do this?
Use *include_bytes!(..) to get a [u8; _] instead of &[u8; _] (since arrays implement Copy), and use include_bytes!(..).len() (which is a const method) to specify the length of the array in the type:
static PROG_ROM: [u8; include_bytes!("foo.dat").len()] = *include_bytes!("foo.dat");
As Chayim Friedman pointed out you can easily define that proc macro yourself:
#[proc_macro]
pub fn define_included_bytes(token_stream: TokenStream) -> TokenStream {
let [ident, _comma, file] = &token_stream.into_iter().collect::<Vec<_>>()[..] else {
panic!("expected invocation: `define_included_bytes!(IDENTIFIER, \"file_name\");");
};
let file = file.to_string().trim_matches('\"').to_string();
let data: Vec<u8> = std::fs::read(&file).expect(&format!("File {:?} could not be read", file));
format!("const {ident}: [u8; {}] = {:?};", data.len(), data).parse().unwrap()
}
Obviously this is just a hacked together proof of concept and you should thouroughly check the tokens instead of just assuming they're correct.
Based on cafce25's answer, I ended up writing the following version using syn and quote:
extern crate proc_macro;
extern crate syn;
extern crate quote;
use proc_macro::TokenStream;
use syn::parse::{Parse, ParseStream, Result};
use syn::{parse_macro_input, Ident, Token, LitStr};
use quote::quote;
struct StaticInclude {
name: Ident,
filepath: String,
}
impl Parse for StaticInclude {
fn parse(input: ParseStream) -> Result<Self> {
let name: Ident = input.parse()?;
input.parse::<Token![=]>()?;
let filepath: String = input.parse::<LitStr>()?.value();
Ok(StaticInclude{ name, filepath })
}
}
#[proc_macro]
pub fn progmem_include_bytes(tokens: TokenStream) -> TokenStream {
let StaticInclude{ name, filepath } = parse_macro_input!(tokens as StaticInclude);
let data: Vec<u8> = std::fs::read(&filepath).expect(&format!("File {:?} could not be read", filepath));
let len = data.len();
TokenStream::from(quote! {
#[link_section = ".progmem.data"]
static #name: [u8; #len] = [#(#data),*];
})
}
(never mind the link_section attribute, that's an AVR-ism).
This works well, except Cargo is not tracking the dependency on the external file, so if its content changes, the program using progmem_include_bytes! is not recompiled by cargo build.

Creating relative address references - Rust

In code C we can have the following snippet to create a reference to a relative address in memory.
int *value = (int*)0x0061FF0C;
Since I haven't found a C-like way, I used inline assembly to get the value, but it's just the value, not the reference.
unsafe fn read_from_adress(adress:u32) -> i32 {
let mut data = 0;
asm!{
"mov {}, [{}]",
out(reg) data,
in(reg) adress,
};
return data;
}
let var: i32 = read_from_adress(0x0061FF0C);
In rust I can't find an equivalent and simple way as in C.
I tried this snippet and to no avail.
let value = &mut 0x0061FF0C;
Is there a way equivalent to C ?
obs: Getting relative address values is being done with dll injection. In both C and Rust
The equivalent Rust code of your C code is:
let value = unsafe { *(0x0061FF0C as *const i32) };
However, as noted in the comments, the address is not relative but absolute.

What are the solutions for wrapping a C-style `static const` global constant in Rust using Bindgen?

I am creating Rust bindings for a C library that defines lists of standard constant default values:
// C
typedef struct RMW_PUBLIC_TYPE rmw_qos_profile_t
{
size_t depth;
enum rmw_qos_reliability_policy_t reliability;
// ...
} rmw_qos_profile_t;
enum RMW_PUBLIC_TYPE rmw_qos_reliability_policy_t
{
RMW_QOS_POLICY_RELIABILITY_RELIABLE,
RMW_QOS_POLICY_RELIABILITY_BEST_EFFORT,
// ...
};
// Global value that needs wrapping
static const rmw_qos_profile_t rmw_qos_profile_sensor_data =
{
5,
RMW_QOS_POLICY_RELIABILITY_BEST_EFFORT,
// ...
};
Using Bindgen, static Rust variables are generated:
// Rust
extern "C" {
#[link_name = "\u{1}rmw_qos_profile_sensor_data"]
pub static rmw_qos_profile_sensor_data: rmw_qos_profile_t;
}
but static global variables are highly inconvenient to work with in Rust, having to encase every access in an unsafe {} block. Especially when you do not need mutability.
I already wrapped the struct and enums in Rust:
// Rust
pub enum QoSReliabilityPolicy {
Reliable = 0,
BestEffort = 1,
}
impl From<rmw_qos_reliability_policy_t> for QoSReliabilityPolicy {
fn from(raw: rmw_qos_reliability_policy_t) -> Self {
match raw {
rmw_qos_reliability_policy_t::RMW_QOS_POLICY_RELIABILITY_RELIABLE => QoSReliabilityPolicy::Reliable,
rmw_qos_reliability_policy_t::RMW_QOS_POLICY_RELIABILITY_BEST_EFFORT => QoSReliabilityPolicy::BestEffort,
}
}
}
pub struct QoSProfile {
pub depth: usize,
pub reliability: QoSReliabilityPolicy,
// ...
}
impl From<rmw_qos_profile_t> for QoSProfile {
fn from(qos_profile: rmw_qos_profile_t) -> Self {
QoSProfile {
depth: qos_profile.depth,
reliability: qos_profile.reliability.into(),
// ...
}
}
}
impl From<rmw_qos_profile_t> for QoSProfile {
fn from(qos_profile: rmw_qos_profile_t) -> Self {
QoSProfile {
depth: qos_profile.depth,
reliability: qos_profile.reliability.into(),
// ...
}
}
}
Now, I am looking for a solution to expose the same pre-defined profiles, such as rmw_qos_profile_sensor_data, to my Rust users without having to duplicate the C values manually in Rust.
Currently I am duplicating the C code in Rust:
// Rust
// Works but unsatisfying
pub const QOS_PROFILE_SENSOR_DATA: QoSProfile = QoSProfile {
depth: 5,
reliability: QoSReliabilityPolicy::BestEffort,
// ...
};
But this is not satisfying. When the upstream C library updates these values, users will experience inconsistent behaviour and bugs.
What are the possible solutions for conveniently wrapping these global constants ?
The ideal solution would:
Automatically update the values when the upstream C library changed
Expose global consts so that these values can be inlined by the compiler
If not possible, expose global immutable variables
If still not possible, at least not require unsafe
The problem that I have been facing is that, since static const C structures are stored in memory, they can't ben translated into a const so easily and this is probably why Bindgen translates it using the static keyword.
So, the possibilities that I can imagine, but don't know how to execute, are:
Have smarter parsing of the C code to generate Rust code ?
Use some form of macro ?
Initialize from the C lib's static memory in the prelude ?
Initialize from the C lib's static memory explicitly ?
Other solutions ?

Extracting an archive with progress bar - mutable borrow error

I am trying to extract a .tar.bz file (or .tar.whatever actually) and also be able to have a xx% progress report. So far I have this:
pub fn extract_file_with_progress<P: AsRef<Path>>(&self, path: P) -> Result<()> {
let path = path.as_ref();
let size = fs::metadata(path)?;
let mut f = File::open(path)?;
let decoder = BzDecoder::new(&f);
let mut archive = Archive::new(decoder);
for entry in archive.entries()? {
entry?.unpack_in(".")?;
let pos = f.seek(SeekFrom::Current(0))?;
}
Ok(())
}
The idea is to use pos/size to get the percentage, but compiling the above function gets me the error cannot borrow f as mutable because it is also borrowed as immutable.
I understand what the error means, but I don't really use f as mutable; I only use the seek function to get the current position.
Is there a way to work-around this, either by forcing the compiler to ignore the mutable borrow or by getting the position in some immutable way?
Files are a bit special. The usual read() and seek() and write() methods (defined on the Read, Seek and Write traits) take self by mutable reference:
fn read(&mut self, buf: &mut [u8]) -> Result<usize>
fn seek(&mut self, pos: SeekFrom) -> Result<u64>
fn write(&mut self, buf: &[u8]) -> Result<usize>
However, all mentioned traits are also implemented for &File, i.e. for immutable references to a file:
impl<'a> Read for &'a File
impl<'a> Seek for &'a File
impl<'a> Write for &'a File
So you can modify a file even if you only have a read-only reference to the file. For these implementations, the Self type is &File, so accepting self by mutable reference in fact means accepting a &mut &File, a mutable reference to a reference to a file.
Your code passes &f to BzDecoder::new(), creating an immutable borrow. Later you call f.seek(SeekFrom::Current(0)), which passes f to seek by mutable reference. However, this is not allowed, since you already have an immutable borrow of the file. The solution is to use the Seek implementation on &File instead:
(&mut &f).seek(SeekFrom::Current(0))
or slightly simpler
(&f).seek(SeekFrom::Current(0))
This only creates a second immutable borrow, which is allowed by Rust's rules for references.
I created a playground example demonstrating that this works. If you replace (&f) with f you get the error you originally got.

thread '<main>' has overflowed its stack when creating a large array

static variable A_INTERSECTS_A from the following code returns the error.
This piece of code should return a big 1356x1356 2D array of bool.
use lazy_static::lazy_static; // 1.2.0
#[derive(Debug, Copy, Clone, Default)]
pub struct A {
pub field_a: [B; 2],
pub ordinal: i32,
}
#[derive(Debug, Copy, Clone, Default)]
pub struct B {
pub ordinal: i32,
}
pub const A_COUNT: i32 = 1356;
lazy_static! {
pub static ref A_VALUES: [A; A_COUNT as usize] = { [A::default(); A_COUNT as usize] };
pub static ref A_INTERSECTS_A: [[bool; A_COUNT as usize]; A_COUNT as usize] = {
let mut result = [[false; A_COUNT as usize]; A_COUNT as usize];
for item_one in A_VALUES.iter() {
for item_two in A_VALUES.iter() {
if item_one.field_a[0].ordinal == item_two.field_a[0].ordinal
|| item_one.field_a[0].ordinal == item_two.field_a[1].ordinal
|| item_one.field_a[1].ordinal == item_two.field_a[0].ordinal
|| item_one.field_a[1].ordinal == item_two.field_a[1].ordinal
{
result[item_one.ordinal as usize][item_two.ordinal as usize] = true;
}
}
}
result
};
}
fn main() {
A_INTERSECTS_A[1][1];
}
I've seen people dealing with this by implementing Drop for structs in a large list, but there aren't any structs in my list and you cant implement it for bool.
If I change A_INTERSECTS_A: [[bool; A_COUNT as usize]; A_COUNT as usize] to A_INTERSECTS_A: Box<Vec<Vec<bool>>> the code works fine, but I really would like to use an array here.
The problem here is almost certainly the huge result array that is being placed on the stack when the initialisation code of A_INTERSECTS_A runs. It is 13562 &approx; 1.8 MB, which is of a similar order of magnitude to the size of the stack. In fact, it is larger than Windows' default size of 1 MB (and I suspect you are on Windows, given you've got that error message).
The solution here is to reduce the stack size by moving it to the heap, by, for instance, using Vec instead (as you indicate works), or using a Box. This will have the added benefit that the initialisation code doesn't have to do a 2MB copy from the stack to A_INTERSECTS_A's memory (it only needs to copy some pointers around).
A direct translation to using a Box:
pub static ref A_INTERSECTS_A: Box<[[bool; A_COUNT as usize]; A_COUNT as usize]> = {
let mut result = Box::new([[false; A_COUNT as usize]; A_COUNT as usize]);
// ...
}
unfortunately doesn't work: Box::new is a normal function call, and hence its argument is placed directly onto the stack.
However, if you're using a nightly compiler and are willing to use unstable features, you can use "placement box", which is literally designed for this purpose: it allocates space on the heap and constructs the value straight into that memory, avoiding intermediate copies, and avoiding the need to have the data on the stack. This simply requires replacing Box::new with box:
let mut result = box [[false; A_COUNT as usize]; A_COUNT as usize];
If you (very sensibly) prefer to stick to stable releases, an alternative until that stabilises is to just replace the outer layer of the arrays with a Vec: this retains all the data locality benefits of the arrays (everything is laid out contiguously in memory), although is slightly weaker in terms of static knowledge (the compiler can't be sure that the length is 1356). Since [_; A_COUNT] doesn't implement Clone, this cannot use thevec!` macro and hence (unfortunately) looks like:
pub static ref A_INTERSECTS_A: Vec<[bool; A_COUNT as usize]> = {
let mut result =
(0..A_COUNT as usize)
.map(|_| [false; A_COUNT as usize])
.collect::<Vec<_>>();
// ...
}
If you absolutely need all the arrays, one could do some unsafe magic to extract this down to the original Box<[[bool; ...]; ...]> from the Vec. It requires two steps (via into_boxed_slice), because a Box<T> needs to have an allocation sized perfectly for T, while a Vec may overallocate in order to achieve its O(1) amortization. This version would look like:
pub static ref A_INTERSECTS_A: Box<[[bool; A_COUNT as usize]; A_COUNT as usize]> = {
let mut result =
(0..A_COUNT as usize)
.map(|_| [false; A_COUNT as usize])
.collect::<Vec<_>>();
// ...
// ensure the allocation is correctly sized
let mut slice: Box<[[bool; A_COUNT as usize]]> = result.into_boxed_slice();
// pointer to the start of the slices in memory
let ptr: *mut [bool; A_COUNT as usize] = slice.as_mut_ptr();
// stop `slice`'s destructor deallocating the memory
mem::forget(slice);
// `ptr` is actually a pointer to exactly A_COUNT of the arrays!
let new_ptr = ptr as *mut [[bool; A_COUNT as usize]; A_COUNT as usize];
unsafe {
// let this `Box` manage that memory
Box::from_raw(new_ptr)
}
}
I've added in some explicit types so that what's going in is a little more clear. This works because Vec<T> exposes into_boxed_slice, and hence we can munge that Box<[T]> (i.e. dynamic length) into a Box<[T; len]> given we know the exact length at compile time.

Resources