I have an odd case where I want to initialize some segments of an array as copies of an existing array, and call a function to initialize the other elements. Naively, I'd like to do something like this:
fn build_array(input: [char; 8]) -> [char; 25] {
let mut out: [char; 25];
out[6..10].copy_from_slice(input[0..4]);
out[16..20].copy_from_slice(input[4..8]);
for i in 0..25 {
if (6..10).contains(i) || (16..20).contains(i) {
continue;
}
out[i] = some_func();
}
}
Obviously I could just initialize the array, but that would be inefficient. I was surprised to find that wrapping the copy_from_slice() calls in an unsafe block does not make this compile. Creating multiple array segments and concatenating them doesn't seem to simplify things based on this question.
Does anyone know of an idiomatic and efficient way to accomplish what I want to do here?
Edit: some_func() here is meant as a placeholder, the elements that aren't provided in input don't all have the same value.
First, do not worry about the cost of initializing the elements. Especially when the optimizer may eliminate it.
If you really need that, e.g. for a very big array, the Rust way is to use MaybeUninit:
use std::mem::{self, MaybeUninit};
use std::ptr;
fn build_array(input: [char; 8]) -> [char; 25] {
// SAFETY: `MaybeUninit` is always considered initialized (replace with
// `MaybeUninit::uninit_array()` once stabilized).
let mut out: [MaybeUninit<char>; 25] = unsafe { MaybeUninit::uninit().assume_init() };
// SAFETY: source and destination derived from references, slices are of
// the correct length (replace with `MaybeUninit::write_slice()` once stabilized).
unsafe {
ptr::copy_nonoverlapping(
input[0..4].as_ptr(),
out[6..10].as_mut_ptr().cast::<char>(),
4,
);
ptr::copy_nonoverlapping(
input[4..8].as_ptr(),
out[16..20].as_mut_ptr().cast::<char>(),
4,
);
}
for i in 0..25 {
if (6..10).contains(&i) || (16..20).contains(&i) {
continue;
}
out[i].write(some_func());
}
// SAFETY: `MaybeUninit<T>` has the same layout as `T`, initialized above
// (replace with `MaybeUninit::array_assume_init()` once stabilized).
unsafe { mem::transmute(out) }
}
Like you can see, this involves non-trivial unsafe code, so I highly recommend to not do that unless really necessary and you know well what you're doing.
Related
I’m trying to initialize a fixed-size array of some nullable, non-copyable type, like an Option<Box<Thing>> for some kind of Thing. I’d like to pack two of them into a struct without any extra indirection. I’d like to write something like this:
let array: [Option<Box<Thing>>; SIZE] = [None; SIZE];
But it doesn’t work because the [e; n] syntax requires that e implements Copy. Of course, I could expand it into SIZE Nones, but that can be unwieldy when SIZE is large. I don’t believe this can be done with a macro without an unnatural encoding of SIZE. Is there a good way to do it?
Yes, this is easy with unsafe; is there a way to do it without unsafe?
As of Rust 1.38 (released in September 2019), a cleaner alternative to previously posted answers is possible using an intermediate const initializer. This approach works for arrays of any size:
const SIZE: usize = 100;
const INIT: Option<Box<Thing>> = None;
let array: [Option<Box<Thing>>; SIZE] = [INIT; SIZE];
(It works with or without the Box; the example uses Box because it was used in the question.)
One limitation is that the array item must have a default representation that can be evaluated at compile time - a constant, enum variant, or a primitive container composed of those. None or a tuple of numbers will work, but a non-empty Vec or String won't.
You could use the Default trait to initialize the array with default values:
let array: [Option<Box<Thing>>; SIZE] = Default::default();
See this playground for a working example.
Note that this will only work for arrays with up to 32 elements, because Default::default is only implemented for up to [T; 32]. See https://doc.rust-lang.org/std/default/trait.Default.html#impl-Default-for-%5BT%3B%2032%5D.
As of Rust 1.55.0 (which introduced [T]::map()), the following will work:
const SIZE: usize = 100;
#[derive(Debug)]
struct THING { data: i64 }
let array = [(); SIZE].map(|_| Option::<THING>::default());
for x in array {
println!("x: {:?}", x);
}
Rust Playground
I'm copying the answer by chris-morgan and adapting it to match the question better, to follow the recommendation by dbaupp downthread, and to match recent syntax changes:
use std::mem;
use std::ptr;
#[derive(Debug)]
struct Thing {
number: usize,
}
macro_rules! make_array {
($n:expr, $constructor:expr) => {{
let mut items: [_; $n] = mem::uninitialized();
for (i, place) in items.iter_mut().enumerate() {
ptr::write(place, $constructor(i));
}
items
}}
}
const SIZE: usize = 50;
fn main() {
let items = unsafe { make_array!(SIZE, |i| Box::new(Some(Thing { number: i }))) };
println!("{:?}", &items[..]);
}
Note the need to use unsafe here: The problem is that if the constructor function panic!s, this would lead to undefined behavior.
Go through the heap
If you can create a Vec of your type, you can convert it into an array:
use std::convert::TryInto;
#[derive(Clone)]
struct Thing;
const SIZE: usize = 100;
fn main() {
let v: Vec<Option<Thing>> = vec![None; SIZE];
let v: Box<[Option<Thing>; SIZE]> = match v.into_boxed_slice().try_into() {
Ok(v) => v,
Err(_) => unreachable!(),
};
let v: [Option<Thing>; SIZE] = *v;
}
In many cases, you actually want to leave it as a Vec<T>, Box<[T]>, or Box<[T; N]> as these types all put the data in the heap. Large arrays tend to be... large... and you don't want all that data on the stack.
See also:
What is the use of into_boxed_slice() methods?
How to get a slice as an array in Rust?
How do I get an owned value out of a `Box`?
Keep it simple
Type out all the values:
struct Thing;
const SIZE: usize = 5;
fn main() {
let array: [Option<Box<Thing>>; SIZE] = [None, None, None, None, None];
}
You could use a build script to generate this code for you. For an example of this, see:
How to create a static string at compile time
An alternative approach using the arrayvec crate that generalizes easily to situations other than initializing everything with a fixed value:
use arrayvec::ArrayVec;
let array = std::iter::repeat(None)
.take(SIZE)
.collect::<ArrayVec<Option<Box<Thing>>, SIZE>>()
.into_inner()
.unwrap();
(playground)
On nightly Rust, you can use inline const. This is a variant of the answer by #user4815162342, but one that doesn't require you to declare a separate constant and repeat the type:
#![feature(inline_const)]
let array: [Option<Box<Thing>>; SIZE] = [const { None }; SIZE];
Until this is stabilized (hopefully soon), you can also use the inline-const crate, but this does require you to repeat the type.
let stackoverflow: [Option<&mut ()>;0xDEADBEEF] = std::array::from_fn(|_| None);
dbg!(stackoverflow);
playground
I'm trying to implement a method to send an array of u32 (eventually an array of arrays of usize, if possible), since you can't just declare a public array field on a wasm_bindgen type. However, using the example outlined in the wasm_bindgen PR 1749, I can't seem to convert arrays or slices to a js_sys::Array; it only works for Vecs. My question is, why? See below
pub fn test() -> js_sys::Array {
let arr: [u32; 5] = [0,1,2,3,4];
let slice = &arr[0..2];
let vec: Vec<u32> = vec![0,1,2];
arr.into_iter().map(JsValue::from).collect() // This doesn't work
slice.into_iter().map(JsValue::from).collect() // Also doesn't work
vec.into_iter().map(JsValue::from).collect() // Works as expected!
}
The specific error is: the trait 'wasm_bindgen::cast::JsCast' is not implemented for 'u32'
The array and slice examples don't seem to work for any number type, ints or floats. My only thought is because the implementation in PR 1749 seems to expect a ref, and arrays are allocated on the stack that the FromIterator is not valid for items in an array?
Is there some other way to achieve what I'm trying to do with the array (passing across the boundary to JS through wasm_bindgen), or if not, why? I'd be very interested to know.
Although Rust arrays and slices have an into_iter method it returns the same Iterator as the iter method does which iterates over references to values instead of the values themselves. Yes, this is confusing. Since JsValue::from is implemented for u32 but not for &u32 you can take your Iterator<Item = &u32> and convert it to a Iterator<Item = u32> using the copied method. Fixed working examples:
use wasm_bindgen::JsValue;
use js_sys::Array;
fn array_to_js_array(array: [u32; 5]) -> Array {
array.iter().copied().map(JsValue::from).collect()
}
fn slice_to_js_array(slice: &[u32]) -> Array {
slice.iter().copied().map(JsValue::from).collect()
}
fn vec_to_js_array(vec: Vec<u32>) -> Array {
vec.into_iter().map(JsValue::from).collect()
}
I am writing a simple tokenizer that will take an input from it's new function (which hopefully be removed in favor of a REPL) and spit back out the tokens associated with the css syntax.
Here is an example:
#[derive(Debug)]
pub enum Token {
Selector,
LBrace,
RBrace,
Property,
Value,
}
#[derive(Debug)]
pub struct Tokenizer {
source: String,
tokens: Vec<(Token, String)>,
}
impl Tokenizer {
pub fn new(source: &str) -> Self {
let source = source.to_string();
for (i, c) in source.chars().enumerate() {
if c == '.' {
// This is where I am stuck.
}
}
Self {
source,
tokens: vec![],
}
}
}
fn main() {
let tokens = Tokenizer::new(".example{}");
println!("{:#?}", tokens);
}
Now my issue is that I want to iterate over the next characters until I meet one of # 'space', . but I have no idea how rust allows me to keep iterating until a condition is met. Is there a way to call the next character in the sequence?
Also if you do find anything wrong with this code in-terms of direction I am taking, please let me know. Thank you.
Here is a link to the rust playground.
Since this is a very broad question, I would refer to servo's css-parser.
Some functions that may come in handy:
skip_while can skip elements in the beginning satisfying a certain predicate
peekable creates an iterator able to look ahead
Instead of working on iterators, you could also work on slices (&str). This may be easier because an (plain old) iterator basically gives you access to one element at a time whereas a slice allows you to go back and forth as you wish (within reasonable bounds, of course).
I would like to use a static or const array, but initialize it using something other than the [T; N] syntax. I need to define specific elements but all other values can default to 0 or some other value.
In C, you can do the following:
byte ARRAY[256] = {
[0x1F] = (1 << 4),
// Or even simply just this
[0x46] '\n'
};
I've tried something along the lines of:
static ARRAY: [u8; 256] = {
// x is some arbitrary number of elements
let mut array = [0, x];
array[i] = 'b',
array[j] = 'a',
array[k] = 'd',
array
};
This was merely trial and error based on syntax I know to work for local array declarations. This throws a compiler error saying that blocks in const and static are limited to items and tail expressions. I know that if I enclose an array definition in brackets, then the last line or last expression must be the implicit return.
Additionally, I don't have access to the std library, but I don't think a complex structure would be necessary for something this simple - to index and obtain a value.
I've looked at the Rust macro rules and think that could be a solution, but all the examples I have seen are iterative and incremental.
There is no Rust equivalent to your C snippet. The documentation shows only 3 simple patterns are allowed:
empty
value, value, value, etc...
value; size
So, currently with array syntax, you can't do it.
RFC about const function now allow:
static ARRAY: [u8; 256] = {
let mut array = [0; 256];
array[0] = b'b';
array[1] = b'a';
array[2] = b'd';
array
};
Now, let take a look at the declarative macro solution. There is no way to "count", there is some trick but will not get very far. A proc macro could work.
You could also generate the file with other tools before compiling. For example, you could use Cargo to generate the file before compiling.
I want to create an array. I don't need the array to be mutable, and at the time of creation, I have all the information I need to calculate the i-th member of the array. However, can't figure out how to create an immutable array in Rust.
Here's what I have now:
let mut my_array: [f32; 4] = [0.0; 4];
for i in 0..4 {
// some calculation, doesn't matter what exactly
my_array[i] = some_function(i);
}
And here's what I want:
let my_array: [f32; 4] = array_factory!(4, some_function);
How can I achieve that in Rust?
Here's the macro definition with sample usage:
macro_rules! array_factory(
($size: expr, $factory: expr) => ({
unsafe fn get_item_ptr<T>(slice: *mut [T], index: usize) -> *mut T {
(slice as *mut T).offset(index as isize)
}
let mut arr = ::std::mem::MaybeUninit::<[_; $size]>::uninit();
unsafe {
for i in 0..$size {
::std::ptr::write(get_item_ptr(arr.as_mut_ptr(), i), $factory(i));
}
arr.assume_init()
}
});
);
fn some_function(i: usize) -> f32 {
i as f32 * 3.125
}
fn main() {
let my_array: [f32; 4] = array_factory!(4, some_function);
println!("{} {} {} {}", my_array[0], my_array[1], my_array[2], my_array[3]);
}
The macro's body is essentially your first version, but with a few changes:
The type annotation on the array variable is omitted, because it can be inferred.
The array is created uninitialized, because we're going to overwrite all values immediately anyway. Messing with uninitialized memory is unsafe, so we must operate on it from within an unsafe block. Here, we're using MaybeUninit, which was introduced in Rust 1.36 to replace mem::uninitialized1.
Items are assigned using std::ptr::write() due to the fact that the array is uninitialized. Assignment would try to drop an uninitialized value in the array; the effects depend on the array item type (for types that implement Copy, like f32, it has no effect; for other types, it could crash).
The macro body is a block expression (i.e. it's wrapped in braces), and that block ends with an expression that is not followed by a semicolon, arr.assume_init(). The result of that block expression is therefore arr.assume_init().
Instead of using unsafe features, we can make a safe version of this macro; however, it requires that the array item type implements the Default trait. Note that we must use normal assignment here, to ensure that the default values in the array are properly dropped.
macro_rules! array_factory(
($size: expr, $factory: expr) => ({
let mut arr = [::std::default::Default::default(), ..$size];
for i in 0..$size {
arr[i] = $factory(i);
}
arr
});
)
1 And for a good reason. The previous version of this answer, which used mem::uninitialized, was not memory-safe: if a panic occurred while initializing the array (because the factory function panicked), and the array's item type had a destructor, the compiler would insert code to call the destructor on every item in the array; even the items that were not initialized yet! MaybeUninit avoids this problem because it wraps the value being initialized in ManuallyDrop, which is a magic type in Rust that prevents the destructor from running automatically.
Now, there is a (pretty popular) crate to do that exact thing: array_init
use array_init::array_init;
let my_array: [f32; 4] = array_init(some_function);
PS:
There is a lot of discussion and evolution around creating abstractions around arrays inside the rust team.
For example, the map function for arrays is already available, and it will become stable in rust 1.55.
If you wanted to, you could implement your function with map:
#![feature(array_map)]
let mut i = 0usize;
result = [(); 4].map(|_| {v = some_function(i);i = i+1; v})
And there are even discussions around your particular problem, you can look here
Try to make your macro expand to this:
let my_array = {
let mut tmp: [f32, ..4u] = [0.0, ..4u];
for i in range(0u, 4u) {
tmp[i] = somefunction(i);
}
tmp
};
What I don't know is whether this is properly optimized to avoid moving tmp to my_array. But for 4 f32 values (128 bits) it probably does not make a significant difference.