How to copy or reference a slice of bytes? [duplicate] - arrays

This question already has answers here:
How to get a slice as an array in Rust?
(7 answers)
How to convert a slice into an array reference?
(3 answers)
Closed 2 years ago.
How can I copy, or reference a slice of bytes from a larger array?
I only need to read them, but I want the size to be specified to catch errors at compile-time.
let foo = rand::thread_rng().gen::<[u8; 32]>();
let bar: [u8; 16] = foo[0..16];
let baz: &[u8; 16] = &foo[16..32];
The errors are:
error[E0308]: mismatched types
--> src/main.rs:64:22
|
64 | let bar: [u8; 16] = foo[0..16];
| -------- ^^^^^^^^^^ expected array `[u8; 16]`, found slice `[u8]`
| |
| expected due to this
error[E0308]: mismatched types
--> src/main.rs:65:23
|
65 | let baz: &[u8; 16] = &foo[16..32];
| --------- ^^^^^^^^^^^^ expected array `[u8; 16]`, found slice `[u8]`
| |
| expected due to this
|
= note: expected reference `&[u8; 16]`
found reference `&[u8]`
I can see that foo[0..16] is exactly 16 bytes, not a slice of unknown length [u8]. How do I help the compiler see this?

Your problem isn't that you can't reference a slice of bytes; it's that a slice is not an array.
Probably you want the arrayref crate or the TryInto trait. There's also some discussion on doing this automatically in this Github issue.

Related

How to reorder the elements of an array in-place? [duplicate]

This question already has answers here:
How to swap the elements of an array, slice, or Vec?
(1 answer)
How to get mutable references to two array elements at the same time?
(8 answers)
Temporarily move out of borrowed content
(3 answers)
Closed 2 years ago.
I would like to write a function fn f<A>(xs: &mut [A; 9]) that reorders an array in-place from:
[a, b, c,
d, e, f,
g, h, i]
to:
[g, d, a,
h, e, b,
i, f, c]
I can't reassign the array due to moving elements out of array:
fn f1<A>(xs: &mut [A; 9]) {
*xs = [xs[6], xs[3], xs[0], xs[7], xs[4], xs[1], xs[8], xs[5], xs[2]];
}
error[E0508]: cannot move out of type `[A; 9]`, a non-copy array
--> src/lib.rs:2:12
|
2 | *xs = [xs[6], xs[3], xs[0], xs[7], xs[4], xs[1], xs[8], xs[5], xs[2]];
| ^^^^^
| |
| cannot move out of here
| move occurs because `xs[_]` has type `A`, which does not implement the `Copy` trait
I cannot do multiple mutable borrows:
fn f2<A>(xs: &mut [A; 9]) {
std::mem::swap(&mut xs[0], &mut xs[6]);
}
error[E0499]: cannot borrow `xs[_]` as mutable more than once at a time
--> src/lib.rs:2:32
|
2 | std::mem::swap(&mut xs[0], &mut xs[6]);
| -------------- ---------- ^^^^^^^^^^ second mutable borrow occurs here
| | |
| | first mutable borrow occurs here
| first borrow later used by call
There is no built-in function that does this transformation for me.
How to implement this?

Why do I need to use type** to point to type*?

I've been reading Learn C The Hard Way for a few days, but here's something I want to really understand. Zed, the author, wrote that char ** is for a "pointer to (a pointer to char)", and saying that this is needed because I'm trying to point to something 2-dimensional.
Here is what's exactly written in the webpage
A char * is already a "pointer to char", so that's just a string. You however need 2 levels, since names is 2-dimensional, that means you need char ** for a "pointer to (a pointer to char)" type.
Does this mean that I have to use a variable that can point to something 2-dimensional, which is why I need two **?
Just a little follow-up, does this also apply for n dimension?
Here's the relevant code
char *names[] = { "Alan", "Frank", "Mary", "John", "Lisa" };
char **cur_name = names;
No, that tutorial is of questionable quality. I wouldn't recommend to continue reading it.
A char** is a pointer-to-pointer. It is not a 2D array.
It is not a pointer to an array.
It is not a pointer to a 2D array.
The author of the tutorial is likely confused because there is a wide-spread bad and incorrect practice saying that you should allocate dynamic 2D arrays like this:
// BAD! Do not do like this!
int** heap_fiasco;
heap_fiasco = malloc(X * sizeof(int*));
for(int x=0; x<X; x++)
{
heap_fiasco[x] = malloc(Y * sizeof(int));
}
This is however not a 2D array, it is a slow, fragmented lookup table allocated all over the heap. The syntax of accessing one item in the lookup table, heap_fiasco[x][y], looks just like array indexing syntax, so therefore a lot of people for some reason believe this is how you allocate 2D arrays.
The correct way to allocate a 2D array dynamically is:
// correct
int (*array2d)[Y] = malloc(sizeof(int[X][Y]));
You can tell that the first is not an array because if you do memcpy(heap_fiasco, heap_fiasco2, sizeof(int[X][Y])) the code will crash and burn. The items are not allocated in adjacent memory.
Similarly memcpy(heap_fiasco, heap_fiasco2, sizeof(*heap_fiasco)) will also crash and burn, but for other reasons: you get the size of a pointer not an array.
While memcpy(array2d, array2d_2, sizeof(*array2d)) will work, because it is a 2D array.
Pointers took me a while to understand. I strongly recommend drawing diagrams.
Please have a read and understand this part of the C++ tutorial (at least with respect to pointers the diagrams really helped me).
Telling you that you need a pointer to a pointer to char for a two dimensional array is a lie. You don't need it but it is one way of doing it.
Memory is sequential. If you want to put 5 chars (letters) in a row like in the word hello you could define 5 variables and always remember in which order to use them, but what happens when you want to save a word with 6 letters? Do you define more variables? Wouldn't it be easier if you just stored them in memory in a sequence?
So what you do is you ask the operating system for 5 chars (and each char just happens to be one byte) and the system returns to you a memory address where your sequence of 5 chars begins. You take this address and store it in a variable which we call a pointer, because it points to your memory.
The problem with pointers is that they are just addresses. How do you know what is stored at that address? Is it 5 chars or is it a big binary number that is 8 bytes? Or is it a part of a file that you loaded? How do you know?
This is where the programming language like C tries to help by giving you types. A type tells you what the variable is storing and pointers too have types but their types tell you what the pointer is pointing to. Hence, char * is a pointer to a memory location that holds either a single char or a sequence of chars. Sadly, the part about how many chars are there you will need to remember yourself. Usually you store that information in a variable that you keep around to remind you how many chars are there.
So when you want to have a 2 dimensional data structure how do you represent that?
This is best explained with an example. Let's make a matrix:
1 2 3 4
5 6 7 8
9 10 11 12
It has 4 columns and 3 rows. How do we store that?
Well, we can make 3 sequences of 4 numbers each. The first sequence is 1 2 3 4, the second is 5 6 7 8 and the third and last sequence is 9 10 11 12. So if we want to store 4 numbers we will ask the system to reserve 4 numbers for us and give us a pointer to them. These will be pointers to numbers. However since we need to have 3 of them we will ask the system to give us 3 pointers to pointers numbers.
And that's how you end up with the proposed solution...
The other way to do it would be to realize that you need 4 times 3 numbers and just ask the system for 12 numbers to be stored in a sequence. But then how do you access the number in row 2 and column 3? This is where maths comes in but let's try it on our example:
1 2 3 4
5 6 7 8
9 10 11 12
If we store them next to each other they would look like this:
offset from start: 0 1 2 3 4 5 6 7 8 9 10 11
numbers in memory: [1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]
So our mapping is like this:
row | column | offset | value
1 | 1 | 0 | 1
1 | 2 | 1 | 2
1 | 3 | 2 | 3
1 | 4 | 3 | 4
2 | 1 | 4 | 5
2 | 2 | 5 | 6
2 | 3 | 6 | 7
2 | 4 | 7 | 8
3 | 1 | 8 | 9
3 | 2 | 9 | 10
3 | 3 | 10 | 11
3 | 4 | 11 | 12
And we now need to work out a nice and easy formula for converting a row and column to an offset... I'll come back to it when I have more time... Right now I need to get home (sorry)...
Edit: I'm a little late but let me continue. To find the offset of each of the numbers from a row and column you can use the following formula:
offset = (row - 1) * 4 + (column - 1)
If you notice the two -1's here and think about it you will come to understand that it is because our row and column numberings start with 1 that we have to do this and this is why computer scientists prefer zero based offsets (because of this formula). However with pointers in C the language itself applies this formula for you when you use a multi-dimensional array. And hence this is the other way of doing it.
From your question what i understand is that you are asking why you need char ** for the variable which is declared as *names[]. So the answer is when you simply write names[], than that it is the syntax of array and array is basically a pointer.
So when you write *names[] than that means you are pointing to an array. And as array is basically a pointer so that means you have a pointer to a pointer and thats why compiler will not complain if you write
char ** cur_name = names ;
In above line you are declaring a pointer to a character pointer and then initialinzing it with the pointer to an array (remember array is also pointer).

In C, how does the length in an array definition map to addressing?

In C, when I define an array like int someArray[10], does that mean that the accessible range of that array is someArray[0] to someArray[9]?
Yes, indexing in c is zero-based, so for an array of n elements, valid indices are 0 through n-1.
Yes, because C's memory addressing is easily computed by an offset
myArray[5] = 3
roughly translates to
store in the address myArray + 5 * sizeof(myArray's base type)
the number 3.
Which means that if we permitted
myArray[1]
to be the first element, we would have to compute
store in the address myArray + (5 - 1) * sizeof(myArray's base type)
the number 3
which would require an extra computation to subtract the 1 from the 5 and would slow the program down a little bit (as this would require an extra trip through the ALU.
Modern CPUs could be architected around such issues, and modern compilers could compile these differences out; however, when C was crafted they didn't consider it a must-have nicety.
Think of an array like this:
* 0 1 2 3 4 5 6 7 8 9
+---+---+---+---+---+---+---+---+---+----+
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
+---+---+---+---+---+---+---+---+---+----+
DATA
* = array indices
So the the range of access would be [0,9] (inclusive)

Convert Julia array to dataframe

I have an array X that I'd like to convert to a dataframe. Upon recommendation from the web, I tried converting to a dataframe and get the following error.
julia> y=convert(DataFrame,x)
ERROR:converthas no method matching convert(::Type{DataFrame}, ::Array{Float64,2})
in convert at base.jl:13
When I try DataFrame(x), the conversion works but i get a complaint that the conversion is deprecated.
julia> DataFrame(x)
WARNING: DataFrame(::Matrix, ::Vector)) is deprecated, use convert(DataFrame, Matrix) instead in DataFrame at /Users/Matthew/.julia/v0.3/DataFrames/src/deprecated.jl:54 (repeats 2 times)
Is there another method I should be aware of to keep my code consistent?
EDIT:
Julia 0.3.2,
DataFrames 0.5.10
OSX 10.9.5
julia> x=rand(4,4)
4x4 Array{Float64,2}:
0.467882 0.466358 0.28144 0.0151388
0.22354 0.358616 0.669564 0.828768
0.475064 0.187992 0.584741 0.0543435
0.0592643 0.345138 0.704496 0.844822
julia> convert(DataFrame,x)
ERROR: `convert` has no method matching convert(::Type{DataFrame}, ::Array{Float64,2}) in convert at base.jl:13
This works for me:
julia> using DataFrames
julia> x = rand(4, 4)
4x4 Array{Float64,2}:
0.790912 0.0367989 0.425089 0.670121
0.243605 0.62487 0.582498 0.302063
0.785159 0.0083891 0.881153 0.353925
0.618127 0.827093 0.577815 0.488565
julia> convert(DataFrame, x)
4x4 DataFrame
| Row | x1 | x2 | x3 | x4 |
|-----|----------|-----------|----------|----------|
| 1 | 0.790912 | 0.0367989 | 0.425089 | 0.670121 |
| 2 | 0.243605 | 0.62487 | 0.582498 | 0.302063 |
| 3 | 0.785159 | 0.0083891 | 0.881153 | 0.353925 |
| 4 | 0.618127 | 0.827093 | 0.577815 | 0.488565 |
Are you trying something different?
If that doesn't work try posting a bit more code we can help you better.
Since this is the first thing that comes up when you google, for more recent versions of DataFrames.jl, you can just use the DataFrame() function now:
julia> x = rand(4,4)
4×4 Matrix{Float64}:
0.920406 0.738911 0.994401 0.9954
0.18791 0.845132 0.277577 0.231483
0.361269 0.918367 0.793115 0.988914
0.725052 0.962762 0.413111 0.328261
julia> DataFrame(x, :auto)
4×4 DataFrame
Row │ x1 x2 x3 x4
│ Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────
1 │ 0.920406 0.738911 0.994401 0.9954
2 │ 0.18791 0.845132 0.277577 0.231483
3 │ 0.361269 0.918367 0.793115 0.988914
4 │ 0.725052 0.962762 0.413111 0.328261
I've been confounded by the same issue a number of times, and eventually realized the issue is often related to the format of the array, and is easily resolved by simply transposing the array prior to conversion.
In short, I recommend:
julia> convert(DataFrame, x')
# convert a Matrix{Any} with a header row of col name strings to a DataFrame
# e.g. mat2df(["a" "b" "c"; 1 2 3; 4 5 6])
mat2df(mat) = convert(DataFrame,Dict(mat[1,:],
[mat[2:end,i] for i in 1:size(mat,2)]))
# convert a Matrix{Any} (mat) and a list of col name strings (headerstrings)
# to a DataFrame, e.g. matnms2df([1 2 3;4 5 6], ["a","b","c"])
matnms2df(mat, headerstrs) = convert(DataFrame,
Dict(zip(headerstrs,[mat[:,i] for i in 1:size(mat,2)])))
A little late, but with the update to the DataFrame() function, I created a custom function that would take a matrix (e.g. an XLSX imported dataset) and convert it into a DataFrame using the first row as column headers. Saves me a ton of time and, hopefully, it helps you too.
function MatrixToDataFrame(mat)
DF_mat = DataFrame(
mat[2:end, 1:end],
string.(mat[1, 1:end])
)
return DF_mat
end
So I found this online and honestly felt dumb.
using CSV
WhatIWant = DataFrame(WhatIHave)
this was adapted from an R guide, but it works so heck
DataFrame([1 2 3 4; 5 6 7 8; 9 10 11 12], :auto)
This works as per >? DataFrame

For "int demo[4][2]",why are all these same in magnitude: &demo[1],demo[1],demo+1,*(demo+1) ?What about type?

Just when I had relaxed thinking I have a fair understanding of pointers in the context of arrays,I have fallen face down again over this following program.I had understood how for an array arr,arr and &arr are both same in magnitude,but different in type,but I fail to get a solid grip over the following program's output.I try to visualize it but succeed only partially.I would appreciate if you can give a rigorous and detailed explanation for this thing so that guys like me can be done with this confusion for good.
In the following program,I have used a "2D" array demo[][2].I know that demo[] will be an array of arrays of size 2.I also know that demo used alone will be of type (*)[2].Still I am at a loss about the following :
1) Why is &demo[1] same as demo[1]?Isn't demo[1] supposed to be the address of the second array?What on earth is &demo[1] then and why is it same as address of the second array?
2) I know that the second printf() and fourth are same,as demo[1] is nothing else but *(demo+1).But I've used it so to illustrate this point.How can it be equal to the third printf(),ie,how can demo+1 be equal to *(demo+1)? demo[1] being the same as *(demo+1) is well-known,but how can demo+1 be equal to *(demo+1)? How can "something" be equal to the value at that "something"?
3) And since it just proved I am not very smart,I should stop my guessing game and ask you for a conclusive answer about what are the types for the following :
&demo[1]
demo[1]
demo+1
#include<stdio.h>
int main(void)
{
int demo[4][2]= {{13,45},{83,34},{4,8},{234,934}};
printf("%p\n",&demo[1]);
printf("%p\n",demo[1]); //Should have cast to (void*),but works still
printf("%p\n",demo+1);
printf("%p\n",*(demo+1));
}
OUTPUT:
0023FF28
0023FF28
0023FF28
0023FF28
demo[1] is the second member of the array demo, and is an array itself. Just like any other array, when it's not the subject of the & or sizeof operator, it evaluates to a pointer to its first element - that is, demo[1] evaluates to the same thing as &demo[1][0], the address of the first int in the array demo[1].
&demo[1] is the address of the array demo[1], and because the address of an array and the address of the first member of that array are necessarily the same location, &demo[1] is equal to &demo[1][0], which is equal to a bare demo[1]. This the key insight - the first element of an array is located at the same place in memory as the array itself, just like the first member of a struct is located at the same place in memory as the struct itself. When you print &demo[1] and demo[1], you're not printing a pointer to the array and an array; you're printing a pointer to the array and a pointer to the first member of that array.
demo+1 is the address of the second member of demo. *(demo+1) is that member itself (it's the array demo[1]), but because that member is an array, it evaluates to a pointer to its first member. As above, its first member is necessarily collocated with the array itself. It's not that "something" is equal to the value at "something" - because when you use an array in an expression like that, it doesn't evaluate to the array itself.
&demo[1] is a pointer to demo[1], which is an array of 2 int. So its type is int (*)[2].
demo[1] is an array of 2 int. Its type is int [2]. However, when used in an expression where it is not the subject of type & or sizeof operators, it will evaluate to a pointer to its first member, which is a value with type int *.
demo+1 is a pointer to demo[1], and its type is int (*)[2].
Think about how the array is laid out in memory:
+-----+-----+-----+-----+-----+-----+-----+-----+
| 13 | 45 | 83 | 34 | 4 | 8 | 234 | 934 |
+-----+-----+-----+-----+-----+-----+-----+-----+
^ ^ ^ ^
| | | |
demo[0] demo[1] demo[2] demo[3]
Then also remember that naturally demo "points" to the first entry in the array. From this follows that demo + 0 of course should point to the first entry as well, and another way of getting the address of an entry in the array is with the address-of operand & which would be &demo[0]. So demo is equal to demo + 0 which is equal to &demo[0].
Also remember that for your example, each entry in demo is another array, and arrays and pointers are pretty much interchangeable (as arrays decays to pointers). From this follows that demo[0] can be used as a pointer as well.
Now replace the 0 index above, with 1, and you got the exact same thing as you are observing.
+-------+------+
| | |
| 13 | 45 |
101 | | | 105
+--------------+
| | |
| 83 | 34 |
109 | | | 113
+--------------+
| | |
| 04 | 08 |
117 | | | 121
+--------------+
| | |
| 234 | 934 |
125 | | | 129
+--------------+
Note: Assuming sizeof(int) = 4
Assuming the 2D layout (its not that way in memory though, they are all in line)
demo[i] is the ith row of the 2D array. which are themselves 1D arrays.
demo[1] is the 1th row. [I mean the one with address 109].
&demo[1]
is address of demo[1] and is the same as the base address of that row.Just like for 1D arrays. Array name gives the address of 1st location. Here 1D array name is demo[1]
demo[1]
since array names also give the base address of the array it is the same as &demo[1]
demo+1
demo is pointer and has value 101. demo (i.e. demo[0])is of type 1 row [shoddy description. What I mean is size of the row having two elements - (*)[2] ] So demo+1 increments it to point to next row. Which is sane as demo[1]
*(demo+1)
demo+1 is the 1<sup>th</sup> row
and *(demo+1) means value at that location. This is itself an array, so it gives the address. since array names give addresses

Resources