How can Rust transfer struct back to C? - c

I am building a server client app, where server is Rust and obtaining stat information for a given path and transfer it to C client. I want the C client to be able to directly use the bytes, and cast it to a struct. The stat I am talking about is this struct in C. Here is my Rust representation of the stat:
#[repr(C)]
pub struct Stat {
st_dev: u64,
st_ino: u64,
st_nlink: u64,
st_mode: u32,
st_uid: u32,
st_gid: u32,
st_rdev: u64,
st_size: u64,
st_blksize: u64,
st_blocks: u64,
st_atime: i64,
st_atime_nsec: i64,
st_mtime: i64,
st_mtime_nsec: i64,
st_ctime: i64,
st_ctime_nsec: i64
}
impl Stat {
pub fn encode(self) -> Vec<u8> {
unsafe {
std::slice::from_raw_parts(
(&self as *const Stat) as *const u8,
std::mem::size_of::<Stat>()
).to_owned()
}
}
}
However, when I find the values are not matching once I received it from my C side. Below is the value comparison for each field following the order in the struct,
# C:
16777220
8613988721
0
0
5
0
16832
6879832142633762816
0
1327895242430480384
20
687194767360
17592186044416
0
6879832142633762816
0
#Rust:
16777220
8613988721
5
16832
501
20
0
160
4096
0
1601835746
0
1601835746
0
1601835746
309174704
Does anyone know what caused this problem? And how can I solve it?

Use nix. See https://docs.rs/nix/newest/nix/sys/stat/fn.stat.html
nix uses the struct stat from the libc package, which has a separate manually generated struct definition for every supported platform. I don't precisely understand why you want to encode stat structures, but you need to keep in mind that they will most likely be mutually incompatible betweeen different architechtures, platforms, and OS versions. That is, you can only reliably bytewise encode and decode them when the encoder and decoder are running on the same version of the same platform.

Related

Memory efficient computation of md5sum of a file in vlang

The following code read a file into bytes and computes the md5sum of the bytes array. It works but I would like to find a solution in V that need less RAM.
Thanks for your comments !
import os
import crypto.md5
b := os.read_bytes("file.txt") or {panic(err)}
s := md5.sum(b).hex()
println(s)
I also tried without success :
import os
import crypto.md5
import io
mut f := os.open_file("file.txt", "r")?
mut h := md5.new()
io.cp(mut f, mut h)?
s := h.sum().hex()
println(s) // does not return the correct md5sum
Alrighty. This is what you're looking for. It produces the same result as md5sum and is only slightly slower. block_size is inversely related to the amount of memory used and speed at which the checksum is computed. Decreasing block_size will lower the memory footprint, but takes longer to compute. Increasing block_size has the opposite effect. I tested on a 2GB manjaro disc image and can confirm the memory usage is very low.
Note: It seems this does perform noticeably slower without the -prod flag. The V compiler makes special optimizations in order to run faster for the production build.
import crypto.md5
import io
import os
fn main() {
println(hash_file('manjaro.img')?)
}
const block_size = 64 * 65535
fn hash_file(path string) ?string {
mut file := os.open(path)?
defer {
file.close()
}
mut buf := []u8{len: block_size}
mut r := io.new_buffered_reader(reader: file)
mut digest := md5.new()
for {
x := r.read(mut buf) or { break }
digest.write(buf[..x])?
}
return digest.checksum().hex()
}
To conclude what I've learned from the comments:
V is a programming language with typed arguments
md5.sum takes a byte array argument, and not something that is a sequence of bytes, e.g. read from a file as-you-go.
There's no alternative to md5.sum
So, you will have to implement MD5 yourself. Maybe the standard library is open source and you can build upon that! Or, you can just bind any of the existing (e.g. C) implementations of MD5 and feed in bytes as you read them, in chunks of 512 bits = 2⁶ Bytes.
EDIT: I don't know V, so it's hard for me to judge, but it would look Digest.write would be a method to consecutively push data through the MD5 calculation. Maybe that together with a while loop reading bytes from the file is the solution?

perlbench results in segfault outside the SPEC 2006 harness

This might be overly specific, but posting here as it might help someone else who's trying to compile/run the SPEC 2006 benchmarks outside the default SPEC benchmark harness. (Our reason of doing this is comparing compiling strategies and code coverage, while the SPEC harness is focused on performance of the resulting code only).
When performing a ref run of perlbench the benchmark crashes with a segmentation fault:
Program received signal SIGSEGV, Segmentation fault.
0x00000000004f6868 in S_regmatch (prog=0x832144)
at <path-to-spec>/CPU2006/400.perlbench/src/regexec.c:3024
3024 PL_reg_start_tmp[n] = locinput;
(gdb) bt
#0 0x00000000004f6868 in S_regmatch (prog=0x832144)
at <path-to-spec>/CPU2006/400.perlbench/src/regexec.c:3024
#1 0x00000000004f22cf in S_regtry (prog=0x8320c0, startpos=0x831e70 "o")
at <path-to-spec>/CPU2006/400.perlbench/src/regexec.c:2196
#2 0x00000000004eba71 in Perl_regexec_flags (prog=0x8320c0, stringarg=0x831e70 "o", strend=0x831e71 "",
strbeg=0x831e70 "o", minend=0, sv=0x7e2528, data=0x0, flags=3)
at <path-to-spec>/CPU2006/400.perlbench/src/regexec.c:1910
#3 0x00000000004b33bb in Perl_pp_match ()
at <path-to-spec>/CPU2006/400.perlbench/src/pp_hot.c:1340
#4 0x00000000004fcde4 in Perl_runops_standard ()
at <path-to-spec>/CPU2006/400.perlbench/src/run.c:37
#5 0x000000000046bf57 in S_run_body (oldscope=1)
at <path-to-spec>/CPU2006/400.perlbench/src/perl.c:2017
#6 0x000000000046b9f6 in perl_run (my_perl=0x7bf010)
at <path-to-spec>/CPU2006/400.perlbench/src/perl.c:1934
#7 0x000000000047add2 in main (argc=4, argv=0x7fffffffe178, env=0x7fffffffe1a0)
at <path-to-spec>/CPU2006/400.perlbench/src/perlmain.c:98
The execution environment is 64-bit Linux and the behaviour is observed with both the latest gcc and clang.
What causes this crash?
The segfault is caused by a garbage value of the variable n on the pointed out line. Inspecting the code shows that the value comes from the field arg1 of an object of type:
struct regnode_1 {
U8 flags;
U8 type;
U16 next_off;
U32 arg1;
};
Inspecting the memory location of the object shows that it is not packed, i.e. there is 32bit padding between next_off and arg1:
(gdb) x/16xb scan
0x7f4978: 0xde 0x2d 0x02 0x00 0x00 0x00 0x00 0x00
0x7f4980: 0x00 0x11 0x0d 0x00 0x00 0x00 0x00 0x00
(gdb) print/x n
$1 = 0xd1100
This is suspicious. There's pointer and type conversion going on in perlbench, so perhaps type size assumptions fail somewhere. Compiling with multilib yields a working benchmark and examining the memory verifies that there is no padding.
Forcing the structure into a bitfield fixes the crash when performing a 64-bit compile:
struct regnode_1 {
U8 flags : 8;
U8 type : 8;
U16 next_off : 16;
U32 arg1 : 32;
};
This is how our little investigation progressed:
At first we thought it was some padding issue, but as Peter pointed out on Godbolt, no such thing occurs. So, the packing or not of the structure did not change anything.
Then, I got suspicious of the (clearly twisted) way that Perl handles pointers. The majority of the casts are violating strict aliasing as defined by the standard. Since the segmentation fault happened on a pointer cast, namely:
struct regnode {
U8 flags;
U8 type;
U16 next_off;
};
to
struct regnode_1 {
U8 flags;
U8 type;
U16 next_off;
U32 arg1;
};
However, enabling it with the -fstrict-aliasing flags didn't change anything. Although it qualifies as undefined behaviour, there is no overlap in memory, since the elements/nodes of the regular expression that is being currently parsed are laid out separately in memory.
Going deeper and checking the LLVM IR for the switch block in question, I got this in regexec.ll
; truncated
%876 = load %struct.regnode*, %struct.regnode** %scan, align 8, !dbg !8005
%877 = bitcast %struct.regnode* %876 to %struct.regnode_1*, !dbg !8005
%arg11715 = getelementptr inbounds %struct.regnode_1, %struct.regnode_1* %877, i32 0, i32 3, !dbg !8005
%878 = load i64, i64* %arg11715, align 8, !dbg !8005
store i64 %878, i64* %n, align 8, !dbg !8006
; truncated
The load/store instructions are using a 64-bit integer, which means that the pointer in C is interpreted as pointing to an 8 bytes integer (instead of 4). Thus, gathering 2 bytes outside the current regex node struct bounds for calculating the value of arg1. This value is in turn used as an array index which ultimately causes a segfault crash when it is out of array bounds.
Back to tracing where U32 is interpreted as a 64-bit unsigned integer. Looking into file spec_config.h, the conditional compilation leads (at least in my machine) to a preprocessor block that starts with
#elif !defined(SPEC_CPU_GOOFY_DATAMODEL)
which, according to a code comment in the surrounding area, is supposed to correspond to a ILP32 data model (see also this). However, U32TYPE is defined as an unsigned long, which on my machine is 64 bits.
So, the fix is to change the definition to
#define U32TYPE uint32_t
which, as stated in this, is guaranteed to be exactly 32 bits (if supported).
I'd like to complement the other answers by saying that it was enough for us to add -DSPEC_CPU_LP64 to work around the segfault (-DSPEC_LP64 in CPU2017). Would be nice if the SPEC group would add this to their FAQ. This also seems to apply to gcc, cactusADM, povray and wrf.
We have a python script generating the config files for us, I'll talk to people and see if I can share what we have so far to get it running for our compiler.
Edit: Seems to be accesible from the outside anyway, so here you go: spec.py

Portable way to determine sector size in Linux

I want to write a small program in C which can determine the sector size of a hard disk. I wanted to read the file located in /sys/block/sd[X]/queue/hw_sector_size, and it worked in CentOS 6/7.
However when I tested in CentOS 5.11, the file hw_sector_size is missing, and I have only found max_hw_sectors_kb and max_sectors_kb.
Thus, I'd like to know how can I determine (APIs) the sector size in CentOS 5, or is there an other better way to do so. Thanks.
The fdisk utility displays this information (and runs successfully on kernels older even than than the 2.6.x vintage on CentOS 5), so that seems a likely place to look for an answer. Fortunately, we're living in the wonderful world of open source, so all it requires is a little investigation.
The fdisk program is provided by the util-linux package, so we need that first.
The sector size is displayed in the output of fdisk like this:
Disk /dev/sda: 477 GiB, 512110190592 bytes, 1000215216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
If we look for Sector size in the util-linux code, we find it in disk-utils/fdisk-list.c:
fdisk_info(cxt, _("Sector size (logical/physical): %lu bytes / %lu bytes"),
fdisk_get_sector_size(cxt),
fdisk_get_physector_size(cxt));
So, it looks like we need to find fdisk_get_sector_size, which is defined in libfdisk/src/context.c:
unsigned long fdisk_get_sector_size(struct fdisk_context *cxt)
{
assert(cxt);
return cxt->sector_size;
}
Well, that wasn't super helpful. We need to find out where cxt->sector_size is set:
$ grep -lri 'cxt->sector_size.*=' | grep -v tests
libfdisk/src/alignment.c
libfdisk/src/context.c
libfdisk/src/dos.c
libfdisk/src/gpt.c
libfdisk/src/utils.c
I'm going to start with alignment.c, since that filename sounds promising. Looking through that file for the same regex I used to list the files, we find this:
cxt->sector_size = get_sector_size(cxt->dev_fd);
Which leads me to:
static unsigned long get_sector_size(int fd)
{
int sect_sz;
if (!blkdev_get_sector_size(fd, &sect_sz))
return (unsigned long) sect_sz;
return DEFAULT_SECTOR_SIZE;
}
Which in turn leads me to the definition of blkdev_get_sector_size in lib/blkdev.c:
#ifdef BLKSSZGET
int blkdev_get_sector_size(int fd, int *sector_size)
{
if (ioctl(fd, BLKSSZGET, sector_size) >= 0)
return 0;
return -1;
}
#else
int blkdev_get_sector_size(int fd __attribute__((__unused__)), int *sector_size)
{
*sector_size = DEFAULT_SECTOR_SIZE;
return 0;
}
#endif
And there we go. There is a BLKSSZGET ioctl that seems useful. A search for BLKSSZGET leads us to this stackoverflow question, which includes the following information in a comment:
For the record: BLKSSZGET = logical block size, BLKBSZGET = physical
block size, BLKGETSIZE64 = device size in bytes, BLKGETSIZE = device
size/512. At least if the comments in fs.h and my experiments can be
trusted. – Edward Falk Jul 10 '12 at 19:33

Raw pointer turns null passing from Rust to C

I'm attempting to retrieve a raw pointer from on C function in rust, and use that same raw pointer as an argument in another C function from another library. When I pass the raw pointer, I end up with a NULL pointer on the C side.
I have tried to make a simplified version of my issue, but when I do it works as I would expect it to -
C Code -
struct MyStruct {
int value;
};
struct MyStruct * get_struct() {
struct MyStruct * priv_struct = (struct MyStruct*) malloc( sizeof(struct MyStruct));
priv_struct->value = 0;
return priv_struct;
}
void put_struct(struct MyStruct *priv_struct) {
printf("Value - %d\n", priv_struct->value);
}
Rust Code -
#[repr(C)]
struct MyStruct {
value: c_int,
}
extern {
fn get_struct() -> *mut MyStruct;
}
extern {
fn put_struct(priv_struct: *mut MyStruct) -> ();
}
fn rust_get_struct() -> *mut MyStruct {
let ret = unsafe { get_struct() };
ret
}
fn rust_put_struct(priv_struct: *mut MyStruct) {
unsafe { put_struct(priv_struct) };
}
fn main() {
let main_struct = rust_get_struct();
rust_put_struct(main_struct);
}
When I run this I get the output of Value - 0
~/Dev/rust_test$ sudo ./target/debug/rust_test
Value - 0
~/Dev/rust_test$
However, when trying to do this against a DPDK library, I retrieve and pass a raw pointer in the the same way but get a segfault. If I use gdb to debug, I can see that I'm passing a pointer on the Rust side, but I see it NULL on the C side -
(gdb) frame 0
#0 rte_eth_rx_queue_setup (port_id=0 '\000', rx_queue_id=<optimized out>, nb_rx_desc=<optimized out>, socket_id=0, rx_conf=0x0, mp=0x0)
at /home/kenton/Dev/dpdk-16.07/lib/librte_ether/rte_ethdev.c:1216
1216 if (mp->private_data_size < sizeof(struct rte_pktmbuf_pool_private)) {
(gdb) frame 1
#1 0x000055555568953b in dpdk::ethdev::dpdk_rte_eth_rx_queue_setup (port_id=0 '\000', rx_queue_id=0, nb_tx_desc=128, socket_id=0, rx_conf=None,
mb=0x7fff3fe47640) at /home/kenton/Dev/dpdk_ffi/src/ethdev/mod.rs:32
32 let retc: c_int = unsafe {ffi::rte_eth_rx_queue_setup(port_id as uint8_t,
In frame 1, mb has an address and is being passed. In frame 0 the receiving function in the library is showing it as 0x0 for mp.
My code to receive the pointer -
let mb = dpdk_rte_pktmbuf_pool_create(CString::new("MBUF_POOL").unwrap().as_ptr(),
(8191 * nb_ports) as u32 , 250, 0, 2176, dpdk_rte_socket_id());
This calls into an ffi library -
pub fn dpdk_rte_pktmbuf_pool_create(name: *const c_char,
n: u32,
cache_size: u32,
priv_size: u16,
data_room_size: u16,
socket_id: i32) -> *mut rte_mempool::ffi::RteMempool {
let ret: *mut rte_mempool::ffi::RteMempool = unsafe {
ffi::shim_rte_pktmbuf_pool_create(name,
n as c_uint,
cache_size as c_uint,
priv_size as uint16_t,
data_room_size as uint16_t,
socket_id as c_int)
};
ret
}
ffi -
extern {
pub fn shim_rte_pktmbuf_pool_create(name: *const c_char,
n: c_uint,
cache_size: c_uint,
priv_size: uint16_t,
data_room_size: uint16_t,
socket_id: c_int) -> *mut rte_mempool::ffi::RteMempool;
}
C function -
struct rte_mempool *
rte_pktmbuf_pool_create(const char *name, unsigned n,
unsigned cache_size, uint16_t priv_size, uint16_t data_room_size,
int socket_id);
When I pass the pointer, it looks much the same as my simplified version up above. My variable mb contains a raw pointer that I pass to another function -
ret = dpdk_rte_eth_rx_queue_setup(port,q,128,0,None,mb);
ffi library -
pub fn dpdk_rte_eth_rx_queue_setup(port_id: u8,
rx_queue_id: u16,
nb_tx_desc: u16,
socket_id: u32,
rx_conf: Option<*const ffi::RteEthRxConf>,
mb_pool: *mut rte_mempool::ffi::RteMempool ) -> i32 {
let retc: c_int = unsafe {ffi::rte_eth_rx_queue_setup(port_id as uint8_t,
rx_queue_id as uint16_t,
nb_tx_desc as uint16_t,
socket_id as c_uint,
rx_conf,
mb)};
let ret: i32 = retc as i32;
ret
}
ffi -
extern {
pub fn rte_eth_rx_queue_setup(port_id: uint8_t,
rx_queue_id: uint16_t,
nb_tx_desc: uint16_t,
socket_id: c_uint,
rx_conf: Option<*const RteEthRxConf>,
mb: *mut rte_mempool::ffi::RteMempool ) -> c_int;
}
C function -
int
rte_eth_rx_queue_setup(uint8_t port_id, uint16_t rx_queue_id,
uint16_t nb_rx_desc, unsigned int socket_id,
const struct rte_eth_rxconf *rx_conf,
struct rte_mempool *mp);
I apologize for the length, but I feel like I'm missing something simple and haven't been able to figure it out. I've checked struct alignment for each field that is being passed, and I even see values for the pointer that is received as I'd expect -
(gdb) frame 1
#1 0x000055555568dcf4 in dpdk::ethdev::dpdk_rte_eth_rx_queue_setup (port_id=0 '\000', rx_queue_id=0, nb_tx_desc=128, socket_id=0, rx_conf=None,
mb=0x7fff3fe47640) at /home/kenton/Dev/dpdk_ffi/src/ethdev/mod.rs:32
32 let retc: c_int = unsafe {ffi::rte_eth_rx_queue_setup(port_id as uint8_t,
(gdb) print *mb
$1 = RteMempool = {name = "MBUF_POOL", '\000' <repeats 22 times>, pool_union = PoolUnionStruct = {data = 140734245862912}, pool_config = 0x0,
mz = 0x7ffff7fa4c68, flags = 16, socket_id = 0, size = 8191, cache_size = 250, elt_size = 2304, header_size = 64, trailer_size = 0,
private_data_size = 64, ops_index = 0, local_cache = 0x7fff3fe47700, populated_size = 8191, elt_list = RteMempoolObjhdrList = {
stqh_first = 0x7fff3ebc7f68, stqh_last = 0x7fff3fe46ce8}, nb_mem_chunks = 1, mem_list = RteMempoolMemhdrList = {stqh_first = 0x7fff3ebb7d80,
stqh_last = 0x7fff3ebb7d80}, __align = 0x7fff3fe47700}
Any ideas on why the pointer is turning to NULL on the C side?
CString::new("…").unwrap().as_ptr() does not work. The CString is temporary, hence the as_ptr() call returns the inner pointer of that temporary, which will likely be dangling when you use it. This is “safe” per Rust's definition of safety as long as you don't use the pointer, but you eventually do so in a unsafe block. You should bind the string to a variable and use as_ptr on that variable.
This is such a common problem, there is even a proposal to fix the CStr{,ing} API to avoid it.
Additionally raw pointer are nullable by themselves, so the Rust FFI equivalent of const struct rte_eth_rxconf * would be *const ffi::RteEthRxConf, not Option<*const ffi::RteEthRxConf>.

Initializing struct member with type DWORD64

I use Microsoft Visual Studio 2010, write in C++ and compile binary for x64 platform. In my project I have useful structure for memory blocks with data pointer and size:
typedef struct _MEMORY_BLOCK
{
DWORD64 Length;
LPBYTE lpData;
}
MEMORY_BLOCK, *LPMEMORY_BLOCK;
In other file I have key definition:
BYTE PublicKey[] = { 0x01, 0x02, 0x03, ... };
DWORD64 PublicKeyLength = (DWORD64)sizeof(PublicKey);
MEMORY_BLOCK ServerKey = { PublicKeyLength, PublicKey };
On x86 platform structure initialization of ServerKey works fine, but on x64 platform it results in MEMORY_BLOCK struct filled with zeroes. If I change members order in structure (e.g. lpData at first, and second is Length), lpData initializes properly, but Length is still equal to zero.
Now I have workaround with function that sets ServerKey values in runtime, but I need to know why I could not initialize DWORD64 struct member in ServerKey definition.

Resources