I want to use Go library with some adjusted in C.
I made GoAdder Go function that having 3 arguments int x, y and function typed f.
And GoAdder function going to call f argument.
adder.go
package main
import "fmt"
import "C"
//export Ftest
type Ftest func(C.int);
//export GoAdder
func GoAdder(x, y int, f Ftest) int {
fmt.Printf("Go says: adding %v and %v\n", x, y)
f(10);
return x + y
}
func main() {} // Required but ignored
And I built go package as a static library named libadder.a in above like this:
go build -buildmode=c-archive -o libadder.a adder.go
And then I have written C++ codes bellow.
main.c
#include <stdio.h>
#include "adder/libadder.h"
void a( GoInt a ){
printf("Hello %d", a);
}
int main() {
printf("C says: about to call Go...\n");
int total = GoAdder(1, 7, &a);
printf("C says: Go calculated our total as %i\n", total);
return 0;
}
I have complied the source like this:
gcc -pthread -o static_go_lib main.c adder/libadder.a
when executing codes above it occurs errors
unexpected fault address 0x0
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x563c99b74244]
goroutine 17 [running, locked to thread]:
...
How to get a correct C function address a in go function GoAdder ?
I referenced https://github.com/draffensperger/go-interlang/tree/master/c_to_go/static_go_lib
C function is just jump pointer, whereas golang's callback are complicate struct, and you cannot convert them.
There's only one (safe) way to call C function pointer:
1) Declare that somewhere:
//go:linkname cgocall runtime.cgocall
//go:nosplit
func cgocall(fn, arg unsafe.Pointer /* may be uintptr */) int32
2) Also, be type safe:
func GoAdder(x, y C.int, f unsafe.Pointer /* don't sure if this available, mb C.uintptr_t */) C.int
3) C function should take pointer (to what ever) as argument
void a(GoInt *a)
(I'd use native types)
4)
ten := 10
cgocall(f, unsafe.Pointer(&ten))
(It should be struct, if you wanna pass several args)
Related
I have been trying to implement a small simulation to understand memory allocation of malloc(). I created a shared library called mem.c. I am linking the library to the main but cannot pass the correct address of the simulated "heap". Heap is created by a malloc() call in the shared library.
Address in the shared library: 0x55ddaff662a0
Address in the main: 0xffffffffaff662a0
Only last 4 bytes seem to be correct. Rest is set to 0xf.
However, when I #include "mem.c" in the main it works correctly. How can I achieve the same result without including the mem.c. I am trying to solve this without including mem.c or mem.h. I create shared library as this:
gcc -c -fpic mem.c
gcc -shared -o libmem.so mem.o
gcc main.c -lmem -L. -o main
From your comments
I am trying to implement without using #include mem.h or mem.c.
Then you must provide by other means a prototype for the function you're calling. Without an explicit function prototype, following the tradition of K&R and then later ANSI C, undeclared functions are assumed to return an int and take parameters of type int.
EDIT: Essentially you need to write what'd you normally find in a header, somewhere before you make first use of the function. Or of it's a function pointer you need an appropriate variable to store the function pointer.
For example to declare a function that returns an untyped pointer, and an arbitrary, unspecified number of arguments you'd write
void *getAddr();
Note that using the extern keyword here is not required, since extern linkage is always implied for non-static function declarations.
In case you want to dynamically link at runtime (using dlopen / LoadLibrary → dlsym / GetProcAddress), you'd define a function pointer variable
void* (*getAddr_fptr)();
You can set it using dlsym with
*(void**)(&getAddr_fptr) = dlsym(…)
This awkward way of writing it comes due to function pointers being allowed to have a different size and alignment as data pointers (see the dlsym manpage for details).
These days on the majority of platforms int is a 4 byte type and the most common calling convention pass the first few function arguments by register. On x86 (and x86_64) the registers are AX, BX, CX and DX and may be accessed in different sizes, but may read and write with different size (to allow size conversion). This explains why only the first 4 bytes are passed: It's passed via register and only the write to the register is done as a 4 byte wide write. When the function then reads from the register, it does so with a wider type, with the higher value bits set to all 1.
From the comments:
Do you have a declaration for getAddr in your main code?
No I don't have but I am trying to implement without a declaration, is it possible?
Then that's your problem. Without a declaration, the compiler falls back to a default declaration of int getAddr(). This is incompatible with the actual definition which returns a void *, and calling a function through an incompatible declaration triggers undefined behavior.
What probably happened is that when the return value of the function was actually returned you only got back the 4 low-order bytes. Assuming your system is little-endian, and int is 4 bytes, and a void * is 8 bytes, this would explain the low bits being the same.
You must include a valid declaration before the function is called. It doesn't necessarily have to reside in a header file, but it has to be visible at the point the call happens.
I'm assuming you're trying to accomplish something like this? For mem.c
#include <stdlib.h>
#include <stdio.h>
void* getAddr() {
char *heap = (char *)malloc(10);
printf("%p\n", (void*)heap);
return heap;
}
And then without including any headers for the mem.c functions, you'd probably create a library out of mem.c as you've already mentioned in the question and have something as follows in main.c
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
typedef void* (*getAddr)(); //prototype for getAddr() in mem.c
int main() {
void* handle = dlopen("./libmem.so", RTLD_LAZY);
if(handle) {
void* fn = dlsym(handle, "getAddr");
if(fn) {
void* addr = ((getAddr)(fn))();
printf("%p\n", addr);
free(addr);
addr = NULL;
} else {
printf("Failed to dlsym %s\n", dlerror());
}
} else {
printf("Failed to dlopen %s\n", dlerror());
}
}
EDIT: For OP's purpose as #Zilog80 mentioned, since the library is being linked with main executable, the dlopen() part can be gotten rid of and main.c can be simplified as
#include <stdio.h>
#include <stdlib.h>
extern void* getAddr(); //prototype for getAddr() in mem.c
int main() {
void* addr = getAddr();
printf("%p\n", addr);
free(addr);
addr = NULL;
}
And used similar compilation commands as OP i.e.
gcc -shared -o libmem.so -fpic mem.c
gcc main.c -lmem -L . -o main
while executing
LD_LIBRARY_PATH=. ./main
I was reviewing some code and I came across something similar to this.
File foo.c:
int bar(int param1)
{
return param1*param1;
}
File main.c:
#include <stdio.h>
int bar(int param1, int unusedParam);
int main (void)
{
int param = 2, unused = 0;
printf("%d\n", bar(param, unused));
}
Running gcc main.c foo.c -Wall --pedantic -O0 it compiles, links and works properly without throwing a single warning in the process. Why is that?
Thanks!
This really depends on the calling convention and architecture. For example, with cdecl on x86, where arguments are pushed right to left and the caller restores the stack, the presence of an additional parameter is transparent to the function bar:
push 11
push 10
call _bar
add esp, 8
bar will only "see" the 10, and will function as expected with that parameter, returning 100. The stack is restored afterwards so there is no misalignment in main either; if you had just passed the 10 it would have added 4 to esp instead.
This is also true of the x64 calling conventions for both MSVC on Windows and the System V ABI, where the first few1 integral arguments are passed in registers; the second argument will be populated in its designated register by the call in main, but not even looked at by bar.
If, however, you tried to use an alternate calling convention where the callee is responsible for cleaning up the stack, you would run into trouble either at the build stage or (worse) at runtime. stdcall, for example, decorates the function name with the number of bytes used by the argument list, so I'm not even able to link the final executable by changing bar to use stdcall instead:
error LNK2019: unresolved external symbol _bar#8 referenced in function _main
This is because bar now has the signature _bar#4 in its object file, as it should.
This gets interesting if you use the obsolete calling convention pascal, where parameters are pushed left-to-right:
push 10
push 11
call _bar
Now bar returns 121, not 100, like you expected. That is, if the function successfully returns, which it won't, since the callee was supposed to clean up the stack but failed due to the extra parameter, trashing the return address.
1: 4 for MSVC on Windows; 6 on System V ABI
Normally you'd have this file structure:
foo.c
#include "foo.h"
int bar(int param1)
{
return param1*param1;
}
foo.h
int bar(int param1);
main.c
#include <stdio.h>
#include "foo.h"
int main (void)
{
int param = 2, unused = 0;
printf("%d\n", bar(param, unused));
}
Now you'll get a compilation error as soon as you use bar with non matching parameters.
I'm trying to build a minimal program in C that calls Rust functions, preferably compiled with #![no_std], in Windows, using GCC 6.1.0 and rustc 1.11.0-nightly (bb4a79b08 2016-06-15) x86_64-pc-windows-gnu. Here's what I tried first:
main.c
#include <stdio.h>
int sum(int, int);
int main()
{
printf("Sum is %d.\n", sum(2, 3));
return 0;
}
sum.rs
#![no_std]
#![feature(libc)]
extern crate libc;
#[no_mangle]
pub extern "C" fn sum(x: libc::c_int, y: libc::c_int) -> libc::c_int
{
x + y
}
Then I tried running:
rustc --crate-type=staticlib --emit=obj sum.rs
But got:
error: language item required, but not found: `panic_fmt`
error: language item required, but not found: `eh_personality`
error: language item required, but not found: `eh_unwind_resume`
error: aborting due to 3 previous errors
OK, so some of those errors are related to panic unwinding. I found out about a Rust compiler setting to remove unwinding support, -C panic=abort. Using that, the errors about eh_personality and eh_unwind_resume disappeared, but Rust still required the panic_fmt function. So I found its signature at the Rust docs, then I added that to the file:
sum.rs
#![no_std]
#![feature(lang_items, libc)]
extern crate libc;
#[lang = "panic_fmt"]
pub fn panic_fmt(_fmt: core::fmt::Arguments, _file_line: &(&'static str, u32)) -> !
{ loop { } }
#[no_mangle]
pub extern "C" fn sum(x: libc::c_int, y: libc::c_int) -> libc::c_int
{
x + y
}
Then, I tried building the whole program again:
rustc --crate-type=staticlib --emit=obj -C panic=abort sum.rs
gcc -c main.c
gcc sum.o main.o -o program.exe
But got:
sum.o:(.text+0x3e): undefined reference to `core::panicking::panic::h907815f47e914305'
collect2.exe: error: ld returned 1 exit status
The panic function reference is probably from a overflow check in the addition at sum(). That's all fine and desirable. According to this page, I need to define my own panic function to work with libcore. But I can't find instructions on how to do so: the function for which I am supposed to provide a definition is called panic_impl in the docs, however the linker is complaining about panic::h907815f47e914305, whatever that's supposed to be.
Using objdump, I was able to find the missing function's name, and hacked that into C:
main.c
#include <stdio.h>
#include <stdlib.h>
int sum(int, int);
void _ZN4core9panicking5panic17h907815f47e914305E()
{
printf("Panic!\n");
abort();
}
int main()
{
printf("Sum is %d.\n", sum(2, 3));
return 0;
}
Now, the whole program compiles and links successfully, and even works correctly.
If I then try using arrays in Rust, another kind of panic function (for bounds checks) is generated, so I need to provide a definition for that too. Whenever I try something more complex in Rust, new errors arise. And, by the way, panic_fmt seems to never be called, even when a panic does happen.
Anyways, this all seems very unreliable, and contradicts every information I could find via Google on the matter. There's this, but I tried to follow the instructions to no avail.
It seems such a simple and fundamental thing, but I can't get it to work the right way. Perhaps it's a Rust nightly bug? But I need libc and lang_items. How can I generate a Rust object file/static library without unwinding or panic support? It should probably just execute an illegal processor instruction when it wants to panic, or call a panic function I can safely define in C.
You shouldn't use --emit=obj; just rustc --crate-type=staticlib -C panic=abort sum.rs should do the right thing. (This fixes the _ZN4core9panicking5panic17h907815f47e914305E link error.)
To fix another link error, you need to write panic_fmt correctly (note the use of extern):
#[lang="panic_fmt"]
extern fn panic_fmt(_: ::core::fmt::Arguments, _: &'static str, _: u32) -> ! {
loop {}
}
With those changes, everything appears to work the way it's supposed to.
You need panic_fmt so you can decide what to do when a panic happens: if you use #![no_std], rustc assumes there is no standard library/libc/kernel, so it can't just call abort() or expect an illegal instruction to do anything useful. It's something which should be exposed in stable Rust somehow, but I don't know if anyone is working on stabilizing it.
You don't need to use #![feature(libc)] to get libc; you should use the version posted on crates.io instead (or you can declare the functions you need by hand).
So, the solution, from the accepted answer, was:
main.c
#include <stdio.h>
#include <stdlib.h>
int sum(int, int);
void panic(const char* filename_unterminated, int filename_size, int line_num)
{
printf("Panic! At line %d, file ", line_num);
for (int i = 0; i < filename_size; i++)
printf("%c", filename_unterminated[i]);
abort();
}
int main()
{
// Sum as u8 will overflow to test panicking.
printf("Sum is %d.\n", sum(0xff, 3));
return 0;
}
sum.rs
#![no_std]
#![feature(lang_items, libc)]
extern crate libc;
extern "C"
{
fn panic(
filename_unterminated: *const libc::c_char,
filename_size: libc::c_int,
line_num: libc::c_int) -> !;
}
#[lang="panic_fmt"]
extern fn panic_fmt(_: ::core::fmt::Arguments, filename: &'static str, line_num: u32) -> !
{
unsafe { panic(filename.as_ptr() as _, filename.len() as _, line_num as _); }
}
#[no_mangle]
pub extern "C" fn sum(x: libc::c_int, y: libc::c_int) -> libc::c_int
{
// Convert to u8 to test overflow panicking.
((x as u8) + (y as u8)) as _
}
And compiling with:
rustc --crate-type=staticlib -C panic=abort sum.rs
gcc -c main.c
gcc main.o -L . -l sum -o program.exe
Now everything works, and I have a panic handler in C that shows where the error occurred!
Is it possible? i.e. compile .c with dmc and .d with dmd and then link them together, will this work? Will I be able to call D functions from C code, share globals etc? Thanks.
Yes it is possible. In fact this is one of the main feature of dmd. To call a D function from C, just make that function extern(C), e.g.
// .d
import std.c.stdio;
extern (C) {
shared int x; // Globals without 'shared' are thread-local in D2.
// You don't need shared in D1.
void increaseX() {
++ x;
printf("Called in D code\n"); // for some reason, writeln crashes on Mac OS X.
}
}
// .c
#include <stdio.h>
extern int x;
void increaseX(void);
int main (void) {
printf("x = %d (should be 0)\n", x);
increaseX();
printf("x = %d (should be 1)\n", x);
return 0;
}
See Interfacing to C for more info.
The above answer is wrong as far as I know.
Because the D main routine has to be called before you use any D functions.
This is necessary to "initialize" D, f.e. its garbage collection.
To solve that, you simply can make the program be entered by a main routine in D or you can somehow call the D main routine from C. (But I dont know exactly how this one works)
When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.
[edit] To be more specific, I will demonstrate it with following example.
#include <stdio.h>
int main(int argc,char *argv[]){
myfunction();
exit(0);
}
Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.
Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:
void* myfunction_address = myfunction;
If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.
Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).
And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.
And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.
If you know the function name before program runs, simply use
void * addr = myfunction;
If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.
#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>
long find_symbol(char *filename, char *symname)
{
bfd *ibfd;
asymbol **symtab;
long nsize, nsyms, i;
symbol_info syminfo;
char **matching;
bfd_init();
ibfd = bfd_openr(filename, NULL);
if (ibfd == NULL) {
printf("bfd_openr error\n");
}
if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
printf("format_matches\n");
}
nsize = bfd_get_symtab_upper_bound (ibfd);
symtab = malloc(nsize);
nsyms = bfd_canonicalize_symtab(ibfd, symtab);
for (i = 0; i < nsyms; i++) {
if (strcmp(symtab[i]->name, symname) == 0) {
bfd_symbol_info(symtab[i], &syminfo);
return (long) syminfo.value;
}
}
bfd_close(ibfd);
printf("cannot find symbol\n");
}
To get a backtrace, use execinfo.h as documented in the GNU libc manual.
For example:
#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>
void trace_pom()
{
const int sz = 15;
void *buf[sz];
// get at most sz entries
int n = backtrace(buf, sz);
// output them right to stderr
backtrace_symbols_fd(buf, n, fileno(stderr));
// but if you want to output the strings yourself
// you may use char ** backtrace_symbols (void *const *buffer, int size)
write(fileno(stderr), "\n", 1);
}
void TransferFunds(int n);
void DepositMoney(int n)
{
if (n <= 0)
trace_pom();
else TransferFunds(n-1);
}
void TransferFunds(int n)
{
DepositMoney(n);
}
int main()
{
DepositMoney(3);
return 0;
}
compiled
gcc a.c -o a -g -Wall -Werror -rdynamic
According to the mentioned website:
Currently, the function name and offset only be obtained on systems that use the ELF
binary format for programs and libraries. On other systems, only the hexadecimal return
address will be present. Also, you may need to pass additional flags to the linker to
make the function names available to the program. (For example, on systems using GNU
ld, you must pass (-rdynamic.)
Output
./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]
About a comment in an answer (getting the address of an instruction), you can use this very ugly trick
#include <setjmp.h>
void function() {
printf("in function\n");
printf("%d\n",__LINE__);
printf("exiting function\n");
}
int main() {
jmp_buf env;
int i;
printf("in main\n");
printf("%d\n",__LINE__);
printf("calling function\n");
setjmp(env);
for (i=0; i < 18; ++i) {
printf("%p\n",env[i]);
}
function();
printf("in main again\n");
printf("%d\n",__LINE__);
}
It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output
in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37
have fun!