Related
I'm trying to use libelf to edit some things in ELF binaries, but so far I'm unable to even write the binary out without corrupting it. This sample:
#include <libelf.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <assert.h>
#include <gelf.h>
int main() {
elf_version(EV_CURRENT);
int fd = open("elftest", O_RDWR, 0);
assert(fd >= 0);
Elf *elf = elf_begin(fd, ELF_C_RDWR, NULL);
assert(elf != NULL);
assert(elf_kind(elf) == ELF_K_ELF);
assert(elf_update(elf, ELF_C_WRITE) != -1);
elf_end(elf);
close(fd);
}
which should just read in elftest and write it back out unchanged instead converts a working C hello world into a program that segfaults immediately (according to gdb, even before main is called).
The first discrepancy I noticed with readelf -h was that it moved the start of section headers back somewhat, and also reports that
readelf: Warning: the .dynamic section is not contained within the dynamic segment
What is causing libelf to alter the executable even when nothing is actually changed?
which should just read in elftest and write it back out unchanged instead converts a working C hello world into a program that segfaults immediately
This sure looks like a bug in libelf.
Adding this line immediately after elf_open() fixes your program:
Elf *elf = elf_begin(fd, ELF_C_RDWR, NULL);
elf_flagelf(elf, ELF_C_SET, ELF_F_LAYOUT); // add this
but I don't think it should be necessary.
P.S. I know this is just an example, but putting functional parts of your program inside assert() is a really bad idea. Doing this:
assert(elf_update(elf, ELF_C_WRITE) != -1);
will make you sorry sooner or later.
I am trying to understand direct I/O. To that end I have written this little toy code, which is merely supposed to open a file and write a text string to it:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
int main(int argc, char **argv) {
char thefile[64];
int fd;
char message[64]="jsfreowivanlsaskajght";
sprintf(thefile, "diotestfile.dat");
if ((fd = open(thefile,O_DIRECT | O_RDWR | O_CREAT, S_IRWXU)) == -1) {
printf("error opening file\n");
exit(1);
}
write(fd, message, 64);
close(fd);
}
My compile command for Cray and GNU is
cc -D'_GNU_SOURCE' diotest.c
and for Intel it is
cc -D'_GNU_SOURCE' -xAVX diotest.c
Under all three compilers, the file diotestfile.dat is created with correct permissions, but no data is ever written to it. When the executable finishes, the output file is blank. The O_DIRECT is the culprit (or, more precisely I guess, my mishandling of O_DIRECT). If I take it out, the code works just fine. I have seen this same problem in a much more complex code that I am trying to work with. What is it that I need to do differently?
Going on Ian Abbot's comment, I discovered that the problem can be solved by adding an alignment attribute to the "message" array:
#define BLOCK_SIZE 4096
int bytes_to_write, block_size=BLOCK_SIZE;
bytes_to_write = ((MSG_SIZE + block_size - 1)/block_size)*block_size;
char message[bytes_to_write] __attribute__ ((aligned(BLOCK_SIZE)));
(System I/O block size is 4096.)
So that solved it. Still can't claim to understand everything that is happening. Feel free to enlighten me if you want. Thanks to everyone for the comments.
Well, you need to rethink the question, because your program runs perfectly on my system, and I cannot guess from it's listing where the error can be.
Have you tested it before posting?
if the program doesn't write to the file, probably a good idea is to see about the return code of write(2). Have you done this? I cannot check because on my system (intel 64bit/FreeBSD) the program runs as you expected.
Your program runs, giving no output and a file named diotestfile.dat appeared in the . directory with contents jsfreowivanlsaskajght.
lcu#europa:~$ ll diotestfile.dat
-rwx------ 1 lcu lcu 64 1 feb. 18:14 diotestfile.dat*
lcu#europa:~$ cat diotestfile.dat
jsfreowivanlsaskajghtlcu#europa:~$ _
Is there a platform-agnostic and filesystem-agnostic method to obtain the full path of the directory from where a program is running using C/C++? Not to be confused with the current working directory. (Please don't suggest libraries unless they're standard ones like clib or STL.)
(If there's no platform/filesystem-agnostic method, suggestions that work in Windows and Linux for specific filesystems are welcome too.)
Here's code to get the full path to the executing app:
Variable declarations:
char pBuf[256];
size_t len = sizeof(pBuf);
Windows:
int bytes = GetModuleFileName(NULL, pBuf, len);
return bytes ? bytes : -1;
Linux:
int bytes = MIN(readlink("/proc/self/exe", pBuf, len), len - 1);
if(bytes >= 0)
pBuf[bytes] = '\0';
return bytes;
If you fetch the current directory when your program first starts, then you effectively have the directory your program was started from. Store the value in a variable and refer to it later in your program. This is distinct from the directory that holds the current executable program file. It isn't necessarily the same directory; if someone runs the program from a command prompt, then the program is being run from the command prompt's current working directory even though the program file lives elsewhere.
getcwd is a POSIX function and supported out of the box by all POSIX compliant platforms. You would not have to do anything special (apart from incliding the right headers unistd.h on Unix and direct.h on windows).
Since you are creating a C program it will link with the default c run time library which is linked to by ALL processes in the system (specially crafted exceptions avoided) and it will include this function by default. The CRT is never considered an external library because that provides the basic standard compliant interface to the OS.
On windows getcwd function has been deprecated in favour of _getcwd. I think you could use it in this fashion.
#include <stdio.h> /* defines FILENAME_MAX */
#ifdef WINDOWS
#include <direct.h>
#define GetCurrentDir _getcwd
#else
#include <unistd.h>
#define GetCurrentDir getcwd
#endif
char cCurrentPath[FILENAME_MAX];
if (!GetCurrentDir(cCurrentPath, sizeof(cCurrentPath)))
{
return errno;
}
cCurrentPath[sizeof(cCurrentPath) - 1] = '\0'; /* not really required */
printf ("The current working directory is %s", cCurrentPath);
This is from the cplusplus forum
On windows:
#include <string>
#include <windows.h>
std::string getexepath()
{
char result[ MAX_PATH ];
return std::string( result, GetModuleFileName( NULL, result, MAX_PATH ) );
}
On Linux:
#include <string>
#include <limits.h>
#include <unistd.h>
std::string getexepath()
{
char result[ PATH_MAX ];
ssize_t count = readlink( "/proc/self/exe", result, PATH_MAX );
return std::string( result, (count > 0) ? count : 0 );
}
On HP-UX:
#include <string>
#include <limits.h>
#define _PSTAT64
#include <sys/pstat.h>
#include <sys/types.h>
#include <unistd.h>
std::string getexepath()
{
char result[ PATH_MAX ];
struct pst_status ps;
if (pstat_getproc( &ps, sizeof( ps ), 0, getpid() ) < 0)
return std::string();
if (pstat_getpathname( result, PATH_MAX, &ps.pst_fid_text ) < 0)
return std::string();
return std::string( result );
}
If you want a standard way without libraries: No. The whole concept of a directory is not included in the standard.
If you agree that some (portable) dependency on a near-standard lib is okay: Use Boost's filesystem library and ask for the initial_path().
IMHO that's as close as you can get, with good karma (Boost is a well-established high quality set of libraries)
I know it is very late at the day to throw an answer at this one but I found that none of the answers were as useful to me as my own solution. A very simple way to get the path from your CWD to your bin folder is like this:
int main(int argc, char* argv[])
{
std::string argv_str(argv[0]);
std::string base = argv_str.substr(0, argv_str.find_last_of("/"));
}
You can now just use this as a base for your relative path. So for example I have this directory structure:
main
----> test
----> src
----> bin
and I want to compile my source code to bin and write a log to test I can just add this line to my code.
std::string pathToWrite = base + "/../test/test.log";
I have tried this approach on Linux using full path, alias etc. and it works just fine.
NOTE:
If you are on windows you should use a '\' as the file separator not '/'. You will have to escape this too for example:
std::string base = argv[0].substr(0, argv[0].find_last_of("\\"));
I think this should work but haven't tested, so comment would be appreciated if it works or a fix if not.
Filesystem TS is now a standard ( and supported by gcc 5.3+ and clang 3.9+ ), so you can use current_path() function from it:
std::string path = std::experimental::filesystem::current_path();
In gcc (5.3+) to include Filesystem you need to use:
#include <experimental/filesystem>
and link your code with -lstdc++fs flag.
If you want to use Filesystem with Microsoft Visual Studio, then read this.
No, there's no standard way. I believe that the C/C++ standards don't even consider the existence of directories (or other file system organizations).
On Windows the GetModuleFileName() will return the full path to the executable file of the current process when the hModule parameter is set to NULL. I can't help with Linux.
Also you should clarify whether you want the current directory or the directory that the program image/executable resides. As it stands your question is a little ambiguous on this point.
On Windows the simplest way is to use the _get_pgmptr function in stdlib.h to get a pointer to a string which represents the absolute path to the executable, including the executables name.
char* path;
_get_pgmptr(&path);
printf(path); // Example output: C:/Projects/Hello/World.exe
Maybe concatenate the current working directory with argv[0]? I'm not sure if that would work in Windows but it works in linux.
For example:
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main(int argc, char **argv) {
char the_path[256];
getcwd(the_path, 255);
strcat(the_path, "/");
strcat(the_path, argv[0]);
printf("%s\n", the_path);
return 0;
}
When run, it outputs:
jeremy#jeremy-desktop:~/Desktop$ ./test
/home/jeremy/Desktop/./test
For Win32 GetCurrentDirectory should do the trick.
You can not use argv[0] for that purpose, usually it does contain full path to the executable, but not nessesarily - process could be created with arbitrary value in the field.
Also mind you, the current directory and the directory with the executable are two different things, so getcwd() won't help you either.
On Windows use GetModuleFileName(), on Linux read /dev/proc/procID/.. files.
Just my two cents, but doesn't the following code portably work in C++17?
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
int main(int argc, char* argv[])
{
std::cout << "Path is " << fs::path(argv[0]).parent_path() << '\n';
}
Seems to work for me on Linux at least.
Based on the previous idea, I now have:
std::filesystem::path prepend_exe_path(const std::string& filename, const std::string& exe_path = "");
With implementation:
fs::path prepend_exe_path(const std::string& filename, const std::string& exe_path)
{
static auto exe_parent_path = fs::path(exe_path).parent_path();
return exe_parent_path / filename;
}
And initialization trick in main():
(void) prepend_exe_path("", argv[0]);
Thanks #Sam Redway for the argv[0] idea. And of course, I understand that C++17 was not around for many years when the OP asked the question.
Just to belatedly pile on here,...
there is no standard solution, because the languages are agnostic of underlying file systems, so as others have said, the concept of a directory based file system is outside the scope of the c / c++ languages.
on top of that, you want not the current working directory, but the directory the program is running in, which must take into account how the program got to where it is - ie was it spawned as a new process via a fork, etc. To get the directory a program is running in, as the solutions have demonstrated, requires that you get that information from the process control structures of the operating system in question, which is the only authority on this question. Thus, by definition, its an OS specific solution.
#include <windows.h>
using namespace std;
// The directory path returned by native GetCurrentDirectory() no end backslash
string getCurrentDirectoryOnWindows()
{
const unsigned long maxDir = 260;
char currentDir[maxDir];
GetCurrentDirectory(maxDir, currentDir);
return string(currentDir);
}
For Windows system at console you can use system(dir) command. And console gives you information about directory and etc. Read about the dir command at cmd. But for Unix-like systems, I don't know... If this command is run, read bash command. ls does not display directory...
Example:
int main()
{
system("dir");
system("pause"); //this wait for Enter-key-press;
return 0;
}
Works with starting from C++11, using experimental filesystem, and C++14-C++17 as well using official filesystem.
application.h:
#pragma once
//
// https://en.cppreference.com/w/User:D41D8CD98F/feature_testing_macros
//
#ifdef __cpp_lib_filesystem
#include <filesystem>
#else
#include <experimental/filesystem>
namespace std {
namespace filesystem = experimental::filesystem;
}
#endif
std::filesystem::path getexepath();
application.cpp:
#include "application.h"
#ifdef _WIN32
#include <windows.h> //GetModuleFileNameW
#else
#include <limits.h>
#include <unistd.h> //readlink
#endif
std::filesystem::path getexepath()
{
#ifdef _WIN32
wchar_t path[MAX_PATH] = { 0 };
GetModuleFileNameW(NULL, path, MAX_PATH);
return path;
#else
char result[PATH_MAX];
ssize_t count = readlink("/proc/self/exe", result, PATH_MAX);
return std::string(result, (count > 0) ? count : 0);
#endif
}
For relative paths, here's what I did. I am aware of the age of this question, I simply want to contribute a simpler answer that works in the majority of cases:
Say you have a path like this:
"path/to/file/folder"
For some reason, Linux-built executables made in eclipse work fine with this. However, windows gets very confused if given a path like this to work with!
As stated above there are several ways to get the current path to the executable, but the easiest way I find works a charm in the majority of cases is appending this to the FRONT of your path:
"./path/to/file/folder"
Just adding "./" should get you sorted! :) Then you can start loading from whatever directory you wish, so long as it is with the executable itself.
EDIT: This won't work if you try to launch the executable from code::blocks if that's the development environment being used, as for some reason, code::blocks doesn't load stuff right... :D
EDIT2: Some new things I have found is that if you specify a static path like this one in your code (Assuming Example.data is something you need to load):
"resources/Example.data"
If you then launch your app from the actual directory (or in Windows, you make a shortcut, and set the working dir to your app dir) then it will work like that.
Keep this in mind when debugging issues related to missing resource/file paths. (Especially in IDEs that set the wrong working dir when launching a build exe from the IDE)
A library solution (although I know this was not asked for).
If you happen to use Qt:
QCoreApplication::applicationDirPath()
Path to the current .exe
#include <Windows.h>
std::wstring getexepathW()
{
wchar_t result[MAX_PATH];
return std::wstring(result, GetModuleFileNameW(NULL, result, MAX_PATH));
}
std::wcout << getexepathW() << std::endl;
// -------- OR --------
std::string getexepathA()
{
char result[MAX_PATH];
return std::string(result, GetModuleFileNameA(NULL, result, MAX_PATH));
}
std::cout << getexepathA() << std::endl;
On POSIX platforms, you can use getcwd().
On Windows, you may use _getcwd(), as use of getcwd() has been deprecated.
For standard libraries, if Boost were standard enough for you, I would have suggested Boost::filesystem, but they seem to have removed path normalization from the proposal. You may have to wait until TR2 becomes readily available for a fully standard solution.
Boost Filesystem's initial_path() behaves like POSIX's getcwd(), and neither does what you want by itself, but appending argv[0] to either of them should do it.
You may note that the result is not always pretty--you may get things like /foo/bar/../../baz/a.out or /foo/bar//baz/a.out, but I believe that it always results in a valid path which names the executable (note that consecutive slashes in a path are collapsed to one).
I previously wrote a solution using envp (the third argument to main() which worked on Linux but didn't seem workable on Windows, so I'm essentially recommending the same solution as someone else did previously, but with the additional explanation of why it is actually correct even if the results are not pretty.
As Minok mentioned, there is no such functionality specified ini C standard or C++ standard. This is considered to be purely OS-specific feature and it is specified in POSIX standard, for example.
Thorsten79 has given good suggestion, it is Boost.Filesystem library. However, it may be inconvenient in case you don't want to have any link-time dependencies in binary form for your program.
A good alternative I would recommend is collection of 100% headers-only STLSoft C++ Libraries Matthew Wilson (author of must-read books about C++). There is portable facade PlatformSTL gives access to system-specific API: WinSTL for Windows and UnixSTL on Unix, so it is portable solution. All the system-specific elements are specified with use of traits and policies, so it is extensible framework. There is filesystem library provided, of course.
The linux bash command
which progname will report a path to program.
Even if one could issue the which command from within your program and direct the output to a tmp file and the program
subsequently reads that tmp file, it will not tell you if that program is the one executing. It only tells you where a program having that name is located.
What is required is to obtain your process id number, and to parse out the path to the name
In my program I want to know if the program was
executed from the user's bin directory or from another in the path
or from /usr/bin. /usr/bin would contain the supported version.
My feeling is that in Linux there is the one solution that is portable.
Use realpath() in stdlib.h like this:
char *working_dir_path = realpath(".", NULL);
The following worked well for me on macOS 10.15.7
brew install boost
main.cpp
#include <iostream>
#include <boost/filesystem.hpp>
int main(int argc, char* argv[]){
boost::filesystem::path p{argv[0]};
p = absolute(p).parent_path();
std::cout << p << std::endl;
return 0;
}
Compiling
g++ -Wall -std=c++11 -l boost_filesystem main.cpp
This question was asked 15 years ago, so the existing answers are now incorrect. If you're using C++17 or greater, the solution is very straightforward today:
#include <filesystem>
std::cout << std::filesystem::current_path();
See cppreference.com for more information.
I have to check Linux system information. I can execute system commands in C, but doing so I create a new process for every one, which is pretty expensive. I was wondering if there is a way to obtain system information without being forced to execute a shell command. I've been looking around for a while and I found nothing. Actually, I'm not even sure if it's more convenient to execute commands via Bash calling them from my C program or find a way to accomplish the tasks using only C.
Linux exposes a lot of information under /proc. You can read the data from there. For example, fopen the file at /proc/cpuinfo and read its contents.
A presumably less known (and more complicated) way to do that, is that you can also use the api interface to sysctl. To use it under Linux, you need to #include <unistd.h>, #include <linux/sysctl.h>. A code example of that is available in the man page:
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <linux/sysctl.h>
int _sysctl(struct __sysctl_args *args );
#define OSNAMESZ 100
int
main(void)
{
struct __sysctl_args args;
char osname[OSNAMESZ];
size_t osnamelth;
int name[] = { CTL_KERN, KERN_OSTYPE };
memset(&args, 0, sizeof(struct __sysctl_args));
args.name = name;
args.nlen = sizeof(name)/sizeof(name[0]);
args.oldval = osname;
args.oldlenp = &osnamelth;
osnamelth = sizeof(osname);
if (syscall(SYS__sysctl, &args) == -1) {
perror("_sysctl");
exit(EXIT_FAILURE);
}
printf("This machine is running %*s\n", osnamelth, osname);
exit(EXIT_SUCCESS);
}
However, the man page linked also notes:
Glibc does not provide a wrapper for this system call; call it using
syscall(2). Or rather... don't call it: use of this system call has
long been discouraged, and it is so unloved that it is likely to
disappear in a future kernel version. Since Linux 2.6.24, uses of this
system call result in warnings in the kernel log. Remove it from your
programs now; use the /proc/sys interface instead.
This system call is available only if the kernel was configured with
the CONFIG_SYSCTL_SYSCALL option.
Please keep in mind that anything you can do with sysctl(), you can also just read() from /proc/sys. Also note that I do understand that the usefulness of that syscall is questionable, I just put it here for reference.
You can also use the sys/utsname.h header file to get the kernel version, hostname, operating system, machine hardware name, etc. More about sys/utsname.h is here. This is an example of getting the current kernel release.
#include <stdio.h> // I/O
#include <sys/utsname.h>
int main(int argc, char const *argv[])
{
struct utsname buff;
printf("Kernel Release = %s\n", buff.release); // kernel release
return 0;
}
This is the same as using the uname command. You can also use the -a option which stands for all information.
uname -r # -r stands for kernel release
I'm trying to figure out how to execute machine code stored in memory.
I have the following code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[])
{
FILE* f = fopen(argv[1], "rb");
fseek(f, 0, SEEK_END);
unsigned int len = ftell(f);
fseek(f, 0, SEEK_SET);
char* bin = (char*)malloc(len);
fread(bin, 1, len, f);
fclose(f);
return ((int (*)(int, char *)) bin)(argc-1, argv[1]);
}
The code above compiles fine in GCC, but when I try and execute the program from the command line like this:
./my_prog /bin/echo hello
The program segfaults. I've figured out the problem is on the last line, as commenting it out stops the segfault.
I don't think I'm doing it quite right, as I'm still getting my head around function pointers.
Is the problem a faulty cast, or something else?
You need a page with write execute permissions. See mmap(2) and mprotect(2) if you are under unix. You shouldn't do it using malloc.
Also, read what the others said, you can only run raw machine code using your loader. If you try to run an ELF header it will probably segfault all the same.
Regarding the content of replies and downmods:
1- OP said he was trying to run machine code, so I replied on that rather than executing an executable file.
2- See why you don't mix malloc and mman functions:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/mman.h>
int main()
{
char *a=malloc(10);
char *b=malloc(10);
char *c=malloc(10);
memset (a,'a',4095);
memset (b,'b',4095);
memset (c,'c',4095);
puts (a);
memset (c,0xc3,10); /* return */
/* c is not alligned to page boundary so this is NOOP.
Many implementations include a header to malloc'ed data so it's always NOOP. */
mprotect(c,10,PROT_READ|PROT_EXEC);
b[0]='H'; /* oops it is still writeable. If you provided an alligned
address it would segfault */
char *d=mmap(0,4096,PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_ANON,-1,0);
memset (d,0xc3,4096);
((void(*)(void))d)();
((void(*)(void))c)(); /* oops it isn't executable */
return 0;
}
It displays exactly this behavior on Linux x86_64 other ugly behavior sure to arise on other implementations.
Using malloc works fine.
OK this is my final answer, please note I used the orignal poster's code.
I'm loading from disk, the compiled version of this code to a heap allocated area "bin", just as the orignal code did (the name is fixed not using argv, and the value 0x674 is from;
objdump -F -D foo|grep -i hoho
08048674 <hohoho> (File Offset: 0x674):
This can be looked up at run time with the BFD (Binary File Descriptor library) or something else, you can call other binaries (not just yourself) so long as they are statically linked to the same set of lib's.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
unsigned char *charp;
unsigned char *bin;
void hohoho()
{
printf("merry mas\n");
fflush(stdout);
}
int main(int argc, char **argv)
{
int what;
charp = malloc(10101);
memset(charp, 0xc3, 10101);
mprotect(charp, 10101, PROT_EXEC | PROT_READ | PROT_WRITE);
__asm__("leal charp, %eax");
__asm__("call (%eax)" );
printf("am I alive?\n");
char *more = strdup("more heap operations");
printf("%s\n", more);
FILE* f = fopen("foo", "rb");
fseek(f, 0, SEEK_END);
unsigned int len = ftell(f);
fseek(f, 0, SEEK_SET);
bin = (char*)malloc(len);
printf("read in %d\n", fread(bin, 1, len, f));
printf("%p\n", bin);
fclose(f);
mprotect(&bin, 10101, PROT_EXEC | PROT_READ | PROT_WRITE);
asm volatile ("movl %0, %%eax"::"g"(bin));
__asm__("addl $0x674, %eax");
__asm__("call %eax" );
fflush(stdout);
return 0;
}
running...
co tmp # ./foo
am I alive?
more heap operations
read in 30180
0x804d910
merry mas
You can use UPX to manage the load/modify/exec of a file.
P.S. sorry for the previous broken link :|
It seems to me you're loading an ELF image and then trying to jump straight into the ELF header? http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
If you're trying to execute another binary, why don't you use the process creation functions for whichever platform you're using?
An typical executable file has:
a header
entry code that is called before main(int, char **)
The first means that you can't generally expect byte 0 of the file to be executable; intead, the information in the header describes how to load the rest of the file in memory and where to start executing it.
The second means that when you have found the entry point, you can't expect to treat it like a C function taking arguments (int, char **). It may, perhaps, be usable as a function taking no paramters (and hence requiring nothing to be pushed prior to calling it). But you do need to populate the environment that will in turn be used by the entry code to construct the command line strings passed to main.
Doing this by hand under a given OS would go into some depth which is beyond me; but I'm sure there is a much nicer way of doing what you're trying to do. Are you trying to execute an external file as a on-off operation, or load an external binary and treat its functions as part of your program? Both are catered for by the C libraries in Unix.
It is more likely that that it is the code that is jumped to by the call through function-pointer that is causing the segfault rather than the call itself. There is no way from the code you have posted to determine that that code loaded into bin is valid. Your best bet is to use a debugger, switch to assembler view, break on the return statement and step into the function call to determine that the code you expect to run is indeed running, and that it is valid.
Note also that in order to run at all the code will need to be position independent and fully resolved.
Moreover if your processor/OS enables data execution prevention, then the attempt is probably doomed. It is at best ill-advised in any case, loading code is what the OS is for.
What you are trying to do is something akin to what interpreters do. Except that an interpreter reads a program written in an interpreted language like Python, compiles that code on the fly, puts executable code in memory and then executes it.
You may want to read more about just-in-time compilation too:
Just in time compilation
Java HotSpot JIT runtime
There are libraries available for JIT code generation such as the GNU lightning and libJIT, if you are interested. You'd have to do a lot more than just reading from file and trying to execute code, though. An example usage scenario will be:
Read a program written in a scripting-language (maybe
your own).
Parse and compile the source into an
intermediate language understood by
the JIT library.
Use the JIT library to generate code
for this intermediate
representation, for your target platform's CPU.
Execute the JIT generated code.
And for executing the code you'd have to use techniques such as using mmap() to map the executable code into the process's address space, marking that page executable and jumping to that piece of memory. It's more complicated than this, but its a good start in order to understand what's going on beneath all those interpreters of scripting languages such as Python, Ruby etc.
The online version of the book "Linkers and Loaders" will give you more information about object file formats, what goes on behind the scenes when you execute a program, the roles of the linkers and loaders and so on. It's a very good read.
You can dlopen() a file, look up the symbol "main" and call it with 0, 1, 2 or 3 arguments (all of type char*) via a cast to pointer-to-function-returning-int-taking-0,1,2,or3-char*
Use the operating system for loading and executing programs.
On unix, the exec calls can do this.
Your snippet in the question could be rewritten:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char* argv[])
{
return execv(argv[1],argv+2);
}
Executable files contain much more than just code. Header, code, data, more data, this stuff is separated and loaded into different areas of memory by the OS and its libraries. You can't load a program file into a single chunk of memory and expect to jump to it's first byte.
If you are trying to execute your own arbitrary code, you need to look into dynamic libraries because that is exactly what they're for.