Is it possible to modify a C program which is running? - c

i was wondering if it is possible to modify a piece of C program (or other binary) while it is running ?
I wrote this small C program :
#include <stdio.h>
#include <stdint.h>
static uint32_t gcui32_val_A = 0xAABBCCDD;
int main(int argc, char *argv[]) {
uint32_t ui32_val_B = 0;
uint32_t ui32_cpt = 0;
printf("\n\n Program SHOW\n\n");
while(1) {
if(gcui32_val_A != ui32_val_B) {
printf("Value[%d] of A : %x\n",ui32_cpt,gcui32_val_A);
ui32_val_B = gcui32_val_A;
ui32_cpt++;
}
}
return 0;
}
With a Hex editor i'm able to find "0xAABBCCDD" and modify it when the program is stopped. The modification works when I relauch the program. Cool !
I would like to do this when the program s running is it possible ?
Here is a simple example to understand the phenomena and play a little with it but my true project is bigger.
I have an old DOS game called Dangerous Dave.
I'm able to modify the tiles by simply editing the binary (thanks to http://www.shikadi.net/moddingwiki/Dangerous_Dave)
I developped a small editor that do this pretty well and had fun with it.
I launch the DOS game by using DOSBOX, it works !
I would like to do this dynamically when the game is running. Is it possible ?
PS : I work under Debian 64bit
regards

I was wondering if it is possible to modify a piece of C program (or other binary) while it is running ?
Not in standard (and portable) C11. Read the n1570 specification to check. Notice that most of the time in practice, it is not the C source program (made of several translation units) which is running, but an executable result of some compiler & linker.
However, on Linux (e.g. Debian/Sid/x86-64) you could use some of the following tricks (often with function pointers):
use plugins, so design your program to accept them and define conventions about your plugins. A plugin is a shared object ELF file (some *.so) containing position-independent code (so it should be compiled with specific options). You'll use dlopen(3) & dlsym(3) to do the dynamic loading of the plugin.
use some JIT-compiling library, like GCCJIT or LLVM or libjit or asmjit.
alter your virtual address space (not recommended) manually, using mprotect(2) and mmap(2); then you could overwrite something in a code segment (you really should not do that). This might be tricky (e.g. because of ASLR) and brittle.
perhaps use debug related facilities, either with ptrace(2) or by scripting or extending the gdb debugger.
I suggest to play a bit with /proc/ (see proc(5)) and try at least to run in some terminal the following commands
cat /proc/self/maps
cat /proc/$$/maps
ls /proc/$$/fd/
(and read enough things to understand their outputs) to understand a bit more what a process "is".
So overwriting your text segment (if you really need to do that) is possible, but perhaps more tricky than what you believe !
(do you mind working for several weeks or months simply to improve some old gaming experience?)
Read also about homoiconic programming languages (try Common Lisp with SBCL), about dynamic software updating, about persistence, about application checkpointing, and about operating systems (I recommend: Operating Systems: Three Easy Pieces & OsDev wiki)
I work under Debian 64bit
I suppose you have programming skills and do know C. Then you should read ALP or some newer Linux programming book (and of course look into intro(2) & syscalls(2) & intro(3) and other man pages etc...)
BTW, in your particular case, perhaps the "OS" is DOSBOX (acting as some virtual machine). You might use strace(1) on DOSBOX (or on other commands or processes), or study its source code.
You mention games in your question. If you want to code some, consider libraries like SDL, SFML, Qt, GTK+, ....

Yes you can modify piece of code while running in C. You got to have pointer to your program memory area, and compiled pieces of code that you want to change. Naturally this is considered to be a dangerous practice, with lot of restrictions, and with many possibilities for error. However, this was practice at olden times when the memory was precious.

Related

Why does compiling a C program produce such a long binary?

I have heard that when a compiler compiles code, what it does is create a file that contains instructions that a machine can execute. According to this video, a simple program like int main(){ int i; i = 3; } should, when compiled, produce a file that's only several bytes long. So why does clang compile this into a file that's several kilobytes long?
This is likely due to some #include statements that statically bind libraries with your executable, or a compiler and a linker including debugging information. Of course an executable also contains a lot of OS specific data/information which add up to the size, see this question for more detailed answers. If you're after a small size executable there's plenty of suggestions in the answers to this question.
EDIT: Reading more about it, the size comes down to C being a high-level language in the sense that it does not communicate directly with hardware, but rather talks with an operation system. Basically, main is not the entry point of your program and there's a lot that goes on before it is even called. I strongly recommend you reading through this blog post and its follow-up and foremost watching Matt Godbolt's insightful talk on the topic. These are all concerned mostly with gcc and GNU/Linux, but I think it's fair to assume that similar reasons apply to executable sizes on other operating systems as well.

Is there a reason even my tiniest .c files always compile to at least 128-kilobyte executables?

I am using Dev-C++, which compiles using GCC, on Windows 8.1, 64-bit.
I noticed that all my .c files always compiled to at least 128-kilobyte .exe files, no matter how small the source is. Even a simple "Hello, world!" was 128kb. Source files with more lines of code increased the size of the executable as I would expect, but all the files started off at at least 128kb, as if that's some sort of minimum size.
I know .exe's don't actually have a minimum size like that; .kkrieger is a full first-person shooter with 3d graphics and sound that all fit inside a single 96kb executable.
Trying to get to the bottom of this, I opened up my hello_world.exe in Notepad++. Perhaps my compiler adds a lengthy header that happens to be 128kb, I thought.
Unfortunately, I don't know enough about executables to be able to make sense of it, though I did find strings like "Address %p has no image-section VirtualQuery failed for %d bytes at address %p" buried among the usual garble of characters in an .exe.
Of course, this isn't a serious problem, but I'd like to know why it's happening.
Why is this 128kb minimum happening? Does it have something to do with my 64-bit OS, or perhaps with a quirk of my compiler?
Short answer: it depends.
Long answer: it depends on what operating system you have and how it handles executables.
Most (if not all) compilers of programming languages do not break it down to the absolute, raw x86/ARM/other architecture's machine code. Instead, after they pack your source code into a .o (object) file, they then bring the .o and its libraries and "link" it all together, in such a way that it forms a standard executable format. These "executable formats" are essentially system-specific file formats that contain low level, very-close-to-machine-code instructions that the OS interprets in such a way that it can relay those low-level instructions to the CPU in the form of machine-code instructions.
For example, I'll talk about the two most commonly used executable formats for Linux devices: ELF and ELF64 (I'll let you figure out what the namesake differences are yourself). ELF stands for Executable and Linkable Format. In every ELF-compiled program, the file starts off with a 4-byte "magic number", which is simply a hexadecimal 0x7F followed by the string "ELF" in ASCII. The next byte is set to either 1 or 2, which signifies that the program is for 32-bit or 64-bit architectures, respectively. And after that, another byte to signify the program's endianness. After that, there's a few more bytes that tell what the architecture is, and so on, until you reach a total of up to 64 bytes for the 64-bit header.
However, 64 bytes is not even close to the 128K that you have stated. That's because (aside from the fact that the windows .exe format is usually much more complex), there is the C++ standard library at fault here. For instance, let's have a look at a common use of the C++ iostream library:
#include <iostream>
int main()
{
std::cout<<"Hello, World!"<<std::endl;
return 0;
}
This program may compile to an extremely large executable on a windows system, because the moment you add iostream to your program, it adds the entire C++ standard library into it, increasing your executable's size immensely.
So, how do we rectify this problem? Simple:
Use the C standard library implementation for C++!
#include <cstdio>
int main()
{
printf("Hello, World!\n");
return 0;
}
Simply using the original C standard library can decrease your size from a couple hundred KBytes to a handful at most. The reason that this happens is simply because GCC/G++ really likes linking programs with the entire standard C++ library for some odd reason.
However, sometimes you absolutely need to use the C++-specific libraries. In that case,a lot of linkers have some kind of command-line option that essentially tells the linker "Hey, I'm only using like, 2 functions from the STDCPP library, you don't need the whole thing". On the Linux linker ld, this is the command-line option -nodefaultlibs. I'm not entirely sure what this is on windows, though. Of course, this can very quickly break a TON of calls and such in programs that make a lot of standard C++ calls.
So, in the end, I would worry more about simply re-writing your program to use the regular C functions instead of the new-fangled C++ functions, as amazing as they are. that is if you're worried about size.

hidden routines linked in c program

Hullo,
When one disasembly some win32 exe prog compiled by c compiler it
shows that some compilers links some 'hidden' routines in it -
i think even if c program is an empty one and has a 5 bytes or so.
I understand that such 5 bytes is enveloped in PE .exe format but
why to put some routines - it seem not necessary for me and even
somewhat annoys me. What is that? Can it be omitted? As i understand
c program (not speaking about c++ right now which i know has some
initial routines) should not need such complementary hidden functions..
Much tnx for answer, maybe even some extended info link, cause this
topic interests me much
//edit
ok here it is some disasembly Ive done way back then
(digital mars and old borland commandline (i have tested also)
both make much more code, (and Im specialli interested in bcc32)
but they do not include readable names/symbols in such dissassembly
so i will not post them here
thesse are somewhat readable - but i am not experienced in understending
what it is ;-)
https://dl.dropbox.com/u/42887985/prog_devcpp.htm
https://dl.dropbox.com/u/42887985/prog_lcc.htm
https://dl.dropbox.com/u/42887985/prog_mingw.htm
https://dl.dropbox.com/u/42887985/prog_pelles.htm
some explanatory comments whats that heere?
(I am afraid maybe there is some c++ sh*t here, I am
interested in pure c addons not c++ though,
but too tired now to assure that it was compiled in c
mode, extension of compiled empty-main prog was c
so I was thinking it will be output in c not c++)
tnx for longer explanations what it is
Since your win32 exe file is a dynamically linked object file, it will contain the necessary data needed by the dynamic linker to do its job, such as names of libraries to link to, and symbols that need resolving.
Even a program with an empty main() will link with the c-runtime and kernel32.dll libraries (and probably others? - a while since I last did Win32 dev).
You should also be aware that main() is only the entry point of your program - quite a bit has already gone on before this point such as retrieving and tokening the command-line, setting up the locale, creating stderr, stdin, and stdout and setting up the other mechanism required by the c-runtime library such a at_exit(). Similarly, when your main() returns, the runtime does some clean-up - and at the very least needs to call the kernel to tell it that you're done.
As to whether it's necessary? Yes, unless you fancy writing your own program prologue and epilogue each time. There are probably are ways of writing minimal, statically linked applications if you're sufficiently masochistic.
As for storage overhead, why are you getting so worked up? It's not enough to worry about.
There are several initialization functions that load whenever you run a program on Windows. These functions, among other things, call the main() function that you write - which is why you need either a main() or WinMain() function for your program to run. I'm not aware of other included functions though. Do you have some disassembly to show?
You don't have much detail to go on but I think most of what you're seeing is probably the routines of the specific C runtime library that your compiler works with.
For instance there will be code enabling it to run from the entry point 'main' which portable executable format understands to call the main(char ** args) that you wrote in your C program.

How do i compile a c program without all the bloat?

I'm trying to learn x86. I thought this would be quite easy to start with - i'll just compile a very small program basically containing nothing and see what the compiler gives me. The problem is that it gives me a ton of bloat. (This program cannot be run in dos-mode and so on) 25KB file containing an empty main() calling one empty function.
How do I compile my code without all this bloat? (and why is it there in the first place?)
Executable formats contain a bit more than just the raw machine code for the CPU to execute. If you want that then the only option is (I think) a DOS .com file which essentially is just a bunch of code loaded into a page and then jumped into. Some software (e.g. Volkov commander) made clever use of that format to deliver quite much in very little executable code.
Anyway, the PE format which Windows uses contains a few things that are specially laid out:
A DOS stub saying "This program cannot be run in DOS mode" which is what you stumbled over
several sections containing things like program code, global variables, etc. that are each handled differently by the executable loader in the operating system
some other things, like import tables
You may not need some of those, but a compiler usually doesn't know you're trying to create a tiny executable. Usually nowadays the overhead is negligible.
There is an article out there that strives to create the tiniest possible PE file, though.
You might get better result by digging up older compilers. If you want binaries that are very bare to the bone COM files are really that, so if you get hold of an old compiler that has support for generating COM binaries instead of EXE you should be set. There is a long list of free compilers at http://www.thefreecountry.com/compilers/cpp.shtml, I assume that Borland's Turbo C would be a good starting point.
The bloated module could be the loader (operating system required interface) attached by linker. Try adding a module with only something like:
void foo(){}
and see the disassembly (I assume that's the format the compiler 'gives you'). Of course the details vary much from operating systems and compilers. There are so many!

Getting Started in C

I know there are many tutorials out there for getting started in C. However Its hard for me to apply the knowledge. The way I've always started out in languages is by writing scripts. Of course C is not a scripting language.
My question isn't so much about learning C as much as it is about how to get started applying C. Great I can write a temperature converter or a text-based rpg. Maybe its because in python I just write up the code in somefile.py and chmod +x somefile.py && somefile.py . I do not really have an equivalent process for C. Every time I read about C its a different compiling process with different flags. Can someone just give me some definite direction on best ways to apply C when you already work with higher-level dynamic scripting languages?
Btw. .. I'm asking about C and not C++.
I usually am on either OpenSuse 11 or Ubuntu 9.04 . "What compiler do i use" is part of the problem. In python there is no choice its just "python somefile.py" same with php or ruby. I didn't know there were choices.
write w.c
#include <stdio.h>
int main(int argc, char *argv[]) {
int i;
for (i = 0; i < argc; ++i) {
printf("Param %d is '%s'\n", i, argv[i]);
}
return 0;
}
and compile with
gcc -Wall -o w w.c
run
./w
As rogeriopvl wrote in a comment, the compilation process is really simple. Just write up the code in somefile.c and
gcc -o somefile somefile.c && ./somefile
(if you're using GCC, and if not, your compiler of choice can probably be invoked similarly) Unless/until you start getting into more complicated projects, it's barely any more complicated than a scripting language. (Well... okay, you may need to link some libraries, once you get beyond the basics. But still, not a huge deal.)
In fact, I did write myself a little shell script that allows me to use C as a scripting language. But the process for setting it up is a little more complicated than what you may want to get into at this stage - it's simpler to just run the compiler each time. Still, if you're interested, I can look up the directions (for Linux) and put them here.
C code needs to be compiled before the program can be run. The exact process is different depending on which platform and compiler you are working on.
For the most part, using an IDE (such as Visual studio, Eclipse, MonoDevelop, and a bunch of others) will do the nasty work for you so that you just have to press a button or click an icon. Download one of these
I asked myself this question when I was learning C. The problem here, if I can say this is a problem, is that C can be used in a broad range of applications and in a broad range of environments, which one with its own IDEs or compilers and libraries. Some examples where you can use C for real staff.
Embedded software. In this case you will probably use some lib.
Network programming (take a look at this book.
Device driver development.
Libraries (both for Linux/Windows and other OSs)
Well this list is endless.
O don't know if I help you with this question. If you give more details about what are you interested in, could be helpful
Good luck
The best advice I can give here is find a topic you're interested in, see if you can make a program to do what you want/assist in doing what you want/adding functionality to the interest of choice, and start coding.
This gives the bonus of doing something you're interested in, and at the same time making something that directly influences it. It should give the motivation to keep steaming onward with the learning process.
I'm working with C a lot at the moment with Linux Kernel modules and am relatively new to C. I've found this rewarding which I think is what's important for this sort of hobby 'temperature converter or a text-based rpg' type programming.
I also struggle finding an application of programming skills. Balance of challenge and reward is important I think.

Resources