How to Limit C instruction set from gcc [duplicate] - c

This question already has answers here:
How to create a lightweight C code sandbox?
(13 answers)
Closed 9 years ago.
I'm developing a platform similar to hackerrank.com where someone can submit C code, and then that code will be compiled, and run on my server, but I want to limit the C instruction set that a person will be able to execute on my server.
For example: limit the instruction set to I/O only.
My first approach was to parse the code and look for malicious code, but that is pretty naive because it can be easily overriden (shell code, obfuscation, etc..)
My second approach (the one I think it could work) is to remove all the "unnecessary" headers, and just leave stdio.h, math.h, stdlib.h, etc... just to name a few.
But then I thought that it might be possible to limit from gcc the instruction set of C, but after reading the man entry for gcc I couldn't find anything close to what I need, so I wonder if that's even possible.
If that's not possible, what could be a safe way to solve this problem? Other than getting rid of unnecessary libraries.
Thanks!

You could limit system calls using systrace, which is available on OpenBSD. I'm sure there's an equivalent for linux and other operating systems. This would allow you to restrict syscalls to file io only and not things like sockets and forking.

Related

Adding a system call with a kernel module(LKM) [duplicate]

This question already has an answer here:
Implementing Linux System Call using LKM
(1 answer)
Closed 6 years ago.
So I have seen a bunch of questions about adding system calls but I can't find any examples of one using an LKM that works. I have found resources like this: http://tldp.org/LDP/lkmpg/2.6/html/
This works, in theory, but doesnt compile. Can anyone point me towards a simple example for adding a hello world system call or something. Something like this: https://tssurya.wordpress.com/2014/08/19/adding-a-hello-world-system-call-to-linux-kernel-3-16-0/ that doesn't require me to recompile my kernel?
Generally, it's strongly recommended to not implement a whole new system call.
Rather, only implement a new ioctl and likely some new block or character devices.
For how to do that, it looks like there is another question/answer already: How do I use ioctl() to manipulate my kernel module?
I don't think you can do that with a module. The definitions of the syscall go into two places which cannot really be changed at runtime (as far as I know): syscall table (which assigns numbers per architecture) and syscalls include file (installed with kernel itself, not modules). (Or at least not without messing with code rewriting at runtime.)
You'll always need to recompile the kernel in that case. But if you want to have a quick update/try cycle, you could implement a syscall that's just a stub, passing a message to the right module if it's loaded. It would allow you to change the implementation, but not the signature.

display filename, line number and function in C without modifying the source code [duplicate]

This question already has answers here:
how to trace function call in C?
(10 answers)
Closed 8 years ago.
I am new in a company, working with C source code which almost lacks any kind of tracing mechanism.
I would like to know whether or not the application passes through a certain file and where (which function).
I could do this using breakpoints, but the concerned file contains a huge lot of functions.
Therefore I'm looking for some kind of tool, that I can attach to the application, and that gives an output of following kind:
-- Main.c (main_function())
---- submain.c (submain_function())
...
From that, I then could deduce where (which filename, which function) the application is passing.
Does anybody know whether or not such a tool exists?
Thanks
If you're on linux, gdb might come handy.
You can compile the code using -g or -g3 option with gcc, then run the binary using gdb ./<executable_name>, set a breakpoint on desired function in any of the source files and check the call.
While stepping through the application, it will show the filename and line number of the executing instruction.
Note: Please check this and this for a detailed understanding.
I assume you develop on Linux. Then you could also customize the GCC compiler, in particular using MELT (a lispy domain specific language to extend GCC), to have the compiler add some logging at many places. For that you'll need to insert a new GCC "optimization" pass doing such a job, and most importantly, you'll need to understand some details about GCC internal representations (Gimple, Tree-s, Basic blocks, ...)
However, that would probably require more than a week of work from your part. Unless your code base is really big (at least half a million
of lines) that might not worth the effort

hidden routines linked in c program

Hullo,
When one disasembly some win32 exe prog compiled by c compiler it
shows that some compilers links some 'hidden' routines in it -
i think even if c program is an empty one and has a 5 bytes or so.
I understand that such 5 bytes is enveloped in PE .exe format but
why to put some routines - it seem not necessary for me and even
somewhat annoys me. What is that? Can it be omitted? As i understand
c program (not speaking about c++ right now which i know has some
initial routines) should not need such complementary hidden functions..
Much tnx for answer, maybe even some extended info link, cause this
topic interests me much
//edit
ok here it is some disasembly Ive done way back then
(digital mars and old borland commandline (i have tested also)
both make much more code, (and Im specialli interested in bcc32)
but they do not include readable names/symbols in such dissassembly
so i will not post them here
thesse are somewhat readable - but i am not experienced in understending
what it is ;-)
https://dl.dropbox.com/u/42887985/prog_devcpp.htm
https://dl.dropbox.com/u/42887985/prog_lcc.htm
https://dl.dropbox.com/u/42887985/prog_mingw.htm
https://dl.dropbox.com/u/42887985/prog_pelles.htm
some explanatory comments whats that heere?
(I am afraid maybe there is some c++ sh*t here, I am
interested in pure c addons not c++ though,
but too tired now to assure that it was compiled in c
mode, extension of compiled empty-main prog was c
so I was thinking it will be output in c not c++)
tnx for longer explanations what it is
Since your win32 exe file is a dynamically linked object file, it will contain the necessary data needed by the dynamic linker to do its job, such as names of libraries to link to, and symbols that need resolving.
Even a program with an empty main() will link with the c-runtime and kernel32.dll libraries (and probably others? - a while since I last did Win32 dev).
You should also be aware that main() is only the entry point of your program - quite a bit has already gone on before this point such as retrieving and tokening the command-line, setting up the locale, creating stderr, stdin, and stdout and setting up the other mechanism required by the c-runtime library such a at_exit(). Similarly, when your main() returns, the runtime does some clean-up - and at the very least needs to call the kernel to tell it that you're done.
As to whether it's necessary? Yes, unless you fancy writing your own program prologue and epilogue each time. There are probably are ways of writing minimal, statically linked applications if you're sufficiently masochistic.
As for storage overhead, why are you getting so worked up? It's not enough to worry about.
There are several initialization functions that load whenever you run a program on Windows. These functions, among other things, call the main() function that you write - which is why you need either a main() or WinMain() function for your program to run. I'm not aware of other included functions though. Do you have some disassembly to show?
You don't have much detail to go on but I think most of what you're seeing is probably the routines of the specific C runtime library that your compiler works with.
For instance there will be code enabling it to run from the entry point 'main' which portable executable format understands to call the main(char ** args) that you wrote in your C program.

Math.h library functions in assembly x86? [duplicate]

This question already has answers here:
How does C compute sin() and other math functions?
(22 answers)
Closed 3 years ago.
I tried to convert C code that is written under Linux (fedora 9) to assembly x86 code, however, I have problem in a Math.h functions. The functions in this library such as ceil, floor, log, log10, pow are undefined in the assembly x86. Can you please help me to solve this problem?
Thanks.
Most library functions won't be defined in assembly language, at least not in the sense of the addition operator directly mapping to the ADD instruction. If you want to re-write the library in assembly, you'll have to implement the function using whatever capabilities that your processor has available. Most library functions will require a separate assembly language subroutine, not just a single operation. The easiest way to approach this is to get the individual library subroutines working in isolation, then incorporate them into the larger program.
You can compile the C code and examine the disassembled output, but beware of compiler optimizations that can make the output hard for a human to follow.
May I ask what the purpose is behind this task? Since a compiler is essentially a C to assembly-language translator, there's rarely a need to do this by hand. Is this homework?
The best way to find out what these functions do is to take a look at their implementation in glibc's source. It should give you clear enough insight. Another way would be to take a look at the disassembly of lm.so found in /usr/lib/.

How do I use C libraries in assembler?

I want to know how to write a text editor in assembler. But modern operating systems require C libraries, particularly for their windowing systems. I found this page, which has helped me a lot.
But I wonder if there are details I should know. I know enough assembler to write programs that will use windows in Linux using GTK+, but I want to be able to understand what I have to send to a function for it to be a valid input, so that it will be easier to make use of all C libraries. For interfacing between C and x86 assembler, I know what can be learned from this page, and little else.
One of the most instructive ways to learn how to call C from assembler is to:
Write a C program that calls the C function of interest
Compile it, and look at the assembly listing (gcc -S)
This approach makes it easy to experiment by starting with something that is already known to work. You can change the C source and see how the generated code changes, and you can start with the generated code and modify it yourself.
push parameter on the stack
call the function
clear the stack
The links you have in your question show all these steps.
The OS may define the calling standard (it pretty well must define the standard for invoking system calls), in which case you need only find where that is documents and read it closely.

Resources