Pro*C based batch, Out of Memory? - c

When trying to compile a Pro*C based batch file, the process "proc" stucks at 100% of 1 CPU core and the memory starts growing to a point where the system needs to OOM kill the process (the machine has 16GB Memory and the process grew up to 9GB).
Has anyone seen this behavior before?
As an aditional information:
-The mk is the one from the instalation of the main package
-The .pc files are the original files (I've tried to compile several, such as dtesys.pc)
-The Libs are correctly compiled
-The environment variables are correctly set

Yes, it is limits.h because it includes itself recursively on line 123:
/* Get the compiler's limits.h, which defines almost all the ISO constants.
We put this #include_next outside the double inclusion check because
it should be possible to include this file more than once and still get
the definitions from gcc's header. */
#if defined __GNUC__ && !defined _GCC_LIMITS_H_
/* `_GCC_LIMITS_H_' is what GCC's file defines. */
# include_next <limits.h>
#endif
So, the solution is to pass parse=none option to Pro*C precompiler:
proc parse=none iname=filename.pc oname=filename.c
Or, a second option: you may first precompile your source with c precompiler to get pc file:
cpp -P -E yourfile.someextension -o yourfile.pc
Then you will get limits.h parsed without recursion.
-P option is needed because Pro*C is the program which can be confused with linemarkers.
-E option is needed because Pro*C is the program which can be confused with non-traditional output.

Related

autoconf configure results in C std lib header related compile errors

I am attempting to build a project that comes with an automake/autoconf build system. This is a well-used project, so I'm skeptical about a problem with the configure scripts, makefiles, or code as I received them. It is likely some kind of environment, path, flag, etc problem - something on my end with simply running the right commands with the right parameters.
The configuration step seems to complete in a satisfactory way. When I run make, I'm shown a set of errors primarily of these types:
error: ‘TRUE’ undeclared here (not in a function)
error: ‘struct work’ has no member named ‘version’
error: expected ‘)’ before ‘PRIu64’
Let's focus on the last one, which I have spent time researching - and I suspect all the errors are related to missing definitions. Apparently the print-friendly extended definitions from the C standard library header file inttypes.h is not being found. However, in the configure step everything is claimed to be in order:
configure:4930: checking for inttypes.h
configure:4930: /usr/bin/x86_64-linux-gnu-gcc -c -g -O2 conftest.c >&5
configure:4930: $? = 0
configure:4930: result: yes
All the INTTYPES flags are set correctly if I look in confdefs.h, config.h, config.log Output Variables, etc:
HAVE_INTTYPES_H='1'
#define HAVE_INTTYPES_H 1
The problem is the same whether doing a native build, or cross-compiling (for arm-linux-gnueabihf, aka armhf).
The source .c file in question does have config.h included as you'd expect, which by my understanding via the m4 macros mechanic should be adding an
#include <inttypes.h>
line. Yes, as you may be inclined to ask, if I enter this line myself into the .c file it appears to work and the PRIu64 errors go away.
I'm left with wondering how to debug this type of problem - essentially, everything I am aware of tells me I've done the configure properly, but I'm left with a bogus make process. Aside from trying every ./configure tweak and trick I can find, I've started looking at the auto-generated Makefile.in itself, but nothing so far. Also looking into how I can get the C pre-processor to tell me which header files it's actually inserting.
EDIT: I've confirmed that the -DHAVE_CONFIG_H mechanic looks good through configure, config.log, Makefile, etc.
autoconf does not automatically produce #include directives. You need to do that on your own based on the HAVE_* macros. So you'll have to add something like this:
#ifdef HAVE_INTTYPES_H
# include <inttypes.h>
#endif
If these lines show up in confdefs.h, a temporary header file used by configure scripts, this does excuse your application from performing these #includes. If configure writes them to confdefs.h, this is solely for the benefit of other configure tests, and not for application use.
First, run make -n for the target that failed. This is probably some .o file; you may need some tweaking to get its path correctly.
Now you have the command used to compile your file. If you don't find the problem by meditating on this command, try to run it, adding the -E to force preprocessor output text instead of invoking the compiler.
Note that now the .o file will be text, and you must rebuild it without -E later.
You may find some preprocessor flags useful to get more details: -dM or -dD, or others.

extract library version from binary with CMake

I am writing a FindXXX.cmake script for an external C library. I would like my script to provide information about the library version. However, the library only provides this information in the form of a function that returns the version number as a string.
I thought I could extract the version number by having FindXXX.cmake compile the following C program on the fly:
#include <stdio.h>
#include "library.h"
int main() {
char version[256];
get_version(version);
puts(version);
return 0;
}
In order for this to work, CMake should compile and run the program above at configure time, and use the information it prints as the version identifier. I know how to do the latter (execute_process), and I almost know how to do the former: CheckCSourceRuns comes to mind, but I do not know how to capture the stdout of the generated executable.
TL;DR: is there a way to compile a program, run it and capture its stdout from CMake at generation time?
You may use try_run for that purpose (it is assumed that your source file is named as foo_get_version.c):
try_run(foo_run_result foo_compile_result
foo_try_run ${CMAKE_CURRENT_LIST_DIR}/foo_get_version.c
RUN_OUTPUT_VARIABLE foo_run_output)
if(NOT foo_compile_result)
# ... Failed to compile
endif()
if(NOT foo_run_result EQUAL "0")
# ... Failed to run
endif()
# Now 'foo_run_output' variable contains output of your program.
Note, that try_run isn't executed when cross-compiling. Instead, CMake expects that the user will set cache variables foo_run_result and foo_run_result__TRYRUN_OUTPUT.

gdb get preprocessor macro info from file in different directory

I'm trying to debug some additions I made to a fairly large c program using gdb. The program I'm trying to debug makes extensive use of #define statements to set different values that are used throughout the code. I need to be able to see what these values are in order to help my debugging (as they include some very important information.
After some digging around I found that the info macro FOO and macro expand FOO commands should be able to print these values if the -g3 option (also tried the -gdwarf-2 and -ggdb3 flags as well) is passed to the compiler (as discussed here). However, whenever I try using this I get
The symbol `FOO' has no definition as a C/C++ preprocessor macro
at <user-defined>:-1
Now, I'm sure that the macro is defined otherwise the previous line of code would not have been able to run. In addition, I'm certain that I have passed the -g3 flag to the compiler. I have one idea as to where the issue might be and that is the location that the macro is defined at. Currently the macro is defined in a header file that is not in the same directory as the rest of the files (i.e. if the source files are in /foo/bar/blam/.. then the macro is defined in /def/mac/here/. Given this I thought maybe the problem was that gdb didn't know to look in this directory so I tried issuing the directory command in gdb and gave it the path to the directory containing the header file (base on this). This still did not solve the problem.
Does anyone know how I can get the values of these macros? If it is pertinent I'm running gdb version 7.11 and compiling the program using
cc and gcc both with Apple LLVM version 7.0.2 (clang-700.1.81). Also, gdb was installed/built using homebrew.

How can execute a decrypted file residing in the memory? [duplicate]

Is it possible to compile a C++ (or the like) program without generating the executable file but writing it and executing it directly from memory?
For example with GCC and clang, something that has a similar effect to:
c++ hello.cpp -o hello.x && ./hello.x $# && rm -f hello.x
In the command line.
But without the burden of writing an executable to disk to immediately load/rerun it.
(If possible, the procedure may not use disk space or at least not space in the current directory which might be read-only).
Possible? Not the way you seem to wish. The task has two parts:
1) How to get the binary into memory
When we specify /dev/stdout as output file in Linux we can then pipe into our program x0 that reads
an executable from stdin and executes it:
gcc -pipe YourFiles1.cpp YourFile2.cpp -o/dev/stdout -Wall | ./x0
In x0 we can just read from stdin until reaching the end of the file:
int main(int argc, const char ** argv)
{
const int stdin = 0;
size_t ntotal = 0;
char * buf = 0;
while(true)
{
/* increasing buffer size dynamically since we do not know how many bytes to read */
buf = (char*)realloc(buf, ntotal+4096*sizeof(char));
int nread = read(stdin, buf+ntotal, 4096);
if (nread<0) break;
ntotal += nread;
}
memexec(buf, ntotal, argv);
}
It would also be possible for x0 directly execute the compiler and read the output. This question has been answered here: Redirecting exec output to a buffer or file
Caveat: I just figured out that for some strange reason this does not work when I use pipe | but works when I use the x0 < foo.
Note: If you are willing to modify your compiler or you do JIT like LLVM, clang and other frameworks you could directly generate executable code. However for the rest of this discussion I assume you want to use an existing compiler.
Note: Execution via temporary file
Other programs such as UPX achieve a similar behavior by executing a temporary file, this is easier and more portable than the approach outlined below. On systems where /tmp is mapped to a RAM disk for example typical servers, the temporary file will be memory based anyway.
#include<cstring> // size_t
#include <fcntl.h>
#include <stdio.h> // perror
#include <stdlib.h> // mkostemp
#include <sys/stat.h> // O_WRONLY
#include <unistd.h> // read
int memexec(void * exe, size_t exe_size, const char * argv)
{
/* random temporary file name in /tmp */
char name[15] = "/tmp/fooXXXXXX";
/* creates temporary file, returns writeable file descriptor */
int fd_wr = mkostemp(name, O_WRONLY);
/* makes file executable and readonly */
chmod(name, S_IRUSR | S_IXUSR);
/* creates read-only file descriptor before deleting the file */
int fd_ro = open(name, O_RDONLY);
/* removes file from file system, kernel buffers content in memory until all fd closed */
unlink(name);
/* writes executable to file */
write(fd_wr, exe, exe_size);
/* fexecve will not work as long as there in a open writeable file descriptor */
close(fd_wr);
char *const newenviron[] = { NULL };
/* -fpermissive */
fexecve(fd_ro, argv, newenviron);
perror("failed");
}
Caveat: Error handling is left out for clarities sake. Includes for sake of brevity.
Note: By combining step main() and memexec() into a single function and using splice(2) for copying directly between stdin and fd_wr the program could be significantly optimized.
2) Execution directly from memory
One does not simply load and execute an ELF binary from memory. Some preparation, mostly related to dynamic linking, has to happen. There is a lot of material explaining the various steps of the ELF linking process and studying it makes me believe that theoretically possible. See for example this closely related question on SO however there seems not to exist a working solution.
Update UserModeExec seems to come very close.
Writing a working implementation would be very time consuming, and surely raise some interesting questions in its own right. I like to believe this is by design: for most applications it is strongly undesirable to (accidentially) execute its input data because it allows code injection.
What happens exactly when an ELF is executed? Normally the kernel receives a file name and then creates a process, loads and maps the different sections of the executable into memory, performs a lot of sanity checks and marks it as executable before passing control and a file name back to the run-time linker ld-linux.so (part of libc). The takes care of relocating functions, handling additional libraries, setting up global objects and jumping to the executables entry point. AIU this heavy lifting is done by dl_main() (implemented in libc/elf/rtld.c).
Even fexecve is implemented using a file in /proc and it is this need for a file name that leads us to reimplement parts of this linking process.
Libraries
UserModeExec
libelf -- read, modify, create ELF files
eresi -- play with elfes
OSKit (seems like a dead project though)
Reading
http://www.linuxjournal.com/article/1060?page=0,0 -- introduction
http://wiki.osdev.org/ELF -- good overview
http://s.eresi-project.org/inc/articles/elf-rtld.txt -- more detailed Linux-specific explanation
http://www.codeproject.com/Articles/33340/Code-Injection-into-Running-Linux-Application -- how to get to hello world
http://www.acsu.buffalo.edu/~charngda/elf.html -- nice reference of ELF structure
Loaders and Linkers by John Levine -- deeoer explanation of linking
Related Questions at SO
Linux user-space ELF loader
ELF Dynamic loader symbol lookup ordering
load-time ELF relocation
How do global variables get initialized by the elf loader
So it seems possible, you decide whether is also practical.
Yes, though doing it properly requires designing significant parts of the compiler with this in mind. The LLVM guys have done this, first with a kinda-separate JIT, and later with the MC subproject. I don't think there's a ready-made tool doing it. But in principle, it's just a matter of linking to clang and llvm, passing the source to clang, and passing the IR it creates to MCJIT. Maybe a demo does this (I vaguely recall a basic C interpreter that worked like this, though I think it was based on the legacy JIT).
Edit: Found the demo I recalled. Also, there's cling, which seems to do basically what I described, but better.
Linux can create virtual file systems in RAM using tempfs. For example, I have my tmp directory set up in my file system table like so:
tmpfs /tmp tmpfs nodev,nosuid 0 0
Using this, any files I put in /tmp are stored in my RAM.
Windows doesn't seem to have any "official" way of doing this, but has many third-party options.
Without this "RAM disk" concept, you would likely have to heavily modify a compiler and linker to operate completely in memory.
If you are not specifically tied to C++, you may also consider other JIT based solutions:
in Common Lisp SBCL is able to generate machine code on the fly
you could use TinyCC and its libtcc.a which emits quickly poor (i.e. unoptimized) machine code from C code in memory.
consider also any JITing library, e.g. libjit, GNU Lightning, LLVM, GCCJIT, asmjit
of course emitting C++ code on some tmpfs and compiling it...
But if you want good machine code, you'll need it to be optimized, and that is not fast (so the time to write to a filesystem is negligible).
If you are tied to C++ generated code, you need a good C++ optimizing compiler (e.g. g++ or clang++); they take significant time to compile C++ code to optimized binary, so you should generate to some file foo.cc (perhaps in a RAM file system like some tmpfs, but that would give a minor gain, since most of the time is spent inside g++ or clang++ optimization passes, not reading from disk), then compile that foo.cc to foo.so (using perhaps make, or at least forking g++ -Wall -shared -O2 foo.cc -o foo.so, perhaps with additional libraries). At last have your main program dlopen that generated foo.so. FWIW, MELT was doing exactly that, and on Linux workstation the manydl.c program shows that a process can generate then dlopen(3) many hundred thousands of temporary plugins, each one being obtained by generating a temporary C file and compiling it. For C++ read the C++ dlopen mini HOWTO.
Alternatively, generate a self-contained source program foobar.cc, compile it to an executable foobarbin e.g. with g++ -O2 foobar.cc -o foobarbin and execute with execve that foobarbin executable binary
When generating C++ code, you may want to avoid generating tiny C++ source files (e.g. a dozen lines only; if possible, generate C++ files of a few hundred lines at least; unless lots of template expansion happens thru extensive use of existing C++ containers, where generating a small C++ function combining them makes sense). For instance, try if possible to put several generated C++ functions in the same generated C++ file (but avoid having very big generated C++ functions, e.g. 10KLOC in a single function; they take a lot of time to be compiled by GCC). You could consider, if relevant, to have only one single #include in that generated C++ file, and pre-compile that commonly included header.
Jacques Pitrat's book Artificial Beings, the conscience of a conscious machine (ISBN 9781848211018) explains in details why generating code at runtime is useful (in symbolic artificial intelligence systems like his CAIA system). The RefPerSys project is trying to follow that idea and generate some C++ code (and hopefully, more and more of it) at runtime. Partial evaluation is a relevant concept.
Your software is likely to spend more CPU time in generating C++ code than GCC in compiling it.
tcc compiler "-run" option allows for exactly this, compile into memory, run there and finally discard the compiled stuff. No filesystem space needed. "tcc -run" can be used in shebang to allow for C script, from tcc man page:
#!/usr/local/bin/tcc -run
#include <stdio.h>
int main()
{
printf("Hello World\n");
return 0;
}
C scripts allow for mixed bash/C scripts, with "tcc -run" not needing any temporary space:
#!/bin/bash
echo "foo"
sed -n "/^\/\*\*$/,\$p" $0 | tcc -run -
exit
/**
*/
#include <stdio.h>
int main()
{
printf("bar\n");
return 0;
}
Execution output:
$ ./shtcc2
foo
bar
$
C scripts with gcc are possible as well, but need temporary space like others mentioned to store executable. This script produces same output as the previous one:
#!/bin/bash
exc=/tmp/`basename $0`
if [ $0 -nt $exc ]; then sed -n "/^\/\*\*$/,\$p" $0 | gcc -x c - -o $exc; fi
echo "foo"
$exc
exit
/**
*/
#include <stdio.h>
int main()
{
printf("bar\n");
return 0;
}
C scripts with suffix ".c" are nice, headtail.c was my first ".c" file that needed to be executable:
$ echo -e "1\n2\n3\n4\n5\n6\n7" | ./headtail.c
1
2
3
6
7
$
I like C scripts, because you just have one file, you can easily move around, and changes in bash or C part require no further action, they just work on next execution.
P.S:
The above shown "tcc -run" C script has a problem, C script stdin is not available for executed C code. Reason was that I passed extracted C code via pipe to "tcc -run". New gist run_from_memory_stdin.c does it correctly:
...
echo "foo"
tcc -run <(sed -n "/^\/\*\*$/,\$p" $0) 42
...
"foo" is printed by bash part, "bar 42" from C part (42 is passed argv[⁠1]), and piped script input gets printed from C code then:
$ route -n | ./run_from_memory_stdin.c
foo
bar 42
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.29.58.98 0.0.0.0 UG 306 0 0 wlan1
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 wlan0
169.254.0.0 0.0.0.0 255.255.0.0 U 303 0 0 wlan0
172.29.58.96 0.0.0.0 255.255.255.252 U 306 0 0 wlan1
$
One can easily modify the compiler itself. It sounds hard first but thinking about it, it seams obvious. So modifying the compiler sources directly expose a library and make it a shared library should not take that much of afford (depending on the actual implementation).
Just replace every file access with a solution of a memory mapped file.
It is something I am about to do with compiling something transparently in the background to op codes and execute those from within Java.
-
But thinking about your original question it seams you want to speed up compilation and your edit and run cycle. First of all get a SSD-Disk you get almost memory speed (use a PCI version) and lets say its C we are talking about. C does this linking step resulting in very complex operations that are likely to take more time than reading and writing from / to disk. So just put everything on SSD and live with the lag.
Finally the answer to OP question is yes!
I found memrun repo from guitmz, that demoed running (x86_64) ELF from memory, with golang and assembler. I forked that, and provided C version of memrun, that runs ELF binaries (verified on x86_64 and armv7l), either from standard input, or via first argument process substitution. The repo contains demos and documentation (memrun.c is 47 lines of code only):
https://github.com/Hermann-SW/memrun/tree/master/C#memrun
Here is simplest example, with "-o /dev/fd/1" gcc compiled ELF gets sent to stdout, and piped to memrun, which executes it:
pi#raspberrypi400:~/memrun/C $ gcc info.c -o /dev/fd/1 | ./memrun
My process ID : 20043
argv[0] : ./memrun
no argv[1]
evecve --> /usr/bin/ls -l /proc/20043/fd
total 0
lr-x------ 1 pi pi 64 Sep 18 22:27 0 -> 'pipe:[1601148]'
lrwx------ 1 pi pi 64 Sep 18 22:27 1 -> /dev/pts/4
lrwx------ 1 pi pi 64 Sep 18 22:27 2 -> /dev/pts/4
lr-x------ 1 pi pi 64 Sep 18 22:27 3 -> /proc/20043/fd
pi#raspberrypi400:~/memrun/C $
The reason I was interested in this topic was usage in "C script"s. run_from_memory_stdin.c demonstrates all together:
pi#raspberrypi400:~/memrun/C $ wc memrun.c | ./run_from_memory_stdin.c
foo
bar 42
47 141 1005 memrun.c
pi#raspberrypi400:~/memrun/C $
The C script producing shown output is so small ...
#!/bin/bash
echo "foo"
./memrun <(gcc -o /dev/fd/1 -x c <(sed -n "/^\/\*\*$/,\$p" $0)) 42
exit
/**
*/
#include <stdio.h>
int main(int argc, char *argv[])
{
printf("bar %s\n", argc>1 ? argv[1] : "(undef)");
for(int c=getchar(); EOF!=c; c=getchar()) { putchar(c); }
return 0;
}
P.S:
I added tcc's "-run" option to gcc and g++, for details see:
https://github.com/Hermann-SW/memrun/tree/master/C#adding-tcc--run-option-to-gcc-and-g
Just nice, and nothing gets stored in filesystem:
pi#raspberrypi400:~/memrun/C $ uname -a | g++ -O3 -Wall -run demo.cpp 42
bar 42
Linux raspberrypi400 5.10.60-v7l+ #1449 SMP Wed Aug 25 15:00:44 BST 2021 armv7l GNU/Linux
pi#raspberrypi400:~/memrun/C $

Running a C program in Linux

Can someone explain to me why, in particular, we are using ./a.out to run a program?
Is there any meaning behind this?
Can someone please provide an explanation?
The name stands for "assembler output", and was (and still is) the default name for the executable generated by the compiler. The reason you need ./ in front of it is because the current directory (.) is not in $PATH therefore the path to the executable must be explicitly given.
If you mean the ./ part, it's for safety. Windows by default appends current directory to PATH, which is bad (there's a risk of DLL injection, and so on).
If you mean a.out part, it's just a name (which came from name of format a.out), which you can change by modifying gcc -o parameter.
When running an executable like a shell like bash the executable must be in your PATH environment variable for bash to locate and run the program.
The ./ prefix is a shorthand way of specifying the full path to the executable, so that bash does not need to the consult the PATH variable (which usually does not contain the current directory) to run it.
[For a.out (short for "assembler output"), it is the default executable output for a compiler like gcc if no output filename is specified.]
It'd be worth you looking a bit more into C and the way that C programs are compiled.
Essentially, your source code is sent to the preprocessor, where directives like #define and #include are loaded (e.g. into memory). So any libraries you want to use are loaded, e.g.
#include <math.h>
will basically 'paste' the contents of math.h into source code at the point at which it is defined.
Once all this stuff has been expanded out, the compiler turns your source code into object code, which is your source in binary code. a.out is the default name for output if you do not specify a build name.
gcc -o mynewprogram mynewprogram.c
a.out is the default name for the compiler. AFAIK it is because the linking process is skipped and it is not compiled as an object or library.

Resources