How to create real compiler with Flex/Bison? - c

I'm learning Flex / Bison right now, thinking I can do a compiler, but the more I inquire the more I get the impression that they are only Syntactic Analyzers, and do not allow to generate new files executable universal windows from our programming language. I explain, when the file generated by Bison is executed, it is only interpreted our code language in C.
Is it possible to create a compiler that generates executable files from any windows that do not have my compiler?

Yes, it's possible (and perfectly common) to write compilers using flex+bison, but these tools only help you do the lexical and syntactic analysis. You'll have to do the rest yourself or using additional tools like LLVM.
For example to create a simple single-pass compiler you could simply write assembly instructions into a file from within your bison-actions. Then you could run that file through an assembler and linker at the end and get an executable.
Or for a more complicated compiler you might create an abstract syntax tree inside your bison actions and then walk that tree in later phases to perform transformations and analyses on it until you finally generate assembly.
Either way bison only helps you with the parsing, you'll need to perform the other steps yourself.

Related

Compilation map

Let assume a complex project (in C/C++), is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself.
I know I could just read the Makefile and try to follow the dependency chain like this but it's not very scalable and it could be hard if multiple Makefiles and / or implicit rules are used.
Thanks a lot for your help
PS: To clarify the first comments, I'm looking for a method which does not need to have a valid build environment (e.g. so compiling, even as a dry-run, is not an option).
is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself
If you compile with GCC (or perhaps Clang) you could use appropriate preprocessor options like -M to generate and keep in some textual file the dependencies, in a format acceptable by GNU make or ninja build automation tools. This works well on Linux distributions like Debian.
You could also be interested by other builders, including omake, and package managers like opam, urpmi, etc...
You could also be in touch with SoftwareHeritage team.
If you use GCC, you could write your own GCC plugin to maintain these dependencies in your database.
At last, be aware of Rice's theorem, and think about crazy examples (in C++) like
#if __TIME__[0]=='1'
int something=0;
#else
constexpr int something=1;
#endif
So my current intuition is that your wish is impossible. I could have misunderstood it.
Refer to some C standard like n1570, or to some C++ standard like n3337.
Study the behavior of tools like GNU autoconf.
Think of programs generating C or C++ code like GNU bison, my manydl.c, bismon, SWIG, RefPerSys, ANTLR .... Notice that GCC has many C++ code generators (notably gengtype) and is definitely "a complex project coded in C++".
See also linuxfromscratch.

how can I view the c code generated from my pypy program

Many language environments allow one to "disassemble" a provided function. Since Pypy compiles to C-code (if I understand things correctly). then it seems natural to be able to see a C-code dump from an expression, or a whole python file.
Can I do this? how?
PyPy does not compile python bytecode to C code. The tracing JIT replaces a frequently used piece of code with generated machine code (assembly). You can turn on logging to save the JIT traces to a file and view them with vmprof or jitviewer.

How to delete dead code or code of no use based on configure file/makefile file

When we compile a C/C++ project, some files and codes in the project source are not needed for compilation. For example, test folder (some testing scripts), examples folder and dead code. How can I recognize these source files that are not compiled to binary files? It is not hoped that compiling is necessary. Cause I need to process many projects automatically and it is really hard to compile all projects without manual operations.
I know compilation can delete dead code automatically, but in my situation I can not compile the whole project, and in the source, there are many other codes that are not involved in final compilation such as code in test folder, tools folder... I hope to detect these code, as for dead code, I know its hard to detect it by static analysis, so disregard it, just care about whole files and whole folders that are not compiled.
Why I want to do this?
I want to extract some features (strings, function call graph, int constants... ) to represent this project and compare this features with the same features extracted from binary files to see what differences are there. So, if I extract features from code in test folder and the code are not compiled in the final binary files. There will be a big error when comparing these features.
Dead code would often -but not always- be eliminated by the compiler, when you ask it to optimize (but removing all dead code automatically is impossible, since undecidable because of equivalence with the halting problem). Be aware of the as-if rule permitting the compiler to do such optimizations. So in practice you don't need to remove the corresponding source code.
Some industries have as a coding rule (e.g. in DO-178C) the requirement that dead source code is forbidden. Detecting that is extremely difficult and in general impossible (see e.g. Rice's theorem), so requires a lot of sophisticated static program analysis techniques and external code review and costs a big lot (e.g. increase the cost of software development by more than 30x).
Your build automation system (e.g. cmake or Makefile, etc...) might be (and usually is) Turing-complete; so even removing entirely useless C++ source files is an impossible task in general. Even the POSIX shell (used in commands to build your thing) is difficult to analyze (see the excellent Parsing Posix [S]hell talk by Yann Regis-Gianas at FOSDEM2018).

Read math functions from file and calculate

I'm creating program which will read model described by math functions from file into memory. I need to make these functions invokable. Is there any other way to achive it instead of implementing RPN ? Performance is the most important factor.
Maybe something like creating and compiling functions during runtime, after reading model from file ?
CUDA currently only has JIT compilation for device code written in PTX assembly code. So your only "native" JIT option would be to have your code translate the functions into PTX code and compile them.
Realistically, your best option would be to write your front end in Python and use PyCUDA, which includes some very powerful metaprogramming and JIT compilation features, or to use OpenCL, which has native C99 JIT compilation, at the expense of an uglier and more verbose host API and a lack of C++ language support.

Writing C code in Visual C++ on VS2010

I appreciate the differences are negligible, but im doing some number crunching and so i want to use C. Ive just created a project in VS2010, chosen a C++ project and written some C. All executes fine, but
is this being created and executed in the fast(er) C compiler or the C++ because its a C++ project?
How can i specify that the code i wish to write is actually C and to be compiled and run as C?
The Visual Studio C++ compiler will treat all .c files as C language files and compile them as such.
Additional reference:
By default, the Visual C++ compiler
treats all files that end in .c as C
source code, and all files that end in
.cpp as C++ source code. To force the
compiler to treat all files as C
regardless of file name extension, use
the /Tc compiler option.
http://msdn.microsoft.com/en-us/library/bb384838.aspx
You are just being silly now. C is not guaranteed to be faster than C++ in any way - it's all compiled to native machine instructions in the end. If you want a true performance leap you should use another compiler, Intels for example, or use the GPU or something like that.
What will actually give you more speed is to use Intel's compiler, which is available as a plugin. The real-world differences are significant, especially for number crunching. The difference between C and C++ is dubious.
Here's a good place to start: link text
Since you're number crunching, you should consider using SIMD extensions, if possible. Using SIMD on Intel's compiler, vs. straight MS C compiled code, will give you some serious gain.

Resources