Using TinyCC (tcc) to generate a C wrapper for V - c

I am trying to find some basis I can use to generate wrappers/bindings for C libraries to be used from Vlang and whilst doing so, I remembered that initially, V uses TCC for it's bootstrap compilation.
Since TCC is a very, very capable C compiler, I wondered if it was possible to utilize this and make this a way to generate wrappers and bindings by using TCC's built in parser/lexer to generate a symbol table of structs, functions, enums and the like and then iterate over said table to generate V code.
Judging from reading tcc.h, the API described here is usable, but I wouldn't be surprised if it was declared internal and thus not fully documented. Where can I find more information about how I could use TCC as a plain parser?

I'm sure you've already found some information regarding this, but for posterity, here are some places with information about TCC and using it as a dynamic code generator:
The TCC Git Repo
The Gnu project page
The TCC Development Archive

Related

Compilation map

Let assume a complex project (in C/C++), is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself.
I know I could just read the Makefile and try to follow the dependency chain like this but it's not very scalable and it could be hard if multiple Makefiles and / or implicit rules are used.
Thanks a lot for your help
PS: To clarify the first comments, I'm looking for a method which does not need to have a valid build environment (e.g. so compiling, even as a dry-run, is not an option).
is there a solution to know which sources files are responsible/used for the creation of a specific binary without compiling the project itself
If you compile with GCC (or perhaps Clang) you could use appropriate preprocessor options like -M to generate and keep in some textual file the dependencies, in a format acceptable by GNU make or ninja build automation tools. This works well on Linux distributions like Debian.
You could also be interested by other builders, including omake, and package managers like opam, urpmi, etc...
You could also be in touch with SoftwareHeritage team.
If you use GCC, you could write your own GCC plugin to maintain these dependencies in your database.
At last, be aware of Rice's theorem, and think about crazy examples (in C++) like
#if __TIME__[0]=='1'
int something=0;
#else
constexpr int something=1;
#endif
So my current intuition is that your wish is impossible. I could have misunderstood it.
Refer to some C standard like n1570, or to some C++ standard like n3337.
Study the behavior of tools like GNU autoconf.
Think of programs generating C or C++ code like GNU bison, my manydl.c, bismon, SWIG, RefPerSys, ANTLR .... Notice that GCC has many C++ code generators (notably gengtype) and is definitely "a complex project coded in C++".
See also linuxfromscratch.

Libraries that parse code written in C and provide an API

I am implementing a proof of concept application for source-to-source transformation and need a C-parser with an API for manipulating/traversing the C-syntax tree (AST).
I have tried to use clang but I ran into various problems, like not being able to compile the tutorials using libclang, wrong architecture etc. Since this is a proof of concept application, I will defer clang to a different date.
Question
What are some software/libraries (implemented in any language) which can parse C code and which provide an API so I can build applications on top of them. I looked around, but I could not locate any free parsers.
The platforms I can use are anything on Windows or Mac or Linux, and any parsers written in C/C++/Java/Perl/Python/PHP will work.
You could try one of the available grammars for ANTLR. ANTLR has support for creating tree walkers and you can walk/manipulate the AST manually if necessary. ANTLR V3 has several grammars available including a C preprocessor, ANSI C and GNU C.

Hiding a library within a library

Here's the situation. I have an old legacy library that is broken in many places, but has a lot of important code built in (we do not have the source, just the lib + headers). The functions exposed by this library have to be handled in a "special" way, some post and pre-processing or things go bad. What I'm thinking is to create another library that uses this old library, and exposes a new set of functions that are "safe".
I quickly tried creating this new library, and linked that into the main program. However, it still links to the symbols in the old library that are exposed through the new library.
One thing would obviously be to ask people not to use these functions, but if I could hide them through some way, only exposing the safe functions, that would be even better.
Is it possible? Alternatives?
(it's running on an ARM microcontroller. the file format is ELF, and the OS is an RTOS from Keil, using their compiler)
[update]
Here's what i ended up doing: I created dummy functions within the new library that use the same prototypes as the ones in the old. Linked the new library into the main program, and if the other developers try to use the "bad" functions from the old library it will break the build with a "Symbol abcd multiply defined (by old_lib.o and new_lib.o)." Good enough for government work...
[update2]
I actually found out that i can manually hide components of a library when linking them in through the IDE =P, much better solution. sorry for taking up space here.
If you're using the GNU binutils, objcopy can prefix all symbols with a string of your choice. Just use objcopy --prefix-symbols=brokenlib_ old.so new.so (be careful: omitting new.so will cause old.so to be overwritten!)
Now you use brokenlib_foo() to call the original version of foo().
If you use libtool to compile and link the library instead of ld, you can provide -export-symbols to control the output symbols, but this will only work if your old library can be statically linked. If it is dynamically linked (.so, .dylib, or .dll), this will not be possible.

Why in Linux compiler we have to give additional arguments while compiling and running C programs?

I have implemented semaphores in Linux last year. But for that I have to use -lpthread.
Now while implementing log10() function in C, I surfed the INTERNET and I saw that I have to use -lm.
I want to know why these kind of command line arguments are necessary in Linux.And Does this rule is compiler oriented?
(In windows Turboc compiler, I never used these kind of arguments.)
You are instructing the compiler to look for certain libraries and use them to try and produce a final object file.
When you were doing your threading code, you used threading primitives. These threading primitives are implemented in a library called pthread, -lpthread tells the linker to use the library pthread, without providing this switch the compiler will not be able to produce a valid object file as it is missing threading code implementation.
On the file system the libraries can be found in /usr/lib and lib (among others) when you look in these directories you will see files start with the lib prefix. for example libpthreadxxxxxx. You will have to do your own research to figure out what the xxxx means.
The development cycle using unix style tools is very granular on the surface, when you use heavyweight IDE's (read: visual studiio for C++), the IDE implicetly links against loads of standard libraries, so often you do not need to supply the name of the libraries you will commonly use. However, when you start doing more advanced programming you will probably have to install and configure your IDE to use external code libraries. If you were to use threading primitives in visual studio, you most likely will not have to provide the compiler with information on where to look for threading primitives, Microsoft considers this a common library and every new project will implicitly link against it.
A little discussion on GCC
GCC is a very diverse compiler producing code for various different usage scenarios. As such they try to be neutral and do not make assumptions. For example pthread is a particular threading primitives implementation. However, even through now on Linux at least it is the defacto standard, it is not the only one. Other Unix implementation have had different implementation. When such choices exist it is not fair for the compiler developers to implicitly link against libraries. They do however implicitly link against standard libraries; for example G++ is just a wrapper command to the internal compiler code, it is a C++ front-end so it implicitly links against an implementation of the C++ standard library. Similarly the C front end links against a the standard C library.
People often do not want to use certain standard library implementation, and instead they might want to use another implementation, in such cases you have to explicetly inform the compiler to use an implementation that you provide. Such use cases are very granular and are surface level issues with G++. In visual studio, you would have to tinker a lot to make such changes generally, since it is not an anticipated use-case anymore.
wikipedia will provide you with more information.
Edit: I'll fix the spelling and Grammatical issues later :D
The option -l indicates to gcc what libraries must be used for linking. -lpthread stands for "use the pthread library", and -lm stands for "use the m library" which is the math library. These commands are relative to gcc, not linux.
Because by default, gcc only links the C library (libc), which contains the well-known functions printf, scanf, and many more.
log10 exists in a different library called libm, and thus you need to explictly tell gcc to link that library, with -lm. The same logic applies for -lpthread.
This is purely a backwards, harmful practice. Separating parts of the standard library into separate .so files does nothing but increase load time and memory usage. Good luck getting anyone to change it though... Just accept that you have to do it (and that POSIX specifically allows, but does not require, that an implementation require -lm for using the math functions and -lpthread for using threads, etc.) and move on to more important things.
Or, go pester Drepper about it on the glibc bug tracker/mailing list. He won't change his mind, but if you enjoy flamewars you can get some kicks...

How to use multiple development languages

I program in Delphi (D7 and D2006) on Windows XP (migrating in the near future to Windows 7). I need to use a mathematical library for some of the work I am doing and most of the math libraries (I am inclining towards Mathematica at present) I have looked at will produce compiled C code. Such code will provide specific functionality to my main programs.
I have a very basic question - given this development setup - how do I start utilising the compiled c code from Delphi? I really need baby steps to get me started on the process.
I've done quite a bit of this with my FE product OrcaFlex. You have two options to link to your C code from Delphi: static or dynamic. I link statically because it makes distribution and versioning much easier. But it's really quite a trick to get it to work statically and you have to rely on a number of undocumented aspects of Delphi.
I suspect that for your needs dynamic linking is best. Basically you need to compile and link your C code into a DLL. I recommend using the Borland C compiler to do this. You can use the free command line version BCC55 to do this. The advantage of using Borland C is that it makes the same assumptions about the 8087 floating point unit as Delphi does. If you build with MSVC then you will find that MS have elected not to raise floating point exceptions. Borland C does raise floating point exceptions. This is a bit of a corner case but it becomes relevant if you are trying to ship a product that you need to be robust.
You should know that the C code will, by default, use the C calling convention and I'd just stick with that. You bring it into Delphi by declaring the external routine as cdecl calling convention.
The other thing you need to take care on is defining a clear interface between the two modules. You need to make sure that exceptions don't cross the module boundary and that you don't pass any special types (e.g. Delphi strings) across the boundary. So for a string use a PChar (or even better PAnsiChar or PWideChar to be sure that it won't change meaning when you upgrade to Delphi 2009 and later).
I have been very happy with the SDL Library from Lohninger (http://www.lohninger.com/mathpack.html). It is written in Delphi and compiles right into your application, so there are no bundling or calling convention problems or floating point usage differences, as discussed by other responses in this thread.
Take a look at what he includes. If you're lucky, your needs will be met by his library and you'll be able to use it!
If you currently have Mathematica installed, go to the documentation centre and lookup guide/CLanguageInterface otherwise that guide is available on the web and have a good read there.
My understanding is that Mathematica can generate C-programs that link up with the Mathematica engine via MathLink if you need full function, or if you only need lower-level features then it is capable of generating code that can be statically linked with compiled Mathematica libraries. So that standalone code is possible.
See the Code Generator documentation.
If you can convert the C programs in to DLLs, then accessing such external functions from Delphi is relatively simple with external declarations.
function MathematicaRoutine(const x : double) : double; external 'MyInterface.dll';
There are bound to be a great number of complexities in getting this to work if you need to achieve a static bind, for use where Mathematica is not installed, if indeed it is possible. I have never attempted it.
You can mix your project with Delphi and C++ (Builder) code using RAD Studio. Put the automatically created C code into a C++ Builder file (.cpp) and for the rest add Delphi files.

Resources