Partially pre-compile code (or maybe use .so library) while leaving another part of code open to edits - c

I'm trying to do a somewhat odd thing that realistically I'm not sure is even possible with current constraints but is outside of my scope of knowledge so it could be. I'll hopefully be able to make everything clear enough in the question, but it will be a little broad in scope, its too big to get detailed.
Anyway, I have a C codebase (we'll call it bar) that is rather large and takes a bit of time to compile. Not a huge deal normally, but now there is a set of files that are changed often and currently the changes can only be confirmed as good after running a compile. Due to the nature of how these are changed it could result in people running multiple compiles in a day, taking quite a lot of time.
What I want to do on a broad scale is only have to actually compile the set of files that might change (about 20, all in 1 directory, we'll call it foo) and have everything else (bar and everything under it except for foo) ready before hand. Initially was looking at .so library for the task, but not positive anymore that's correct. Either way, it still seemed likely to be reasonably possible until I realized that some of the files in directory foo were included by other files in bar. Mostly the files in foo only include files and are kind of the end point, not being included in things. But with a few of them being included I'm not sure what can be done.
My two thoughts are generate a .so library of everything outside of foo that somehow still checks on the needed included files at compile time, or get some kind of general pre-compile set up. Neither of these seem like they would work at a glance, but I very well could be wrong.
A third option, less ideal but better then nothing, is to generate the .so library with everything including any files in foo that are needed at that point, just leaving out the files that aren't included anywhere. It seems like this would work better, though even if it would I'm still not really sure how to go about it.
So basically, is there a way to do what I want to some extent, and if so what is the best method?
Sorry about the broadness of the question, the codebase is too large to provide lots of detail. I will try to edit and add in any information that people think is needed though. Thanks for the help.

Related

What's the main reason for make -t command?

After making a mistake of replying to a post from 2010 (the link to it: What does it mean to 'touch' a target in make? ), i would like to understand something as a new student for C programming language, who recently learned about the makefile, make command an so on. i've got a question on class about what does the make -t command do, and why should i use it (or not, for the matter).
i understand that make -t only create the files (the relevant ones, from my makefile commands and full script). but why should i ever use it?
i mean, if it is just creating me the files, without actually doing anything with it - why bother?
(I'm using Ubuntu 20.04.4 LTS, gcc for compiling, and writing my code in C language)
As mentioned in comments you should pretty much never use this.
It is "useful" in very limited situations such as: you know for a fact that your entire build is is correct and up to date, then something happens such that the timestamps on your files get all messed up. Maybe some tool went in and tweaked a comment in every file (maybe something changed the copyright year in every file), or maybe you copied the build tree somewhere but forgot to preserve the modification time, or whatever.
Then you can run make -t to "bring back" the relative timestamps of your files so that make understands everything is up to date, without actually building anything.
Back in the day when builds were a lot slower and there were more opportunities to mess up timestamps maybe this was more useful.
These days it's better to just run make without -t: yes you'll have to rebuild a bunch of files but it's much safer than assuring make you know that everything is up to date, when you might be wrong.

Why use separate source files?

I'm learning C, coming from scripted languages background it is highly intriguing and rather confusing.
A brief story of how I got to this question:
At first I was confused why I can't include a source (.c) file in another source file, then I found out that function declarations repeat. Then I found out about header files (.h) and was confused, why I have to declare a function in one file then define in another, then if something changes I have to go edit 2 files, so I started defining functions in header files. Then I found out that #ifndef doesn't work across separate source files, so here's the question I can't yet find the answer to:
Why do I even have to use separate source files? Why can't I just have 1 source file and put all of my other code/function definitions in header files, this way I'm going to have things defined once and included once in the final build?
Now don't get me wrong, I'm not thinking I'll start a revolution, I'm just looking for answers as to why this is not how it works.
If you think beyond small learning programs, there are several benefits to splitting code into multiple source files.
Code Organization
Large programming projects can have millions of lines of code. You don't want to have files that big! Editors will probably have trouble handling it. Human beings will have trouble understanding it. Multiple developers would have conflicts all touching the same file. If you separate the code by purpose, it will be much easier to handle.
Build Times
Many code changes are small, while compilation time can be expensive. Compilers typically work on a file at a time, not parts of files. So if you make a tiny change and then have to rebuild the entire project, that can be very time consuming. If your code is separated into multiple source files, making a change in one of them means you only have to recompile that file.
Reusability
Frequently, code can be reused for more than one program. If you have all your code in one source file, you'll have to go and copy that code into another file to reuse it. Of course, now if it has a bug you have two places to fix it. Not good.
Let's say, for example, you have code that uses a linked list. If you put the linked list code into its own source file, you can then simply link that into another program. If there's a bug, you can fix it in one place, recompile, and then re-link the programs that use it.
You can use a single source file for some (small) projects.
For many projects though, it makes sense to divide the source in different source files according to their function.
Let's say your making a game.
Have all the user interface code in its source file.
Have all the computer move algorithms in its source file.
...
Have the main() function which ties it all together in its source file.
Then, to compile for PC you do gcc game.c algo.c ui-pc.c, to compile to android you do gcc game.c algo.c ui-android.c ..., to compile a brand new algorithm you though up and don't know if it's good gcc game.c algo-test.c ui-pc.c
Header files help keep everything in sync. And they're a good place for documentation.

Selective compilation of source code

I am working on a C project which is quite large and consists of multiple source files. I have written a script to find out all the functions in this code that are never used (Only defined once but never used elsewhere). Now I want to compile my code without including these functions. Is there any direct way to exclude certain functions from a compilation?
I understand that I can use #ifdef/#endif for each of these functions and leave them out, but inserting these at the right location using a script is turning out to be really challenging, hence the question.
PS: I have already used all compiler/linker optimizations and this exercise is supposed to be beyond those (as no optimization has been successful in removing 100% dead code and I dont expect it to). So I am not really looking for answers in that area.

Any good reason to #include source (*.c *.cpp) files?

i've been working for some time with an opensource library ("fast artificial neural network"). I'm using it's source in my static library. When i compile it however, i get hundreds of linker warnings which are probably caused by the fact that the library includes it's *.c files in other *.c files (as i'm only including some headers i need and i did not touch the code of the lib itself).
My question: Is there a good reason why the developers of the library used this approach, which is strongly discouraged? (Or at least i've been told all my life that this is bad and from my own experience i believe it IS bad). Or is it just bad design and there is no gain in this approach?
I'm aware of this related question but it does not answer my question. I'm looking for reasons that might justify this.
A bonus question: Is there a way how to fix this without touching the library code too much? I have a lot of work of my own and don't want to create more ;)
As far as I see (grep '#include .*\.c'), they only do this in doublefann.c, fixedfann.c, and floatfann.c, and each time include the reason:
/* Easy way to allow for build of multiple binaries */
This exact use of the preprocessor for simple copy-pasting is indeed the only valid use of including implementation (*.c) files, and relatively rare. (If you want to include some code for another reason, just give it a different name, like *.h or *.inc.) An alternative is to specify configuration in macros given to the compiler (e.g. -DFANN_DOUBLE, -DFANN_FIXED, or -DFANN_FLOAT), but they didn't use this method. (Each approach has drawbacks, so I'm not saying they're necessarily wrong, I'd have to look at that project in depth to determine that.)
They provide makefiles and MSVS projects which should already not link doublefann.o (from doublefann.c) with either fann.o (from fann.c) or fixedfann.o (from fixedfann.c) and so on, and either their files are screwed up or something similar has gone wrong.
Did you try to create a project from scratch (or use your existing project) and add all the files to it? If you did, what is happening is each implementation file is being compiled independently and the resulting object files contain conflicting definitions. This is the standard way to deal with implementation files and many tools assume it. The only possible solution is to fix the project settings to not link these together. (Okay, you could drastically change their source too, but that's not really a solution.)
While you're at it, if you continue without using their project settings, you can likely skip compiling fann.c, et. al. and possibly just removing those from the project is enough – then they won't be compiled and linked. You'll want to choose exactly one of double-/fixed-/floatfann to use, otherwise you'll get the same link errors. (I haven't looked at their instructions, but would not be surprised to see this summary explained a bit more in-depth there.)
Including C/C++ code leads to all the code being stuck together in one translation unit. With a good compiler, this can lead to a massive speed boost (as stuff can be inlined and function calls optimized away).
If actual code is going to be included like this, though, it should have static in most of its declarations, or it will cause the warnings you're seeing.
If you ever declare a single global variable or function in that .c file, it cannot be included in two places which both compile to the same binary, or the two definitions will collide. If it is included in even one place, it cannot also be compiled on its own while still being linked into the same binary as its user.
If the file is only included in one place, why not just make it a discrete compilation unit (and use its globals via extern declarations)? Why bother having it included at all?
If your C files declare no global variables or functions, they are header files and should be named as such.
Therefore, by exhaustive search, I can say that the only time you would ever potentially want to include C files is if the same C code is used in building multiple different binaries. And even there, you're increasing your compile time for no real gain.
This is assuming that functions which should be inlined are marked inline and that you have a decent compiler and linker.
I don't know of a quick way to fix this.
I don't know that library, but as you describe it, it is either bad practice or your understanding of how to use it is not good enough.
A C project that wants to be included by others should always provide well structured .h files for others and then the compiled library for linking. If it wants to include function definitions in header files it should either mark them as static (old fashioned) or as inline (possible since C99).
I haven't looked at the code, but it's possible that the .c or .cpp files being included actually contain code that works in a header. For example, a template or an inline function. If that is the case, then the warnings would be spurious.
I'm doing this at the moment at home because I'm a relative newcomer to C++ on Linux and don't want to get bogged down in difficulties with the linker. But I wouldn't recommend it for proper work.
(I also once had to include a header.dat into a C++ program, because Rational Rose didn't allow headers to be part of the issued software and we needed that particular source file on the running system (for arcane reasons).)

Find header file that defines a C function

Shouldn't be hard, right? Right?
I am currently trawling the OpenAFS codebase to find the header definition of pioctl. I've thrown everything I've got at it: checked ctags, grepped the source code for pioctl, etc. The closest I've got to a lead is the fact that there's a file pioctl_nt.h that contains the definition, except it's not actually what I want because none of the userspace code directly includes it, and it's Windows specific.
Now, I'm not expecting you to go and download the OpenAFS codebase and find the header file for me. I am curious, though: what are your techniques for finding the header file you need when everything else fails? What are the worst case scenarios that could cause a grep for pioctl in the codebase to not actually come up with anything that looks like a function definition?
I should also note that I have access to two independent userspace programs that have done it properly, so in theory I could do an O(n) search for the function. But none of the header files pop out to me, and n is large...
Edit: The immediate issue has been resolved: pioctl() is defined implicitly, as shown by this:
AFS.xs:2796: error: implicit declaration of function ‘pioctl’
If grep -r and ctags are failing, then it's probably being defined as the result of some nasty macro(s). You can try making the simplest possible file that calls pioctl() and compiles successfully, and then preprocessing it to see what happens:
gcc -E test.c -o test.i
grep pioctl -C10 test.i
There are compiler options to show the preprocessor output. Try those? In a horrible pinch where my head was completely empty of any possible definition the -E option (in most c compilers) does nothing but spew out the the preprocessed code.
Per requested information: Normally I just capture a compile of the file in question as it is output on the screen do a quick copy and paste and put the -E right after the compiler invocation. The result will spew preprocessor output to the screen so redirect it to a file. Look through that file as all of the macros and silly things are already taken care of.
Worst case scenarios:
K&R style prototypes
Macros are hiding the definition
Implicit Declaration (per your answer)
Have you considered using cscope (available from SourceForge)?
I use it on some fairly significant code sets (25,000+ files, ranging up to about 20,000 lines in a file) with good success. It takes a while to derive the file list (5-10 minutes) and longer (20-30 minutes) to build the cross-reference on an ancient Sun E450, but I find the results useful.
On an almost equally ancient Mac (dual 1GHz PPC 32-bit processors), cscope run on the OpenAFS (1.5.59) source code comes up with quite a lot of places where the function is declared, sometimes inline in code, sometimes in headers. It took a few minutes to scan the 4949 files, generating a 58 MB cscope.out file.
openafs-1.5.59/src/sys/sys_prototypes.h
openafs-1.5.59/src/aklog/aklog_main.c (along with comment "Why doesn't AFS provide these prototypes?")
openafs-1.5.59/src/sys/pioctl_nt.h
openafs-1.5.59/src/auth/ktc.c includes a define for PIOCTL
openafs-1.5.59/src/sys/pioctl_nt.c provides an implementation of it
openafs-1.5.59/src/sys/rmtsysc.c provides an implementation of it (and sometimes afs_pioctl() instead)
The rest of the 184 instances found seem to be uses of the function, or documentation references, or release notes, change logs, and the like.
The current working theory that we've decided on, after poking at the preprocessor and not finding anything either, is that OpenAFS is letting the compiler infer the prototype of the function, since it returns an integer and takes pointer, integer, pointer, integer as its parameters. I'll be dealing with this by merely defining it myself.
Edit: Excellent! I've found the smoking gun:
AFS.xs:2796: error: implicit declaration of function ‘pioctl’
While the original general question has been answered, if anyone arrives at this page wondering where to find a header file that defines pioctl:
In current releases of OpenAFS (1.6.7), a protoype for pioctl is defined in sys_prototypes.h. But that the time that this question was originally asked, that file did not exist, and there was no prototype for pioctl visible from outside the OpenAFS code tree.
However, most users of pioctl probably want, or are at least okay with using, lpioctl ("local" pioctl), which always issues a syscall on the local machine. There is a prototype for this in afssyscalls.h (and these days, also sys_prototypes.h).
The easiest option these days, though, is just to use libkopenafs. For that, include kopenafs.h, use the function k_pioctl, and link against -lkopenafs. That tends to be a much more convenient interface than trying to link with OpenAFS libsys and other stuff.
Doesn't it usually say in the man page synopsis?

Resources