Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
My C project uses preprocessor directives to activate / deactivate some features. It's not unusual to find some of the less common configurations do not compile anymore due to a change made a few days ago within an #ifdef.
We use a script to compile the most common configurations, but I'm looking for a tool to ensure everything is compiled (testing is not a problem in our case, we just want to detect ASAP nothing stops compiling). Usually ifdefs / ifndefs are independent, so normally each module have to be compiled just twice (all symbols defined, all undefined). But sometimes the ifdefs are nested, so those modules have to be compiled more times.
Do you know of any tool to search all ifdef / ifndef (also nested ones) and gives how many times a module have to be compiled (with the set of preprocessor symbols to be defined in each one) to ensure every single source line of code is analyzed by the compiler?
I am not aware of any tool for doing what you want to do. But looking at your problem i think all you need is a script that will compile the source with all possible combinations of the preprocessor symbols.
Like say if you have..
#ifdef A
CallFnA();
#ifdef B
CallFnB();
#endif
CallFnC();
#endif
You will have to trigger the build with the foll combinations
A and B both defined
A not defined and B defined ( will not make sense here, but required for entire module)
A defined and B not defined
A and B both not defined
Would love to see some script that will grep the source code and produce the combinations.
Something like
find ./ -name '*.cpp' -exec egrep -h '^#ifdef' {} \; | awk '{ print $2}' | sort | uniq
With *.cpp replaced with whatever files you want to search.
Here's a Perl script that does a hacky job of parsing #ifdef entries and assembles a list of the symbols used in a particular file. It then prints out the Cartesian Product of all the possible combinations of having that symbol on or off. This works for a C++ project of mine, and might require minor tweaking for your setup.
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
my $path = $ENV{PWD};
my $symbol_map = {};
find( make_ifdef_processor( $symbol_map ), $path );
foreach my $fn ( keys %$symbol_map ) {
my #symbols = #{ $symbol_map->{$fn} };
my #options;
foreach my $symbol (#symbols) {
push #options, [
"-D$symbol=0",
"-D$symbol=1"
];
}
my #combinations = #{ cartesian( #options ) };
foreach my $combination (#combinations) {
print "compile $fn with these symbols defined:\n";
print "\t", join ' ', ( #$combination );
print "\n";
}
}
sub make_ifdef_processor {
my $map_symbols = shift;
return sub {
my $fn = $_;
if ( $fn =~ /svn-base/ ) {
return;
}
open FILE, "<$fn" or die "Error opening file $fn ($!)";
while ( my $line = <FILE> ) {
if ( $line =~ /^\/\// ) { # skip C-style comments
next;
}
if ( $line =~ /#ifdef\s+(.*)$/ ) {
print "matched line $line\n";
my $symbol = $1;
push #{ $map_symbols->{$fn} }, $symbol;
}
}
}
}
sub cartesian {
my $first_set = shift #_;
my #product = map { [ $_ ] } #$first_set;
foreach my $set (#_) {
my #new_product;
foreach my $s (#$set) {
foreach my $list (#product) {
push #new_product, [ #$list, $s ];
}
}
#product = #new_product;
}
return \#product;
}
This will definitely fail with C-style /* */ comments, as I didn't bother to parse those effectively. The other thing to think about is that it might not make sense for all of the symbol combinations to be tested, and you might build that into the script or your testing server. For example, you might have mutually exclusive symbols for specifying a platform:
-DMAC
-DLINUX
-DWINDOWS
Testing the combinations of having these on and off doesn't really make sense. One quick solution is just to compile all combinations, and be comfortable that some will fail. Your test for correctness can then be that the compilation always fails and succeeds with the same combinations.
The other thing to remember is not all combinations are valid because many of them aren't nested. I think that compilation is relatively cheap, but the number of combinations can grow very quickly if you're not careful. You could make the script parse out which symbols are in the same control structure (nested #ifdefs for example), but that's much harder to implement and I've not done that here.
You can use unifdef -s to get a list of all preprocessor symbols used in preprocessor conditionals. Based on the discussion around the other answers this is clearly not quite the information you need, but if you use the -d option as well, the debugging output includes the nesting level. It should be fairly simple to filter the output to produce the symbol combinations you want.
Another solution is to compile in all the features and have run-time configuration testing for the features. This is a cool "trick" since it allows Marketing to sell different configurations and saves Engineering time by simply setting values in a configuration file.
Otherwise, I suggest a scripting language for building all the configurations.
Sorry, I don't know any tools to help you, but if I had to do this I would go with a simple script that does this:
- Copy all source files to another place,
- Add a running number (in a comment, obviously) at the start of each line (first code line of first file = 1, do not reset between files),
- Pre-process using all the pre-defined configurations and check which lines were included and which weren't,
- Check which lines have been included and which are missing.
Shouldn't take more than a couple of days to get that running using e.g. Perl or Python. What is required is a file that has a line including the configurations. This should be quick enough to do with the help of this script. Just check which lines are not included with the configurations already and edit the file until every line is included. Then just run this occasionally to make sure that there are no new paths.
Analyzing the source like you want would be a much more complex script and I'm not sure if it would be worth the trouble.
Hm, I initially thought that unifdef might be helpful, but looking further as to what you're asking for, no, it wouldn't be of any immediate help.
You can use Hudson with a matrix project. (Note that Hudson is not just a Java testing tool, but a very flexible build server which can build just about anything, including C/C++ projects). If you set up a matrix project, you get the option to create one or more axes. For each axis you can specify one ore more values. Hudson will then run your build using all possible combinations of the variable values. For example, if you specify
os=windows,linux,osx
wordsize=32,64
Hudson will build six combinations; 32- and 64-bit versions for each windows, linux, and osx. When building a "free-style project" (i.e. launching an external build script), the configurations are specified using environment variables. in the example above, "os" and "windows" will be specified as environment variables.
Hudson also has support for filtering combinations in order to avoid building certain invalid combinations (for example, windows 64-bit can be removed, but all others kept).
(Edited post to provide more details about matrix projects.)
May be You need grep?
Related
For a small test framework, I want to do automatic test discovery. Right now, my plan is that all tests just have a prefix, which could basically be implemented like this
#define TEST(name) void TEST_##name(void)
And be used like this (in different c files)
TEST(one_eq_one) { assert(1 == 1); }
The ugly part is that you would need to list all test-names again in the main function.
Instead of doing that, I want to collect all tests in a library (say lib-my-unit-tests.so) and generate the main function automatically, and then just link the generated main function against the library. All of this internal action can be hidden nicely with cmake.
So, I need a script that does:
1. Write "int main(void) {"
2. For all functions $f starting with 'TEST_' in lib-my-unit-tests.so do
a) write "extern void $f(void);"
b) write "$f();
3. Write "}"
Most parts of that script are easy, but I am unsure how to reliably get a list of all functions starting with the prefix.
On POSIX systems, I can try to parse the output of nm. But here, I am not sure if the names will always be the same (on my MacBook, all names start with an additional '_'). To me, it looks like it might be OS/architecture-dependent which names will be generated for the binary. For windows, I do not yet have an idea on how to do that.
So, my questions are:
Is there a better way to implement test-discovery in C? (maybe something like dlsym)
How do I reliably get a list of all function-names starting with a certain prefix on a MacOS/Linux/Windows
A partial solution for the problem is parsing nm with a regex:
for line in $(nm $1) ; do
# Finds all functions starting with "TEST_" or "_TEST_"
if [[ $line =~ ^_?(TEST_.*)$ ]] ; then
echo "${BASH_REMATCH[1]}"
fi
done
And then a second script consumes this output to generate a c file that calls these functions. Then, cmake calls the second script to create the test executable
add_executable(test-executable generated_source.c)
target_link_libraries(test-executable PRIVATE library_with_test_functions)
add_custom_command(
OUTPUT generated_source.c
COMMAND second_script.sh library_with_test_functions.so > generated_source.c
DEPENDS second_script.sh library_with_test_functions)
I think this works on POSIX systems, but I don't know how to solve it for Windows
You can write a shell script using the nm or objdump utilities to list the symbols, pipe through awk to select the appropriate name and output the desired source lines.
For example, I have two C binary executable files. How can I determine whether the two were generated using same source code or not?
In general, this is completely impossible to do.
You can generate different binaries from the same source
Two identical binaries can be generated from different sources
It is possible to add version information in different ways. However, you can fool all of those methods quite easily if you want.
Here is a short script that might help you. Note that it might have flaws. It's just to show the idea. Don't just copy this and use in production code.
#!/bin/bash
STR="asm(\".ascii \\\"$(md5sum $1)\\\"\");"
NEWNAME=$1.aux.c
cp $1 $NEWNAME
echo $STR >> $NEWNAME
gcc $NEWNAME
What it does is basically to make sure that the md5sum of the source gets included as a string in the binary. It's gcc specific, and you can read more about the idea here: embed string via header that cannot be optimized away
The h2ph utility generates a .ph "Perl header" file from a C header file, but what is the best way to use this file? Like, should it be require or use?:
require 'myconstants.ph';
# OR
use myconstants; # after mv myconstants.ph myconstants.pm
# OR, something else?
Right now, I am doing the use version shown above, because with that one I never need to type parentheses after the constant. I want to type MY_CONSTANT and not MY_CONSTANT(), and I have use strict and use warnings in effect in the Perl files where I need the constants.
It's a bit strange though to do a use with this file since it doesn't have a module name declared, and it doesn't seem to be particularly intended to be a module.
I have just one file I am running through h2ph, not a hundred or anything.
I've looked at perldoc h2ph, but it didn't mention the subject of the intended mechanism of import at all.
Example input and output: For further background, here's an example input file and what h2ph generates from it:
// File myconstants.h
#define MY_CONSTANT 42
...
# File myconstants.ph - generated via h2ph -d . myconstants.h
require '_h2ph_pre.ph';
no warnings qw(redefine misc);
eval 'sub MY_CONSTANT () {42;}' unless defined(&MY_CONSTANT);
1;
Problem example: Here's an example of "the problem," where I need to use parentheses to get the code to compile with use strict:
use strict;
use warnings;
require 'myconstants.ph';
sub main {
print "Hello world " . MY_CONSTANT; # error until parentheses are added
}
main;
which produces the following error:
Bareword "MY_CONSTANT" not allowed while "strict subs" in use at main.pl line 7.
Execution of main.pl aborted due to compilation errors.
Conclusion: So is there a better or more typical way that this is used, as far as following best practices for importing a file like myconstants.ph? How would Larry Wall do it?
You should require your file. As you have discovered, use accepts only a bareword module name, and it is wrong to rename myconstants.ph to have a .pm suffix just so that use works.
The choice of use or require makes no difference to whether parentheses are needed when you use a constant in your code. The resulting .ph file defines constants in the same way as the constant module, and all you need in the huge majority of cases is the bare identifier. One exception to this is when you are using the constant as a hash key, when
my %hash = { CONSTANT => 99 }
my $val = $hash{CONSTANT}
doesn't work, as you are using the string CONSTANT as a key. Instead, you must write
my %hash = { CONSTANT() => 99 }
my $val = $hash{CONSTANT()}
You may also want to wrap your require inside a BEGIN block, like this
BEGIN {
require 'myconstants.ph';
}
to make sure that the values are available to all other parts of your code, including anything in subsequent BEGIN blocks.
The problem does somewhat lie in the require.
Since require is a statement that will be evaluated at run-time, it cannot have any effect on the parsing of the latter part of the script. So when perl reads through the MY_CONSTANT in the print statement, it does not even know the existence of the subroutine, and will parse it as a bareword.
It is the same for eval.
One solution, as mentioned by others, is to put it into a BEGIN block. Alternatively, you may forward-delcare it by yourself:
require 'some-file';
sub MY_CONSTANT;
print 'some text' . MY_CONSTANT;
Finally, from my perspective, I have not ever used any ph files in my Perl programming.
I would like to check syntax of my perl module (as well as for imports), but I don't want to check for dynamic loaded c libraries.
If I do:
perl -c path_to_module
I get:
Can't locate loadable object for module B::Hooks::OP::Check in #INC
because B::Hooks::OP::Check are loading some dynamic c libraries and I don't want to check that...
You can't.
Modules can affect the scripts that use them in many ways, including how they are parsed.
For example, if a module exports
sub f() { }
Then
my $f = f+4;
means
my $f = f() + 4;
But if a it were to export
sub f { }
the same code means
my $f = f(+4);
As such, modules must be loaded to parse the script that loads it. To load a module is simply to execute it, be it written in Perl or C.
That said, some folks put together PPI to address the needs of people like you. It's not perfect —it can't be perfect for the reasons previously stated— but it will give useful results nonetheless.
By the way, the proper way to syntax check a module is
perl -e'use Module;'
Using -c can give errors where non exists and vice-versa.
The syntax checker loads the included libraries because they might be applying changes to the syntax. If you're certain that this is not happening, you could prevent the inclusion by manipulating the loading path and providing a fake b::Hooks::OP::Check.
For testing purposes, I need to share some definitions between Tcl and C. Is it possible to include C style include file in Tcl scripts? Any alternative suggestion will be welcomed, but I prefer not to write a parser for C header file.
SWIG supports Tcl so possibly you can make use of that. Also I remember seeing some code to parse C headers on the Tcl wiki - so you might try looking at the Parsing C page there. That should save you writing one from scratch.
If you're doing a full API of any complexity, you'd be best off using SWIG (or critcl†) to do the binding. SWIG can do a binding between a C API and Tcl with very little user input (often almost none). It should be noted though that the APIs it produces are not very natural from a Tcl perspective (because Tcl isn't C and has different idioms).
Yet if you are instead after some way of handling just the simplest parts of definitions — just the #defines of numeric constants — then the simplest way to deal with that is via a bit of regular expression parsing:
proc getDefsFromIncludeFile {filename} {
set defs {}
set f [open $filename]
foreach line [split [read $f] "\n"] {
# Doesn't handle all edge cases, but does do a decent job
if {[regexp {^\s*#\s*define\s+(\w+)\s+([^\s\\]+)} $line -> def val]} {
lappend defs $def [string trim $val "()"]
}
}
close $f
return $defs
}
It does a reasonably creditable job on Tcl's own headers. (Handling conditional definitions and nested #include statements is left as an exercise; I suggest you try to arrange your C headers so as to make that exercise unnecessary.) When I do that, the first few definitions extracted are:
TCL_ALPHA_RELEASE 0 TCL_BETA_RELEASE 1 TCL_FINAL_RELEASE 2 TCL_MAJOR_VERSION 8
† Critcl is a different way of doing Tcl/C bindings, and it works by embedding the C language inside Tcl. It can produce very natural-working Tcl interfaces to C code; I think it's great. However, I don't think it's likely to be useful for what you are trying to do.