I have C function prototypes (certain windows api header files) that look like:
int
foo
(
int
a
,
int
*
b
)
;
(they seem to have no coding convention)
which I am trying to programmatically turn into a one-line prototype of the form (or something close to it):
int foo(int a, int * b);
I have looked into programs like ctags ( ctags multi-line C function prototypes ) and into various settings in uncrustify ( http://uncrustify.sourceforge.net/ ) however I haven't been able to make any headway in either. (any insight would be great, or perhaps one of the 385 uncrustify options that I missed does what I want).
Programmatically, I am trying to look for unique markers that signify a function prototype so that I can write a script that will format the code to my liking.
Without using a lexer and a parser this seems like it could get very convoluted very quickly; any suggestions?
run them through indent -kr or astyle --style=kr
Solution using vim?
put marker on int and do 11J
sed ':a;N;$!ba;s/\n/ /g' prototypes.file | sed 's/; */;\n/g'
The first command - before the pipe - will replace all new-lines to spaces, and the next will put a new-line back after every semicolon.
Of course this will only work if there are nothing else but these prototypes in the file. If there are some other stuff that you want to keep as they are, you can use vim's visual selection and two substitution commands:
Select the region you want to join, than
:s/\n/ /
Select the joined line and
:s/; */;\r/g
Another solution using vi:
do a regex search removing all newlines. Then take the resulting mess and do another regex search replacing each ; with ; \n\n. That should leave you with a list of prototypes with a line skipped between each one. Since we're marking the ends of the prototypes instead of the beginnings and all prototypes end the same way, we don't have to worry about not recognizing special cases.
Related
For a small test framework, I want to do automatic test discovery. Right now, my plan is that all tests just have a prefix, which could basically be implemented like this
#define TEST(name) void TEST_##name(void)
And be used like this (in different c files)
TEST(one_eq_one) { assert(1 == 1); }
The ugly part is that you would need to list all test-names again in the main function.
Instead of doing that, I want to collect all tests in a library (say lib-my-unit-tests.so) and generate the main function automatically, and then just link the generated main function against the library. All of this internal action can be hidden nicely with cmake.
So, I need a script that does:
1. Write "int main(void) {"
2. For all functions $f starting with 'TEST_' in lib-my-unit-tests.so do
a) write "extern void $f(void);"
b) write "$f();
3. Write "}"
Most parts of that script are easy, but I am unsure how to reliably get a list of all functions starting with the prefix.
On POSIX systems, I can try to parse the output of nm. But here, I am not sure if the names will always be the same (on my MacBook, all names start with an additional '_'). To me, it looks like it might be OS/architecture-dependent which names will be generated for the binary. For windows, I do not yet have an idea on how to do that.
So, my questions are:
Is there a better way to implement test-discovery in C? (maybe something like dlsym)
How do I reliably get a list of all function-names starting with a certain prefix on a MacOS/Linux/Windows
A partial solution for the problem is parsing nm with a regex:
for line in $(nm $1) ; do
# Finds all functions starting with "TEST_" or "_TEST_"
if [[ $line =~ ^_?(TEST_.*)$ ]] ; then
echo "${BASH_REMATCH[1]}"
fi
done
And then a second script consumes this output to generate a c file that calls these functions. Then, cmake calls the second script to create the test executable
add_executable(test-executable generated_source.c)
target_link_libraries(test-executable PRIVATE library_with_test_functions)
add_custom_command(
OUTPUT generated_source.c
COMMAND second_script.sh library_with_test_functions.so > generated_source.c
DEPENDS second_script.sh library_with_test_functions)
I think this works on POSIX systems, but I don't know how to solve it for Windows
You can write a shell script using the nm or objdump utilities to list the symbols, pipe through awk to select the appropriate name and output the desired source lines.
For example:
#include "pathtoheader1/header1.hh"
##include "pathtoheader2/header2.hh"
What is the difference between these two preprocessor directives?
Edit
From what I can tell, the ##include directive, in the context of the program I am working with, will prepend -I flags to the specified include path.
TRICK_CFLAGS += -Imodels
TRICK_CXXFLAGS += -Imodels
The compiler will now look for:
/models/pathtoheader1/header1.hh
instead of
/pathtoheader1/header1.hh
These flags are stored in a .mk file.
Additional Information
I am using NASA's Trick Simulation environment to build a simple 2-body simulation of the earth orbiting the sun. The specific tool I am using is called 'trick-CP', Trick's compilation tool.
https://github.com/nasa/trick
## is the token pasting operator in both the C and C++ preprocessors. It's used to concatenate two arguments.
Since it requires an argument either side, a line starting with it is not syntactically valid, unless it's a continuation of a previous line where that previous line has used the line continuation symbol \ or equivalent trigraph sequence.
Question is about NASA Trick. Trick extends C and C++ language with its own syntax.
From Trick documentation:
Headers files, that supply data-types for user-defined models should be included using ##include . Note the double hash (#).
The second one is a syntax error in C++, and I am pretty sure it is a syntax error in C too. The ## preprocessor operator is only valid inside a preprocessor macro (where it forces token pasting).
Here is what the Trick Documentation says about include:
Include files
There are two types of includes in the S_define file.
Single pound "#" includes.
Include files with a single pound "#" are parsed as they are part of the S_define file. They are treated just as #include files in C or C++ files. These files usually include other sim objects or instantiations as part of the S_define file.
Double pound "#" includes.
Include files with a double pound "##" are not parsed as part of the S_define file. These files are the model header files. They include the model class and structure definitions as well as C prototypes for functions used in the S_define file. Double pound files are copied, minus one pound, to S_source.hh.
Also here is a link to where it talks about it in the Trick documentation: https://nasa.github.io/trick/documentation/building_a_simulation/Simulation-Definition-File
Several days ago, someone told me that their keyboard is faulty, leaving them unable to type ( on their keyboard. The first thing that came to my mind, when writing C, was to either copy the character from somewhere and just paste it or to try to use preprocessor directive #define . Once I tried to use #define, I realised that gcc doesn't let me write something like #define OB ( and I pretty much understand why. Is it possible to write something similar to this and let me replace a punctuator using #define?
If there was a trigraph for ( I would suggest that. Otherwise your solution with a macro might work, but unfortunately when written in C or a header file it can not be parsed as you want it. However using gcc you can specify that macro directly using the -D option.
Example:
int main OB ) {
printf OB "Hello, world\n");
}
I can compile it using the following command: cc -o test test.c -DOB='('. I needed to add ' otherwise bash wouldn't accept that.
You can also use make adding the extra parameter to CPPFLAGS: CPPFLAGS+="-DOB='('" make test - again you need more quotes than usual otherwise all the shells would not understand unquoted (. Of course you can just add CPPFLAGS+=-DOB='(' directly to the Makefile.
Edit:
In fact you can still have a macro in code. But you have to tell gcc that ( is not a part of macro identifier. The following worked for me:
#define OB \
(
int main OB ) {
printf OB "Hello, world\n");
}
When I copy code from another file, the formatting is messed up, like this:
fun()
{
for(...)
{
for(...)
{
if(...)
{
}
}
}
}
How can I autoformat this code in vim?
Try the following keystrokes:
gg=G
Explanation: gg goes to the top of the file, = is a command to fix the indentation and G tells it to perform the operation to the end of the file.
I like to use the program Artistic Style. According to their website:
Artistic Style is a source code indenter, formatter, and beautifier for the C, C++, C# and Java programming languages.
It runs in Window, Linux and Mac. It will do things like indenting, replacing tabs with spaces or vice-versa, putting spaces around operations however you like (converting if(x<2) to if ( x<2 ) if that's how you like it), putting braces on the same line as function definitions, or moving them to the line below, etc. All the options are controlled by command line parameters.
In order to use it in vim, just set the formatprg option to it, and then use the gq command. So, for example, I have in my .vimrc:
autocmd BufNewFile,BufRead *.cpp set formatprg=astyle\ -T4pb
so that whenever I open a .cpp file, formatprg is set with the options I like. Then, I can type gg to go to the top of the file, and gqG to format the entire file according to my standards. If I only need to reformat a single function, I can go to the top of the function, then type gq][ and it will reformat just that function.
The options I have for astyle, -T4pb, are just my preferences. You can look through their docs, and change the options to have it format the code however you like.
Here's a demo. Before astyle:
int main(){if(x<2){x=3;}}
float test()
{
if(x<2)
x=3;
}
After astyle (gggqG):
int main()
{
if (x < 2)
{
x = 3;
}
}
float test()
{
if (x < 2)
x = 3;
}
The builtin command for properly indenting the code has already been mentioned (gg=G). If you want to beautify the code, you'll need to use an external application like indent. Since % denotes the current file in ex mode, you can use it like this:
:!indent %
I find that clang-format works well.
There are some example keybindings in the clang documentation
I prefer to use the equalprg binding in vim. This allows you to invoke clang-format with G=gg or other = indent options.
Just put the following in your .vimrc file:
autocmd FileType c,cpp setlocal equalprg=clang-format
The plugin vim-autoformat lets you format your buffer (or buffer selections) with a single command: https://github.com/vim-autoformat/vim-autoformat. It uses external format programs for that, with a fallback to vim's indentation functionality.
I like indent as mentioned above, but most often I want to format only a small section of the file that I'm working on. Since indent can take code from stdin, its really simple:
Select the block of code you want to format with V or the like.
Format by typing :!indent.
astyle takes stdin too, so you can use the same trick there.
I wanted to add, that in order to prevent it from being messed up in the first place you can type :set paste before pasting. After pasting, you can type :set nopaste for things like js-beautify and indenting to work again.
Maybe you can try the followings
$indent -kr -i8 *.c
Hope it's useful for you!
Their is a tool called indent. You can download it with apt-get install indent, then run indent my_program.c.
For a good overview and demo of many of the options mentioned here, #Gavin-Freeborn has a great video on YouTube:
https://www.youtube.com/watch?v=tM_uIwSucPU
It covers some Vim plugins as well as built-in capabilities such as =, gq, and formatprg.
Is there any way in C to remove (using remove()) multiple files using a * (wildcards)?
I have a set of files that all start with Index. For example: Index1.txt, Index-39.txt etc.
They all start with Index but I don't know what text follows. There are also other files in the same directory so deleting all files won't work.
I know you can read the directory, iterate each file name, read the the first 5 chars, compare and if it fits then delete, but, is there an easier way (this is what I currently do by the way)?
This is standard C, since the code runs on Linux and Windows.
As you point out you could use diropen, dirread, dirclose to access the directory contents, a function of your own (or transform the wildcards into a regex and use a regex library) to match, and unlink to delete.
There isn't a standard way to do this easier. There are likely to be libraries, but they won't be more efficient than what you're doing. Typically a file finding function takes a callback where you provide the matching and action part of the code. All you'd be saving is the loop.
If you don't mind being platform-specific, you could use the system() call:
system("del index*.txt"); // DOS
system("rm index*.txt"); // unix
Here is some documentation on the system() call, which is part of the standard C library (cstdlib).
Is this all the program does? If so, let the command line do the wildcard expansion for you:
int main(int argc, char* argv[])
{
while (argc--)
remove(argv[argc]);
}
on Windows, you need to link against 'setargv.obj', included in the VC standard lib directory.