Use gcc to get all functions from *.c/*.h-file [duplicate] - c

I want to do this:
extract_prototypes file1.c file2.cpp file3.c
and have whatever script/program print a nice list of function prototypes for all functions defined in the given C / C++ files. It must handle multi-line declarations nicely.
Is there a program that can do this job? The simpler the better.
EDIT: after trying to compile two C programs, bonus points for something that uses {perl, python, ruby}.

I use ctags
# p = function declaration, f = function definition
ctags -x --c-kinds=fp /usr/include/hal/libhal.h
Also works with C++
ctags -x --c++-kinds=pf --language-force=c++ /usr/include/c++/4.4.1/bits/deque.tcc
Note, you may need to add include paths, do this using the -I /path/to/includes.

The tool cproto does what you want and allows to tune the output to your requirements.
Note: This tool also only works for C files.

I use ctags and jq
ctags --output-format=json --totals=no --extras=-F --fields=nP file1.c |
jq -sr 'sort_by(.line) | .[].pattern | ltrimstr("/^") | rtrimstr("$/") | . + ";"'

If you have universal-ctags (https://ctags.io), --_xformat option may be useful though you need sed and tr commands to get what you want.
$ cat input.c
struct object *new_object (struct
/* COMMENT */
param
/* IGNORE ME */
*p)
{
return NULL;
}
int main (void)
{
return 0;
}
$ ./ctags -o - --kinds-C=f --kinds-C++=f -x --_xformat='%{typeref} %{name} %{signature};' input.c | tr ':' ' ' | sed -e 's/^typename //'
struct object * new_object (struct param * p);
int main (void);
$
This is similar to the answer posted by Steve Ward but this one requires sed, and tr instead of jq.

http://cfunctions.sourceforge.net
(This only does C and a limited subset of C++. Disclaimer: this is my program.)

I used to use doxygen to generate documentation for my C++ code. I am not an expert, but i think you can use doxygen to generate some sort of index file of the function prototypes.
Here is a thread of someone asking a similar question

gccxml is interesting, but it print a xml tree. You need to extract information about class, functions, types, and even the specialized templates of class and functions. gccxml use parser of GCC, so you don't need to do the worst job wich is parsing C++ file, and you are 100% sure that it's what probably the best compilator understand.

If you format your comments suitably, you could try DOxygen. In fact, if you've not tried it before I'd recommend giving it a go anyway - it will produce inheritance graphs as well as full member function lists and descriptions (from your comments).

In more modern versions of GCC, you can also use -aux-info to get this information when writing C code. See here.
Here's a sample of what the output looks like:
/* src/main.c:30:NC */ static void usage (const char *);
/* src/main.c:32:NF */ extern int main (int argc, char **argv); /* (argc, argv) int argc; char **argv; */
/* src/main.c:57:NF */ static void usage (const char *prog_name); /* (prog_name) const char *prog_name; */

gcc-xml might help, although as it is, it only does half the job you want. You'll need some processing of the XML output

You can run the source file through this program:
/* cproto_parser.c */
#include <stdio.h>
int main (void)
{
int c;
int infundef = 0;
int nb = 0,
np = 0;
while((c=getc(stdin))!=EOF){
if(c=='{'){
if((np==0)&&(nb==0)){infundef=1;}
nb++;
}
if (infundef==0) {putc(c,stdout);}
if(c=='}'){
if((np==0)&&(nb==1)){infundef=0;}
nb--;
}
if(c=='('){np++;}
if(c==')'){np--;}
}
return 0;
}
Run through the preprocessor to get rid of comments. If you have unmatched braces due to #ifdefs you have to set defines, include files to make it not so.
e.g., cc cproto_parser.c -o cproto_parser; cc -E your_source_file.c|./cproto_parser

Related

Check if a system implements a function

I'm creating a cross-system application. It uses, for example, the function itoa, which is implemented on some systems but not all. If I simply provide my own itoa implementation:
header.h:115:13: error: conflicting types for 'itoa'
extern void itoa(int, char[]);
In file included from header.h:2:0,
from file.c:2:0,
c:\path\to\mingw\include\stdlib.h:631:40: note: previous declaration of 'itoa' was here
_CRTIMP __cdecl __MINGW_NOTHROW char* itoa (int, char*, int);
I know I can check if macros are predefined and define them if not:
#ifndef _SOME_MACRO
#define _SOME_MACRO 45
#endif
Is there a way to check if a C function is pre-implemented, and if not, implement it? Or to simply un-implement a function?
Given you have already written your own implementation of itoa(), I would recommend that you rename it and use it everywhere. At least you are sure you will get the same behavior on all platforms, and avoid the linking issue.
Don't forget to explain your choice in the comments of your code...
I assume you are using GCC, as I can see MinGW in your path... there's one way the GNU linker can take care of this for you. So you don't know whether there is an itoa implementation or not. Try this:
Create a new file (without any headers) called my_itoa.c:
char *itoa (int, char *, int);
char *my_itoa (int a, char *b, int c)
{
return itoa(a, b, c);
}
Now create another file, impl_itoa.c. Here, write the implementation of itoa but add a weak alias:
char* __attribute__ ((weak)) itoa(int a, char *b, int c)
{
// implementation here
}
Compile all of the files, with impl_itoa.c at the end.
This way, if itoa is not available in the standard library, this one will be linked. You can be confident about it compiling whether or not it's available.
Ajay Brahmakshatriya's suggestion is a good one, but unfortunately MinGW doesn't support weak definition last I checked (see https://groups.google.com/forum/#!topic/mingwusers/44B4QMPo8lQ, for instance).
However, I believe weak references do work in MinGW. Take this minimal example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
__attribute__ ((weak)) char* itoa (int, char*, int);
char* my_itoa (int a, char* b, int c)
{
if(itoa != NULL) {
return itoa(a, b, c);
} else {
// toy implementation for demo purposes
// replace with your own implementation
strcpy(b, "no itoa");
return b;
}
}
int main()
{
char *str = malloc((sizeof(int)*3+1));
my_itoa(10, str, 10);
printf("str: %s\n", str);
return 0;
}
If the system provides an itoa implementation, that should be used and the output would be
str: 10
Otherwise, you'll get
str: no itoa
There are two really important related points worth making here along the "don't do it like this" lines:
Don't use atoi because it's not safe.
Don't use atoi because it's not a standard function, and there are good standard functions (such as snprintf) which are available to do what you want.
But, putting all this aside for one moment, I want to introduce you to autoconf, part of the GNU build system. autoconf is part of a very comprehensive, very portable set of tools which aim to make it easier to write code which can be built successfully on a wide range of target systems. Some would argue that autoconf is too complex a system to solve just the one problem you pose with just one library function, but as any program grows, it's likely to face more hurdles like this, and getting autoconf set up for your program now will put you in a much stronger position for the future.
Start with a file called Makefile.in which contains:
CFLAGS=--ansi --pedantic -Wall -W
program: program.o
program.o: program.c
clean:
rm -f program.o program
and a file called configure.ac which contains:
AC_PREREQ([2.69])
AC_INIT(program, 1.0)
AC_CONFIG_SRCDIR([program.c])
AC_CONFIG_HEADERS([config.h])
# Checks for programs.
AC_PROG_CC
# Checks for library functions.
AH_TEMPLATE([HAVE_ITOA], [Set to 1 if function atoi() is available.])
AC_CHECK_FUNC([itoa],
[AC_DEFINE([HAVE_ITOA], [1])]
)
AC_CONFIG_FILES([Makefile])
AC_OUTPUT
and a file called program.c which contains:
#include <stdio.h>
#include "config.h"
#ifndef HAVE_ITOA
/*
* WARNING: This code is for demonstration purposes only. Your
* implementation must have a way of ensuring that the size of the string
* produced does not overflow the buffer provided.
*/
void itoa(int n, char* p) {
sprintf(p, "%d", n);
}
#endif
int main(void) {
char buffer[100];
itoa(10, buffer);
printf("Result: %s\n", buffer);
return 0;
}
Now run the following commands in turn:
autoheader: This generates a new file called config.h.in which we'll need later.
autoconf: This generates a configuration script called configure
./configure: This runs some tests, including checking that you have a working C compiler and, because we've asked it to, whether an itoa function is available. It writes its results into the file config.h for later.
make: This compiles and links the program.
./program: This finally runs the program.
During the ./configure step, you'll see quite a lot of output, including something like:
checking for itoa... no
In this case, you'll see that the config.h find contains the following lines:
/* Set to 1 if function atoi() is available. */
/* #undef HAVE_ITOA */
Alternatively, if you do have atoi available, you'll see:
checking for itoa... yes
and this in config.h:
/* Set to 1 if function atoi() is available. */
#define HAVE_ITOA 1
You'll see that the program can now read the config.h header and choose to define itoa if it's not present.
Yes, it's a long way round to solve your problem, but you've now started using a very powerful tool which can help you in a great number of ways.
Good luck!

Linking .c and .h files

For my program I am linking 3 files in total. A main.c, sortfile.c and my.h(header file). For my sortfile.c I am implementing a OddEven Sort. I am unsure whether my coding algorithm is correct. Also would like to know what information usually goes in a header file. Is it only the other two c files vide #include?
#include <stdio.h>
void swap(int *, int *);
void Odd_Even_Sort(int *);
/* swaps the elements */
void swap(int * x, int * y)
{
int temp;
temp = *x;
*x = *y;
*y = temp;
}
/* sorts the array using oddeven algorithm */
void Odd_Even_Sort(int * x)
{
int sort = 0, i;
while (!sort)
{
sort = 1;
for (i = 1;i < MAX;i += 2)
{
if (x[i] > x[i+1])
{
swap(&x[i], &x[i+1]);
sort = 0;
}
}
for (i = 0;i < MAX - 1;i += 2)
{
if (x[i] > x[i + 1])
{
swap(&x[i], &x[i + 1]);
sort = 0;
}
}
}
I did not include a main in the sortfile.c because I intended to put main in the main.c file.
You look confused. Read first the wikipage on linkers and on compilers. You don't link source files, but only object files and libraries.
(I am guessing and supposing and hoping for you that you are using Linux)
You also compile translation units into object files.
Header files are for the preprocessor (the first "phase" of the compilation). The preprocessing is a textual operation. See this answer for some hint.
So you probably want to compile your main.c into main.o with
gcc -Wall -g -c main.c -o main.o
(the -Wall asks for all warnings, so never forget that; the -g asks for debugging information; -c asks to compile some source into some object file; order of program arguments to gcc matters a big lot).
Likewise, you want to compile your sortfile.c into sortfile.o. I leave as an exercise to get the right command doing that.
Finally, you want to get an executable program myprogsort, by linking both object files. Do that with
gcc -g main.o sortfile.o -o myprogsort
But you really want to use some build automation tool. Learn about GNU make. Write your Makefile (beware, tabs are important in it). See this example. Don't forget to try make -p to understand (and take advantage of) all the builtin rules make is knowing.
Also would like to know what information usually goes in a header file.
Conventionally you want only declarations in your common header file (which you would #include in every source file composing a translation unit). You can also add definitions of static inline functions. Read more about inline functions (you probably don't need them at first).
Don't forget to learn how to use the gdb debugger. You probably will run
gdb ./myprogsort
more than once. Don't forget to rebuild your thing after changes to source code.
Look also inside the source code of some medium sized free software project coded in C on github. You'll learn a big lot.

Where is R's C-level PROTECT macro defined? [duplicate]

I am reading R sources and trying to learn about the Heap Structure. I'm looking for the definition of PROTECT(), but I've founded:
$ grep -rn "#define PROTECT(" *
src/include/Rinternals.h:642:#define PROTECT(s) Rf_protect(s)
and then
$ grep -rn "Rf_protect(" *
src/include/Rinternals.h:803:SEXP Rf_protect(SEXP);
src/include/Rinternals.h:1267:SEXP Rf_protect(SEXP);
But I didn't find Rf_protect()'s definition.
Thanks.
The Rf_ prefix is a common idiom giving this plain C code the resemblance of a namespace. So you want to look for protect(...) instead:
/usr/share/R/include/Rinternals.h:#define protect Rf_protect
And given how 'core' this, you may as well start in src/main where a quick grep -c leads you to src/main/memory.c. Et voila on lines 3075 to 3081
SEXP protect(SEXP s)
{
if (R_PPStackTop >= R_PPStackSize)
R_signal_protect_error();
R_PPStack[R_PPStackTop++] = CHK(s);
return s;
}
Now that said, you probably want to pay attention to most of the file and not just this function.

Regex for greping C type definition

I have a big bunch of source files and I want to grep through it to find the definition of a specific user-defined type dev_if_type_t. All I know about it so far it's that some functions in the code I'm examining use it as a return value.
Right now I'm using the following:
typedef.*dev_if_type_t|(define|typedef|enum|struct)\s*dev_if_type_t
but it returns no results. Is there another method of C type definition I'm neglecting to mention?
The grep line itself, in the code base's top directory:
grep -rn "typedef.*dev_if_type_t\|\(define\|typedef\|enum\|struct\)\s*dev_if_type_t" *
There could be much more variants of the definition like:
typedef struct {
/* some code */
} dev_if_type_t;
Some code could also look like this:
#define \
dev_if_type_t int
struct
dev_if_type_t
{
/* some code */
};
You'll never know.
I would suggest you try it just with grepping dev_if_type_t and using the context option -C <num>of grep to find the definition by yourself.
When using expressions including | don't forget to use egrep (deprecated) or the proper command grep -E ....
Note that \| and \( has a different meaning. Use | and ( for your purpose.
So the correct pattern should be:
grep -Ern "typedef.*dev_if_type_t|(define|typedef|enum|struct)\s*dev_if_type_t" *

Treat functions by name

Suppose you created a main() to deal with an exercise you asked your students.
Every student is supposed to write their own function, with the same API. And a single file will be created, with all functions and the main calling them.
Lets say: int studentname(int a, int b) is the function pattern.
One way I deal with it was using a vector of pointer to functions int (*func[MAX])(). But you need to fulfill the vector one by one func[0]=studentname;.
I wonder, is there a way a function can be called by its name somehow?
Something like: int student1(int a , int b), student2(), etc.
And in main somehow we could just call sscanf(funcname,"student%d",i); funcname();.
Do you have any other idea? Maybe
int studentname(int a, int b, char *fname)
{
strcpy(fname, "studentname");
Anything creative will do! :)
Thanks!
Beco
PS. I tried just a vector of functions, but C won't allow me! :)
int func[2]()={{;},{;}};
This way I could just give to each student a number, and voilá... But no way. It was funny though.
Edited: I'm using linux.
Edited 2: Thanks! I've accepted an answer that helped me, but I've also documented a complete example as an answer bellow.
Maybe a bit overcomplicating it, but spontaneous idea:
Compile all student source files into one shared library with the students' functions being exports.
Then enumerate all exposed functions, call and test them.
As an alternative:
Write a small tool that will compile all "student units" using a preprocessor define to replace a predefined function name with an unique name ("func1", "func2", etc.).
Then let the tool write a small unit calling all these functions while performing tests, etc.
And yet another idea:
Use C++ to write a special class template that's going to register derived classes in a object factory and just embed student's code using extern "C". Depending on the implementation this might look a bit confusing and overcomplicated though.
Then use the factory to create one instance of each and run the code.
Example for the approach with dlopen() and dlsym() (whether only one function per library or all - doesn't matter):
void *pluginlib = dlopen("student1.so", RTLD_NOW); // RTLD_NOW will load the file right away
if (!pluginlib)
; // failed to load
studentproc func = (studentproc)dlsym(pluginlib, "student1"); // this loads the function called "student1"
if (!func)
; // failed to resolve
func("hello world!"); // call the lib
dlclose(pluginlib); // unloads the dll (this will make all further calls invalid)
Similar to what #Jamey-Sharp proposed:
ask each student to provide .c file with entry function of a given name/signature
compile each .c into a shared library, named by the student name, or given whatever unique name. This step can be easily automated with make or simple script.
make a simple host application which enumerates all .so files in a given directory, and uses dlopen() and dlsym() to get to the entry point function.
now you can simply call each student's implementation.
BTW, that's how plug-ins are implemented usually, isn't it?
Edit: Here's a working proof of concept (and a proof, that each student can use the same name of the entry point function).
Here's student1.c:
#include <stdio.h>
void student_task()
{
printf("Hello, I'm Student #1\n");
}
Here's student2.c:
#include <stdio.h>
void student_task()
{
printf("Hello, I'm Student #2\n");
}
And here's the main program, tester.c:
#include <stdio.h>
#include <dlfcn.h>
/* NOTE: Error handling intentionally skipped for brevity!
* It's not a production code!
*/
/* Type of the entry point function implemented by students */
typedef void (*entry_point_t)(void);
/* For each student we have to store... */
typedef struct student_lib_tag {
/* .. pointer to the entry point function, */
entry_point_t entry;
/* and a library handle, so we can play nice and close it eventually */
void* library_handle;
} student_solution_t;
void load(const char* lib_name, student_solution_t* solution)
{
/* Again - all error handling skipped, I only want to show the idea! */
/* Open the library. RTLD_LOCAL is quite important, it keeps the libs separated */
solution->library_handle = dlopen(lib_name, RTLD_NOW | RTLD_LOCAL);
/* Now we ask for 'student_task' function. Every student uses the same name.
* strange void** is needed for C99, see dlsym() manual.
*/
*(void**) (&solution->entry) = dlsym(solution->library_handle, "student_task");
/* We have to keep the library open */
}
int main()
{
/* Two entries hardcoded - you need some code here that would scan
* the directory for .so files, allocate array dynamically and load
* them all.
*/
student_solution_t solutions[2];
/* Load both solutions */
load("./student1.so", &solutions[0]);
load("./student2.so", &solutions[1]);
/* Now we can call them both, despite the same name of the entry point function! */
(solutions[0].entry)();
(solutions[1].entry)();
/* Eventually it's safe to close the libs */
dlclose(solutions[0].library_handle);
dlclose(solutions[1].library_handle);
return 0;
}
Let's compile it all:
czajnik#czajnik:~/test$ gcc -shared -fPIC student1.c -o student1.so -Wall
czajnik#czajnik:~/test$ gcc -shared -fPIC student2.c -o student2.so -Wall
czajnik#czajnik:~/test$ gcc tester.c -g -O0 -o tester -ldl -Wall
And see it works:
czajnik#czajnik:~/test$ ./tester
Hello, I'm Student #1
Hello, I'm Student #2
I'd take a different approach:
Require every student to use the same function name, and place each student's code in a separate source file.
Write one more source file with a main that calls the standard name.
Produce a separate executable from linking main.c with student1.c, then main.c with student2.c, and so on. You might be able to use wildcards in a makefile or shell script to automate this.
That said, at least on Unix-like OSes, you can do what you asked for.
Call dlopen(NULL) to get a handle on the symbols in the main program.
Pass that handle and the function name you want to dlsym. Coerce the resulting pointer to a function pointer of the right type, and call it.
Here is an ugly preprocessor hack:
#Makefile
FILE_NAME=student
${FILE_NAME}: main.c
cc -Wall -DFILE_NAME=\"${FILE_NAME}.c\" -o $# main.c -lm
Teacher's main.c:
#include <math.h>
#include <stdio.h>
#include FILE_NAME
char *my_name(void);
double my_sin(double val);
int main(void)
{
double dd;
dd = my_sin(3.1415923563);
printf("%s: %f\n", my_name(), dd);
return 0;
}
Student's .c File:
#include <math.h>
char * my_name(void);
double my_sin(double val);
char * my_name(void)
{
return "Wildplasser-1.0";
}
double my_sin(double val)
{
return sin (val);
}
The trick lies i the literal inclusion of the student's .c file.
To avoid this, you could also use a different make line, like:
cc -Wall -o $# ${FILE_NAME}.c main.c -lm
(and remove the ugly #include FILENAME, of course)
Thanks you all. I've accepted an answer that gave me the inspiration to solve the question. Here, just to document it, is my complete solution:
File shamain.c
/* Uses shared library shalib.so
* Compile with:
* gcc shamain.c -o shamain -ldl -Wall
*/
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
int main(void)
{
void *libstud;
int (*student[2])(int, int);
char fname[32];
int i,r;
libstud = dlopen("./shalib.so", RTLD_NOW);
if (!libstud)
{
fprintf(stderr, "error: %s\n", dlerror());
exit(EXIT_FAILURE);
}
dlerror(); /* Clear any existing error */
for(i=0; i<2; i++)
{
sprintf(fname, "func%d", i);
*(void **) (&student[i]) = dlsym(libstud, fname); /* c99 crap */
//student[i] = (int (*)(int, int)) dlsym(libstud, fname); /* c89 format */
}
for(i=0; i<2; i++)
{
r=student[i](i, i);
printf("i=%d,r=%d\n", i, r);
}
return 0;
}
File shalib.c
/* Shared library.
* Compile with:
* gcc -shared -fPIC shalib.c -o shalib.so -Wall
*/
#include <stdio.h>
int func0(int one, int jadv)
{
printf("%d = Smith\n", one);
return 0;
}
int func1(int one, int jadv)
{
printf("%d = John\n", one);
return 0;
}
It is a while since I have used shared libraries, but I have a feeling you can extract named functions from a DLL/shlib. Could you create a DLL/shared library containing all of the implementations and then access them by name from the main?
Per #william-morris's suggestion, you might have luck using dlsym() to do a dynamic lookup of the functions. (dlsym() may or may not be the library call to use on your particular platform.)

Resources