Get all the functions' names from c/cpp files - c

For example, there is a C file a.c, there are three functions in this file: funA(), funB() and funC().
I want to get all the function names from this file.
Additionally, I also want to get the start line number and end line number of each function.
Is there any solution?
Can I use clang to implement it?

You can compile the file and use nm http://en.wikipedia.org/wiki/Nm_(Unix) on the generated binary. You can then just parse the output of nm to get the function names.
If you want to get line numbers, you can use the function names to parse the source file for line numbers.
All the this can be accomplished with a short perl script that makes system calls to gcc and nm.
This is assuming you are using a *nix system of course...

One solution that works well for the job is cproto. It will scan source files (in K&R or ANSI-C format) and output the function prototypes. You can process entire directories of source files with a find command similar to:
find "$dirname" -type f -name "*.c" \
-exec /path/to/cproto -s \
-I/path/to/extra/includes '{}' >> "$outputfile" \;
While the cproto project is no longer actively developed, the cproto application continues to work very, very well. It provides function output in a reasonable form that can be fairly easily parsed/formatted as you desire.
Note: this is just one option based on my use. There are many others available.

Related

using a shared library : custom myfopen(), myfrwrite()

Hello I have created a shared library named logger.so. This library is my custom fopen() and fwrite(). It collects some data and produce a file with these data.
Now I am writing a bash script and I want to use this library as well. Producing a txt file I would be able to see the extra file that produce my custom fopen(). So when I use the command fopen() in a c file this extra file is produced.
My Question is which commands are using fopen() and fwrite() functions in bash?
I have already preloaded my shared library but It doesn't work. Maybe these commands don't use fopen(),fwrite()
export LD_PRELOAD=./logger.so
read -p 'Enter the number of your files to create: [ENTER]: ' file_number
for ((i=1; i<=file_number; i++))
do
echo file_"$i" > "file_${i}"
done
This may require some trial and error. I see two ways to do this. One is to run potential commands that deal with files in bash and see if when traced they call fopen:
strace bash -c "read a < /dev/null"`
or
strace bash -c "read a < /dev/null"` 2&>1 | fgrep fopen
This shows that that read uses open, not fopen.
Another way is to grep through the source code of bash as #oguz suggested. When I did this I found several places where fopen is called, but I did not investigate further:
curl https://mirrors.tripadvisor.com/gnu/bash/bash-5.1-rc3.tar.gz|tar -z -x --to-stdout --wildcards \*.c | fgrep fopen
You'll want to unarchive the whole package and search through the .c files one by one, e.g.:
curl https://mirrors.tripadvisor.com/gnu/bash/bash-5.1-rc3.tar.gz|tar -z -v -x --wildcards \*.c
You can also FTP the file or save it via a browser and do the tar standalone if you don't have curl (wget will also work).
Hopefully you can trace the relevant commands but it might not be easy.
Not necessarily every program uses C standard library stdio functions fopen and fwrite.
But every program uses open and write syscalls to open and write files, which you can interpose / monkey-patch.
Modern programs that use io_uring require a different method of interposing.

Using "$**" to direct file input within a Makefile

I am working with a large C project that has many source files. Here is a line from one of the makefiles:
!$(TOOLSDIRECTORY)unifdef $(UNIFDEF_ARGUMENTS) $** > $(TARGET)\$**
The Unifdef tool referenced by this line is open source and available here:
http://dotat.at/prog/unifdef/
In this case, the last argument to Unifdef is the group of files to process. It is my understanding that this code is using the symbol "$**" to say "every file in this folder", then piping all of the output to the TARGET directory.
My confusion is that I don't understand how Unifdef receives multiple files in one command. Does a makefile package all of the files into one file stream when it sees "$**"? I understand how Unifdef handles the input it receives, but how do multiple files turn into the single argument that Unifdef receives?
Other note: this makefile is being run on Windows in MSVS 2010.
I suspect that this doesn't actually do what you want.
$** is not a single construct; it is the special Makefile variable $* followed immediately by the shell-glob wildcard *. The line is going to be rewritten twice: first, by Make, to something like
!../path/to/tools/unifdef --opt1 --opt2 foo* > ../path/to/target/foo\*
and then, by /bin/sh, to something like
!../path/to/tools/unifdef --opt1 --opt2 foo.c foo1.c foo2.c fooquux.c \
> ../path/to/target/foo*
and only then executed.
You didn't quote any of the context, so I can't be any more specific than that. Here's why I think this can't be right, though:
Unless maybe you set .RECIPEPREFIX, which is a feature I had never heard of before now, the ! at the beginning of the command doesn't make any sense and should be causing the command to fail because there is no executable named literally !../path/to/tools/unifdef.
The backslash on the second occurrence of $** does not escape the $* (you would do that by writing $$*); it is preserved, and escapes the shell-glob star, so the output is being written to a file literally named ../path/to/target/foo*, which is sufficiently weird that I don't think it can be what was intended.
If the target-directory glob weren't being escaped, there would be two more problems:
Glob expansion happens independently in the source and target directories, and so would (at least potentially) match unrelated sets of files.
The output redirection (>) only applies to the very next thing on the command line; all the other things matched by the glob in the target directory would be provided as input to unifdef.
Based on a wild-assed guess about what you're trying to do, I think you probably want something more like this:
# Resist the temptation to use wildcards. It will be less grief in the long run
# to list each file explicitly.
GENERIC_SOURCES := foo.c foo1.c foo2.c fooquux.c barblurf.c barbaz.c
UNIFDEFED_SOURCES := $(patsubst %.c,$(TARGET)/%-u.c,$(GENERIC_SOURCES))
# The indented lines below must be indented using exactly one hard tab character.
$(UNIFDEFED_SOURCES): %-u.c: %.c
$(TOOLSDIRECTORY)unifdef $(UNIFDEF_ARGUMENTS) $< > $#T
mv -f $#T $#
This does not attempt to batch invocations of unifdef; Make is generally much happier if each rule creates only one output file.
$(patsubst ...) and the static pattern rule are features specifically of GNU make. I normally advocate portability, but in the case of Make, the GNU incarnation is so much more powerful than the portable feature set that it's worth carrying around as a dependency.
The Windows CMD prompt expands the $** into multiple individual calls to Unifdef, each one with one of the files in the directory as the argument, and pipes the output to a file of the same name in the target directory. Therefore, each call to Unifdef is only receiving one file name as its input.

Bash script for extracting function calls from c files

I'm new to scripting, and I'm attempting to extract all function calls from a c files, all present in a directory.
Here is my code so far, but it seems to be giving no output.
#!/bin/bash
awk '/[ \t]*[a-zA-Z_]*\(([a-zA-Z_]*[ \t]*,?)*\);/ {print $0}' *.c
I'm stumped.
Also the c files all have at least one function call.
You should debug your regexp. Reduce it until you get some matches, then add again the other parts, checking if you get the expected results.

Is there a file include mechanism for YACC files?

I have three programs that are currently using YACC files to do configuration file parsing. For simplicity, they all read the same configuration file, however, they each respond to keys/values uniquely (so the same .y file can't be used for more than 1 program). It would be nice not to have to repeat the %token declarations for each one - if I want to add one token, I have to change 3 files? What year is it??
These methods aren't working or are giving me issues:
The C preprocessor is obviously run AFTER we YACC the file, so #include for a #define or other macro will not work.
I've tried to script up something similar using sed:
REPLACE_DATA=$(cat <file>)
NEW_FILE=<file>.tmp
sed 's/$PLACEHOLDER/$REPLACE_DATA/g' <file> > $NEW_FILE
However it seems that it's stripping my newlines in REPLACE_DATA and then not replacing instances of $PLACEHOLDER instead of replacing the contents of the variables PLACEHOLDER.
Is there a real include mechanism in YACC, or are there other solutions I'm missing? This is a maintenance nightmare and I'm hoping someone else has run into a similar situation. Thanks in advance.
here's a sed version from http://www.grymoire.com/Unix/Sed.html#uh-37
#!/bin/sh
# watch out for a '/' in the parameter
# use alternate search delimiter
sed -e '\_#INCLUDE <'"$1"'>_{
r '"$1"'
d
}'
But traditionally, we used the m4 preprocessor before yacc.

strange cscope command line limitation

Cscope has eleven search input fields in interactive mode. But when I try to use it in line-oriented output mode and specify Find all symbol assignments: field using -10 switch it does not work. Any ideas?
Thanks.
I also see some little strange-ness.
In terminal,
cscope -d
gives the following options
Find this C symbol:
Find this global definition:
Find functions called by this function:
Find functions calling this function:
Find this text string:
Change this text string:
Find this egrep pattern:
Find this file:
Find files #including this file:
But, using my cscope plugin in gvim,
:cs help
gives the following options
find : Query for a pattern (Usage: find c|d|e|f|g|i|s|t name)
c: Find functions calling this function
d: Find functions called by this function
e: Find this egrep pattern
f: Find this file
g: Find this definition
i: Find files #including this file
s: Find this C symbol
t: Find assignments to
The "Find assignments to" option is available only in the second.
So, for line-oriented output mode, the closest seems to be the "Find this text string:" option. That can be done as
cscope -d -L -4 <text>
The assignment option was added by RedHat patch, it is not part of the original cscope. Seems like they patched only ncurses interface without updating the corresponding command line options.

Resources