Bash script for extracting function calls from c files

Bash script for extracting function calls from c files - c

I'm new to scripting, and I'm attempting to extract all function calls from a c files, all present in a directory.
Here is my code so far, but it seems to be giving no output.
#!/bin/bash
awk '/[ \t]*[a-zA-Z_]*\(([a-zA-Z_]*[ \t]*,?)*\);/ {print $0}' *.c
I'm stumped.
Also the c files all have at least one function call.

You should debug your regexp. Reduce it until you get some matches, then add again the other parts, checking if you get the expected results.

Related

Run TreSpEx analysis

I am trying to run TreSpex analysis on a series of trees, which are saved in newick format as .fasta.txt files in a folder.
I have a list of Taxa names saved in a .txt file
I enter:
perl TreSpEx.v1.pl -fun e -ipt *fasta.txt -tf Taxa_List.txt
But it won't run. I tried writing a loop for each file within the folder but am not very good with them and my line of
for i in treefile/; do perl TreSpEx.v1.1.pl -fun e -ipt *.fasta.txt -tf Taxa_List.txt; done
won't work because -ipt apparently needs a name that starts with a letter or number

In your second example you are actually doing the same thing as in first (but posible several times).
I'm not familiar with TreSpEx or know Bash very well for that matter (which it seems you are using), but you might try something like below.
for i in treefile/*.fasta.txt ; do
perl TreSpEx.v1.1.pl -fun e -ipt $i -tf Taxa_List.txt;
done
Basically, you need to use a variable from the for loop (i) to pass name of each file to the command.

Get all the functions' names from c/cpp files

For example, there is a C file a.c, there are three functions in this file: funA(), funB() and funC().
I want to get all the function names from this file.
Additionally, I also want to get the start line number and end line number of each function.
Is there any solution?
Can I use clang to implement it?

You can compile the file and use nm http://en.wikipedia.org/wiki/Nm_(Unix) on the generated binary. You can then just parse the output of nm to get the function names.
If you want to get line numbers, you can use the function names to parse the source file for line numbers.
All the this can be accomplished with a short perl script that makes system calls to gcc and nm.
This is assuming you are using a *nix system of course...

One solution that works well for the job is cproto. It will scan source files (in K&R or ANSI-C format) and output the function prototypes. You can process entire directories of source files with a find command similar to:
find "$dirname" -type f -name "*.c" \
-exec /path/to/cproto -s \
-I/path/to/extra/includes '{}' >> "$outputfile" \;
While the cproto project is no longer actively developed, the cproto application continues to work very, very well. It provides function output in a reasonable form that can be fairly easily parsed/formatted as you desire.
Note: this is just one option based on my use. There are many others available.

Potential Dangers of Running Code in Parallel

I am working in OSX and using bash for my shell. I have a script which calls an executable hundreds of times, and each call is independent of the other. Therefore I am going to run this code in parallel. However, each call to the executable appends output to a community text file on a new line.
The ordering of the text file is not of importance (although it would be nice, but totally not worth over complicating since I can just use unix sort command), but what is, is that every call of the executable properly printed to the file. My concern is that if I run the script in parallel that the by some freak accident, two threads will check out the text file, print to it and then save different copies back to the original directory of the text file. Thus nullifying one of the writes to the file.
Does this actually happen, or is my understanding of printing to a file flawed? I don't fully know if this would also be a case by case bases so I will provide some mock code of what is being done in my program below.
Script:
#!/bin/sh
abs=$1
input=$(echo "$abs" | awk '{print 0.004 + 0.005*$1 }')
./program input
"./program":
~~Normal .c file stuff here~~
~~VALUE magically calculated here~~
~~run number is pulled out of input and assigned to index for sorting~~
FILE *fpp;
fpp = fopen("Doc.txt","a");
fprintf(fpp,"%d, %.3f\n", index, VALUE);
fclose(fpp);
~Closing events of program.c~~
Commands to run script in parallel in bash:
printf "%s\n" {0..199} | xargs -P 8 -n 1 ./program
Thanks for any help you guys can offer.

A write() call (like fwrite()) with the append flag set in open() (like during fopen()) is guaranteed to avoid the race condition you describe.
O_APPEND
If set, the file offset shall be set to the end of the file prior to each write.
From: POSIX specifications for open:
opengroup.org open

Race conditions are what you are thinking of.
Not 100% sure but if you simple append to the end of the file rather than opening it and editing it should be right

If you have the option, make your program write to standard output instead of directly to a file. Then you can let the shell merge the output of your programs:
printf "%s\n" {0..199} | parallel -P 8 -n 1 ./program > merged_output.txt

Yeah, that looks like a recipe for disaster. If those processes both hit opening the file at the roughly the same time, only one will "take".
I suggest either (easier) writing to separate files then catting them together when the processing is done, or (harder) sending all results to a consumer process that will write the file for everyone.

Is there a file include mechanism for YACC files?

I have three programs that are currently using YACC files to do configuration file parsing. For simplicity, they all read the same configuration file, however, they each respond to keys/values uniquely (so the same .y file can't be used for more than 1 program). It would be nice not to have to repeat the %token declarations for each one - if I want to add one token, I have to change 3 files? What year is it??
These methods aren't working or are giving me issues:
The C preprocessor is obviously run AFTER we YACC the file, so #include for a #define or other macro will not work.
I've tried to script up something similar using sed:
REPLACE_DATA=$(cat <file>)
NEW_FILE=<file>.tmp
sed 's/$PLACEHOLDER/$REPLACE_DATA/g' <file> > $NEW_FILE
However it seems that it's stripping my newlines in REPLACE_DATA and then not replacing instances of $PLACEHOLDER instead of replacing the contents of the variables PLACEHOLDER.
Is there a real include mechanism in YACC, or are there other solutions I'm missing? This is a maintenance nightmare and I'm hoping someone else has run into a similar situation. Thanks in advance.

here's a sed version from http://www.grymoire.com/Unix/Sed.html#uh-37
#!/bin/sh
# watch out for a '/' in the parameter
# use alternate search delimiter
sed -e '\_#INCLUDE <'"$1"'>_{
r '"$1"'
d
}'
But traditionally, we used the m4 preprocessor before yacc.

How to get the range (i.e., the line number) of all functions in a file in C?

I want to get both the beginning and ending line numbers of all functions in a file in C. Does any one know that whether there is a easy-to-use tool in Linux for this purpose?

$ ctags -x --c-kinds=f filename.c
This only gives the starting line of each function, but perhaps that is good enough.
If the code was written using fairly common conventions, the function should end with a single line containing } in the first column, so it is fairly easy to get the last line given the first:
awk 'NR > first && /^}$/ { print NR; exit }' first=$FIRST_LINE filename.c

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Bash script for extracting function calls from c files - c

You should debug your regexp. Reduce it until you get some matches, then add again the other parts, checking if you get the expected results.

Related

Run TreSpEx analysis

Get all the functions' names from c/cpp files

Potential Dangers of Running Code in Parallel

Is there a file include mechanism for YACC files?

How to get the range (i.e., the line number) of all functions in a file in C?

Categories

Resources