using a shared library : custom myfopen(), myfrwrite() - c

Hello I have created a shared library named logger.so. This library is my custom fopen() and fwrite(). It collects some data and produce a file with these data.
Now I am writing a bash script and I want to use this library as well. Producing a txt file I would be able to see the extra file that produce my custom fopen(). So when I use the command fopen() in a c file this extra file is produced.
My Question is which commands are using fopen() and fwrite() functions in bash?
I have already preloaded my shared library but It doesn't work. Maybe these commands don't use fopen(),fwrite()
export LD_PRELOAD=./logger.so
read -p 'Enter the number of your files to create: [ENTER]: ' file_number
for ((i=1; i<=file_number; i++))
do
echo file_"$i" > "file_${i}"
done

This may require some trial and error. I see two ways to do this. One is to run potential commands that deal with files in bash and see if when traced they call fopen:
strace bash -c "read a < /dev/null"`
or
strace bash -c "read a < /dev/null"` 2&>1 | fgrep fopen
This shows that that read uses open, not fopen.
Another way is to grep through the source code of bash as #oguz suggested. When I did this I found several places where fopen is called, but I did not investigate further:
curl https://mirrors.tripadvisor.com/gnu/bash/bash-5.1-rc3.tar.gz|tar -z -x --to-stdout --wildcards \*.c | fgrep fopen
You'll want to unarchive the whole package and search through the .c files one by one, e.g.:
curl https://mirrors.tripadvisor.com/gnu/bash/bash-5.1-rc3.tar.gz|tar -z -v -x --wildcards \*.c
You can also FTP the file or save it via a browser and do the tar standalone if you don't have curl (wget will also work).
Hopefully you can trace the relevant commands but it might not be easy.

Not necessarily every program uses C standard library stdio functions fopen and fwrite.
But every program uses open and write syscalls to open and write files, which you can interpose / monkey-patch.
Modern programs that use io_uring require a different method of interposing.

Related

Trying to use the Yajl example code, don't know how to give the input to the program

I've compiled the code provided by YAJL library in C. Don't know why the compiled program not parsing my file or is my file format is wrong?
I'm quite new in parsing JSON file.
This is where C code example located at https://lloyd.github.io/yajl/
Don't know do I need to paste the entire code or link is fine?
Myfile.
cat input_file.json
helllooooooooooo
When I ran the program ./a.out json_reformat input_file.json, it doesn't do anything. ./a.out -m json_reformat input_file.json this also didnt work.
I tried with -u and -m option nothing worked.
It print out the usage in STDOUT like this.
usage: json_reformat [options]
-m minimize json rather than beautify (default)
-u allow invalid UTF8 inside strings during parsing

Get all the functions' names from c/cpp files

For example, there is a C file a.c, there are three functions in this file: funA(), funB() and funC().
I want to get all the function names from this file.
Additionally, I also want to get the start line number and end line number of each function.
Is there any solution?
Can I use clang to implement it?
You can compile the file and use nm http://en.wikipedia.org/wiki/Nm_(Unix) on the generated binary. You can then just parse the output of nm to get the function names.
If you want to get line numbers, you can use the function names to parse the source file for line numbers.
All the this can be accomplished with a short perl script that makes system calls to gcc and nm.
This is assuming you are using a *nix system of course...
One solution that works well for the job is cproto. It will scan source files (in K&R or ANSI-C format) and output the function prototypes. You can process entire directories of source files with a find command similar to:
find "$dirname" -type f -name "*.c" \
-exec /path/to/cproto -s \
-I/path/to/extra/includes '{}' >> "$outputfile" \;
While the cproto project is no longer actively developed, the cproto application continues to work very, very well. It provides function output in a reasonable form that can be fairly easily parsed/formatted as you desire.
Note: this is just one option based on my use. There are many others available.

Potential Dangers of Running Code in Parallel

I am working in OSX and using bash for my shell. I have a script which calls an executable hundreds of times, and each call is independent of the other. Therefore I am going to run this code in parallel. However, each call to the executable appends output to a community text file on a new line.
The ordering of the text file is not of importance (although it would be nice, but totally not worth over complicating since I can just use unix sort command), but what is, is that every call of the executable properly printed to the file. My concern is that if I run the script in parallel that the by some freak accident, two threads will check out the text file, print to it and then save different copies back to the original directory of the text file. Thus nullifying one of the writes to the file.
Does this actually happen, or is my understanding of printing to a file flawed? I don't fully know if this would also be a case by case bases so I will provide some mock code of what is being done in my program below.
Script:
#!/bin/sh
abs=$1
input=$(echo "$abs" | awk '{print 0.004 + 0.005*$1 }')
./program input
"./program":
~~Normal .c file stuff here~~
~~VALUE magically calculated here~~
~~run number is pulled out of input and assigned to index for sorting~~
FILE *fpp;
fpp = fopen("Doc.txt","a");
fprintf(fpp,"%d, %.3f\n", index, VALUE);
fclose(fpp);
~Closing events of program.c~~
Commands to run script in parallel in bash:
printf "%s\n" {0..199} | xargs -P 8 -n 1 ./program
Thanks for any help you guys can offer.
A write() call (like fwrite()) with the append flag set in open() (like during fopen()) is guaranteed to avoid the race condition you describe.
O_APPEND
If set, the file offset shall be set to the end of the file prior to each write.
From: POSIX specifications for open:
opengroup.org open
Race conditions are what you are thinking of.
Not 100% sure but if you simple append to the end of the file rather than opening it and editing it should be right
If you have the option, make your program write to standard output instead of directly to a file. Then you can let the shell merge the output of your programs:
printf "%s\n" {0..199} | parallel -P 8 -n 1 ./program > merged_output.txt
Yeah, that looks like a recipe for disaster. If those processes both hit opening the file at the roughly the same time, only one will "take".
I suggest either (easier) writing to separate files then catting them together when the processing is done, or (harder) sending all results to a consumer process that will write the file for everyone.

atomic create file if not exists from bash script

In system call open(), if I open with O_CREAT | O_EXCL, the system call ensures that the file will only be created if it does not exist. The atomicity is guaranteed by the system call. Is there a similar way to create a file in an atomic fashion from a bash script?
UPDATE:
I found two different atomic ways
Use set -o noclobber. Then you can use > operator atomically.
Just use mkdir. Mkdir is atomic
A 100% pure bash solution:
set -o noclobber
{ > file ; } &> /dev/null
This command creates a file named file if there's no existent file named file. If there's a file named file, then do nothing (but return a non-zero return code).
Pros of > over the touch command:
Doesn't update timestamp if file already existed
100% bash builtin
Return code as expected: fail if file already existed or if file couldn't be created; success if file didn't exist and was created.
Cons:
need to set the noclobber option (but it's okay in a script, if you're careful with redirections, or unset it afterwards).
I guess this solution is really the bash counterpart of the open system call with O_CREAT | O_EXCL.
Here's a bash function using the mv -n trick:
function mkatomic() {
f="$(mktemp)"
mv -n "$f" "$1"
if [ -e "$f" ]; then
rm "$f"
echo "ERROR: file exists:" "$1" >&2
return 1
fi
}
Examples:
$ mkatomic foo
$ wc -c foo
0 foo
$ mkatomic foo
ERROR: file exists: foo
You could create it under a randomly-generated name, then rename (mv -n random desired) it into place with the desired name. The rename will fail if the file already exists.
Like this:
#!/bin/bash
touch randomFileName
mv -n randomFileName lockFile
if [ -e randomFileName ] ; then
echo "Failed to acquired lock"
else
echo "Acquired lock"
fi
Just to be clear, ensuring the file will only be created if it doesn't exist is not the same thing as atomicity. The operation is atomic if and only if, when two or more separate threads attempt to do the same thing at the same time, exactly one will succeed and all others will fail.
The best way I know of to create a file atomically in a shell script follows this pattern (and it's not perfect):
create a file that has an extremely high chance of not existing (using a decent random number selection or something in the file name), and place some unique content in it (something that no other thread would have - again, a random number or something)
verify that the file exists and contains the contents you expect it to
create a hard link from that file to the desired file
verify that the desired file contains the expected contents
In particular, touch is not atomic, since it will create the file if it's not there, or simply update the timestamp. You might be able to play games with different timestamps, but reading and parsing a timestamp to see if you "won" the race is harder than the above. mkdir can be atomic, but you would have to check the return code, because otherwise, you can only tell that "yes, the directory was created, but I don't know which thread won". If you're on a file system that doesn't support hard links, you might have to settle for a less ideal solution.
Another way to do this is to use umask to try to create the file and open it for writing, without creating it with write permissions, like this:
LOCK_FILE=only_one_at_a_time_please
UMASK=$(umask)
umask 777
echo "$$" > "$LOCK_FILE"
umask "$UMASK"
trap "rm '$LOCK_FILE'" EXIT
If the file is missing, the script will succeed at creating and opening it for writing, despite the file being created without writing permissions. If it already exists, the script won't be able to open the file for writing. It would be possible to use exec to open the file and keep the file descriptor around.
rm requires you to have write permissions to the directory itself, without regards to file permissions.
touch is the command you are looking for. It updates timestamps of the provided file if the file exists or creates it if it doesn't.

Stdout to file from which is reading content

When I want to modify files using Bash, I usually redirect stdout to another file and then remove the original. Is there a faster way?
For example:
cut -d':' -f2 FILE > FILE
This makes my source file empty (it empty the file before writing and then starts reading)
All I can do is:
cut -d':' -f2 FILE > FILE2
rm -f FILE
mv FILE2 FILE
Is there a way to redirect the (modified) output to the original file in just one step?
If you install Colin Watson's sponge utility (the link on Colin's blog seems to be dead, but you can get it as part of Joey Hess' moreutils package), you can use it to "soak up" the output before dumping it back into the file:
cat FILE | cut -d':' -f2 | sponge FILE
I'm afraid I don't have an answer for you in bash, but it's very easy in zsh:
cut -d':' -f2 =(cat FILE) > FILE
The =(...) construct creates a temporary file containing the output of the program mentioned which is removed after the command containing it is finished.
I'm afraid there is no convenient general way of doing that in bash. The problem is that before running any commands, bash opens the FILE for output, truncating it, so when cat opens the file, all it sees is the new, empty file. sed has a -i option to do the replacement inline (for which it actually uses a temporary file), and zsh has =(...), as mentioned in other answers, and sort manually checks whether the input and output files match (when using -o, as I mentioned, using the shell redirect will wipe the file before the program ever sees it), but cut and many other utilities don't. You have to use a temporary file for those.

Resources