Stdout to file from which is reading content - file

When I want to modify files using Bash, I usually redirect stdout to another file and then remove the original. Is there a faster way?
For example:
cut -d':' -f2 FILE > FILE
This makes my source file empty (it empty the file before writing and then starts reading)
All I can do is:
cut -d':' -f2 FILE > FILE2
rm -f FILE
mv FILE2 FILE
Is there a way to redirect the (modified) output to the original file in just one step?

If you install Colin Watson's sponge utility (the link on Colin's blog seems to be dead, but you can get it as part of Joey Hess' moreutils package), you can use it to "soak up" the output before dumping it back into the file:
cat FILE | cut -d':' -f2 | sponge FILE

I'm afraid I don't have an answer for you in bash, but it's very easy in zsh:
cut -d':' -f2 =(cat FILE) > FILE
The =(...) construct creates a temporary file containing the output of the program mentioned which is removed after the command containing it is finished.

I'm afraid there is no convenient general way of doing that in bash. The problem is that before running any commands, bash opens the FILE for output, truncating it, so when cat opens the file, all it sees is the new, empty file. sed has a -i option to do the replacement inline (for which it actually uses a temporary file), and zsh has =(...), as mentioned in other answers, and sort manually checks whether the input and output files match (when using -o, as I mentioned, using the shell redirect will wipe the file before the program ever sees it), but cut and many other utilities don't. You have to use a temporary file for those.

Related

using a shared library : custom myfopen(), myfrwrite()

Hello I have created a shared library named logger.so. This library is my custom fopen() and fwrite(). It collects some data and produce a file with these data.
Now I am writing a bash script and I want to use this library as well. Producing a txt file I would be able to see the extra file that produce my custom fopen(). So when I use the command fopen() in a c file this extra file is produced.
My Question is which commands are using fopen() and fwrite() functions in bash?
I have already preloaded my shared library but It doesn't work. Maybe these commands don't use fopen(),fwrite()
export LD_PRELOAD=./logger.so
read -p 'Enter the number of your files to create: [ENTER]: ' file_number
for ((i=1; i<=file_number; i++))
do
echo file_"$i" > "file_${i}"
done
This may require some trial and error. I see two ways to do this. One is to run potential commands that deal with files in bash and see if when traced they call fopen:
strace bash -c "read a < /dev/null"`
or
strace bash -c "read a < /dev/null"` 2&>1 | fgrep fopen
This shows that that read uses open, not fopen.
Another way is to grep through the source code of bash as #oguz suggested. When I did this I found several places where fopen is called, but I did not investigate further:
curl https://mirrors.tripadvisor.com/gnu/bash/bash-5.1-rc3.tar.gz|tar -z -x --to-stdout --wildcards \*.c | fgrep fopen
You'll want to unarchive the whole package and search through the .c files one by one, e.g.:
curl https://mirrors.tripadvisor.com/gnu/bash/bash-5.1-rc3.tar.gz|tar -z -v -x --wildcards \*.c
You can also FTP the file or save it via a browser and do the tar standalone if you don't have curl (wget will also work).
Hopefully you can trace the relevant commands but it might not be easy.
Not necessarily every program uses C standard library stdio functions fopen and fwrite.
But every program uses open and write syscalls to open and write files, which you can interpose / monkey-patch.
Modern programs that use io_uring require a different method of interposing.

Bash array behaving strangely from sed usage [duplicate]

I am making an NW.js app on macOS, and want to run the app in dev mode
by double-clicking on an icon.
In the first step, I'm trying to make my shell script work.
Using VS Code on Windows (I wanted to gain time), I have created a run-nw file at the root of my project, containing this:
#!/bin/bash
cd "src"
npm install
cd ..
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &
but I get this output:
$ sh ./run-nw
: command not found
: No such file or directory
: command not found
: No such file or directory
Usage: npm <command>
where <command> is one of: (snip commands list)
(snip npm help)
npm#3.10.3 /usr/local/lib/node_modules/npm
: command not found
: No such file or directory
: command not found
Some things I don't understand.
It seems that it takes empty lines as commands.
In my editor (VS Code) I have tried to replace \r\n with \n
(in case the \r creates problems) but it changes nothing.
It seems that it doesn't find the folders
(with or without the dirname instruction),
or maybe it doesn't know about the cd command ?
It seems that it doesn't understand the install argument to npm.
The part that really weirds me out, is that it still runs the app
(if I did an npm install manually)...
Not able to make it work properly, and suspecting something weird with
the file itself, I created a new one directly on the Mac, using vim this time.
I entered the exact same instructions, and... now it works without any
issues.
A diff on the two files reveals exactly zero difference.
What can be the difference? What can make the first script not work? How can I find out?
Update
Following the accepted answer's recommendations, after the wrong line
endings came back, I checked multiple things.
It turns out that since I copied my ~/.gitconfig from my Windows
machine, I had autocrlf=true, so every time I modified the bash
file under Windows, it re-set the line endings to \r\n.
So, in addition to running dos2unix (which you will have to
install using Homebrew on a Mac), if you're using Git, check your
.gitconfig file.
Yes. Bash scripts are sensitive to line-endings, both in the script itself and in data it processes. They should have Unix-style line-endings, i.e., each line is terminated with a Line Feed character (decimal 10, hex 0A in ASCII).
DOS/Windows line endings in the script
With Windows or DOS-style line endings , each line is terminated with a Carriage Return followed by a Line Feed character. You can see this otherwise invisible character in the output of cat -v yourfile:
$ cat -v yourfile
#!/bin/bash^M
^M
cd "src"^M
npm install^M
^M
cd ..^M
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
In this case, the carriage return (^M in caret notation or \r in C escape notation) is not treated as whitespace. Bash interprets the first line after the shebang (consisting of a single carriage return character) as the name of a command/program to run.
Since there is no command named ^M, it prints : command not found
Since there is no directory named "src"^M (or src^M), it prints : No such file or directory
It passes install^M instead of install as an argument to npm which causes npm to complain.
DOS/Windows line endings in input data
Like above, if you have an input file with carriage returns:
hello^M
world^M
then it will look completely normal in editors and when writing it to screen, but tools may produce strange results. For example, grep will fail to find lines that are obviously there:
$ grep 'hello$' file.txt || grep -x "hello" file.txt
(no match because the line actually ends in ^M)
Appended text will instead overwrite the line because the carriage returns moves the cursor to the start of the line:
$ sed -e 's/$/!/' file.txt
!ello
!orld
String comparison will seem to fail, even though strings appear to be the same when writing to screen:
$ a="hello"; read b < file.txt
$ if [[ "$a" = "$b" ]]
then echo "Variables are equal."
else echo "Sorry, $a is not equal to $b"
fi
Sorry, hello is not equal to hello
Solutions
The solution is to convert the file to use Unix-style line endings. There are a number of ways this can be accomplished:
This can be done using the dos2unix program:
dos2unix filename
Open the file in a capable text editor (Sublime, Notepad++, not Notepad) and configure it to save files with Unix line endings, e.g., with Vim, run the following command before (re)saving:
:set fileformat=unix
If you have a version of the sed utility that supports the -i or --in-place option, e.g., GNU sed, you could run the following command to strip trailing carriage returns:
sed -i 's/\r$//' filename
With other versions of sed, you could use output redirection to write to a new file. Be sure to use a different filename for the redirection target (it can be renamed later).
sed 's/\r$//' filename > filename.unix
Similarly, the tr translation filter can be used to delete unwanted characters from its input:
tr -d '\r' <filename >filename.unix
Cygwin Bash
With the Bash port for Cygwin, there’s a custom igncr option that can be set to ignore the Carriage Return in line endings (presumably because many of its users use native Windows programs to edit their text files).
This can be enabled for the current shell by running set -o igncr.
Setting this option applies only to the current shell process so it can be useful when sourcing files with extraneous carriage returns. If you regularly encounter shell scripts with DOS line endings and want this option to be set permanently, you could set an environment variable called SHELLOPTS (all capital letters) to include igncr. This environment variable is used by Bash to set shell options when it starts (before reading any startup files).
Useful utilities
The file utility is useful for quickly seeing which line endings are used in a text file. Here’s what it prints for for each file type:
Unix line endings: Bourne-Again shell script, ASCII text executable
Mac line endings: Bourne-Again shell script, ASCII text executable, with CR line terminators
DOS line endings: Bourne-Again shell script, ASCII text executable, with CRLF line terminators
The GNU version of the cat utility has a -v, --show-nonprinting option that displays non-printing characters.
The dos2unix utility is specifically written for converting text files between Unix, Mac and DOS line endings.
Useful links
Wikipedia has an excellent article covering the many different ways of marking the end of a line of text, the history of such encodings and how newlines are treated in different operating systems, programming languages and Internet protocols (e.g., FTP).
Files with classic Mac OS line endings
With Classic Mac OS (pre-OS X), each line was terminated with a Carriage Return (decimal 13, hex 0D in ASCII). If a script file was saved with such line endings, Bash would only see one long line like so:
#!/bin/bash^M^Mcd "src"^Mnpm install^M^Mcd ..^M./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
Since this single long line begins with an octothorpe (#), Bash treats the line (and the whole file) as a single comment.
Note: In 2001, Apple launched Mac OS X which was based on the BSD-derived NeXTSTEP operating system. As a result, OS X also uses Unix-style LF-only line endings and since then, text files terminated with a CR have become extremely rare. Nevertheless, I think it’s worthwhile to show how Bash would attempt to interpret such files.
On JetBrains products (PyCharm, PHPStorm, IDEA, etc.), you'll need to click on CRLF/LF to toggle between the two types of line separators (\r\n and \n).
I was trying to startup my docker container from Windows and got this:
Bash script and /bin/bash^M: bad interpreter: No such file or directory
I was using git bash and the problem was about the git config, then I just did the steps below and it worked. It will configure Git to not convert line endings on checkout:
git config --global core.autocrlf input
delete your local repository
clone it again.
Many thanks to Jason Harmon in this link:
https://forums.docker.com/t/error-while-running-docker-code-in-powershell/34059/6
Before that, I tried this, that didn't works:
dos2unix scriptname.sh
sed -i -e 's/\r$//' scriptname.sh
sed -i -e 's/^M$//' scriptname.sh
If you're using the read command to read from a file (or pipe) that is (or might be) in DOS/Windows format, you can take advantage of the fact that read will trim whitespace from the beginning and ends of lines. If you tell it that carriage returns are whitespace (by adding them to the IFS variable), it'll trim them from the ends of lines.
In bash (or zsh or ksh), that means you'd replace this standard idiom:
IFS= read -r somevar # This will not trim CR
with this:
IFS=$'\r' read -r somevar # This *will* trim CR
(Note: the -r option isn't related to this, it's just usually a good idea to avoid mangling backslashes.)
If you're not using the IFS= prefix (e.g. because you want to split the data into fields), then you'd replace this:
read -r field1 field2 ... # This will not trim CR
with this:
IFS=$' \t\n\r' read -r field1 field2 ... # This *will* trim CR
If you're using a shell that doesn't support the $'...' quoting mode (e.g. dash, the default /bin/sh on some Linux distros), or your script even might be run with such a shell, then you need to get a little more complex:
cr="$(printf '\r')"
IFS="$cr" read -r somevar # Read trimming *only* CR
IFS="$IFS$cr" read -r field1 field2 ... # Read trimming CR and whitespace, and splitting fields
Note that normally, when you change IFS, you should put it back to normal as soon as possible to avoid weird side effects; but in all these cases, it's a prefix to the read command, so it only affects that one command and doesn't have to be reset afterward.
Coming from a duplicate, if the problem is that you have files whose names contain ^M at the end, you can rename them with
for f in *$'\r'; do
mv "$f" "${f%$'\r'}"
done
You properly want to fix whatever caused these files to have broken names in the first place (probably a script which created them should be dos2unixed and then rerun?) but sometimes this is not feasible.
The $'\r' syntax is Bash-specific; if you have a different shell, maybe you need to use some other notation. Perhaps see also Difference between sh and bash
Since VS Code is being used, we can see CRLF or LF in the bottom right depending on what's being used and if we click on it we can change between them (LF is being used in below example):
We can also use the "Change End of Line Sequence" command from the command pallet. Whatever's easier to remember since they're functionally the same.
For Notepad++ users, this can be solved by:
One more way to get rid of the unwanted CR ('\r') character is to run the tr command, for example:
$ tr -d '\r' < dosScript.py > nixScript.py
I ran into this issue when I use git with WSL.
git has a feature where it changes the line-ending of files according to the OS you are using, on Windows it make sure the line endings are \r\n which is not compatible with Linux which uses only \n.
You can resolve this problem by adding a file name .gitattributes to your git root directory and add lines as following:
config/* text eol=lf
run.sh text eol=lf
In this example all files inside config directory will have only line-feed line ending and run.sh file as well.
The simplest way on MAC / Linux - create a file using 'touch' command, open this file with VI or VIM editor, paste your code and save. This would automatically remove the windows characters.
If you are using a text editor like BBEdit you can do it at the status bar. There is a selection where you can switch.
For IntelliJ users, here is the solution for writing Linux script.
Use LF - Unix and masOS (\n)
Scripts may call each other.
An even better magic solution is to convert all scripts in the folder/subfolders:
find . -name "*.sh" -exec sed -i -e 's/\r$//' {} +
You can use dos2unix too but many servers do not have it installed by default.
For the sake of completeness, I'll point out another solution which can solve this problem permanently without the need to run dos2unix all the time:
sudo ln -s /bin/bash `printf 'bash\r'`

Is there a unix command to count the number of different file types in a directory?

I used the ls | wc -1 command to count the number of files in a directory. Is there a command to count the number of different file types ? Say the directory has 2 text files and one jpeg, the output should be 2 (text and jpeg are the different file types).
Any help is much appreciated. Thanks !
There is no single command (although you can certainly create one!) to do what you want, but it is quite simple to get your result. Decide exactly how you want to distinguish file type (filename extension, file content, name, etc.), then use common tools to count the result. If you are happy with the results printed by the file command, perhaps something as simple as:
file * | awk '{$1=""}1' | sort -u | wc -l
The awk filters out the first column of output (the filename) and the remaining processes in the pipeline count the results. This is fragile and will break if any of your filenames contain whitespace, so you might want to use : for the field separater in awk (in which case the solution is fragile and will fail if any filename contains a colon.)
Use file to find out the file types. Pipe that through grep to filter out things like images etc. and then do a wc -l.

atomic create file if not exists from bash script

In system call open(), if I open with O_CREAT | O_EXCL, the system call ensures that the file will only be created if it does not exist. The atomicity is guaranteed by the system call. Is there a similar way to create a file in an atomic fashion from a bash script?
UPDATE:
I found two different atomic ways
Use set -o noclobber. Then you can use > operator atomically.
Just use mkdir. Mkdir is atomic
A 100% pure bash solution:
set -o noclobber
{ > file ; } &> /dev/null
This command creates a file named file if there's no existent file named file. If there's a file named file, then do nothing (but return a non-zero return code).
Pros of > over the touch command:
Doesn't update timestamp if file already existed
100% bash builtin
Return code as expected: fail if file already existed or if file couldn't be created; success if file didn't exist and was created.
Cons:
need to set the noclobber option (but it's okay in a script, if you're careful with redirections, or unset it afterwards).
I guess this solution is really the bash counterpart of the open system call with O_CREAT | O_EXCL.
Here's a bash function using the mv -n trick:
function mkatomic() {
f="$(mktemp)"
mv -n "$f" "$1"
if [ -e "$f" ]; then
rm "$f"
echo "ERROR: file exists:" "$1" >&2
return 1
fi
}
Examples:
$ mkatomic foo
$ wc -c foo
0 foo
$ mkatomic foo
ERROR: file exists: foo
You could create it under a randomly-generated name, then rename (mv -n random desired) it into place with the desired name. The rename will fail if the file already exists.
Like this:
#!/bin/bash
touch randomFileName
mv -n randomFileName lockFile
if [ -e randomFileName ] ; then
echo "Failed to acquired lock"
else
echo "Acquired lock"
fi
Just to be clear, ensuring the file will only be created if it doesn't exist is not the same thing as atomicity. The operation is atomic if and only if, when two or more separate threads attempt to do the same thing at the same time, exactly one will succeed and all others will fail.
The best way I know of to create a file atomically in a shell script follows this pattern (and it's not perfect):
create a file that has an extremely high chance of not existing (using a decent random number selection or something in the file name), and place some unique content in it (something that no other thread would have - again, a random number or something)
verify that the file exists and contains the contents you expect it to
create a hard link from that file to the desired file
verify that the desired file contains the expected contents
In particular, touch is not atomic, since it will create the file if it's not there, or simply update the timestamp. You might be able to play games with different timestamps, but reading and parsing a timestamp to see if you "won" the race is harder than the above. mkdir can be atomic, but you would have to check the return code, because otherwise, you can only tell that "yes, the directory was created, but I don't know which thread won". If you're on a file system that doesn't support hard links, you might have to settle for a less ideal solution.
Another way to do this is to use umask to try to create the file and open it for writing, without creating it with write permissions, like this:
LOCK_FILE=only_one_at_a_time_please
UMASK=$(umask)
umask 777
echo "$$" > "$LOCK_FILE"
umask "$UMASK"
trap "rm '$LOCK_FILE'" EXIT
If the file is missing, the script will succeed at creating and opening it for writing, despite the file being created without writing permissions. If it already exists, the script won't be able to open the file for writing. It would be possible to use exec to open the file and keep the file descriptor around.
rm requires you to have write permissions to the directory itself, without regards to file permissions.
touch is the command you are looking for. It updates timestamps of the provided file if the file exists or creates it if it doesn't.

How to redirect output away from /dev/null

I have an application that runs the a command as below:
<command> <switches> >& /dev/null
I can configure <command>, but I have no control over <switches> . All the output generated by this command goes to /dev/null. I want the output to be visible on screen or redirected to a log file.
I tried to use freopen() and related functions to reopen /dev/null to another file, but could not get it working.
Do you have any other ideas? Is this possible at all?
Thanks for your time.
PS: I am working on Linux.
Terrible Hack:
use a text editor in binary mode open the app, find '/dev/null/' and replace it with a string of the same length
e.g '~/tmp/log'
make a backup first
be carefull
be very carefull
did I mention the backup?
Since you can modify the command you run you can use a simple shell script as a wrapper to redirect the output to a file.
#!/bin/bash
"$#" >> logfile
If you save this in your path as capture_output.sh then you can add capture_output.sh to the start of your command to append the output of your program to logfile.
Append # at the end of your command so it becomes <command> # >& /dev/null, thus commenting out the undesired part.
Your application is probably running a shell and passing it that command line.
You need to make it run a script written by you. That script will replace >/dev/null in the command line with >>/your/log and call the real shell with the modified command line.
The first step is to change the shell used by the application. Changing the environment variable SHELL should suffice, i.e., run your application as
SHELL=/home/user/bin/myshell theApp
If that doesn't work, try momentarily linking /bin/sh to your script.
myshell will call the original shell, but after pattern-replacing the parameters:
#!/bin/bash
sh ${1+"${#/\>\/dev\/null/>>\/your\/log}"}
Something along these lines should work.
You can do this with an already running process by using gdb. See the following page: http://etbe.coker.com.au/2008/02/27/redirecting-output-from-a-running-process/
Can you create an alias for that command? If so, alias it to another command that dumps output to a file.
The device file /dev/tty references your application's controlling terminal - if that hasn't changed, then this should work:
freopen("/dev/tty", "w", stdout);
freopen("/dev/tty", "w", stderr);
Alternatively, you can reopen them to point to a log file:
freopen("/var/log/myapp.log", "a", stdout);
freopen("/var/log/myapp.err", "a", stderr);
EDIT: This is NOT a good idea and certainly not worth trying unless you know what this can break. It works for me, may work for you as well.
Ok, This is a really bad hack and probably not worth doing. Assuming that none of the other commands works, and you simply do not have access to the binary/application (which contains the command with /dev/null) and you cannot re-direct the output to other file (by replacing /dev/null).
Then, you can delete /dev/null ($> rm /dev/null) and create your own file at its place (preferably with a soft link) where all the data can be directed. When you are done, you can create the /dev/null once again using following command:
$> mknod -m 666 /dev/null c 1 3
Just to be very clear, this is a bad hack and certainly requires root permissions to work. High chances that your re-directed file may contain data from many other applications/binaries which are running and use /dev/null as sink.
It may not exactly redirect, but it allows to get the output wherever it's being sent
strace -ewrite -p $PID
It's not that cleen (shows lines like: write(#,) ), but works! (and is single-line :D ) You might also dislike the fact, that arguments are abbreviated. To control that use -s parameter that sets the maxlength of strings displayed.
It catches all streams, so You might want to filter that somehow.
You can filter it:
strace -ewrite -p $PID 2>&1 | grep "write(1"
shows only descriptor 1 calls. 2>&1 is to redirect stderr to stdout, as strace writes to stderr by default.
In perl, if you just want to redirect STDOUT to something slightly more useful, you can just do something like:
open STDOUT, '>>', '/var/log/myscript.log';
open STDERR, '>>', '/var/log/myscript.err';
at the beginning of your script, and that'll redirect it for the rest of your script.
Along the lines of e-t172's answer, can you set the last switch to (or append to it):
; echo
If you can put something inline before passing things to /dev/null (not sure if you are dealing with a hardcoded command), you could use tee to redirect to something of your choice.
Example from Wikipedia which allows escalation of a command:
echo "Body of file..." | sudo tee root_owned_file > /dev/null
http://en.wikipedia.org/wiki/Tee_(command)

Resources