removing files from C code - c

I have to remove few hundreds of files inside my C code. I use "remove" in a loop. Is there any faster way to do it than using "remove"? I ask this because I can't give wildchars using "remove".

No, there isn't a quicker way than using remove() - or unlink() on POSIX systems - in a loop.
The system rm command does that too - at least in the simple, non-recursive case where the names are given on the command line. The shell expands the metacharacters, and rm (in)famously goes along deleting what it was told to delete, unaware of the disastrous *.* notation that was used on the command line. (In the recursive case, it uses a function such as nftw() to traverse the directory structure in depth-first order and repeated calls to unlink() to remove the files and rmdir() to remove the (now-empty) directories.)
POSIX does provide functions (glob() and wordexp()) to generate lists of file names from metacharacters as used in the (POSIX) shell, plus fnmatch() to see whether a name matches a pattern.

You could use system to spawn a shell which would do the * expansion for you. This would probably not run any faster than just calling unlink() in a loop, though, because it would have to spawn a shell (start a new process). But it would be easier to code.

Related

How to force deletion of file in C?

How can I remove opened file in linux?
In shell I can do this:
rm -rf /path/to/file_or_directory
But how can I do it in C?
I don't want to using system() function.
I have seen unlink and remove method but it's haven't any flags to set force deletion.
The unlink and remove functions force deletion. The rm command is doing extra checks before it calls one of those functions. But once you answer y, it just uses that function to do the real work.
Well, I hope this answers your question.. This program searches the current directory for the filename, you have to add the feature of opening a different directory, which shouldn't be too hard... I don't understand the last line of your question, can you elaborate? But flags aren't necessary for remove and unlink (They force delete)...
#include<stdio.h>
int main()
{
int status;
char file_name[25];
printf("Enter the name of file you wish to delete\n");
fgets(file_name,25,stdin);
status = remove(file_name);
if( status == 0 )
printf("%s file deleted successfully.\n",file_name);
else
{
printf("Unable to delete the file\n");
perror("Error");
}
return 0;
}
To perform a recursive removal, you have to write a moderately complicated program which performs a file system walk. ISO C has no library features for this; it requires platform-specific functions for scanning the directory structure recursively.
On POSIX systems you can use opendir, readdir and closedir to walk individual directories, and use programming language recursion to handle subdirectories. The functions ftw and its newer variant nwft perform an encapsulated file system walk; you just supply a callback function to process the visited paths. nftw is better because it has a flags argument using which you can specify the FTW_DEPTH flag to do the search depth first: visit the contents of a directory before reporting the directory. That, of course, is what you want for recursive deletion.
On MS Windows, there is FindFirstFile and FindNextFile to cob together a recursive traversal.
About -f, that only suppresses certain checks done by the rm program above and beyond what the operating system requires. Without -f, you get prompted if you want to delete a read-only file, but actually, in a Unix-like system, only the directory write permission is relevant, not that of the file, for deletion. The remove library function doesn't have such a check.
By the way, remove is in ISO C, so it is platform-independent. On POSIX systems, it calls rmdir for directories and unlink for other objects. So remove is not only portable, but lets you not worry about what type of thing you're deleting. If a directory is being removed, it has to be empty though. (Not a requirement of the remove function itself, but of mainstream operating systems that support it).
remove or unlink is basically equivalent to rm -f already--that is, it removes the specified item without prompting for further input.
If you want something equivalent to rm -r, you'll need to code up walking through the directory structure and deleting items individually. Boost Filesystem (for one example) has code to let you do that fairly simply while keeping the code reasonably portable.

How to execvp ls *.txt in C

I'm having issues execvping the *.txt wildcard, and reading this thread - exec() any command in C - indicates that it's difficult because of "globbing" issues. Is there any easy way to get around this?
Here's what I'm trying to do:
char * array[] = {"ls", "*.txt", (char *) NULL };
execvp("ls", array);
you could use the system command:
system("ls *.txt");
to let the shell do the globbing for you.
In order to answer this question you have to understand what is going on when you type ls *.txt in your terminal (emulator). When ls *.txt command is typed, it is being interpreted by the shell. The shell then performs directory listing and matches file names in the directory against *.txt pattern. Only after all of the above is done, shell prepares all of the file names as arguments and spawns a new process passing those file names as argv array to execvp call.
In order to assemble something like that yourself, look at the following Q/A:
How to list files in a directory in a C program?
Use fnmatch() to match file name with a shell-like wildcard pattern.
Prepare argument list from matched file names and use vfork() and one of the exec(3) family of functions to run another program.
Alternatively, you can use system() function as #manu-fatto has suggested. But that function will do a little bit different thing — it will actually run the shell program that will evaluate ls *.txt statement which in turn will perform steps similar to one I have described above. It is likely to be less efficient and it may introduce security holes (see manual page for more details, security risk are stated under NOTES section with a suggestion not to use the above function in certain cases).
Hope it helps. Good Luck!

How do I add an operator to Bash in Linux?

I'd like to add an operator ( e.g. ^> ) to handle prepend instead append (>>). Do I need to modify Bash source or is there an easier way (plugin, etc)?
First of all, you'd need to modify bash sources and quite heavily. Because, above all, your ^> would be really hard to implement.
Note that bash redirection operators usually do a very simple writes, and work on a single file (or program in case of pipes) only. Excluding very specific solutions, you usually can't write to a beginning of a file for the very simple reason you'd need to move all remaining contents forward after each write. You could try doing that but it will be hard, very ineffective (since every write will require re-writing the whole file) and very unsafe (since with any error you will end up with random mix of old and new version).
That said, you are indeed probably better off with a function or any other solution which would use a temporary file, like others suggested.
For completeness, my own implementation of that:
prepend() {
local tmp=$(tempfile)
if cat - "${1}" > "${tmp}"; then
mv "${tmp}" "${1}"
else
rm -f "${tmp}"
# some error reporting
fi
}
Note that you unlike #jpa suggested, you should be writing the concatenated data to a temporary file as that operation can fail and if it does, you don't want to lose your original file. Afterwards, you just replace the old file with new one, or delete the temporary file and handle the failure any way you like.
Synopsis the same as with the other solution:
echo test | prepend file.txt
And a bit modified version to retain permissions and play safe with symlinks (if that is necessary) like >> does:
prepend() {
local tmp=$(tempfile)
if cat - "${1}" > "${tmp}"; then
cat "${tmp}" > "${1}"
rm -f "${tmp}"
else
rm -f "${tmp}"
# some error reporting
fi
}
Just note that this version is actually less safe since if during second cat something else will write to disk and fill it up, you'll end up with incomplete file.
To be honest, I wouldn't personally use it but handle symlinks and resetting permissions externally, if necessary.
^ is a poor choice of character, as it is already used in history substitution.
To add a new redirection type to the shell grammar, start in parse.y. Declare it as a new %token so that it may be used, add it to STRING_INT_ALIST other_token_alist[] so that it may appear in output (such as error messages), update the redirection rule in the parser, and update the lexer to emit this token upon encountering the appropriate characters.
command.h contains enum r_instruction of redirection types, which will need to be extended. There's a giant switch statement in make_redirection in make_cmd.c processing redirection instructions, and the actual redirection is performed by functions throughout redir.c. Scattered throughout the rest of source code are various functions for printing, copying, and destroying pipelines, which may also need to be updated.
That's all! Bash isn't really that complex.
This doesn't discuss how to implement a prepending redirection, which will be difficult as the UNIX file API only provides for appending and overwriting. The only way to prepend to a file is to rewrite it entirely, which (as other answers mention) is significantly more complex than any existing shell redirections.
Might be quite difficult to add an operator, but perhaps a function could be enough?
function prepend { tmp=`tempfile`; cp $1 $tmp; cat - $tmp > $1; rm $tmp; }
Example use:
echo foobar | prepend file.txt
prepends the text "foobar" to file.txt.
I think bash's plugin architecture (loading shared objects via the 'enable' built-in command) is limited to providing additional built-in commands. The redirection operators are part of they syntax for running simple commands, so I think you would need to modify the parser to recognize and handle your new ^> operator.
Most Linux filesystems do not support prepending. In fact, I don't know of any one that has a stable userspace interface for it. So, as stated by others already, you can only rely on overwriting, either just the initial parts, or the entire file, depending on your needs.
You can easily (partially) overwrite initial file contents in Bash, without truncating the file:
exec {fd}<>"$filename"
printf 'New initial contents' >$fd
exec {fd}>&-
Above, $fd is the file descriptor automatically allocated by Bash, and $filename is the name of the target file. Bash opens a new read-write file descriptor to the target file on the first line; this does not truncate the file. The second line overwrites the initial part of the file. The position in the file advances, so you can use multiple commands to overwrite consecutive parts in the file. The third line closes the descriptor; since there is only a limited number available to each process, you want to close them after you no longer need them, or a long-running script might run out.
Please note that > does less than you expected:
Remove the > and the following word from the commandline, remembering the redirection.
When the commandline is processed and the command can be launched, calling fork(2) (or clone(2)), to create a new process.
Modify the new process according to the command. That includes things like modified environment variables (SOMEVAR=foo yourcommand), but also changed filedescriptors. At this point, a > yourfile from the cmdline will have the effect that the file is open(2)'ed at the stdout filedescriptor (that is #1) in write-only mode truncating the file to zero bytes. A >> yourfile would have the effect that the file is oppend at stdout in write-only mode and append mode.
(Only now launch the program, like execv(yourprogram, yourargs)
The redirections could, for a simple example, be implemented like
open(yourfile, O_WRONLY|O_TRUNC);
or
open(yourfile, O_WRONLY|O_APPEND);
respectively.
The program then launched will have the correct environment set up, and can happily write to fd1. From here, the shell is not involved. The real work is not done by the shell, but by the operating system. As Unix doesn't have a prepend mode (and it would be impossible to integrate that feature correctly), everything you could try would end up in a very lousy hack.
Try to re-think your requirements, there's always a simpler way around.

Hooks on terminal. Can I call a method before a command is run in the terminal?

I am wanting to make a terminal app that stores information about files/directories. I want a way to keep the information if the file is moved or renamed.
What I thought I could do is have a function execute before any command is run. I found this:
http://www.twistedmatrix.com/users/glyph/preexec.bash.txt
But I was wondering if this would be a good way to go about it. Or should I do something else?
I would like to call that function from a C program whenever mv is entered I suppose.
If what you're trying to do is attach some sort of metadata to files, there's a much better supported way to do that -- extended attributes.
Another solution might be to use the file's inode number as an index into a database you maintain yourself.
Can you alias the mv command? in .profile or .bashrc
alias mv=/usr/bin/local/mymv
where mymv is a compiled executable that runs your C code function and calls /usr/bin/mv.
precmd and preeexec add some overhead to every bash script that gets run, even if the script never calls mv. The downside to alias is that it requires new code in /usr/local and if scripts or users employ /usr/bin/mv instead of mv it will not do what you want. Generally doing something like this often means there is a better way to handle the problem with some kind of service (daemon) or driver. Plus, what happens if your C code cannot correctly handle interesting input like
mv somefille /dev/null
If you want to run command each time after some command was executed in the terminal, just put the following in ~/.bashrc:
PROMPT_COMMAND="your_command;$PROMPT_COMMAND"
If you want your command to be executed each time before mv is executing, put the following in ~/.bashrc:
alias mv="your_script"
Make sure that your script will execute real mv if needed.
You can use inotify library to track filesystem changes. It's good solution, but once user remove file, it's already gone.
You might be able to make use of the DEBUG trap in Bash.
From man bash:
If a sigspec is DEBUG, the command arg is executed before every
simple command, for command, case command, select command, every
arithmetic for command, and before the first command executes in
a shell function
I found this article when I was forced to work in tcsh and wanted to ensure a specific environemtn variable was present when the user ran a program from a certain folder (without setting that variable globally)
tcsh can do this.
tcsh has special alias, one of which is precmd
This can be used to run a script just before the shell prompt is printed.
e.g. I used set precmd 'bash $HOME/.local/bin/on_cd.sh'
This might be one of the very few useful features in csh.
It is a shame but I don't think the same or similar feature is in bash or other sh derivites (ash, dash etc). Related answer.

implementing globbing in a shell prototype

I'm implementing a linux shell for my weekend assignment and I am having some problems implementing wilcard matching as a feature in shell. As we all know, shells are a complete language by themselves, e.g. bash, ksh, etc. I don't need to implement the complete features like control structures, jobs etc. But how to implement the *?
A quick analysis gives you the following result:
echo *
lists all the files in the current directory. Is this the only logical manifestation of the shell? I mean, not considering the language-specific features of bash, is this what a shell does, internally? Replace a * with all the files in the current directory matching the pattern?
Also I have heard about Perl Compatible Regular Expression , but it seems to complex to use a third party library.
Any suggestions, links, etc.? I will try to look at the source code as well, for bash.
This is called "globbing" and the function performing this is named the same: glob(3)
Yes, that's what shell does. It will replace '*' characters by all files and folder names in cwd. It is in fact very basic regular expressions supporting only '?' and '*' and matching with file and folder names in cwd.
Remark that backslashed \* and '*' enclosed between simple or double quotes ' or " are not replaced (backslash and quotes are removed before passing to the command executed).
If you want more control than glob gives, the standard function fnmatch performs just glob matching.
Note that shells also performs word expansion (e.g. "~" → "/home/user"), which should be done before glob expansion, if you're doing filename matching manually. (Or use wordexp.)

Resources