Understanding functioning of read() and lseek() in C - c

When we use system call open() and then perform I/O operations (especially read() and lseek()), Do the kernel buffer gets updated if we change the file when program is still running? if not, then how to forcefully synchronize live updating file to kernel buffer.
Here is an example:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
int fd=0;
char ch='\0';
fd=open("test.dat",O_RDONLY);
while(1)
{
while(read(fd,&ch,1)!=0)
{
printf("%c",ch);
}
printf("\n");
lseek(fd,0,SEEK_SET);
sleep(5);
}
close(fd);
return 0;
}
Now, I have some data in "test.dat" (say: '3 3 34'). I opened a this file and read it to end and then again seek it to start. Meanwhile, in some editor I opened this "test.dat" file and updated its contents and saved them. Since read() and lseek() are system calls, they shall correspond to changes in updated file if kernel/OS buffer regularly syncs with file in Hard disk. But that is not the case. The changes in file are not reflected from read(), instead it continues to print the initial content. For writing, if have solutions like sync(), fsync() etc. But for reading do we have some such functions?
(Note: one trivial solution to this problem is close() and again open() file descriptor which works perfectly but I want to know and understand some alternative without closing file descriptor)

Some editors make a copy of the file they edit, and finally they move it into the original filename (probably renaming the original to something else), so the file you are reading can be different actually from the one the editor has edited. This depends on which editor you use to modify the file and is a complex task as you can have several links to the file, so it is expected this operation is not done by the editor. Ensure you are modifying the original file. Something useful is to have a different contents file and do (instead of editing the file)
$ your_program your_file.txt &
$ cat modified_version.txt > your_file.txt
and see what happens.
You have not exposed what editor you use to edit the file.

Related

How to interact with an external text editor in C

I am developing a command line application in C (linux environment) to edit a particular file format. This file format is a plain XML file, which is compressed, then encrypted, then cryptographically signed.
I'd like to offer an option to the user to edit this kind of file in an easy way, without the hassle of manualy extracting the file, editing it, and then compressing, encrypting and signing it.
Ideally, when called, my application should do the following:
Open the encrypted/compressed file and extract it to a temporary location (like /tmp)
Call an external text editor like nano or sublime-text or gedit depending on which is installed and maybe the user preferences. Wait until the user have edited the file and closed the text editor.
Read the modified temporary file and encrypt/compress it, replacing the old encrypted/compressed file
How can I achieve point no. 2?
I thought about calling nano with system() and waiting for it to return, or placing an inotify() on the temp file to know when it is modified by the graphical text editor.
Which solution is better?
How can i call the default text editor of the user?
Anything that can be done in a better way?
First, consider not writing an actual application or wrapper yourself, which calls another editor, but rather writing some kind of plugin for some existing editor which is flexible enough to support additional formats and passing its input through decompression.
That's not the only solution, of course, but it might be easier for you.
With your particular approach, you could:
Use the EDITOR and/or VISUAL command-line variables (as also pointed out by #KamilCuk) to determine which editor to use.
Run the editor as a child process so that you know when it ends execution, rather than having to otherwise communicate with it. Being notified of changes to the file, or even to its opening or closing, is not good enough, since the editor may make changes multiple files, and some editors don't even keep the file open while you work on it in them.
Remember to handle the cases of the editor failing to come up; or hanging; or you getting some notification to stop waiting for the editor; etc.
Call an external text editor like nano or sublime-text or gedit depending on which is installed and maybe the user preferences. Wait until the user have edited the file and closed the text editor.
Interesting question. One way to open the xml file with the user's default editor is using the xdg-open, but it doesn't give the pid of the application, in which user will edit the file.
You can use xdg-mime query default application/xml to find out the .desktop file of the default editor, but then you have to parse this file to figure out the executable path of the program - this is exactly how xdg-open actually works, in the search_desktop_file() function the line starting with Exec= entry is simply extracted from the *.desktop to call the editor executable and pass the target file as argument... What I am trying to say, is, after you find the editor executable, you can start it, and wait until it's closed, and then check if the file content has been changed. Well, this looks like a lot of unnecessary work...
Instead, you can try a fixed well-known editor, such as gedit, to achieve the desired workflow. You can also provide user a way (i.e. a prompt or config file) to set a default xml editor, i.e. /usr/bin/sublime_text, which then can be used in your programm on next run.
However, the key is here to open an editor that blocks the calling process, until user closes the editor. After the editor is closed, you can simply check if the file has been changed and if so, perform further operations.
To find out, if the file contents have been modified, you can use the stat system call to get the inode change time of the file, before you open the file, and then compare the timestamp value with the current one once it is closed.
i.e.:
stat -c %Z filename
Output: 1558650334
Wrapping up:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void execute_command(char* cmd, char* result) {
FILE *fp;
fp = popen(cmd, "r");
fscanf (fp, "%s" , result);
}
int get_changetime(char* filename) {
char cmd[4096];
char output[10];
sprintf(cmd, "stat -c %%Z %s", filename);
execute_command(cmd, output);
return atoi(output);
}
int main() {
char cmd[4096];
char* filename = "path/to/xml-file.xml";
uint ctime = get_changetime(filename);
sprintf(cmd, "gedit %s", filename);
execute_command(cmd, NULL);
if (ctime != get_changetime(filename)) {
printf("file modified!");
// do your work here...
}
return 0;
}

How to reserve a file descriptor?

I'm writing a curses-based program. In order to make it simpler for me to find errors in this program, I would like to produce debug output. Due to the program already displaying a user interface on the terminal, I cannot put debugging output there.
Instead, I plan to write debugging output to file descriptor 3 unconditionally. You can invoke the program as program 3>/dev/ttyX with /dev/ttyX being a different teletype to see the debugging output. When file descriptor 3 is not opened, write calls fail with EBADF, which I ignore like all errors when writing debugging output.
A problem occurs when I open another file and no debugging output has been requested (i.e. file descriptor 3 has not been opened). In this case, the newly opened file might receive file descriptor 3, causing debugging output to randomly corrupt a file I just opened. This is a bad thing. How can I avoid this? Is there a portable way to mark a file descriptor as “reserved” or such?
Here are a couple of ideas I had and their problems:
I could open /dev/null or a temporary file to file descriptor 3 (e.g. by means of dup2()) before opening any other file. This works but I'm not sure if I can assume this to always succeed as opening /dev/null may not succeed.
I could test if file descriptor 3 is open and not write debugging output if it isn't. This is problematic when I'm attempting to restart the program by calling exec as a different file descriptor might have been opened (and not closed) prior to the exec call. I could intentionally close file descriptor 3 before calling exec when it has not been opened for debugging, but this feels really uggly.
Why use fd 3? Why not use fd 2 (stderr)? It already has a well-defined "I am logging of some sorts" meaning, is always (not true, but sufficiently true...) and you can redirect it before starting your binary, to get the logs where you want.
Another option would be to log messages to syslog, using the LOG_DEBUG level. This entails calling syslog() instead of a normal write function, but that's simply making the logging more explicit.
A simple way of checking if stderr has been redirected or is still pointing at the terminal is by using the isatty function (example code below):
#include <stdio.h>
#include <unistd.h>
int main(void) {
if (isatty(2)) {
printf("stderr is not redirected.\n");
} else {
printf("stderr seems to be redirected.\n");
}
}
In the very beginning of your program, open /dev/null and then assign it to file descriptor 3:
int fd = open ("/dev/null", O_WRONLY);
dup2(fd, 3);
This way, file descriptor 3 won't be taken.
Then, if needed, reuse dup2() to assign file descriptor 3 to your debugging output.
You claim you can't guarantee you can open /dev/null successfully, which is a little strange, but let's run with it. You should be able to use socketpair() to get a pair of FDs. You can then set the write end of the pair non-blocking, and dup2 it. You claim you are already ignoring errors on writes to this FD, so the data going in the bit-bucket won't bother you. You can of course close the other end of the socketpair.
Don't focus on a specific file descriptor value - you can't control it in a portable manner anyway. If you can control it at all. But you can use an environment variable to control debug output to a file:
int debugFD = getDebugFD();
...
int getDebugFD()
{
const char *debugFile = getenv( "DEBUG_FILE" );
if ( NULL == debugFile )
{
return( -1 );
}
int fd = open( debugFile, O_CREAT | O_APPEND | O_WRONLY, 0644 );
// error checking can be here
return( fd );
}
Now you can write your debug output to debugFD. I assume you know enough to make sure debugFD is visible where you need it, and also how to make sure it's initialized before trying to use it.
If you don't pass a DEBUG_FILE envval, you get an invalid file descriptor and your debug calls fail - presumably silently.

auto delete file on linux

I am trying to do a file be deleted when a program ends. I remember that before I could put the unlink() before the first close() and I don't need reopen the file.
What I expect: The file is erased after the program ends.
What is happening: The file is erased when the call to unlink happens the file is erased.
My sample program:
int main()
{
int fd = open(argv[1], O_CREAT);
int x = 1;
write(fd, "1234\n", 5);
close(fd);
fd = open(argv[1], 0);
unlink(argv[1]);
while (x <= 3)
{
int k;
scanf(" %d", &k);
x++;
}
close(fd);
return 0;
}
Has a way that I can open() the file, interact with it and on close() delete the file from harddisk? I'm using fedora linux 18.
I need know the name of the file that I did open in this way because it will be used by another application.
Unlinking a file simply detaches the file name from the underlying inode, making it impossible to open the file using that file name afterwards.
If any process has the file still open, they can happily read and write it, as those operations operate on the inode and not the file name. Also, if there are hardlinks (other file names referring to the same inode) left, those other file names can be used to open the file just fine. See e.g. the Wikipedia article on inodes for further details.
Edited to add:
In Linux, you can leverage the /proc pseudofilesystem. If your application (with process ID PID) has file descriptor FD open, with the file name already unlinked, it can still let another application work on it by telling the other application to work on /proc/PID/fd/FD. It is a pseudo-file, meaning it looks like a (non-functioning!) symlink, but it is not -- it's just useful Linux kernel magic: as long as the other application just opens it normally (open()/fopen() etc., no lstat()/readlink() stuff), they will get access as if they were opening a normal file.
As a real-world example, open two terminals, and in one write
bash -c 'exec 3<>foobar ; echo $$ ; rm foobar ; echo "Initial contents" >&3 ; cat >&3'
The first line it outputs is the PID, and FD is 3 here. Anything you type (after pressing Enter) will be appended to a file that was briefly named foobar, but no longer exists. (You can easily verify that.)
In a second terminal, type
cat /proc/PID/fd/3
to see what that file contains.
It sounds like what you really want is tmpfile():
The tmpfile() function opens a unique temporary file in binary
read/write (w+b) mode. The file will be automatically deleted when it
is closed or the program terminates.
The File is unlinked, so it won't show up from ls... but the file still exists there is an inode and you could actually re-link it... the file won't be removed from the disk until all file descriptors pointing to it are closed...
you could still read and write to the fd while it is open after it is unlinked...

File is not written on disk until program ends

I'm writing a file using a c code on a unix system . I open it , write a few lines and close it. Then i call a shell script, say code B where this file is to be used and then return back to main program. However, when code B tries to read the file, the file is empty.
I checked the file on the file system, its size is shown as 0 and no data is present in file. However after killing the running c code process, file has data present in it.
Here is the piece of code -
void writefile(){
FILE *fp;
fp = fopen("ABC.txt","w");
fputs("Some lines...\n",fp);
fclose(fp);
system("code_B ABC.txt");
}
Please advise how can I read the file in the shell script without stopping the c code process.
If there's some time between the fputs and fclose, add
fflush(fp);
This will cause the contents of the disk file to be written.
You should do fsync() after the fclose(), to guarantee the writing of the file to the disk.
Take a look at this question:
Does Linux guarantee the contents of a file is flushed to disc after close()?
The kernel ensures that data which is written to a file can be read back afterwards from a different process, even if it is not physically written to the disc yet. So, in usual scenarios, there is no need to call fsync() - still, even with fsync(), the filesystem could decide to further delay physical writes.
One common problem is that the C library has not flushed its buffers yet, in which case you would need to call fflush() - however, you are calling fclose() before launching your sub process, and fclose() internally calls fflush().
Actually, since system() is using a shell to launch the command passed as parameter, you can use the following simple SSCCE to verify that it works:
#include <stdio.h>
void writefile(){
FILE *fp;
fp = fopen("ABC.txt","w");
fputs("Some lines...\n",fp);
fclose(fp);
system("cat ABC.txt");
}
int main() {
writefile();
return 0;
}
Here, system() simply calls the cat command to print the file contents. The output is:
$ ./writefile
Some lines...

Reopen a file descriptor with another access?

Assume the OS is linux. Suppose I opened a file for write and get a file descriptor fdw. Is it possible to get another file descriptor fdr, with read-only access to the file without calling open again? The reason I don't want to call open is the underlying file may have been moved or even unlinked in the file system by other processes, so re-use the same file name is not reliable against such actions. So my question is: is there anyway to open a file descriptor with different access right if given only a file descriptor? dup or dup2 doesn't change the access right, I think.
Yes! The trick is to access the deleted file via /proc/self/fd/n. It’s a linux-only trick, as far as I know.
Run this program:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
FILE* out_file;
FILE* in_file;
char* dev_fd_path;
char buffer[128];
/* Write “hi!” to test.txt */
out_file = fopen("test.txt", "w");
fputs("hi!\n", out_file);
fflush(out_file);
/* Delete the file */
unlink("test.txt");
/* Verify that the file is gone */
system("ls test.txt");
/* Reopen the filehandle in read-mode from /proc */
asprintf(&dev_fd_path, "/proc/self/fd/%d", fileno(out_file));
in_file = fopen(dev_fd_path, "r");
if (!in_file) {
perror("in_file is NULL");
exit(1);
}
printf("%s", fgets(buffer, sizeof(buffer), in_file));
return 0;
}
It writes some text to a file, deletes it, but keeps the file descriptor open and and then reopens it via a different route. Files aren’t actually deleted until the last process holding the last file descriptor closes it, and until then, you can get at the file contents via /proc.
Thanks to my old boss Anatoly for teaching me this trick when I deleted some important files that were fortunately still being appended to by another process!
No, the fcntl call will not let you set the read/write bits on an open file descriptor and the only way to get a new file descriptor from an existing one is by using the duplicate functionality. The calls to dup/dup2/dup3 (and fcntl) do not allow you to change the file access mode.
NOTE: this is true for Linux, but not true for other Unixes in general. In HP-UX, for example, [see (1) and (2)] you are able to change the read/write bits with fcntl using F_SETFL on an open file descriptor. Since file descriptors created by dup share the same status flags, however, changing the access mode for one will necessarily change it for the other.

Resources