The intended behavior of the C program below is to copy its own executable file to a new randomly-named file, then execve that file, ad nauseum. This should create many, many copies of the executable. This is obviously a terrible idea, but it is nevertheless what I am trying to do.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
int main(int argc, char* argv[]) {
/* Obtain name of current executable */
const char* binName = argv[0];
/* Print message */
printf("Hello from %s!\n", binName);
/* Create name of new executable */
char newBinName[] = "tmpXXXXXX";
mkstemp(newBinName);
/* Determine size of current executable */
struct stat st;
stat(binName, &st);
const int binSize = st.st_size;
/* Copy current executable to memory */
char* binData = (char*) malloc(sizeof(char) * binSize);
FILE* binFile = fopen(binName, "rb");
fread(binData, sizeof(char), binSize, binFile);
fclose(binFile);
/* Write executable in memory to new file */
binFile = fopen(newBinName, "wb");
fwrite(binData, sizeof(char), binSize, binFile);
fclose(binFile);
/* Make new file executable */
chmod(newBinName, S_IRUSR | S_IWUSR |S_IXUSR);
/* Run new file executable */
execve(
newBinName,
(char*[]) {
newBinName,
NULL
},
NULL);
/* If this code runs, then there has been an error. */
perror("execve");
return EXIT_FAILURE;
}
Instead, though, the following is output:
Hello from ./execTest
execve: Text file busy
I presume that the text file is "busy" because ./execTest is still accessing it... but I do close the file stream to that file. What is it that I'm doing wrong?
From the manpage of mkstemp:
On success, these functions return the file descriptor of the
temporary file.
You discard the file descriptor returned to you by mkstemp, effectively leaking it. Probably it is a writable file descriptor and thus will cause execve to fail with ETXTBSY (which occurs when there are writable fds open). Can you try close() on the return value of mkstemp and see if that improves the behavior?
General point of feedback: When coding in C you should be in the habit of looking at return values. Failure to observe their meaning and error status is often indicative of a bug.
Related
I just discovered that a FILE* can not only refer to a regular file, but also to a directory. If the latter is the case, fread will fail with errno set to 21 (Is a directory).
Minimal repro can be tested here
#include <stdio.h>
#include <fcntl.h>
#include <assert.h>
#include <errno.h>
int main() {
char const* sz = ".";
int fd = open(sz, O_RDONLY | O_NOFOLLOW); // all cleanup omitted for brevity
FILE* file = fdopen(fd, "rb");
// I would like to test in this line if it is a directory
char buffer[21];
int const n = fread(buffer, 1, 20, file);
if (0 < n) {
buffer[n] = 0;
printf(buffer);
} else {
printf("Error %d", errno); // 21 = Is a directory
}
}
What is the proper way to detect early that my FILE* is referring to directory without trying to read from it?
EDIT to repel the duplicate flags:
I want to test on the FILE*, not the filename. Testing on filename only and then opening it later is a race condition.
Assuming a POSIX-like environment, if you have just the file stream (FILE *fp), then you are probably reduced to using fileno() and fstat():
#include <sys/stat.h>
struct stat sb;
if (fstat(fileno(fp), &sb) != 0)
…oops…
if (S_ISDIR(sb.st_mode))
…it is a directory…
else
…it is not a directory…
Assuming you are on a POSIX-based system, use stat() (if you wish to use the filename in sz before the call to open()) or fstat() (if you wish to use the descriptor fd after calling open()) to get a file status structure from the OS. The member of the structure named st_mode can be used with the POSIX API S_ISDIR(st_mode) to see if the file is a directory.
For more information, see: http://man7.org/linux/man-pages/man2/stat.2.html
Checking The fcntl.h man page:
header shall define the following symbolic constants as
file creation flags for use in the oflag value to open() and
openat(). The values shall be bitwise-distinct and shall be suitable
for use in #if preprocessing directives.
And the flag :
O_DIRECTORY Fail if not a directory.
I've found on google code that was over 50 lines long and that's completely unnecessary for what I'm trying to do.
I want to make a very simple cp implementation in C.
Just so I can play with the buffer sizes and see how it affects performance.
I want to use only Linux API calls like read() and write() but I'm having no luck.
I want a buffer that is defined as a certain size so data from file1 can be read into buffer and then written to file2 and that continues until file1 has reached EOF.
Here is what I tried but it doesn't do anything
#include <stdio.h>
#include <sys/types.h>
#define BUFSIZE 1024
int main(int argc, char* argv[]){
FILE fp1, fp2;
char buf[1024];
int pos;
fp1 = open(argv[1], "r");
fp2 = open(argv[2], "w");
while((pos=read(fp1, &buf, 1024)) != 0)
{
write(fp2, &buf, 1024);
}
return 0;
}
The way it would work is ./mycopy file1.txt file2.txt
This code has an important problem, the fact that you always write 1024 bytes regardless of how many you read.
Also:
You don't check the number of command line arguments.
You don't check if the source file exists (if it opens).
You don't check that the destination file opens (permission issues).
You pass the address of the array which has a different type than the pointer to the first element to the array.
The type of fp1 is wrong, as well as that of fp2.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char **argv)
{
char buffer[1024];
int files[2];
ssize_t count;
/* Check for insufficient parameters */
if (argc < 3)
return -1;
files[0] = open(argv[1], O_RDONLY);
if (files[0] == -1) /* Check if file opened */
return -1;
files[1] = open(argv[2], O_WRONLY | O_CREAT | S_IRUSR | S_IWUSR);
if (files[1] == -1) /* Check if file opened (permissions problems ...) */
{
close(files[0]);
return -1;
}
while ((count = read(files[0], buffer, sizeof(buffer))) != 0)
write(files[1], buffer, count);
return 0;
}
Go to section 8.3 of the K&R "The C Programming Language". There you will see an example of what you want to accomplish. Try using different buffer sizes and you will end up seeing a point where the performance tops.
#include <stdio.h>
int cpy(char *, char *);
int main(int argc, char *argv[])
{
char *fn1 = argv[1];
char *fn2 = argv[2];
if (cpy(fn2, fn1) == -1) {
perror("cpy");
return 1;
}
reurn 0;
}
int cpy(char *fnDest, char *fnSrc)
{
FILE *fpDest, *fpSrc;
int c;
if ((fpDest = fopen(fnDest, "w")) && (fpSrc = fopen(fnSrc, "r"))) {
while ((c = getc(fpSrc)) != EOF)
putc(fpDest);
fclose(fpDest);
fclose(fpSrc);
return 0;
}
return -1;
}
First, we get the two file names from the command line (argv[1] and argv[2]). The reason we don't start from *argv, is that it contains the program name.
We then call our cpy function, which copies the contents of the second named file to the contents of the first named file.
Within cpy, we declare two file pointers: fpDest, the destination file pointer, and fpSrc, the source file pointer. We also declare c, the character that will be read. It is of type int, because EOF does not fit in a char.
If we could open the files succesfully(if fopen does not return NULL), we get characters from fpSrc and copy them onto fpDest, as long as the character we have read is not EOF. Once we have seen EOF, we close our file pointers, and return 0, the success indicator. If we could not open the files, -1 is returned. The caller can check the return value for -1, and if it is, print an error message.
Good question. Related to another good question:
How can I copy a file on Unix using C?
There are two approaches to the "simplest" implementation of cp. One approach uses a file copying system call function of some kind - the closest thing we get to a C function version of the Unix cp command. The other approach uses a buffer and read/write system call functions, either directly, or using a FILE wrapper.
It's likely the file copying system calls that take place solely in kernel-owned memory are faster than the system calls that take place in both kernel- and user-owned memory, especially in a network filesystem setting (copying between machines). But that would require testing (e.g. with Unix command time) and will be dependent on the hardware where the code is compiled and executed.
It's also likely that someone with an OS that doesn't have the standard Unix library will want to use your code. Then you'd want to use the buffer read/write version, since it only depends on <stdlib.h> and <stdio.h> (and friends).
<unistd.h>
Here's an example that uses function copy_file_range from the unix standard library <unistd.h>, to copy a source file to a (possible non-existent) destination file. The copy takes place in kernel space.
/* copy.c
*
* Defines function copy:
*
* Copy source file to destination file on the same filesystem (possibly NFS).
* If the destination file does not exist, it is created. If the destination
* file does exist, the old data is truncated to zero and replaced by the
* source data. The copy takes place in the kernel space.
*
* Compile with:
*
* gcc copy.c -o copy -Wall -g
*/
#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <unistd.h>
/* On versions of glibc < 2.27, need to use syscall.
*
* To determine glibc version used by gcc, compute an integer representing the
* version. The strides are chosen to allow enough space for two-digit
* minor version and patch level.
*
*/
#define GCC_VERSION (__GNUC__*10000 + __GNUC_MINOR__*100 + __gnuc_patchlevel__)
#if GCC_VERSION < 22700
static loff_t copy_file_range(int in, loff_t* off_in, int out,
loff_t* off_out, size_t s, unsigned int flags)
{
return syscall(__NR_copy_file_range, in, off_in, out, off_out, s,
flags);
}
#endif
/* The copy function.
*/
int copy(const char* src, const char* dst){
int in, out;
struct stat stat;
loff_t s, n;
if(0>(in = open(src, O_RDONLY))){
perror("open(src, ...)");
exit(EXIT_FAILURE);
}
if(fstat(in, &stat)){
perror("fstat(in, ...)");
exit(EXIT_FAILURE);
}
s = stat.st_size;
if(0>(out = open(dst, O_CREAT|O_WRONLY|O_TRUNC, 0644))){
perror("open(dst, ...)");
exit(EXIT_FAILURE);
}
do{
if(1>(n = copy_file_range(in, NULL, out, NULL, s, 0))){
perror("copy_file_range(...)");
exit(EXIT_FAILURE);
}
s-=n;
}while(0<s && 0<n);
close(in);
close(out);
return EXIT_SUCCESS;
}
/* Test it out.
*
* BASH:
*
* gcc copy.c -o copy -Wall -g
* echo 'Hello, world!' > src.txt
* ./copy src.txt dst.txt
* [ -z "$(diff src.txt dst.txt)" ]
*
*/
int main(int argc, char* argv[argc]){
if(argc!=3){
printf("Usage: %s <SOURCE> <DESTINATION>", argv[0]);
exit(EXIT_FAILURE);
}
copy(argv[1], argv[2]);
return EXIT_SUCCESS;
}
It's based on the example in my Ubuntu 20.x Linux distribution's man page for copy_file_range. Check your man pages for it with:
> man copy_file_range
Then hit j or Enter until you get to the example section. Or search by typing /example.
<stdio.h>/<stdlib.h> only
Here's an example that only uses stdlib/stdio. The downside is it uses an intermediate buffer in user-space.
/* copy.c
*
* Compile with:
*
* gcc copy.c -o copy -Wall -g
*
* Defines function copy:
*
* Copy a source file to a destination file. If the destination file already
* exists, this clobbers it. If the destination file does not exist, it is
* created.
*
* Uses a buffer in user-space, so may not perform as well as
* copy_file_range, which copies in kernel-space.
*
*/
#include <stdlib.h>
#include <stdio.h>
#define BUF_SIZE 65536 //2^16
int copy(const char* in_path, const char* out_path){
size_t n;
FILE* in=NULL, * out=NULL;
char* buf = calloc(BUF_SIZE, 1);
if((in = fopen(in_path, "rb")) && (out = fopen(out_path, "wb")))
while((n = fread(buf, 1, BUF_SIZE, in)) && fwrite(buf, 1, n, out));
free(buf);
if(in) fclose(in);
if(out) fclose(out);
return EXIT_SUCCESS;
}
/* Test it out.
*
* BASH:
*
* gcc copy.c -o copy -Wall -g
* echo 'Hello, world!' > src.txt
* ./copy src.txt dst.txt
* [ -z "$(diff src.txt dst.txt)" ]
*
*/
int main(int argc, char* argv[argc]){
if(argc!=3){
printf("Usage: %s <SOURCE> <DESTINATION>\n", argv[0]);
exit(EXIT_FAILURE);
}
return copy(argv[1], argv[2]);
}
Another way to ensure portability in general while still working with a Unix-like C API is to develop with GNOME (e.g. GLib, GIO)
https://docs.gtk.org/glib/
https://docs.gtk.org/gio/
I have used syscalls read() and write() in my program WITHOUT including "unistd.h" header file in the program. But still the program works and gives expected results.
After running the program, i thought i will read the man page for read() and write().
In the man 2 page for read() and write(), in the SYNOPSIS section it is mentioned that I need to include unistd.h header file to use read() or write().
SYNOPSIS
#include <unistd.h>
ssize_t read(int fd, void *buf, size_t count);
SYNOPSIS
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
So I am surprised how did my program work although I had not included unistd.h ?
Below is my program. It's a program to copy contents of a source file to target file using read(), and write() syscalls.
#include<stdio.h>
#include<fcntl.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<stdlib.h>
int main()
{
/* Declaring the buffer. */
/* Data read by read() will be stored in this buffer. */
/* Later when write() is used, write() will take the contents of this buffer and write to the file.*/
char buffer[512];
/* Decalring strings to store the source file and target file names. */
char source[128], target[128];
/* Declaring integer variables in which integer returned by open() will be stored. */
/* Note that this program will open a source file, and a target file. So, 2 integers will be needed. */
int inhandle, outhandle;
/* Declaring integer variable which will specify how much bytes to read or write.*/
int bytes;
/* Taking source filename from keyboard.*/
printf("\nSource File name: ");
scanf("%s",source);
/* Open the source file using open().*/
inhandle = open(source, O_RDONLY);
/* If there is error while opening source file.*/
if (inhandle == -1)
{
perror("Error opening source file.\n");
exit(1);
}
/* Taking target filename from keyboard.*/
printf("\nTarget File name: ");
scanf("%s",target);
/* Open the target file using open().*/
outhandle = open(target, O_CREAT | O_WRONLY, 0660);
/* If there is error while opening target file.*/
if (outhandle == -1)
{
perror("Error opening target file.\n");
close(inhandle);
exit(2);
}
/* Below code does following:
1. First reads (at most) 512 bytes from source file
2. Then copies them to buffer
3. If bytes read is greater than 0, write the content stored in buffer to target file.
*/
while((bytes = read(inhandle, buffer, 512)) > 0)
{
write(outhandle, buffer, bytes);
}
/* Close both source and target files. */
close(inhandle);
close(outhandle);
return 0;
}
Your program worked because of implicit function declaration, read() and write() both return ssize_t and compiler the compiler assumes int when implicitly declaring functions, so it might work as you know.
If you compile your program with warnings enabled, then the compiler would warn you about that, using gcc
gcc -Wall -Wextra -Werror
would stop compilation if it finds implicitly declared functions, i.e. functions without a prototype.
I am a C beginner, trying to use dup(), I wrote a program to test this function, the result is a little different from what I expected.
Code:
// unistd.h, dup() test
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
extern void dup_test();
int main() {
dup_test();
}
// dup()test
void dup_test() {
// open a file
FILE *f = fopen("/tmp/a.txt", "w+");
int fd = fileno(f);
printf("original file descriptor:\t%d\n",fd);
// duplicate file descriptor of an opened file,
int fd_dup = dup(fd);
printf("duplicated file descriptor:\t%d\n",fd_dup);
FILE *f_dup = fdopen(fd_dup, "w+");
// write to file, use the duplicated file descriptor,
fputs("hello\n", f_dup);
fflush(f_dup);
// close duplicated file descriptor,
fclose(f_dup);
close(fd_dup);
// allocate memory
int maxSize = 1024; // 1 kb
char *buf = malloc(maxSize);
// move to beginning of file,
rewind(f);
// read from file, use the original file descriptor,
fgets(buf, maxSize, f);
printf("%s", buf);
// close original file descriptor,
fclose(f);
// free memory
free(buf);
}
The program try write via the duplicated fd, then close the duplicated fd, then try to read via the original fd.
I expected that when I close the duplicated fd, the io cache will be flushed automatically, but it's not, if I remove the fflush() function in the code, the original fd won't be able to read the content written by the duplicated fd which is already closed.
My question is:
Does this means when close the duplicated fd, it won't do flush automatically?
#Edit:
I am sorry, my mistake, I found the reason, in my initial program it has:
close(fd_dup);
but don't have:
fclose(f_dup);
after use fclose(f_dup); to replace close(f_dup); it works.
So, the duplicated fd do automatically flush if close in a proper way, write() & close() is a pair, fwrite() & fclose() is a pair, should not mix them.
Actually, in the code I could have use the duplicated fd_dup directly with write() & close(), and there is no need to create a new FILE at all.
So, the code could simply be:
// unistd.h, dup() test
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#define BUF_SIZE 1024 // 1 kb
extern void dup_test();
int main() {
dup_test();
}
// dup()test
void dup_test() {
// open a file
FILE *f = fopen("/tmp/a.txt", "w+");
int fd = fileno(f);
printf("original file descriptor:\t%d\n",fd);
// duplicate file descriptor of an opened file,
int fd_dup = dup(fd);
printf("duplicated file descriptor:\t%d\n",fd_dup);
// write to file, use the duplicated file descriptor,
write(fd_dup, "hello\n", BUF_SIZE);
// close duplicated file descriptor,
close(fd_dup);
// allocate memory
char *buf = malloc(BUF_SIZE);
// move to beginning of file,
rewind(f);
// read from file, use the original file descriptor,
fgets(buf, BUF_SIZE, f);
printf("%s", buf);
// close original file descriptor,
fclose(f);
// free memory
free(buf);
}
From dup man pages:
After a successful return from one of these system calls, the old and new file descriptors maybe used interchangeably. They refer to the same open file description (see open(2))and thus share file offset and file status flags; for example, if the file offset is modified by using lseek(2) on one of the descriptors, the offset is also changed for the other.
It means the seek pointer is changed when you write to the duplicated file descriptor, so, reading from the first file descriptor after writing to the duplication shouldn't read any data.
You are using fdopen to create separated seek_ptr and end_ptr of the duplicated stream, in that way, the fd_dup stops being a duplication. That's why you can read data after flushing and closing the stream.
I couldn't find any strong facts about why you can't read if you don't flush the second file descriptor. I can add that it may be related to sync system call.
After all, if you need a IO buffer, you might be using the wrong mechanism, check named pipes and other buffering OS mechanism.
I cannot really understand your problem. I tested it under Microsoft VC2008 (had to replace unistd.h with io.h) and gcc 4.2.1.
I commented out fflush(f_dup) because it is no use before a close and close(fd_dup); because the file descriptor was already closed, so the piece of code now looks like :
// write to file, use the duplicated file descriptor,
fputs("hello\n", f_dup);
// fflush(f_dup);
// close duplicated file descriptor,
fclose(f_dup);
// close(fd_dup);
And it works correctly. I get on both systems :
original file descriptor: 3
duplicated file descriptor: 4
hello
I am trying to open a file in c using open() and I need to check that the file is a regular file (it can't be a directory or a block file). Every time I run open() my returned file discriptor is 3 - even when I don't enter a valid filename!
Here's what I have
/*
* Checks to see if the given filename is
* a valid file
*/
int isValidFile(char *filename) {
// We assume argv[1] is a filename to open
int fd;
fd = open(filename,O_RDWR|O_CREAT,0644);
printf("fd = %d\n", fd);
/* fopen returns 0, the NULL pointer, on failure */
}
Can anyone tell me how to validate input files?
Thanks!
Try this:
int file_isreg(const char *path) {
struct stat st;
if (stat(path, &st) < 0)
return -1;
return S_ISREG(st.st_mode);
}
This code will return 1 if regular, 0 if not, -1 on error (with errno set).
If you want to check the file via its file descriptor returned by open(2), then try:
int fd_isreg(int fd) {
struct stat st;
if (fstat(fd, &st) < 0)
return -1;
return S_ISREG(st.st_mode);
}
You can find more examples here, (specifically in the path.c file).
You should also include the following headers in your code (as stated on stat(2) manual page):
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
For future reference, here is an excerpt of the stat(2) manpage regarding the POSIX macros available for st_mode field validations:
S_ISREG(m) is it a regular file?
S_ISDIR(m) directory?
S_ISCHR(m) character device?
S_ISBLK(m) block device?
S_ISFIFO(m) FIFO (named pipe)?
S_ISLNK(m) symbolic link? (Not in POSIX.1-1996.)
S_ISSOCK(m) socket? (Not in POSIX.1-1996.)
int isValidFile(char *filename) {
// We assume argv[1] is a filename to open
int fd;
fd = open(filename,O_RDWR|***O_CREAT***,0644);
printf("fd = %d\n", fd);
/* fopen returns 0, the NULL pointer, on failure */
}
you are using 0_CREAT which prompts the function to create if the file doesn't exist.this in the table its number is 3 (0,1,2 being std input std output and std error)
Wrong: check if the file is OK, then if it is, go open it and use it.
Right: go open it. If you can't, report the problem and bail out. Otherwise, use it (checking and reporting errors after each opetation).
Why: you have just checked that a file is OK. That's fine, but you cannot assume it will be OK in 0.000000017 seconds from now. Perhaps the disk wil overheat and break down. Perhaps some other process will mass-delete your entire file collection. Perhaps your cat will trip over the network cable. So let's just check if it's OK again, and then go open it. Wow, what a great idea! No wait...