Program Not Reading Entire File - c

I am writing a hex dump program in C. I know there are tons of hex dump programs out there, but I wanted to write one for the experience. I have written the program in CodeBlocks, on Windows, but I can't seem to get it to work.
I am reading in a test program which is roughly 137,000 bytes, but the program stops at 417 bytes. Now, when I compile the code on Linux (as it's only a console application and is using standard C libraries), it works perfectly, and gives back the correct amount of bytes in the file. Does anyone have any idea why read() would not work on Windows, but works fine in Linux?
Below is an example of how I am reading in the file.
int main(int argc, char **argv)
{
if (argc != 2) { return 1; }
int fd = open(argv[1], O_RDONLY);
if (fd == -1) { return 1; }
unsigned char buffer[8];
unsigned int bytes = 0;
unsigned int total_bytes = 0;
while ((bytes = read(fd, buffer, sizeof(unsigned char) * 8)) > 0) {
...
total_bytes += bytes;
}
printf("Total Bytes: %d\n", total_bytes);
return 0;
}

I have found the answer in this post after all. They were having the issue with stdin, though. Apparently the substitute character (1A) is the same as CTRL+Z in Windows, and so it was forcibly closing my program when reading that character.
C reading (from stdin) stops at 0x1a character

Related

Can not read from a pipe, and another stdin issue

So, I asked here just a while ago, but half of that question was just me being dumb. And I still have issues. I hope that this will be clearer than the question before.
I'm writing POSIX cat, I nearly got it working, but I have couple of issues:
My cat can not read from a pipe and I really do not know why (redirecting (<) works fine)
I can not figure out how to make it continuously read stdin, without some issues. I had a version that worked "fine", but would create a stack-overflow. The other version wouldn't stop reading from stdin if there was only stdin i.e.: my-cat < file would read from stdin until it got terminated which it shouldn't, but it has to read from stdin and wait for termination if no files are suplied.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
char opt;
while ((opt = getopt(argc, argv, "u")) != EOF) {
switch(opt) {
case 'u':
/* Make the output un-buffered */
setbuf(stdout, NULL);
break;
default:
break;
}
}
argc -= optind;
argv += optind;
int i = 0, fildes, fs = 0;
do {
/* Check for operands, if none or operand = "-". Read from stdin */
if (argc == 0 || !strcmp(argv[i], "-")) {
fildes = STDIN_FILENO;
} else {
fildes = open(argv[i], O_RDONLY);
}
/* Check for directories */
struct stat fb;
if (!fstat(fildes, &fb) && S_ISDIR(fb.st_mode)) {
fprintf(stderr, "pcat: %s: Is a directory\n", argv[i]);
i++;
continue;
}
/* Get file size */
fs = fb.st_size;
/* If bytes are read, write them to stdout */
char *buf = malloc(fs * sizeof(char));
while ((read(fildes, buf, fs)) > 0)
write(STDOUT_FILENO, buf, fs);
free(buf);
/* Close file if it's not stdin */
if (fildes != STDIN_FILENO)
close(fildes);
i++;
} while (i < argc);
return 0;
}
Pipes don't have a size, and nor do terminals. The contents of the st_size field is undefined for such files. (On my system it seems to always contain 0, but I don't think there is any cross-platform guarantee of that.)
So your plan of reading the entire file at one go and writing it all out again is not workable for non-regular files, and is risky even for them (the read is not guaranteed to return the full number of bytes requested). It's also an unnecessary memory hog if the file is large.
A better strategy is to read into a fixed-size buffer, and write out only the number of bytes you successfully read. You repeat this until end-of-file is reached, which is indicated by read() returning 0. This is how you solve your second problem.
On a similar note, write() is not guaranteed to write out the full number of bytes you asked it to, so you need to check its return value, and if it was short, try again to write out the remaining bytes.
Here's an example:
#define BUFSIZE 65536 // arbitrary choice, can be tuned for performance
ssize_t nread;
char buf[BUFSIZE]; // or char *buf = malloc(BUFSIZE);
while ((nread = read(filedes, buf, BUFSIZE)) > 0) {
ssize_t written = 0;
while (written < nread) {
ssize_t ret = write(STDOUT_FILENO, buf + written, nread - written);
if (ret <= 0)
// handle error
written += ret;
}
}
if (nread < 0)
// handle error
As a final comment, your program lacks error checking in general; e.g. if the file cannot be opened, it will proceed anyway with filedes == -1. It is important to check the return value of every system call you issue, and handle errors accordingly. This would be essential for a program to be used in real life, and even for toy programs created just as an exercise, it will be very helpful in debugging them. (Error checking would probably have given you some clues in figuring out what was wrong with this program, for instance.)
Your cat (You can call it my-cat, but I preferred to call it felix, just permit me the pun) should be used with stdio all the time to get the benefit of the buffering done by the stdio package. Below is a simplified version of cat using exclusively stdio package (almost exactly equal as it appears in K&R) and you'll see that is completely efficient as shown (you will see that the structure is almost exactly as yours, but I simplify the processing of the data copy /like K&R book/ and the processing of arguments /yours is a bit meshy/):
felix.c
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <getopt.h>
#define ERR(_code, _fmt, ...) do { \
fprintf(stderr,"%s: " _fmt, progname, \
##__VA_ARGS__); \
if (_code) exit(_code); \
} while (0)
char *progname = "cat";
void process(FILE *f);
int main(int argc, char **argv)
{
int opt;
while ((opt = getopt(argc, argv, "u")) != EOF) {
switch (opt) {
case 'u': setbuf(stdout, NULL); break;
}
}
/* for the case it has been renamed, calculate the basename
* of argv[0] (progname is used in the macro ERR above) */
progname = strrchr(argv[0], '/');
progname = progname
? progname + 1
: argv[0];
/* shift options */
argc -= optind;
argv += optind;
if (argc) {
int i;
for (i = 0; i < argc; i++) {
FILE *f = fopen(argv[i], "r");
if (!f) {
ERR(EXIT_FAILURE,
"%s: %s (errno = %d)\n",
argv[i], strerror(errno), errno);
}
process(f);
fclose(f);
}
} else {
process(stdin);
}
exit(EXIT_SUCCESS);
}
/* you don't need to complicate here, fgetc and putchar use buffering as you stated in main
* (no output buffering if you do the setbuf(NULL) and input buffering all the time). The buffer
* size is best to leave stdio to calculate it, as it queries the filesystem to get the best
* input/output size and create buffers this size. and the processing is simple with a loop like
* the one below. You'll get no appreciable difference between this and any other input/output.
* you can believe me, I've tested it. */
void process(FILE *f)
{
int c;
while ((c = fgetc(f)) != EOF) {
putchar(c);
}
}
As you see, nothing has been specially done to support redirection, as redirection is not done inside a program, but done by the program that calls it (in this case by the shell) When you start a program, you receive three already open file descriptors. These are the ones that the shell is using, or the ones that the shell just puts in the places of 0, 1, and 2 before starting your program. So your program has nothing to do to cope with redirection. Everything is done (in this case) in the shell... and this is why your program redirection works, even if you have not done anything for it to work. You have only to do redirection if you are going to call a program with its input, output or standard error redirected somewhere (and this somewhere is not the standard input, output or error you have received from your parent process)... but this is not the case of my-cat.

Is it necessary to use a while loop when using the read and write system calls?

I am practicing the read and write system call, the below code is working fine with a while loop and also without them. could you please tell me what is the use of while loop here, is it necessary to add it while using read and write system calls. I am a beginner. Thanks.
#include <unistd.h>
#define BUF_SIZE 256
int main(int argc, char *argv[])
{
char buf[BUF_SIZE];
ssize_t rlen;
int i;
char from;
char to;
from = 'e';
to = 'a';
while (1) {
rlen = read(0, buf, sizeof(buf));
if (rlen == 0)
return 0;
for (i = 0; i < rlen; i++) {
if (buf[i] == from)
buf[i] = to;
}
write(1, buf, rlen);
}
return 0;
}
You usually need to use while loops (or some kind of loop in general) with read and write, because, as you should know from the manual page (man 2 read):
RETURN VALUE
On success, the number of bytes read is returned (zero indicates end
of file), and the file position is advanced by this number. It is
not an error if this number is smaller than the number of bytes
requested; this may happen for example because fewer bytes are
actually available right now (maybe because we were close to end-of-
file, or because we are reading from a pipe, or from a terminal), or
because read() was interrupted by a signal. See also NOTES.
Therefore, if you ever want to read more than 1 byte, you need to do this in a loop, because read can always process less than the requested amount.
Similarly, write can also process less than the requested size (see man 2 write):
RETURN VALUE
On success, the number of bytes written is returned (zero indicates nothing was written). It is not an error if this
number is smaller than the number of bytes requested; this may happen for example because the disk device was filled.
See also NOTES.
On error, -1 is returned, and errno is set appropriately.
The only difference here is that when write returns 0 it's not an error or an end of file indicator, you should just retry writing.
Your code is almost correct, in that it uses a loop to keep reading until there are no more bytes left to read (when read returns 0), but there are two problems:
You should check for errors after read (rlen < 0).
When you use write you should also add a loop there too, because as I just said, even write could process less than the requested amount of bytes.
A correct version of your code would be:
#include <stdio.h>
#include <unistd.h>
#define BUF_SIZE 256
int main(int argc, char *argv[])
{
char buf[BUF_SIZE];
ssize_t rlen, wlen, written;
char from, to;
int i;
from = 'e';
to = 'a';
while (1) {
rlen = read(0, buf, sizeof(buf));
if (rlen < 0) {
perror("read failed");
return 1;
} else if (rlen == 0) {
return 0;
}
for (i = 0; i < rlen; i++) {
if (buf[i] == from)
buf[i] = to;
}
for (written = 0; written < rlen; written += wlen) {
wlen = write(1, buf + written, rlen - written);
if (wlen < 0) {
perror("write failed");
return 1;
}
}
}
return 0;
}

Short read on named pipes (matlab->linux)

I've got some code I'm writing that is expecting messages from a Matlab program via a named pipe, e.g., "/tmp/named_pipe_0". I can get pipes mkfifo and opened find, but when the C program goes to read() from the pipe, instead of the expected 5004 bytes, I'll get short values like 4096, 904, 5000, 4096, etc. I've already verified that Matlab is supposedly sending the correct 5004 bytes (at least, it's told to), so I'm wondering what the cause is. Anyone run across something like this before?
Matt
This is expected, a read on a pipe/socket/named pipe gives you back the data as soon as something is available.
If you need to read 5004 byte, you'd do it in a loop that appends you your own buffer up till you get that many bytes (or an error or eof occurs)
e.g.
size_t readn(int fd, void *buf, ssize_t len)
{
ssize_t tot = 0;
unsigned char *p = buf;
while (tot != len) {
ssize_t r = read(fd, p + tot, len - tot);
if (r == 0) //premature end of reading
break;
else if (r == -1) //error
return -1
tot += r;
}
return tot;
}
...
char buf[5004];
if (readn(pipe_fd, buf, sizeof buf) != sizeof buf) {
// something went bad
} else {
//got all the 5004 bytes
}

Write pid to fifo - C

I've got two C files, server.c and client.c. The server has to create a fifo file and constantly read in it, waiting for input. The client gets its PID and writes the PID in the fifo.
This is my server file which I launch first:
int main(){
int fd;
int fd1;
int bytes_read;
char * buffer = malloc(5);
int nbytes = sizeof(buffer);
if((fd = mkfifo("serverfifo",0666)) == -1) printf("create fifo error");
else printf("create fifo ok");
if ((fd1 = open("serverfifo",O_RDWR)) == -1) printf("open fifo error");
else{
printf("open fifo ok");
while(1){
bytes_read = read(fd,buffer,nbytes);
printf("%d",bytes_read);
}
}
return(0);
}
And my client file :
int main(){
int fd;
int pid = 0;
char *fifo;
int bytes;
if ((pid = getpid()) == 0) printf("pid error");
char pid_s[sizeof(pid)];
sprintf(pid_s,"%d",pid);
if ((fd = open ("serverfifo",O_RDWR)) == -1)printf("open fifo error");
else {
printf("open fifo ok");
bytes = write(fd,pid_s, sizeof(pid_s));
printf("bytes = %d",bytes);
}
close(fd);
return(0);
}
The two main problems I'm getting are: When I write the pid to the file it returns the number of bytes I wrote so it looks okay but when I check the properties of the fifo file it says 0 bytes. The second problem is the read doesn't work. If I do a printf before it shows, but after it doesn't and the read isn't returning anything it just freezes.
I realise there are a lot of similar posts on the site but I couldn't find anything that helped.
I'm using Ubuntu and GCC compiler with CodeBlocks.
There are many things wrong here
char pid_s[sizeof(pid)];
sprintf(pid_s,"%d",pid);
sizeof(pid) returns the size of the pid value, not its string representation, i.e. it is sizeof(int) which is either 4 or 8, depending on your architecture. You then proceed to print it. If this works it works only by luck (you are on a 64 bit machine). The correct way to do is, if you choose to do it at all, is to allocate a suitably large buffer, and use snprintf to make sure you don't overflow. PIDs fit in 5 digits, so something like this will do:
char pid_s[8];
snprintf(pid_s, sizeof(pid_s), "%d", pid);
of course, you can skip this step all together and send the raw bytes of the pid instead
write(fd, (void*)&pid, sizeof(pid))
Now in the server you make similar mistakes:
char * buffer = malloc(5);
int nbytes = sizeof(buffer);
sizeof(buffer) returns 4 or 8 again, but you allocated 5 bytes, the correct way to do this, if you want to allocate on the heap (using malloc), is this:
char* buffer = malloc(8);
int nbytes = 8;
alternatively you can allocate on the stack:
char buffer[8];
int nbytes = sizeof(buffer);
sizeof is sort of magical, in that if you pass in an array, it returns the size of the array (8*1) in this case.
When you are reading, you read 5 bytes, which is likely not enough (because you wrote 8 bytes due to the earlier bug), so it would not finish. You should read like this
int pid;
read(fd, (void*)&pid, sizeof(pid));
Also, if you were to actually read and write strings, you'd do something like this:
// client
char pid_s[8];
snprintf(pid_s, sizeof(pid_s), "%d", pid);
write(fd, pid_s, sizeof(pid_s));
// server
char pid_s[8];
read(fd, pid_s, sizeof(pid_s));
Note also that read may return less than what was written, and you need to call it again to keep reading...
Well there are a lot of mistake in this code...
First of all sizeof is not working like that.
Why do you serialize the pid ?
This is wrong :
char pid_s[sizeof(pid)];
123456 is an int and it doesn't fit into this array of size 4, only 3 char can be printed...
And because you are trying to serialize the pid you don't know the expected size to read, unless you take the worst case and write 10 + 1 for the '\0'...

How to Search for New Lines while Reading from a File in C/C++

I am implementing my own version of the ("cat") command in Unix for practice. After i did that i became interested in implementing some flags like (-n) and (-b).
My Question: I am looking for a way to locate the blank and new lines while reading from my file. I can't remember what library or function i should use.
Here is the source code I am working on:
#include <fcntl.h>
#include <unistd.h>
static int cat_fd(int fd)
{
char buf[4096];
ssize_t nread;
while ((nread = read(fd, buf, sizeof buf)) > 0)
{
ssize_t ntotalwritten = 0;
while (ntotalwritten < nread)
{
ssize_t nwritten = write(STDOUT_FILENO, buf + ntotalwritten, nread - ntotalwritten);
if (nwritten < 1)
{
return -1;
}
ntotalwritten += nwritten;
}
}
return (nread == 0) ? 0 : -1;
}
static int cat(const char *fname)
{
int fd, success;
if ((fd = open(fname, O_RDONLY)) == -1)
{
return -1;
}
success = cat_fd(fd);
if (close(fd) != 0)
{
return -1;
}
return success;
}
int main(int argc, char **argv)
{
int i;
if (argc == 1)
{
if (cat_fd(STDIN_FILENO) != 0)
goto error;
}
else
{
for (i = 1; i < argc; i++)
{
if (cat(argv[i]) != 0)
{
goto error;
}
}
}
return 0;
error:
write(STDOUT_FILENO, "error\n", 6);
return 1;
}
Any ideas or suggestions concerning my question are greatly appreciated.
I would be even more grateful if you can type for me the complete function prototype that i shall be using as i am not an experienced programmer.
Thanks in advance for your help.
P.S: I am implementing the (-n) and (-b) flags. Thus, i am looking forward to write the line number at the beginning of each line in the file that i am reading.
While there is a function that does line-based file input in C (it's called fgets), you can't really use it for cat, because:
There's no way to know the maximum length of the line beforehand;
You'll lose portions of the input if it contains null bytes.
You'll have to look for newline symbols in your buffer after you read it, and once you find any, print the prefix of the buffer, followed by newline, line number, and the rest of the buffer (with additional processing of remaining newlines, of course).
An easier solution would be to switch to processing input one byte at a time; you can use FILE* and fgetc to use CRT-provided buffering so that you don't actually do a syscall for each read/write, or read file in blocks as you do now, and do byte processing inside the loop. Then it's a matter of writing a state machine - if a previous read character was a newline, then output a line number, unless this character is a newline and -b option is used, etc.
This still results in a less efficient solution, so you may want to treat cat without arguments specially - i.e. switch to byte-per-byte processing only if you need it. In fact, this is exactly what at least one of actual cat implementations does.
I recall reading that cat memory maps files for fast execution. Use mmap(2).
http://kernel.org/doc/man-pages/online/pages/man2/munmap.2.html
I found this example: http://ladweb.net/src/map-cat.c
I know this doesn't answer your question about newlines. I guess
memchr() would do the trick.

Resources