I have two segments of code.
#define BUFFSIZE 30
#define STDIN_FILENO 0
#define STDOUT_FILENO 1
#define STDERR_FILENO 2
#include <strings.h>
int main(void)
{
int n;
char buf[BUFFSIZE];
while ( (n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0)
if (write(STDOUT_FILENO, buf, n) != n)
err_sys("write error");
if (n < 0)
err_sys("read error");
exit(0);
}
And another
#define STDIN_FILENO 0
#define STDOUT_FILENO 1
#define STDERR_FILENO 2
#include <stdio.h>
#include <strings.h>
int main(void) {
int c;
while ( (c = getc(stdin)) != EOF)
if (putc(c, stdout) == EOF)
err_sys("output error");
if (ferror(stdin))
err_sys("input error");
exit(0); }
For the first programme, I thought if I input a string whose length is larger than BUFFZISE, the characters whose indexes are larger than BUFFZISE will be eliminated. But it turned out not to be so. Why does this happen? And what is the major difference between these two I/O mechanism? Many thanks.
For me, the basic difference between I/O levels is that lower level is not buffered (in standard library).
In your case, the first example is reading and writing using your own buffer of size BUFFSIZE. In the second example, you are reading/writing by a single character relying on the fact that the buffering is done by the library. Otherwise, both examples are doing the same thing.
Lower level functions allow to use a few more options than higher level functions like non-blocking I/O. Also programs using higher level functions may be a bit slower. In your second example the data is copied (byte after byte) from an input buffer to an output buffer which does not happen in the first example.
BTW, your first example can miss some characters, the loop shall be something like:
while ( (n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0) {
int i, k = 0;
do {
i = write(STDOUT_FILENO, buf+k, n-k);
if (i < 0) {err_sys("write error"); break;}
k += i;
} while (k < n);
}
Related
So, I asked here just a while ago, but half of that question was just me being dumb. And I still have issues. I hope that this will be clearer than the question before.
I'm writing POSIX cat, I nearly got it working, but I have couple of issues:
My cat can not read from a pipe and I really do not know why (redirecting (<) works fine)
I can not figure out how to make it continuously read stdin, without some issues. I had a version that worked "fine", but would create a stack-overflow. The other version wouldn't stop reading from stdin if there was only stdin i.e.: my-cat < file would read from stdin until it got terminated which it shouldn't, but it has to read from stdin and wait for termination if no files are suplied.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
char opt;
while ((opt = getopt(argc, argv, "u")) != EOF) {
switch(opt) {
case 'u':
/* Make the output un-buffered */
setbuf(stdout, NULL);
break;
default:
break;
}
}
argc -= optind;
argv += optind;
int i = 0, fildes, fs = 0;
do {
/* Check for operands, if none or operand = "-". Read from stdin */
if (argc == 0 || !strcmp(argv[i], "-")) {
fildes = STDIN_FILENO;
} else {
fildes = open(argv[i], O_RDONLY);
}
/* Check for directories */
struct stat fb;
if (!fstat(fildes, &fb) && S_ISDIR(fb.st_mode)) {
fprintf(stderr, "pcat: %s: Is a directory\n", argv[i]);
i++;
continue;
}
/* Get file size */
fs = fb.st_size;
/* If bytes are read, write them to stdout */
char *buf = malloc(fs * sizeof(char));
while ((read(fildes, buf, fs)) > 0)
write(STDOUT_FILENO, buf, fs);
free(buf);
/* Close file if it's not stdin */
if (fildes != STDIN_FILENO)
close(fildes);
i++;
} while (i < argc);
return 0;
}
Pipes don't have a size, and nor do terminals. The contents of the st_size field is undefined for such files. (On my system it seems to always contain 0, but I don't think there is any cross-platform guarantee of that.)
So your plan of reading the entire file at one go and writing it all out again is not workable for non-regular files, and is risky even for them (the read is not guaranteed to return the full number of bytes requested). It's also an unnecessary memory hog if the file is large.
A better strategy is to read into a fixed-size buffer, and write out only the number of bytes you successfully read. You repeat this until end-of-file is reached, which is indicated by read() returning 0. This is how you solve your second problem.
On a similar note, write() is not guaranteed to write out the full number of bytes you asked it to, so you need to check its return value, and if it was short, try again to write out the remaining bytes.
Here's an example:
#define BUFSIZE 65536 // arbitrary choice, can be tuned for performance
ssize_t nread;
char buf[BUFSIZE]; // or char *buf = malloc(BUFSIZE);
while ((nread = read(filedes, buf, BUFSIZE)) > 0) {
ssize_t written = 0;
while (written < nread) {
ssize_t ret = write(STDOUT_FILENO, buf + written, nread - written);
if (ret <= 0)
// handle error
written += ret;
}
}
if (nread < 0)
// handle error
As a final comment, your program lacks error checking in general; e.g. if the file cannot be opened, it will proceed anyway with filedes == -1. It is important to check the return value of every system call you issue, and handle errors accordingly. This would be essential for a program to be used in real life, and even for toy programs created just as an exercise, it will be very helpful in debugging them. (Error checking would probably have given you some clues in figuring out what was wrong with this program, for instance.)
Your cat (You can call it my-cat, but I preferred to call it felix, just permit me the pun) should be used with stdio all the time to get the benefit of the buffering done by the stdio package. Below is a simplified version of cat using exclusively stdio package (almost exactly equal as it appears in K&R) and you'll see that is completely efficient as shown (you will see that the structure is almost exactly as yours, but I simplify the processing of the data copy /like K&R book/ and the processing of arguments /yours is a bit meshy/):
felix.c
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <getopt.h>
#define ERR(_code, _fmt, ...) do { \
fprintf(stderr,"%s: " _fmt, progname, \
##__VA_ARGS__); \
if (_code) exit(_code); \
} while (0)
char *progname = "cat";
void process(FILE *f);
int main(int argc, char **argv)
{
int opt;
while ((opt = getopt(argc, argv, "u")) != EOF) {
switch (opt) {
case 'u': setbuf(stdout, NULL); break;
}
}
/* for the case it has been renamed, calculate the basename
* of argv[0] (progname is used in the macro ERR above) */
progname = strrchr(argv[0], '/');
progname = progname
? progname + 1
: argv[0];
/* shift options */
argc -= optind;
argv += optind;
if (argc) {
int i;
for (i = 0; i < argc; i++) {
FILE *f = fopen(argv[i], "r");
if (!f) {
ERR(EXIT_FAILURE,
"%s: %s (errno = %d)\n",
argv[i], strerror(errno), errno);
}
process(f);
fclose(f);
}
} else {
process(stdin);
}
exit(EXIT_SUCCESS);
}
/* you don't need to complicate here, fgetc and putchar use buffering as you stated in main
* (no output buffering if you do the setbuf(NULL) and input buffering all the time). The buffer
* size is best to leave stdio to calculate it, as it queries the filesystem to get the best
* input/output size and create buffers this size. and the processing is simple with a loop like
* the one below. You'll get no appreciable difference between this and any other input/output.
* you can believe me, I've tested it. */
void process(FILE *f)
{
int c;
while ((c = fgetc(f)) != EOF) {
putchar(c);
}
}
As you see, nothing has been specially done to support redirection, as redirection is not done inside a program, but done by the program that calls it (in this case by the shell) When you start a program, you receive three already open file descriptors. These are the ones that the shell is using, or the ones that the shell just puts in the places of 0, 1, and 2 before starting your program. So your program has nothing to do to cope with redirection. Everything is done (in this case) in the shell... and this is why your program redirection works, even if you have not done anything for it to work. You have only to do redirection if you are going to call a program with its input, output or standard error redirected somewhere (and this somewhere is not the standard input, output or error you have received from your parent process)... but this is not the case of my-cat.
I am practicing the read and write system call, the below code is working fine with a while loop and also without them. could you please tell me what is the use of while loop here, is it necessary to add it while using read and write system calls. I am a beginner. Thanks.
#include <unistd.h>
#define BUF_SIZE 256
int main(int argc, char *argv[])
{
char buf[BUF_SIZE];
ssize_t rlen;
int i;
char from;
char to;
from = 'e';
to = 'a';
while (1) {
rlen = read(0, buf, sizeof(buf));
if (rlen == 0)
return 0;
for (i = 0; i < rlen; i++) {
if (buf[i] == from)
buf[i] = to;
}
write(1, buf, rlen);
}
return 0;
}
You usually need to use while loops (or some kind of loop in general) with read and write, because, as you should know from the manual page (man 2 read):
RETURN VALUE
On success, the number of bytes read is returned (zero indicates end
of file), and the file position is advanced by this number. It is
not an error if this number is smaller than the number of bytes
requested; this may happen for example because fewer bytes are
actually available right now (maybe because we were close to end-of-
file, or because we are reading from a pipe, or from a terminal), or
because read() was interrupted by a signal. See also NOTES.
Therefore, if you ever want to read more than 1 byte, you need to do this in a loop, because read can always process less than the requested amount.
Similarly, write can also process less than the requested size (see man 2 write):
RETURN VALUE
On success, the number of bytes written is returned (zero indicates nothing was written). It is not an error if this
number is smaller than the number of bytes requested; this may happen for example because the disk device was filled.
See also NOTES.
On error, -1 is returned, and errno is set appropriately.
The only difference here is that when write returns 0 it's not an error or an end of file indicator, you should just retry writing.
Your code is almost correct, in that it uses a loop to keep reading until there are no more bytes left to read (when read returns 0), but there are two problems:
You should check for errors after read (rlen < 0).
When you use write you should also add a loop there too, because as I just said, even write could process less than the requested amount of bytes.
A correct version of your code would be:
#include <stdio.h>
#include <unistd.h>
#define BUF_SIZE 256
int main(int argc, char *argv[])
{
char buf[BUF_SIZE];
ssize_t rlen, wlen, written;
char from, to;
int i;
from = 'e';
to = 'a';
while (1) {
rlen = read(0, buf, sizeof(buf));
if (rlen < 0) {
perror("read failed");
return 1;
} else if (rlen == 0) {
return 0;
}
for (i = 0; i < rlen; i++) {
if (buf[i] == from)
buf[i] = to;
}
for (written = 0; written < rlen; written += wlen) {
wlen = write(1, buf + written, rlen - written);
if (wlen < 0) {
perror("write failed");
return 1;
}
}
}
return 0;
}
I am using low level io functions to fetch the size of a file in bytes and write it to stdout. I am using windows 7 64bit, and I am using visual studio 2017, x64 debugging mode. The functions _filelength and _filelengthi64 are exclusive to the windows operating system however when I use them they both return a 0 for any file I open. Here is the full code, but the issue should only lie with _sopen_s() or _filelengthi64():
Header
#pragma once
// Headers
#include <io.h>
#include <string.h>
#include <sys\stat.h>
#include <share.h>
#include <fcntl.h>
#include <errno.h>
// Constants
#define stdout 1
#define stderr 2
// Macros
#define werror_exit { werror(); return 1; }
#define werr_exit(s) { _write(stderr, (s), (unsigned int)strlen((s))); return 1; }
// Declarations
extern void werror();
extern void wnum(__int64 num);
Source
#include "readbinaryfile.h"
int main(int argc, char **argv)
{
int fhandle;
__int64 fsize;
// open binary file as read only. deny sharing write permissions. allow write permissions if new file
if (_sopen_s(&fhandle, argv[1], _O_RDONLY | _O_BINARY, _SH_DENYWR, _S_IWRITE) == -1)
werror_exit
else if (fhandle == -1)
werr_exit("\nERROR: file does not exist...\n")
if (fsize = _filelengthi64(fhandle) == -1)
{
if (_close(fhandle) == -1)
werror_exit
werror_exit
}
if (_close(fhandle) == -1)
werror_exit
// write the file size to stdout
wnum(fsize);
return 0;
}
// fetch the string representation of the errno global variable and write it to stderr
void werror()
{
char bufstr[95];
size_t buflen = 95; // MSDN suggested number for errno string length
strerror_s(bufstr, buflen, errno);
_write(stderr, bufstr, (unsigned int)buflen);
_set_errno(0);
}
// recursively write the ascii value of each digit in a number to stdout
void wnum(__int64 num)
{
if (num / 10 == 0)
{
_write(stdout, &(num += 48), 1);
return;
}
wnum(num / 10);
_write(stdout, &((num %= 10) += 48), 1);
}
I have tried passing many different filepaths to argv[1] yet they all still show an fsize of 0. In all of those cases, fhandle was assigned a value of 3 after using _sopen_s() which indicates no errors when opening the files. I have verified the operation of wnum() and werror(). I appreciate the help!
_filelengthi64(fhandle) doesn't return 0. The expression _filelengthi64(fhandle) == -1, however, will (assuming a successful call), which is then assigned to fsize. You are ignoring the C operator precedence, dictating that == has higher precedence than =. You will have to use parentheses to change the precedence:
if ((fsize = _filelengthi64(fhandle)) == -1)
{
...
If you want to reduce the amount of mental energy required to write (and especially read) code, it is generally a good idea to isolate normal code logic from error handling, e.g.:
// Normal code flow
fsize = _filelengthi64(fhandle);
// Error handling code
if (fsize == -1)
{
...
I need to read two 1MB+ binary files byte by byte, compare them - If they're not equal, print out the next 16 bytes starting at the unequal byte. The requirement is that it all runs in just 5msecs. Currently, my program is taking 19msecs if the unequal bit is at the end of the two files. Are there any suggestions as to how I can optimize it?
#include <stdio.h> //printf
#include <unistd.h> //file open
#include <fcntl.h> //file read
#include <stdlib.h> //exit()
#include <time.h> //clock
#define SIZE 4096
void compare_binary(int fd1, int fd2)
{
int cmpflag = 0;
int errorbytes = 1;
char c1[SIZE], c2[SIZE];
int numberofbytesread = 1;
while(read(fd1, &c1, SIZE) == SIZE && read(fd2, &c2, SIZE) == SIZE && errorbytes < 17){
for (int i=0 ; i < SIZE ; i++) {
if (c1[i] != c2[i] && cmpflag == 0){
printf("Bytes not matching at offset %d\n",numberofbytesread);
cmpflag = 1;
}
if (cmpflag == 1){
printf("Byte Output %d: 0x%02x 0x%02x\n", errorbytes, c1[i], c2[i]);
errorbytes++;
}
if (errorbytes > 16){
break;
}
numberofbytesread++;
}
}
}
int main(int argc, char *argv[])
{
int fd[2];
if (argc < 3){
printf("Check the number of arguments passed.\n");
printf("Usage: ./compare_binary <binaryfile1> <binaryfile2>\n");
exit(0);
}
if (!((access(argv[1], F_OK) == 0) && (access(argv[2], F_OK) == 0))){
printf("Please check if the files passed in the argument exist.\n");
exit(0);
}
fd[0] = open(argv[1], O_RDONLY);
fd[1] = open(argv[2], O_RDONLY);
if (fd[0]< 0 && fd[1] < 0){
printf("Can't open file.\n");
exit(0);
}
clock_t t;
t = clock();
compare_binary(fd[0], fd[1]);
t = clock() - t;
double time_taken = ((double)t)/(CLOCKS_PER_SEC/1000);
printf("compare_binary took %f milliseconds to execute \n", time_taken);
}
Basically need the optimized way to read binary files over 1MB such that they can be done under 5msecs.
First, try reading larger blocks. There's no point in performing so many read calls when you can read everything at once. Using 2 MB of memory is not a deal nowadays. Disk I/O calls are inherently expensive, their overhead is significant too, but can be reduced.
Second, try comparing integers (or even 64-bit longs) instead of bytes in each iteration, that reduces the number of loops you need to do significantly. Once you find a missmatch, you can still switch to the byte-per-byte implementation. (of course, some extra trickery is required if the file length is not a multiple of 4 or 8).
first thing caught my eye is this
if (cmpflag == 1){
printf("Byte Output %d: 0x%02x 0x%02x\n", errorbytes, c1[i], c2[i]);
errorbytes++;
}
if (errorbytes > 16){
break;
}
yourcmpflag checking is useless maybe this thing do a little optimaztion
if (c1[i] != c2[i] && cmpflag == 0){
printf("Bytes not matching at offset %d\n",numberofbytesread);
printf("Byte Output %d: 0x%02x 0x%02x\n", errorbytes, c1[i], c2[i]);
errorbytes++;
if (errorbytes > 16){
break;
}
}
you can do array compare built in function, or increase your buffer too
I am learning C and I have been trying to read a file and print what I just read. I open the file and need to call another function to read and return the sentence that was just read.
My function will return 1 if everything went fine or 0 otherwise.
I have been trying to make it work for a while but I really dont get why I cant manage to give line its value. In the main, it always prints (null).
The structure of the project has to stay the same, and I absolutely have to use open and read. Not fopen, or anything else...
If someone can explain it to me that would be awesome.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#define BUFF_SIZE 50
int read_buff_size(int const fd, char **line)
{
char buf[BUFF_SIZE];
int a;
a = read(fd, buf, BUFF_SIZE);
buf[a] = '\0';
*line = strdup(buf);
return (1);
}
int main(int ac, char **av)
{
char *line;
int fd;
if (ac != 2)
{
printf("error");
return (0);
}
else
{
if((fd = open(av[1], O_RDONLY)) == -1)
{
printf("error");
return (0);
}
else
{
if (read_buff_size(fd, &line))
printf("%s\n", line);
}
close(fd);
}
}
Here:
char buf[BUFF_SIZE];
int a;
a = read(fd, buf, BUFF_SIZE);
buf[a] = '\0';
if there are more characters than BUFF_SIZE available to be read, then you will fill your array entirely, and buf[a] will be past the end of your array. You should either increase the size of buf by one character:
char buf[BUFF_SIZE + 1];
or, more logically given your macro name, read one fewer characters:
a = read(fd, buf, BUFF_SIZE - 1);
You should also check the returns from strdup() and read() for errors, as they can both fail.
read(fd, buf, BUFF_SIZE); //UB if string is same or longer as BUFF_SIZE
u need +1 byte to store 0, so use BUFF_SIZE - 1 on reading or +1 on array allocation...also you should check all returned values and if something failed - return 0
Keep it simple and take a look at:
https://github.com/mantovani/apue/blob/c47b4b1539d098c153edde8ff6400b8272acb709/mycat/mycat.c
(Archive form straight from the source: http://www.kohala.com/start/apue.tar.Z)
#define BUFFSIZE 8192
int main(void){
int n;
char buf[BUFFSIZE];
while ( (n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0)
if (write(STDOUT_FILENO, buf, n) != n)
err_sys("write error");
if (n < 0)
err_sys("read error");
exit(0);
}
No need to use the heap (strdup). Just write your buffer to STDOUT_FILENO (=1) for as long as read returns a value that's greater than 0. If you end with read returning 0, the whole file has been read.