On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number.
This is the description I copy from man 2 read, I have a question to "the file position is advanced by this number" this statement.
What I want to do in my code is the following:
for (int i = 1; i < argc; i++) {
if (read(pipe_fd[i-1][0], &contribution, sizeof(int)) == -1) {
perror("reading from pipe from a child");
exit(1);
}
if (i == argc){
i = 1;
}
}
I am trying to repeatedly read data from the pipes that connect each child process to my parent process, my question to this post is: will read remember when it should continue reading when next time I call read again?
For example, suppose I am calling read(pipe_fd[1][0], &contribution, sizeof(int)), from what I understand here, I know read will read sizeof(int) bytes from the pipe, and somehow use fseek call or something like that to move sizeof(int) bytes to the next new starting position. But when I loop through, and change i back to read(pipe_fd[1][0]) again, will read remember its last starting position? (which is after the first read call, the new starting position I describe above) Or read will just assume nothing happens and read from the initial starting position instead of the new starting position?
Pipes don't have a file position, but read(2) will give you all the data written to the other end of the pipe, in the same order (first in, first out). When you call read() the next time on the same fd, you will get the data that follows what you got from the previous call on that fd. This is typically what you want, and you don't need to do anything special in between.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I am trying to read input from stdin with fread(). However i am have a problem, the loop will not terminate and instead keeps reading.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "argument err");
return -1;
}
FILE *in = fopen(argv[1], "w");
if (in == NULL) {
fprintf(stderr, "failed to open file");
return -1;
}
char buffer[20];
size_t ret;
while ((ret = fread(buffer, 1, 20, stdin)) > 0) {
if (fwrite(buffer, 1, ret, in) != ret) {
if (ferror(in) != 0) {
perror("write err:");
}
}
}
return 0;
}
How can i make this loop terminate when EOF is reached? i have tried using ctrl+D but that just seems like a strange way to stop taking input.
I guess what i want is to use fread() to read multiple arbitrary amounts of data in chunks of 20 bytes and then somehow stop.
How can i make this loop terminate when EOF is reached?
When do you think EOF is reached? Really. When you are providing input interactively, how is the system or the program supposed to know that you've entered all the data you want the program to consume?
i have tried using ctrl+D but that just seems like a strange way to stop taking input.
It is exactly the way to signal a soft EOF to a POSIX terminal. Since you want the loop to stop when EOF is encountered, it seems absolutely natural to me to use ctrl+D for the purpose when providing data interactively. That's not the only way you could signal the end of the input, but it has a lot going for it.
I guess what i want is to use fread() to read multiple arbitrary amounts of data in chunks of 20 bytes and then somehow stop.
Again: how is the program supposed to know when it has consumed all the "multiple arbitrary amounts" of data that you decide to provide on a given run? An EOF signal is an eminently reasonable choice for multiple reasons, and the way to deliver that from a POSIX terminal interface is ctrl+D.
As pointed out before you are reading from an eternal stream, this means that stdin don't naturally have a EOF (or <=0) value.
If you want your loop to terminate, you will have to add a termination condition, like a certain character, word or all type of value. After that you could use a break or a return in some case. You could also search if your terminal emulator support the insertion of an EOF value into the stdin, which is pretty common (But very platform dependent).
ADD: On my system, typical linux, CTRL+D is for an EOF insertion in stdin. It seems that you found this out yourself, and if you want your program to know where to stop you will need to use this.
You cand also send a signal to your program, usually done with a shortcut like CTRL+D, CTRL+C, CTRL+T etc... there is all sort of signal, which can be sent by your system or/and your TE and you just have to implement in your program the corresponding signal receiver.
How can i make this loop terminate when EOF is reached? i have tried using ctrl+D but that just seems like a strange way to stop taking input.
fread and fwrite are there to read data records, so they (both) take the number of records to read and the size of the record. If the available data doesn't fit on a full record, you will not get the full record at all (indeed, the routines return the number of full records read, and the partial read will be waiting for the next fread() call.)
All the calls in stdio.h package are buffered, so the buffer holds the data that has been read (from the system) but not yet consumed by the user, and so, this makes me to wonder why are you trying to use a buffer to read data that is already buffered?
EOF is produced when you are trying to read one record and the fread() call results in a true end of file from the system (this normally requires two calls, the first to complete the remaining data, the second resulting in no data ---zero bytes--- returned from the system) So you have to distinguish two cases:
fread() returns 0 in case it has read something, but is not enough to complete a record.
fread() returns EOF in case it has read nothing (the true end of file is reached)
As I've said above, fread() & fwrite() will read/write full records (this is useful when your data is a struct with a fixed length, but normally not when you can have extra data at the end)
The way to terminate the loop should be something like this:
while ((ret = fread(buffer, 1, 20, stdin)) >= 0) {
if (fwrite(buffer, 1, ret, in) != ret) {
if (ferror(in) != 0) {
perror("write err:");
}
}
}
/* here you can have upto 19 bytes in the buffer that cannot
* be read with that record length, but you can read individually
* with fgetc() calls. */
so, if you read half a record (at end of file) only at the next fread() it will detect the end of file (by reading nothing) and you will be free of ending. (beware that the extra data that doesn't fill a full buffer, still needs to be read by other means)
The cheapest and easiest way to solve this problem (to copy a file from one descriptor to another) is described in K&R (in the first edition) and has not yet have better code to void it, is this:
int c;
while ((c = fgetc(in)) != EOF)
fputc(c, out);
while it seems to read the characters one by one, it actually makes a call to read(2) to completely fill a full buffer of data, and return just one character, next characters will be taken from the buffer, saving calls to read(), and the same happens to fputc() (it fills the buffer until it's full, then flushes it, in a single call to write()).
Many people has tried to defeat the code above, without any measurable gain in efficience. So, my hint is be simple, that the world is complicated enough to force you to go complex.
I just want to create 2 new forks(child processes) and they will put their name sequentally. SO first they need to some string in pipe to check something. Let's see the code:
char myname[] = "ALOAA";
int main ()
{
int fds[2];
pid_t pid;
pipe(fds);
pid = fork();
if(pid == 0)
{
strcpy(myname, "first");
}
else
{
pid = fork();
if(pid == 0)
{
strcpy(myname, "second");
}
}
if(strcmp(myname, "ALOAA") != 0)
{
char readbuffer[1025];
int i;
for (i = 0; i < 2 ; i++)
{
//printf("%s\n", myname);
close(fds[0]);
write(fds[1], myname, strlen(myname));
while(1)
{
close(fds[1]);
int n = read(fds[0], readbuffer, 1024);
readbuffer[n] = 0;
printf("%s-alihan\n", readbuffer);
if(readbuffer != myname)
break;
sleep(1);
}
//printf("%s\n", myname);
}
}
return 0;
}
So the first process will write her name to pipe. And after that, will check if any new string in pipe. It will be same for second too. However I got empty string from read() function. So it prints like that
-alihan
-alihan
I couldn't get the problem.
However I got empty string from read() function [...] I couldn't get the problem.
#MikeCAT nailed this issue with his observation in comments that each child closes fds[0] before it ever attempts to read from it. No other file is assigned the same FD between, so the read fails. You do not test for the failure.
Not testing for the read failure is a significant problem, because your program does not merely fail to recognize it -- it exhibits undefined behavior as a result. This arises for (at least) two reasons:
read() will have indicated failure by returning -1, and your program will respond by attempting an out-of-bounds write (to readbuffer[-1]).
if we ignore the UB resulting from (1), we still have the program thereafter reading from completely uninitialized array readbuffer (because neither the read() call nor the assignment will have set the value of any element of that array).
Overall, you need to learn the discipline of checking the return values of your library function calls for error conditions, at least everywhere that it matters whether an error occurred (which is for most calls). For example, your usage of pipe(), fork(), and write() exhibits this problem, too. Under some circumstances you want to check the return value of printf()-family functions, and you usually want to check the return value of input functions -- not just read(), but scanf(), fgets(), etc..
Tertiarily, your usage of read() and write() is incorrect. You make the common mistake of assuming that (on success) write() will reliably write all the bytes specified, and that read() will read all bytes that have been written, up to the specified buffer size. Although that ordinarily works in practice for exchanging short messages over a pipe, it is not guaranteed. In general, write() may perform only a partial write and read() may perform only a partial read, for unspecified, unpredictable reasons.
To write successfully one generally must be prepared to repeat write() calls in a loop, using the return value to determine where (or whether) to start the next write. To read complete messages successfully one generally must be prepared similarly to repeat read() calls in a loop until the requisite number of bytes have been read into the buffer, or until some other termination condition is satisfied, such as the end of the file being reached. I presume it will not be lost on you that many forms of this require advance knowledge of the number of bytes to read.
I started today working with pipe() and fork() and exec() in C, and I now have a problem:
The main program, creates two pipes and forks. The child process does an exec() to another program that now is only a test program which reads from stdin, communicates with its parent and writes through stdout ther results. The main program is supposed to recive data, communicate with a SQLite 3 database and return data using pipes. This part of the problem is solved, the pipes are open and closed properly and there's communication.
Then the problem is, in the child process (the one which is called with exec()) in a certain point, I have this:
printf("Strid sent: %s\n", strid);
write(4,strid,sizeof(strid));
printf("Str sent: %s\n", str);
write(4,str,sizeof(str));
And the parent should be reading properly with this part:
read(select[0],strid, sizeof(strid));
printf("Strid recived: %s\n",strid);
int id = atoi(strid);
printf("Id recived: %d\n",id);
read(select[0],buffer, sizeof(buffer));
printf("Buffer recived: %s\n",buffer);
But what I recive whith those printf is:
Strid sent: 1
Str sent: start
Strid recived: 1
Id recived: 1
Buffer recived: 7� (and other strange characters)
As you can see, the problem is in the receiving of the second command (and that part is copied as is, there's no other code into it).
I have to say too that "buffer" variable, which recives str, is declared as char buffer[20] and has not been used before the read()
Thank you in advance!
After read, you need to add terminating 0 byte at the end, before printing, something like
int len = read(select[0], buffer, sizeof(buffer) - 1);
if (len < 0) {
perror("read error");
} else {
buffer[len] = 0;
printf("Buffer recived: %s\n", buffer);
}
So, imporant thing, read through man page of read, or actually man page of any function before you use it...
Alternatively, use some stdio.h function, which adds string terminating 0 themselves, probably fgets if reading line by line is ok.
pipes don't keep track of 'boundaries' between writes. So if you have multiple writes on the pipe, all the data from those writes might come back in response to a single read. So if you send a string (for example) followed by an integer, and you attempt to read the string into a buffer, it will get both the string and the following integer integer into the buffer, looking like garbage on the end of the string, and the size returned by read will be larger.
In addition, if the pipe is getting full, a write might not write all the data you asked it to -- it might write only as much as can fit for now, and you'll have to write the rest later.
so ALWAYS check the return values of your read and write calls and be prepared to deal with getting less than (or more for read) than you expect.
I'm having a problem getting the correct file position at which I'm writing when simultaneously writing to different parts of the same file using multiple threads.
I have one global file descriptor to the file. In my writing function, I
first lock a mutex, then do lseek(global_fd, 0, SEEK_CUR) to get the current file
position. I next write 31 zero bytes (31 is my entry size) using write(), in effect to reserve space for later. I then unlock the mutex.
Later in the function, I declare a local fd variable to the same file, and open
it. I now do an lseek on that local fd to get to the position I learned from
earlier, where my space is reserved. Finally, I write() 31 data bytes there for
the entry, and close the local fd.
The issue seems to be that rarely, an entry doesn't get written to the expected location (it's not mangled data - it seems that either it is swapped with a different entry, or two entries were written to the same location). There are multiple threads running that
"writing function" I described.
I since learned that pwrite() can be used to write to a specific offset, which would be more efficient, and eliminate the lseek(). However, I first want to find out: what is wrong with my original algorithm? Is there any type of buffering that could be causing the discrepancy between the expected write location, and where the data actually ends up getting stored in the file?
The relevant code snippet is below. The reason this is an issue is that in a second data file, I record the location where the entry I'm writing will be stored. If that location, based on the lseek() before the write, is not accurate, my data doesn't match up properly -- which is what happens on occasion (it's hard to reproduce - it happens in maybe 1 in 100k writes). Thanks!
db_entry_add(...)
{
char dbrecord[DB_ENTRY_SIZE];
int retval;
pthread_mutex_lock(&db_mutex);
/* determine the EOF index, at which we will add the log entry */
off_t ndb_offset = lseek(cfg.curr_fd, 0, SEEK_CUR);
if (ndb_offset == -1)
{
fprintf(stderr, "Unable to determine ndb offset: %s\n", strerror_s(errno, ebuf, sizeof(ebuf)));
pthread_mutex_unlock(&db_mutex);
return 0;
}
/* reserve entry-size bytes at the location, at which we will
later add the log entry */
memset(dbrecord, 0, sizeof(dbrecord));
/* note: db_write() is a write() loop */
if (db_write(cfg.curr_fd, (char *) &dbrecord, DB_ENTRY_SIZE) < 0)
{
fprintf(stderr, "db_entry_add2db - db_write failed!");
close(curr_fd);
pthread_mutex_unlock(&db_mutex);
return 0;
}
pthread_mutex_unlock(&db_mutex);
/* in another data file, we now record that the entry we're going to write
will be at the specified location. if it's not (which is the problem,
on rare occasion), our data will be inconsistent */
advertise_entry_location(ndb_offset);
...
/* open the data file */
int write_fd = open(path, O_CREAT|O_LARGEFILE|O_WRONLY, 0644);
if (write_fd < 0)
{
fprintf(stderr, "%s: Unable to open file %s: %s\n", __func__, cfg.curr_silo_db_path, strerror_s(errno, ebuf, sizeof(ebuf)));
return 0;
}
pthread_mutex_lock(&db_mutex);
/* seek to our reserved write location */
if (lseek(write_fd, ndb_offset, SEEK_SET) == -1)
{
fprintf(stderr, "%s: lseek failed: %s\n", __func__, strerror_s(errno, ebuf, sizeof(ebuf)));
close(write_fd);
return 0;
}
pthread_mutex_unlock(&db_mutex);
/* write the entry */
/* note: db_write_with_mutex is a write() loop wrapped with db_mutex lock and unlock */
if (db_write_with_mutex(write_fd, (char *) &dbrecord, DB_ENTRY_SIZE) < 0)
{
fprintf(stderr, "db_entry_add2db - db_write failed!");
close(write_fd);
return 0;
}
/* close the data file */
close(write_fd);
return 1;
}
One more note, for completeness. I have a similar but simpler routine that could also be causing the problem. This one uses buffered output (FILE*, fopen, fwrite), but performs an fflush() at the end of each write. It writes to a different file than the earlier routine, but could cause the same symptom.
pthread_mutex_lock(&data_mutex);
/* determine the offset at which the data will be written. this has to be accurate,
otherwise it could be causing the problem */
offset = ftell(current_fp);
fwrite(data);
fflush(current_fp);
pthread_mutex_unlock(&data_mutex);
There seem to be several places where things could go wrong. I would make the following changes: (1) be consistent and use the same I/O library as per bdonlan's suggestion, (2) make the lseek() and the writes an atomic action guarded by a mutex so that only a single thread at a time can do those actions of adding to both files. SEEK_CUR does a seek based on the current location of the file offset pointer so would you not want SEEK_END to seek to the end of the file in order to append there? Then if you are modifying a particular section of the file you would use SEEK_SET to reposition to the location you want to write to. And you would want to do this in a mutex guarded section so as to allow only a single thread to do the file positioning and file update.
If you're using your 'simpler routine' at the same time, this could indeed be a problem. If these are separate file descriptors, there's nothing to ensure that they're both pointing at the end of the file at all times (unless you use append mode, however I'm not sure what the semantics around ftell for append mode are). If they're the same fd (ie, you have a raw fd and a FILE * pointing to the same place), you might have problems with the standard library getting confused about where you are in a file, when you use write() to bypass it.
I have a simple program that only uses one process (each time it's executed), creates a semaphore with a key that is the file's name (ftok() function), and then writes a line to a file. The thing is, the semaphores (in this case, 2) have to do two things: one has to guarantee that no more than two programs write at the same time, and the other has to verify that only 10 lines maximum have been written to the file. So if I execute the program and the file already has 10 lines of text, it won't write anything to it.
This is my code:
#include "semaphores.h"
int main() {
int semaphoreLines = create_semaphore(ftok("Ex5.c", 0), 10);
int semaphoreWrite = create_semaphore(ftok("Ex5.c", 1), 1);
FILE *file;
int ret_val = down(semaphoreLines, 1);
if(ret_val != 0) {
printf("No more lines can be written to the file!\n");
exit(-1);
}
down(semaphoreWrite, 1);
file = fopen("Ex5.txt", "a");
fprintf(file, "This is process %d\n", getpid());
fclose(file);
up(semaphoreWrite, 1);
return 0;
}
When I execute it the first time, semaphoreLines goes to 9 (as intended), locks the semaphoreWrite to 0 (so no other process can write to the file), then writes and frees up the latter back to 1. The process terminates. I manually tell it to run again in Terminal. However, semaphoreLines should be 9 so when I down() it, it goes to 8 and so forth. The issue is, it gets back up at 10 again. I don't want this.
Maybe it's because I'm fairly new to semaphore programming, but I thought semaphores were public if they don't get created with 0 key. With the ftok(), I wanted it to be public so that if I run the program again it decrements it if possible and writes, if not it displays the error code and terminates. I mean, the semaphore doesn't get removed, so the second time the program gets executed it should see the semaphore value is 9, right...?
I don't really want to fork 10 processes and have them write one by one to the file in the same program...or is that the only way to do it?
P.S. The create_semaphore() function is part of my semaphores.h header file, which contains 4 simple functions I wrote so it's easier to use semaphores instead of running all that semget, semop, and semctl stuff every time I want to work with them.
The issue is, it gets back up at 10 again. I don't want this.
If you don't want this, then don't do it. You yourself are setting the semaphore value to 10 in create_semaphore(). Instead, pass IPC_EXCL in addition to IPC_CREAT to semget(), and if that yields errno EEXIST, just return from create_semaphore() and skip the semctl(SETVAL).