Read from "edge" of file written by another process (which used fallocate) - c

I have two processes: a "writer" and a "reader"
The writer uses fallocate(2) to first create a fixed length file (filled with zeros) and the writer writes data to the file.
I want the reader process to read the file up to the "edge" (file offset where the writer last wrote data) and then wait for the writer to add more data.
Here are simple example programs for the writer and reader. Note the reader reads just a little bit faster so at some point will hit the edge of the file. When it does, it should seek back one data element and try again until it reads a non-zero data element.
It doesn't work. Seems the reader can seek and read successfully only once. After that it always reads zeros from the current file position. (see log below)
But... if I use another terminal window to copy the file somewhere else, all of the expected data is present. So it seems the data is being written but it is not read back reliably.
Any suggestions?
EDIT / SOLUTION
John Bollinger's suggestion solved the problem:
Adding this line after fopen and before starting to read:
setvbuf(fp, NULL, _IONBF, 0);
WRITER PROCESS
/*
g++ fWriter.cpp -o fWriter -g -pg
*/
#include <stdio.h>
#include <fcntl.h>
#include <stdint.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
int
main(int argc, char **argv)
{
printf("WRITER LAUNCHED\n");
const int N = 100;
FILE *fp = fopen("/tmp/someFile", "w");
if (!fp)
{
perror("fopen");
exit(1);
}
int itemLen = sizeof(uint32_t);
off_t len = itemLen * N;
int fd = fileno(fp);
if (fallocate(fd, 0, 0, len) == -1)
{
fprintf(stderr, "fallocate failed: %s\n", strerror(errno));
exit(1);
}
for (uint32_t i = 1; i <= N; i++)
{
int numBytes = fwrite((void*)&i, 1, itemLen, fp);
if (numBytes == itemLen)
printf("[%u] Wrote %u\n", i, i);
else
printf("Write error\n");
fflush(fp);
sleep(1);
}
fclose(fp);
printf("WRITER COMPLETE\n");
}
READER PROCESS
/*
g++ fReader.cpp -o fReader -g -pg
*/
#include <stdio.h>
#include <fcntl.h>
#include <stdint.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
int
main(int argc, char **argv)
{
printf("READER LAUNCHED\n");
FILE *fp = fopen("/tmp/someFile", "r");
if (!fp)
{
perror("fopen");
exit(1);
}
// New!
setvbuf(fp, NULL, _IONBF, 0);
int i = 1;
int itemLen = sizeof(uint32_t);
while (!feof(fp))
{
uint32_t val;
int numBytes = fread((void*)&val, 1, itemLen, fp);
if (numBytes != itemLen)
{
printf("Short read\n");
continue;
}
printf("[%d] Read %u\n", i, val);
// wait for data
if (val == 0)
{
off_t curPos = ftello(fp);
off_t newPos = curPos - itemLen;
printf("Seek and try again. newPos %d\n", newPos);
fseeko(fp, newPos, SEEK_SET);
}
i++;
//usleep(1100000);
usleep(900000);
}
fclose(fp);
}
READER LOG
READER LAUNCHED
[1] Read 1
[2] Read 2
[3] Read 3
[4] Read 4
[5] Read 5
[6] Read 6
[7] Read 0
Seek and try again. newPos 24
[8] Read 7
[9] Read 8
[10] Read 9
[11] Read 10
[12] Read 11
[13] Read 0
Seek and try again. newPos 44
[14] Read 0
Seek and try again. newPos 44
[15] Read 0
Seek and try again. newPos 44
[16] Read 0
Seek and try again. newPos 44
[17] Read 0
Seek and try again. newPos 44
[18] Read 0
Seek and try again. newPos 44
[19] Read 0
Seek and try again. newPos 44
[20] Read 0
Seek and try again. newPos 44
[21] Read 0
Seek and try again. newPos 44
[22] Read 0
Seek and try again. newPos 44

Streams (anything represented by a stdio.h FILE object) can be fully buffered, line buffered, or unbuffered. Streams connected to a regular file are fully buffered by default. This is surely why your reader, having once read a bunch of zeroes at the current file position, doesn't notice that the writer has overwritten some of those. The reader will not consult the underlying file again for data from the region of the file that is already buffered.
Under the circumstances, the reader can and should use setvbuf() to change the buffering mode of the file to unbuffered.

Related

I made program that copies data from one file and pastes to another using (read,write) but i think its taking too long

i need to copy 1gb file to another and i am using this code while using different buffers (1byte, 512byte and 1024byte) while using 512byte buffer it took me about 22seconds but when i use 1byte buffer copying doesnt end even after 44minutes. Is that time expected or mby something is wrong with my code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <corecrt_io.h>
int main(int argc, char* argv[])
{
char sourceName[20], destName[20], bufferStr[20];
int f1, f2, fRead;
int bufferSize = 0;
char* buffer;
/*printf("unesite buffer size(u bytima): ");
scanf("%d", &bufferSize);*/
//bufferSize = argv[3];
bufferSize = atoi(argv[3]);
buffer = (char*)calloc(bufferSize, sizeof(char));
/*printf("unesite source name: ");
scanf("%s", sourceName);*/
strcpy(sourceName, argv[1]);
f1 = open(sourceName, O_RDONLY);
if (f1 == -1)
printf("something's wrong with oppening source file!\n");
else
printf("file opened!\n");
/*printf("unesite destination name: ");
scanf("%s", destName);*/
strcpy(destName, argv[2]);
f2 = open(destName, O_CREAT | O_WRONLY | O_TRUNC | O_APPEND);
if (f2 == -1)
printf("something's wrong with oppening destination file!\n");
else
printf("file2 opened!");
fRead = read(f1, buffer, bufferSize);
while (fRead != 0)
{
write(f2, buffer, bufferSize);
fRead = read(f1, buffer, bufferSize);
}
return 0;
}
Yes, this is expected, because system calls are expensive operations, so the time is roughly proportional to the number of times you call read() and write(). If it takes 22 seconds to copy with 512-byte buffers, you should expect it to take about 22 * 512 seconds with 1-byte buffers. That's 187 minutes, or over 3 hours.
This is why stdio implements buffered output by default.

how to properly use posix_memalign

I'am struggling to find out how to proper use the pread and pwrite.
In this case, I am trying to read only 256 bytes using pread.
However, that whenever I try to read less than 512 bytes pread will not return anything.
I believe that this problem has to be with the SECTOR argument that I am assigning to posix_memalign...
Is there some obvious info that I have to be aware of?
#define BUF_SIZE 256
#define SECTOR 512
#define FILE_SIZE 1024 * 1024 * 1024 //1G
int main( int argc, char **argv ){
int fd, nr;
char fl_nm[]={"/dev/nvme0n1p1"};
char* aligned_buf_w = NULL;
char* aligned_buf_r = NULL;
void* ad = NULL;
if (posix_memalign(&ad, SECTOR, BUF_SIZE)) {
perror("posix_memalign failed"); exit (EXIT_FAILURE);
}
aligned_buf_w = (char *)(ad);
ad = NULL;
if (posix_memalign(&ad, SECTOR, BUF_SIZE)) {
perror("posix_memalign failed"); exit (EXIT_FAILURE);
}
aligned_buf_r = (char *)(ad);
memset(aligned_buf_w, '*', BUF_SIZE * sizeof(char));
printf("BEFORE READ BEGIN\n");
printf("\t aligned_buf_w::%ld\n",strlen(aligned_buf_w));
printf("\t aligned_buf_r::%ld\n",strlen(aligned_buf_r));
printf("BEFORE READ END\n");
fd = open(fl_nm, O_RDWR | O_DIRECT);
pwrite(fd, aligned_buf_w, BUF_SIZE, 0);
//write error checking
if(nr == -1){
perror("[error in write 2]\n");
}
nr = pread(fd, aligned_buf_r, BUF_SIZE, 0);
//read error checking
if(nr == -1){
perror("[error in read 2]\n");
}
printf("AFTER READ BEGIN\n");
printf("\taligned_buf_r::%ld \n",strlen(aligned_buf_r));
printf("AFTER READ END\n");
//error checking for close process
if(close(fd) == -1){
perror("[error in close]\n");
}else{
printf("[succeeded in close]\n");
}
return 0;
}
Here is the output when I read and write 512 bytes
BEFORE READ BEGIN
aligned_buf_w::512
aligned_buf_r::0
BEFORE READ END
AFTER READ BEGIN
aligned_buf_r::512
AFTER READ END
[succeeded in close]
and here is the result when I try to read 256 bytes
BEFORE READ BEGIN
aligned_buf_w::256
aligned_buf_r::0
BEFORE READ END
[error in read 2]
: Invalid argument
AFTER READ BEGIN
aligned_buf_r::0
AFTER READ END
[succeeded in close]
While using O_DIRECT "the kernel will do DMA directly from/to the physical memory pointed by the userspace buffer passed as parameter" - https://www.ukuug.org/events/linux2001/papers/html/AArcangeli-o_direct.html - so you have to observe some restrictions - http://man7.org/linux/man-pages/man8/raw.8.html
All I/Os must be correctly aligned in memory and on disk: they must start at a sector
offset on disk, they must be an exact number of sectors long, and the
data buffer in virtual memory must also be aligned to a multiple of
the sector size. The sector size is 512 bytes for most devices.
With buffered IO you do not care of that. The following sample illustrates that while reading a HDD (/dev/sda9) :
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define SECTOR 512
int main( int argc, char **argv ){
int fd, nr, BUF_SIZE;
char fl_nm[]={"/dev/sda9"};
char* buf = NULL;
if (argc>1) {
BUF_SIZE = atoi(argv[1]);
// BUFFERED IO
printf("Buffered IO -------\n");
if ((buf = (char*)malloc(BUF_SIZE)) == NULL) perror("[malloc]");
else {
if ((fd = open(fl_nm, O_RDONLY)) == -1) perror("[open]");
if((nr = pread(fd, buf, BUF_SIZE, 4096)) == -1) perror("[pread]");
else
printf("%i bytes read %.2x %.2x ...\n",nr,buf[0],buf[1]);
free(buf);
if(close(fd) == -1) perror("[close]");
}
// DIRECT IO
printf("Direct IO ---------\n");
if (posix_memalign((void *)&buf, SECTOR, BUF_SIZE)) {
perror("posix_memalign failed");
}
else {
if ((fd = open(fl_nm, O_RDONLY | O_DIRECT)) == -1) perror("[open]");
/* buf size , buf alignment and offset has to observe hardware restrictions */
if((nr = pread(fd, buf, BUF_SIZE, 4096)) == -1) perror("[pread]");
else
printf("%i bytes read %.2x %.2x ...\n",nr,buf[0],buf[1]);
free(buf);
if(close(fd) == -1) perror("[close]");
}
}
return 0;
}
You can verify the following behaviour :
$ sudo ./testodirect 512
Buffered IO -------
512 bytes read 01 04 ...
Direct IO ---------
512 bytes read 01 04 ...
$ sudo ./testodirect 4
Buffered IO -------
4 bytes read 01 04 ...
Direct IO ---------
[pread]: Invalid argument
By the way O_DIRECT is not in flavour of everybody https://yarchive.net/comp/linux/o_direct.html
512B is the smallest unit you can read from a storage device

C using open()read()write() to access text modify it and store in a new file

My program is supposed to take text from a file given in the command line change it to uppercase and store it in another file.
It works except the output file has a whole bunch of garbage after the converted text. Thank you
Edit: I changed my read to check for 0 bytes and used ret_in to write per Pyjamas it still pulls two or three garbage values. It's definitely read getting the garbage because when I output the buffer before converting it's there.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <ctype.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#define BUF_SIZE 500
int main(int argc, char* argv[]){
char buffer[BUF_SIZE];
int ret_in;
char inputf[100],outputf[100],txt[4],up[3];
// Takes input and adjusts it to the correct file type.
strcpy(inputf,argv[1]);
strcpy(outputf,argv[1]);
strcat(outputf,".up");
printf("%s\n",outputf);
strcat(inputf,".txt");
printf("%s\n",inputf);
int output, input,wrt;
int total;
//opens input file
input=open(inputf, O_RDONLY);
if (input == -1) {
printf("Failed to open file\n");
exit(1);
}
ret_in = read(input,buffer,BUF_SIZE);
total = ret_in;
// output to console
while (ret_in!= 0) {
// printf("%s\n", buffer);
ret_in = read(input,buffer,BUF_SIZE);
total += ret_in;
}
//ret_in= read(input,&buffer,BUF_SIZE);
puts(buffer);
close(input);
int i = 0;
while(buffer[i]) {
buffer[i] = toupper(buffer[i]);
i++;
}
// output buffer in console
puts(buffer);
//output filename in console
printf("%s\n",outputf);
// Opens or creates output file with permissions.
output = open(outputf, O_CREAT| S_IRUSR | O_RDWR);
if (output == -1) {
printf("Failed to open or create the file\n");
exit(1);
}
// write to output file
wrt = write(output, buffer,total);
close(output);
return 0;
}
Because you read ret_in bytes from file, but you write BUF_SIZE to the file, you should write ret_in bytes to the file. You are not supposed to read BUF_SIZE bytes from file every time, it depends, right?
write(output, buffer,BUF_SIZE);//wrong
write(output, buffer,ret_in); //right

Creating a file with a hole

I was trying to create a file with a hole using a textbook code and modifying it slightly. However, something must've gone wrong, since I don't see any difference between both files in terms of size and disk blocks.
Code to create a file with a hole (Advanced Programming in the Unix Environment)
#include "apue.h"
#include <fcntl.h>
char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";
int
main(void)
{
int fd;
if ((fd = creat("file.hole", FILE_MODE)) < 0)
printf("creat error");
if (write(fd, buf1, 10) != 10)
printf("buf1 write error");
/* offset now = 10 */
if (lseek(fd, 16384, SEEK_SET) == -1)
printf("lseek error");
/* offset now = 16384 */
if (write(fd, buf2, 10) != 10)
printf("buf2 write error");
/* offset now = 16394 */
exit(0);
}
My code, creating a file basically full of abcdefghij's.
#include "apue.h"
#include <fcntl.h>
#include<unistd.h>
char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";
int
main(void)
{
int fd;
if ((fd = creat("file.nohole", FILE_MODE)) < 0)
printf("creat error");
while(lseek(fd,0,SEEK_CUR)<16394)
{
if (write(fd, buf1, 10) != 10)
printf("buf1 write error");
}
exit(0);
}
Printing both files I get the expected output. However, their size is identical.
{linux1:~/dir} ls -ls *hole
17 -rw-------+ 1 user se 16394 Sep 14 11:42 file.hole
17 -rw-------+ 1 user se 16400 Sep 14 11:33 file.nohole
You misunderstand what is meant by a "hole".
What it means is that you write some number of bytes, skip over some other number of bytes, then write more bytes. The bytes you don't explicitly write in between are set to 0.
The file itself doesn't have a hole in it, i.e. two separate sections. It just has bytes with 0 in them.
If you were to look at the contents of the first file, you'll see it has "abcdefghij" followed by 16373 (16384 - 10 - 1) bytes containing the value 0 (not the character "0") followed by ABCDEFGHIJ.

Read function from c returns 0 prematurely

I have this code to copy chunks of 1 KB from a source file to a destination file (practically create a copy of the file) :
test.cpp
#include<stdio.h>
#include<unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include<string.h>
int main() {
int fd = open("file1.mp4", O_RDONLY);
int fd2 = open("file2.mp4", O_WRONLY | O_CREAT | O_APPEND);
int nr = 0;int n;
char buff[1024];
memset(buff, 0, 1024);
while((n = read(fd, buff, 1024)) != 0) {
write(fd2, buff, strlen(buff));
nr = strlen(buff);
memset(buff, 0, 1024);
}
printf("succes %d %d\n", nr,n);
close(fd);
close(fd2);
return 0;
}
I have tried to copy a .mp4 file, which has 250 MB, but the result has only 77.4 MB. The return value of the last read(), n, is 0, so there isn't supposed to be any error (but it should be, since it doesn't copy entire input file).
I think that the .mp4 file has a EOF byte, which does not actually mean the end of the file.
What should I do to be able to copy the entire .mp4 file (I would like an answer to improve my code, not a completely different code).
Thanks for help!
The problem is that you write strlen(buff) bytes instead of n bytes in your loop.
Whenever the buffer contains a \0 byte, strlen will take it to mean "end of string" and you end up not writing any more. (And when it doesn't contain a \0, you end up reading past the end of the buffer).

Resources