Creating my own archive tool in C [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
i was just assigned a project to create an archiving tool for unix. So after creating the program i would do something like
"./bar -c test_archive.bar file.1"
It would create a test_archive.bar with file.1 inside of it. Then i could do some command where i list the files inside etc. etc.. But i'm having trouble understanding the concept of making a test_archive.bar, i realize in essence its just a file, but if you were to say open a .tgz "vi file.tgz" it would give a list of directories/files inside,
So, are there any good ways to go about creating a archive/directory in which i can extrapolate some files within and list their names etc..
Note: I have looked at tar.c and all the files included in that but every file is so abstracted it's very hard to follow.
Note: i know how to read the command line flags etc.

Using a old (but still valid) tar format is actually pretty easy to do. Wikipedia has a nice explanation of the format here. All you need to do is this:
For each file:
Fill out and emit a header to the tar file
Emit the file contents
Pad the file size to a multiple of 512 bytes
The most basic valid header for a tar file is: (Copied from Wikipedia, basically)
100 bytes: File name
8 bytes: File mode
8 bytes: Owner's numeric ID
8 bytes: Group's numeric ID
12 bytes: File's size
12 bytes: Timestamp of last modified time
8 bytes: Checksum
1 byte: File type
100 bytes: Name of linked file
The file type can be 0 (a normal file), 1 (a hard link) or 2 (a symlink). The name of linked file is the name of the file that a link points at. If I recall correctly, if you have a hard link or symbolic link, the file content should be empty.
To quote Wikipedia:
"Numeric values are encoded in octal numbers using ASCII digits, with leading zeroes. For historical reasons, a final NUL or space character should be used."
"The checksum is calculated by taking the sum of the unsigned byte values of the header record with the eight checksum bytes taken to be ascii spaces (decimal value 32). It is stored as a six digit octal number with leading zeroes followed by a NUL and then a space."
Here's a simple tarball generator. Creating an extractor, dealing with automatic file feeding, etc, is left as an exercise for the reader.
#include<stdio.h>
#include<string.h>
struct tar_header{
char name[100];
char mode[8];
char owner[8];
char group[8];
char size[12];
char modified[12];
char checksum[8];
char type[1];
char link[100];
char padding[255];
};
void fexpand(FILE* f, size_t amount, int value){
while( amount-- ){
fputc( value, f );
}
}
void tar_add(FILE* tar_file, const char* file, const char* internal_name){
//Get current position; round to a multiple of 512 if we aren't there already
size_t index = ftell( tar_file );
size_t offset = index % 512;
if( offset != 0 ){
fexpand( tar_file, 512 - offset, 0);
}
//Store the index for the header to return to later
index = ftell( tar_file );
//Write some space for our header
fexpand( tar_file, sizeof(struct tar_header), 0 );
//Write the input file to the tar file
FILE* input = fopen( file, "rb" );
if( input == NULL ){
fprintf( stderr, "Failed to open %s for reading\n", file);
return;
}
//Copy the file content to the tar file
while( !feof(input) ){
char buffer[2000];
size_t read = fread( buffer, 1, 2000, input );
fwrite( buffer, 1, read, tar_file);
}
//Get the end to calculate the size of the file
size_t end = ftell( tar_file );
//Round the file size to a multiple of 512 bytes
offset = end % 512;
if( end != 0 ){
fexpand( tar_file, 512 - offset, 0);
}
//Fill out a new tar header
struct tar_header header;
memset( &header, 0, sizeof( struct tar_header ) );
snprintf( header.name, 100, "%s", internal_name );
snprintf( header.mode, 8, "%06o ", 0777 ); //You should probably query the input file for this info
snprintf( header.owner, 8, "%06o ", 0 ); //^
snprintf( header.group, 8, "%06o ", 0 ); //^
snprintf( header.size, 12, "%011o", end - 512 - index );
snprintf( header.modified, 12, "%011o ", time(0) ); //Again, get this from the filesystem
memset( header.checksum, ' ', 8);
header.type[0] = '0';
//Calculate the checksum
size_t checksum = 0;
int i;
const unsigned char* bytes = &header;
for( i = 0; i < sizeof( struct tar_header ); ++i ){
checksum += bytes[i];
}
snprintf( header.checksum, 8, "%06o ", checksum );
//Save the new end to return to after writing the header
end = ftell(tar_file);
//Write the header
fseek( tar_file, index, SEEK_SET );
fwrite( bytes, 1, sizeof( struct tar_header ), tar_file );
//Return to the end
fseek( tar_file, end, SEEK_SET );
fclose( input );
}
int main( int argc, char* argv[] ){
if( argc > 1 ){
FILE* tar = fopen( argv[1], "wb" );
if( !tar ){
fprintf( stderr, "Failed to open %s for writing\n", argv[1] );
return 1;
}
int i;
for( i = 2; i < argc; ++i ){
tar_add( tar, argv[i], argv[i] );
}
//Pad out the end of the tar file
fexpand( tar, 1024, 0);
fclose( tar );
return 0;
}
fprintf( stderr, "Please specify some file names!\n" );
return 0;
}

So, are there any good ways to go about creating a archive/directory
in which i can extrapolate some files within and list their names
etc..
There are basically two approaches:
Copy file contents one after another, each prefixed with "header" block, containing information about file name, size and (optionally) other attributes. Tar is an example of this. Example:
Copy file contents one after another and put somewhere (on the beginning of at the end) "index" which contains list of file names with their sizes and (optionally) other attributes. When you look at file sizes, you can compute where individual files begin/end.
Most real world archivers use combination of those, and add other features such as check sums, compression and encryption.
Example
Suppose we have Two files named hello.txt containg Hello, World! (12 bytes) and bar.txt containg foobar (6 bytes).
In first method, archive would look like that
[hello.txt,12][Hello, World!][bar.txt,6][foobar]
^- fixed size ^- 12 bytes ^- fixed size ^- 6 bytes
Length of header blocks would habe to be either constant, or you have to encode somewhere their length.
In second:
[Hello, World!foobar][hello.txt,12,bar.txt,6]
^- 12+6 bytes

Related

pointer in binary files after using fread command

im learning at the moment about file pointers,and came across this code which was given as exemple
i tried replicating it in visual studio but i keep getting run time errors that are not relevent at this moment
void main()
{
FILE *cfPtr; /* credit.dat file pointer */
/* create clientData with default information */
struct clientData client = { 0, "", "", 0.0 };
if ( ( cfPtr = fopen( "credit.dat", "rb" ) ) == NULL )
printf( "File could not be opened.\n" );
else {
printf( "%-6s%-16s%-11s%10s\n", "Acct", "Last Name",
"First Name", "Balance" );
/* read all records from file (until eof) */
fread( &client, sizeof( struct clientData ), 1, cfPtr );
while ( !feof( cfPtr ) ) {
/* display record */
if ( client.acctNum != 0 )
printf( "%-6d%-16s%-11s%10.2f\n",
client.acctNum, client.lastName,
client.firstName, client.balance );
fread( &client, sizeof( struct clientData ), 1, cfPtr );
} /* end while */
fclose( cfPtr ); /* fclose closes the file */
} /* end else */
} /* end main */
my question is this,if the file is empty what does struct client contains?
also if the file only have 1 struct when in the code they used fread wouldnt the pointer move to after the struct meaning it will sit on eof and when they used if it will be false meaning the struct wasnt printed on screen?
my question is this,if the file is empty what does struct client contains? also if the file only have 1 struct when in the code they used fread wouldnt the pointer move to after the struct meaning it will sit on eof and when they used if it will be false meaning the struct wasnt printed on screen?
7.21.8.1 The fread function
...
Synopsis
1 #include <stdio.h>
size_t fread(void * restrict ptr,
size_t size, size_t nmemb,
FILE * restrict stream);
Description
2 The fread function reads, into the array pointed to by ptr, up to nmemb elements
whose size is specified by size, from the stream pointed to by stream. For each
object, size calls are made to the fgetc function and the results stored, in the order
read, in an array of unsigned char exactly overlaying the object. The file position
indicator for the stream (if defined) is advanced by the number of characters successfully
read. If an error occurs, the resulting value of the file position indicator for the stream is
indeterminate. If a partial element is read, its value is indeterminate.
Returns
3 The fread function returns the number of elements successfully read, which may be
less than nmemb if a read error or end-of-file is encountered. If size or nmemb is zero,
fread returns zero and the contents of the array and the state of the stream remain
unchanged.
C 2011 Online Draft
Emphasis added; if fread returns something less than 1 for this code, then you should assume that client doesn't contain anything meaningful.
Never use feof as your loop condition - it won't return true until you try to read past the end of the file, so the loop will execute once too often. Instead, check the result of your input operation like so:
while ( fread( &client, sizeof client, 1, cfPtr ) == 1 )
{
// process client normally
}
if ( feof( cfPtr ) )
fputs( "End of file detected on input\n", stderr );
else
fputs( "Error on input\n", stderr );

ext4: Splitting and Concatenating File in situ

Say I have a large file on a large disk and this file fills the disk almost entirely. e.g. 10TB disk, almost 10TB file, say 3GB are free. Also, I do not have any other
I would like to split that file in N pieces, but splitting in half is ok for simple case. As the desired solution is probably FS specific, I'm on an ext4 filesystem.
I am aware of https://www.gnu.org/software/coreutils/manual/coreutils.html#split-invocation
Obviously, I do not have enough free space on the device to create the splits by copying.
Would it be possible to split file A (~10TB) into two files B and C in a way, so these (B and C) would simply be new "references" to the original data of file A.
I.e. B having the same start (A_start = B_start), but a smaller length and C, starting at B_start+B_length having C_length = A_length-B_length.
File A might or might not exist in the FS after the operation.
Also, I'd be fine if there was some constraint/restriction like this was only possible at some sector/block boundary (i.e. only 4096 byte raster).
Same question applies to the inverse situation:
Having two files of almost 5TB each on a 10TB hard disk: concatenating these to a resulting file of nearly 10TB size by merely adjusting the "inode references".
Sorry if the nomenclature is not that precise, I hope it's clear what I try to achieve.
First, there is currently no guaranteed portable way to do what you want - any solution is going to be platform-specific, because to do what you want requires that your underlying filesystem support sparse files.
Code like this will work to split a file in half if the underlying filesystem creates sparse files (proper headers and error checking left out for clarity):
// 1MB chunks (use a power of two)
#define CHUNKSIZE ( 1024L * 1024L )
int main( int argc, char **argv )
{
int origFD = open( argv[ 1 ], O_RDWR );
int newFD = open( argv[ 2 ], O_WRONLY | O_CREAT | O_TRUNC, 0644 );
// get the size of the input file
struct stat sb;
fstat( origFD, &sb );
// get a CHUNKSIZE-aligned offset near the middle of the file
off_t startOffset = ( sb.st_size / 2L ) & ~( CHUNKSIZE - 1L );
// get the largest CHUNKSIZE-aligned offset in the file
off_t readOffset = sb.st_size & ~( CHUNKSIZE - 1L );
// might have to malloc() if it doesn't fit on the stack
char *ioBuffer[ CHUNKSIZE ];
while ( readOffset >= startOffset )
{
// write the data to the end of the file - the underlying
// filesystem had better create a sparse file or this can
// fill up the disk on the first pwrite() call
ssize_t bytesRead = pread(
origFD, ioBuffer, CHUNKSIZE, readOffset );
ssize_t bytesWritten = pwrite(
newFD, ioBuffer, byteRead, readOffset - startOffset );
// cut the end off the input file - this had better free up
// disk space
ftruncate( origFD, readOffset );
readOffset -= CHUNKSIZE;
}
free( ioBuffer );
close( origFD );
close( newFD );
return( 0 );
}
There are other approaches, too. On a Solaris system, you can use fcntl() with the F_FREESPC command and on a Linux system that supports the FALLOC_FL_PUNCH_HOLE you can use the fallocate() function to remove arbitrary blocks from the file after you've copied the data to another file. On such systems you wouldn't be limited only being able to cut the end off the original file with ftruncate().

Improving IO performance for merging two files in C

I wrote a function which merges two large files (file1,file2) into a new file (outputFile).
Each file is a line based format while entries are separated by \0 byte. Both files have the same amount of null bytes.
One example file with two entries could look like this A\nB\n\0C\nZ\nB\n\0
Input:
file1: A\nB\0C\nZ\nB\n\0
file2: BBA\nAB\0T\nASDF\nQ\n\0
Output
outputFile: A\nB\nBBA\nAB\0C\nZ\nB\nT\nASDF\nQ\n\0
FILE * outputFile = fopen(...);
setvbuf ( outputFile , NULL , _IOFBF , 1024*1024*1024 )
FILE * file1 = fopen(...);
FILE * file2 = fopen(...);
int c1, c2;
while((c1=fgetc(file1)) != EOF) {
if(c1 == '\0'){
while((c2=fgetc(file2)) != EOF && c2 != '\0') {
fwrite(&c2, sizeof(char), 1, outputFile);
}
char nullByte = '\0';
fwrite(&nullByte, sizeof(char), 1, outputFile);
}else{
fwrite(&c1, sizeof(char), 1, outputFile);
}
}
Is there a way to improve this IO performance of this function? I increased the buffer size of outputFile to 1 GB by using setvbuf. Would it help to use posix_fadvise on file1 and file2?
You're doing IO character-by-character. That is going to be needlessly and painfully S-L-O-W, even with buffered streams.
Take advantage of the fact that your data is stored in your files as NUL-terminated strings.
Assuming you're alternating nul-terminated strings from each file, and running on a POSIX platform so you can simply mmap() the input files:
typedef struct mapdata
{
const char *ptr;
size_t bytes;
} mapdata_t;
mapdata_t mapFile( const char *filename )
{
mapdata_t data;
struct stat sb;
int fd = open( filename, O_RDONLY );
fstat( fd, &sb );
data.bytes = sb.st_size;
/* assumes we have a NUL byte after the file data
If the size of the file is an exact multiple of the
page size, we won't have the terminating NUL byte! */
data.ptr = mmap( NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0 );
close( fd );
return( data );
}
void unmapFile( mapdata_t data )
{
munmap( data.ptr, data.bytes );
}
void mergeFiles( const char *file1, const char *file2, const char *output )
{
char zeroByte = '\0';
mapdata_t data1 = mapFile( file1 );
mapdata_t data2 = mapFile( file2 );
size_t strOffset1 = 0UL;
size_t strOffset2 = 0UL;
/* get a page-aligned buffer - a 64kB alignment should work */
char *iobuffer = memalign( 64UL * 1024UL, 1024UL * 1024UL );
/* memset the buffer to ensure the virtual mappings exist */
memset( iobuffer, 0, 1024UL * 1024UL );
/* use of direct IO should reduce memory pressure - the 1 MB
buffer is already pretty large, and since we're not seeking
the page cache is really only slowing things down */
int fd = open( output, O_RDWR | O_TRUNC | O_CREAT | O_DIRECT, 0644 );
FILE *outputfile = fdopen( fd, "wb" );
setvbuf( outputfile, iobuffer, _IOFBF, 1024UL * 1024UL );
/* loop until we reach the end of either mapped file */
for ( ;; )
{
fputs( data1.ptr + strOffset1, outputfile );
fwrite( &zeroByte, 1, 1, outputfile );
fputs( data2.ptr + strOffset2, outputfile );
fwrite( &zeroByte, 1, 1, outputfile );
/* skip over the string, assuming there's one NUL
byte in between strings */
strOffset1 += 1 + strlen( data1.ptr + strOffset1 );
strOffset2 += 1 + strlen( data2.ptr + strOffset2 );
/* if either offset is too big, end the loop */
if ( ( strOffset1 >= data1.bytes ) ||
( strOffset2 >= data2.bytes ) )
{
break;
}
}
fclose( outputfile );
unmapFile( data1 );
unmapFile( data2 );
}
I've put in no error checking at all. You'll also need to add the proper header files.
Note also that the file data is assumed to NOT be an exact multiple of the system page size, thus ensuring that there's a NUL byte mapped after the file contents. If the size of the file is an exact multiple of the page size, you'll have to mmap() an additional page after the file contents to ensure that there's a NUL byte to terminate the last string.
Or you can rely on there being a NUL byte as the last byte of the file's contents. If that ever turns out to not be true, you'll likely get either a SEGV or corrupted data.
you are using two function calls per character, (one for input, one for output) Function calls are slow (they pollute the instruction pipeline)
fgetc() and fputc have their getc() / putc() counterparts, which are (can be) implemented as macros, enabling the compiler to inline the entire loop, except for the reading/writing of buffers , twice per 512 or 1024 or 4096 characters processed. (these will invoke system calls, but these are inevitable anyway)
using read/write instead of buffered I/O will probably not be worth the effort, the extra bookkeeping wil make your loop fatter (btw: using fwrite() to write one character is certainly wastefull, same for write())
maybe a larger output buffer could help, but I wouldnt count on that.
If you can use threads, make one for file1 and another for file2.
Make the outputFile as big as you need, then make thread1 write the file1 into outputFile.
While thread2 seek it's output of outputFile the the length of file1+1, and write file2
Edit:
It's not a correct answer for this case, but to prevent confusions I'll let it here.
More discusion I found about it: improve performance in file IO in C

how to print first 10 lines of a text file using Unix system calls?

I want to write my own version of the head Unix command, but my program is not working.
I am trying to to print the first 10 lines of a text file, but instead the program prints all the lines. I specify the file name and number of lines to print via command-line arguments. I am only required to use Unix system calls such as read(), open() and close().
Here is the code:
#include "stdlib.h"
#include "stdio.h"
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#define BUFFERSZ 256
#define LINES 10
void fileError( char*, char* );
int main( int ac, char* args[] )
{
char buffer[BUFFERSZ];
int linesToRead = LINES;
int in_fd, rd_chars;
// check for invalid argument count
if ( ac < 2 || ac > 3 )
{
printf( "usage: head FILE [n]\n" );
exit(1);
}
// check for n
if ( ac == 3 )
linesToRead = atoi( args[2] );
// attempt to open the file
if ( ( in_fd = open( args[1], O_RDONLY ) ) == -1 )
fileError( "Cannot open ", args[1] );
int lineCount = 0;
//count no. of lines inside file
while (read( in_fd, buffer, 1 ) == 1)
{
if ( *buffer == '\n' )
{
lineCount++;
}
}
lineCount = lineCount+1;
printf("Linecount: %i\n", lineCount);
int Starting = 0, xline = 0;
// xline = totallines - requiredlines
xline = lineCount - linesToRead;
printf("xline: %i \n\n",xline);
if ( xline < 0 )
xline = 0;
// count for no. of line to print
int printStop = lineCount - xline;
printf("printstop: %i \n\n",printStop);
if ( ( in_fd = open( args[1], O_RDONLY ) ) == -1 )
fileError( "Cannot open ", args[1] );
//read and print till required number
while (Starting != printStop) {
read( in_fd, buffer, BUFFERSZ );
Starting++; //increment starting
}
//read( in_fd, buffer, BUFFERSZ );
printf("%s \n", buffer);
if ( close( in_fd ) == -1 )
fileError( "Error closing files", "" );
return 0;
}
void fileError( char* s1, char* s2 )
{
fprintf( stderr, "Error: %s ", s1 );
perror( s2 );
exit( 1 );
}
What am I doing wrong?
It's very odd that you open the file and scan it to count the total number lines before going on to echoing the first lines. There is absolutely no need to know in advance how many lines there are altogether before you start echoing lines, and it does nothing useful for you. If you're going to do it, anyway, however, then you ought to close() the file before you re-open it. For your simple program, this is a matter of good form, not of correct function -- the misbehavior you observe is unrelated to that.
There are several problems in the key portion of your program:
//read and print till required number
while (Starting != printStop) {
read( in_fd, buffer, BUFFERSZ );
Starting++; //increment starting
}
//read( in_fd, buffer, BUFFERSZ );
printf("%s \n", buffer);
You do not check the return value of your read() call in this section. You must check it, because it tells you not only whether there was an error / end-of-file, but also how many bytes were actually read. You are not guaranteed to fill the buffer on any call, and only in this way can you know which elements of the buffer afterward contain valid data. (Pre-counting lines does nothing for you in this regard.)
You are performing raw read()s, and apparently assuming that each one will read exactly one line. That assumption is invalid. read() does not give any special treatment to line terminators, so you are likely to have reads that span multiple lines, and reads that read only partial lines (and maybe both in the same read). You therefore cannot count lines by counting read() calls. Instead, you must scan the valid characters in the read buffer and count the newlines among them.
You do not actually print anything inside your read loop. Instead, you wait until you've done all your reading, then print everything the buffer after the last read. That's not going to serve your purpose when you don't get all the lines you need in the first read, because each subsequent successful read will clobber the data from the preceding one.
You pass the buffer to printf() as if it were a null-terminated string, but you do nothing to ensure that it is, in fact, terminated. read() does not do that for you.
I have trouble believing your claim that your program always prints all the line of the designated file, but I can believe that it prints all the lines of the specific file you're testing it on. It might do that if the file is short enough that the whole thing fits into your buffer. Your program then might read the whole thing into the buffer on the first read() call (though it is not guaranteed to do so), and then read nothing on each subsequent call, returning -1 and leaving the buffer unchanged. When you finally print the buffer, it still contains the whole contents of the file.

C copy file contents from EOF to SOF

My program is working almost as it should. The intended purpose is to read the file from the end and copy the contents to destination file. However what confuses me is the lseek() method more so how I should be setting the offset.
My src contents at the moment are:
Line 1
Line 2
Line 3
At the moment what I get in my destination file is:
Line 3
e 2
e 2...
From what I understand calling int loc = lseek(src, -10, SEEK_END); will move the "cursor" in source file to then end then offset it from EOF to SOF for 10 bytes and the value of loc will be the size of file after I have deducted the offset. However after 7h of C I'm almost brain dead here.
int main(int argc, char* argv[])
{
// Open source & source file
int src = open(argv[1], O_RDONLY, 0777);
int dst = open(argv[2], O_CREAT|O_WRONLY, 0777);
// Check if either reported an erro
if(src == -1 || dst == -1)
{
perror("There was a problem with one of the files.");
}
// Set buffer & block size
char buffer[1];
int block;
// Set offset from EOF
int offset = -1;
// Set file pointer location to the end of file
int loc = lseek(src, offset, SEEK_END);
// Read from source from EOF to SOF
while( loc > 0 )
{
// Read bytes
block = read(src, buffer, 1);
// Write to output file
write(dst, buffer, block);
// Move the pointer again
loc = lseek(src, loc-1, SEEK_SET);
}
}
lseek() doesn't change or return the file size. What it returns is the position where the 'cursor' is set to. So when you call
loc = lseek(src, offset, SEEK_END);
twice it will always set the cursor to the same position again. I guess you want to do something like this:
while( loc > 0 )
{
// Read bytes
block = read(src, buffer, 5);
// Write to output file
write(dst, buffer, block);
// Move the pointer again five bytes before the last offset
loc = lseek(src, loc+offset, SEEK_SET);
}
If the line length is variable, you could do something like the following instead:
// define an offset that exceeds the maximum line length
int offset = 256;
char buffer[256];
// determine the file size
off_t size = lseek( src, 0, SEEK_END );
off_t pos = size;
// read block of offset bytes from the end
while( pos > 0 ) {
pos -= offset;
if( pos < 0 ) {
//pos must not be negative ...
offset += pos; // in fact decrements offset!!
pos = 0;
}
lseek( src, pos, SEEK_SET );
// add error checking here!!
read(src, buffer, offset );
// we expect the last byte read to be a newline but we are interested in the one BEFORE that
char *p = memchr( buffer, '\n', offset-1 );
p++; // the beginning of the last line
int len = offset - (p-buffer); // and its length
write( dst, p, len );
pos -= len; // repeat with offset bytes before the last line
}
From some of your comments it looks like you want to reverse the order of the lines in a text file. Unfortunately you're not going to get that with such a simple program. There are several approaches you can take, depending on how complicated you want to get, how big the files are, how much memory is on hand, how fast you want it to be, etc.
Here are some different ideas off the top of my head:
Read your whole source file at once into a single memory block. Scan through the memory block forwards looking for line breaks and recording the pointer and length for each line. Save these records onto a stack (you could use a dynamic array, or an STL vector in C++,) and then to write your output file, you just pop a line's record off the stack (moving backwards through the array) and write it until the stack is empty (you've reached the beginning of the array.)
Start at the end of your input file, but for each line, seek backwards character-by-character until you find the newline that starts the previous line. Seek forwards again past that newline and then read in the line. (You should now know its length.) Or, you could just build up the reversed characters in a buffer and then write them out backwards.
Pull in whole blocks (sectors perhaps) of the file at once, from end to beginning. Within each block, locate the newlines in a similar fashion to the method above except now you already have the characters in memory and so don't need to do any reversing or pulling them in redundantly. However, this solution will be much more complicated because lines can span across block boundaries.
There may be more elaborate/clever tricks, but those are the more obvious, straightforward approaches.
I think you should be using SEEK_CUR instead of SEEK_END in your final call to lseek():
// Set file pointer location to the end of file
int loc = lseek(src, offset, SEEK_END);
// Read from source from EOF to SOF
while( loc > 0 )
{
// Read bytes
block = read(src, buffer, 5);
// Write to output file
write(dst, buffer, block);
// Move the pointer again
lseek(src, -10, SEEK_CUR);
}
You could also do:
// Set file pointer location to the end of file
int loc = lseek(src, offset, SEEK_END);
// Read from source from EOF to SOF
while( loc > 0 )
{
// Read bytes
block = read(src, buffer, 5);
// Write to output file
write(dst, buffer, block);
// Move the pointer again
loc -= 5;
lseek(src, loc, SEEK_SET);
}

Resources