How to duplicate an image file? [duplicate] - c

I am designing an image decoder and as a first step I tried to just copy the using c. i.e open the file, and write its contents to a new file. Below is the code that I used.
while((c=getc(fp))!=EOF)
fprintf(fp1,"%c",c);
where fp is the source file and fp1 is the destination file.
The program executes without any error, but the image file(".bmp") is not properly copied. I have observed that the size of the copied file is less and only 20% of the image is visible, all else is black. When I tried with simple text files, the copy was complete.
Do you know what the problem is?

Make sure that the type of the variable c is int, not char. In other words, post more code.
This is because the value of the EOF constant is typically -1, and if you read characters as char-sized values, every byte that is 0xff will look as the EOF constant. With the extra bits of an int; there is room to separate the two.

Did you open the files in binary mode? What are you passing to fopen?

It's one of the most "popular" C gotchas.

You should use freadand fwrite using a block at a time
FILE *fd1 = fopen("source.bmp", "r");
FILE *fd2 = fopen("destination.bmp", "w");
if(!fd1 || !fd2)
// handle open error
size_t l1;
unsigned char buffer[8192];
//Data to be read
while((l1 = fread(buffer, 1, sizeof buffer, fd1)) > 0) {
size_t l2 = fwrite(buffer, 1, l1, fd2);
if(l2 < l1) {
if(ferror(fd2))
// handle error
else
// Handle media full
}
}
fclose(fd1);
fclose(fd2);
It's substantially faster to read in bigger blocks, and fread/fwrite handle only binary data, so no problem with \n which might get transformed to \r\n in the output (on Windows and DOS) or \r (on (old) MACs)

Related

fread is not reading whole file [duplicate]

What translation occurs when writing to a file that was opened in text mode that does not occur in binary mode? Specifically in MS Visual C.
unsigned char buffer[256];
for (int i = 0; i < 256; i++) buffer[i]=i;
int size = 1;
int count = 256;
Binary mode:
FILE *fp_binary = fopen(filename, "wb");
fwrite(buffer, size, count, fp_binary);
Versus text mode:
FILE *fp_text = fopen(filename, "wt");
fwrite(buffer, size, count, fp_text);
I believe that most platforms will ignore the "t" option or the "text-mode" option when dealing with streams. On windows, however, this is not the case. If you take a look at the description of the fopen() function at: MSDN, you will see that specifying the "t" option will have the following effect:
line feeds ('\n') will be translated to '\r\n" sequences on output
carriage return/line feed sequences will be translated to line feeds on input.
If the file is opened in append mode, the end of the file will be examined for a ctrl-z character (character 26) and that character removed, if possible. It will also interpret the presence of that character as being the end of file. This is an unfortunate holdover from the days of CPM (something about the sins of the parents being visited upon their children up to the 3rd or 4th generation). Contrary to previously stated opinion, the ctrl-z character will not be appended.
In text mode, a newline "\n" may be converted to a carriage return + newline "\r\n"
Usually you'll want to open in binary mode. Trying to read any binary data in text mode won't work, it will be corrupted. You can read text ok in binary mode though - it just won't do automatic translations of "\n" to "\r\n".
See fopen
Additionally, when you fopen a file with "rt" the input is terminated on a Crtl-Z character.
Another difference is when using fseek
If the stream is open in binary mode, the new position is exactly offset bytes measured from the beginning of the file if origin is SEEK_SET, from the current file position if origin is SEEK_CUR, and from the end of the file if origin is SEEK_END. Some binary streams may not support the SEEK_END.
If the stream is open in text mode, the only supported values for offset are zero (which works with any origin) and a value returned by an earlier call to std::ftell on a stream associated with the same file (which only works with origin of SEEK_SET.
Even though this question was already answered and clearly explained, I think it would be interesting to show the main issue (translation between \n and \r\n) with a simple code example. Note that I'm not addressing the issue of the Crtl-Z character at the end of the file.
#include <stdio.h>
#include <string.h>
int main() {
FILE *f;
char string[] = "A\nB";
int len;
len = strlen(string);
printf("As you'd expect string has %d characters... ", len); /* prints 3*/
f = fopen("test.txt", "w"); /* Text mode */
fwrite(string, 1, len, f); /* On windows "A\r\nB" is writen */
printf ("but %ld bytes were writen to file", ftell(f)); /* prints 4 on Windows, 3 on Linux*/
fclose(f);
return 0;
}
If you execute the program on Windows, you will see the following message printed:
As you'd expect string has 3 characters... but 4 bytes were writen to file
Of course you can also open the file with a text editor like Notepad++ and see yourself the characters:
The inverse conversion is performed on Windows when reading the file in text mode.
We had an interesting problem with opening files in text mode where the files had a mixture of line ending characters:
1\n\r
2\n\r
3\n
4\n\r
5\n\r
Our requirement is that we can store our current position in the file (we used fgetpos), close the file and then later to reopen the file and seek to that position (we used fsetpos).
However, where a file has mixtures of line endings then this process failed to seek to the actual same position. In our case (our tool parses C++), we were re-reading parts of the file we'd already seen.
Go with binary - then you can control exactly what is read and written from the file.
In 'w' mode, the file is opened in write mode and the basic coding is 'utf-8'
in 'wb' mode, the file is opened in write -binary mode and it is resposible for writing other special characters and the encoding may be 'utf-16le' or others

How to use FILE in binary and text mode in C [duplicate]

What translation occurs when writing to a file that was opened in text mode that does not occur in binary mode? Specifically in MS Visual C.
unsigned char buffer[256];
for (int i = 0; i < 256; i++) buffer[i]=i;
int size = 1;
int count = 256;
Binary mode:
FILE *fp_binary = fopen(filename, "wb");
fwrite(buffer, size, count, fp_binary);
Versus text mode:
FILE *fp_text = fopen(filename, "wt");
fwrite(buffer, size, count, fp_text);
I believe that most platforms will ignore the "t" option or the "text-mode" option when dealing with streams. On windows, however, this is not the case. If you take a look at the description of the fopen() function at: MSDN, you will see that specifying the "t" option will have the following effect:
line feeds ('\n') will be translated to '\r\n" sequences on output
carriage return/line feed sequences will be translated to line feeds on input.
If the file is opened in append mode, the end of the file will be examined for a ctrl-z character (character 26) and that character removed, if possible. It will also interpret the presence of that character as being the end of file. This is an unfortunate holdover from the days of CPM (something about the sins of the parents being visited upon their children up to the 3rd or 4th generation). Contrary to previously stated opinion, the ctrl-z character will not be appended.
In text mode, a newline "\n" may be converted to a carriage return + newline "\r\n"
Usually you'll want to open in binary mode. Trying to read any binary data in text mode won't work, it will be corrupted. You can read text ok in binary mode though - it just won't do automatic translations of "\n" to "\r\n".
See fopen
Additionally, when you fopen a file with "rt" the input is terminated on a Crtl-Z character.
Another difference is when using fseek
If the stream is open in binary mode, the new position is exactly offset bytes measured from the beginning of the file if origin is SEEK_SET, from the current file position if origin is SEEK_CUR, and from the end of the file if origin is SEEK_END. Some binary streams may not support the SEEK_END.
If the stream is open in text mode, the only supported values for offset are zero (which works with any origin) and a value returned by an earlier call to std::ftell on a stream associated with the same file (which only works with origin of SEEK_SET.
Even though this question was already answered and clearly explained, I think it would be interesting to show the main issue (translation between \n and \r\n) with a simple code example. Note that I'm not addressing the issue of the Crtl-Z character at the end of the file.
#include <stdio.h>
#include <string.h>
int main() {
FILE *f;
char string[] = "A\nB";
int len;
len = strlen(string);
printf("As you'd expect string has %d characters... ", len); /* prints 3*/
f = fopen("test.txt", "w"); /* Text mode */
fwrite(string, 1, len, f); /* On windows "A\r\nB" is writen */
printf ("but %ld bytes were writen to file", ftell(f)); /* prints 4 on Windows, 3 on Linux*/
fclose(f);
return 0;
}
If you execute the program on Windows, you will see the following message printed:
As you'd expect string has 3 characters... but 4 bytes were writen to file
Of course you can also open the file with a text editor like Notepad++ and see yourself the characters:
The inverse conversion is performed on Windows when reading the file in text mode.
We had an interesting problem with opening files in text mode where the files had a mixture of line ending characters:
1\n\r
2\n\r
3\n
4\n\r
5\n\r
Our requirement is that we can store our current position in the file (we used fgetpos), close the file and then later to reopen the file and seek to that position (we used fsetpos).
However, where a file has mixtures of line endings then this process failed to seek to the actual same position. In our case (our tool parses C++), we were re-reading parts of the file we'd already seen.
Go with binary - then you can control exactly what is read and written from the file.
In 'w' mode, the file is opened in write mode and the basic coding is 'utf-8'
in 'wb' mode, the file is opened in write -binary mode and it is resposible for writing other special characters and the encoding may be 'utf-16le' or others

How could I "manually" signify EOF when writing a file?

I have this function :
int cipher_file(char *file_path, uint8_t *key, int key_size){
FILE *file;
size_t read_char_count, wrote_char_count;
fpos_t *pos = malloc(sizeof(fpos_t));
char *block = malloc(16*sizeof(uint8_t));
if ( !(file = fopen(file_path, "rb+")) ) {
return EXIT_FAILURE;
}
while(!feof(file)){
while( ( read_char_count = fread(block, 1, 16*sizeof(uint8_t), file) ) > 0 ) {
block = cipher_block(block, key, key_size);
fseek(file, -read_char_count, SEEK_CUR);
wrote_char_count = fwrite(block , 1, 16*sizeof(uint8_t), file);
}
}
fclose(file);
return EXIT_SUCCESS;
}
(I know ECB mode is not safe btw)
Which takes a file, break it down in blocks of 128 bits, cipher them using an AES and write them back to the file, effectively replacing plain text with cipher text.
I also wrote a function decipher_file() to decipher the file.
The issue is that if the file size is not a multiple of 128 bits, at the end fread() only partially replace content of "block" (which is 16 bytes long) with the successfully read characters, leaving a bunch of garbage from the previous ciphered block.
When deciphering since decipher_file() has normally no way of knowing the size of the original file, it deciphers all the content, including the garbage characters, and write it back to the file.
I also tried re initializing "block" with zeros at each round but, without great surprise, they were added to the file too, which can be very problematic.
So my question is, is there a way (like a function) to signify where the file ends, or tell fwrite() to stop writing?
You can't use a special character because the encrypted data might end up looking like it. It would not be an elegant solution anyway.
There are multiple solutions:
Prefix the file with the decrypted content length. That's very clean and easy to implement.
Use a cipher mode that retains length information. ECB does not. Use padding schemes or a scheme, that preserves the length such as counter mode.

Open non text file without windows line ending

I took over a project that use the following function to read files:
char *fetchFile(char *filename) {
char *buffer;
int len;
FILE *f = fopen(filename, "rb");
if(f) {
if(verbose) {
fprintf(stdout, "Opened file %s successfully\n", filename);
}
fseek(f, 0, SEEK_END);
len = ftell(f);
fseek(f, 0, SEEK_SET);
if(verbose) {
fprintf(stdout, "Allocating memory for buffer for %s\n", filename);
}
buffer = malloc(len + 1);
if(buffer) fread (buffer, 1, len, f);
fclose (f);
buffer[len] = '\0';
} else {
fprintf(stderr, "Error reading file %s\n", filename);
exit(1);
}
return buffer;
}
The rb mode is used because sometimes the file can be a spreadsheet and therefore I want the information as in a text file.
The program runs on a linux machine but the files to read come from linux and windows.
I am not sure of what approach is better to not have windows line ending mess with my code.
I was thinking of using dos2unix at the start of this function.
I also thought of opening in r mode, but I believe that could potentially mess things up when opening non-text files.
I would like to understand better the differences between using:
dos2unix,
r vs rb mode,
or any other solution which would fit
better the problem.
Note: I believe that I understand r vs rb modes, but if you could explain why it is a bad or good solution for this specific situation (I think it wouldn't be good because sometimes it opens spreadsheets but I am not sure of that).
If my understanding is correct the rb mode is used because sometimes the file can be a spreadsheet and therefore the programs just want the information as in a text file.
You seem uncertain, and though perhaps you do understand correctly, your explanation does not give me any confidence in that.
C knows about two distinct kinds of streams: binary streams and text streams. A binary stream is simply an ordered sequence of bytes, written and / or read as-is without any kind of transformation. On the other hand,
A text stream is an ordered sequence of characters composed into
lines, each line consisting of zero or more characters plus a
terminating new-line character. Whether the last line requires a
terminating new-line character is implementation-defined. Characters
may have to be added, altered, or deleted on input and output to
conform to differing conventions for representing text in the host
environment. Thus, there need not be a one- to-one correspondence
between the characters in a stream and those in the external
representation. [...]
(C2011 7.21.2/2)
For some implementations, such as POSIX-compliant ones, this is a distinction without a difference. For other implementations, such as those targeting Windows, the difference matters. In particular, on Windows, text streams convert on the fly between carriage-return / line-feed pairs in the external representation and newlines (only) in the internal representation.
The b in your fopen() mode specifies that the file should be opened as a binary stream -- that is, no translation will be performed on the bytes read from the file. Whether this is the right thing to do depends on your environment and the application's requirements. This is moot on Linux or another Unix, however, as there is no observable difference between text and binary streams on such systems.
dos2unix converts carriage-return / line-feed pairs in the input file to single line-feed (newline) characters. This will convert a Windows-style text file or one with mixed Windows / Unix line terminators to Unix text file convention. It is irreversible if there are both Windows-style and Unix-style line terminators in the file, and it is furthermore likely to corrupt your file if it is not a text file in the first place.
If your inputs are sometimes binary files then opening in binary mode is appropriate, and conversion via dos2unix probably is not. If that's the case and you also need translation for text-file line terminators, then you first and foremost need a way to distinguish which case applies for any particular file -- for example, by command-line argument or by pre-analyzing the file via libmagic. You then must provide different handling for text files; your main options are
Perform the line terminator conversion in your own code.
Provide separate versions of the fetchFile() function for text and binary files.
The code just copies the contents of a file to an allocated buffer. The UNIX way (YMMV) is to just memory map the file instead of reading it. Much faster.
// untested code
void* mapfile(const char *name)
{
int fd;
struct stat st;
if ((fd = open(name, O_RDONLY)) == -1)
return NULL;
if (fstat(fd, &st)) {
close(fd);
return NULL;
}
void *p = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, 0, fd);
close(fd);
if (p == (void *)MAP_FAILED)
p = NULL;
return p;
}
Something along these lines will work. Adjust settings if you want to write to the file as well.

Is there any way to create dummy file descriptor in linux?

I have opened one file with following way:
fp = fopen("some.txt","r");
Now in this file the 1st some bytes lets say 40 bytes are unnecessary junk of data so I want to remove them. But I cannot delete that data from that file, modify or
create duplicates of that file without that unnecessary data.
So I want to create another dummy FILE pointer which points to the file and when I pass this dummy pointer to any another function that does the following operation:
fseek ( dummy file pointer , 0 , SEEK_SET );
then it should set the file pointer at 40th position in my some.txt.
But the function accepts a file descriptor so i need to pass a file descriptor which will treat the file as those first 40 bytes were never in the file.
In short that dummy descriptor should treat the file as those 40 bytes were not in that file and all positioning operations should be with respect to that 40th byte counting as the is 1st byte.
Easy.
#define CHAR_8_BIT (0)
#define CHAR_16_BIT (1)
#define BIT_WIDTH (CHAR_8_BIT)
#define OFFSET (40)
FILE* fp = fopen("some.txt","r");
FILE* dummy = NULL;
#if (BIT_WIDTH == CHAR_8_BIT)
dummy = fseek (fp, OFFSET*sizeof(char), SEEK_SET);
#else
dummy = fseek (fp, OFFSET*sizeof(wchar_t), SEEK_SET);
#endif
The SEEK_SET macro indicates beginning of file, and depending on whether you are using 8-bit characters (ASCI) or 16-bit characters (eg: UNICODE) you will step 40 CHARACTERS forward from the beginning of your file pointer, and assign that pointer/address to dummy.
Good luck!
These links will likely be helpful as well:
char vs wchar_t
http://www.cplusplus.com/reference/clibrary/cstdio/fseek/
If you want, you can just convert a file descriptor to a file pointer via the fdopen() call.
http://linux.die.net/man/3/fdopen
fseek ( dummy file pointer , 0 , SEEK_SET );
In short that dummy pointer should treat the file as there is no that 40 byte in that file and all position should be with respect to that 40th byte as counting as it is 1st byte.
You have conflicting requirements, you cannot do this with the C API.
SEEK_SET always refers to the absolute position in the file, which means if you want that command to work, you have to modify the file and remove the junk.
On linux you could write a FUSE driver that would present the file like it was starting from the 40th byte, but that's a lot of work. I'm only mentioned this because it's possible to solve the problem you've created, but it would be quite silly to actually do this.
The simplest thing of course would be just to abandon this emulating layer idea you're looking for, and write code that can handle that extra header junk.
If you want to remove the first 40 bytes of a file on the disk without creating another file, then you can copy the content from the 41th byte and onwards into a buffer, then write it back at offset -40. Then use ftruncate (a POSIX library in unistd.h) to truncate at (filesize - 40) offset.
I wrote a small code with what i understood from your question.
#include<stdio.h>
void readIt(FILE *afp)
{
char mystr[100];
while ( fgets (mystr , 100 , afp) != NULL )
puts (mystr);
}
int main()
{
FILE * dfp = NULL;
FILE * fp = fopen("h4.sql","r");
if(fp != NULL)
{
fseek(fp,10,SEEK_SET);
dfp = fp;
readIt(dfp);
fclose(fp);
}
}
The readIt() is reading the file from the 11 byte.
Is this what you are expecting or something else?
I haven't actually tried this, but I think you should be able to use mmap (with the MAP_SHARED option) to get your file mapped into your address space, and then fmemopen to get a FILE* that refers to all but the first 40 bytes of that buffer.
This gives you a FILE* (as you describe in the body of your question), but I believe not a file descriptor (as in the title and elsewhere in the question). The two are not the same, and AFAIK the FILE* created with fmemopen does not have an associated file descriptor.

Resources