I have a question about how to recover jpg images (it's an assignment for CS50).
My code works for the most part (I believe), however I only get a bunch of thumbnails when I open the jpg's I found.
I've been trying to solve this exercise for quite some time now but I can't seem to figure out why it doesn't work. Could somebody give me a push into the right direction.
Here is my code (also available at http://pastebin.com/U2pwJd5e):
#include <stdio.h>
#include <stdlib.h>
#include "bmp.h"
int main(int argc, char* argv[])
{
//get input file
char* infile = "card.raw";
// open card file
FILE* inptr;
inptr = fopen("card.raw", "r");
// error checking (copied from copy.c)
if (inptr == NULL)
{
printf("Could not open %s.\n", infile);
return 2;
}
// initialize buffer
BYTE buffer[512];
//initialize jpg variables:
int increment = 0;
char outfilename[8];
// while the end of the file is not reached, continue process & write to buffer next block of 512 bytes
while (fread(buffer, 512, 1, inptr) != 0)
{
// if the inpointer is not empty
if(inptr != NULL)
{
// If the block of 512 bytes starts with markers
if(buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3]== 0xe1 || buffer[3]== 0xe0))
{
// increase file number by 1
sprintf(outfilename,"%.3d.jpg", ++increment);
// open new file
FILE* outptr;
outptr = fopen(outfilename, "a");
// write first block of 512 bytes, then read next block
fwrite(buffer, 512, 1, outptr);
if(fread(buffer, 512, 1, inptr) == 0)
break;
// copy all information from inpointer to buffer to jpg
while((buffer[0] != 0xff && buffer[1] != 0xd8 && buffer[2] != 0xff && (buffer[3]!= 0xe1 || buffer[3]!= 0xe0) ))
{
// if next byte is NULL break
if(fread(buffer, 512, 1, inptr) == 0)
break;
fread(buffer, 512, 1, inptr);
//copies jpg file 1 byte at a time
fwrite(buffer, 512, 1, outptr);
}
// close file
fclose(outptr);
}
}
}
return 0;
}
There's a lot of missing information in your problem but I can see a few potential problems:
You only check for the JPEG file at the start of each 512 byte block. Unless you are guaranteed that this is the case you should probably check for start of JPEG files in the entire memory block.
You only check for JPEG files with a FFD8FFE1 or FFD8FFE0 start. What if the second block in the JPEG is not FFE1/FFE0?
The following check in your second if block is not correct:
(buffer[3] != 0xe1 || buffer[3] != 0xe0)
This is always true as buffer[3] can't be both 0xE1 and 0xE0 at the same time. This should be:
(buffer[3] != 0xe1 && buffer[3] != 0xe0)
Your checking for the end of the JPEG image probably doesn't do what you wanted:
while ( buffer[0] != 0xff && buffer[1] != 0xd8 &&
buffer[2] != 0xff && buffer[3] != 0xe1 && buffer[3]!= 0xe0 )
This ends the JPEG when you find any of those values at the start of the 512 byte block. For example the bytes 01DB0203 would end the JPEG since buffer[1] != 0xd8 is false, even though this is not a JPEG block marker.
I would think that finding the end of a JPEG file would require you to search the entire memory block for the FFD9 byte marker which signifies the end of a JPEG file. If I understand the JPEG format correct a FFD9 byte combination can only occur at the end of a valid JPEG file.
If you still run into issues I would create a test file composed of several known JPEG files and other data. You can then directly compare what is being output to what you know should be output to narrow down where/what is causing the problem.
Related
I am working on problem set 4 "Memory" and trying to understand fread() function and how using fread() inside of a while loop of fread() works. I am trying to read a file until end of file, which is what my while loop is for, then when I find the JPEG file signature, I would like to read from that file until I find the next JPEG signature. My question is how does this work for the two separate calls to fread() function? Once I find the file signature and start reading from file using fread() inside the while loop then iterate over that while loop after exiting my if conditional, does the while(fread()) pickup where fread() left off inside the while loop? Please see code below:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
// using keyword typedef to give the uint8_t a new name of BYTE (capitalized by convention)
typedef uint8_t BYTE;
int main(int argc, char *argv[])
{
// must be two arguments or program not run correctly
if(argc != 2)
{
printf("Usage: ./recover filename\n");
return 1;
}
// open file and store its location in a pointer called infile
FILE *infile = fopen(argv[1], "r");
if(infile == NULL)
{
printf("file cannot be opened or doesnt exist...\n");
return 1;
}
// read using fread(), each 512 byte block into a buffer
//need a buffer of size 512 BYTEs
BYTE buffer[512];
while(fread(&buffer, sizeof(BYTE), 512, infile))
{
// create buffer to store a filename with a formatted string of ###.jpg starting at 000.jpg
int number = 0;
char filename[8];
sprintf(filename, "%03i.jpg", number);
// this demarks a JPEG using bitwise logical & for last buffer bit of signature
if(buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3] & 0xf0) == 0xe0)
{
// this will be the first signature read and start of a JPEG
if(number == 0)
{
// open new file
FILE *img = fopen(filename, "w");
// write what is currently in buffer into file
fwrite(&buffer, sizeof(BYTE), 512, img);
// continue reading from file where left off and write it to img file until a new file signature?
while(fread(&buffer, sizeof(BYTE), 512, infile))
{
fwrite(&buffer, sizeof(BYTE), 512, img)
if(buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3] & 0xf0) == 0xe0)
{
break;
}
}
My question is about the second call to fread() while inside the while loop and how that call to fread() affects the call to fread() for the initial while loop. Hope this made sense.
I think you have figured out the solution in the comments.
You can also use functions and recursion here.
My program finds all 50 jpgs. However, the images are 'incomplete'. Some are half recovered, some less than a quarter, some are just completely grey. When I opened a jpeg, that appears grey when opened, in hexadecimal it shows that there are mostly colored pixels, which makes no sense. I've attempted to find the bug from all angles.
Can anyone please help me understand why my images are 'corrupted'?
// 000.jpg
int j = 0;
// storing 000.jpg
char jpeg[8];
// buffer
unsigned char bf[512];
// FILE pointer
FILE *img = NULL;
// reading into memory card file
while(fread(bf, 512, 1, inptr) == 1)
{
if (bf[0] == 0xff && bf[1] == 0xd8 && bf[2] == 0xff && (bf[3] & 0xf0) == 0xe0)
{
sprintf(jpeg, "%03i.jpg", j);
img = fopen(jpeg, "w");
do
{
fwrite(bf, 512, 1, img);
fread(bf, 512, 1, inptr);
}
while(bf[0] != 0xff && bf[1] != 0xd8 && bf[2] != 0xff && (bf[3] & 0xf0) != 0xe0);
fclose(img);
j++;
// fread is going to read it again
fseek(inptr, -512, SEEK_CUR);
}
}
fclose(inptr);
return 0;
}
Make sure you're passing the 'b' flag to fopen. For example:
fopen(jpeg, "wb");
Otherwise the standard library will (often unhelpfully) automatically convert line endings for you (e.g. \n gets turned into \r\n on Windows) which when applied to binary files results in corruption.
See the documentation for fopen:
With the mode specifiers above the file is open as a text file. In order to open a file as a binary file, a "b" character has to be included in the mode string.
I’m a very novice programmer and I’ve encountered an issue whose nature I don’t understand, whilst working on a problem set of the excellent cs50 course. I have implemented a program to recover JPEG pictures from an image of a memory card and am implementing a break at End of File as follows:
if(file > 1)
{
if (fread(&buffer, 1, 512, in_pointer) != 512)
{
free(filename);
return 0;
}
else
fseek(in_pointer, -512, SEEK_CUR);
}
(the pictures are filling up the card in 512 byte blocks). When I first implemented this it broke my first picture (it was recognizable but distorted) so I excluded it by means of the first if statement. Now however the middle files of the set are slightly off– they still open as Jpegs but I can’t get their thumbnails to work. My hypothesis is that I am corrupting the JPEG file format header. The beginning (including first and last images of the set work perfectly).
My questions are:
What is an elegant way to implement an EOF break since my getto solution is causing trouble?
What is the likely nature of the problem I’ve created (in layman’s terms)?
Thank you very much,
Tikhon
ps here is the whole thing
#include <cs50.h>
#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
// ensure proper usage
if (argc != 2)
{
fprintf(stderr, "enter exactly two command line arguments: ./recover and destination of disc to scan\n");
return 1;
}
//name the file
char *infile = argv[1];
//open card file and ensure proper format
FILE *in_pointer = fopen(infile, "r");
if (in_pointer == NULL)
{
fprintf(stderr, "could not open %s\n", infile);
return 2;
}
typedef uint8_t BYTE;
BYTE buffer[512];
bool new_jpeg = false;
int block = 0;
int file = 0;
char *filename = malloc(3);
//sprintf(filename, "%03i.jpg",1);
do
{
//read a 512 block of a jpeg
fread(&buffer, 512, 1, in_pointer);
//check for new jpeg
if (buffer[0] == 0xff &&
buffer[1] == 0xd8 &&
buffer[2] == 0xff &&
(buffer[3] & 0xf0) == 0xe0) // took me a while to figure this out
{
new_jpeg = true;
//printf("jpeg found, block %i\n", block);
}
block++;
} while(new_jpeg == false);
do
{
//set name of file to write to
sprintf(filename, "%03i.jpg",file);
file++;
new_jpeg = false;
// open output file
FILE *img = fopen(filename, "w");
if (img == NULL)
{
fprintf(stderr, "Could not create %s.\n", filename);
return 3;
}
//add blocks to file while before we reach the nea JPEG.
do
{
fwrite(&buffer, 1, 512, img);
//read the next block
fread(&buffer, 1, 512, in_pointer);
//There MUST be a better way... Anyhow this checks for end of file but backtracks becouse the act of checking moved the file coursor forward...
if(file > 1)
{
if (fread(&buffer, 1, 512, in_pointer) != 512)
{
free(filename);
return 0;
}
else
fseek(in_pointer, -512, SEEK_CUR);
}
block++; //we are reading off teh next block
if (buffer[0] == 0xff &&
buffer[1] == 0xd8 &&
buffer[2] == 0xff &&
(buffer[3] & 0xf0) == 0xe0) // took me a while to figure this out
{
new_jpeg = true;
//printf("jpeg %i found, block %i\n", file, block);
}
}while(new_jpeg == false);
}while(!feof(in_pointer));
free(filename);
//ran valgrind no probs detected.
}
OK, I fixed it.
Instead of having a whole separate section of code to check eof I killed two birds in one stone and fread the file WHILE checking for eof:
//read the next block
int k = fread(&buffer, 1, 512, in_pointer);
if(k != 512)
{
free(filename);
return 0;
}
I still have no idea why my previous method didn't work, I would be extremly grateful for suggestions...
I am trying to read a .raw file and recover JPG files and then create 50 of them. I can compile, but my output does not display, though i do have all 50 jpg files.
I have succesfully printed 50 jpg photos with names from 000.jpg to 049.jpg. When trying to open them, I get this message:
Error interpreting JPEG image file (Improper call to JPEG library in state 201)
I hopefully am correctly making sure that files are closed before i open another one
Here is my code:
#define JPEG1 0xff 0xd8 0xff 0xe0
#define JPEG2 0xff 0xd8 0xff 0xe1
#define BLOCK 512
int main(int argc, char* argv[])
{
// long enough to store the name of a jpeg file
char jpeg_name[4];
// where we are going to store our data
BYTE buffer[512];
// open the picture file
FILE* file = fopen("card.raw", "r");
// error checking
if (file == NULL)
{
printf("File could not be opened");
return 2;
}
// how many jpegs we have at any one time
int jpeg_num = 0;
// check if we're open
int open = 0;
// the outfile we will use for all jpeg files
FILE* jpeg = NULL;
// do this until we can't come up with a full 512, fread returns what it has succesfully read
// dont need to use address operator for image_data because its an array
while (fread(buffer, sizeof(BYTE), BLOCK, file) == BLOCK)
{
// this will help us count and name files
int i = 0;
// if this the begenning of a jpeg file?
if ((buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff) && (buffer[3] == 0xe0 || buffer[3] == 0xe1))
{
// is there a jpeg file already open?
//if (fopen("jpeg_name", "r") != NULL)
if(open == 1)
{
fclose(jpeg);
open = 0;
}
// name the jpegs we find
sprintf(jpeg_name, "%03d.jpg", jpeg_num+i);
// open the jpeg from sprintf
jpeg = fopen(jpeg_name, "w");
open = 1;
// error checking
if (jpeg == NULL)
{
printf("JPEG file could not be created");
return 1;
}
// write to our file
fwrite(buffer, sizeof(BYTE), BLOCK, jpeg);
// increment counter
i++;
jpeg_num += 1;
}
}
if(jpeg)
{
fclose(jpeg);
}
fclose(file);
return 0;
}
Let's see.
The size of the jpeg_name variable is 3 but you are writing 8 bytes into it: 001.jpg(null)
You are reading everything but only writing the first block of each JPEG style file (assuming your header is correct).
Are you quite sure the binary string 0xff 0xd8 0xff 0xe0 will not occur in the random binary inside a JPEG?
I have the same programming assignment. One thing I noticed in the code is that the fourth byte of a jpeg can range from 0xe0 to 0xef. In your code I only see
buffer[3] == 0xe0 || buffer[3] == 0xe1
I'm trying to copy 50 jpegs, one by one from a large .raw file, however currently I get a segmentation fault error. Here's my code:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
typedef uint8_t BYTE;
//SOI - 0xFF 0xD8
//EOI - 0xFF 0xD9
//APPn - 0xFF 0xEn
int main(void)
{
//FAT - 512 bytes per block
BYTE block[512];
//open file containing pictures
FILE* card_file = fopen("card.raw", "rd");
FILE* jpeg_file;
//make sure the file opened without errors
if (card_file == NULL)
{
printf("something went wrong and file could not be opened");
return 1;
}
int i = 0;
while (fread(&block, sizeof(BYTE), 512, card_file) != 0)
{
//jpeg start signature
if(block[0] == 0xFF && block[1] == 0xD8)
{
i++;
if(jpeg_file != NULL)
fclose(jpeg_file);
//create a new jpeg file to copy bytes to
jpeg_file = fopen((char*)i, "w+");
}
//write 512 bytes to a jpeg file
if(jpeg_file != NULL)
fwrite(block, sizeof(block), 1, jpeg_file);
}
fclose(card_file);
return 0;
}
when I run it through GDB, my code gets all the way to if(block[0] == 0xFF && block1 == 0xD8), then it skips the condition and segmentation fault occurs. I don't see what might be causing this.
Here's a screenshot:
Code updated:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <cs50.h>
typedef uint8_t BYTE;
/*struct jpg*/
/*{*/
/* BYTE soi[2] = { 0xFF, 0xD8 };*/
/* BYTE eoi[2] = { 0xFF, 0xD9 };*/
/*};*/
//SOI - 0xFF 0xD8
//EOI - 0xFF 0xD9
//APPn - 0xFF 0xEn
int main(void)
{
//FAT - 512 bytes per block
BYTE block[512];
//jpeg name
char name[6];
bool is_open = false;
//JPEG
//struct jpg image;
//open file containing pictures
FILE* card_file = fopen("card.raw", "r");
FILE* jpeg_file;
//make sure the file opened without errors
if (card_file == NULL)
{
printf("something went wrong and file could not be opened");
return 1;
}
int i = 0;
while (fread(block, sizeof(BYTE), 512, card_file) != 0)
{
//jpeg start signature
if ((block[0] == 0xFF) && (block[1] == 0xD8) && (block[2] == 0xFF) && ((block[3] == 0xe1) || (block[3] == 0xe0)))
{
//assign jpeg name
sprintf(name, "%d.jpg", i++);
if(is_open)
fclose(jpeg_file);
//create a new jpeg file to copy bytes to
jpeg_file = fopen(name, "a+");
is_open = true;
}
//write 512 bytes to a jpeg file
if(is_open)
fwrite(block, sizeof(block), 1, jpeg_file);
}
fclose(jpeg_file);
fclose(card_file);
return 0;
}
Now it doesn't crash, however only 9 out of 50 jpegs are properly recovered. cs50.h is there just so I have access to bool type. What's a better way to write 50 files? I seem to have a logical flaw with my booleans.
fopen((char*)i, "w+"); is completely invalid. You are casting an integer as a pointer, which is going to crash.
You need to format the number as a filename:
char path[PATH_MAX];
sprintf(path, "%d", i);
fopen(path, "w+");
You are also not initializing jpeg_file -- if the condition fails, jpeg_file will be a wild pointer, which also crashes. You should initialize jpeg_file to NULL.
In your fread call, you should pass the address of the array. Hence, the statement should be fread(block, sizeof(BYTE), 512, card_file).
Postscript:
In your code, there is assumption that the size of the input file is an integral multiple of 512, which needn't the case for JPEG files. The last fread might return a number less than 512 which needs to be handled in your implementation logic. Hence, the number of elements to write should be determined by the return value of fread
You would need to close the jpeg_file pointer after the loop terminates.
Last, because you are working with JPEG, you may want to handle a case for EXIF files with thumbnails. In this case, you would get 2 SOI (start of image) markers.