I'm driving myself crazy trying to figure out what is happening with me code.
I'm currently in CS50's pset4. Recover Challenge.
For those who don't know what is it about:
We're given a file called card.raw in which there are some deleted photos. Our task is to implement a program that can do a bit of forensics (idyllically) and recover the lost photos.
Hereby I attach my code:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
if (argc != 2)
{
fprintf(stderr, "Usage: ./recover file\n");
return 1;
}
//declaring pointer infile and giving the address of argv[1];
char *infile = argv[1];
//Opening file
FILE *raw_data;
raw_data = fopen(infile, "r");
//Checking for NULL error
if(raw_data == NULL)
{
fprintf(stderr, "Could not open file.\n");
return 2;
}
uint8_t buffer[512]; //Delcaring unsigned int variable type. Array of 512 bytes.
int counter = 0; //Declaring counter for counting jpegs files
FILE *outfile; //Setting pointer named outfile for printing here
char filename[8]; //declaring 'filename' variable for storing the file's name
//While we can reads blocks of memory of 512 bytes from raw_data (aka the address from the infile) into buffer:
while (fread(buffer, 512, 1, raw_data))
{
//Condition for tracking the first bytes that form a JPEG file
if(buffer[0] == 0xff &&
buffer[1] == 0xd8 &&
buffer[2] == 0xff &&
(buffer[3] & 0xf0) == 0xe0)
{
if(counter == 0) //If this is the 1st file, then name the file with
//counter value with 3 digits (%03d)
{
sprintf(filename, "%03d.jpg", counter); // And 3 digits (%i3)
outfile = fopen(filename, "w"); //Open file named outfile in write mode
counter++;
}
else //If this is not the first JPG opened, firstly close the
{ // current open file, and then open a new one with the
fclose(outfile); // current counter value and 3 digits for its name
sprintf(filename, "%03d.jpg", counter);
outfile = fopen(filename, "w"); //Open file named 'outfile' in write mode
counter++;
}
}
fwrite(buffer, 1, sizeof(buffer), outfile); /* Write function that takes buffer data (aka the
pointer to the array of elements to be written,
writes 1 byte of elements of the syze buffer (512)
and it writes it to the output, aka 'outfile' */
}
fclose(outfile); //Remember to close the last file once we get out of the while-loop
}
Here's the tricky part:
I've successfully recovered all the problem images.
But, if I run the code several times, let's say for example, 5 times, I end up having a Segmentation Fault.
When I run check50, I get the following message (I will attach an image with both the segmentation fault after some successful runs and the check50 veredict). Click here to see the image
I just can't get it. I supose there might be some trouble with memory, but I just don't know what is it.
Thank you very much for your time and your help guys. StackOVerFlow is always such a nice place to seek for guidance.
EDIT
If I run echo $? once the Segmentation Fault has prompted, I get a value of 139.
Here's the terminal prompt screenshot
EDIT
Just as #Thomas Dickey has pointed out, the program was writing on a file regardless of having an open file yet or not.
I've updated and fixed a bit my code in order to keep it cleaner, and added an if condition in order to fix it.
Here's the solution:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
if (argc != 2)
{
fprintf(stderr, "Usage: ./recover file\n");
return 1;
}
//declaring pointer infile and giving the address of argv[1];
char *infile = argv[1];
//Opening file
FILE *raw_data;
raw_data = fopen(infile, "r");
//Checking for NULL error
if(raw_data == NULL)
{
fprintf(stderr, "Could not open file.\n");
return 2;
}
uint8_t buffer[512]; //Delcaring unsigned int variable type. Array of 512 bytes.
int counter = 0; //Declaring counter for counting jpegs files
FILE *outfile; //Setting pointer named outfile for printing here
char filename[8]; //declaring 'filename' variable for storing the file's name
//While we can reads blocks of memory of 512 bytes from raw_data (aka the address from the infile) into buffer:
while (fread(buffer, 512, 1, raw_data))
{
//Condition for tracking the first bytes that form a JPEG file
if(buffer[0] == 0xff &&
buffer[1] == 0xd8 &&
buffer[2] == 0xff &&
(buffer[3] & 0xf0) == 0xe0)
{
if(counter != 0)
{
fclose(outfile); //If this is not the first JPG opened, close previous file
}
sprintf(filename, "%03d.jpg", counter); //print stream to 'filename' the value of 'counter' in 3 digits
outfile = fopen(filename, "w"); //Open file named outfile in write mode
counter++; //Add 1 to counter
}
if(counter != 0) //Don't start writing on a file until the first jpeg is found
{
fwrite(buffer, sizeof(buffer), 1, outfile); /* - Write function that takes buffer data
(aka the array of elements to be written) ,
- Write a block of 512 bytes of elements
(aka the size of buffer),
- 1 block of 512 bytes at a time,
- And it writes it to the output, aka 'outfile' */
}
}
fclose(outfile); //Remember to close the last file once we get out of the while-loop
return 0;
}
The program only opens the output file if the header looks okay, but writes to the output irregardless. If you read a file that doesn't have a jpeg header, it'll break.
Related
I am solved the pset4 recover in the CS50. Although I solved the problem, I am confuse between "buffer" & "&buffer". Please pay attention to "LINE AAAAA" & "LINE BBBBB".
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
// New type to store a byte of data
typedef uint8_t BYTE;
// Number of "block size" 512
const int BLOCK_SIZE = 512;
int main(int argc, char *argv[])
{
// Check for 2 command-line arguments
if (argc != 2)
{
printf("Usage: ./recover IMAGE\n");
return 1;
}
// Open card.raw file
FILE *input = fopen(argv[1],"r");
// Check for fail to open
if (input == NULL)
{
printf("Couldn't open the file.\n");
return 1;
}
// Read the first 4 bytes
BYTE buffer[BLOCK_SIZE];
// Count image
int count_image = 0;
// Assign NULL to output_file
FILE *output = NULL;
// Declare filename
char filename[8];
// LINE AAAAA
while (fread(buffer, sizeof(BYTE), BLOCK_SIZE, input) == BLOCK_SIZE)
{
// Check first 4 bytes for JPEG file format
if (buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3] & 0xf0) == 0xe0)
{
// Only for first image, generate new filename and write into it
if (count_image == 0)
{
// Generate a new file with sequence name
sprintf(filename, "%03i.jpg", count_image);
// Open new file
output = fopen(filename, "w");
// LINE BBBBB
fwrite (&buffer, sizeof(BYTE), BLOCK_SIZE, output);
// Add filename counter
count_image++;
}
// For subsequence new repeat JPG images
// Close output file, generate new file, and write into it
else if (count_image > 0)
{
// fclose current writing files
fclose (output);
// Generate a new file with sequence name
sprintf(filename, "%03i.jpg", count_image);
// Open new file
output = fopen(filename, "w");
// LINE BBBBB
fwrite (&buffer, sizeof(BYTE), BLOCK_SIZE, output);
// Add filename counter
count_image++;
}
}
// Not fulfill the 4 bytes JPG condition, keep writing to the same filename
else if (count_image > 0)
{
// LINE BBBBB
fwrite (&buffer, sizeof(BYTE), BLOCK_SIZE, output);
}
}
fclose(output);
fclose(input);
}
Question:
Why do we use "&buffer" in LINE BBBBB instead of "buffer"?
I know LINE AAAAA is using "buffer" as BYTE buffer[BLOCK_SIZE] is an pointer or array. So "buffer" mean the location of the pointer.
When buffer is used, it often converts to the address of the first element. &buffer is the address of the array.
Those 2 addresses will compare equal, yet have different types. When the type is important, use the matching one. Enable all warnings to help identify incorrect type usage.
Since fread() use void *, either will work.
Conceptually use buffer in this case to match the sizeof(BYTE) and BLOCK_SIZE.
or
fread(&buffer, sizeof buffer, 1, input) == 1)
fwrite(buffer, ...) will work fine.
Output:
:( recovers 000.jpg correctly
failed to execute program due to segmentation fault
:( recovers middle images correctly
failed to execute program due to segmentation fault
:( recovers 049.jpg correctly
failed to execute program due to segmentation fault
Code:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
if (argc != 2) //to make sure that accept exactly one command-line argument
{
printf("Usage: ./recover key\n");
return 1;
}
FILE *infile = fopen(argv[1], "r"); //open the file card.raw and creating a new file called f in read format
if (infile == NULL)//if file cannot open then print below if can open just continue
{
printf("Cannot open file\n");
return 2;
}
FILE *img; //img is the output
int jpeg_counter = 0; // to count the no. of jpeg files
uint8_t buffer[512]; //cos 512 bytes and the buffer is the temporary storage
char filename[8];
while (fread(buffer, sizeof(buffer), 1, infile) == 512)
//continue doing this loop if the while conditions are true. to repeat until end of card.like while the file you reading is true,
{
if (buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3] & 0xf0) == 0xe0)
//removing last 4 bits of the 8 bits, only looking at they first 4 which is e. setting all to 0
//if start of new jpeg with above conditions
{
if (jpeg_counter != 0) // telling them that you previously found jpeg
{
fclose(img);//else if never find before, tell them now that you have found it by making it true
}
sprintf(filename, "%03i.jpg", jpeg_counter); //%03i means print an integer with 3 digits
jpeg_counter ++;
img = fopen(filename , "w"); //open the new file w for writting
if (img == NULL) //see if can remove this
return 3;
fwrite(buffer, sizeof(buffer), 1, img);// writing new output file
}
if (jpeg_counter != 0)
fwrite(buffer, sizeof(buffer), 1, img);
}
fclose(infile);
fclose(img);
return 0;
}
the segfault is cased by fclose(img) and img is an invalid FILE pointer. The problem is your while loop condition is never true and the loop is never taken. Your fread will never return 512, it returns 1 on a success read. I have fixed the loop condition for you and added some printfs to print out more information so you have a better understanding of what happens. Here is a link of the fixed code in our cloud IDE, you can use it to debug segfault in the future.
I am writing a code that reads information from a memory card (card.raw is the one we are provided but the code uses user input) and extracts the jpegs from it using the signatures that jpegs have of (0xff,0xd8,0xff,0x00 - 0xff). I am getting a segmentation fault because i am using malloc, but i dont see where i went wrong. I am pasting my code here any help would be appreciated.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
typedef uint8_t BYTE;
int main(int argc, char *argv[])
{
//check terminal usage
if (argc != 2)
{
printf("Usage: ./recover image");
return 1;
}
//open inputted file and check for valid file
FILE *file = fopen(argv[1], "r");
if (!file)
{
printf("Invalid or missing file.");
return 1;
}
BYTE *buff = malloc(512 * sizeof(BYTE));
int counter = 0;
FILE *image = NULL;
char *name = malloc(8 * sizeof(char));
//loop till end of file reached and read a block of input
while(fread(buff, sizeof(BYTE), 512, file) == 1 && !feof(file))
{
bool foundJPEG = buff[0] == 0xff && buff[1] == 0xd8 && buff[2] == 0xff && ((buff[3] & 0xf0) == 0xe0);
//check if found jpeg, and open file for writing
if (foundJPEG)
{
sprintf(name, "%03i.jpg", counter);
image = fopen(name, "w");
}
//if image file open, write to it
if (image != NULL)
{
fwrite(buff, sizeof(BYTE), 512, image);
}
//if found a jpeg already, close it so new one can be written
if (foundJPEG && image != NULL)
{
fclose(image);
counter++;
}
}
free(name);
free(buff);
fclose(image);
fclose(file);
return 0;
}
There are three issues with the code above which are not mentioned in the comments:
The return value of fread is not 1 but 512, upon successful read. You exchanged the parameters for the blocksize and the blockcount -> fread definition. Therefore the while loop is not entered.
Don't try to save space with packing to much code into one statement. If would be more clever to separate the checks for the fread return value and the EOF and use a do ... while() loop, instead. Then you had the chance of seeing this issue in the debugger. This was exactly what i have done and how i found this out.
The second issue is that you close the image after rescuing the first 512 bytes, but you do not reset the file pointer image back to NULL along with the fclose statement.
As a consequence, the code would repeatedly write to an a file which is closed until a new block with a jpg header is found.
The third issue is that you only rescue the first 512 bytes of the jpg but not the whole jpg. You need to scan the input stream for the jpg end indicator FF D9 and copy bytes until it is found. ->jpg format
I am writing a program in C to recover images from a raw file for CS50 and I am having a strange problem. I have a variable int cnt that I was using for debug purposes and I got the program to work so I was removing leftover debug code. But when I remove the cnt declaration I start outputting corrupt files.
Before removing line 25 below I was outputing .jpg files that I could open and view, then I removed the line, recompiled, deleted the photos from the last run, and reran the program on the same .raw data and the new files I got were unrecognized. So I put the declaration back in, recompiled, deleted the old photos, and ran the program again and got good files. Does anyone know why removing an unused declaration is messing with my results? The offending declaration is on line 25.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
if (argc != 2)
{
printf("Usage: ./recover image\n");
return 1;
}
int filesFound = 0;
FILE *inFile = fopen(argv[1], "r");
FILE *outFile = NULL;
if (inFile == NULL)
{
printf("Image file could not be opened\n");
return 1;
}
uint8_t buffer[512];
int cnt = 0;
while (!feof(inFile))
{
fread(buffer, 512, 1, inFile);
// check for start of jpg file
if (buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff && (buffer[3] & 0xf0) == 0xe0)
{
// start of jpg was found
if (outFile != NULL)
{
// close the current file and then open a new file to write to
fclose(outFile);
outFile = NULL;
}
// open a file to write to
char fName[4];
sprintf(fName, "%03i.jpg", filesFound);
outFile = fopen(fName, "w");
filesFound++;
}
if (outFile != NULL){
// we have found data to write and opened a file
fwrite(buffer, 512, 1, outFile);
}
}
//Be sure to close my files
fclose(inFile);
if (outFile != NULL)
{
fclose(outFile);
}
return 0;
}
char fName[4] does not have sufficient room for the name generated by "%03i.jpg", so you are overrunning the buffer. Make it larger and use snprintf, not sprintf, and test the return value to detect errors:
int result = snprintf(fName, sizeof fName, "%03i.jpg", filesFound);
if (sizeof fName <= result)
{
fprintf(stderr, "Internal error, buffer is too small for file name.\n");
exit(EXIT_FAILURE);
}
Instead of printing an error, you could instead use the return value of snprintf, which indicates the length needed, to allocate memory for a larger buffer and then redo the snprintf with that buffer.
(Note that snprintf may return a negative result if an error occurs. Normally, this will become a large number upon conversion to size_t for the comparison, so it will trigger this error message. However, in a robust program, you might want to insert a separate test for result < 0.)
I am learning how to code and I have no experience with that at all. I've successful got to PSET4 and stuck on recover. I've read everything online about this problem and i found out that many people have similar code as I do and it works. Does not work for me whatsoever. Please have a look and give me a hint what did I do wrong and how to correct it.
Here is everything about the pset4 recover i downloaded their card.raw from here card.raw
/** recovering JPEG files from a memory card
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
typedef uint8_t BYTE;
int main(int argc, char* argv[])
{
// ensure proper usage
if (argc != 2)
{
fprintf(stderr,
"Usage: ./recover infile (the name of a forensic image from which to recover JPEGs)\n");
return 1;
}
// open input file (forensic image)
FILE* inptr = fopen(argv[1], "r");
if (inptr == NULL)
{
fprintf(stderr, "Could not open %s.\n", argv[1]);
return 2;
}
FILE* outptr = NULL;
// create a pointer array of 512 elements to store 512 bytes from the memory card
BYTE* buffer = malloc(sizeof(BYTE) * 512);
if (buffer == NULL)
{
return 3;
}
// count amount of jpeg files found
int jpeg = 0;
// string for a file name using sprintf
char filename[8] = { 0 };
// read memory card untill the end of file
while (fread(buffer, sizeof(BYTE) * 512, 1, inptr) != 0)
{
// check if jpeg is found
if (buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff
&& (buffer[3] >= 0xe0 || buffer[3] <= 0xef))
{
if (jpeg > 0)
{
fclose(outptr);
}
sprintf(filename, "%03d.JPEG", jpeg);
outptr = fopen(filename, "w");
jpeg++;
}
if (jpeg > 0)
{
fwrite(buffer, sizeof(BYTE) * 512, 1, outptr);
}
}
// free memory
free(buffer);
// close filename
fclose(outptr);
// close input file (forensic image)
fclose(inptr);
return 0;
}
The main problem is that you invoke undefined behavior because filename is not enough big. sprintf() need be 9 and 17 bytes with your code but you only has 8. So you have a buffer overflow.
Just change:
char filename[8] = { 0 };
to
char filename[17] = { 0 };
Because, you use an int, this value is implemented defined but in many system has an int with 32 bits. So the value possible are between -2^31 and 2^31 - 1 that make a maximum of 11 chars (-2147483648). We add the number of chars in ".JPEG", 5. We have 16 but you forget the null terminate byte of a c-string. So we are 17 maximum.
Modern compiler warning you: gcc version 7.1.1 20170516 (GCC):
In function ‘main’:
warning: ‘sprintf’ writing a terminating nul past the end of the destination [-Wformat-overflow ]
sprintf(filename, "%03d.JPEG", jpeg++);
^
note: ‘sprintf’ output between 9 and 17 bytes into a destination of size 8
sprintf(filename, "%03d.JPEG", jpeg++);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Plus, your typedef is useless because a char world be always a byte in C. More than that you don't need a byte but an octet so like char, uint8_t would be always an octet in C. So you don't need typedef.
Again one thing, you allocate your buffer but it's useless because your buffer has a constant size. So just create an array is more simple.
#include <stdint.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Usage: ./recover infile (the name of a forensic image "
"from which to recover JPEGs)\n");
return 1;
}
FILE *inptr = fopen(argv[1], "r");
if (inptr == NULL) {
fprintf(stderr, "Could not open %s.\n", argv[1]);
return 2;
}
FILE *outptr = NULL;
uint8_t buffer[512];
size_t const buffer_size = sizeof buffer / sizeof *buffer;
size_t jpeg = 0;
while (fread(buffer, sizeof *buffer, buffer_size, inptr) == buffer_size) {
if (buffer[0] == 0xff && buffer[1] == 0xd8 && buffer[2] == 0xff &&
buffer[3] == 0xe0) {
if (outptr != NULL) {
fclose(outptr);
}
char filename[26];
sprintf(filename, "%03zu.JPEG", jpeg++);
outptr = fopen(filename, "w");
}
if (outptr != NULL) {
fwrite(buffer, sizeof *buffer, buffer_size, outptr);
}
}
if (outptr != NULL) {
fwrite(buffer, sizeof *buffer, buffer_size, outptr);
}
if (outptr != NULL) {
fclose(outptr);
}
fclose(inptr);
}
Note: This example is clearly not perfect, this will be better to make a true parser for jpeg file to have a better control flow. Here we suppose that all gonna be right.
how do you know that an instance of a JPEG image will always end with '\n'? Or better, how do you know that a JPEG image will be an exact multiple of 512?
You dont know.
So the posted code needs to calculate the actual value OR use some method to have the last call to fread() for any specific JPEG instance, to stop reading at the end of that image,
Then the check for the ID bytes of the next JPEG image will find the next image.
Otherwise, the start of the next image is already written to the prior output file and the check for a new image will fail.
In general this will result in the last created file containing more than one image.
This link: 'https://en.wikipedia.org/wiki/JPEG_File_Interchange_Format' is a web page that describes the format of a JPEG file.
On every digital camera that I have used, the SD card has a directory of all the files.
Suggest using that directory and the info in the linked web page to find each JPEG image and to determine when the end of that image has been encountered. (I.E. the 0xFF 0xD9)