My recovered IMGs doesn't match the Original in recover CS50 - c

The problem is to recover some JPGs from a .raw file.
when I run check50 I get "recovered img don't match".
:) recover.c exists.
:) recover.c compiles.
:) handles lack of forensic image
:( recovers 000.jpg correctly –
recovered image does not match
:( recovers middle images correctly –
recovered image does not match
:( recovers 015.jpg correctly –
015.jpg not found
I really tried hard to identify the problem and every time I fail to Identify where the problem is, I hope someone can and give me a peace of advice.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[]){
if(argc != 2){
fprintf(stderr, "Usage: ./recover image");
return 1;
}
//open file
FILE *inptr = fopen(argv[1], "r");
if (inptr == NULL){
fprintf(stderr, "Could not open %s.\n", argv[1]);
return 2;
}
int foundjpg = 0;
char filename[10];
int x=1;
//repeat until end of the card
while(x == 1){
//buffer
unsigned char buf[512];
x = fread(buf, 512, 1, inptr);
//read into buffer
fread(buf, 512, 1, inptr);
FILE *jpg = fopen(filename, "w");
//start of a new jpg?
if(buf[0]== 0xff && buf[1] == 0xd8 && buf[2] == 0xff && (buf[3] & 0xf0) == 0xe0 ){
if(jpg != NULL){// yes i found before
fclose(jpg);
sprintf(filename, "%03i.jpg" ,foundjpg );
foundjpg++;
jpg = fopen(filename, "w");
}
else{
sprintf(filename, "%03i.jpg" ,foundjpg );
jpg = fopen(filename , "w");
foundjpg++;
}
}
//already found a jpg?
if(jpg != NULL && foundjpg > 0){
fwrite(buf, 1, 512, jpg);
}
}
fclose(inptr);
// success
return 0;
}

The order in which you do things is quite confused and leads to errors. For example:
filename isn't initialised when you use it for the first time.
You increase the counter foundjpg after you use it to create the filename, which in your program means that the second image is called 01.jpg. All image indices are off by one and the last one is missing.
When the id bytes do not identify a valid jpg, no new record is read and your loop never ends.
You should re-organise your code so that it does one thing after another in a natural way. The program might look like this:
Check command line arguments
Open the raw file
Main loop:
Read fixed-size block. If it can't be read, exit the loop
Check if first bytes identify a jpg and if so:
Create file name
Open jpg file for writing
Write block and close jpg file
Increment block counter
Close raw file
You must decide how you handle errors. Do you just skip erroneous blocks or do you abort the program?
It is also not clear whether all images are 512 bytes long, which seems improbable. Perhaps you must read the actual image size from the header and then copy the whole image.

Related

fgets statement reads first line and not sure how to modify because I have to return a pointer [duplicate]

I need to copy the contents of a text file to a dynamically-allocated character array.
My problem is getting the size of the contents of the file; Google reveals that I need to use fseek and ftell, but for that the file apparently needs to be opened in binary mode, and that gives only garbage.
EDIT: I tried opening in text mode, but I get weird numbers. Here's the code (I've omitted simple error checking for clarity):
long f_size;
char* code;
size_t code_s, result;
FILE* fp = fopen(argv[0], "r");
fseek(fp, 0, SEEK_END);
f_size = ftell(fp); /* This returns 29696, but file is 85 bytes */
fseek(fp, 0, SEEK_SET);
code_s = sizeof(char) * f_size;
code = malloc(code_s);
result = fread(code, 1, f_size, fp); /* This returns 1045, it should be the same as f_size */
The root of the problem is here:
FILE* fp = fopen(argv[0], "r");
argv[0] is your executable program, NOT the parameter. It certainly won't be a text file. Try argv[1], and see what happens then.
You cannot determine the size of a file in characters without reading the data, unless you're using a fixed-width encoding.
For example, a file in UTF-8 which is 8 bytes long could be anything from 2 to 8 characters in length.
That's not a limitation of the file APIs, it's a natural limitation of there not being a direct mapping from "size of binary data" to "number of characters."
If you have a fixed-width encoding then you can just divide the size of the file in bytes by the number of bytes per character. ASCII is the most obvious example of this, but if your file is encoded in UTF-16 and you happen to be on a system which treats UTF-16 code points as the "native" internal character type (which includes Java, .NET and Windows) then you can predict the number of "characters" to allocate as if UTF-16 were fixed width. (UTF-16 is variable width due to Unicode characters above U+FFFF being encoded in multiple code points, but a lot of the time developers ignore this.)
I'm pretty sure argv[0] won't be an text file.
Give this a try (haven't compiled this, but I've done this a bazillion times, so I'm pretty sure it's at least close):
char* readFile(char* filename)
{
FILE* file = fopen(filename,"r");
if(file == NULL)
{
return NULL;
}
fseek(file, 0, SEEK_END);
long int size = ftell(file);
rewind(file);
char* content = calloc(size + 1, 1);
fread(content,1,size,file);
return content;
}
If you're developing for Linux (or other Unix-like operating systems), you can retrieve the file-size with stat before opening the file:
#include <stdio.h>
#include <sys/stat.h>
int main() {
struct stat file_stat;
if(stat("main.c", &file_stat) != 0) {
perror("could not stat");
return (1);
}
printf("%d\n", (int) file_stat.st_size);
return (0);
}
EDIT: As I see the code, I have to get into the line with the other posters:
The array that takes the arguments from the program-call is constructed this way:
[0] name of the program itself
[1] first argument given
[2] second argument given
[n] n-th argument given
You should also check argc before trying to use a field other than '0' of the argv-array:
if (argc < 2) {
printf ("Usage: %s arg1", argv[0]);
return (1);
}
argv[0] is the path to the executable and thus argv[1] will be the first user submitted input. Try to alter and add some simple error-checking, such as checking if fp == 0 and we might be ble to help you further.
You can open the file, put the cursor at the end of the file, store the offset, and go back to the top of the file, and make the difference.
You can use fseek for text files as well.
fseek to end of file
ftell the offset
fseek back to the begining
and you have size of the file
Kind of hard with no sample code, but fstat (or stat) will tell you how big the file is. You allocate the memory required, and slurp the file in.
Another approach is to read the file a piece at a time and extend your dynamic buffer as needed:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define PAGESIZE 128
int main(int argc, char **argv)
{
char *buf = NULL, *tmp = NULL;
size_t bufSiz = 0;
char inputBuf[PAGESIZE];
FILE *in;
if (argc < 2)
{
printf("Usage: %s filename\n", argv[0]);
return 0;
}
in = fopen(argv[1], "r");
if (in)
{
/**
* Read a page at a time until reaching the end of the file
*/
while (fgets(inputBuf, sizeof inputBuf, in) != NULL)
{
/**
* Extend the dynamic buffer by the length of the string
* in the input buffer
*/
tmp = realloc(buf, bufSiz + strlen(inputBuf) + 1);
if (tmp)
{
/**
* Add to the contents of the dynamic buffer
*/
buf = tmp;
buf[bufSiz] = 0;
strcat(buf, inputBuf);
bufSiz += strlen(inputBuf) + 1;
}
else
{
printf("Unable to extend dynamic buffer: releasing allocated memory\n");
free(buf);
buf = NULL;
break;
}
}
if (feof(in))
printf("Reached the end of input file %s\n", argv[1]);
else if (ferror(in))
printf("Error while reading input file %s\n", argv[1]);
if (buf)
{
printf("File contents:\n%s\n", buf);
printf("Read %lu characters from %s\n",
(unsigned long) strlen(buf), argv[1]);
}
free(buf);
fclose(in);
}
else
{
printf("Unable to open input file %s\n", argv[1]);
}
return 0;
}
There are drawbacks with this approach; for one thing, if there isn't enough memory to hold the file's contents, you won't know it immediately. Also, realloc() is relatively expensive to call, so you don't want to make your page sizes too small.
However, this avoids having to use fstat() or fseek()/ftell() to figure out how big the file is beforehand.

Segmentation fault in CS50 (2020) recovery program

I am trying to write a program that will recover deleted images from a file and write each of those images to their own seperate files. I've been stuck on this problem for a few days, and have tried my best to solve it on my own, but I now realize I need some guidance. My code always compiles well, but everytime I run my program I suffer a segmentation fault. Using valgrind shows me that I don't have any memory leaks.
I think I have pinpointed the issue, though I'm not sure how to resolve it. When I run my program through the debugger, it always stops at the code inside my last 'else' condition (where the comment says "If already found JPEG") , and gives me an error message about the segmentation fault.
I have tried opening and initializing my file pointer jpegn atop this line of code, to prevent jpegn from being NULL when this condition is run, but that did not work to fix the fault.
I am very new to programming (and this site) so any advice or suggestions would be helpful.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
typedef uint8_t BYTE;
int main(int argc, char *argv[])
{
if(argc!=2) // Checks if the user typed in exactly 1 command-line argument
{
printf("Usage: ./recover image\n");
return 1;
}
if(fopen(argv[1],"r") == NULL) // Checks if the image can be opened for reading
{
printf("This image cannot be opened for reading\n");
return 1;
}
FILE *forensic_image = fopen(argv[1],"r"); // Opens the image inputted and stores it in a new file
BYTE *buffer = malloc(512 * sizeof(BYTE)); // Dynamically creates an array capable of holding 512 bytes of data
if(malloc(512*sizeof(BYTE)) == NULL) // Checks if there is enough memory in the system
{
printf("System error\n");
return 1;
}
// Creates a counting variable, a string and two file pointers
int JPEG_num=0;
char *filename = NULL;
FILE *jpeg0 = NULL;
FILE *jpegn = NULL;
while(!feof(forensic_image)) // Repeat until end of image
{
fread(buffer, sizeof(BYTE), 512, forensic_image); // Read 512 bytes of data from the image into a buffer
// Check for the start of a new JPEG file
if(buffer[0] == 0xff & buffer[1] == 0xd8 & buffer[2] == 0xff & (buffer[3] & 0xf0) == 0xe0)
{
// If first JPEG
if(JPEG_num == 0)
{
sprintf(filename, "%03i.jpg", JPEG_num);
jpeg0 = fopen(filename, "w");
fwrite(buffer, sizeof(BYTE), 512, jpeg0);
}
else // If not first JPEG
{
fclose(jpeg0);
JPEG_num++;
sprintf(filename, "%03i.jpg", JPEG_num);
jpegn = fopen(filename, "w");
fwrite(buffer, sizeof(BYTE), 512, jpegn);
}
}
else // If already found JPEG
{
fwrite(buffer, sizeof(BYTE), 512, jpegn);
}
}
// Close remaining files and free dynamically allocated memory
fclose(jpegn);
free(buffer);
}
There are quite many issues on your code. I am surprised if valgrind didn't identify these out.
First is this:
if(fopen(argv[1],"r") == NULL) // Checks if the image can be opened for reading
{
printf("This image cannot be opened for reading\n");
return 1;
}
FILE *forensic_image = fopen(argv[1],"r"); // Opens the image inputted and stores it in a new file
This is not fatal, but you opened the same file twice and discard the first file pointer. But the one with similar pattern below is definitely a memory leak:
BYTE *buffer = malloc(512 * sizeof(BYTE)); // Dynamically creates an array capable of holding 512 bytes of data
if(malloc(512*sizeof(BYTE)) == NULL) // Checks if there is enough memory in the system
{
printf("System error\n");
return 1;
}
Here you allocated 512-bytes twice and keep only the first allocation in the pointer buffer, while the second allocation is lost.
And then this:
char *filename = NULL;
// ...
sprintf(filename, "%03i.jpg", JPEG_num);
you are writing a string to an unallocated memory!
and also the lines:
else // If already found JPEG
{
fwrite(buffer, sizeof(BYTE), 512, jpegn);
}
How can you guarantee jpegn is a valid file pointer? Probably never because I see in your code that JPEG_num will always be 0. The part of else marked by // If not first JPEG is dead code.
when compiling, always enable the warnings, then fix those warnings.
gcc -ggdb3 -Wall -Wextra -Wconversion -pedantic -std=gnu11 -c "untitled1.c" -o "untitled1.o"
results in several warnings like:
untitled1.c:46:91: warning: suggest parentheses around comparison in operand of ‘&’ [-Wparentheses]
if(buffer[0] == 0xff & buffer[1] == 0xd8 & buffer[2] == 0xff & (buffer[3] & 0xf0) == 0xe0)
Note: a single & is a bit wise AND. You really want a logical AND && for all but the last one in this statement
regarding;
FILE *forensic_image = fopen(argv[1],"r");
Always check (!=NULL) the returned value to assure the operation was successful. If not successful (==NULL) then call
perror( "fopen failed" );
to output to stderr both your error message and the text reason the system thinks the error occurred.
regarding:
while(!feof(forensic_image))
please read: why while( !feof() is always wrong
regarding:
FILE *forensic_image = fopen(argv[1],"r");
This is already done in the prior code block. There is absolutely no reason to do this again AND it will create problems in the code. Suggest: replacing:
if(fopen(argv[1],"r") == NULL)
{
printf("This image cannot be opened for reading\n");
return 1;
}
with:
if( (forensic_image = fopen(argv[1],"r") ) == NULL)
{
perror( "fopen for input file failed" );
exit( EXIT_FAILURE );
}
regarding:
BYTE *buffer = malloc( 512 * sizeof(BYTE) );
and later:
free( buffer );
This is a waste of code and resources. The project only needs one such instance. Suggest:
#define RECORD_LEN 512
and
unsigned char buffer[ RECORD_LEN ];
regarding;
fread(buffer, sizeof(BYTE), 512, forensic_image);
The function: fread() returns a size_t. You should be assigning the returned value to a size_t variable and checking that value to assure the operation was successful. Infact, that statement should be in the while() condition
regarding;
sprintf(filename, "%03i.jpg", JPEG_num);
This results in undefined behavior and can result in a seg fault event because the pointer filename is initialized to NULL. Suggest:
char filename[20];
to avoid that problem
regarding:
else // If not first JPEG
{
fclose(jpeg0);
if your (for instance) working with the 3rd file, then jpeg0 is already closed, resulting in a run time error. Suggest removing the statement:
FILE *jpeg0;
and always using jpegn
regarding;
else // If already found JPEG
{
fwrite(buffer, sizeof(BYTE), 512, jpegn);
}
on the first output file, jpegn is not set, so this results in a crash. Again, ONLY use jpegn for all output file operations.
regarding:
fwrite(buffer, sizeof(BYTE), 512, jpegn);
this returns the number of (second parameter) amounts actually written, so this should be:
if( fwrite(buffer, sizeof(BYTE), 512, jpegn) != 512 ) { // handle error }
the posted code contains some 'magic' numbers, like 512. 'magic' numbers are numbers with no basis. 'magic' numbers make the code much more difficult to understand, debug, etc. Suggest using an enum statement or #define statement to give those 'magic' numbers meaningful names, then use those meaningful names throughout the code.

Can someone explain why fread() is not working?

After a couple hours working on the recover exercise of cs50 i've been stumbling in the segmentention error problem. After running the debbuger i've discovered that the cause of the segmentation error is the malfuction of fread(memory, 512, 1, file), even after calling the function the memory[] array keeps empty, thus, the segmentation error.
i've tried to work with malloc(512) instead of an unsigned char array but the error persists. Can someone explain why is this happening and how to solve it?
(PS. Sorry for my bad english)
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
// making sure the only user input is the name of the file
if (argc != 2)
{
printf("Usage: ./recover image\n");
return 1;
}
// open the file and check if it works
FILE *file = fopen("card.raw", "r");
if (file == NULL)
{
printf("Could not open card.raw.\n");
return 2;
}
int ending = 1000;
int count = 0;
char img = '\0';
FILE *picture = NULL;
unsigned char memory[512];
do
{
//creating buffer and reading the file into the buffer
fread(memory, 512, 1, file);
//checking if the block is a new jpg file
if (memory[0] == 0xff && memory[1] == 0xd8 && memory[2] == 0xff && (memory[3] & 0xf0) == 0xe0)
{
//if it's the first jpg file
if (count == 0)
{
sprintf(&img, "000.jpg");
picture = fopen(&img, "w");
fwrite(&memory, 512, 1, picture);
}
//closing previous jpg file and writing into a new one
else
{
fclose(picture);
img = '\0';
sprintf(&img, "%03i.jpg", count + 1);
picture = fopen(&img, "w");
fwrite(&memory, 512, 1, picture);
}
}
//continue writing into the file
else
{
picture = fopen(&img, "a");
fwrite(&memory, 512, 1, picture);
}
count++;
}
while(ending >= 512);
fclose(file);
fclose(picture);
return 0;
}
if you're using C or C++ then you should be fully aware of the memory model being used. For example, declaring a character local variable, means allocating from 1 to 4 bytes of memory in the stack, depending on memory alignment and architecture being used (16-bit? 32-bit? 64-bit?). And guess what happens when you do sprintf of more than 4 characters on such character local variable. It will overrun to whatever variable occupying the space after the img variable. So you must prepare a buffer large enough to hold characters that are needed to create the filename.
In C, if you make a mistake, there are several possibilities :
sometime you get segmentation fault after you do a mistake
sometime you didn't get any error but the data silently corrupted
sometime error happens long after the mistake is done
There are other problems with your code, which has been pointed out by Weather Vane and Jabberwocky in the comments above. I would like to add that reopening 'img' file and discarding previous file handle is not a good thing either (you already said continue writing? why need to reopen? ). You might get a dangling file handle or needlessly create many file handles during the iteration. C will not help you check such things, it assumes you really know what are you doing. Even mixing types will not cause compile error nor identifiable runtime error. It will just do one of the three things I said above.
You might want to use other modern language with more memory safety features in order to learn programming, like C#, Java or Python.

C sockets receive file

I'm developing very simple ftp client. I have created a data connection sockets, but I can't transfer file successfully:
FILE *f = fopen("got.png", "w");
int total = 0;
while (1){
memset(temp, 0, BUFFSIZE);
int got = recv(data, temp, sizeof(temp), 0);
fwrite(temp, 1, BUFFSIZE, f);
total += got;
if (total == 1568){
break;
}
}
fclose(f);
BUFFSIZE = 1568
I know that my file is 1568 bytes size, so I try to download it just for a test. Everything is file when I try to download .xml or .html files, but nothing good happens when I try to download png or avi files. Simply original file size is 1568 but got.png file size is 1573. I can't figure out what might cause that.
EDIT:
I have modified my code, so now it looks like (it can accept any file size):
FILE *f = fopen("got.png", "w");
while (1){
char* temp = (char*)malloc(BUFFSIZE);
int got = recv(data, temp, BUFFSIZE, 0);
fwrite(temp, 1, got, f);
if (got == 0){
break;
}
}
fclose(f);
Still received file is 2 bytes too long.
You are opening the file in text mode, so bare CR/LF characters are going to get translated to CRLF pairs when written to the file. You need to open the file in binary mode instead:
FILE *f = fopen("got.png", "wb");
You are always writing a whole buffer even if you have received only a partial one. This is the same problem with ~50% of all TCP questions.
The memset is not necessary. And I hope that temp is an array so that sizeof(temp) does not evaluate to the native pointer size. Better use BUFFSIZE there as well.
Seeing your edit, after fixing the first problem there is another one: Open the file in binary mode.

Open image file as binary, store image as string of bytes, save the image - possible in plain C?

I would like to read an image, lets say, picture.png in C. I know I can open it in binary mode, and then read - it's pretty simple.
But I need something more: I would like to be able to read the image once, store it in my code, for example, in *.h file, as 'string of bytes', for example:
unsigned char image[] = "0x87 0x45 0x56 ... ";
and then, be able to just do:
delete physical file I read from disk,
save image into file - it will create my file once again,
EVEN if I removed image from disk (deleted physical file picture.png I read earlier) I will still be able to create an image on disk, simply by writing my image array into file using binary mode. Is that possible in pure C? If so, how can I do this?
There's even a special format for this task, called XPM and a library to manipulate these files. But remember due to its nature it's suitable only for relatively small images. But yes, it was used for years in X Window System to provide icons. Well, those old good days icons were 16x16 pixels wide and contained no more than 256 colors :)
Of course it's possible, but it's a bit unclear what you're after.
There are stand-alone programs that convert binary data to C source code, you don't need to implement that. But doing it that way of course means that the image becomes a static part of your program's executable.
If you want it to be more dynamic, like specifying the filename to your program when it's running, then the whole thing about converting to C source code becomes moot; your program is already compiled. C programs can't add to their own source at run-time.
UPDATE If all you want to do is load a file, hold it in memory and then write it back out, all in the same run of your program, that's pretty trivial.
You'd use fopen() to open the file, fseek() to go to the end, ftell() to read the size of the file. Then rewind() it to the start, malloc() a suitable buffer, fread() the file's contents into the buffer and fclose() the file. Later, fopen() a new output file, and fwrite() the buffer into that before using fclose() to close the file. Then you're done. You can do it again, as many times as you like. It can be an image, a program, a document or any other kind of file, it doesn't matter.
pic2h.c :
#include <stdio.h>
int main(int argc, char *argv[]){
if(argc != 3){
fprintf(stderr, "Usage >pic2h image.png image.h\n");
return -1;
}
FILE *fi = fopen(argv[1], "rb");
FILE *fo = fopen(argv[2], "w");
int ch, count = 0;
fprintf(fo, "extern unsigned char image[];\n");
fprintf(fo, "unsigned char image[] =");
while(EOF!=(ch=fgetc(fi))){
if(count == 0)
fprintf(fo, "\n\"");
fprintf(fo, "\\x%02X", ch);
if(++count==24){
count = 0;
fprintf(fo, "\"");
}
}
if(count){
fprintf(fo, "\"");
}
fprintf(fo, ";\n");
fclose(fo);
fclose(fi);
return 0;
}
resave.c :
#include <stdio.h>
#include "image.h"
int main(int argc, char *argv[]){
if(argc != 2){
fprintf(stderr, "Usage >resave image.png\n");
return 0;
}
size_t size = sizeof(image)-1;
FILE *fo = fopen(argv[1], "wb");
fwrite(image, size, 1, fo);
fclose(fo);
return 0;
}

Resources