I have made many researches about MD5 of an xls file but my effort seems be in vain
I tried to used lirary and recommendation in this link "https://stackoverflow.com/questions/27858288/calculate-md5-for-a-file-in-c-language"
but , still give wrong result ,
can you help me ??
Well I used to answer the link you gave but then the question was closed.
The idea is as follows. First read the file into a buffer. You can do this using following function:
unsigned char * readFile(const char *path)
{
FILE * pFile;
long lSize;
unsigned char * buffer;
size_t result;
pFile = fopen (path , "rb" );
if (pFile==NULL) {fputs ("File error",stderr); exit (1);}
// obtain file size:
fseek (pFile , 0 , SEEK_END);
lSize = ftell (pFile);
rewind (pFile);
// allocate memory to contain the whole file:
buffer = malloc (sizeof(char)*lSize);
if (buffer == NULL) {fputs ("Memory error",stderr); exit (2);}
// copy the file into the buffer:
result = fread (buffer,1,lSize,pFile);
if (result != lSize) {fputs ("Reading error",stderr); exit (3);}
// terminate
fclose (pFile);
return buffer;
}
Read the file
unsigned char * data = readFile("c:\\file.xls");
Then you must apply MD5 on this buffer of data. You can use code similar
to the one in that question (though I am not sure which library/implementation
of md5 author of that question used). e.g.,
char hash[64] = {0};
md5_byte_t digest[16] = {0};
md5_init(&state);
md5_append(&state, (const md5_byte_t *)data, filesize);
md5_finish(&state,digest);
int i=0;
for(i; i<16; i++)
{
snprintf(hash + i*2,sizeof(hash),"%02x",digest[i]);
}
Now hash should store the hash of the file, encoded in hexadecimal string. ps. Indeed that sample is incorrectly using strlen with binary file. That is why I suggested the readFile method above; that function also contains code to get file size - you can use that code to get file size and then pass the file size to md5_append method.
ps. also don't forget to free data when you are done with it.
MD5 of xls file is very same of MD5 of any other kind of file since it operates on bytes. See by example openssl implementation openssl/crypto/md5/md5.c and md5test.c ( code is in git://git.openssl.org/openssl.git ).
The problem is that your example uses strlen to determine the file size. But .xls format is binary, so strlen will not work properly.
Adapt the function to return the total data read from the file, and it should work.
Edit. Try something like this code:
void *addr;
struct stat s;
int ret, fd;
ret = stat(filename, &s);
if (ret) {
fprintf(stderr, "Error while stat()ing file: %m\n");
return -1;
}
fd = open(filename, O_RDONLY);;
if (fd < 0) {
fprintf(stderr, "Error while opening file: %m\n");
return -1;
}
addr = mmap(NULL, s.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (addr == MAP_FAILED) {
fprintf(stderr, "Error while mapping file: %m\n");
close(fd);
return -1;
}
md5_init(&state);
md5_append(&state,addr, s.st_size);
md5_finish(&state,digest);
Related
So i got my function here that works to write back any file
int write_file(FILE *f_write) {
// Temp variables
FILE *img = fopen("test.pdf", "wb");
unsigned char buffer[255];
while ( (bytes_read = fread(buffer, 1, sizeof(buffer), f_write) ) > 0) {
fwrite(buffer, 1, bytes_read, img);
}
fclose(img);
return 1;
}
So this works perfecly ive tried with pnj / pdf / jpg etc..
But now i want to stock what ive writen in the memory so i can use it later and not write right away
like an array of uint8_t (maybe) that will contain all the bytes ive writen and that i can send later with sockets to my server and store the file
no idea how to do it
Or maybe i'm making it too complicated and i can just
send(client_socket, FILE, sizeof(FILE), 0); ?
One way to do it would be to create a buffer that exactly fits the size of the file.
In order to do so, you can write a function to get the size of an openned file like so:
size_t get_file_size(FILE *f)
{
size_t pos = ftell(f); // store the cursor position
size_t size;
// go to the end of the file and get the cursor position
fseek(f, 0L, SEEK_END);
size = ftell(f);
// go back to the old position
fseek(f, pos, SEEK_SET);
return size;
}
Then create and fill your buffer:
FILE *f = fopen("your_file", "r");
size_t size = get_file_size(f);
char *buffer = malloc(size);
if (fread(buffer, 1, size, f) != size) { // bytes read != computed file size
// error handling
}
// use your buffer...
// don't forget to free and fclose
free(buffer);
fclose(f);
It is worth mentioning that you should check if the file was opened correctly, and to check if you have enough memory to store the buffer (the one created with malloc).
Edit:
As Andrew Henle said, fseek()/ftell() to get the size of a file is non-portable. Instead, to get the size of your file, you should use one of these techniques depending on your OS (assuming you are trying to open a 'normal' file):
On Linux / MacOS:
#include <sys/stat.h>
struct stat st;
size_t size;
if (stat("your_file", &st) != 0) {
// error handling...
}
size = st.st_size;
On Windows (as answered here) :
__int64 FileSize(const wchar_t* name)
{
HANDLE hFile = CreateFile(name, GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile == INVALID_HANDLE_VALUE)
return -1; // error condition, could call GetLastError to find out more
LARGE_INTEGER size;
if (!GetFileSizeEx(hFile, &size)) {
CloseHandle(hFile);
return -1; // error condition, could call GetLastError to find out more
}
CloseHandle(hFile);
return size.QuadPart;
}
I am attempting to read a '.raw' file which stores the contents of an image that was taken on a camera using C. I would like to store these contents into a uint16_t *.
In the following code I attempt to store this data into a pointer, using fread(), and then write this data into a test file, using fwrite(), to check if my data was correct.
However, when I write the file back it is completely black when I check it.
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#define MAX_ROW 2560
#define MAX_COL 2160
int main()
{
char filename[32] = "image1.raw";
FILE * image_raw = fopen(filename, "rb");
fseek(image_raw, 0, 2);
long filesize = ftell(image_raw);
/*READ IMAGE DATA*/
uint16_t * image_data_ptr;
image_data_ptr = (uint16_t *)malloc(sizeof(uint16_t)*MAX_ROW*MAX_COL);
fread(image_data_ptr, sizeof(uint16_t), filesize, image_raw);
fclose(image_raw);
/*TEST WRITING THE SAME DATA BACK INTO TEST RAW FILE*/
FILE *fp;
fp = fopen("TEST.raw", "w");
fwrite(image_data_ptr, sizeof(uint16_t), filesize, fp);
fclose(fp);
return 0;
}
There are multiple issues with your code:
lack of error handling.
not seeking the input file back to offset 0 after seeking it to get its size. Consider using stat() or equivalent to get the file size without having to seek the file at all.
not dividing filesize by sizeof(uint16_t) when reading from the input file, or writing to the output file. filesize is expressed in bytes, but fread/fwrite are expressed in number of items of a given size instead, and your items are not 1 byte in size.
not opening the output file in binary mode.
leaking the buffer you allocate.
With that said, try something more like this instead:
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
int main()
{
char filename[32] = "image1.raw";
FILE *image_raw = fopen(filename, "rb");
if (!image_raw) {
fprintf(stderr, "Can't open input file\n");
return -1;
}
if (fseek(image_raw, 0, SEEK_END) != 0) {
fprintf(stderr, "Can't seek input file\n");
fclose(image_raw);
return -1;
}
long filesize = ftell(image_raw);
if (filesize == -1L) {
fprintf(stderr, "Can't get input file size\n");
fclose(image_raw);
return -1;
}
rewind(image_raw);
long numSamples = filesize / sizeof(uint16_t);
/*READ IMAGE DATA*/
uint16_t *image_data_ptr = (uint16_t*) malloc(filesize);
if (!image_data_ptr) {
fprintf(stderr, "Can't allocate memory\n");
fclose(image_raw);
return -1;
}
size_t numRead = fread(image_data_ptr, sizeof(uint16_t), numSamples, image_raw);
if (numRead != numSamples) {
fprintf(stderr, "Can't read samples from file\n");
free(image_data_ptr);
fclose(image_raw);
return -1;
}
fclose(image_raw);
/*TEST WRITING THE SAME DATA BACK INTO TEST RAW FILE*/
FILE *fp = fopen("TEST.raw", "wb");
if (!fp) {
fprintf(stderr, "Can't open output file\n");
free(image_data_ptr);
return -1;
}
if (fwrite(image_data_ptr, sizeof(uint16_t), numSamples, fp) != numSamples) {
fprintf(stderr, "Can't write to output file\n");
fclose(fp);
free(image_data_ptr);
return -1;
}
fclose(fp);
free(image_data_ptr);
return 0;
}
You have already a great answer and useful comments
anyway, consider that if you want to iterate over your file, loaded in memory as a whole, as an array of unsigned words:
if the file size could be odd what to do at the last byte/word
you may read the file as a whole in a single call, after having the file size determined
fstat() is the normal way to get the file size
get the file name from the command line as an argument is much more flexible than recompile the program or change the file name in order to use the program
The code below does just that:
uses image.raw as a default for the file name, but allowing you to enter the file name on the command line
uses fstat() to get the file size
uses a single fread() call to read the entire file as a single record
A test using the original program file as input:
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 20/07/2021 17:40 1067 main.c
PS > gcc -Wall -o tst main.c
PS > ./tst main.c
File is "main.c". Size is 1067 bytes
File "main.c" loaded in memory.
PS > ./tst xys
File is "xys". Could not open: No such file or directory
The C example
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
int main(int argc, char**argv)
{
const char* default_file = "image.raw";
char f_name[256];
if (argc < 2)
strcpy(f_name, default_file);
else
strcpy(f_name, argv[1]);
FILE* F = fopen(f_name, "rb");
if (F == NULL)
{
printf("File is \"%s\". ", f_name);
perror("Could not open");
return -1;
}
struct stat info;
fstat(_fileno(F),&info);
printf("File is \"%s\". Size is %lu bytes\n", f_name, info.st_size);
uint16_t* image = malloc(info.st_size);
if (image == NULL)
{ perror("malloc() error");
return -2;
};
if (fread(image, info.st_size, 1, F) != 1)
{ perror("read error");
free(image);
return -3;
};
// use 'image'
printf("File \"%s\" loaded in memory.\n", f_name);
free(image);
fclose(F);
return 0;
}
I wrote a program
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
int r;
char arr[] = "this is the string";
char str[20] = {'\0'};
fp = fopen("fwrite.txt", "w");
fwrite(arr, 1, sizeof(arr), fp);
fseek(fp, SEEK_SET, 0);
r = fread(str, 1, sizeof(arr), fp);
if(r == sizeof(arr))
printf("read successfully\n");
else
{
printf("read unsuccessfull\n");
exit(1);
}
printf("read = %d\n", r);
printf("%s\n", str);
fclose(fp);
return 0;
}
I am trying to read in this way but I am not able to do it. What is the problem here, is it that I should put &str[i] and run a loop for fread or will fread be able to put data in the str?
I am getting junk and I don't understand why?
The primary problem is that you have the arguments to fseek() backwards — you need the offset (0) before the whence (SEEK_SET). A secondary problem is that you attempt to read from a file open only for writing. A more minor issue in this context, but one that is generally very important, is that you don't error check the fopen() call. (It is relatively unlikely that this fopen() will fail, but funnier things have been known.) You should also check the fwrite() call (you already check the fread(), of course).
Fixing all these might lead to:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int rc = EXIT_SUCCESS;
int r;
const char file[] = "fwrite.txt";
char arr[] = "this is the string";
char str[20] = {'\0'};
FILE *fp = fopen(file, "w+b");
if (fp == 0)
{
fprintf(stderr, "Failed to open file %s for reading and writing\n", file);
rc = EXIT_FAILURE;
}
else
{
if (fwrite(arr, 1, sizeof(arr), fp) != sizeof(arr))
{
fprintf(stderr, "Failed to write to file %s\n", file);
rc = EXIT_FAILURE;
}
else
{
fseek(fp, 0, SEEK_SET);
r = fread(str, 1, sizeof(arr), fp);
if (r == sizeof(arr))
{
printf("read successful\n");
printf("read = %d bytes\n", r);
printf("read data [%s]\n", str);
}
else
{
printf("read unsuccessful\n");
rc = EXIT_FAILURE;
}
}
fclose(fp);
}
return rc;
}
Example run:
$ ./fi37
read successful
read = 19 bytes
read data [this is the string]
$
Note that this works in part because you write the null byte at the end of the output string to the file, and then read that back in. The file isn't really a text file if it contains null bytes. The b in "w+b" mode isn't really needed on Unix systems where there's no distinction between a binary and a text file. If you're writing null bytes to a file on Windows, you should use the b to indicate binary mode.
If you chose to, you could reduce the 'bushiness' (or depth of nesting) by not having a single return in the main() function. You could use return EXIT_FAILURE; and avoid an else and another set of braces. The code shown is careful to close the file if it was opened. In a general-purpose function, that's important. In main(), it is less critical since the exiting process will flush and close open files anyway.
You can't read in a file with the "w" mode for fopen, use "w+" instead.
"r" - Opens a file for reading. The file must exist.
"w" - Creates an empty file for writing. If a file with the same name already
exists, its content is erased and the file is considered as a new empty file.
"a" - Appends to a file. Writing operations, append data at the end of the
file. The file is created if it does not exist.
"r+" - Opens a file to update both reading and writing. The file must exist.
"w+" - Creates an empty file for both reading and writing.
"a+" - Opens a file for reading and appending.
I am trying to "mmap" a binary file in order to encrypt it using AES then write the encrypted data to another file(outFile) using the following code. I tried to modify the flags for both functions mmap() and open() but I always get segmentation fault when I run the executable.
int main (void)
{
FILE *outFile; //The output file (encrypted)
/* A 256 bit key */
unsigned char *key = (unsigned char *)"01234567890123456789012345678901";
/* A 128 bit IV */
unsigned char *iv = (unsigned char *)"01234567890123456";
int fd;
struct stat sb;
void * memblock;
fd = open("result0.jpg",O_RDONLY);
outFile=fopen("result0enc.jpg","wb");
fstat(fd, &sb);
printf("Size: %lu\n", sb.st_size);
unsigned char decryptedtext[sb.st_size];
int decryptedtext_len, ciphertext_len;
/* Initialise the library */
ERR_load_crypto_strings();
OpenSSL_add_all_algorithms();
OPENSSL_config(NULL);
memblock = mmap(NULL, sb.st_size,PROT_READ, MAP_SHARED, fd, 0);
if (memblock == MAP_FAILED) {
close(fd);
perror("Error mmapping the file");
exit(EXIT_FAILURE);
}
ciphertext_len = encrypt((unsigned char *)memblock, sb.st_size,key,iv,ciphertext);
fwrite( ciphertext,1, sb.st_size,outFile);
if (munmap(memblock, sb.st_size) == -1) {
perror("Error un-mmapping the file");
/* Decide here whether to close(fd) and exit() or not. Depends... */
}
close(fd);
fclose(outFile);
EVP_cleanup();
ERR_free_strings();
return 0;
}
As yano mentioned in the comments, your error is here:
memcpy(outFile, ciphertext, sb.st_size);
You're trying to memcpy to a FILE * which is completely wrong. That doesn't do at all what you expect. You're overwriting the private internals of the FILE structure to which outFile points.
You should instead operate on a buffer and use fwrite to write to the file.
I suggest you get familiar with basic file I/O operations using f... functions before digging into mmap and encryption.
I am trying to create a simple c program that strips the HTML from a webpage and keeps the text. So far i have come up with the code below. It uses cURL to get the contents of the webpage and write it to a file. How do i go through the memory buffer and remove all HTML tags and output to text to either the terminal or a file?
#include <curl/curl.h>
#include <stdio.h>
#include <stdlib.h>
#define WEBPAGE_URL "http://homepages.paradise.net.nz/adrianfu/index.html"
#define DESTINATION_FILE "/home/acwest/data.txt"
size_t write_data( void *ptr, size_t size, size_t nmeb, void *stream)
{
return fwrite(ptr,size,nmeb,stream);
}
int main()
{
int in_tag = 0;
char * buffer;
char c;
long lSize;
size_t result;
FILE * file = fopen(DESTINATION_FILE,"w+");
if (file==NULL) {
fputs ("File error",stderr);
exit (1);
}
CURL *handle = curl_easy_init();
curl_easy_setopt(handle,CURLOPT_URL,WEBPAGE_URL); /*Using the http protocol*/
curl_easy_setopt(handle,CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(handle,CURLOPT_WRITEDATA, file);
curl_easy_perform(handle);
curl_easy_cleanup(handle);
// obtain file size:
fseek (file, 0, SEEK_END);
lSize = ftell (file);
rewind (file);
// allocate memory to contain the whole file:
buffer = (char*) malloc (sizeof(char)*lSize);
if (buffer == NULL) {
fputs ("Memory error",stderr);
exit (2);
}
// copy the file into the buffer:
result = fread (buffer,1,lSize,file);
if (result != lSize) {
fputs ("Reading error",stderr);
exit (3);
}
}
Curl will not help you with parsing HTML, and it is a complicated task. You can read the language specification and write a parser. There's an open source C++ project at http://www.mbayer.de/html2text/ or a python script at https://github.com/aaronsw/html2text. You can also install and use html2text from the command-line or execute it from your c code.