I'm a fairly new computer engineering student making a program in C to learn more over summer.
I do not know or understand anything about encryption apart from a simple implementation of Diffie-Hellman.
My program is just a terminal based only and completely offline. It needs to read in saved data from a file and write back to the file when it's done. I'd like to encrypt the I/O in the program.
It seems simple but Googling has me running in circles because I don't know enough to actually get anywhere. Are there any resources someone could point me to about encryption basics and making an offline program secure?
If you would like to learn about general techniques for working with encrypted input and output files, I would suggest implementing a very simple "encryption" algorithm such as XOR with a constant. For example, the following very simple function would work for both encryption and decryption:
void encrypt(char *data, size_t len)
{
while (len--) {
*data++ ^= 0xFF;
}
}
To read an encrypted file, you read a block of data, decrypt it, and then work with it as you would normally. You would do the opposite for writing: Encrypt the data, then write it.
When working with encrypted files, you won't be able to use the C stdio functions such as fgets() or fprintf(), because you don't have a chance to encrypt/decrypt the data between those functions and the actual file I/O.
Related
I'm currently interesting in encrypt file. But I don't know much about what issue may occur when I convert byte-to-byte. I read about end-of-file character, but as in Wikipedia say, EOF is system dependent.
What I want to know is what byte (or group of byte) I should keep away when write some method to encrypt file, on Windows or Linux? Thank!
After writing a basic LFSR-based stream cipher encryption module in C, I tried it on usual text files, and then on a .exe file in Windows. However, after decrypting it back the file is not running, giving some error about being a 16-bit. Evidently some error in decrypting. Or are files made so that if I tamper with their binary code they become corrupted?
I'm checking my program on text-files in the hope of locating any error on my part. However, the question is had anyone tried running your own encryption programs on an executable file? Is their any obvious answer to this?
There is nothing special about executables. They are obviously binary files and thus contain 00 bytes and bytes >127. As long as your algorithm is binary safe, it should work.
Compare the original file and the decrypted file using a hex-editor. To see how they differ.
The error you get means that you didn't decrypt the executable header correctly, so the decryption mistake must already affect the first few bytes of your file.
Evidently some error in decrypting. An exe is a bag 'o bytes just like any other file, there's no magic. You are merely likely to run into byte values that you won't get in a text file. Like a zero.
A decryption process should be the inverse of its encryption. In other words, Decrypt(Encrypt(X)) == X for all inputs X, of all possible lengths, of all possible byte values.
I suggest you build yourself a test harness that will run some pairwise checks with randomised data so you can prove to yourself that the two transformations do indeed cancel each other out. I mean something like:
for length from 0 to 1000000:
generate a string of that length with random contents
encrypt it to a fresh memory buffer
decrypt it to a fresh memory buffer
compare the decrypted string with the original string
Do this first of all on in-memory strings so you can isolate the algorithm from your file-handling code.
Once you've proved the algorithm is properly inverting, you can then do the same for files; as others have said you might well be running into issues with handling binary files, that's a common gotcha.
I'd like to encrypt a file with aes256 using OpenSSL with C.
I did find a pretty nice example here.
Should I first read the whole file into a memory buffer and than do the aes256, or should I do it partial with a ~16K buffer?
Any snippets or hints?
Loading the whole file in a buffer can get inefficient to impossible on larger files - do this only if all your files are below some size limit.
OpenSSL's EVP API (which is also used by the example you linked) has an EVP_EncryptUpdate function, which can be called multiple times, each time providing some more bytes to encrypt. Use this in a loop together with reading in the plaintext from a file into a buffer, and writing out the ciphertext to another file (or the same one). (Analogously for decryption.)
Of course, instead of inventing a new file format (which you are effectively doing here), think about implementing the OpenPGP Message format (RFC 4880). There are less chances to make mistakes which might destroy your security – and as an added bonus, if your program somehow ceases to work, your users can always use the standard tools (PGP or GnuPG) to decrypt the file.
It's better to reuse a fixed buffer, unless you know you'll always process small files - but I don't think that fits your backup files definition.
I said better in a non-cryptographic way :-) There won't be any difference at the end (for the encrypted file) but your computer might not like (or even be able) to load several MB (or GB) into memory.
Crypto-wise the operations are done in block, for AES it's 128 bits (16 bytes). So, for simplicity, you better use a multiple of 16 bytes for your buffer. Otherwise the choice is yours. I would suggest between 4kb to 16kb buffers but, to be honest, I would test several values.
How can I encrypt & decrypt binary files in C using OpenSSL?
I have a test program that encrypts and then decrypts the input it's given.
I executed my test program for text files, and the output is the same as the input, but when I execute my test program on a binary file the output is not the same as the input.
Just guessing: you are using Windows and missed O_BINARY flag in file operations?
Chances are you are using string functions like strlen() on the buffers you're reading. The OpenSSL functions work fine for binary files.
Without seeing your code we can only guess. But my first guess would be that your encryption or decryption routine is barfing on a \0 character or two within the binary file. The data must be treated as bytes not as character strings. (Same as the StrLen() problem mentioned elsewhere on this page.)
I'm not a C programmer(!) but the way I managed to get the encryption routines working within Delphi/Pascal was by downloading the OpenSSL source (in C) and stepping through the code for the openssl.exe application. Using the EVP_* functions became a whole lot easier once you work out how they do it themselves.
I'm writing a straightforward C program on Linux and wish to use an existing library's API which expects data from a file. I must feed it a file name as a const char*. But i have data, just like content of a file, already sitting in a buffer allocated on the heap. There is plenty of RAM and we want high performance. Wanting to avoid writing a temporary file to disk, what is a good way to feed the data to this API in a way that looks like a file?
Here's a cheap pretend version of my code:
marvelouslibrary.h:
int marvelousfunction(const char *filename);
normal-persons-usage.cpp, for which library was originally designed:
#include "marvelouslibrary.h"
int somefunction(char *somefilename)
{
return marvelousfunction(somefilename);
}
myprogram.cpp:
#include "marvelouslibrary.h"
int one_of_my_routines()
{
byte* stuff = new byte[1000000];
// fill stuff[] with...stuff!
// stuff[] holds same bytes as might be found in a file
/* magic goes here: make filename referring to stuff[] */
return marvelousfunction( ??? );
}
To be clear, the marvelouslibrary does not offer any API functions that accept data by pointer; it can only read a file.
I thought of pipes and mkfifo(), but seems meant for communicating between processes. I am no expert at these things. Does a named pipe work okay read and written in the same process? Is this a wise approach?
Or skip being clever, go with plan "B" which is to shuddup and just write a temp file. However, i'd like to learn something new and find out what's possible in this situation, beside getting high performance.
Given that you likely have a function like:
char *read_data(const char *fileName)
I think you will need to "skip being clever, go with plan "B" which is to shuddup and just write a temp file."
If you can dig around and find out if the call you are making is calling another function that takes a File * or an int for the file descriptor then you can do something better.
One thought that does come to mind, can you cahnge your code to write to a memory mapped file instead of to the heap? That way you would have a file on disk already and you would avoid the copying (though it'll still be on disk) and you can still give the function call the file name.
I'm not sure what kind of input the library function wants ... does it need a path/file name, or open file pointer, or open file descriptor?
If you don't want to hack the library and the function wants a string (path to a file), try making the temporary file in /dev/shm.
Otherwise, mmap might be the best option, please be sure to research posix_madvise() when using mmap() (or its counterpart posix_fadvise() if using a temporary file).
It looks like your talking about very little data to begin with, so I don't think you'll see a performance impact in whatever route you take.
Edit
Sorry, I just re-read your question .. perhaps I just read too fast. There is no way you are going to feed a function like:
char * foo(const char *filepath)
... with mmap().
If you can not modify the library to accept a file descriptor instead (or as an alternate to the path) .. just use /dev/shm and a temporary file, it will be quite cheap.
You're on Linux, can't you just grab the source of the library and hack in the function you need? If it's useful to others, you could even send a patch to the original author, so it will be in future versions for everyone.
Edit: Sorry. Just read the question. With my advise below, you fork a spare process, and the question of "does in work in a single process does not come up". I also see no reason you couldn't spawn a separate thread to do the push...
Not in the least elegant, but you could:
open a named pipe.
fork a streamer that does nothing but try to write to the pipe
pass the name of the pipe
which should be pretty robust...
mmap(), perhaps?