What I am searching for is a decrypt function to the crypt(3) function. Reading the manual they only refer me to see login(1), passwd(1), encrypt(3), getpass(3), passwd(5), but as far as I am aware, non of them can be used to decrypt the string.
I wrote together a small program to show my point, the function I am looking for is the somefunctogetbackplaintext(...)
#define _XOPEN_SOURCE
#include <unistd.h>
#include <string.h>
#include <stdio.h>
int
main(int argc, char *argv[])
{
char *cryptated = crypt(argv[1], "aa"); // Password and salt
if(strcmp("somepassword", somefunctogetbackplaintext(argv[1], cryptated, "aa"))) //Plain text, cryptated string, salt
{
printf("Success!\n");
}
else
{
printf("Not a success!\n");
}
return 0;
}
crypt does not encrypt passwords (so there is no way to decrypt them). Instead it hashes a given password, producing a string that is impossible to reverse to the original password (because the hash function loses information in the process). The most practical way to attack crypt and recover passwords from their hashes is probably some sort of dictionary attack.
However, none of that is necessary to check whether a given password is correct:
const char *password_and_salt = ...; // e.g. from getpwent or a database
const char *input = argv[1];
if (strcmp(crypt(input, password_and_salt), password_and_salt) == 0) {
printf("your password is correct\n");
}
In other words, you pass the user input to crypt and check whether it matches the known result of an earlier crypt. If so, the passwords match.
Here is a summary excerpt from this article distinguishing between the concepts of encryption and Hashing:
Passwords remain the primary means for online authentication and must
be protected when stored on a server. Encryption is an option, but it
has an inherent weakness in this application because the server
authenticating the password must have the key to decrypt it. An
attacker who steals a file of encrypted passwords might also steal the
key.
Hashing is a better option, especially with the judicious use of salt,
according to mathematician Andrew Regenscheid and computer scientist
John Kelsey of the National Institute of Standards and Technology’s
Computer Security Division.
Encryption is a two-way function; what is encrypted can be decrypted
with the proper key. Hashing, however, is a one-way function that
scrambles plain text to produce a unique message digest. With a
properly designed algorithm, there is no way to reverse the hashing
process to reveal the original password. An attacker who steals a file
of hashed passwords must then guess the password.
(emphasis mine)
Also (from comments) this link plainly states: crypt is the library function which is used to compute a password hash...
As wikipedia article about crypt states:
Excerpt 1:
crypt is the library function which is used to compute a password hash that can be used to store user account passwords while keeping them relatively secure (a passwd file).
Excerpt 2:
This is technically not encryption since the data (all bits zero) is not being kept secret; it's widely known to all in advance. However, one of the properties of DES is that it's very resistant to key recovery even in the face of known plaintext situations. It is theoretically possible that two different passwords could result in exactly the same hash. Thus the password is never "decrypted": it is merely used to compute a result, and the matching results are presumed to be proof that the passwords were "the same."
So that is the answer to question: "the password is never "decrypted""
Related
(Not to be confused with the DES algorithm subkey generation)
(edit: more examples)
Explanation of problem:
I'm doing this as part of a school assignment where I'm required to recode parts of OpenSSL in C, specifically those pertaining to PKI cryptosystems. So far I've recoded from scratch the core DES algorithm with ecb, cbc, 3des-ecb, and 3des-cbc modes of operation. Other parts of the project include MD5 and SHA256. This portion of the project focuses on RSA key generation, manipulation and usage.
Part of RSA key manipulation includes encrypting a given key with a passphrase.
(not with the pure key + initial vector alone like I've done before with DES)
This requires converting the user-input passphrase to a DES key (and optional additional IV as needed), and then using that to encrypt a RSA key. I know the general term for the function I'm looking for is PBKDF, or Password-Based Key Derivation Function. However, I have not been able (through searching the man pages of OpenSSL or google) to find what exact function (or functions) are used in OpenSSL for key derivation.
Demonstration of DES key generation encrypting RSA keys:
Running the following command with no passphrase generate an unencrypted RSA key example_plain_key.
ssh-keygen -t rsa -f example_plain_key
Then running the following commands will encrypt example_plain_key with the des cipher in ecb mode. Each command outputs the encrypted version to a new file so it doesn't change the original. Use the same passphrase for both commands (password, for example).
openssl rsa -DES-ECB -in id_rsa -out id_rsa_1
openssl rsa -DES-ECB -in id_rsa -out id_rsa_2
You can use head id_rsa and head id_rsa_1 to see how encrypting a key changes the header. If you compare the two new keys with
diff id_rsa_1 id_rsa_2
they will be identical in the header and formatting, but the key itself will be encrypted differently, even though the same passphrase is used. The difference is because the key generation (I believe) generates a new random salt every time it is ran. I would assume the hashing algorithm and the number of iterations would be the same. Also, unlike /etc/shadow on unix machines, the salt doesn't appear to be stored alongside the key (or at least I don't know how to read it).
Demonstration of DES key generation from password:
A more DES-specific example is:
openssl des -P
Running the above command any number of times with the same password will always result in a different key and iv, probably because the salt is different.
My findings, and deducted assumptions:
Searching "how are rsa keys encrypted?" brings up a lot of results on using RSA keys to encrypt. (sometimes I expect too much from Google's nlp)
Searching "how are DES keys generated from passphrase?" brings up a lot of results on how to generate the 16 round des subkeys.
I've skimmed the source of OpenSSL with no luck. I'll do an exhaustive search if absolutely necessary, but the code isn't the most readable or searchable.
php prototype
perl man page
A link I thought would be more helpful than it was
(Note: I don't have an account with OpenSSL but don't think it'd be required to view)
The most helpful findings led me to believe an example prototype of what I'm looking for would look something like this:
#include <unistd.h>
#include <stdio.h>
#include <pwd.h>
// #include <something_else_maybe.h>
int main(void)
{
int num_iterations = 1000;
char *salt;
char *passphrase;
char *key;
passphrase = getpass("Password: ");
salt = get_some_random_bytes(8); // assumed arbitrary length
// the function in question
key = example_pbkdf(md5_function, num_iterations, salt, 8, passphrase, strlen(passphrase));
printf("Key (in hexadecimal or otherwise) is: %s\n", key);
free(key);
free(passphrase);
free(salt);
return (0);
}
Things I am specifically looking for:
(Knowing where to look for these answers would be more valuable than the answers themselves, but all help is appreciated. I do need the header/source/prototype/etc in C though.)
The function (if it exists) that operates like the one demonstrated above. It doesn't have to be a perfect match, I'm more concerned about what it does rather than the exact prototyping or usage.
Alternatively, (if it doesn't exist) the "recipe" or series of operations that could be summarized as "the algorithm" I'm looking for.
DES key generation. (though including multiple ciphers, say, AES, is awesome too)
How the salt is stored in an ecrypted RSA key, if it is (and if it isn't, how to recover it). I know the IV is stored in the header of a key encrypted with a cipher in CBC.
I have currently started coding an electronic journal that is password protected. The user inputs one in (when they first use the program) and then the same password can be used to access records.
How to do this without writing the password into a file? I want the program to work every time using the same password.*?
The solution that standard C provides for storing data for future use is files. That is why they exist, and there is no reasonable alternative. C does not keep any of your program’s internal data from run to run. You need to use files for that.
It is not necessary to store the password itself in a file. You can store a hash of the password. A hash is a function that converts some input, such as a string, to another value in a way such that it is hard to figure out an input that produces a specific value. Thus, if somebody learns the stored hash value, it is hard for them to figure out the password.
So, when the user sets a new password, your program would compute the hash of the password and store the hash in a file. When checking a password, your program would read a password entered by the user, compute the hash of it, and compare that hash to the one stored in the file.
Selecting, implementing, and using hash functions is a broad and complicated topic.
When softwares such as ecryptfs use AES, it asks for a user password (such as "password123").
The AES algorithm by itself does not call for a user password. So where does the "password123" get thrown into the math?
I'm working to make a C function that encrypts some data using a password. I know the typical way of doing it with OpenSSL and an aes key, but I don't know how to get a user password integrated.
You need to use a key derivation function (KDF). Password-Based Key Derivation Function 2 (PBKDF2) is the current most common approach.
OpenSSL probably exposes PBKDF2, it typically takes in a password and an iteration count (modern systems should use something like 100000 or higher... crank up the number until it takes about 0.3 seconds), and an output length. It may also take a hash function, something in the SHA-2 family (SHA256, SHA384, SHA512) would be a good modern choice.
I wish to know how do software verify the downloaded files are not corrupt by using hash functions?
You should read http://en.wikipedia.org/wiki/File_verification,
"Hash-based verification ensures that a file has not been corrupted by comparing the file's hash value to a previously calculated value. If these values match, the file is presumed to be unmodified." That's how
Consider password hash verfication process....
you signup to "www.example.com" and they ask for your password
"your-secret-password" >> gets hashed and becomes gn234hs (for example)
You now have a "reference" hash
you come back a month later and as long as you provide same password the hash function will produce the same output gn234hs - which matches the original and verifies that what you entered is the same as what was entered last time.
No big insights there....
what if, instead of feeding in a password - someone feed a binary representation of a file or a collection of text files into the hashing function.
[010101001010101... huge number] >> hash function
hash function produces 32j4h234j234k23j4h23k4h23kj423kj4h3
you now have a "reference hash" for that file.
Now you get a file off the internet
If you run the file through the same hashing function and you get
32j4h234j234k23j4h23k4h23kj423kj4h3 - same as for a password - you know the file is a bit for bit representation of the original.
So the question is, I get how a hash can represent a password thats only a few characters , but how can a hash represent an unbelievably huge binary sequence or text file, be "sensitive" enough to detect changes and still have a unique quality?
Basically, because of the "randomness" of the output of cryptographic hash functions (as distinct from ordinary hashes) and the number of possible combinations a hash can have is so huge, that whilst its possible for different permutations of the items being hashed to result in the same hash - its so small as to be considered statistically insignificant.
Its a bit oversimplified, but hopefully that helps.
There (obviously) is tons of info on the subject if you google it, e.g. the wiki article linked to already.
I'm working on a project that involves writing low-level C software for a hardware implementation. We are wanting to implement a new feature for our devices that our users can unlock when they purchase an associated license key.
The desired implementation steps are simple. The user calls us up, they request the feature and sends us a payment. Next, we email them a product key which they input into their hardware to unlock the feature.
Our hardware is not connected to the internet. Therefore, an algorithm must be implemented in such a way that these keys can be generated from both the server and from within the device. Seeds for the keys can be derived from the hardware serial number, which is available in both locations.
I need a simple algorithm that can take sequential numbers and generate unique, non-sequential keys of 16-20 alphanumeric characters.
UPDATE
SHA-1 looks to be the best way to go. However, what I am seeing from sample output of SHA-1 keys is that they are pretty long (40 chars). Would I obtain sufficient results if I took the 40 char key and, say, truncated all but the last 16 characters?
You could just concatenate the serial number of the device, the feature name/code and some secret salt and hash the result with SHA1 (or another secure hashing algorithm). The device compares the given hash to the hash generated for each feature, and if it finds a match it enables the feature.
By the way, to keep the character count down I'd suggest to use base64 as encoding after the hashing pass.
SHA-1 looks to be the best way to go. However, what I am seeing from sample output of SHA-1 keys is that they are pretty long (40 chars). Would I obtain sufficient results if I took the 40 char result and, say, truncated all but the last 16 characters?
Generally it's not a good idea to truncate hashes, they are designed to exploit all the length of the output to provide good security and resistance to collisions. Still, you could cut down the character count using base64 instead of hexadecimal characters, it would go from 40 characters to 27.
Hex: a94a8fe5ccb19ba61c4c0873d391e987982fbbd3
Base64: qUqP5cyxm6YcTAhz05Hph5gvu9M
---edit---
Actually, #Nick Johnson claims with convincing arguments that hashes can be truncated without big security implications (obviously increasing chances of collisions of two times for each bit you are dropping).
You should also use an HMAC instead of naively prepending or appending the key to the hash. Per Wikipedia:
The design of the HMAC specification was motivated by the existence of
attacks on more trivial mechanisms for combining a key with a hash
function. For example, one might assume the same security that HMAC
provides could be achieved with MAC = H(key ∥ message). However, this
method suffers from a serious flaw: with most hash functions, it is
easy to append data to the message without knowing the key and obtain
another valid MAC. The alternative, appending the key using MAC =
H(message ∥ key), suffers from the problem that an attacker who can
find a collision in the (unkeyed) hash function has a collision in the
MAC. Using MAC = H(key ∥ message ∥ key) is better, however various
security papers have suggested vulnerabilities with this approach,
even when two different keys are used.
For more details on the security implications of both this and length truncation, see sections 5 and 6 of RFC2104.
One option is to use a hash as Matteo describes.
Another is to use a block cipher (e.g. AES). Just pick a random nonce and invoke the cipher in counter mode using your serial numbers as the counter.
Of course, this will make the keys invertible, which may or may not be a desirable property.
You can use an Xorshift random number generator to generate a unique 64-bit key, and then encode that key using whatever scheme you want. If you use base-64, the key is 11 characters long. If you use hex encoding, the key would be 16 characters long.
The Xorshift RNG is basically just a bit mixer, and there are versions that have a guaranteed period of 2^64, meaning that it's guaranteed to generate a unique value for every input.
The other option is to use a linear feedback shift register, which also will generate a unique number for each different input.