Storing metadata about password hashes in the database? - database

I was talking to a friend of mine and he stated that to store user passwords in the database, he, and I quote:
Takes a users password
Hashes it with a random salt
Then, he prefixes the result of #2 with the hash type and the salt, joining them with a pipe delimited string. (e.g. SHA1|RANDOMSALT|afde4343....)
Stores result of #3 in the db
He claims that's it readable, as he "can instantly know" what type of hash digest is used.
I don't think I've ever seen this before, but, I'm looking for reasons on why it would be bad aside from the fact than an unauthorized user can instantly know what type of hash is used to encrypt the passwords and the extra space required for storing the field.
My gut reaction is that this approach to working with passwords is silly, as any indication to help an attacker break passwords is a weakness. Any other reasons I should use to convince him that this is a bad idea?
Thanks!

It is a good idea. Everybody does that, including in the /etc/password (/etc/shadow...) file on Unix systems.
Any decent security model will assume that the attacker knows what kind of hash function is used, if only because the hashing system is a piece of software, store on a hard disk. Conversely, believing in the virtue of not announcing which hash function is used is akin to security through obscurity and that's bad.
So do not try to convince your friend that what he is doing is bad. Instead, let him convince you that what he is doing is good.
PS: this is not password encryption, this is password hashing. Hashing is not a kind of encryption.

It's also good practice to store the hash "cost" (number of times the hash has been stretched), even crypt() does that. Possibly, the worst issue is that you first have to query the database (to know the salt, cost, algorithm) and then check against your generated hash, this requires an extra trip to the DB.
Why is this a good practice? Well, if you have user-specific salts, they have to be stored somewhere... Also, you may (in the future) want to escalate the cost (number of times a hash is "re-hashed") of your hashes to keep up with Moore's Law, if you don't know the original cost you wont be able to do anything.
I strongly recommend that you read this blog post about (secure) password hashing.

Related

Should I encrypt username/email fields in my MongoDB?

So I am encrypting fields in the database, but I don't think i can encrypt the user's username or email because I use those fields to find the user. I could hash them instead, but since I don't think I can use a unique salt per username/email someone could just use a rainbow table to find the hidden username/email.
I guess this it is ok to not encrypt them? I would like to make the website as secure as possible. Would hashing them make sense? I could find a user by their _id instead of username/email, but I wouldn't have their _id until I find the user.
What I am doing currently:
const user = await new db.userModel({
email: email,
username: username,
stuff: cryptr.encrypt(stuff),
});
////
const user = await db.userModel.findOne({
email: email
}).exec();
EDIT: I guess hashing would not make sense, since I cannot un-hash the username/email. Not sure what I was thinking.
This question should really be asked on infosec Stackexchange.
Encryption is usually pointless as a defense against hackers, because you need to access the key to use the encrypted database. That means that you need to save it next to the database. If your server becomes compromised, then the hacker will simply decrypt the database with the key. Of course it's better than doing nothing, because it is possible (maybe improbable) that the hacker will compromise only the database. Horatiu Jeflea hovewer mentions other important reasons to encrypt the database, which you should especially consider if you are not working on the project alone.
Hashing usernames is only possible if you don't need to display them, but usernames are usually public anyway, so it improves security very little.
Hashing emails is an interesting problem. You asked for them for a reason. You presumably need to contact your users. If you hash them, you won't be able to do it and if you don't need to do it, then you don't need to (and shouldn't) save them in the first place. If however they are only part of the authentication process, then it would be possible solution.
Rainbow tables can't really break modern hashing algorithms, although the biggest mitigation is aforementioned salt, which will make it MUCH harder for most attack on the hash. You also have to make sure that you are not using vulnerable ones like md5, but safe one like sha256 or sufficient bcrypt. You should hash the passwords with one of those. Also note that you could use the same salt for all the hashed fields and the salt could even be (this would reduce the security little bit) one of the public fields (username?). There are very little excuses for not using salt.
In summary: you can't hash them and encryption is probably not worth it unless you can sufficiently isolate the key, or need it because of something else than external hackers
I guess this it is ok to not encrypt them?
In most cases, yes. Passwords should be hashed (+ salt..) and sensitive data should be encrypted. But username or email in most cases should not be sensitive.
But let's assume they will be encrypted in DB.
You are storing encrypt(username) in your DB, so in order to search for that, instead of using username, use encrypt(username). Of course sorting may cause some headaches, but finding the user should be efficient.
Think of encrypting (in your case) not about hackers, but about people who are reading those records. For example developers who are investigating a production issue or a DBA, you want some (not all) fields hard for them to read. Storing the key on a different machine, best in a Key Management Tool, will add an extra security layer.

The point of hashing, and problems with it?

So from what I have seen it is impossible to decrypt any hashing algorithm such as MD5 or SHA-1 without brute forcing it or using rainbow tables. This seemed to confuse me on a few aspects of using hashes. These confusing points are:
What would be the point of hashing in the first place if they cant be decrypted?
How would hashed passwords be able to be used in a database?
Also since people say it is like the modulo operation, what, if anything, is preventing multiple inputs to equate to the same hash?
If somebody simply does SHA1 or MD5 on a password, then they get almost no protection.
That's why it's important to understand the right way to handle "password hashing". Please read Our password hashing has no clothes
To answer your questions:
You can verify the user without "decrypting the hash": you simply "hash" the user entered password (along with salt and other parameters) upon login and verify that it matches the expected result that is stored in the database.
See 1 and the Troy Hunt link
People who say it is like a modulo operation are making a bad analogy: they are non-experts on this subject. Anyway, the properties of the "hash" function make it hard to find collisions, and the salt prevents two users with the same password from having the same password "hashes" in the database.
Other resources:
Salted password hashing - Doing it right
Method to Protect Passwords in Databases for Web Applications -- advanced reading: solves other problems with current solutions to protecting passwords. If you wonder why I put "hash" in quotes above, this will explain it.
It's not like a modulo. A hash is reasonably guaranteed to be unique based on the input. If you enter your passwords in a database as hashes, then all you need to do is to hash the password entry and check it against what you have stored in the database. This way, you are not storing readable passwords in the database which are openly visible to others. Normally, you would have a private key, some salt and something unique like a timestamp included in your hashing algorithm to ensure that it cannot be easily spoofed.
This may help you further:
http://searchsqlserver.techtarget.com/definition/hashing
Even if the process of hashing is basically non decryptable, the problem as pointed before is that each hash is nearly unique, so that means using websites like md5decrypt which contains a lot of different words and their encrypted hashes, one may find the password he is looking for.
That is if the password isn't strong enough in the first place. Obviously one shouldn't use the password "password" for instance because it will probably be found in most of the websites like md5decrypt.
What you should do to protect passwords on your website is actually simple. First, don't use old hashes like md5 or sha1. Use at least sha256, and if you've enough sql storage, sha384 or sha512. You should know that most of the online hashes database are only about the most commonly used hashes (let's say md5,sha1,sha256 in most cases). So you should find a hash type that isn't very represented on online database.
Then you should (you have actually to) use salt when encrypt users passwords, that is add some word, letters, whatever, to the password before you encrypt it, then store that salt somewhere so you can still allow people to log in. You could also add a pepper to the salt to make the all thing stronger.
While using the salt, try to find a way that hackers won't think about, for instance double the salt, or triple it, or try different ways to concat the salt and the actual password, etc. You could also make a double encryption with double salt, like sha512(sha384()), which would be almost impossible to find.
But, please, do not store unencrypted passwords !

Storing manipulated hashes instead of the correct ones

I had an idea about storing passwords in databases: since passwords can be cracked by simply looking up a hash in rainbow tables (etc. etc.), would it be much (or even a little) safer to store a manipulated hash instead of the real one? In my case, it's not a string hashed twice or something - I have a custom pattern of "scrambling" a hash (I'd prefer not to mention my approach to this), so I figured I'd ask if it's worth the trouble before I do something that's useless.
Passwords in the database are currently encrypted with Blowfish (salts are completely random) and SHA-1, is this otherwise safe enough (yeah, you can never be too safe - but should it suffice)? We really don't have many users either, as the site doesn't draw much attention.
I'm absolutely no expert of this kind of stuff, so go easy on me. The only thing I know, is that people are getting better and better at cracking passwords (and the possibilities seems to be increasing).
I'd prefer not to mention my approach to this
Security through obscurity is not security.
If the passwords of the users is long enough and you add a long enough salt and you use a good hashing / crypting algo you won't be able to find the hash in a rainbow table.
Take a look at for example: http://freerainbowtables.com which are distributed rainbow tables and see where they are.
You can however in stead of scrambling your password yourself (or with some selfmade function) use more iterations when encrypting.

Why restrict the length of a password?

I've just signed up to a site to purchase some goods, and when I tried to enter my (reasonably secure) password I was informed it was too long, and that I should enter a password between 5 & 10 characters! What is the point in that? Who makes decisions like this? Surely the ideal password would be a really long and complicated one? Why do people insist on trying to restrict what types of passwords you can use?
Have you had to implement a login to a website? Was the login for secure purposes (e.g. purchasing goods). What (if any) restrictions did you place on the user's password? What were your reasons for the decision?
Restricting the size of a password is an attempt to save storage space. It pretty much indicates that your password is being stored plainly in their database, so they want to restrict its size. Otherwise it's just a restriction because the implementors don't know any better. Either way it's a bad sign.
You might want to contact the admins of the site and ask them about it. They should be storing hashes, not passwords, which are always the same size no matter how big the password is. There really should be no limit to the size of password you enter, nor the domain of characters you're permitted to input.
The most common reason for this is because the front-end intgrates with some old legacy system that does not handle more than a given number of characters.
Seems especially stupid, given that any half decent website does not store plaintext passwords in their database, they store a one way hash of that password (which will always be a set length depending on the algorithm used, for example sha1 is a 160 bit digest) and then rehash that password on login to make sure that the newly hashed password matches the stored one.
Other than for frontend design asthetics - I agree, it doesn't make any sense to enforce a maximum password length. Minimum length is entirely different though for obvious reasons.
The length restriction is probably due to a storage space concern, but it might be a really bad anti-scripting measure. I'd be a lot more confident if my bank told me my password was too short, rather than too long. Whenever I'm told my password is too short, or "special" characters are not allowed I think, "Oh, they must not have found my password in their dictionary... facepalm."
Any characters should be allowed. Pass phrases should be encouraged, not discouraged. They're much easier to remember than cryptic passwords and much harder to crack since they won't be in a lookup table.
Some (poorly designed) websites have maximum password lengths for a simple reason: that's all the space they have in their database to store your password. There's a good chance they're not hashing it or processing it at all, meaning it's stored in plain text. Websites like that I use one use, throw-away passwords for every time. It's a poor design, and it's unfortunate that people still use it.
It could be that the algorithm they use for encryption doesn't work well with large passwords or that they only have limited storage to store it. Both are very poor reasons, I know, but it's possible.
If I were to make password rules, it would only be things to protect users, like forcing them to use at least one special character and number or mixing lower and upper case.
It could be because they are storing your password as plain text and are trying to save space, but it might also be to try and stop people making their passwords really long and then forgetting them, which means that the company has to send an email with your password, which is a bit of a hassle.
The only possible reason to limit a password in that manner would be to simplify the database table, and that's a bad reason. Long, complicated passwords should be allowed!
Futhermore, the site should not be storing the password at all, but rather storing a crypto hash. Since the hash is a fixed size, that makes the database very simple and storage requirements small.

Preferred Method of Storing Passwords In Database

What is your preferred method/datatype for storing passwords in a database (preferably SQL Server 2005). The way I have been doing it in several of our applications is to first use the .NET encryption libraries and then store them in the database as binary(16). Is this the preferred method or should I be using a different datatype or allocating more space than 16?
I store the salted hash equivalent of the password in the database and never the password itself, then always compare the hash to the generated one of what the user passed in.
It's too dangerous to ever store the literal password data anywhere. This makes recovery impossible, but when someone forgets or loses a password you can run through some checks and create a new password.
THE preferred method: never store passwords in your DB. Only hashes thereof. Add salt to taste.
I do the same thing you've described, except it is stored as a String. I Base64 encode the encrypted binary value. The amount of space to allocate depends on the encryption algorithm/cipher strength.
I think you are doing it right (given that you use a Salt).
store the hash of the salted-password, such as bcrypt(nounce+pwd). You may prefer bcrypt over SHA1 or MD5 because it can be tuned to be CPU-intensive, therefore making a brute force attack way longer.
add a captcha to the login form after a few login errors (to avoid brute-force attacks)
if your application has a "forgot my password" link, make sure it does not send the new password by email, but instead it should send a link to a (secured) page allowing the user to define a new password (possibly only after confirmation of some personal information, such as the user's birth date, for example). Also, if your application allows the user to define a new password, make sure you require the user to confirm the current password.
and obviously, secure the login form (typically with HTTPS) and the servers themselves
With these measures, your user's passwords will be fairly well protected against:
=> offline dictionary attacks
=> live dictionary attacks
=> denial of service attacks
=> all sorts of attacks!
Since the result of a hash function is a series of byte in the range 0 to 255 (or -128 to 127, depending the signed-ness of your 8-bit data type), storing it as a raw binary field makes the most sense, as it is the most compact representation and requires no additional encoding and decoding steps.
Some databases or drivers don't have great support for binary data types, or sometimes developers just aren't familiar enough with them to feel comfortable. In that case, using a binary-to-text encoding like Base-64 or Base-85, and storing the resulting text in a character field is acceptable.
The size of the field necessary is determined by the hash function that you use. MD5 always outputs 16 bytes, SHA-1 always outputs 20 bytes. Once you select a hash function, you are usually stuck with it, as changing requires a reset of all existing passwords. So, using a variable-size field doesn't buy you anything.
Regarding the "best" way to perform the hashing, I've tried to provide many answers to other SO questions on that topic:
Encrypting passwords
Encrypting passwords
Encrypting passwords in .NET
Salt
Salt: Secret or public?
Hash iterations
I use the sha hash of the username, a guid in the web config, and the password, stored as a varchar(40). If they want to brute force / dictionary they'll need to hack the web server for the guid as well. The username breaks creating a rainbow table across the whole database if they do find the password. If a user wants to change their username, I just reset the password at the same time.
System.Web.Security.FormsAuthentication.HashPasswordForStoringInConfigFile(
username.ToLower().Trim(),
ConfigurationManager.AppSettings("salt"),
password
);
A simple hash of the password, or even (salt + password) is not generally adequate.
see:
http://www.matasano.com/log/958/enough-with-the-rainbow-tables-what-you-need-to-know-about-secure-password-schemes/
and
http://gom-jabbar.org/articles/2008/12/03/why-you-should-use-bcrypt-to-store-your-passwords
Both recommend the bcrypt algorithms. Free implementations can be found online for most popular languages.
You can use multiple hashes in your database, it just requires a little bit of extra effort. It's well worth it though if you think there's the remotest chance you'll need to support additional formats in the future. I'll often use password entries like
{hashId}${salt}${hashed password}
where "hashId" is just some number I use internally to recognize that, e.g., I'm using SHA1 with a specific hash pattern; "salt" is a base64-encoded random salt; and "hashed password" is a base64-encoded hash. If you need to migrate hashes you can intercept people with an old password format and make them change their password the next time they log in.
As others have mentioned you want to be careful with your hashes since it's easy to do something that's not really secure, e.g., H(salt,password) is far weaker than H(password,salt), but at the same time you want to balance the effort put into this with the value of the site content. I'll often use H(H(password,salt),password).
Finally, the cost of using base64-encoded passwords is modest when compared to the benefits of being able to use various tools that expect text data. Yeah, they should be more flexible, but are you ready to tell your boss that he can't use his favorite third party tool because you want to save a few bytes per record? :-)
Edited to add one other comment: if I suggested deliberately using an algorithm that burned even a 1/10th of a second hashing each password I would be lucky to just be laughed out of my boss's office. (Not so lucky? He would jot something down to discuss at my next annual review.) Burning that time isn't a problem when you have dozens, or even hundreds, of users. If you're pushing 100k users you'll usually have multiple people logging in at the same time. You need something fast and strong, not slow and strong. The "but what about the credit card information?" is disingenuous at best since stored credit card information shouldn't be anywhere near your regular database, and would be encrypted by the application anyway, not individual users.
If you are working with ASP.Net you can use the built in membership API.
It supports many types of storage options, inlcuding; one way hash, two way encryption, md5 + salt. http://www.asp.net/learn/security for more info.
If you dont need anything too fancy, this is great for websites.
If you are not using ASP.Net here is a good link to a few articles from 4guys and codeproject
https://web.archive.org/web/20210519000117/http://aspnet.4guysfromrolla.com/articles/081705-1.aspx
https://web.archive.org/web/20210510025422/http://aspnet.4guysfromrolla.com/articles/103002-1.aspx
http://www.codeproject.com/KB/security/SimpleEncryption.aspx
Since your question is about storage method & size I will address that.
Storage type can be either binary or text representation (base64 is the most common). Binary is smaller but I find working with text easier. If you are doing per user salting (different salt per password) then it is easier to store salt+hash as a single combined string.
The size is hash algorithm dependent. The output of MD5 is always 16 bytes, SHA1 is always 20 bytes. SHA-256 & SHA-512 are 32 & 64 bytes respectively. If you are using text encoding you will need slightly more storage depending on the encoding method. I tend to use Base64 because storage is relatively cheap. Base64 is going to require roughly 33% larger field.
If you have per user salting you will need space for the hash also. Putting it all together 64bit salt + SHA1 hash (160 bit) base64 encoded takes 40 characters so I store it as char(40).
Lastly if you want to do it right you shouldn't be using a single hash but a key derivation function like RBKDF2. SHA1 and MD5 hashes are insanely fast. Even a single threaded application can hash about 30K to 50K passwords per second thats up to 200K passwords per second on quad core machine. GPUs can hash 100x to 1000x as many passwords per second.With speeds like that brute force attacking becomes an acceptable intrusion method. RBKDF2 allows you to specify the number of iterations to fine tune how "slow" your hashing is. The point isn' to bring the system to its knees but to pick a number of iterations so that you cap upper limit on hash throughput (say 500 hashes per second). A future proof method would be to include the number of iterations in the password field (iterations + salt + hash). This would allow increasing iterations in the future to keep pace with more powerful processors. To be even more flexible use varchar to allow potentially larger/alternative hashes in the future.
The .Net implementation is RFC2892DeriveBytes
http://msdn.microsoft.com/en-us/library/system.security.cryptography.rfc2898derivebytes.aspx

Resources