Data Encryption - database

A database that stores a lot of credit card information is an inevitable part of the system we have just completed. What I want though is ultimate security of the card numbers whereby we setup a mechanism to encrypt and decrypt but of ourselves cannot decrypt any given number.
What I am after is a way to secure this information even down at the database level so no one can go in and produce a file of card numbers. How have others overcome this issue? What is the 'Standard' approach to this?
As for usage of the data well the links are all private and secure and no transmission of the card number is performed except when a record is created and that is encrypted so I am not worried about the front end just the back end.
Well the database is ORACLE so I have PL/SQL and Java to play with.

There's no shortage of processors willing to store your CC info and exchange it for a token with which you can bill against the stored number. That gets you out of PCI compliance, but still allows on demand billing. Depending on why you need to store the CC, that may be a better alternative.
Most companies refer to this as something like "Customer Profile Management", and are actually pretty reasonable on fees.
A few providers I know of (in no particular order):
Authorize.NET Customer Information Manager
TrustCommerce Citadel
BrainTree

Unless you are a payment processor you don't really need to store any kind of CC information.
Review your requirements, there really is not many cases where you need to store CC information

Don't store the credit card numbers, store a hash instead. When you need to verify if a new number matches a stored number, take a hash of the new number and compare it to the stored hash. If they match, the number is (in theory) the same.
Alternatively, you could encrypt the data by getting the user who enters the card number to enter a pass phrase; you'd use this as an encryption/decryption key.
However, anyone with access to your database and sourcecode (ie. you and your team) will find it trivial to decrypt that data (ie. modify the live code so that it emails any decryption keys entered to a disposable Hotmail account, etc).

If you are storing the credit card information because you don't want the user to have to re-enter it then hashing of any form isn't going to help.
When do you need to act on the credit card number?
You could store the credit card numbers in a more secure database, and in the main db just store enough information to show to the user, and a reference to the card. The backend system can be much more locked down and use the actual credit card info just for order processing. You could encrypt these numbers by some master password if you like, but the password would have to be known by the code that needs to get the numbers.
Yes, you have only moved the problem around somewhat, but a lot of security is more about reducing the attack footprint rather than eliminating it. If you want to eliminate it then don't store the credit card number anywhere!

If you're using Oracle you might be interested in Transparent Data Encryption. Only available with an Enterprise license though.
Oracle also has utilities for encryption - decryption, for example the DBMS_OBFUSCATION_TOOLKIT.
As for "Standards", the proper standard you are interested in is the PCI DSS standard which describes which measures need to be taken to protect sensitive credit card information.

For an e-commerce type use case (think Amazon 1-Click), you could encrypt the CC (or key) with the user's existing strong password. Assuming you only store a hash of the password, only the user (or a rainbow table - but, it'd have to be run on each user, and would not work if it didn't come up with the same password - not just 1 that hashed the same) can decrypt it.
You'd have to take some care to re-encrypt the data when a password changes, and the data would be worthless (and need to be reentered by the user) if they forgot their password - but, if the payments are user-initiated, then it'd work nicely.

It would be helpful to know the DB server and language/platform types so we could get more specific, but I would be looking into SHA.

I'd symmetrically encrypt (AES) a secure salted hash (SHA-256 + salt). The salted hash would be enough with a big salt, but the encryption adds a bit extra in case the database and not the code leaks and there are rainbow tables for salted hashes by then or some other means. Store the key in the code, not in the database, of course.
It's worth noting that nothing protects you from crooked teammates, they can also store a copy of the date before hashing, for instance. You have to take good care of the code repository and do frequent code revisions for all code in the credit card handling path. Also try to minimize the time from receiving the data and having it crypted/hashed, manually ensuring the variable where it was stored is cleared from memory.

Related

Should I encrypt username/email fields in my MongoDB?

So I am encrypting fields in the database, but I don't think i can encrypt the user's username or email because I use those fields to find the user. I could hash them instead, but since I don't think I can use a unique salt per username/email someone could just use a rainbow table to find the hidden username/email.
I guess this it is ok to not encrypt them? I would like to make the website as secure as possible. Would hashing them make sense? I could find a user by their _id instead of username/email, but I wouldn't have their _id until I find the user.
What I am doing currently:
const user = await new db.userModel({
email: email,
username: username,
stuff: cryptr.encrypt(stuff),
});
////
const user = await db.userModel.findOne({
email: email
}).exec();
EDIT: I guess hashing would not make sense, since I cannot un-hash the username/email. Not sure what I was thinking.
This question should really be asked on infosec Stackexchange.
Encryption is usually pointless as a defense against hackers, because you need to access the key to use the encrypted database. That means that you need to save it next to the database. If your server becomes compromised, then the hacker will simply decrypt the database with the key. Of course it's better than doing nothing, because it is possible (maybe improbable) that the hacker will compromise only the database. Horatiu Jeflea hovewer mentions other important reasons to encrypt the database, which you should especially consider if you are not working on the project alone.
Hashing usernames is only possible if you don't need to display them, but usernames are usually public anyway, so it improves security very little.
Hashing emails is an interesting problem. You asked for them for a reason. You presumably need to contact your users. If you hash them, you won't be able to do it and if you don't need to do it, then you don't need to (and shouldn't) save them in the first place. If however they are only part of the authentication process, then it would be possible solution.
Rainbow tables can't really break modern hashing algorithms, although the biggest mitigation is aforementioned salt, which will make it MUCH harder for most attack on the hash. You also have to make sure that you are not using vulnerable ones like md5, but safe one like sha256 or sufficient bcrypt. You should hash the passwords with one of those. Also note that you could use the same salt for all the hashed fields and the salt could even be (this would reduce the security little bit) one of the public fields (username?). There are very little excuses for not using salt.
In summary: you can't hash them and encryption is probably not worth it unless you can sufficiently isolate the key, or need it because of something else than external hackers
I guess this it is ok to not encrypt them?
In most cases, yes. Passwords should be hashed (+ salt..) and sensitive data should be encrypted. But username or email in most cases should not be sensitive.
But let's assume they will be encrypted in DB.
You are storing encrypt(username) in your DB, so in order to search for that, instead of using username, use encrypt(username). Of course sorting may cause some headaches, but finding the user should be efficient.
Think of encrypting (in your case) not about hackers, but about people who are reading those records. For example developers who are investigating a production issue or a DBA, you want some (not all) fields hard for them to read. Storing the key on a different machine, best in a Key Management Tool, will add an extra security layer.

Encrypted Data only accessible for user as data owner and algorithm

Which method or way would you choose to make encrypted data only accessible for the user and an algorithm to process and evaluate the data? In this case the user would be one of n service-users, who would add sensible data (mostly answers to questions) about himself into the database. The company who is providing the database shouldn’t have any access to the sensible data, but to the results of the data processing. The results wouldn’t give any conclusion of the sensible data.
What you are looking for is Fully Homomorphic Encryption (FHE). FHE operates on encrypted data. This can be achieved by an encryption scheme that supports two operations on encrypted data. RSA and others only supported one operation until Gentry's work.
With FHE schemes like HeLib (there are many now), you can upload your data the server and give a function (circuit) to evaluate. The FHEs, in general, have semantic security (randomized encryption). The Semi-honest server can only see encrypted data and can return the result back to you.
Note: They are not practical, yet.
I think the best way to do that is to save only the result. but if you want to save the user's answers you could use AES with the user's password as a key by doing so the user will have to enter his password every time to decrypt the data.

One way encrypting primary key

What is the best one way permutation function I could use to digest an e-mail so I can use it as a primary key without storing personal data?
I'm getting my first F2P game ready: a simple yet (hopefully) addictive 2D casual puzzler based on aiming mechanics. It's made with Unity and will be released on Android very soon.
In order for the player to keep the same data across different devices, I have an SQL table with the device e-mail as the primary key, then another string as the savegame data.
But I don't want to store the user e-mail for privacy reasons.
So I thought of digesting it with some function that would use the original e-mail to generate a new string that:
is unique (will never collide with another string generated from a different e-mail address)
is not decypherable (there should be no way to obtain the original e-mail from the digested string - or at least it should be hard enough)
This way I could still use the Android device e-mail to retrieve the savegame data, without storing personal data from the player.
As far as I've researched, the solution seems to be called a one way permutation function. The problem is that I can't seem to find an appropriate function on the internet; instead, all answers seem to be plagued with solutions for password hashing, which is very interesting (salting, MD5, SHAXXX...) but don't meet my first requirement of no collision.
Thank you in advance for any answer on this topic.
What you need is a cryptographic hash function such as SHA-256. Such functions are designed to be collision resistant, Git uses an older version SHA-1. Most languages/systems have support of this, just Google "Android SHA-256" along with your language of choice.
One option is to append a creation timestamp.
Update: Since SHA-256 does not provide sufficient collision resistance consider s GUID, from RFC 4122: "A UUID is 128 bits long, and can guarantee uniqueness across space and time.". Of course you need to find a good implementation.

Should I change my License Key output from pure md5 output to a common "XXXX-YYYY-ZZZZ" type code?

I'm creating a simple license key system to "keep honest people honest". I don't care about especially stringent cryptography.
If they get to annoyed with the demo limitations, they go to my registration website, pay, and give me their email. I give them a license key.
I'm keeping things really simple, so:
license_key = md5(email + "Salt_String");
I have PHP and C# functions run that same algorithm and get the same key.
The problem is that the output of these functions is a 32-character string like:
A69761CF99316358D04771C5ECFCCDC5
Which is potentially hard to remember/type. Yes, I know about copy/paste, but I want to make it REALLY easy for all paying customers to unlock the software.
Should I somehow convert this long string into something shorter?
Lets say I use only the first 6 digits, so: A69761
There are obviously way more cryptographic collisions in that, but will it matter at all in practical use?
Any other ideas to make the thing more human readable/typeable?
To left 6-10 symbols will be enough - the user anyway will not be able to guess the code, and it would be easy to type in.
Also good idea would be to register each license on your server, so that you will be able to check that user is really honest, and didn't give a license key to another person.
In my experience, asking the user to type or copy/paste a 30-character code indeed leads to frustrated customers. It's not that it's so difficult. It's simply a hurdle that people don't care for.
The solution I've used for my business is to have separate trial and purchased downloads. To get their licensed copy, the customer types in their email address and a short user ID on the download form. Entering only the email automatically resends the user ID. You didn't ask about this, but a system to automatically look up whatever code the customer needs is even more important than having a simple system. The download system looks up the user's details in the database and serves a SetupSomeProductCustomerName.exe that has the user's license embedded in it. This setup installs the customer's licensed copy without requiring any further identification or server connections.
This system has worked really well for us. The customer has only one file to back up and no serial numbers to lose to make sure they can reinstall the software in the future.
That said, if you prefer to use a system using a one-way hash, simply use an algorithm that generates a smaller hash. E.g. CRC-32 results in 8 hexadecimal digits.
There's no point in the hash being cryptographically secure. A cracker will simply walk through your code, copy the entire block of code that mutates the email address into the license key, and paste that into their keygen. Then they can generate license keys for any email address. They can do that regardless of how complex your hashing algorithm is.
If you want to prevent this, you need to use public key encryption, which results in keys that are far too long to type in. If you go that route, you'll either need to annoy your customers with long keys to paste in or separate key files, or use the personalized download system I described above.

Preferred Method of Storing Passwords In Database

What is your preferred method/datatype for storing passwords in a database (preferably SQL Server 2005). The way I have been doing it in several of our applications is to first use the .NET encryption libraries and then store them in the database as binary(16). Is this the preferred method or should I be using a different datatype or allocating more space than 16?
I store the salted hash equivalent of the password in the database and never the password itself, then always compare the hash to the generated one of what the user passed in.
It's too dangerous to ever store the literal password data anywhere. This makes recovery impossible, but when someone forgets or loses a password you can run through some checks and create a new password.
THE preferred method: never store passwords in your DB. Only hashes thereof. Add salt to taste.
I do the same thing you've described, except it is stored as a String. I Base64 encode the encrypted binary value. The amount of space to allocate depends on the encryption algorithm/cipher strength.
I think you are doing it right (given that you use a Salt).
store the hash of the salted-password, such as bcrypt(nounce+pwd). You may prefer bcrypt over SHA1 or MD5 because it can be tuned to be CPU-intensive, therefore making a brute force attack way longer.
add a captcha to the login form after a few login errors (to avoid brute-force attacks)
if your application has a "forgot my password" link, make sure it does not send the new password by email, but instead it should send a link to a (secured) page allowing the user to define a new password (possibly only after confirmation of some personal information, such as the user's birth date, for example). Also, if your application allows the user to define a new password, make sure you require the user to confirm the current password.
and obviously, secure the login form (typically with HTTPS) and the servers themselves
With these measures, your user's passwords will be fairly well protected against:
=> offline dictionary attacks
=> live dictionary attacks
=> denial of service attacks
=> all sorts of attacks!
Since the result of a hash function is a series of byte in the range 0 to 255 (or -128 to 127, depending the signed-ness of your 8-bit data type), storing it as a raw binary field makes the most sense, as it is the most compact representation and requires no additional encoding and decoding steps.
Some databases or drivers don't have great support for binary data types, or sometimes developers just aren't familiar enough with them to feel comfortable. In that case, using a binary-to-text encoding like Base-64 or Base-85, and storing the resulting text in a character field is acceptable.
The size of the field necessary is determined by the hash function that you use. MD5 always outputs 16 bytes, SHA-1 always outputs 20 bytes. Once you select a hash function, you are usually stuck with it, as changing requires a reset of all existing passwords. So, using a variable-size field doesn't buy you anything.
Regarding the "best" way to perform the hashing, I've tried to provide many answers to other SO questions on that topic:
Encrypting passwords
Encrypting passwords
Encrypting passwords in .NET
Salt
Salt: Secret or public?
Hash iterations
I use the sha hash of the username, a guid in the web config, and the password, stored as a varchar(40). If they want to brute force / dictionary they'll need to hack the web server for the guid as well. The username breaks creating a rainbow table across the whole database if they do find the password. If a user wants to change their username, I just reset the password at the same time.
System.Web.Security.FormsAuthentication.HashPasswordForStoringInConfigFile(
username.ToLower().Trim(),
ConfigurationManager.AppSettings("salt"),
password
);
A simple hash of the password, or even (salt + password) is not generally adequate.
see:
http://www.matasano.com/log/958/enough-with-the-rainbow-tables-what-you-need-to-know-about-secure-password-schemes/
and
http://gom-jabbar.org/articles/2008/12/03/why-you-should-use-bcrypt-to-store-your-passwords
Both recommend the bcrypt algorithms. Free implementations can be found online for most popular languages.
You can use multiple hashes in your database, it just requires a little bit of extra effort. It's well worth it though if you think there's the remotest chance you'll need to support additional formats in the future. I'll often use password entries like
{hashId}${salt}${hashed password}
where "hashId" is just some number I use internally to recognize that, e.g., I'm using SHA1 with a specific hash pattern; "salt" is a base64-encoded random salt; and "hashed password" is a base64-encoded hash. If you need to migrate hashes you can intercept people with an old password format and make them change their password the next time they log in.
As others have mentioned you want to be careful with your hashes since it's easy to do something that's not really secure, e.g., H(salt,password) is far weaker than H(password,salt), but at the same time you want to balance the effort put into this with the value of the site content. I'll often use H(H(password,salt),password).
Finally, the cost of using base64-encoded passwords is modest when compared to the benefits of being able to use various tools that expect text data. Yeah, they should be more flexible, but are you ready to tell your boss that he can't use his favorite third party tool because you want to save a few bytes per record? :-)
Edited to add one other comment: if I suggested deliberately using an algorithm that burned even a 1/10th of a second hashing each password I would be lucky to just be laughed out of my boss's office. (Not so lucky? He would jot something down to discuss at my next annual review.) Burning that time isn't a problem when you have dozens, or even hundreds, of users. If you're pushing 100k users you'll usually have multiple people logging in at the same time. You need something fast and strong, not slow and strong. The "but what about the credit card information?" is disingenuous at best since stored credit card information shouldn't be anywhere near your regular database, and would be encrypted by the application anyway, not individual users.
If you are working with ASP.Net you can use the built in membership API.
It supports many types of storage options, inlcuding; one way hash, two way encryption, md5 + salt. http://www.asp.net/learn/security for more info.
If you dont need anything too fancy, this is great for websites.
If you are not using ASP.Net here is a good link to a few articles from 4guys and codeproject
https://web.archive.org/web/20210519000117/http://aspnet.4guysfromrolla.com/articles/081705-1.aspx
https://web.archive.org/web/20210510025422/http://aspnet.4guysfromrolla.com/articles/103002-1.aspx
http://www.codeproject.com/KB/security/SimpleEncryption.aspx
Since your question is about storage method & size I will address that.
Storage type can be either binary or text representation (base64 is the most common). Binary is smaller but I find working with text easier. If you are doing per user salting (different salt per password) then it is easier to store salt+hash as a single combined string.
The size is hash algorithm dependent. The output of MD5 is always 16 bytes, SHA1 is always 20 bytes. SHA-256 & SHA-512 are 32 & 64 bytes respectively. If you are using text encoding you will need slightly more storage depending on the encoding method. I tend to use Base64 because storage is relatively cheap. Base64 is going to require roughly 33% larger field.
If you have per user salting you will need space for the hash also. Putting it all together 64bit salt + SHA1 hash (160 bit) base64 encoded takes 40 characters so I store it as char(40).
Lastly if you want to do it right you shouldn't be using a single hash but a key derivation function like RBKDF2. SHA1 and MD5 hashes are insanely fast. Even a single threaded application can hash about 30K to 50K passwords per second thats up to 200K passwords per second on quad core machine. GPUs can hash 100x to 1000x as many passwords per second.With speeds like that brute force attacking becomes an acceptable intrusion method. RBKDF2 allows you to specify the number of iterations to fine tune how "slow" your hashing is. The point isn' to bring the system to its knees but to pick a number of iterations so that you cap upper limit on hash throughput (say 500 hashes per second). A future proof method would be to include the number of iterations in the password field (iterations + salt + hash). This would allow increasing iterations in the future to keep pace with more powerful processors. To be even more flexible use varchar to allow potentially larger/alternative hashes in the future.
The .Net implementation is RFC2892DeriveBytes
http://msdn.microsoft.com/en-us/library/system.security.cryptography.rfc2898derivebytes.aspx

Resources