I have a table with User Agents Strings table with the following structure:
UserAgentStringID INT
UserAgentStringValue VARBINARY(8000)
The [UserAgentStringValue] field is encrypted with symmetric key. The previous version of the table structure was:
UserAgentStringID INT
UserAgentStringValue NVARCHAR(4000)
UserAgentStringHASH BINARY(32)
and I have index on the [UserAgentStringHASH] column in order to optimized searchers.
With the new format, such index is not efficient as the ENCRYPTION function uses InitializationVector in order to generate random values each time the encryption function is called with the same input:
Initialization vectors are used to initialize the block algorithm. It
is not intended to be a secret, but must be unique for every call to
the encryption function in order to avoid revealing patterns.
So, I can create index on my encrypted field, but if I try to search by encrypted value, I will not be able to find anything.
I do not want to use HASH because using hash function is not secure technique. If someone have my table data and table with all or huge amount of user agents, he/she will be able to perform an join by hash and reveal my data.
In SQL Server 2016 SP standard edition we have Always Encrypted which allows using Deterministic Encryption for column value - this means equal comparisons are working and indexes can be created.
I am looking for a way to optimize the search by other technique or a way to implement deterministic encryption using CLR for example?
Knowing there is no work around is OK for me, too. I guess I will pay the data protection with performance.
I am posting a workaround of this - it's not the ideal solution, but it is compromise between speed and security.
The details
a columns must be encrypted (lets say an email address)
fast search must be implemented (let say the email is used for login and we need to locate the record as fast as possible)
we are not able to use Always Encrypted deterministic encryption (due to various reasons)
we don't want to use hash function with salt - if one has the salt for each user, ze might be able to read the hashes using large sample database
The security hierarchy
There are various ways of implementing the security hierarchy. The following schema from the MSDN describes it very well.
In our environment we are using the Database Mater Key -> Certificate -> Symmetric Key hierarchy. Only DBAs know the DMK password, have access to certificate and symmetric keys. Some developers can do encrypt/decrypt data (using plain T-SQL) and other do not.
Note, using Always Encrypted you can have role separation - the people who works with the data have not access to the keys, and the people who have access to the keys, do not have access to the data. In our case, we want to protect our data from outsiders and have other techniques for granting/logging data access internally.
Developers with access to encrypted data
The developers who can access the protected data are able to encrypt and decrypt it. They have not access to the symmetric key values. If one have access to the symmetric key values, ze is able to decrypt the data event not having the certifications used for protecting the symmetric keys. Basically, only sys.admins and db_owners have access to the symmetric keys values.
How to hash
We need a hash to get fast searches, but we cannot use a salt which is not encrypted. And hash without a salt is like plain text from security perspective. So, we've decided to use use the symmetric key value as salt. It is get like this:
SELECT #SymmetricKeyValue = CONVERT(VARCHAR(128), DECRYPTBYCERT(C.[certificate_id], KE.[crypt_property]), 1)
FROM [sys].[symmetric_keys] SK
INNER JOIN [sys].[key_encryptions] KE
ON SK.[symmetric_key_id] = KE.[key_id]
INNER JOIN [sys].[certificates] C
ON KE.[thumbprint] = C.[thumbprint]
WHERE SK.[name] = #SymmetricKeyName;
And the value is concatenated to your email address and then the hash is calculated. It is good for us, because we are binding the hash to the security hierarchy. And it is not a different salt for each record, it is the same - but if one knows the symmetric key value, ze is able to decrypt the data directly.
Considerations
You need to create the routines (stored procedures, triggers) which are searching by hash values or computing hashes using the EXECUTE AS OWNER clause. Otherwise, developers will not be able to execute them as only sys.admins and db_owners have access to the symmetric key value.
Related
Which method or way would you choose to make encrypted data only accessible for the user and an algorithm to process and evaluate the data? In this case the user would be one of n service-users, who would add sensible data (mostly answers to questions) about himself into the database. The company who is providing the database shouldn’t have any access to the sensible data, but to the results of the data processing. The results wouldn’t give any conclusion of the sensible data.
What you are looking for is Fully Homomorphic Encryption (FHE). FHE operates on encrypted data. This can be achieved by an encryption scheme that supports two operations on encrypted data. RSA and others only supported one operation until Gentry's work.
With FHE schemes like HeLib (there are many now), you can upload your data the server and give a function (circuit) to evaluate. The FHEs, in general, have semantic security (randomized encryption). The Semi-honest server can only see encrypted data and can return the result back to you.
Note: They are not practical, yet.
I think the best way to do that is to save only the result. but if you want to save the user's answers you could use AES with the user's password as a key by doing so the user will have to enter his password every time to decrypt the data.
I have problem in encrypting my plain text in C.
I am able to write and read the file in C
Inside the text:
ID Promo Points Password Name
1 NONE 0 awdawdawd daw
Which in this case it just print password in plain text, Is it possible to encrypt the data any method, which shows like this:
ID Promo Points Password Name
1 NONE 0 ENCRYPTEDDATA daw
Of course the password wont be "ENCRYPTEDDATA", I just want it avoid plain text which user can easily see the file.
The reason I create file cause I need to read it from the file and make a login function.
My program itself able to decrypt the password when in request of checking id and password.
It can use any method only the program can encrypt and decrypt the data
Any solution guys?
If possible I also need to limit the encrypted text
You should hash password with strong hash function like SHA2 and store the hash in your file rather than storing it in plain
Hashing might be better in this case than encryption, as for encryption you will have to worry about storing key somewhere securely.
When your login function needs to validate incoming password, you can just hash incoming password and match it against the hashed password from you file
If you want to protect your login then you should use a password hash, also known as a Password Based Key Derivation Function. These functions are often, but not always, based on a secure hash. You should not use a cryptographic hash such as SHA-2 for this purposes.
Common password hashes are PBKDF2, bcrypt, scrypt and Argon2. Argon2 is the most advanced one as winner of the password hashing competition. A password hash differs from a normal hash in two important aspects:
it uses key strengthening techniques to make it harder for adversaries to use a dictionary or brute force attack (in the form of an iteration count or work factor and possibly additional memory / threading related parameters);
it uses a salt - stored with the password hash - to avoid rainbow table attacks and to avoid duplicate hash values - which would show that an identical password is being used.
So although Pras is right about not using encryption I would not recommend a secure hash unless you are sure that the password is large enough and unique. In general those restrictions can nor should be enforced on password based authentication systems.
I have patient health information stored in a SQL Server 2012 database. When I do a search on a patient's name, their names are encrypted, so the search is very slow. How can I add an index on an encrypted column ?
I am using Symmetric Key encryption (256-bit AES) on varbinary fields.
There are separate encrypted fields for Patient's first name, last name, address, phone number, DOB, SSN. All of these are searchable (partial also) except SSN.
To build on the answer that #PhillipH provided: if you are performing an exact search on (say) last name you can include a computed column defined as CHECKSUM(encrypt(last_name)) (with encrypt your encryption operation). This is secure in that it does not divulge any information -- a checksum on the encrypted value does not reveal anything about the plaintext.
Create an index on this computed column. To search on the name, instead of just doing WHERE encrypted_last_name = encrypt(last_name), add a search on the hash: WHERE encrypted_last_name = encrypt(last_name) AND CHECKSUM(encrypt(last_name)) = hashed_encrypted_last_name. This is much faster because SQL Server only has to search an index for a small integer value, then verify that the name in fact matches, reducing the amount of data to check considerably. Note that no data is decrypted in this scheme, with or without the CHECKSUM -- we search for the encrypted value only. The speedup does not come from reducing the amount of data that is encrypted/decrypted (only the data you pass in is encrypted) but the amount of data that needs to be indexed and compared for equality.
The only drawback is that this does not allow partial searches, or even case variation, and indeed, doing that securely is not trivial. Case is relatively simple (hash encrypted(TOUPPER(name)), making sure you use a different key to avoid correlation), but partial matches require specialized indexes. The simplest approach I can think of is to use a separate service like Lucene to do the indexing, but make it use secure storage for its files (i.e. Encrypting File System (EFS) in Windows). Of course, that does mean a separate system that needs to be certified -- but I can't think of any convenient solution that remains entirely in SQL Server and does not require additional code.
If you can still change the database design/storage, you may wish to consider Transparent Data Encryption (TDE) which has the huge advantage that it's, well, transparent and integrated in SQL Server at the engine level. Not only should partial matching be much faster since individual rows don't need decrypting (just whole pages), if it's not fast enough you can create a full-text index which will also be encrypted. I don't know if TDE works with your security requirements, though.
As a programmatic solution, if you dont need a partial match, you could store a hash in the clear on another field and use the same hashing algorithm on the client/app server and match on hash. This would have the possibility of a false positive match but would negate the need to decrypt the data.
If you are using Microsoft SQL server implicit encryptbykey function, there is no benefit of using index on that column because sql sever encryptbykey function will have different output every time for same input because of random iv used by sql server itself.
Assume the following example:
I have an online service where user can register and enter personal data. Now I want to encrypt these data. I have a private key Pr1 and public key Pu1.
User logs in with password at my online service
Convert login password to fit a private key format = Pr2
Get public key Pu2 from Pr2
User enters data to store them online in the database
Encrypt user entered data with Pu1 and add --recipient Pu2 like Encryption with multiple different keys?
Now I can copy the encrypted data from the online database to my local machine and decrypt the data with my local Pr1
Users can decrypt their already entered data online using their normal password which is converted to their Pr2 every time when they log in (step 2a) but is valid the entire session
With that approach no data can be decrypted even if an attacker has access to my server with all files and the database, right? Sure, a brute force attack is possible but it should take some time as for every try a private key needs to be computed.
But no private key is stored online or needs to be exchanged. So this should be pretty save.
Here the question: If this approach is secure and practicable, then there must be already something similar or better out there which has these functionalities and uses some nice security standards. What is it?
A bunch of seemingly random thoughts re: why this doesn't tend to be how people do this...
First, for multiple user access. Typically systems I have seen that want to let two users access something with their own creds, but only protect the thing one time, is to create a key to protect the content, then protect that key with multiple credentials and store the key in this form many times. That is to say, you store the key next tot he item itself, but the key is stored N times,once for each accessor. If I grant you access, my cred is used to decrypt the key, then it is stored again with your materials.
Along the same multi-user lines, the "grant access" flow is problematic. The scheme you suggest above requires that in order for me to grant you access, the system needs to have my credential (to validate I am who I say I am & have the key in hand) plus your key (to give you access) at the same moment in time. This is pretty problematic in the real world.
This scheme does not afford the user a "forgot my password" experience. Lost password -> lost key.
This scheme assumes users pick good passwords.
This scheme means that two users with the same password have the same key.
You assert that theft of DB isn't an issue a they would have to compute all of the passwords (which means downstream keys) but in practice this isn't too hard to do, nor too expensive. And I just need to compute Password123 once and then can scan the entire db for it.
Hope this helps.
I am new to web development and database and am trying to implement password authentication with reasonable security and speed. I have read about hashing the password and append a salt unique to each user in order to deter people from generating rainbow tables.
My question deals with the time I have to search in order to verify a user. Since I don't know who is trying to connect at any given time, it seems to me that I would need to retrieve every field from the salt column then hash the submitted password + each unique salt and then finally compare each output to the hashed strings in the table?
So I have to submit a separate query for each combination of hash(password+salt)? That seems like it would be awfully slow. Am I missing a trick that would speed up the process? Or is it simply a matter of sucking it up and sacrificing speed for better security? Or am I mistaken and with the speed of today's computers it isn't an issue at all?
You cannot authenticate the user based only on a password. Password is a verification that the user is who they say they are, so you need some sort of user identifier — name or whatever. Table then looks like users(..., name, password, ...), you do SELECT password WHERE name = "foo" and proceed with verification from there. Convenient form is to keep all parameters needed to generate derived key inside the password field, e.g. like this:
algo$salt$password hash
For hashing itself, you don't want it to be fast — see key derivation functions like PBKDF2, bcrypt or scrypt. In general, tuning the parameters so that it takes about a second to derive one key is a nice way to make brute forcing infeasible.