With GDPR (General Data Protection Regulation) going into effect tomorrow, I am wondering whether Google encrypts data in their Datastore indexes on GAE. I know that they encrypt data stored in entities but it isn't clear that they encrypt the data in indexes. I can't imagine that this is even possible given that queries would never be able to run on encrypted data. If the indexed data is not encrypted, would this not be considered to make GAE non-compliant with GDPR?
I'm reasonably certain they don't actually. The HIPAA compliance guide specifically instructs you to encrypt PHI before using it as an index key. Here's the full text:
When creating or configuring indexes in Google Cloud Datastore, encrypt any PHI, security credentials, or other sensitive data, before using it as the entity key, indexed property key, or indexed property value for the index. See the Cloud Datastore documentation for information on creating and/or configuring indexes.
I'm assuming this means you need to do your own encryption here, otherwise I'm not sure why they'd mention it. And no, I don't know how a meaningful database index can be built from encrypted data.
Google encrypts and authenticates all data in transit at one or more network layers when data moves outside physical boundaries not controlled by Google, it also has 7 layer type of data encryption. More information about Encryption in Transit can be found here. There's also the Server-Side Encryption in Datastore.
All data in Cloud Datastore is encrypted at rest as documented at https://cloud.google.com/datastore/docs/concepts/encryption-at-rest .
Related
I am wondering what the functional benefits are of applying Tri-Secret Managed Key security in Snowflake warehouses?
From what I understand:
the Tri-Secret method let's you define your own key
revoking that key will make it impossible for Snowflake to decrypt your data
But is the level of encryption in any way more secure?
The only difference I see is that a Snowflake key and your own key are combined.
Prohibiting access to that key will make it impossible for Snowflake to decrypt, rendering your data useless.
And then what? How to recuperate from that?
Also see: https://www.snowflake.com/blog/customer-managed-keys/
This is from Snowflake themselves. What they essentially say, is:
you have more control, but you must supply Snowflake with that key otherwise they can't do their work
you can shut out Snowflake from accessing your data if you mistrust them
if there is a data leak, you can prevent further leakage (but why not shutdown all access in that case, except for reverified trusted parties???)
I am stymied by this functionality.
The only response I get from local Snowflake representatives is that the Dutch financial laws/regulations prescribe or at least imply the need for this.
Regards, Richard.
Let me start by making the most important statement here: Don't use customer managed keys, unless you really need to use them.
Think of the worst case scenario: If you don't have an incredible resilient platform to securely store your own keys, then you risk losing all your data if you ever lose your keys.
Let me quote now some 3rd parties on when customer managed keys should be used:
Google says:
Customer-managed encryption keys are intended for organizations that have sensitive or regulated data that requires them to manage their own encryption key.
Meaning: Customer managed keys are intended for organizations that are required to use them. Don't follow this path, unless you are required too.
Or why would some companies start using their own keys, or stop using them — from CIO.com:
If you’re considering whether bringing your own keys – which also means securing your own keys – is right for your business, the first question to ask is are you ready to become a bank, because you’ll have to run your key infrastructure with the same rigor, down to considering the travel plans of officers of the company. If you have three people authorized to use the smart card that gives access to your key, you don’t ever want to let all three of them on the same plane.
Same article about Microsoft customers on the automotive industry:
“They all start by saying ‘I want to be in control,’ but as they see the responsibility and they understand to what extreme lengths Microsoft taking this responsibility, they say ‘why don’t you just do it.’ They don't want to be the weaker link in a chain.”
And NY financial institutions:
Even some New York financial institutions, who initially wanted BYOK that ran against their own on-premises HSMs decided against that when they considered what could go wrong, says Paul Rich from Microsoft’s Office 365 team. “An HSM could have been powered down, taking out a vast swathe of user base. They quickly got the idea that this is potentially a great denial of service attack that malicious insider or attacker performs on the company. These are our most sophisticated customers who are highly sensitive that this is a big responsibility but also a threat of potential destruction, whether that’s accidental or malicious.”
So you should only use customer managed keys if you already understand why they would be beneficial for your use case - and only if you are ready to incur the cost of securely managing them. Otherwise, just let Snowflake handle all these complexities for you.
On the other hand, benefits of using tri-secrets from the Snowflake announcement post:
Customer-managed keys provide an extra level of security for customers with sensitive data. With this feature, the customer manages the encryption key themselves and makes it accessible to Snowflake. If the customer decides to disable access, data can no longer be decrypted. In addition, all running queries are aborted. This has the following benefits for customers: (a) it makes it technically impossible for Snowflake to comply with requests for access to customer data, (b) the customer can actively mitigate data breaches and limit data exfiltration, and (c) it gives the customer full control over data lifecycle.
On the same blog post you'll notice that Snowflake tri-secret customer keys are managed by AWS's KMS. This gives you the above benefits, without having to deploy your own infrastructure to manage your keys. You still need to carefully consider your responsibility for safeguarding your key.
AWS Key Management Service (KMS) makes it easy for you to create and manage cryptographic keys and control their use across a wide range of AWS services and in your applications. AWS KMS is a secure and resilient service that uses hardware security modules that have been validated under FIPS 140-2, or are in the process of being validated, to protect your keys. AWS KMS is integrated with AWS CloudTrail to provide you with logs of all key usage to help meet your regulatory and compliance needs.
The blog post hasn't been updated to reflect that also GCP and Azure can be used to manage the keys, as stated in the docs:
Tri-Secret Secure lets you control access to your data using a master encryption key that you maintain in the key management service for the cloud provider that hosts your Snowflake account: AWS: AWS Key Management Service (KMS); Google Cloud: Cloud Key Management Service (Cloud KMS); Microsoft Azure: Azure Key Vault.
Synonyms for future research you might want to perform: BYOK, CMEK.
I am currently comparing different solutions for an immutable database, such as Blockchain or AWS QLDB.
AWS QLDB looks very interesting for me, but I have a question about how the data is stored at Amazon:
Can Amazon see the data, I put on the QLDB, in plain text (so they could use it for other purposes) or are they encrypted so that only users with a private key can see the content?
The encryption Amazon talks about on their homepage seems to refer to the hashing of the journals to make it immutable, and not to the data itself..?
Thank you in advance
With the general availability announcement on 9/10/29, this question is now answered in the FAQ of Amazon QLDB.
Q. How does encryption work in Amazon QLDB?
Yes. By default, all data in transit and at rest is encrypted. Today, Amazon QLDB does not support customer managed CMKs (Customer Master Keys). Amazon QLDB uses AWS-owned keys to encrypt customer data.
So no, Amazon will not yet allow you to use customer managed keys from KMS to encrypt the data in your ledger. I'm sure this is only a matter of time before this feature is available though.
It is worth noting that hashing is not a type of encryption. In QLDB, the hash value of each journal entry (and including the hash of the previous entry) is stored alongside that entry's data. This is available in plaintext. If AWS managed keys are used, then Amazon has access to your data.
I have a question about cloudant and data encryption, I’m doing a project, I don't know if I can encrypt the data in cloudant, I did an investigation but I did not find something in particular, someone here knows something about this topic?
Cloundant’s disks are encrypted at rest, and data in flight over https. Data in the database itself during normal operation isn’t encrypted—although it’s perfectly feasible for you to do so in the client. Note, however, if you encrypted the data client-side, Cloundant cannot index your data.
in addition to xpqz's answer, this blog shows how to do client side transformation of Cloudant data for additional encryption outside of "at-rest" and "in-motion"
https://blog.cloudant.com/2018/10/10/Client-Side-Transformation.html
How are you supposed to store users passwords in a Cloudant DB ? By users, I mean users of an application that uses Cloudant as a backend.
I've searched the docs, but I found nothing on that topic.
There's a _users database, in which you can create users and add a "password" field, but the password is a regular field that the DB admin (and possibly others) can read.
Is there a built-in way to hide it from view or encrypt it ?
EDIT
I've found a piece of the puzzle, the CouchDB security feature that encrypts user's passwords.
Since CouchDB 1.2.0, the password_sha and salt fields are
automatically created when a password field is present in the user
document. When the user document is written, CouchDB checks for the
existence of the password field and if it exists, it will generate a
salt, hash the value of the password field and hash the concatenation
of the password hash and the salt. It then writes the resulting
password into the password_sha field and the salt into the salt field.
The password field is removed.
This has the following implications: Clients no longer have to
calculate the password salt and hash manually. Yay.
Now what's missing is the link between that underlying DB feature and Cloudant (just setting the password field in the user document is not working).
EDIT 2
Found that other question which is similar to this one - it's a broader problem, but specifically for web apps. There's an accepted answer from #JasonSmith that addresses my question:
Can I use CouchDB security features
Answer's "yes you can"
Cloudant does not yet have the newer CouchDB feature where the server
will automatically hash the password for you
But the CouchDB doc states that this features is included in the 1.20 version from 2013! How is that a "newer" feature?
From the doc, I gather that Cloudant uses CouchDB 1.61.
To recap:
the feature exists,
it's a CouchDB security feature existing in the CouchDB version that Cloudant uses,
Cloudant can be configured to use CouchDB security features
So... the missing link is really really small...
As you've discovered, Cloudant does not automatically hash passwords server side, as introduced in Couch 1.2. Also, it only supports the simple password scheme: salted SHA1 (which you may find insufficient). That's how passwords are supposed to be saved (not plain text).
It also misses a bunch of other security features, such as special access rules to the _users database (described here).
Hashing passwords "automatically" can be accomplished by an update function (special access rules could be implemented through show/list functions). I have done this myself:
function (doc, req) {
var body = JSON.parse(req.body || '{}') || {};
if (doc == null) doc = {
_id: req.id,
type: 'user'
};
doc.name = body.name;
doc.roles = body.roles;
doc.salt = req.uuid;
doc.password_scheme = 'simple';
doc.password_sha = hex_sha1(body.password + doc.salt);
return [doc, { json: doc }];
}
Get hex_sha1 from here. Set the above as an update function in a design doc on the _users database. You can also use this as a validation function.
Then instead of PUTing a user into the database, you PUT the same JSON to the update function, and it generates the salted hash before committing it to the database.
If salted SHA1 is not enough for your purposes you can't rely on _users on Cloudant, as is.
Not knowing more about your design, I can't really give much advice.
But I should warn you that, thanks to poor _users support, it's e.g. nearly impossible to effectively implement a 2-tier architecture on Cloudant. I'd be glad to be contradicted by someone who knows better, but after banging my head against this for months (and nagging support), this is the conclusion I've come to.
Eventually, you'll need an application layer to do user management, either through _users or API keys. Once you have such a layer, that's where you can hash passwords, and/or skip the _users database and do user management some other way. Every sample posted by Cloudant eventually does this, as soon as things get complicated enough (and none of the samples scale to tens of thousands of users, btw).
Finally, to #Anti-weakpasswords, who says you must go with PBKDF2 and huge iteration counts.
This is sound advice regarding saving passwords in general, but:
this doesn't work with Cloudant, at all;
it doesn't really work very well with CouchDB either.
First, as stated, if salted SHA1 is all that Cloudant supports, period.
But even for CouchDB, it's bad advice. With basic HTTP auth, you're sending the password on every single request. Key stretching with huge iteration counts on every single request would put tremendous pressure on the server, so large iteration counts are not recommended (that's why the docs have a 10 in there). If you're going down that road, you need to make sure you always use _session and cookies, and avoid basic auth like the plague.
More likely, if you take security seriously, you need to get a layer between the client and the database that handles user management some other way, and have decoupled database users/roles with strong enough passwords not to need strong hashing at all.
Clusers just came out! It may be useful to you. Clusers is a user account creator meant for Cloudant and CouchDB. It uses the older "password_scheme: simple" which Cloudant and older CouchDB.
First, you really, really do need to read Thomas Pornin's canonical answer to How to Securely Hash Passwords.
Read it right now.
Now read that CouchDB link, and see one of the recommended ways to produce password_sha for 1.3 (and if you're not on at least 1.3, get there).
{
"_id": "org.couchdb.user:username",
"_rev": "1-227bbe6ddc1db6826fb6f8a250ef6264",
"password_scheme": "pbkdf2",
"iterations": 10,
"name": "username",
"roles": [
],
"type": "user",
"derived_key": "aa7dc3719f9c48f1ac72754b28b3f2b6974c2062",
"salt": "77bac623e30d91809eecbc974aecf807"
}
Make certain that password_scheme is pbkdf2!
See that "iterations": 10 in the same? You need to bump that up by a huge amount - I'd say try a number in the low hundreds of thousands and see how it runs; SHA-1 is very cheap these days.
As far as Cloudant goes, here's a Github repository with some code to have Cloudant use the CouchDB _users.
I am creating a mobile app that uses Google App Engine (python) for the backend. Users sign in with Twitter on the app, and the auth token and secret are passed to the backend (over https) so that the server can authenticate with Twitter and also periodically sync friends and followers in a background task. Because they are used by the background thread, I want to store the information in the datastore so they can be retrieved and used later.
Right now, during development and testing, I just put these in the datastore in plain text. But I'd like to add a little more security by storing it encrypted and decrypting it when its needed. Thank you for any help!
For general account passwords, I use
security.generate_password_hash(raw_password, length=12)
based on how webapp2_extras stores the passwords. But this approach wouldn't allow me to retrieve the data. Is there anything similar that allows for encryption and decryption?
Normally for password storage you would use a unidirectional (One way) encryption technique so that no one can work out what the password is and then take the user supplied values and compare them to the stored values. This way you're never really storing the actual password and it's less likely to be stolen.
What you're looking for is a bidirectional encryption technique where by you provide the value and a key to create an encrypted value and can apply the key to the encrypted value to get the original.
You haven't stated which language you're using so I cannot provide a good example, however I suggest looking at techniques such as AES. Please keep in mind that if you choose an encryption technique with a short key it will be much easier to brute force. Any encryption that is bidirectional is at risk of easier brute force and once the key has been determined ALL passwords are at risk of being decrypted. Most languages have some form of support for AES and similar encryption techniques.
There are many techniques available, some much newer and more secure so do some research and see what you deem 'secure enough'.
Read more here: http://en.wikipedia.org/wiki/Advanced_Encryption_Standard