I have a web app that stores objects in a database and then sends emails based on changes to those objects. For debugging and tracking, I am thinking of including the Document Id in the email metadata. Is there a security risk here? I could encrypt it (AES-256).
In general, I realize that security through obscurity isn't good practice, but I am wondering if I should still be careful with Document Ids.
For clarity, I am using CouchDB, but I think this can apply to databases in general.
By default, CouchDB uses UUIDs with a UTC time prefix. The worst you can leak there is the time the document was created, and you will be able to correlate about 1k worth of IDs likely having been produced on the same machine.
You can change this in the CouchDB configuration to use purely 128bit random UUIDs by setting the algorithm setting within the uuids section to random. For more information see the CouchDB Docs. Nothing should be possible to be gained from them.
Edit: If you choose your own document IDs, of course, you leak whatever you put in there :)
Compare Convenience and Security:
Convenience:
how useful is it for you having the document id in the mail?
can you quickly get useful information / the document having the ID ?
does encrypting/hashing it mean it's harder to get the actual database document? (answer here is yes unless you have a nice lookup form/something which takes the hash directly, avoid manual steps )
Security:
having a document ID what could I possibly do that's bad?
let's say you have a web application to look at documents..you have the same ID in a URL, it can't be considered 'secret'
if I have the ID can I access the 'document' or some other information I shouldn't be able to access. Hint: you should always properly check rights, if that's done then you have no problem.
as long as an ID isn't considered 'secret', meaning there aren't any security checks purely on ID, you should have no problems.
do you care if someone finds out the time a document was created? ( from Jan Lehnardt's answer )
Related
Maybe this has been asked a lot, but I can't find a comprehensive post about it.
Q: What are the options when you don't want to pass the ids from database to the frontend? You don't want the user to be able to see how many records are in your database.
What I found/heard so far:
Encrypt and decrypt the Id on backend
Use a GUID instead of a numeric auto-incremented Id as PK
Use a GUID together with an auto-incremented Id as PK
Q: Do you know any other or do you have experience with any of these? What are the performance and technical issues? Please provide documentation and blog posts on this topic if you know any.
Two things:
The sheer existence of an id doesn't tell you anything about how many records are in a database. Even if the id is something like 10, that doesn't mean there's only 10 records; it's just likely the tenth that was created.
Exposing ids has nothing to do with security, one way or another. Ids only have a meaning in the context of the database table they reside in. Therefore, in order to discern anything based on an id, the user would have to have access directly to your database. If that's the case, you've got far more issues than whether or not you exposed an id.
If users shouldn't be able to access certain ids, such as perhaps an edit page, where an id is passed as part of the URL, then you control that via row-level access policies, not by obfuscating or attempting to hide the id. Security by obscurity is not security.
That said, if you're just totally against the idea of sequential ids, then use GUIDs. There is no performance impact to using GUIDs. It's still a clustered index, just as any other primary key. They take up more space than something like an int, obviously, but we're talking a difference of 12 bytes per id - hardly anything to worry about with today's storage.
How are you supposed to store users passwords in a Cloudant DB ? By users, I mean users of an application that uses Cloudant as a backend.
I've searched the docs, but I found nothing on that topic.
There's a _users database, in which you can create users and add a "password" field, but the password is a regular field that the DB admin (and possibly others) can read.
Is there a built-in way to hide it from view or encrypt it ?
EDIT
I've found a piece of the puzzle, the CouchDB security feature that encrypts user's passwords.
Since CouchDB 1.2.0, the password_sha and salt fields are
automatically created when a password field is present in the user
document. When the user document is written, CouchDB checks for the
existence of the password field and if it exists, it will generate a
salt, hash the value of the password field and hash the concatenation
of the password hash and the salt. It then writes the resulting
password into the password_sha field and the salt into the salt field.
The password field is removed.
This has the following implications: Clients no longer have to
calculate the password salt and hash manually. Yay.
Now what's missing is the link between that underlying DB feature and Cloudant (just setting the password field in the user document is not working).
EDIT 2
Found that other question which is similar to this one - it's a broader problem, but specifically for web apps. There's an accepted answer from #JasonSmith that addresses my question:
Can I use CouchDB security features
Answer's "yes you can"
Cloudant does not yet have the newer CouchDB feature where the server
will automatically hash the password for you
But the CouchDB doc states that this features is included in the 1.20 version from 2013! How is that a "newer" feature?
From the doc, I gather that Cloudant uses CouchDB 1.61.
To recap:
the feature exists,
it's a CouchDB security feature existing in the CouchDB version that Cloudant uses,
Cloudant can be configured to use CouchDB security features
So... the missing link is really really small...
As you've discovered, Cloudant does not automatically hash passwords server side, as introduced in Couch 1.2. Also, it only supports the simple password scheme: salted SHA1 (which you may find insufficient). That's how passwords are supposed to be saved (not plain text).
It also misses a bunch of other security features, such as special access rules to the _users database (described here).
Hashing passwords "automatically" can be accomplished by an update function (special access rules could be implemented through show/list functions). I have done this myself:
function (doc, req) {
var body = JSON.parse(req.body || '{}') || {};
if (doc == null) doc = {
_id: req.id,
type: 'user'
};
doc.name = body.name;
doc.roles = body.roles;
doc.salt = req.uuid;
doc.password_scheme = 'simple';
doc.password_sha = hex_sha1(body.password + doc.salt);
return [doc, { json: doc }];
}
Get hex_sha1 from here. Set the above as an update function in a design doc on the _users database. You can also use this as a validation function.
Then instead of PUTing a user into the database, you PUT the same JSON to the update function, and it generates the salted hash before committing it to the database.
If salted SHA1 is not enough for your purposes you can't rely on _users on Cloudant, as is.
Not knowing more about your design, I can't really give much advice.
But I should warn you that, thanks to poor _users support, it's e.g. nearly impossible to effectively implement a 2-tier architecture on Cloudant. I'd be glad to be contradicted by someone who knows better, but after banging my head against this for months (and nagging support), this is the conclusion I've come to.
Eventually, you'll need an application layer to do user management, either through _users or API keys. Once you have such a layer, that's where you can hash passwords, and/or skip the _users database and do user management some other way. Every sample posted by Cloudant eventually does this, as soon as things get complicated enough (and none of the samples scale to tens of thousands of users, btw).
Finally, to #Anti-weakpasswords, who says you must go with PBKDF2 and huge iteration counts.
This is sound advice regarding saving passwords in general, but:
this doesn't work with Cloudant, at all;
it doesn't really work very well with CouchDB either.
First, as stated, if salted SHA1 is all that Cloudant supports, period.
But even for CouchDB, it's bad advice. With basic HTTP auth, you're sending the password on every single request. Key stretching with huge iteration counts on every single request would put tremendous pressure on the server, so large iteration counts are not recommended (that's why the docs have a 10 in there). If you're going down that road, you need to make sure you always use _session and cookies, and avoid basic auth like the plague.
More likely, if you take security seriously, you need to get a layer between the client and the database that handles user management some other way, and have decoupled database users/roles with strong enough passwords not to need strong hashing at all.
Clusers just came out! It may be useful to you. Clusers is a user account creator meant for Cloudant and CouchDB. It uses the older "password_scheme: simple" which Cloudant and older CouchDB.
First, you really, really do need to read Thomas Pornin's canonical answer to How to Securely Hash Passwords.
Read it right now.
Now read that CouchDB link, and see one of the recommended ways to produce password_sha for 1.3 (and if you're not on at least 1.3, get there).
{
"_id": "org.couchdb.user:username",
"_rev": "1-227bbe6ddc1db6826fb6f8a250ef6264",
"password_scheme": "pbkdf2",
"iterations": 10,
"name": "username",
"roles": [
],
"type": "user",
"derived_key": "aa7dc3719f9c48f1ac72754b28b3f2b6974c2062",
"salt": "77bac623e30d91809eecbc974aecf807"
}
Make certain that password_scheme is pbkdf2!
See that "iterations": 10 in the same? You need to bump that up by a huge amount - I'd say try a number in the low hundreds of thousands and see how it runs; SHA-1 is very cheap these days.
As far as Cloudant goes, here's a Github repository with some code to have Cloudant use the CouchDB _users.
i have a database that already has a users table
COLUMNS:
userID - int
loginName - string
First - string
Last - string
i just installed the asp.net membership table. Right now all of my tables are joined into my users table foreign keyed into the "userId" field
How do i integrate asp.net_users table into my schema? here are the ideas i thought of:
Add a membership_id field to my users table and on new inserts, include that new field in my users table. This seems like the cleanest way as i dont need to break any existing relationships.
break all existing relationship and move all of the fields in my user table into the asp.net_users table. This seems like a pain but ultimately will lead to the most simple, normalized solution
any thoughts?
I regularly use all manner of provider stacks with great success.
I am going to proceed with the respectful observation that your experience with the SqlProvider stack is limited and that the path of least resistance seems to you to be to splice into aspnet_db.
The abstract provider stack provides cleanly separated feature sets that compliment and interact with each other in an intuitive way... if you take the time to understand how it works.
And by extension, while not perfect, the SqlProviders provide a very robust backing store for the extensive personalization and security facilities that underly the asp.net runtime.
The more effort you make to understand the workings of these facilities, focusing less on how to modify (read: break) the existing schema and more on how to envision how your existing data could fit into the existing schema the less effort you will ultimately expend in order to ultimately end up with a robust, easily understandable security and personalization system that you did not have to design, write, test and maintain.
Don't get me wrong, I am not saying not to customize the providers. That is the whole point of an abstract factory pattern. But before you take it upon yourself to splice into a database/schema/critical infrastructural system it would behoove you to better understand it.
And once you get to that point you will start to see how simple life can be if you concentrate on learning how to make systems that have thousands of man hours in dev time and countless users every minute of every day work for you the more actual work you will get done on the things that really interest you and your stakeholders.
So - let me suggest that you import your users into the aspnet_db/sqlprovider stack and leverage the facilities provided.
The userId in aspnet_db is a guid and should remain that way for very many reasons. If you need to retain the original integral user identifier - stash it in the mobile pin field for reference.
Membership is where you want to place information that is relevant to security and identification. User name, password, etc.
Profiles is where you want to place volitile meta like names and site preferences.
Anyway - what I am trying to say is that you need to have a better understanding of the database and the providers before you hack it. Start of by understanding how to use it as provided and your experience will be more fruitful.
Good luck.
In my experience, the "ASP.NET membership provider" introduces more complexity than it solves. So I'd go for option 2: a custom user table.
P.S. If anyone has been using the "ASP.NET membership provider" with success, please comment!
I've heard that exposing database IDs (in URLs, for example) is a security risk, but I'm having trouble understanding why.
Any opinions or links on why it's a risk, or why it isn't?
EDIT: of course the access is scoped, e.g. if you can't see resource foo?id=123 you'll get an error page. Otherwise the URL itself should be secret.
EDIT: if the URL is secret, it will probably contain a generated token that has a limited lifetime, e.g. valid for 1 hour and can only be used once.
EDIT (months later): my current preferred practice for this is to use UUIDS for IDs and expose them. If I'm using sequential numbers (usually for performance on some DBs) as IDs I like generating a UUID token for each entry as an alternate key, and expose that.
There are risks associated with exposing database identifiers. On the other hand, it would be extremely burdensome to design a web application without exposing them at all. Thus, it's important to understand the risks and take care to address them.
The first danger is what OWASP called "insecure direct object references." If someone discovers the id of an entity, and your application lacks sufficient authorization controls to prevent it, they can do things that you didn't intend.
Here are some good rules to follow:
Use role-based security to control access to an operation. How this is done depends on the platform and framework you've chosen, but many support a declarative security model that will automatically redirect browsers to an authentication step when an action requires some authority.
Use programmatic security to control access to an object. This is harder to do at a framework level. More often, it is something you have to write into your code and is therefore more error prone. This check goes beyond role-based checking by ensuring not only that the user has authority for the operation, but also has necessary rights on the specific object being modified. In a role-based system, it's easy to check that only managers can give raises, but beyond that, you need to make sure that the employee belongs to the particular manager's department.
There are schemes to hide the real identifier from an end user (e.g., map between the real identifier and a temporary, user-specific identifier on the server), but I would argue that this is a form of security by obscurity. I want to focus on keeping real cryptographic secrets, not trying to conceal application data. In a web context, it also runs counter to widely used REST design, where identifiers commonly show up in URLs to address a resource, which is subject to access control.
Another challenge is prediction or discovery of the identifiers. The easiest way for an attacker to discover an unauthorized object is to guess it from a numbering sequence. The following guidelines can help mitigate that:
Expose only unpredictable identifiers. For the sake of performance, you might use sequence numbers in foreign key relationships inside the database, but any entity you want to reference from the web application should also have an unpredictable surrogate identifier. This is the only one that should ever be exposed to the client. Using random UUIDs for these is a practical solution for assigning these surrogate keys, even though they aren't cryptographically secure.
One place where cryptographically unpredictable identifiers is a necessity, however, is in session IDs or other authentication tokens, where the ID itself authenticates a request. These should be generated by a cryptographic RNG.
While not a data security risk this is absolutely a business intelligence security risk as it exposes both data size and velocity. I've seen businesses get harmed by this and have written about this anti-pattern in depth. Unless you're just building an experiment and not a business I'd highly suggest keeping your private ids out of public eye. https://medium.com/lightrail/prevent-business-intelligence-leaks-by-using-uuids-instead-of-database-ids-on-urls-and-in-apis-17f15669fd2e
It depends on what the IDs stand for.
Consider a site that for competitive reason don't want to make public how many members they have but by using sequential IDs reveals it anyway in the URL: http://some.domain.name/user?id=3933
On the other hand, if they used the login name of the user instead: http://some.domain.name/user?id=some they haven't disclosed anything the user didn't already know.
The general thought goes along these lines: "Disclose as little information about the inner workings of your app to anyone."
Exposing the database ID counts as disclosing some information.
Reasons for this is that hackers can use any information about your apps inner workings to attack you, or a user can change the URL to get into a database he/she isn't suppose to see?
We use GUIDs for database ids. Leaking them is a lot less dangerous.
If you are using integer IDs in your db, you may make it easy for users to see data they shouldn't by changing qs variables.
E.g. a user could easily change the id parameter in this qs and see/modify data they shouldn't http://someurl?id=1
When you send database id's to your client you are forced to check security in both cases. If you keep the id's in your web session you can choose if you want/need to do it, meaning potentially less processing.
You are constantly trying to delegate things to your access control ;) This may be the case in your application but I have never seen such a consistent back-end system in my entire career. Most of them have security models that were designed for non-web usage and some have had additional roles added posthumously, and some of these have been bolted on outside of the core security model (because the role was added in a different operational context, say before the web).
So we use synthetic session local id's because it hides as much as we can get away with.
There is also the issue of non-integer key fields, which may be the case for enumerated values and similar. You can try to sanitize that data, but chances are you'll end up like little bobby drop tables.
My suggestion is to implement two stages of security.
"Security through obscurity": You can have integer Id as primary key and Gid as GUID as surrogate key in tables. Whereas integer Id column is used for relations and other database back-end and internal purposes (and even for select list keys in web apps to avoid unnecessary mapping between Gid and Id while loading and saving) and Gid is used for REST Urls i.e for GET,POST, PUT, DELETE etc. So that one cannot guess the other record id. This gives first level of protection against guess-based attacks. (i.e. number series guessing)
Access based control at Server side : This is most important, and you have various way to validate the request based on roles and rights defined in application. Its up to you to decide.
From the perspective of code design, a database ID should be considered a private implementation detail of the persistence technology to keep track of a row. If possible, you should be designing your application with absolutely no reference to this ID in any way. Instead, you should be thinking about how entities are identified in general. Is a person identified with their social security number? Is a person identified with their email? If so, your account model should only ever have a reference to those attributes. If there is no real way to identify a user with such a field, then you should be generating a UUID before hitting the DB.
Doing so has a lot of advantages as it would allow you to divorce your domain models from persistence technologies. That would mean that you can substitute database technologies without worrying about primary key compatibility. Leaking your primary key to your data model is not necessarily a security issue if you write the appropriate authorization code but its indicative of less than optimal code design.
I've thought about this too much now with no obviously correct solution. It might be a real wood-for-the-trees situation, so I need stackoverflow's help.
I'm trying to enforce database filtering on a regional basis. My system has various users and each one is assigned to a regional office. I only want users to be able to see data that is associated with their regional office.
Put simply my application is: Java App -> JPA (hibernate) -> MySQL
The database contains object from all regions, but I only want the users to be able to manipulate objects from their own region. I've thought about the following ways of doing it:
1) modify all database querys so they read something like select * from tablex where region="myregion". This is nasty. It doesn't work to well with JPA eg the entitymanager.find() method only accepts primary key. Of course I can go native, but I only have to miss one select statement and my security is shot
2) use a mysql proxy to filter results. kind of funky, but then the mysql proxy just sees the raw call and doesn't really know how it should be filtering them (ie which region the user that made this request belongs to). Ok, I could start a proxy for each region, but it starts getting a little messy..
3) use separate schemas for each region. yeah, simple, I'm using spring so I could use the RoutingDataSource to route the requests via the correct datasource (1 datasource per schema). Of the course the problem now is somewhere down the line I'm going to want to filter by region and some other category. ohps.
4) ACL - not really sure about this. If a did a select * from tablex; would it quietly filter out objects I don't have access for or would a load of access exceptions be thrown?
But am I thinking too much about this? This seems like a really common problem. There must be some easy solution I'm just too dumb to see. I'm sure it'll be something close to / or in the database as you want to filter as near to source as possible, but what?
Not looking to be spoonfed - any links, keywords, ideas, commerical/opensource product suggestions would be really appreciated!! thanks.
I've just been implementing something similar (REALbasic talking to MySQL) over the last couple of weeks for a hierarchical multi-company extension to an accounting package.
There's a large body of existing code which composes SQL statements so we had to live with that and just do a lot of auditing to ensure the restrictions were included in each table as appropriate. One gotcha was related lookups where lookup tables were normally only used in combination with a primary table but for some maintenance GUIs would load the lookup table itself, directly.
There's a danger of giving away implied information such as revealing that Acme Pornstars are a client of some division of the company ;-)
The only solution for that part was very careful construction of DB diagrams to show all implied relationships and lots of auditing and grepping source code, with careful commenting to indicate areas which had been OK'd as not needing additional restrictions.
The one pattern I've come up with to make this more generalised in future is, rather than explicit region=currentRegionVar type searches, using an arbitrary entityID which is supplied by a global CurrentEntityForRole("blah") function.
This abstraction allows for sharing of some data as well as implementing pseudo-entities which represent other restriction boundaries.
I don't know enough about Java and Spring to be able to tell but is there a way you could use views to provide a single-key lookup, where the views are restricted by the region filter?
The desire to provide aggregations and possible data sharing was why we didn't go down the separate database route.
Good Question.
Seems like #1 is the best since it's the most flexible.
Region happens to be what you're filtering on today, but it could be region + department + colour of hair tomorrow.
If you start carving up the data too much it seems like you'll be stuck working harder than necessary to glue them all back together for reporting.
I am having the same problem. It is hard to believe that such a common task (filtering a list of model entities based on the user profile) has not a 'standard' way, pattern or best-practice to do it.
I've found pgacl, a PostgreSQL module. Basically, you do your query like you normally would, and then you tack on an acl_access() predicate to work as a filter.
Maybe there is something similar for MySQL.
I suggest you to use ACL. It is more flexible than other choices. Use Spring Security. You can use it without using Spring Framework. Read the tutorial from link text