What is the idiomatic way of handling "ephemeral" state in a database? - database

I know that "best practices" type of questions are frowned upon in the StackOverflow community, but I am not sure how else to word this.
My "big picture" question is this:
What is a good practice when it comes to handling "session" state in a stateless server (like one that provides a REST api)?
Quick details
Using nodeJS on backend, MongoDB for database.
Example 1: Login state
In version 1 of the admin panel, I had a simple login that asks for an email and password. If the credentials are correct, user is returned a token, otherwise an error.
In version 2, I added a two-factor authentication for users who activate it.
Deciding to keep things simple, I have now two endpoints. The flow is this:
/admin/verifyPassword:
Receive email and password;
if(Credentials are correct) {
if(Admin requires 2fa) {
return {nextStep: 2fa};
} else {
return tokenCode;
}
} else {
return error;
}
/admin/verifyTotpToken:
Receive email and TOTP token;
Get admin with corresponding email
if(Admin has verified password) {
return tokenCode
} else {
return error;
}
At the verifyTotpToken step, it needs to know if the admin has already verified password. To do that I decided to attach a 'temporary' field to the Admin document called hasVerifiedPassword which gets set to true in verifyPassword step.
Not only that, but I also set a passwordVerificationExpirationDate temporary field in the verifyPassword endpoint so that they have a short window within which they must complete the whole login process.
The problem with my approach is that:
It bloats the admin document with ephemeral, temporary state that has nothing to do with an admin itself. In my mind, resource and session are two separate things.
It gives way for stale data to stay alive and attached to the admin document, which at best is a slight nuisance when looking through the admin collection in a database explorer, and at worst can lead to hard to detect bugs because the garbage data is not properly cleaned.
Example 2: 2FA activation confirmation by email
When an admin decides to activate 2fa, for security purposes, I first send them an email to confirm that it is truly them (and not someone who hijacked their session) who wanted to activate 2fa. To do that I need to pass in a hash of someway and store it in the database.
My current approach is this:
1) I generate a hash on the server side, store it in their admin document as well as an expiration date.
2) I generate a url containing the hash as a query parameter and send it in the email.
3) The admin clicks on the email
4) The frontend code picks up the hash from the query parameter and asks the server to verify it
5) The server looks up the admin document and checks for a hash match. If it does, great. Return ok and clean up the data. If not, return an error. If expired, clean up the data.
Here also, I had to use some temporary state (the two fields hash and expirationDate). It is also fragile for the same problems mentioned above.
My main point
Through these two examples I tried to illustrate the problem I am facing. Although these solutions are working "fine", I am curious about what better programmers think of my approaches and if there is a better, more idiomatic way of doing this.
Please keep in mind that the purpose of my question is not a get a specific solution to my specific problem. I am looking for advice for the more general problem of storing session data in a clever, maintainable, way that does not mix resource state and ephemeral state.

Related

How should we structure our models with microservices?

For example, if I have a microservice with this API:
service User {
rpc GetUser(GetUserRequest) returns (GetUserResponse) {}
}
message GetUserRequest {
int32 user_id = 1;
}
message GetUserResponse {
int32 user_id = 1;
string first_name = 2;
string last_name = 3;
}
I figured that for other services that require users, I'm going to have to store this user_id in all rows that have data associated with that user ID. For example, if I have a separate Posts service, I would store the user_id information for every post author. And then whenever I want that user information to return data in an endpoint, I would need to make a network call to the User service.
Would I always want to do that? Or are there certain times that I want to just copy over information from the User service into my current service (excluding saving into in-memory databases like Redis)?
Copying complete data generally never required, most of times for purposes of scale or making microservices more independent, people tend to copy some of the information which is more or less static in nature.
For eg: In Post Service, i might copy author basic information like name in post microservices, because when somebody making a request to the post microservice to get list of post based on some filter , i do not want to get name of author for each post.
Also the side effect of copying data is maintaining its consistency. So make sure you business really demands it.
You'll definitely want to avoid sharing database schema/tables. See this blog for an explanation. Use a purpose built interface for dependency between the services.
Any decision to "copy" data into your other service should be made by the service's team, but they better have a real good reason in order for it to make sense. Most designs won't require duplicated data because the service boundary should be domain specific and non-overlapping. In case of user ids they can be often be treated as contextual references without any attached logic about users.
One pattern observed is: If you have auth protected endpoints, you will need to make a call to your auth service anyway - for security - and that same call should allow you to acquire whatever user id information is necessary.
All the regular best practices for API dependencies apply, e.g. regarding stability, versioning, deprecating etc.

Minimizing database overhead by storing additional information to the users authentication-object (Spring Security)

Long story short: I want to minimize my database look-ups for things like user_id of an already logged in user but I don't know what a good way of doing this would look like.
I am using Spring Security in order to check if a logged in user is authenticated. However, after authentication, as some actual requests come in, I would like to minimize the number of database calls as much as possible.
Hence my question: From the SecurityContextHolder I can get my hands on the User object using getPrincipal() of the Authentication object:
#PreAuthorize("isAuthenticated()")
public List<StoreDTO> getAvailableStores() {
Authentication auth = SecurityContextHolder.getContext().getAuthentication();
User user = (User)auth.getPrincipal();
String username = user.getUsername();
List<Store> storeList = this.storeAdminRepository.getStores(username);
return Convert.toStoreDtoList(restaurantList);
}
Would it be "dirty", instead of setting the simple User object as principal, to use a custom object that e.g. also stores the user id from inside the database?
The way I am doing it now would require me to look up the user id first according to his name and then get the stores where user.id = store.user_id or something like that.
Assume that there are a lot of requests coming and - is this a way to minimize the overhead? Or, and this could also be true, are my concerns unfounded since the overhead is by far not that large like I am assuming here?

User information in Nancy

I'm knocking together a demo app based upon Nancy.Demo.Authentication.Forms.
I'm implementing Claims and UserName in my UserIdentity:IUserIdentity class and, as per the demo, I've got a UserModel with UserName.
In the SecureModule class, I can see that the Context.CurrentUser can be used to see who it is that's logged on, but as per the interface, this only supplies the username and the claims. If I then need to get more data (say messages for the logged on user) for a view model, all I can see to use as a filter for a db query is the username, which feels, well, weird. I'd much rather be using the uniqueIdentifier of the user.
I think what I'm trying to get to the bottom of, if it is better to add the extra fields to my IUserIdentity implementation, or to the UserModel? And where to populate these?
Not sure my question is that clear (It's not clear in my head!), but some general basic architecture advice would go down a treat.
Sorry for the delayed reply.. bit hectic at the moment :)
The IUserIdentity is the minimum interface required to use Nancy's built in authentication helpers, you can implement that and add as much additional information as you like to your class; it's similar to the standard .net IPrincipal. If you do add your own info you'll obviously have to cast to your implementation type to access the additional fields. We could add a CurrentUser method to stop you having to do that, but it seems a little redundant.
You can stop reading here if you like, or you can read on if you're interested in how forms auth works..
FormsAuth uses an implementation of IUsernameMapper (which is probably named wrong now) to convert between the Guid user identifier that's stored in the client cookie and the actual user (the IUserIdentity). It's worth noting that this GUID needs to be mapped to the user/id somewhere, but it's not intended to be your database primary key, it is merely a layer of indirection between your (probably predictable) user id/names and the "token" stored on the client. Although the cookies are encrypted and HMACd (depending on your configuration), if someone does manage to crack open and reconstruct the auth cookie they would have to guess someone else's GUID in order to impersonate them, rather than changing a username (to "admin" or something smilar), or an id (to 1 for the first user).
Hope that makes sense :)

Ways to avoid CakePHP $this->data security hole?

I just realized when doing basic CakePHP stuff, that there is quite bad security issue, which many don't necessarily notice. I'll just take this basic function that I think many users use while doing CakePHP driven apps.
function edit() {
if(!empty($this->data)) {
if($this->User->save($this->data)) {
}
}
}
Lets assume user has privileges to use this action. This action could be editing user information, which may have like city and number and ofcourse username. Lets assume that we want to have a form that allows us to edit just the city and number but not the username. Well what if someone just inserts that username field into that form with firebug for example? Then submits the form. Now the edit would just grab all the post information, including the username field and its value and edit them straight away. So you can change your username in this case even though there werent a field for it.
This can go even further, if someone would use saveAll(), which allows you to validate and save multiple models in one shot. If you could guess from form fields the models to use, you could easily go to other models and tables aswell and alter those information.
Now that you understand my concerns, my question is what would be the best or atleast near the best method to avoid this?
I know I could just grab the data I want from $this->data to other variable and then pass that to the save or saveAll, but because there are many forms and ajax requests, that would be quite a lot of work. But is it the only way to go or are there better ways?
Should I make or is there a behavior which could stop this? Like checking what variables some action in some controller can get from post?
After couple of days research I found that this is not really a "security hole", but rather beginners mistake.
There are two ways avoiding this type of form tampering: Security component ( http://book.cakephp.org/view/1296/Security-Component ) which automatically gets CSRF and form tampering protection by creating one-time hashes for form fields.
The other way is to give the third parameter to save() function. The save actually gets 3 parameters: data, validate, fieldlist. The fieldlist parameter acts like whitelist of fields that are allowed to be saved.
I first reported this problem as a bug to cakephp which it then wasn't but this euromark guy replied to me that he had done nice documenting about the actual problem and how to do secure saves and I really think it was quite good reading. So if you have the same problems, please see this page: http://www.dereuromark.de/2010/09/21/saving-model-data-and-security/

How can I prevent database being written to again when the browser does a reload/back?

I'm putting together a small web app that writes to a database (Perl CGI & MySQL). The CGI script takes some info from a form and writes it to a database. I notice, however, that if I hit 'Reload' or 'Back' on the web browser, it'll write the data to the database again. I don't want this.
What is the best way to protect against the data being re-written in this case?
Do not use GET requests to make modifications! Be RESTful; use POST (or PUT) instead the browser should warn the user not to reload the request. Redirecting (using HTTP redirection) to a receipt page using a normal GET request after a POST/PUT request will make it possible to refresh the page without getting warned about resubmitting.
EDIT:
I assume the user is logged in somehow, and therefore you allready have some way of tracking the user, e.g. session or similar.
You could make a timestamp (or a random hash etc..) when displaying the form storing it both as a hidden field (just besides the anti Cross-Site Request token I'm sure you allready have there), and in a session variable (wich is stored safely on your server), when you recieve a the POST/PUT request for this form, you check that the timestamp is the same as the one in session. If it is, you set the timestamp in the session to something variable and hard to guess (timestamp concatenated with some secret string for instance) then you can save the form data. If someone repeats the request now you won't find the same value in the session variable and deny the request.
The problem with doing this is that the form is invalid if the user clicks back to change something, and it might be a bit to harsh, unless it's money you're updating. So if you have problems with "stupid" users who refresh and click the back-button thus accidentally reposting something, just using POST would remind them not to do that, and redirecting will make it less likely. If you have a problem with malicious users, you should use a timestampt too allthough it will confuse users sometimes, allthough if users is deliberately posting the same message over and over you probably need to find a way to ban them. Using POST, having a timestam, and even doing a full comparison of the whole database to check for duplicate posts, won't help at all if the malicious users just write a script to load the form and submit random garbage, automatically. (But cross-site-request protection makes that a lot harder)
Using a POST request will cause the browser to try to prevent the user from submitting the same request again, but I'd recommend using session-based transaction tracking of some kind so that if the user ignores the warnings from the browser and resubmits his query your application will prevent duplication of changes to the database. You could include a hidden input in the submission form with value set to a crypto hash and record that hash if the request is submitted and processed without error.
I find it handy to track the number of form submissions the user has performed in their session. Then when rendering the form I create a hidden field that contains that number. If the user then resubmits the form by pressing the back button it'll submit the old # and the server can tell that the user has already submitted the form by examining what's in the session to what the form is saying.
Just my 2 cents.
If you aren't already using some sort of session-management (which would let you note and track form submissions), a simple solution would be to include some sort of unique identifier in the form (as a hidden element) that is either part of the main DB transaction itself, or tracked in a separate DB table. Then, when you are submitted a form you check the unique ID to see if it has already been processed. And each time the form itself is rendered, you just have to make sure you have a unique ID.
First of all, you can't trust the browser, so any talk about using POST rather than GET is mostly nerd flim-flam. Yes, the client might get a warning along the lines of "Did you mean to resubmit this data again?", but they're quite possibly going to say "Yes, now leave me alone, stupid computer".
And rightly so: if you don't want duplicate submissions, then it's your problem to solve, not the user's.
You presumably have some idea what it means to be a duplicate submission. Maybe it's the same IP within a few seconds, maybe it's the same title of a blog post or a URL that has been submitted recently. Maybe it's a combination of values - e.g. IP address, email address and subject heading of a contact form submission. Either way, if you've manually spotted some duplicates in your data, you should be able to find a way of programmatically identifying a duplicate at the time of submission, and either flagging it for manual approval (if you're not certain), or just telling the submitter "Have you double-clicked?" (If the information isn't amazingly confidential, you could present the existing record you have for them and say "Is this what you meant to send us? If so, you've already done it - hooray")
I'd not rely on POST warnings from the browser. Users just click OK to make messages go away.
Anytime you'll have a request that needs to be one time only e.g 'make a payment', send a unique token down, that gets submitted back with the request. Throw the token out after it comes back, and so you can now tell when something is a valid submission (anything with a token that isn't 'active'). Expire active tokens after X amount of time, e.g. when a user session ends.
(alternately track the tokens that have come back, and if you have received it before then it is invalid.)
Do a POST every time you alter data, but never return an HTML response from a post... instead return a redirect to a GET that retrieves the updated data as a confirmation page. That way, there is no worry about them refreshing the page. If they refresh, all that will happen is another retrieve, never a data-altering action.

Resources