Should I avoid modifying user inputs before sending to backend if possible? - request

I'm validating and sanitizing user inputs from the server-side. I'm also validating it from the front-end. But I'm wondering if I should also modify the input values to match the server's requirement before sending a request.
For example, I have a form with a birthday text input in MM-dd format. But the server requires a month(MM) and a day of the month(dd) values separately. I can format the input to match the server's requirement(MM and dd), or I can just pass the value without modification and the server will do the rest. Which method is recommended?

This question is more related to UX practices then frontend itself. I believe that before server validation, frontend checks should be performed.
You shouldn't validate and you definitely shouldn't change any values during user completing the form. However the common practice is to validate fields on blur. This is when you can change fields values.
However I would be very careful with this, to avoid confusing the user. So stripping whitespaces etc. should not be a problem, but aggressive input changes should be avoided.
Also try input masking for operations like date formats.
Check for example this library
https://nosir.github.io/cleave.js/
EDIT:
I case of changing values before sending them to backend, it's perfectly fine. It's good practice to have some mapping layer, which will map between UI forms and DTOs required by backend. UI should be focused on user experience, so some extra work will be required almost every time in more complex scenarios

Related

How can I identify an untranslatable value exists in FormRecognizer analysis

I posted the following Feature Request to the azure-sdk, but not sure if that was the correct place for getting a response, so reposting here.
https://github.com/Azure/azure-sdk-for-net/issues/20764
When processing a document against a custom trained model, when a value is present but not able to be translated (such as a signature), would it be possible to include something in the response to identify it as having a value though it wasn't able to be processed?
The specific use case is that our client needs to know that a document was signed by the parties involved. Without this feature, someone will be required to manually review thousands of document images per week to verify that they have been signed. In testing we have found that very few signatures are being translated any way, so the string response is coming back as null.
Thank you,
Rich
For Form Recognizer when a value is not detected although it is present it will be extracted as Null as Form Recognizer is not aware that a value exists it did not detect it. In case of signature this is usually due to the signature being unreadable and just a scribble.

Validation stages

I am validating with Cake 3 but can't get it working probably.
As the docs says there are two stages of validating:
Before you save your data you will probably want to ensure the data is correct and consistent. In CakePHP we have two stages of validation:
Before request data is converted into entities, validation rules around data types and formatting can be applied.
Before data is saved, domain or application rules can be applied. These rules help ensure that your application’s data remains consistent.
So, if I understand this right, at first validation rules are used when I pass data via newEntity and patchEntity.
After that, the application rules are used when using save or delete.
However, when I am passing data (an array) via newEntity the application rules are never used (buildRules is never called). When using newEntity without passing data, application rules are used!
So, my first question, is it right that not both rules are runned, only one (OR validation rules, OR application rules?). I would expect that first validation rules would be called to check the input, and before saving, ALSO the application rules would be called to check if the entity is valid to the applicaton.
Second question, how should I validate with my API? The actions pass their data via the newEntity method, but I want to check if (for example) the category_id belongs to the same user. Thats typical an application rule I guess?
Thank you very much ;)
Quoting CakePHP documentation:
Validation objects are intended primarily for validating user input, i.e. forms and any other posted request data.
Basically, validation is done when you use newEntity or patchEntity to check that the incoming data is consistent:
You don't have a random string where you should have a number
The user email is of correct format
Standard and confirmation passwords are equals
etc.
Validation is not done when you set field manually:
$user->email = 'not a valid email' ; // no validation check
Basically, validation rules are meant to tell the user « Hey, you did something wrong! ».
Application rules on the other end are always checked when you call save or delete, these may be used for:
Checking uniqueness of a field
Checking that a foreign key exist - There is an Group that correspond to your group_id
etc.
Your first assumption is somehow false because in the following scenario, both validation and application rules are checked:
$article = $this->Articles->newEntity($this->request->data);
$this->Articles->save($article) ;
This part of the documentation explain the difference between the two layers of validation.
Concerning your second question, you should not check that a user has the right to do something in your model, this should be done by your controller, see CakePHP book for more details.

Should I only add validation criteria to user input in CakePHP?

When baking models in CakePHP, should I add validation criteria only to the data which are user input? Or to everything? Or to some specific data? The database consists mostly of things which will be added by admins. There's only 1 user-related table. I'm not sure about this. Thanks.
Validate everything!:
Validation to everything. There's no reason not to add validation to everything. If an admin knows what they're doing, and is inserting data per the requirements, they won't see any of the validation errors anyway. But - if they have a moment of insanity, or just don't know what's allowed/not, then having the validation is a great fallback.
We've all done it (or... NOT done it):
It's understandable for a small/simple project to not want to spend the time adding validation - we've probably all done it... but when asked "SHOULD you add validation to everything?", I think the answer has to be "yes!".
Validation - not just for user-generated content:
Validation is overall great - not just for user-entered data, but also for scraped, code-generated data, admin-entered data, and everything in between.
Can be slightly lax... if you must
If most of your data isn't user-generated, you could always think about making the validation slightly more lax than it would be otherwise, but - having it is still better than not.

Should "http://" be stored with a database record of a URL?

There are a number of fields a user can fill in where they'd enter a URL (their personal website, business site, favorite sites, etc etc).
It's the only thing they'd be entering in that particular field.
So should I always strip out "http://" to keep it consistent and to also reduce the possibility of broken links (ie. "http//")?
Just not sure what the best way to store URLs is.
If there's a reason to sanitize your users' input (security, size, speed, accuracy...) then do it.
But otherwise, don't.
There's actually a benefit a lot of times in taking your customer-input data as-is. They own their own typos or misspellings, broken links, etc. that way. As long as it doesn't cause a problem for you (i.e. you don't have a reason to sanitize it).
BTW -- consistency is a moot point, as it won't change the data type, and you can easily check for the "http://" and add or remove it as necessary in your presentation layers with a re-usable function.
As far as I know you actually can not call it an "URL", without having the protocol part:
http://www.w3.org/Addressing/URL/url-spec.txt
I wouldn't remove it.
However if you really need to keep the data consistent, it really depends how the URL is actually typed in your application. If it's a browser-like application, I'd bet it can be assumed to be http:// in front if there is none, for valid links.

How can I prevent database being written to again when the browser does a reload/back?

I'm putting together a small web app that writes to a database (Perl CGI & MySQL). The CGI script takes some info from a form and writes it to a database. I notice, however, that if I hit 'Reload' or 'Back' on the web browser, it'll write the data to the database again. I don't want this.
What is the best way to protect against the data being re-written in this case?
Do not use GET requests to make modifications! Be RESTful; use POST (or PUT) instead the browser should warn the user not to reload the request. Redirecting (using HTTP redirection) to a receipt page using a normal GET request after a POST/PUT request will make it possible to refresh the page without getting warned about resubmitting.
EDIT:
I assume the user is logged in somehow, and therefore you allready have some way of tracking the user, e.g. session or similar.
You could make a timestamp (or a random hash etc..) when displaying the form storing it both as a hidden field (just besides the anti Cross-Site Request token I'm sure you allready have there), and in a session variable (wich is stored safely on your server), when you recieve a the POST/PUT request for this form, you check that the timestamp is the same as the one in session. If it is, you set the timestamp in the session to something variable and hard to guess (timestamp concatenated with some secret string for instance) then you can save the form data. If someone repeats the request now you won't find the same value in the session variable and deny the request.
The problem with doing this is that the form is invalid if the user clicks back to change something, and it might be a bit to harsh, unless it's money you're updating. So if you have problems with "stupid" users who refresh and click the back-button thus accidentally reposting something, just using POST would remind them not to do that, and redirecting will make it less likely. If you have a problem with malicious users, you should use a timestampt too allthough it will confuse users sometimes, allthough if users is deliberately posting the same message over and over you probably need to find a way to ban them. Using POST, having a timestam, and even doing a full comparison of the whole database to check for duplicate posts, won't help at all if the malicious users just write a script to load the form and submit random garbage, automatically. (But cross-site-request protection makes that a lot harder)
Using a POST request will cause the browser to try to prevent the user from submitting the same request again, but I'd recommend using session-based transaction tracking of some kind so that if the user ignores the warnings from the browser and resubmits his query your application will prevent duplication of changes to the database. You could include a hidden input in the submission form with value set to a crypto hash and record that hash if the request is submitted and processed without error.
I find it handy to track the number of form submissions the user has performed in their session. Then when rendering the form I create a hidden field that contains that number. If the user then resubmits the form by pressing the back button it'll submit the old # and the server can tell that the user has already submitted the form by examining what's in the session to what the form is saying.
Just my 2 cents.
If you aren't already using some sort of session-management (which would let you note and track form submissions), a simple solution would be to include some sort of unique identifier in the form (as a hidden element) that is either part of the main DB transaction itself, or tracked in a separate DB table. Then, when you are submitted a form you check the unique ID to see if it has already been processed. And each time the form itself is rendered, you just have to make sure you have a unique ID.
First of all, you can't trust the browser, so any talk about using POST rather than GET is mostly nerd flim-flam. Yes, the client might get a warning along the lines of "Did you mean to resubmit this data again?", but they're quite possibly going to say "Yes, now leave me alone, stupid computer".
And rightly so: if you don't want duplicate submissions, then it's your problem to solve, not the user's.
You presumably have some idea what it means to be a duplicate submission. Maybe it's the same IP within a few seconds, maybe it's the same title of a blog post or a URL that has been submitted recently. Maybe it's a combination of values - e.g. IP address, email address and subject heading of a contact form submission. Either way, if you've manually spotted some duplicates in your data, you should be able to find a way of programmatically identifying a duplicate at the time of submission, and either flagging it for manual approval (if you're not certain), or just telling the submitter "Have you double-clicked?" (If the information isn't amazingly confidential, you could present the existing record you have for them and say "Is this what you meant to send us? If so, you've already done it - hooray")
I'd not rely on POST warnings from the browser. Users just click OK to make messages go away.
Anytime you'll have a request that needs to be one time only e.g 'make a payment', send a unique token down, that gets submitted back with the request. Throw the token out after it comes back, and so you can now tell when something is a valid submission (anything with a token that isn't 'active'). Expire active tokens after X amount of time, e.g. when a user session ends.
(alternately track the tokens that have come back, and if you have received it before then it is invalid.)
Do a POST every time you alter data, but never return an HTML response from a post... instead return a redirect to a GET that retrieves the updated data as a confirmation page. That way, there is no worry about them refreshing the page. If they refresh, all that will happen is another retrieve, never a data-altering action.

Resources