A good place validation for business logic - database

OK, so let start from the beginning. For instance, I'm developing some big project. It isn't a secret for all of us that such projects contains a lot of common logic. Say, for example, that some online shop order must have some items and sum of prices of all items must be exactly 1000. So, I implement this logic in MVC3 web application and do validation on order. If there are issues it will tell user about issues and ask him to repost the form.
But project has another important part. It's DB. Here I can also wrap this validation logic in some stored procedure and add orders only through this proc to be sure that there is no inconsistencies. (Or should I use checks on table to be sure?)
Here's the stuck I run. It is necessary to dublicate logic in two places to get data integrity. Of course, I may store validation logic only on DB side. But I doubt it will decrease application performance.
I'm sure more experienced guys has a complete and elegant solution for this question.
So, how can I implement logic only in one place?

Simple answer, not to everyone's liking
Your database will outlast your client code
You will have multiple client code bases for your database
It's usual to have validation in several places: would you trust user input from a browser and skip server side validation?
Define "validation": complex business rules or data integrity? If the latter, then it should be the RDBMS' responsbility
Edit, about the latter part
Not all logic needs to be implemented in the database: only logic that will be common to several client code bases. Or you can funnel these common requests through a separate tier/service that deals with it.
Note, some business logic is based on aggregations or cross-table or whole-table checks that simply are best done in the database

Related

Microservices unsuitable for business domain?

The business domain has five high-level bounded contexts
Customers
Applications
Documents
Decisions
Preforms
Further, these bounded contexts has sub-contexts like ordering and delivery of the documents. Despite the project of consisting ten of thousands of classes and dozens of EJB's, most of the business logic resides in relational database views and triggers for a reason: A lot of joins, unions and constraints involved in all business transactions. In other words, there is complex web of dependencies and constraints between the bounded contexts, which restricts the state transfers. In layman terms: the business rules are very complicated.
Now, if I were to split this monolith to database per service microservices architecture, bounded contexts being the suggested service boundaries, I will have to implement all the business logic with explicit API calls. I would end up with hundreds of API's implementing all these stupid little business rules. As the performance is main factor (we use a lot of effort to optimize the SQL as it is now), this is out of the question. Secondly, segregated API's would probably be nightmare to maintain in this web of ever evolving business rules, where as database triggers actually support the high cohesion and DRY mentality, enforcing the business rules transparently.
I came up with a conclusion microservice architecture being unsuitable for this type of document management system. Am I correct, or approaching the idea from wrong angle?
First of all, you don't have to have a Microservices architecture. I really mean it! If you were ordered by management/architect to do it, and it doesn't solve any real problems you are having, you are probably right for pushing back.
That being said, and with the disclaimer that I don't know the exact requirements of your application, having "things" as bounded context is a smell. So having "Customers", "Applications", "Documents", etc. as services is very likely the wrong approach.
Bounded contexts should not be CRUD operations on a specific entity. They should be completely independent (or as independent as possible) "vertical" parts of the whole application. Preferably with their own Database and GUI. They should also operate independently of each other, not requiring input from other services for own decisions.
It is the complete opposite of data-centric design, where tables/fields and relations are the core concepts. Here, functionality is the core concept. You would have to split your application along functionality to arrive at a good separation.
I could imagine a document management system having these idependent bounded contexts / services: Search, Workflow, Editing, etc.
Here is how you would think about it: Search does not require any (synchronous) input from any other service. It may receive regular, even near-time updates with new documents, but that does not impact it's main feature: searching already indexed documents. The GUI is also independent, something like one google-like page with a search box maybe. It can deliver results independently, and would link back to the Workflow or Editing apps when you click on a result.
The others would be similarly independent. Again, the point is to split the services in a way that makes them work independently. If you don't have that, you will only make things worse with Microservices.
First of all the above answer is correct in suggesting that you need to breaup your microservice in a better way.
Now If scalability is your concern(lots of api calls between microservice).
I strongly suggest you to validate that how many of the constraints are really required at the first level, and how many of them you could do in async way. With that what i mean is in distributed enviornment we actually do not need to validate all the things at the same time.
Sometimes these things are not directly visible , for eg: lets say there are two services order service and customer service and order service expose a api which say place a order for customer id. and business say you cannot place a order for a unknown customer
one implementation is from the order service you call the customer service in sync ---- in this case customer service down will impact your service, now lets question do we really need this.
Because a scenario could happen where customer just placed an order and somebody deleted that customer from customer service, now we have a order which dosen't belong to customer.Consistency cannot be guaranteed.
In the new sol. we are saying allow the order service to place the order without checking the customer id and do one of the following:
Using ProcessManager check the customer validity and update the status of the order as invalid and when customer get deleted using ProcessManager update the order status as invalid or perform business logic
Do not check at all , because placing a order dosen't count a thing, when this order will be in the process of dispatch that service will anyway check the customer status
In this way your API hits are reduced and better independent services are produced

Which comes first: database or application logic?

What is the best way or recommended best practice in the flow of database driven asp.net web application? I mean the database first or coding first or side by side?
Your data access code won't compile without an existing database - unless you stub (or Mock) it. So probably the database comes first.
But it is a bad idea to do whole chunks of the application in isolation. Ideally you should design and build slivers of the system - database and application - hand-in-hand. These slivers should be cohesive sub-sets of functionality, probably smaller than sub-systems. Inevitably, the act of coding screens and business rules will throw up problems in the data model. So it is good to have a data modeller or DBA who is happy to work incrementally alongside the developers.
edit
Stephanie makes an extremely pertinent point:
"the core tables which are persisting
your app's data really can't be
piecemealed. Most of the data is known
at project start. It has a form, you
need to find it."
I agree that the core entities are knowable at project start, and the physical data model can be derived from that logical data model. But I don't think it is ever possible to nail down completely the structure of any table, even a core table, at the start. This is because at the start of the design/build phase all we have to go on are the Requirements, and if there's one thing history tells us about the Requirements it is that they will change.
So, new tables will be needed and some existing tables will become obsolete. There will be columns which need to be added, columns which need to be modified, columns which need to be dropped. This is why Nature gave us the ALTER TABLE statement.
I am not suggesting that we don't design our tables, or assemble them piecemeal. I am merely suggesting that when we start designing the HR sub-system we need to worry about the EMPLOYEES table and the SALARIES table. We don't need to concern ourselves with INVENTORY or ORDERS until we commence work on Sales.
We personally start with the Domain and do things side-by-side. The important part is that we implement vertical slices of the application (fully working end-to-end features), not horizontal slices (e.g. first the whole database layer, then the data access, then the services, then the presentation): we build the application incrementally and demonstrate progress with working code after each iteration.
Applications are all about features.
You don't build apps to store data,
but to provide functionality. If we
can't agree on that, the discussion is
moot of course. Software should be
developed to satisfy the needs of its
users and not of its developers.
Well I have really no understanding of the second sentence. If you think my company pays me a good salary to write code that satisfies me and not my users you're crazy. So that argument is a strawman. Back to the first.
This is a common view point of application centric people (they), vs. database centric people (We). They see the entire point of the exercise to "provide features". Those are things the clients know they want and ask for them. To them, the database is just persistence required for these features. And when they are done, that's it, features delivered, database is sufficient for those features. Could be an entire Rube Goldberg inside the database with redundant data, severe violations of normal forms, constraints enforced by the application, what have you.
think overall usability alone outweighs database design
If the design of your database is affecting your usability than the design was bad. I have no doubt that one who strives for features will leave the database in such a state that it severely hampers usability.
Data Centric people, don't look at a system as a place to provide only what's been asked for, but a repository of Intellectual Capital that can be exploited by more than whatever the Application-du-jour is. I can't begin to describe the number of cases where one team has used the database of some other team's app to enhance their apps value. Just look at all the medical research that is nothing more that the meta-analysis of existing studies. None of that is possible if you believe that only the features of your app matter and subsequent uses of your apps data do not.
A good data model isn't inviolate. Sure you'll add to it, change it when requirements change. But if you don't completely understand your data, I don't know how anyone can begin to write code.
I guess you need first to define datamodel and only then going coding. You should plan everything carefully before actually writting the code.
First is a feature list.
Then, detailed spec.
Then test plan and design of all, including databases.
Then, it wouldn't matter which to implement first.
You'll probably end up doing it "side by side".
You need some data to be able to test the application, but you need the application to be able to verify that you're storing the correct data.
Do some modelling first and then build the minimum you can for one or two features. Then when these are working add the next feature and so on.
You'll need to write some database update procedures (both the code and the rules about what and when to update) as you will have to extend your tables, but you'll need those for the final system anyway as it will have to change as new requirements come along.
Having done it quite a few times, I find myself invariably doing it like so:
Define the problem I'm trying to solve.
Write out some use-cases.
Have my significant other or a friend tell me if this is even a problem.
Sketch out a few sample screens.
Write flow diagrams for the use cases.
Ask my Rubber-duck questions.
Use questions to refine 1-6.
Write out the 'nouns'. Those become my data Model.
Write out the actions. Those become application logic.
Code data Model.
Code Application Logic.
Realize I've gotten it a little wrong.
Repeat 10-12 as many times as needed.
Ask, "Have I solved the problem"?
If not, rinse, lather and repeat 1-15.
This is a trick question. IMO, they both come in parallel during your planning and design phase. They are so closely related that it make sense to do them together. Just keep in mind that your database design will be almost fully developed while your code is still in its infancy (though your application logic should be almost fully mapped out in you head or on paper)
The idea is that you're designing your solution in the context of the problem. When you're planning out your solution you will be (or should be) defining your application as a set of things and actions (nouns and verbs).
For example, a very basic helpdesk program has people and tickets. People need to create tickets, update tickets, and close tickets. The nouns that require persistent storage will comprise your database, and the nouns + actions will be contained in your application.
Sometimes your table mappings and the relationship between tables will be obvious (IE people create tickets, ticket.creatorID = people.personID) and other times the relationship doesn't really click in your head until you start working through use cases or until you start writing your code (IE different ppl have different access levels defining what they can do. At a glance this would seem like a simple field in a table, but in practice it is better as a separate table).

Should data validation be done at the database level?

I am writing some stored procedures to create tables and add data. One of the fields is a column that indicates percentage. The value there should be 0-100. I started thinking, "where should the data validation for this be done? Where should data validation be done in general? Is it a case by case situation?"
It occurs to me that although today I've decided that 0-100 is a valid value for percentage, tomorrow, I might decide that any positive value is valid. So this could be a business rule, couldn't it? Should a business rule be implemented at the database level?
Just looking for guidance, we don't have a dba here anymore.
Generally, I would do validations in multiple places:
Client side using validators on the aspx page
Server side validations in the code behind
I use database validations as a last resort because database trips are generally more expensive than the two validations discussed above.
I'm definitely not saying "don't put validations in the database", but I would say, don't let that be the only place you put validations.
If your data is consumed by multiple applications, then the most appropriate place would be the middle tier that is (should be) consumed by the multiple apps.
What you are asking in terms of business rules, takes on a completely different dimension when you start thinking of your entire application in terms of business rules. If the question of validations is small enough, do it in individual places rather than build a centralized business rules system. If it is a rather large system, them you can look into a business rules engine for this.
If you have a good data access tier, it almost doesn't matter which approach you take.
That said, a database constraint is a lot harder to bypass (intentionally or accidentally) than an application-layer constraint.
In my work, I keep the business logic and constraints as close to the database as I can, ensuring that there are fewer potential points of failure. Different constraints are enforced at different layers, depending on the nature of the constraint, but everything that can be in the database, is in the database.
In general, I would think that the closer the validation is to the data, the better.
This way, if you ever need to rewrite a top level application or you have a second application doing data access, you don't have two copies of the (potentially different) code operating on the same data.
In a perfect world the only thing talking (updating, deleting, inserting) to your database would be your business api. In the perfect world databae level constraints are a waste of time, your data would already have been validated and cross checked in your business api.
In the real world we get cowboys taking shortcuts and other people writing directly to the database. In this case some constraints on the database are well worth the effort. However if you have people not using your api to read/write you have to consider where you went wrong in your api design.
It would depend on how you are interacting with the database, IMO. For example, if the only way to the database is through your application, then just do the validation there.
If you are going to allow other applications to update the database, then you may want to put the validation in the database, so that no matter how the data gets in there it gets validated at the lowest level.
But, validation should go on at various levels, to give the user the quickest opportunity possible to know that there is a problem.
You didn't mention which version of SQL Server, but you can look at user defined datatypes and see if that would help you out, as you can just centralize the validation.
I worked for a government agency, and we had a -ton- of business rules. I was one of the DBA's, and we implemented a large number of the business rules in the database; however, we had to keep them pretty simple to avoid Oracle's dreaded 'mutating table' error. Things get complicated very quickly if you want to use triggers to implement business rules which span several tables.
Our decision was to implement business rules in the database where we could because data was coming in through the application -and- through data migration scripts. Keeping the business rules only in the application wouldn't do much good when data needed to be migrated in to the new database.
I'd suggest implementing business rules in the application for the most part, unless you have data being modified elsewhere than in the application. It can be easier to maintain and modify your business rules that way.
One can make a case for:
In the database implement enough to ensure overall data integrity (e.g. in SO this could be every question/answer has at least one revision).
In the boundary between presentation and business logic layer ensure the data makes sense for the business logic (e.g. in SO ensuring markup doesn't contain dangerous tags)
But one can easily make a case for different places in the application layers for every case. Overall philosophy of what the database is there for can affect this (e.g. is the database part of the application as a whole, or is it a shared data repository for many clients).
The only thing I try to avoid is using Triggers in the database, while they can solve legacy problems (if you cannot change the clients...) they are a case of the Action at a Distance anti-pattern.
I think basic data validation like you described makes sure that the data entered is correct. The applications should be validating data, but it doesn't hurt to have the data validated again on the database. Especially if there is more than one way to access the database.
You can reasonable restrict the database so that the data always makes sense. A database will support multiple applications using the same data so some restrictions make sense.
I think the only real cost in doing so would be time. I think such restrictions aren't a big deal unless you are doing something crazy. And, you can change the rules later if needed (although some changes are obviously harder than others)
First ideal: have a "gatekeeper" so that your data's consistency does not depend upon each developer applying the same rules. Simple validation such as range validation may reasonably be implemented in the DB. If it changes at least you have somewhere to put.
Trouble is the "business rules" tend to get much more complex. It can be useful to offload processing to the application tier where OO languages can be better for managing complex logic.
The trick then is to structure the app tier so that the gatekeeper is clear and unduplicated.
In a small organisation (no DBA ergo, small?) I would tend to put the business rules where you have strong development expertise.
This does not exclude doing initial validation in higher levels, for example you might validate all the way up in the UI to help the user get it right, but you don't depend upon that initial validation - you still have the gatekeeper.
If you percentage is always 'part divided by whole' (and you don't save part and whole values elsewhere), then checking its value against [0-100] is appropriate at db level. Additional constraints should be applied at other levels.
If your percentage means some kind of growth, then it may have any kind of values and should not be checked at db level.
It is case by case situation. Usually you should check at db level only constraints, which can never change or have natural limits (like first example).
Richard is right: the question is subjective the way it has been asked here.
Another take is: what are the schools of thought on this? Do they vary by sector or technology?
I've been doing Ruby on Rails for a bit now, and there, even relationships between records (one-to-many etc.) are NOT respected on the DB level, not to mention cascade deleting and all that stuff. Neither are any kind of limits aside from basic data types, which allow the DB to do its work. Your percentage thing is not handled on the DB level but rather at the Data Model level.
So I think that one of the trends that we're seeing lately is to give more power to the app level. You MUST check the data coming in to your server (so somewhere in the presentation level) and you MIGHT check it on the client and you MIGHT check again in the business layer of your app. Why would you want to check it again at the database level?
However: the darndest things do happen and sometimes the DB gets values that are "impossible" reading the business-layer's code. So if you're managing, say, financial data, I'd say to put in every single constraint possible at every level. What do people from different sectors do?

Preventing bad data input

Is it good practice to delegate data validation entirely to the database engine constraints?
Validating data from the application doesn't prevent invalid insertion from another software (possibly written in another language by another team). Using database constraints you reduce the points where you need to worry about invalid input data.
If you put validation both in database and application, maintenance becomes boring, because you have to update code for who knows how many applications, increasing the probability of human errors.
I just don't see this being done very much, looking at code from free software projects.
Validate at input time. Validate again before you put it in the database. And have database constraints to prevent bad input. And you can bet in spite of all that, bad data will still get into your database, so validate it again when you use it.
It seems like every day some web app gets hacked because they did all their validation in the form or worse, using Javascript, and people found a way to bypass it. You've got to guard against that.
Paranoid? Me? No, just experienced.
It's best to, where possible, have your validation rules specified in your database and use or write a framework that makes those rules bubble up into your front end. ASP.NET Dynamic Data helps with this and there are some commercial libraries out there that make it even easier.
This can be done both for simple input validation (like numbers or dates) and related data like that constrained by foreign keys.
In summary, the idea is to define the rules in one place (the database most the time) and have code in other layers that will enforce those rules.
The disadvantage to leaving the logic to the database is then you increase the load on that particular server. Web and application servers are comparatively easy to scale outward, but a database requires special techniques. As a general rule, it's a good idea to put as much of the computational logic into the application layer and keep the interaction with the database as simple as possible.
With that said, it is possible that your application may not need to worry about such heavy scalability issues. If you are certain that database server load will not be a problem for the foreseeable future, then go ahead and put the constraints on the database. You are quite correct that this improves the organization and simplicity of your system as a whole by keeping validation logic in a central location.
There are other concerns than just SQL injection with input. You should take the most defensive stance possible whenever accepting user input. For example, a user might be able to enter a link to an image into a textbox, which is actually a PHP script that runs something nasty.
If you design your application well, you should not have to laboriously check all input. For example, you could use a Forms API which takes care of most of the work for you, and a database layer which does much the same.
This is a good resource for basic checking of vulnerabilities:
http://ha.ckers.org/xss.html
It's far too late by the time the data gets to your database to provide meaningful validation for your users and applications. You don't want your database doing all the validation since that'll slow things down pretty good, and the database doesn't express the logic as clearly. Similarly, as you grow you'll be writing more application level transactions to complement your database transactions.
I would say it's potentially a bad practice, depending on what happens when the query fails. For example, if your database could throw an error that was intelligently handled by an application, then you might be ok.
On the other hand, if you don't put any validation in your app, you might not have any bad data, but you may have users thinking they entered stuff that doesn't get saved.
Implement as much data validation as you can at the database end without compromising other goals. For example, if speed is an issue, you may want to consider not using foreign keys, etc. Furthermore, some data validation can only be performed on the application side, e.g., ensuring that email addresses have valid domains.
Another disadvantage to doing data validation from the database is that often you dont validate the same way in every case. In fact, it often depends on application logic (user roles), and sometimes you might want to bypass validation altogether (cron jobs and maintenance scripts).
I've found that doing validation in the application, rather than in the database, works well. Of course then, all the interaction needs to go through your application. If you have other applications that work with your data, your application will need to support some sort of API (hopefully REST).
I don't think there is one right answer, it depends on your use.
If you are going to have a very heavily used system, with the potential that the database performance might become a bottleneck, then you might want to move the responsibility for validation to the front-end where it is easier to scale with multiple servers.
If you have multiple applications interacting with the database, then you might not want to replicate and maintain the validation rules across multiple applications, so then the database might be the better place.
You might want a slicker input screen that doesn't just hit the user with validation warnings when they try to save a record, maybe you want to validate a field after data has been entered and it losses focus; or even as the user types, changing the font colour as validation fails/passes.
Also related to constraints, is warnings of suspect data. In my application I have hard-constraints in the database (e.g. someone can't start a job before their date of birth), but then in the front-end have warnings for data that is possibly correct, but suspect (e.g. an eight year-old starting a job).

Database exception handling best practices [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
How do you handle database exceptions in your application?
Are you trying to validate data prior passing it to DB or just relying on DB schema validation logic?
Do you try to recover from some kind of DB errors (e.g. timeouts)?
Here are some approaches:
Validate data prior passing it to DB
Left validation to DB and handle DB exceptions properly
Validate on both sides
Validate some obvious constraints in business logic and left complex validation to DB
What approach do you use? Why?
Updates:
I'm glad to see growing discussion.
Let’s try to sum up community answers.
Suggestions:
Validate on both sides
Check business logic constraints on
client side, let DB do integrity checks from hamishmcn
Check early to avoid bothering DB from ajmastrean
Check early to improve user experience from Will
Keep DB interacting code in place to
simplify development from hamishmcn
Object-relational mapping (NHibernate, Linq, etc.) can help you to deal with constrains from ajmastrean
Client side validation is necessary for security reasons from Seb Nilsson
Do you have anything else to say? This is converted to Validation specific question. We are missing the core, i.e. "Database related Error best practices" which ones to handle and Which ones to Bubble up?
#aku: DRY is nice, but its not always possible. Validation is one of those places, as you will have three completely different and unrelated places where validation is not only possible but absolutely needed: Within the UI, within the business logic, and within the database.
Think of a web application. You want to reduce trips to the server, so you include javascript validation of client data entry. But you can't trust what the user enters, so you must perform validation within your business logic before touching the database. And the database must have its own validation in order to prevent data corruption.
There's no clean way to unify these three different types of validation within a single component.
There are some attempts being made to unify cross-cutting responsibilities like validation within policy injectors like the P&P group's Policy Injection Application Block combined with their Validation Application Block, but these are still code based. If you have validation that's not in code, you still have to maintain parallel logic separately...
There is one killer-reason to validate on both the client-side and on the database-side, and that is security. Especially when you start using AJAX-stuff, hackable URLs and other things that make your site (in this case) more friendly to users and hackers.
Validate on the client to provide a smooth experience to early tell the user to correct their input. Also validate in database, (or in business logic, if this is considered a totally secure gateway to the database) for security for you database.
You want to reduce unnecessary trips to the DB, so performing validation within the application is a good practice. Also, it allows you to handle data errors where it is most easy to recover from: up near the UI (whether in the controller or within the UI layer for simpler apps) where the data is entered.
There are some data errors that you can't check for programatically, however. For instance, you can't validate data on the existance of related data without roundtripping to the db. Data errors like these should be validated by the database through the use of relationships, triggers, etc.
Where you deal with errors returned by database calls is an interesting one. You could deal with them at the data layer, the business logic layer, or the UI layer. The best practice in this instance is to let those errors bubble up to the last responsible moment before handling them.
For example, if you have an ASP.NET MVC web application, you have three layers (from bottom to top): Database, controller and UI (model, controller, and view). Any errors thrown by your data layer should be allowed to bubble up into your controller. At this level your application "knows" what the user is attempting to do, and can correctly inform the user about the error, suggesting different ways to handle it. Attempting to recover from these errors within the data layer makes it much harder to know what's going on within the controller. And, of course, placing business logic within the UI is not considered a best practice.
TL;DR: Validate everywhere, handle validation errors at the last responsible moment.
I try to validate on both sides. 1 rule I always follow is never trust input from the user. Following this to it's conclusion, I will usually have some front end validation on the form/web page which will not even allow submission with improperly formed data. This is a blunt tool - meaning you can check/parse the value to make sure a date field contains a date. From there, I usually let my business logic check as to whether the data entry makes sense in context with how it was submitted. For example, does the date submitted fall into the expected range? Does the currency value submitted fall into the expected range? Finally, on the server side, Foreign Key constraints and Indexes can catch any errors that slip through, which will bubble up a DB exception as a last resort, which can be handled by the app code. I use this method because it filters out as many errors as possible before the DB call is invoked.
An object-relational mapping (ORM) tool, like NHibernate (or better yet, ActiveRecord), can help you avoid a lot of validation by allowing the data model to be built right into your code as a proper C# class. You may avoid trips to the database as well, thanks to great caching and validation models built into the framework.
In general, I try to validate data as soon as possible after it has been entered. This is so that I can give helpful messages to the user earlier than after they have clicked "submit" or the equivalent.
By the time that it comes to making the db call I am hopefull that the data I am passing should be fairly good.
I try to keep db calls in the one file (or group of files) that share helper methods make it as easy as possible for the programmer (me or whoever else adds calls) to write to a log details about the exception, and what parameters were passed in etc
The sorts of apps that I was writing (I've since moved jobs) were in-house fat-client apps.
I would try to keep the business logic in the client, and do more mechanical validation on the db (ie validation that only related to the procedure's ability to run, as opposed to higher level validation).
In short, validate where you can, and try to keep related types of validation together.

Resources