data integrity with dbms or application stack - database

My question is very simple and it's more of an advice that i'm seeking. What is better when it comes to maintaining data integrity: DBMS or Application Code?
Example: With DBMS we can use things like Triggers, Transactions, Procedures etc to do ALMOST proper data management and making sure things go and fit into the right place.. Same can be achieved with Application code.
Which one would you prefer in particular?
Application Code or a Combination of both?

Generally you would want to bake any data-at-rest integrity into the DBMS. Referential requirements, data limitations (length, etc.), things like that. The idea is that the DBMS should, in and of itself (disconnected from and not reliant on the application[s] which use[s] it), maintain the integrity and rules of the data it contains.
They key thing to note there is that multiple applications may use this database. Even if that's not the case for a particular database, or not even likely to be the case in the foreseeable future, it's still good design. Not all of the applications necessarily have the same business logic or the same integrity checks. The DBMS should never assume that what it's getting has been checked for integrity.
The application(s) should apply the business logic and maintain integrity for data-in-motion. It should do its best to prevent even trying to persist invalid data to the database. But in any given point in the application it may very well not be reasonable to assume that it "knows" all of the other data in the database. The application can apply logic to the small piece of data it's currently holding, then try to interact with the database to persist it.
But the job of the database is to know and maintain all of the data, not just what's currently being used by the application. So where the application may believe that it has a perfectly good piece of data, the database may disagree based on the state of some other data which the application wasn't using. It's perfectly acceptable for the database to then return an error to the application to tell it that there's a problem with the data being sent. The application should be able to handle such errors.
To sum up...
From the point of view of the application, active data is a small subset of all data and the application is responsible only for the active data. Also, data is in motion and very fluid and part of a richer logical model of business logic.
From the point of view of the DBMS, "active data" is all data and the DBMS is responsible for maintaining the integrity of all data. Generally, data is at rest and static and should at any given snapshot of the tables be "good data."

Related

Use Event Sourcing to get rid of relational database?

I've read some articles about CQRS and Event Sourcing recently. While the first seemed to me like a highly complex and risky workaround to fix poorly performing business and poorly designed data access layers and data models, the last seemed like a solution to many problems.
Problems to solve using Event Sourcing:
Get rid of Relational Database and Object Relational Mappers, like NHibernate and Entity Framework. Hardly anybody in the programming area wants to pay attention to such stuff like indices, table/index fragmentation or normalization, how to design relational data and how to code/configure the ORM (a science on its own).
Have Business Model and in-memory "database" united, an entity/aggregate service keeping all relevant items in memory, maintaining integrity by simply dumping the CUD events somewhere without much pain. Old items can be evicted from memory and dumped to a NoSQL (or whatever) store and used for aggregate calculations, reporting, search and, if necessary, be re-activated. If I understand right, in-memory databases like VoltDB use event dumping in a similar way, but are still relational databases, separated from the business logic.
This would also make concurrency easier: instead of locking (with possible complete system deadlocks) or optimistic locking with a general "success or fail" logic, depending on whether the data has changed meanwhile (or rather complex DB code), merge rules can be implemented in code.
History: no more pain with implementing auditing functions, cemetary tables or "deleted" marker columns, or possibly deleted data still being required.
Data Duplication/Search/Reporting: use full-text indices instead of chasing missing relational indices, create proper viewing areas, preparing the data for the user in a required format, instead of using ugly copy routines in relational databases, with triggers, followup stored procedures or even program code copying data to half a dozen different tables.
Versioning: it's a pain to get many modules running with a number of different relational database versions, each having different tables and columns and needs appropriate ORM mappings. Could be much easier in a single layer model, with the event dump accepting any object format (typically schema-less or loose-schema NoSQL documents, represented as JSON or XML). It might also be possible to upgrade old data through a "data schema change event" chain (instead of having to maintain migration scripts for relational DBs).
N-tier Business Model / Relational DB / ORM mess
An n-tier approach a decade or longer ago might have been a business layer and data access layer. In order to keep separation really strict, many relational features were omitted, to implement them in the business layer instead: relational integrity, normalization, with the DB being what I call a "trash dump": looking like a kid playing around with SQL Server Management Studio or Access. Extremly un-normalized, polymorphic references ("Foreign Key" columns referencing different source tables, identified by a "ReferenceSource" marker), abuse of same tables for different kinds of business objects and duplication of data to numerous other tables (and from there again elsewhere), because performance wasnt good and this was supposed to improve queries. ORM usage was without object references too, reduced to single object load and save operations. Loading an aggregate (a graph of entities/table rows) would iterate through the graph and send a query for every set of sub-entities.
When performance got worse and, possibly, orphaned references caused serious trouble, attempts to implement classic relational design might have been made, but it was impossible to adapt the grown system to a complete data redesign (nobody would pay for it), hardly anybody would know how to map object relations or even optimized loading in the ORM. Such attempts were limited to a few places in the design, possibly making the data model and access even harder to maintain.
CQRS on top of n-tier?
To get acceptable performance, separate SQL queries were possibly built for certain modules, bypassing the business model with it's single object iterative access. This whole structure was suddenly called a de-facto CQRS, because of separate query access (which -could- have been handled by a well-implemented, relational data model and ORM usage, as long as it wasnt supposed to be a "Big data" Google- or Stackoverflow-like workload), and the plenty of duplicated data in relational tables, made up for immediate application access.
Something better than the inappropriate table format?
OK, so I read into CQRS, and while I didn't like the use of "CQRS" as described before, the concept of an event storage instead of a relational DB looked very useful: It is unlikely to successfully enforce the introduction of the original, state-of-art, relational DB design and OR mapping, and even if, it would be extremly costly. In fact, ordinary, object-oriented programming is much more "normalized" than most DB tables, due to the need to press all into the table format or create tons of tables for object graphs/aggregates. And I agree: having to take care of search indices and defragmentation, schema management and data history tracking manually, is like stone age IT, like running Ford T models and steam locomotives besides modern day cars and electric high speed trains.
Any good Experiences?
How are the experiences about using event sourcing (not necessarily full CQRS)? Does it eliminate much of the pain with relational databases? I really look forward for a kind of in-memory database with all business logic integrated, and possibly fast enough to make separate query modules dispensable!
There's a lot going on in this question and so a specific, actionable answer is not possible, but if you're looking for one then it is...
It depends on your domain.
CQRS/ES/DDD is not appropriate for solving every single problem - it is not a silver bullet. If the domain suggests that CRUD/NTier will be good enough, then that's what you should use. All of the concerns you list in your question are infrastructural or system traits and say nothing about the very thing that should inform your choice of tool or practice; what are you trying to build?
Although CQRS, ES, and DDD are very often used together they are separate concepts that are very powerful on their own.
CQRS (Command Query Responsibility Segregation): This is a very useful pattern to design software in general. The idea is to keep things that change state (commands) from things that do not (queries). In many systems queries modify the state of the database, this makes it very difficult for developers to reason about what is going on.
Imagine doing a query to find out some information and realizing that the information changed because you queried it.
CQRS prohibits those kind of behaviors. Commands (which cannot return information) change state and Queries (which return information) cannot modify state. That way, you have certainty in which parts of the code are idempotent (and therefore can be called as much as you want with no side effect) and which parts change state.
DDD (Domain Driven Design): This is a Design methodology for the "Data Structure" of the code. It does not prescribe techniques for database access or many technical details. What it does is provide guidelines and concepts to structure data in an application in a way that makes it much more responsive to the actual user's needs. It also simplifies development (although it is more work than just slapping something together).
ES (Event Sourcing): Event sourcing is a data storage strategy which shifts data storage from state (the actual values of a piece of data at the current point in time) into transitions (the changes that have happened to a piece of data during its lifetime) which are called events.
There are several advantages of using ES.
First, it allows the business to store much more information regarding what happened before (a boon to Data Scientists). In traditional systems, a lot of information is lost to updates of the data, and unless those updates are explicitly logged, the information is gone forever. This does not happen in ES.
Second, storing all events makes debugging much more simple because now a developer can follow the processing of the data since its beginning. An update to a piece of data that happened a long time ago (and would have been rewritten by another update and lost) but corrupted processing can be identified and fixed. Furthermore, the effects of the fix can even span all calculations that happened between the wrong event and the last event. In a traditional system, this would be impossible as we are only storing the latest state only.
While it is theoretically possible to write an Event Sourced system without CQRS or DDD, it is remarkably more difficult to do so.

business logic in database layer

I hate to ask the classic question of "business logic in database vs code" again, but I need some concrete reasons to convince an older team of developers that business logic in code is better, because it's more maintainable, above all else. I used to have a lot of business logic in the DB, because I believed it was the single point of access. Maintenance is easy, if I was the only one doing the changing it. In my experience, the problems came when the projects got larger and complicated. Source Control for DB Stored Procs are not so advanced as the ones for newer IDEs, nor are the editors. Business logic in code can scale much better than in the DB, is what I've found in my recent experience.
So, just searching around stackoverflow, I found quite the opposite philosophy from its esteemed members:
https://stackoverflow.com/search?q=business+logic+in+database
I know there is no absolute for any situation, but for a given asp.net solution, which will use either sql server or oracle, for a not a particularly high traffic site, why would I put the logic in the DB?
Depends on what you call business.
The database should do what is expected.
If the consumers and providers of data expect the database to make certain guarantees, then it needs to be done in the database.
Some people don't use referential integrity in their databases and expect the other parts of the system to manage that. Some people access tables in the database directly.
I feel that from a systems and component perspective, the database is like any other service or class/object. It needs to protect its perimeter, hide its implementation details and provide guarantees of integrity, from low-level integrity up to a certain level, which may be considered "business".
Good ways to do this are referential integrity, stored procedures, triggers (where necessary), views, hiding base tables, etc., etc.
Database does data things, why weigh down something that is already getting hit pretty hard to give you data. It's a performance thing and a code thing. It's MUCH easier to maintain business logic code than to store it all in the database. Sprocs, Views and Functions can only go so far until you have Views of Views of Views with sprocs to fill that mess in. With business logic you separate your worries. If you have a bug that's causing something to be calculated wrong it's easier to check the business logic code than go into the DB and see if someone messed up something in a Stored Procedure. This is highly opinionated and in some cases it's OK to put some logic in the database but my thoughts on this are it's a database not a logicbase, put things where they belong.
P.S: Might be catchin some heat for this post, it's highly opinionated and other than performance numbers there's no real evidence for either and it becomes a case of what you're working with.
EDIT: Something that Cade mentioned that I forgot. Refrential integrity. By all means please have correct data integrity in your DB, no orphaned records ON DELETE CASCADE's, checks and whatnot.
I have faced with database logic on one of huge projects. This was caused by the decision of main manager who was the DBA specialist. He said that the application should be leightweight, it should know nothing about database scheme, joined tables, etc, and anyway stored Procs executes much faster than the transaction scopes and queries from client.
At the other side, we had too much bugs with database object mappings (stored prod or view based on view based on other view etc). It was unreachable to understand what is happening with our data because of each button clicked called a huge stored proc with 70-90-120 parameters and updated several (10-15) tables. We had no ability to query simple select request so we had to compile a view or stored Proc and class in code for this just for one simple join :-( of course when the table or view definition changes you should recompile all other dB objects based on edited object elsewhere you will get runtime Exception.
So I think that logic in database is a horrible way. Of course you can store some pieces of code in stored procs if needed by performance or security issues, but you shoul not develop everything in the Database) the logic should be flexible, testable and maintenable, and you can not reach this points using database for storing logic)

Should business rules be enforced in both the application tier and the database tier, or just one of the two?

I have been enforcing business rules in both my application tier (models) and my database tier (stored procedures who raise errors).
I've been duplicating my validations in both places for a few reasons:
If the conditions change between
when they are checked in the
application code and when they are
checked in the database, the
business rule checks in the database
will save the day. The database
also allows me to lock various
records in a simpler manner than in
my application code, so it seems
natural to do so here.
If we have
to do some batch data
insertions/updates to the database directly, if I route
all these operations through my
stored procedures/functions which
are doing the business rule
validations, there's no chance of me
putting in bad data even though I lack the protections that I would get if I was doing single-input through the application.
While
enforcing these things ONLY in the
database would have the same effect
on the actual data, it seems
improper to just throw data at the
database before first making a good
effort to validate that it conforms
to constraints and business rules.
What's the right balance?
You need to enforce at the data tier to ensure data integrity. That's your last line of defense, and that's the DBs job, to help enforce its world view of the data.
That said, throwing junk data against the DB for validation is a coarse technique. Typically the errors are designed to be human readable rather than machine readable, so its inefficient for the program to process the error from the DB and make heads or tails out of it.
Stored Procedures are a different matter. Back in the day, Stored Procedures were The Way to handle business rules on the data tiers, etc.
But today, with the modern application server environments, they have become a, in general, better place to put this logic. They offer multiple ways to access and expose the data (the web, web services, remote protocols, APIs, etc). Also, if your rules are CPU heavy (arguably most aren't) it's easier to scale app servers than DB servers.
The large array of features within the app servers give them a flexibility beyond what the DB servers can do, and thus much of what was once pushed back in to the DBs is being pulled out with the DB servers being relegated to "dumb persistence".
That said, there are certainly performance advantages using Stored Procs and such, but now that's a tuning thing where the question becomes "is it worth losing the app server capability for the gain we get by putting it in to the DB server".
And by app server, I'm not simply talking Java, but .NET and even PHP etc.
If the rule must be enforced at all times no matter where the data came from or how it was updated, the database is where it needs to be. Remember databases are affected by direct querying to make changes that affect many records or to do something the application would not normally do. These are things like fixing a group of records when a customer is bought out by another customer and they want to change all the historical data, the application of new tax rates to orders not yet processed, the fixing of a some bad data inputs. They are also affected sometimes by other applications which do not use your data layer. They may also be affected by imports run through ETL programs which also cannot use your data layer. So if the rule must in all cases be followed, it must be in the database.
If the rule is only for special cases concerning how this particular input page works, then it needs to be in the application. So if a sales manager has only specific things he can do from his user interface, these things can be specified in the application.
Somethings it is helpful to do in both places. For instance, it is silly to allow a user to put a non-date in an input box that will relate to a date field. The datatype in the database should still be a datetime datatype, but it is best to check some of this stuff before you send.
Your business logic can sit in either location, but should not be in both. The logic should NOT be duplicated because it's easy to make a mistake trying to keep both in sync. If you put it in the model you'll want all data access to go through your models, including batch updates.
There will be trade-offs to putting it in the database vs the application models (here's a few of the top of my head):
Databases can be harder to maintain and update than applications
It's easier to distribute load if it's in the application tier
Multiple, disparate dbs may require splitting business rules (which may not be possible)

How much business logic should be in the database?

I'm developing a multi-user application which uses a (postgresql-)database to store its data. I wonder how much logic I should shift into the database?
e.g. When a user is going to save some data he just entered. Should the application just send the data to the database and the database decides if the data is valid? Or should the application be the smart part in the line and check if the data is OK?
In the last (commercial) project I worked on, the database was very dump. No constraits, no views etc, everything was ruled by the application. I think that's very bad, because every time a certain table was accesed in the code, there was the same code to check if the access is valid repeated over and over again.
By shifting the logic into the database (with functions, trigers and constraints), I think we can save a lot of code in the application (and a lot of potential errors). But I'm afraid of putting to much of the business-logic into the database will be a boomerang and someday it will be impossible to maintain.
Are there some real-life-approved guidelines to follow?
If you don't need massive distributed scalability (think companies with as much traffic as Amazon or Facebook etc.) then the relational database model is probably going to be sufficient for your performance needs. In which case, using a relational model with primary keys, foreign keys, constraints plus transactions makes it much easier to maintain data integrity, and reduces the amount of reconciliation that needs to be done (and trust me, as soon as you stop using any of these things, you will need reconciliation -- even with them you likely will due to bugs).
However, most validation code is much easier to write in languages like C#, Java, Python etc. than it is in languages like SQL because that's the type of thing they're designed for. This includes things like validating the formats of strings, dependencies between fields, etc. So I'd tend to do that in 'normal' code rather than the database.
Which means that the pragmatic solution (and certainly the one we use) is to write the code where it makes sense. Let the database handle data integrity because that's what it's good at, and let the 'normal' code handle data validity because that's what it's good at. You'll find a whole load of cases where this doesn't hold true, and where it makes sense to do things in different places, so just be pragmatic and weigh it up on a case by case basis.
Two cents: if you choose smart, remember not to go in the "too smart" field. The database should not deal with inconsistencies that are inappropriate for its level of understanding of the data.
Example: suppose you want to insert a valid (checked with a confirmation mail) email address in a field. The database could check if the email actually conforms to a given regular expression, but asking the database to check if the email address is valid (e.g. checking if the domain exists, sending the email and handling the response) it's a bit too much.
It's not meant to be a real case example. Just to illustrate you that a smart database has limits in its smartness anyway, and if an unexistent email address gets into it, the data is still not valid, but for the database is fine. As in the OSI model, everything should handle data at its level of understanding. ethernet does not care if it's transporting ICMP, TCP, if they are valid or not.
I find that you need to validate in both the front end (either the GUI client, if you have one, or the server) and the database.
The database can easily assert for nulls, foreign key constraints etc. i.e. that the data is the right shape and linked up correctly. Transactions will enforce atomic writes of this. It's the database's responsibility to contain/return data in the right shape.
The server can perform more complex validations (e.g. does this look like an email, does this look like a postcode etc.) and then re-structure the input for insertion into the database (e.g. normalise it and create the appropriate entities for insertion into the tables).
Where you put the emphasis on validation depends to some degree on your application. e.g. it's useful to validate a (say) postcode in a GUI client and immediately provide feedback, but if your database is used by other applications (e.g. an application to bulkload addresses) then your layer surrounding the database needs to validate as well. Sometimes you end up providing validation in two different implementations (e.g. in the above, perhaps a Javascript front-end and a Java DAO backend). I've never found a good strategic solution to this.
Using the common features of relational databases, like primary and foreign key constraints, datatype declarations, etc. is good sense. If you're not going to use them they why bother with a relational db?
That said, all data should be validated for both type and business rules before it hits the db. Type validation is just defensive programming- assume users are out to hack you and then you'll get fewer unpleasant surprises. Business rules are what your application is all about. If you make them part of the structure of your db they become much more tightly bound to how your app works. If you put them in the application layer, it's easier to change them if business requirements change.
As a secondary consideration: clients often have less choice about which db they use (postgresql, mysql, Oracle, etc) than which application language they have available. So if there is a good chance that your application will be installed on many different systems, your best bet is to make sure that your SQL is as standard as possible. This may will mean that constructing language agnostic db features like triggers, etc. will be more trouble than putting that same logic in your application layer.
It depends on the application :)
For some applications the dumb database is the best. For example Google's applications run on a big dumb database that can't even do joins because the need amazing scalability to be able to serve millions of users.
On the other hand, for some internal enterprise app it can be beneficial to go with very smart database as those are often used in more than just application and therefore you want a single point of control - think of employees database.
That said if your new application is similar to the previous one, I would go with dumb database. In order to eliminate all the manual checks and database access code I would suggest using an ORM library such as Hibernate for Java. It will essentially automate your data access layer but will leave all the logic to your application.
Regarding validation it must be done on all levels. See other answers for more details.
One other item of consideration is deployment. We have an application where the deployment of database changes is actually much easier for remote installations than the actual code base is. For this reason, we've put a lot of application code in stored procedures and database functions.
Deployment is not your #1 consideration but it can play an important role in deciding b/t various choices
This is as much a people question as it is a technology question. If your application is the only application that's ever going to manipulate the data (which is rarely the case, even if you think that's the plan), and you've only got application coders to hand, then by all means keep all the logic in the application.
On the other hand, if you've got DBAs who can handle it, or you know that more than one app will need to have its access validated, then managing data actually in the database makes a lot of sense.
Remember, though, that the best things for the database to be validating are a) the types of the data and b) relational constraints, which anything calling itself an RDBMS should have a handle on anyway.
If you've got any transactions in your application code, it's also worthwhile asking yourself whether they should be pushed to the database as a stored procedure so that it's impossible for them to be incorrectly reimplemented elsewhere.
I do know of shops where the only access allowed to the database is via stored procedures, so the DBAs have full resposibility for both the data storage semantics and access restrictions, and anyone else has to go through their gateways. There are obvious advantages to this, especially if more than one application has to have access to the data. Whether you go quite that far is up to you, but it's a perfectly valid approach.
While I believe that most data should be validated from the user interface (why send known bad stuff across the network tying up resources?), I also believe it is irresponsible not to put constraints on the database as the user interface is unlikely to be the only way that data ever gets into the database. Data also comes in from imports, other applications, quick script fixes for problems run at the query window, mass updates run (to update all prices by 10% for example). I want all bad records rejected no matter what their source and the database is the only place where you can be assured that will happen. To skip the database integrity checks because the user interface does it is to guarantee that you will most likely eventually have data integrity issues and then all of your data become meaningless and useless.
e.g. When a user is going to save some
data he just entered. Should the
application just send the data to the
database and the database decides if
the data is valid? Or should the
application be the smart part in the
line and check if the data is OK?
Its better to have the validation in the front end as well as the server side. So if the data is invalid the user will be notified immediately. Otherwise he will have to wait for the DB to respond after a post back.
When security is concerned its better to validate at both the ends. Front end as well as DB. Or how can the DB trust all the data that is sent by the application ;-)
Validation should be done on the client-side and server side and once it valid then it should be stored.
The only work that the database should do is any querying logic. So update rows, inserting rows, selects and everything else should be handled by the server side logic since thats where the real meat of the application lives.
Structuring your insert properly will handle any foreign Key constraints. Getting your business logic to call a sproc will insert data in the correct format. I don't really consider this validation but some people might.
My decision is : never use stored procedure in database. Stored procedure is not portable.

Should data validation be done at the database level?

I am writing some stored procedures to create tables and add data. One of the fields is a column that indicates percentage. The value there should be 0-100. I started thinking, "where should the data validation for this be done? Where should data validation be done in general? Is it a case by case situation?"
It occurs to me that although today I've decided that 0-100 is a valid value for percentage, tomorrow, I might decide that any positive value is valid. So this could be a business rule, couldn't it? Should a business rule be implemented at the database level?
Just looking for guidance, we don't have a dba here anymore.
Generally, I would do validations in multiple places:
Client side using validators on the aspx page
Server side validations in the code behind
I use database validations as a last resort because database trips are generally more expensive than the two validations discussed above.
I'm definitely not saying "don't put validations in the database", but I would say, don't let that be the only place you put validations.
If your data is consumed by multiple applications, then the most appropriate place would be the middle tier that is (should be) consumed by the multiple apps.
What you are asking in terms of business rules, takes on a completely different dimension when you start thinking of your entire application in terms of business rules. If the question of validations is small enough, do it in individual places rather than build a centralized business rules system. If it is a rather large system, them you can look into a business rules engine for this.
If you have a good data access tier, it almost doesn't matter which approach you take.
That said, a database constraint is a lot harder to bypass (intentionally or accidentally) than an application-layer constraint.
In my work, I keep the business logic and constraints as close to the database as I can, ensuring that there are fewer potential points of failure. Different constraints are enforced at different layers, depending on the nature of the constraint, but everything that can be in the database, is in the database.
In general, I would think that the closer the validation is to the data, the better.
This way, if you ever need to rewrite a top level application or you have a second application doing data access, you don't have two copies of the (potentially different) code operating on the same data.
In a perfect world the only thing talking (updating, deleting, inserting) to your database would be your business api. In the perfect world databae level constraints are a waste of time, your data would already have been validated and cross checked in your business api.
In the real world we get cowboys taking shortcuts and other people writing directly to the database. In this case some constraints on the database are well worth the effort. However if you have people not using your api to read/write you have to consider where you went wrong in your api design.
It would depend on how you are interacting with the database, IMO. For example, if the only way to the database is through your application, then just do the validation there.
If you are going to allow other applications to update the database, then you may want to put the validation in the database, so that no matter how the data gets in there it gets validated at the lowest level.
But, validation should go on at various levels, to give the user the quickest opportunity possible to know that there is a problem.
You didn't mention which version of SQL Server, but you can look at user defined datatypes and see if that would help you out, as you can just centralize the validation.
I worked for a government agency, and we had a -ton- of business rules. I was one of the DBA's, and we implemented a large number of the business rules in the database; however, we had to keep them pretty simple to avoid Oracle's dreaded 'mutating table' error. Things get complicated very quickly if you want to use triggers to implement business rules which span several tables.
Our decision was to implement business rules in the database where we could because data was coming in through the application -and- through data migration scripts. Keeping the business rules only in the application wouldn't do much good when data needed to be migrated in to the new database.
I'd suggest implementing business rules in the application for the most part, unless you have data being modified elsewhere than in the application. It can be easier to maintain and modify your business rules that way.
One can make a case for:
In the database implement enough to ensure overall data integrity (e.g. in SO this could be every question/answer has at least one revision).
In the boundary between presentation and business logic layer ensure the data makes sense for the business logic (e.g. in SO ensuring markup doesn't contain dangerous tags)
But one can easily make a case for different places in the application layers for every case. Overall philosophy of what the database is there for can affect this (e.g. is the database part of the application as a whole, or is it a shared data repository for many clients).
The only thing I try to avoid is using Triggers in the database, while they can solve legacy problems (if you cannot change the clients...) they are a case of the Action at a Distance anti-pattern.
I think basic data validation like you described makes sure that the data entered is correct. The applications should be validating data, but it doesn't hurt to have the data validated again on the database. Especially if there is more than one way to access the database.
You can reasonable restrict the database so that the data always makes sense. A database will support multiple applications using the same data so some restrictions make sense.
I think the only real cost in doing so would be time. I think such restrictions aren't a big deal unless you are doing something crazy. And, you can change the rules later if needed (although some changes are obviously harder than others)
First ideal: have a "gatekeeper" so that your data's consistency does not depend upon each developer applying the same rules. Simple validation such as range validation may reasonably be implemented in the DB. If it changes at least you have somewhere to put.
Trouble is the "business rules" tend to get much more complex. It can be useful to offload processing to the application tier where OO languages can be better for managing complex logic.
The trick then is to structure the app tier so that the gatekeeper is clear and unduplicated.
In a small organisation (no DBA ergo, small?) I would tend to put the business rules where you have strong development expertise.
This does not exclude doing initial validation in higher levels, for example you might validate all the way up in the UI to help the user get it right, but you don't depend upon that initial validation - you still have the gatekeeper.
If you percentage is always 'part divided by whole' (and you don't save part and whole values elsewhere), then checking its value against [0-100] is appropriate at db level. Additional constraints should be applied at other levels.
If your percentage means some kind of growth, then it may have any kind of values and should not be checked at db level.
It is case by case situation. Usually you should check at db level only constraints, which can never change or have natural limits (like first example).
Richard is right: the question is subjective the way it has been asked here.
Another take is: what are the schools of thought on this? Do they vary by sector or technology?
I've been doing Ruby on Rails for a bit now, and there, even relationships between records (one-to-many etc.) are NOT respected on the DB level, not to mention cascade deleting and all that stuff. Neither are any kind of limits aside from basic data types, which allow the DB to do its work. Your percentage thing is not handled on the DB level but rather at the Data Model level.
So I think that one of the trends that we're seeing lately is to give more power to the app level. You MUST check the data coming in to your server (so somewhere in the presentation level) and you MIGHT check it on the client and you MIGHT check again in the business layer of your app. Why would you want to check it again at the database level?
However: the darndest things do happen and sometimes the DB gets values that are "impossible" reading the business-layer's code. So if you're managing, say, financial data, I'd say to put in every single constraint possible at every level. What do people from different sectors do?

Resources