Should data validation be done at the database level? - database

I am writing some stored procedures to create tables and add data. One of the fields is a column that indicates percentage. The value there should be 0-100. I started thinking, "where should the data validation for this be done? Where should data validation be done in general? Is it a case by case situation?"
It occurs to me that although today I've decided that 0-100 is a valid value for percentage, tomorrow, I might decide that any positive value is valid. So this could be a business rule, couldn't it? Should a business rule be implemented at the database level?
Just looking for guidance, we don't have a dba here anymore.

Generally, I would do validations in multiple places:
Client side using validators on the aspx page
Server side validations in the code behind
I use database validations as a last resort because database trips are generally more expensive than the two validations discussed above.
I'm definitely not saying "don't put validations in the database", but I would say, don't let that be the only place you put validations.
If your data is consumed by multiple applications, then the most appropriate place would be the middle tier that is (should be) consumed by the multiple apps.
What you are asking in terms of business rules, takes on a completely different dimension when you start thinking of your entire application in terms of business rules. If the question of validations is small enough, do it in individual places rather than build a centralized business rules system. If it is a rather large system, them you can look into a business rules engine for this.

If you have a good data access tier, it almost doesn't matter which approach you take.
That said, a database constraint is a lot harder to bypass (intentionally or accidentally) than an application-layer constraint.
In my work, I keep the business logic and constraints as close to the database as I can, ensuring that there are fewer potential points of failure. Different constraints are enforced at different layers, depending on the nature of the constraint, but everything that can be in the database, is in the database.

In general, I would think that the closer the validation is to the data, the better.
This way, if you ever need to rewrite a top level application or you have a second application doing data access, you don't have two copies of the (potentially different) code operating on the same data.

In a perfect world the only thing talking (updating, deleting, inserting) to your database would be your business api. In the perfect world databae level constraints are a waste of time, your data would already have been validated and cross checked in your business api.
In the real world we get cowboys taking shortcuts and other people writing directly to the database. In this case some constraints on the database are well worth the effort. However if you have people not using your api to read/write you have to consider where you went wrong in your api design.

It would depend on how you are interacting with the database, IMO. For example, if the only way to the database is through your application, then just do the validation there.
If you are going to allow other applications to update the database, then you may want to put the validation in the database, so that no matter how the data gets in there it gets validated at the lowest level.
But, validation should go on at various levels, to give the user the quickest opportunity possible to know that there is a problem.
You didn't mention which version of SQL Server, but you can look at user defined datatypes and see if that would help you out, as you can just centralize the validation.

I worked for a government agency, and we had a -ton- of business rules. I was one of the DBA's, and we implemented a large number of the business rules in the database; however, we had to keep them pretty simple to avoid Oracle's dreaded 'mutating table' error. Things get complicated very quickly if you want to use triggers to implement business rules which span several tables.
Our decision was to implement business rules in the database where we could because data was coming in through the application -and- through data migration scripts. Keeping the business rules only in the application wouldn't do much good when data needed to be migrated in to the new database.
I'd suggest implementing business rules in the application for the most part, unless you have data being modified elsewhere than in the application. It can be easier to maintain and modify your business rules that way.

One can make a case for:
In the database implement enough to ensure overall data integrity (e.g. in SO this could be every question/answer has at least one revision).
In the boundary between presentation and business logic layer ensure the data makes sense for the business logic (e.g. in SO ensuring markup doesn't contain dangerous tags)
But one can easily make a case for different places in the application layers for every case. Overall philosophy of what the database is there for can affect this (e.g. is the database part of the application as a whole, or is it a shared data repository for many clients).
The only thing I try to avoid is using Triggers in the database, while they can solve legacy problems (if you cannot change the clients...) they are a case of the Action at a Distance anti-pattern.

I think basic data validation like you described makes sure that the data entered is correct. The applications should be validating data, but it doesn't hurt to have the data validated again on the database. Especially if there is more than one way to access the database.

You can reasonable restrict the database so that the data always makes sense. A database will support multiple applications using the same data so some restrictions make sense.
I think the only real cost in doing so would be time. I think such restrictions aren't a big deal unless you are doing something crazy. And, you can change the rules later if needed (although some changes are obviously harder than others)

First ideal: have a "gatekeeper" so that your data's consistency does not depend upon each developer applying the same rules. Simple validation such as range validation may reasonably be implemented in the DB. If it changes at least you have somewhere to put.
Trouble is the "business rules" tend to get much more complex. It can be useful to offload processing to the application tier where OO languages can be better for managing complex logic.
The trick then is to structure the app tier so that the gatekeeper is clear and unduplicated.
In a small organisation (no DBA ergo, small?) I would tend to put the business rules where you have strong development expertise.
This does not exclude doing initial validation in higher levels, for example you might validate all the way up in the UI to help the user get it right, but you don't depend upon that initial validation - you still have the gatekeeper.

If you percentage is always 'part divided by whole' (and you don't save part and whole values elsewhere), then checking its value against [0-100] is appropriate at db level. Additional constraints should be applied at other levels.
If your percentage means some kind of growth, then it may have any kind of values and should not be checked at db level.
It is case by case situation. Usually you should check at db level only constraints, which can never change or have natural limits (like first example).

Richard is right: the question is subjective the way it has been asked here.
Another take is: what are the schools of thought on this? Do they vary by sector or technology?
I've been doing Ruby on Rails for a bit now, and there, even relationships between records (one-to-many etc.) are NOT respected on the DB level, not to mention cascade deleting and all that stuff. Neither are any kind of limits aside from basic data types, which allow the DB to do its work. Your percentage thing is not handled on the DB level but rather at the Data Model level.
So I think that one of the trends that we're seeing lately is to give more power to the app level. You MUST check the data coming in to your server (so somewhere in the presentation level) and you MIGHT check it on the client and you MIGHT check again in the business layer of your app. Why would you want to check it again at the database level?
However: the darndest things do happen and sometimes the DB gets values that are "impossible" reading the business-layer's code. So if you're managing, say, financial data, I'd say to put in every single constraint possible at every level. What do people from different sectors do?

Related

Why repeat database constraints in models?

In a CakePHP application, for unique constraints that are accounted for in the database, what is the benefit of having the same validation checks in the model?
I understand the benefit of having JS validation, but I believe this model validation makes an extra trip to the DB. I am 100% sure that certain validations are made in the DB so the model validation would simply be redundant.
The only benefit I see is the app recognizing the mistake and adjusting the view for the user accordingly (repopulating the fields and showing error message on the appropriate field; bettering the ux) but this could be achieved if there was a constraint naming convention and so the app could understand what the problem was with the save (existing method to do this now?)
Quicker response times, less database load. The further out to the client you can do validation, i.e. JavaScript, the quicker it is. The major con is having to implement the same rules in multiple layers.
If database constraints are coded by one person and the rest of the code is code by another, they really shouldn't trust each other completely. Check things at boundaries, especially if they represent organizational-people boundaries. e.g. user to application or one developers module to another, or one corporate department to another.
Don't forget the matter of portability. Enforcing validation in the model keeps your application database-agnostic. You can program an application against a SQLite database, and then deploy to MySQL.. oh wait, you don't have that.. PostgreSQL? No? Oh, Oracle, fine.
Also, in production, if a database error happens on a typical controller action that saves and then redirects, your user will be stuck staring at a blank white page (since errors are off, there was no view to output, and the redirect never happened). Basically, database errors are turned off in production mode as they can give insight into the DB schema, whereas model validation errors remain enabled as they are user-friendly.
You have a point though, is it possible to capture these database errors and do something useful with them? Currently no, but it would be nice if CakePHP could dynamically translate them into failed model validation rules, preventing us from repeating ourselves. Different databases throw different looking errors, so each DBO datasource would need updated to support this before it could happen.
Just about any benefit that you might gain would probably be canceled out by the hassle of maintaining the contraints in duplicate. Unless you happen to have an easy mechanism for specifying them in a single location and keeping them in sync, I would recommend sticking with one location of the other.
Validation in CakePHP happens before save or update query is sent to the database. Therefore it reduces the database load. You are wrong in your belief that the model validation makes an extra trip to the database. By default, validation occurs before save.
Ideally the design of the model should come first (based on user stories, use cases, etc.) with the database schema deriving from the model. Then the database implementation can either be generated from the model (explicitly tying both to a single source), or the database constraints can be designed based on relational integrity requirements (which are conceptually different from, and generally have a different granularity and vocabulary than, the model, although in many cases there is a mapping of some kind.
I generally have in mind only relational integrity requirements for database constraints. There are too many cases where the granularity and universal applicability of business constraints are too incongruent, transitory, and finer-grained than the database designer knows about; and they change more frequently over time and across application modules.
# le dorfier
Matters of data integrity and matters of business rules are one and the same thing (modulo the kind of necessarily procedural "business rules "stuff such as "when a new customer is entered, a mail with such-and-so content must be sent to that customer's email address).
The relational algebra is generally accepted to be "expressively complete" (I've even been told there is formal proof that RA plus TC is Turing complete). Therefore, RA (plus TC) can express "everything that is wrong" (wrong in the sense that it violates some/any arbitrary "business rule").
So enforcing a rule that "the set of things that are wrong must remain empty" boils down to the dbms enforcing each possible conceivable (data-related, i.e. state-related) business rule for you, without any programmer having to write the first byte of code to achieve such business rule enforcement.
You bring up "business rules change frequently" as an argument. If business rules change, then what would be the scenario that allows for the quickest adaptation of the system to such a business rule : only having to change the corresponding expression of the RA that enforces the constraint/business rule, or having to go find out where in the application code the rule is enforced and having to change all that ?
# bradley harris.
Correct. I'd vote you up if voting was available to me. But you forget to mention that since one can never be really certain that some database will never be needed by some other app, the only sensible place to do business rules enforcement is inside the DBMS.

Business Logic in Database versus Code? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
As a software engineer, I have a strong bias towards writing business logic in the application layer, while typically relying on the database for little more than CRUD (Create Retrieve Update and Delete) operations. On the other hand, I have run across applications (typically older ones) where a large amount of the business logic was written in stored procedures, so there are people out there that prefer to write business logic in the database layer.
For the people that have and/or enjoy written/writing business logic in a stored procedure, what were/are your reasons for using this method?
I try to seriously limit my business logic in the DB to only procs that have to do alot of querying and updating to perform a single application operation. Some may argue that even that should be in the app, but I like to keep the IO down if I can.
Databases are great for CRUD but if they get bloated with logic:
It becomes confusing where the logic is,
Typically databases are a silo and do not scale horizontally nearly as well as the app servers.
t_sql/PLsql is hard to read and procedural in nature
You forfeit all of the benefits of OOAD.
To the maximum extent possible, keep your business logic in the environment that is the most testable and debuggable. There are some valid reasons for storing business logic in the database in other people's existing answers, but they are almost always far outweighed by this.
Limiting the business logic to the application layer is short-sighted at best. Experienced professional database designers rarely allow it on their systems. Database need to have constraints and triggers and stored procs to help define how the data from any source will go into it.
If the database is to maintain its integrity and to ensure that all sources of new data or data changes follow the rules, the database is the place to put the required logic. Putting it the application layer is a data nightmare waiting to happen. Databases do not get information just from one application. Business logic in the application is often unintentionally bypassed by imports (assume you got a new customer who wanted their old historical data imported to your system or a large number of target records, no one is going to enter a million possible targets through the interface, it will happen in an import.) It is also bypassed by changes made through the query window to fix one-time issues (things like increasing the price of all products by 10%). If you have application layer logic that should have been applied to the data change, it won't be. Now it's ok to put it in the application layer as well, no sense sending bad data to the database and wasting network bandwidth, but to fail to put it in the database will sooner or later cause data problems.
Another reason to keep all of this in the database has to to with the possibility of users committing fraud. If you put all your logic in the application layer, then you must grant the users access directly to the tables. If you encapsulate all your logic in stored procs, they can be limited to doing only what the stored procs allow and not anything else. I would not consider allowing any kind of access by users to a database that stores financial records or personal information (such as health records) as I would not allow anyone except a couple of dbas to directly access the production records in any way shape or form. More fraud is committed than many developers realize and almost none of them consider the possibility in their design.
If you need to import large amount of data, going through a data access layer could slow down the import to a crawl becasue it doesn't take advanatge of the set-based operations that databases are designed to handle.
Your usage of the term "business logic" is rather vague.
It can be interpreted to mean to include the enforcement of constraints on the data (aka 'business rules'). Enforcement of these unequivocally belongs in the dbms, period.
It can also be interpreted to mean to include things like "if a new customer arrives, then within a week we send him a welcome letter." Trying to push stuff like this in the data layer is probably a big mistake. In such cases, the driver for "create a new welcome letter" should probably be the application that also triggers the new customer row insertion. Imagine every new database row insertion triggering a new welcome letter, and then suddenly we take over another company and we must integrate that company's customers in our own database ... Ouch.
We do a lot of processing in the DB tier, where appropriate. There's a lot of operations you wouldn't want to pull back large datasets to the app tier to do analysis on. It's also an easier deployment for us -- a single point vs. updating applications at all install points. But a lot depends on your application and what it does; there's no single good answer here.
On a couple of ocassions I have put 'logic' in sprocs because the CRUD might be happening in more than one place. By 'logic' I would have to say it is not really business logic but more 'integrity logic'. It might be the same - some cleanup might be necessary if something gets deleted or updated in a certain way, and if that delete or update could happen from more than one tool with different code-bases it made sense to put it in the proc they all used.
In addition, sometimes the 'business logic line' is pretty blurry. Take reports for example - they may rely on stored procedures or views that encapsulate 'smarts' about what the schema means to the business. How often have you seen CASE statements and the like that 'do things' based on column values or other critieria? Could be construed as business logic and yet it probably does belong in the DB where it can be optimized, etc.
I'd say if 'business-logic' means application flow, user control, timed operations and generally 'doing-business-stuff' then it should be in the application layer. But if it means making sure that no matter how you dig around in the data, it always makes sense and is a sensible, non-self-conflicting whole, then the checks to enforce those rules go in the DB, absolutely, no questions. There are always many ways to push data into the DB and manipulate it once its there. Not all those ways have 'business-logic' built in to them. You will find a SQL session into a DB through a DOS window on a support call at 3am is very liberal in what it allows for example! If the logic isn't in the DB to make sure that ALL data changes make sense, you can bet for sure that the data will get very, very screwed up over time. And since a system is only as valuable as the data it holds, that makes for a much lower return on investment.
Two good reasons for putting the business logic in the database are:
It secures your logic and data
against additional applications that
may access the database that don't
implement similar logic.
Database designs usually outlive the
application layer and it reduces the
work necessary when you move to new
technologies on the client side.
You often find business logic at the database layer because it can often be faster to make a change and deploy. I think often the best intentions are not to put the logic there but because of the ease of deployment it ends up there.
The primary reason I would put BL in stored procs in the past is that transactions were easier in the database.
If deployments are difficult for your app and you don't have an app-server, changing the BL in stored procedures is the most effective way to deploy a change.
I work for a financial type company where certain rules are applied by states, and these rules and their calculations are subject to change almost daily if not surely weekly. That being the case, it made more sense to move parts of the logic dealing with calculations to the database; where a change can be tested and applied without having to recompile and redistibute an application, which is impossible to do daily without disrupting business. The stored proc is tested, approved, applied and the end user is none the wiser.
With the move to web based applications, the reliance on moving the logic to the database is less but still present. Even web apps (depending on the language) must be compiled and published to the site which could cause downtime.
Sometimes business logic is too slow to run on the app layer. This is especially true on on older systems where client power and bandwidth was more limited.
The main reason for using the database to do the work is that you have a single point of control. Often, app developers re-use or rewrite code fragments in different parts of the application. Even assuming that these all work exactly the same way (which is doubtful), when the business logic changes, the app needs to be reviewed, recoded, recompiled. Unless the parameters change, this would not be necessary where the business logic is stored only in the database.
My preference is to keep any complicated business logic out of the database, simply for maintenance purposes. If I get a call at 2 o'clock in the morning I would rather debug my application code than try to step through database scripts.
I'm in a team to build-up and maintain a rather large financial system, and I find no way put the logic into the application layer for action that affect to or get constraints from dozens of thousand records.
Beside the performance issue, should errors happen, rectifying a stored procedures is much faster than debugging the application, fixing, recompiling, redeploying the code with longer downtime
I think Specially for older applications which i working on (Banking) where the Bussiness logic is huge, it's almost next to impossible to perform all these business logic in application layer, and also It's a big performenance hit when we put these logic in Application layer where the number of fetch to the database is more, results in more resource utilization(more java objects if it's done in java layer) and network issues and forget abt performenance.

How much business logic should be in the database?

I'm developing a multi-user application which uses a (postgresql-)database to store its data. I wonder how much logic I should shift into the database?
e.g. When a user is going to save some data he just entered. Should the application just send the data to the database and the database decides if the data is valid? Or should the application be the smart part in the line and check if the data is OK?
In the last (commercial) project I worked on, the database was very dump. No constraits, no views etc, everything was ruled by the application. I think that's very bad, because every time a certain table was accesed in the code, there was the same code to check if the access is valid repeated over and over again.
By shifting the logic into the database (with functions, trigers and constraints), I think we can save a lot of code in the application (and a lot of potential errors). But I'm afraid of putting to much of the business-logic into the database will be a boomerang and someday it will be impossible to maintain.
Are there some real-life-approved guidelines to follow?
If you don't need massive distributed scalability (think companies with as much traffic as Amazon or Facebook etc.) then the relational database model is probably going to be sufficient for your performance needs. In which case, using a relational model with primary keys, foreign keys, constraints plus transactions makes it much easier to maintain data integrity, and reduces the amount of reconciliation that needs to be done (and trust me, as soon as you stop using any of these things, you will need reconciliation -- even with them you likely will due to bugs).
However, most validation code is much easier to write in languages like C#, Java, Python etc. than it is in languages like SQL because that's the type of thing they're designed for. This includes things like validating the formats of strings, dependencies between fields, etc. So I'd tend to do that in 'normal' code rather than the database.
Which means that the pragmatic solution (and certainly the one we use) is to write the code where it makes sense. Let the database handle data integrity because that's what it's good at, and let the 'normal' code handle data validity because that's what it's good at. You'll find a whole load of cases where this doesn't hold true, and where it makes sense to do things in different places, so just be pragmatic and weigh it up on a case by case basis.
Two cents: if you choose smart, remember not to go in the "too smart" field. The database should not deal with inconsistencies that are inappropriate for its level of understanding of the data.
Example: suppose you want to insert a valid (checked with a confirmation mail) email address in a field. The database could check if the email actually conforms to a given regular expression, but asking the database to check if the email address is valid (e.g. checking if the domain exists, sending the email and handling the response) it's a bit too much.
It's not meant to be a real case example. Just to illustrate you that a smart database has limits in its smartness anyway, and if an unexistent email address gets into it, the data is still not valid, but for the database is fine. As in the OSI model, everything should handle data at its level of understanding. ethernet does not care if it's transporting ICMP, TCP, if they are valid or not.
I find that you need to validate in both the front end (either the GUI client, if you have one, or the server) and the database.
The database can easily assert for nulls, foreign key constraints etc. i.e. that the data is the right shape and linked up correctly. Transactions will enforce atomic writes of this. It's the database's responsibility to contain/return data in the right shape.
The server can perform more complex validations (e.g. does this look like an email, does this look like a postcode etc.) and then re-structure the input for insertion into the database (e.g. normalise it and create the appropriate entities for insertion into the tables).
Where you put the emphasis on validation depends to some degree on your application. e.g. it's useful to validate a (say) postcode in a GUI client and immediately provide feedback, but if your database is used by other applications (e.g. an application to bulkload addresses) then your layer surrounding the database needs to validate as well. Sometimes you end up providing validation in two different implementations (e.g. in the above, perhaps a Javascript front-end and a Java DAO backend). I've never found a good strategic solution to this.
Using the common features of relational databases, like primary and foreign key constraints, datatype declarations, etc. is good sense. If you're not going to use them they why bother with a relational db?
That said, all data should be validated for both type and business rules before it hits the db. Type validation is just defensive programming- assume users are out to hack you and then you'll get fewer unpleasant surprises. Business rules are what your application is all about. If you make them part of the structure of your db they become much more tightly bound to how your app works. If you put them in the application layer, it's easier to change them if business requirements change.
As a secondary consideration: clients often have less choice about which db they use (postgresql, mysql, Oracle, etc) than which application language they have available. So if there is a good chance that your application will be installed on many different systems, your best bet is to make sure that your SQL is as standard as possible. This may will mean that constructing language agnostic db features like triggers, etc. will be more trouble than putting that same logic in your application layer.
It depends on the application :)
For some applications the dumb database is the best. For example Google's applications run on a big dumb database that can't even do joins because the need amazing scalability to be able to serve millions of users.
On the other hand, for some internal enterprise app it can be beneficial to go with very smart database as those are often used in more than just application and therefore you want a single point of control - think of employees database.
That said if your new application is similar to the previous one, I would go with dumb database. In order to eliminate all the manual checks and database access code I would suggest using an ORM library such as Hibernate for Java. It will essentially automate your data access layer but will leave all the logic to your application.
Regarding validation it must be done on all levels. See other answers for more details.
One other item of consideration is deployment. We have an application where the deployment of database changes is actually much easier for remote installations than the actual code base is. For this reason, we've put a lot of application code in stored procedures and database functions.
Deployment is not your #1 consideration but it can play an important role in deciding b/t various choices
This is as much a people question as it is a technology question. If your application is the only application that's ever going to manipulate the data (which is rarely the case, even if you think that's the plan), and you've only got application coders to hand, then by all means keep all the logic in the application.
On the other hand, if you've got DBAs who can handle it, or you know that more than one app will need to have its access validated, then managing data actually in the database makes a lot of sense.
Remember, though, that the best things for the database to be validating are a) the types of the data and b) relational constraints, which anything calling itself an RDBMS should have a handle on anyway.
If you've got any transactions in your application code, it's also worthwhile asking yourself whether they should be pushed to the database as a stored procedure so that it's impossible for them to be incorrectly reimplemented elsewhere.
I do know of shops where the only access allowed to the database is via stored procedures, so the DBAs have full resposibility for both the data storage semantics and access restrictions, and anyone else has to go through their gateways. There are obvious advantages to this, especially if more than one application has to have access to the data. Whether you go quite that far is up to you, but it's a perfectly valid approach.
While I believe that most data should be validated from the user interface (why send known bad stuff across the network tying up resources?), I also believe it is irresponsible not to put constraints on the database as the user interface is unlikely to be the only way that data ever gets into the database. Data also comes in from imports, other applications, quick script fixes for problems run at the query window, mass updates run (to update all prices by 10% for example). I want all bad records rejected no matter what their source and the database is the only place where you can be assured that will happen. To skip the database integrity checks because the user interface does it is to guarantee that you will most likely eventually have data integrity issues and then all of your data become meaningless and useless.
e.g. When a user is going to save some
data he just entered. Should the
application just send the data to the
database and the database decides if
the data is valid? Or should the
application be the smart part in the
line and check if the data is OK?
Its better to have the validation in the front end as well as the server side. So if the data is invalid the user will be notified immediately. Otherwise he will have to wait for the DB to respond after a post back.
When security is concerned its better to validate at both the ends. Front end as well as DB. Or how can the DB trust all the data that is sent by the application ;-)
Validation should be done on the client-side and server side and once it valid then it should be stored.
The only work that the database should do is any querying logic. So update rows, inserting rows, selects and everything else should be handled by the server side logic since thats where the real meat of the application lives.
Structuring your insert properly will handle any foreign Key constraints. Getting your business logic to call a sproc will insert data in the correct format. I don't really consider this validation but some people might.
My decision is : never use stored procedure in database. Stored procedure is not portable.

Business Logic: Database or Application Layer

The age old question. Where should you put your business logic, in the database as stored procedures ( or packages ), or in the application/middle tier? And more importantly, Why?
Assume database independence is not a goal.
Maintainability of your code is always a big concern when determining where business logic should go.
Integrated debugging tools and more powerful IDEs generally make maintaining middle tier code easier than the same code in a stored procedure. Unless there is a real reason otherwise, you should start with business logic in your middle tier/application and not in stored procedures.
However when you come to reporting and data mining/searching, stored procedures can often a better choice. This is thanks to the power of the databases aggregation/filtering capabilities and the fact you are keeping processing very close the the source of the data. But this may not be what most consider classic business logic anyway.
Put enough of the business logic in the database to ensure that the data is consistent and correct.
But don't fear having to duplicate some of this logic at another level to enhance the user experience.
For very simple cases you can put your business logic in stored procedures. Usually even the simple cases tend to get complicated over time. Here are the reasons I don't put business logic in the database:
Putting the business logic in the database tightly couples it to the technical implementation of the database. Changing a table will cause you to change a lot of the stored procedures again causing a lot of extra bugs and extra testing.
Usually the UI depends on business logic for things like validation. Putting these things in the database will cause tight coupling between the database and the UI or in different cases duplicates the validation logic between those two.
It will get hard to have multiple applications work on the same database. Changes for one aplication will cause others to break. This can quickly turn into a maintenance nightmare. So it doesn't really scale.
More practically SQL isn't a good language to implement business logic in an understandable way. SQL is great for set based operations but it misses constructs for "programming in the large" it's hard to maintain big amounts of stored procedures. Modern OO languages are better suited and more flexible for this.
This doesn't mean you can't use stored procs and views. I think it sometimes is a good idea to put an extra layer of stored procedures and views between the tables and application(s) to decouple the two. That way you can change the layout of the database without changing external interface allowing you to refactor the database independently.
It's really up to you, as long as you're consistent.
One good reason to put it in your database layer: if you are fairly sure that your clients will never ever change their database back-end.
One good reason to put it in the application layer: if you are targeting multiple persistence technologies for your application.
You should also take into account core competencies. Are your developers mainly application layer developers, or are they primarily DBA-types?
While there is no one right answer - it depends on the project in question, I would recommend the approach advocated in "Domain Driven Design" by Eric Evans. In this approach the business logic is isolated in its own layer - the domain layer - which sits on top of the infrastructure layer(s) - which could include your database code, and below the application layer, which sends the requests into the domain layer for fulfilment and listens for confirmation of their completion, effectively driving the application.
This way, the business logic is captured in a model which can be discussed with those who understand the business aside from technical issues, and it should make it easier to isolate changes in the business rules themselves, the technical implementation issues, and the flow of the application which interacts with the business (domain) model.
I recommend reading the above book if you get the chance as it is quite good at explaining how this pure ideal can actually be approximated in the real world of real code and projects.
While there are certainly benefits to have the business logic on the application layer, I'd like to point out that the languages/frameworks seem to change more frequently then the databases.
Some of the systems that I support, went through the following UIs in the last 10-15 years: Oracle Forms/Visual Basic/Perl CGI/ ASP/Java Servlet. The one thing that didn't change - the relational database and stored procedures.
Database independence, which the questioner rules out as a consideration in this case, is the strongest argument for taking logic out of the database. The strongest argument for database independence is for the ability to sell software to companies with their own preference for a database backend.
Therefore, I'd consider the major argument for taking stored procedures out of the database to be a commercial one only, not a technical one. There may be technical reasons but there are also technical reasons for keeping it in there -- performance, integrity, and the ability to allow multiple applications to use the same API for example.
Whether or not to use SP's is also strongly influenced by the database that you are going to use. If you take database independence out of consideration then you're going to have very different experiences using T-SQL or using PL/SQL.
If you are using Oracle to develop an application then PL/SQL is an obvious choice as a language. It's is very tightly coupled with the data, continually improved in every relase, and any decent development tool is going to integratePL/SQL development with CVS or Subversion or somesuch.
Oracle's web-based Application Express development environment is even built 100% with PL/SQL.
The only thing that goes in a database is data.
Stored procedures are a maintenance nightmare. They aren't data and they don't belong in the database. The endless coordination between developers and DBA's is little more than organizational friction.
It's hard to keep good version control over stored procedures. The code outside the database is really easy to install -- when you think you've got the wrong version you just do an SVN UP (maybe an install) and your application's back to a known state. You have environment variables, directory links, and lots of environment control over the application.
You can, with simple PATH manipulations, have variant software available for different situations (training, test, QA, production, customer-specific enhancements, etc., etc.)
The code inside the database, however, is much harder to manage. There's no proper environment -- no "PATH", directory links or other environment variables -- to provide any usable control over what software's being used; you have a permanent, globally bound set of application software stuck in the database, married to the data.
Triggers are even worse. They're both a maintenance and a debugging nightmare. I don't see what problem they solve; they seem to be a way of working around badly-designed applications where someone couldn't be bothered to use the available classes (or function libraries) correctly.
While some folks find the performance argument compelling, I still haven't seen enough benchmark data to convince me that stored procedures are all that fast. Everyone has an anecdote, but no one has side-by-side code where the algorithms are more-or-less the same.
[In the examples I've seen, the old application was a poorly designed mess; when the stored procedures were written, the application was re-architected. I think the design change had more impact than the platform change.]
Anything that affects data integrity must be put at the database level. Other things besides the user interface often put data into, update or delete data from the database including imports, mass updates to change a pricing scheme, hot fixes, etc. If you need to ensure the rules are always followed, put the logic in defaults and triggers.
This is not to say that it isn't a good idea to also have it in the user interface (why bother sending information that the database won't accept), but to ignore these things in the database is to court disaster.
If you need database independence, you'll probably want to put all your business logic in the application layer since the standards available in the application tier are far more prevalent than those available to the database tier.
However, if database independence isn't the #1 factor and the skill-set of your team includes strong database skills, then putting the business logic in the database may prove to be the best solution. You can have your application folks doing application-specific things and your database folks making sure all the queries fly.
Of course, there's a big difference between being able to throw a SQL statement together and having "strong database skills" - if your team is closer to the former than the latter then put the logic in the application using one of the Hibernates of this world (or change your team!).
In my experience, in an Enterprise environment you'll have a single target database and skills in this area - in this case put everything you can in the database. If you're in the business of selling software, the database license costs will make database independence the biggest factor and you'll be implementing everything you can in the application tier.
Hope that helps.
It is nowadays possible to submit to subversion your stored proc code and to debug this code with good tool support.
If you use stored procs that combine sql statements you can reduce the amount of data traffic between the application and the database and reduce the number of database calls and gain big performance gains.
Once we started building in C# we made the decision not to use stored procs but now we are moving more and more code to stored procs. Especially batch processing.
However don't use triggers, use stored procs or better packages. Triggers do decrease maintainability.
Putting the code in the application layer will result in a DB independent application.
Sometimes it is better to use stored procedures for performance reasons.
It (as usual) depends on the application requirements.
The business logic should be placed in the application/middle tier as a first choice. That way it can be expressed in the form of a domain model, be placed in source control, be split or combined with related code (refactored), etc. It also gives you some database vendor independence.
Object Oriented languages are also much more expressive than stored procedures, allowing you to better and more easily describe in code what should be happening.
The only good reasons to place code in stored procedures are: if doing so produces a significant and necessary performance benefit or if the same business code needs to be executed by multiple platforms (Java, C#, PHP). Even when using multiple platforms, there are alternatives such as web-services that might be better suited to sharing functionality.
The answer in my experience lies somewhere on a spectrum of values usually determined by where your organization's skills lie.
The DBMS is a very powerful beast, which means proper or improper treatment will bring great benefit or great danger. Sadly, in too many organizations, primary attention is paid to programming staff; dbms skills, especially query development skills (as opposed to administrative) are neglected. Which is exacerbated by the fact that the ability to evaluate dbms skills is also probably missing.
And there are few programmers who sufficiently understand what they don't understand about databases.
Hence the popularity of suboptimal concepts, such as Active Records and LINQ (to throw in some obvious bias). But they are probably the best answer for such organizations.
However, note that highly scaled organizations tend to pay a lot more attention to effective use of the datastore.
There is no standalone right answer to this question. It depends on the requirements of your app, the preferences and skills of your developers, and the phase of the moon.
Business logic is to be put in the application tier and not in the database.
The reason is that a database stored procedure is always dependen on the database product you use. This break one of the advantages of the three tier model. You cannot easily change to an other database unless you provide an extra stored procedure for this database product.
on the other hand sometimes, it makes sense to put logic into a stored procedure for performance optimization.
What I want to say is business logic is to be put into the application tier, but there are exceptions (mainly performance reasons)
Bussiness application 'layers' are:
1. User Interface
This implements the business-user's view of h(is/er) job. It uses terms that the user is familiar with.
2. Processing
This is where calculations and data manipulation happen. Any business logic that involves changing data are implemented here.
3. Database
This could be: a normalized sequential database (the standard SQL-based DBMS's); an OO-database, storing objects wrapping the business-data; etc.
What goes Where
In getting to the above layers you need to do the necessary analysis and design. This would indicate where business logic would best be implemented: data-integrity rules and concurrency/real-time issues regarding data-updates would normally be implemented as close to the data as possible, same as would calculated fields, and this is a good pointer to stored-procedures/triggers, where data-integrity and transaction-control is absolutely necessary.
The business-rules involving the meaning and use of the data would for the most part be implemented in the Processing layer, but would also appear in the User-Interface as the user's work-flow - linking the various process in some sequence that reflects the user's job.
Imho. there are two conflicting concerns with deciding where business logic goes in a relational database-driven app:
maintainability
reliability
Re. maintainability:  To allow for efficient future development, business logic belongs in the part of your application that's easiest to debug and version control.
Re. reliability:  When there's significant risk of inconsistency, business logic belongs in the database layer.  Relational databases can be designed to check for constraints on data, e.g. not allowing NULL values in specific columns, etc.  When a scenario arises in your application design where some data needs to be in a specific state which is too complex to express with these simple constraints, it can make sense to use a trigger or something similar in the database layer.
Triggers are a pain to keep up to date, especially when your app is supposed to run on client systems you don't even have access too.  But that doesn't mean it's impossible to keep track of them or update them.  S.Lott's arguments in his answer that it's a pain and a hassle are completely valid, I'll second that and have been there too.  But if you keep those limitations in mind when you first design your data layer and refrain from using triggers and functions for anything but the absolute necessities it's manageable.
In our application, most business logic is contained in the application's model layer, e.g. an invoice knows how to initialize itself from a given sales order.  When a bunch of different things are modified sequentially for a complex set of changes like this, we roll them up in a transaction to maintain consistency, instead of opting for a stored procedure.  Calculation of totals etc. are all done with methods in the model layer.  But when we need to denormalize something for performance or insert data into a 'changes' table used by all clients to figure out which objects they need to expire in their session cache, we use triggers/functions in the database layer to insert a new row and send out a notification (Postgres listen/notify stuff) from this trigger.
After having our app in the field for about a year, used by hundreds of customers every day, the only thing I would change if we were to start from scratch would be to design our system for creating database functions (or stored procedures, however you want to call them) with versioning and updates to them in mind from the get-go.
Thankfully, we do have some system in place to keep track of schema versions, so we built something on top of that to take care of replacing database functions.  It would've saved us some time now if we'd considered the need to replace them from the beginning though.
Of course, everything changes when you step outside of the realm of RDBMS's into tuple-storage systems like Amazon SimpleDB and Google's BigTable.  But that's a different story :)
We put a lot of business logic in stored procedures - it's not ideal, but quite often it's a good balance between performance and reliability.
And we know where it is without having to search through acres of solutions and codebase!
Scalability is also very important factor for pusing business logic in middle or app layer than to database layer.It should be understood that DatabaseLayer is only for interacting with Database not manipulating which is returned to or from database.
I remember reading an article somewhere that pointed out that pretty well everything can be, at some level, part of the business logic, and so the question is meaningless.
I think the example given was the display of an invoice onscreen. The decision to mark an overdue one in red is a business decision...
It's a continuum. IMHO the biggest factor is speed. How can u get this sucker up and running as quickly as possible while still adhering to good tenants of programming such as maintainability, performance, scalability, security, reliability etc. Often times SQL is the most concise way to express something and also happens to be the most performant many times, except for string operations etc, but that's where your CLR Procs can help. My belief is to liberally sprinkle business logic around whereever you feel it is best for the undertaking at hand. If you have a bunch of application developers who shit their pants when looking at SQL then let them use their app logic. If you really want to create a high performance application with large datasets, put as much logic in the DB as you can. Fire your DBA's and give developers ultimate freedom over their Dev databases. There is no one answer or best tool for the job. You have multiple tools so become expert at all levels of the application and you'll soon find that you're spending a lot more time writing nice consise expressive SQL where warranted and using the application layer other times. To me, ultimately, reducing the number of lines of code is what leads to simplicity. We have just converted a sql rich application with a mere 2500 lines of app code and 1000 lines of SQL to a domain model which now has 15500 lines of app code and 2500 lines of SQL to achieve what the former sql rich app did. If you can justify a 6 fold increase in code as "simplified" then go right ahead.
This is a great question! I found this after I had already asked a simliar question, but this is more specific. It came up as a result of a design change decision that I wasn't involved in making.
Basically, what I was told was that If you have millions of rows of data in your database tables, then look at putting business logic into stored procedures and triggers. That is what we are doing right now, converting a java app into stored procedures for maintainability as the java code had become convoluted.
I found this article on: The Business Logic Wars The author also made the million rows in a table argument, which I found interesting. He also added business logic in javascript, which is client side and outside of the business logic tier. I hadn't thought about this before even though I've used javascript for validation for years, to along with server side validation.
My opinion is that you want the business logic in the application/middle tier as a rule of thumb, but don't discount cases where it makes sense to put it into the database.
One last point, there is another group where I'm working presently that is doing massive database work for research and the amount of data they are dealing with is immense. Still, for them they don't have any business logic in the database itself, but keep it in the application/middle tier. For their design, the application/middle tier was the correct place for it, so I wouldn't use the size of tables as the only design consideration.
Business logic is usually embodied by objects, and the various language constructs of encapsulation, inheritance, and and polymorphism. For example, if a banking application is passing around money, there may be a Money type that defines the business elements of what "money" is. This, opposed to using a primitive decimal to represent money. For this reason, well-designed OOP is where the "business logic" lives—not strictly in any layer.

Preventing bad data input

Is it good practice to delegate data validation entirely to the database engine constraints?
Validating data from the application doesn't prevent invalid insertion from another software (possibly written in another language by another team). Using database constraints you reduce the points where you need to worry about invalid input data.
If you put validation both in database and application, maintenance becomes boring, because you have to update code for who knows how many applications, increasing the probability of human errors.
I just don't see this being done very much, looking at code from free software projects.
Validate at input time. Validate again before you put it in the database. And have database constraints to prevent bad input. And you can bet in spite of all that, bad data will still get into your database, so validate it again when you use it.
It seems like every day some web app gets hacked because they did all their validation in the form or worse, using Javascript, and people found a way to bypass it. You've got to guard against that.
Paranoid? Me? No, just experienced.
It's best to, where possible, have your validation rules specified in your database and use or write a framework that makes those rules bubble up into your front end. ASP.NET Dynamic Data helps with this and there are some commercial libraries out there that make it even easier.
This can be done both for simple input validation (like numbers or dates) and related data like that constrained by foreign keys.
In summary, the idea is to define the rules in one place (the database most the time) and have code in other layers that will enforce those rules.
The disadvantage to leaving the logic to the database is then you increase the load on that particular server. Web and application servers are comparatively easy to scale outward, but a database requires special techniques. As a general rule, it's a good idea to put as much of the computational logic into the application layer and keep the interaction with the database as simple as possible.
With that said, it is possible that your application may not need to worry about such heavy scalability issues. If you are certain that database server load will not be a problem for the foreseeable future, then go ahead and put the constraints on the database. You are quite correct that this improves the organization and simplicity of your system as a whole by keeping validation logic in a central location.
There are other concerns than just SQL injection with input. You should take the most defensive stance possible whenever accepting user input. For example, a user might be able to enter a link to an image into a textbox, which is actually a PHP script that runs something nasty.
If you design your application well, you should not have to laboriously check all input. For example, you could use a Forms API which takes care of most of the work for you, and a database layer which does much the same.
This is a good resource for basic checking of vulnerabilities:
http://ha.ckers.org/xss.html
It's far too late by the time the data gets to your database to provide meaningful validation for your users and applications. You don't want your database doing all the validation since that'll slow things down pretty good, and the database doesn't express the logic as clearly. Similarly, as you grow you'll be writing more application level transactions to complement your database transactions.
I would say it's potentially a bad practice, depending on what happens when the query fails. For example, if your database could throw an error that was intelligently handled by an application, then you might be ok.
On the other hand, if you don't put any validation in your app, you might not have any bad data, but you may have users thinking they entered stuff that doesn't get saved.
Implement as much data validation as you can at the database end without compromising other goals. For example, if speed is an issue, you may want to consider not using foreign keys, etc. Furthermore, some data validation can only be performed on the application side, e.g., ensuring that email addresses have valid domains.
Another disadvantage to doing data validation from the database is that often you dont validate the same way in every case. In fact, it often depends on application logic (user roles), and sometimes you might want to bypass validation altogether (cron jobs and maintenance scripts).
I've found that doing validation in the application, rather than in the database, works well. Of course then, all the interaction needs to go through your application. If you have other applications that work with your data, your application will need to support some sort of API (hopefully REST).
I don't think there is one right answer, it depends on your use.
If you are going to have a very heavily used system, with the potential that the database performance might become a bottleneck, then you might want to move the responsibility for validation to the front-end where it is easier to scale with multiple servers.
If you have multiple applications interacting with the database, then you might not want to replicate and maintain the validation rules across multiple applications, so then the database might be the better place.
You might want a slicker input screen that doesn't just hit the user with validation warnings when they try to save a record, maybe you want to validate a field after data has been entered and it losses focus; or even as the user types, changing the font colour as validation fails/passes.
Also related to constraints, is warnings of suspect data. In my application I have hard-constraints in the database (e.g. someone can't start a job before their date of birth), but then in the front-end have warnings for data that is possibly correct, but suspect (e.g. an eight year-old starting a job).

Resources