What's the benefit of using a lot of complex stored procedures - database

For typical 3-tiered application, I have seen that in many cases they use a lot of complex stored procedures in the database. I cannot quite get the benefit of this approach. In my personal understanding, there are following disadvantages on this approach:
Transactions become coarse.
Business logic goes into database.
Lots of computation is done in the database server, rather than in the application server. Meanwhile, the database still needs to do its original work: maintain data. The database server may become a bottleneck.
I can guess there may be 2 benefits of it:
Change the business logic without compile. But the SPs are much more harder to maintain and test than Java/C# code.
Reduce the number of DB connection. However, in the common case, the bottleneck of database is hard disk io rather than network io.
Could anyone please tell me the benefits of using a lot of stored procedures rather than letting the work be done in business logic layer?

Basically, the benefit is #2 of your problem list - if you do a lot of processing in your database backend, then it's handled there and doesn't depend on the application accessing the database.
Sure - if your application does all the right things in its business logic layer, things will be fine. But as soon as a second and a third application need to connect to your database, suddenly they too have to make sure to respect all the business rules etc. - or they might not.
Putting your business rules and business logic in the database ensures that no matter how an app, a script, a manager with Excel accesses your database, your business rules will be enforced and your data integrity will be protected.
That's the main reason to have stored procs instead of code-based BLL.
Also, using Views for read and Stored Procs for update/insert, the DBA can remove any direct permissions on the underlying tables. Your users do no longer need to have all the rights on the tables, and thus, your data in your tables is better protected from unadvertent or malicious changes.
Using a stored proc approach also gives you the ability to monitor and audit database access through the stored procs - no one will be able to claim they didn't alter that data - you can easily prove it.
So all in all: the more business critical your data, the more protection layer you want to build around it. That's what using stored procs is for - and they don't need to be complex, either - and most of those stored procs can be generated based on table structure using code generation, so it's not a big typing effort, either.

Don't fear the DB.
Let's also not confuse business logic with data logic which has its rightful place at the DB.
Good systems designers will encompass flexible business logic through data logic, i.e. abstract business rule definitions which can be driven by the (non)existence or in attributes of data rows.
Just FYI, the most successful and scalable "enterprise/commercial" software implementations with which I have worked put all projection queries into views and all data management either into DB procedures or triggers on staged tables.

Network between appServer and sqlServer is the bottle neck very often.
Stored procedures are needed when you need to do complex query.
For example you want collect some data about employee by his surname. Especially imagine, that data in DB looks like some kind of tree - you have 3 records about this employee in table A. You have 10 records in table B for each record in table A. You have 100 records in table C for each record in table B. And you want to get only special 5 records from table C about that employee. Without stored procedures you will get a lot of queries traffic between appServer and sqlServer, and a lot of code in appServer. With stored procedure which accepts employee surname, fetches those 5 records and returns them to appServer you 1) decrease traffic by hundreds times, 2) greatly simplify appServer code.

The life time of our data exceeds that of our applications. Also data gets shared between applications. So many applications will insert data into the database, many applications will retrieve data from it. The database is responsible for the completeness, integrity and correctness of the data. Therefore it needs to have the authority to enforce the business rules relating to the data.
Taking you specific points:
Transactions are Units Of Work. I
fail to see why implementing
transactions in stored procedures
should change their granularity.
Business logic which applies to the
data belongs with the data: that
maximises cohesion.
It is hard to write good SQL and to
learn to think in sets. Therefore
it may appear that the database is
the bottleneck. In fact, if we are
undertaking lots of work which
relates to the data the database is
probably the most efficient place to
do.
As for maintenance: if we are familiar with PL/SQL, T-SQL, etc maintenance is easier than it might appear from the outside. But I concede that tool support for things like refactoring lags behind that of other languages.

You listed one of the main ones putting business logic in the Db often gives the impression of making it easier to maintain.
Generally complex SP logic in the db allows for cheaper implementation of the actual implementation code, which may be beneficial if its a transitional application (say being ported from legacy code), its code which needs to be implemented in several languages (for instance to market on different platforms or devices) or because the problem is simpler to solve in the db.
One other reason for this is often there is a general "best practice" to encapsulate all access to the db in sps for security or performance reasons. Depending on your platform and what you are doing with it this may or may not be marginally true.

I don't think there are any. You are very correct that moving the BL to the database is bad, but not for everything. Try taking a look at Domain Driven Design. This is the antidote to massive numbers of SPROCs. I think you should be using your database as somewhere to store you business objects, nothing more.
However, SPROCs can be much more efficient on certain, simple functions. For instance, you might want to increase the salary to every employee in your database by a fixed percentage. This is quicker to do via a SPROC than getting all the employees from the db, updating them and then saving them back.

I worked in a project where every thing is literally done in database level. We wrote lot of stored procedures and did lot of business validation / logic in the database. Most of the times it became a big overhead for us to debug.
The advantages I felt may be
Take advantage of full DB features.
Database intense activities like lot of insertion/updation can be better done in DB level. Call an SP and let it do all the work instead of hitting DB several times.
New DB servers can accommodate complex operations so they no longer see this as a bottleneck. Oh yeah, we used Oracle.
Looking at it now, I think few things could have been better done at application level and lesser at DB level.

It depends almost entirely on the context.
Doing work on the server rather than on the clients is generally a bad idea as it makes your server less scalable. However, you have to balance this against the expected workload (if you know you will only have 100 users in a closed enironment, you may not need a scalable server) and network traffic costs (if you have to read a lot of data to apply calculations/processes to, then it can be cheaper/faster overall to run those calculations on the server and only send the results over the net).
Also, if you have custom client applications (as opposed to web browsers etc) it makes it very easy to push updates out to your clients, because you don't need to recompile and deploy the client code, you simply upgrade the database stored procedures.
Of course, using stored procedures rather than executing dynamically compiled SQL statements can be more efficient (it's precompiled, and the code doesn't need to be uploaded to the server) and aids encapsulation to give the database better integrity/security. But by the sound of it, you're talking about masses of busines logic, not simple efficiency and security measures.
As with most things, a sensible compromise/balance is needed. Stored Procedures should be used enough to enhance efficiency and security, but you don't want your server to become unscalable.

"there are following disadvantages on this approach:
...
Business logic goes into database."
Insofar as by "busines logic" you mean "enforcement of business rules", the DBMS is EXACTLY where "business logic" belongs.

Related

business logic in database layer

I hate to ask the classic question of "business logic in database vs code" again, but I need some concrete reasons to convince an older team of developers that business logic in code is better, because it's more maintainable, above all else. I used to have a lot of business logic in the DB, because I believed it was the single point of access. Maintenance is easy, if I was the only one doing the changing it. In my experience, the problems came when the projects got larger and complicated. Source Control for DB Stored Procs are not so advanced as the ones for newer IDEs, nor are the editors. Business logic in code can scale much better than in the DB, is what I've found in my recent experience.
So, just searching around stackoverflow, I found quite the opposite philosophy from its esteemed members:
https://stackoverflow.com/search?q=business+logic+in+database
I know there is no absolute for any situation, but for a given asp.net solution, which will use either sql server or oracle, for a not a particularly high traffic site, why would I put the logic in the DB?
Depends on what you call business.
The database should do what is expected.
If the consumers and providers of data expect the database to make certain guarantees, then it needs to be done in the database.
Some people don't use referential integrity in their databases and expect the other parts of the system to manage that. Some people access tables in the database directly.
I feel that from a systems and component perspective, the database is like any other service or class/object. It needs to protect its perimeter, hide its implementation details and provide guarantees of integrity, from low-level integrity up to a certain level, which may be considered "business".
Good ways to do this are referential integrity, stored procedures, triggers (where necessary), views, hiding base tables, etc., etc.
Database does data things, why weigh down something that is already getting hit pretty hard to give you data. It's a performance thing and a code thing. It's MUCH easier to maintain business logic code than to store it all in the database. Sprocs, Views and Functions can only go so far until you have Views of Views of Views with sprocs to fill that mess in. With business logic you separate your worries. If you have a bug that's causing something to be calculated wrong it's easier to check the business logic code than go into the DB and see if someone messed up something in a Stored Procedure. This is highly opinionated and in some cases it's OK to put some logic in the database but my thoughts on this are it's a database not a logicbase, put things where they belong.
P.S: Might be catchin some heat for this post, it's highly opinionated and other than performance numbers there's no real evidence for either and it becomes a case of what you're working with.
EDIT: Something that Cade mentioned that I forgot. Refrential integrity. By all means please have correct data integrity in your DB, no orphaned records ON DELETE CASCADE's, checks and whatnot.
I have faced with database logic on one of huge projects. This was caused by the decision of main manager who was the DBA specialist. He said that the application should be leightweight, it should know nothing about database scheme, joined tables, etc, and anyway stored Procs executes much faster than the transaction scopes and queries from client.
At the other side, we had too much bugs with database object mappings (stored prod or view based on view based on other view etc). It was unreachable to understand what is happening with our data because of each button clicked called a huge stored proc with 70-90-120 parameters and updated several (10-15) tables. We had no ability to query simple select request so we had to compile a view or stored Proc and class in code for this just for one simple join :-( of course when the table or view definition changes you should recompile all other dB objects based on edited object elsewhere you will get runtime Exception.
So I think that logic in database is a horrible way. Of course you can store some pieces of code in stored procs if needed by performance or security issues, but you shoul not develop everything in the Database) the logic should be flexible, testable and maintenable, and you can not reach this points using database for storing logic)

Should business rules be enforced in both the application tier and the database tier, or just one of the two?

I have been enforcing business rules in both my application tier (models) and my database tier (stored procedures who raise errors).
I've been duplicating my validations in both places for a few reasons:
If the conditions change between
when they are checked in the
application code and when they are
checked in the database, the
business rule checks in the database
will save the day. The database
also allows me to lock various
records in a simpler manner than in
my application code, so it seems
natural to do so here.
If we have
to do some batch data
insertions/updates to the database directly, if I route
all these operations through my
stored procedures/functions which
are doing the business rule
validations, there's no chance of me
putting in bad data even though I lack the protections that I would get if I was doing single-input through the application.
While
enforcing these things ONLY in the
database would have the same effect
on the actual data, it seems
improper to just throw data at the
database before first making a good
effort to validate that it conforms
to constraints and business rules.
What's the right balance?
You need to enforce at the data tier to ensure data integrity. That's your last line of defense, and that's the DBs job, to help enforce its world view of the data.
That said, throwing junk data against the DB for validation is a coarse technique. Typically the errors are designed to be human readable rather than machine readable, so its inefficient for the program to process the error from the DB and make heads or tails out of it.
Stored Procedures are a different matter. Back in the day, Stored Procedures were The Way to handle business rules on the data tiers, etc.
But today, with the modern application server environments, they have become a, in general, better place to put this logic. They offer multiple ways to access and expose the data (the web, web services, remote protocols, APIs, etc). Also, if your rules are CPU heavy (arguably most aren't) it's easier to scale app servers than DB servers.
The large array of features within the app servers give them a flexibility beyond what the DB servers can do, and thus much of what was once pushed back in to the DBs is being pulled out with the DB servers being relegated to "dumb persistence".
That said, there are certainly performance advantages using Stored Procs and such, but now that's a tuning thing where the question becomes "is it worth losing the app server capability for the gain we get by putting it in to the DB server".
And by app server, I'm not simply talking Java, but .NET and even PHP etc.
If the rule must be enforced at all times no matter where the data came from or how it was updated, the database is where it needs to be. Remember databases are affected by direct querying to make changes that affect many records or to do something the application would not normally do. These are things like fixing a group of records when a customer is bought out by another customer and they want to change all the historical data, the application of new tax rates to orders not yet processed, the fixing of a some bad data inputs. They are also affected sometimes by other applications which do not use your data layer. They may also be affected by imports run through ETL programs which also cannot use your data layer. So if the rule must in all cases be followed, it must be in the database.
If the rule is only for special cases concerning how this particular input page works, then it needs to be in the application. So if a sales manager has only specific things he can do from his user interface, these things can be specified in the application.
Somethings it is helpful to do in both places. For instance, it is silly to allow a user to put a non-date in an input box that will relate to a date field. The datatype in the database should still be a datetime datatype, but it is best to check some of this stuff before you send.
Your business logic can sit in either location, but should not be in both. The logic should NOT be duplicated because it's easy to make a mistake trying to keep both in sync. If you put it in the model you'll want all data access to go through your models, including batch updates.
There will be trade-offs to putting it in the database vs the application models (here's a few of the top of my head):
Databases can be harder to maintain and update than applications
It's easier to distribute load if it's in the application tier
Multiple, disparate dbs may require splitting business rules (which may not be possible)

Should Databases be used just for persistence

A lot of web applications having a 3 tier architecture are doing all the processing in the app server and use the database for persistence just to have database independence. After paying a huge amount for a database, doing all the processing including batch at the app server and not using the power of the database seems to be a waste. I have a difficulty in convincing people that we need to use best of both worlds.
What "power" of the database are you not using in a 3-tier archiecture? Presumably we exploit SQL to the full, and all the data management, paging, caching, indexing, query optimisation and locking capabilities.
I'd guess that the argument is where what we might call "business logic" should be implemented. In the app server or in database stored procedure.
I see two reasons for putting it in the app server:
1). Scalability. It's comparatively hard to add more datbase engines if the DB gets too busy. Partitioning data across multiple databases is really tricky. So instead pull the business logic out to the app server tier. Now we can have many app server instances all doing business logic.
2). Maintainability. In principle, Stored Procedure code can be well-written, modularised and resuable. In practice it seems much easier to write maintainable code in an OO language such as C# or Java. For some reason re-use in Stored Procedures seems to happen by cut and paste, and so over time the business logic becomes hard to maintain. I would concede that with discipline this need not happen, but discipline seems to be in short supply right now.
We do need to be careful to truly exploit the database query capabilities to the full, for example avoiding pulling large amounts of data across to the app server tier.
It depends on your application. You should set things up so your database does things databases are good for. An eight-table join across tens of millions of records is not something you're going to want to handle in your application tier. Nor is performing aggregate operations on millions of rows to emit little pieces of summary information.
On the other hand, if you're just doing a lot of CRUD, you're not losing much by treating that large expensive database as a dumb repository. But simple data models that lend themselves to application-focused "processing" sometimes end up leading you down the road to creeping unforeseen inefficiencies. Design knots. You find yourself processing recordsets in the application tier. Looking things up in ways that begin to approximate SQL joins. Eventually you painfully refactor these things back to the database tier where they run orders of magnitude more efficiently...
So, it depends.
No. They should be used for business rules enforcement as well.
Alas the DBMS big dogs are either not competent enough or not willing to support this, making this ideal impossible, and keeping their customers hostage to their major cash cows.
I've seen one application designed (by a pretty smart guy) with tables of the form:
id | one or two other indexed columns | big_chunk_of_serialised_data
Access to that in the application is easy: there are methods that will load one (or a set) of objects, deserialising it as necessary. And there are methods that will serialise an object into the database.
But as expected (but only in hindsight, sadly), there are so many cases where we want to query the DB in some way outside that application! This is worked around is various ways: an ad-hoc query interface in the app (which adds several layers of indirection to getting the data); reuse of some parts of the app code; hand-written deserialisation code (sometimes in other languages); and simply having to do without any fields that are in the deserialised chunk.
I can readily imagine the same thing occurring for almost any app: it's just handy to be able to access your data. Consequently I think I'd be pretty averse to storing serialised data in a real DB -- with possible exceptions where the saving outweighs the increase in complexity (an example being storing an array of 32-bit ints).

Are stored procedures required for large data sets?

I've just started my first development job for a reasonably sized company that has to manage a lot of data. An average database is 6gb (from what I've seen so far). One of the jobs is reporting. How it's done currently is -
Data is replicated and transferred onto a data warehouse. From there, all the data required for a particular report is gathered (thousands of rows and lots of tables) and aggregated to a reports database in the warehouse. This is all done with stored procedures.
When a report is requested, a stored procedure is invoked which copies the data onto a reports database which PHP reads from to display the data.
I'm not a big fan of stored procs at all. But the people I've spoken to insist that stored procedures are the only option, as queries directly against the data via a programming language are incredibly slow (think 30 mins?). Security is also a concern.
So my question is - are stored procedures required when you have a very large data set? Do queries really take that long on such a large amount of data or is there a problem with either the DB servers or how the data is arranged (and indexed?). I've got a feeling that something is wrong.
The reasoning behind using a stored procedure is that the execution plan that is created in order to execute your procedure is cached by SQL Server in an area of memory known as the Plan Cache. When the procedure is then subsequently re-run at a later time, the execution plan has the possibility of being re-used.
A stored procedure will not run any faster than the same query, executed as a batch of T-SQL. It is the execution plans re-use that result in a performance improvement. The query cost will be the same for the actual T-SQL.
Offloading data to a reporting database is a typical pursuit however you may need to review your indexing strategy on the reporting database as it will likely need to be quite different from that of your OLTP platform for example.
You may also wish to consider using SQL Server Analysis Services in order to service your reporting requirements as it sounds like your reports contain lots of data aggregations. Storing and processing data for the purpose of fast counts and analytics is exactly what SSAS is all about. It sounds like it is time for your business to look as building a data warehouse.
I hope this helps but please feel free to request further details.
Cheers, John
In the context in which you are operating - large corporate database accessed in several places - it is virtually always best to place as much business logic inside the database as is possible.
In this case your immediate performance benefits are :
Firstly because if the the SP involves any processing beyond a simple select the processing of the data within the database can be orders of magnitude faster than sending rows across the network to your program for handling there.
You do acquire some benefits in that the SP is stored compiled. This is usually marginal compared to 1. if processing large volumes
However, and in my mind often more important than performance, is the fact that with corporate databases encapsulating the logic inside the database itself provides major management and maintenance benefits:-
Data structures can be abstracted away from program logic, allowing database structures to change without requiring changes to programs accessing the data. Anyone who has spent hours grep'ing a corporate codebase for SQL using [mytable] before making a simple database change will appreciate this.
SPs can provide a security layer, although this can be overused and overrelied on.
You say this is your first job for a company with a database of this type, so you can be forgiven for not appreciating how a database-centric approach to handling the data is really essential in such environments. You are not alone either - in a recent podcast Jeff Attwood said he wasn't a fan of putting code into databases. This is a fine and valid opinion where you are dealing with a database serving a single application, but is 100% wrong with a database used across a company by several applications, where the best policy is to screw down the data with a full complement of constraints and use SPs liberally for access and update.
The reason for this is if you don't such databases always lose data integrity and accumulate crud. Sometimes it's virtually impossible to imagine how they do, but in any large corporate database (tens of millions of records) without sufficient constraints there will be badly formed records - at best these force a periodic clean-up of data (a task I regularly used to get dumped with as a junior programmer), or worse will cause applications to crash due to invalid inputs, or even worse not cause them to crash but deliver incorrect business information to the end-users. And if your end user is your finance director then that's your job on the line :-)
It seems to me that there is an additional step in there that, based on your description, appears unneccessary. Here is what I am referring to -
When a report is requested, a stored
procedure is invoked which gathers the
data into a format required for a
report, and forwarded to another
stored procedure which transforms the
data into a view, and forwards THAT
off to a PHP framework for display.
A sproc transforms the data for a report, then another sproc transforms this data into another format for front-end presentation - is the data ever used in the format in which it is in after the first sproc? If not, that stage seems unneccessary to me.
I'm assuming that your reports database is a data warehouse and that data is ETL'ed and stored within in a format for the purposes of reporting. Where I currently work, this is common practice.
As for your question regarding stored procedures, they allow you to centralize logic within the database and "encapsulate" security, the first of which would appear to be of benefit within your organisation, given the other sprocs that you have for data transformation. Stored procedures also have a stored execution plan which, under some circumstances, can provide some improvement to performance.
I found that stored procedures help with large data sets because they eliminate a ton of network traffic, which can be a huge performance bottleneck depending on how large the data set actually is.
When processing large numbers of rows, where indexes are available and the SQL is relatively tuned, the database engine performing set-based operations directly on the data - through SQL, say - will almost always outperform row-by-row processing (even on the same server) in a client tool. The data is not crossing any physical or logical boudaries to leave the database server processes or to leave the database server and go out across the network. Even performing RBAR (row by agonizing row) on the server will be faster than performing it in a client tool, if only a limited amount of data really needs to ever leave the server, because...
When you start to pull more data across networks, then the process will slow down and limiting the number of rows at each stage becomes the next optimization.
All of this really has nothing to do with stored procedures. Stored procedures (in SQL Server) no longer provide much performance advantages over batch SQL. Stored procedures do provide a large number of other benefits like modularization, encapsulation, security management, design by contract, version management. Performance, however is no longer an advantage.
Generally speaking stored procedures have a number of advantages over direct queries. I can't comment on your complete end to end process, however, SPs will probably perform faster. For a start a direct query needs to be compiled and an execution plan worked out every time you do a direct query - SPs don't.
There are other reasons, why you would want to use stored procedure - centralisation of logic, security etc.
The end to end process does look a little complicated but there may be good reasons for it simply due to the data volume - it might well be that if you run the reports on the main database, the queries are slowing down the rest of the system so much that you'll cause problems for the rest of the users.
Regarding the stored procedures, their main advantage in a scenario like this is that they are pre-compiled and the database has already worked out what it considers to be the optimal query plan. Especially with the data volumes you are talking about, this might well result in a very noticeable performance improvement.
And yes, depending on the complexity of the report, a query like this can take half an hour or longer...
This reporting solution seems to have been designed by people that think the database is the centre of the world. This is a common and valid view – however I don’t always hold to it.
When moving data between tables/databases, it can be a lot quicker to use stored procs, as the data does not need to travel between the database and the application. However in most cases, I would rather not use stored proc as they make development more complex, I am in the ORM camp myself. You can sometimes get great speedups by loading lots into RAM and processing it there, however that is a totally different way of coding and will not allow the reuse of the logic that is already in the stored procs. Sorry I think you are stack with stored proc while in that job.
Giving the amount of data being moved about, if using SQL server I would look at using SSIS or DTS – oracle will have something along the same line. SSIS will do the data transformations on many threads while taking care of a lot of the details for you.
Remember the design of software has more to do with the history of the software and the people working it in, than it has to do with the “right way of doing it”. Come back in 100 years and we may know how to write software, at present it is mostly a case of the blind leading the blind. Just like when the first bridges were build and a lot of them fell down, no one could tell you in advance witch bridge would keep standing and why.
Unlike autogenerated code from an ORM product, stored procs can be performance tuned. This is critical in large production environment. There are many ways to tweak performance that are not available when using an ORM. Also there are many many tasks performed by a large database which have nothing to do with the user interface and thus should not be run from code produced from there.
Stored procs are also required if you want to control rights so that the users can only do the procedures specified in the proc and nothing else. Otherwise, users can much more easily make unauthorized changes to the databases and commit fraud. This is one reason why database people who work with large business critical systems, do not allow any access except through stored procs.
If you are moving large amounts of data to other servers though, I would consider using DTS (if using SQL Server 2000) or SSIS. This may speed up your processes still further, but it will depend greatly on what you are doing and how.
The fact that sps may be faster in this case doesn't preclude that indexing may be wrong or statistics out of date, but generally dbas who manage large sets of data tend to be pretty on top of this stuff.
It is true the process you describe seems a bit convoluted, but without seeing the structure of what is happening and understanding the database and environment, I can't say if maybe this is the best process.
I can tell you that new employees who come in and want to change working stuff to fit their own personal predjudices tend to be taken less than seriously and then you will have little credibility when you do need to suggest a valid change. This is particularly true when your past experience is not with databases of the same size or type of processing. If you were an expert in large systems, you might be taken more seriously from the start, but, face it, you are not and thus your opinion is not likely to sway anybody until you have been there awhile and they have a measure of your real capabilities. Plus if you learn the system as it is and work with it as it is, you will be in a better position in six months or so to suggest improvements rather than changes.
I could perhaps come up with more, but a few points.
Assuming a modern DB, stored procedures probably won't actually be noticeably faster than normal procedures due to caching and the like.
The security benefits of Stored procedures are somewhat overrated.
Change is evil. Consistency is king.
I'd say #3 trumps all other concerns unless stored procedures are causing a legitimate problem.
The faster way for reporting is to just read all data into memory (64 bit OS required) and just walk the objects. This is of course limited to ram size (affordable 32 GB) and reports where you hit a large part of the db. No need to make the effort for small reports.
In the old days I could run a report querying over 8 million objects in 1.5 seconds. That was in about a gigabyte of ram on a 3GHz pentium 4. 64 bit should be about twice as slow, but that is compensated by faster processors.

Business Logic: Database or Application Layer

The age old question. Where should you put your business logic, in the database as stored procedures ( or packages ), or in the application/middle tier? And more importantly, Why?
Assume database independence is not a goal.
Maintainability of your code is always a big concern when determining where business logic should go.
Integrated debugging tools and more powerful IDEs generally make maintaining middle tier code easier than the same code in a stored procedure. Unless there is a real reason otherwise, you should start with business logic in your middle tier/application and not in stored procedures.
However when you come to reporting and data mining/searching, stored procedures can often a better choice. This is thanks to the power of the databases aggregation/filtering capabilities and the fact you are keeping processing very close the the source of the data. But this may not be what most consider classic business logic anyway.
Put enough of the business logic in the database to ensure that the data is consistent and correct.
But don't fear having to duplicate some of this logic at another level to enhance the user experience.
For very simple cases you can put your business logic in stored procedures. Usually even the simple cases tend to get complicated over time. Here are the reasons I don't put business logic in the database:
Putting the business logic in the database tightly couples it to the technical implementation of the database. Changing a table will cause you to change a lot of the stored procedures again causing a lot of extra bugs and extra testing.
Usually the UI depends on business logic for things like validation. Putting these things in the database will cause tight coupling between the database and the UI or in different cases duplicates the validation logic between those two.
It will get hard to have multiple applications work on the same database. Changes for one aplication will cause others to break. This can quickly turn into a maintenance nightmare. So it doesn't really scale.
More practically SQL isn't a good language to implement business logic in an understandable way. SQL is great for set based operations but it misses constructs for "programming in the large" it's hard to maintain big amounts of stored procedures. Modern OO languages are better suited and more flexible for this.
This doesn't mean you can't use stored procs and views. I think it sometimes is a good idea to put an extra layer of stored procedures and views between the tables and application(s) to decouple the two. That way you can change the layout of the database without changing external interface allowing you to refactor the database independently.
It's really up to you, as long as you're consistent.
One good reason to put it in your database layer: if you are fairly sure that your clients will never ever change their database back-end.
One good reason to put it in the application layer: if you are targeting multiple persistence technologies for your application.
You should also take into account core competencies. Are your developers mainly application layer developers, or are they primarily DBA-types?
While there is no one right answer - it depends on the project in question, I would recommend the approach advocated in "Domain Driven Design" by Eric Evans. In this approach the business logic is isolated in its own layer - the domain layer - which sits on top of the infrastructure layer(s) - which could include your database code, and below the application layer, which sends the requests into the domain layer for fulfilment and listens for confirmation of their completion, effectively driving the application.
This way, the business logic is captured in a model which can be discussed with those who understand the business aside from technical issues, and it should make it easier to isolate changes in the business rules themselves, the technical implementation issues, and the flow of the application which interacts with the business (domain) model.
I recommend reading the above book if you get the chance as it is quite good at explaining how this pure ideal can actually be approximated in the real world of real code and projects.
While there are certainly benefits to have the business logic on the application layer, I'd like to point out that the languages/frameworks seem to change more frequently then the databases.
Some of the systems that I support, went through the following UIs in the last 10-15 years: Oracle Forms/Visual Basic/Perl CGI/ ASP/Java Servlet. The one thing that didn't change - the relational database and stored procedures.
Database independence, which the questioner rules out as a consideration in this case, is the strongest argument for taking logic out of the database. The strongest argument for database independence is for the ability to sell software to companies with their own preference for a database backend.
Therefore, I'd consider the major argument for taking stored procedures out of the database to be a commercial one only, not a technical one. There may be technical reasons but there are also technical reasons for keeping it in there -- performance, integrity, and the ability to allow multiple applications to use the same API for example.
Whether or not to use SP's is also strongly influenced by the database that you are going to use. If you take database independence out of consideration then you're going to have very different experiences using T-SQL or using PL/SQL.
If you are using Oracle to develop an application then PL/SQL is an obvious choice as a language. It's is very tightly coupled with the data, continually improved in every relase, and any decent development tool is going to integratePL/SQL development with CVS or Subversion or somesuch.
Oracle's web-based Application Express development environment is even built 100% with PL/SQL.
The only thing that goes in a database is data.
Stored procedures are a maintenance nightmare. They aren't data and they don't belong in the database. The endless coordination between developers and DBA's is little more than organizational friction.
It's hard to keep good version control over stored procedures. The code outside the database is really easy to install -- when you think you've got the wrong version you just do an SVN UP (maybe an install) and your application's back to a known state. You have environment variables, directory links, and lots of environment control over the application.
You can, with simple PATH manipulations, have variant software available for different situations (training, test, QA, production, customer-specific enhancements, etc., etc.)
The code inside the database, however, is much harder to manage. There's no proper environment -- no "PATH", directory links or other environment variables -- to provide any usable control over what software's being used; you have a permanent, globally bound set of application software stuck in the database, married to the data.
Triggers are even worse. They're both a maintenance and a debugging nightmare. I don't see what problem they solve; they seem to be a way of working around badly-designed applications where someone couldn't be bothered to use the available classes (or function libraries) correctly.
While some folks find the performance argument compelling, I still haven't seen enough benchmark data to convince me that stored procedures are all that fast. Everyone has an anecdote, but no one has side-by-side code where the algorithms are more-or-less the same.
[In the examples I've seen, the old application was a poorly designed mess; when the stored procedures were written, the application was re-architected. I think the design change had more impact than the platform change.]
Anything that affects data integrity must be put at the database level. Other things besides the user interface often put data into, update or delete data from the database including imports, mass updates to change a pricing scheme, hot fixes, etc. If you need to ensure the rules are always followed, put the logic in defaults and triggers.
This is not to say that it isn't a good idea to also have it in the user interface (why bother sending information that the database won't accept), but to ignore these things in the database is to court disaster.
If you need database independence, you'll probably want to put all your business logic in the application layer since the standards available in the application tier are far more prevalent than those available to the database tier.
However, if database independence isn't the #1 factor and the skill-set of your team includes strong database skills, then putting the business logic in the database may prove to be the best solution. You can have your application folks doing application-specific things and your database folks making sure all the queries fly.
Of course, there's a big difference between being able to throw a SQL statement together and having "strong database skills" - if your team is closer to the former than the latter then put the logic in the application using one of the Hibernates of this world (or change your team!).
In my experience, in an Enterprise environment you'll have a single target database and skills in this area - in this case put everything you can in the database. If you're in the business of selling software, the database license costs will make database independence the biggest factor and you'll be implementing everything you can in the application tier.
Hope that helps.
It is nowadays possible to submit to subversion your stored proc code and to debug this code with good tool support.
If you use stored procs that combine sql statements you can reduce the amount of data traffic between the application and the database and reduce the number of database calls and gain big performance gains.
Once we started building in C# we made the decision not to use stored procs but now we are moving more and more code to stored procs. Especially batch processing.
However don't use triggers, use stored procs or better packages. Triggers do decrease maintainability.
Putting the code in the application layer will result in a DB independent application.
Sometimes it is better to use stored procedures for performance reasons.
It (as usual) depends on the application requirements.
The business logic should be placed in the application/middle tier as a first choice. That way it can be expressed in the form of a domain model, be placed in source control, be split or combined with related code (refactored), etc. It also gives you some database vendor independence.
Object Oriented languages are also much more expressive than stored procedures, allowing you to better and more easily describe in code what should be happening.
The only good reasons to place code in stored procedures are: if doing so produces a significant and necessary performance benefit or if the same business code needs to be executed by multiple platforms (Java, C#, PHP). Even when using multiple platforms, there are alternatives such as web-services that might be better suited to sharing functionality.
The answer in my experience lies somewhere on a spectrum of values usually determined by where your organization's skills lie.
The DBMS is a very powerful beast, which means proper or improper treatment will bring great benefit or great danger. Sadly, in too many organizations, primary attention is paid to programming staff; dbms skills, especially query development skills (as opposed to administrative) are neglected. Which is exacerbated by the fact that the ability to evaluate dbms skills is also probably missing.
And there are few programmers who sufficiently understand what they don't understand about databases.
Hence the popularity of suboptimal concepts, such as Active Records and LINQ (to throw in some obvious bias). But they are probably the best answer for such organizations.
However, note that highly scaled organizations tend to pay a lot more attention to effective use of the datastore.
There is no standalone right answer to this question. It depends on the requirements of your app, the preferences and skills of your developers, and the phase of the moon.
Business logic is to be put in the application tier and not in the database.
The reason is that a database stored procedure is always dependen on the database product you use. This break one of the advantages of the three tier model. You cannot easily change to an other database unless you provide an extra stored procedure for this database product.
on the other hand sometimes, it makes sense to put logic into a stored procedure for performance optimization.
What I want to say is business logic is to be put into the application tier, but there are exceptions (mainly performance reasons)
Bussiness application 'layers' are:
1. User Interface
This implements the business-user's view of h(is/er) job. It uses terms that the user is familiar with.
2. Processing
This is where calculations and data manipulation happen. Any business logic that involves changing data are implemented here.
3. Database
This could be: a normalized sequential database (the standard SQL-based DBMS's); an OO-database, storing objects wrapping the business-data; etc.
What goes Where
In getting to the above layers you need to do the necessary analysis and design. This would indicate where business logic would best be implemented: data-integrity rules and concurrency/real-time issues regarding data-updates would normally be implemented as close to the data as possible, same as would calculated fields, and this is a good pointer to stored-procedures/triggers, where data-integrity and transaction-control is absolutely necessary.
The business-rules involving the meaning and use of the data would for the most part be implemented in the Processing layer, but would also appear in the User-Interface as the user's work-flow - linking the various process in some sequence that reflects the user's job.
Imho. there are two conflicting concerns with deciding where business logic goes in a relational database-driven app:
maintainability
reliability
Re. maintainability:  To allow for efficient future development, business logic belongs in the part of your application that's easiest to debug and version control.
Re. reliability:  When there's significant risk of inconsistency, business logic belongs in the database layer.  Relational databases can be designed to check for constraints on data, e.g. not allowing NULL values in specific columns, etc.  When a scenario arises in your application design where some data needs to be in a specific state which is too complex to express with these simple constraints, it can make sense to use a trigger or something similar in the database layer.
Triggers are a pain to keep up to date, especially when your app is supposed to run on client systems you don't even have access too.  But that doesn't mean it's impossible to keep track of them or update them.  S.Lott's arguments in his answer that it's a pain and a hassle are completely valid, I'll second that and have been there too.  But if you keep those limitations in mind when you first design your data layer and refrain from using triggers and functions for anything but the absolute necessities it's manageable.
In our application, most business logic is contained in the application's model layer, e.g. an invoice knows how to initialize itself from a given sales order.  When a bunch of different things are modified sequentially for a complex set of changes like this, we roll them up in a transaction to maintain consistency, instead of opting for a stored procedure.  Calculation of totals etc. are all done with methods in the model layer.  But when we need to denormalize something for performance or insert data into a 'changes' table used by all clients to figure out which objects they need to expire in their session cache, we use triggers/functions in the database layer to insert a new row and send out a notification (Postgres listen/notify stuff) from this trigger.
After having our app in the field for about a year, used by hundreds of customers every day, the only thing I would change if we were to start from scratch would be to design our system for creating database functions (or stored procedures, however you want to call them) with versioning and updates to them in mind from the get-go.
Thankfully, we do have some system in place to keep track of schema versions, so we built something on top of that to take care of replacing database functions.  It would've saved us some time now if we'd considered the need to replace them from the beginning though.
Of course, everything changes when you step outside of the realm of RDBMS's into tuple-storage systems like Amazon SimpleDB and Google's BigTable.  But that's a different story :)
We put a lot of business logic in stored procedures - it's not ideal, but quite often it's a good balance between performance and reliability.
And we know where it is without having to search through acres of solutions and codebase!
Scalability is also very important factor for pusing business logic in middle or app layer than to database layer.It should be understood that DatabaseLayer is only for interacting with Database not manipulating which is returned to or from database.
I remember reading an article somewhere that pointed out that pretty well everything can be, at some level, part of the business logic, and so the question is meaningless.
I think the example given was the display of an invoice onscreen. The decision to mark an overdue one in red is a business decision...
It's a continuum. IMHO the biggest factor is speed. How can u get this sucker up and running as quickly as possible while still adhering to good tenants of programming such as maintainability, performance, scalability, security, reliability etc. Often times SQL is the most concise way to express something and also happens to be the most performant many times, except for string operations etc, but that's where your CLR Procs can help. My belief is to liberally sprinkle business logic around whereever you feel it is best for the undertaking at hand. If you have a bunch of application developers who shit their pants when looking at SQL then let them use their app logic. If you really want to create a high performance application with large datasets, put as much logic in the DB as you can. Fire your DBA's and give developers ultimate freedom over their Dev databases. There is no one answer or best tool for the job. You have multiple tools so become expert at all levels of the application and you'll soon find that you're spending a lot more time writing nice consise expressive SQL where warranted and using the application layer other times. To me, ultimately, reducing the number of lines of code is what leads to simplicity. We have just converted a sql rich application with a mere 2500 lines of app code and 1000 lines of SQL to a domain model which now has 15500 lines of app code and 2500 lines of SQL to achieve what the former sql rich app did. If you can justify a 6 fold increase in code as "simplified" then go right ahead.
This is a great question! I found this after I had already asked a simliar question, but this is more specific. It came up as a result of a design change decision that I wasn't involved in making.
Basically, what I was told was that If you have millions of rows of data in your database tables, then look at putting business logic into stored procedures and triggers. That is what we are doing right now, converting a java app into stored procedures for maintainability as the java code had become convoluted.
I found this article on: The Business Logic Wars The author also made the million rows in a table argument, which I found interesting. He also added business logic in javascript, which is client side and outside of the business logic tier. I hadn't thought about this before even though I've used javascript for validation for years, to along with server side validation.
My opinion is that you want the business logic in the application/middle tier as a rule of thumb, but don't discount cases where it makes sense to put it into the database.
One last point, there is another group where I'm working presently that is doing massive database work for research and the amount of data they are dealing with is immense. Still, for them they don't have any business logic in the database itself, but keep it in the application/middle tier. For their design, the application/middle tier was the correct place for it, so I wouldn't use the size of tables as the only design consideration.
Business logic is usually embodied by objects, and the various language constructs of encapsulation, inheritance, and and polymorphism. For example, if a banking application is passing around money, there may be a Money type that defines the business elements of what "money" is. This, opposed to using a primitive decimal to represent money. For this reason, well-designed OOP is where the "business logic" lives—not strictly in any layer.

Resources