Is microsoft access a good stepping stone to learning real database management? [closed]

Is microsoft access a good stepping stone to learning real database management? [closed] - database

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
My sister is going to start taking classes to try to learn how to become a web developer. She's sent me the class lists for a couple of candidate schools for me to help guide her decision.
One of the schools mentions Microsoft Access as the primary tool used in the database classes including relational algebra, SQL, database management, etc.
I'm wondering - if you learn Microsoft Access will you be able to easily pick up another more socially-acceptable database technology later like MySQL, Postgres, etc? My experience with Access was not pleasant and I picked up a whole lot of bad practices when I played around with it during my schooling years.
Basically: Does Microsoft Access use standards-compliant SQL? Do you learn the necessary skills for other databases by knowing how Microsoft Access works?

Access I would say a lot more peculiarities over 'actual' databasing software. Access can be used as a front end for SQL databases easily and that's part of the program.
Let's assume the class is using databases built in Access. Then let's break it down into the parts of a database:
Tables
Access uses a simplified model for variables. Basically you can have typical number, text fields, etc. You can fix the number of decimals for instance like you could with SQL. You won't see variables like varchar(x) though. You will just pick a text field and set the field size to "8", etc. However, like a real database, it will enforce the limits you've put in. Access will support OLE objects, but it quickly becomes a mess. Access databases are just stored like a file and can become incredibly large and bloat quickly. Therefore using it for more than storing address books, text databases, or linking to external sources via code...you have to be careful about how much information you store just because the file will get too big to use.
Queries
Access implements a lot of things along the line of SQL. I'm not aware that it is SQL compliant. I believe you can just export your Access database into something SQL can use. In code, you interact with SQL database with DAO, ADO, ADODB and the Jet or Ace engines. (some are outdated but on older databases work) However, once you get to just making queries, many things are similar. Typical commands--select, from, where, order, group, having, etc. are normal and work as you'd see them work elsewhere. The peculiar things happen when you get into using calculated expressions, complicated joins (access does not implement some kinds of joins but you will see arguably the most important--inner join/union ). For instance, the behavior of distinct is different in Access than other database architecture. You also are limited in the way you use aggregate functions (sum/max/min/avg) . In essence, Access works for a lot of tasks but it is incredibly picky and you will have to write queries just to work around these problems that you wouldn't have to write in a real database.
Forms/Reports
I think the key feature of Access is that it is much more approachable to users that are not computer experts. You can easily navigate the tables and drag and drop to create forms and reports. So even though it's not a database in my book officially, it can be very useful...particularly if few people will be using the database and they highly prefer ease of use/light setup versus a more 'enterprise level' solution. You don't need crystal reports or someone to code a lot of stuff to make an Access database give results and allow users to add data as needed.
Why Access isn't a database
It's not meant to handle lots of concurrent connections. One person can hold the lock and there's no negotiating about it--if one person is editing certain parts of the database it will lock all other users out or at least limit them to read-only. Also if you try to use Access with a lot of users or send it many requests via code, it will break after about 10-20 concurrent connections. It's just not meant for the kinds of things oracle and mySQL are built for. It's meant for the 'everyman' computer user if you will, but has a lot of useful things programmers can exploit to make the user experience much better.
So will this be useful for you to learn more about?
I don't see how it would be a bad thing. It's an environment that you can more easily see the relational algebra and understand how to organize your data appropriately. It's a similar argument to colleges that teach Java, C++, or Python and why each has its merits. Since you can immediately move from Access to Access being the front-end (you load links to the tables) for accessing a SQL database, I'm sure you could teach a very good class with it.

MS-Access is a good Sand-pit to build databases and learn the Basic's (Elementary) design and structure of a Database.
MS-Access'es SQL implementation is jsut about equivalent to SQL1.x syntax. Again Access is a Great app for learning the interaction between Query's, Tables, and Views.
Make sure she doesnt get used to the Macro's available in Access as they structure doesnt translate to Main-Stream RDBMS. The best equivalent is Stored procedures (SProcs) in professional RDBMS but SProcs have a thousand fold more utility and functionality than any Access Macro could provide.
Have her play with MS-Access to get a look and feel for DBMS, but once she gets comfortable with Database design, have her migrate to either MS-SQL Express or MySQL or Both. SQL-Express is as close to the real thing without paying for MS-SQL Std. MySQL is good for the LAMP web infrastructures.

Related

Comparing Database Platforms [closed]

Does that idea make sense or is the
comparison require going beyond the
scope of a trivial sample application?
I don't think it's a good idea. Most of the things that will really affect you are long term database management issues, and how the database management system you choose can handle those things.
You could be tempted in the short term with things like "I found out in 3 seconds how to do this with XYZ database management system". Now, I'm not saying support is not important; quite the contrary. But finding an answer in google in 3 seconds means that you got an answer to a simple question. How quickly, if ever, can you find an answer to a challenging problem?
A short list (not exhaustive) of important things to consider are:
backup and recovery -- at both logical level and physical level
good support for functions (or stored procedures), triggers, various SQL query constructs
APIs that allow real extensibility -- these things can get you out of tough situations and allow you to solve problems in creative ways. You'd be surprised what can be accomplished with user-defined types and functions. How do the user-defined types interact with the indexing system?
SQL standard support -- doesn't trump everything else, but if support is lacking in a few areas, really consider why it is lacking, what the workarounds are, and what are the costs of those workarounds.
A powerful executor that offers a range of fundamental algorithms (e.g. hash join, merge join, etc.) and indexing structures (btree, hash, maybe a full text option, etc.). If it's missing some algorithms or index structures, consider the types of questions that the database will be inefficient at answering. Note: I don't just mean "slow" here; the wrong algorithm can easily be worse by orders of magnitude.
Can the type system reasonably represent your business? If the set of types available is incredibly weak, you will have a mess. Representing everything as strings is kind of like assembly programming (untyped), and you will have a mess.
A trivial application won't show you any of those things. Simple things are simple to solve. If you have a "database committee" then your company cares about its data, and you should take the responsibility seriously. You need to make sure that you can develop applications on it easily with the results you and your developers expect; and when you run into problems you need to have access to a powerful system and quality support that can get you through it.

Actually learning capabilities of each RDMS is more crucial. Because it depends on the application. If you need spatial data capabilities PostGIS with PostgreSQL is better than MySQL. If you need easy replication, high availability features MySQL seems better. Also there are license issues. A link for comparison here. All has strengths and weaknesses. First get the requirements of your project or projects than compare it with list the features of the RDMSs you pick and decide which one to go.

I don't think you need to test the simple CRUD stuff, it's hard to imagine a vendor that doesn't support the basics.

Firstly, you're going beyond the scope of a sample app, in my humble opinion.
Secondly, I'd pick the one most appropriate to the tool or application you wish to develop. For example, are schemas and transactions relevant for a database that stores a single-user app configuration?
Thirdly, I've worked with Access, SQL Server, SQLite, MySQL, PostgreSQL and Oracle, and they all have their place. If you're in the MS space, go with SQL Server (and don't forget Express). There are also ADO.NET ways to talk to the others in my list. It depends on what you want.

Frankly, I doubt an arbitrarily-defined simple application would be likely to really highlight the differences between database engines. I think you'd be better to read the advertising literature for the various engines to see what they claim as their strong points. Then consider which of these matter to you, and construct tests specifically designed to verify claims that you care about.
For example, here are pros and cons of database engines I've used the most that have mattered to me. I don't claim this is an exhaustive list, but it may give you an idea of things to think about:
MySQL: Note: MySQL has two main engines internally: MyISAM and InnoDB. I have never used the InnoDB.
Pros: Fast. Free to cheap depending on how you're using it. Very convenient and easy-to-use commands for managing the schema. Some very useful extensions to the SQL standard, like "insert ... on duplicate".
Cons: The MyISAM engine does not support transactions, i.e. there's no rollback. MyISAM engine does not manage foreign keys for you. (InnoDB does not have these drawbacks, but as I say, I've never used it, so I can't comment much further.) Many deviations from SQL standards.
Oracle: Pros: Fast. Generally good conformance to SQL standards. My brother works for Oracle so if you buy there you'll be helping support my family. (Okay, maybe that's not an important pro for you ...)
Cons: Difficult to install and manage. Expensive.
Postgres: Pros: Very high conformance to SQL standards. Free. Very good "explain" plans.
Cons: Relatively slow. Optimizer is easily confused on complex queries. Some awkwardness in modifying existing tables.
Access: Pros: Easy to install and manage. Very easy to use schema management. Built-in data entry tools and query builder for quick-and-dirty stuff. Cheap.
Cons: Slow. Unreliable with multiple users.

I think that you can investigate Firebird too
This is an extract of Firebird-General on yahoogroups and I find it quite objective
Our natural audience is developers who
want to package and sell proprietary
applications. Firebird is easier to
package and install than Postgres;
more capable than SQLite; and doesn't
charge a royalty like MySQL.

What are the use cases of Graph-based Databases (http://neo4j.org/)? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have used Relational DB's a lot and decided to venture out on other types available.
This particular product looks good and promising: http://neo4j.org/
Has anyone used graph-based databases? What are the pros and cons from a usability prespective?
Have you used these in a production environment? What was the requirement that prompted you to use them?

I used a graph database in a previous job. We weren't using neo4j, it was an in-house thing built on top of Berkeley DB, but it was similar. It was used in production (it still is).
The reason we used a graph database was that the data being stored by the system and the operations the system was doing with the data were exactly the weak spot of relational databases and were exactly the strong spot of graph databases. The system needed to store collections of objects that lack a fixed schema and are linked together by relationships. To reason about the data, the system needed to do a lot of operations that would be a couple of traversals in a graph database, but that would be quite complex queries in SQL.
The main advantages of the graph model were rapid development time and flexibility. We could quickly add new functionality without impacting existing deployments. If a potential customer wanted to import some of their own data and graft it on top of our model, it could usually be done on site by the sales rep. Flexibility also helped when we were designing a new feature, saving us from trying to squeeze new data into a rigid data model.
Having a weird database let us build a lot of our other weird technologies, giving us lots of secret-sauce to distinguish our product from those of our competitors.
The main disadvantage was that we weren't using the standard relational database technology, which can be a problem when your customers are enterprisey. Our customers would ask why we couldn't just host our data on their giant Oracle clusters (our customers usually had large datacenters). One of the team actually rewrote the database layer to use Oracle (or PostgreSQL, or MySQL), but it was slightly slower than the original. At least one large enterprise even had an Oracle-only policy, but luckily Oracle bought Berkeley DB. We also had to write a lot of extra tools - we couldn't just use Crystal Reports for example.
The other disadvantage of our graph database was that we built it ourselves, which meant when we hit a problem (usually with scalability) we had to solve it ourselves. If we'd used a relational database, the vendor would have already solved the problem ten years ago.
If you're building a product for enterprisey customers and your data fits into the relational model, use a relational database if you can. If your application doesn't fit the relational model but it does fit the graph model, use a graph database. If it only fits something else, use that.
If your application doesn't need to fit into the current blub architecture, use a graph database, or CouchDB, or BigTable, or whatever fits your app and you think is cool. It might give you an advantage, and its fun to try new things.
Whatever you chose, try not to build the database engine yourself unless you really like building database engines.

We've been working with the Neo team for over a year now and have been very happy. We model scholarly artifacts and their relationships, which is spot on for a graph db, and run recommendation algorithms over the network.
If you are already working in Java, I think that modeling using Neo4j is very straight forward and it has the flattest / fastest performance for R/W of any other solutions we tried.
To be honest, I have a hard time not thinking in terms of a Graph/Network because it's so much easier than designing convoluted table structures to hold object properties and relationships.
That being said, we do store some information in MySQL simply because it's easier for the Business side to run quick SQL queries against. To perform the same functions with Neo we would need to write code that we simply don't have the bandwidth for right now. As soon as we do though, I'm moving all that data to Neo!
Good luck.

Two points:
First, on the data I've been working with the past 5 years in SQL Server, I've recently hit the scalability wall with SQL for the type of queries we need to run (nested relationhsips...you know...graphs). I've been playing around with neo4j, and my lookup times are several orders of magnitude faster when I need this kind of lookup.
Second, to the point that graph databases are outdated. Um...no. Early on, as people were trying to figure out how to store and lookup data efficiently, they created and played with graph and network style database models. These were designed so the physical model reflected the logical model, so their efficiency wasnt that great. This type of data structure was good for semi-structured data, but not as good for structured dense data. So, this IBM dude named Codd was researching efficient ways to arrange and store structured data and came up with the idea for the relational database model. And it was good, and people were happy.
What do we have here? Two tools for two different purposes. Graph database models are very good for representing semi-structured data and the relationships between entities (that may or may not exist). Relational databases are good for structured data that has a very static schema, and where join depths do not go very deep. One is good for one kind of data, the other is good for other kinds of data.
To coin the phrase, there is no Silver Bullet. Its very short sighted to say that graph database models are out of date and to use one gives up 40 years of progress. That's like saying using C is giving up all the technological progress we've gone through to get things like Java and C#. That's not true though. C is a tool that is needed for certain tasks. And Java is a tool for other tasks.

I've been using MySQL for years to manage engineering data, and it worked well, but one of the problems we had (but didn't realise we had) was that we always had to plan the schema up-front. Another problem we knew we had was mapping the data up to domain objects and back.
Now we've just started trying out neo4j and it looks like it is solving both problems for us. The ability to add different properties to each node (and relation) has allowed us to re-think our entire approach to data. It is like dynamic versus static languages (Ruby versus Java), but for databases. Building the data model in the database can be done in a much more agile and dynamic way, and that is dramatically simplifying our code.
And since the object model in code is generally a graph structure, mapping from the database is also simpler, with less code and consequently fewer bugs.
And as a additional bonus, our initial prototype code for loading our data into neo4j is actually performing faster than the previous MySQL version. I have no solid numbers on this (yet), but that was a nice additional feature.
But at the end of the day, the choice probably should be based mostly on the nature of your domain model. Does it map better to tables or graphs? Decide by doing some prototypes, load the data and play with it. Use neoclipse to look at different views of the data. Once you've done that, hopefully you know if you're on to a good thing or not.

Here is a good article that talks about the needs that non relational databases fill: http://www.readwriteweb.com/enterprise/2009/02/is-the-relational-database-doomed.php
It does a good job at pointing out (aside from the name) that relational databases arent flawed or wrong, its just that these days people are starting to process more and more data in mainstream software and web sites, and that relational databases just wont scale for these needs.

I am building an intranet at my company.
I am interested in understanding how to load data that was stored in tables (Oracle, MySQL, SQL Server, Excel, Access, various random lists) and loading it into Neo4J, or some other graph database. Specifcally, what happens when common data overlaps existing data already in the system.
Yes, I know some data is best modeled in RDBMS, but I have this idea itching me, that when you need to superimpose several distinct tables, the graph model is better than the table structure.
For instance, I work in a manufacturing environment. There is a major project we are working on and because of the complexity, each department has created a seperate Excel spreadsheet that has a BOM (Bill Of Materials) hierarchy in a column on the left and then several columns of notes and checks made by individuals who made these sheets.
So one of the problems is merging all these notes together into one "view" so that someone can see all the issues that need to be addressed in any particular part.
The second problem is that an Excel spreadsheet sucks at representing a hierarchial BOM when a common component is used in more than one subassembly. Meaning that, if someone writes a note about the P34 relay in the ignition subassembly, the same comment should be associated with the P34 relays used in the motor driver subassembly. This won't occur in the excel spreadsheet.
For the company intranet, I want to be able to search for anything easily. Such as data related to a part number, a BOM structure, a phone number, an email address, a company policy, or procedure. I want to even extend this to manage computer hardware assets, and installed software.
I envision that once the information network starts to get populated you can start doing cool traversals such as "I want to write an email to everyone working on the XYZ project". People will have been associated with the project because they will be tagged as creating and modifying the data within the XYZ project. So by using the XYZ project as a search key, a huge set with everything related to the XYZ project will be created. Including links to people who built the XYZ project. The people links will connect to their email addresses. So by their involvement in the XYZ project, they will be included in my email. This is in stark contrast to some secretary trying to maintain a list of people work on the project. We generate a lot of lists. We spend a lot of time maintaining lists and making sure they are up to date. And most of it doesn't add any value to our products.
Another cool traversal could report all the computers that have a certain piece of software installed, by version. That report could be used to generate tasks to remove extra copies of old software and to update people who need to have the latest copy. It would also be useful for license tracking.

might be a bit late, but there is a growing number of projects using Neo4j, the better known ones listed at Neo4j . Also NeoTechnology, the company behind Neo4j, has some references at their customers page
Note: I am part of the Neo4j team

Front-End for MS Access migration? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Background
I work for a large organization which has thousands of MS Access applications floating around. I didn't write any of these - in fact, most of the original authors have long since left the company - but from time to time another Access app lands on my desk for support. I would soooo love to replace access with a different solution.
Requirement
I know that there are several good alternatives for the database part of MS Access (the Jet database), such as SQLite, MySQL, VistaDB, etc.
What I would like to know is: Is there anything that will replace the front end part of MS Access?
I.e. Something which can be used to build forms, write simple scripts and queries, etc?
Why?
#BracC asked "why replace access?" - A fair question indeed.
I want to get rid of access because:
it hides logic, leading to hard-to-support applications. Logic can be in lots of different places, none of which provide or encourage any structure:
macros
modules
queries
forms
its very nature encourages users to create "little" applications which become "not so little applications". Then the user leaves and I have to support a bunch of spaghetti. I know that access isn't the only culprit, but it's the leader in my organisation, and I would love to get rid of it completely.
For extra credit
what I would really love to find is something which can read in an MDB file and output something like C# which replicates the functionality. (Or any language - not fussy).
I hope this is all clear. If not, please post a comment and I'll re-write/add detail.
Update
#GuinnessFan makes some points I find interesting. I have added my comments to discuss those points.
What we have done since I asked the question:
Got users to give us a definitive list of access applications they use and need. (The understanding is that any MDB files not on the list can be deleted - hooray!).
Analysed the MDBs on the list, coming to the following conclusions:
Most of the "applications" consist of a single hard-coded query or a single linked table.
Many are a small number of queries with, perhaps, a date parameter or similar.
very few (if any) have any truly complex logic.
We are now working through the list, converting most of the apps to SSRS (SQL Server Reporting Services) packages.
Anything which can't be replicated using SSRS will become a hand-crafted web application. However, there aren't many of these.
May I say many thanks, to everybody who has given me helpful answers.

I switched the back-end on one application from MSacces to MSSQL a few years ago. Kept the front-end, because it worked well, and I didn't find anything as easy to use/modify.
I've never seen a MSAccess -> C# translator. However, you might be able to find a MSAccess to VB6 translator (their syntaxes are roughly similar), and from there there are VB6->VB.Net translators (and even VB.Net ->C# translators)

You could check out Oracle's Application Express. It's free and it's geared toward Access developers.
It has a migration assistant as well that you run your Access database through, it proccesses the data and the forms, migrates everything to an Oracle Database (this works with the free database, Oracle XE, and comes install by default) and builds web forms for your Access database.
So in the end you'll have your Access databases on the web, your data in Oracle and somewhat nice web front end for extending them.
As far as Oracle goes, the tool isn't half bad. You can sign up for a free instance to play around with here.
Here's the document that explains how you migrate Access Databases.

So, other than personnal distaste, why replace the Access front-end? May be easy to do for some (simple) databases, but most Access apps in the real world have a lot of complexity.
Lots of reasons for upgrading the back-end, of course (scalability, performance, db corruption, user-locking). Access even has a built-in "upgrade wizard" tool that allows you to split the forms and logic from the data, and upgrade the data to MS SQL server. If you want, use this wizard to upgrade the back-end to SQL Express, then manually migrate to another db platform.
Hope this is not too far off-topic, but sometimes all you need to do with Access is:
Upgrade the back-end (as we've already discussed)
Always make sure the front-ends are locked down (read-only)
If necessary, create different front-ends for different user roles (as a form of security).
If possible, have the front-ends copied locally on each workstation, for performance reasons. You may need to have a network script to check for new versions of the front-end.
I don't have any direct experience with it, but I did find an Access to ASP.Net converter tool called "Access Whiz" at http://www.microtools.us/

We used an internal app based on MS Access as a frontend to a MySQL database. We ran into lots of problems, and eventually rewrote the whole app in CodeGear Delphi 2007 for Win32. This has been a great success, although the migration did cost quite a lot of effort (training/hiring a couple of Delphi programmers, buying some third-party tools). I can wholeheartedly recommend Delphi, though. And AFAIK, integration with a MS Access back-end is certainly possible --- I once wrote a Delphi app that does just that, and it only cost me a couple of days to get a feature-complete version!
I realize this is a full programming solution, so you'd definitely loose some of the ease-of-use of MS Access for building front-ends. Then again, you can put together a database application in Delphi in 10 minutes without writing too much code --- no kidding! And since the 2009 release, the language is slowly becoming more mainstream again...

#BradC
I don't recomment MicroTools. I worked for a company a while back where we had the same problem. Unless MicroTools has made significant improvements to their product, it spits out garbage last I checked.
What we found was that pretty much any upgrade path will require significant amounts of coding changes. All these tools are good for is to maintain a similar GUI from the original application. Their code had no object structure, just a bunch of utility functions that were dumped on each page to simulate the way Access provides record navigation. If you have a large number of forms, pulling out their solution and implementing your own takes some work and a ton of find-and-replace operations.
We were so disappointed with MicroTools performance that we started writing our own converter. We were pumping out better ASP.NET forms than they were after a week of coding.

You won't find an server-class engine that also has the desktop interface design tools attached. The big server engines all expect you to use something like C++, C#, Java, or PHP to build your interface.
I, too, would love to see an upgrade tool for access that would spit out some basic C# forms and talks to an equivalent SQL Server database. It seems that would be a big money-maker for Microsoft because they could use it as a way to up-sell customers to a full SQL Server.
IIRC, there might be a way to tell an Access front end to talk to a SQL Server, or change the tables used by an Access front end to really be linked tables into a SQL Server, or something like that, but I've never had to use the feature myself.

I have a different perspective for you to consider. Your main issue is that it hides logic and there is data and applications scattered through the organization.
Unfortunately I don't know of a RAD (rapid application development) tool that is as easy as Access to create functional forms.
However I would recommend that you focus more on the possibility of centeralizing your data and logic and still allow Access as a front end. I support a database product called Advantage Database Server which supports RI (refferential integrity) rules, stored procedures, triggers, etc. that can all be managed on a centeral server thus bringing all of the logic to you. These Access front-ends could then link to the data backend using ODBC or OLEDB. If you switched to a solution like this then later down the road you would have flexibility to write other applications such as .NET, PHP, JDBC, etc. that tie into the same data while phasing Access front-ends out.
A good start would be to stop new Access development unless they're using this sort of data backend.

Out of the 1000's of Access files how many have you been asked to support? I'm guessing less than 100. Why rebuild an application that A) no one uses B) works fine just the way it is?
You need to begin a policy that it is an acceptable practice for a large organization to develop custom applications in a robust, scalable, reliable, yadda yadda yadda environment. Identify the Access applications you feel are critical or are being outgrown and just work on those.
Be prepared to handle the expectation of getting their quick and dirty little applications on a quick turnaround. You'll have to show them the benefits of your new apps.
I think you just need to be a resident expert and teach these users how to improve their application or get your input from the beginning to start them off right. The requirements to convert all of these files would otherwise be overwhelming.

Microtools offers Access Whiz, a set of Access conversion tools. It consists of Access to ASP .NET (VB/C#) converters, Access to VB6 converter, Access to WinForms (VB .NET/C#) converters and Access to Crystal Reports converter. More information and trial demos can be found at http://www.microtools.us.

You can also take a look at Firebird
Here is the way to migrate (you need Delphi)
I also find this MDB2FDB

Is there anything that will replace the front end part of MS Access?
Maybe Kexi?

How much of an applications "smarts" should reside in the database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I've noticed a trend lately that people are moving more and more processing out of databases and in to applications. Some people are taking this to what seems to me to be ridiculous extremes.
I've seen application designs that not only banned all use of stored procedures, but also banned any kind of constraints enforced at the database (this would include primary key, foreign key, unique, and check constraints). I have even seen applications that required the use of only one data type stored in the database, namely varchar(2000). DateTime and number types were not allowed. Transactions and concurrency were also handled outside the database.
Has anyone seen these kind of applications implemented successfully? Both of the implementations I've dealt with that were implemented this way had all kinds of data integrity and concurrency problems. Can anyone explain this trend to move stuff (logic, processing, constraints) out of the database? What is the motivation behind it? Is it something I'm imagining?

Firstly, I really hope there is no trend towards databases without PKs and FKs and sensible datatypes. That would really be a tragedy.
But there is definitely a large core of developers who prefer putting logic in their apps than in stored procedures. I agree with Riho on the main reason for this: usually, DBAs manage databases, meaning that a developer has to go through a bunch of administrative overhead -- getting approvals from the DBA -- in order to create and update stored procs. Programmers by nature like to have control over their world, and to do things "their way."
There are also a couple of valid technical reasons:
Procedural extensions to SQL (e.g. T-SQL) used for developing stored procs have traditionally lacked user-defined datatypes, debuggability, portability, and interoperability with external systems -- all qualities helpful for developing reliable large-scale software. (And the clumsy syntax doesn't help.)
Software version control (e.g. svn) works well for managing even very large codebases, but managing DB schema changes is a harder problem and less well supported. Every serious shop uses version control for their application codebase, but many still don't have any rigorous system for managing their databases; hence stored procs can easily fall into an unversioned "black hole" that makes coders rightly nervous.
My personal view is that the closer the core business logic is to the data, the better, especially if more than one agent accesses the DB (or may do in the future). It's an unfortunate artefact of history that T-SQL and its ilk were weak languages, leading to the rise of the idea that "data and logic should be separated." My ideal world is one in which every business rule is encapsulated in a constraint enforced by the database, and all inconsistencies fail fast.

I like to keep logic out of the database. I tend to avoid stored procedures and triggers. I do, though, always use proper data types, keys, indicies and constraints. The way I see it is that the database is a database and the application is the application. The database should keep your data stored properly and efficiently whereas the application should own the logic. Perhaps I have never been in a situation where a stored procedure or trigger was needed; and thus never been inclined to use them to solve a problem. But to me, giving logic a home on the database seems "messy" to me; I would rather control everything from the application itself.

The trend results from the fact that the software technology industry is populated and driven largely by humans, and thus subject to trends and irrational behavior. To understand what's going on today requires a bit of perspective in the history of databases, and their parallel development with programming languages.
To be brief in this answer that will likely get downvoted: SQL is the IE6 of the database languages world. It breaks many of the rules of the relational model- in other words, it's a little bit like a calculator that performs multiplication incorrectly, and doesn't have a minus operator. SQL is not complete enough to be a real solution. It was never developed beyond the prototype stage, and was never meant to be used in industrial settings. But then it was naively used by oracle, which turned out to be a "killer app", SQL became industry standard instead of its technically superior competitors, and the rest is history. SQL's syntax is based around a set of command line tabular data processing tools, and COBOL. Full of bugs, inconsistencies, and a mishmash proprietary versions and features that don't have a grounding in math or logic, results in a situation where it really is unclear what goes where.
I think the trend you must be talking about is recent proliferation of ORMs: misguided and ill thought out attempts to patch over the obvious deficiencies of SQL. Database triggers and procedures are another misfeature trying to patch over SQL's problems.
If history had played out in a logical and orderly way, the answer to your question would be simple: Just follow the rules of the relational model and everything will work itself out. Unfortunately, the rules of the relational model don't fit cleanly into the current crop of SQL based DBMS's, so some application level fiddling, or triggers, or whatever other stupid patch is unfortunately necessary, and it ends up being a matter of subjective opinion, rather than reasoned argument, which stupid hack you use.
So the real answer is to just follow the relational model as close as you can, and then fudge it the rest of the way. Put the logic in the application if you're the only one using the db, and you need to keep all your source code in a version repository. If multiple applications are likely to use the database, make the DB as bullet proof and self sufficient as it can be- The main goal here is to ensure that the data remains consistent.

Ultimately the database and how you connect to it is your "persistence API" -- how much is in the database and how much is in the application is application-specific. But the important aspect is that the API boundary is responsible for producing or consuming correct data.
Personally I prefer a thin access layer in the application and sprocs/PKs/FKs in the database to enforce transactional correctness and to enable API versioning. This also allows other applications to modify the database without upsetting the data model.
And if I see another moron using *SELECT * FROM blah* I'm going to go nuts with an Uzi... :-)

"The database should keep your data stored properly and efficiently whereas the application should own the logic" - Nelson LaQeut in another answer.
This seems to be the crux of the issue: that all "logic" belongs to the application and not to the database. But what is meant by "logic"? There are various kinds of "logic", some of which belong in an application and some, I would say, better placed in the database.
I would think most developers would agree (surely?) that basic data integrity such as primary and foreign keys belongs in the database. There is less agreement on more sophisticated data integrity logic - even the humble but useful check constraint is woefully underused in general. .
The application camp see the database is "merely" a place to store the data that "belongs" to their application. The database camp (which is where I sit) see the application as "merely" one (perhaps currently the only) user of the data that "belongs" to the database - or rather that belongs to the business and is managed for the business by the database (DBMS = database management system).
If all your data logic is tied up in your application, what happens when the application needs to be rewritten in the latest trendy paradigm (or do you think J2EE for example is the last there will ever be)? As Tom Kyte often says, it's all about the data.

The database is an integral part of an application, but everyone interprets that differently. It's definitely a wise move to isolate them, but that shouldn't mean that you circumvent what they do in your programming. Correct data types and primary key references are important parts of good database design, on top of which a good application can be built.

Although I personally believe the Database should have enough smarts to defend itself, some people that don't understand that Databases aren't dumb services, think, and not incorrectly mind you, that data and logic should be separated. Now in many cases the separation of data and logic is a powerful tool, however most databases already provide us with solid implementations of atomicity, redundancy, processing, checking, etc... And many times that's where it belongs, however since the quality of these services and their API differs among vendors, many application programmers have felt like its worth trying to implement this sort of stuff in the application layer, to avoid tying themselves up with a specific database layer.

I can't say that I've seen a "trend" to create poor applications with terrible database designs. Programming is just like any other discipline in that there will be people who won't learn the tools or just want to cut corners. I've even talked to a person that just didn't "trust" databases. The applications that you described are just as you said, ridiculous nightmares. Don't follow those "trends".

I still prefer to use Stored Procedures and functions in SQL server. It adds more flexibility to application acturally. And it has a performance benefit also. Generally I don't think it is good idea to put everything to applicatons.

I think that those "developers" who created databases without indexes or with single VARCHAR(2000) column are just art majors who are making their first attempt into entering the high-priced IT world.
No-one, who has even little-bit of IT education, makes this kind of database structures.
I can understand the reason to keep logic out of the well formed database. Usually it is time-consuming to make changes (you have to convince database admins to make it, and all the red-tape that comes with it). If the business logic is in your program, then its up to you only.

Use constraints in the database, but for any sophisticated logic I would place that in a data access layer or use one of the standard Object Relational Mapping (ORM) tools such as Hibernate/NHibernate.
There is a general belief that this will affect performance; based on the view that stored procedures are pre-compiled but 'raw' queries have to be compiled on every call. However, I work mostly in SQL Server 2005/2008, and that is very efficient at handling 'raw' parameterised queries, caching the compiled query path for future calls to the database. This means that there under SQL Server there is essentially no difference between the performance of stored procedures to parameterised SQL queries.
The only downside on losing stored procedures is if you are very granular with your database security permissions, and which to enforce security at the database login level.

I have a simple philosophy.
If it's need to keep the database secure and in a consistant state, make sure to do it in the database
I do try to keep a lot of other stuff there too, in my world it's easier to update a client's database than it is to update their application...
Essentially I try to treat the database as a pseudo object. A bunch of methods I can call, etc, but I don't want the app to care about the detail of the internal data storage.

In my experience, putting any application logic in the database always results in a WTF. It doesn't matter how smart the database programmer, how advanced the database, it always ends up being a mistake. The reverse question is "how often should my C# code manage relational data using its own flat-file structure and query language", to which the answer is (almost) always never.
I think the database should be used for data storage, which is what it's good at.

Where to put your code - Database vs. Application? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have been developing web/desktop applications for about 6 years now. During the course of my career, I have come across application that were heavily written in the database using stored procedures whereas a lot of application just had only a few basic stored procedures (to read, insert, edit and delete entity records) for each entity.
I have seen people argue saying that if you have paid for an enterprise database use its features extensively. Whereas a lot of "object oriented architects" told me its absolute crime to put anything more than necessary in the database and you should be able to drive the application using the methods on those classes?
Where do you think is the balance?
Thanks,
Krunal

I think it's a business logic vs. data logic thing. If there is logic that ensures the consistency of your data, put it in a stored procedure. Same for convenience functions for data retrieval/update.
Everything else should go into the code.
A friend of mine is developing a host of stored procedures for data analysis algorithms in bioinformatics. I think his approach is quite interesting, but not the right way in the long run. My main objections are maintainability and lacking adaptability.

I'm in the object oriented architects camp. It's not necessarily a crime to put code in the database, as long as you understand the caveats that go along with that. Here are some:
It's not debuggable
It's not subject to source control
Permissions on your two sets of code will be different
It will make it more difficult to track where an error in the data came from if you're accessing info in the database from both places

Anything that relates to Referential Integrity or Consistency should be in the database as a bare minimum. If it's in your application and someone wants to write an application against the database they are going to have to duplicate your code in their code to ensure that the data remains consistent.
PLSQL for Oracle is a pretty good language for accessing the database and it can also give performance improvements. Your application can also be much 'neater' as it can treat the database stored procedures as a 'black box'.
The sprocs themselves can also be tuned and modified without you having to go near your compiled application, this is also useful if the supplier of your application has gone out of business or is unavailable.
I'm not advocating 'everything' should be in database, far from it. Treat each case seperately and logically and you will see which makes more sense, put it in the app or put it in the database.

I'm coming from almost the same background and have heard the same arguments. I do understand that there are very valid reasons to put logic into the database. However, it depends on the type of application and the way it handles data which approach you should choose.
In my experience, a typical data entry app like some customer (or xyz) management will massively benefit from using an ORM layer as there are not so many different views at the data and you can reduce the boilerplate CRUD code to a minimum.
On the other hand, assume you have an application with a lot of concurrency and calculations that span a lot of tables and that has a fine-grained column-level security concept with locking and so on, you're probably better off doing stuff like that directly in the database.
As mentioned before, it also depends on the variety of views you anticipate for your data. If there are many different combinations of columns and tables that need to be presented to the user, you may also be better off just handing back different result sets rather than map your objects one-by-one to another representation.
After all, the database is good at dealing with sets, whereas OO code is good at dealing with single entities.

Reading these answers, I'm quite confused by the lack of understanding of database programming. I am an Oracle Pl/sql developer, we source control for every bit of code that goes into the database. Many of the IDEs provide addins for most of the major source control products. From ClearCase to SourceSafe. The Oracle tools we use allow us to debug the code, so debugging isn't an issue. The issue is more of logic and accessibility.
As a manager of support for about 5000 users, the less places i have to look for the logic, the better. If I want to make sure the logic is applied for ALL applications that use the data , even business logic, i put it in the DB. If the logic is different depending on the application, they can be responsible for it.

#DannySmurf:
It's not debuggable
Depending on your server, yes, they are debuggable. This provides an example for SQL Server 2000. I'm guessing the newer ones also have this. However, the free MySQL server does not have this (as far as I know).
It's not subject to source control
Yes, it is. Kind of. Database backups should include stored procedures. Those backup files might or might not be in your version control repository. But either way, you have backups of your stored procedures.

My personal preference is to try and keep as much logic and configuration out of the database as possible. I am heavily dependent on Spring and Hibernate these days so that makes it a lot easier. I tend to use Hibernate named queries instead of stored procedures and the static configuration information in Spring application context XML files. Anything that needs to go into the database has to be loaded using a script and I keep those scripts in version control.

#Thomas Owens: (re source control) Yes, but that's not source control in the same sense that I can check in a .cs file (or .cpp file or whatever) and go and pick out any revision I want. To do that with database code requires a potentially-significant amount of effort to either retrieve the procedure from the database and transfer it to somewhere in the source tree, or to do a database backup every time a minor change is made. In either case (and regardless of the amount of effort), it's not intuitive; and for many shops, it's not a good enough solution either. There is also the potential here for developers who may not be as studious at that as others to forget to retrieve and check in a revision. It's technically possible to put ANYTHING in source control; the disconnect here is what I would take issue with.
(re debuggable) Fair enough, though that doesn't provide much integration with the rest of the application (where the majority of the code could live). That may or may not be important.

Well, if you care about the consistency of your data, there are reasons to implement code within the database. As others have said, placing code (and/or RI/constraints) inside the database acts to enforce business logic, close to the data itself. And, it provides a common, encapsulated interface, so that your new developer doesn't accidentally create orphan records or inconsistent data.

Well, this one is difficult. As a programmer, you'll want to avoid TSQL and such "Database languages" as much as possible, because they are horrendous, difficult to debug, not extensible and there's nothing you can do with them that you won't be able to do using code on your application.
The only reasons I see for writing stored procedures are:
Your database isn't great (think how SQL Server doesn't implement LIMIT and you have to work around that using a procedure.
You want to be able to change a behaviour by changing code in just one place without re-deploying your client applications.
The client machines have big calculation-power constraints (think small embedded devices).
For most applications though, you should try to keep your code in the application where you can debug it, keep it under version control and fix it using all the tools provided to you by your language.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight