How to create a proper database layer? - database

Currently I use a lot of the ado.net classes (SqlConnection, SqlCommand, SqlDataAdapter etc..) to make calls to our sql server. Is that a bad idea? I mean it works. However I see many articles that use an ORM, nHibernate, subsonic etc.. to connect to SQL. Why are those better? I am just trying to understand why I would need to change this at all?
Update:
I did check the following tutorial on using nHibernate with stored-procedures.
http://ayende.com/Blog/archive/2006/09/18/UsingNHibernateWithStoredProcedures.aspx
However it looks to me that this is way to overkill. Why would I have to create a mapping file? Even if I create a mapping file and lets say my table changes, then my code wont work anymore. However if I use ado.net to return a simple datatable then my code will still work. I am missing something here?

There's nothing wrong with using the basic ADO.NET classes.
You might just have to do a lot more manual work than necessary. If you e.g. select your top 10 customers from a table with SqlCommand and SqlDataReader, it's up to you go iterate over the results, pull out each and every single item of data (like customer number, customer name, and so forth), and you're dealing very closely with the database structures, e.g. rows and columns. That's fine for some scenarios, but too much work in others.
What an ORM gives you is a lot of this "grunt work" being handled for you. You just tell it to get a list of your top 10 customers - as "Customer" objects. The ORM will go off and grab the data (most likely using SqlCommand, SqlDataReader) and then pulling out the bits and pieces, and assemble nice, easy to use "Customer" objects for you, that are a lot easier to use, since they are what your code is dealing with - Customer objects.
So there's definitely nothing wrong with using ADO.NET and it's a good thing if you know how it works - but an ORM can save you a lot of tedious, repetitive and boring grunt work and let you focus on your real business problems on the object level.
Marc

First of all, the ORMs are likely to do a much better job at producing the SQL queries than your normal non-SQL specialized Joe :)
Secondly, ORMs are a great way to somewhat "standardize" your DALs, increasing flexibility over different projects.
And lastly, with a good ORM, you're likely to have an easier time substituting your underlaying data-source, as a good ORM will have many different dialects. Of course, this is just a side-bonus :)

ORM's are great to avoid code repetition. You can often find that your object model and database model are extremely close to each other and whenever you add a field you'll be adding it to the database, your objects, your sql statements as well as everywhere else. If you use an ORM then you change your code in one place and it builds the rest of it for you.
As for performance, this can go either way. You will probably find that a lot of the simple sql that is written for you is often extremely tailored with various shortcuts that you would have been too lazy to write, such as only returning the absolutely required data. On the other hand, if you have some extremely complex queries and joins that an automated system could not possibly build then you're better of keeping these written yourself.
In summary though, they're fantastic for fast builds!

You don't need to change. If SqlConnection, SqlCommand, etc. work for you then that's great.
They work just peachy fine for the DB app I'm developing, and I have dozens of concurrent users with no problems.

There's nothing wrong with using straight up ADO.Net, but using an ORM will save you time, both in development and maintenance. Thats the biggest benefit.

One thing to consider: will a future "new developer" be more inclined to know or learn a well documented and widely adopted OR/M or your custom data access layer?
The number one thing for me though is the time. Minutes with my favorite OR/M, nHibernate vs. hours/days writing a custom data access layer using ADO.NET.
I also favor OR/Ms because maintaining declarative XML mappings is way easier than maintaining potentially thousands of lines of imperative code... or worse thousands of lines of C# data access code on top of thousands of lines of stored procedure code. In my current project I have 58 objects mapped in 58 XML mapping files, each with less than 50 lines. I cringe when I think about writing/maintaining CRUD code for 58 entities in ADO.NET.
I must warn you to read the documentation. Many, dare I say most, folks with whom I've worked will jump on a tool like mice on cheese, but they'll never read the documentation and learn the technology. I recommend reading the docs BEFORE moving to a new technology like nHibernate. A good cup o' jo and an hour or two of hard reading before-hand will pay dividends.

I haven't find pointing to an general idea of ORM in any answer. The general idea of ORM is to perform an Object-Relational mapping and provide your business classes with persistence. It means that you will think only about you business logic and will let ORM tool to save its state for you. Sure there are a lot of different scenarios. As was already said, it is nothing bad in using pure ADO.NET and may be your application (that is already written in this stile won't get any benefit), but using ORM tool in new projects is a very good idea. As for other - I totally agree with other answers.

Related

ORM vs traditional database query, which are their fields?

ORM seems to be a fast-growing model, with both pros and cons in their side. From Ultra-Fast ASP.NET of Richard Kiessig (http://www.amazon.com/Ultra-Fast-ASP-NET-Build-Ultra-Scalable-Server/dp/1430223839/ref=pd_bxgy_b_text_b):
"I love them because they allow me to develop small, proof-of-concept sites extremely quickly. I can side step much of the SQL and related complexity that I would otherwise need and focus on the objects, business logic and presentation. However, at the same time, I also don't care for them because, unfortunately, their performance and scalability is usually very poor, even when they're integrated with a comprehensive caching system (the reason for that becomes clear when you realize that when properly configured, SQL Server itself is really just a big data cache"
My questions are:
What is your comment about Richard's idea. Do you agree with him or not? If not, please tell why.
What is the best suitable fields for ORM and traditional database query? in other words, where you should use ORM and where you should use traditional database query :), which kind/size... of applications you should undoubtedly choose ORM/traditional database query
Thanks in advance
I can't agree to the common complain about ORMs that they perform bad. I've seen many plain-SQL applications until now. While it is theoretically possible to write optimized SQL, in reality, they ruin all the performance gain by writing not optimized business logic.
When using plain SQL, the business logic gets highly coupled to the db model and database operations and optimizations are up to the business logic. Because there is no oo model, you can't pass around whole object structures. I've seen many applications which pass around primary keys and retrieve the data from the database on each layer again and again. I've seen applications which access the database in loops. And so on. The problem is: because the business logic is already hardly maintainable, there is no space for any more optimizations. Often when you try to reuse at least some of your code, you accept that it is not optimized for each case. The performance gets bad by design.
An ORM usually doesn't require the business logic to care too much about data access. Some optimizations are implemented in the ORM. There are caches and the ability for batches. This automatic (and runtime-dynamic) optimizations are not perfect, but they decouple the business logic from it. For instance, if a piece of data is conditionally used, it loads it using lazy loading on request (exactly once). You don't need anything to do to make this happen.
On the other hand, ORM's have a steep learning curve. I wouldn't use an ORM for trivial applications, unless the ORM is already in use by the same team.
Another disadvantage of the ORM is (actually not of the ORM itself but of the fact that you'll work with a relational database an and object model), that the team needs to be strong in both worlds, the relational as well as the oo.
Conclusion:
ORMs are powerful for business-logic centric applications with data structures that are complex enough that having an OO model will advantageous.
ORMs have usually a (somehow) steep learning curve. For small applications, it could get too expensive.
Applications based on simple data structures, having not much logic to manage it, are most probably easier and straight forward to be written in plain sql.
Teams with a high level of database knowledge and not much experience in oo technologies will most probably be more efficient by using plain sql. (Of course, depending on the applications they write it could be recommendable for the team to switch the focus)
Teams with a high level of oo knowledge and only basic database experience are most probably more efficient by using an ORM. (same here, depending on the applications they write it could be recommendable for the team to switch the focus)
ORM is pretty old, at least in the Java world.
Major problems with ORM:
Object-Oriented model and Relational model are quite different.
SQL is a high level language to access data based on relational algebra, different from any OO language like C#, Java or Visual Basic.Net. Mixing those can you the worst of two worlds, instead of the best
For more information search the web on things like 'Object-relational impedance mismatch'
Either case, a good ORM framework saves you on quite some boiler-plate code. But you still need to have knowlegde of SQL, how to setup a good SQL databasemodel. Start with creating a good databasemodel using SQL, then base your OO model on that (not the other way around)
However, the above only holds if you really need to use a SQL database. I recommend looking into NoSQL movement as well. There's stuff like Cassandra, Couch-db. While google'ing for .net solutions I found this stackoverflow question: https://stackoverflow.com/questions/1777103/what-nosql-solutions-are-out-there-for-net
I'm the author of the book with the text quoted in the question.
Let me emphatically add that I am not arguing against using business objects or object oriented programming.
One issue I have with conventional ORM -- for example, LINQ to SQL or Entity Framework -- is that it often leads to developers making DB calls when they don't even realize that they're doing so. This, in turn, is a performance and scalability killer.
I review lots of websites for performance issues, and have found that DB chattiness is one of the most common causes of serious problems. Unfortunately, ORM tends to encourage chattiness, in spades.
The other complaints I have about ORM include:
No support for command batching
No support for multiple result sets
No support for table valued parameters
No support for native async calls (making them from a background thread doesn't count)
Support for SqlDependency and SqlCacheDependency is klunky if/when it works at all
I have no objection to using ORM tactically, to address specific business issues. But I do object to using it haphazardly, to the point where developers do things like make the exact same DB call dozens of time on the same page, or issue hugely expensive queries without considering caching and change notifications, or totally neglect async operations when scalability is a concern.
This site uses Linq-to-SQL I believe, and it's 'fairly' high traffic... I think that the time you save from writing the boiler plate code to access/insert/update simple items is invaluable, but there is always the option to drop down to calling a SPROC if you have something more complex, where you know you can write some screaming fast SQL directly.
I don't think that these things have to be mutually exclusive - use the advantages of both, and if there are sections of your application that start to slow down, then you can optimise as you need to.
ORM is far older than both Java and .NET. The first one I knew about was TopLink for Smalltalk. It's an idea as old as persistent objects.
Every "CRUD on the web" framework like Ruby on Rails, Grails, Django, etc. uses ORM for persistence because they all presume that you are starting with a clean sheet object model: no legacy schema to bother with. You start with the objects to model your problem and generate the persistence from it.
It often works the other way with legacy systems: the schema is long-lived, and you may or may not have objects.
It's astonishing how quickly you can get a prototype up and running with "CRUD on the web" frameworks, but I don't see them being used to develop enterprise apps in large corporations. Maybe that's a Fortune 500 prejudice.
Database admins that I know tell me they don't like the SQL that ORMs generate because it's often inefficient. They all wish for a way to hand-tune it.
I agree with most points already made here.
ORM's are not new in .NET, LLBLGen has been around for a long time, I've been using them for >5 years now in .NET.
I've seen very bad performing code written without ORMs (in-efficient SQL queries, bad indexes, nested database calls - ouch!) and bad code written with ORMs - I'm sure I've contributed to some of the bad code too :)
What I would add is that an ORM is generally a powerful and productivity-enhancing tool that allows you to stop worrying about plumbing db code for most of your application and concentrate on the application itself. When you start trying to write complex code (for example reporting pages or complex UI's) you need to understand what is happening underneath the hood - ignorance can be very costly. But, used properly, they are immensely powerful, and IMO won't have a detrimental effect on your apps performance. I for one wouldn't be happy on a project that didn't use an ORM.
Programming is about writing software for business use. The more we can focus on business logic and presentation and less with technicalities that only matter at certain points in time (when software goes down, when software needs upgrading, etc), the better.
Recently I read about talks of scalability from a Reddit founder, from here, and one line of him that caught my attention was this:
"Having to deal with the complexities
of relational databases (relations,
joins, constraints) is a thing of the
past."
From what I have watched, maintaining a complex database schema, when it comes to scalability, becomes a major pain as the site grows (you add a field, you reassign constraints, re-map foreign keys...etc). It was not entirely clear to me as to why is that. They're not using a NOSQL database though, they're in Postgres.
Add to that, here comes ORM, another layer of abstraction. It simplifies code writing, but almost often at a performance penalty. For me, a simple database abstraction library will do, much like lightweight AR libs out there together with database-specific "plain text" queries. I can't show you any benchmark but with the ORMs I have seen, most of them say that "ORM can often be slow".
Richard covers both sides of the coin, so I agree with him.
As for the fields, I really don't quite get the context of the "fields" you are asking about.
As others have said, you can write underperforming ORM code, and you can also write underperforming SQL.
Using ORM doesn't excuse you from knowing your SQL, and understanding how a query fits together. If you can optimize a SQL query, you can usually optimize an ORM query. For example, hibernate's criteria and HQL queries let you control which associations are joined to improve performance and avoid additional select statements. Knowing how to create an index to improve your most common query can make or break your application performance.
What ORM buys you is uniform, maintainable database access. They provide an extra layer of verification to ensure that your OO code matches up as closely as possible with your database access, and prevent you from making certain classes of stupid mistake, like writing code that's vulnerable to SQL injection. Of course, you can parameterize your own queries, but ORM buys you that advantage without having to think about it.
Never got anything but pain and frustration from ORM packages. If I'd write my SQL the way they autogen it - yeah I'd claim to be fast while my code would be slow :-) Have you ever seen SQL generated by an ORM ? Barely has PK-s, uses FK-s only for misguided interpretation of "inheritance" and if it wants to do paging it dumps the whole recordset on you and then discards 90% of it :-))) Then it locks everything in sight since it has to take in a load of records like it went back to 50 yr old IBM's batch processing.
For a while I thought that the biggest problem with ORM was splintering (not going to have a standard in 50 yrs - every year different API, pardon "model" :-) and ideologizing (everyone selling you a big philosophy - always better than everyone else's of course :-) Then I realized that it was really the total amateurism that's the root cause of the mess and everything else is just the consequence.
Then it all started to make sense. ORM was never meant to be performant or reliable - that wasn't even on the list :-) It was academic, "conceptual" toy from the day one, the consolation prize for professors pissed off that all their "relational" research papers in Prolog went down the drain when IBM and Oracle started selling that terrible SQL thing and making a buck :-)
The closest I came to trusting one was LINQ but only because it's possible and quite easy to kick out all "tracking" and use is just as deserialization layer for normal SQL code. Then I read how the object that's managing connection can develop spontaneous failures that sounded like premature GC while it still had some dangling stuff around. No way I was going to risk my neck with it after that - nope, not my head :-)
So, let me make a list:
Totally sloppy code - not going to suffer bugs and poor perf
Not going to take deadlocks from ORM's 10-100 times longer "transactions"
Drastic reduction of capabilities - SQL has huge expressive power these days
Tying you up into fringe and sloppy API (every ORM aims to hijack your codebase)
SQL queries are highly portable and SQL knowledge is totally portable
I still have to know SQL just to clean up ORM's mess anyway
For "proof-of-concept" I can just serialize to binary or XML files
not much slower, zero bug libraries and one XPath can select better anyway
I've actually done heavy traffic web sites all from XML files
if I actually need real graph then I have no use for DB - nothing real to query
I can serialize a blob and dump into SQL in like 3 lines of code
If someone claims that he does it all from DB to UI - keep your codebase locked :-)
and backup your payroll DB - you'll thank me latter :-)))
NoSQL bases are more honest than ORM - "we specialize in persistence"
and have better code quality - not surprised at all
That would be the short list :-) BTW, modern SQL engines these days do trees and spatial indexing, not to mention paging without a single record wasted. ORM-s are actually "solving" problems of 10yrs ago and promoting amateurism. To that extent NoSQL, also known as document

What's the meaning of ORM?

I ever developed several projects based on python framework Django. And it greatly improved my production. But when the project was released and there are more and more visitors the db becomes the bottleneck of the performance.
I try to address the issue, and find that it's ORM(django) to make it become so slow. Why? Because Django have to serve a uniform interface for the programmer no matter what db backend you are using. So it definitely sacrifice some db's performance(make one raw sql to several sqls and never use the db-specific operation).
I'm wondering the ORM is definitely useful and it can:
Offer a uniform OO interface for the progarammers
Make the db backend migration much easier (from mysql to sql server or others)
Improve the robust of the code(using ORM means less code, and less code means less error)
But if I don't have the requirement of migration, What's the meaning of the ORM to me?
ps. Recently my friend told me that what he is doing now is just rewriting the ORM code to the raw sql to get a better performance. what a pity!
So what's the real meaning of ORM except what I mentioned above?
(Please correct me if I made a mistake. Thanks.)
You have mostly answered your own question when you listed the benefits of an ORM. There are definitely some optimisation issues that you will encounter but the abstraction of the database interface probably over-rides these downsides.
You mention that the ORM sometimes uses many sql statements where it could use only one. You may want to look at "eager loading", if this is supported by your ORM. This tells the ORM to fetch the data from related models at the same time as it fetches data from another model. This should result in more performant sql.
I would suggest that you stick with your ORM and optimise the parts that need it, but, explore any methods within the ORM that allow you to increase performance before reverting to writing SQL to do the access.
A good ORM allows you to tune the data access if you discover that certain queries are a bottleneck.
But the fact that you might need to do this does not in any way remove the value of the ORM approach, because it rapidly gets you to the point where you can discover where the bottlenecks are. It is rarely the case that every line of code needs the same amount of careful hand-optimisation. Most of it won't. Only a few hotspots require attention.
If you write all the SQL by hand, you are "micro optimising" across the whole product, including the parts that don't need it. So you're mostly wasting effort.
here is the definition from Wikipedia
Object-relational mapping is a programming technique for converting data between incompatible type systems in relational databases and object-oriented programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language.
a good ORM (like Django's) makes it much faster to develop and evolve your application; it lets you assume you have available all related data without having to factor every use in your hand-tuned queries.
but a simple one (like Django's) doesn't relieve you from good old DB design. if you're seeing DB bottleneck with less than several hundred simultaneous users, you have serious problems. Either your DB isn't well tuned (typically you're missing some indexes), or it doesn't appropriately represents the data design (if you need many different queries for every page this is your problem).
So, i wouldn't ditch the ORM unless you're twitter or flickr. First do all the usual DB analysis: You see a lot of full-table scans? add appropriate indexes. Lots of queries per page? rethink your tables. Every user needs lots of statistics? precalculate them in a batch job and serve from there.
ORM separates you from having to write that pesky SQL.
It's also helpful for when you (never) port your software to another database engine.
On the downside: you lose performance, which you fix by writing a custom flavor of SQL - that it tried to insulate from having to write in the first place.
ORM generates sql queries for you and then return as object to you. that's why it slower than if you access to database directly. But i think it slow a little bit ... i recommend you to tune your database. may be you need to check about index of table etc.
Oracle for example, need to be tuned if you need to get faster ( i don't know why, but my db admin did that and it works faster with queries that involved with lots of data).
I have recommendation, if you need to do complex query (eg: reports) other than (Create Update Delete/CRUD) and if your application won't use another database, you should use direct sql (I think Django has it feature)

Data Dictionary or ORM?

I am in the processes of replacing the framework for a fairly complex business web application. Our application runs on a LAMP platform and the new framework will be an extension of CodeIgniter. In my research for framework design I decided to look into ORM, I have never done ORM before and I wanted to know if it would be valuable for our application. Then I stumbled on a very interesting blog post entitled "Why I Do Not Use ORM." This blog seemed to confirm many of my worries about using ORM and it also presented a solution similar to what I was already planning.
By "data dictionary" I plan to use this definition from "The Database Programmer" blog:
The term "data dictionary" is used by many, including myself, to denote a separate set of tables that describes the application tables. The Data Dictionary contains such information as column names, types, and sizes, but also descriptive information such as titles, captions, primary keys, foreign keys, and hints to the user interface about how to display the field.
So in choosing a "data dictionary" over ORM I may be exhibiting confirmation bias, regardless here are my reasons for being weary of ORM:
I have never used ORM before, I don't know much about it.
This framework needs to be built rather quickly, my boss has little time and I need to produce a working application that will allow for a smooth upgrade to a more modern framework.
My boss already thinks that I am over engineering this framework (trust me, I am no where close to that) and is paranoid about the framework preventing us from being able to do things that we need to, and creating bugs that we can't solve in the required amount of time. So far I have done a poor job of convincing him that change is good, I am not a very effective salesman and while the other developers can help me the boss still needs a lot of assurance.
Our old framework is procedural, our code is PHP, and our developers know SQL very well. ORM would be a big change.
Our database has dozens of tables, many with hundreds of thousands of entries running on a fairly old server. In the past we have been burned by code that repeated polls the database in a loop instead of doing one query to pull all of the needed data at once. Avoiding this problem with hand coded SQL is rather straight forward. Ensuring that this always happens where necessary with ORM is a huge unknown to me and appears to be risky.
Regardless, the solution of the data dictionary seems very promising to me as this blog post "Using A Data Dictionary" seems to provide a lot of useful features and some that are requirements of the new framework. Here are my reasons for preferring the data dictionary solution:
Implementing access control rules on the table rows themselves would be invaluable.
Auto-generating database changes, documentation, and schema checking would also be useful.
One requirement of the framework is a generic data history/auditing feature that can be applied to any sub-feature within our application. A data dictionary or an equivalent is essentially required to provide such a feature. The history must have detailed information about the structure and data types within the database.
Our systems hold a wide variety of data types that would more properly addressed if they treated as formal types within the application. For one, HTML fragments (of which we have many in our data, they are required) need to be encoded as entities in some cases, decoded as HTML in others, parsed for links and images in some cases, and always validated for correctness. Then there are dates, measurements, currency, and various other fields that could benefit from having a clear definition in the code of how this data should be manipulated.
The data dictionary idea that I would like to implement would be series of objects in separate PHP files, and there will be plenty of OOP, however it will be used as in a manner very similar to the data dictionary concept presented in "The Database Programmer" blog. It would be the single source definition of the complete database schema for the entire framework.
Now my question is, am I overlooking the value of ORM or is this a case where a data dictionary is the right tool for the job?
I think your question would be more interesting if you were making an initial architectural decision rather than refactoring an existing application. I don't see a single assertion in your question that suggests a problem that designing in an ORM would address; but several it would create. If two major stakeholder groups (owner and other developers are more comfortable with a more conventional design, it seems to me that an ORM would be swimming upstream.
I can imagine the (possibly undeserved) approbation that would be associated with the ORM as soon as a query is slow or transaction locking problems start emerging. Not to mention the impact on the development schedule. Why create an unquantified risk factor?
Do you have a framework which supports building applications using a "Data Dictionary"? If so, give it a try, it might solve your problems. If you haven't, then there are lots of good and working ORM frameworks out there which have large communities, which come with source (so you can fix bugs yourself even if the "vendor" refuses to help you).
If you want to get a quick glance at a nice web based ORM framework, I suggest Django or TurboGears. They are based on Python which will be a nice change after using PHP. I usually prefer TurboGears but Django seems to be more smooth at the moment. Both are easy to set up and you should be able to build a prototype in a day or two. That will give you an idea of the odds.
PS: I also don't think ORM tools are TEH SOLUT10N. I use Hibernate or SQL Alchemy when it makes sense but I often roll my own simple mappers.
I think that you have made a very good analysis for you situation. You know why you choose the Data Dictionary approach. So go for it.
Later on you might reconsider. If so, then there should be not a problem to use the Data Dictionary and a ORM for new developments in parallel. Both technologies are not mutual exclusive.
If you don't like the idea of mixing different technologies: Stick to a solid OOP design and separate concerns between domain logic and data access cleanly, then switching to an ORM shouldn't be that painful or at least possible.
Good luck!

What is the best practice for persistence right now?

I come from a java background.
But I would like a cross-platform perspective on what is considered best practice for persisting objects.
The way I see it, there are 3 camps:
ORM camp
direct query camp e.g. JDBC/DAO, iBatis
LINQ camp
Do people still handcode queries (bypassing ORM) ? Why, considering the options available via JPA, Django, Rails.
There is no one best practice for persistence (although the number of people screaming that ORM is best practice might lead you to believe otherwise). The only best practice is to use the method that is most appropriate for your team and your project.
We use ADO.NET and stored procedures for data access (though we do have some helpers that make it very fast to write such as SP class wrapper generators, an IDataRecord to object translator, and some higher order procedures encapsulating common patterns and error handling).
There are a bunch of reasons for this which I won't go into here, but suffice to say that they are decisions that work for our team and that our team agrees with. Which, at the end of the day, is what matters.
I am currently reading up on persisting objects in .net. As such I cannot offer a best practice, but maybe my insights can bring you some benefit. Up until a few months ago I have always used handcoded queries, a bad habit from my ASP.classic days.
Linq2SQL - Very lightweight and easy to get up to speed. I love the strongly typed querying possibilities and the fact that the SQL is not executed at once. Instead it is executed when your query is ready (all the filters applied) thus you can split the data access from the filtering of the data. Also Linq2SQL lets me use domain objects that are separate from the data objects which are dynamically generated. I have not tried Linq2SQL on a larger project but so far it seems promising. Oh it only supports MS SQL which is a shame.
Entity Framework - I played around with it a little bit and did not like it. It seems to want to do everything for me and it does not work well with stored procedures. EF supports Linq2Entities which again allows strongly typed queries. I think it is limited to MS SQL but I could be wrong.
SubSonic 3.0 (Alpha) - This is a newer version of SubSonic which supports Linq. The cool thing about SubSonic is that it is based on template files (T4 templates, written in C#) which you can easily modify. Thus if you want the auto-generated code to look different you just change it :). I have only tried a preview so far but will look at the Alpha today. Take a look here SubSonic 3 Alpha. Supports MS SQL but will support Oracle, MySql etc. soon.
So far my conclusion is to use Linq2SQL until SubSonic is ready and then switch to that since SubSonics templates allows much more customization.
There is at least another one: System Prevalence.
As far as I can tell, what is optimal for you depends a lot on your circumstances. I could see how for very simple systems, using direct queries still could be a good idea. Also, I have seen Hibernate fail to work well with complex, legacy database schemata, so using an ORM might not always be a valid option. System Prevalence is supposed to unbeatingly fast, if you have enough memory to fit all your objects into RAM. Don't know about LINQ, but I suppose it has its uses, too.
So, as so often, the answer is: know a variety of tools for the job, so that you are able to use the one that's most appropriate for your specific situation.
The best practice depends on your situation.
If you need database objects in table structures with some sort of meaningful structure (so one column per field, one row per entity and so on) you need some sort of translation layer inbetween objects and the database. These fall into two camps:
If there's no logic in the database (just storage) and tables map to objects well, then an ORM solution can provide a quick and reliable persistence system. Java systems like Toplink and Hibernate are mature technologies for this.
If there is database logic involved in persistence, or your database schema has drifted from your object model significantly, stored procedures wrapped by Data Access Objects (with further patterns as you like) is a little more involved than ORM but more flexible.
If you don't need structured storage (and you need to be really sure that you don't, as introducing it to existing data is not fun), you can store serialized object graphs directly in the database, bypassing a lot of complexity.
I prefer to write my own SQL, but I apply all my refactoring techniques and other "good stuff" when I do so.
I have written data access layers, ORM code generators, persistence layers, UnitOfWork transaction management, and LOTS of SQL. I've done that in systems of all shapes and sizes, including extremely high-performance data feeds (forty thousand files totaling forty million transactions per day, each loaded within two minutes of real-time).
The most important criteria is destiny, as in control thereof. Don't ever let your ORM tool be an obstacle to getting your work done, or an excuse for not doing it right. Ultimately, all good SQL is hand-written and hand-tuned, but some decent tools can help you get a good first draft quickly.
I treat this issue the same way that I do my UI design. I write all my UIs directly in code, but I might use a visual designer to prototype some essential elements that I have in mind, then I tear apart the code it generates in order to kickstart my own.
So, use an ORM tool in any of its manifestations as a way to get a decent example--look at how it solves many of the issues that arise (key generation, associations, navigation, etc.). Tear apart its output, make it your own, then reuse the heck out of it.

What is a good balance in an MVC model to have efficient data access?

I am working on a few PHP projects that use MVC frameworks, and while they all have different ways of retrieving objects from the database, it always seems that nothing beats writing your SQL queries by hand as far as speed and cutting down on the number of queries.
For example, one of my web projects (written by a junior developer) executes over 100 queries just to load the home page. The reason is that in one place, a method will load an object, but later on deeper in the code, it will load some other object(s) that are related to the first object.
This leads to the other part of the question which is what are people doing in situations where you have a table that in one part of the code only needs the values for a few columns, and another part needs something else? Right now (in the same project), there is one get() method for each object, and it does a "SELECT *" (or lists all the columns in the table explicitly) so that anytime you need the object for any reason, you get the whole thing.
So, in other words, you hear all the talk about how SELECT * is bad, but if you try to use a ORM class that comes with the framework, it wants to do just that usually. Are you stuck to choosing ORM with SELECT * vs writing the specific SQL queries by hand? It just seems to me that we're stuck between convenience and efficiency, and if I hand write the queries, if I add a column, I'm most likely going to have to add it to several places in the code.
Sorry for the long question, but I'm explaining the background to get some mindsets from other developers rather than maybe a specific solution. I know that we can always use something like Memcached, but I would rather optimize what we can before getting into that.
Thanks for any ideas.
First, assuming you are proficient at SQL and schema design, there are very few instances where any abstraction layer that removes you from the SQL statements will exceed the efficiency of writing the SQL by hand. More often than not, you will end up with suboptimal data access.
There's no excuse for 100 queries just to generate one web page.
Second, if you are using the Object Oriented features of PHP, you will have good abstractions for collections of objects, and the kinds of extended properties that map to SQL joins. But the important thing to keep in mind is to write the best abstracted objects you can, without regard to SQL strategies.
When I write PHP code this way, I always find that I'm able to map the data requirements for each web page to very few, very efficient SQL queries if my schema is proper and my classes are proper. And not only that, but my experience is that this is the simplest and fastest way to implement. Putting framework stuff in the middle between PHP classes and a good solid thin DAL (note: NOT embedded SQL or dbms calls) is the best example I can think of to illustrate the concept of "leaky abstractions".
I got a little lost with your question, but if you are looking for a way to do database access, you can do it couple of ways. Your MVC can use Zend framework that comes with database access abstractions, you can use that.
Also keep in mind that you should design your system well to ensure there is no contention in the database as your queries are all scattered across the php pages and may lock tables resulting in the overall web application deteriorating in performance and becoming slower over time.
That is why sometimes it is prefereable to use stored procedures as it is in one place and can be tuned when we need to, though other may argue that it is easier to debug if query statements are on the front-end.
No ORM framework will even get close to hand written SQL in terms of speed, although 100 queries seem unrealistic (and maybe you are exaggerating a bit) even if you have the creator of the ORM framework writing the code, it will always be far from the speed of good old SQL.
My advice is, look at the whole picture not only speed:
Does the framework improves code readability?
Is your team comfortable with writing SQL and mixing it with code?
Do you really understand how to optimize the framework queries? (I think a get() for each object is not the optimal way of retrieving them)
Do the queries (after optimization) of the framework present a bottleneck?
I've never developed anything with PHP, but I think that you could mix both approaches (ORM and plain SQL), maybe after a thorough profiling of the app you can determine the real bottlenecks and only then replace that ORM code for hand written SQL (Usually in ruby you use ActiveRecord, then you profile the application with something as new relic and finally if you have a complicated AR query you replace that for some SQL)
Regads
Trust your experience.
To not repeat yourself so much in the code you could write some simple model-functions with your own SQL. This is what I am doing all the time and I am happy with it.
Many of the "convenience" stuff was written for people who need magic because they cannot do it by hand or just don't have the experience.
And after all it's a question of style.
Don't hesitate to add your own layer or exchange or extend a given layer with your own stuff. Keep it clean and make a good design and some documentation so you feel home when you come back later.

Resources