I am using C# 3.0 / .NET 3.5 and planning to build an eCommerce website.
I've seen NHibernate, LLBLGEN, Genome, Linq to SQL, Entity Framework, SubSonic, etc.
I don't want to code everything by hand. If there is some specific bottleneck I'll manage to optimize the database/code.
Which ORM would be best? There is so much available those day that I don't even know where to start.
Which feature(s) should I be using?
Links, Screencast and Documentation are welcome.
I've been using nHibernate which is a very good free solution. The one downside is the lack of documentation, which causes a slightly steep rampup time. But once you get the basics down it really speeds up development.
I like Fluent nHibernate for a way to configure without the xml files. The one thing I suggest though is to abstract out your data access from your application. this way should you choose wrong you don't have to worry about re-coding the App tiers.
I can only really speak for LINQ-SQL and can say that it is:
Easy to use
Quick to get you up and running
Good for simple schemas and object models
but it starts to fall down if:
You're using a disconnected (tiered) architecture because its datacontexts require the same object instances to perform tracking and concurrency (though there are ways around this).
You have a complex object model / database
Plus it has some other niggles and strange behaviour
I'm looking to try EF next myself and MS seem to be quietly dropping LINQ-SQL in favour of EF, which isn't exactly a ringing recommendation of LINQ-SQL :)
That depends on the architecture of the data model. I can speak to the effectiveness of SubSonic, since I'm in the process of launching a web app that it backs.
I've run into problems with JOINs and DISTINCTs while using SubSonic. Both times, all I had to do is patch the source and rebuild the DLL. Now, I'm not at all averse to something like this, but you might be.
Other than those two problems, SubSonic is a joy to use. Selects are very easy and flowing. It maps fairly closely to SQL, much the same way LINQ does. Also, SubSonic comes with the scaffolding function that should be able to pre-build certain pages for you. I'm not sure how effective it is, since I like to do that stuff myself.
One more thing, selection of specific rows as opposed to * is slow, but only in debug mode. Once you compile for release, it's actually faster.
That's my two cents.
I started out using Linq to SQL as the whole linq integration is awesome, but if you want to do Model First rather than Schema First and you want to have a rich domain model then nHibernate\Fluent nHibernate is really the way to go. We switched to this and is far simpler, better supported than l2s. However for straight dragging your schema into the dbml code generator, linq to sql is great.
I have also heard very good things about Mindscape Lightspeed but have not used it.
Related
Currently I use a lot of the ado.net classes (SqlConnection, SqlCommand, SqlDataAdapter etc..) to make calls to our sql server. Is that a bad idea? I mean it works. However I see many articles that use an ORM, nHibernate, subsonic etc.. to connect to SQL. Why are those better? I am just trying to understand why I would need to change this at all?
Update:
I did check the following tutorial on using nHibernate with stored-procedures.
http://ayende.com/Blog/archive/2006/09/18/UsingNHibernateWithStoredProcedures.aspx
However it looks to me that this is way to overkill. Why would I have to create a mapping file? Even if I create a mapping file and lets say my table changes, then my code wont work anymore. However if I use ado.net to return a simple datatable then my code will still work. I am missing something here?
There's nothing wrong with using the basic ADO.NET classes.
You might just have to do a lot more manual work than necessary. If you e.g. select your top 10 customers from a table with SqlCommand and SqlDataReader, it's up to you go iterate over the results, pull out each and every single item of data (like customer number, customer name, and so forth), and you're dealing very closely with the database structures, e.g. rows and columns. That's fine for some scenarios, but too much work in others.
What an ORM gives you is a lot of this "grunt work" being handled for you. You just tell it to get a list of your top 10 customers - as "Customer" objects. The ORM will go off and grab the data (most likely using SqlCommand, SqlDataReader) and then pulling out the bits and pieces, and assemble nice, easy to use "Customer" objects for you, that are a lot easier to use, since they are what your code is dealing with - Customer objects.
So there's definitely nothing wrong with using ADO.NET and it's a good thing if you know how it works - but an ORM can save you a lot of tedious, repetitive and boring grunt work and let you focus on your real business problems on the object level.
Marc
First of all, the ORMs are likely to do a much better job at producing the SQL queries than your normal non-SQL specialized Joe :)
Secondly, ORMs are a great way to somewhat "standardize" your DALs, increasing flexibility over different projects.
And lastly, with a good ORM, you're likely to have an easier time substituting your underlaying data-source, as a good ORM will have many different dialects. Of course, this is just a side-bonus :)
ORM's are great to avoid code repetition. You can often find that your object model and database model are extremely close to each other and whenever you add a field you'll be adding it to the database, your objects, your sql statements as well as everywhere else. If you use an ORM then you change your code in one place and it builds the rest of it for you.
As for performance, this can go either way. You will probably find that a lot of the simple sql that is written for you is often extremely tailored with various shortcuts that you would have been too lazy to write, such as only returning the absolutely required data. On the other hand, if you have some extremely complex queries and joins that an automated system could not possibly build then you're better of keeping these written yourself.
In summary though, they're fantastic for fast builds!
You don't need to change. If SqlConnection, SqlCommand, etc. work for you then that's great.
They work just peachy fine for the DB app I'm developing, and I have dozens of concurrent users with no problems.
There's nothing wrong with using straight up ADO.Net, but using an ORM will save you time, both in development and maintenance. Thats the biggest benefit.
One thing to consider: will a future "new developer" be more inclined to know or learn a well documented and widely adopted OR/M or your custom data access layer?
The number one thing for me though is the time. Minutes with my favorite OR/M, nHibernate vs. hours/days writing a custom data access layer using ADO.NET.
I also favor OR/Ms because maintaining declarative XML mappings is way easier than maintaining potentially thousands of lines of imperative code... or worse thousands of lines of C# data access code on top of thousands of lines of stored procedure code. In my current project I have 58 objects mapped in 58 XML mapping files, each with less than 50 lines. I cringe when I think about writing/maintaining CRUD code for 58 entities in ADO.NET.
I must warn you to read the documentation. Many, dare I say most, folks with whom I've worked will jump on a tool like mice on cheese, but they'll never read the documentation and learn the technology. I recommend reading the docs BEFORE moving to a new technology like nHibernate. A good cup o' jo and an hour or two of hard reading before-hand will pay dividends.
I haven't find pointing to an general idea of ORM in any answer. The general idea of ORM is to perform an Object-Relational mapping and provide your business classes with persistence. It means that you will think only about you business logic and will let ORM tool to save its state for you. Sure there are a lot of different scenarios. As was already said, it is nothing bad in using pure ADO.NET and may be your application (that is already written in this stile won't get any benefit), but using ORM tool in new projects is a very good idea. As for other - I totally agree with other answers.
I am developing an application which at the moment queries a (rather large) database via ADO.NET and hard-coded SQL statements. Admittedly this is ugly (i.e. no compile time errors thrown if a mistake is made in the SQL) and potentially dangerous (due to SQL injections, etc although this is unlikely to be a problem for this particular application) but this wasn't considered initially because this application is really only interested in a very small subset of tables in this database (at least for now...).
LinqToSQL seemed interesting but because this application is required to have the ability to connect to Oracle databases as well, that plan was a non-starter.
Is a project like mine suitable for integration with an ORM framework or would that be overkill?
I think an ORM should always at least be considered.
But it doesn't sound like you're even using business objects (Sometimes referred to as a Data Access Layer or DAL) which greatly undermines the usefulness of an object oriented language. I would address this first. If you find it's too time consuming to create all the CRUD for the business objects it's time for an ORM...
My personal favorite is nHibernate. Big learning curve but definitely worth it.
I would recommend a generated DAL instead of an ORM, or Linq.
Look into subsonic http://subsonicproject.com/. It is an open source DAL generator that is very easy to learn and use, and has a very low overhead.
I would definitely say that it is a candidate for an ORM framework. The overhead of setting up the ORM is quite small once you have familiarized yourself with a framework, and the benefits are many.
As you say, LinqToSQL is not appropriate if you might need Oracle support, but most other frameworks support Oracle.
If you only use a small subset of the tables, then you will only have to map a small subset of the tables and hence the setup cost will decrease even further.
Good luck!
Try using something that generates sql (Like Linq, only with Oracle), instead of an orm.
Why? Jeff Atwood explains.
Quote:
"At first you're like "whee! objects!" and then you realize-- hey, this is a lot of tedious, error-prone mapping code I didn't have to write before... "
I know this is not a programming question per se, but I wanted to get as much input from the SO community on a new project I hope to get started. The project is from being started from scratch and thus every decision for programming languages, databases, frameworks, platforms and what not are up in the air. I'm hoping to get your opinion on the matter, what you feel are the strengths and weaknesses of each option.
Database:
Currently I have the option of using MSSQL or MySQL. While I am leaning towards using MySQL because it is free and most probably has all the features I need. However, there is the possibility of having a lot of hierarchical data and the new hierarchical data type in MSSQL is quite appealing. Does it really simplify matters that much? Also MSSQL supports many more advanced SQL functions that may or may not be useful in the long run. While for development I can get access to Server 2008, multiple licenses as the development team grows and for production, are the costs justified?
Programming Languages:
The project will have a web based front end UI and a server based component that will do some heavy lifting.
For the web based UI, I was thinking of maybe doing Apache/IIS with PHP or IIS with ASP.Net in C#. I'd like to use a good framework to properly utilize good design patterns that should structure the code and development of the app. As well as make modifications in the long run easy to implement. I also want the GUI to look good and don't like the idea of buying .Net controls from component vendors. Instead I prefer the idea of using good CSS, and open sources like YUI and javascript to make the UI sleek.
For the server based component, I was thinking of using C#. I have no real development experience in C++ and I'd like good libraries and sufficient speed is good enough. However, while the web based UI and server based component is loosely coupled, there may be instances where the UI needs to communicate (call methods and what not) with the server based component and I want to pick languages/frameworks that will play nice with each other.
All suggestions on frameworks to incorporate are welcome.
Version Control:
I have had good experiences with SVN and a pretty bad experiences with TFS. I've never worked with GIT. Which do you think is better in terms of features as well as general developer familiarity. I want to pick something that other developers will know and not have trouble with.
I apologize if the questions are bit redundant or I'm not providing enough information or using bad terminology. I plan to edit and improve the question as I get feedback. Thanks!
EDIT:
Who: This would most probably be a startup formed of college students or junior developers. I want the project to utilize technologies that most people are familiar with or are easy to pick up.
What: I'd need hours and days to explain the solution. But in the end when you break it down, its a web based UI (think standard web app to just manage database data) that would be used to knowledgeable clients. The server based component would be very separate except for the fact that it should be able to communicate with the web app.
I can provide more information as required but I would appreciate an opportunity for users to answer and provide their ideas before you hastily close the question.
Obviously it depends a lot on specific requirements, but then again, even with those I probably wouldn't be able to tell for sure!
I've been working on a from-scratch project myself for a couple of months, and have generally found:
Choosing Microsoft for all the layers just goes down much easier (my subjective opinion). For example I would use C# for the UI, the back end, and use MSSQL for the database. Nothing at all wrong with non-Microsoft vendors, I'm no Microsoft fan-boy, I just struggle to get productive with unfamiliar tools. Depends where your experience lies though.
Database: In particular I've found that .NET and MSSQL go easily together. When I started the project I was using a PostgreSQL (because it's free, fully featured and has open-source warm fuzzies). However I abandoned it in favour of MSSQL simply because it was taking me too long to get database work done in an unfamiliar language with unfamiliar tools. Also, I'm not sure MSSQL is so expensive anymore, for example for a web application, MSSQL 2008 Web Edition is pretty damn cheap per-processor I think (only on SPLA licensing though). If you're concerned about database features in a free implementation though, personally I think PostgreSQL has a very full feature set, nicely standardised, and rapidly growing.
UI: I'm pretty inexperienced, but ASP.NET MVC looks far less painful to me than ASP.NET Web Forms. I like PHP too, but again I'd match the UI language with the back-end language, so would recommend .NET.
On frameworks, I'm immersed in DALs at the moment. I like Subsonic for lightweight data, NHibernate for heavy-weight.
I still have a long way to go with my project so perhaps I can only see the short-term benefits and drawbacks at the moment. But in general I would say: use the technologies that you're most comfortable using, as you'll be way more productive and the end result will probably be about the same anyway. If you want to learn new technologies though, and who doesn't? - go ahead, just expect it to take a lot longer.
Didn't want to answer 'cause it's so open ended. But a few points:
Money
First, check out BizSpark. That should take care of any money aspect for 3 years. For a service company, that means not only free VS Team Suite and Office and so on, but free Windows, SQL, etc. If your startup can't afford to spend a bit on MS tech in 3 years, it's probably a bad business. So that takes out licensing.
On a similar note, Sun has Startup Essentials. Could be interesting on the hardware side of things, but I haven't actually competitively priced them versus Dell/HP.
Software
It doesn't sound like you have hard enough requirements to say "oh, this slightly-less-popular software X is perfect for my domain Y and is gonna give me a very big boost". In fact, your project might not be like that at all. Maybe it, technically, is going to be a relatively plain application just pushing data around or whatever. You didn't specify.
For a small startup, personal productivity is probably going to trump any other argument. If your people are excellent in X, then that's one of your top arguments right there.
If you really don't have any particular system you're most comfortable with, be conservative. Stick with .NET or Java, as they'll give you the widest range of useful possibilities.
As far as things like OS and Database, I'm biased, but I think Microsoft will give you platforms that are easier to take advantage of than you'll find elsewhere. For instance, setting up load balancing, clustering, centralized authentication, managing servers (updates, events, etc.) is going to be easier to get going on Windows than it would be on another platform, assuming you're not an expert in either. Configuring SQL Server, even the advanced features, is a piece of cake. (Go time someone who knows neither: Setup a DB mirror in MSSQL and MySQL -- which is going to take more work?) Again, this is all predicated on you not having experts in a particular set of technology.
Don't mix -- whatever you do, stick with the platform. If you go .NET, MSSQL is going to work better with the data providers (or things like Linq-to-SQL). If you decide to do PHP, then use MySQL as everyone else uses it and you'll encounter less resistance. If you're not inventing stuff on the technical side, don't become an edge case.
You should pick the platform first, then the language that is best for that platform (if there is any choice).
One thing you should consider is the labor pool, and labor pool cost, for specific platforms and languages. Human Resources can often get cost metrics, if you don't have ideas already.
In my town, for example, .NET platform is much more expensive per Software Engineer than open source, because the .NET developers have a higher rate (40% roughly). C# is a little higher rate than VB.NET, but also tends to bring more well rounded candidates.
Just to throw in something totally different: How about weblocks as a web framework? It uses Hunchentoot as a server, which can run either standalone or with Apache. This is all done in Common Lisp. Weblocks can use cl-sql as a backend store, which can connect to many different RDBMs (MySQL, PostgreSQL, Oracle, ODBC, SQLite).
I come from a java background.
But I would like a cross-platform perspective on what is considered best practice for persisting objects.
The way I see it, there are 3 camps:
ORM camp
direct query camp e.g. JDBC/DAO, iBatis
LINQ camp
Do people still handcode queries (bypassing ORM) ? Why, considering the options available via JPA, Django, Rails.
There is no one best practice for persistence (although the number of people screaming that ORM is best practice might lead you to believe otherwise). The only best practice is to use the method that is most appropriate for your team and your project.
We use ADO.NET and stored procedures for data access (though we do have some helpers that make it very fast to write such as SP class wrapper generators, an IDataRecord to object translator, and some higher order procedures encapsulating common patterns and error handling).
There are a bunch of reasons for this which I won't go into here, but suffice to say that they are decisions that work for our team and that our team agrees with. Which, at the end of the day, is what matters.
I am currently reading up on persisting objects in .net. As such I cannot offer a best practice, but maybe my insights can bring you some benefit. Up until a few months ago I have always used handcoded queries, a bad habit from my ASP.classic days.
Linq2SQL - Very lightweight and easy to get up to speed. I love the strongly typed querying possibilities and the fact that the SQL is not executed at once. Instead it is executed when your query is ready (all the filters applied) thus you can split the data access from the filtering of the data. Also Linq2SQL lets me use domain objects that are separate from the data objects which are dynamically generated. I have not tried Linq2SQL on a larger project but so far it seems promising. Oh it only supports MS SQL which is a shame.
Entity Framework - I played around with it a little bit and did not like it. It seems to want to do everything for me and it does not work well with stored procedures. EF supports Linq2Entities which again allows strongly typed queries. I think it is limited to MS SQL but I could be wrong.
SubSonic 3.0 (Alpha) - This is a newer version of SubSonic which supports Linq. The cool thing about SubSonic is that it is based on template files (T4 templates, written in C#) which you can easily modify. Thus if you want the auto-generated code to look different you just change it :). I have only tried a preview so far but will look at the Alpha today. Take a look here SubSonic 3 Alpha. Supports MS SQL but will support Oracle, MySql etc. soon.
So far my conclusion is to use Linq2SQL until SubSonic is ready and then switch to that since SubSonics templates allows much more customization.
There is at least another one: System Prevalence.
As far as I can tell, what is optimal for you depends a lot on your circumstances. I could see how for very simple systems, using direct queries still could be a good idea. Also, I have seen Hibernate fail to work well with complex, legacy database schemata, so using an ORM might not always be a valid option. System Prevalence is supposed to unbeatingly fast, if you have enough memory to fit all your objects into RAM. Don't know about LINQ, but I suppose it has its uses, too.
So, as so often, the answer is: know a variety of tools for the job, so that you are able to use the one that's most appropriate for your specific situation.
The best practice depends on your situation.
If you need database objects in table structures with some sort of meaningful structure (so one column per field, one row per entity and so on) you need some sort of translation layer inbetween objects and the database. These fall into two camps:
If there's no logic in the database (just storage) and tables map to objects well, then an ORM solution can provide a quick and reliable persistence system. Java systems like Toplink and Hibernate are mature technologies for this.
If there is database logic involved in persistence, or your database schema has drifted from your object model significantly, stored procedures wrapped by Data Access Objects (with further patterns as you like) is a little more involved than ORM but more flexible.
If you don't need structured storage (and you need to be really sure that you don't, as introducing it to existing data is not fun), you can store serialized object graphs directly in the database, bypassing a lot of complexity.
I prefer to write my own SQL, but I apply all my refactoring techniques and other "good stuff" when I do so.
I have written data access layers, ORM code generators, persistence layers, UnitOfWork transaction management, and LOTS of SQL. I've done that in systems of all shapes and sizes, including extremely high-performance data feeds (forty thousand files totaling forty million transactions per day, each loaded within two minutes of real-time).
The most important criteria is destiny, as in control thereof. Don't ever let your ORM tool be an obstacle to getting your work done, or an excuse for not doing it right. Ultimately, all good SQL is hand-written and hand-tuned, but some decent tools can help you get a good first draft quickly.
I treat this issue the same way that I do my UI design. I write all my UIs directly in code, but I might use a visual designer to prototype some essential elements that I have in mind, then I tear apart the code it generates in order to kickstart my own.
So, use an ORM tool in any of its manifestations as a way to get a decent example--look at how it solves many of the issues that arise (key generation, associations, navigation, etc.). Tear apart its output, make it your own, then reuse the heck out of it.
I am working on a few PHP projects that use MVC frameworks, and while they all have different ways of retrieving objects from the database, it always seems that nothing beats writing your SQL queries by hand as far as speed and cutting down on the number of queries.
For example, one of my web projects (written by a junior developer) executes over 100 queries just to load the home page. The reason is that in one place, a method will load an object, but later on deeper in the code, it will load some other object(s) that are related to the first object.
This leads to the other part of the question which is what are people doing in situations where you have a table that in one part of the code only needs the values for a few columns, and another part needs something else? Right now (in the same project), there is one get() method for each object, and it does a "SELECT *" (or lists all the columns in the table explicitly) so that anytime you need the object for any reason, you get the whole thing.
So, in other words, you hear all the talk about how SELECT * is bad, but if you try to use a ORM class that comes with the framework, it wants to do just that usually. Are you stuck to choosing ORM with SELECT * vs writing the specific SQL queries by hand? It just seems to me that we're stuck between convenience and efficiency, and if I hand write the queries, if I add a column, I'm most likely going to have to add it to several places in the code.
Sorry for the long question, but I'm explaining the background to get some mindsets from other developers rather than maybe a specific solution. I know that we can always use something like Memcached, but I would rather optimize what we can before getting into that.
Thanks for any ideas.
First, assuming you are proficient at SQL and schema design, there are very few instances where any abstraction layer that removes you from the SQL statements will exceed the efficiency of writing the SQL by hand. More often than not, you will end up with suboptimal data access.
There's no excuse for 100 queries just to generate one web page.
Second, if you are using the Object Oriented features of PHP, you will have good abstractions for collections of objects, and the kinds of extended properties that map to SQL joins. But the important thing to keep in mind is to write the best abstracted objects you can, without regard to SQL strategies.
When I write PHP code this way, I always find that I'm able to map the data requirements for each web page to very few, very efficient SQL queries if my schema is proper and my classes are proper. And not only that, but my experience is that this is the simplest and fastest way to implement. Putting framework stuff in the middle between PHP classes and a good solid thin DAL (note: NOT embedded SQL or dbms calls) is the best example I can think of to illustrate the concept of "leaky abstractions".
I got a little lost with your question, but if you are looking for a way to do database access, you can do it couple of ways. Your MVC can use Zend framework that comes with database access abstractions, you can use that.
Also keep in mind that you should design your system well to ensure there is no contention in the database as your queries are all scattered across the php pages and may lock tables resulting in the overall web application deteriorating in performance and becoming slower over time.
That is why sometimes it is prefereable to use stored procedures as it is in one place and can be tuned when we need to, though other may argue that it is easier to debug if query statements are on the front-end.
No ORM framework will even get close to hand written SQL in terms of speed, although 100 queries seem unrealistic (and maybe you are exaggerating a bit) even if you have the creator of the ORM framework writing the code, it will always be far from the speed of good old SQL.
My advice is, look at the whole picture not only speed:
Does the framework improves code readability?
Is your team comfortable with writing SQL and mixing it with code?
Do you really understand how to optimize the framework queries? (I think a get() for each object is not the optimal way of retrieving them)
Do the queries (after optimization) of the framework present a bottleneck?
I've never developed anything with PHP, but I think that you could mix both approaches (ORM and plain SQL), maybe after a thorough profiling of the app you can determine the real bottlenecks and only then replace that ORM code for hand written SQL (Usually in ruby you use ActiveRecord, then you profile the application with something as new relic and finally if you have a complicated AR query you replace that for some SQL)
Regads
Trust your experience.
To not repeat yourself so much in the code you could write some simple model-functions with your own SQL. This is what I am doing all the time and I am happy with it.
Many of the "convenience" stuff was written for people who need magic because they cannot do it by hand or just don't have the experience.
And after all it's a question of style.
Don't hesitate to add your own layer or exchange or extend a given layer with your own stuff. Keep it clean and make a good design and some documentation so you feel home when you come back later.