Related
I'm investigating a new project which will be a social networking style site. I'm reading up on RavenDb and I like the look of a lot of its features. I've not read up on nosql all that much but I'm wondering if there's a niche it fits best with and old school sql is still the best choice for other stuff.
I'm thinking that the permissions plug in would be ideal for a social net style site - but will it really perform in an environment where the database will be getting hammered - or is it optimised for a more reporting style system where it's possible to keep throwing new data structures at the database and report on those structures.
I'm eager to use the right tool for the job - I'll be using MVC3, Windsor + either Nhibernate+Sql server or RavenDb.
Should I stick with the old school sql or go with the new kid on the block: ravendb?
This question can get very close to being subjective (even though it's really not), you're talking about NoSQL as if it is just one thing, and that is not the case.
You have
graph databases (Neo4j etc),
map/reduce style document databases (Couch,Raven),
document databases which attempt to feel like ordinary databases (Mongo),
Key/value stores (Cassandra etc)
moar goes here.
Each of them attempts to solve a different problem via different means, and whether you'd use one of them over a traditional relational store is
A matter of suitability
A matter of personal preference
At the end of the day, for the primary data-storage for a single system, a document database or relational store is probably what you want, although for different parts of your system you may well end up utilising a graph database (For calculating neighbours etc), or a key/value store (like Facebook does/did for inbox messages).
The main benefit of choosing a document store as your primary store over that of a relational one, is that you haven't got to worry about trying to map your objects into a collection of tables, and there is less configuration overhead involved in doing so.
The other downside/upside would be that you have to learn something new and make mistakes along the way.
So my answer if I am going to be direct?
RavenDB would be suitable
SQL would be suitable
Which do you prefer to use? These days I'd probably just go for Raven, knowing I can dump data into a relational store for reporting purposes and probably do likewise for other parts of my system, and getting free-text search and fastish-writes/fast-reads without going through the effort of defining separate read/write stores is an overall win.
But that's me, and I am biased.
Currently I use a lot of the ado.net classes (SqlConnection, SqlCommand, SqlDataAdapter etc..) to make calls to our sql server. Is that a bad idea? I mean it works. However I see many articles that use an ORM, nHibernate, subsonic etc.. to connect to SQL. Why are those better? I am just trying to understand why I would need to change this at all?
Update:
I did check the following tutorial on using nHibernate with stored-procedures.
http://ayende.com/Blog/archive/2006/09/18/UsingNHibernateWithStoredProcedures.aspx
However it looks to me that this is way to overkill. Why would I have to create a mapping file? Even if I create a mapping file and lets say my table changes, then my code wont work anymore. However if I use ado.net to return a simple datatable then my code will still work. I am missing something here?
There's nothing wrong with using the basic ADO.NET classes.
You might just have to do a lot more manual work than necessary. If you e.g. select your top 10 customers from a table with SqlCommand and SqlDataReader, it's up to you go iterate over the results, pull out each and every single item of data (like customer number, customer name, and so forth), and you're dealing very closely with the database structures, e.g. rows and columns. That's fine for some scenarios, but too much work in others.
What an ORM gives you is a lot of this "grunt work" being handled for you. You just tell it to get a list of your top 10 customers - as "Customer" objects. The ORM will go off and grab the data (most likely using SqlCommand, SqlDataReader) and then pulling out the bits and pieces, and assemble nice, easy to use "Customer" objects for you, that are a lot easier to use, since they are what your code is dealing with - Customer objects.
So there's definitely nothing wrong with using ADO.NET and it's a good thing if you know how it works - but an ORM can save you a lot of tedious, repetitive and boring grunt work and let you focus on your real business problems on the object level.
Marc
First of all, the ORMs are likely to do a much better job at producing the SQL queries than your normal non-SQL specialized Joe :)
Secondly, ORMs are a great way to somewhat "standardize" your DALs, increasing flexibility over different projects.
And lastly, with a good ORM, you're likely to have an easier time substituting your underlaying data-source, as a good ORM will have many different dialects. Of course, this is just a side-bonus :)
ORM's are great to avoid code repetition. You can often find that your object model and database model are extremely close to each other and whenever you add a field you'll be adding it to the database, your objects, your sql statements as well as everywhere else. If you use an ORM then you change your code in one place and it builds the rest of it for you.
As for performance, this can go either way. You will probably find that a lot of the simple sql that is written for you is often extremely tailored with various shortcuts that you would have been too lazy to write, such as only returning the absolutely required data. On the other hand, if you have some extremely complex queries and joins that an automated system could not possibly build then you're better of keeping these written yourself.
In summary though, they're fantastic for fast builds!
You don't need to change. If SqlConnection, SqlCommand, etc. work for you then that's great.
They work just peachy fine for the DB app I'm developing, and I have dozens of concurrent users with no problems.
There's nothing wrong with using straight up ADO.Net, but using an ORM will save you time, both in development and maintenance. Thats the biggest benefit.
One thing to consider: will a future "new developer" be more inclined to know or learn a well documented and widely adopted OR/M or your custom data access layer?
The number one thing for me though is the time. Minutes with my favorite OR/M, nHibernate vs. hours/days writing a custom data access layer using ADO.NET.
I also favor OR/Ms because maintaining declarative XML mappings is way easier than maintaining potentially thousands of lines of imperative code... or worse thousands of lines of C# data access code on top of thousands of lines of stored procedure code. In my current project I have 58 objects mapped in 58 XML mapping files, each with less than 50 lines. I cringe when I think about writing/maintaining CRUD code for 58 entities in ADO.NET.
I must warn you to read the documentation. Many, dare I say most, folks with whom I've worked will jump on a tool like mice on cheese, but they'll never read the documentation and learn the technology. I recommend reading the docs BEFORE moving to a new technology like nHibernate. A good cup o' jo and an hour or two of hard reading before-hand will pay dividends.
I haven't find pointing to an general idea of ORM in any answer. The general idea of ORM is to perform an Object-Relational mapping and provide your business classes with persistence. It means that you will think only about you business logic and will let ORM tool to save its state for you. Sure there are a lot of different scenarios. As was already said, it is nothing bad in using pure ADO.NET and may be your application (that is already written in this stile won't get any benefit), but using ORM tool in new projects is a very good idea. As for other - I totally agree with other answers.
I am in the processes of replacing the framework for a fairly complex business web application. Our application runs on a LAMP platform and the new framework will be an extension of CodeIgniter. In my research for framework design I decided to look into ORM, I have never done ORM before and I wanted to know if it would be valuable for our application. Then I stumbled on a very interesting blog post entitled "Why I Do Not Use ORM." This blog seemed to confirm many of my worries about using ORM and it also presented a solution similar to what I was already planning.
By "data dictionary" I plan to use this definition from "The Database Programmer" blog:
The term "data dictionary" is used by many, including myself, to denote a separate set of tables that describes the application tables. The Data Dictionary contains such information as column names, types, and sizes, but also descriptive information such as titles, captions, primary keys, foreign keys, and hints to the user interface about how to display the field.
So in choosing a "data dictionary" over ORM I may be exhibiting confirmation bias, regardless here are my reasons for being weary of ORM:
I have never used ORM before, I don't know much about it.
This framework needs to be built rather quickly, my boss has little time and I need to produce a working application that will allow for a smooth upgrade to a more modern framework.
My boss already thinks that I am over engineering this framework (trust me, I am no where close to that) and is paranoid about the framework preventing us from being able to do things that we need to, and creating bugs that we can't solve in the required amount of time. So far I have done a poor job of convincing him that change is good, I am not a very effective salesman and while the other developers can help me the boss still needs a lot of assurance.
Our old framework is procedural, our code is PHP, and our developers know SQL very well. ORM would be a big change.
Our database has dozens of tables, many with hundreds of thousands of entries running on a fairly old server. In the past we have been burned by code that repeated polls the database in a loop instead of doing one query to pull all of the needed data at once. Avoiding this problem with hand coded SQL is rather straight forward. Ensuring that this always happens where necessary with ORM is a huge unknown to me and appears to be risky.
Regardless, the solution of the data dictionary seems very promising to me as this blog post "Using A Data Dictionary" seems to provide a lot of useful features and some that are requirements of the new framework. Here are my reasons for preferring the data dictionary solution:
Implementing access control rules on the table rows themselves would be invaluable.
Auto-generating database changes, documentation, and schema checking would also be useful.
One requirement of the framework is a generic data history/auditing feature that can be applied to any sub-feature within our application. A data dictionary or an equivalent is essentially required to provide such a feature. The history must have detailed information about the structure and data types within the database.
Our systems hold a wide variety of data types that would more properly addressed if they treated as formal types within the application. For one, HTML fragments (of which we have many in our data, they are required) need to be encoded as entities in some cases, decoded as HTML in others, parsed for links and images in some cases, and always validated for correctness. Then there are dates, measurements, currency, and various other fields that could benefit from having a clear definition in the code of how this data should be manipulated.
The data dictionary idea that I would like to implement would be series of objects in separate PHP files, and there will be plenty of OOP, however it will be used as in a manner very similar to the data dictionary concept presented in "The Database Programmer" blog. It would be the single source definition of the complete database schema for the entire framework.
Now my question is, am I overlooking the value of ORM or is this a case where a data dictionary is the right tool for the job?
I think your question would be more interesting if you were making an initial architectural decision rather than refactoring an existing application. I don't see a single assertion in your question that suggests a problem that designing in an ORM would address; but several it would create. If two major stakeholder groups (owner and other developers are more comfortable with a more conventional design, it seems to me that an ORM would be swimming upstream.
I can imagine the (possibly undeserved) approbation that would be associated with the ORM as soon as a query is slow or transaction locking problems start emerging. Not to mention the impact on the development schedule. Why create an unquantified risk factor?
Do you have a framework which supports building applications using a "Data Dictionary"? If so, give it a try, it might solve your problems. If you haven't, then there are lots of good and working ORM frameworks out there which have large communities, which come with source (so you can fix bugs yourself even if the "vendor" refuses to help you).
If you want to get a quick glance at a nice web based ORM framework, I suggest Django or TurboGears. They are based on Python which will be a nice change after using PHP. I usually prefer TurboGears but Django seems to be more smooth at the moment. Both are easy to set up and you should be able to build a prototype in a day or two. That will give you an idea of the odds.
PS: I also don't think ORM tools are TEH SOLUT10N. I use Hibernate or SQL Alchemy when it makes sense but I often roll my own simple mappers.
I think that you have made a very good analysis for you situation. You know why you choose the Data Dictionary approach. So go for it.
Later on you might reconsider. If so, then there should be not a problem to use the Data Dictionary and a ORM for new developments in parallel. Both technologies are not mutual exclusive.
If you don't like the idea of mixing different technologies: Stick to a solid OOP design and separate concerns between domain logic and data access cleanly, then switching to an ORM shouldn't be that painful or at least possible.
Good luck!
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
If you are motivate to the "pros" of an ORM and why would you use an ORM to management/client, what are those reasons would be?
Try and keep one reason per answer so that we can see which one gets voted up as the best reason.
The most important reason to use an ORM is so that you can have a rich, object oriented business model and still be able to store it and write effective queries quickly against a relational database. From my viewpoint, I don't see any real advantages that a good ORM gives you when compared with other generated DAL's other than the advanced types of queries you can write.
One type of query I am thinking of is a polymorphic query. A simple ORM query might select all shapes in your database. You get a collection of shapes back. But each instance is a square, circle or rectangle according to its discriminator.
Another type of query would be one that eagerly fetches an object and one or more related objects or collections in a single database call. e.g. Each shape object is returned with its vertex and side collections populated.
I'm sorry to disagree with so many others here, but I don't think that code generation is a good enough reason by itself to go with an ORM. You can write or find many good DAL templates for code generators that do not have the conceptual or performance overhead that ORM's do.
Or, if you think that you don't need to know how to write good SQL to use an ORM, again, I disagree. It might be true that from the perspective of writing single queries, relying on an ORM is easier. But, with ORM's it is far too easy to create poor performing routines when developers don't understand how their queries work with the ORM and the SQL they translate into.
Having a data layer that works against multiple databases can be a benefit. It's not one that I have had to rely on that often though.
In the end, I have to reiterate that in my experience, if you are not using the more advanced query features of your ORM, there are other options that solve the remaining problems with less learning and fewer CPU cycles.
Oh yeah, some developers do find working with ORM's to be fun so ORM's are also good from the keep-your-developers-happy perspective. =)
Speeding development. For example, eliminating repetitive code like mapping query result fields to object members and vice-versa.
Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.
Supporting OO encapsulation of business rules in your data access layer. You can write (and debug) business rules in your application language of preference, instead of clunky trigger and stored procedure languages.
Generating boilerplate code for basic CRUD operations. Some ORM frameworks can inspect database metadata directly, read metadata mapping files, or use declarative class properties.
You can move to different database software easily because you are developing to an abstraction.
Development happiness, IMO. ORM abstracts away a lot of the bare-metal stuff you have to do in SQL. It keeps your code base simple: fewer source files to manage and schema changes don't require hours of upkeep.
I'm currently using an ORM and it has sped up my development.
So that your object model and persistence model match.
To minimise duplication of simple SQL queries.
The reason I'm looking into it is to avoid the generated code from VS2005's DAL tools (schema mapping, TableAdapters).
The DAL/BLL i created over a year ago was working fine (for what I had built it for) until someone else started using it to take advantage of some of the generated functions (which I had no idea were there)
It looks like it will provide a much more intuitive and cleaner solution than the DAL/BLL solution from http://wwww.asp.net
I was thinking about created my own SQL Command C# DAL code generator, but the ORM looks like a more elegant solution
Abstract the sql away 95% of the time so not everyone on the team needs to know how to write super efficient database specific queries.
I think there are a lot of good points here (portability, ease of development/maintenance, focus on OO business modeling etc), but when trying to convince your client or management, it all boils down to how much money you will save by using an ORM.
Do some estimations for typical tasks (or even larger projects that might be coming up) and you'll (hopefully!) get a few arguments for switching that are hard to ignore.
Compilation and testing of queries.
As the tooling for ORM's improves, it is easier to determine the correctness of your queries faster through compile time errors and tests.
Compiling your queries helps helps developers find errors faster. Right? Right. This compilation is made possible because developers are now writing queries in code using their business objects or models instead of just strings of SQL or SQL like statements.
If using the correct data access patterns in .NET it is easy to unit test your query logic against in memory collections. This speeds the execution of your tests because you don't need to access the database, set up data in the database or even spin up a full blown data context.[EDIT]This isn't as true as I thought it was as unit testing in memory can present difficult challenges to overcome. But I still find these integration tests easier to write than in previous years.[/EDIT]
This is definitely more relevant today than a few years ago when the question was asked, but that may only be the case for Visual Studio and Entity Framework where my experience lies. Plugin your own environment if possible.
.net tiers using code smith templates
http://nettiers.com/default.aspx?AspxAutoDetectCookieSupport=1
Why code something that can be generated just as well.
convince them how much time / money you will save when changes come in and you don't have to rewrite your SQL since the ORM tool will do that for you
I think one cons is that ORM will need some updation in your POJO. mainly related to schema, relation and query. so scenario where you are not suppose to make changes in model objects, might be because it is shared among more that on project or b/w client and server. so in such cases you will need to split it in two levels, which will require additional efforts .
i am an android developer and as you know mobile apps are usually not huge in size, so this additional effort to segregate pure-model and orm-affected-model does not seems worth full.
i understand that question is generic one. but mobile apps are also come inside generic umbrella.
I am working on a few PHP projects that use MVC frameworks, and while they all have different ways of retrieving objects from the database, it always seems that nothing beats writing your SQL queries by hand as far as speed and cutting down on the number of queries.
For example, one of my web projects (written by a junior developer) executes over 100 queries just to load the home page. The reason is that in one place, a method will load an object, but later on deeper in the code, it will load some other object(s) that are related to the first object.
This leads to the other part of the question which is what are people doing in situations where you have a table that in one part of the code only needs the values for a few columns, and another part needs something else? Right now (in the same project), there is one get() method for each object, and it does a "SELECT *" (or lists all the columns in the table explicitly) so that anytime you need the object for any reason, you get the whole thing.
So, in other words, you hear all the talk about how SELECT * is bad, but if you try to use a ORM class that comes with the framework, it wants to do just that usually. Are you stuck to choosing ORM with SELECT * vs writing the specific SQL queries by hand? It just seems to me that we're stuck between convenience and efficiency, and if I hand write the queries, if I add a column, I'm most likely going to have to add it to several places in the code.
Sorry for the long question, but I'm explaining the background to get some mindsets from other developers rather than maybe a specific solution. I know that we can always use something like Memcached, but I would rather optimize what we can before getting into that.
Thanks for any ideas.
First, assuming you are proficient at SQL and schema design, there are very few instances where any abstraction layer that removes you from the SQL statements will exceed the efficiency of writing the SQL by hand. More often than not, you will end up with suboptimal data access.
There's no excuse for 100 queries just to generate one web page.
Second, if you are using the Object Oriented features of PHP, you will have good abstractions for collections of objects, and the kinds of extended properties that map to SQL joins. But the important thing to keep in mind is to write the best abstracted objects you can, without regard to SQL strategies.
When I write PHP code this way, I always find that I'm able to map the data requirements for each web page to very few, very efficient SQL queries if my schema is proper and my classes are proper. And not only that, but my experience is that this is the simplest and fastest way to implement. Putting framework stuff in the middle between PHP classes and a good solid thin DAL (note: NOT embedded SQL or dbms calls) is the best example I can think of to illustrate the concept of "leaky abstractions".
I got a little lost with your question, but if you are looking for a way to do database access, you can do it couple of ways. Your MVC can use Zend framework that comes with database access abstractions, you can use that.
Also keep in mind that you should design your system well to ensure there is no contention in the database as your queries are all scattered across the php pages and may lock tables resulting in the overall web application deteriorating in performance and becoming slower over time.
That is why sometimes it is prefereable to use stored procedures as it is in one place and can be tuned when we need to, though other may argue that it is easier to debug if query statements are on the front-end.
No ORM framework will even get close to hand written SQL in terms of speed, although 100 queries seem unrealistic (and maybe you are exaggerating a bit) even if you have the creator of the ORM framework writing the code, it will always be far from the speed of good old SQL.
My advice is, look at the whole picture not only speed:
Does the framework improves code readability?
Is your team comfortable with writing SQL and mixing it with code?
Do you really understand how to optimize the framework queries? (I think a get() for each object is not the optimal way of retrieving them)
Do the queries (after optimization) of the framework present a bottleneck?
I've never developed anything with PHP, but I think that you could mix both approaches (ORM and plain SQL), maybe after a thorough profiling of the app you can determine the real bottlenecks and only then replace that ORM code for hand written SQL (Usually in ruby you use ActiveRecord, then you profile the application with something as new relic and finally if you have a complicated AR query you replace that for some SQL)
Regads
Trust your experience.
To not repeat yourself so much in the code you could write some simple model-functions with your own SQL. This is what I am doing all the time and I am happy with it.
Many of the "convenience" stuff was written for people who need magic because they cannot do it by hand or just don't have the experience.
And after all it's a question of style.
Don't hesitate to add your own layer or exchange or extend a given layer with your own stuff. Keep it clean and make a good design and some documentation so you feel home when you come back later.