I'm a research assistant for a university. We're retooling our Software Architecture subject, hoping to "modernize", and address some of the teaching and collaborative learning issues we've discovered in past semesters.
Students are asked to rapidly build a prototype of their architectured system using Eclipse.
For persistence, we've guided students to HSqlDb.
Last semester we received significant feedback that writing the Data Access Layer and mapping to OO has taken alot of time. This plumbing work could be better spent on more relevant things, like scaling, end-to-end perf or satisfying more scenarios.
In a real-world production, I'd pick an ORM tech, like Hibernate, but the subject is already too complex to teach yet another technology (and Hibernate is a massive one to learn for students IMHO).
So, my questions to the SO community:
Should we consider giving students an object oriented database (if they still exist)? This saves time on ORM and plumbing
Should we stick with RDBMS and tell students to roll their own ORM?
Should we point students to a lightweight, simple ORM?
Remember, this isn't real world, but we'd like to teach real world skills as much as possible. Teaching ORM not as important as getting students to rapidly prototype a system that satisfies the scenarios.
I'm a C# dev at heart but the students are only familiar with Java when they come into the subject.
I have to wholeheartedly disagree with the use of ORM in an educational environment. You need to be able to walk before you run, and utilizing ORM eliminates a very important step in the learning process regarding using a relational database in an application. You should stick to a VERY lightweight data access framework--one that requires students to write their own SQL and (at best) doesn't allow this scope of the code to be tied to the UI or (at least) doesn't require it.
I am admittedly unfamiliar with the Java world as it relates to actual enterprise development, but I realize (somewhat begrudgingly) that it's THE environment in the educational system. While I believe that students should be exposed to .NET, that's an argument for another time ;) In any even, I'm quite certain that there's some sort of framework out there that satisfies this.
I'd be willing to wager that there is something out there that provides some code generation functionality. At my old job, we only made use of the .NET database libraries to the point of (relatively) low-level reading of the data from the database. We didn't use any of the repositories or change-tracking technologies, but instead rolled our own. SQL commands were written by hand, but the framework still provided type safety and rich designer support. My point here is that both are possible. I would suggest finding something similar to this in Java, requiring the students to hand-code one or two of these adapters to gain an understanding of what goes into them, then have the code generator do the other "gruntwork" plumbing based on their SQL statements.
Don't use any SQL generation. You MUST be able to write the SQL before you let the computer do it for you. The second you use ORM to do something that you don't know how to do in SQL is the second that you've lost control of your database model, and they need to understand that.
As someone who embraced OO heartily, and RDBMS grudgingly... I would encourage, even beg, computer science departments to keep theory and practice of relational databases in the foreground. If that can be done even while you use an ORM for the OO classes, then go for it. But I'd prefer to see students coming out of college understanding that OO - relational mapping is hard, and why it's hard, and that this does not mean that the relational model is broken.
Not very broken, anyway.
If the primary goal is for the students to learn, then I think using an RDBMS would be the better approach here - they already have to grok the object model on the application side, so reconciling a relational structure into an overall architectural portfolio is an important skill.
Regarding providing an ORM such as Hibernate, I don't really agree that it is something massive for students to learn. One of the best things about Hibernate is that its difficulty level is fairly well correlated to how deep you dig into it. It has a very low barrier of entry (less than a day IMO) to get rolling with the basics, and often the basics are all you need - especially for something like a prototype, which is what you said is the goal of this activity. Certainly it does not require study to such a degree that it will stick with the students beyond the scope of the course. Basic Hibernate use can be a throwaway skill.
So to summarize, I recommend sticking with an RDBMS and providing an ORM like Hibernate.
Take a look at DataObjects.Net - is shares benefits of OR/M framework as well as of object database (if built-in storage providers are used), allowing to transparently migrate between supported storages.
It is quite advanced from the point of architecture and extensions: check out e.g. this post about its query optimization techniques.
I think it depends a lot on what the students know coming into the course. The reason I say this is that it's probably best to start with something they are all familiar with and move forward from there. In my experience, most students understand what objects are and how to use them, so presenting SQL tables as objects seems like a great place to start.
If you agree with me so far, then you might also agree that ORM is a great way to transition young programmers from their comfort zone of object orientated programming into a new world of database programming.
I'm not really familiar with the tools available for Java to implement ORM, but I was able to pick it up in C# (using LINQ to MySQL) in just a few days (after failing miserably at trying to learn PHP for several weeks). If implementing the projects in C# is a possibility, the great thing about using LINQ is that it gives students a feel for how the queries might look in SQL, without leaving their comfort zone of OOP (assuming they are at least somewhat comfortable with developing in .NET). This allows you to teach the concepts of database programming, without spending too much time talking about the implementation of it. Then, once they've mastered the concepts, you can roll back and show them how to perform a similar implementation outside of OOP (using SQL, PHP, JSP, etc.)
Not to mention, it gives them a great preview of how they can use the latest .NET technologies to do some pretty advanced stuff without too much effort (which is probably more beneficial to them in the long term anyway).
Good luck!
Related
ORM seems to be a fast-growing model, with both pros and cons in their side. From Ultra-Fast ASP.NET of Richard Kiessig (http://www.amazon.com/Ultra-Fast-ASP-NET-Build-Ultra-Scalable-Server/dp/1430223839/ref=pd_bxgy_b_text_b):
"I love them because they allow me to develop small, proof-of-concept sites extremely quickly. I can side step much of the SQL and related complexity that I would otherwise need and focus on the objects, business logic and presentation. However, at the same time, I also don't care for them because, unfortunately, their performance and scalability is usually very poor, even when they're integrated with a comprehensive caching system (the reason for that becomes clear when you realize that when properly configured, SQL Server itself is really just a big data cache"
My questions are:
What is your comment about Richard's idea. Do you agree with him or not? If not, please tell why.
What is the best suitable fields for ORM and traditional database query? in other words, where you should use ORM and where you should use traditional database query :), which kind/size... of applications you should undoubtedly choose ORM/traditional database query
Thanks in advance
I can't agree to the common complain about ORMs that they perform bad. I've seen many plain-SQL applications until now. While it is theoretically possible to write optimized SQL, in reality, they ruin all the performance gain by writing not optimized business logic.
When using plain SQL, the business logic gets highly coupled to the db model and database operations and optimizations are up to the business logic. Because there is no oo model, you can't pass around whole object structures. I've seen many applications which pass around primary keys and retrieve the data from the database on each layer again and again. I've seen applications which access the database in loops. And so on. The problem is: because the business logic is already hardly maintainable, there is no space for any more optimizations. Often when you try to reuse at least some of your code, you accept that it is not optimized for each case. The performance gets bad by design.
An ORM usually doesn't require the business logic to care too much about data access. Some optimizations are implemented in the ORM. There are caches and the ability for batches. This automatic (and runtime-dynamic) optimizations are not perfect, but they decouple the business logic from it. For instance, if a piece of data is conditionally used, it loads it using lazy loading on request (exactly once). You don't need anything to do to make this happen.
On the other hand, ORM's have a steep learning curve. I wouldn't use an ORM for trivial applications, unless the ORM is already in use by the same team.
Another disadvantage of the ORM is (actually not of the ORM itself but of the fact that you'll work with a relational database an and object model), that the team needs to be strong in both worlds, the relational as well as the oo.
Conclusion:
ORMs are powerful for business-logic centric applications with data structures that are complex enough that having an OO model will advantageous.
ORMs have usually a (somehow) steep learning curve. For small applications, it could get too expensive.
Applications based on simple data structures, having not much logic to manage it, are most probably easier and straight forward to be written in plain sql.
Teams with a high level of database knowledge and not much experience in oo technologies will most probably be more efficient by using plain sql. (Of course, depending on the applications they write it could be recommendable for the team to switch the focus)
Teams with a high level of oo knowledge and only basic database experience are most probably more efficient by using an ORM. (same here, depending on the applications they write it could be recommendable for the team to switch the focus)
ORM is pretty old, at least in the Java world.
Major problems with ORM:
Object-Oriented model and Relational model are quite different.
SQL is a high level language to access data based on relational algebra, different from any OO language like C#, Java or Visual Basic.Net. Mixing those can you the worst of two worlds, instead of the best
For more information search the web on things like 'Object-relational impedance mismatch'
Either case, a good ORM framework saves you on quite some boiler-plate code. But you still need to have knowlegde of SQL, how to setup a good SQL databasemodel. Start with creating a good databasemodel using SQL, then base your OO model on that (not the other way around)
However, the above only holds if you really need to use a SQL database. I recommend looking into NoSQL movement as well. There's stuff like Cassandra, Couch-db. While google'ing for .net solutions I found this stackoverflow question: https://stackoverflow.com/questions/1777103/what-nosql-solutions-are-out-there-for-net
I'm the author of the book with the text quoted in the question.
Let me emphatically add that I am not arguing against using business objects or object oriented programming.
One issue I have with conventional ORM -- for example, LINQ to SQL or Entity Framework -- is that it often leads to developers making DB calls when they don't even realize that they're doing so. This, in turn, is a performance and scalability killer.
I review lots of websites for performance issues, and have found that DB chattiness is one of the most common causes of serious problems. Unfortunately, ORM tends to encourage chattiness, in spades.
The other complaints I have about ORM include:
No support for command batching
No support for multiple result sets
No support for table valued parameters
No support for native async calls (making them from a background thread doesn't count)
Support for SqlDependency and SqlCacheDependency is klunky if/when it works at all
I have no objection to using ORM tactically, to address specific business issues. But I do object to using it haphazardly, to the point where developers do things like make the exact same DB call dozens of time on the same page, or issue hugely expensive queries without considering caching and change notifications, or totally neglect async operations when scalability is a concern.
This site uses Linq-to-SQL I believe, and it's 'fairly' high traffic... I think that the time you save from writing the boiler plate code to access/insert/update simple items is invaluable, but there is always the option to drop down to calling a SPROC if you have something more complex, where you know you can write some screaming fast SQL directly.
I don't think that these things have to be mutually exclusive - use the advantages of both, and if there are sections of your application that start to slow down, then you can optimise as you need to.
ORM is far older than both Java and .NET. The first one I knew about was TopLink for Smalltalk. It's an idea as old as persistent objects.
Every "CRUD on the web" framework like Ruby on Rails, Grails, Django, etc. uses ORM for persistence because they all presume that you are starting with a clean sheet object model: no legacy schema to bother with. You start with the objects to model your problem and generate the persistence from it.
It often works the other way with legacy systems: the schema is long-lived, and you may or may not have objects.
It's astonishing how quickly you can get a prototype up and running with "CRUD on the web" frameworks, but I don't see them being used to develop enterprise apps in large corporations. Maybe that's a Fortune 500 prejudice.
Database admins that I know tell me they don't like the SQL that ORMs generate because it's often inefficient. They all wish for a way to hand-tune it.
I agree with most points already made here.
ORM's are not new in .NET, LLBLGen has been around for a long time, I've been using them for >5 years now in .NET.
I've seen very bad performing code written without ORMs (in-efficient SQL queries, bad indexes, nested database calls - ouch!) and bad code written with ORMs - I'm sure I've contributed to some of the bad code too :)
What I would add is that an ORM is generally a powerful and productivity-enhancing tool that allows you to stop worrying about plumbing db code for most of your application and concentrate on the application itself. When you start trying to write complex code (for example reporting pages or complex UI's) you need to understand what is happening underneath the hood - ignorance can be very costly. But, used properly, they are immensely powerful, and IMO won't have a detrimental effect on your apps performance. I for one wouldn't be happy on a project that didn't use an ORM.
Programming is about writing software for business use. The more we can focus on business logic and presentation and less with technicalities that only matter at certain points in time (when software goes down, when software needs upgrading, etc), the better.
Recently I read about talks of scalability from a Reddit founder, from here, and one line of him that caught my attention was this:
"Having to deal with the complexities
of relational databases (relations,
joins, constraints) is a thing of the
past."
From what I have watched, maintaining a complex database schema, when it comes to scalability, becomes a major pain as the site grows (you add a field, you reassign constraints, re-map foreign keys...etc). It was not entirely clear to me as to why is that. They're not using a NOSQL database though, they're in Postgres.
Add to that, here comes ORM, another layer of abstraction. It simplifies code writing, but almost often at a performance penalty. For me, a simple database abstraction library will do, much like lightweight AR libs out there together with database-specific "plain text" queries. I can't show you any benchmark but with the ORMs I have seen, most of them say that "ORM can often be slow".
Richard covers both sides of the coin, so I agree with him.
As for the fields, I really don't quite get the context of the "fields" you are asking about.
As others have said, you can write underperforming ORM code, and you can also write underperforming SQL.
Using ORM doesn't excuse you from knowing your SQL, and understanding how a query fits together. If you can optimize a SQL query, you can usually optimize an ORM query. For example, hibernate's criteria and HQL queries let you control which associations are joined to improve performance and avoid additional select statements. Knowing how to create an index to improve your most common query can make or break your application performance.
What ORM buys you is uniform, maintainable database access. They provide an extra layer of verification to ensure that your OO code matches up as closely as possible with your database access, and prevent you from making certain classes of stupid mistake, like writing code that's vulnerable to SQL injection. Of course, you can parameterize your own queries, but ORM buys you that advantage without having to think about it.
Never got anything but pain and frustration from ORM packages. If I'd write my SQL the way they autogen it - yeah I'd claim to be fast while my code would be slow :-) Have you ever seen SQL generated by an ORM ? Barely has PK-s, uses FK-s only for misguided interpretation of "inheritance" and if it wants to do paging it dumps the whole recordset on you and then discards 90% of it :-))) Then it locks everything in sight since it has to take in a load of records like it went back to 50 yr old IBM's batch processing.
For a while I thought that the biggest problem with ORM was splintering (not going to have a standard in 50 yrs - every year different API, pardon "model" :-) and ideologizing (everyone selling you a big philosophy - always better than everyone else's of course :-) Then I realized that it was really the total amateurism that's the root cause of the mess and everything else is just the consequence.
Then it all started to make sense. ORM was never meant to be performant or reliable - that wasn't even on the list :-) It was academic, "conceptual" toy from the day one, the consolation prize for professors pissed off that all their "relational" research papers in Prolog went down the drain when IBM and Oracle started selling that terrible SQL thing and making a buck :-)
The closest I came to trusting one was LINQ but only because it's possible and quite easy to kick out all "tracking" and use is just as deserialization layer for normal SQL code. Then I read how the object that's managing connection can develop spontaneous failures that sounded like premature GC while it still had some dangling stuff around. No way I was going to risk my neck with it after that - nope, not my head :-)
So, let me make a list:
Totally sloppy code - not going to suffer bugs and poor perf
Not going to take deadlocks from ORM's 10-100 times longer "transactions"
Drastic reduction of capabilities - SQL has huge expressive power these days
Tying you up into fringe and sloppy API (every ORM aims to hijack your codebase)
SQL queries are highly portable and SQL knowledge is totally portable
I still have to know SQL just to clean up ORM's mess anyway
For "proof-of-concept" I can just serialize to binary or XML files
not much slower, zero bug libraries and one XPath can select better anyway
I've actually done heavy traffic web sites all from XML files
if I actually need real graph then I have no use for DB - nothing real to query
I can serialize a blob and dump into SQL in like 3 lines of code
If someone claims that he does it all from DB to UI - keep your codebase locked :-)
and backup your payroll DB - you'll thank me latter :-)))
NoSQL bases are more honest than ORM - "we specialize in persistence"
and have better code quality - not surprised at all
That would be the short list :-) BTW, modern SQL engines these days do trees and spatial indexing, not to mention paging without a single record wasted. ORM-s are actually "solving" problems of 10yrs ago and promoting amateurism. To that extent NoSQL, also known as document
I am developing an application which at the moment queries a (rather large) database via ADO.NET and hard-coded SQL statements. Admittedly this is ugly (i.e. no compile time errors thrown if a mistake is made in the SQL) and potentially dangerous (due to SQL injections, etc although this is unlikely to be a problem for this particular application) but this wasn't considered initially because this application is really only interested in a very small subset of tables in this database (at least for now...).
LinqToSQL seemed interesting but because this application is required to have the ability to connect to Oracle databases as well, that plan was a non-starter.
Is a project like mine suitable for integration with an ORM framework or would that be overkill?
I think an ORM should always at least be considered.
But it doesn't sound like you're even using business objects (Sometimes referred to as a Data Access Layer or DAL) which greatly undermines the usefulness of an object oriented language. I would address this first. If you find it's too time consuming to create all the CRUD for the business objects it's time for an ORM...
My personal favorite is nHibernate. Big learning curve but definitely worth it.
I would recommend a generated DAL instead of an ORM, or Linq.
Look into subsonic http://subsonicproject.com/. It is an open source DAL generator that is very easy to learn and use, and has a very low overhead.
I would definitely say that it is a candidate for an ORM framework. The overhead of setting up the ORM is quite small once you have familiarized yourself with a framework, and the benefits are many.
As you say, LinqToSQL is not appropriate if you might need Oracle support, but most other frameworks support Oracle.
If you only use a small subset of the tables, then you will only have to map a small subset of the tables and hence the setup cost will decrease even further.
Good luck!
Try using something that generates sql (Like Linq, only with Oracle), instead of an orm.
Why? Jeff Atwood explains.
Quote:
"At first you're like "whee! objects!" and then you realize-- hey, this is a lot of tedious, error-prone mapping code I didn't have to write before... "
I know this is not a programming question per se, but I wanted to get as much input from the SO community on a new project I hope to get started. The project is from being started from scratch and thus every decision for programming languages, databases, frameworks, platforms and what not are up in the air. I'm hoping to get your opinion on the matter, what you feel are the strengths and weaknesses of each option.
Database:
Currently I have the option of using MSSQL or MySQL. While I am leaning towards using MySQL because it is free and most probably has all the features I need. However, there is the possibility of having a lot of hierarchical data and the new hierarchical data type in MSSQL is quite appealing. Does it really simplify matters that much? Also MSSQL supports many more advanced SQL functions that may or may not be useful in the long run. While for development I can get access to Server 2008, multiple licenses as the development team grows and for production, are the costs justified?
Programming Languages:
The project will have a web based front end UI and a server based component that will do some heavy lifting.
For the web based UI, I was thinking of maybe doing Apache/IIS with PHP or IIS with ASP.Net in C#. I'd like to use a good framework to properly utilize good design patterns that should structure the code and development of the app. As well as make modifications in the long run easy to implement. I also want the GUI to look good and don't like the idea of buying .Net controls from component vendors. Instead I prefer the idea of using good CSS, and open sources like YUI and javascript to make the UI sleek.
For the server based component, I was thinking of using C#. I have no real development experience in C++ and I'd like good libraries and sufficient speed is good enough. However, while the web based UI and server based component is loosely coupled, there may be instances where the UI needs to communicate (call methods and what not) with the server based component and I want to pick languages/frameworks that will play nice with each other.
All suggestions on frameworks to incorporate are welcome.
Version Control:
I have had good experiences with SVN and a pretty bad experiences with TFS. I've never worked with GIT. Which do you think is better in terms of features as well as general developer familiarity. I want to pick something that other developers will know and not have trouble with.
I apologize if the questions are bit redundant or I'm not providing enough information or using bad terminology. I plan to edit and improve the question as I get feedback. Thanks!
EDIT:
Who: This would most probably be a startup formed of college students or junior developers. I want the project to utilize technologies that most people are familiar with or are easy to pick up.
What: I'd need hours and days to explain the solution. But in the end when you break it down, its a web based UI (think standard web app to just manage database data) that would be used to knowledgeable clients. The server based component would be very separate except for the fact that it should be able to communicate with the web app.
I can provide more information as required but I would appreciate an opportunity for users to answer and provide their ideas before you hastily close the question.
Obviously it depends a lot on specific requirements, but then again, even with those I probably wouldn't be able to tell for sure!
I've been working on a from-scratch project myself for a couple of months, and have generally found:
Choosing Microsoft for all the layers just goes down much easier (my subjective opinion). For example I would use C# for the UI, the back end, and use MSSQL for the database. Nothing at all wrong with non-Microsoft vendors, I'm no Microsoft fan-boy, I just struggle to get productive with unfamiliar tools. Depends where your experience lies though.
Database: In particular I've found that .NET and MSSQL go easily together. When I started the project I was using a PostgreSQL (because it's free, fully featured and has open-source warm fuzzies). However I abandoned it in favour of MSSQL simply because it was taking me too long to get database work done in an unfamiliar language with unfamiliar tools. Also, I'm not sure MSSQL is so expensive anymore, for example for a web application, MSSQL 2008 Web Edition is pretty damn cheap per-processor I think (only on SPLA licensing though). If you're concerned about database features in a free implementation though, personally I think PostgreSQL has a very full feature set, nicely standardised, and rapidly growing.
UI: I'm pretty inexperienced, but ASP.NET MVC looks far less painful to me than ASP.NET Web Forms. I like PHP too, but again I'd match the UI language with the back-end language, so would recommend .NET.
On frameworks, I'm immersed in DALs at the moment. I like Subsonic for lightweight data, NHibernate for heavy-weight.
I still have a long way to go with my project so perhaps I can only see the short-term benefits and drawbacks at the moment. But in general I would say: use the technologies that you're most comfortable using, as you'll be way more productive and the end result will probably be about the same anyway. If you want to learn new technologies though, and who doesn't? - go ahead, just expect it to take a lot longer.
Didn't want to answer 'cause it's so open ended. But a few points:
Money
First, check out BizSpark. That should take care of any money aspect for 3 years. For a service company, that means not only free VS Team Suite and Office and so on, but free Windows, SQL, etc. If your startup can't afford to spend a bit on MS tech in 3 years, it's probably a bad business. So that takes out licensing.
On a similar note, Sun has Startup Essentials. Could be interesting on the hardware side of things, but I haven't actually competitively priced them versus Dell/HP.
Software
It doesn't sound like you have hard enough requirements to say "oh, this slightly-less-popular software X is perfect for my domain Y and is gonna give me a very big boost". In fact, your project might not be like that at all. Maybe it, technically, is going to be a relatively plain application just pushing data around or whatever. You didn't specify.
For a small startup, personal productivity is probably going to trump any other argument. If your people are excellent in X, then that's one of your top arguments right there.
If you really don't have any particular system you're most comfortable with, be conservative. Stick with .NET or Java, as they'll give you the widest range of useful possibilities.
As far as things like OS and Database, I'm biased, but I think Microsoft will give you platforms that are easier to take advantage of than you'll find elsewhere. For instance, setting up load balancing, clustering, centralized authentication, managing servers (updates, events, etc.) is going to be easier to get going on Windows than it would be on another platform, assuming you're not an expert in either. Configuring SQL Server, even the advanced features, is a piece of cake. (Go time someone who knows neither: Setup a DB mirror in MSSQL and MySQL -- which is going to take more work?) Again, this is all predicated on you not having experts in a particular set of technology.
Don't mix -- whatever you do, stick with the platform. If you go .NET, MSSQL is going to work better with the data providers (or things like Linq-to-SQL). If you decide to do PHP, then use MySQL as everyone else uses it and you'll encounter less resistance. If you're not inventing stuff on the technical side, don't become an edge case.
You should pick the platform first, then the language that is best for that platform (if there is any choice).
One thing you should consider is the labor pool, and labor pool cost, for specific platforms and languages. Human Resources can often get cost metrics, if you don't have ideas already.
In my town, for example, .NET platform is much more expensive per Software Engineer than open source, because the .NET developers have a higher rate (40% roughly). C# is a little higher rate than VB.NET, but also tends to bring more well rounded candidates.
Just to throw in something totally different: How about weblocks as a web framework? It uses Hunchentoot as a server, which can run either standalone or with Apache. This is all done in Common Lisp. Weblocks can use cl-sql as a backend store, which can connect to many different RDBMs (MySQL, PostgreSQL, Oracle, ODBC, SQLite).
In my experience, this has been a contentious issue between "backend" (database developer) and "frontend" guys (application developer, client and server side).
There have been many heated pub discussions on this subject.
I just want to know is it just people have different mindsets, or lazy to learn more and feel comfortable in what they know, or something else.
I might re-phrase the question: why do (some) application developers think they can do "database stuff" without actually bothering to understand it properly? Whereas database developers do not (in general) assume they can write a good application without some training and experience!
It is about levels of abstraction. A database is the lowest level of abstraction in a typical business application (software-wise). It is much more likely that a developer working on an outer layer of the abstraction would have knowledge of an inner layer than a developer in an inner layer would know about the outer layer.
This is because inner layers of abstraction best perform when they are ignorant of the outer layers who depend on them.
So a designer in the presentation layer of a website may know a bit about the server-side code they depend on because they interact with it. But the developer working on the server does not need to know anything about design at all.
I would say it's on a need to know basis. Applications developers often need to know how to connect to databases, add records, delete records etc... This is taken further with new technologies such as LINQ where developers can write database queries within their actual code.
Database developers on the other hand only really need to know how to write database queries as that is their job and probably won't need to worry about the code at application level.
Because programmers very often must understand and interact with databases to do their job, but DBAs very often don't need to do any programming (outside of the DBMS) to do their jobs.
I believe it stems from the fact that programming in sql looks easy, and to get started you have to have a small amount of knowledge (Really for a programmer to learn SELECT * FROM Table is pretty easy). Application programming is not the same way. It becomes very complex in a small amount of time, and that discourages a lot of people. Now I am not saying that database people are any less intelligent it is just what they do looks easier than building applications.
If you develop applications, then the chances are, that sooner or later, you'll have to connect an app to a back-end.
The opposite is not as true.
I think it stems from necessity. If you consider the roles of each person, a programmer needs to to database related stuff far more than database workers need to do programming tasks.
From my experience, having developed both "databases" and "applications" (following your nomenclature...), I guess there's a big difference in state management.
Properly designed databases are always in a "clean" state, and every transaction keeps this consistency. So when developing a database, you have to very clearly specify your data abstractions into tables and which updates are legal and so on.
I've found that most application developers (myself included :)) do a very sloppy job in keeping this consistent state in the application. Any non-trivial interface has many more possible states to manage than a modest database, and it's not as easy to make sure it's always in a clean state. It's also harder to analyze every possible sequence of steps that users will perform.
From my experience, the application developers don't do all the database stuff. Consider all the administration that is related to the databse, backups, replication, etc.
A typical DBA (at least on most of the projects I've been involved to) takes care about everything that is related to project databases - all administration, cooperates with application developers on performance tuning, gives advices about SQL used by the app, does some of the stored procs coding, creates (or, at least reviews and consults) physical DB designs, etc.
So, aren't the database guys "lazy", or "fine with what they already know" just from an application developer's perspective? I'm an app developer myself and there is a whole lot of things that I just don't know about the DBs we're using on our projects.
Part of my education ensured I got a decent understanding of how Databases work. I went into the field expecting to do database work, and a lot of it. I'm a web app guy; it comes with the territory I guess.
My two jobs as a developer have been at two shops that would best be described as tiny (2 people myself included, and then just me) and tiny (3 developers, briefly having a fourth). I have not observed an immediate business need for, nor worked anywhere that had the resources to employ a dedicated DB guy. I can envision some scenarios where that would change (including a new job :P).
As to the rest, I agree that abstraction is also a factor and as developers we're way up on top/outside looking in. I can't imagine doing web app development without DB skills, and I consider Sql/DB Management to be both an important tool and an area I need to stay sharp in.
I'll add that I treat the database side as its own field. There's skills that translate between the two, but there's a lot of specialized knowledge I need to acquire to get better at it, and that being a good programmer doesn't necessarily mean I'm doing a good job on the back end either (fortunately, I'm not a good programmer ;) ). Also, I'm pretty sure that's what she said.
2 reasons:
DB Vendors facilitate bad SQL, and
SQL is hidden from view while
application UI is front and center.
Most naive developers think SQL is a procedural language and write it as such because vendors ensure that the tools exist so that they can do so. DBAs know that good SQL is set-oriented and has optimization principles that are totally different from those involved in application programming.
The visibility aspect makes it so the application developers can write bad SQL against a database and get it to perform in a marginal way, and no one ever sees quite how bad it is. When a DBA writes an application, there are immediate critiques on its appearance and behavior because it's directly visible to the end user.
Good question. Actually why developers do Database Stuff because where no dedicated Database guys then developers have to do that. But a company have Database Guys also have Development guys.
:) what is your idea ?
Is LINQ a kind of Object-Relational Mapper?
LINQ in itself is a set of language extensions to aid querying, readability and reduce code. LINQ to SQL is a kind of OR Mapper, but it isn't particularly powerful. The Entity Framework is often referred to as an OR Mapper, but it does quite a lot more.
There are several other LINQ to X implementations around, including LINQ to NHibernate and LINQ to LLBLGenPro that offer OR Mapping and supporting frameworks in a broadly similar fashion to the Entity Framework.
If you are just learning LINQ though, I'd recommend you stick to LINQ to Objects to get a feel for it, rather than diving into one of the more complicated flavours :-)
LINQ is not an ORM at all. LINQ is a way of querying "stuff", and can be more or less seen as a SQL-like language extension for different things (IEnumerables).
There are various types of "stuff" that can be queried, among them SQL Server databases. This is called LINQ-to-SQL. The way it works is that it generates (implicit) classes based on the structure of the DB and your query. In this sense it works much more like a code generator.
LINQ-to-SQL is not an ORM because it doesn't try at all to solve the object-relational impedance mismatch. In an ORM you design the classes and then either map them manually to tables or let the ORM generate the database. If you then change the database for whatever reason (typically refactoring, renormalization, denormalization), many times you are able to keep the classes as they are by changing the mapping.
LINQ-to-SQL does nothing of the sort. Your LINQ queries will be tightly coupled to the database structure. If you change the DB, you will probably have to change the LINQ as well.
LINQ to SQL (part of Visual Studio 2008) is an OR Mapper.
LINQ is a new query language that can be used to query many different types of sources.
LINQ itself is not a ORM. LINQ is the language features and methods that exist in allowing you to query objects like SQL.
"LINQ to SQL" is a provider that allows us to use LINQ against SQL strongly-typed objects.
I think a good test to ascertain whether a platform or code block displays the characteristics of an O/R-M is simply:
With his solution hat on, does the developer(s) (or his/her code generator) have any direct, unabstracted knowledge of what's inside the database?
With this criterion, the answer for differing LINQ implementations can be
Yes, knowledge of the database schema is entirely contained within the roll-your-own, LINQ utilizing O/R-M code layerorNo, knowledge of the database schema is scattered throughout the application
Further, I'd extend this characterization to three simple levels of O/R-M.
1. Abandonment.
It's a small app w/ a couple of developers and the object/data model isn't that complex and doesn't change very often. The small dev team can stay on top of it.
2. Roll your own in the data access layer.
With some managable refactoring in a data access layer, the desired O/R-M functionality can be effected in an intermediate layer by the relatively small dev team. Enough to keep the entire team on the same page.
3. Enterprise-level O/R-M specification defining/overhead introducing tools.
At some level of complexity, the need to keep all devs on the same page just swamps any overhead introduced by the formality. No need to reinvent the wheel at this level of complexity. N-hibernate or the (rough) V1.0 Entity Framework are examples of this scale.
For a richer classification, from which I borrowed and simplified, see Ted Neward's classic post at
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
where he classifies O/R-M treatments (or abdications) as
1. Abandonment. Developers simply give up on objects entirely, and return to a programming model that doesn't create the object/relational impedance mismatch. While distasteful, in certain scenarios an object-oriented approach creates more overhead than it saves, and the ROI simply isn't there to justify the cost of creating a rich domain model. ([Fowler] talks about this to some depth.) This eliminates the problem quite neatly, because if there are no objects, there is no impedance mismatch.
2. Wholehearted acceptance. Developers simply give up on relational storage entirely, and use a storage model that fits the way their languages of choice look at the world. Object-storage systems, such as the db4o project, solve the problem neatly by storing objects directly to disk, eliminating many (but not all) of the aforementioned issues; there is no "second schema", for example, because the only schema used is that of the object definitions themselves. While many DBAs will faint dead away at the thought, in an increasingly service-oriented world, which eschews the idea of direct data access but instead requires all access go through the service gateway thus encapsulating the storage mechanism away from prying eyes, it becomes entirely feasible to imagine developers storing data in a form that's much easier for them to use, rather than DBAs.
3. Manual mapping. Developers simply accept that it's not such a hard problem to solve manually after all, and write straight relational-access code to return relations to the language, access the tuples, and populate objects as necessary. In many cases, this code might even be automatically generated by a tool examining database metadata, eliminating some of the principal criticism of this approach (that being, "It's too much code to write and maintain").
4. Acceptance of O/R-M limitations. Developers simply accept that there is no way to efficiently and easily close the loop on the O/R mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever percentage seems appropriate) of the problem and make use of SQL and relational-based access (such as "raw" JDBC or ADO.NET) to carry them past those areas where an O/R-M would create problems. Doing so carries its own fair share of risks, however, as developers using an O/R-M must be aware of any caching the O/R-M solution does within it, because the "raw" relational access will clearly not be able to take advantage of that caching layer.
5. Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework. For the last decade or more, the emphasis on solutions to the O/R problem have focused on trying to bring objects closer to the database, so that developers can focus exclusively on programming in a single paradigm (that paradigm being, of course, objects). Over the last several years, however, interest in "scripting" languages with far stronger set and list support, like Ruby, has sparked the idea that perhaps another solution is appropriate: bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects". Work in this space has thus far been limited, constrained mostly to research projects and/or "fringe" languages, but several interesting efforts are gaining visibility within the community, such as functional/object hybrid languages like Scala or F#, as well as direct integration into traditional O-O languages, such as the LINQ project from Microsoft for C# and Visual Basic. One such effort that failed, unfortunately, was the SQL/J strategy; even there, the approach was limited, not seeking to incorporate sets into Java, but simply allow for embedded SQL calls to be preprocessed and translated into JDBC code by a translator.
6. Integration of relational concepts into frameworks. Developers simply accept that this problem is solvable, but only with a change of perspective. Instead of relying on language or library designers to solve this problem, developers take a different view of "objects" that is more relational in nature, building domain frameworks that are more directly built around relational constructs. For example, instead of creating a Person class that holds its instance data directly in fields inside the object, developers create a Person class that holds its instance data in a RowSet (Java) or DataSet (C#) instance, which can be assembled with other RowSets/DataSets into an easy-to-ship block of data for update against the database, or unpacked from the database into the individual objects.
Linq To SQL using the dbml designer yes, otherwise Linq is just a set of extension methods for Enumerables.