I am starting a new Winforms application, and I've always used traditional ADO.NET methods in my DAL (...yes, hardcoding!)
I have used Linq on many ocassions previously, but more for adhoc queries, so I have some experience with it and I loved how easy it was to write querying code. What I would like to perhaps do is replace my existing DAL with pure LINQ queries. I know that they may be areas of concerns with this, which is why I need your help.
If I had to do things like how I always had done in the past, I would structure my app like:
[AppName].ClientUI --> desktop client Presentaion layer
[AppName].WebUI --> web Presentation layer
[AppName].DAL --> ADO.NET Data access layer
[AppName].BLL --> Business logic layer (validation, extra calcs, + manager classes)
[AppName].BE --> Business Entities containing business objects and collection classes
To be honest, I've used this always in web apps and had never done an n-layered Winforms app before.
If I want to replace my old DAL methods with LINQ ones, what challenges am I in store for.
Finally, is it even recommended to use LINQ in a multi-layered app? Sometimes I think that using the old ADO.NET methods is still better...more work, but better. I mean - and correct me if Im wrong - at the core of all these new technologies (that are meant to make our work as developers better) are they not all still making use of traditional ADO.NET???
Hope someone can shed some strong light on this! :)
For a straight forward winforms client app that connects directly to the database, Linq is a good replacement for ADO.NET, there are very few gotchas that you will come up against. Use a tool like SQLMetal (Or the designer that's built in with VS2008) to generate your Linq data objects and database connection classes. If you want use the generated objects as your "BE" objects you can just copy this stuff and move it into what ever assembly you want (if you want to separate the assemblies). Or alternatively you can just provide separate "Business entities" and a translation layer that copies data from the BEs to the Linq generated objects and back again.
The one gotcha that I have come across with Linq a few times is that it doesn't have very good support for disconnected architectures. For example if you wanted to have your DAL on a server and have all your client apps connect to it, you will hit on problems if you just let your linq objects be transfered across the server.
If you do choose to have separate business entities (or have a disconnected architecture) you will find you have to carefully manage disconnecting the Linq objects from the data context, and then reattaching them when you are ready to save/update. It's worth doing some prototyping in this area first to make sure you understand how it works.
The other thing that often trips people up is that linq queries are not executed immediately against the database, they are only executed as the data is needed. Watch out for this, as it can catch you out if you aren't expecting it (and it's hard to spot when debugging because when you look at your linq query in the debugger it will execute to get the data).
It's also worth considering the Entity framework as an alternative of linq2sql (you can still to linq2EF queries). The EF is a more complete ORM and has better support for mapping related tables to multiple objects, but still suffers from poor support for disconnected apps. The EF in .net 4.0 is supposed to have better support for disconnected architectures.
I've done it both ways.
You are right, rolling an ADO.NET DAL by hand is more work, so do you get a performance benefit for that additional work? Yes, when you use Linq to SQL classes, you will take about 7% to 10% off the top as overhead. However, for that small overhead, you get a lot of benefits:
Lazy Loading (which provides performance increases by deferring execution of queries until they are actually needed)
CRUD for free, with built-in
concurrency handling
Linq, IQueryable, and all of the
goodness that provides
Partial classes allow you to insert
business logic and validation int
your Linq to Sql classes, without
worrying about the Linq to SQL code
generator overwriting them.
Of course, if you don't need any of these things, then Linq to SQL might seem like a luxury. However, I find that Linq to SQL is easier to maintain, especially since the DAL classes can be regenerated if the database changes.
And yes, Linq to SQL uses ADO.NET under the hood.
Linq to SQL's multi-tier story is less clear, in large part due to the way the DataContext object needs to be handled. I suggest checking out this CodePlex project:
An Example of a Multi Tier Architecture for Linq to Sql
http://www.codeplex.com/MultiTierLinqToSql
Related
I'm sorry if exactly same question is somewhere in the haystack of Stack Overflow questions but I've been searching answers for 2 days and at last I'm here.
Please feel free to mention if there are any duplicates but make sure they address the issue.
I have a very old application where code is not object oriented and sql queries are everywhere to get data and each of then uses sqlcommand/sqldatareader.
Now, I'm trying to move to WCF Web API framework and trying to separate business and data layers. Most of the sql queries return different set of columns.
I believe the sql queries must be executed in data layer.
I'm building those queries or placing existing queries in Business layer and passing those queries to Data Layer.
My plan is to return datasets from data layer. Manipulate and put into domain classes in Business layer and return into the controller.
However, I feel this is not a good approach and it's tightly coupled. I cannot find a way to how I should make it loosely coupled.
If I was to use EF or ORM then I could use DBContext to get data in the Business Layer.
I cannot map each database table to POCO class in my data layer (because most queries are complex and return different set of columns).
Question
How should I deal with the queries to make this architecture a better one in terms of making it loosely coupled?
Fixing up a brownfields project like the one you are dealing with is tricky. It sounds like you have the right idea in terms of trying to get all sql queries into a data layer.
My suggestion is to find a thin slice. A small part of the application (e.g. if the system allows you to add a user) and try to refactor the code into the structure you want. Then find the next thin slice...
Another tip is that you should do one small refactor at a time as soon as you let your refactorings get too big you increase the risk of introducing bugs.
A tool like resharper is a great help when it comes safely refactoring your code. It does have a bit of a learning curve, but it's worth the effort.
Does it make sense to use an OR-mapper?
I am putting this question of there on stack overflow because this is the best place I know of to find smart developers willing to give their assistance and opinions.
My reasoning is as follows:
1.) Where does the SQL belong?
a.) In every professional project I have worked on, security of the data has been a key requirement. Stored Procedures provide a natural gateway for controlling access and auditing.
b.) Issues with Applications in production can often be resolved between the tables and stored procedures without putting out new builds.
2.) How do I control the SQL that is generated? I am trusting parse trees to generate efficient SQL.
I have quite a bit of experience optimizing SQL in SQL-Server and Oracle, but would not feel cheated if I never had to do it again. :)
3.) What is the point of using an OR-Mapper if I am getting my data from stored procedures?
I have used the repository pattern with a homegrown generic data access layer.
If a collection needed to be cached, I cache it. I also have experience using EF on a small CRUD application and experience helping tuning an NHibernate application that was experiencing performance issues. So I am a little biased, but willing to learn.
For the past several years we have all been hearing a lot of respectable developers advocating the use of specific OR-Mappers (Entity-Framework, NHibernate, etc...).
Can anyone tell me why someone should move to an ORM for mainstream development on a major project?
edit: http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html seems to have a strong discussion on this topic but it is out of date.
Yet another edit:
Everyone seems to agree that Stored Procedures are to be used for heavy-duty enterprise applications, due to their performance advantage and their ability to add programming logic nearer to the data.
I am seeing that the strongest argument in favor of OR mappers is developer productivity.
I suspect a large motivator for the ORM movement is developer preference towards remaining persistence-agnostic (don’t care if the data is in memory [unless caching] or on the database).
ORMs seem to be outstanding time-savers for local and small web applications.
Maybe the best advice I am seeing is from client09: to use an ORM setup, but use Stored Procedures for the database intensive stuff (AKA when the ORM appears to be insufficient).
I was a pro SP for many, many years and thought it was the ONLY right way to do DB development, but the last 3-4 projects I have done I completed in EF4.0 w/out SP's and the improvements in my productivity have been truly awe-inspiring - I can do things in a few lines of code now that would have taken me a day before.
I still think SP's are important for some things, (there are times when you can significantly improve performance with a well chosen SP), but for the general CRUD operations, I can't imagine ever going back.
So the short answer for me is, developer productivity is the reason to use the ORM - once you get over the learning curve anyway.
A different approach... With the raise of No SQL movement now, you might want to try object / document database instead to store your data. In this way, you basically will avoid the hell that is OR Mapping. Store the data as your application use them and do transformation behind the scene in a worker process to move it into a more relational / OLAP format for further analysis and reporting.
Stored procedures are great for encapsulating database logic in one place. I've worked on a project that used only Oracle stored procedures, and am currently on one that uses Hibernate. We found that it is very easy to develop redundant procedures, as our Java developers weren't versed in PL/SQL package dependencies.
As the DBA for the project I find that the Java developers prefer to keep everything in the Java code. You run into the occassional, "Why don't I just loop through all the Objects that just returned?" This caused a number of "Why isn't the index taking care of this?" issues.
With Hibernate your entities can contain not only their linked database properties, but can also contain any actions taken upon them.
For example, we have a Task Entity. One could Add or Modify a Task among other things. This can be modeled in the Hibernate Entity in Named Queries.
So I would say go with an ORM setup, but use procedures for the database intensive stuff.
A downside of keeping your SQL in Java is that you run the risk of developers using non-parameterized queries leaving your app open to a SQL Injection.
The following is just my private opinion, so it's rather subjective.
1.) I think that one needs to differentiate between local applications and enterprise applications. For local and some web applications, direct access to the DB is okay. For enterprise applications, I feel that the better encapsulation and rights management makes stored procedures the better choice in the end.
2.) This is one of the big issues with ORMs. They are usually optimized for specific query patterns, and as long as you use those the generated SQL is typically of good quality. However, for complex operations which need to be performed close to the data to remain efficient, my feeling is that using manual SQL code is stilol the way to go, and in this case the code goes into SPs.
3.) Dealing with objects as data entities is also beneficial compared to direct access to "loose" datasets (even if those are typed). Deserializing a result set into an object graph is very useful, no matter whether the result set was returned by a SP or from a dynamic SQL query.
If you're using SQL Server, I invite you to have a look at my open-source bsn ModuleStore project, it's a framework for DB schema versioning and using SPs via some lightweight ORM concept (serialization and deserialization of objects when calling SPs).
I am doing something I consider to be pretty normal (although I personally haven't had to do it before), and I have assumed there'd be a 'no-brainer' way forward, but Im yet to find it - which is really frustrating.
I will be creating a WPF application, which is a data-oriented business application. My data will come from a remote IIS server (that I control) that has a standard SQL server 2008 database, so Web services/WCF seem to be the way forward. The remote service needs to be secure (reasonably) via a user (of the WPF client) username/password login.
I dont want to use 3rd party ORM products, but I expect the data layer (between the service and the database) to be able to cope with very simple ORM type functionality (I really dont want to hand-craft a data retrieval and persistence layer). Im not worried about concurrency very much as this will be a fairly simple app.
My options seem to be one of the following:
ADO.NET Entity Framework over WCF
Linq2Sql over WCF
WCF Data Services
On further investigation, none of the above seem to be the 'no brainer' Im after
1) ADO.NET entity Framework - Ive had a play with this and getting all sorts of issues serializing objects over WCF. Even when I try to generate POCO entities and use them, Im having to decorate service contracts with custom attributes just to get it to not error all the time, and I seem to have to hand-crank anything more than a flat object graph. It seems to me that EF simply isn't designed to be exposed via a service.
2) Linq2Sql - This doesn't seem much better than EF. I seem to have to hand-crank too much stuff. Ive tried the designer and SQLMetal but nothing seems to 'just work' - it all needs fiddling with.
3) WCF Data Services - this seems like a good option on the face of it, but essentially it seems like I'm just exposing my SQL database tables 'in the raw' over the service layer. Im not an expert in this technology by any means but it seems like a potentially dangerous approach, and on top of that it doesnt seem to support any kind of access security as standard (you have to hack it to require authentication it seems).
As I said, this scenario feels like it should have a no-brainer solution, but Im still scratching my head. Ive done lots of things with .NET technologies, but to be honest this area represents a bit of a hole in my understanding, so I apologize if any of my comments or assumptions are naive.
Of course, it may well be that the 'hacky' long-way-round on EF or Linq2SQL may be all I can do, in which case I can roll up my sleeves, and accept the fact that I haven't missed a more elegant solution.
Any help/advice will be much appreciated.
This is a tad subjective, but i'll offer my opinion.
First of all, forget L2SQL - it's basically obsolete and doesn't have the full POCO support of EF4 (it can be done, but needs XML tinkering, or SQLMetal generation), which means serializaing your entities will be a left-to-right entity cloning nightmare.
I would go with ADO.NET Entity Framework over WCF, Entity Framework 4.0 specifically. You will have a wealth of flexibility in your model (including the ability to apply OO principles such as inheritance).
Use Self-Tracking-Entities. Yes, you have to decorate service contracts - this is by design, and there are many reasons for this.
You could always use DTO's, as opposed to serializing the actual EF entities.
OData is really good as well in it's flexibility and simplicity. But if your only consuming your model via a single client application, a specialized service layer (WCF) is a better approach IMO.
3) WCF Data Services - this seems like
a good option on the face of it, but
essentially it seems like I'm just
exposing my SQL database tables 'in
the raw' over the service layer.
That might be a first impression - but it's fundamentally wrong. What you're exposing over the web is a model - and you have full control over what gets into that model, and how consumers of your WCF Data Services might be able to see and/or even update entities in that model.
That's where Entity Framework comes in and shines (and where Linq-to-SQL miserably fails): you can grab your database (or at least parts of it) into an Entity Data Model, and then modify it. You can tweak your entity names to be totally different from your table names, you can add computed attributes, you can remove certain attributes and much more.
If you're talking about a fairly simple app, that's definitely the way I'd go:
grab your database and turn it into an Entity Data Model using EF
expose that EDM over WCF Data Services and define what can be seen read-only, and what might even be updated over the wire
I come from a java background.
But I would like a cross-platform perspective on what is considered best practice for persisting objects.
The way I see it, there are 3 camps:
ORM camp
direct query camp e.g. JDBC/DAO, iBatis
LINQ camp
Do people still handcode queries (bypassing ORM) ? Why, considering the options available via JPA, Django, Rails.
There is no one best practice for persistence (although the number of people screaming that ORM is best practice might lead you to believe otherwise). The only best practice is to use the method that is most appropriate for your team and your project.
We use ADO.NET and stored procedures for data access (though we do have some helpers that make it very fast to write such as SP class wrapper generators, an IDataRecord to object translator, and some higher order procedures encapsulating common patterns and error handling).
There are a bunch of reasons for this which I won't go into here, but suffice to say that they are decisions that work for our team and that our team agrees with. Which, at the end of the day, is what matters.
I am currently reading up on persisting objects in .net. As such I cannot offer a best practice, but maybe my insights can bring you some benefit. Up until a few months ago I have always used handcoded queries, a bad habit from my ASP.classic days.
Linq2SQL - Very lightweight and easy to get up to speed. I love the strongly typed querying possibilities and the fact that the SQL is not executed at once. Instead it is executed when your query is ready (all the filters applied) thus you can split the data access from the filtering of the data. Also Linq2SQL lets me use domain objects that are separate from the data objects which are dynamically generated. I have not tried Linq2SQL on a larger project but so far it seems promising. Oh it only supports MS SQL which is a shame.
Entity Framework - I played around with it a little bit and did not like it. It seems to want to do everything for me and it does not work well with stored procedures. EF supports Linq2Entities which again allows strongly typed queries. I think it is limited to MS SQL but I could be wrong.
SubSonic 3.0 (Alpha) - This is a newer version of SubSonic which supports Linq. The cool thing about SubSonic is that it is based on template files (T4 templates, written in C#) which you can easily modify. Thus if you want the auto-generated code to look different you just change it :). I have only tried a preview so far but will look at the Alpha today. Take a look here SubSonic 3 Alpha. Supports MS SQL but will support Oracle, MySql etc. soon.
So far my conclusion is to use Linq2SQL until SubSonic is ready and then switch to that since SubSonics templates allows much more customization.
There is at least another one: System Prevalence.
As far as I can tell, what is optimal for you depends a lot on your circumstances. I could see how for very simple systems, using direct queries still could be a good idea. Also, I have seen Hibernate fail to work well with complex, legacy database schemata, so using an ORM might not always be a valid option. System Prevalence is supposed to unbeatingly fast, if you have enough memory to fit all your objects into RAM. Don't know about LINQ, but I suppose it has its uses, too.
So, as so often, the answer is: know a variety of tools for the job, so that you are able to use the one that's most appropriate for your specific situation.
The best practice depends on your situation.
If you need database objects in table structures with some sort of meaningful structure (so one column per field, one row per entity and so on) you need some sort of translation layer inbetween objects and the database. These fall into two camps:
If there's no logic in the database (just storage) and tables map to objects well, then an ORM solution can provide a quick and reliable persistence system. Java systems like Toplink and Hibernate are mature technologies for this.
If there is database logic involved in persistence, or your database schema has drifted from your object model significantly, stored procedures wrapped by Data Access Objects (with further patterns as you like) is a little more involved than ORM but more flexible.
If you don't need structured storage (and you need to be really sure that you don't, as introducing it to existing data is not fun), you can store serialized object graphs directly in the database, bypassing a lot of complexity.
I prefer to write my own SQL, but I apply all my refactoring techniques and other "good stuff" when I do so.
I have written data access layers, ORM code generators, persistence layers, UnitOfWork transaction management, and LOTS of SQL. I've done that in systems of all shapes and sizes, including extremely high-performance data feeds (forty thousand files totaling forty million transactions per day, each loaded within two minutes of real-time).
The most important criteria is destiny, as in control thereof. Don't ever let your ORM tool be an obstacle to getting your work done, or an excuse for not doing it right. Ultimately, all good SQL is hand-written and hand-tuned, but some decent tools can help you get a good first draft quickly.
I treat this issue the same way that I do my UI design. I write all my UIs directly in code, but I might use a visual designer to prototype some essential elements that I have in mind, then I tear apart the code it generates in order to kickstart my own.
So, use an ORM tool in any of its manifestations as a way to get a decent example--look at how it solves many of the issues that arise (key generation, associations, navigation, etc.). Tear apart its output, make it your own, then reuse the heck out of it.
Is LINQ a kind of Object-Relational Mapper?
LINQ in itself is a set of language extensions to aid querying, readability and reduce code. LINQ to SQL is a kind of OR Mapper, but it isn't particularly powerful. The Entity Framework is often referred to as an OR Mapper, but it does quite a lot more.
There are several other LINQ to X implementations around, including LINQ to NHibernate and LINQ to LLBLGenPro that offer OR Mapping and supporting frameworks in a broadly similar fashion to the Entity Framework.
If you are just learning LINQ though, I'd recommend you stick to LINQ to Objects to get a feel for it, rather than diving into one of the more complicated flavours :-)
LINQ is not an ORM at all. LINQ is a way of querying "stuff", and can be more or less seen as a SQL-like language extension for different things (IEnumerables).
There are various types of "stuff" that can be queried, among them SQL Server databases. This is called LINQ-to-SQL. The way it works is that it generates (implicit) classes based on the structure of the DB and your query. In this sense it works much more like a code generator.
LINQ-to-SQL is not an ORM because it doesn't try at all to solve the object-relational impedance mismatch. In an ORM you design the classes and then either map them manually to tables or let the ORM generate the database. If you then change the database for whatever reason (typically refactoring, renormalization, denormalization), many times you are able to keep the classes as they are by changing the mapping.
LINQ-to-SQL does nothing of the sort. Your LINQ queries will be tightly coupled to the database structure. If you change the DB, you will probably have to change the LINQ as well.
LINQ to SQL (part of Visual Studio 2008) is an OR Mapper.
LINQ is a new query language that can be used to query many different types of sources.
LINQ itself is not a ORM. LINQ is the language features and methods that exist in allowing you to query objects like SQL.
"LINQ to SQL" is a provider that allows us to use LINQ against SQL strongly-typed objects.
I think a good test to ascertain whether a platform or code block displays the characteristics of an O/R-M is simply:
With his solution hat on, does the developer(s) (or his/her code generator) have any direct, unabstracted knowledge of what's inside the database?
With this criterion, the answer for differing LINQ implementations can be
Yes, knowledge of the database schema is entirely contained within the roll-your-own, LINQ utilizing O/R-M code layerorNo, knowledge of the database schema is scattered throughout the application
Further, I'd extend this characterization to three simple levels of O/R-M.
1. Abandonment.
It's a small app w/ a couple of developers and the object/data model isn't that complex and doesn't change very often. The small dev team can stay on top of it.
2. Roll your own in the data access layer.
With some managable refactoring in a data access layer, the desired O/R-M functionality can be effected in an intermediate layer by the relatively small dev team. Enough to keep the entire team on the same page.
3. Enterprise-level O/R-M specification defining/overhead introducing tools.
At some level of complexity, the need to keep all devs on the same page just swamps any overhead introduced by the formality. No need to reinvent the wheel at this level of complexity. N-hibernate or the (rough) V1.0 Entity Framework are examples of this scale.
For a richer classification, from which I borrowed and simplified, see Ted Neward's classic post at
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
where he classifies O/R-M treatments (or abdications) as
1. Abandonment. Developers simply give up on objects entirely, and return to a programming model that doesn't create the object/relational impedance mismatch. While distasteful, in certain scenarios an object-oriented approach creates more overhead than it saves, and the ROI simply isn't there to justify the cost of creating a rich domain model. ([Fowler] talks about this to some depth.) This eliminates the problem quite neatly, because if there are no objects, there is no impedance mismatch.
2. Wholehearted acceptance. Developers simply give up on relational storage entirely, and use a storage model that fits the way their languages of choice look at the world. Object-storage systems, such as the db4o project, solve the problem neatly by storing objects directly to disk, eliminating many (but not all) of the aforementioned issues; there is no "second schema", for example, because the only schema used is that of the object definitions themselves. While many DBAs will faint dead away at the thought, in an increasingly service-oriented world, which eschews the idea of direct data access but instead requires all access go through the service gateway thus encapsulating the storage mechanism away from prying eyes, it becomes entirely feasible to imagine developers storing data in a form that's much easier for them to use, rather than DBAs.
3. Manual mapping. Developers simply accept that it's not such a hard problem to solve manually after all, and write straight relational-access code to return relations to the language, access the tuples, and populate objects as necessary. In many cases, this code might even be automatically generated by a tool examining database metadata, eliminating some of the principal criticism of this approach (that being, "It's too much code to write and maintain").
4. Acceptance of O/R-M limitations. Developers simply accept that there is no way to efficiently and easily close the loop on the O/R mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever percentage seems appropriate) of the problem and make use of SQL and relational-based access (such as "raw" JDBC or ADO.NET) to carry them past those areas where an O/R-M would create problems. Doing so carries its own fair share of risks, however, as developers using an O/R-M must be aware of any caching the O/R-M solution does within it, because the "raw" relational access will clearly not be able to take advantage of that caching layer.
5. Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework. For the last decade or more, the emphasis on solutions to the O/R problem have focused on trying to bring objects closer to the database, so that developers can focus exclusively on programming in a single paradigm (that paradigm being, of course, objects). Over the last several years, however, interest in "scripting" languages with far stronger set and list support, like Ruby, has sparked the idea that perhaps another solution is appropriate: bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects". Work in this space has thus far been limited, constrained mostly to research projects and/or "fringe" languages, but several interesting efforts are gaining visibility within the community, such as functional/object hybrid languages like Scala or F#, as well as direct integration into traditional O-O languages, such as the LINQ project from Microsoft for C# and Visual Basic. One such effort that failed, unfortunately, was the SQL/J strategy; even there, the approach was limited, not seeking to incorporate sets into Java, but simply allow for embedded SQL calls to be preprocessed and translated into JDBC code by a translator.
6. Integration of relational concepts into frameworks. Developers simply accept that this problem is solvable, but only with a change of perspective. Instead of relying on language or library designers to solve this problem, developers take a different view of "objects" that is more relational in nature, building domain frameworks that are more directly built around relational constructs. For example, instead of creating a Person class that holds its instance data directly in fields inside the object, developers create a Person class that holds its instance data in a RowSet (Java) or DataSet (C#) instance, which can be assembled with other RowSets/DataSets into an easy-to-ship block of data for update against the database, or unpacked from the database into the individual objects.
Linq To SQL using the dbml designer yes, otherwise Linq is just a set of extension methods for Enumerables.