Why would a Database developer use LINQ - sql-server

I am curious to know why should a Database person learn LINQ. How can it be useful.

LINQ has one big advantage over most other data access methods: it centers around the 'query expression' metaphor, meaning that queries can be passed around like objects and be altered and modified, all before they are executed (iterated). In practice what that means is that code can be modular and better isolated. The data access repository will return the 'orders' query, then an intermediate filter in the request processing pipeline will decorate this query with a filter, then it gets passed on to a display module that adds sorting and pagination etc. In the end, when its iterated, the expression has totally transformed into very specific SELECT ... WHERE ... ORDER BY ... LIMIT ... (or the other back-end specific pagination like ROW_NUMBER). For application developers, this is priceless and there simply isn't any viable alternative. This is why I believe LINQ is here to stay and won't vanish in 2 years. It is definitely more than just a fad. And I'm specifically referring to LINQ as database access method.
The advantage of manipulating query expression objects alone is enough of a factor to make LINQ a winning bid. Add to this the multiple container types it can manipulate (XML, arrays and collections, objects, SQL) and the uniform interface it exposes over all these disparate technologies, consider the parallel processing changes coming in .Net 4.0 that I'm sure will be integrated transparently into LINQ processing of arrays and collections, and is really no way LINQ will go away. Sure, today it sometimes produces unreadable, poorly performant and undebuggable SQL, and is every dedicated DBA nightmare. It will get better.

Linq provides "first class" access to your data. Meaning that your queries are now part of the programming language.
Linq provides a common way to access data objects of all kinds. For example, the same syntax that is used to access your database queries can also be used to access lists, arrays, and XML files.
Learning Linq will provide you with a deeper understanding of the programming language. Linq was the driver for all sorts of language innovations, such as extension methods, anonymous types, lambda expressions, expression trees, type inference, and object initializers.

I think a better question is why would one replace SQL with LINQ when querying a database?

"Know your enemy" ?
OTOH, out of interest. I'm learning more about XMl, XSLT, XSDs etc because I can see a use for them as a database developer.

Is this a real question? It can't be answered with a code snippet. Most of these discussion type questions get closed quickly.

I think one of the many reasons why one could consider linq it because linq is big productivity boost and huge time saver

This question also prompts me to ask, if LINQ becomes immensely popular, will server side developers know SQL as well as their predecessors did?

Related

ICursor implementation for ORM in MonoDroid

I'm trying to figure out how to make implementation of android database Cursor to wrap "ORMed" database layer.
To have ORM in MonoDroid we can use sqlite-net project (very lightweight ORM) or ServiceStack.OrmLite
My thoughts are to implement ICursor interface and "wrap" ORM
For now I just can't set it in my mind how it should work, and should it work ever or not.
Should it load "framed" set of results, or fetch it one by one?
Which is better for performance, how to get column values - reflection or..?
So, actually question is: is it possible ever?
Any thoughts will be appreciated.
Thanks.
I'm not sure what "problem" you're trying to solve with an ICursor implementation, perhaps you should be a little more specific as to what specific task you're trying to do. The entire point of an ORM (and you missed this one that also supports SQLite on Android) is to abstract away the whole RDBMS paradigm from the code and give you an object-oriented paradigm instead.
An ICursor gives you back an updatable resultset from a SQL query - which means you have to know about rows, resultsets, queries and all of that. An ORM gives back an object, or a collection of objects. If you want to update one, you update the object and send it back to the ORM.
Now I fully admit that there are times when an ORM's might not provide the cleanest mechanism to do something that a SQL query might do well. For example, if you logically wanted to "delete all parts built yesterday during second shift". A lightweight ORM might give you all parts and then you have to use LINQ or similar to filter that to the right day and shift and then iterate that resulting collection to delete each, whereas with a SQL query you just pass in a DELETE FROM Parts WHERE BornONDate BETWEEN #start AND #end, but that's one of the trade-offs you face.
In some cases the ORM might provide a facility to do what you want. For example in the OpenNETCF ORM linked above, you can cast your DataStore (if it isn't already) to a SQLDataStore and then you have access to the ExecuteNonQuery method, allowing you to pass in a direct SQL statement. If still doesn't have a means to pass you back a record set because, as I said, returning database rows is really the antithesis or an ORM.
There's also some inherent risk in using something like ExecuteNonQuery. If you want to change your backing store, from say a RDBMS like SQLite to something totally different like an object database, an XML file or whatever, then your code that builds and uses a SQL query breaks. Admittedly this might not be common, but if code portability and extensibility and on your radar, then it's at least something to keep in mind.

Is there a database like this?

Background: Okay, so I'm looking for what I guess is an object database. However, the (admittedly few) object databases that I've looked at have been simple persistence layers, and not full-blown DBMSs. I don't know if what I'm looking for is even considered an object database, so really any help in pointing me in the right direction would be very appreciated.
I don't want to give you two pages describing what I'm looking for so I'll use an example to illustrate my point. Let's say I have a "BlogPost" object that I need to store. Something like this, in pseudocode:
class BlogPost
title:String
body:String
author:User
tags:List<String>
comments:List<Comment>
(Assume Comment is its own class.)
Now, in a relational database, author would be stored as a foreign key pointing to a User.id, and the tags and comments would be stored as one-to-many or many-to-many relationships using a separate table to store the relationships. What I'd like is a database engine that does the following:
Stores related objects (author, tags, etc.) with a direct reference instead of using foreign keys, which require an additional lookup; in other words, objects on top of each other should be natively supported by the database
Allows me to add a comment or a tag to the blog post without retrieving the entire object, updating it, and then putting it back into the database (like a document-oriented database -- CouchDB being an example)
I guess what I'm looking for is a navigational database, but I don't know. Is there anything even remotely similar to what I'm thinking of? If so, what is it called? (Or better yet, give me an actual working database.) Or am I being too picky?
Edit:
Just to clarify, I am NOT looking for an ORM or an abstraction layer or anything like that. I am looking for an actual database that does this internally. Sorry if I'm being difficult, but I've searched and I couldn't find anything.
Edit:
Also, something for the JVM would be excellent, but at this point I really don't care what platform it runs on.
I think what you are describing could easily be modeled in a graph database. Then you get the benefit of navigating to the nodes/edges where you want to make changes without any need to retrieve anything else. For the JVM there's the Neo4j open source graph database (where I'm part of the team). You can read about it over at High Scalability, as part of an overview at thinkvitamin or in this stackoverflow thread. As for the tags, I think storing them in a graph database can give you some extra advantages if you want to find related tags and similar stuff. Just drop a line on the mailing list, and I'm sure the community will help you out.
You could try out db4o which is available in C# and Java.
I think our looking for this: http://www.odbms.org/. This site has some good info on Object Databases, including Objectivity, which is a pretty good object database.
Elephant does this: http://common-lisp.net/project/elephant/
Exactly what you've described can be done with (N)Hibernate running on an ordinary RDBMS.
The advantage of using such a persistence layer with an ordinary database is that you have a standard database system combined with convenient programming. You declare your classes in a very natural way, and (N)Hibernate provides a way to translate betweeen references/lists and foreign key relationships.
Java tutorial: http://docs.jboss.org/hibernate/stable/core/reference/en/html/tutorial-firstapp.html
.NET tutorial: https://web.archive.org/web/20081212181310/http://blogs.hibernatingrhinos.com/nhibernate/archive/2008/04/01/your-first-nhibernate-based-application.aspx
If you insist that you don't want to use a well-supported standard RDBMS and would rather trust your data to something more exotic and less heavily tested, you're looking for an Object Relational Database.
However, such a product would probably be best implemented by making it be a layer over a standard RDBMS anyway. This is probably why ORMs like (N)Hibernate are the most popular solution - they allow standard RDBMS software (and widely available management/user skills) to be applied, and yet the programming experience is 99% object-based.
This is exactly what LINQ was designed for.
Microsoft LINQ defines a set of proprietary query operators that can be used to query, project and filter data in arrays, enumerable classes, XML (XLINQ), relational database, and third party data sources. While it allows any data source to be queried, it requires that the data be encapsulated as objects. So, if the data source does not natively store data as objects, the data must be mapped to the object domain. Queries written using the query operators are executed either by the LINQ query processing engine or, via an extension mechanism, handed over to LINQ providers which either implement a separate query processing engine or translate to a different format to be executed on a separate data store (such as on a database server as SQL queries (DLINQ)). The results of a query are returned as a collection of in-memory objects that can be enumerated using a standard iterator function such as C#'s foreach.
There's a variety of terms, all linked to Object-Relational Mapping, aka ORM, which is probably going to be the most useful one for you to look up. ORM libraries exist for many programming languages.
Oracle's nested tables provide some part of that functionality, though in updates, you cannot just add a row to the nested table - you have to replace the whole nested table.
I guess you're looking for an ORM with "EntityFirst" approach.
In EntityFirst approach the developer is least[not-at-all] concerned with Database. You just have to build your entities or objects. The ORM then takes care of storing the entities in Database and retrieving them at your will.
The only EntityFirst ORM witihn my knowledge "Signum". It's a wonderful framework built on top of .net. I recommend you to go thrgouh some videos on the SignumFramework website and I'm sure you'll find it useful.
Link Text: http://www.signumframework.com
Thanks.
ZODB perhaps?
good introduction find here:
http://www.ibm.com/developerworks/aix/library/au-zodb/
You could try out STSdb, DB4O, Perst ... which is available in C# and Java.

Marrying up consumer-defined aggregates (e.g. SQL counts) with 'pure' model objects?

What is the best practice of introducing custom (typically volatile) data into entity model classes? This may sound like a bad practice first, but it seems to be quite a common scenario. In our recent web application we have developed a proper model and in most cases we are fine with loading model entities. But there are cases where we cannot afford loading an entire hierarchy of entities; we need to load, say, results of a couple of SQL COUNT’s or possibly some additional information alongside (or embedded inside) the model entities. So basically, the requirements and conditions are:
It’s a web application where 99.9999999999% of all operations are read operations.
They don’t need to process or do any complicated business logic. We just need to get data quickly to HTML.
In several performance critical cases, we need to load results of SQL aggregates which don’t fit any model properties.
We need an extensible way to introduce any new custom data if needed.
How do you usually solve this issue without working too much around your ORM (for instance raw data from db)? I’m sure this has been discussed many times, but I cannot figure out a good Google query to find anything useful.
Edit: Since I later realized the question was not very well formed, I decided to reformulate it and start a new one.
If you're just getting relational data to and from a browser, with little or no behavior in between, it sounds like your trying to solve a relational problem with an OO paradigm.
I might be inclined to dispense with the Object Oriented approach altogether.
Me team recently rewrote an application by asking "What is the simplest thing that can possibly work?" and "What is the closest language to the problem?". Our new app, replacing an OO one, ended up being 10 times smaller, faster, and cheaper.
We used SQL, stored procedures, XML libraries on the DB server, XSLT (to get the HTML), and javascript.
OOP purist like myself would go to the Decorator pattern.
http://en.wikipedia.org/wiki/Decorator_pattern
But the thing is, some people may not need the flexibility it offers. Plus, creating new classes for each distinct operation may seem overkill, but it provide good compile type checking.
The best practice in my view is that your application consumes data using the Domain Model pattern. The Domain Model can offer business-logic methods for doing the type of queries that make sense and are relevant to your application needs.
These can fetch "live" results that map directly to database rows and can therefore be edited and "saved."
But additionally, the Domain Model can provide methods that fetch read-only results that are too complex to be easily saved back to the database. This includes your example of grouped aggregate query results, and also includes joined query result sets, expressions as columns, etc.
The Domain Model pattern offers a way to decouple the OO design of an application from the design of the physical database.

What is a good balance in an MVC model to have efficient data access?

I am working on a few PHP projects that use MVC frameworks, and while they all have different ways of retrieving objects from the database, it always seems that nothing beats writing your SQL queries by hand as far as speed and cutting down on the number of queries.
For example, one of my web projects (written by a junior developer) executes over 100 queries just to load the home page. The reason is that in one place, a method will load an object, but later on deeper in the code, it will load some other object(s) that are related to the first object.
This leads to the other part of the question which is what are people doing in situations where you have a table that in one part of the code only needs the values for a few columns, and another part needs something else? Right now (in the same project), there is one get() method for each object, and it does a "SELECT *" (or lists all the columns in the table explicitly) so that anytime you need the object for any reason, you get the whole thing.
So, in other words, you hear all the talk about how SELECT * is bad, but if you try to use a ORM class that comes with the framework, it wants to do just that usually. Are you stuck to choosing ORM with SELECT * vs writing the specific SQL queries by hand? It just seems to me that we're stuck between convenience and efficiency, and if I hand write the queries, if I add a column, I'm most likely going to have to add it to several places in the code.
Sorry for the long question, but I'm explaining the background to get some mindsets from other developers rather than maybe a specific solution. I know that we can always use something like Memcached, but I would rather optimize what we can before getting into that.
Thanks for any ideas.
First, assuming you are proficient at SQL and schema design, there are very few instances where any abstraction layer that removes you from the SQL statements will exceed the efficiency of writing the SQL by hand. More often than not, you will end up with suboptimal data access.
There's no excuse for 100 queries just to generate one web page.
Second, if you are using the Object Oriented features of PHP, you will have good abstractions for collections of objects, and the kinds of extended properties that map to SQL joins. But the important thing to keep in mind is to write the best abstracted objects you can, without regard to SQL strategies.
When I write PHP code this way, I always find that I'm able to map the data requirements for each web page to very few, very efficient SQL queries if my schema is proper and my classes are proper. And not only that, but my experience is that this is the simplest and fastest way to implement. Putting framework stuff in the middle between PHP classes and a good solid thin DAL (note: NOT embedded SQL or dbms calls) is the best example I can think of to illustrate the concept of "leaky abstractions".
I got a little lost with your question, but if you are looking for a way to do database access, you can do it couple of ways. Your MVC can use Zend framework that comes with database access abstractions, you can use that.
Also keep in mind that you should design your system well to ensure there is no contention in the database as your queries are all scattered across the php pages and may lock tables resulting in the overall web application deteriorating in performance and becoming slower over time.
That is why sometimes it is prefereable to use stored procedures as it is in one place and can be tuned when we need to, though other may argue that it is easier to debug if query statements are on the front-end.
No ORM framework will even get close to hand written SQL in terms of speed, although 100 queries seem unrealistic (and maybe you are exaggerating a bit) even if you have the creator of the ORM framework writing the code, it will always be far from the speed of good old SQL.
My advice is, look at the whole picture not only speed:
Does the framework improves code readability?
Is your team comfortable with writing SQL and mixing it with code?
Do you really understand how to optimize the framework queries? (I think a get() for each object is not the optimal way of retrieving them)
Do the queries (after optimization) of the framework present a bottleneck?
I've never developed anything with PHP, but I think that you could mix both approaches (ORM and plain SQL), maybe after a thorough profiling of the app you can determine the real bottlenecks and only then replace that ORM code for hand written SQL (Usually in ruby you use ActiveRecord, then you profile the application with something as new relic and finally if you have a complicated AR query you replace that for some SQL)
Regads
Trust your experience.
To not repeat yourself so much in the code you could write some simple model-functions with your own SQL. This is what I am doing all the time and I am happy with it.
Many of the "convenience" stuff was written for people who need magic because they cannot do it by hand or just don't have the experience.
And after all it's a question of style.
Don't hesitate to add your own layer or exchange or extend a given layer with your own stuff. Keep it clean and make a good design and some documentation so you feel home when you come back later.

Is LINQ an Object-Relational Mapper?

Is LINQ a kind of Object-Relational Mapper?
LINQ in itself is a set of language extensions to aid querying, readability and reduce code. LINQ to SQL is a kind of OR Mapper, but it isn't particularly powerful. The Entity Framework is often referred to as an OR Mapper, but it does quite a lot more.
There are several other LINQ to X implementations around, including LINQ to NHibernate and LINQ to LLBLGenPro that offer OR Mapping and supporting frameworks in a broadly similar fashion to the Entity Framework.
If you are just learning LINQ though, I'd recommend you stick to LINQ to Objects to get a feel for it, rather than diving into one of the more complicated flavours :-)
LINQ is not an ORM at all. LINQ is a way of querying "stuff", and can be more or less seen as a SQL-like language extension for different things (IEnumerables).
There are various types of "stuff" that can be queried, among them SQL Server databases. This is called LINQ-to-SQL. The way it works is that it generates (implicit) classes based on the structure of the DB and your query. In this sense it works much more like a code generator.
LINQ-to-SQL is not an ORM because it doesn't try at all to solve the object-relational impedance mismatch. In an ORM you design the classes and then either map them manually to tables or let the ORM generate the database. If you then change the database for whatever reason (typically refactoring, renormalization, denormalization), many times you are able to keep the classes as they are by changing the mapping.
LINQ-to-SQL does nothing of the sort. Your LINQ queries will be tightly coupled to the database structure. If you change the DB, you will probably have to change the LINQ as well.
LINQ to SQL (part of Visual Studio 2008) is an OR Mapper.
LINQ is a new query language that can be used to query many different types of sources.
LINQ itself is not a ORM. LINQ is the language features and methods that exist in allowing you to query objects like SQL.
"LINQ to SQL" is a provider that allows us to use LINQ against SQL strongly-typed objects.
I think a good test to ascertain whether a platform or code block displays the characteristics of an O/R-M is simply:
With his solution hat on, does the developer(s) (or his/her code generator) have any direct, unabstracted knowledge of what's inside the database?
With this criterion, the answer for differing LINQ implementations can be
Yes, knowledge of the database schema is entirely contained within the roll-your-own, LINQ utilizing O/R-M code layerorNo, knowledge of the database schema is scattered throughout the application
Further, I'd extend this characterization to three simple levels of O/R-M.
1. Abandonment.
It's a small app w/ a couple of developers and the object/data model isn't that complex and doesn't change very often. The small dev team can stay on top of it.
2. Roll your own in the data access layer.
With some managable refactoring in a data access layer, the desired O/R-M functionality can be effected in an intermediate layer by the relatively small dev team. Enough to keep the entire team on the same page.
3. Enterprise-level O/R-M specification defining/overhead introducing tools.
At some level of complexity, the need to keep all devs on the same page just swamps any overhead introduced by the formality. No need to reinvent the wheel at this level of complexity. N-hibernate or the (rough) V1.0 Entity Framework are examples of this scale.
For a richer classification, from which I borrowed and simplified, see Ted Neward's classic post at
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
where he classifies O/R-M treatments (or abdications) as
1. Abandonment. Developers simply give up on objects entirely, and return to a programming model that doesn't create the object/relational impedance mismatch. While distasteful, in certain scenarios an object-oriented approach creates more overhead than it saves, and the ROI simply isn't there to justify the cost of creating a rich domain model. ([Fowler] talks about this to some depth.) This eliminates the problem quite neatly, because if there are no objects, there is no impedance mismatch.
2. Wholehearted acceptance. Developers simply give up on relational storage entirely, and use a storage model that fits the way their languages of choice look at the world. Object-storage systems, such as the db4o project, solve the problem neatly by storing objects directly to disk, eliminating many (but not all) of the aforementioned issues; there is no "second schema", for example, because the only schema used is that of the object definitions themselves. While many DBAs will faint dead away at the thought, in an increasingly service-oriented world, which eschews the idea of direct data access but instead requires all access go through the service gateway thus encapsulating the storage mechanism away from prying eyes, it becomes entirely feasible to imagine developers storing data in a form that's much easier for them to use, rather than DBAs.
3. Manual mapping. Developers simply accept that it's not such a hard problem to solve manually after all, and write straight relational-access code to return relations to the language, access the tuples, and populate objects as necessary. In many cases, this code might even be automatically generated by a tool examining database metadata, eliminating some of the principal criticism of this approach (that being, "It's too much code to write and maintain").
4. Acceptance of O/R-M limitations. Developers simply accept that there is no way to efficiently and easily close the loop on the O/R mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever percentage seems appropriate) of the problem and make use of SQL and relational-based access (such as "raw" JDBC or ADO.NET) to carry them past those areas where an O/R-M would create problems. Doing so carries its own fair share of risks, however, as developers using an O/R-M must be aware of any caching the O/R-M solution does within it, because the "raw" relational access will clearly not be able to take advantage of that caching layer.
5. Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework. For the last decade or more, the emphasis on solutions to the O/R problem have focused on trying to bring objects closer to the database, so that developers can focus exclusively on programming in a single paradigm (that paradigm being, of course, objects). Over the last several years, however, interest in "scripting" languages with far stronger set and list support, like Ruby, has sparked the idea that perhaps another solution is appropriate: bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects". Work in this space has thus far been limited, constrained mostly to research projects and/or "fringe" languages, but several interesting efforts are gaining visibility within the community, such as functional/object hybrid languages like Scala or F#, as well as direct integration into traditional O-O languages, such as the LINQ project from Microsoft for C# and Visual Basic. One such effort that failed, unfortunately, was the SQL/J strategy; even there, the approach was limited, not seeking to incorporate sets into Java, but simply allow for embedded SQL calls to be preprocessed and translated into JDBC code by a translator.
6. Integration of relational concepts into frameworks. Developers simply accept that this problem is solvable, but only with a change of perspective. Instead of relying on language or library designers to solve this problem, developers take a different view of "objects" that is more relational in nature, building domain frameworks that are more directly built around relational constructs. For example, instead of creating a Person class that holds its instance data directly in fields inside the object, developers create a Person class that holds its instance data in a RowSet (Java) or DataSet (C#) instance, which can be assembled with other RowSets/DataSets into an easy-to-ship block of data for update against the database, or unpacked from the database into the individual objects.
Linq To SQL using the dbml designer yes, otherwise Linq is just a set of extension methods for Enumerables.

Resources