Linq vs. database views - database

Here’s an interesting question. Suppose we have related tables in the database, for example, Instrument and Currency. Instrument table has a currency_id field that is mapped to entry in Currency table. In Linq land what’s the better way:
a) Create Instrument and Currency entities in the DataContext and then create association or simply use join in Linq queries or
b) Create a view in the database that joins Instrument and Currency (thus resolving currency_id to currency code) and use that as an entity in Linq context?

Would you ever use them independently? If so, you will need to have entities for each one that will be used independently. I suspect that you will use the Currency independently (say for a dropdown that allows you to choose a currency when creating an instrument). That being the case, I think it would be easier to just keep them separate and have an association.

With the ORM now abstracting the data access logic that particular function of views is no longer needed. It would be best to leave it to the ORM since thats part of its function.
However views may still be useful to simplify stored procedures code and even for creating useful indices.

If you load in Instrument and later use the Currencies property to load in related Currencies, there will be two queries.
If you issue a linq query with a join, linq will convert that to sql with a join and you get all the data at once.
If you set up a DataLoadOptions, you get all the data in one query and you do not have to write the join.
http://msdn.microsoft.com/en-us/library/system.data.linq.dataloadoptions.aspx
http://msdn.microsoft.com/en-us/library/system.data.linq.dataloadoptions.loadwith.aspx
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Instrument>(i => i.Currencies)
myDataContext.LoadOptions = dlo;

I find LINQ to be temperamental. I can run the same query 1 minute apart and get a different result. Note I am working of a local database so I know the data hasn't changed. Using a view with a dataset is far more reliable in my option, especially with joins.

Related

Need a clever way to get orders from all stores while each store is in a different database

The setup
I have the following database setup:
CentralDB
Table: Stores
Table: Users
Store1DB
Table: Orders
Store2DB
Table: Orders
Store3DB
Table: Orders
Store4DB
Table: Orders
... etc
CentralDB contains the users, logging and a Stores table with the name of each store database and general information about each store such as address, name, description, image, etc...
All the StoreDB's use the same structure just different data.
It is important to know that the list of stores will shrink and increase in the future.
The main client communicating with this setup is an API REST Service which gets passed a STOREID in the Header of each request telling it which database to connect to. This works flawlessly so far.
The reasoning
Whenever we need to do database maintenance on one store, we don't want all other stores to be down.
Backup management should be per store
Not having to write the WHERE storeID=x every time and for every table
Performance: each store could run on its own database server if the need arises
The goal
I need my REST API Service to somehow get all orders from all stores in one query.
Will you help me figure out a way to do this without hardcoding all storedb names? I was thinking about a stored procedure on the CentralDB but I was hoping there would be other solutions. In any case it has to be very efficient.
One option would be to have a list of databases stored in a "system" table in CentralDB.
Then you could create a stored procedure that would read the database names from the table, loop through them with cursor and generate a dynamic SQL that would UNION the results from all the databases. This way you would get a single recordset of results.
However, this database design is IMHO flawed. There is no reason for using multiple databases to store data that belongs to the same "domain". All the reasons that you have mentioned can be solved by using a single database with proper database design. Having multiple databases will create multiple problems on the long term:
you will need to change structure of all the DBs when you modify your database model
you will need to create/drop new databases when new stores are added/removed from your system
you will need to have items and other entities that are "common" to all the stores duplicated in all the DBs
what about reporting requirements (e.g. get sales data for stores 1 and 2 together, etc.) - this will require creating complex union queries...
etc...
On the long term, managing and maintaining this model will be a big pain.
I'd maintain a set of views that UNION ALL all the data. Every time a store is added or deleted those views must be updated. This can be automated.
The views provide an illusion to the application that there is only one database.
What I would not do is have each SQL query or procedure query all the database names and create dynamic SQL. That would entail lots of code duplication and an unnecessary loss of performance. This approach is error prone. Better generate code once in a central place and have all other SQL code reference that generated code.

ADO - Can I edit results of a complex query with multiple join statements?

I'm working on a data conversion utility which can push data from one master database out to a number of different databases. The utility its self will have no knowledge of how data is kept in the destination (table structure), but I would like to provide writing a SQL statement to return data from the destination using a complex SQL query with multiple join statements. As long as the data is in a standardized format that the utility can recognize (field names) in an ADO query.
What I would like to do is then modify the live data in this ADO Query. However, since there are multiple join statements, I'm not sure if it's possible to do this. I know at least with BDE (I've never used BDE), it was very strict and you had to return all fields (*) and such. ADO I know is more flexible, but I don't know quite how flexible in this case.
Is it supposed to be possible to modify data in a TADOQuery in this manner, when the results include fields from different tables? And even if so, suppose I want to append a new record to the end (TADOQuery.Append). Would it append to two different tables?
The actual primary table I'm selecting from has a complimentary table which is joined by the same primary key field, one is a "Small" table (brief info) and the other is a "Detail" table (more info for each record in Small table). So, a typical statement would include something like this:
select ts.record_uid, ts.SomeField, td.SomeOtherField from table_small ts
join table_detail td on td.record_uid = ts.record_uid
There are also a number of other joins to records in other tables, but I'm not worried about appending to those ones. I'm only worried about appending to the "Small" and "Detail" tables - at the same time.
Is such a thing possible in an ADO Query? I'm willing to tweak and modify the SQL statement in any way necessary to make this possible. I have a bad feeling though that it's not possible.
Compatibility:
SQL Server 2000 through 2008 R2
Delphi XE2
Editing these Fields which have no influence on the joins is usually no problem.
Appending is ... you can limit the Append to one of the Tables by
procedure TForm.ADSBeforePost(DataSet: TDataSet);
begin
inherited;
TCustomADODataSet(DataSet).Properties['Unique Table'].Value := 'table_small';
end;
but without an Requery you won't get much further.
The better way will be setting Values by Procedure e.g. in BeforePost, Requery and Abort.
If your View would be persistent you would be able to use INSTEAD OF Triggers
Jerry,
I encountered the same problem on FireBird, and from experience I can tell you that it can be made(up to a small complexity) by using CachedUpdates . A very good resource is this one - http://podgoretsky.com/ftp/Docs/Delphi/D5/dg/11_cache.html. This article has the answers to all your questions.
I have abandoned the original idea of live ADO query updates, as it has become more complex than I can wrap my head around. The scope of the data push project has changed, and therefore this is no longer an issue for me, however still an interesting subject to know.
The new structure of the application consists of attaching multiple "Field Links" on various fields from the original set of data. Each of these links references the original field name and a SQL Statement which is to be executed when that field is being imported. Multiple field links can be on one single field, therefore can execute multiple statements, placing the value in various tables, etc. The end goal was an app which I can easily and repeatedly export a common dataset from an original source to any outside source with different data structures, without having to recompile the app.
However the concept of cached updates was not appealing to me, simply for the fact pointed out in the link in RBA's answer that data can be changed in the database in the mean-time. So I will instead integrate my own method of customizable data pushes.

Is it possible to write a database view that encompasses one-to-many relationships?

So I'm not necessarily saying this is even a good idea if it were possible, since the schema of the view would be extremely volatile, but is there any way to represent a has-many relationship in a single view?
For example, let's say I have a customer that can have any number of addresses in the database. Is there any way to list out each column of each address with perhaps a number as a part of the alias (e.g., columns like Customer Id, Name, Address_Street_1, Address_Street_2, etc)?
Thanks!
Not really - you really are doing a dynamic pivot. It's possible to use OPENROWSET to get to a dynamically generated query, but whether that's advisable, it's hard to say without seeing more about the business case.
First make a stored proc which does the dynamic pivot like I did on the StackExchange Data Explorer.
Basically, you generate dynamic SQL which builds the column list. This can only really be done in a stored proc. Which is fine for applciation calls.
But what about if you want to re-use that in a lot of different joins or ad hoc queries?
Then, have a look at this article: "Using SQL Servers OPENROWSET to break the rules"
You can now call your stored proc by looping back into the server and then getting the results into a rowset - this can be in a view!
The late Ken Henderson has some good examples of this in his excellent book: "The Guru's Guide to SQL Server Stored Procedures, XML, and HTML" (you got to love the little "Covers .NET!" on the cover which captures well the zeitgeist for 2002!).
He only covers the loopback part (with views and user-defined functions), the less verbose PIVOT syntax was not available until 2005, but PIVOTs can also be generated using a CASE statement as a characteristic function.
Obviously, this technique has caveats (I can't even do this on our production server).
Yes - use:
CREATE VIEW customer_addresses AS
SELECT t.customer_id,
t.customer_name,
a1.street AS address_street_1,
a2.street AS address_street_2
FROM CUSTOMER t
LEFT JOIN ADDRESS a1 ON a1.customer_id = t.customer_id
LEFT JOIN ADDRESS a2 ON a2.customer_id = t.customer_id
If you provided more info, it'd be easier to give you a better answer. It's possible you're looking to pivot data (turn rows into columns).
Simply put, no. Not without dynamically recreating the view every time you want to use it at least, that is.
But, what you can do is predefine, say, 4 address columns in your view, then populate the first four results of your one-to-many relation into those columns. It's not quite the dynamic view you want, but it's also much more stable/usable in my opinion.

SSRS Multi value parameters - appropriate layer for implmentation of the filter

When using multivalue parameters in sql reporting services is it more appropriate to implement the list filter using a filter on the dataset itself, the data region control or change the actual query that drives the dataset?
SSRS will support any scenario, so then I ask, is there a reason beyond the obvious of why this should be done at one level over another?
It makes sense to me that modifying the query itself and asking the RDBMS to handle the filtering would be most efficient but maybe I am missing something with respect to how the SSRS Data Processing Extension may handle this scenario?
You are correct. The way to go is to pass the parameters through to the database engine.
Reporting Services should only be ideally used to render content. The less data that you need to pass back to the client web browser, the faster the report will render.
You may find my answer to a similar post regarding using mulit-value parameters to be of use.
Passing multiple values for a single parameter in Reporting Services
Hope this helps but please feel free to pose any further questions you may have.
Cheers,
John
Using table-valued UDF is a good approach, but there is still one issue - in case if this function is called in many places of query, and even inside inner select, there can be performance problem. You can resolve this issue using table variable (or temp table eather):
DECLARE #Param (Value INT)
INSERT INTO #Param (Value)
SELECT Param FROM dbo.fn_MVParam(#sParameterString,',')
...
where someColumn IN(SELECT Value FROM #Param)
so function will be called only once.
Othe thing, if you don't use stored procedure, but embedded SQL query instead, you can just put MVP into query:
...
where someColumn IN(#Param)
...
Use the RDBMS to do the main filtering
SSRS provides filtering for the purposes on data driven display and/or dynamic display. Especially useful for sub reports etc

Querying XML columns in SQLServer 2005

There is a field in my company's "Contacts" table. In that table, there is an XML type column. The column holds misc data about a particular contact. EG.
<contact>
<refno>123456</refno>
<special>a piece of custom data</special>
</contact>
The tags below contact can be different for each contact, and I must query these fragments
alongside the relational data columns in the same table.
I have used constructions like:
SELECT c.id AS ContactID,c.ContactName as ForeName,
c.xmlvaluesn.value('(contact/Ref)[1]', 'VARCHAR(40)') as ref,
INNER JOIN ParticipantContactMap pcm ON c.id=pcm.contactid
AND pcm.participantid=2140
WHERE xmlvaluesn.exist('/contact[Ref = "118985"]') = 1
This method works ok but, it takes a while for the Server to respond.
I have also investigated using the nodes() function to parse the XML nodes and exist() to test if a nodes holds the value I'm searching for.
Does anyone know a better way to query XML columns??
If you are doing one write and a lot of reads, take the parsing hit at write time, and get that data into some format that is more query-able. A first suggestion would be to parse them into a related but separate table, with name/value/contactID columns.
I've found the msdn xml best practices helpful for working with xml blob columns, might provide some inspiration...
http://msdn.microsoft.com/en-us/library/ms345115.aspx#sql25xmlbp_topic4
In addition to the page mentioned by #pauljette, this page has good performance optimization advice:
http://msdn.microsoft.com/en-us/library/ms345118.aspx
There's a lot you can do to speed up the performance of XML queries, but it will never be as good as properly indexed relational data. If you are selecting one document and then querying inside just that one, you can do pretty well, but when your query needs to scan through a bunch of similar documents looking for something, it's sort of like a key lookup in a relational query plan (that is, slow).
If you have a XSD for your Xml then you can import that into your database and you can then build indexes for your Xml data.
Try this
SELECT * FROM conversionupdatelog WHERE
convert(XML,colName).value('(/leads/lead/#LeadID=''xyz#airproducts.com'')[1]', 'varchar(max)')='true'

Resources