SQL Views vs. MS Access queries --- Updating data affects multiple base tables - sql-server

I'm interested in understanding more about using a SQL View vs. a local query in MS Access. I like the fact that a view is basically a query that is stored on the server, and local machines running Access "see" it as a table.
Due to performance reasons, I'll sometimes take a view over a query since it typically makes a form load a lot faster. However, I've run into issues where I can't update the view if I make changes in two different fields that are in different base tables. Even if the view is constructed correctly with the correct joins, etc.
Just wondering if there is a more efficient and proper way to construct a query that can be updated.

A user can never update more than one table at a time. That's a given. You need to construct your form (probably using subforms) to represent the data using either single table views, simple views that are updatable, or tables.
Subforms are basically left joins to the parent form. like
SELECT *
FROM ParentForm P
LEFT JOIN SubForm S
ON P.ParentID <~~Link Master Field
= S.ParentID <~~Link Child Field
So you can recreate your view using subforms.
If your view is too complicated to fit this mold it is probably not updatable and it probably means that the data you DO want to update are in a single table but all the rest of the info in your view are supporting information. i.e. displayed to support the user making a decision.
In this case you should make the Record Source of your form be the table/view (which is updatable) that you want to update. Then in comboboxes/listboxes/controls which support the data going into your updatable table/view you make the Row Source that of your complicated view.
No matter where the view is (in Access or on the server) if it is constructed in such a way that it is impossible to determine which record in which table should be changed, nothing else matters. YOu need to design the whole form differently.

Related

Modifying view data in T-SQL

My company uses Microsoft SQL Server Management Studio and T-SQL for its database functionalities. One the daily activities my team performs is updating the data in a view that contains around 4k-5k records.
Let's suppose the view has 10 fields. So our job is to change the value of field_5, considering the values of field_1, field_2 and field_7. We have always been doing it manually and it takes close to 2-3 hours to go through all 5k records and check all the fields' values and update the required field.
I thought of automating the task and build a script for it but I can't seem to find anything on MSDN. The closest I got was this - https://msdn.microsoft.com/en-us/library/ms178076.aspx. But it talks about altering the structure of the view, not updating the data inside the view while retaining structure. Further googling also didn't help much.
Is there any way to do this?
What's the definition of the view? Does it meet the criteria for an updatable view? If so you just need an update statement.
A trivial example would be
UPDATE YourView
SET Field5 = Field1+Field2+Field7
Otherwise if it does not meet the requirements you could write an instead of trigger implementing the desired logic and still use an update statement.
Ok, you use the word 'View' but I assume you mean a 'Table'? Views are dynamic and (apart from indexed views) don't actually store data, they store queries that run a certain set of code when you call the view.
Assuming you're talking about a table, you're looking for something like below;
UPDATE table
SET field_5 = (field_1 + field_2 + field_7)
That will change field_5 in every row to match your calculation.

What are materialized views?

Can someone explain to me what views or materialized views are in plain everyday English please? I've been reading about materialized views but I don't understand.
Sure.
A normal view is a query that defines a virtual table -- you don't actually have the data sitting in the table, you create it on the fly by executing.
A materialized view is a view where the query gets run and the data gets saved in an actual table.
The data in the materialized view gets refreshed when you tell it to.
A couple use cases:
We have multiple Oracle instances where we want to have the master data on one instance, and a reasonably current copy of the data on the other instances. We don't want to assume that the database links between them will always be up and operating. So we set up materialized views on the other instances, with queries like select a,b,c from mytable#master and tell them to refresh daily.
Materialized views are also useful in query rewrite. Let's say you have a fact table in a data warehouse with every book ever borrowed from a library, with dates and borrowers. And that staff regularly want to know how many times a book has been borrowed. Then build a materialized view as select book_id, book_name, count(*) as borrowings from book_trans group by book_id, book_name, set it for whatever update frequency you want -- usually the update frequency for the warehouse itself. Now if somebody runs a query like that for a particular book against the book_trans table, the query rewrite capability in Oracle will be smart enough to look at the materialized view rather than walking through the millions of rows in book_trans.
Usually, you're building materialized views for performance and stability reasons -- flaky networks, or doing long queries off hours.
A view is basically a "named" SQL statement. You can reference views in your queries much like a real table. When accessing the view, the query behind the view is executed.
For example:
create view my_counter_view(num_rows) as
select count(*)
from gazillion_row_table;
select num_rows from my_counter_view;
Views can be used for many purposes such as providing a simpler data model, implement security constraints, SQL query re-use, workaround for SQL short comings.
A materialized view is a view where the query has been executed and the results has been stored as a physical table. You can reference a materialized view in your code much like a real table. In fact, it is a real table that you can index, declare constraints etc.
When accessing a materialized view, you are accessing the pre-computed results. You are NOT executing the underlaying query. There are several strategies for how to keeping the materialized view up-to-date. You will find them all in the documentation.
Materialized views are rarely referenced directly in queries. The point is to let the optimizer use "Query Rewrite" mechanics to internally rewrite a query such as the COUNT(*) example above to a query on the precomputed table. This is extremely powerful as you don't need to change the original code.
There are many uses for materialied views, but they are mostly used for performance reasons. Other uses are: Replication, complicated constraint checking, workarounds for deficiencies in the optimizer.
Long version: -> Oracle documentation
A view is a query on one or more tables. A view can be used just like a table to select from or to join with other tables or views. A metrialized view is a view that has been fully evaluated and its rows have been stored in memory or on disk. Therefore each time you select from a materialized view, there is no need to perform the query that produces the view and the results are returned instantly.
For example, a view may be a query such as SELECT account, SUM(payment) FROM payments GROUP BY account with a large number of payments in the table but not many accounts. Each time this view is used the whole table must be read. With a materialized view, the result is returned instantly.
The non-trivial issue with materialized views is to update them when the underlying data is changed. In this example, each time a new row is added to the payments table, the row in the materialized view that represents the account needs to be updated. These updates may happen synchronously or periodically.
Yes. Materialized views are views with a base table underneath them. You define the view and Oracle creates the base table underneath it automatically.
By executing the view and placing the resulting data in the base table you gain performance.
They are useful for a variety of reasons. Some examples of why you would use a materialized view are:
1) A view that is complex may take a long time to execute when referenced
2) A view included in complex SQL may yield poor execution plans leading to performance issues
3) You might need to reference data across a slow DBLINK
A materialized view can be setup to refresh periodically.
You can specify a full or partial refresh.
Please see the Oracle documentation for complete information

Creating an index on a view with OpenQuery

SQL Server doesn't allow creating an view with schema binding where the view query uses OpenQuery as shown below.
Is there a way or a work-around to create an index on such a view?
The best you could do would be to schedule a periodic export of the AD data you are interested in to a table.
The table could of course then have all the indexes you like. If you ran the export every 10 minutes and the possibility of getting data that is 9 minutes and 59 seconds out of date is not a problem, then your queries will be lightning fast.
The only part of concern would be managing locking and concurrency during the export time. One strategy might be to export the data into a new table and then through renames swap it into place. Another might be to use SYNONYMs (SQL 2005 and up) to do something similar where you just point the SYNONYM to two alternating tables.
The data that supplies the query you're performing comes from a completely different system outside of SQL Server. There's no way that SQL Server can create an indexed view on data it does not own. For starters, how would it be notified when something had been changed so it could update its indexes? There would have to be some notification and update mechanism, which is implausible because SQL Server could not reasonably maintain ACID for such a distributed, slow, non-SQL server transaction to an outside system.
Thus my suggestion for mimicking such a thing through your own scheduled jobs that refresh the data every X minutes.
--Responding to your comment--
You can't tell whether a new user has been added without querying. If Active Directory supports some API that generates events, I've never heard of it.
But, each time you query, you could store the greatest creation time of all the users in a table, then through dynamic SQL, query only for new users with a creation date after that. This query should theoretically be very fast as it would pull very little data across the wire. You would just have to look into what the exact AD field would be for the creation date of the user and the syntax for conditions on that field.
If managing the dynamic SQL was too tough, a very simple vbscript, VB, or .Net application could also query active directory for you on a schedule and update the database.
Here are the basics for Indexed views and thier requirements. Note what you are trying to do would probably fall in the category of a Derived Table, therefore it is not possible to create an indexed view using "OpenQuery"
This list is from http://www.sqlteam.com/article/indexed-views-in-sql-server-2000
1.View definition must always return the same results from the same underlying data.
2.Views cannot use non-deterministic functions.
3.The first index on a View must be a clustered, UNIQUE index.
4.If you use Group By, you must include the new COUNT_BIG(*) in the select list.
5.View definition cannot contain the following
a.TOP
b.Text, ntext or image columns
c.DISTINCT
d.MIN, MAX, COUNT, STDEV, VARIANCE, AVG
e.SUM on a nullable expression
f.A derived table
g.Rowset function
h.Another view
i.UNION
j.Subqueries, outer joins, self joins
k.Full-text predicates like CONTAIN or FREETEXT
l.COMPUTE or COMPUTE BY
m.Cannot include order by in view definition
In this case, there is no way for SQL Server to know of any changes (data, schema, whatever) in the remote data source. For a local table, it can use SCHEMABINDING etc to ensure the underlying tables(s) stay the same and it can track datachanges.
If you need to query the view often, then I'd use a local table that is refreshed periodically. In fact, I'd use a table anyway. AD queries are't the quickest at the best of times...

Can SQL Server Replication include the source dbid in the replicated data?

Let's say I have DatabaseA with TableA, which has these fields: Id, Name.
In another database, DatabaseB, I have TableA which has these fields: DatabaseId, Id, Name.
Is it possible to setup a replication publication that will send:
DatabaseA.dbid, DatabaseA.TableA.Id, DatabaseA.TableA.Name
to DatabaseB.TableA?
Edit:
The reason I'm asking is that I need to combine multiple databases (with identical schemas) into a single database, with as little latency as possible. Replication seemed like a good place to start (need to replicate data from one place to another), but I'm just in the brainstorming phase. I would definitely be open to suggestions on how to accomplish this without using replication.
There might be an easier way to do it, but the first thing I thought of is wrapping TableA in an indexed view on the source database and then replicating the view as a table (i.e., type = "indexed view logbased"). I don't think this would work with merge replication, though.
So, that would roughly be like:
CREATE VIEW TableA_with_dbid WITH SCHEMABINDING AS
SELECT DatabaseA.dbid, Id, Name FROM TableA
CREATE UNIQUE CLUSTERED INDEX ON TableA_with_dbid (Id) -- or whatever your PK is
EXEC sp_addarticle ...,
#source_object = 'TableA_with_dbid',
#destination_table = 'TableA',
#type = 'indexed view logbased',
...
Big caveat: indexed views have a lot of requirements that may not be appropriate for your application. For example, certain options have to be set any time you update the base table.
(In response to the edit in your question...) This won't work for combining multiple sources into one table. AFAIK, an object in a subscribing database can only come from one published article. And you can't do an indexed view on the subscribing side since UNION is not allowed in an indexed view. (The docs don't explicitly state UNION ALL is disallowed, but it wouldn't surprise me. You might try it just in case.) But it still does answer your explicit question: the dbid would be in the replicated table.
Are you aggregating these events in one place from multiple sources? Replicating only comes from one source - it's one-to-one, so the source ID doesn't seem like it would make much sense.
If you're aggregating data from multiple sources, maybe linked servers and triggers is a better choice, and if that's the case, then you could absolutely include any information about the source that you want.
If you can clarify your question to describe the purpose, it would help us find the best solution.
UPDATED FROM NEW DETAIL IN QUESTION:
Does this solution sound like it might be what you need?
Set up AFTER triggers on the source databases that send any changed rows to the central repository database, in some kind of holding table. These rows can include additional columns, like "Source", "Change type" (for insert, delete, etc).
Some central process watches the table and processes new rows (or runs periodically - once/minute, maybe), incorporating them into the central database
You could adjust how frequently the check/merge process runs on the server based on your needs (even running it constantly to handle new rows as they appear, perhaps even with an AFTER trigger on that table as well).

Synchronize DataSet

What is the best approach to synchronizing a DataSet with data in a database? Here are the parameters:
We can't simply reload the data because it's bound to a UI control which a user may have configured (it's a tree grid that they may expand/collapse)
We can't use a changeflag (like a UpdatedTimeStamp) in the database because changes don't always flow through the application (e.g. a DBA could update a field with a SQL statement)
We cannot use an update trigger in the database because it's a multi-user system
We are using ADO.NET DataSets
Multiple fields can change of a given row
I've looked at the DataSet's Merge capability, but this doesn't seem to keep the notion of an "ID" column. I've looked at DiffGram capability but the issue here is those seem to be generated from changes within the same DataSet rather than changes that occured on some external data source.
I've been running from this solution for a while but the approach I know would work (with a lot of ineffeciency) is to build a separate DataSet and then iterate all rows applying changes, field by field, to the DataSet on which it is bound.
Has anyone had a similar scenario? What did you do to solve the problem? Even if you haven't run into a similar problem, any recommendation for a solution is appreciated.
Thanks
DataSet.Merge works well for this if you have a primary key defined for each DataTable; the DataSet will raise changed events to any databound GUI controls
if your table is small you can just re-read all of the rows and merge periodically, otherwise limiting the set to be read with a timestamp is a good idea - just tell the DBAs to follow the rules and update the timestamp ;-)
another option - which is a bit of work - is to keep a changed-row queue (timestamp, row ID) using a trigger or stored procedure, and base the refresh queries off of the timestamp in the queue; this will be more efficient if the base table has a lot of rows in it, allowing you (via an inner join on the queue record) to pull only the changed rows since the last poll time.
I think it would be easier to store a list of the nodes that the user has expanded (assuming you can uniquely identify each one), then re-load the data and re-bind it to the tree view, and then expand all the nodes previously expanded.

Resources