Creating an index on a view with OpenQuery

Creating an index on a view with OpenQuery - sql-server

SQL Server doesn't allow creating an view with schema binding where the view query uses OpenQuery as shown below.
Is there a way or a work-around to create an index on such a view?

The best you could do would be to schedule a periodic export of the AD data you are interested in to a table.
The table could of course then have all the indexes you like. If you ran the export every 10 minutes and the possibility of getting data that is 9 minutes and 59 seconds out of date is not a problem, then your queries will be lightning fast.
The only part of concern would be managing locking and concurrency during the export time. One strategy might be to export the data into a new table and then through renames swap it into place. Another might be to use SYNONYMs (SQL 2005 and up) to do something similar where you just point the SYNONYM to two alternating tables.
The data that supplies the query you're performing comes from a completely different system outside of SQL Server. There's no way that SQL Server can create an indexed view on data it does not own. For starters, how would it be notified when something had been changed so it could update its indexes? There would have to be some notification and update mechanism, which is implausible because SQL Server could not reasonably maintain ACID for such a distributed, slow, non-SQL server transaction to an outside system.
Thus my suggestion for mimicking such a thing through your own scheduled jobs that refresh the data every X minutes.
--Responding to your comment--
You can't tell whether a new user has been added without querying. If Active Directory supports some API that generates events, I've never heard of it.
But, each time you query, you could store the greatest creation time of all the users in a table, then through dynamic SQL, query only for new users with a creation date after that. This query should theoretically be very fast as it would pull very little data across the wire. You would just have to look into what the exact AD field would be for the creation date of the user and the syntax for conditions on that field.
If managing the dynamic SQL was too tough, a very simple vbscript, VB, or .Net application could also query active directory for you on a schedule and update the database.

Here are the basics for Indexed views and thier requirements. Note what you are trying to do would probably fall in the category of a Derived Table, therefore it is not possible to create an indexed view using "OpenQuery"
This list is from http://www.sqlteam.com/article/indexed-views-in-sql-server-2000
1.View definition must always return the same results from the same underlying data.
2.Views cannot use non-deterministic functions.
3.The first index on a View must be a clustered, UNIQUE index.
4.If you use Group By, you must include the new COUNT_BIG(*) in the select list.
5.View definition cannot contain the following
a.TOP
b.Text, ntext or image columns
c.DISTINCT
d.MIN, MAX, COUNT, STDEV, VARIANCE, AVG
e.SUM on a nullable expression
f.A derived table
g.Rowset function
h.Another view
i.UNION
j.Subqueries, outer joins, self joins
k.Full-text predicates like CONTAIN or FREETEXT
l.COMPUTE or COMPUTE BY
m.Cannot include order by in view definition

In this case, there is no way for SQL Server to know of any changes (data, schema, whatever) in the remote data source. For a local table, it can use SCHEMABINDING etc to ensure the underlying tables(s) stay the same and it can track datachanges.
If you need to query the view often, then I'd use a local table that is refreshed periodically. In fact, I'd use a table anyway. AD queries are't the quickest at the best of times...

Related

In this situation, best practice to create an ORACLE VIEW or TABLE?

In order to be able to report out oracle logons I have a query to find logons to the system. I want to be able to output the query results to a table or view in order to then report on this table/view. The underlying tables that the query is based on do not keep historical data hence my need for a new table/view.
Query will run 3 times a day to gather logon information at those times.
Is it best to create a new table and append the daily information updates or would a view be better practice? I'm unsure on the updating of a view as the underlying tables needs to be present, if that's correct.
Thanks

A view won't help at all. It is just a stored query and doesn't contain any data; it just reflects what you have in underlying table(s).
Therefore, you'll need a "history" table which will permanently hold data you're interested in.

Real time table alternative vs swapping table

I use SSMS 2016. I have a view that has a few millions of records. The view is not indexed and should not be as it's being updated (insert, delete, update) every 5 minutes by a job on the server to then display update data sets in to the client calling application in GUI.
The view does a very heavy volume of conversion INT values to VARCHAR appending to them some string values.
The view also does some CAST operations on the NULL assigning them column names aliases. And the worst performance hit is that the view uses FOR XML PATH('') function on 20 columns.
Also the view uses two CTEs as the source as well as Subsidiaries to define a single column value.
I made sure I created the right indexes (Clustered, nonclustered,composite and Covering) that are used in the view Select,JOIN,and WHERE clauses.
Database Tuning Advisor also have not suggested anything that could substantialy improve performance.
AS a workaround I decided to create two identical physical tables with clustered indexes on each and using the Merge statement (further converted into a SP and then Into as SQL Server Agent Job) maintain them updated. And to assure there is no long locking of the view. I will then swap(rename) the tables names immediately after each merge finishes. So in this case all the heavy workload falls onto a SQL Server Agent Job keep the tables updated.
The problem is that the merge will take roughly 15 minutes considering current size of the data, which may increase in the future. So, I need to have a real time design to assure that the view has the most up-to-date information.
Any ideas?

SQL Views vs. MS Access queries --- Updating data affects multiple base tables

I'm interested in understanding more about using a SQL View vs. a local query in MS Access. I like the fact that a view is basically a query that is stored on the server, and local machines running Access "see" it as a table.
Due to performance reasons, I'll sometimes take a view over a query since it typically makes a form load a lot faster. However, I've run into issues where I can't update the view if I make changes in two different fields that are in different base tables. Even if the view is constructed correctly with the correct joins, etc.
Just wondering if there is a more efficient and proper way to construct a query that can be updated.

A user can never update more than one table at a time. That's a given. You need to construct your form (probably using subforms) to represent the data using either single table views, simple views that are updatable, or tables.
Subforms are basically left joins to the parent form. like
SELECT *
FROM ParentForm P
LEFT JOIN SubForm S
ON P.ParentID <~~Link Master Field
= S.ParentID <~~Link Child Field
So you can recreate your view using subforms.
If your view is too complicated to fit this mold it is probably not updatable and it probably means that the data you DO want to update are in a single table but all the rest of the info in your view are supporting information. i.e. displayed to support the user making a decision.
In this case you should make the Record Source of your form be the table/view (which is updatable) that you want to update. Then in comboboxes/listboxes/controls which support the data going into your updatable table/view you make the Row Source that of your complicated view.
No matter where the view is (in Access or on the server) if it is constructed in such a way that it is impossible to determine which record in which table should be changed, nothing else matters. YOu need to design the whole form differently.

Optimization problems with View using Clustered Index Insert on tempdb on SQL Server 2008

I am creating a Java function that needs to use a SQL query with a lot of joins before doing a full scan of its result. Instead of hard-coding a lot of joins I decided to create a view with this complex query. Then the Java function just uses the following query to get this result:
SELECT * FROM VW_####
So the program is working fine but I want to make it faster since this SELECT command is taking a lot of time. After taking a look on its plan execution plan I created some indexes and made it +-30% faster but I want to make it faster.
The problem is that every operation in the execution plan have cost between 0% and 4% except one operation, a clustered-index insert that has +-50% of the execution cost. I think that the system is using a temporary table to store the view's data, but an index in this view isn't useful for me because I need all rows from it.
So what can I do to optimize that insert in the CWT_PrimaryKey? I think that I can't turn off that index because it seems to be part of the SQL Server's internals. I read somewhere that this operation could appear when you use cursors but I think that I am not using (or does the view use it?).
The command to create the view is something simple (no T-SQL, no OPTION, etc) like:
create view VW_#### as SELECTS AND JOINS HERE
And here is a picture of the problematic part from the execution plan: http://imgur.com/PO0ZnBU
EDIT: More details:
Well the query to create the problematic view is a big query that join a lot of tables. Based on a single parameter the Java-Client modifies the query string before creating it. This view represents a "data unit" from a legacy Database migrated to the SQLServer that didn't had any Foreign or Primary Key, so our team choose to follow this strategy. Because of that the view have more than 50 columns and it is made from the join of other seven views.
Main view's query (with a lot of Portuguese words): http://pastebin.com/Jh5vQxzA
The other views (from VW_Sintese1 until VW_Sintese7) are created like this one but without using extra views, they just use joins with the tables that contain the data requested by the main view.
Then the Java Client create a prepared Statement with the query "Select * from VW_Sintese####" and execute it using the function "ExecuteQuery", something like:
String query = "Select * from VW_Sintese####";
PreparedStatement ps = myConn.prepareStatement(query,ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
ResultSet rs = ps.executeQuery();
And then the program goes on until the end.
Thanks for the attention.

First: you should post the code of the view along with whatever is using the views because of the rest of this answer.
Second: the definition of a view in SQL Server is later used to substitute in querying. In other words, you created a view, but since (I'm assuming) it isn't an indexed view, it is the same as writing the original, long SELECT statement. SQL Server kind of just swaps it out in the DML statement.
From Microsoft's 'Querying Microsoft SQL Server 2012': T-SQL supports the following table expressions: derived tables, common table expressions (CTEs), views, inline table-valued functions.
And a direct quote:
It’s important to note that, from a performance standpoint, when SQL Server optimizes
queries involving table expressions, it first unnests the table expression’s logic, and therefore interacts with the underlying tables directly. It does not somehow persist the table expression’s result in an internal work table and then interact with that work table. This means that table expressions don’t have a performance side to them—neither good nor
bad—just no side.
This is a long way of reinforcing the first statement: please include the SQL code in the view and what you're actually using as the SELECT statement. Otherwise, we can't help much :) Cheers!
Edit: Okay, so you've created a view (no performance gain there) that does 4-5 LEFT JOIN on to the main view (again, you're not helping yourself out much here by eliminating rows, etc.). If there are search arguments you can use to filter down the resultset to fewer rows, you should have those in here. And lastly, you're ordering all of this at the top, so your query engine will have to get those views, join them up to a massive SELECT statement, figure out the correct order, and (I'm guessing here) the result count is HUGE and SQL's db engine is ordering it in some kind of temporary table.
The short answer: get less data (fewer columns and only the rows you need); don't order the results if the resultset is very large, just get the data to the client and then sort it there.
Again, if you want more help, you'll need to post table schemas and index strategies for all tables that are in the query (including the views that are joined) and you'll need to include all view definitions (including the views that are joined).

SQL Server: Persistant "Cache" Table Across Connections

I have a set of approx 1 million rows (approx rowsize: 1.5kb) that needs to be "cached" so that many different parts of our application can utilize it.
These rows are a derived/denormalized "view" of compiled data from other tables. Generating this data isn't terribly expensive (30-60sec) but is far too slow to generate "on the fly" as part of a view or table-valued function that the application can query directly. I want to update this data periodically, perhaps every few minutes.
My first thought is to have a scheduled job that updates a global temp table with this data every n minutes.
What's the best strategy, performance-wise? I'm not sure of the performance implications of storing it in a real table versus a global temp table (##tablename) versus other strategies I haven't thought of. I don't want to muck up the transaction logs with inserts to this table... it's all derived data and doesn't need to be persisted.
I'm using Microsoft SQL Server 2000. Upgrading during the timeframe of this project isn't an option, but if there's functionality in 2005/2008/2010 that would make this easier, I'd appreciate hearing about that.

I'd recommend using a materialized view (AKA indexed view).
Limitations:
View definition must always return the same results from the same underlying data.
Views cannot use non-deterministic functions.
The first index on a View must be a clustered, UNIQUE index.
If you use Group By, you must include the new COUNT_BIG(*) in the select list.
View definition cannot contain the following:
TOP
Text, ntext or image columns
DISTINCT
MIN, MAX, COUNT, STDEV, VARIANCE, AVG
SUM on a nullable expression
A derived table
Rowset function
Another view
UNION
Subqueries, outer joins, self joins
Full-text predicates like CONTAIN or FREETEXT
COMPUTE or COMPUTE BY
Cannot include order by in view definition

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight