Benefits to views in stored procedures - sql-server

I've tried searching in different ways, but haven't found a clear answer to my question. This question almost answers my query, but not quite.
Besides the obvious readability differences, are there any benefits to using a view in a stored procedure:
SELECT
*
FROM
view1
WHERE
view1.fdate > #start AND
view1.fdate <= #end
...over using a linked table list?
SELECT
*
FROM
table1
INNER JOIN
table2
ON table1.pid = table2.fid
INNER JOIN
table3
ON table1.pid = table3.fid
WHERE
table1.fdate > #start AND
table1.fdate <= #end

Is not all about your app and you.
Think enterprise databases, with tens of different apps accessing the same data, and hundreds of individuals querying the data for business purposes. How do you explain to each one of the many individual how to recompose your highly normalized data? Which lookup field maps to which table? How are they joined? And how to you grant read only access to the data, making sure some sensitive fields are not accessible, w/o repeating yourself?
You, the DBA, create VIEWs. They denormalize the data into easy to process relations for the business people, for the many apps, and for the reports. You grant select permission on the views w/o granting access to the underlying table to hide sensitive private fields. And sometime you write views because you're tired of being called at midnight because the database is 'down' cause Johnny from accounting is running a cartesian join.

There are no difference. Query plans will be identical in both cases. Query optimizer can use indexed view even if you don't use it explicitly (in case 2)

Related

Key Differences Between Views and Created Tables Using SELECT INTO?

I understand that a view is only a saved SQL query that can be considered as a virtual table. However, it seems to me that there is not much difference between creating a new table using the SELECT INTO statement and creating a view. What are some of the major distinctions between the two methods? Under what situations would you use one or the other?
Let's start with aknowledging that by what you say about views, you are not talking about indexed views. Because these are actually implemented with existing tables in the background. So, we are talking about non-indexed views.
The two methods are very different - in fact, they have nothing in common. It is strange, because you both mention:
"view = only saved sql query"
"into = creating new table"
Isn't this a contradiction? The result of select into is actually a table, a view is not.
Not sure why are you asking about this or what are you trying to accomplish. In my experience, I use select into to fastly create logically temporary tables that have the same columns with the original without having to type all columns. This method of creating tables is generally inferior to the create table command and a subsequent insert, because no indexes and other stuff can be made - thus its use in adhoc queries or as a temporary entity.

SQL Server Performance: LEFT JOIN vs NOT IN

The output of these two queries is the same. Just wondering if there's a performance difference between these 2 queries, if the requirement is to list down all the customers who have not purchased anything? The database used is the Northwind sample database. I'm using T-SQL.
select companyname
from Customers c
left join Orders o on c.customerid = o.customerid
where o.OrderID is null
select companyname
from Customers c
where Customerid not in (select customerid from orders)
If you want to find out empirically, I'd try them on a database with one customer who placed 1,000,000 orders.
And even then you should definitely keep in mind that the results you'll be seeing are valid only for the particular optimiser you're using (comes with particular version of particular DBMS) and for the particular physical design you're using (different sets of indexes or different detailed properties of some index might yield different performance characteristics).
Potentially the second is faster if the tables are indexed. So if orders has an index on customer ID, then NOT IN will mean that you aren't bringing back the entire ORDERS table.
But as Erwin said, a lot depends on how things are set up. I'd tend to go for the second option as I don't like bringing in tables unless I need data from them.

DB Performance - Left outer join over database funtion

This is a bit complex query which has multiple joins and reruns a lot of records with several data fields. Let’s say it basically use to retrieve manager details.
First set of tables (already implemented query):
Select m.name, d.name, d.address, m.salary , m.age,……
From manager m,department d,…..etc
JOINS …..
Assume, a one manger can have zero or more employees.
Let’s say I need to list down all employee names for each and every manager for result of first set of tables with managers who has no employees (which means want to keep the manager list of first set of tables as it is).
Then I have to access “employee” table through “party” tables (might be involved few more tables).
Second set of tables (to be newly connected):
That means there are one or more join with “employee” , “party” and …..etc
I have two approaches on this.
Make left outer join with first set of tables to second set of
tables.
Create a user define function (UDF) in DB level for second set of
tables. Then I have to insert manger id in to this UDF as a
parameter and take all the employees (e1,e2,…) as a formatted string
by calling through the select clause in the first set of tables
Please can someone suggest me the best solution in DB performance wise out of these two options?
Go for the JOIN, using appropriate WHERE clauses and indexes.
The database engine is far better at optimizing that you'll ever be. Let it do its job.
Your way sounds like (n+1) query death.
Write a sample query and ask your database to EXPLAIN PLAN to see what the cost is. If you spot a TABLE

Joining against views in SQLServer with strange query optimizer behavior

I have a complex view that I use to pull a list of primary keys that indicate rows in a table that have been modified between two temporal points.
This view has to query 13 related tables and look at a changelog table to determine if a entity is "dirty" or not.
Even with all of this going on, doing a simple query:
select * from vwDirtyEntities;
Takes only 2 seconds.
However, if I change it to
select
e.Name
from
Entities e
inner join vwDirtyEntities de
on e.Entity_ID = de.Entity_ID
This takes 1.5 minutes.
However, if I do this:
declare #dirtyEntities table
(
Entity_id uniqueidentifier;
)
insert into #dirtyEntities
select * from vwDirtyEntities;
select
e.Name
from
Entities e
inner join #dirtyEntities de
on e.Entity_ID = de.Entity_ID
I get the same results in only 2 seconds.
This leads me to believe that SQLServer is evaluating the view per row when joined to Entities, instead of constructing a query plan that involves joining the single inner join above to the other joins in the view.
Note that I want to join against the full result set from this view, as it filters out only the keys I want internally.
I know I could make it into a materialized view, but this would involve schema binding the view and it's dependencies and I don't like the overhead maintaining the index would cause (This view is only queried for exports, while there are far more writes to the underlying tables).
So, aside from using a table variable to cache the view results, is there any way to tell SQL Server to cache the view while evaluating the join? I tried changing the join order (Select from the view and join against Entities), however that did not make any difference.
The view itself is also very efficient, and there is no room to optimize there.
There is nothing magical about a view. It's a macro that expands. The optimiser decides when JOINed to expand the view into the main query.
I'll address other points in your post:
you have ruled out an indexed view. A view can only be a discrete entity when it is indexed
SQL Server will never do a RBAR query on it's own. Only developers can write loops.
there is no concept of caching: every query uses latest data unless you use temp tables
you insist on using the view which you've decided is very efficient. But have no idea how views are treated by the optimizer and it has 13 tables
SQL is declarative: join order usually does not matter
Many serious DB developer don't use views because of limitations like this: they are not reusable because they are macros
Edit, another possibility. Predicate pushing on SQL Server 2005. That is, SQL Server can not push the JOIN condition "deeper" into the view.

Database Permission Structure

Many of my employers applications share a similar internal permission structure for restricting data to a specific set of users or groups. Groups can also be nested.
The problem we're currently facing with this approach is that enumerating the permissions is incredibly slow. The current method uses a stored procedure with many cursors and temporary tables. This has worked fine for smaller applications, but we now have one particular system which is growing quickly, and it's starting to slow down.
The basic table structure is as follows;
tblUser { UserID, Username, WindowsLogonName }
tblGroup { GroupID, Name, Description, SystemFlag }
tblGroupGroup { GroupGroupID, Name, }
tblGroupUser { GroupUserID, Name, }
and to tie it all together;
tblPermission { PermissionID, SecurityObjectID, SecuredID, TableName, AllowFlag }
which contains rows like..
'5255-5152-1234-5678', '{ID of a Group}', '{ID for something in tblJob}', 'tblJob', 1
'4240-7678-5435-8774', '{ID of a User}', '{ID for something in tblJob}', 'tblJob', 1
'5434-2424-5244-5678', '{ID of a Group}', '{ID for something in tblTask}', 'tblTask', 0
Surely there must be a more efficient approach to enumerating all the groups, and getting the ID's of the secured rows?
To complicate things further; if a user is explicitly denied access to a row then this overrules any group permissions. This is all in MSSQL.
I'm guessing it would be useful to break apart tblPermission into a couple of tables: one for groups and one for users. By having both groups and users in there, it seems to add complexity to the design (and maybe that's why you need the stored procedures).
If you want to break down the tblPermission table (into something like tblUserPermission and tblGroupPermission) but still want a representation of the tables that looks like tblPermission, you can make a view that union's the data from the two tables.
Hope this helps. Do you have examples of what the stored procedures do?
I think you could use a Recursive Common Table Expressions (CTE) hierarchical query. You can find many examples if you search for it. This is one of them.
Perhaps your design is OK, but the implementation/code is wrong.
Some thoughts:
Are all your ID columns GUID? Not recommended Kimberley L Tripp article
Indexes on all foreign keys, perhaps with other columns in key or INCLUDE
Regular maintenance? eg fragmented indexes, stats ot of date etc
Are all datatypes matching (assumes no FKs): datatype precedence and implicit conversion errors may creep in
Some more schema info and examples of poorly performing code may help

Resources