Limitations of SQL Views-Order by Clause not supported - sql-server

why can't we use Order By clause while creating the view. What is the reason behind SQL supporting Order by clause with TOP clause mentioned in the query and not supporting the same without TOP clause

A view is nothing but a virtual table and the order in which data is stored in a table can never be guaranteed in any RDBMS.
What you will need to do is:
SELECT <Column1>,<Column2>,....,<ColumnN>
FROM <MyView>
ORDER BY <MyColumn>

Because the tsql is relational and view is a relation and the relation doesn't have order.

In SQL, a view is a virtual table based on the result-set of an SQL statement. A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from one single table.
For ordering the resulted data you need to query it and apply order by clause as per your requirement.

Related

Why can't columnar databases like Snowflake and Redshift change the column order?

I have been working with Redshift and now testing Snowflake. Both are columnar databases. Everything I have read about this type of databases says that they store the information by column rather than by row, which helps with the massive parallel processing (MPP).
But I have also seen that they are not able to change the order of a column or add a column in between existing columns (don't know about other columnar databases). The only way to add a new column is to append it at the end. If you want to change the order, you need to recreate the table with the new order, drop the old one, and change the name of the new one (this is called a deep copy). But this sometimes can't be possible because of dependencies or even memory utilization.
I'm more surprised about the fact that this could be done in row databases and not in columnar ones. Of course, there must be a reason why it's not a feature yet, but I clearly don't have enough information about it. I thought it was going to be just a matter of changing the ordinal of the tables in the information_schema but clearly is not that simple.
Does anyone know the reason of this?
Generally, column ordering within the table is not considered to be a first class attribute. Columns can be retrieved in whatever order you require by listing the names in that order.
Emphasis on column order within a table suggests frequent use of SELECT *. I'd strongly recommend not using SELECT * in columnar databases without an explicit LIMIT clause to minimize the impact.
If column order must be changed you do that in Redshift by creating a new empty version of the table with the columns in the desired order and then using ALTER TABLE APPEND to move the data into the new table very quickly.
https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_TABLE_APPEND.html
The order in which the columns are stored internally cannot be changed without dropping and recreating them.
Your SQL can retrieve the columns in any order you want.
General requirement to have columns listed in some particular order is for the viewing purpose.
You could define a view to be in the desired column order and use the view in the required operation.
CREATE OR REPLACE TABLE CO_TEST(B NUMBER,A NUMBER);
INSERT INTO CO_TEST VALUES (1,2),(3,4),(5,6);
SELECT * FROM CO_TEST;
SELECT A,B FROM CO_TEST;
CREATE OR REPLACE VIEW CO_VIEW AS SELECT A,B FROM CO_TEST;
SELECT * FROM CO_VIEW;
Creating a view to list the columns in the required order will not disturb the actual table underneath the view and the resources associated with recreation of the table is not wasted.
In some databases (Oracle especially) ordering columns on table will make difference in performance by storing NULLable columns at the end of the list. Has to do with how storage is beiing utilized within the data block.

SQL Server partitioniong

I am working on a heavy record set database in MS SQL 2016. So I want to use row table partition feature to improve speed.
As we know partition feature is working on partition column of a table. Let's say [Date Column] of a table. In our scenario, have many tables that need to partition because of heaver record set in 5 to 7 tables. Each table not have that [Date column]. Also not possible to add that column in each table.
So is there any way I can select partition column of another table or something else.
The best option is to add a common column to all tables that you will then use to partition by.
You must already have a way of relating the different tables to each other so you can use this to tag each table with the correct Partition column.
This column could be as simple as an int with YYYYMM as values for monthly partitions.
You also need to make sure your queries are "Partition Aware".
This means that you should include this column in your WHERE Clause and also your JOIN Clauses for any queries.
Use Query Plans to make sure you are getting Partition Elimination on your queries.
If you can't change the model (but can add partitions???) then you could implement the partitioning with different columns in each table provided you have a single column in each table that you can partition on named ranges - but if you have 1-many relationships then it is unlikely that the child tables keys will be consecutive relative to the parent table. Note that this approach will make your "partition aware" queries more complex to craft.

SQL Server Indexed views used instead of tables

I am a little bit confused about using indexed views in SQL Server 2016.
Here is my issue. If I have a fact table with a lot of columns and I create an indexed view named IV_Sales as
select
year,
customer,
sum(sales)
from F_Sales
group by year, customer
I would aggregate all sales for year and customer.
After that, when a user runs a query from the F_sales like
Select
year, customer,
sum(sales)
from F_sales
group by year, customer
will the Optimizer (in SQL Server Enterprise Edition) automatically use the indexed view IV_sales instead of table scan of F_sales?
I have the Standard Edition and when I add
Select
year,
customer,
sum(sales)
from F_sales WITH (NOEXPAND)
group by year, customer
I get an error since there is no clustered index like the one I created on the indexed view. Is there a way to force using index views instead of the table in Standard Edition?
My real world issue is that I have a Cognos Framework model pointing to the table F_sales and when a report is executed using Year, customer and sum of sales for performance reasons I want it to use the indexed view automatically instead of the table.
I hope I'm being clear about my issue. Many thanks in advance.
If you have a performance issue, Indexed views are probably the last thing you want to try.
You should exhaust all other avenues, like standard indexes first.
For example if you know for sure that you are doing a table scan, the simple solution is to add a non clustered index to satisfy the query so it does an index scan or seek instead. If it still doesn't use this, you need to continue your performance tuning, and work out why it isn't (non sargable expressions? stale statistics?)
Your indexed view will automatically be used (without explicit mention of the indexed view) in a very limited number of cases. You'll see it in the query plan.
If your query very closely matches the index view definition, it will use your indexed view.
Make a very small change to your SQL, (like joining to another table) and it won't throw an error, it will just fall back to not using the indexed view.
Automatic SQL writing tools like Cognos will very quickly make the SQL unrecognisable to the query planner and therefore not use the indexed view.
This is all very easily verifiable if you just crack open SSMS and do some experiments.
So in short: start your optmisation with standard indexes, filtered indexes, even column store indexes (which are particularly good for fact tables or so I hear)

Understanding indexed view update qnd query process in SQL Server 2008 R2

I created indexed view (clustered unique index on Table1_ID) view with such T-SQL:
Select Table1_ID, Count_BIG(*) as Table2TotalCount from Table2 inner join
Table1 inner join... where Table2_DeletedMark=0 AND ... Group BY Table1_ID
Also after creating the view, we set clustered unique index on column Table1_ID.
So View consists of two columns:
Table1_ID
Table2TotalCount
T-Sql for creating View is a heavy because of group by and several millions of rows in Table2.
But when I run a query to a view like
Select Total2TotalCount from MyView where Table1_ID = k
- it executes fast and without overhead for server.
Also in t-sql for creating view many conditions in where clause for Table2 columns. And
If I changed Table2_DeletedMark to 1 and run a query
Select Total2TotalCount from MyView where Table1_ID = k
again - I'll get correct results. (Table2TotalCount decreased by 1).
So our questions are:
1. Why does query execution time decreased so much when we used Indexed View (compare to without view using (even we run DBCC DROPCLEANBUFFERS() before executing query to VIEW))
2. After changing
Table2_DeletedMark
View immediately recalculated and we get correct results, but what is the process behind? we can't imagine that sql executes t-sql by what view was generated each time we changes any values of 10+ columns containing in the t-sql view generating, because it is too heavy.
We understand that it is enough to run a simple query to recalculate values, depends on columns values we changing.
But how does sql understand it?
An indexed view is materialized e.g. its rows that it contains (from the tables it depends on) are physically stored on disk - much like a "system-computed" table that's always kept up to date whenever its underlying tables change. This is done by adding the clustered index - the leaf pages of the clustered index on a SQL Server table (or view) are the data pages, really.
Columns in an indexed view can be indexed with non-clustered indexes, too, and thus you can improve query performance even more. The down side is: since the rows are stored, you need disk space (and some data is duplicated, obviously).
A normal view on the other hand is just a fragment of SQL that will be executed to compute the results - based on what you select from that view. There's no physical representation of that view, there are no rows stored for a regular view - they need to be joined together from the base tables as needed.
Why do you think there are so many bizarre rules on what's allowed in indexed views, and what the base tables are allowed to do? It's so that the SQL engine can immediately know "If I'm touching this row, it potentially affects the result of this view - let's see, this row no longer fits the view criteria, but I insisted on having a COUNT_BIG(*), so I can just decrement that value by one"

Full-text Indexing for a view with multiple databases

Can MS SQL support full-text indexing for a view that connects (joins or unions) multiple databases?
Yes, absolutely. Each index will be queried individually and the results will be combined by the engine.
For example, if you've got:
DatabaseA, TableA, FieldA with a full text index
DatabaseB, TableB, FieldB with a full text index
And you have a view that includes both fields from both tables in both databases, it'll work fine when you query that view. From SQL Server's perspective, it doesn't matter whether they're in the same database or not.
If that doesn't match your scenario, try posting more detail about your challenges. Thanks!
No, not at all.
You cannot create a full text index on a table or view without an index.
You cannot create a view with a clustered index that contains Left/right joins or Unions.
You can do a full text search on a view that contains data from another database, but only if it contains a single table or inner joined tables.

Resources