Is there any way to convince SQL Server 2014 to do join elimination when joining columnstore indexed tables?
We have a standard dimensional model, with fact tables and dimensions, and there are views which join the facts with their many dimensions for the convenience of users.
When using conventional row store tables, we can take advantage of SQL's ability to eliminate the joins from those convenience views if they aren't necessary for a given query, by virtue of the fact that there are FK/PK relationships defined between the Facts and the Dimensions which allow the query planner to be certain that the join doesn't add or remove rows.
However, we'd like to convert the fact tables to columnstores, since they have massive performance improvements for the kinds of aggregate queries that are commonly done on the datamarts. But if we do so, we lose the ability to define the foreign keys, given that they aren't supported by columnstores. In turn, this prevents the planner from doing join elimination, making the convenience views do a whole bunch of unnecessary joining in many situations.
Is there any way to convince the planner to do join elimination, without the use of foreign keys?
To my knowledge for SQL Server 2014 this is impossible. You can easily get it with SQL Server 2016 Clustered Columnstore and the foreign keys ... :)
In some of the latest Cumulative Updates for SQL Server 2014 RTM & SP1 branches, there is one interesting bug fix that improve some performance, but I do understand that it is not exactly what you are looking for:
Query plan generation improvement for some columnstore queries in SQL Server 2014 or 2016
Related
Can we do database partitioning (not table/view partition) on SQL server 2014 Standard Edition?
By doing database partitioning, I want to place files on different physical drives.
If not possible, please share link from site like Microsoft etc. mentioning that Standard Edition is not supported.
For all I know, in SQL Server you can put one or more tables in a filegroup, but you cannot break one table into multiple filegroups unless you are using partitioning. Since Standard Edition does not allow partitioning, it seems you are out of luck.
Now, I may regret saying this, but...
What you could consider is to mimic partitioning by splitting your stuff into two or more tables e.g. TABLE1, TABLE2, and so on. Then you place each table on a different filegroup. You can even create a view that does UNION ALL with the tables, so your SELECT queries can hit just one thing, though for INSERT or UPDATE you will probably need to go back to the tables.
Of course, this is NOT PARTITIONING, and you lose a lot of the benefits, from partition operations (switch, split, merge) to engine optimisation, to index management and surely other aspects.
In other words, I would not do this unless I know exactly what I'm doing.
I just imported ~50 tables; each table has 2 common foreign keys (making each record unique). My goal is to setup a query that joins all these tables; I obviously don't want to have to this manually and was thinking about setting up a procedure that loops through all tables to dynamically build this query ... is this the best way or is there an obvious solution I'm not seeing? Thanks you
No there is not. You have to manually write the query to JOIN the tables.
You can also check Automatically Generate a Set of Join Filters Between Merge Articles (SQL Server Management Studio) but I am not sure if that is going to help.
Can we get the benefits of the partitioning of a SQL server 2010 table when we use entity framework as the data layer?
The table will have 10 000 records per day and it will be partitioned by the date created (Ex :- Older than 30 days and new)
I'm not very skilled in SQL Server so perhaps I'm wrong but I believe that table partitioning should be transparent to queries (if we are talking about automatic partition function defined in the table) - it means that common queries should still work and even have better performance if partitioning is configured correctly. So in case of database-frist design, EF should not have any problem with this because it still works with single logical table. If you mean manual partitioning by creating new table each month then it is a big probrem with EF and you will need stored procedures to access that tables.
Does anyone have experience of when SQL Server 2008 R2 is able to automatically match indexed view (also known as materialized views) that contain joins to a query?
For example the view
select dbo.Orders.Date, dbo.OrderDetails.ProductID
from dbo.OrderDetails
join dbo.Orders on dbo.OrderDetails.OrderID = dbo.Orders.ID
Cannot automatically be matched to the same exact query. When I select directly from this view with (noexpand) I actually get a much faster query plan that does a scan on the clustered index of the indexed view. Can I get SQL Server to do this matching automatically? I have quite a few queries and views and I do not want to reference the indexed view manually each time because I am using an OR mapper.
I am on enterprise edition of SQL Server 2008 R2.
Edit: I found the solution. SQL Server 2008 R2 does not match indexed views with more than 2 joins automatically. Probably it would slow down the optimization process too much.
Edit 2: Reviewing this 2 years after the question was created by me, I don't think my conclusion was correct. Materialized view matching is a very fragile process with no clear rules that I could find over the years.
Certainly, the following play a role:
Number of joins
Presence of a predicate
Join order, both in the view and in the query
I'm a little fuzzy on exactly what your question is; but I think this will give you what you want:
http://msdn.microsoft.com/en-us/library/ms181151.aspx
There are a lot of strange, arbitrary-seeming conditions that limit when SQL Server will use a view index in a query. This page documents them for SQL Server 2008.
We want to replicate data from one database to several others (on another server). Would it make sense to replicate these tables to a shared database on the other server and have our cross-database queries reference the shared database... or would it make more sense to replicate out to each individual database on the other server? Would cross database joins pose a performance hit? Would cross-database constraints work as expected?
Replicating once to a shared database would help replication performance... I'm trying to evaluate whether or not any performance hit as a result of cross-database queries or constraints would be worth it.
Edit: It looks like cross database constraints are not possible in sql server? If this is true then we would have to replicate to each database
Cross database queries are somewhat slower that within the same DB. Foreign keys work within the same DB only. Usual approach is to create a separate schema in each DB (like ETL) and then replicate those tables to that schema. This approach is actually frequently used when replicating dimension tables between data marts.
When using cross-db approach, use triggers to implement constraints -- may be slow and complicated.
Depending on your application, you may implement foreign keys as "logical only" and run periodic "look for orphans" queries to deal with referential integrity.