I have to implement some SQL views for a third-party report engine.
Each view has a complex query that joins multiple tables (some tables contain millions of rows)
Some view queries have subqueries
Views are not accessed by the application (Only read by the report engine)
I have the following concerns
we have some doughts this will badly impact the current application (since the complex queries in the view are running every time when accessing the view)
What would be the performance impact of executing the complex views? (memory, time, etc)
These are some other solutions we have for now.
Use new tables instead of views and update the tables using triggers and stored procedures.
Use a replicated database and create those views on that table (So, it will not affect the current system)
Can you give me comments on the above concerns and the solutions, please? New suggestions are welcome.
Related
I have a datawarehouse on a microsoft sql server and many complex queries involving a lot of joins between tables. Each query will return me a structure which will then be used to populate an object in my mongodb database.
The queries can change and involve new tables so my strategy would be the following:
I would create some materialized views (of course microsoft does things at its own liking, so it seems that those views do not exist, but are rendered as normal views+index, is it the same I wonder?).
I would set a proper update period for the view
Kafka would then listen for events on those views
I'm not so sure about this approach because I don't know how and if this dbms would produce event logs for materialized views too, nor if kafka would interpret them as changes to the tables.
The alternative would be to listen for events on every single table but as I stated they are a lot and could change, so there would be a lot of maintenance involved.
What do you think?
As commented, views don't emit events
You can Kafka Connect JDBC to query a view just as any other table, though
Otherwise, you would need different topics to perform filters and joins
I have a mvc view that requires data from 5 different DB tables. I currently have a big LINQ query that joins all the tables and returns the results, works fine. However, I am wondering if it would better to build a DB view to make the LINQ query simple.
Querying 5 tables via a single query isn't necessarily a problem. It depends on a ton of external factors like how performant is your database setup and the characteristics of the tables themselves: are they huge with millions of rows or only a few hundred?
Assuming it is a problem, causing excessive load on your database or long page load times, then yes, you might want to look into an alternative solution, but a view is almost certainly not the right choice.
Views have a very key negative in that they cannot have keys nor indexes. That means unless you plan to just return everything in the view, it will almost always be slower to query into a view than even doing joins across tables. Frankly, I've pretty much never found a good use for a database view in a web application context. Maybe they work in other environments, such as reporting, but other than that, they're useless. If you need an alternative to Entity Framework, use a stored procedure.
Since your objective is performance, keep with the 5 joins. You could enable SQL Profiler and track the query that is being generated by EF. Probably, if you write the query manually and then send to EF execute it, you'll get a better performance too.
Our application has over hundreds of Tables in its SQL Server database. Now we want to give the facility for users to write queries for certain areas and retrieve the data. Because the current database architecture is too complicated, I am planning to create a set of simplified indexed views and expose those views to users to write queries against them.
Data in the tables are changing very frequently. Is it ok to use Indexed views for such tables? I don't want to make this feature an overhead to the current functionality.
Can you foresee any issue with this procedure?
Thanks!
Any indexed view will add a performance overhead when tables are inserted/updated (inserted/updated data must be persisted to the indexed view as well). Based on your description of your requirement, I would start with a regular view and only consider indexing the view if performance of these user written queries warrants it.
I tested my database with some indexed and non-clustered indexed views. Views with clustered index makes the update/delete operations slower in the underneath tables. But that's happens in millisecond range. I understand it depends on the complexity, but this is what I have observed for my scenario.
I ran a update query which update 15000 records.
With indexed view - 550ms -650ms
Without clustered index – 250ms – 280ms
Our database doesn't have 100s of saving per min. So I think indexed views is suitable for our case.
Thanks!
I know Oracle offers several refreshmode options for their materialized views (on demand, on commit, periodically).
Does Microsoft SQLServer offer the same functions for their indexed views?
If not, how can I else use indexed views on SQLServer if my purpose is to export data on a daily+
on-demand basis, and want to avoid performance overhead problems? Does a workaround exist?
A materialized view in SQL Server is always up to date, with the overhead on the INSERT/UPDATE/DELETE that affects the view.
I'm not completely sure of what your require, you question isn't completely clear to me. However, if you only want the overhead one time, on a daily+ on-demand basis , I suggest that you drop the index when you don't need it and recreate it when you do. The index will be built when you create it, and it will be then up to date. When the index is dropped there will not be any overhead on your INSERT/UPDATE/DELETE commands.
There has been a lot of talk recently about NoSQL.
The #1 reason why I hear people use NoSQL is because they start to de-normalize their DBMS data so much so, to increase performance, that they end up with just one table with all of their data within that single table.
With Materialized Views however, you can keep your data normalized, yet have it stored as a single table view for the same reasons why you'd use NoSQL.
As such, why would someone use NoSQL over Materialized Views?
One reason is that materialized views will perform poorly in an OLTP situation where there is a heavy amount of INSERTs vs. SELECTs.
Everytime data is inserted the materialized views indexes must be updated, which not only slows down inserts but selects as well. The primary reason for using NoSQL is performance. By being basically a hash-key store, you get insanely fast reads/writes, at the cost of less control over constraints, which typically must be done at the application layer.
So, while materialized views may help reads, they do nothing to speed up writes.
NoSQL is not about getting better performance out of your SQL database. It is about considering options other than the default SQL storage when there is no particular reason for the data to be in SQL at all.
If you have an established SQL Database with a well designed schema and your only new requirement is improved performance, adding indexes and views is definitely the right approach.
If you need to save a user profile object that you know will only ever need to be accessed by its key, SQL may not be the best option - you gain nothing from a system with all sorts of query functionality you won't use, but being able to leave out the ORM layer while improving the performance of the queries you will be using is quite valuable.
Another reason is the dynamic nature of NoSQL. Each view you create will need created before-hand and a "guess" as to how an application might use it.
With NoSQL you can change as the data changes; dynamically varying your data to suit the application.