Mirror table vs materialized view - database

From this excellent video "Microservices Evolution: How to break your monolithic database by Edson Yanaga" I know that there are different ways to split chunk of data as separate db for microservice:
View
Materialized View
Mirror Table using Trigger
Mirror Table using Transactional Code
Mirror Table using ETL tools
Event Sourcing
Could you please explain me the difference between mirrored table and materialized view?
I'm confused due to both of them are stored on disk...

My understanding is :-
Mirrored tables
Mirrored tables are generally an exact copy of the another, source table. Same structure and the same data. Some database platforms allow triggers to be created on the source table which will perform updates on the source table to the mirror table. If the database platform does not provide this functionality, or if the Use Case dictates, you may perform the update in transactional code instead of a trigger.
Materialized Views
A Materialized View contains the result of a query. With a regular database view, when the underlying table data changes, querying the view reflects those changes. However, with a materialized view the data is current only at the point in time of creation (or refresh) of the Materialized view. In simple terms, a materialized view is a snapshot of data at a point in time.

Related

Materialized View vs Table Using dbt

I'm just onboarding dbt and having gone through the tutorial docs I'm wondering if there's a difference between materializing my transformations as views or tables? I'm using Snowflake as the data warehouse. There's some documentation here that shows the differences between a table and a materialized view but if I'm using dbt to update the tables regularly, do they more or less become the same thing?
Thanks!
dbt doesn't support materialized views, as far as I'm aware, but as Felipe commented, there is an open issue to discuss it. If it were possible to use materialized views on Snowflake, you're right that they somewhat become the same thing. The materialized view would update even if you haven't run dbt. As Drew mentions in the ticket though, there are a lot of caveats that make using tables with dbt preferable in most use cases: "no window functions, no unions, limited aggregates, can't query views, etc etc etc".
That said, dbt does support views and tables.
Even when you're using dbt, there's still a difference between a view and a table. A table will always need to be refreshed by dbt in order to be updated. A view will always be as up-to-date as the underlying tables it is referencing.
For example, let's say you have a dbt model called fct_orders which references a table that is loaded by Fivetran/Stitch called shopify.order. If your model is materialized as a view, it will always return the most up-to-date data in the Shopify table. If it is materialized as a table, and new data has arrived in the Shopify table since you last run dbt, the model will be 'stale'.
That said, the benefit of materializing it as a table is that it will run more quickly, given it's not having to do the SQL 'transformation' each time.
The advice I have seen given most often is something like this:
If using a view isn't too slow for your end-users, use a view.
If a view gets too slow for your end-users, use a table.
If building a table with dbt gets too slow, use incremental models in dbt.
If you use DBT there's little need for materialized views: a materialized view is in fact a table which is based on a query - same as "create table as select". If you have a DBT model you can materialize as a table and you'll get the same result. Now the difference between a table and a materialized view is the fact that the materialized view automatically updates, while the table does not. But if you're using DBT you can schedule a refresh of the table by scheduling DBT.
This will only give you updated data after your scheduled DBT will complete, which is not the same as a materialized view if the underlying table changes frequently, but most people refrain from using materialized views on top of tables that change frequently because the running cost can get out of control.
Materialized views in Snowflake can only query one table, while with DBT there are more options - e.g. join two tables and materialize as a table will give you something you can't do with a materialized view.
Finally, if you really want to deploy materialized views with DBT there are two ways:
Use the pre-hook or the post-hook, which executes any piece of SQL after running the DBT model. That can work but the maintenance is not great.
There is a way to create your own materialization - see https://docs.getdbt.com/docs/guides/creating-new-materializations - this is not an easy task, but that will give you want you want. There's also a GitHub page called dbt-hack which gives interesting techniques on non-standard materializations.

Trying to validate if a SQL Server data model uses temporal tables?

Looks like SQL Server 2008 and later uses the concept of "temporal tables" to manage table data history:
https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-table-usage-scenarios
Looks like the following clause is used to accomplish this:
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.MyTableHistory));
Let's assume that a data model has tables TableX and a TableXHistory and I select the following context menu path to generate a DDL script of TableX:
Script Table as > CREATE to > New Query Editor Window
If the generated SQL script does not have a text reference to "HISTORY_TABLE" then can I say 100% that the history table is not managed as a temporal table? Also, would a temporal table be explicitly displayed in the standard tables directory for the data model? Is there any reason not to use temporal tables in 2018 as opposed to manually created history tables? My first impression is that anyone who creates manual history tables in 2018 is most likely out of date with SQL Server capabilities.
Temporal tables available only from 2016. Technology is not mature yet.
Temporal tables have their own Pros & Cons. Other options should be considered (classic triggers and history table, change data capture, replication, etc.)
The main disadvantages of temporal tables for me:
multiple changes made at the same time are invisible (only one row is returned)
history tables must be located at the same DB
limitation for transactional replication, merge replication is not supported
issues when system time has been changed - no way to know which update was first w/o implementing additional logic (version)
history tables can'be updated w/o disabling versioning
to get net changes you need to query the base table (which is not good).
how to detect which columns are changed? (CDC & triggers can detect that naturally, with temporal it may be very expensive)
...

Impala: how to create a materialized view in impala?

Can we create Materialized views in Impala?
If not, what is the alternative solution for better performance of view.
Impala can't create materialized views at this time. So the solution for better view performance would be to load the output of the view query into a table and then have the view query the table or just query the table. As for keeping the data up to date you can take a batch approach of scheduling some DML statement to refresh the data or you could take a streaming approach by using something like Kafka to keep the data up to date.

SQL Server Indexed Views vs Oracle Materialized View

I know materialized view and I'm using it. I have never used indexed views but I will. What are the differences between them ?
SQL Server’s indexed views are always kept up to date. In SQL Server, if a view’s base tables are modified, then the view’s indexes are also kept up to date in the same atomic transaction.
Oracle provides something similar called a materialized view. If Oracle’s materialized views are created without the **REFRESH FAST ON COMMIT** option, then the materialized view is not modified when its base tables are. So that’s one major difference. While SQL Server’s indexed views are always kept current, Oracle’s materialized views can be static.

ETL : Tracking changes to data using Materialized View log

I am into designing ETL with source and target database as oracle Standard Edition.
For ETL purpose I need to get the changed data everytime.Client does not want any changes to be made in source objects.
Is it feasible to create Materialized view log on source database using dblink to track Inser/Update/Delete on the identified tables.
Thanks and Regards
I do not believe so -- a materialized view log must be created in the same database as the source object. If the database link were unavailable, your materialized view log would then be incomplete or inaccurate, or worse yet, would be blocking DML against the source table.
I'd recommend instead either:
Accepting the overhead of a FULL vs
FAST refreshable materialized view; or
Implementing Streams-based replication
to have your own copy of the table(s) in question,
against which you then implement materialized view logs.

Resources