Do hibernate work with normalized databases? If not, then there comes the lack of redundancy and optimization in the db design. Then how can i achieve optimization in db systems where we use hibernate ?
Related
Currently, I generate data on a different datastore and replicate to Snowflake Staging, then that data moves to the Data Warehouse DB through ELT ingestion for Analytics purpose. However this approach can be considered as creating data-silos in itself, since we already have 3 copies of the same data:
Transactional data-store DB
Replicated snowflake staging
Snowflake Data Warehouse DB
From a technical architecture point of view, is it a good idea to use Snowflake as a direct datastore for transactional application? (application that does many CRUD operations). That may help in avoiding the cost of replication and ingestion.
The main problem I see with this approach is that: Snowflake does not enforce any referential integrity (primary keys, foreign keys) so within the CRUD app, I have to either use a MERGE statement always or somehow make sure I don't create duplicate records.
The other problem being in the cloud, the distance (aka network) between the app and snowflake decides the performance of the transactions, I want good, consistent performance of my CRUD operations.
Any thoughts/suggestions are much appreciated.
Snowflake as of today does not perform well with singleton updates and inserts, which is what we see mostly with transactional databases. I have seen a performance degradation when using singleton inserts are submitted against Snowflake.
On the contrary, they are very optimized for bulk ingestion of unstructured data and structured data though and are designed for OLAP warehouses. You can still use it but you may see the same performance degradation. Also, primary keys can be defined but they are not enforced.
In my opinion, if you are faced with that challenge, you have the option to use a Postgre SQL DB (open source) in the cloud as your transactional database and it acts as a good complement to Snowflake as the OLAP database.
No. Snowflake isn't good as a transactional / OLTP database for the reasons you've mentioned. Plus, it won't perform well with many individual CRUD operations due to how they structure the data (optimised for OLAP workloads).
Just want to point out that there are benefits to creating separate databases, for one you want to isolate your transactional database from that of your analytics database otherwise you could be significantly affect the performance of the application. Secondly, the data in the transactional database could change and if you had to reprocess the data for whatever reason you may not be able to do so. There are many more, but I will stop here :-)
I've been working on a project for dating-like app, kind of tinder/bumble. I've been debating which database to use Cassandra or MongoDB. So far I have experience only with MS SQL, mysql and Unidata... I've been looking into Cassandra and MongoDB because of scalability, but I've heard Tinder had issues with their MongoDB, thus they had to call in for help. Even if it is not any of those 2, what else would you suggest? Learning DB would not be an issue for me, but I am looking for performance and scalability. Main programming language will be C# (if it helps) and preferably I am looking for building this in cloud (Azure Cosmos DB, aws dynamoDB or similar). My thoughts are NoSQL DB because of scalability but I wouldn't be opposed to select RDBMS if there is strong reason.
Suggestions, comments, thoughts?
Cassandra has some advantages over mongodb.
There is no master-slave in cassandra. Any node can receive any
query. If master goes down on mongodb, you'll face with little down time.
It is easy to scale cassandra, adding a node is not a challange.
Writes are very fast.
Read query with primary key is fast.
Also
There is no aggregation in cassandra
Bad performance for very high update/delete (increasing tombstones causes bad performance impact : http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html)
Not efficient for fulltext search applications
No transactions
No joins
Secondary indexes are not equal to rdbs indexes and should not use very often
So you can not use cassandra for every use cases. If your data model does not fit for cassandra just consider another db which fits your requirements.
Also take a look at : https://blog.pythian.com/cassandra-use-cases/
It seems now that Google bet on NewSql solutions for big data storages.
I'm wondering if there is still some advantages of a NoSql solution comparing to a newSql solution ? (Like memory managment or others things)
NewSql databases are a new "strain" of databases if you will that are attempting to take the long established benefits of the traditional Relationl Database Management System (RDBMS) and make it compete with the highlights of NoSql data stores. They are not updates or improvements to RDBMS but more often rewrites that include middleware that abstracts the practice of database "sharding" or the ability to distribute a database over a grid of computers like NoSql does.
The power of the RDBMS comes mostly from queryability via the Structured Query Language (SQL), their transactionality and adhereance to the ACID principal (Atomicity, Consistency, Isolation, Durability) and the powerful tools developed over time to manage them. A lesser benefit comes from the fact that the relational model eliminates repetitive storage of the same information in multiple places.
The benefits of the NoSql is high speed, the ability to scale laterally across a comuting grid, and the lack of schema to maintain. This makes them very highly performant even against hugh data stores. But they lack the benefits that you get from the traditional RDBMS in that the query language to manipulate data isn't really there (yet), they can't be transactional across a computing grid, and they lack the tools to work against them like MS Sql Server Management Studio.
NewSql is attempting to take the best parts of both worlds and I think it eventually will. Here is a great write up of the RDBMS V.s. NoSql V.s. NewSql on bananagunprogramming.com.
We are in the process of a database redesign & have heard stuff about all the different options: hadoop, cassandra, oracle, etc. Is there a good article that compares each of the major DB's side-by-side on performance & features?
Comparison of relational database management systems
NoSQL
Is there any different between these two kind of database? If yes, what is the different? Thank you.
The question isn't really answerable because "RDBMS" and "column-oriented" refer to very different aspects of a DBMS and are not mutually exclusive.
A RDBMS is any DBMS that implements the relational model.
A column-oriented DBMS is any DBMS that uses a columnar storage for data. That could be an RDBMS or it could be something else.
A column-oriented database is typically used for data warehouses and where you need to aggregate large amounts of data. It can be substantially different than a 'typical' transactional database.
Is this this what you are desiring to build (a data warehouse)?
When the column-oriented DBMS supports SQL, it replaces the SQL schema internally with a fully normalised version. Therefore performance considerations in the design of the schema that usually apply to traditional RDBMS no longer apply.