Can a Superset dataset use multiple data sources? - database

I'm evaluating different BI solutions and I have a specific requirement.
Our setup has multiple DS with the same schema, e.g. Customer1DB, Customer2DB, etc.
Can multiple DBs be ingested in the same Superset dataset?

No, Superset does not support that. There are a few discussions of this on the project Github, here's one.
Two workarounds for combining multiple DBs are:
Do the joins in a database suited to this operation, like Trino or Drill, then use this single data source in Superset
Someone in the thread linked above says they got this working in Superset by linking the database servers
Superset does support joining tables from the same database, they get combined into a new virtual dataset via Superset's SQL Lab. And it can connect to multiple databases and use them in different charts. It just can't join across them for a single chart.

Related

Is that possible to have a DBMS with different data in databases nodes?

We're looking for the best setup to distribute databases for large databases. We would like to have more databases in multiple computers and to have every database tables which we're using in the particular part of the globe, then we would like to have a common table set which we would like to synchronize between the database nodes. We would like to extend a large platform but it's very hard to implement the functionality which will choose which connection string we need to use. That's why we started to have research on this and seems that DBMS may have this functionality.
Can you suggest some material/documentation/tutorial we could use? We're using SQL server.
Thanks in advance!

Talend Open Studio For MDM + Postgresql + synchronization of two databases

I have two database A and B(replica of A) now one live web application is
entering data into A now i want that the entries done in A should be
reflected in B.(i.e the changes in A should automatically reflect in B).
My sole purpose is synchronizing two databases and for that i have searched on Talend. I searched and came up with Talend MDM.I have installed MDM .I have searched on it but i am not getting whether it does database synchronization or not.Since there are other talend products like ESB,Data Integration etc. which one of them exactly is for syncing purpose.
Please suggest me.
IHMO, if you are looking for data replication between two databases having the same structure, then Talend is not what you are looking for.
Talend is an ETL tool (Extract Transform and Load). It would be applicable if in your case, your B database had a different structure than A. For that particular use case, you would use Talend in order to define some processing rules :
How do I extract data from A (Extract)
How do I transform A's data into B's data (Transform)
How do I store B's data (Load)
As mentioned by #jayadevan above, I would definitely look for inbuilt replication offered by your database.

How to build database reports using multiple remote databases

Does anyone have experience building database reports - doesn't matter which database - i just want design ideas - for a system that is made up of many separate, but identical databases?
I cannot "combine" all databases into one. They must be separate.
But the structure is identical across all databases...
I need to build a web interface that will allow a user to get a "global" report that will query all databases and build one combined report.
Do you have any comments on how the model would look like? or anything you think i need to beware of?
Thanks.
I don't have first hand experience with cross database reports, my experience comes from a product the company i work for sells which can create reports from multiple databases, from your description i believe you require something of the "combine" tables kind, in this case i recommend you to detect the tables used in the query, and unify them in a single temporary intermediary database, for example Access, SQL Server CE or SQLite and then run the query against this temporary database or table.
If your databases are Microsoft SQL Server, then using SQL Server Reporting Services seems like a good solution. The software for the report generation / display is bundled along with the database software.
It gives you a web interface, where you can configure 'data sources' from any number of remote databases, and combine data from these sources into reports. It is user friendly and you can do all the report design / configuration through the web interface without having to write any code.
some references :
Building report using SQL Server stored procedure
http://blog.hoegaerden.be/2009/11/10/reporting-on-data-from-stored-procedures-part-1/

Multiple database connectivity

We have 4 products and each supports below 4 datasources.
Oracle
SQL server 2005
DB2
Datopia
Now We are building Administration product which will interact will all the products and hence their databases.We have some requirements where we have to access tables from different datasources in a single query.We initially thought of using Oracle transparent gateway to create DB links and then access tables in different datasources. But this requires oracle to be installed for one of the products. This restrictions cannot be brought in our environment(For example among 4 products 2 may have SQL server installation and other two may have DB2 installation). Which is the best way to connect to all datasources with out any restriction. One more thing, we are using java to connect to these databases. Thanks in advance.
You don't say what kind of framework your client software uses. But if it uses Java, dotnet, or PERL, you will be able to use that framework's data access modules to connect to the various table servers. You can connect to all of them from a single client process easily enough.
You db access won't be perfectly transparent. You'll need some aspects of your program to be Oracle- or SQL-Server- specific, for example. On the other hand, if you do this right, it won't be hard to add MySQL and PostgreSQL support if your customers need it.
You'll have a fairly steep QA burden -- you'll need to test with at least one and two instances of all four table servers connected simultaneously to make sure everything works.
But this kind of product usually has high value, so you should be able to justify the QA effort.

Are heterogeneous database systems in practice?

I was probing around a bit in the realm of databases and hit the notion of having heterogeneous databases. I googled and found this - link text
My question is what kind of scenario would put this into practice and is it really useful? Is it just another thing which was thought about but not implemented or in case it was implemented, then it got restricted to a very niche area?
cheers
I would say yes, very much so. One implementation I am familiar with is integrating MAS90 with an LOB production system. The data is duplicated in both but accessed and used in different ways.
I've worked on a heterogeneous system before. It's a commercial system to manage study abroad programs for large universities, and they had installations on Oracle, MySql, and Sql Server. I was an outside consultant handling a very specific conversion project, though, so I didn't get to see many of the issues involved in making it work well everywhere.
I do remember that the single biggest hurdle I had to deal with was Oracle's lack of a simple autoincrement-style column and having to set up separate sequences instead. There were a number of datatype mismatches as well, but there was a pretty good system in place to just map those.
Note that even here, each customer only had one kind of database. We didn't have to worry about replicating data itself between db types (aside from a few common lookup tables). Just structure.
Different departments in your company might use different databases. I pull data in and push data to from the following
SQL Server
Oracle
Sybase IQ
Access
MySQL
FoxPro
Flat files
Excel files
The SQL Server database is the repository off all the data but it pull from many different databases to populate data and then data will be pushed to different databases for departmental use

Resources