Cross database queries.How to proper use cross database features? - sql-server

I am investigating the possibility to split one DB into multiple. We decided to move some tables into another database, but we have queries with join on these tables. I found a few solutions about how to achieve that:
Azure SQL Database elastic query
EXTERNAL DATA SOURCE
But I don`t know what the difference between them and what to choose.
Thanks for any help!

Azure SQL Database Elastic Queries and External data sources are two names for the same concept.
My suggestion is to avoid cross database queries and avoid splitting one database into multiples because query performance involving external data sources won't be the same no matter what strategy you choose to query those external tables.
If you still want to stick with the plan of splitting the database into multiple databases, then know that cross database queries show good performance when the remote tables are not big. When remote tables are big, this article shows you how to perform joins remotely using table variables and improve performance. This other article shows you also how to push parameterized operations to remote databases and improve performance.

if you are thinking to split your DB into multiple SQL server DB with the different host then you can prefer Linked server which has flexible to join across SQL servers

Related

Very slow queries in MS Access with joined MS SQL table via ODBC

What is the best solution when I would like to use an Access front-end application with some linked table (via ODBC) from MSSQL Server?
The difficulty of this for me is that I have to use complex queries with many multiple joins (and functions called from queries).
It is very-very slow because of the joins between the two DB (and there is a lot of data in some tables, the 2 GB Access mdb limit is the reason of the MSSQL DB upgrade).
Pass-through query doesn't help because of the joined Access tables.
With OPENDATASOURCE('Microsoft.ACE.OLEDB.12.0'... it is still slow in SQL Server too. I tried ODBC linked view with WHERE clause from MSSQL, but it
seems as slow as the full table.
I have to move all of joined Access tables to the MSSQL DB and convert all queries to Pass-Through? Is there any other solution?
I have to move all of joined Access tables to the MSSQL DB
Yes, definitely.
and convert all queries to Pass-Through?
Not necessarily, only those that are still slow.
"Normal" INNER JOIN queries, using only linked tables from one server database, are handled by Access and the ODBC driver in a way that everything is processed on the server. They should be (more or less) as fast as when run on the server (or as Pass-Through query).
Only "complex" queries, especially involving multiple INNER and OUTER JOINs, won't work like that. You'll notice that they are still very slow when running on linked tables. These need to be changed to Pass-Through queries.
Edit: I just noticed
functions called from queries
You can't call VBA functions from PT queries, and they will again kill performance when called from Access queries running on linked MSSQL tables (because they have to be processed locally).
You'll need to learn to create views in MSSQL, probably also user defined functions and/or stored procedures.
In the long run, you'll find that views are actually easier to manage than PT queries.

SQL Server tables connection

I have to connect multiple tables that are part of single or multiple databases. Approximately 10-15 tables in each query have to be connected to generate data for the analysis in SQL Server 2014.
I don't have access to the database diagram or architecture and these reports are to be sent out weekly. I want to understand the approach on how to begin writing these kind of queries which are of basic and advanced level and identify the relationship between tables and what kind of advanced level queries I can learn or utilize like CTE, Rank Partition, Subqueries etc.
Anybody who can provide a rough flow diagram or structure about the approach will be really helpful.
It's very unlikely that owners of those source systems want to be directly queried every time someone runs a report. Since you already have access to SQL Server, I would suggest building a data warehouse with that.
You haven't provided a whole lot of information to go on, but SSIS packages could be created to connect to the source systems and load into your data warehouse. And furthermore, those packages can be scheduled through Agent.
As for modeling... Again it is difficult with the lack of information, but generally the star model works great for reporting, which is a fact table surrounded by dimension (or attribute) tables.
As for figuring out relationships without a diagram, this will have to be done via experimentation and tieing to existing reports to make sure your joins aren't dropping records or cascading.
Good luck.

MS SQL Server: central database and foreign keys

I'm am currently developing one project of many to come which will be using its own database and also data from a central database.
Example:
the database "accountancy" with all accountancy package specific tables.
the database "personelladministration" with its specific tables
But we also use data which is general and will be used in all projects like "countries", "cities", ...
So we have put these tables in a separate database called "general"
We come from a db2 environment where we could create foreign keys between databases.
However, we are switching to MS SQL server where it is not possible to put foreign keys between databases.
I have seen that a workaround would be to use triggers, but I'm not convinced that is a clean solution.
Are we doing something wrong in our setup? Because it seems right to me to put tables with general data in a separate database instead of having a table "countries" in every database, that seams difficult to maintain and inefficiƫnt.
What could be a good approach to overcome this?
I would say that countries is not a terrible table to reproduce in multiple databases. I would rather duplicate static data like that than use more elaborate techniques. There is one physical schema per database in sql server and the schema can not be shared. That is why people use replication or triggers for shared data.
I can across this problem a while back. We have one database for authentication, however, those users have to be shared across multiple applications some of which have their own database.
Here is my question on this topic.
We resorted to replication and using an custom Authentication/Registration service agent to keep the data up to data.
Using views, in what Sourav_Agasti suggested in his answer, would be the most straight forward approach for static data. You can create views and indexed views and join data from databases on linked servers.
Create a loopback linked server and then create a view(if required, on each database) which accesses the table in this "central database" through this linked server. There will be a minor performance impact but it more than enough compensates by being very simiplistic.

Best way to perform distributed SQL query and joins, calling from .Net code

Here's my scenario:
I have to query two PeopleSoft Databases on different servers (both are SQL Server 2000) and do a join of the data. My application is a .Net application (BizTalk).
I'm wondering what the best option is with regards to performance?
use standard select queries to get data
and do the join in memory (e.g. LINQ) for example
generated complex dynamic queries using LINKED Server, e.g.
select blah
from Server1.HRDB.dbo.MyTable1
left join Server2.FinanceDb.dbo.MyTable2
use standard select queries to get the data into an intermediate / staging sql server database and do my queries / joins on this database instead.
should I consider using SSIS? ( are there features here that might be better than doing an in-memory, e.g. LINQ? )
I wish I could use stored procedures on the source database, but the owners of the PeopleSoft database refuse it
The main constraints we have is that the source database is old (SQL Server 2000) and that performance of the source database is paramount. Whatever queries I run on this server must not block the other users. Hence, the DBAs are adamant about no Stored Procedures. They also believe that queries involving Linked Servers will trump (i.e. take higher priority) to other queries being run against the the database.
Any feedback would be greatly appreciated.
Thanks!
Update: additional background information on the project
We are primarily integrating PeopleSoft databases (the HR and Finance) into another product. Some are simple - like AccountCode and Department. Others are more complex, like the personal data, job, and leave accrual. Some are real-time, other's are scheduled, and other's are 'batch' (e.g. at payroll runs).
Regardless, we have to get source data out of PeopleSoft database -- and my hope had been to let the (source) database do the 'heavy' lifting by executing SQL Queries. I don't really want BizTalk, or SSIS, or C# LINQ to be the ones doing the transformations/filtering.
Definitely open to suggestions.

SQL Server Replication (cross-database queries & constraints)

We want to replicate data from one database to several others (on another server). Would it make sense to replicate these tables to a shared database on the other server and have our cross-database queries reference the shared database... or would it make more sense to replicate out to each individual database on the other server? Would cross database joins pose a performance hit? Would cross-database constraints work as expected?
Replicating once to a shared database would help replication performance... I'm trying to evaluate whether or not any performance hit as a result of cross-database queries or constraints would be worth it.
Edit: It looks like cross database constraints are not possible in sql server? If this is true then we would have to replicate to each database
Cross database queries are somewhat slower that within the same DB. Foreign keys work within the same DB only. Usual approach is to create a separate schema in each DB (like ETL) and then replicate those tables to that schema. This approach is actually frequently used when replicating dimension tables between data marts.
When using cross-db approach, use triggers to implement constraints -- may be slow and complicated.
Depending on your application, you may implement foreign keys as "logical only" and run periodic "look for orphans" queries to deal with referential integrity.

Resources