I have manufacturing company data in a relational form and have around 10 tables each table has more than 300 columns .we have set of operations to be performed (like join ,union ,ranking) sequentially to reach a final aggregated table which will be used for analysis but during joins its crossing more than 1000 columns which is not supported by relational database.What is the best database to use for such scenarios?
do they support all SQL operations?
currently we are using SQL Server
Related
I have a requirement where I need to get data from different database i.e. cosmos and sql.
How can I join both the table and get the data?
Below is the data that needs to be fetched. The common column in both is DossierGloabalId which can be use to join both databases tables.
Name--SQL
TaxableYear--SQL
Period--COSMOS
DossierType--SQL
VATType--COSMOS
LastUpdated--SQL
There is no way to do a join between SQL Server and Cosmos DB, as they are two completely different database engines.
Further: While Cosmos DB's SQL API does have a SQL-based query language, it's not a relational database, and is not compatible with SQL Server.
You'll need to do separate sets of queries: one against Cosmos DB, and one against SQL Server.
If you're asking to perform a relational join on data stored in Azure SQL with data stored in CosmosDB, there is no out-of-the-box support for doing so. You are going to have to query records in Azure SQL, query the corresponding records in CosmosDB, and then join them together in your own application code. There are a bunch of approaches to how you go about this in your own application code, and it's highly dependent on your own application.
Here's my scenario:
I have to query two PeopleSoft Databases on different servers (both are SQL Server 2000) and do a join of the data. My application is a .Net application (BizTalk).
I'm wondering what the best option is with regards to performance?
use standard select queries to get data
and do the join in memory (e.g. LINQ) for example
generated complex dynamic queries using LINKED Server, e.g.
select blah
from Server1.HRDB.dbo.MyTable1
left join Server2.FinanceDb.dbo.MyTable2
use standard select queries to get the data into an intermediate / staging sql server database and do my queries / joins on this database instead.
should I consider using SSIS? ( are there features here that might be better than doing an in-memory, e.g. LINQ? )
I wish I could use stored procedures on the source database, but the owners of the PeopleSoft database refuse it
The main constraints we have is that the source database is old (SQL Server 2000) and that performance of the source database is paramount. Whatever queries I run on this server must not block the other users. Hence, the DBAs are adamant about no Stored Procedures. They also believe that queries involving Linked Servers will trump (i.e. take higher priority) to other queries being run against the the database.
Any feedback would be greatly appreciated.
Thanks!
Update: additional background information on the project
We are primarily integrating PeopleSoft databases (the HR and Finance) into another product. Some are simple - like AccountCode and Department. Others are more complex, like the personal data, job, and leave accrual. Some are real-time, other's are scheduled, and other's are 'batch' (e.g. at payroll runs).
Regardless, we have to get source data out of PeopleSoft database -- and my hope had been to let the (source) database do the 'heavy' lifting by executing SQL Queries. I don't really want BizTalk, or SSIS, or C# LINQ to be the ones doing the transformations/filtering.
Definitely open to suggestions.
I have a question about data warehousing and column oriented databases. In my project the company use a warehouse solution in visual studio SQL server, they have troubles with the performance when querying complex questions on large amount of data. I want to try to replace the database with a columnar based database. I know that you can "transform" a row oriented database in to more column based or use an open source database such as Vertica or Sybase IQ, i just wondering how it would fit in the warehouse? Do you have to have a star join schema in a warehouse or can you use the columnar approach instead, i realize this is kind of a stupid question but im just trying to understand it all before i start to explore the different databases and solutions.
I know that SQL Server 2012 have a column store but i would like to try the other open source databases as well.
Thanks in advance!
Do you have to have a star join schema in a warehouse or can you use the columnar approach instead?
The star join schema consists of the table definitions of your data warehouse. The star schema, and similar schema, trade query performance for query flexibility. Usually, query flexibility is more important than query performance in a data warehouse.
Based on the Wikipedia article you linked to in your comments, a column oriented database engine stores the actual database bytes in column order, rather than the traditional row order of relational databases.
As the article says, this can improve disk access performance.
The star schema is how you define tables. A column oriented database engine is concerned with how the database information is written to disk. The two concepts have nothing to do with one another, except that they both apply to a data warehouse.
Keep your present data warehouse schema, and see if a column oriented database engine will improve query performance.
I'm going to use a single table to aggregate historical data about our (very big) virtual infrastructure. The table will be composed of 15 to 30 fields, and I esitmate from 500 to 1000 records a day.
Why a single table? A couple of reasons:
Data is extracted to csv using powershell scripts. Then bulk load on a single table is very easy and fast.
I will use the table to connect excel and report through pivot tables. Then a single table is perfect (otherwise I should create views).
Now my question:
If I'm planning in the future to build cubes upon this table is the "single-table" choice a bad solution?
Do cubes rely on relational databases or they can be easily built upon single-table databases?
Thanks for any suggestion
Can't tell you specifically about SQL Server Analysis Services, but for OLAP you typically use denormalized and aggregated data. That means fewer tables than in a normal relational scenario. And as your data volume is not really big (365k rows/year - even small for OLAP), I don't see any problem using a single table for your data.
Can we get the benefits of the partitioning of a SQL server 2010 table when we use entity framework as the data layer?
The table will have 10 000 records per day and it will be partitioned by the date created (Ex :- Older than 30 days and new)
I'm not very skilled in SQL Server so perhaps I'm wrong but I believe that table partitioning should be transparent to queries (if we are talking about automatic partition function defined in the table) - it means that common queries should still work and even have better performance if partitioning is configured correctly. So in case of database-frist design, EF should not have any problem with this because it still works with single logical table. If you mean manual partitioning by creating new table each month then it is a big probrem with EF and you will need stored procedures to access that tables.