While user generate report in TM1 web, how the reprots performance be increased?
What are the various ways of doing so?
Mainly by suppressing zeros and optimizing the subsets you use on rows/columns.
Related
So we're thinking about using cubes in our organization.
Situation AS IS:
DWH (Azure MS SQL) Query language - SQL
Microsoft Column Storage (Not real cubes) Query language DAX (There is MDX support, but looks like it's poorly implemented - inefficient)
Tableau (BI system, reports) Can use SQL and MDX
Known problems:
When we use MDX there is aggregation problem by date (we should show year, month, date hierarchy in the query), there is no such problem with DAX.
Microsoft Column Storage inefficient running total calculating.
How we want to solve the problem right now:
Use Microsoft Column Storage, materializing running total but won't use this kind of "cube" in all reports, only for a few people that really need it
In DWH materializing running total. All Tableau reports using it
In DWH we have data with daily granulation (Ex: We have a record that changed 1st November, 5th November, 15th November, before we have 3 records in DWH now we'll have 15). We need it like this to be able to have up to any date data really fast (basically we're implementing our own cube line this)
Pros:
No one will need to go in-depth with DAX and MDX languages
We shouldn't refactor anything
Cos:
DWH upload(update) will become longer than right now
DWH will become bigger (an everyday data for records)
We need to maintain running total fields in a manual way
Known alternatives:
Microsoft Power BI - can use DAX and MDX really efficient
Microsoft Analysis Services Cube (Real cubes) - MDX efficient on this as long as we concern, not like in Microsoft Column Storage
Questions:
First: if it's possible I really want to have your impression of technologies that you've used to understand what and why causes pain when you develop and maintain the solution.
Second: it will be really appreciated if you'll have any criticism on our current approach - why is that bad?
Third: Are cubes dead? I mean google doesn't present its own cubes, maybe the technology of itself is a dead-end?
Last: if you have any advice on what we need to use - that will be great.
I am trying to answer it step by step based on my experiences, Question is way too large for single technology or person.
First: if it's possible I really want to have your impression of
technologies that you've used to understand what and why causes pain
when you develop and maintain the solution.
Warehousing, cube, reporting, querying is moving fast on different distributed technology which can scale horizontally on relatively cheap hardware, scale up/down on demand and also can scale quickly. Also size of data is ever increasing with rise in Bandwidths of internet, globalization, social networking and various reasons. Hadoop, Cloud initially fill in gap for distributed tech that can evolve on distributed horizontally & can scale up/down easily.
Having a sql server with high computation & High RAM for in-memory high data, mdx, cube is usually vertical scaling, is costly & can't be scaled down back as easily as distributed horizontally even if we have SQL server on cloud.
Now with advantages comes complexities of developing Bigdata solution, learning curve & maintenance which is again a big challenge for new adopters who are not familiar with it till now.
Second: it will be really appreciated if you'll have any criticism on
our current approach - why is that bad
There is no golden bullet or silver lining architecture that can solve every issue you face without facing some issues of it's own. Your approach is again viable & has it's pro's & cons based on your current organisation structure. What I am assuming your team is familiar with SQL server, mdx , cubes & column storage and also done feasibility analysis. Only issue I see is when size of data increases SQL demands more computing power & RAM that can mostly be done by upgrading VM/machine. Vertical Scaling is costly & there is always limit at some time. Also failover/DR on such infra is again more costly.
Third: Are cubes dead? I mean google doesn't present its own cubes,
maybe the technology of itself is a dead-end?
No technology is dead if you can find support for it, even assembly, C, C++, Cobol is still going strong for old projects and for cases where it fit better than other.
Last: if you have any advice on what we need to use - that will be
great.
Do POC(proof of concepts) for at-least 3-4 types of solutions/architecture, what suits you best with cost/skill/timeframe, you will be the best judge.
I can suggest if you are open to cloud based solution try exploring some other solutions like data lake with azure data factory for Proof of concepts if it can meet your requirements.
Also I came through one out-of-box solution from Microsoft quite recently which is worth looking: Azure Synapse Analytics(https://azure.microsoft.com/en-in/services/synapse-analytics/). It has in built support of dataware housing, querying, support for AI/BI, streaming, data lake exploration, security, scale, support for Spark and various other sources with PowerBI too, insights/visual display.
Can anyone point me at a performance benchmark comparing SSAS with querying your own rollup tables in SQL?
What I'd like to understand is if the benefit from SSAS is entirely maintenance/convenience (managing your own rollup tables may become unmaintainable with a large number of dimensions) or if there is some magic in the MOLAP storage itself that makes it faster than equivalent relational SQL queries with equivalent pre-built aggregates.
It's as much about the ease of slicing and dicing as it is storing the aggregations. Sure, you can create your own rollup tables and subsequent queries that handles 10 different dimensions but I would rather use Excel/SSMS and just drag and drop. I can also point my users to the cube and say 'have fun'. I don't need to facilitate their every need, they can self-service.
As for the benchmark, that is dependent on your data warehouse schema, indexes, calculations, etc. Basically, you would need to do the analysis yourself to see if it is better for your situation.
Sorry I don't have link for that. But the rollup tables become complicated very soon as type and number of dimensions increase.
This document specifies the performance guide for MOLAP. Page 78 shows this diagram:
As you can see building the aggregate if you have hierarchies.
Can anyone give me general guidelines on how to approach multi dimensional reporting where I'd like to support at the very least cube generated from Oracle and SQL Server databases. I can imagine GemFire or Coherence being in the mix too.
I'm a little unsure where to start. If I work entirely in the Microsoft ecosystem I'm fine with SQL Server Analysis services, reporting services, MDX. Introduce the other data sources and I'm lost.
Thanks
The following vendors can all do what you need:
SAP Business Objects
IBM Cognos
Microstrategy
Actuate
Oracle and Microsoft will both work great with only ONE of your datasources.
Try looking under keywords "Business Intelligence" for Gartner group papers and other useful whitepapers from sources like InformationWeek. There are MANY vendors in this space, I encourage you to do a very deep slice prototype, because they all look great in demo, but might not work for you.
Also, the CUBE you mention (OLAP) is truly a performance booster. But you can do "multi-dimensional reporting" without Cubes. Maybe slower, but less intimidating and definitely less expensive.
Regarding prices you've a bunch of free OLAP servers available, depending on your needs all of them will be fine. Just look for the ones following XMLA/MDX standard.
Amongst them vou've the classical Mondrian (ROLAP) and one new coming icCube (MOLAP).
I'm building an application with an underlying database that looks like a text-book example of OLAP: I have large amount of data that comes in every night, which then gets rolled up by time and other dimensions and hierarchies with bunch of stored procedures I wrote. Then I build my application on top of the rolled-up tables that allows user to compare/retrieve data on different dimensions and levels.
At this point, I wonder if there's any compelling reason I should switch to a commercial BI product instead of building my own data cubes. I played with MSSQL BI and MDX, the learning curve seems very steep and I am not seeing any major performance gain. So that makes me ask myself again - what do I really gain by using a BI product? I'd appreciate if someone can help answer that question. Thanks.
MDX is a new language and certainly learning takes time and energy. Once you learn MDX you can apply it to all MDX compliant servers and you'll be able to solve new problems quickly.
I see different advantages :
You get the power of MDX for making complex calculations (e.g. calculated members, many-to-many relationships, multiple hierarchies..)
You can assume it will better scale than your local implementation (this is arguable and depends how good you or your team is).
Certainly one of the strong points is all available reporting tools. You can connect with Excel and other standard reporting tools to your data (as example check online here to see what is possible with iccube).
We wrote a gentle introduction to MDX to help smoothing the learning curve (here).
How much database performance overhead is involved with using C# and LINQ compared to custom optimized queries loaded with mostly low-level C, both with a SQL Server 2008 backend?
I'm specifically thinking here of a case where you have a fairly data-intensive program and will be doing a data refresh or update at least once per screen and will have 50-100 simultaneous users.
In my experience the overhead is minimal, provided that the person writing the queries knows what he/she is doing, and take the usual precautions to ensure the generated queries are optimal, that the necessary indexes are in place etc etc. In other words, the database impact should be the same; there is a minimal but usually negligible overhead on the app side.
That said... there is one exception to this; if a single query generates multiple aggregates the L2S provider translates it to a large query with one sub-query per aggregate. For a large table this can have a significant I/O impact as the db I/O cost for the query grows by magnitudes for each new aggregate in the query.
The workaround for that is of course to move the aggregates to stored proc or view. Matt Warren has some sample code for an alternative query provider that translate that kind of queries in a more efficient way.
Resources:
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=334211
http://blogs.msdn.com/mattwar/archive/2008/07/08/linq-building-an-iqueryable-provider-part-x.aspx
Thanks Stu. Bottom line seems to be that LINQ to SQL probably doesn't have a significant database performance overhead with the newer versions if you are able to use a compiled select, and the slower functions of updating are likely to be faster unless you have a REALLY sharp expert doing most of the coding.