A user was browsing our sales cube using PowerBi and noticed that measure group Fact Finance was incorrect.
The issue was an incorrect relationship. I added the relationship between FactFinance and DimAccount. Since processing the whole cube takes over an hour, I prefer processing this measure group, check the values in the SSDT browser, and publish the cube if everything's good.
How can I process just one Measure group so that I can browse the cube (ie. this measure) in SSDT and make sure everything's correct?
The reason I ask is that the SSDT browser connects to the actual cube, so I'm not sure how I can just process and deploy one measure group.
You can connect to the Cube and refresh (technically it's called "Processing"), with a variety of tools. The most common for a manual re-processing is SQL Server Management Studio. See Tools and Approaches for Processing (Analysis Services)
Related
I have an ETL and Cube solutions, which I process one after another in a SQL agent job.
In ETL I run 1 package, that in turn one by one runs all other packages.
Whole processing takes 10 hours.
For ETL:
How can I find out which package takes what amount of time to run within that one parent package, other than opening solution and record times?
For cube:
Here dimensions process fast. What do I measure here in order to find which part takes it so long? Maybe measures? How to track processing times of particular measure?
Maybe SQL Profiler will help? If so, is there a good article which describes which metrics there should I pay attention to?
To gather statistics about SSIS execution times, you can enable logging:
For package deployment model, you'll have to turn on logging in each package, go to SSIS > logging. In the dialogue choose the Pre and Post Execute events. Use a sql logging provide which will log to a system table called dbo.sysssislog. You'll need to join pre and post events on execution id.
For Project deployment model, it's probably already on. This can be configured in SSMS, Integration Services > SSISDB, right click and choose properties. Once you've executed the package, you can see the results in the standard reports. Right click the master package and choose Reports > Standard Reports > All Executions.
Lots more details on SSIS logging here: https://learn.microsoft.com/en-us/sql/integration-services/performance/integration-services-ssis-logging
For SSAS, I always tested this manually. Connect in SSMS, right click on each Measure group and do a process full (this assumes the dimensions have just been freshly processed.) The measures are more likely to be the cause of an issue because of the amount of data.
Once you understand which measure is slow, you can look at tuning the source query, if it has any complexity to it, or partitioning the measure group and doing incremental loading. Full processing could be scheduled periodically.
m
Could someone please explain why the EXACT same call to run a stored procedure takes about 25 seconds to complete when run from my local SQL Server Management Studio but only takes 5 seconds (this is the time I'd expect it to take) when run from a query window in the "Manage" facility, inside the Azure portal? It's completely consistent no matter how many times I do it!
It's also running slow from our cloud application, which makes me think there's some kind of difference between "internal" and "external" access to the DB server.
Thanks.
The solution was to create a new "Business" instance and move the database there. Performance returned to normal without any changes to the DB or associated app. Investigation showed that the S0/1/2 instances were all SLOWER than the "Business" instances (that are about to be retired by MS). We will be consulting them this week. It's probable that we need to pay for their Premium service to sustain the performance we get from the old Web/Business instances. This still does not account for WHY the original database slowed down, nor why performance would differ when accessed from the Azure console vs SQLIDE/Application.
I am accessing OLAP SSAS Cubes on a 2005 SQL Server using Excel 2007 pivot tables and finding that refreshing some of the tables is taking >10 minutes. My coworkers seem to think it is a sad reality, but I am wondering if there are alternatives I should be looking into.
Some thoughts I have had:
Obviously if I could upgrade the server hardware I would, but I am merely an analyst with no such powers, so I don't think hardware improvements are a great option. The same is true of moving to a newer SQL server, which I imagine would also speed up the process.
Would updating to a newer version of excel speed up the process?
I came across this: http://olappivottableextend.codeplex.com/, which gives me access to the MDX, which is apparently comically inefficient (Sounds like the macro recorder for VBA to me), so would changing the MDX around (I know a bit of it and the queries it gives for the pivot tables don't seem that complicated) be an option?
Would running MDX outside of excel be an option? I can write the queries, but I imagine it would not be as simple as the pivot table is.
It just seems like OLAP Cubes are a great solution in a lot of ways and these are some massive pivot tables processing quite a bit of information, but if there is a reasonable way to speed up the whole process I would love to know more about it.
Thanks for your thoughts SO.
There are many ways to access SSAS cubes, but it depends on what you are trying to achieve.
Excel tends to be used by business because
Its already installed
It is a familiar business tool
Easy to use
Requires no developer intervention
Other alternatives to Excel to access the cube include
SQL Server Analysis Services (management studio) via cube browser or mdx directly
SQL Server Reporting Services
Bespoke development (such as c#) utilising AdomdConnection
SQL Server (management studio) via OpenQuery
If you have been using Excel to access the cube so far, you will probably decide that none of the other tools quite cover your needs and you will end up sticking with it.
Assuming that Excel is the right tool for you, you should then move on to why is it slow. The list of possibilities (not including hardware / software) is long, but here are some;
It could be that it is external contention (to your project) on network / database / disk resource. The colume of data may be accumulating over time.
The cube may not be paritioned.
The questions you ask of it may be getting more complex.
The cube aggregations may not be utilised for your needs.
Cube partitioning may be missing
Cube structure may be inefficient as its supporting many-to-many relationships
User / query volume may have increased
To try to address the problem I would
Assess the data that you require within the cube (and maybe limit the cube to a rolling x month window)
Log your queries and apply Usage Based Optimisation
Monitor cube usage via SQL Server Profiler
Review the structure of your cube design
Attempt similar queries with other tools (both across the network and local to the cube) to establish where the issue lies
These two sites may help you if you establish Excel is the week point Excel, Cube Formulas, Analysis Services, Performance, Network Latency, and Connection Strings OR Excel, Cube Formulas, Analysis Services, Performance, Network Latency, and Connection Strings (which is on page 57 of SQLCAT's Guide to BI and Analytics)
I have a cube that I have build that has data across multiple servers. After the cube is deployed to the SSAS server, does it interact with the SQL servers that contain the initial data in which the cube was based on? The reason I ask is because I have potentially a lot of users and some of the data is on one of our production servers which we don't want to be accessed during a query to the cube.
Thanks,
Ethan
A typical SSAS Cube copies all the data available to it (as per the tables/views you pull into the DSV) to it's own location, you can validate this by going to the storage path as defined in SSAS Server options and looking at the folder sizes. When you query the cube, it will use this 'copied data'.
Having said that, there are exceptions:
If you have ROLAP dimensions it can go through to the underlying data:
http://technet.microsoft.com/en-us/library/ms174915.aspx
If your cube is set up for proactive caching, then it could query the underlying databases itself in order to stay up-to-date:
http://msdn.microsoft.com/en-us/library/ms174769.aspx
Those are the only two I'm familiar with.
Do bear in mind that deployment will generally require processing afterwards, unless you're restoring from a backup you've processed elsewhere. Also bear in mind at some point you'll probably want to add new data into the cube, which you say comes from the production databases you don't want to interrupt.
I am developing a data warehouse + a data cube for a certain client using Microsoft technologies (SSIS, SSAS and SSRS). I have almost finished doing the data warehouse. I have created a data cube in SSAS and already did the initial setting up of the cube dimensions and facts. So far it was in a development environment. We are expecting to deploy the solution at a production server in about two weeks from now.
My question is (as this is my first enterprise level cube creation), I do not know whether the cube is production ready yet. Can anyone let me know whether there are anything in specific I should do to the cube before deploying it? Also I have kept the system suggested names for the cube dimensions and measures; do I need to change them before it goes to the users?
Any help is deeply appreciated.
I'm not sure what you mean by "enterprise level": are there specific security/audit requirements, availability levels, backup schedules or support procedures that you need to comply with?
And if your users have accepted the dimension names in the test environment and you have developed reports and even code using them, then why would you want to change them in production?
Assuming that you've already deployed cubes and packages successfully to different environments, then the things to check should be exactly the same every time: accounts and permissions, package configurations, scheduled jobs for batch processing etc.