Can anyone give me general guidelines on how to approach multi dimensional reporting where I'd like to support at the very least cube generated from Oracle and SQL Server databases. I can imagine GemFire or Coherence being in the mix too.
I'm a little unsure where to start. If I work entirely in the Microsoft ecosystem I'm fine with SQL Server Analysis services, reporting services, MDX. Introduce the other data sources and I'm lost.
Thanks
The following vendors can all do what you need:
SAP Business Objects
IBM Cognos
Microstrategy
Actuate
Oracle and Microsoft will both work great with only ONE of your datasources.
Try looking under keywords "Business Intelligence" for Gartner group papers and other useful whitepapers from sources like InformationWeek. There are MANY vendors in this space, I encourage you to do a very deep slice prototype, because they all look great in demo, but might not work for you.
Also, the CUBE you mention (OLAP) is truly a performance booster. But you can do "multi-dimensional reporting" without Cubes. Maybe slower, but less intimidating and definitely less expensive.
Regarding prices you've a bunch of free OLAP servers available, depending on your needs all of them will be fine. Just look for the ones following XMLA/MDX standard.
Amongst them vou've the classical Mondrian (ROLAP) and one new coming icCube (MOLAP).
Related
I am using SSAS (SQL Server 2008 R2) to develop a classification model for a data set where 80% of values are missing. Ensemble classifiers based on trees are supposedly the best solution (Random Forest for example).
Is there any nice way of adding an ensemble classifier into SSAS? For example an AdaBoost or any other Bagging or Boosting classifier?
I know SSAS provides plug-in functionality, but I have not come across anyone doing any ensemble solutions... Not to mention anything that you can just download and start using.
If not, is there any efficient method to connect various classifiers in SSAS? I hope I am missing something obvious that is there.
I am not too familiar with the topic you are asking about, but technologically in SSAS you can register assemblies and use them in MDX. Therefore, it could be possible for you to code this in .NET and use the logic in SSAS. Please have a look at the following MSDN page for more information in case this sounds like something worth exploring.
http://technet.microsoft.com/en-us/library/ms175398.aspx
Additionally, have a look at the Data Mining SSAS provides out-of-the box as some classification objectives could be achievable by using the included algorithms:
http://technet.microsoft.com/en-us/library/ms175595.aspx
I'm using Squeak4.1. How does it handle Database connections? Does it provides something similar to ODBC/ADO in .NET or the J2EE stuff?
Which packages deal with database operations?
Can anybody give me some hints?
Few links that might be of use to you:
Squeak Smalltalk and Databases
SqueakDBX
Persistence in Seaside (Also see Chapter 8 in the Seaside tutorial)
Magma
Databases and Persistence
Squeak PostreSQL
If you want something that's truly an analog to ODBC/JDBC or ADO.NET, then the closest analog would be SqueakDBX, a generic, FFI-based connector to a wide variety of databases. While it uses FFI, the developers have gone to great lengths to ensure that long operations do not block the VM. While I can't honestly say I've used it in production, reviews have been positive, it supports a very wide variety of databases (MySQL, Microsoft SQL Server, PostgreSQL, SQLite3, and more), and it's being actively developed, so it's probably a good bet.
Historically, the downside of SqueakDBX is that you didn't get GLORP, the major ORM used in the Smalltalk world these days. The good news is that's no longer true: SqueakDBX now has GlorpDBX, which brings GLORP to SqueakDBX. Drivers are currently available for PostgreSQL, MS SQL, and MySQL, among others. If you need to connect to a traditional database, this is probably your best bet.
Benjamin: We have already started to modify Glorp, we call it GlorpDBX and now Glorp works with a generic database driver, included a GlorpSqueakDBX driver. Right now we have GlorpDBX working with SqueakDBX for Postgres, MSSQL and Oracle.
Cheers
You might not need to. If your smalltalk code runs in Gemstone, there is no need to worry about database connections and queries before you have a lot of data/a lot of transactions.
And if the number of objects is very small, SandstoneDB is much easier to use. On the Persistence in Seaside page you can find the links.
I am doing a project in the university and it includes a MySQL database. I have a design for the database in terms of a list of tables and their respective fields.
In what form should I present this design? Just the list of tables and content? In an ERD? How do you present your designs?
To clarify - whatever you answer, I expect not only specification of how you present your design, but also which tools do you use the create the diagrams/list/tables etc.
ERD is the only way to go. As they say, a picture is worth a thousand words.
But don't try to put the whole database on one diagram. It will, in all but the most trivial cases, be overwhelming to your audience to try to digest the entire database design in one go. Instead, break the diagrams into subject areas depicting only the most relevant tables in each diagram. For example, a point-of-sale system might have separate diagrams for Inventory, Sales, Accounting, Customer Management, Security, Auditing, and Reporting. Some tables will show up in more than one subject area -- this is to be expected.
As far as tooling, nothing beats ErWin, but it is really expensive and only available for Windows. Visio is ubiquitous in a corporate environment, but is only available on Windows and is not exactly cheap either. Macs offer some really nice diagramming tools; most of them are not free.
Dia is a decent, free, and cross-platform diagramming tool. It is a bit quirky, though; and I have not had much success making the diagrams look as nice I want them to look.
For MySQL, I have played with fabFORCE dbDesigner and it is not bad, but I did find its support for multiple subject areas to be a bit lacking at the time -- perhaps they've improved it since. But it is free and works on Windows and Linux.
For the actual presentation, I create images from these diagramming tools and pull them into presentation software (PowerPoint, KeyNote, or OpenOffice Impress). These presentations can be exported to PDF and distributed to the audience; they won't need anything more than a PDF viewer to review the information later.
Let's look at this from your professor's perspective. If I were him/her:
I would require an ERD. Without it, I cannot see one of the most fundamental issues of a database design, how are the tables related.
I would also expect some basic use cases/ requirements. What problems are you trying to solve with this database design?
I would want to see what indexes are in place, especiall on the foreign key columns. I would want to see expected row counts in all tables to determine if indexes are even required.
I would want to see column data types to determine if they meet the requirements. I would want to see what columns accept NULL values, since that often can cause problems if you're not careful.
If I were using SQL Server, I would probably create a diagram in SSMS to display a somewhat basic ERD. Visio can be used as well. I might use Visio to create my use cases, or perhaps Microsoft Word.
mysql workbench will make you pretty graphics for presentation amongst other many sophisticated features.
Depends on the audience. ERD certainly isn't the only answer and may not be the best. You should choose a medium that your audience will understand.
Don't forget to discuss design aspects that can't fit to ERD:
1) how inheritance/aggregation relationships from your analytical model implemented in your db.
2) how you are going to support hierarchies of your objects in the rdb (if you have any)
3) list relationships that are in your analytical model but are not supported by the rdb design.
4) ETL process, track changes, track schema changes, security based on resource.
5) storage partitioning and maintenance aspects (one of the goal optimize backup time)
6) in prod test (test island data) and easy cloning db for test environment
We are looking at acquiring Data Mining software to primarily run predictive analysis processes.
How does SQL Server Data Mining solution compares to other solutions like SPSS from IBM?
Since SQL Server DM is included in SQL Server Enterprise license - what would be the justification to spend extra couple 100K to buy separate software just to do DM?
I would look into open source options as well, including R, RapidMiner, Weka
I would recommend checking out the Rexer survey, as it shows popularity and satisfaction measures for a variety of data mining products:
http://www.kdnuggets.com/2010/03/f-annual-rexer-analytics-data-miner-survey-results.html
Depending on what you are looking to accomplish, and obviously your budget, there are certainly some great things being done in R. Check out Rattle for R and Revolution Computing.
I am a big fan of SPSS, and unfortunately have not used their Modeler package, but it seems like it may be worth considering. I have used SAS Enterprise Miner, and while it is powerful, I am not a big fan.
I haven't dabbled with Weka that much, but I found RapidMiner to have a steep learning curve, but does have alot of capability.
If you want to keep everything in the Microsoft stack check out www.predixionsoftware.com which is planning the release of a disruptive Excel add-in as an update to the current MS DM add-ins.
You might want want to give KNIME a try before paying for something else. Works well with databases and is excellent for exploratory analysis.
I would suggest to check open-source data mining software. There are some very good open-source software that are free.
I Would start by building some data mining models in SSAS using both Multidimensional and Tabular, and then get an account for Google Analytics. I built a social networking website that was set up where members had to join and used Google Analytics to start building reporting dashboards and have probably built near a thousand. Good starting point, R is good, Omni used to be the top dawg but Adobe bought them, clicktracks, quilk view, Sisense, Tableau, Actuate, however I would wait and see how the product Microsoft releases is. Chances are it will set itself apart like they have in the BI market and shot up to 2nd in market share and 1st in growth in the database market.
We'd like to see if we can get some improved performance for analysis and reporting by moving some of our key data into Analysis Services cubes. However, I haven't been able to find much in the way of good client front ends.
Our users have Office 2003. The move to 2007 is probably at least a year out and the Analysis Services add-in for Excel 2003 isn't great. I also considered just creating a winforms app, but I haven't had much luck finding robust controls for SSAS data. Meanwhile, Reporting Services seems to make you force your multi-dimensional data into a two dimensional dataset before it can be used in a report.
I hope that I'm missing something obvious and that there are some great client tools somewhere that will allow us to bring the multi-dimensional data from SSAS to a client application in a meaningful way. Performance Point seems like overkill, but maybe it's the best option.
Does anyone use SSAS data in line of business apps or is it primarily used for adhoc analysis? If you are using SSAS data for line of business apps, what technology(ies) are you using for the client front end?
I am on a project now that is using SSAS 2008. We were able to get all of our pilot users upgraded to Office 2007 and during the pilot of this project we used Excel 2007 as the front end tool. It turned out to be a great move (for us..YMMV) and I have been very impressed with how well the data features of Excel 2007 work. That being said it doesn't serve all our business needs and we're still going to use a reporting tool (MicroStrategy) as part of the client tool offerings to this project. This article (http://www.microsoft.com/downloads/details.aspx?FamilyId=2D779CD5-EEB2-43E9-BDFA-641ED89EDB6C&displaylang=en) was very helpful too.
Though you didn't ask directly I'll still say that the FE tools won't do much if the back-end design isn't right. I recommend googling Ralph Kimball and buying The Data Warehouse Toolkit book. There is even one tailored to SSAS 2005. Also search for the Microsoft Project Real whitepaper.
I've heard good things about this control.
http://www.datadynamics.com/Products/DDA/Default.aspx
I've used Dundas' control for OLAP. Very good and easy to use.
Also recommended is the DevExpress kit, the ASPxPivotGrid works directly against cubes, with some flexibility over what & how groups/dimensions/etc get shown via properties. Good prices and products. I don't work there :)
Take a look at some of the Demos:
http://demos.devexpress.com/ASPXPIVOTGRIDDEMOS/OLAP/Browser.aspx
Also nice integration with their charting stuff.