Business Intelligence: Data mining with MS SQL Server? - sql-server

I have to study about data mining using SQL Server. As I know, Business Intelligence in SQL Server supports data mining, but I'm not pretty sure.
Does BI really support data mining?
How can I start with data mining with SQL Server? I mean, resources such as books, blogs,..etc
Thank you all.

I would suggest beginning your research by focusing on SQL Server Analysis Services via the Books Online Documentation, in particular the "Analysis Services Information Worker InfoCenter" as the information presented is mostly for the attention of data analysts.
http://msdn.microsoft.com/en-us/library/ms174577(SQL.90).aspx
Using the reference you can choose further readying into specific subjects such as "Data Mining Concepts", which will subsequently lead you onto the various Data Mining Algorithms that are available to you.
Then to get hands on with the technology, take a look at the Microsoft Data Mining Tutorial:
http://msdn.microsoft.com/en-us/library/ms167167.aspx

Every database supports data mining. Handling large amounts of structured data is what databases do!
First learn SQL which is useful in many applications and databases.
Then, if you find stuff you can't solve with SQL, you can turn to:
Reporting services: to create fancy reports
Analysis services: to analyse truly gigantic amounts of data (if you're thinking in millions of rows, Analysis server is overkill)
Integration Services: import from non Sql Server sources, automate tasks, combine queries graphically
These are Sql Server specific, and not as useful in every BI scenario.

The Microsoft BI stack is composed of:
SQL Server Reporting Services
SQL Server Analysis Services
SQL Services Integration Services
The combination of three will allow you to mine data, perform analytics and display them interactively.
I have bookmarked a few good links in my delicious account:
http://delicious.com/syalam/ssrs

Check thislink out - I learned from those screencasts:
http://sqlblog.com/blogs/denis_gobo/archive/2007/12/13/3937.aspx

Related

How can I establish a connection between my SQL Server 2017 database with a SaaS-based server?

I use a SQL Server 2017 database to perform ad-hoc data analytics activities to support my team. In order to source data from various databases, I either mount a backup on my environment (if the target DB holding the data I’m after is also SQL Server) or use linked servers to establish a direct connection (where I need data from Oracle or iSeries).
More recently though I’m coming across SaaS based systems and was wondering if there’s any way I can establish a direct connection between my database and the SaaS database? I’m not sure whether SSIS packages will do the trick. Any pointers would be gratefully appreciated as I’m struggling to get the right, scalable solution for this problem!
Data integration with SaaS solutions is a mixed bag. You have to discover on a case-by-case basis what integration or data export functionality each SaaS application has. Few will allow any kind of direct database access, but you might find an ETL tool that has pretty broad connectivity.
In Azure you can look at Logic Apps Connectors, and Power Query Connectors. Or products like Boomi. Which have connectors for many popular SaaS Applications.

SQL Server Linked server for data federation

I am familiar with SQL server as a product and want to use this to do data federation across a wide range of data sources that can be accessed trough linked servers. My data volumes are rather limited.
However I am interested how would this approach compare to existing data virtualisation software on a data federation level performance wise?
SQL server has his own query optimization engine and so on... so will performance of sql server linked servers federated queries be comparable to the Denodos, Red hats, Ciscos,... and other Data Virtualization products of this world? Or do they have some other trick up their sleeve?
Kind Regards!
The specialised data virtualization tools do have optimization techniques you will not find in LinkedServer and similar technologies. The optimizers of DV systems have been designed from scratch to work with distributed architectures, while the optimizer of SQL Server has been designed for a conventional database. The problem of optimizing queries in both scenarios is VERY different.
You can check this two blog posts by Denodo to get an idea of the differences (DISCLAIMER: I work for Denodo):
http://www.datavirtualizationblog.com/performance-data-virtualization-logical-data-warehouse-scenarios/
http://www.datavirtualizationblog.com/cost-based-optimization-in-data-virtualization/

Replication across heterogenous databases

Is it possible using SQL Server Replication to replicate data to AND from (bi-directional) Oracle and SQL Server? The schemas are completely different. In real-time would be a bonus.
Have already investigated Oracle Golden Gate, which seemed to do the job, although the licence cost is not insignificant!
I wondered if anyone has had any experience in replicating data across different schemas, and what other tools they employed? I realise this is a bit of an open-ended question but any advice and previous experiences would be most useful.
Thanks
Duncan
I recently had to create a solution to import periodically lots of data from different databases (most of the time from Oracle databases) to a SQL Server database (a data warehouse). To do so, I used SQL Server Integration Service to create a package able to import, transform and insert the data as I wanted (since it was from heterogeneous sources too). This software comes with SQL Server and the version 2005 and superior is really easy to use (graphical programming). In your case, you could trigger your created services when needed. I am not sure it is the best solution since you would need to create a SSIS service for each direction (from Oracle to SQL Server and from SQL Server to Oracle).

Front end applications for examining/digging through a SQL Analysis Services Mining Model

I currently use Excel and SQL Server Business intelligence studio to browse my models, but I've been searching high and low for a decent, moderately user-friendly front end application that can be used for trudging through a SSAS Mining Model.
I understand how to use the predictions for specific purposes (such as integrating with call queues and such) but the higher-ups are wanting to use the mining model in a more ad-hoc sense.
What applications are available for browsing a mining model and which ones have you used?
You can build an ASP.Net Web page using the controls here:
http://msftasprodsamples.codeplex.com/wikipage?title=SS2005!Data%20Mining%20Web%20Controls%20Library
You can use Microsoft SQL Server Management Studio and select Analysis Service under the Server Type.

Sql Server Services - Overview anyone?

In Short
I am thoroughly confused by the array of SQL Server services available, and am having a hard time finding a brief primer.
The Scoop
I am a long time ASP.Net developer who has happily been churning out ASP.Net applications for years. These have usually been based on SQL server and a range of custom tables and views.
Just recently I have had the need to do some reporting that goes a little beyond the straight forward, and so I have started to look into SQL Server Reporting Services (SSRS). In doing so I feel like i have opened up Pandora's Box.
There is reporting services, integration services, Analysis services, Business intelligence, etc etc. This has got me wondering what they all do, and if I could (or should have been) leveraging these tools to do some of the heavy lifting for .net applications.
My immediate research efforts on Microsoft have stifled me with impenetrable marketing and business speak. My Google searched have turned up no comprehensive overview of what these tools are, or they fit together, or even which tool should be used when. One of the biggest problems i am having comprehending the options is that they all seem kind of the same to me!
So, all of this to say, if some kind sole could set me straight and point me in the right direction, I would be very grateful :)
Very high level overview:
Core services for the database engine itself consist of the SQL Server service itself (or MSSQLSERVER) and there's also the SQL Server Agent and the SQL Server Browser. The SQL Server Agent is a job scheduler for SQL Server and handles some other maintenance tasks and so on, while the Browser service helps with accepting logins and so on by creating an easier way for client applications to connect to SQL Server.
All three of these services though are true 'services' in the sense that they're daemons running on your server. But of these three, you ONLY need MSSQLSERVER running to be able to work with the relational database engine. (The other two services are optional.)
Then, in addition to core database engine functionality, there are a bunch of add-ons or supplemental 'products' and tools that Microsoft Offers to help make SQL Server that much better and more compelling as a platform/offering. As you've seen, these are typically marketed as 'services'.
Reporting Services (or SSRS) is a great solution that uses a native SQL Server database to store information about data sources, report delivery schedules, access permissions, and so on... for a powerful reporting engine that makes it very easy to output great reports from pretty much ANY data source. You just define sources for the reports, wire up datasets and parameters, then use a specialized xml markup (Report Definition Language or RDL) to define look, feel, shape, and so on of your reports. And, Microsoft has created a 'tweaked' version of Visual Studio to make creating/developing/editing these reports much easier. That version of visual studio is called SQL Server Business Intelligence Design Studio (or BIDS).
Likewise, Microsoft also uses BIDS to help developers, DBAs, and analysts create data cubes - multidimensionally 'joined' tables using a specialized schema that allows SQL Server Analysis Services (SSAS) to churn through pretty much any combination of data relations in order to provide 'analytical data'. So... say you have a company selling widgets: you could define relationships between products, customers, sales regions, dates, and so on - and SSAS will take that data (along with specially defined 'cube' definitions that define those relationships, their granularity, and so on) and pre-calculate the outcome to pretty much any combination of those 'dimensions' or angles of data - so that you could say something like: "Show me all x-brand widgets sold by sales people in the northwest region in January of this year.." and it would give you a sum of those parts sold... and even let you drill-down by sales person, store, and so on - assuming you've defined your cube/relationships as needed.
Of course, this functionality is pretty powerful, but you almost never want these cubes running against your production tables... so companies typically create data warehouses or data marts to store their analytic data in. But to copy data on a regular basis from production into these warehouses/marts.. you need some kind of tool that makes it easy to figure out which changes should be picked up, how they should be handled, and what kinds of tweaks/changes you need to make to them to get them out of a highly-normalized production environment into a highly DEnormalized warehouse/data mart (where you're using snowflake or star schemas)...
And this tool/service is none other than SQL Server Integration Services (SSIS) - which has a rich set of connectors, pumps, transforms, monitors, and so on to facilitate data transformations and migrations on a scheduled basis. What's very cool about this tool/service though... is that it doesn't just connect/transform data between SQL Server and SQL Server... but it's able to connect MOST data sources to MOST data sources - so it's a powerful tool for transforming, exporting, importing, modifying, tweaking, and so on all of your data.
And... to define the 'workflows' for your integration/transformation projects, you also use Business Intelligence Design Studio - same as with defining cubes and so on for SQL Server Analysis Services.
Ultimately though, what makes all of these so cool is that you get access to:
SQL Server
SQL Server Reporting Services
SQL Server Integration Services
SQL Server Analysis Services
With SQL Server Standard and Enterprise Editions - so there's no need to buy extra tools/solutions/offerings in order to be able to take advantage of these great features. (The Enterprise Edition of SQL Server which costs about 5x as much as Standard... comes with much better/cooler features in some ways ... but you still get a solid subset of all features using the Standard Edition.)
Here is a basic overview of the services:
Analysis Services: This is Microsoft's data warehousing solution. It allows you to create data cubes and other warehousing features. Provides tools to make it easier to create what-if scenarios. This runs on top of SQL Server. Data warehousing is also referred to as Business Intelligence applications because the warehouses are used to mine and analyze data.
Reporting Services: Allows you to build database reports more easily. It provides a report designer and management functionality. This service is supposed to make it easier and faster to create rich reports. The reporting service also works against non-SQL Server data services.
Integration Services: Provides functionality to build Extract, Transform, Load (ETL) functionality. ETL tools integrate individual data sets and help with extracting, aggregating, cleaning and uploading data. Like the other services, it provides a GUI that is supposed to make these packages easier to create.
You aren't required to use the three services in one bunch but they do provide functionality when used together. For example, you may use Analysis Services to create data cubes, Integration Services to extract the data from an external service and import it into your cube, and the Reporting Services to create reports that contain business metrics from the data contained in the cube.
As to whether you should be/have been using them, as always, it depends. The services provide some great services but don't always fit the project. For example, Reporting Services do make it easier to create reports but, in my experience, it isn't friendly enough for the end-user report authoring as it is advertised. Integration services are great to do some ETL jobs but isn't as flexible as was sometimes required. I used this service in a few projects but it always ended up being more trouble than I would have expected. But definitely, take a look at them.

Resources