How to report across large enterprise systems - database

Some time ago my company was evaluating different reporting solutions. We settled on MS SSRS particularly because it's capable of connecting to various types of data stores, including MS SQL, Oracle and SAP Netweaver BI. It has proven the test of time pretty well, however, we're now under fire from management because SSRS is not capable of mixing data sources into the same data set.
So, I searched long and hard for a reporting solution that can "inner join" data from separate systems, but I came up short. I am about to propose that we custom write reports (ASP.NET) for these cross-system report requests, but I wanted to ping the internet first for any advice.
How do you "inner join" across your massive enterprise systems for reporting purposes?

Take a look at BIRT from Actuate. BIRT comes in open-source and commercial flavors and I believe it allows joins across data sources.

Perhaps a linked server in SQL Server? Watch out for performance issues and I'm not sure if SSRS has limitations against them - I don't think that it does. You can reference a table like this - MYSERVER01.DATABASE1.dbo.TABLE. More info from the source.
For best performance you would be pulling all your disparate data in to a data warehouse, but that is a major undertaking that management may not be willing to fund.

One way to join across data sources in SSRS is to use subreports - see http://msdn.microsoft.com/en-us/library/ms159837.aspx .
Peformance is unlikely to be good using this method, however - Sam's suggestion of a linked server is likely to be more practical.
(According to BIRT's documentation it does enable joins between datasets, as DMKing suggested - I haven't tried using this feature yet.)

this limitation was removed in SQl Server 2008 R2. there is a work around in previous versions
see my full response here: How can i add a field to dataset from another dataset in ssrs?

I was going to suggest using SSIS as a possibility until I read this
http://blogs.msdn.com/b/jenss/archive/2009/04/23/consuming-ssis-package-data-in-reporting-services-and-using-web-services-in-addition-part-1.aspx
I have used linked servers to combine data from Oracle/SQL Server, not nice but it worked.
Failing that I'd go with subreports.
Failing that, point out to management how expensive SAP/Oracle etc are and they'll soon stop moaning.
:)

Related

Convert SQL Server queries to Postgres on the fly

I have a scenario where I get queries on a webservice that need to be executed on a database.
The source for these queries is from a physical device so I cant really change the input to my queries.
I get the queries from the device in MSSQL. Earlier the backend was in SQL Server, so things were pretty straight forward. Queries would come in and get executed as is on the DB.
Now we have migrated to Postgres and we don't have to the option to modify the input data (SQL queries).
What I want to know is. Is there any library that will do this SQL Server/T-SQL translation for me so I can run the SQL Server queries through this and execute the resulting Postgres query on the database. I searched a lot but couldn't find much that would do this. (There are libraries that convert schema from one to another but what I need is to be able to translate SQL Server queries to Postgres on the fly)
I understand there are quite a bit of nuances that will be different between SQL and postgres so a translator will be needed in between. I am open to libraries in any language(that preferably runs on linux : ) ) or if you have any other suggestions on how to go about this would also be welcome.
Thanks!
If I were in your position I would have a look on upgrading your SQL Sever to 2019 ASAP (as of today, you can find on Twitter that the officially supported production ready version is available on request). Then have a look on the Polybase feature they (re)introduced in this version. In short words it allows you to connect your MSSQL instance to other data source (like Postgres) and query the data in as they would be "normal" SQL Server DB (via T-SQL) then in the background your queries will be transformed into the native pgsql and consumed from your real source.
There is not much resources on this product (as 2019 version) yet, but it seems to be one of the most powerful features coming with this release.
This is what BOL is saying about it (unfortunately, it mostly covers the old 2016 version).
There is an excellent, yet very short presentation by Bob Ward (
Principal Architect # Microsoft) he did during SQL Bits 2019 on this topic.
The only thing I can think of that might be worth trying is SQL::Translator. It's a set of Perl modules that have been around for ages but seem to be still maintained. Whether it does what you want will depend on how detailed those queries are.
The no-brainer solution is to keep a SQL Server Express in place and introduce Triggers that call out to the Postgres database.
If this is too heavy, you can look at creating a Tabular Data Stream (TDS is SQL Server network transport) gateway with limited functionality and map each possible incoming query with any parameters to a static Postgres query. This limits any testing to a finite, small, number of cases.
This way, there is no SQL Server, and you have more control than with the trigger option.
If your terminals have a limited dialect demand then this may be practical. Attempting a general translation is very likely to be worth more than the devices cost to replace (unless you have zillions already deployed).
There is an open implementation FreeTDS that you could use if you are happy with C or Java.

Reports from SQL Server using Excel

I've developed an intranet system for our company which uses a SQL Server 2008 backend. This stores an awful lot of information and I'm frequently asked to build reports for various managers to help with the business. Quite often these reports are variations on a theme, whilst sometimes they're quite unique. At the moment I write SQL to perform the report and have them dump the required output via ASP.Net pages. What I'd really like to do is get away from that, and I was thinking along the lines of having the managers query the database using Excel so that they can decide what fields to filter on etc. To this end I wrote a couple of views and used Excel to connect to them. The problem is that without filtering you end up with a lot of data, so I was wondering about the best way to approach this. I've not had anything to do with data warehousing/Analysis Services but I wondered if that was a route to look at, or should I be looking at Reporting Services? I've got access to the full Microsoft stack so happy to use different solutions
I'm more then happy to spend some time doing some reading/research but I'm a bit unsure where to begin so any pointers would be gratefully received.
Thanks in advance

How to separate programming logic and data in MS SQL Server 2005?

I am developing a data driven website and quite a lot of programming logic resides in database stored procedures and database functions. I found myself changing the stored proc/functions quite a lot in order to fix bugs or add new functionality. The data (tables) have remained mostly untouched.
The issue I am having is keeping track of versions of stored proc/functions. Currently I am incrementing version of whole database when I do a set of changes. As data is huge (10 Gb) I get issues having to run development version and release versions of databases in parallel.
I wish to put all the stored procs and functions in one database and keep data in one database, so that I can better manage the changes.
I am sure others would have encountered similar suggest and request suggestions on how to best handle this situation.
I would also recommend using source control keyword expansion in your stored procedures ($Version:$)
That way you can eyeball, grep, search syscomments, etc to see what version you have on your deployed database.
You can version just the schema dumps. In combination with source control keword expansion (as suggested by Rawheiser), you just take a look at what version you have in the database, generate a diff and apply it.
Also, there are several excellent tools to compare databases and their schemas, generate DDL scripts etc.: SQL Workbench, Power Architect, DDLUtils and Redgate SQL Compare, to name a few. SQL Compare is likely to work best with SQL Server, although all the others are FOSS and provide a higher ROI (in terms of time spent learning and what you can do with them) as they are platoform and RDBMS independent.
Finally, I have to say...I understand that the immediate results you get with logic in the DB are tempting, but if you've gone beyond more than a couple of procedures in the database, you're setting your self up for quite a lot of pain, sifting through what easily turns into spaghetti code and locking your application to a single database vendor. You might have your reasons, but I've been there and didn't like it very much. Logic can live very nicely in a different layer.
For source control you have several options:
Use a Visual Studio Database project.
Use SQL Server 2005's built-in support for source control
Use a third part tool such as SQL Compare
IMO Option 1. is preferable.

End User Ad-Hoc Reporting Tool: Microsoft SQL Server Management Studio or Microsoft Access?

Our centralized IT department has suggested two primary ad hoc query tools for our general user base of approximately 200 staff members:
Microsoft SQL Server Management Studio 2008 (SSMS)
Microsoft Access 2003
Environment
The backend database is a read-only Microsoft SQL Server 2005 database.
The schema is 400+ tables; allowing access to the raw data for our general staff would be a disaster.
We will be building an "abstraction layer" over the raw data for our general staff to run ad hoc queries against.
The abstraction layer will most likely contain a number of views.
A number of users have basic knowledge in Microsoft Access; none have used SSMS.
Which of the above tools (or alternative) would be best for a decidedly non-techie user base of approximately 200 people? What are the pros and cons of each?
Also, the IT department has suggested teaching people T-SQL so they may use SSMS. Is this reasonable?
How about this one? i-net Clear Reports (used to be called i-net Crystal-Clear) has a powerful ad-hoc reporting component that is made to be an easy-to-use thing for non-technical users. Your users won't have to know anything about reporting at all. They simply select the kind of report, the data et voila there is a report suiting the needs.
The data abstraction can be done easily by creating so called data-views which can be designed by e.g. your administration. There are various ways to access the ad hoc reporting GUI. We have a web GUI, a Java Applet or a standalone Java program.
The end users will not need any training since the GUI is highly intuitive.
The views can easily be build by drag and drop in addition to setting datatypes, formats and so on.
All reports (depending on security settings) can be accessed via DAV our a report repository gui.
The server supports different security settings on a per user or per group basis.
The standalone report designer is free and fully functional.
Disclosure: Yep. I work for the company who built this.
Your "abstraction layer" is the right approach to take with Access. Create an MDB with the basic views required linked into it and distribute to the users. Allow them to create new queries and reports in their own MDB as required.
Now how you are going to stop them from running a Cartesian join on tables with a million records or more I'm not quite sure.
Microsoft have a free tool for business and end users which called "Report Builder". It supports the full capabilities of SQL Server Reporting Services. The good thing it is provides a Microsoft Office look-like user interface.
You can download latest version "Report Builder 3.0" from here
http://www.microsoft.com/download/en/details.aspx?DisplayLang=en&id=6116
And for more information about MS Report Builder check this link
http://technet.microsoft.com/en-us/library/dd207008.aspx
Attempting to teach "non-techie" people T-SQL to query a schema with 400+ tables probably isn't going to do well, unless they are limited to querying the views only, and the views hide all the ugly complexities of various joins, grouping etc.
Our company was in a similar situation where Access was used early on, and then we switched everyone over to use T-SQL and SSMS. IMO, this is the approach you'd want to take.
Again though, the success of this will depend on the quality of your views, or better yet, reports you provide your end-users.
Randy
I would look more into something like Stonefieldquery.com that is designed for non developers to build reports. Not that the report writer or query builder in Access is bad, but may be too much. I think they also provide a way to centralize reports and queries where they can be shared. Multiple people are not going to be able to open a single access file and create a report (I think query building is OK.).
Most will use the drag and drop capability, but about 5-10%will come thing a need for SQL and then you can take advantage of the "teachable moment" and get them some training.
Cons for Access certainly would be cost; SSMS should be free assuming you're properly licensed for the SQL server.
Depending on the actual needs, some users might actually be better off with Crystal Reports (never thought I'd say that), or Reporting Services.
you could create a series of sql server analysis cubes and have the users conenct to those using excel so that they can use excel's pivot tables.
Being a newbie at ad hoc reporting and doing the work myself, I used Izenda.com ad hoc reporting. It was very straight forward, and I could do it myself versus outsourcing.
Check SQLS*Plus - http://www.sqlsplus.com
I found SQLS*Plus to be a very effective command line SQL server reporting tool - this is a free tool (for personal use) and allows me to generate reports with the titles, headers, in HTML and CSV formats, format columns in custom masks, set report length, pagesize, etc. As I understand it is very similar to very well known Oracle SQL*Plus reporting tool

How to aggregate data from SQL Server 2005

I have about 150 000 rows of data written to a database everyday. These row represent outgoing articles for example. Now I need to show a graph using SSRS that show the average number of articles per day over time. I also need to have a information about the actual number of articles from yesterday.
The idea is to have a aggregated view on all our transactions and have something that can indicate that something is wrong (that we for example send out 20% less articles than the average).
My idea is to have yesterdays data moved into SSAS every night and there store the aggregated value of number of transactions and the actual number of transaction from yesterdays data. Using SSAS would hopefully speed up the reports.
Do you think this is the right idea? Should I skip SSAS and have reports straight on the raw data? I know how use reporting services on raw data using standard SQL queries but how would this change when querying SSAS? I don't know SSAS - where do I start ..?
The neat thing with SSAS is that you can get those indicators that you talk about quite easily either by creating calculated measures or by using KPIs.
I started with Delivering Business Intelligence with Microsoft SQL Server 2005. It had some good introduction, but unfortunately it's too verbose when it comes to the details. But if you want to understand SSAS, OLAP and reporting using this framework it's a good start.
Mosha Pasumansky has a blog on SSAS and MDX with great links.
Other than that I would recommend Microsofts Online books.
Are you sure you aren't mixing up SSAS (Analysis Services) and SSIS (integration services)?
SSAS is not an ETL, it is an OLAP tool.
SSIS is an ETL tool.
I agree with everything that Rowan said. I'm just confused by the terms.
SSAS is an ETL tool. Basically you get data from somewhere (your outgoing articles), do something to it (aggregate), and put it somewhere else (your aggregates table, data warehouse, etc). Check the link for details.
You probably won't be keeping all of the rows in the DB indefinitely and if you want to be able to report on longer trends you need in any case do some kind of aggregating of historical data. So making the reports use this historical data store as their source makes sense. You can then use it to do all kinds of fancy reporting.
TL;DR: Define your aggregated history table with your future reporting needs in mind. Use the SSAS to populate the table and refresh it from the daily updates. Report from that table. Further reading: Star Schemas and data warehousing.
#Sergio and #Rowan
Yes, we're not talking about loading and transforming data into the database (like a SSIS tool would do). That's solved using our integration platform.
#Riri maybe SSAS is overkill for the situation you presented. If you only need to daily populate sumarization tables, you can accomplish it by creating a regular JOB in SQL Server and doing it in a regular T-SQL script.
I've used this approach for several years in a daily process to calculate business indicators from about 9GB new data / day. It works, it's fast, it's simple and it uses a technology you're already used to. If your daily process get's more complicated (it needs to read from files, use FTP, send emails) you can move to a SSIS package (or any other ETL tool you like), but I cannot recommend using SSAS unless you need to provide OLAP capabilities to your users.

Resources