Quality diagnostics tool for database model - database

I´m looking for one tool capable to analyse an existing db model (an Oracle schema in my setup) building a report with quality metrics, potential causes of problems (circular constraints, for example), etc.
We have this kind of features for Java code using tools like PMD or checkstyle.
Does anyone know about some tool like this for database structures?

in quest software's TOAD there's a tool called code expert that does a lot of checks on stored code eg. views, stored procedures etc. maybee that can help

Related

Practical Implementation for Data Warehouse

Data warehousing seems to be a big trend these days, and is very interesting to me. I'm trying to acquaint myself with its concepts, and am having a problem "seeing the forest through the trees" because all of the data warehouse models and descriptions I can find online are theoretical, but don't gives examples with actual technologies being used. I'm a contextual learner, so abstracted, theoretical explanations don't really help me out all that much.
Now there seem to be many "data warehousing models", but all of them seem to have some similar characteristics. There is ually an "ODS" (operational data store that aggregates data from multiple sources into the same place. A process known as "ETL" then converts data in this ODS into a "data vault", and again into "data" and/or "strategy marts."
Can someone provide an example of the technologies that would be used for each of these components (ODS, ETL, data vault, data/strategy marts)?
It sounds like the ODS could just be any ordinary database, but the data vault seems to have some special things going on because it is used by these "marts" to pull data from.
ETL is the biggest thing I'm choking on by far. Is this a language? A framework? An algorithm?
I think once I see a concrete example of what's going on at each step of the way, I'll finally get it. Thanks in advance!
ETL is a process. The abbreviation stands for Extract-Transform-Load which describes what is being done with data during the process. The process can be implemented anywhere where you need to create a bridge between two systems with differenet data formats. First, you need to pull (exract) data from a source system (database, flat files, web service etc.), Then data are being processed (transform) to comply with format of a target storage (again it can vary: databases, files, API calls). During the transform step, further actions can be performed on the data set as enrichment with data from other sources, cleansing and improving its quality. The last step is loading transformed data into a target storage.
Typically, an ETL process is employed for loading a datawarehouse, migrating data from one system or database to another during moving from a legacy system to new one, synchronizing data between two or more systems. It is also used as an intermediate layer in broader MDM and BI solutions.
In terms of specific software, there are many ETL tools on the market ranging from robust solutions from big players as Informatica, IBM DataStage, Oracle Data Integrator, to more affordable and open source providers as CloverETL, Talend, or Pentaho. The most of these tools offer a GUI where flow and processing of data is defined through diagrams.
For Microsoft SQL Server 2005 and later the ETL tool is called SSIS (SQL Server Integration Services). If you install at least the Standard version of the SQL Server you get the Business Intelligence Developer Studio with which you can design your data flows. Basically what an ETL tool does is take data from one or more sources (tables, flat files, ...) then transform it (add columns, join, filter, map to different data types, etc.) and finally store it again to one or more tables or files.
To get a basic understanding of how something works you can watch e.g. this video or this one (both from midnightdba). They're a bit lengthy, but you get an idea. They certainly helped me in understanding the basic functionality of an ETL tool.
Unfortunately I have not yet digged into other platforms or tools.
I'd highly recommend checking out some of the books by Ralph Kimball and Margy Ross (The Data Warehouse Toolkit, The Data Warehouse Lifecycle Toolkit) for an introduction to data warehousing.
My company's data warehouse is built using the Oracle Warehouse Builder tool for ETL. The OWB is a GUI tool that generates PL/SQL code on the database to manipulate the data. After manipulation and cleansing, the data is published to an Oracle datamart. The datamart is a database instance that users access for ad-hoc querying via Oracle Discoverer (Java software).

Data migration between different DBMS's

As i couldnt get any satisfying answer to my Question it seems we have to write our own program for that, we are in the design phase and we are thinking which format shall we use to backup the data.
The program will be written in Delphi.
Needed is Exporting/Importing data between Oracle/Informix/Msserver, very important here is the Performance issue, as this program will run on a 1-2 GB Databases. Beside the normal data there are Blobs in the Database which have to be backuped.
We thought of Xml-Data or comma-separated data as both are transparent (which is nice to have), but Blobs must be considered here. Paradox format is not optinal in this case.
Can anybody recommend some performant formats?
Any other Ideas to achieve the same Goal are welcome.
Thanx in Advance.
I use an excellent program called OmegaSync for my backups, but it will only handle Informix via ODBC and not directly. If you find you can use OmegaSync, you'll find its performance to be excellent, because it compares the databases first, and then syncs only the differences. You might want to use this idea if you decide to do the programming yourself if efficiency is your number one goal.
But programming database conversion is very complex as others answers to your question have said. So why not just develop the SQL you need, and do the conversion that way. For example see: Convert Informix Schema to Oracle Schema Or Any Other RDBMS For moving the data, check out sources like: Moving non-informaix data between computers and dbspaces
You can optimize the SQL to what I'm sure will be an adequate speed if you dump and load your data smartly.
DbUnit is a popular tool which can extract and load data in XML format, see
http://www.dbunit.org/faq.html#extract
// partial database export
QueryDataSet partialDataSet = new QueryDataSet(connection);
partialDataSet.addTable("FOO", "SELECT * FROM TABLE WHERE COL='VALUE'");
partialDataSet.addTable("BAR");
FlatXmlDataSet.write(partialDataSet, new FileOutputStream("partial.xml"));
// full database export
IDataSet fullDataSet = connection.createDataSet();
FlatXmlDataSet.write(fullDataSet, new FileOutputStream("full.xml"));
Did you check ODI (Oracle Data Integrator) It has support for lots of source databases. It is able to capture changes from the source databases and integrate them in the target database. It is performant but has a price tag.
Ronald.
The new DBExpress framework give the possibility to exporting/importing data between many databases. you can check this CodeRage session Deep Dive into dbExpress by John Kaster
You should use your own binary format, integrated by (xml for text/streams for Blobs).
If you have to export metadata too and not only data, it could be very complex. There are many subtle (and not so subtle differences) among the databases you're going to use, that such a format should be general enough and the exporting/importing code should be able to translate and map metadata across databases, and because an external application can't write directly to the database internal structures, it would have to generate the db proper DDL to create the data structures.
As long as this is a proprietary format, IMHO its design is the least of your issues, if size and performance are important and the file is read sequentially it would not be difficult to design a binary format.
Anyway import/exports and backups are two different tasks. If you have to backup a database, use its facilities. They usually allow far more control, i.e. point-in-time recovery. If you have to move data across databases that's another issue - I would write just the code to move data, not metadata, pre-creating the required structure in the target database.
You could give Toad (Quest Software) a try.
It supports all your mentioned platforms and can do things like 'Export table data to INSERT statements' on your source platform which can then be run on the target platform.
IIRC there is even some Toad-internal backup-format which might be cross-platform.
Toad Communities:
Toad for ORACLE
Toad for SQL SERVER
Toad for OTHER RDMBS (including Informix)
Some videos about exporting, importing:
YouTube: Toad for Data Analysts v2.7 Export Enhancements
YouTube: Toad for Data Analysts v2.7 Import Enhancements

LINQ with existing databases and unknown schema

I'm working on a database heavy project, where the Microsoft SQL databases are very mature (16 or more years-old mature), and an old product uses VB6 and ADO to generate sql which interacts with the database. I've been given the task of porting/re-writing the ancient version with a new .NET version.
I'd love to use LINQ-to-* to ensure easy maintainability, but having tried for the last several weeks I feel like LINQ-to-SQL isn't flexible enough, LINQ-to-Entities has too much overhead, and LINQ-to-Datasets is pointless since I would be just as happy using Ado.Net.
The program operates on two databases at once: one is a database with a very consistent schema containing meta-data, and the other a database which has a varying schema, is tightly coupled to the meta-database, and dictates what information from the meta-database you are interested in at any given time. Furthermore, I need non-LINQ information from both databases (such as system-stored procedures, and system-tables).
Is there any way to use LINQ intelligently here? I'd love the static typing, but if I can't have it I don't want to force my square app into a round framework.
Just an FYI, you can get access system tables (and sys stored procs too?) using LINQ. Here is how:
Create a connection to the server you want.
Right-click the server and choose Change View > Object Type.
You should now see System Tables and User Tables. You should see sysjobs there, and you can easily drag it onto a .dbml surface.
Above was stolen from this post.
The best answer seems to be to use ADO.NET completely. I have the option of using Linq-to-Sql over the metabase and ADO.NET for any other database access, but that would make the code feel too inconsistent for me.

How to separate programming logic and data in MS SQL Server 2005?

I am developing a data driven website and quite a lot of programming logic resides in database stored procedures and database functions. I found myself changing the stored proc/functions quite a lot in order to fix bugs or add new functionality. The data (tables) have remained mostly untouched.
The issue I am having is keeping track of versions of stored proc/functions. Currently I am incrementing version of whole database when I do a set of changes. As data is huge (10 Gb) I get issues having to run development version and release versions of databases in parallel.
I wish to put all the stored procs and functions in one database and keep data in one database, so that I can better manage the changes.
I am sure others would have encountered similar suggest and request suggestions on how to best handle this situation.
I would also recommend using source control keyword expansion in your stored procedures ($Version:$)
That way you can eyeball, grep, search syscomments, etc to see what version you have on your deployed database.
You can version just the schema dumps. In combination with source control keword expansion (as suggested by Rawheiser), you just take a look at what version you have in the database, generate a diff and apply it.
Also, there are several excellent tools to compare databases and their schemas, generate DDL scripts etc.: SQL Workbench, Power Architect, DDLUtils and Redgate SQL Compare, to name a few. SQL Compare is likely to work best with SQL Server, although all the others are FOSS and provide a higher ROI (in terms of time spent learning and what you can do with them) as they are platoform and RDBMS independent.
Finally, I have to say...I understand that the immediate results you get with logic in the DB are tempting, but if you've gone beyond more than a couple of procedures in the database, you're setting your self up for quite a lot of pain, sifting through what easily turns into spaghetti code and locking your application to a single database vendor. You might have your reasons, but I've been there and didn't like it very much. Logic can live very nicely in a different layer.
For source control you have several options:
Use a Visual Studio Database project.
Use SQL Server 2005's built-in support for source control
Use a third part tool such as SQL Compare
IMO Option 1. is preferable.

How to validate the clients database against my database schema?

Our clients use SQLServer/Oracle databases. Over the years, we've sent them many update scripts which they had to run manually. Most of the time, everything went smooth, but every now and then a script did not run completely to the end or had some errors in it (which weren't detected at the time of the upgrade). Also, sometimes even "smart users" added indexes/tables into those databases themselves, for whatever reason. Later on, those irregularities lead to problems.
Now I have been tasked to figure out a way to verify/validate our clients databases against our own database schema (tables, datatypes, indexes, views, ...). The output should be some kind of difference file indicating what is missing/what should not be in the database. I could do this in code (C++) from inside our application or I can create an external tool for just this one purpose.
Now before I start coding, I wanted to ask if there is already a tool out there that would produce the necessary results, or that at least could help me produce a decent xml file from our master-databases (Oracle and SQLServer)? Or is there a library which could help me write my own tool?
I've used this technique before and it doesn't require buying any tools.
Enterprise Manager has a "Create Script" feature. Perform this on your reference database and the comparison database. Select the appropriate options to generate scripts for the objects you care about. Next, just compare the two generated files with your favorite diff tool.
You can do a similar procedure with Oracle tools that let you export the DDL scripts.
There are three options using Red Gate's tools:
Have your client run the comparison.
You would need to convince your
clients to purchase a license of SQL
Compare and send them a schema
snapshot of your database.
Write an application of your own using Red
Gate's SQL Comparison SDK ($595 for
10 distributions) which can be run
at the client site.
Ask your client to send you a schema snapshot and
run the comparison yourself using your own
copy of SQL Compare. Red Gate
supplies a free schema snapshot tool
called SQL Snapper that will create
snapshots that can then be emailed to you
by your client. As this doesn't include any data, it may be something your client is willing to consider.
The SQL Snapper tool and SQL Comparison SDK sample code can be downloaded from our labs.red-gate.com website.
Oracle compatibility is now available in the form of an Early Access Build. If you're interested or would like to try out the tool visit the product page. You can use this for free until the full release of the tool.
David Atkinson, Product Manager, Red Gate Software.
We use Redgate SQL Compare for this and it's served us well over the years.
We also use Redgate SQL Data Compare for comparing the content of lookup tables.
The folks at redgate have a great tool called SQL Compare.
Can you create a schema dump like MySQL's SHOW CREATE TABLES?
If you're on Windoze, I have used Advanced Query Tool for years, and can attest that, for the money, it does more than anything else. In particular, it will generate a diff report between databases. It is ODBC/VB6, and can run against dozens of databases. Check it out. (No, I am not of QueryTool nor do I own any part of it, just a happy client.)

Resources