For below script written in .sql files:
if not exists (select * from sys.tables where name='abc_form')
CREATE TABLE abc_forms (
x BIGINT IDENTITY,
y VARCHAR(60),
PRIMARY KEY (x)
)
Above script has a bug in table name.
For programming languages like Java/C, compiler help resolve most of the name resolutions
For any SQL script, How should one approach unit testing it? static analysis...
15 years ago I did something like you request via a lot of scripting. But we had special formats for the statements.
We had three different kinds of files:
One SQL file to setup the latest version of the complete database schema
One file for all the changes to apply to older database schema's (custom format like version;SQL)
One file for SQL statements the code uses on the database (custom format like statementnumber;statement)
It was required that every statement was on one line so that it could be extracted with awk!
1) At first I set up the latest version of the database by executing from statement after the other and logging the errors to a file.
2) Secondly I did the same for all changes to have a second schema
3) I compared the two database schemas to find any differences
4) I filled in some dummy test values in the complete latest schema for testing
5) Last but not least I executed every SQL statement against the latest schema with test data and logged every error again.
At the end the whole thing runs every night and there was no morning without new errors that one of 20 developers had put into the version control. But it saved us a lot of time during the next install at a new customer.
You could also generate the SQL scripts from your code.
Code first avoids these kinds of problems. Choosing between code first or database first usually depends on whether your main focus is on your data or on your application.
Related
I am trying to execute (call) a SQL Server stored procedure from Infa Developer, I created a mapping (new mapping from SQL Query). I am trying to pass it runtime variables from the previous mapping task in order to log these to a SQL Server table (the stored procedure does an INSERT). It generated the following T-SQL query:
?RETURN_VALUE? = call usp_TempTestInsertINFARunTimeParams (?Workflow_Name?, ?Instance_Id?, ?StartTime?, ?EndTime?, ?SourceRows?, ?TargetRows?)
However, it does not validate, the validation log states 'the mapping must have a source' and '... must have a target'. I have a feeling I'm doing this completely wrong. And: this is not Power Center (no sessions, as far as I can tell).
Any help is appreciated! Thanks
Now with the comments I can confirm and answer your question:
Yes, Soure and Target transformations in Informatica are mandatory elements of the mapping. It will not be a valid mapping without them. Let me try to explain a bit more.
The whole concept of ETL tool is to Extract data from the Source, do all the needed Transformations outside the database and Load the data to required Target. It is possible - and quite often necessary - to invoke Stored Procedures before or after the data load. Sometimes even use the exisitng Stored Procedures as part of the dataload. However, from ETL perspective, this is the additional feature. ETL tool - here Informatica being a perfect example - is not meant to be a tool for invoking SPs. This reminds me a question any T-SQL developer asks with his first PL-SQL query: what in the world is this DUAL? Why do I need 'from dual' if I just want to do some calculation like SELECT 123*456? That is the theory.
Now in real world it happens quite often that you NEED to invoke a stored procedure. And that it is the ONLY thing you need to do. Then you do use the DUAL ;) Which in PowerCenter world means you use DUAL as the Source (or actually any table you know that exists in the source system), you put 1=2 in the Source Filter property (or put the Filter Transforation in the mapping with FALSE as the condition), link just one port with the target. Next, you put the Stored Procedure call as Pre- or Post-SQL property on your source or target - depending on where you actually want to run it.
Odd? Well - the odd part is where you want to use the ETL tool as a trigger, not the ETL tool ;)
When comparing our production database to our database project, one table always shows up as "Add" action, even though the file is already part of the project. Updating the schema then produces the same file again with an underscore and increment (dbo.Data.sql => dbo.Data_1.sql)
I noticed that when I open the individual table creation scripts, all scripts open in [Design] mode while the offending table opens as plain T-SQL.
How do I add topsheet.Data to my project without it showing up on my next schema compare?
The offending table: topsheet.Data
A normal table: topsheet.Property
Does it do this if you rename the table Data to something else? I saw here that Data is a future reserved keyword, maybe this is making it act all weird?
When SQL Server Snapshot Agent creates a snapshot (for transactional replication), there's a bunch of .PRE, .SCH, .BCP, and .IDX files, usually prefixed with the object name, a sequence number and part number. Like MY_TABLE_1#1.bcp for MY_TABLE.
But when table names are a little longer like MY_TABLE_IS_LONG it can name the files like MY_TABLE_IS_LO890be30c_1#1.
I want to process some of these files manually (i.e. grab a snapshot and process the BCPs myself) but that requires the full name of the table, and I haven't been able to find where that hex number is created from or stored. They don't appear to be a straight object_id, and I've checked various backing tables in the distribution and publication databases where the tables have an objid and sycobjid and it's neither of those either (after converting hex to decimal).
Does anyone know where that number comes from? It must be somewhere.
It appears they're just random. What happens is when the snapshot is generated a set of commands are placed into the distribution database (you can see them with EXEC sp_browsereplcmds) and these have the hardcoded table name along with the script names, and in what order to run them.
When you run the distribution agent for the first time, it gets those replicated commands, and these instruct it to run all the scripts (alternately, if you've got it set to replication support only, I suspect these commands are just ignored).
In order to process the scripts semi-automatically you'd need to grab everything from replcmds (hopefully on a quiet system) and parse the commands before running them manually.
I'm comparing two SQL server databases (development and live environment, SQL2005 and SQL2008 respectively) to check for differences between the two. If I generate a script for each database I can use a simple text comparison to highlight the differences.
Problem is the scripts need to be in the same order to ease comparison and avoid simple differences where the order of the stored procedures is different, but their contents are the same.
So if I generate this from development:
1: CREATE TABLE dbo.Table1 (ID INT NOT NULL, Name VARCHAR(100) NULL)
2: CREATE TABLE dbo.Table2 (ID INT NOT NULL, Name VARCHAR(100) NULL)
3: CREATE TABLE dbo.Table3 (ID INT NOT NULL, Name VARCHAR(100) NULL)
And this from live:
1: CREATE TABLE dbo.Table1 (ID INT NOT NULL, Name VARCHAR(100) NULL)
2: CREATE TABLE dbo.Table3 (ID INT NOT NULL, Name VARCHAR(100) NULL)
3: CREATE TABLE dbo.Table2 (ID INT NOT NULL, Name VARCHAR(100) NULL)
Comparing the two highlights lines 2 and 3 as different, but they're actually identical, just the generate script wizard did table3 before table 2 on the live environment. Add in 100's of tables, stored procedures, views, etc. and this quickly becomes a mess.
My current options are:
Manually sort the contents before comparison
Create a program to create the scripts in a specific order
Find a freeware application that sorts the generated scripts
Pay for a product that does this as part of its suite of tools
(Some other way of doing this)
Hopefully, I'm only missing the checkbox that says "Sort scripts by name", but I can't see anything that does this. I don't feel I should have to pay for something as simple as a 'sort output' option or lots of other unneeded tools, so option 4 should just be a last resort.
EDIT
I have full access to both environments, but the live environment is locked down and hosted on virtual servers, with remote desktoping the typical way to access live. My preference is to copy what I can to development and compare there. I can generate scripts for each type of object in the database as separate files (tables, SP's, functions, etc.)
Depending on your version of Visual Studio 2010 (if you have it), you can do this easily via the data menu, based on your original intent, you might save yourself some time.
Edit: Generating the actual DB's and then comparing the schema comparison tool as shown below is the same net effect as comparing two script files and you don't have to worry about line breaks etc.
Red_gate's SQLCompare is the best thing to use for this, worth every penny.
This is quite hard to do with scripts - because SQL will tend to generate the tables/objects in the order that makes sense to it (eg dependency order) rather than alphabetical order.
There are other complications that come up when you start comparing databases - for example the names of constraint objects may be randomly generated, so the same constraint may have different names in each DB.
Your best bet is probably option (4) I'm afraid ... an evaulation copy of Red Gate Sql Compare - free for 30 days. I've used it a lot and its very good at pinpointing the differences that matter. It will then generate you a script to bring the two schemas back into sync.
edit: or Visual Studio 2010 Ultimate (or Premium) can do it apparently - see kd7's answer
You can use WinMerge to some extent to find out if the lines are found elsewhere when comparing two generated scripts. I think it works in the simpler cases.
Using WinMerge v2.12.4.0 Unicode. Note the color usage for highlighting these below.
Here is the help for the Edit -> Options -> Compare "Enable moved block detection":
3.6. Enable moved block detection
Disabled (default): WinMerge does not
detect when differences are due to moved lines.
Enabled: WinMerge tries to detect lines that are moved (in different
locations in each file). Moved blocks are indicated by the Moved and
Selected Moved difference colors. If the Location bar is displayed,
corresponding difference locations in the left and right location bars
are connected with a line. Showing moved blocks can make it easier to
visualize changes in files, if there are not too many.
For an example, see the Location pane description in Comparing and
merging files.
I had a similar issue. My database is SQL Server 2008. I realized that, if I generate scripts of through object explorer details, then I get that order that I are viewing names. In this way I was able to compare 2 databases and findout their differences.
The only problem with this is that I had to take out separate scripts for tables/ stored procedures, triggers etc.
But we can compare easily.
I'm working on a number of Delphi applications that will need to upgrade their own database structures in the field when new versions are released and when users choose to install additional modules. The applications are using a variety of embedded databases (DBISAM and Jet currently, but this may change).
In the past I've done this with DBISAM using the user version numbers than can be stored with each table. I shipped an extra, empty set of database files and, at start-up, compared the version numbers of each table using the FieldDefs to update the installed table if necessary. While this worked I found it clumsy to have to ship a spare copy of the database and newer versions of DBISAM have changed the table restructuring methodology so that I'll need to rewrite this anyway.
I can see two ways of implementing this: storing a version number with the database and using DDL scripts to get from older versions to newer versions or storing a reference version of the database structure inside the application, comparing the reference to the database on start-up, and having the application generate DDL commands to upgrade the database.
I think that I'll probably have to implement parts of both. I won't want the application to diff the database against the reference structure every time the application starts (too slow), so I'll need a database structure version number to detect whether the user is using an outdated structure. However, I'm not sure I can trust pre-written scripts to do the structural upgrade when the database could have been partially updated in the past or when the user may have themselves changed the database structure, so I'm inclined to use a reference diff for the actual update.
Researching the question I've found a couple of database versioning tools but they all seem targeted towards SQL Server and are implemented outside the actual application. I'm looking for a process that would be tightly integrated into my application and that could be adapted to different database requirements (I know that I'll have to write adapters, custom descendant classes, or event code to handle differences in DDL for various databases, that doesn't bother me).
Does anyone know of anything off the shelf that does this or failing that, does anyone have any thoughts on:
The best way to store a reference version of a generic relational database structure inside an application.
The best way to diff the reference against the actual database.
The best way to generate DDL to update the database.
Similar story here.
We store a DB version number in a 'system' table and check that on startup. (If the table/field/value doesn't exist then we know it's version 0 where we forgot to add that bit in!)
During development as and when we need to upgrade the database we write a DDL script to do the work and once happy that it's working OK it gets added as a text resource to the app.
When the app determines that it needs to upgrade it loads the appropriate resource(s) and runs it/them. If it needs to upgrade several versions it must run each script in order. Turns out to be only a few lines of code in the end.
The main point being that instead of using the GUI based tools to modify tables in an ad-hoc or 'random' manner we actually write the DDL straight away. This makes it far easier, when the time comes, to build the full upgrade script. And structure diff'ing isn't required.
I have a blog post here about how I do dbisam database versioning and sql server.
The important parts are:
Because dbisam doesn't support views,
the version number is stored (along
with a bunch of other info) in an ini
file in the database directory.
I have a datamodule,
TdmodCheckDatabase. This has a
TdbisamTable component for every table
in the database. The table component
contains all fields in the table and
is updated whenever the table is
changed.
To make database changes, the
following process was used:
Increase the version number in the application
Make and test DB changes.
Update the affected tables in TdmodCheckDatabase
If necessary (rarely) add further upgrade queries to
TdmodCheckDatabase. E.g. to set the
values of new fields, or to add new
data rows.
Generate a CreateDatabase unit script using the supplied database
tools.
Update unit tests to suit the new db
When the application is run, it goes
through the following process
If no database is found, then run CreateDatabase unit and then do
step 3
Get the current version number from the database ini file
If it is less than the expected version number then
Run CreateDatabase (to create any new tables)
Check every table component in TdmodCheckDatabase
Apply any table changes
run any manual upgrade scripts
Update the version number in the database ini file
A code sample is
class procedure TdmodCheckDatabase.UpgradeDatabase(databasePath: string; currentVersion, newVersion: integer);
var
module: TdmodCheckDatabase;
f: integer;
begin
module:= TdmodCheckDatabase.create(nil);
try
module.OpenDatabase( databasePath );
for f:= 0 to module.ComponentCount -1 do
begin
if module.Components[f] is TDBISAMTable then
begin
try
// if we need to upgrade table to dbisam 4
if currentVersion <= DB_VERSION_FOR_DBISAM4 then
TDBISAMTable(module.Components[f]).UpgradeTable;
module.UpgradeTable(TDBISAMTable(module.Components[f]));
except
// logging and error stuff removed
end;
end;
end;
for f:= currentVersion + 1 to newVersion do
module.RunUpgradeScripts(f);
module.sqlMakeIndexes.ExecSQL; // have to create additional indexes manually
finally
module.DBISAMDatabase1.Close;
module.free;
end;
end;
procedure TdmodCheckDatabase.UpgradeTable(table: TDBISAMTable);
var
fieldIndex: integer;
needsRestructure: boolean;
canonical: TField;
begin
needsRestructure:= false;
table.FieldDefs.Update;
// add any new fields to the FieldDefs
if table.FieldDefs.Count < table.FieldCount then
begin
for fieldIndex := table.FieldDefs.Count to table.Fields.Count -1 do
begin
table.FieldDefs.Add(fieldIndex + 1, table.Fields[fieldIndex].FieldName, table.Fields[fieldIndex].DataType, table.Fields[fieldIndex].Size, table.Fields[fieldIndex].Required);
end;
needsRestructure:= true;
end;
// make sure we have correct size for string fields
for fieldIndex := 0 to table.FieldDefs.Count -1 do
begin
if (table.FieldDefs[fieldIndex].DataType = ftString) then
begin
canonical:= table.FindField(table.FieldDefs[fieldIndex].Name);
if assigned(canonical) and (table.FieldDefs[fieldIndex].Size <> canonical.Size) then
begin
// field size has changed
needsRestructure:= true;
table.FieldDefs[fieldIndex].Size:= canonical.Size;
end;
end;
end;
if needsRestructure then
table.AlterTable(); // upgrades table using the new FieldDef values
end;
procedure TdmodCheckDatabase.RunUpgradeScripts(newVersion: integer);
begin
case newVersion of
3: sqlVersion3.ExecSQL;
9: sqlVersion9.ExecSQL;
11: begin // change to DBISAM 4
sqlVersion11a.ExecSQL;
sqlVersion11b.ExecSQL;
sqlVersion11c.ExecSQL;
sqlVersion11d.ExecSQL;
sqlVersion11e.ExecSQL;
end;
19: sqlVersion19.ExecSQL;
20: sqlVersion20.ExecSQL;
end;
end;
I'm Using ADO for my databases. I also use a version number scheme, but only as a sanity check. I have a program I developed which uses the Connection.GetTableNames and Connection.GetFieldNames to identify any discrepancy against an XML document which describes the "master" database. If there is a discrepancy, then I build the appropriate SQL to create the missing fields. I never drop additional ones.
I then have a dbpatch table, which contains a list of patches identified by a unique name. If there are specific patches missing, then they are applied and the appropriate record is added to the dbpatch table. Most often this is new stored procs, or field resizing, or indexes
I also maintain a min-db-version, which is also checked since I allow users to use an older version of the client, I only allow them to use a version that is >= min-db-version and <= cur-db-version.
What I do is store a version number in the database, and a version number in the application. Every time I need to change the database structure, I create some code update the structure of the database, and increase the version number in the application. When the application starts, it compares, numbers, and if need be runs some code to update the database structure AND update the database's version number. Thus the database is now up to date with the application. My code is something like
if DBVersion < AppVersion then
begin
for i := DBVersion+1 to AppVersion do
UpdateStructure(i);
end
else
if DBVersion > AppVersion then
raise EWrongVersion.Create('Wrong application for this database');
UpdateStructure just runs the necessary code something like :
procedure UpdateStructure(const aVersion : Integer);
begin
case aVersion of
1 : //some db code
2 : //some more db code
...
...
end;
UpdateDatabaseVersion(aVersion);
end;
You can actually use the same code to create the database from scratch
CreateDatabase;
for i := 1 to AppVersion do
UpdateStructure(i);