Database versioning in installed applications using Delphi - database

I'm working on a number of Delphi applications that will need to upgrade their own database structures in the field when new versions are released and when users choose to install additional modules. The applications are using a variety of embedded databases (DBISAM and Jet currently, but this may change).
In the past I've done this with DBISAM using the user version numbers than can be stored with each table. I shipped an extra, empty set of database files and, at start-up, compared the version numbers of each table using the FieldDefs to update the installed table if necessary. While this worked I found it clumsy to have to ship a spare copy of the database and newer versions of DBISAM have changed the table restructuring methodology so that I'll need to rewrite this anyway.
I can see two ways of implementing this: storing a version number with the database and using DDL scripts to get from older versions to newer versions or storing a reference version of the database structure inside the application, comparing the reference to the database on start-up, and having the application generate DDL commands to upgrade the database.
I think that I'll probably have to implement parts of both. I won't want the application to diff the database against the reference structure every time the application starts (too slow), so I'll need a database structure version number to detect whether the user is using an outdated structure. However, I'm not sure I can trust pre-written scripts to do the structural upgrade when the database could have been partially updated in the past or when the user may have themselves changed the database structure, so I'm inclined to use a reference diff for the actual update.
Researching the question I've found a couple of database versioning tools but they all seem targeted towards SQL Server and are implemented outside the actual application. I'm looking for a process that would be tightly integrated into my application and that could be adapted to different database requirements (I know that I'll have to write adapters, custom descendant classes, or event code to handle differences in DDL for various databases, that doesn't bother me).
Does anyone know of anything off the shelf that does this or failing that, does anyone have any thoughts on:
The best way to store a reference version of a generic relational database structure inside an application.
The best way to diff the reference against the actual database.
The best way to generate DDL to update the database.

Similar story here.
We store a DB version number in a 'system' table and check that on startup. (If the table/field/value doesn't exist then we know it's version 0 where we forgot to add that bit in!)
During development as and when we need to upgrade the database we write a DDL script to do the work and once happy that it's working OK it gets added as a text resource to the app.
When the app determines that it needs to upgrade it loads the appropriate resource(s) and runs it/them. If it needs to upgrade several versions it must run each script in order. Turns out to be only a few lines of code in the end.
The main point being that instead of using the GUI based tools to modify tables in an ad-hoc or 'random' manner we actually write the DDL straight away. This makes it far easier, when the time comes, to build the full upgrade script. And structure diff'ing isn't required.

I have a blog post here about how I do dbisam database versioning and sql server.
The important parts are:
Because dbisam doesn't support views,
the version number is stored (along
with a bunch of other info) in an ini
file in the database directory.
I have a datamodule,
TdmodCheckDatabase. This has a
TdbisamTable component for every table
in the database. The table component
contains all fields in the table and
is updated whenever the table is
changed.
To make database changes, the
following process was used:
Increase the version number in the application
Make and test DB changes.
Update the affected tables in TdmodCheckDatabase
If necessary (rarely) add further upgrade queries to
TdmodCheckDatabase. E.g. to set the
values of new fields, or to add new
data rows.
Generate a CreateDatabase unit script using the supplied database
tools.
Update unit tests to suit the new db
When the application is run, it goes
through the following process
If no database is found, then run CreateDatabase unit and then do
step 3
Get the current version number from the database ini file
If it is less than the expected version number then
Run CreateDatabase (to create any new tables)
Check every table component in TdmodCheckDatabase
Apply any table changes
run any manual upgrade scripts
Update the version number in the database ini file
A code sample is
class procedure TdmodCheckDatabase.UpgradeDatabase(databasePath: string; currentVersion, newVersion: integer);
var
module: TdmodCheckDatabase;
f: integer;
begin
module:= TdmodCheckDatabase.create(nil);
try
module.OpenDatabase( databasePath );
for f:= 0 to module.ComponentCount -1 do
begin
if module.Components[f] is TDBISAMTable then
begin
try
// if we need to upgrade table to dbisam 4
if currentVersion <= DB_VERSION_FOR_DBISAM4 then
TDBISAMTable(module.Components[f]).UpgradeTable;
module.UpgradeTable(TDBISAMTable(module.Components[f]));
except
// logging and error stuff removed
end;
end;
end;
for f:= currentVersion + 1 to newVersion do
module.RunUpgradeScripts(f);
module.sqlMakeIndexes.ExecSQL; // have to create additional indexes manually
finally
module.DBISAMDatabase1.Close;
module.free;
end;
end;
procedure TdmodCheckDatabase.UpgradeTable(table: TDBISAMTable);
var
fieldIndex: integer;
needsRestructure: boolean;
canonical: TField;
begin
needsRestructure:= false;
table.FieldDefs.Update;
// add any new fields to the FieldDefs
if table.FieldDefs.Count < table.FieldCount then
begin
for fieldIndex := table.FieldDefs.Count to table.Fields.Count -1 do
begin
table.FieldDefs.Add(fieldIndex + 1, table.Fields[fieldIndex].FieldName, table.Fields[fieldIndex].DataType, table.Fields[fieldIndex].Size, table.Fields[fieldIndex].Required);
end;
needsRestructure:= true;
end;
// make sure we have correct size for string fields
for fieldIndex := 0 to table.FieldDefs.Count -1 do
begin
if (table.FieldDefs[fieldIndex].DataType = ftString) then
begin
canonical:= table.FindField(table.FieldDefs[fieldIndex].Name);
if assigned(canonical) and (table.FieldDefs[fieldIndex].Size <> canonical.Size) then
begin
// field size has changed
needsRestructure:= true;
table.FieldDefs[fieldIndex].Size:= canonical.Size;
end;
end;
end;
if needsRestructure then
table.AlterTable(); // upgrades table using the new FieldDef values
end;
procedure TdmodCheckDatabase.RunUpgradeScripts(newVersion: integer);
begin
case newVersion of
3: sqlVersion3.ExecSQL;
9: sqlVersion9.ExecSQL;
11: begin // change to DBISAM 4
sqlVersion11a.ExecSQL;
sqlVersion11b.ExecSQL;
sqlVersion11c.ExecSQL;
sqlVersion11d.ExecSQL;
sqlVersion11e.ExecSQL;
end;
19: sqlVersion19.ExecSQL;
20: sqlVersion20.ExecSQL;
end;
end;

I'm Using ADO for my databases. I also use a version number scheme, but only as a sanity check. I have a program I developed which uses the Connection.GetTableNames and Connection.GetFieldNames to identify any discrepancy against an XML document which describes the "master" database. If there is a discrepancy, then I build the appropriate SQL to create the missing fields. I never drop additional ones.
I then have a dbpatch table, which contains a list of patches identified by a unique name. If there are specific patches missing, then they are applied and the appropriate record is added to the dbpatch table. Most often this is new stored procs, or field resizing, or indexes
I also maintain a min-db-version, which is also checked since I allow users to use an older version of the client, I only allow them to use a version that is >= min-db-version and <= cur-db-version.

What I do is store a version number in the database, and a version number in the application. Every time I need to change the database structure, I create some code update the structure of the database, and increase the version number in the application. When the application starts, it compares, numbers, and if need be runs some code to update the database structure AND update the database's version number. Thus the database is now up to date with the application. My code is something like
if DBVersion < AppVersion then
begin
for i := DBVersion+1 to AppVersion do
UpdateStructure(i);
end
else
if DBVersion > AppVersion then
raise EWrongVersion.Create('Wrong application for this database');
UpdateStructure just runs the necessary code something like :
procedure UpdateStructure(const aVersion : Integer);
begin
case aVersion of
1 : //some db code
2 : //some more db code
...
...
end;
UpdateDatabaseVersion(aVersion);
end;
You can actually use the same code to create the database from scratch
CreateDatabase;
for i := 1 to AppVersion do
UpdateStructure(i);

Related

Unit testing SQL scripts

For below script written in .sql files:
if not exists (select * from sys.tables where name='abc_form')
CREATE TABLE abc_forms (
x BIGINT IDENTITY,
y VARCHAR(60),
PRIMARY KEY (x)
)
Above script has a bug in table name.
For programming languages like Java/C, compiler help resolve most of the name resolutions
For any SQL script, How should one approach unit testing it? static analysis...
15 years ago I did something like you request via a lot of scripting. But we had special formats for the statements.
We had three different kinds of files:
One SQL file to setup the latest version of the complete database schema
One file for all the changes to apply to older database schema's (custom format like version;SQL)
One file for SQL statements the code uses on the database (custom format like statementnumber;statement)
It was required that every statement was on one line so that it could be extracted with awk!
1) At first I set up the latest version of the database by executing from statement after the other and logging the errors to a file.
2) Secondly I did the same for all changes to have a second schema
3) I compared the two database schemas to find any differences
4) I filled in some dummy test values in the complete latest schema for testing
5) Last but not least I executed every SQL statement against the latest schema with test data and logged every error again.
At the end the whole thing runs every night and there was no morning without new errors that one of 20 developers had put into the version control. But it saved us a lot of time during the next install at a new customer.
You could also generate the SQL scripts from your code.
Code first avoids these kinds of problems. Choosing between code first or database first usually depends on whether your main focus is on your data or on your application.

How do I execute a SQL Server stored procedure from Informatica Developer (10.1, not Power Center)

I am trying to execute (call) a SQL Server stored procedure from Infa Developer, I created a mapping (new mapping from SQL Query). I am trying to pass it runtime variables from the previous mapping task in order to log these to a SQL Server table (the stored procedure does an INSERT). It generated the following T-SQL query:
?RETURN_VALUE? = call usp_TempTestInsertINFARunTimeParams (?Workflow_Name?, ?Instance_Id?, ?StartTime?, ?EndTime?, ?SourceRows?, ?TargetRows?)
However, it does not validate, the validation log states 'the mapping must have a source' and '... must have a target'. I have a feeling I'm doing this completely wrong. And: this is not Power Center (no sessions, as far as I can tell).
Any help is appreciated! Thanks
Now with the comments I can confirm and answer your question:
Yes, Soure and Target transformations in Informatica are mandatory elements of the mapping. It will not be a valid mapping without them. Let me try to explain a bit more.
The whole concept of ETL tool is to Extract data from the Source, do all the needed Transformations outside the database and Load the data to required Target. It is possible - and quite often necessary - to invoke Stored Procedures before or after the data load. Sometimes even use the exisitng Stored Procedures as part of the dataload. However, from ETL perspective, this is the additional feature. ETL tool - here Informatica being a perfect example - is not meant to be a tool for invoking SPs. This reminds me a question any T-SQL developer asks with his first PL-SQL query: what in the world is this DUAL? Why do I need 'from dual' if I just want to do some calculation like SELECT 123*456? That is the theory.
Now in real world it happens quite often that you NEED to invoke a stored procedure. And that it is the ONLY thing you need to do. Then you do use the DUAL ;) Which in PowerCenter world means you use DUAL as the Source (or actually any table you know that exists in the source system), you put 1=2 in the Source Filter property (or put the Filter Transforation in the mapping with FALSE as the condition), link just one port with the target. Next, you put the Stored Procedure call as Pre- or Post-SQL property on your source or target - depending on where you actually want to run it.
Odd? Well - the odd part is where you want to use the ETL tool as a trigger, not the ETL tool ;)

Adding a column to a table in SQLite

I've got a table in SQLite, and it already has many rows stored in it. I know realise I need another column in the table. Up to now I've just deleted the database and started again because the data has just been test data. But now the data in the database can't be deleted.
I know the query to add a column to the table, my question is what is a good way to do this so that it works for both existing users and new users? (I have updated the CREATE query I have for when the table is not found (because it's a new user or an existing user has cleared the database). It seems wrong to have an ALTER query in software that ships, and check every time. Is there some way of telling SQLite to automatically add the column if it doesn't exist during the UPDATE query I now need?
If I discover I need more columns in the future, is having a bunch of ALTER statements on startup (or somewhere?) really the best way to do it?
(If relevant this is for a node js app)
I'd just throw a table somewhere that marks what version of your database it is, and check that to determine if an update is needed. Either that or if you have a table already where there's always going to be just one record in it add a new field 'DatabaseVersion' to it.
So for example if you check the version number, and find it's a version 1 database when the newest version should be version 3, you know which updates to perform on it.
You can use PRAGMA user_version to store the version number of the database and check if the database needs to be updated.

How to store site wide settings in a database?

I'm debating three different approaches to to storing sitewide settings for a web application.
A key/value pair lookup table, each key represents a setting.
Pros Simple to implement
Cons No Constraints on the individual settings
A single row settings table.
Pros Per setting defaults and constraints
Cons - Lots of settings would mean lots of columns. Not sure if Postgres would have an issue with that
Just hard code it since the settings won't change that often.
Pros Easy to setup and add more settings.
Cons Much harder to change
Thoughts on which way to go?
Since your question is tagged with database/sql I presume you'd have no problem accessing an sql table for both lookup and management of settings... Guessing here, but I'd start with a table like:
settingName value can_be_null minvalue maxvalue description
TheAnswer 42 no 1 100 this setting does...
...
If you think about managing a large number of settings, there's more information you need about each one of them than just their current value.
I've used a key/value pair lookup table much in the way you describe with good results.
As an added bonus the table had a "configuration name" column which provided a simple way to choose/activate a specific set of configuration settings. That meant that prod, dev, and test could all live in the same table, though it was up to the application to choose which set to use. In our case a JVM argument made sense. It might make sense to store different "sets" of config settings in the same DB table; then again, it might not.
If you are thinking about file-based configuration, I like INI or YAML. You could still store it in a database, though you probably won't find an INI or YAML column type (as you might for XML).
I would go with the first option -- key/value pair lookup table. It's the most flexible and scalable solution, in my opinion. If you are worried about the cost of running many queries here and there to retrieve various config values, you could always implement some sort of cache, such as loading the whole table at once into memory. In addition to key and value, you could add columns such as "Description", and "Default Value", etc., and build a generic configuration editor that displays the Descriptions, etc., on-screen to help the user edit the config values.
I've seen some commercial applications with a single-row config table, and while I don't have direct experience doing development work against it, it struck me as much less scalable and harder to read.
Following Mike's idea, here is a script to create a table to save pairs of key/value. This integrates a mechanism (constraint) to check that the values is ok with respect to min/max/not null, and it also automatically creates a function fn_setting_XXXX() to quickly get the value of the corresponding setting (correctly casted).
CREATE TABLE settings
(
id serial,
name varchar(30),
type regtype,
value text,
v_min double precision,
v_max double precision,
v_not_null boolean default true,
description text,
constraint settings_pkey primary key(id),
constraint setting_unique unique(name),
constraint setting_type check (type in ('boolean'::regtype, 'integer'::regtype, 'double precision'::regtype, 'text'::regtype))
);
/* constraint to check value */
ALTER TABLE settings
ADD CONSTRAINT check_value
CHECK (
case when type in ('integer'::regtype,'double precision'::regtype) then
case when v_max is not null and v_min is not null then
value::double precision <= v_max and value::double precision >= v_min
when v_max is not null then
value::double precision <= v_max
when v_min is not null then
value::double precision >= v_min
else
true
end
else
true
end
and
case when v_not_null then
value is not null
else
true
end
);
/* trigger to create get function for quick access to the setting */
CREATE OR REPLACE FUNCTION ft_setting_create_fn_get() RETURNS TRIGGER AS
$BODY$
BEGIN
IF TG_OP <> 'INSERT' THEN
EXECUTE format($$DROP FUNCTION IF EXISTS fn_setting_%1$I();$$, OLD.name);
END IF;
IF TG_OP <> 'DELETE' THEN
EXECUTE format($$
CREATE FUNCTION fn_setting_%1$I() RETURNS %2$s AS
'SELECT value::%2$s from settings where name = ''%1$I''' language sql
$$, NEW.name, NEW.type::regtype );
END IF;
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER tr_setting_create_fn_get_insert
BEFORE INSERT OR DELETE ON settings
FOR EACH ROW
EXECUTE PROCEDURE ft_setting_create_fn_get();
COMMENT ON TRIGGER tr_setting_create_fn_get_insert ON settings IS 'Trigger: automatically create get function for inserted settings';
CREATE TRIGGER tr_setting_create_fn_get_update
BEFORE UPDATE OF type, name ON settings
FOR EACH ROW
WHEN ( NEW.type <> OLD.type OR OLD.name <> NEW.name)
EXECUTE PROCEDURE ft_setting_create_fn_get();
COMMENT ON TRIGGER tr_setting_create_fn_get_update ON settings IS 'Trigger: automatically create get function for inserted settings';
A mixed approach is best. You have to consider what is best for each setting - which largely boils down to who would change each site-wide setting.
If you have a development server and a live server, adding new application settings can be awkward if they are solely in the db. You either need to update the database before you update the code, or have all your code handle the situation where a setting is unavailable. Obviously one common sitewide setting is the database name, and that can't be stored in the database!
You can easily end up with different settings in your test and live environments. I've taken settings away from the DB and into text files before now.
I would recommend having defaults in a 'hardcoded' file, which may then overridden by a key/value pair lookup table.
You can therefore push up new code without first needing to change the settings stored in the database.
Where there are a varying amount of values, or values that are always changed at the same time, I'd store the values as JSON or other serialised form.
Go with #1. If you want constraints based on simple types, then rather than having a simple string as a value, add a date and number field as well. The individual properties will "know" what value they want. No reason to get all meta about it.
If I had to choose, I'd go with the first option. It is easy to add/remove rows as you need. Whereas the single row could end up being a nightmare, and is probably a lot less scalable. And for option 3: It's possible you will regret hard coding your settings in the future, so you definitely don't want to box yourself in.
Although you didn't list is as an option, is XML available? It is easy to set up, and gives you slightly more options, as you can nest settings within settings.
I am including using a separate PHP script with just the settings:
$datatables_path = "../lib/dataTables-1.9.4/media";
$gmaps_utils_dir = "../lib/gmaps-utils";
$plupload_dir = "../lib/plupload-1.5.2/js";
$drag_drop_folder_tree_path = "../lib/dhtmlgoodies/drag-drop-folder-tree2";
$lib_dir = "../lib";
$dbs_dir = "../.no_backup/db";
$amapy_api_registration_id = "47e5efdb-d13b-4487-87fc-da7920eb6618";
$google_maps_api_key = "ABQIABBDp7qCIMXsNBbZABySLejWiBSmGz7YWLno";
So it's your third variant.
I don't actually see what you find hard on changing these values; in fact, this is the easiest way to administrate these settings. This is not the kind of data you want your users (with different roles) to change via web interface. Products like PHPMyAdmin and Joomla happily use this approach.
I have used a mixed approach before in which i had put all the settings (which are not likely to change) into a separate PHP file. The individual settings (which are likely to change) as a key/value pair. That way I could reduce entries from the database thereby reducing my overall query time also this helped my keep the key size small .

How to add Stored Procedures to Version Control

Our team just experienced for the first time the hassle of not having version control for our DB. How can we add stored procedures at the very least to version control? The current system we're developing relies on SPs mainly.
Background: I develop a system that has almost 2000 stored procedures.
The critical thing I have found is to treat the database as an application. You would never open an EXE with a hex editor directly and edit it. The same with a database; just because you can edit the stored procedures from the database does not mean you should.
Treat the copy of the stored procedure in source control as the current version. It is your source code. Check it out, edit it, test it, install it, and check it back in. The next time it has to be changed, follow the same procedure. Just as an application requires a build and deploy process, so should the stored procedures.
The code below is a good stored procedure template for this process. It handles both cases of an update (ALTER) or new install (CREATE).
IF EXISTS(SELECT name
FROM sysobjects
WHERE name = 'MyProc' AND type = 'P' AND uid = '1')
DROP PROCEDURE dbo.MyProc
GO
CREATE PROCEDURE dbo.MyProc
AS
GO
However following sample is better in situations where you control access to the stored procedures. The DROP-CREATE method loses GRANT information.
IF NOT EXISTS(SELECT name
FROM sysobjects
WHERE name = 'MyProc' AND type = 'P' AND uid = '1')
CREATE PROCEDURE dbo.MyProc
AS
PRINT 'No Op'
GO
ALTER PROCEDURE dbo.MyProc
AS
GO
In addition, creating a process to build the database completely from source control can help in keeping things controlled.
Create a new database from source control.
Use a tool like Red Gate SQL Compare to compare the two databases and identify differences.
Reconcile the differences.
A cheaper solution is to simply use the "Script As" functionality of SQL Management Studio and do a text compare. However, this method is real sensitive to the exact method SSMS uses to format the extracted SQL.
I’d definitely recommend some third party tool that integrates into SSMS. Apart from SQL Source Control mentioned above you can also try SQL Version from Apex.
Important thing is to make this really easy for developers if you want them to use it and the best way is to use tool that integrates into SSMS.
2nd solution from #Darryl didn't work as suggested by #Moe. I modified #Darryl's template and I got it working, and thought it would be nice to share it with everybody.
IF NOT EXISTS(SELECT name FROM sysobjects
WHERE name = '<Stored Proc Name>' AND type = 'P' AND uid = '1')
EXEC sp_executesql N'CREATE PROCEDURE dbo.<Stored Proc Name>
AS
BEGIN
select ''Not Implemented''
END
'
GO
ALTER PROCEDURE dbo.<Stored Proc Name>
AS
BEGIN
--Stored Procedure Code
End
This is really nice because I don't lose my stored procedure permissions.
I think it's good to have each stored procedure scripted to a separate .sql file and then just commit those files into source control. Any time a sproc is changed, update the creation script - this gives you full version history on a sproc by sproc basis.
There are SQL Server source control tools that hook into SSMS, but I think they are just scripting the db objects and committing those scripts. Red Gate looks to be due to releasing such a tool this year for example.
We just add the CREATE statement to source control in a .sql file, e.g.:
-- p_my_sp.sql
CREATE PROCEDURE p_my_sp
AS
-- Procedure
Make sure that you only put one SP per file, and that the filename exactly matches the procedure name (it makes things so much easier to find the procedure in source control)
You then just need to be disciplined about not applying a stored procedure to your database that hasn't come from source control.
An alternative would be to save the SP as an ALTER statement instead - this has the advantage of making it easier to update an existing database, but means you need to do some tweaking to create a new empty database.
I've been working on this tool http://timabell.github.com/sqlHawk/ for exactly that purpose.
The way to ensure no-one forgets to check in their updated .sql files is by making your build server force the staging and live environments to match source control ;-) (which this tool will assist you with).

Resources