Is it possible to upgrade directly from Datastage 7.5.3 to 11.7.1? - version

We are migrating from Datastage 7.5.3 to 11.7.1. I was wondering whether we need to upgrade to an intermediate version of Datastage? Is there any conversion tool available? Any inputs from people who have experience in a similar upgrade are appreciated. Thanks

There is no option for in-place upgrade from DataStage v7 directly to Information Server v11.
You will need to install Information Server 11.7.1 (either to same machine in side-by-side config if machine has enough resources for both environments, or to a new server). You can then export all of your existing DataStage jobs in v7 environment to dsx file that you can then import into the new environment.
More information on migration steps can be found here:
https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.productization.iisinfsv.migrate.doc/topics/top_of_map.html
Though this document does not list specific steps for DataStage v7.5, the steps for DataStage v8 are equivalent as long as you export jobs as dsx files since istool did not exist in DataStage v7.

There have been many changes to DataStage between versions 7.5 and 11.7 which you need to be aware of when moving jobs from old release to new release. We have documented these changes for DataStage 8.5, 8.7, 9.1 and 11.3 releases. Since you are jumping past all these releases, all the documents are relevant and I will link them below and HIGHLY recommend reviewing them as they can affect behavior of jobs and also result in errors. In some cases we document in these technotes environment variables that can be set which will switch back to the old behavior.
Additionally, in the last few releases a number of the older enterprise database stages for various database vendors have been deprecated in favor of using newer "Connector" stages that did not exist in v7.5. For example, DB2 Enterprise stages should be upgraded to DB2 Connector, Oracle stages to Oracle connector, etc.
We have a client tool, the Connector Migration tool which can be used to create new version of job with the older stages automatically converted to connector stages (you will still need to test the jobs).
Also, when exporting jobs from v7.5, export design only...all jobs need to be recompiled at new release level so exporting executable is waste of space in this case.
If you do have a need to also move hash files and datasets to new systems, there are technotes on IBM.com that dicuss how to do that, though I cannot guarantee the format of datasets have not changed between 7.5 and 11.7.
You will find that in more recent releases we have tightened error checking such that things which only received warnings in past may now be flagged as errors, or conditions not reported at all may now be warnings. Examples of this include changes to null handling, such as when field in source stage is nullable but target stage/database has field as not nullable. Also there are new warnings or errors for truncation and type mismatch (some of those warnings can be turned off by properties in the new connector stages)
Here are the recommended technotes to review:
Null Handling in a transformer for Information Server DataStage Version 8.5 or higher
https://www.ibm.com/support/pages/node/433863
Information Server Version 8.7 Compatibility
https://www.ibm.com/support/pages/node/435721
InfoSphere DataStage and QualityStage, Version 9.1 Job Compatibility
https://www.ibm.com/support/pages/node/221733
InfoSphere Information Server, Version 11.3 job compatibility
https://www.ibm.com/support/pages/node/514671
DataStage Parallel framework changes may require DataStage job modifications
https://www.ibm.com/support/pages/node/414877
Product manual documentation on deprecated database stages and link to Connector Migration Tool:
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.conn.migtool.doc/topics/removal_stages_palette.html
Thanks.

Related

Migrate stored procedure on SQL Server to HPL/SQL (Hadoop ecosystem)

I have a project which required migrating all the stored procedure from SQL Server to Hadoop ecosystem.
So the main point makes me concerned that if HPL/SQL is terminated or not up-to-date as in http://www.hplsql.org/new. It shows latest updated features HPL/SQL 0.3.31-September 2011,2017
Has anyone been using this open source tool and this kind of migration is feasible basing on your experience? Very highly appreciated your sharing.
I am facing the same issue of migrating a huge amount of stored procedures from traditional RDBMS to Hadoop.
Firstly, this project is still active. It is included in Apache Hive since version 2.0 and that's why it stopped releasing individually since September 2017. You could download the latest version of HPL/SQL from the hive repo.
You may have a look at the git history for new features and bug fixes.

How can Flyway respect version control and the SDLC?

We are thinking of integrating Flyway in our application but are concerned about the way it maintains its own versions and how that works with the Software development life cycle (SDLC).
In essence our problem with the approach is that you are maintaining a set of SQL scripts separated by version in the file name instead of maintaining a trunk in version control and releasing/tagging that trunk as a specific version. With Flyway a developer could go back and change an old migration script that relates to a released version of your application and break a version you've already integrated/tested/staged and shipped to a production environment.
What we are considering doing is maintaining the SQL migrations in a project under version control (i.e. my-app-db/trunk/migration.sql) and releasing/tagging from there when a SQL developer is stating it is ready as a release (V1.0.0__blah.sql). The trunk/migration.sql is then wiped out so that the next 1.0.1 or 1.1.0 script can be developed and tagged. A wrapper script will then export the SQL files from the tags, call Flyway with that directory to perform the migration, and clean up the export.
Does this seem like a valid point/approach? Will Flyway ever support something like version control?
Flyway 3.0 will open APIs that will make it possible for end users to extend it in this direction. Out of the box support for SCM integration is currently not on the agenda.

Rake usage/alternatives for SQL Server database management in non-rails environment

Short version:
Is there a way to use rake without the full rails environment/active record to manage a SQL Server database? Are there alternatives to rake to do so that provide the same feature set?
Longer version:
I've done some hobby development using Rails but I haven't used it for work nor do I intend to. But from using it, one thing (among others, of course) that stuck out was how intuitive I found rake db:migrate to be from the standpoint of managing the development life cycle of the database.
I particularly like:
All scripts are ordered for execution with upgrade/downgrade steps separated.
Table generation is inherently scripted (doesn't rely on Management Studio UI to click click click.
Data insertion is explicit as a version/step.
Ease of use
What kind of options are there out there for doing this type of management on a SQL Server database for the lone coder?
rake db:migrate only makes easy migrations easier.
Bigger migrations are still just as hard, and require thinking about the schema changes in terms of what the database actually has to do.
Anyway, I think you're kind of stuck using the entire thing, because db:migrate isn't a database-centric migration - it's really model-centric.
Typically, I like to use something like SQL Compare to go from current production to next production schema.
I do not tend to run multiple migrations to go from 1.0.1 (Release) -> 1.0.2 (Dev) -> 1.0.3 (Dev) -> 1.1.0 (Release) (i.e. the scripts developers used to get to different internal releases), because i want to upgrade a staging environment from one production release to the next production release directly just like it's going to happen for production. There are always possibilities that the scripts won't work on real data (or they'll be too slow, or there will be referential integrity issues with the real data).
As far as scripting, this is available through SMO, or with tools like Red Gate's or APEX SQLScript.

How do you manage your sqlserver database projects for new builds and migrations?

How do you manage your sql server database build/deploy/migrate for visual studio projects?
We have a product that includes a reasonable database part (~100 tables, ~500 procs/functions/views), so we need to be able to deploy new databases of the current version as well as upgrade older databases up to the current version. Currently we maintain separate scripts for creation of new databases and migration between versions. Clearly not ideal, but how is anyone else dealing with this?
This is complicated for us by having many customers who each have their own db instance, rather than say just having dev/test/live instances on our own web servers, but the processes around managing dev/test/live for others must be similar.
UPDATE: I'd prefer not to use any proprietary products like RedGate's (although I have always heard they're really good and will look into that as a solution).
We use Red-Gate SQLCompare and SQLDataCompare to handle this. The idea is simple. Both compare products let you maintain a complete image of the schema or data from selected tables (e.g. configuration tables) as scripts. You can then compare any database to the scripts and get a change script. We keep the scripts in our Mercurial source control and tag (label) each release. Support can then go get the script for any version and use the Redgate tools to either create from scratch or upgrade.
Redgate also has an API product that allows you to do the compare function from your code. For example, this would allow you to have an automatic upgrade function in your installer or in the product itself. We often use this for our hosted web apps as it allows us to more fully automate the rollout process. In our case, we have an MSBuild task that support can execute to do an automatic rollout and upgrade. If you distribute to third-parties, you have to pay a small additional license fee for each distribution that includes the API.
Redgate also has a tool that automatically packages a database install or upgrade. We don't use that one as we have found that the compare against scripts for a version gives us more flexibility.
The Redgate tools also help us in development because they make it trivial to source control the schema and configuration data in a very granular way (each database object can be placed in its own file)
The question was asked before SSDT projects appeared, but that's definitely the way I'd go nowadays, along with hand-crafting migration scripts for structural db changes where there is data that would be affected.
There's also the MS VSTS method (2008 description here), anyone got a good article on doing this with 2010 and the pros/cons of using these tools?

Database schema updates

I'm working on an AIR application that uses a local SQLite database and was wondering how I could manage database schema updates when I distribute new versions of the application. Also considering updates that skip some versions. E.g. instead of going from 1.0 to 1.1, going from 1.0 to 1.5.
What technique would you recommend?
In the case of SQLite, you can make use of the user_version pragma to track the version of the database. To get the version:
PRAGMA user_version
To set the version:
PRAGMA user_version = 5
I then keep each group of updates in an SQL file (that's embedded in the app) and run the updates needed to get up to the most recent version:
Select Case currentUserVersion
Case 1
// Upgrade to version 2
Case 2
// Upgrade to version 3
Case etc...
End Select
This allows the app to update itself to the most recent version regardless of the current version of the DB.
We script every DDL change to the DB and when we make a "release" we concatenate them into a single "upgrade" script, together with any Stored Procedures which have changed "since last time"
We have a table that stores the version number of the latest patch applied - so upgrade tools can apply any newer patches.
Every Stored Procedure is in a separate file. Each starts with an "insert" statement to a logging table that stores Name of SProc, Version and "now". (Actually an SProc is executed to store this, its not a raw insert statement).
Sometimes during deployment we manually change an SProc, or rollout odds & ends from DEV, and comparing the log on client's TEST and PRODUCTION databases enables us to check that everything is at the same version.
We also have a "release" master-database, to which we apply the updates, and we use a restored backup of that for new installations (saves the time of running the scripts, which obviously increase over time). We update that as & when, because obviously if it is a bit stale the later patch scripts can be applied.
Our Release database also contains sanitised starter data (which is deleted, or sometimes adopted & modified, before a new installation goes live - so this is not included in any update scripts)
SQL Server has a toolbar button to script a change - so you can use the GUI tools to make all the changes, but rather than saving them generate a script instead. (actually, there is a checkbox to always generate a script, so if you forget and just press SAVE it still gives you the script it used after-the-fact, which can be saved as the patch file)
What I am considering is adding a SchemaVersion table to the database which holds a record for every version that exists. The last version of the SchemaVersion table is the current level of the database.
I am going to create (SQL) scripts that perform the initial setup of 1.0 and thereafter the upgrade from 1.0 to 1.1, 1.1 to 1.2, etc.
Even a fresh install to e.g. 1.2 will run through all these scripts. This might seem a little slow, but is only done once and on an (almost) empty database.
The big advantage of this is that a fresh install will have the same database schema as an upgraded install.
As I said: I am considering this. I will probably start implementing this tomorrow. If you're interested I can share my experiences. I will be implementing this for a c# application that uses LINQ-to-entities with SQL Server and MySQL as DBMSes.
I am interested to hear anybody else's suggestions and ideas and if somebody can point me out an open source .Net library or classes that implements something like this, that would be great.
EDIT:
In the answer to a different question here on SO I found a reference to Migrator.Net. I started using it today and it looks like it is exactly what I was looking for.
IMO the easiest thing to do is to treat an update from e.g. 1.0 to 1.5 as a succession of updates from 1.0 to 1.1, 1.1 to 1.2, and so forth. For each version change, keep a conversion script/piece of code around.
Then, keep a table with a version field in the database, and compile into the the app the required version. On startup, if the version field does not match the compiled-in version, run all the required conversion scripts, one by one.
The conversion scripts should ideally start a transaction and write the new version into the database as the last statement before committing the transaction.

Resources