We have three environments - production, staging, and test. Each environment has multiple databases for our internal back office systems (ie, time collection, accounting, human resources, etc).
For our employee benefits, we create file feeds that get shipped out to our benefit providers to handle enrollment/termination. Prior to SSIS 2012, we had a set of custom SSIS components that centralized configurations - so depending on where we deployed the package, it would automatically pick up the right connection strings.
Now our old way worked, but it has its own difficulties (ie, new developers often forget to set up the package properly and spend time trying to figure it out), so we'd like to move forward with SSIS 2012 and the Project Deployment Model.
I really like the concept of the Environment configurations, but simply creating 3 environments for prod/stage/test doesn't seem like a good idea. A lot of our packages have custom SFTP credentials, file paths to various areas on the network, etc - so those environments would grow very quickly with things that don't really matter to most of our packages.
Is there a way to have one Environment configuration pull in data from another Environment configuration? Something like this would allow me to create a Production Environment config which has the common database servers, and then inherit that within a production config for one of our benefit file feed packages. The benefit file feed package prod config would have all of its other very specific variables.
Probably catalog stored procedures in SSISDB could be an option (e.g. catalog.create_environment, catalog.create_environment_reference, catalog.create_environment_variable)
Related
Ok, so I understand what they are, they're collections of parameter assignments that you can tell a package to use upon execution.
What I'm trying to understand is why or if I need to use them.
I'm migrated a bunch of old SSIS packages using the packages deployment model to some new servers using the project deployment model. I have one project containing about 35 packages. I've created parameters on all my packages, and have a couple of project level parameters for stuff like Server Name etc.
I'm developing on my PC and the packages will have their parameters set to my dev environment settings by default unless I change them.
I'm deploying my packages to 3 servers (Test, UAT, Prod). I deploy them from Visual Studio.
Each server runs an identically scripted SQL job to execute the package.
So now, I need to set my parameters for each environment/server.
Do I need to set up environments, or why can't I just right click - configure my Project in the SSIS Integration Services Catalog on each server, and set the parameters there for the project and each package?
If I create environments, I still need to enter all the parameter values for each server/environment, but then I need to set up the reference between the project and the environment, and set each SQL job to use the relevant environment when executing the job.
Are environments only useful if you have one server, one package catalogue, and one set of SQL Jobs, and you're just using different databases for each environment so you need the environments to toggle between each?
Aren't they overkill if you have your environments on different servers, or am I missing something?
A use case for Environments
As the DBA, I have access to the user names and passwords for systems. I can perform a one-time task of setting up environments that define all the things.
I can then empower the developers to create/deploy and configure their own SSIS projects without having to get involved or giving them a post-it with the associated credentials.
Environment vs Project configuration - I view this as automating what makes the most sense. I used to be a big advocate of Environments but I found migrating them difficult between servers (before I found the magic script) and the fact that you can only have 1 associated to a project a limitation I didn't like. At this point, I generally configure a project directly with SSMS and save the script off so I can replace values as I migrate up the environments.
After an initial publish to a SQL Server(2019) and after the initial create of my DB project (data tier app) when publishing again it is failing with a drift report. No changes done to the database externally or within the VS project.
Drift report:
<Modifications>
<Object Name="[DummySqlLogin]" Parent="" Type="SqlUser" />
</Modifications>
To try and circumvent this by following a suggestion in answer on a question/answer from 2016.
"From your publish config file, use the following"
<ExcludeUsers>True</ExcludeUsers>
<ExcludeLogins>True</ExcludeLogins>
Unlike the OP this does allow me to publish to my database which now leaves me to the question of how to deal with logins/passwords especially in a scenario where we are going to be publishing to different environments.
I was planning on using SQLCMDVARIABLES to maintain separate Publish profiles for different types of environments and within there could specify passwords for each and then within the VS Database Project place a PreDeployment script which would setup Logins/Passwords for SQL accounts and make use of the SQLCMD variables.
Is there no better way of doing this? This probably works great for when you only have 5-10 environments but what if you have a 100?
Note* I want to avoid using commercial tools such as Redgate.
Here is one of common approaches:
environments, deploy targets are no project sources.
Same for specific logins, users, passwords. Those are security/administrations items server-specific. They depend on the environment, not product features. Why would the product, project sources depend on them? Imagine you have to change password due to security reasons on one of the servers. Can you logically connect it to recompiling all the sources?
I'd suggest to consider removing all the security items from SSDT project except maybe Roles (and configure publish.xml to ignore all the removed kinds of objects). Maintaining 100+ servers surely requires different tools and approaches, SSDT, dacpacs has nothing to do with it. The solution could be based on Octopus deploy, Ansible or something else.
It is not clear to me how I should use the new features of SSIS in SQL Server 2012/2014 in an enterprise environment. Specifically, I am referring to the project deployment model, project parameters, environments, etc. We use a three-tier environment workflow; developing in development, testing and staging in QA, and production in production. The developers only have access to the development environment. The DBA’s migrate code to the other environments. All source is kept in TFS.
What is the intended workflow using these new features? If a developer develops the project/package, does the developer deploy the project to the SSISDB or does the developer stop after checking in the source? Where does the DBA come into the picture? Which environment contains SSISDB? How does the project/package get deployed to the other environments?
There seems to be many “how-to’s” published on the Internet, but I am struggling to find one that deals with the business workflow best practices. Can anyone suggest a link to an article on this subject?
Thanks.
What is the intended workflow using these new features?
It is up to the enterprise to determine how they will use them.
If a developer develops the project/package, does the developer deploy the project to the SSISDB or does the developer stop after checking in the source?
Where does the DBA come into the picture? Which environment contains SSISDB? How does the project/package get deployed to the other environments?
It really does depend. I advocate that developers have sysadmin rights in the development tier of servers. If they break it, they fix it (or if they've really pooched it, we re-image the server). In that scenario, they develop the implementation process and use deployments to Development to simulate the actions the DBAs will take when deploying to all the other pre-production and production environments. This generally satisfies your favorite regulatory standard (SOX/SAS70/HIPPA/CPI/etc) as those creating the work are not the same ones that install it.
What is the deliverable unit of work for SSIS packages using the project deployment model? It is an .ispac file. That is a self contained zip file with a manifest, project level parameters, project level connection managers and the SSIS packages.
How you generate that is up to you. Maybe you check the ispac in and that is what is deployed to your environments. Maybe the DBAs open the solution from source control and build their own ispac. Maybe you have Continuous Integration, CI, running and you click a button and some automated process generates and deploys the ispac.
That's 1/3 of the equation. From the SSISDB side, you likely want to create an Environment and populate it with variable values. Things like Connection Strings and file paths and user names & passwords. When you start creating those things, CLICK THE CREATE SCRIPT TO NEW WINDOW button! Otherwise, you're going to have to re-enter all that data when you lift to a new environment. I would expect your developers to check those scripts into source control. For passwords, blank out the value and make notes in your deployment checklist that they need to fix that before mashing F5.
You also need SQL Scripts to create the structure (folder) within the SSISDB for the project to be deployed into. Once deployed, you'll want to apply the Environment values, created in the preceding step, to the newly deployed project. Save those out as well.
I would have each environment contain an SSISDB. I don't want a missed configuration allowing a process in the production tier to reach across to the development tier and pull data. I've seen that, it's not pretty. When code is deployed to the QA/Stage tier, we find out quickly whether we missed a connection string somewhere because the dev servers reject the connection from QA. This means our SQL Instances don't all run under the same server account. Each tier gets their own account: domain\SQLServer_DEV, domain\SQLServer_QA, domain\SQLServer_PROD Do what you can to prevent yourself from having a bad day. If you go with a single/shared SSISDB across all your tiers, it can work, but you're going to have to invest a lot more energy ensuring that packages always run with the correct configuration environment applied lest bad things happen.
Every shop at which I've worked has had their own cobbled-together, haphazard, poorly understood and poorly maintained method for updating production databases.
I've never seen a consistent method for doing this.
So, in the most recent versions of SQL Server, what is the best practice for updating schema changes and migrating data from a development or test server to a production server?
Is there a 3rd party tool which handles this painlessly?
I'd imagine the ultimate tool would be able to
detect schema changes between two DBs and generate DDL to update one to the other.
include the ability to have custom code which performs custom data migration steps
allow versioning so a v1 db could be updated all the way to a v99 database, running all scripts and migration steps in order.
The three things I've used are:
For schemas
Visual Studio Database Projects. Meh. They are okay but you still have to do alot of the work yourself.
Red Gate's SQL Compare and the entire SQL Toolbelt. They've worked pretty hard to make this something you can version control. In practice I've found with databases you are usually trying to get from point A in the version timeline to point B. With binaries, you often just clobber whatever is there with point B (an oversimplification I know, but often true).
http://www.red-gate.com/
xSQL is a good place to start if your system is small and perhaps will remain small:
http://www.xsqlsoftware.com/LiteEdition.aspx
I don't work for or know anyone who works for or get any money from these people. Just telling you what I've done in the past.
For data
Red Gate has SQL Data Compare.
However, if you want something "free" (or included with SQL Server)
I've actually had a lot of success just using BCP and writing a small system that injects and extracts data. Generally when I find myself doing this I ask myself, "Why? If I am changing data, does that mean I am really changing something that is configuration? Can I use a different method here?" But sometimes you can't (maybe it's a legacy system where the original devs thought databases are for everything).
The problem with BCP extracts is they don't version control very well. There are tricks I've used like extracting in character mode and stuffing an order by in the extract query to try and pull rows out in an order that makes them somewhat more palatable for version control.
For small Projects I have used RedGate to manage schema and data migrations with alot of success. Very easy to use works for most cases.
For larger enterprise systems for Schema and data changes normally you save all the SQL scripts as text files and run them. We also include a Rollback script to run incase something goes wrong during the migration. Run this on UAT server then Test/staging/pre prod server then on Production. Saving a copy of all these files plus their roll back scripts should allow you to move from multiple versions of a DB.
There is also http://code.google.com/p/migratordotnet/ if your using .NET it allows you to define these scripts in CODE. Very usesful if you want to deploy across multiple DBs in an automated way. Makes it easy to say set my DB to version 23. Or revert my DB to version 5. etc. Works for schema and data, but I would only really use it for a few lines of data.
First you have to think that the requirements between scenarios vary a lot:
Customers purchase v1 of the product at Costco and install it in they home office or small business. When v2 comes out, customer purchases a box of the product and installs it on a new computer. It exports the data from the v1 installation and imports it into v2 installation. Even though behind the scenes both v1 and v2 use a SQL Express instance there is no supported upgrade. Schema changes on the deployed databases are not expected (hidden database, non technical user) and definitely not supported. The only 'upgrade' path supported is an explicit export/import, which probably uses an XML file or something similar.
A business purchases v1 of the product with a support contract. It installs it on its department SQL Server instance, from where the data is accessed by the purchased product and by many more integration services, reports etc. When v2 is released, the customer runs the prescribed upgrade procedure, if it runs into problems it calls the product vendor customer support line which walks the customer through some specific steps for his deployment. Database schema customizations are expected and often supported, including upgrade scenarios, but the schema changes are done by the customer (not known at v2 design time).
A web startup has database that backs the site. Developers make changes on their personal instances and check in changes. Automated build deployment with contiguous integration picks up the changes and deploys them against a test instance, and run build validation tests. The main branch build can be, at any moment, deployed into production. Production is the one database that backs the site. The structure of the production database is documented and understood 100%, every single change to the production database schema occurs through the build system and QA process. On a side note, this is the scenarios most SO users that ask your question have in mind, minus the part about '100% documented and understood'. I give the example of WWW backing site, but deplyment can really be anything. The gist of it is that there is only one production database (it may include HA/DR copies, and it may consist of multiple actual SQL Server databases), and is the only database that has to be upgraded.
A succesfull web startup. Same as above, but the production database has 5TB of data and 5 minutes of downtime make the CNN headlines. Schema changes may involve setting up replicas and copying data into new schemas with contiguous updates, followed by an online switch of operations to the replica. Schema changes are designed by MCM experts and deployn a schema change can be a multi-week process.
I can go on wit more scenarios. The point is that the requirement of each of these cases are so vastly different, that no 'state of the art' can answer all of them. Some scenarios will be perfectly OK with a schema diff deployment tool like vsdbcmd or SQL Compare. Other scenarios will be much better faced with explicit versioning scripts. Other might have such specific requirements (eg. 0 downtime) that each upgrade is a project on its own and has to be specifically custom tailored.
One thing is clear though across all scenarios: if your shop threats the development database MDF file* as 'source' and makes changes to it using the management tools, that is always a major #fail. All changes should be captured explicitly as some sort of source control artifact, and this is why I favor most the explicit version scripts, as in Version Control and your Database. But I recon that the VSDB project support for compile time schema validation and its ease of refactoring schema objects make a pretty powerful proposition and VSDB schema compare deployment may be OK.
Another important approache that has to be addressed is the code first schema modeling from tools like EF or LinqToSql. It works brilliantly to deploy v1, but fails miserably at any subsequent version. I strongly discourage these approaches.
But to sum up and answer in brief: as today, the state of the art sucks.
At Red Gate we'd recommend one of two approaches depending on your requirements and how formal you need your processes to be. If you have a development database and simply want to push changes to production, SQL Compare is the tool for the job. A level of versioning can be achieved by using the schema snapshots.
However, if you wants full source control benefits, such as team collaboration, sandboxed environments, audit trail, compliance, history, rollback, etc, you should consider SQL Source Control. This links development databases to Team Foundation Server or Subversion.
For the last few years I was the only developer that handled the databases we created for our web projects. That meant that I got full control of version management. I can't keep up with doing all the database work anymore and I want to bring some other developers into the cycle.
We use Tortoise SVN and store all repositories on a dedicated server in-house. Some clients require us not to have their real data on our office servers so we only keep scripts that can generate the structure of their database along with scripts to create useful fake data. Other times our clients want us to have their most up to date information on our development machines.
So what workflow do larger development teams use to handle version management and sharing of databases. Most developers prefer to deploy the database to an instance of Sql Server on their development machine. Should we
Keep the scripts for each database in SVN and make developers export new scripts if they make even minor changes
Detach databases after changes have been made and commit MDF file to SVN
Put all development copies on a server on the in-house network and force developers to connect via remote desktop to make modifications
Some other option I haven't thought of
Never have an MDF file in the development source tree. MDFs are a result of deploying an application, not part of the application sources. Thinking at the database in terms of development source is a short-cut to hell.
All the development deliverables should be scripts that deploy or upgrade the database. Any change, no matter how small, takes the form of a script. Some recommend using diff tools, but I think they are a rat hole. I champion version the database metadata and having scripts to upgrade from version N to version N+1. At deployment the application can check the current deployed version, and it then runs all the upgrade scripts that bring the version to current. There is no script to deploy straight the current version, a new deployment deploys first v0 of the database, it then goes through all version upgrades, including dropping object that are no longer used. While this may sound a bit extreme, this is exactly how SQL Server itself keeps track of the various changes occurring in the database between releases.
As simple text scripts, all the database upgrade scripts are stored in version control just like any other sources, with tracking of changes, diff-ing and check-in reviews.
For a more detailed discussion and some examples, see Version Control and your Database.
Option (1). Each developer can have their own up to date local copy of the DB. (Up to date meaning, recreated from latest version controlled scripts (base + incremental changes + base data + run data). In order to make this work you should have the ability to 'one-click' deploy any database locally.
You really cannot go wrong with a tool like Visual Studio Database Edition. This is a version of VS that manages database schemas and much more, including deployments (updates) to target server(s).
VSDE integrates with TFS so all your database schema is under TFS version control. This becomes the "source of truth" for your schema management.
Typically developers will work against a local development database, and keep its schema up to date by synchronizing it with the schema in the VSDE project. Then, when the developer is satisfied with his/her changes, they are checked into TFS, and a build and then deployment can be done.
VSDE also supports refactoring, schema compares, data compares, test data generation and more. It's a great tool, and we use it to manage our schemas.
In a previous company (which used Agile in monthly iterations), .sql files were checked into version control, and (an optional) part of the full build process was to rebuild the database from production then apply each .sql file in order.
At the end of the iteration, the .sql instructions were merged into the script that creates the production build of the database, and the script files moved out. So you're only applying updates from the current iteration, not going back til the beginning of the project.
Have you looked at a product called DB Ghost? I have not personally used it but it looks comprehensive and may offer an alternative as part point 4 in your question.