SSDT2012 copy csv file to deploy server - sql-server

I am using SSDT2012 to deploy a database project. I have static data that I want to populate but it is in a .csv file. I added it to the project but can't see a way to copy over to the server temp folder or something similar.
I tried adding
Thanks for the help!
EDIT: I have been looking at Deployment Contributors but it is still not a solution. The need to actually have everyone copy the Contributor onto their machines and having to maintain that and bug fix it; it is not a desirable approach.

The recommended approach to dealing with static data is to use merge statements in the pre or post deploy scripts:
http://blogs.msdn.com/b/ssdt/archive/2012/02/02/including-data-in-an-sql-server-database-project.aspx
2000 lines is quite a lot but sql can easily handle it.
Getting your 2000 line csv into a merge statement by hand will obviously be a royal pain so you can use the Sql import wizard to get it into a table (basically just deploy it somewhere) and then you can deploy sp_generate_merge to create the merge statement which you can then put into your post-deploy script:
https://github.com/readyroll/generate-sql-merge
If you are going to use merge statements then regardless of whether you automatically generate the script or not, I would really recommend using this blog from Alex Whittles to help understand how they work as they can be quite confusing to start with:
http://www.purplefrogsystems.com/blog/2011/12/introduction-to-t-sql-merge-basics/
Finally you should be careful when you remove items from your static data, if you have other tables with foreign keys into the data and you remove an item that the child tables depend on the merge statement would fail so you should make sure you go ahead and deal with any possible issues in the pre/post deployment script before running the merge. These should be re-runnable.
Ed

Related

How to run raw SQL to deploy database changes

We intend to create DACPAC files using SQL database projects and distribute them automatically to several environments, DEV/QA/PROD, using Azure Pipeline. I can make changes to the schema for a table, view, function, or procedure, but I'm not sure how we can update specific data in a table. I am sure this is very common use case but unfortunately I am having hard time implementing it.
Any idea how can I automate creating/updating/deleting a row for a table?
E.g.: update myTable set myColumn = 5 where someColumn = 'condition'
In your database project you can add a Post Deployment Script
Do not. Seriously. I found DACPAC always to be WAY too limiting for serious operations. Look how the SQL is generated and - realize how little control you have.
The standard approach is to have deployment scripts that you generate and that do the changes in the database, plus a table in the db tracking which have executed (possibly with a checksum so you do not need t change the name to update them).
You can easily generate them partially by schema compare (and then generate the change script), but those also allow you to do things like data scrubbing and multi step transformations that DACPAC by design cannot efficiently and easily do.
There are plenty of frameworks for this around. They generally belong in the category of developer tools.

Trying to loop through, and delete from, a list of tables using SSIS

I am currently working on establishing an archive and purge process for our database. I inherited this task from an employee who left a few days ago - he'd been working on it (infrequently) for the past several months so I've had to spend quite awhile retracing his steps so to speak and trying to figure out how he had this all set up. We are using a scheduled task to perform our archiving process and I have that part mostly ready to deploy. However, we're using an SSIS package to handle the purge process, and I've never created or modified an SSIS package before so I'm running into some issues that I'm frankly not really even sure where to start with in terms of troubleshooting.
Essentially, we have a table that we use to store the list of tables which will be archived (and eventually purged), which also records the order that the tables need to be purged in to avoid leaving any orphaned records or FK conflicts. The purpose of the SSIS package then is to load the list of tables, ordered according to the purge order, then loop through each of those tables and delete from them based on a WHERE clause referring to the archive date of each record. I think one of the problems I'm running into is that table names can't be used as parameters in an SQL command, and despite trying a few ways to work around this I can't quite get it to work right.
Based on what I've read elsewhere I tried to create a variable containing the SQL statement I'm trying to execute, then selecting that variable as the source for an SQL statement, but I'm still not getting it right. I think I may be pulling the table names from the foreach loop containing the SQL command incorrectly but I'm not sure what I'm doing wrong.
I'll post images below of the two SQL commands I'm using if that helps. I can post the foreach loop as well, or different information pertaining to either of the commands. Again I apologize for the vagueness of my question, I'm just not really familiar enough with SSIS yet to actually know what information would be useful to provide. I unfortunately don't have enough rep to post the images directly so I'll just provide links.
Purge order SQL command
Delete SQL command
Yeah you've set your second SQL Task up wrong. When you select "Variable" as the SQLSourceType, then the SourceVariable value should be the name of the SSIS variable that contains the text of your SQL Statement.
So instead of "DELETE FROM WHERE DATEADD(..."
Your SourceVariable value should be something like "User::SqlCmd", and you should have populated the SqlCmd variable with the complete SQL command to be executed.
it seems that you did not bind the result of each loop of for each loop container to your TableName variable.
please double click your for each loop container
then go to variable mapping,
then make sure to assign index 0 to variable TableName. thanks
Thanks for the answers, I've solved the problem though I'm still not quite sure how. Exasperated, I decided on a lark to try deleting the SQL command responsible for performing the purge, then re-creating it the same way. Bafflingly this fixed the problem.
Like I said in the question, I don't know much at all about how SSIS packages are formed but I imagine there is a lot of metadata and other information stored behind the scenes. I figured since I was picking up a project someone else had been working on there was a chance that the changes I made had gummed up the works somehow. Apparently this was the case. At any rate thanks again for the help.

VSTS build - Incremental database deployment in distributed environment

I have a sql server database working with a .net 2015 mvc 5 application. My database code is source controlled using SSDT project. I am using SqlPackage.exe to deploy database to the staging environment using .Decpac file created by the SSDT project build process. This has been done using a powershell task of VSTS build.
This way I can make db schema changes in a source controlled way. Now the problem is regarding the master data insertion for the database.
I use a sql script file which have data insertion scripts which is executed as a post deployment script. This file is also source controlled.
The problem is that initially we have prepared the insertion script to target a sprint ( taking sprint n as a base) which works well for first release. but in next sprint if update some master data then how should the master data insert should be updated:
Add new update / insert query at the last of the script file? but in this case the post deployment script will be execute by CI and it try to insert the data again and again in the subsequent builds which will eventually get failed if we have made some schema changes in the master tables of this database.
Update the existing insert queries in the data insertion script. in this case also we have trouble because at the post build event, whole data will be re-inserted.
Maintain separate data insertion scripts for each script and update the script reference to the new file for the post build event of SSDT. This approach has a manual effort and error pron because the developer has to remember this process. Also the other problem with this approach is if we need to setup 1 more database server in the distributed server farm. Multiple data insertion script will throw errors because SSDT has latest schema and it will create a database with the same. but older data scripts has data insertion for previous schema ( sprint wise db schema which was changed in later sprints)
So can anyone suggest best approach which have lesser manual effort but it can cover all the above cases.
Thanks
Rupendra
Make sure your pre- and post-deployment scripts are always idempotent. However you want to implement that is up to you. The scripts should be able to be run any number of times and always produce correct results.
So if your schema changes that would affect the deployment scripts, well, updating the scripts is a dependency of the changes and accompanies it in source control.
Versioning of your database is already a built in feature of SSDT. In the project file itself, there is a node for the version. And there is a whole slew of versioning build tasks in VSTS you can use for free to version it as well. When SqlPackage.exe publishes your project with the database version already set, a record is updated in msdb.dbo.sysdac_instances. It is so much easier than trying to manage, update, etc. your own home-grown version solution. And you're not cluttering up your application's database with tables and other objects not related to the application itself.
I agree with keeping sprint information out of the mix.
In our projects, I label source on successful builds with the build number, which of course creates a point in time marker in source that is linked to a specific build.
I would suggest to use MERGE statements instead of insert. This way you are protected for duplicated inserts within a sprint scope.
Next thing is how to distinguish different inserts for different sprints. I would suggest to implement version numbering to sync database with the sprints. So create a table DbVersion(version int).
Then in post deployment script do something like this:
SET #version = SELECT ISNULL(MAX(version), 0) FROM DbVersion
IF #version < 1
--inserts/merge for sprint 1
IF #version < 2
--inserts/merge for sprint 2
...
INSERT INTO DbVersion(#currentVersion)
What I have done on most projects is to create MERGE scripts, one per table, that populate "master" or "static" data. There are tools such as https://github.com/readyroll/generate-sql-merge that can be used to help generate these scripts.
These get called from a post-deployment script, rather than in a post-build action. I normally create a single (you're only allowed one anyway!) post-deployment script for the project, and then include all the individual static data scripts using the :r syntax. A post-deploy script is just a .sql file with a build action of "Post-Deploy", this can be created "manually" or by using the "Add New Object" dialog in SSDT and selecting Script -> Post-Deployment Script.
These files (including the post-deploy script) can then be versioned along with the rest of your source files; if you make a change to the table definition that requires a change in the merge statement that populates the data, then these changes can be committed together.
When you build the dacpac, all the master data will be included, and since you are using merge rather than insert, you are guaranteed that at the end of the deployment the contents of the tables will match the contents of your source control, just as SSDT/sqlpackage guarantees that the structure of your tables matches the structure of their definitions in source control.
I'm not clear on how the notion of a "sprint" comes into this, unless a "sprint" means a "release"; in this case the dacpac that is built and released at the end of the sprint will contain all the changes, both structural and "master data" added during the sprint. I think it's probably wise to keep the notion of a "sprint" well away from your source control!

SSDT implementation: Alter table insteed of Create

We just trying to implement SSDT in our project.
We have lots of clients for one of our products which is built on a single DB (DBDB) with tables and stored procedures only.
We created one SSDT project for database DBDB (using VS 2012 > SQL Server object Browser > right click on project > New Project).
Once we build that project it creates one .sql file.
Problem: if we run that file on client's DBDB - it creates all the tables again & it deletes all records in it [this fulfills the requirements but deletes the existing records :-( ]
What we need: only the update which is not present on the client's DBDB should get update with new changes.
Note : we have no direct access to client's DBDB database for comparing with our latest DBDB. We only can send them some magic script file which will update their DBDB to the latest state.
The only way to update the Client's DB is to compare the DB schemas and then apply the delta. Any way you do it, you will need some way to get a hold on the schema thats running at the client:
IF you ship a versioned product, it is easiest to deploy version N-1 of that to your development server and compare that to the version N you are going to ship. This way, SSDT can generate the migration script you need to ship to the client to pull that DB up to the current schema.
IF you don't have a versioned product, or your client might have altered the schema or you will need to find a way to extract the schema data on site (maybe using SSDT there) and then let SSDT create the delta.
Option: You can skip using the compare feature of SSDT altogether. But then you need to write your migration script yourself. For each modification to the schema, you need to write the DDL statements yourself and wrap them in if clauses that check for the old state so the changes will only be made once and if the old state exists. This way, it doesnt really matter from wich state to wich state you are going as the script will determine for each step if and what to do.
The last is the most flexible, but requires deep testing in its own and of course should have started way before the situation you are in now, where you don't know what the changes have been anymore. But it can help for next time.
This only applies to schema changes on the tables, because you can always fall back to just drop and recreate ALL stored procedures since there is nothing lost in dropping them.
It sounds like you may not be pushing the changes correctly. You have a couple of options if you've built a SQL Project.
Give them the dacpac and have them use SQLPackage to update their own database.
Generate an update script against your customer's "current" version and give that to them.
In any case, it sounds like your publish option might be set to drop and recreate the database each time. I've written quite a few articles on SSDT SQL Projects and getting started that might be helpful here: http://schottsql.blogspot.com/2013/10/all-ssdt-articles.html

Sql Server Project: Post deployment script(s)

I have a database project and I'm wondering what best practice is for adding pre-determined data, like statuses, types, etc...
Do I have 1 post deployment script for each status / type? OR
Do I have 1 post deployment script that uses :r someStatus.sql for each status/type script?
I suppose a 3rd option could be to have all inserts in one giant script but that seems awful to me. In the past, I've used option 2, but I'm not sure why it was done this way. Suggestions?
There's tools to package your data.
I have happily used RedGate SQL Packager (not free) and
DBUnit XML datafiles extracted from development environment and sent to the database with an Ant <dbunit> task.
For our scenario, we use a combination of #3 and #2. If we have a new build, we populate empty databases, set the post-deploy inserts that we normally use not to run, then populate the data after the entire build/publish. I tend to batch up related inserts as well so if I'm inserting 15 statuses, I add them in one script. The downside to that is that you need to make sure your script can be re-run and not cause issues so inserting into a temp table, then doing a left join against your actual table may be the best solution. It keeps the number of scripts down to a more manageable size.
For incremental releases, I tend to batch inserts by Story (using Scrum) so related scripts go together. It also helps me know when a script has been run in production and can be safely removed from the project.
You may also want to look at having a "reference" database of some sort where you only store the reference values, then perhaps a tool such as Red-Gate's Data Compare to pull over the appropriate set of data. The Pro version can be automated/scripted so you may have an easier way to pull in new data for testing. This may be your best solution in the long run as you can easily set up which tables you want to copy and set filters on data.

Resources