I am developing a data driven website and quite a lot of programming logic resides in database stored procedures and database functions. I found myself changing the stored proc/functions quite a lot in order to fix bugs or add new functionality. The data (tables) have remained mostly untouched.
The issue I am having is keeping track of versions of stored proc/functions. Currently I am incrementing version of whole database when I do a set of changes. As data is huge (10 Gb) I get issues having to run development version and release versions of databases in parallel.
I wish to put all the stored procs and functions in one database and keep data in one database, so that I can better manage the changes.
I am sure others would have encountered similar suggest and request suggestions on how to best handle this situation.
I would also recommend using source control keyword expansion in your stored procedures ($Version:$)
That way you can eyeball, grep, search syscomments, etc to see what version you have on your deployed database.
You can version just the schema dumps. In combination with source control keword expansion (as suggested by Rawheiser), you just take a look at what version you have in the database, generate a diff and apply it.
Also, there are several excellent tools to compare databases and their schemas, generate DDL scripts etc.: SQL Workbench, Power Architect, DDLUtils and Redgate SQL Compare, to name a few. SQL Compare is likely to work best with SQL Server, although all the others are FOSS and provide a higher ROI (in terms of time spent learning and what you can do with them) as they are platoform and RDBMS independent.
Finally, I have to say...I understand that the immediate results you get with logic in the DB are tempting, but if you've gone beyond more than a couple of procedures in the database, you're setting your self up for quite a lot of pain, sifting through what easily turns into spaghetti code and locking your application to a single database vendor. You might have your reasons, but I've been there and didn't like it very much. Logic can live very nicely in a different layer.
For source control you have several options:
Use a Visual Studio Database project.
Use SQL Server 2005's built-in support for source control
Use a third part tool such as SQL Compare
IMO Option 1. is preferable.
Related
There are many questions like this on Stack Overflow, but it seems to me that they already have the migration scripts in place. For example, insert and update statements are available, as such, Flyway can just use those scripts to create the tables in the target database and its data.
However, my question is that, what if we don't have those scripts? For example, tables are being created manually or with some other tools and the data are being inserted over the years with the bound application, now we want to switch to a different SQL database. Can Flyway be used as a tool to transfer all the tables and databases only with providing connections?
If the answer is no, how this sort of migration can be done and what are the best practices.
I did a search and went through Flyway documentation but they are all vague and doesn't give you a clear example of that. Some of these tools I found are used for Salesforce but I need a tool/library that possibly can be used in Java using JDBC connection, or other languages such as Python etc, as our databases - for security reasons - cannot be accessed directly and are cloud based.
For your information, we are using a range of databases PostgreSQL, Aurora MySQL, SQL Server.
No, Flyway can't do this sort of thing.
Flyway is a deployment tool. While it certainly can include data movement, as with the deployment of database objects, the scripts supporting data movement have to be completely idempotent or completely isolated in their deployment. Neither of these is lending itself to what you're talking about.
What you're talking about is something like Redgate SQL Compare along with SQL Data Compare. These two would allow you to compare two databases, identify the differences, then generate the necessary scripts. I'm aware of no open source tools that do all that, especially that do all that across multiple data platforms. And that tool only supports SQL Server (there is a second one for Oracle, but no others).
The thing is, if you're allowing deployments to occur using manual processes or 3rd party mechanisms, without going through source control as centralized management of your code, you can't use Flyway anyway. Flyway requires a consistent and stable process wherein it is the thing running deployments. Allowing, or even encouraging, drift through out-of-band deployments will break your Flyway deployments.
DISCLOSURE: I work for Redgate, but we're not the solution you're looking for.
I'm working on a database heavy project, where the Microsoft SQL databases are very mature (16 or more years-old mature), and an old product uses VB6 and ADO to generate sql which interacts with the database. I've been given the task of porting/re-writing the ancient version with a new .NET version.
I'd love to use LINQ-to-* to ensure easy maintainability, but having tried for the last several weeks I feel like LINQ-to-SQL isn't flexible enough, LINQ-to-Entities has too much overhead, and LINQ-to-Datasets is pointless since I would be just as happy using Ado.Net.
The program operates on two databases at once: one is a database with a very consistent schema containing meta-data, and the other a database which has a varying schema, is tightly coupled to the meta-database, and dictates what information from the meta-database you are interested in at any given time. Furthermore, I need non-LINQ information from both databases (such as system-stored procedures, and system-tables).
Is there any way to use LINQ intelligently here? I'd love the static typing, but if I can't have it I don't want to force my square app into a round framework.
Just an FYI, you can get access system tables (and sys stored procs too?) using LINQ. Here is how:
Create a connection to the server you want.
Right-click the server and choose Change View > Object Type.
You should now see System Tables and User Tables. You should see sysjobs there, and you can easily drag it onto a .dbml surface.
Above was stolen from this post.
The best answer seems to be to use ADO.NET completely. I have the option of using Linq-to-Sql over the metabase and ADO.NET for any other database access, but that would make the code feel too inconsistent for me.
I am using github for maintaining versions and code synchronization.
We are team of two and we are located at different places.
How can we make sure that our databases are synchronized.
Update:--
I am rails developer. But these days i m working on drupal projects (where database is the center of variations). So i want to make sure that team must have a synchronized database. Also the values in various tables.
I need something which keep our data values synchronized.
Centralized database is a good solution. But things get disturbed when someone works offline
if you use visual studio then you can script your database tables, views, stored procedures and functions as .sql files from a database solution and then check those into version control as well - its what i currently do at my workplace
In you dont use visual studio then you can still script your sql as .sql files [but with more work] and then version control them as necessary
Have a look at Red Gate SQL Source Control - http://www.red-gate.com/products/SQL_Source_Control/
To be honest I've never used it, but their other software is fantastic. And if all you want to do is keep the DB schema in sync (rather than full source control) then I have used their SQL Compare product very succesfully in the past.
(ps. I don't work for them!)
You can use Sql Source Control together with Sql Data Compare to source control both: schema and data. Here is an article from redgate: Source controlling data.
These are some of the possibilities.
Using the same database. Set-up a central database where everybody can connect to. This way you are sure everybody uses the same database all the time.
After every change, export the database and commit it to the VCS. This option requires discipline and manual labor.
Use some kind of other definition of the schema. For example, Doctrine for php has the ability to build the database from a yaml definition which can be stored in the vcs. This can be easier automated then point 2.
Use some other software/script which updates the database.
I feel your pain. I had terrible trouble getting SQL Server to play nice with SVN. In the end I opted for a shared database solution. Every day I run an extensive script to backup all our schema definitions (specifically stored procedures) for version control into text files. Due to the limited number of changes this works well.
I now use this technique for our major project and personal projects too. The only negative is that it relies on being connected all the time. The other answers suggest that full database versioning is very time consuming and I tend to agree. For "live" upgrades we use the Red Gate tools, they do both schema and data compare and it works very well.
http://www.red-gate.com/products/SQL_Data_Compare/. We were using this tool for keeping databases in sync in our company. Later we had some specific demands so we had to write our own code for synchronization. Depends how complex is you database and how much changes is happening. It is much simpler if you have time when no one is working and you can lock database for syncronization.
Check out OffScale DataGrove.
This product tracks changes to the entire DB - schema and data. You can tag versions in any point in time, and return to older states of the DB with a simple command. It also allows you to create virtual, separate, copies of the same database so each team member can have his own separate DB. All the virtual copies are tracked into the same repository so it's super-easy to revert your DB to someone else's version (you simply check-out their version, just like you do with your source control). This means all your DBs can always be synchronized.
Regarding a centralized DB - just like you don't want to work on the same source code, you don't want to be working on the same DB. It means you'll constantly break each other's code and builds each time someone changes something in the DB.
I suggest that you go with a separate DB for each developer, and sync them using DataGrove.
Disclaimer - I work at OffScale :-)
Try Wizardby. This is my personal project, but I've used it in my several previous jobs with great deal of success.
Basically, it's a tool which lets you specify all changes to your database schema in a database-independent manner and then apply these changes to all your databases.
My DBA just lost some development work that he did on our development database. Poor fella. So naturally our manager asked him, at our status meeting, how this could happen and how we could avoid this happening in the future. "Source control could alleviate the problem" I suggested... The dba's response; "No, we just backup the server more often". Now I would like to help my DBA understand what source control is and how it fits together with a database schema and development on that schema.
Previously I've tried to explain him that there's nothing special about the source code behind tables and stored procedures and it should be in a source control system (TFS in this case). But he just didn't bite. Now, while this misap is in recent memory, I would like to take another stab at it.
So my question is, do you know of any good advice I could pass on to my DBA and maybe even a couple of resources explaining how you would go about migrating a DB schema to be under source control and find its proper place in the build and deployment processes?
A couple of facts about the environment:
Source Control on a TFS 2008 Server.
Database is a MS SQL server 2008 with >300 tables and >300 other objects (sprocs, triggers, functions etc.).
Clarification:
We have been using DB Ghost and other change management solutions on other projects with other DBAs, in the past. We even have the license for VS DB edition! The problem is getting the DBA to even think about this way of developing for the database. He's really old school (i.e. migrating changes manually from environment to environment), and unfortunately hes the only one who knows anything about this particular DB.
See how to version control sql server databases and Do you source control your databases, among many others. Or use the search page. Basically, your approach seems correct. Good luck persuading the DBA...
If you are using Visual Studio Team System, I recommend having a stab at their Database Edition (i think these days it comes with the Developer Edition if you are an MSDN Subscriber). What this will allow you to do is to script out all your schema, stored procs, views, triggers, etc and source control these. This should also make the dba more comfortable since he will be working with a "Database" version of the tool rather than the "Developer" version (naming can go a great lengths with people). As you make changes from Visual Studio, you can manage script changes as you work, and source control them.
If your company has an MSDN license, they can use the Visual Studio Database edition. There's a video tutorial of it here.
I have no power of purchase, so I don't know what the cost breakdowns are. But it has the capability of source controlling all the parts of a DB schema, and includes creating change-scripts as well as auto-deploying straight from VS if you want (I wouldn't recommend that).
In general though, it's pretty solid as a database source control option.
Source control for databases can be quite contentious. It's different to use source control for something that produces a binary because you can't lock the source: a stored proc is a row in a table and there is not single table to read to get a table definition.
Also, version to version is mostly a set of ALTER statements. You script out CREATEs and add them to source control. This makes it harder to use in cases like this.
To me, this is more a procedural error.
Why was the change not done from a script? Forget where the script lives, but why no reproducable and re-runnable script? Perhaps linked to the change tracking number? If the database is reset (loaded from prod) then how would the change have been re-applied to prepare for production. And other questions.
I believe in source control and we use it: but it has limits for database work.
First you are approaching this incorrectly. If the dba won't bite on Source Control and he is making errors that affect the system, the person you need to persuade is his boss.
If it helps, I'm from the old school too and I love having our database objects in source control. How nice to be able to revert one table without having to restore the whole database backup to a different location and then move the table. How much faster and simpler. How nice to be able to compare two different versions and see what changed. How nice to deploy a change and know exactly which database changes (say, for instance only twelve of the 23 possible ones) go with the part you are deploying and not some other unfinished project. How nice to know exactly which scripts were involved in a particular change you had to rollback. How nice that nobody is making on-the-fly changes on production since we now require all production changes to be from source control scripts. There are so many fewer errors and issues to worry about.
Yes it was a change in how we did business, but we did it through a policy change from on high so three was no argument and the dbas went through a couple of times and reverted any objects different from source control to the source control version, so now nobody will even think of doing a database change without it being in source control.
As the product manager for SQL Compare I've spoken to many 'traditional' DBAs who are uncomfortable with third party tools mainly because they have a system that works for them and sometimes changing can be difficult. There are many situations where I am convinced that they would benefit from our tools if only they gave them a chance. Frustrating.
One thing you might consider trying is Red Gate's upcoming tool, SQL Source Control. This is designed to build source control into SSMS, in other words it doesn't require DBAs to leave the comfort zone of their management environment. The bad news is that the tool hasn't been released yet. The good news is that we have an Early Access Program. Please visit the following link to find out more about the tool:
http://www.red-gate.com/Products/SQL_Source_Control/index.htm
you can't really put a large database under source control, so your DBA is right.
what you can do practically is to put your schema under source control, and maybe a few smallish 'configuration' tables.
One way to source control database is to store the data in and about the database separately
You can have the all the tables, procedures and function scripts as SQL files and add them to source control.
Export the database data as insert statements into SQL files, each with a fixed size. This is a cumbersome process as it would involve a lot of files that are to be tracked and controlled.
I am not sure if the VSS/SVN are able to read and keep history of changes to dump files created by the database backup options.
Its not clear from you question if you want to protect the data in the Db or the schemas in the Db. If the latter then you could identify all the important schemas and run an cron job that pulls the schema definitions from the Db and inserts them automatically into a source control system (perhaps even via triggers on the schemas??).
But this still just amounts to backing the system up more often. For what you envision you would need source control integrated with the Db tools and I don't know of any product that does that.
(and I shudder to think of VSS integrated into SQL management studio :-(( )
My answer to this same problem was to export all DB objects to text form (more than 136,000 of them) and then create the SourceSafe projects to hold them. Any New or changed objects in the DB now go to the SourceSafe structure, while unchanged are left alone.
Working on a team where people are prone to amending dev SQL Server tables and forgetting about it, or preparing a change for deployment and having to wait for that deployment. This leaves our dev and live tables inconsistent, causing problems when SPROCs are pushed live.
Is there a tool whereby I can enter a SPROC name and have it check all tables referenced in it in the dev and live DBs, and notify of any differences?
I know two excellent tools for diffing SQL database structures - they don't specifically look inside stored procedures at their text, but they'll show you structural differences in your databases:
RedGate SQL Compare
ApexSQL's SQL Diff
Redgate also has a SQL Dependency Tracker which visualizes object dependencies and could be quite useful here.
Marc
For SQL Server 2005/2008, Open DBDiff works pretty well. The great part about this is that it's free. Also note that I am writing this answer for version 0.9 which currently works for SQL 2005/2008.
It'll show you the differences between the database schema between a source database you specify and the destination database you specify. There are also buttons you can click that can update or create the table that is in question.
I would recommend SQL compare and SQL Data Compare from Redgate Software. I worked with these tools for several projects and they did a great job. Documenting changes is also a good thing to do, but some changes are to complex to write your own SQL code for (including juggling data around between tables).
The redgate tools create scripts in a matter of seconds and those scripts are almost always correct (some older versions had a hard time with table dependencies in big databases, but when playing around with the statements (in a begin transaction / rollback) I was able to quickly fix those problems).
Another strong point in the redgate suites is that you can save your comparison project. This is especially useful when you don't want to convert a certain table (or data), you can exclude them. When loading the project the next time the software will automatically ignore those tables.
One disadvantage is the cost of the software (smaller companies I worked with did not want to buy the software). SQL compare and SQL data compare together will cost you about 800 dollars, but if you look at the time you will save when releasing you will save a lot of money. There is also a trial you can play around with (30 days I believe).
SQLDBDiff is a nice and user-friendly and lite tool.
SQLDBDiff supports SQL Server 2000 to 2016 and also SQL Azure.
SQLDBDiff available with both free with limited use and full with a trial.
More Screen
Try Microsoft Visual Studio Database Edition aka Data Dude (formerly for Database Professionals). It'll do a complete schema comparison and generate the necessary scripts to upgrade the target schema.
Of course, this shouldn't replace a proper build process ;-)
If you need a quick schema comparison tool for SQL Server, you should take a look at dbForge Schema Compare for SQL Server.
I've made a MssqlMerge utility that allows to compare (and merge) MSSQL database data and programming objects. It also allows to search for particular word or phrase across table definitions and programming objects.