sql server - data backing into local - is it necessary - sql-server

In all the years of my experience, I always connected to a database by creating a new connection using IP address, username and password.
I recently joined a company where they use a desktop application written in VB6 that has an SQL server backend. Here, the practice is, get a backup of the latest version of SQL server and name it as a different DB, use it for testing purposes.
We now have a issue where we have loads of these databases created by users and it needs cleanup.
My question: Is it possible to have a centralised database which exists remotely to which everyone connects and gets the data? what are the things that we need to keep in mind to achieve this goal, so everyone can have one single database to access to, where they can make the changes.

We've been using a single centralized dev/test environment for over a decade now, with up to 50 full time developers using it -- and I'd say it works quite fine. Most of the changes are new columns into tables and not that many developers are working with the same tables / modules at the same time, so it doesn't cause that much issues.
All our stored procedures / functions are renamed for each release separately (by adding a release number in the end), and installed automatically with compilation process, even for developers. For developers compilation, the version numbers also include the developers userid. This way changing stored procedures in development won't break the test environment, or the procedures other developers are using.
The biggest advantage of this is that we can use similarly sized databases for testing and production.

Your ability to do that is really a functional and/or procedural issue. There's nothing technical that prevents you from having a single, shared database for dev/test. The challenge is, dev/test environments tend to be destructive and/or disruptive.
If you have a single DB used for all development and testing requirements, you'll probably get little to no work done. One dev modifying an object (SP, FN, table, view, etc...) can potentially break everyone else (or no one). A tester running stress tests will have everyone else getting mad about slow responses, timeouts, etc... Someone decides to test Always Encrypted or even sometime simpler like TDE can end up breaking everybody.
Dev environments almost always need their own sandbox before check-in. Checked in code/schema then get tested in a central environment that mimics prod before going to pre-prod that is (ideally) identical to prod. This is pretty basic though each team/company will have its variations.
One thing you could do right away is to automate taking a copy backup of the prod database so you drop a fresh .bak to a common location where everyone can grab from and restore to their own instance. This reduces the impact on your production system and reduces storage consumption. Another benefit is you remove all non-essential access to your production database - this is really, really important. Finally, once this is standard op, you can implement further controls or tasks in the future easily (e.g. restore to a secure instance, obfuscate/mask sensitive data, take new backup for dev/test use).

It is possible but it's usually not a good idea. It would be ok (and no more than ok) if all database access was inquiry only, but imagine the confusion that could arise if developer a fires in some updates to a table that developer be is writing a report for or if the DB was recovered in an uncontrolled manner. Development and test need a lot of managing and how many databases you need and where will depend on an analysis of your dev and test requrements.

Thanks for all the answers. We all had a discussion in our team and came up with a process that suits to our team:
There is a master database backed up and restored from the most recent and stable source
Only QA team has got write access to this database
Developers make their own test database using the Master backup
If new data is required, write SQL scripts to add it
Run unit & E2E tests on their copy
Give the new tests and scripts to add new data (if any) to QA
QA runs the tests and data scripts on the Master
When the tests are passed, if there is a SQL update script, then QA restores the Master Database from the backup (to remove data changes made by running the tests), runs the SQL scripts to update the data then backs it up as the new Master
Scripts are added to source control so we have a history
Note: As an extra safeguard we can keep a copy of the very first ever Master database somewhere else. So if anybody ever does something dumb and corrupts it, we can retrieve it and run all the SQL scripts to bring it up to date.

Related

Rebuilding an unstable tool from scratch (Currently Access based - can go anywhere)

I have inherited a custom built tool that is poorly designed and unstable, and I have a great opportunity to rebuild it from scratch. This is an internal tool only that works almost entirely in Access, and its purpose is to provide higher detail on parts that cost the company over a certain dollar amount.
How it works:
1) The raw data (new part numbers) gets pulled nightly from the EDW via macros in Access.
2) The same macros then join two tables (part numbers from one, names from another). Any part under a certain dollar amount is removed, and the new data is appended to the existing Access database.
3) During the day employees can then open a custom Access form to add more details about the part. Different questions are asked depending on the part category.
4) The completed form is forwarded to management, and the information entered is retained in the Access database – it does not write back to the EDW.
5) Managers can also pull some basic reports from the database, based on overall costs.
The problems:
1) Currently everyone has to have Access installed on their work stations, and whenever there is an update the new database gets pushed to their stations. This is not considered an ideal situation by management or IT.
2) If anyone has left the tool open accidentally at the end of the day the database is locked out, therefore the macros cannot run and the tool cannot be updated with new part numbers.
3) If the tool cannot update for a few days in a row the database can become corrupted. We can restore from the last good backup, in the past this has resulted in the loss of multiple days of work.
Ideally we want to take the tool completely out of Access. I am building a SharePoint site that can host the tool, which (if I can get it right) will eliminate the need of Access on end-user stations with a database push. However the SharePoint form would need read/write ability.
The big question is: How do I build this?
I have a completely open path of possibilities – I can design it work any way I want, using any tools or platform I want, as long as it works. It does not have to update automatically, as I already run a number of SQL scripts at the start of my day and adding one more is inconsequential.
The resources I have at my immediate disposal are: SharePoint (with designer), Access, Toad, and SQL Server. The database can be hosted on a shared network drive.
I am a recent college graduate with basic SQL knowledge. I have about a year to produce a final product, but would like to get it up and running far sooner if possible.
Any advice on what direction to pursue would be very helpful, thank you.
Caveat: I've never worked with SQL Server, so I don't know all of it's capabilities (I'm an Oracle developer).
What I'd do in your situation is something like the following (although not necessarily in this exact order):
Get a SQL Server database set up to host your tables.
Create the tables etc
Migrate test data across (I'm assuming you have a dev/uat/test environment for your current system! If you haven't, make sure you set up at least a separate test environment to prod for your new db!)
Write stored procs to do the work for adding new parts, updating existing data, etc etc
Set up an automated job on the db (I'm assuming SQL Server can do this!) to do the overnight processing.
Create a separate db user with the necessary permissions to call the stored procedures
Get your frontend to call the stored procs with the relevant parameters using the db user you created in step 6 to connect to the db.
You'd also have to think about transaction control to try and mitigate the case where users go home at the end of the day without committing their work - Does the db handle the commits/rollbacks or does Sharepoint?
Once you've worked out everything in your test environment, it's then a case of creating the prod db, users and objects, and then working out the best way of migrating the prod data across.
Good luck.
Don't forget to get backups for the new db set up as well.

Interchanging of database servers

Hello I am still a newbie for database deployment.
Generally how are changes to a production database deployed for a release?
My client wants an entire new setup. We have 3 environments: DEV, INT, PROD. He want to make INT as PRODUCTION when QA has certified. This will be fine with application servers but as the state of database is very important, this is a problem for the database because we cannot make the INT database to be production unless we sync the production data to integration. But our database is of more than 300GB so it will take a lot of time to sync data and therefore a huge down time which is not advisible.
Can you guys please advise me in this scenario.
The Most suitable way i know of to do such a synchronization before deployement consists on an offline integration using copies of the original data. At first it might seem as a heavy process, but it bears the advantages of keeping the original data available (problems might always happen during the sync) and alowing you to do all the necessary testing with the data before full deployement.
Here are some tips for deploying to a production database:
Always have current backups of your production database. Just in case something goes wrong.
Make sure you use some kind of source control for your scripts. Just like code. Check in scripts, stored procedures, etc.
Always have rollback scripts for any data updates. For example, have a script that updates 100 records? Write a script that copies the data somewhere else temporarily and that can restore any changes you make. It's easy to test this in DEV and INT. Gives you a bit of peace of mind when making production changes to data.
Always have a recent backup for any schema changes. If you're adding a field to a table, see if you can copy the table to a temp table and then make your changes. Might not always be possible if the table is really large, but again, it lets you quickly rollback in case of an error.
Practice, practice, practice. Practice restoring backups of old production data. Practice running your scripts in DEV and INT. Be ready to re-deploy all stored procedures at any moment.
Another subject that can be tough that you touched on is having production data in INT. I would regularly restore production database backups to INT and DEV. It's well worth it for QA since it provides them with both the quality of production data and the quantity.
I would advise against turning the INT database into production however. Developers and QA will always put in garbage data for testing and you don't want to make that live.

How do you manage databases during development?

My development team of four people has been facing this issue for some time now:
Sometimes we need to be working off the same set of data. So while we develop on our local computers, the dev database is connected to remotely.
However, sometimes we need to run operations on the db that will step on other developers' data, ie we break associations. For this a local db would be nice.
Is there a best practice for getting around this dilemma? Is there something like an "SCM for data" tool?
In a weird way, keeping a text file of SQL insert/delete/update queries in the git repo would be useful, but I think this could get very slow very quickly.
How do you guys deal with this?
You may find my question How Do You Build Your Database From Source Control useful.
Fundamentally, effective management of shared resources (like a database) is hard. It's hard because it requires balancing the needs of multiple people, including other developers, testers, project managers, etc.
Often, it's more effective to give individual developers their own sandboxed environment in which they can perform development and unit testing without affecting other developers or testers. This isn't a panacea though, because you now have to provide a mechanism to keep these multiple separate environments in sync with one another over time. You need to make sure that developers have a reasonable way of picking up each other changes (both data, schema, and code). This isn't necesarily easier. A good SCM practice can help, but it still requires a considerable level of cooperation and coordination to pull it off. Not only that, but providing each developer with their own copy of an entire environment can introduce costs for storage, and additional DBA resource to assist in the management and oversight of those environments.
Here are some ideas for you to consider:
Create a shared, public "environment whiteboard" (it could be electronic) where developers can easily see which environments are available and who is using them.
Identify an individual or group to own database resources. They are responsible for keeping track of environments, and helping resolve the conflicting needs of different groups (developers, testers, etc).
If time and budgets allow, consider creating sandbox environments for all of your developers.
If you don't already do so, consider separating developer "play areas", from your integration, testing, and acceptance testing environments.
Make sure you version control critical database objects - particularly those that change often like triggers, stored procedures, and views. You don't want to lose work if someone overwrites someone else's changes.
We use local developer databases and a single, master database for integration testing. We store creation scripts in SCM. One developer is responsible for updating the SQL scripts based on the "golden master" schema. A developer can make changes as necessary to their local database, populating as necessary from the data in the integration DB, using an import process, or generating data using a tool (Red Gate Data Generator, in our case). If necessary, developers wipe out their local copy and can refresh from the creation script and integration data as needed. Typically databases are only used for integration testing and we mock them out for unit tests so the amount of work keeping things synchronized is minimized.
I recommend that you take a look at Scott Allen´s views on this matter. He wrote a series of blogs which are, in my opinion, excellent.
Three Rules for Database Work,
The Baseline,
Change scripts,
Views, stored procs etc,
Branching and Merging.
I use these guidelines more or less, with personal changes and they work.
In the past, I've dealt with this several ways.
One is the SQL Script repository that creates and populates the database. It's not a bad option at all and can keep everything in sync (even if you're not using this method, you should still maintain these scripts so that your DB is in Source Control).
The other (which I prefer) was having a single instance of a "clean" dev database on the server that nobody connected to. When developers needed to refresh their dev databases, they ran a SSIS package that copied the "clean" database onto their dev copy. We could then modify our dev databases as needed without stepping on the feet of other developers.
We have a database maintenance tool that we use that creates/updates our tables and our procs. we have a server that has an up-to-date database populated with data.
we keep local databases that we can play with as we choose, but when we need to go back to "baseline" we get a backup of the "master" from the server and restore it locally.
if/when we add columns/tables/procs we update the dbMaintenance tool which is kept in source control.
sometimes, its a pain, but it works reasonably well.
If you use an ORM such as nHibernate, create a script that generate both the schema & the data in the LOCAL development database of your developers.
Improve that script during the development to include typical data.
Test on a staging database before deployment.
We do replicate production database to UAT database for the end users. That database is not accessible by developers.
It takes less than few seconds to drop all tables, create them again and inject test data.
If you are using an ORM that generates the schema, you don't have to maintain the creation script.
Previously, I worked on a product that was data warehouse-related, and designed to be installed at client sites if desired. Consequently, the software knew how to go about "installation" (mainly creation of the required database schema and population of static data such as currency/country codes, etc.).
Because we had this information in the code itself, and because we had pluggable SQL adapters, it was trivial to get this code to work with an in-memory database (we used HSQL). Consequently we did most of our actual development work and performance testing against "real" local servers (Oracle or SQL Server), but all of the unit testing and other automated tasks against process-specific in-memory DBs.
We were quite fortunate in this respect that if there was a change to the centralised static data, we needed to include it in the upgrade part of the installation instructions, so by default it was stored in the SCM repository, checked out by the developers and installed as part of their normal workflow. On reflection this is very similar to your proposed DB changelog idea, except a little more formalised and with a domain-specific abstraction layer around it.
This scheme worked very well, because anyone could build a fully working DB with up-to-date static data in a few minutes, without stepping on anyone else's toes. I couldn't say if it's worthwhile if you don't need the install/upgrade functionality, but I would consider it anyway because it made the database dependency completely painless.
What about this approach:
Maintain a separate repo for a "clean db". The repo will be a sql file with table creates/inserts, etc.
Using Rails (I'm sure could be adapted for any git repo), maintain the "clean db" as a submodule within the application. Write a script (rake task, perhaps) that queries a local dev db with the SQL statements.
To clean your local db (and replace with fresh data):
git submodule init
git submodule update
then
rake dev_db:update ......... (or something like that!)
I've done one of two things. In both cases, developers working on code that might conflict with others run their own database locally, or get a separate instance on the dev database server.
Similar to what #tvanfosson recommended, you keep a set of SQL scripts that can build the database from scratch, or
On a well defined, regular basis, all of the developer databases are overwritten with a copy of production data, or with a scaled down/deidentified copy of production, depending on what kind of data we're using.
I would agree with all the LBushkin has said in his answer. If you're using SQL Server, we've got a solution here at Red Gate that should allow you to easily share changes between multiple development environments.
http://www.red-gate.com/products/sql_source_control/index.htm
If there are storage concerns that make it hard for your DBA to allow multiple development environments, Red Gate has a solution for this. With Red Gate's HyperBac technology you can create virtual databases for each developer. These appear to be exactly the same as ordinary database, but in the background, the common data is being shared between the different databases. This allows developers to have their own databases without taking up an impractical amount of storage space on your SQL Server.

Step-by-step instructions for updating an (SQL Server) database?

Just a question about best-practices when upgrading an existing database. Assuming there will be all kinds of modifications to the data itself, the structure, the relations, additional columns, disappearing columns and whatever more.
My problem is a simple one. I'm working on a project that will use SQL Server. No problem there, since I'm enough of an expert to handle this. But this project will be upgraded later on and I need to specify a protocol that needs to be followed by the upgrade mechanism. Basically, this protocol needs to be followed when creating upgrade scripts...
Right now, I have these simple steps:
Add the new columns to the tables.
Add constraints to the new columns.
Add new tables.
Drop constraints where needed.
Drop columns that need to be removed.
Drop tables that need to be removed.
Somehow, this list feels incomplete. Is there a more extended list somewhere describing the proper steps which needs to be followed during an upgrade?
Also, is it always possible to do a complete upgrade within a single database transaction (with SQL Server) or are there breakpoints that need to be included within the protocol where one transaction should end and another one starts?While automated tools will provide a nice, automated solution, I still can't really use them. The development team working on this system has 4 developers, each with their own database on their local system. Every developer keeps track of their own updates to the structure and keeps track of them by generating both an Upgrade and Downgrade script for his own modifications, both for structural changes and data changes. These scripts can then be used by the other developers to keep their own system up-to-date. Whenever the system is going to be released, those scripts are all merged into one big script.
The system does not include any stored procedures or other "special" features. The database is just that: a data storage with just tables and relations between them. No roles, no users, no stored procedures, no triggers, no complex datatypes...The DB is used by an application where users work from 9-to-5 so shutting down can be done easily, including upgrades for the clients. We also add a version number to the database and applications will check if they're linked to the correct database version.
During development, all developers use their own database instance, which they can fully control. Since we're not the ones who use the application, we tend to develop for the Express edition, not any more expensive one. To be honest, we don't develop our application to support a lot of users, but we'll inform our users that since it uses SQL Server, they could install the system on a bigger SQL Server platform, according to their own needs. They will need their own DBA for this, though. We do have a bigger SQL Server available for ourselves, which we also use for our own web interface, but this server is located in a special dataserver where it is being maintained for us, not by us.
The project previously used MS Access for it's data storage and was intended for single-user development, but as it turned out, many users still decided to share their databases and this had shown that the datamodel itself is reliable enough for multi-user environments. So we migrated to SQL Server to support smaller offices with 3 or more users and some big organisation who will have 500 or more users at the same time.
Since we need to keep the cost of the software low, we don't have a big budget to spend on expensive tools or a more expensive server.
Check out Red-Gate's SQL Compare (structure comparison), SQL Data Compare (data comparison), and SQL Packager (for packaging up updates scripts into a C# project or a .NET executable).
They provide a nice, clean, fully functional and easy-to-use solution for all your database upgrade needs. They're well worth their license fees - that pays for itself in a few weeks or months.
Highly recommended!
In my opinion, it's an absolute bear doing these manually. For Microsoft SQL Server, I'd recommend using the Database editiion of Team System, since it includes complete source control capabilities for your database, and can automatically build your scripts for upgrading/downgrading versions.
Another option is SQLCompare with Redgate, which can also handle these kinds of upgrades/downgrades, and will result in a very nice SQL script. I've used both, and keeping the historic scripts has helped us troubleshoot issues and resolve many a mystery.
If you are working with a manual script as above, don't forget to also account for SP changes in your scripts. Also, any hand-edited script should be able to be executed multiple times on a database - i.e. if your script includes a table creation or drop, be sure to check for existance first, otherwise your script will fail if executed back to back.
Again, while it's possible to build a manual protocol I'd still fall back on using one of the purpose-built tools out there, and both Team System and SQL Compare will be able to output scripts that you could include as part of an installation/upgrade package.
With database updates I always believe it should be all or nothing. If any of the DB updates fail your application will be left in an unknown state that could be harmful to the data so I think it is best practice to either apply them all or none (1 transaction around them all).
I also like to backup the database before applying updates so that if anything does go wrong the database can be rolled back (this has saved me numerous times when working with live data).
Hope this helps.
Best practices for upgrading a production database schema actually look pretty bad on the surface. Unless you can completely shut down your system for the upgrade, which is often not possible, your changes all need to be backwards compatible. If you have many clients accessing the database, you can't update them all simultaneously, so any schema changes you make need to allow old code to run.
That means never renaming a column, and making all new columns nullable. This doesn't mean you leave it like that forever. You write two scripts, one for the initial change, which is backwards compatible, then another to clean things up after all clients have been updated.
Automated tools are great for validation of schemas, but they are not so good when it comes to actually modifying a complex system. You should break your changes up into many small, discrete change scripts so each can be run manually. If there's a failure, it's easier to pinpoint the cause and fix it. Basically, each feature gets its own script. Give each a unique name and then store that name in the database itself when you run the script so you can query the database to find out what's been run and what hasn't. This is invaluable when you have instances on developer's machines, test servers, production, etc.

Is it 'ok' to develop with a DEV database residing on the same SQL server as the live production app?

Sometimes we have upwards to 4-6 people either RDPed looking at data in SQL Management Studio, or hitting the server with LINQpad, Toad, etc from various locations while developing in mostly ASP.NET and Flex with WebOrb. Is this bad? Bad in the sense that we are trying to keep our live production app stable and as lag free as possible for global users?
i don't think i'd do that. if it was just me, then sure:) but if there's a bunch of people god only knows what queries they might run. we always use a test server for such things.
best regards,
don
Best practice would be separate servers. Next best, separate instances on same server. Next best, separate databases on a instance.
However, I wouldn't be letting any developers RDP into a production SQL Server (or production anything), regardless of choice of segregation mechanism. Use a separate terminal server with tools and everything there.
You can have dev and prod db on the same instance. Just make sure the permission are setup so that developers cannot touch the prod db. The negative is a long running query in dev will impact prod.
In SQL SERVER 2005 a better solution is to have a dev "instance" and a prod "instance".
Then is someone mis-behaves on the dev instance you and just bring down that insance.
In SQL server 2008 you can setup up CPU usage plans which can help throttle how much resources can be used. You should investigate that.
It depends on a lot of variables. It's generally better to have them on different servers. This is really depending on how you use sql server. If you just have databases, don't use a lot of the management tools, like nightly processes to alter data and other jobs you might be OK. You are running a real risk of having bleed over code from developing on the dev database to the production one though. It's safer to have them separated out, especially for the small amount of money needed to create a dev instance of sql server.
I find this a poor practice for several reasons:
First suppose one of your devs messes up and does something that ends up taking all of the processing power of your server. Oops prod is down for no good reason.
Second, devs could easliy change the wrong database. Oops prod is down for no good reason. At least you can avoid this by not giving any production rights to devs (which you should be doing anyway as a best practice.)
Third, if the database is the on the same server it has to have a different name, this can make moving things to prod difficult and error prone. I think it also means it will be less likely that you deploy correctly through source controlled scripts. If you choses to copy objects from one database to the other, then you can have issues with that as well. First if there is data in the object already, you may accidentally wipe it out (hope you have a backup) or you may move the new table structure but miss things like the PKs and FKS and default values and triggers and constraints and indexes or the wizard might take much longer to do the move because in the background it is creating and populating a new table and then droping the old and renaming the new one rather than using alter table. Oops prod is down or seriously slowed for no good reason.
I tend to agree with the "separate servers" folks, although with my company we actually do most of our day to day development work on our local machines -- so we have SQL Server installed locally. This can be a pain, of course, if you're developing reporting or something that needs production data. In that scenario, developers here usually get a subset of production data exported to work with.
For acceptance testing vs. deployment though, we do use separate instances.
Developers probably shouldn't have production access UNLESS they're also the ones who do application deployments (as can be the case with small teams like the one I'm in). If you do end up using separate DBs on the same server, I would at least lock down RDP access and grant access to each development DB on an individual basis. That's how it works here -- I don't have admin rights to any of our servers at this time, and can only admin databases for applications that belong specifically to my team.
It depends how much you value your live service. I know I wouldn't trust me and my fat hands running SQL on the same hardware as a live application.
Even if the application is not business critical, and the app is not data-bound, you can set up a development environment on an unused desktop machine, so why wouldn't you do that instead of take the risk?
The set up I use is typically DEV database on a local instance of SQL Server (Development Version for me, but Express would probably also work), a QA database on a test instance of SQL Server. In our environment, this is located on a virtual instance of W2K3 -- soon to be W2K8. Production databases live either on dedicated instances of SQL server or on one of various clustered instances. We don't mix PROD/QA/DEV at all. I use RedGate SQL Compare to synchronize schemas between the various systems, including different developer instances of the database.
It will be 'OK' as much as the team don't had any administrator privileges over the server (either SQL or Windows), and their user log-ins just grant access to potentially destroy just the development database and it's associated files, having denied access to production databases
For other application testing reasons, we created a copy of our production server (which is a virtual server) on a separate domain. This allowed the Windows Server Name, SQL Serer Name, Database name to be exactly the same (lots of settings on 3rd party apps require this level of configuration to get different processes to work.). Now we can rebuild a test environment by creating an exact virtual image of our production server.
I was sceptical about running SQL Server on a virtual machine, but it has given our small company a lot of flexibility. We like to think our databases are critical, but it is for internal uses and having some down time would just have workers shift their lunch hour.

Resources