SQL server semi-database mirroring - sql-server

We have the following problem:Our customer has a the life database.Sometimes,we face bugs that are due to data in life,those bugs doesn't appear in our staging and development databases because they are usually related to the actual data.
So, for accurate debugging,we need to have the same copy of life data in another database.This database should be synchronized with the life database (either automatically or in-demand),so that we can replicate the erroneous scenarios without impacting the actual data.How can we perform that?Is it better to create this "semi-mirror" in the staging itself? As a final note,I don't want the changes from the "semi-mirror" database to be reflected one the life, Only from the life to the "semi-mirror".

Per definition you ahve no staging database. Staging should reflect real world, so contain real world data (and size) and run on a similar system.
Your customer should take a backup and you load it into staging. You do that regularly (weekly, monthly, after updates) to make sure you are in sync. Standad procecedure in every project I have ever been that worked well.

Related

sql server - data backing into local - is it necessary

In all the years of my experience, I always connected to a database by creating a new connection using IP address, username and password.
I recently joined a company where they use a desktop application written in VB6 that has an SQL server backend. Here, the practice is, get a backup of the latest version of SQL server and name it as a different DB, use it for testing purposes.
We now have a issue where we have loads of these databases created by users and it needs cleanup.
My question: Is it possible to have a centralised database which exists remotely to which everyone connects and gets the data? what are the things that we need to keep in mind to achieve this goal, so everyone can have one single database to access to, where they can make the changes.
We've been using a single centralized dev/test environment for over a decade now, with up to 50 full time developers using it -- and I'd say it works quite fine. Most of the changes are new columns into tables and not that many developers are working with the same tables / modules at the same time, so it doesn't cause that much issues.
All our stored procedures / functions are renamed for each release separately (by adding a release number in the end), and installed automatically with compilation process, even for developers. For developers compilation, the version numbers also include the developers userid. This way changing stored procedures in development won't break the test environment, or the procedures other developers are using.
The biggest advantage of this is that we can use similarly sized databases for testing and production.
Your ability to do that is really a functional and/or procedural issue. There's nothing technical that prevents you from having a single, shared database for dev/test. The challenge is, dev/test environments tend to be destructive and/or disruptive.
If you have a single DB used for all development and testing requirements, you'll probably get little to no work done. One dev modifying an object (SP, FN, table, view, etc...) can potentially break everyone else (or no one). A tester running stress tests will have everyone else getting mad about slow responses, timeouts, etc... Someone decides to test Always Encrypted or even sometime simpler like TDE can end up breaking everybody.
Dev environments almost always need their own sandbox before check-in. Checked in code/schema then get tested in a central environment that mimics prod before going to pre-prod that is (ideally) identical to prod. This is pretty basic though each team/company will have its variations.
One thing you could do right away is to automate taking a copy backup of the prod database so you drop a fresh .bak to a common location where everyone can grab from and restore to their own instance. This reduces the impact on your production system and reduces storage consumption. Another benefit is you remove all non-essential access to your production database - this is really, really important. Finally, once this is standard op, you can implement further controls or tasks in the future easily (e.g. restore to a secure instance, obfuscate/mask sensitive data, take new backup for dev/test use).
It is possible but it's usually not a good idea. It would be ok (and no more than ok) if all database access was inquiry only, but imagine the confusion that could arise if developer a fires in some updates to a table that developer be is writing a report for or if the DB was recovered in an uncontrolled manner. Development and test need a lot of managing and how many databases you need and where will depend on an analysis of your dev and test requrements.
Thanks for all the answers. We all had a discussion in our team and came up with a process that suits to our team:
There is a master database backed up and restored from the most recent and stable source
Only QA team has got write access to this database
Developers make their own test database using the Master backup
If new data is required, write SQL scripts to add it
Run unit & E2E tests on their copy
Give the new tests and scripts to add new data (if any) to QA
QA runs the tests and data scripts on the Master
When the tests are passed, if there is a SQL update script, then QA restores the Master Database from the backup (to remove data changes made by running the tests), runs the SQL scripts to update the data then backs it up as the new Master
Scripts are added to source control so we have a history
Note: As an extra safeguard we can keep a copy of the very first ever Master database somewhere else. So if anybody ever does something dumb and corrupts it, we can retrieve it and run all the SQL scripts to bring it up to date.

Interchanging of database servers

Hello I am still a newbie for database deployment.
Generally how are changes to a production database deployed for a release?
My client wants an entire new setup. We have 3 environments: DEV, INT, PROD. He want to make INT as PRODUCTION when QA has certified. This will be fine with application servers but as the state of database is very important, this is a problem for the database because we cannot make the INT database to be production unless we sync the production data to integration. But our database is of more than 300GB so it will take a lot of time to sync data and therefore a huge down time which is not advisible.
Can you guys please advise me in this scenario.
The Most suitable way i know of to do such a synchronization before deployement consists on an offline integration using copies of the original data. At first it might seem as a heavy process, but it bears the advantages of keeping the original data available (problems might always happen during the sync) and alowing you to do all the necessary testing with the data before full deployement.
Here are some tips for deploying to a production database:
Always have current backups of your production database. Just in case something goes wrong.
Make sure you use some kind of source control for your scripts. Just like code. Check in scripts, stored procedures, etc.
Always have rollback scripts for any data updates. For example, have a script that updates 100 records? Write a script that copies the data somewhere else temporarily and that can restore any changes you make. It's easy to test this in DEV and INT. Gives you a bit of peace of mind when making production changes to data.
Always have a recent backup for any schema changes. If you're adding a field to a table, see if you can copy the table to a temp table and then make your changes. Might not always be possible if the table is really large, but again, it lets you quickly rollback in case of an error.
Practice, practice, practice. Practice restoring backups of old production data. Practice running your scripts in DEV and INT. Be ready to re-deploy all stored procedures at any moment.
Another subject that can be tough that you touched on is having production data in INT. I would regularly restore production database backups to INT and DEV. It's well worth it for QA since it provides them with both the quality of production data and the quantity.
I would advise against turning the INT database into production however. Developers and QA will always put in garbage data for testing and you don't want to make that live.

SQL Server - Production DB Schema vs. Reporting DB Schema. Should they be the same?

We recently put a new production database into use. The schema of this database is optimized for OLTP. We're also getting ready to implement a reporting server to be used for reporting purposes. I'm not convinced we should just blindly use the same schema for our reporting database as we do for our production database, and replicate data over.
For those of you that have dealt with having separate production and reporting databases, have you chosen to use the same database schema for your reporting database, or a schema that is more efficient for reporting; for example, perhaps something more denormalized?
Thanks for thoughts on this.
There's really two sides to the story:
if you keep the schema identical, then updating the reporting database from the production is a simple copy (or MERGE in SQL Server 2008) command. On the other hand, the reports might get a bit harder to write, and might not perform optimally
if you devise a separate reporting schema, you can optimize it for reporting needs - then the creation of new reports might be easier and faster, and the reports should perform better. BUT: The updating is going to be harder
So it really boils down to: are you going to create a lot of reports? If so: I'd recommend coming up with a specific reporting schema optimized for reports.
Or is the main pain point the upgrade? If you can define and implement that once (e.g. with SQL Server Integration Services), maybe that's not really going to be a big issue after all?
Typically, chances are that you'll be creating a lot of reports of time, so there's a good chance it might be beneficial in the long run to invest a bit upfront in a separate reporting schema, and a data loading process (typically using SSIS) and then reap the benefit of having better performing reports and faster report creation time.
I think that the reporting database schema should be optimized for reporting - so you'll need a ETL Process to load your data. In my experience I was quickly at the point that the production schema does not fit my reporting needs.
If you are starting your reporting project I would suggest that you design your reporting database for your reports needs.
For serious reporting, usually you create data warehouse (Which is typically at least somewhat denormalized and certain types of calculations are done when the data is refreshed to save from averaging the values of 1.3 million records when you run the report. This is for the kind of reporting reporting that includes a lot of aggregate data.
If your reporting needs are not that great a replicated database might work. It may also depend on how up-to-date you need the data to be as data warehouses are typically updated once or twice a day so the reporting data is often one day behind, OK for monthly and quarterly reports not so good to see how many widgits have been ordered so far today.
The determinate of whether you need a data warehouse tends to be how long it would take to runthe reports they need. This is why datawarehouse pre-aggregate data on loading it. IF your reoports are running fine and you just want to get the worokload away from the input workload a replicated adatabase should do the trick. If you are trying to do math on all the records for the last ten years, you need a data warehouse.
You could do this in steps too. Do the replication now, to get reporting away from data input. That should be an immediate improvement (even if not as much as you want), then design and implement the datawarehouse (which can be a fairly long and involved project and which will take some time to get right).
It's easiest just to copy over.
You could add some views to that schema to simplify queries - to conceptually denormalize.
If you want to go the full Data Warehouse/Analysis Services route, it will be quite a bit of work. But it's very fast, takes up less space, and users seem to like it. If you're concerned about large amounts of data and response times, you should look into this.
If you have many many tables being joined, you might look into actually denormalizing the data. I'd do a test case just to see how much gain for pain you'll be getting.
Without going directly for the data warehouse solution you could always put together some views that rearrange data for better reporting access. This helps you in that you don't have to start a large warehouse project right away and could help scope out a warehouse project if you decide to go that way.
All the answers I've read here are good, I would just add that you do this in stages, stopping as soon as your goals for performance and functionality are met:
Keep the schema identical - this just takes contention and load off the OLTP server
Keep the schema identical - but add new indexed views OR index base tables differently
Build a partial data-warehouse style model (perhaps not keeping snapshot-style history or slowly changing dimensions or anything special not catered for in your normal database) from the copy-schema in another schema or database on the same reporting server. The benefits of star-schema models are huge for reporting, views flattened for users and data dictionaries etc. In this model, if your OLTP database loses changes (for instance customer name changes) due to overwrites, the data warehouse doesn't capture that information (often it's not that important if you stop at this spot). Effectively you are getting data warehouse-style organization for "current" data only. The benefits of retaining the copy of the original schema on your reporting server at this point are that you can pull from the source data in original SQL Server form instead of some kind of intermediate form (like text files) without affecting production OLTP, and you can migrate data models gradually, some in stars, some in normal form, all without affecting production. At some point later, you might be able to drop all or part of the copy.
Build a full data-warehouse including slowly changing dimensions where all the data is captured from the source system.

SQL Server 2008 Auto Backup

We want to have our test servers databases updated from our production server databases on a nightly basis to ensure we're developing on the most recent data. We, however, want to ensure that any fn, sp, etc that we're currently working on in the development environment doesn't get overwritten by the backup process. What we were thinking of doing was having a prebackup program that saves objects selected by our developers and a postbackup program to add them back in after the backup process is complete.
I was wondering what other developers have been doing in a situation like this. Is there an existing tool to do this for us that can run automatically on a daily basis and allow us to set objects not to overwrite (without requiring the attention of a sysadmin to run it daily).
All the objects in our databases are maintained in code - tables, view, triggers, stored procedures, everything - if we expect to find it in the database then it should be in DDL in code that we can run. Actual schema changes are versioned - so there's a table in the database that says this is schema version "n" and if this is not the current version (according to the update code) then we make the necessary changes.
We endeavour to separate out triggers and views - don't, although we probably should, do much with SP and FN - with drop and re-create code that is valid for the current schema version. Accordingly it should be "safe" to drop and recreate anything that isn't a table, although there will be sequencing issues with both the drop and the create if there are dependencies between objects. The nice thing about this generally is that we can confidently bring a schema from new to current and have confidence that any instance of the schema is consistent.
Expanding to your case, if you have the ability to run the schema update code including the code to recreate all the database objects according to the current definitions then your problem should substantially go away... backup, restore, run schema maint logic. This would have the further benefit that you can introduce schema (table) changes in the dev servers and still keep the same update logic.
I know this isn't a completely generic solution. And its worth noting that it probably works better with database per developer (I'm an old fashioned programmer, so I see all problems as having code based solutions (-:) but as a general approach I think it has considerable merit because it gives you a consistent mechanism to address a number of problems.

Copying data from a local database to a remote one

I'm writing a system at the moment that needs to copy data from a clients locally hosted SQL database to a hosted server database. Most of the data in the local database is copied to the live one, though optimisations are made to reduce the amount of actual data required to be sent.
What is the best way of sending this data from one database to the other? At the moment I can see a few possibly options, none of them yet stand out as being the prime candidate.
Replication, though this is not ideal, and we cannot expect it to be supported in the version of SQL we use on the hosted environment.
Linked server, copying data direct - a slow and somewhat insecure method
Webservices to transmit the data
Exporting the data we require as XML and transferring to the server to be imported in bulk.
The data copied goes into copies of the tables, without identity fields, so data can be inserted/updated without any violations in that respect. This data transfer does not have to be done at the database level, it can be done from .net or other facilities.
More information
The frequency of the updates will vary completely on how often records are updated. But the basic idea is that if a record is changed then the user can publish it to the live database. Alternatively we'll record the changes and send them across in a batch on a configurable frequency.
The amount of records we're talking are around 4000 rows per table for the core tables (product catalog) at the moment, but this is completely variable dependent on the client we deploy this to as each would have their own product catalog, ranging from 100's to 1000's of products. To clarify, each client is on a separate local/hosted database combination, they are not combined into one system.
As well as the individual publishing of items, we would also require a complete re-sync of data to be done on demand.
Another aspect of the system is that some of the data being copied from the local server is stored in a secondary database, so we're effectively merging the data from two databases into the one live database.
Well, I'm biased. I have to admit. I'd like to hypnotize you into shelling out for SQL Compare to do this. I've been faced with exactly this sort of problem in all its open-ended frightfulness. I got a copy of SQL Compare and never looked back. SQL Compare is actually a silly name for a piece of software that synchronizes databases It will also do it from the command line once you have got a working project together with all the right knobs and buttons. Of course, you can only do this for reasonably small databases, but it really is a tool I wouldn't want to be seen in public without.
My only concern with your requirements is where you are collecting product catalogs from a number of clients. If they are all in separate tables, then all is fine, whereas if they are all in the same table, then this would make things more complicated.
How much data are you talking about? how many 'client' dbs are there? and how often does it need to happen? The answers to those questions will make a big difference on the path you should take.
There is an almost infinite number of solutions for this problem. In order to narrow it down, you'd have to tell us a bit about your requirements and priorities.
Bulk operations would probably cover a wide range of scenarios, and you should add that to the top of your list.
I would recommend using Data Transformation Services (DTS) for this. You could create a DTS package for appending and one for re-creating the data.
It is possible to invoke DTS package operations from your code so you may want to create a wrapper to control the packages that you can call from your application.
In the end I opted for a set of triggers to capture data modifications to a change log table. There is then an application that polls this table and generates XML files for submission to a webservice running at the remote location.

Resources