Minimalistic Database Administration - database

I am a developer. An architect on good days. Somehow I find myself also being the DBA for my small company. My background is fair in the DB arts but I have never been a full fledged DBA. My question is what do I have to do to ensure a realiable and reasonably functional database environment with as little actual effort as possible?
I am sure that I need to make sure that backups are being performed and that is being done. That is an easy one. What else should I be doing on a consistant basis?

Who else is involved in the database? Are you the only person making schema changes (creating new objects, releasing new stored procedures, permissioning new users)?
Make sure that the number of users doing anything that could impact performance is reduced to as close to zero as possible, ideally including you.
Make sure that you're testing your backups - ideally run a DEV box that is recreating the production environment periodically, 1. a DEV box is a good idea, 2. a backup is only useful if you can restore from it.
Create groups for the various apps that connect to your database, so when a new user comes along you don't guess what permissions they need, just add them to the group, meanwhile permission the database objects to only the groups that need them
Use indices, primary keys, foreign keys, constraints, stats and whatever other tools your database supports. Normalise.
Optimise the most common code against your box - bad stored procedures/data access code will kill you.

I've been there. I used to have a job where I wrote code, did all the infrastructure stuff, wore the DBA hat, did user support, fixed the electric stapler when it jammed, and whatever else came up that might be remotely associated with IT. It was great! I learned a little about everything.
As far as the care and feeding of your database box, I'd recommend that you do the following:
Perform regular full backups.
Perform regular transaction log backups.
Monitor your backup jobs. There's a bunch of utilities out on the market that are relatively cheap that can automate this for you. In a small shop you're often too busy
to remember to check on them daily.
Test your backups. Do a drill. Restore an old copy of your most important databases. Prove to yourself that your backups are working and that you know how to restore them properly. You'd be suprised how many people only think about this during their first real disaster.
Store backups off-site. With all the online backup providers out there today, there's not much excuse for not having an offsite backup.
Limit sa access to your boxes.
If your database platform supports it, use only role based security. Resist the temptation to have one-off user specific security.
The basic idea here is that if you restrict who has access to the box, you'll have fewer problems. Secondly, if your backups are solid, there are few things that come up that you won't be able to deal with effectively.

I would suggest:
A script to quickly restore the latest backup of a database, in case it gets corrupted
What kind of backups are you doing? Full backups each day, or incremental every hour, etc?
Some scripts to create new users and grant them basic access.
However, the number one suggestion is to limit as much as possible the power other users have, this will greatly reduce the chance of stuff getting badly messed up. Servers that have everyone as an sa tend to get screwed up quicker than servers that are locked down.

Related

Refreshing lower environments with prod copy

I know this isn't something new and people might be doing this in their environment. I have a requirement to do refreshes monthly, weekly etc in lower environments and I wanted to know if there is a quicker approach to this. I know we can do a backup and restore etc through SQL job (I would love to know if there is an automated script which takes care of the entire process). Also, instead of doing a full database restore every month is there a way we can only send changes that happened during the month or the week that way it would save a lot of time and wastage of space. I am not sure on how to achieve the second option of shipping only the changes. We aren't considering any HA technologies for x reasons so please do not give me those options. Any script that you have that can achieve this or if you are doing something similar in your environment and have the necessary details and scripts then please do share the same. Is there any tool that can achieve this but obviously this won't be my first option unless we can't do it via writing t-sql code. Also, our boxes are VMs so is there a possibility we can leverage the features and capabilities by taking file snapshots and delivering it to the lower environments (sorry I am a bit naive on VM capabilities and techniques) rather than doing backup and restore natively through SQL.
Fyi...we want complete data not just bare schema. Also, please do not share solutions using SSIS.
Any help would be much appreciated.
Thanks
Once you perform the recovery portion of a database restore, it loses the ability to restore additional backups from the source database. Depending on your setup, you may be able to get away with shipping only an additional differential backup from the source system. But you'd still need to restore the full backup again.

Use one large database or use single databases per customer

Currently I'm working on a on-line webapplication for construction materials. Companies can log in on our website and then they can use the webapp.
From the beginnen the idea was to create a database per customer. But now it's becomming larger and larger (100+) so we have now 100 databases to manage.
We have to run approx. twice a year an update script for db maintanance.
The advantage that I see, is that when a customer wants to quit, we delete their database and than it's finished.
When I want to add new customer, I have to fill the database with approx. 1.000.000 unique records for that specific customer, because every customer have different prices /materials.
For backups I use a MySQL Dump script, that creates a *.sql file per database that I download every day.
What is your opnion and what do you think?
One large db or per customer a database?
I'm using MySQL with ASP.NET/C#...
I don't want to make a suggestion because there are far too many variables.
I do want to note, however, that my employer has 1000s of deployed databases -- we use one database per customer with replication (2+ databases).
So, the idea is workable. My job isn't related to DB management but I do recall that we do a lot in the way of automation and online tools. Backups and DB management is handled by a team.
Ultimately, you can make the 100+ deployments work but you are going to want to start investing in the development of utility and tools to help automate the backup and/or management of the DBs.
Ideally, nothing (DB Management) should be done by hand. Furthermore, the connection strings should be abstracted away from a given web app deployment.
But now it's becomming larger and larger (100+) so we have now 100 databases to manage
I think you have your answer right there.
Have to agree with #Hogan - the overhead of managing that many databases is probably far from ideal - especially if you ever need to make schema changes, etc. in the future.
That said, if you use a single database are you ever likely to need to separate out a given customer's data into a standalone database/site? If this is likely, how long would it take to carry out this separation?
In essence, if it's likely to take less effort to write a set of tools to handle the above case, then I'd be tempted to go for the single database approach. However, you'll also need to factor in the likely timescales for creating a unified version of the database schemas that handle datasets for each customer, etc.
Also, are the schemas precisely the same for all of the existing 100+ databases? If not, there's potentially a world of pain if you decide to migrate the existing data into a single database.
Update - Incidentally, all of the above is a bit generalised, but it's hard to be specific without knowing more about the amount of data, and traffic, etc. in use. (e.g.: If you ever had a high demand site for a customer it would be trivial to put it onto its own DB server if you were using a per-customer database.)
i agree with #Hogan and #middaparke... if the schemas are the same, you shuol dput it in one instance.
unfortuantely it is impossible to tell from here if your schemas would benefit from reusing most of those million rows or not, if normalized well, the ncertinly it would be beneficial.
it is also impossible to tell how difficult any changes to the applications would be based on this change.
unfortunately, it sounds like you have a large customer base with working applications, and therefore momentum to keep going in that direction - which thros you into the realm of sucking it up and dealing with it by automating the management of so many db's... not the way you would do it from scratch - but maybe cheapest since you are where you are.

Maintenance window and recovery for a large database

One of our teams is developing a database that will be somewhat large (~500GB) and grow from there (I know 500 Gigs may seem small to many of you, but it will be one of the larger databases in our shop). One of the issues they are grappling with is backing up and restoring the database. Basically, the database will have several "data" tables and one table used for storing images / documents. We need to accomplish the following:
Be able to quickly backup and restore only the data tables (sans images) to our test server for debugging and testing purposes.
In the event of a catastrophic database failure, restore the data tables only to get most of the application up and running ASAP. Then, restore the images table when possible.
Backup the database within the allotted nightly time window (a few hours).
My questions are:
Is it possible to accomplish the first two goals while still having the images stored in the same database? If so, would we use filegroups, filestream, or something else?
How do other shops backup their databases in a reasonable time window while maintaining high availability? Do you replicate to a second server and backup from there?
We have dealt with similar issues. We are a $2.5B solar manufacturing company and disaster recovery is critical for us, as well as keeping our databases backed up. Our main database is our plant floor production database. We decided to strip this database to the absolutely essential data needed to maintain production, and move other data off into its own database. This has allowed us high availability and reasonable backup/restore times.
In your case, is it really necessary to store images in the same database as your other data? I suspect it's not, and is just a case of making some issues easier to deal with. I think separate file groups would also help your problem. But you might want to seriously reconsider whether everything needs to be in a single DB.

How would you migrate hundreds of MS Access databases to a central service?

We have literally 100's of Access databases floating around the network. Some with light usage and some with quite heavy usage, and some no usage whatsoever. What we would like to do is centralise these databases onto a managed database and retain as much as possible of the reports and forms within them.
The benefits of doing this would be to have some sort of usage tracking, and also the ability to pay more attention to some of the important decentralised data that is stored in these apps.
There is no real constraints on RDBMS (Oracle, MS SQL server) or the stack it would run on (LAMP, ASP.net, Java) and there obviously won't be a silver bullet for this. We would like something that can remove the initial grunt work in an automated fashion.
We upsize (either using the upsize wizard or by hand) users to SQL server. It's usually pretty straight forward. Replace all the access tables with linked tables to the sql server and keep all the forms/reports/macros in access. The investment in access isn't lost and the users can keep going business as usual. You get reliability of sql server and centralized backups. Keep in mind - we’ve done this for a few large access databases, not hundreds. I'd do a pilot of a few dozen and see how it works out.
UPDATE:
I just found this, the sql server migration assitant, it might be worth a look:
http://www.microsoft.com/sql/solutions/migration/default.mspx
Update: Yes, some refactoring will be necessary for poorly designed databases. As for how to handle access sprawl? I've run into this at companies with lots of technical users (engineers esp., are the worst for this... and excel sprawl). We did an audit - (after backing up) deleted any databases that hadn't been touched in over a year. "Owners" were assigned based the location &/or data in the database. If the database was in "S:\quality\test_dept" then the quality manager and head test engineer had to take ownership of it or we delete it (again after backing it up).
Upsizing an Access application is no magic bullet. It may be that some things will be faster, but some types of operations will be real dogs. That means that an upsized app has to be tested thoroughly and performance bottlenecks addressed, usually by moving the data retrieval logic server-side (views, stored procedures, passthrough queries).
It's not really an answer to the question, though.
I don't think there is any automated answer to the problem. Indeed, I'd say this is a people problem and not a programming problem at all. Somebody has to survey the network and determine ownership of all the Access databases and then interview the users to find out what's in use and what's not. Then each app should be evaluated as to whether or not it should be folded into an Enterprise-wide data store/app, or whether its original implementation as a small app for a few users was the better approach.
That's not the answer you want to hear, but it's the right answer precisely because it's a people/management problem, not a programming task.
Oracle has a migration workbench to port MS Access systems to Oracle Application Express, which would be worth investigating.
http://apex.oracle.com
So? Dedicate a server to your Access databases.
Now you have the benefit of some sort of usage tracking, and also the ability to pay more attention to some of the important decentralised data that is stored in these apps.
This is what you were going to do anyway, only you wanted to use a different database engine instead of NTFS.
And now you have to force the users onto your server.
Well, you can encourage them by telling them that you aren't going to overwrite their data with old backups anymore, because now you will own the data, and you won't do that anymore.
Also, you can tell them that their applications will run faster now, because you are going to exclude the folder from on-access virus scanning (you don't do that to your other databases, which is why they are full of sql-injection malware, but these databases won't be exposed to the internet), and planning to turn packet signing off (you won't need that on a dedicated server: it's only for people who put their file-share on their domain-server).
Easy upgrade path, improved service to users, greater centralization and control for IT. Everyone's a winner.
Further to David Fenton's comments
Your administrative rule will be something like this:
If the data that is in the database is just being used by one user, for their own work (alone), then they can keep it in their own network share.
If the data that is in the database is for being used by more than one person (even if it is only two), then that database must go on a central server and go under IT's management (backups, schema changes, interfaces, etc.). This is because, someone experienced needs to coordinate the whole show or we will risk the time/resources of the next guy down the line.

Database Backup/Restore Process

The backup and restore process of a large database or collection of databases on sql server is very important for disaster & recovery purposes. However, I have not found a robust solution that will guarantee the whole process is as efficient as possible, 100% reliable and easily maintainable and configurable accross multiple servers.
Microsft's Maintenance Plans doesn't seem to be sufficient. The best solution I have used is one that I created manually using many jobs with many steps per database running on the source server (backup) and destination server (restore). The jobs use stored procedures to do the backup, copying & restoring. This runs once a day (full backup/restore) and intraday every 5 mins (transaction log shipping).
Although my current process works and reports any job failures via email, I know the whole process isn't very reliable and cannot be easily maintained/configured on all our servers by a non-DBA without having in-depth knowledge of the process.
I would like to know if others have this same backup/restore process and how others overcome this issue.
I've used a similar step to keep dev/test/QA databases 'zero-stepped' on a nightly basis for developers and QA folks to use.
Documentation is the key - if you want to remove what Scott Hanselman calls 'bus factor' (i.e. the danger that the creator of the system will get hit by a bus and everything starts to suck).
That said, for normal database backups and disaster recovery plans, I've found that SQL Server Maintenance Plans work out pretty well. As long as you include:
1) Decent documentation
2) Routine testing.
I've outlined some of the ways to go about doing that (for anyone drawn to this question looking for an example of how to go about creating a disaster recovery plan):
SQL Server Backup Best Practices (Free Tutorial/Video)
The key part of your question is the ability for the backup solution to be managed by a non-DBA. Any native SQL Server answer like backup scripts isn't going to meet that need, because backup scripts require T-SQL knowledge.
Because of that, you want to look toward third-party solutions like the ones Mitch Wheat mentioned. I work for Quest (the makers of LiteSpeed) so of course I'm partial to that one - it's easy to show to non-DBAs. Before I left my last company, I had a ten minute session to show the sysadmins and developers how the LiteSpeed console worked, and that was that. They haven't called since.
Another approach is using the same backup software that the rest of your shop uses. TSM, Veritas, Backup Exec and Microsoft DPM all have SQL Server agents that let your Windows admins manage the backup process with varying degrees of ease-of-use. If you really want a non-DBA to manage it, this is probably the most dead-easy way to do it, although you sacrifice a lot of performance that the SQL-specific backup tools give you.
I am doing precisely the same thing and have various issues semi regularly even with this process.
How do you handle the spacing between copying the file from Server A to Server B and restoring the transactional backup on Server B.
Every once in a while the transaction backup is larger than normal and takes a longer time to copy. The restore job then gets an operating system error that the file is in use.
This is not such a big deal since the file is automatically applied the next time around however it would be nicer to have a more elegant solution in general and one that specifically fixes this issue.

Resources