What's the best storage for SQL Server 2000? - sql-server

I want to access my sql server database files in a INTEL SS4000-E storage. It´s a NAS Storage. Could it be possible to work with it as a storage for sql server 2000? If not, what is the best solution?

I strongly recommend against it.
Put your data files locally on the server itself, with RAID mirrored drives. The reasons are twofold:
SQL Server will run much faster for all but the smallest workloads
SQL Server will be much less prone to corruption in case the link to the NAS gets broken.
Use the NAS to store backups of your SQL Server, not to host your datafiles. I don't know what your database size will be, or what your usage pattern will be, so I can't tell you what you MUST have. At a minimum for a database that's going to take any significant load in a production environment, I would recommend two logical drives (one for data, one for your transaction log), each consisting of a RAID 1 array of the fastest drives you can stomach to buy. If that's overkill, put your database on just two physical drives, (one for the transaction log, and one for data). If even THAT is over budget, put your data on a single drive, back up often. But if you choose the single-drive or NAS solution, IMO you are putting your faith in the Power of Prayer (which may not be a bad thing, it just isn't that effective when designing databases).
Note that a NAS is not the same thing as a SAN (on which people typically DO put database files). A NAS typically is much slower and has much less bandwidth than a SAN connection, which is designed for very high reliability, high speed, advanced management, and low latency. A NAS is geared more toward reducing your cost of network storage.

My gut reaction - I think you're mad risking your data on a NAS. SQL's expectation is continuous low-latency uninterrupted access to your storage subsystem. The NAS is almost certainly none of those things - you local or SAN storage (in order of performance, simplicity and therefore preference) - leave the NAS for offline file storage/backups.
The following KB lists some of the constraints and issues you'd encounter trying to use a NAS with SQL - while the KB covers SQL 7 through 2005, a lot of the information still applies to SQL 2008 too.
http://support.microsoft.com/kb/304261

local is almost always faster than networked storage.
Your performance for sql will depend on how your objects, files, and filegroups are defined, and how consumers use the data.

Well "best" means different things to different people, but I think "best" performance would be a TMS RAMSAN or a RAID of SSDs... etc
Best capacity would be achieved with a RAID of large HDDs...
Best reliability/data saftey would be achieved with Mirroring across many drives, and regular backups (off site preferably)...
Best availability... I don't know... maybe a clone the system and have a hot backup ready to go at all times.
Best security would require encryption, but mainly limiting physical access to the machine (and it's backups) is enough unless it's internet connected.

As the other answers point out, there will be a performance penalty here.
It is also worth mentioning that these things sometimes implement a RAM cache to improve I/O performance, if that is the case and you do trial this config, the NAS should be on the same power protection / UPS as the server hardware, otherwise in case of power outtage the NAS may 'loose' the part of the file in cache. ouch!

It can work but a dedicated fiber attached SAN will be better.
Local will usually be faster but it has limited size and won't scale easily.
I'm not familiar with the hardware but we initially deployed a warehouse on a shared NAS. Here's what we found.
We were regularly competing for resources on the head unit -- there was only so much bandwidth that it could handle. Massive warehouse queries and data loads were severely impacted.
We needed 1.5 TB for our warehouse (data/indexes/logs) we put each of these resources onto a separate set of LUNS (like you might do with attached storage). Data was spanning just 10 disks. We ran into all sorts of IO bottlenecks with this. the better solution was to create one big partition across lots of small disks and store data, index and logs all in the same place. This sped things up considerably.
If you're dealing with a moderately used OLTP system, you might be fine but a NAS can be troublesome.

Related

Can a 200GB or bigger SQL Server database load into a pagefile when not enough memory is available on the server?

Here's what we're trying to do.
We will have a 200GB+ SQL Server database that needs to load into memory. Microsoft best practice is to have enough physical memory available on the server and then load the entire database into that. That means we would need 256GB of memory on each of our SQL Servers. This would result is fast access to the database which is loaded to memory, but for the high cost of memory. BTW, we're running SQL Server 2008 on Windows Server 2008.
Currently, our server is setup with only 12GB memory. Just under 3GB is allocated to the OS, and the remaining 9GB is used for SQL Server. Is it possible to increase the pagefile to 256GB and set it up on an SSD drive? What we want to do then is, load the database into the pagefile located on the SSD. We're hoping the performance will be similar to loading the entire database into memory, since it'll be on an SSD.
Will this work? Is there another alternative we're overlooking? We want to keep the costs down as much as we can, without sacrificing the performance of our environment. Any advice would be appreciated.
Thanks.
If you want the database to be stored in memory, you need to buy more memory. In spite of what the other answer suggests, memory is the absolute best and cheapest way to make a database perform better - SQL Server is designed to use memory well.
While SQL Server will take advantage of the page file when it has to, and while having the page file on an SSD will be slightly faster than on an old-fashioned mechanical disk, it's still I/O and swapping and there is a lot of overhead around that, regardless of the disk type underneath. This may turn out to be a little bit better, in general, than having the same page file on a spinny disk (or no page file at all), but I don't think that it's going to be anywhere near the impact of having real memory, or that it's going to come anywhere close to your expectations of "fast access."
If you can't buy more memory then you can start with this page file on an SSD, but I'm confident you will need to additionally focus on other tuning opportunities - largely making sure you have indexes that support the type of queries you run, avoiding full table scans as much as possible. For full table aggregates you can consider indexed views (see here and here); for subsets you can consider filtered indexes.
And just to be sure: you are storing the actual data on an SSD drive, right? If not, then I would argue that you should use the SSD for the data and/or log, not for the page file. The page file isn't going to offer you much benefit if it is constantly swapping data in and out by exchanging with a spinny disk.
Need more clairity on the question.
Are you in control of the database or is this a COTS solution that limits your ability to optimize?
Are you clustering? Is that why adding 200+Gb of RAM is an issue (now more than 400GB, 200 per node)?
Are you on bare metal or virtualized? Is this why RAM may be an issue?
So far it would seem the "experts" have made some assumptions that may not be fair to your circumstance.
Please update your question... :)

Shared volume for data (multiple MDF) and another shared volume for logs (multiple LDF) on SAN

I have 3 instances of SQL Server 2008, each on different machines with multiple databases on each instance. I have 2 separate LUNS on my SAN for MDF and LDF files. The NDX and TempDB files run on the local drive on each machine. Is it O.K. for the 3 instances to share a same volume for the data files and another volume for the log files?
I don't have thin provisioning on the SAN so I would like to not constaint disk space creating multiple volumes because I was adviced that I should create a volume (drive letter) for each instance, if not for each database. I am aware that I should split my logs and data files at least. No instance would share the actual database files, just the space on drive.
Any help is appretiated.
Of course the answer is: "It depends". I can try to give you some hints on what it depends however.
A SQL Server Instance "assumes" that it has exclusive access to its resources. So it will fill all available RAM per default, it will use all CPUs and it will try to saturate the I/O channels to get maximum performance. That's the reason for the general advice to keep your instances from concurrently accessing the same disks.
Another thing is that SQL Server "knows" that sequential I/O access gives you much higher trhoughput than random I/O, so there are a lot of mechanisms at work (like logfile organization, read-ahead, lazy writer and others) to avoid random I/O as much as possible.
Now, if three instances of SQL Server do sequential I/O requests on a single volume at the same time, then from the perspective of the volume you are getting random I/O requests again, which hurts your performance.
That being said, it is only a problem if your I/O subsystem is a significant bottleneck. If your logfile volume is fast enough that the intermingled sequential writes from the instances don't create a problem, then go ahead. If you have enough RAM on the instances that data reads can be satisfied from the buffer cache most of the time, you don't need much read performance on your I/O subsystem.
What you should avoid in each case is multiple growth steps on either log or data files. If several files on one filesystem are growing, you will get fragmentation and fragmentation can transform a sequential read or write request even from a single source to random I/O again.
The whole picture changes again if you use SSDs as disks. These have totally different requirements and behaviour, but since you didn't say anything about SSD I will assume that you use a "conventional" disk-based array or RAID configuration.
Short summary: You might get away with it, if the circumstances are right, but it is hard to assess without knowing a lot more about your systems, from both the SAN and SQL perspective.

What's TOO BIG for a database?

I have a buddy who runs a web app for people listing cars for sale. There are a few thousand clients who use it, and each client has hundreds and sometimes thousands of rows in the database (some have been on for 5 years with hundreds of cars selling each month, and 10s of rows per sale (comments, messages, etc)). He has run this system in one SQL Server database in one physical server with like 20GB or RAM and a couple processors for the whole time, with no problems. Is this some sort of miracle?
Just like most programmers, I'm no DBA and just get by, thanks to ORMs, etc. Everywhere I look, people talk about having the need to shard or get a separate database server for big users of a web app. Why is this? Is it really that inefficient to have a large DB with lots or rows? Should I plan to use Cassandra or something, or can I rely on scaling up well with Postgres?
I personally don't think what you've described is that large of a database. The server (20 gigs of ram? ;)) sounds decent. It's more about usage and design. If the database is indexed and well designed, it can grow much, much larger on the current hardware.
Before doing any sort of switch, I'd simply look at archiving useless data and optimising queries if there's a fear of performance issues.
The reason for sharding and separate db servers is that at some point it's going to be cheaper to use multiple cheaper machines than one expensive one. Hardware price doesn't scale linearly with performance and once you reach a certain point it'll be much cheaper to get twice as many machines as to get a machine that's twice as fast.
You should have no problem in SQL server, Oracle, or any modern relational or non-relational database. I have administered databases with 100's of millions of records and Terabytes of data.
Typically you split components up across different servers so you can manage up time, resilience, and performance more easily.
It's certainly quite possible to have one monster machine which does it all, but then you may need another monster machine in case your motherboard dies, or your datacenter is unavailable.
By splitting a web site or application up, amongst different server's it's easier to get cheaper machines, and more of them.
Thus you can build in resilience, and not have components which have similiar demands on hardware clashing.
It's also important to think about restore times for servers, and recovery plans.
What happens when your machine dies, can you replace it in the agreed upon time? Can you restore from backups in that time?
SQL Server or other enterprise class databases shouldn't have any problems with 10's or 100GB databases, as long as they not designed too badly. (We have a few machines with that capacity/use which aren't struggling at all.).
In my mind that's nothing. Having tens of millions of rows on multiple tables with database size exceeding 10 GB has not caused problems for MS SQL Server. Of course it is not too fast with that much data, but otherwise it works just fine.
And to answer the question, too big is so big it does cause problems. And when it starts causing problems depends on the table structure and your performance demands.
Databases are extremely efficient at storing and retrieving relational data (i.e. data that is structured and has references to other data) - that's what they're designed to do. Honestly, 99% of the people spewing about key-value stores and Cassandra and whatnot have no clue what they're doing. A database server is just fine for storing large volumes of data, particularly if you're willing to put a bit of work into tuning it properly.
That said, there are use cases for Cassandra et. al. - if you have mostly unstructured key/value data or don't need consistency or want to shard for redundancy, it may be worth investigating.
Unless you're an extremely popular website, you probably can get by just fine with a decent database server - don't switch until you've determined why you need to switch. Switching is fine, just make sure you are switching because it serves your needs better, and not because it's the "cool web-scale thing to do"

Would it ever be wise to have a SQL server per web server?

I'm wondering if, under the circumstances that
You get lots more reads than writes
Your SQL server of choice is cheap/free and offers a fast mirroring/replication service
Your database isn't insanely large
rather than having separate SQL servers it would be better to have an instance of SQL on each machine getting instant updates from the master. This way there would be no network latency when doing all the read queries, but there would be a per box performance hit as the SQL instance has to execute. Would this be better overall for performance? Are there any other pros/cons that might come up?
Your SQL Server should always be on a different box to the webserver, of that there is no question.
How many DB servers and webservers you have, and how they mirror (or otherwise) is up to how you scale your application.
You have SQL Server on a different machine because it needs (and deserves) a lot of RAM.
It's quite a common architectural pattern to have read-only replicas of a database. We accept some degree of stalesness in them, perhaps they are even only updated once a day.
The general rule will be that multiple copies will introduce complexity in terms of operations and management and tend to introduce the possibilities of inconsistency of data - almost inevitably the copies will not be perfectly is step (or the costs of making them soo will be too high.)
An example: what happens if your replication processing breaks a bit. So that some, but not all copies become stale. Now your users start to see radically different views of the world. How much might that matter to you? If it's a site with low value data (eg. celebrity sightings in London suberbs) then perhaps that's fine. If it's on hand inventory, and being out of date means that your customers can't place orders, then maybe you care rather more.
My advice: things that sound simple at a boxed on paper sort of level don't always work out that way when you're sitting in an operations room at 3AM. Be very sure that you can easily operate your solution.
How would your SQL Server be cheap/free? I should have said the licensing costs for this setup would be crippling. At retail prices you're looking at $6000 per server. See also Jeff's comments about costs. Scale out the web servers by all means, but not your SQL Server until it's pretty much on its' knees.
You might instead want to think about a distributed cache like Velocity or NCache.
Either way, run your site first with one SQL server and see how it copes with the load, then think about mirroring/replication across servers, otherwise you're just optimising prematurely. Measure first!
An immediate con is that there is no distributed lock co-ordinator in SQL Server so you can get merge conflicts as updates can change the same row on two different servers at the same time.
Depending on the size of the database and the disks in the web servers, you will find your network latency is smaller than the disk latency you will start suffering as the web server disks will not usually be as performant as the disk array you give to the database. If you wanted that kind of performance, you would be buying it per web server.
Replication performance is not without latency either, the distribution of the transactions isn't 'free' and careful maintenance of the transaction log would have to be planned to ensure you did not get log fragmentation (too many vlog's wthin the transaction log) which kills replication performance.

Which type of external drives are good for SQL backup files?

As a part of database maintenance we are thinking of taking daily backups onto an external/firewire drives. Are there any specific recommended drives for the frequent read/write operations from sql server 2000 to take backups?
Whatever you do, just don't use USB 1.1.
The simple fact is that harddrives over a period of time will fail. The best two solutions
I can recommend unfortunately do not avail of using harddrives.
Using a tape backup, granted is slower but you get the flexibility of having the option of offsite backups. It is easy to put a tape in the boot of a car. Rotating the tapes means that you can have pretty recent protection against any unforseen situations.
Another option is an online backup solution where the backups are encrypted and copied offsite. My reccommendation is definitly at least having some sort of offsite backup external to the building that you keep the SQL servers. After all it is "disaster" recovery.
Pretty much any external drive can be used here, provided it has the space to hold your backups and enough performance to get the backups there. The specifics depend on your exact requirements.
In my experience, FireWire tends to outperform USB for disk activity, regardless of their theoretical maximum transfer rates. And FireWire 800 will perform even better yet. I have found poor performance from FireWire and USB drives when you have multiple concurrent reads/writes going on, but with backups, it's generally more large sequential reads and writes.
Another option that is a little bit more complex to setup and manage, but can provide you with greater flexibility and performance is external SATA (eSATA). You can even get Hot Swappable external SATA enclosures for even greater convenience, and ease of taking your backups offsite.
However, another related option that I've had excellent success with is to setup a separate server to act as your backup server. You can use whatever disk options you choose (FireWire, SATA, eSATA, SCSI, FiberChannel, iSCSI, etc), and share out that disk storage as a network share (I use NFS and Samba on a Linux box, but for a Windows oriented network, a Windows share will work fine). You can then access the shares across the network and backup multiple machines to it. Also, the separation of backup server from your production machines will give you greater flexibility if you need to take it offline for maintenance, adding/removing storage, etc.
Drobo!
A USB hard drive RAID array that uses normal - off the shelf hard drives. 4 bays, when you need more space, buy another hard drive. Out of bays? Buy bigger hard drives and replace your smallest in the array.
http://www.drobo.com/
Depending on the size of the databases speed of the drive can be a real factor. I would look into something like Drobo but with an eSata or SAS interface. There is nothing more entertaining than watching a terabyte go through USB 2.0. Also, you might consider something like hyperbac or RedGate SQL Backup to compress the backup and make it easier to fit on the drive as well.
For the most part, external drives aren't a good option - unless your database is really small.
Other than some of the options others have listed, you can also use UNC/Network shares as a great 'off-box' option.
Check out the following video for some other options:
SQL Server Backup Options (Free Video)
And the videos on configuring backups on the site will show you how to specify a network path for backup purposes.

Resources