SQL Server CPU vs. Storage Bottlenecking - sql-server

I've read quite a bit about SQL Servers using SSDs performing much better than traditional hard drives. In load tests with my app in a test environment, though, I'm able to keep my test DB server (SQL 2005) pegged between 75% and 100% CPU usage without much of a strain on disk access (as far as I can tell). My data set is still pretty small; database backups are under 100 MB. The test server I'm using is not new, but is also no slouch.
So, my questions:
Is the CPU the bottleneck (as opposed to the storage) because the dataset is small and therefore fits in memory?
Will this change once the dataset grows so paging is necessary?
Approximately how big (as a percentage of system memory) does the dataset have to get before SQL Server starts paging? Or does that depend on a lot of other factors?
As the app and its dataset grows, are there other bottlenecks that will tend to crop up besides CPU, storage, and lack of proper indexes?

Yes
Yes
If you have SQL Server configured to use as much memory as it can get, probably when it exceeds the max system memory. But it's very setup dependant on what causes paging (the query that is being executed is the most prevalent cause).
I/O between the request machine and server is the only one that I can think of, and that only matters if you are retrieving large datasets. I also would not group a lack of indexes as a bottleneck, rather indexes enable better performance with regard to searching.

As long as the CPU is the bottleneck on your dedicated SQL-Server machine, you don't have to worry about disk speed (assuming nothing's wrong with the machine). SQL-Server WILL use heavy memory caching. SQL-Server has built-in strategies to perform best under a given load and available resources. Just don't worry about it!

Related

Fastest throughput Local, RAM Cahced DB

I'm looking for a DB solution for a high performance application.
The database will need to be local and stored in RAM for performance and will be several GB in size.
It will be local to the application, but it may be accessed by multiple processes running on the machine (up to 40). The data in the DB is immutable once it's been inserted and I only need a basic key value store rather than anything relational.
The obvious candidates are Memcached and Redis, but I believe they both have limitations with overhead and bottlenecks from the network component.
Something like Berkeley DB would also appear to be ideal, but it's only single process as far as I can see.
Throughput is the most important consideration (more so than latency).

Can a 200GB or bigger SQL Server database load into a pagefile when not enough memory is available on the server?

Here's what we're trying to do.
We will have a 200GB+ SQL Server database that needs to load into memory. Microsoft best practice is to have enough physical memory available on the server and then load the entire database into that. That means we would need 256GB of memory on each of our SQL Servers. This would result is fast access to the database which is loaded to memory, but for the high cost of memory. BTW, we're running SQL Server 2008 on Windows Server 2008.
Currently, our server is setup with only 12GB memory. Just under 3GB is allocated to the OS, and the remaining 9GB is used for SQL Server. Is it possible to increase the pagefile to 256GB and set it up on an SSD drive? What we want to do then is, load the database into the pagefile located on the SSD. We're hoping the performance will be similar to loading the entire database into memory, since it'll be on an SSD.
Will this work? Is there another alternative we're overlooking? We want to keep the costs down as much as we can, without sacrificing the performance of our environment. Any advice would be appreciated.
Thanks.
If you want the database to be stored in memory, you need to buy more memory. In spite of what the other answer suggests, memory is the absolute best and cheapest way to make a database perform better - SQL Server is designed to use memory well.
While SQL Server will take advantage of the page file when it has to, and while having the page file on an SSD will be slightly faster than on an old-fashioned mechanical disk, it's still I/O and swapping and there is a lot of overhead around that, regardless of the disk type underneath. This may turn out to be a little bit better, in general, than having the same page file on a spinny disk (or no page file at all), but I don't think that it's going to be anywhere near the impact of having real memory, or that it's going to come anywhere close to your expectations of "fast access."
If you can't buy more memory then you can start with this page file on an SSD, but I'm confident you will need to additionally focus on other tuning opportunities - largely making sure you have indexes that support the type of queries you run, avoiding full table scans as much as possible. For full table aggregates you can consider indexed views (see here and here); for subsets you can consider filtered indexes.
And just to be sure: you are storing the actual data on an SSD drive, right? If not, then I would argue that you should use the SSD for the data and/or log, not for the page file. The page file isn't going to offer you much benefit if it is constantly swapping data in and out by exchanging with a spinny disk.
Need more clairity on the question.
Are you in control of the database or is this a COTS solution that limits your ability to optimize?
Are you clustering? Is that why adding 200+Gb of RAM is an issue (now more than 400GB, 200 per node)?
Are you on bare metal or virtualized? Is this why RAM may be an issue?
So far it would seem the "experts" have made some assumptions that may not be fair to your circumstance.
Please update your question... :)

When can I host IIS and SQL Server on the same machine?

I've read that it's unwise to install SQL Server and IIS on the same machine, but I haven't seen any evidence for that. Has anybody tried this, and if so, what were the results? At what point is it necessary to separate them? Is any tuning necessary? I'm concerned specifically with IIS7 and SQL Server 2008.
If somebody can provide numbers showing when it makes more sense to go to two machines, that would be most helpful.
It is unwise to run SQL Server with any other product, including another instance of SQL Server. The reason for this recommendation is the nature of of how SQL Server uses the OS resources. SQL Server runs on a user mode memory management and processor scheduling infrastructure called SQLOS. SQL Server is designed to run at peak performance and assumes that is the only server on the OS. As such the SQL OS reserves all RAM on the machine for SQL process and creates a scheduler for each CPU core and allocates tasks for all schedulers to run, utilizing all CPU it can get, when it needs it. Because SQL reserves all memory, other processes that need memory will cause SQL to see memory pressure, and the response to memory pressure will evict pages from buffer pool and compiled plans from the plan cache. And since SQL is the only server that actually leverages the memory notification API (there are rumors that the next Exchange will too), SQL is the only process that actually shrinks to give room to other processes (like leaky buggy ASP pools). This behavior is also explained in BOL: Dynamic Memory Management.
A similar pattern happens with CPU scheduling where other processes steal CPU time from the SQL schedulers. On high end systems and on Opteron machines things get worse because SQL uses NUMA locality to full advantage, but no other processes are usually not aware of NUMA and, as much as the OS can try to preserve locality of allocations, they end up allocating all over the physical RAM and reduce the overall throughput of the system as the CPUs are idling on waiting for cross-numa boundary page access. There are other things to consider too like TLB and L2 miss increase due to other processes taking up CPU cycles.
So to sum up, you can run other servers with SQL Server, but is not recommended. If you must, then make sure you isolate the two server to your best ability. Use CPU affinity masks for both SQL and IIS/ASP to isolate the two on separate cores, configure SQL to reserve less RAM so that it leaves free memory for IIS/ASP, configure your app pools to recycle aggressively to prevent application pool growth.
Yes, it is possible and many do it.
It tends to be a question of security and/or performance.
Security is questioned as your attack surface is increased on a box that has both. Perhaps not an issue for you.
Performance is questioned as now your server is serving web and DB requests. Again, perhaps not an issue in your case.
Test vs. Production....
Many may feel fine in test environments but not production....
Again, your team's call. I like my test and production environments being as similar as possible if possible but that's my preference.
It's possible, yes.
A good idea for a production environment, no.
The problem that you're going to run in to is that a SQL Server database under substantial load is, more than likely, going to be doing heavy disk I/O and have a large memory footprint. That combination is going to tie up the machine, and you're going to see a performance hit in IIS as it tries to serve up the pages.
It's unwise in certain contexts... totally wise in others.
If your machine is underutilized and won't experience heavy loads, then there is an advantage to installing the database on the same machine, because you simply won't have to transfer anything across the network.
On the other hand, if one or both of IIS or the database will be under heavy load, they will likely start to interfere, and the performance gain of dedicated hardware for each will probably outstrip the loss of having to go over the network.
Don't forget the maintenance issue...you can't reboot/patch one without nuking the other. If they are on two boxes, you could give your users a better experience, than no response from the webserver if you are maintaining the SQL box.
Not highest on the list, but should be noted.
You certainly can. You will run into performance issues if, for example, you have large user base or if there are a lot of heavy query's being run against the DB. I have worked on several sites, usually hosted at 1and1, that run IIS and SQL Server (Express!) on the same box with thousands of users (hundreds concurrent) and millions of records in poorly designed tables, accessed via poorly written stored procedures and the user experience was certainly tolerable. It all comes down to how hard you plan on hitting the server.

SQL Server 2000 Page size and the NTFS page size

In the Oracle world, it's been gospel to build your database block size to be even multiples of the File system's block size. I assume this is still true but I'm not adverse to being told why technology has made this irrelevant.
But I've been told some SQL Server DBA's are going to upgrade the OS of a SS2000 installation to 64bit to get 64k pages in the FS.
Does SQL Server 2000 support changing the page size?
From what I've read it's fixed at 8k. Is that right?
If it is fixed at 8k, would there be any advantage to making the FS 64k?
I'm getting this information from a reliable source but none-the-less second hand.
EDIT: Thanks to SAMBO, I've read the links and found the specification for
"NTFS Allocation Unit Size" be set to 64Kb
I assume that term = Block Size...
So the conflict I have between 8k DB blocks and 64k FS blocks is in fact the recommended setup from MS.
Make sure you read the Microsoft's Predeployment I/O Best Practices
It recommends using 64K allocation units for NTFS volumes.
Also, read SQL Server 2000 IO basics
Finally have a look at this post.
SQL Servers page size is in fact 8K, this is non-configurable. The advantage of having larger allocation unit on the OS is that perhaps you can get slightly better performance when SQL Server is fetching pages in to its cache.
From my experience, I doubt mucking around with these values will give you any noticeable performance improvements best case you would get a minuscule improvement.
Better spend your efforts doing stuff like isolating tempdb, ensuring raid1/0 array are used, having your transaction log live on a different array to the data file and optimizing queries.
The overall performance of the file-system can make a noticeable difference.
For example, I heard when Windows Server 2003 came out that SQL Server 2000 performance on that platform was improved significantly.
So it doesn't surprise me. I don't think the multiple factor is that big of a deal.

What's the best storage for SQL Server 2000?

I want to access my sql server database files in a INTEL SS4000-E storage. It´s a NAS Storage. Could it be possible to work with it as a storage for sql server 2000? If not, what is the best solution?
I strongly recommend against it.
Put your data files locally on the server itself, with RAID mirrored drives. The reasons are twofold:
SQL Server will run much faster for all but the smallest workloads
SQL Server will be much less prone to corruption in case the link to the NAS gets broken.
Use the NAS to store backups of your SQL Server, not to host your datafiles. I don't know what your database size will be, or what your usage pattern will be, so I can't tell you what you MUST have. At a minimum for a database that's going to take any significant load in a production environment, I would recommend two logical drives (one for data, one for your transaction log), each consisting of a RAID 1 array of the fastest drives you can stomach to buy. If that's overkill, put your database on just two physical drives, (one for the transaction log, and one for data). If even THAT is over budget, put your data on a single drive, back up often. But if you choose the single-drive or NAS solution, IMO you are putting your faith in the Power of Prayer (which may not be a bad thing, it just isn't that effective when designing databases).
Note that a NAS is not the same thing as a SAN (on which people typically DO put database files). A NAS typically is much slower and has much less bandwidth than a SAN connection, which is designed for very high reliability, high speed, advanced management, and low latency. A NAS is geared more toward reducing your cost of network storage.
My gut reaction - I think you're mad risking your data on a NAS. SQL's expectation is continuous low-latency uninterrupted access to your storage subsystem. The NAS is almost certainly none of those things - you local or SAN storage (in order of performance, simplicity and therefore preference) - leave the NAS for offline file storage/backups.
The following KB lists some of the constraints and issues you'd encounter trying to use a NAS with SQL - while the KB covers SQL 7 through 2005, a lot of the information still applies to SQL 2008 too.
http://support.microsoft.com/kb/304261
local is almost always faster than networked storage.
Your performance for sql will depend on how your objects, files, and filegroups are defined, and how consumers use the data.
Well "best" means different things to different people, but I think "best" performance would be a TMS RAMSAN or a RAID of SSDs... etc
Best capacity would be achieved with a RAID of large HDDs...
Best reliability/data saftey would be achieved with Mirroring across many drives, and regular backups (off site preferably)...
Best availability... I don't know... maybe a clone the system and have a hot backup ready to go at all times.
Best security would require encryption, but mainly limiting physical access to the machine (and it's backups) is enough unless it's internet connected.
As the other answers point out, there will be a performance penalty here.
It is also worth mentioning that these things sometimes implement a RAM cache to improve I/O performance, if that is the case and you do trial this config, the NAS should be on the same power protection / UPS as the server hardware, otherwise in case of power outtage the NAS may 'loose' the part of the file in cache. ouch!
It can work but a dedicated fiber attached SAN will be better.
Local will usually be faster but it has limited size and won't scale easily.
I'm not familiar with the hardware but we initially deployed a warehouse on a shared NAS. Here's what we found.
We were regularly competing for resources on the head unit -- there was only so much bandwidth that it could handle. Massive warehouse queries and data loads were severely impacted.
We needed 1.5 TB for our warehouse (data/indexes/logs) we put each of these resources onto a separate set of LUNS (like you might do with attached storage). Data was spanning just 10 disks. We ran into all sorts of IO bottlenecks with this. the better solution was to create one big partition across lots of small disks and store data, index and logs all in the same place. This sped things up considerably.
If you're dealing with a moderately used OLTP system, you might be fine but a NAS can be troublesome.

Resources