How do you stress load dev database (server) locally? - sql-server

Wow, this title gave me immediately "The question you're asking appears subjective and is likely to be closed."
Anyway, after some searching and reading, I decided to ask it.
Coming from my question: What are the first issues to check while optimizing an existing database?, which boiled down to necessity to stress load a local SQL Server dev database received as backup .bak file.
Did I understand correctly the answer by paxdiablo to question: "DB (SQL) automated stress/load tools?" that there is no general purpose stress loading testing SQL tools independent on RDBMS?
What are the stress test load-ing tools for SQL Server?
What are you doing for cheap and dirty stress loading of local dev SQL Server database?
Update: And I am interested in stress loading SQL Server 2000, 2005, 2008 databases
(having no clue about 2000).
OK, let's put aside final/real test (to QA speacialists, DBA and sysadmins) and confine the question in context of stress loading to find obvious (outrageous) flaws in design, performance bottlenecks.

As one word of warning: It is easy to stress test alocal db to see whether the db design is good / bad / missing (bad indices etc.)
Trying to get real time perforamnce metrics out of that is futile - even with plenty of memory (unlikely - most workstations are crappy memory wise compared to real db servers) your disc subsystem will SUCK (in 100 meter high letters) comapred to what a real db server can throw at it. Becasue normal dev local databases only have one or maybe two discs, while db servers often utilize a LOT more and a LOT faster discs. As such, waht may be SLOOOOW on your workstation may be a seconds operation on the sever.
But again, thigns like bad index usage is visible.

You are correct.
There are no general purpose stress loading testing SQL tools independent of RDBMS.
And how could there be? You can throughput benchmark the hardware sub-systems in isolation (such as SAN, network), but the performance of your database depends very much on your access patterns of your application(s), the type of RDBMS, the hardware.
Your best bet is to load test your application connected to your database on a representative hardware platform. There are several tools that can do, including the Ultimate version of Microsoft Visual Studio 2010.

Related

SQL Server when is correct to use different instance for performance purpose?

I have two very big and "stressed" databases in a single SQL Server 2008 instance and I experience a sensible slowness in the first database when the second database is under heavy work.
It also appear that the Server RAM are CPU are not really under stress and I have some spare resources that I can use.
I'm planning to buy a second SQL Server machine and move one of the database to separate them but before to do so I would like to understand if creating two differente instances on the same server, with one database each, could solve my issue.
Thanks for any help provided.
If there's no stress on CPU please check the other component for troubleshooting. Also check the database configuration like MAXDop, Cost threshold for parallelism etc. It doesn't make sense to buy a complete new server without knowing the root cause of the issue.
In my experience a separate instance on the same server will allow you to control resources allocated to each instance. You can do that however with other means like resource governor, especially with newer versions of SQL Server. I have found that creating a new instance is more useful for security and separation of concerns than performance.
I agree with Dans comment. If you have no issues with RAM / CPU, disk is the next place to look. I have however seen slow performance with RAM / CPU / DISK / Network all at low usage and the problem was solved with changes to indexing.
If your disk looks good, I suggest checking for blocking and locking issues as well as index tuning.
Well, there are a few reasons to use different instances.
Different resource configuration
Server level settings / installation settings
The most important of them all. Sql-Server has only one TempDB, and you can only use so much of it. So if you predict you will get there, then more than 1 instance really makes sense.

Sql Server distribution and configuration for best performance

I want design and implement an enterprise software with silverlight.I use sql server database for this.many useres run sql queireis on sql server database.
how can i configure sql server database for best performance?
how can i distribute sql server database for best performance?
how can i distribute sql server database between some servers for best performance?
and so what technologies can i use in sql server for best performance?
In addition to replication you can use mirroring or log shipping for this. Note that I am talking only about scaling out reads, not write. So reports etc. can be run from the copies of the database but writes must go to the main copy (unless you are using merge replication, which is frightening to me). There are some caveats of course.
With database mirroring, you can use the secondary as a read-only reporting source by taking a snapshot. There are limits here to how many databases you can mirror and there is of course maintenance to manage the snapshots. It is not quite true distribution of resources here, but it can be helpful to offload some of the load. In the next version of SQL Server (Denali), you will be able to set secondaries as read-only, so you can avoid the maintenance of snapshots.
With log shipping, you can essentially keep a stale version of the database around for reporting, and replace it periodically by restoring logs to it. You have a lot more flexibility here compared to replication or mirroring, as you can actually define a delay (like every 6 hours or once a day, you refresh the copy) - which can also serve as a "recover from a shoot-yourself-in-the-foot" scenario. The downside is that to restore a new copy of the database you need to kick all the current users out, as the database needs to be in single user mode in order to recover.
Those are just a couple of ideas for helping scale out reads, but deep down I agree with #gbn - are you solving a problem you don't have yet? It's one thing to design for scalability, but it's very easy to step over that line and completely over-engineer.
Well, SQL Server doesn't really have a load balancing mchanism in and off itself. What it does support, however, is an active/passive node configuration and also replication.
We are using the replication strategy in one application I support. You can read more about it here:
http://msdn.microsoft.com/en-us/library/ms151198.aspx
In our configuration, we basically have a transactional database and a reporting database. We replicate the data from our transactional DB to the reporting DB. Any reporting is done against this reporting DB, so that we don't slow down work being done on the transactional DB due to some long running report.
Note that the replication isn't truly real time. In other words, there's some time involved in replicating the data from the transactional to the reporting DB, albeit a very small time amount. But replication is certainly one strategy you could consider if you are trying to balance workload.
Other things you might consider are partitioning large tables for better performance.
As gbn pointed out in his comment though, it's better to determine if you actually need these strategies before implementing them, because they add a lot of complexity and maintenance efforts, which may not even be needed. It's important to properly analyze how much data you think you will have, and how much activity will be occurring against that data to determine if strategies such as the ones I just described are even needed.
Also, you can refer to this link for some other helpful information and some links to whitepapers you may find helpful:
http://social.msdn.microsoft.com/Forums/en/sqldisasterrecovery/thread/05cf41b7-c558-44bf-86c6-12f5c2b2ffe2

What database to use for big data storage and manipulation?

I have to make a decision of which database server to use for my next project, but the simple decision to use MySQL like almost all the projects I did is harder now, because I expect very much records.
The database will store a user list, some other irrelevant tables, and the last one, some user-collected data. Let's say, if I have 6000 users responding to a quiz about each other. Simple math shows that from those users, if each one completes the quiz about everyone (and in my project that is 99% sure that will happen) I'll end up with 35.99million records(they will exclude themselves and in this particular situation the operation is 6000*5999). Unfortunately 6000 maybe is a small number, the real one growing day by day.
What to choose? MySQL and maybe if things go well and the project grows to expand it in a cluster? PostgreSQL, MSSQL? Oracle?
I've read about all of them, each one has it's pros and cons, but still don't know what to choose. The advantage of MySQL and PostgreSQL is of course, the starting price of $0 which is pretty nice in a usual self-funded startup.
Any opinions, pieces of advice? If you encountered this situation in your experience as developers, I'd love to hear from you.
These days, free isn't something that differenciates between databases any more. Both Oracle and SQL Server have free versions, but the limitations is resources - 4 GB database, RAM & single CPU utilization. Millions of records is not a concern - it's what datatypes you're using.
I saw the OPs comment about not liking MS software - that's your prerogative, but using the free versions of either Oracle or SQL Server do benefit from seamless transition to upscale versions of the respective database.
Personally, my choice would be either Oracle or SQL Server because of IMHO, real feature considerations like hierarchical query support, subquery factoring/CTE, packages (long before I get concerned with functions/procedures), full text searching, xml support, etc.
MySQL will handle 35 million records no problem. Worry about scalability when you get there. You can easily add raid hard disks backing your database tables, and if you really start getting big you can get a compellant SAN that will scream... Don't worry about the DB engine as much as the underlying hardware.. MySQL rocks for us with millions of records.
I've had no problems handling tables as large as 36,000,000 rows on MySQL and Oracle.
Just be sure that you index the proper columns, run EXPLAINs for your queries, and maintain proper design principles.
Most of the truly large scale web properties use a distributed key-value store. That said, 35 million is large, but not that large. With most modern databases, your main two scaling worries should be throughput and what happens when no single box can contain your entire database anymore. And both of these problems can be solved to some degree for any database you choose to use. (Caching, replication, sharding, etc.)
Use MySQL until you can't anymore. At that point, you ought to be rolling in dough anyways and you now have a very desirable problem.
Use MySQL as it's free and you have experience with it.
Besides in my opinion it matters more on how you design the tables than which database you use.
35 million records can be easily handled by MS SQL Server (assuming proper database design, indices, etc.). You can start with the free SQL Server Express edition and later, if you need, you can upgrade to the full version which supports clustering, etc.
SQL Server Express does have some limitations - single CPU, 1 GB memory, max 4 GB database size and a few other things. I'm not sure how quickly these limitations will become a problem but you can always move to the full version when you run into them.
MySQL(i) & Postgre
0$ of costs
large community
many tutorials
well documentated
MSSQL
You can get "money" from MS if you promote that you are using MSSQL (secret information from some companies I worked for)
MS tools work very well
Complete tool set from C# IDE over .NET lib to Windows Server 2003
Oracle
Professional and commercial provider
Used by many large companies (I also heard about Blizzard (World of Warcraft) using Oracle)
- expensive
The final decision depends on the very special requirements of your project.
Make yourself a quick list of things , that ARE IMPORTANT for your project (e.g. quick performed queries) and look up which Database pros are matching the most to your requirements.
Everything is about design. SQL Database are some kind of cars, you just have to know which component has to be placed here and which there.
Make a clear design and you won't struggle with any of them.
May be you can test Firebird
Blog post about big Firebird database here
MySQL licence is here (not allways free).
Postgresql and Firebird are free.
First of all, don't think about performance. Premature optimization being the root of all evil and all that. You can always throw more hardware and/or tuning at it later.
All of the mentioned should perform nicely if tuned/maintained correctly. I'd focus on manageability and familiarity. IMHO open source databases excels on manageability (perhaps not the best GUIs, but the CLI has been my home for a long long time).
And if the database becomes the bottleneck, why limit yourself to those choices? How about a key-value distributed database? Or perhaps serialize data directly to disk? Storing data outside of a RDBMS, while often frowned upon, might be the correct path. Or simply use the common route of denormalization.
Always remember not to optimize prematurely.
As far as opinions go (since you specifically asked for it) I favor open source databases, specifically PostgreSQL. It's rock solid, fast and very well-featured. And even with (relatively) large datasets it has performed superbly on mediocre hardware (some tuning involved, of course, but you can't skip that step no matter which db you end up choosing).

Oracle XE or Postgres?

Should I use for a web application Postgres or Oracle XE and why?
They are both available for free and for use commercially.
Why should I use one database over the other?
Well, with Oracle XE, if you hit the limits, you have to either buy Oracle or migrate your entire application to a different database. With PostgreSQL, there are no limits, and the software is completely free and open-source.
A few things to be aware of if you choose Oracle XE:
It will only use one CPU if you have a multiprocessor server
It will only use up to a maximum of 1 gigabyte of memory
It has a database size limit of 4 gigabytes for user data
Only available for 32-bit Windows and 32-bit Linux
If those limitations aren't an issue for you and you like the Oracle approach then give it a shot, otherwise consider an opensource server like Postgres or MySQL, which have none of the aforementioned limitations.
If you do choose XE and then later find your requirements have changed, the next version up of Oracle is 900USD for 5 seats, and an additional 180 per seat. This is in fact a bit cheaper than MS SQL Server afaik.
There are some good reasons to choose Oracle, particularly if you're a Java developer e.g. you can write stored procedures in Java, and I think there's native support for Java web services. Ultimately however you need to weigh up the cost with the requirements of your application. MySQL and Postgres will allow you to scale your application without any cost (other than hardware obviously).
You haven't provided much information, but unless you're already an Oracle expert, I see no reason to choose Oracle XE over PostgreSQL. PostgreSQL will always be free and is far more capable and more scalable.
And you can choose to run PostgreSQL on Windows, Mac OS X or Linux. I think Oracle XE is limited to Windows and Linux.
when choosing a db vendor be careful to match the demands of your app against the strengths of the vendors' db product. And watch out for their weaknesses.
For example, if you know your writes will be as frequent as your reads and both will occur simultaneously, then you'll want to know how each vendor you consider handles concurency. Vendors that rely on elaborate lock managers with complex lock escalation schemes are likely to bring you grief if you expect heavy load on you app. You'll spend more time trying to work around the DB's lock manager than actually solving your problems.
That's one example. Every DB has its strengths and weaknesses to consider. Do your research, find a site that compares vendors and make a choice that balances your needs against that. If you can get an eval copy, all the better to run some proof of concept tests against. Write scripts that pummel the db in some similar to what you expect your app to produce and go from there. While your at it, get the query plans for the SQL in your scripts from each vendor and see what you can learn from that about how each vendor's optimizer works.
There's more that can be said, but hopefully you get the gist.
What is your platform of choice? If it's Windows, I'd go with Oracle or MySQL as I have heard only bad things about running PostgreSQL on Windows machines.
Also, Oracle has a more GUI apprach to configuration so if you're not a Linux hacker, you may like it better.
Postgres on the other hand has a way superior SQL console and console tools in general are more developed and easier to use. Oracle has more tools but the decent ares (like PL/SQL navigator) are not free and the free ones simply suck.
Remember that choosing a database is a long-term decision so consider the possible changes in requirements for your application as well. If you are not prepared to eventually spend money on Oracle, better go with Postgres -- safe option. And good enough for most projects.

Running SQL Server on the Web Server

Is it good, bad, or indifferent to run SQL Server on your webserver?
I'm using Server 2008 and SQL Server 2005, but I don't think that matters to this question.
For small sites, it doesn't make a bit of a difference.
As the load grows, though, this scales really badly, and quicker than you think:
Database servers are built on the premise they "own" the server. They trade memory for speed and they easily use all available RAM for internal caching.
Once resources start to be scarce, profiling becomes very difficult -- it is clear that IIS and SQL are both suffering, less clear where the bottleneck is. IIS needs CPU, SQL Server needs RAM or CPU etc etc
No matter how many layers you put in your code, it all runs on the same CPU, therefore a single layered application will run better in this context -- less overhead -- but it will not scale.
Security is really bad, usually you isolate SQL behind a firewall!
If you can afford it, it's probably better to shell out a few bucks and get a second server, maybe using PostgreSQL. One IIS server and one PostgreSQL cost about as much as on IIS + SQL Server because of licensing costs...
Larger shops would probably not consider this a best practice... However, if you aren't dealing with hundreds of requests per second, you're fine putting them both on one box.
In fact, for small apps, you will see better performance on the back-end because data does not have to go across the wire. It's all about scale.
Keep in mind that database servers eat memory. Here's one important lesson from the school of hard knocks: if you decide to run SQL Server 2005 on the same machine as your web server (and that is the setup you mentioned in your question), make sure you go into Sql Server Management Studio and do this:
Right click on the server instance and click properties
Select 'memory' from the list on the left
Change 'maximum server memory' to something your server can sustain.
If you don't do that, SQL Server will eventually take up all of your server's RAM and hang onto it indefinitely. This will cause your server to more or less sputter and die. If you are not aware of this, it can be very frustrating to troubleshoot.
I've done this quite a few times. It's not something you would do if you had the infrastructure of a large corporation and it does not scale, but it's fine for a lot of things.
It really comes down to how much work your webserver and your sql server are doing.
Without more information I doubt you are going to get any helpful answers.
If your web server is publicly accessible, this is a VERY bad idea from a security perspective.
Although it makes a lot of things more difficult from a routing, firewall, ports, authentication, etc. perspective, separation is good. When you have your database server running on the web server, if your web server is compromised, then your sql server is, too.
When you have them on separate boxes, you've raised the bar a little.
There's still a lot more work to be done to secure your web server AND your database server, but why make it easier than it needs to be?
I'd say it was best to run them on the same server until it becomes a problem. That way you'll save yourself some money and time upfront. Once the site becomes a success and requires a some architectural changes it should have already paid for itself.
Remember to back up :)
It will depend on the expected load of the server. For small sites, it is no problem at all (if correctly configured). For large sites, you might want to consider distributing the load over different servers: web server, file server, database server, etc.
I've seen this issue over and over again. The right answer is to put SQL Server on one machine and IIS (web server) on the other. Your money will go into the SQL Server machine because the right drive system and RAM must be purchased to support a efficient server but the web server can be a much scaled down & less expensive machine with just a mirrored drive set.

Resources