I am looking for a stress tool for SQL Server. I've seen a lot of suggestions on Google. But nothing of what I really need.
I am really looking for a tool that could run a list of stored procedures in parallel to see how much contention on resources. The collect and reporting feature is not that important. But I also want something server-side base for our enterprise build server.
I am not looking for a replay feature (Yes it could do the trick but it would be difficult to program a lot of different scenarios)
I've look at the following tools:
RML Utilities from Microsoft
DTM DB Stress (this is the closest to what I'm looking for)
SQL Stress
I created a simple test tool for this scenario, check it out to see if it will be of any use to you. It's free, no licensing of any sort required. No guarantees on any performance or quality either ;-)
Usage: StressDb.exe <No. of instances> <Tot. Runtime (mins)> <Interval (secs)>
Connection string should reside in the configuration file.
All command line arguments are required. Use integers.
The stored proc to use is also in the config file.
You need to have .NET framework 3.5 installed. You can also run it from multiple workstations for additional load, or from multiple folders on same machine if trying to run additional stored procedures. You also need a SQL user Id, as currently it doesn't use a trusted connection.
This code was actually super simple, the only clever bit was making sure that the connections are not pooled.
http://......com/..../stressdb.zip
Let me know if you find it useful.
Related
Could anybody tell me what the current trend for SQL Server Integration Services is? Is it better than other ETL tools available in market like Informatica, Cognos, etc?
I was introduced to SSIS a couple of weeks ago. Executive summary: I am unlikey to consider it for future projects.
I'm pretty sure flow charts (i.e. non-structured) were discredited as an effective programming paradigm a long time, except in a tiny minority of cases.
There's no point replacing a clean textual (source code) interface with a colourful connect-the-dots one if the user still needs to think like a programmer to know where to drag the arrows.
A program design that you can't access (e.g. fulltext search, navigate using alternative methods, effectively version control, ...) except by one prescribed method is a massive productivity killer. And a wonderful source of RSI.
It's possible there is a particular niche where it's just right, but I imagine most ETL tasks would outgrow it pretty quickly.
SSIS isn't great for production applications from my experience for the following reasons:
To call an SSIS package remotely, you have to call a stored procedure, that calls a job, that calls the SSIS
Using the above method, you can't pass in parameters.
Passing parameters means you have to call the SSIS on a local server - meaning code running on a remote server will have to call code running on the SQL server to execute the package.
I would always rather write specific code to handle ETL and use SSIS for one off transforms.
In my opinion it's quite good platform, and I see a good progress on it. Many of the drwabacks that 2005 version had and that the community complained about, have been corrected on 2008.
From my point of view, the best thing is that you can extend and complement it with SQL or .NET code in an organized way as much as you want.
For instance, you can decide if in your solution you want 80% of c# code and 20% of ETL componenets or 5% of c# code and 95% of ETL components.
disclaimer - i work for microsoft
now the answer
SSIS or SQL Server Integration services is a great tool for ETL operations, there is a lot of uptake in the market place. there is no additional cost other than licensing SQL server and you can also use .Net languages to write tasks.
http://www.microsoft.com/sqlserver/2008/en/us/integration.aspx
http://msdn.microsoft.com/en-us/library/ms141026.aspx
I would list as benefits:
you use SSIS for bigger projects, probably/preferably once or in one run, and then use the integration project for many months with minor changes; the tasks, packages and everything in general is easily readable (of course, depends on perspective)
the tool itself handles the scheduled runs, sending you mails with the logs, and - as long as my experience reaches - it communicates very well with all the other tools (such as SSAS, SQL Server Management Studio, Microsoft Office Excel, Access etc., and other, non-Microsoft tools)
the manually, in-detail configured tasks seem to take over the responsibility in all ways, letting only small chance for errors
as also mentioned above, there are many former problems corrected in the new versions
I would recommend it for ETL, especially if you would continue with analytical processes, since the SSIS, SSAS and SSRS tools blend together quite smoothly.
Drawback: debugging/looking for errors is a bit harder until you get used to it.
Just a question about best-practices when upgrading an existing database. Assuming there will be all kinds of modifications to the data itself, the structure, the relations, additional columns, disappearing columns and whatever more.
My problem is a simple one. I'm working on a project that will use SQL Server. No problem there, since I'm enough of an expert to handle this. But this project will be upgraded later on and I need to specify a protocol that needs to be followed by the upgrade mechanism. Basically, this protocol needs to be followed when creating upgrade scripts...
Right now, I have these simple steps:
Add the new columns to the tables.
Add constraints to the new columns.
Add new tables.
Drop constraints where needed.
Drop columns that need to be removed.
Drop tables that need to be removed.
Somehow, this list feels incomplete. Is there a more extended list somewhere describing the proper steps which needs to be followed during an upgrade?
Also, is it always possible to do a complete upgrade within a single database transaction (with SQL Server) or are there breakpoints that need to be included within the protocol where one transaction should end and another one starts?While automated tools will provide a nice, automated solution, I still can't really use them. The development team working on this system has 4 developers, each with their own database on their local system. Every developer keeps track of their own updates to the structure and keeps track of them by generating both an Upgrade and Downgrade script for his own modifications, both for structural changes and data changes. These scripts can then be used by the other developers to keep their own system up-to-date. Whenever the system is going to be released, those scripts are all merged into one big script.
The system does not include any stored procedures or other "special" features. The database is just that: a data storage with just tables and relations between them. No roles, no users, no stored procedures, no triggers, no complex datatypes...The DB is used by an application where users work from 9-to-5 so shutting down can be done easily, including upgrades for the clients. We also add a version number to the database and applications will check if they're linked to the correct database version.
During development, all developers use their own database instance, which they can fully control. Since we're not the ones who use the application, we tend to develop for the Express edition, not any more expensive one. To be honest, we don't develop our application to support a lot of users, but we'll inform our users that since it uses SQL Server, they could install the system on a bigger SQL Server platform, according to their own needs. They will need their own DBA for this, though. We do have a bigger SQL Server available for ourselves, which we also use for our own web interface, but this server is located in a special dataserver where it is being maintained for us, not by us.
The project previously used MS Access for it's data storage and was intended for single-user development, but as it turned out, many users still decided to share their databases and this had shown that the datamodel itself is reliable enough for multi-user environments. So we migrated to SQL Server to support smaller offices with 3 or more users and some big organisation who will have 500 or more users at the same time.
Since we need to keep the cost of the software low, we don't have a big budget to spend on expensive tools or a more expensive server.
Check out Red-Gate's SQL Compare (structure comparison), SQL Data Compare (data comparison), and SQL Packager (for packaging up updates scripts into a C# project or a .NET executable).
They provide a nice, clean, fully functional and easy-to-use solution for all your database upgrade needs. They're well worth their license fees - that pays for itself in a few weeks or months.
Highly recommended!
In my opinion, it's an absolute bear doing these manually. For Microsoft SQL Server, I'd recommend using the Database editiion of Team System, since it includes complete source control capabilities for your database, and can automatically build your scripts for upgrading/downgrading versions.
Another option is SQLCompare with Redgate, which can also handle these kinds of upgrades/downgrades, and will result in a very nice SQL script. I've used both, and keeping the historic scripts has helped us troubleshoot issues and resolve many a mystery.
If you are working with a manual script as above, don't forget to also account for SP changes in your scripts. Also, any hand-edited script should be able to be executed multiple times on a database - i.e. if your script includes a table creation or drop, be sure to check for existance first, otherwise your script will fail if executed back to back.
Again, while it's possible to build a manual protocol I'd still fall back on using one of the purpose-built tools out there, and both Team System and SQL Compare will be able to output scripts that you could include as part of an installation/upgrade package.
With database updates I always believe it should be all or nothing. If any of the DB updates fail your application will be left in an unknown state that could be harmful to the data so I think it is best practice to either apply them all or none (1 transaction around them all).
I also like to backup the database before applying updates so that if anything does go wrong the database can be rolled back (this has saved me numerous times when working with live data).
Hope this helps.
Best practices for upgrading a production database schema actually look pretty bad on the surface. Unless you can completely shut down your system for the upgrade, which is often not possible, your changes all need to be backwards compatible. If you have many clients accessing the database, you can't update them all simultaneously, so any schema changes you make need to allow old code to run.
That means never renaming a column, and making all new columns nullable. This doesn't mean you leave it like that forever. You write two scripts, one for the initial change, which is backwards compatible, then another to clean things up after all clients have been updated.
Automated tools are great for validation of schemas, but they are not so good when it comes to actually modifying a complex system. You should break your changes up into many small, discrete change scripts so each can be run manually. If there's a failure, it's easier to pinpoint the cause and fix it. Basically, each feature gets its own script. Give each a unique name and then store that name in the database itself when you run the script so you can query the database to find out what's been run and what hasn't. This is invaluable when you have instances on developer's machines, test servers, production, etc.
Does anyone know of any packages or source code that does simple statistical analysis, e.g., confidence intervals or ANOVA, inside a SQL Server stored procedure?
The reason you probably don't want to do that is because these calculations are CPU-intensive. SQL Server is usually licensed by the CPU socket (roughly $5k/cpu for Standard, $20k/cpu for Enterprise) so DBAs are very sensitive to any applications that want to burn a lot of CPU power on the SQL Server itself. If you started doing statistics calculations and suddenly the server needs another CPU, that's an expensive licensing proposition.
Instead, it makes sense to do these statistical calculations on a separate application server. Query the data over the wire to your app server, do the number-crunching there, and then send the results back via an update statement or stored proc. Yes, it's more work, but as your application grows, you won't be facing an expensive licensing bill.
In more recent versions of SQL Server you can use .net objects natively. So any .net package will do. Other than that there's always external proc calls...
Unless you have to do it within the stored proc I'd retrieve the data and do it outside SQL Server. That way you can choose from any of the open source or commercial stats routines and it would probably be faster too.
I don't know if a commercial package like this exist. There could be multiple reasons for this, some of which have been outlined above.
If what you are trying to accomplish is to avoid building statistical functions that process your data stored in SQL Server, you might want to try and integrate statistical packages with your database server by importing data from it. For example, R supports it and there is also CRAN
Once you have accomplished that and you still feel that you'd like to make statistical analysis run inside your SQL Server, the next steps would be to call your stats package from a stored procedure using a command line interface. Your best option here is probably xp_cmdshell, though it requires careful configuration in order not to compromise your SQL Server security.
In Maybe Normalizing Isn't Normal Jeff Atwood says, "You're automatically measuring all the queries that flow through your software, right?" I'm not but I'd like to.
Some features of the application in question:
ASP.NET
a data access layer which depends on the MS Enterprise Library Data Access Application Block
MS SQL Server
In addition to Brad's mention of SQL Profiler, if you want to do this in code, then all your database calls need to funnelled through a common library. You insert the timing code there, and voila, you know how long every query in your system takes.
A single point of entry to the database is a fairly standard feature of any ORM or database layer -- or at least it has been in any project I've worked on so far!
SQL Profiler is the tool I use to monitor traffic flowing to my SQL Server. It allows you to gather detailed data about your SQL Server. SQL Profiler has been distributed with SQL Server since at least SQL Server 2000 (but probably before that also).
Highly recommended.
Take a look at this chapter Jeff Atwood and I wrote about performance optimizations for websites. We cover a lot of stuff, but there's a lot of stuff about database tracing and optimization:
Speed Up Your Site: 8 ASP.NET Performance Tips
The Dropthings project on CodePlex has a class for timing blocks of code.
The class is named TimedLog. It implements IDisposable. You wrap the block of code you wish to time in a using statement.
If you use rails it automatically logs all the SQL queries, and the time they took to execute, in your development log file.
I find this very useful because if you do see one that's taking a while, it's one step to just copy and paste it straight off the screen/logfile, and put 'explain' in front of it in mysql.
You don't have to go digging through your code and reconstruct what's happening.
Needless to say this doesn't happen in production as it'd run you out of disk space in about an hour.
If you define a factory that creates SqlCommands for you and always call it when you need a new command, you can return a RealProxy to an SqlCommand.
This proxy can then measure how long ExecuteReader / ExecuteScalar etc. take using a StopWatch and log it somewhere. The advantage to using this kind of method over Sql Server Profiler is that you can get full stack traces for each executed piece of SQL.
I have seen the references to VistaDB over the years and with tools like SQLite, Firebird, MS SQL et. al. I have never had a reason to consider it.
What are the benefits of paying for VistaDB vs using another technology? Things I have thought of:
1. Compact Framework Support. SQLite+MSSQL support the CF.
2. Need migration path to a 'more robust' system. Firebird+MSSQL.
3. Need more advanced features such as triggers. Firebird+MSSQL
The VistaDB client runtime is free. The runtime will never "expire at 3am" as you put it. Only the developer tools are licensed in that manner. You need 1 license per developer, simple. We even offer a really inexpensive Lite version with no Visual Studio tools.
Some other benefits
100% managed code - there are no interop or other unmanaged calls in the engine. This is a big deal to some, and others couldn't care less.
No registry access required - Most other in proc databases require registry access to look for parent controls, or permissions. VistaDB only does what you tell it to do, and will even run in Medium Trust.
XCopy deployment for runtime and your database (single file). You can xcopy you application, the runtime, and your database and run. Nothing to install or configure on the machine, no special privileges needed (we can run in Medium Trust or higher).
Isolated storage - You can put your entire database into Isolated Storage and run it from there directly. This makes it very easy to build secure click once applications that write databases in a domain friendly way for corporate environments. There is no need to store the user data on a shared drive or worry about permission mapping.
CLR Triggers / CLR Procs - You can write CLR Code and use them as Triggers or Stored Procs. We have just recently introduced changes to make it even easier to maintain a single CLR Assembly that can run in both VistaDB and SQL Server 2005/2008.
T-SQL Procs - VistaDB T-SQL Procs are compatible with SQL Server 2005/2008. Any procedure that works in our engine will run in SQL Server. That does not mean anything that runs there will port to us. We are a subset of the functionality in SQL Server. But we are also the only way to run T-SQL Procs without SQL Server (SQL CE can't do it).
I personally think one of the biggest features is the ability to upsize to SQL Server later. All of the VistaDB types, syntax, and CLR Procs, T-SQL procs, etc all will run on SQL Server. (You can't take everything from SQL Server down to VistaDB though, it is a subset)
32/64 bit Deployment - VistaDB is a single assembly deployment that runs both 32 and 64 bit without changes. SQL CE requires two different runtimes depending upon the OS, and cannot run under IIS at all. Access has no 64 bit runtime, and the most recent 32 bit runtime can only be deployed through MSI. The 32 bit version of Windows has the runtime, the 64 bit version does not.
Relational Integrity - VistaDB also actually enforces your constraints and Foreign Keys. You can specific cascade update, and delete operations. The person who commented we are like SQLITE is wrong in this regard. They parse constraints, but do not enforce them.
EDIT: They do have support for FK's now in SQLite. But they are not compiled in by default, and do not use the same syntax as SQL Server.
Medium Trust - The ability to run on a medium trust web server is another feature that many will not care about, but it is a big deal. Many third party controls can't even run in Medium Trust. We can run the complete engine within Medium Trust because of our commitment to 100% managed code and least permission required.
- Full disclosure - I am the owner of VistaDB so I may be biased. :)
Well, the main thing is that it is pure managed code - for what that is worth; it works not only on your typical Windows machines running .NET, but works wherever you run the Compact Framework and even works on Mono. Here are some noteworthy bullet points from their homepage:
Small < 1 MB footprint truly embedded ZeroClick
Microsoft SQL Server 2005 compatible data types and T-SQL syntax
None of the SQL CE limits
Single user, multi user local or using shared network.
Partially trusted shared hosting is no problem.
Royalty-free distribution - single CPU deployment of SQL Server costs more than a site license of VistaDB!
One thing worth noting is that Rob Howard's company, telligent, uses it as the default database for their new CMS software, "Graffiti."
I have played with it here and there but have yet to build anything against it.
For me this most interesting feature of VistaDB is that it can be run in Medium Trust environment. Which makes it perfect solution for creating small to medium .NET websites which can be deployed on server by copying and pasting (x-copy deployment).
And almost all windows shared hosting providers (like GoDaddy) won't let you run your websites in Full Trust mode. And also won't install for you any 3rd party binaries into GAC like System.Data.SQLite.dll if you wish to use SQLite for example.
I hadn't seen VistaDB before, it does look pretty cool.
Update: Received a comment from someone from VistaDB - their update model is only for getting new versions. Your old ones won't stop working if your license expires, which is good to know.
Keeping the original post here as IMHO the warning about expiring software licenses is still worth thinking about, even though VistaDB itself is fine.
It definitely seems 'more featureful' than SQLite, but I don't see anything there to justify the cost. The site seems to indicate that you can buy one license for $279, but it implies this is just a 1 year subscription. Would you have to then pay another $279 next year to stop your site falling over?
If so, remember to factor into the 'cost' how much inconvenience it's going to be when you get a call at 3am (murphy's law, it's always 3am) from your panicking customers because their VistaDB license has expired :-(
I've had this experience personally with some expiring software, and it's never good. You can send your customers emails and messages and flash their entire screen blinking red saying "YOU NEED TO GET A NEW LICENSE BEFORE NEXT WEEK" and they'll still never do it, and you'll still get the pain at 3am when it does expire.