What I've done many times when testing database calls is setup a database, open a transaction and rollback it at the end. I've even used an in-memory sqlite db that I create and destroy around each test. And this works and is relatively quick.
My question is: Should I mock the database calls, should I use the technique above or should I use both - one for unit test, one for integration tests (which, to me at least, seems double work).
the problem is that if you use your technique of setting up a database, opening transactions and rolling back, your unit tests will rely on database service, connections, transactions, network and such. If you mock this out, there is no dependency to other pieces of code in your application and there are no external factors influencing your unit-test results.
The goal of a unit test is to test the smallest testable piece of code without involving other application logic. This cannot be achieved when using your technique IMO.
Making your code testable by abstracting your data layer, is a good practice. It will make your code more robust and easier to maintain. If you implement a repository pattern, mocking out your database calls is fairly easy.
Also unit-test and integration tests serve different needs. Unit tests are to prove that a piece of code is technically working, and to catch corner-cases.
Integration tests verify the interfaces between components against a software design. Unit-tests alone cannot verify the functionality of a piece of software.
HTH
All I have to add to #Stephane's answer is: it depends on how you fit unit testing into your own personal development practices. If you've got end-to-end integration tests involving a real database which you create and tidy up as needed - provided you've covered all the different paths through your code and the various eventualities which could occur with your users hacking post data, etc. - you're covered from a point of view of your tests telling you if your system is working, which is probably the main reason for having tests.
I would guess though that having each of your tests run through every layer of your system makes test-driven development very difficult. Needing every layer in place and working in order for a test to pass pretty much excludes spending a few minutes writing a test, a few minutes making it pass, and repeating. This means your tests can't guide you in terms of how individual components behave and interact; your tests won't force you to make things loosely-coupled, for example. Also, say you add a new feature and something breaks elsewhere; granular tests which run against components in isolation make tracking down what went wrong much easier.
For these reasons I'd say it's worth the "double work" of creating and maintaining both integration and unit tests, with your DAL mocked or stubbed in the latter.
Related
Probably there are already around some answers to this question but I haven't find the one I was looking for my specific scenario. So, here is my situation: I'm working on a web app made in Angular where all the unit tests are using mock data. Then we have some end to end tests written in Protractor. I'm not very excited about them because we are testing the user interface with the data we get from a live api. I think we're using this approach because we have no control on the back-end but the side effect of this is that the database could change a mess up our tests. Also, the api we're using for the e2e is runnung on an internal network meaning that we cannot run tests outside of the office. I was thinking about mocking the http responses in order to mock the database and being able to run all the tests from anywhere. The problem is that the backend logic could act differently from the one we simulate in our tests meaning that as soon as we deploy the application, it will work in an unexpected way.
What is the best practice and workflow to follow in a similar situation?
Best practice is subjective but there are known solutions each with pros/cons.
Using a shared environment
If you have manual testing on the same environment as your automated tests you will risk someone screwing with your tests. Copying data from production to this environment will also halt your tests and is not good. There is extra effort in making your test idempotent by ensuring the setup is in the correct state that your test expects as well as ensuring the data set up is not conflicting with manual tests. It is recommended when you create an entity during test setup to have it created with some unique token related to the test so that it is unique for that test. This is just hard and costly.
Using separate e2e environment
This is clearly easier on your test idempotence as you have more control of the data and no manual intervention. You can empty the database or reseed it using several solutions (see below) before every test or group of tests. Still you must be careful ensuring tests do not depend on each other or interfere with other tests.
Mock the APIs
You can mock APIs however it is not a true e2e test. Consumer-driven contracts will work if you know that the APIs are testing against specific output and you can then use those outputs as mocks for your inputs of the e2e. These tests are blazing fast. If you don't have control over your environment and its data, or it is a 3rd party system it is recommended to mock the api. You risk not testing the real integration which can cause a lot of failures.
Use APIs to set up test data
This is a pretty good solution as not only does it catch issues with APIs but it keeps your e2e tests focused only on the area being tested and you do not have to set up data using the GUI. Test setup and clean up can be managed this way. It may be quicker than using the GUI to set up and certainly not quicker than mocking the API responses.
Use the GUI to set up the test data
This can work but you must be smart about it. Since you are sharing the environment with manual testing you must ensure the data is in the correct state. It is wise to create separate entities related to your tests and not share any test cases that someone would touch manually testing around. This is slower. This complicates your tests as you spend a majority of your time navigating around and setting up things in the GUI.
Use scripts to load the data directly to the database
Avoid this because there is probably business logic that you are missing and will lead to incorrect states. It is better to go through the API to load data as it can validate the input and run any business logic.
Here are some relevant resources to follow up on:
Martin Fowler's write-up on testing microservices
https://medium.com/how-we-build-fedora/e2e-testing-with-angular-protractor-and-rails-725fbefb8149#.9rziv2gtp
How about getting a test version of the backend deployed that has a limited amout of data in?
That way, after each round of testing has completed the database can then be reset with the original datasets loaded in.
This would ensure consistency in your result across tests, and means if the backend guys make changes to their master branch, it wont affect your tests.
So we have a web app, and a bunch of E2E tests.
Its all great, except that its a major pain to keep the data in a valid state. We're trying to write the tests in a way that they should leave the data valid, but it is an overhead, and whenever the test fails it will affect a lot of other tests.
So
We've been trying to do a database restore after every test run (we run local dbs for testing) - its a pain
We've been looking at putting db on a virtual machine and making snapshots - licensing costs are high
I was experimenting with interceptors (it is an AngularJS app) that would intercept certain calls to services and return a predefined piece of data - its hard to get it to work properly and creates too much overhead
Its gotta be a very common pain point yet I can't seem to find much about ways to approach this. So how do you solve this?
I'm attempting to set up SpecFlow for integration/acceptance testing. Our product has a backing database (not a huge one though) in Sqlite.
This is actually proving to be a slightly sticky point though; how do I model the database for the tests?
I would like to know what patterns others out there use for doing integration/acceptance testing with backing databases.
I can think of the following approaches:
Compile a database into the assembly with the tests, then shadow-copy it for each test. Seems slow though.
I could create the database in memory and populate it with pre-determined data.
I could create the database in memory and somehow have Givens populate the database. This seems like it would bloat the tests horribly, but might give them more control and make the tests less fragile.
I could abstract every database interaction and use mocks. Not in love with this idea since I'd like to use this to test the database interactions as well.
Compile the database into the tests and rely on clean-up code to return it to the base state (this one seems dodgy to me). Don't want to do it with transactions since there will be multiple interactions with some tests (so write an item then attempt to read it back with different privileges).
Before considering the How to test, I think you might find it valuable to look at What you want to test.
Starting with what data, I find that it really helps to take a single element, or a small number, and imagine a set of events around them in order to give you the right test data to run your tests with. For example;
If you were working on a healthcare system, you might define a person "Bob" and then produce his life events. Bob was born 37 years ago today, fell off his bike as a child and broke his arm, got married, and has two children.
If you are working on a financial trading system, you might look at a day between opening and closing for a couple of stocks, e.g. "MSFT" and "APPL". On this day you might see one starting low and climbing, the other starting high and falling. A piece of news comes out that reverses their fortunes.
Now you have the what you can actually evaluate which of your scenarios actually work for your data, e.g. “MSFT” and “APPL” could have 1,000s of price changes throughout the day, so generating the Givens and Mocks would be very time consuming. This data lends itself to being pre-captured. On the other hand the “Bob” data works particularly well when using generated data because the data can always change so that it is his birthday today.
One thing your question doesn’t seem to need to consider is updating your data. For example you might want to have a set of tests that work at various stages of your entities life cycle, e.g. Some tests deal with “Baby Bob”, others with “10yr old Bob”,or “Married Bob” etc. If your DB is read only then this isn’t a problem if you can write your tests so that they just don’t see the other data, but sometimes you want build a story through your tests. If your tests do change the data, then you will have problems with ensuring that either your tests run in order (see MSTest OrderedTest or mbUnit DependsOn), or that you can separate your tests so they each deal with an isolated data entity (this is fine if your entity can be described in a single row, but gets harder when you have to read many tables to get it).
You also might want to consider what code you are testing, you can vary the approach inside your different test sets. I currently work on a multi-tier application that has a UI Views, View Models, Client Models, multiple communication systems, and server models. I also have different sets of tests for these. I have some tests that work in a single tier, mocking out other tiers to keep my tests small. Other tests fire up a local server and local client and wire the two up directly. Finally I have some tests that launch a full server process, communicate via EMS and run some simple client side operations using everything but the UI Views.
So now to actually answer your question,
Shadow copy your database - Yes, I’ve done this once with SQLServer Developer and had an xxx.mdb that got copied in before running the tests. However some modern testing frameworks will run tests in parallel e.g. NCrunch, so this just breaks.
Create the database and pre-populate - Not done this one, but my concerns would be what happens where a test changes the database to an unexpected state. Other tests will fail when they have done nothing wrong.
Create the database and use Givens - I’ve done this with NUnit via [SetupFixture]on top of a Linq-to-sql DB.You still have concerns about parallel test runs and you have to balance the granularity of your givens (see StackOverflow-When do BDD scenarios become too specific), and you have the data update ordering/data isolation problem, but this can work really well to allow you to work through your data stories and grow the data throughout your tests. On the other hand, should one test fail and leave the data in a bad state you can end up with lots of failures, but at least you simply need to look at the one that fails first. This kind of testing will also be not play very nicely for developers on their workstation as they can’t just run a single test, particularly with tools such as NCrunch, which can just run tests whose code has changed.
Mock the database This is how I choose to do things now. The trick is that if you are personally following a reasonably strict TDD process where you only test the method you are working on, then you actually end up with some tiers that test the database interaction, e.g. [Test]DALLayerTests.ShouldReadARowAndCreatePOCO(), but most others that used mocked data to test what actually happens e.g. [Test]BusinessObjectPersonTests.ShouldGetBirthdayCongratulations().
Use clean up code - Never tried it, it sounds dodgy :-)
I have an application that is database intensive. Most of the applications methods are updating data in a database. Some calls are wrappers to stored procedures while others perform database updates in-code using 3rd party APIs.
What should I be testing in my unit tests? Should I...
Test that each method completes without throwing an exception -or-
Validate the data in the database after each test to make sure the state of data is as expected
My initial thought is #2 but my concern is that I would be writing a bunch of framework code to go along with my unit tests. I read that you shouldn't write a bunch of framework code for unit testing.
Thoughts?
EDIT: What I mean by framework is writing a ton of other code that serves as a library to the unit testing code...not a third party framework.
I do number 2, i.e., test the update by updating a record, and then reading it back out and verifying that the values are the same as the ones you put in. Do both the update and the read in a transaction, and then roll it back, to avoid permanent effect on the database. I don't think of this as testing Framework code, any more than I think of it as testing OS code or networking code... The framework (if you mean a non-application specific Database access layer component) should be tested and validated independently.
There's a third option, which is to use a mock database-access object that knows how to respond to an update as if it had been connected to a live database, but it doesn't really execute the query against a database.
This technique can be used to supplement testing against a live database. This is not the same as testing against a live database, and shouldn't substitute for that kind of testing. But it can be used at least to test that the invocation of the database update by your class was done with proper inputs. It also typically runs a lot faster than running tests against a real database.
You must test the actual effect of the code on the data, and its compliance with the validation rules etc., not just that no exceptions are raised - that would be a bit like just checking a procedure compiles!
It is difficult testing database code that performs inserts, updates or deletes (DML), because the test changes the environment it runs in, i.e. the database. Running the same procedure several times in a row could (and probably should) have different results each time. This is very different to unit testing "pure code", which you can run thousands of times and always get the same result - i.e. "pure code" is deterministic, database code that performs DML is not.
For this reason, you do often need to build a "framework"to support database unit tests - i.e. scripts to set up some test data in the right state, and to clean up after the test has been run.
If you are not writing to the database manually and using a framework instead (jvm, .net framework, ...), you can safely assume that the framework writes to database correctly. What you must test is if you are using the framework correctly.
Just mock the database methods and objects. Test if you are calling them and retrieving data back correctly. Doing this will give you the opportunity to write your tests easier, run them much more faster and make them parallel with no problems at all.
They shouldn't be unit tested at all! The whole point of those methods is to integrate with the outside world (i.e. the database). So, make sure your integration tests beat the you-know-what out of those methods and just forget about the unit tests.
They should be so simple that they are "obviously bug-free", anyway – and if they aren't, you should break them up in one part which has the logic and a dumb part which just takes a value and sticks it in the database.
Remember: the goal is 100% test coverage, not 100% unit test coverage; that includes all of your tests: unit, integration, functional, system, acceptance and manual.
If the update logic is complex then you should do #2.
In practice the only way to really unit test a complex calculation and update
like say, calculating the banking charges on a set of customer transactions,
is to intialise a set of tables to known values at the start of your
unit test and test for the expected values at the end.
I use DBUnit to load the database with data, execute the update-logic and finally read the updated data from the database and verify it. Basically #2.
This past summer I was developing a basic ASP.NET/SQL Server CRUD app, and unit testing was one of the requirements. I ran into some trouble when I tried to test against the database. To my understanding, unit tests should be:
stateless
independent from each other
repeatable with the same results i.e. no persisting changes
These requirements seem to be at odds with each other when developing for a database. For example, I can't test Insert() without making sure the rows to be inserted aren't there yet, thus I need to call the Delete() first. But, what if they aren't already there? Then I would need to call the Exists() function first.
My eventual solution involved very large setup functions (yuck!) and an empty test case which would run first and indicate that the setup ran without problems. This is sacrificing on the independence of the tests while maintaining their statelessness.
Another solution I found is to wrap the function calls in a transaction which can be easily rolled back, like Roy Osherove's XtUnit. This work, but it involves another library, another dependency, and it seems a little too heavy of a solution for the problem at hand.
So, what has the SO community done when confronted with this situation?
tgmdbm said:
You typically use your favourite
automated unit testing framework to
perform integration tests, which is
why some people get confused, but they
don't follow the same rules. You are
allowed to involve the concrete
implementation of many of your classes
(because they've been unit tested).
You are testing how your concrete
classes interact with each other and
with the database.
So if I read this correctly, there is really no way to effectively unit-test a Data Access Layer. Or, would a "unit test" of a Data Access Layer involve testing, say, the SQL/commands generated by the classes, independent of actual interaction with the database?
There's no real way to unit test a database other than asserting that the tables exist, contain the expected columns, and have the appropriate constraints. But that's usually not really worth doing.
You don't typically unit test the database. You usually involve the database in integration tests.
You typically use your favourite automated unit testing framework to perform integration tests, which is why some people get confused, but they don't follow the same rules. You are allowed to involve the concrete implementation of many of your classes (because they've been unit tested). You are testing how your concrete classes interact with each other and with the database.
DBunit
You can use this tool to export the state of a database at a given time, and then when you're unit testing, it can be rolled back to its previous state automatically at the beginning of the tests. We use it quite often where I work.
The usual solution to external dependencies in unit tests is to use mock objects - which is to say, libraries that mimic the behavior of the real ones against which you are testing. This is not always straightforward, and sometimes requires some ingenuity, but there are several good (freeware) mock libraries out there for .Net if you don't want to "roll your own". Two come to mind immediately:
Rhino Mocks is one that has a pretty good reputation.
NMock is another.
There are plenty of commercial mock libraries available, too. Part of writing good unit tests is actually desinging your code for them - for example, by using interfaces where it makes sense, so that you can "mock" a dependent object by implmenting a "fake" version of its interface that nonetheless behaves in a predictable way, for testing purposes.
In database mocks, this means "mocking" your own DB access layer with objects that return made up table, row, or dataset objects for your unit tests to deal with.
Where I work, we typically make our own mock libs from scratch, but that doesn't mean you have to.
Yeah, you should refactor your code to access Repositories and Services which access the database and you can then mock or stub those objects so that the object under test never touches the database. This is much faster than storing the state of the database and resetting it after every test!
I highly recommend Moq as your mocking framework. I've used Rhino Mocks and NMock. Moq was so simple and solved all the problems I had with the other frameworks.
I've had the same question and have come to the same basic conclusions as the other answerers here: Don't bother unit testing the actual db communication layer, but if you want to unit test your Model functions (to ensure they're pulling data properly, formatting it properly, etc.), use some kind of dummy data source and setup tests to verify the data being retrieved.
I too find the bare-bones definition of unit testing to be a poor fit for a lot of web development activities. But this page describes some more 'advanced' unit testing models and may help to inspire some ideas for applying unit testing in various situations:
Unit Test Patterns
I explained a technique that I have been using for this very situation here.
The basic idea is to exercise each method in your DAL - assert your results - and when each test is complete, rollback so your database is clean (no junk/test data).
The only issue that you might not find "great" is that i typically do an entire CRUD test (not pure from the unit testing perspective) but this integration test allows you to see your CRUD + mapping code in action. This way if it breaks you will know before you fire up the application (saves me a ton of work when I'm trying to go fast)
What you should do is run your tests from a blank copy of the database that you generate from a script. You can run your tests and then analyze the data to make sure it has exactly what it should after your tests run. Then you just delete the database, since it's a throwaway. This can all be automated, and can be considered an atomic action.
Testing the data layer and the database together leaves few surprises for later in the
project. But testing against the database has its problems, the main one being that
you’re testing against state shared by many tests. If you insert a line into the database
in one test, the next test can see that line as well.
What you need is a way to roll back the changes you make to the database.
The TransactionScope class is smart enough to handle very complicated transactions,
as well as nested transactions where your code under test calls commits on its own
local transaction.
Here’s a simple piece of code that shows how easy it is to add rollback ability to
your tests:
[TestFixture]
public class TrannsactionScopeTests
{
private TransactionScope trans = null;
[SetUp]
public void SetUp()
{
trans = new TransactionScope(TransactionScopeOption.Required);
}
[TearDown]
public void TearDown()
{
trans.Dispose();
}
[Test]
public void TestServicedSameTransaction()
{
MySimpleClass c = new MySimpleClass();
long id = c.InsertCategoryStandard("whatever");
long id2 = c.InsertCategoryStandard("whatever");
Console.WriteLine("Got id of " + id);
Console.WriteLine("Got id of " + id2);
Assert.AreNotEqual(id, id2);
}
}
If you're using LINQ to SQL as the ORM then you can generate the database on-the-fly (provided that you have enough access from the account used for the unit testing). See http://www.aaron-powell.com/blog.aspx?id=1125