Avoid Database Dependency For Unit Testing Without Mocking - database

I've got many objects with methods that require database access. We're looking to get into unit testing but are keen to avoid the use of mock objects if possible. I'm wondering if there is a way to refactor the Validate method shown below so that it wouldn't need db access. In the actual application there is usually a fair bit more going on but I think this simplified example should be enough.
We'll learn to use mock objects if we need to but it just seems like a lot of overhead, so I'm looking for alternatives.
public class Person
{
public string Name;
public string Validate()
{
if (PersonDA.NameExists(Name))
{
return "Name Already Used";
}
}
}

Personally I'd just go the mock object route. It's much more flexible and it sounds like you're wanting to go the route of putting test code in your actual object?
Regarless, extract the validation code into a PersonValidator object with a method for boolean isValid(Person). Then in the test code use a mock validator which just returns true or false based on the test case.

The Person class is hard to unit-test because it has a hidden, static dependency on database access code. You can break this coupling by introducing a dynamic collaboration between the Person and some new type of object that provides it with the information it needs to validate its state. In your unit tests of the Person you can test what happens when it is valid or invalid without hitting the database by passing the Person object "stub" implementations of it's collaborator.
You can test the real implementation, which hits the database, in a separate set of tests. Those tests will be slower but there should be fewer of them because they will be direct translations of accessor methods to database queries with no complex logic of their own.
You can call that "using mock objects" if you like but, because your current design means you only need to stub queries, not expect commands, a mock object framework is a too complicated tool for the job. Hand-written stubs will make test failures easier to diagnose.

Take a look at dbunit, it's especially set up to populate a small test database so you can use your real objects on a mock database during unit testing. Testing with it is far easier than developing mock objects, far safer than modifying your data access code, and far more thorough than either.

Why are you trying to avoid mocks exactly? If you are going to practice unit testing and you have data access code, its going to be easiest to get comfortable with the mock/stub/inject way of doing things.
If it's because you dont want to bring in a mocking framework you could code up some simple stubs as you need them.
Putting your data access code behind an interface will let to avoid the need for a database. Consider using dependency injection to insert the mock or stub data access code during your tests.

You should just set up a database that is used for the unit testing.
If you use mockups for all the data access, you wouldn't actually be testing much? :)

Related

Proper Unit Testing Philosophy

What would be the proper thing to do for each case?
1: Context: Testing a function that creates a database as well as generating metadata for that database
Question: Normally unit test cases are supposed to be independent, but if we want to make sure the function raises an exception when trying to make a duplicate database, would it be acceptable to have ordered test cases where the first one tests if the function works, and the second one tests if it fails when calling it again?
2: Most of the other functions require a database and metadata. Would it be better to call the previous functions in the set up of each test suite to create the database and metadata, or would it be better to hard code the required information in the database?
Your automated test should model the following:
Setup
Exercise (SUT)
Verify
Teardown
In addition, each test should be as concise as possible and only expose the details that are being tested. All other infrastructure that is required to execute the test should be abstracted away so that the test method serves as documention that only exposes the inputs that are being tested in regards to what you want to verify for that particular test.
Each test should strive to start from a clean slate so that the test can be repeated with the same results each time regardless of the results of prior tests that have been executed.
I typically execute a test-setup and a test-cleanup method for each integration test or any test that depends on singletons that maintain state for the System-Under-Test and need to have it's state wiped.
Normally unit test cases are supposed to be independent, but if we want to make sure the function raises an exception when trying to make a duplicate database, would it be acceptable to have ordered test cases where the first one tests if the function works, and the second one tests if it fails when calling it again?
No, ordered tests are bad. There's nothing stopping you from having a test call another method that happens to be a test though:
#Test
public void createDataBase(){
...
}
#Test
public void creatingDuplicateDatabaseShouldFail(){
createDataBase();
try{
//call create again should fail
//could also use ExpectedException Rule here
createDataBase();
fail(...);
}catch(...){
...
}
}
Most of the other functions require a database and metadata. Would it be better to call the previous functions in the set up of each test suite to create the database and metadata, or would it be better to hard code the required information in the database?
If you use a database testing framework like DbUnit or something similar, it can reuse the same db setup over and over again in each test.

Global property for DB access rather than passing DB around everywhere? Advice anyone?

Globals are evil right? At least everything I read says so, because something might alter the state of the global at any point.
However, I've a DB object that's a bit of a tramp in regards class parameters. The property below is an instance of a wrapper class that automatically works in MS Access or SQL - hence why it's not EF or some other ORM.
Public Property db As New DBI.DBI(DBI.DBI.modeenum.access, String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0} ;Persist Security Info=True;Jet OLEDB:Database Password=""lkjhgfds8928""", GetRpcd("c:\cms")))
The code itself does have PostSharp for exception handling, so I'm thinking that I can conditionally handle oledb errors by logging them and re initialising the DB if it is Null.
Up till now, the solution has been to continually pass the db around as a parameter to every single class that needs it. Most of the data classes have a shared observablecollection that is built from structures that individually implement inotifyproperty changed. One of these is asynchronously built. The collection property checks if it's empty before firing off the private Async buildCollection sub.
Given that we don't use dependency injection (yet) as I need to learn it; is the Global property all that bad? Db is needed everywhere that data is pulled in or saved. The only places I don't need it at all is the View and its code behind.
It's not a customer facing project but it does need to be solid.
Any advice gratefully recieved!!
Passing the DB connection as a parameter into your classes IS using dependency injection, perhaps you just didn't recognize it as such. Hard coding the connection string in the callers is still code that is not free of dependencies, but at least your database accessors themselves are free of the dependency upon a global connection.
Globals aren't just evil because they change without notice - that's just one effect you see resulting from the bad design choice. They're evil because a design using them is brittle. Code that depends upon globals requires invisible stuff to be set correctly before calling it, and that leads to inter-dependencies between unrelated code. The invisible stuff becomes critically important stuff. Reading just the interface of a module that internally uses globals, how would I know that I have to call the SetupGlobalThing() method before calling it? What happens if I call IncrementGlobalThing() and DecrementGlobalThing() and MultiplyGlobalThing() in varying orders, depending on the function the user selects?
Instead, prefer stateless methods where you pass in all the stuff to be changed and used: IncrementThing(Integer thing) doesn't rely on hidden setup steps. It clearly does one thing: it increments the thing passed in.
It may help to think about it from a unit testing viewpoint. If you were to write a unit test to prove a specific module of code works, would you need to pass in a real database connection (hard*), or would you be able to pass in a fake database reference that meets your testing needs easily?
The best way to test your logic is to unit test it. The best way to test your class interfaces and method structure is to write unit tests that call them. If the class is hard to test, it's likely due to dependencies upon external things (globals, singletons, databases, inappropriate member variables, etc.)
The reason I called using a real database "hard" is that a unit test needs to be easy and fast to run. It shouldn't rely on slow or breakable or complex external things. Think about unit testing your software on the bus, with no network connection. Think about how much work it is to create a dummy database: you have to add users, you have to have the right version of schema in it, it has to be installed, it has to be filled with the right kind of testing data, you need network connectivity to it, all those things can make your testing unreliable. Instead, in a unit test you pass in a mock database, which simply returns values that exercise your code being tested.

helper functions as static functions or procedural functions?

i wonder if one should create a helper function in a class as a static function or just have it declared as a procedural function?
i tend to think that a static helper function is the right way to go cause then i can see what kind of helper function it is eg. Database::connect(), File::create().
what is best practice?
IMO it depends on what type of helper function it is. Statics / Singletons make things very difficult to test things in isolation, because they spread concrete dependencies around. So if the helper method is something I might want to fake out in a unit test (and your examples of creating files and connecting to databases definitely would fall in that category), then I would just create them as instance methods on a regular class. The user would instantiate the helper class as necessary to call the methods.
With that in place, it is easier to use Inversion of Control / Dependency Injection / Service Locator patterns to put fakes in when you want to test the code and you want to fake out database access, or filesystem access, etc.
This of course has the downside of there theoretically being multiple instances of the helper class, but this is not a real problem in most systems. The overhead of having these instances is minimal.
If the helper method was something very simple that I would never want to fake out for test, then I might consider using a static.
Singleton solves the confusion.
MyHelper.Instance.ExecuteMethod();
Instance will be a static property. Benefit is you get simple one line code in calling method and it reuses previously created instance which prevents overhead of instance creation on different memory locations and disposing them.

Can a Mock framework do this for me?

I am a bit confused
from wiki:
"This means that a true mock... performing tests on the data passed into the method calls as arguments."
I never used unit testing or mock framework. I thought unit tests are for automated tests so what are mock tests for?
What I want is a object replacing my database I might use later but still dont know what database or orm tool I use.
When I do my programm with mocks could I easily replace them with POCO`s later to make entity framework for example working pretty fast?
edit: I do not want to use unit testing but using Mocks as a full replacement for entities + database would be nice.
Yes, I think you are a bit confused.
Mock frameworks are for creating "Mock" objects which basically fake part of the functionality of your real objects so you can pass them to methods during tests, without having to go to the trouble of creating the real object for testing.
Lets run through a quick example
Say you have a 'Save()' method that takes a 'Doc' object, and returns a 'boolean' success flag
public bool Save(Doc docToSave(){...}
Now if you want to write a unit test for this method, you are going to have to first create a document object, and populate it with appropriate data before you can test the 'Save()' method. This requires more work than you really want to do.
Instead, it is possible to use a Mocking framework to create a mock 'Doc' object for you.
Syntax various between frameworks, but in pseudo-code you would write something like this:
CreateMock of type Doc
SetReturnValue for method Doc.data = "some test data"
The mocking framework will create a dummy mock object of type Doc that correctly returns "some test data" when it's '.data' property is called.
You can then use this dummy object to test your save method:
public void MyTest()
{
...
bool isSuccess = someClass.Save(dummyDoc);
...
}
The mocking framework ensures that when your 'Save()' method accesses the properties on the dummyDoc object, the correct data is returned, and the save can happen naturally.
This is a slightly contrived example, and in such a simple case it would probably be just as easy to create a real Doc object, but often in a complex bit software it might be much harder to create the object because it has dependencies on other things, or it has requirements for other things to be created first. Mocking removes some of that extra overload and allows you to test just the specific method that you are trying to test and not worry about the intricacies of the Doc class as well.
Mock tests are simply unit tests using mocked objects as opposed to real ones. Mocked objects are not really used as part of actual production code.
If you want something that will take the place of your database classes so you can change your mind later, you need to write interfaces or abstract classes to provide the methods you require to match your save/load semantics, then you can fill out several full implementations depending on what storage types you choose.
I think what you're looking for is the Repository Pattern. That link is for NHibernate, but the general pattern should work for Entity Framework as well. Searching for that, I found Implementing Repository Pattern With Entity Framework.
This abstracts the details of the actual O/RM behind an interface (or set of interfaces).
(I'm no expert on repositories, so please post better explanations/links if anyone has them.)
You could then use a mocking (isolation) framework or hand-code fakes/stubs during initial development prior to deciding on an O/RM.
Unit testing is where you'll realize the full benefits. You can then test classes that depend on repository interfaces by supplying mock or stub repositories. You won't need to set up an actual database for these tests, and they will execute very quickly. Tests pay for themselves over and over, and the quality of your code will increase.

How to unit test an object with database queries

I've heard that unit testing is "totally awesome", "really cool" and "all manner of good things" but 70% or more of my files involve database access (some read and some write) and I'm not sure how to write a unit test for these files.
I'm using PHP and Python but I think it's a question that applies to most/all languages that use database access.
I would suggest mocking out your calls to the database. Mocks are basically objects that look like the object you are trying to call a method on, in the sense that they have the same properties, methods, etc. available to caller. But instead of performing whatever action they are programmed to do when a particular method is called, it skips that altogether, and just returns a result. That result is typically defined by you ahead of time.
In order to set up your objects for mocking, you probably need to use some sort of inversion of control/ dependency injection pattern, as in the following pseudo-code:
class Bar
{
private FooDataProvider _dataProvider;
public instantiate(FooDataProvider dataProvider) {
_dataProvider = dataProvider;
}
public getAllFoos() {
// instead of calling Foo.GetAll() here, we are introducing an extra layer of abstraction
return _dataProvider.GetAllFoos();
}
}
class FooDataProvider
{
public Foo[] GetAllFoos() {
return Foo.GetAll();
}
}
Now in your unit test, you create a mock of FooDataProvider, which allows you to call the method GetAllFoos without having to actually hit the database.
class BarTests
{
public TestGetAllFoos() {
// here we set up our mock FooDataProvider
mockRepository = MockingFramework.new()
mockFooDataProvider = mockRepository.CreateMockOfType(FooDataProvider);
// create a new array of Foo objects
testFooArray = new Foo[] {Foo.new(), Foo.new(), Foo.new()}
// the next statement will cause testFooArray to be returned every time we call FooDAtaProvider.GetAllFoos,
// instead of calling to the database and returning whatever is in there
// ExpectCallTo and Returns are methods provided by our imaginary mocking framework
ExpectCallTo(mockFooDataProvider.GetAllFoos).Returns(testFooArray)
// now begins our actual unit test
testBar = new Bar(mockFooDataProvider)
baz = testBar.GetAllFoos()
// baz should now equal the testFooArray object we created earlier
Assert.AreEqual(3, baz.length)
}
}
A common mocking scenario, in a nutshell. Of course you will still probably want to unit test your actual database calls too, for which you will need to hit the database.
Ideally, your objects should be persistent ignorant. For instance, you should have a "data access layer", that you would make requests to, that would return objects. This way, you can leave that part out of your unit tests, or test them in isolation.
If your objects are tightly coupled to your data layer, it is difficult to do proper unit testing. The first part of unit test, is "unit". All units should be able to be tested in isolation.
In my C# projects, I use NHibernate with a completely separate Data layer. My objects live in the core domain model and are accessed from my application layer. The application layer talks to both the data layer and the domain model layer.
The application layer is also sometimes called the "Business Layer".
If you are using PHP, create a specific set of classes ONLY for data access. Make sure your objects have no idea how they are persisted and wire up the two in your application classes.
Another option would be to use mocking/stubs.
The easiest way to unit test an object with database access is using transaction scopes.
For example:
[Test]
[ExpectedException(typeof(NotFoundException))]
public void DeleteAttendee() {
using(TransactionScope scope = new TransactionScope()) {
Attendee anAttendee = Attendee.Get(3);
anAttendee.Delete();
anAttendee.Save();
//Try reloading. Instance should have been deleted.
Attendee deletedAttendee = Attendee.Get(3);
}
}
This will revert back the state of the database, basically like a transaction rollback so you can run the test as many times as you want without any sideeffects. We've used this approach successfully in large projects. Our build does take a little long to run (15 minutes), but it is not horrible for having 1800 unit tests. Also, if build time is a concern, you can change the build process to have multiple builds, one for building src, another that fires up afterwards that handles unit tests, code analysis, packaging, etc...
I can perhaps give you a taste of our experience when we began looking at unit testing our middle-tier process that included a ton of "business logic" sql operations.
We first created an abstraction layer that allowed us to "slot in" any reasonable database connection (in our case, we simply supported a single ODBC-type connection).
Once this was in place, we were then able to do something like this in our code (we work in C++, but I'm sure you get the idea):
GetDatabase().ExecuteSQL( "INSERT INTO foo ( blah, blah )" )
At normal run time, GetDatabase() would return an object that fed all our sql (including queries), via ODBC directly to the database.
We then started looking at in-memory databases - the best by a long way seems to be SQLite. (http://www.sqlite.org/index.html). It's remarkably simple to set up and use, and allowed us subclass and override GetDatabase() to forward sql to an in-memory database that was created and destroyed for every test performed.
We're still in the early stages of this, but it's looking good so far, however we do have to make sure we create any tables that are required and populate them with test data - however we've reduced the workload somewhat here by creating a generic set of helper functions that can do a lot of all this for us.
Overall, it has helped immensely with our TDD process, since making what seems like quite innocuous changes to fix certain bugs can have quite strange affects on other (difficult to detect) areas of your system - due to the very nature of sql/databases.
Obviously, our experiences have centred around a C++ development environment, however I'm sure you could perhaps get something similar working under PHP/Python.
Hope this helps.
You should mock the database access if you want to unit test your classes. After all, you don't want to test the database in a unit test. That would be an integration test.
Abstract the calls away and then insert a mock that just returns the expected data. If your classes don't do more than executing queries, it may not even be worth testing them, though...
The book xUnit Test Patterns describes some ways to handle unit-testing code that hits a database. I agree with the other people who are saying that you don't want to do this because it's slow, but you gotta do it sometime, IMO. Mocking out the db connection to test higher-level stuff is a good idea, but check out this book for suggestions about things you can do to interact with the actual database.
I usually try to break up my tests between testing the objects (and ORM, if any) and testing the db. I test the object-side of things by mocking the data access calls whereas I test the db side of things by testing the object interactions with the db which is, in my experience, usually fairly limited.
I used to get frustrated with writing unit tests until I start mocking the data access portion so I didn't have to create a test db or generate test data on the fly. By mocking the data you can generate it all at run time and be sure that your objects work properly with known inputs.
Options you have:
Write a script that will wipe out database before you start unit tests, then populate db with predefined set of data and run the tests. You can also do that before every test – it'll be slow, but less error prone.
Inject the database. (Example in pseudo-Java, but applies to all OO-languages)
class Database {
public Result query(String query) {... real db here ...}
}
class MockDatabase extends Database {
public Result query(String query) {
return "mock result";
}
}
class ObjectThatUsesDB {
public ObjectThatUsesDB(Database db) {
this.database = db;
}
}
now in production you use normal database and for all tests you just inject the mock database that you can create ad hoc.
Do not use DB at all throughout most of code (that's a bad practice anyway). Create a "database" object that instead of returning with results will return normal objects (i.e. will return User instead of a tuple {name: "marcin", password: "blah"}) write all your tests with ad hoc constructed real objects and write one big test that depends on a database that makes sure this conversion works OK.
Of course these approaches are not mutually exclusive and you can mix and match them as you need.
Unit testing your database access is easy enough if your project has high cohesion and loose coupling throughout. This way you can test only the things that each particular class does without having to test everything at once.
For example, if you unit test your user interface class the tests you write should only try to verify the logic inside the UI worked as expected, not the business logic or database action behind that function.
If you want to unit test the actual database access you will actually end up with more of an integration test, because you will be dependent on the network stack and your database server, but you can verify that your SQL code does what you asked it to do.
The hidden power of unit testing for me personally has been that it forces me to design my applications in a much better way than I might without them. This is because it really helped me break away from the "this function should do everything" mentality.
Sorry I don't have any specific code examples for PHP/Python, but if you want to see a .NET example I have a post that describes a technique I used to do this very same testing.
I agree with the first post - database access should be stripped away into a DAO layer that implements an interface. Then, you can test your logic against a stub implementation of the DAO layer.
You could use mocking frameworks to abstract out the database engine. I don't know if PHP/Python got some but for typed languages (C#, Java etc.) there are plenty of choices
It also depends on how you designed those database access code, because some design are easier to unit test than other like the earlier posts have mentioned.
I've never done this in PHP and I've never used Python, but what you want to do is mock out the calls to the database. To do that you can implement some IoC whether 3rd party tool or you manage it yourself, then you can implement some mock version of the database caller which is where you will control the outcome of that fake call.
A simple form of IoC can be performed just by coding to Interfaces. This requires some kind of object orientation going on in your code so it may not apply to what your doing (I say that since all I have to go on is your mention of PHP and Python)
Hope that's helpful, if nothing else you've got some terms to search on now.
Setting up test data for unit tests can be a challenge.
When it comes to Java, if you use Spring APIs for unit testing, you can control the transactions on a unit level. In other words, you can execute unit tests which involves database updates/inserts/deletes and rollback the changes. At the end of the execution you leave everything in the database as it was before you started the execution. To me, it is as good as it can get.

Resources