Best way to test a MS Access application? - database

With the code, forms and data inside the same database I am wondering what are the best practices to design a suite of tests for a Microsoft Access application (say for Access 2007).
One of the main issues with testing forms is that only a few controls have a hwnd handle and other controls only get one they have focus, which makes automation quite opaque since you cant get a list of controls on a form to act on.
Any experience to share?

1. Write Testable Code
First, stop writing business logic into your Form's code behind. That's not the place for it. It can't be properly tested there. In fact, you really shouldn't have to test your form itself at all. It should be a dead dumb simple view that responds to User Interaction and then delegates responsibility for responding to those actions to another class that is testable.
How do you do that? Familiarizing yourself with the Model-View-Controller pattern is a good start.
It can't be done perfectly in VBA due to the fact that we get either events or interfaces, never both, but you can get pretty close. Consider this simple form that has a text box and a button.
In the form's code behind, we'll wrap the TextBox's value in a public property and re-raise any events we're interested in.
Public Event OnSayHello()
Public Event AfterTextUpdate()
Public Property Let Text(value As String)
Me.TextBox1.value = value
End Property
Public Property Get Text() As String
Text = Me.TextBox1.value
End Property
Private Sub SayHello_Click()
RaiseEvent OnSayHello
End Sub
Private Sub TextBox1_AfterUpdate()
RaiseEvent AfterTextUpdate
End Sub
Now we need a model to work with. Here I've created a new class module named MyModel. Here lies the code we'll put under test. Note that it naturally shares a similar structure as our view.
Private mText As String
Public Property Let Text(value As String)
mText = value
End Property
Public Property Get Text() As String
Text = mText
End Property
Public Function Reversed() As String
Dim result As String
Dim length As Long
length = Len(mText)
Dim i As Long
For i = 0 To length - 1
result = result + Mid(mText, (length - i), 1)
Next i
Reversed = result
End Function
Public Sub SayHello()
MsgBox Reversed()
End Sub
Finally, our controller wires it all together. The controller listens for form events and communicates changes to the model and triggers the model's routines.
Private WithEvents view As Form_Form1
Private model As MyModel
Public Sub Run()
Set model = New MyModel
Set view = New Form_Form1
view.Visible = True
End Sub
Private Sub view_AfterTextUpdate()
model.Text = view.Text
End Sub
Private Sub view_OnSayHello()
model.SayHello
view.Text = model.Reversed()
End Sub
Now this code can be run from any other module. For the purposes of this example, I've used a standard module. I highly encourage you to build this yourself using the code I've provided and see it function.
Private controller As FormController
Public Sub Run()
Set controller = New FormController
controller.Run
End Sub
So, that's great and all but what does it have to do with testing?! Friend, it has everything to do with testing. What we've done is make our code testable. In the example I've provided, there is no reason what-so-ever to even try to test the GUI. The only thing we really need to test is the model. That's where all of the real logic is.
So, on to step two.
2. Choose a Unit Testing Framework
There aren't a lot of options here. Most frameworks require installing COM Add-ins, lots of boiler plate, weird syntax, writing tests as comments, etc. That's why I got involved in building one myself, so this part of my answer isn't impartial, but I'll try to give a fair summary of what's available.
AccUnit
Works only in Access.
Requires you to write tests as a strange hybrid of comments and code. (no intellisense for the comment part.
There is a graphical interface to help you write those strange looking tests though.
The project has not seen any updates since 2013.
VB Lite Unit
I can't say I've personally used it. It's out there, but hasn't seen an update since 2005.
xlUnit
xlUnit isn't awful, but it's not good either. It's clunky and there's lots of boiler plate code. It's the best of the worst, but it doesn't work in Access. So, that's out.
Build your own framework
I've been there and done that. It's probably more than most people want to get into, but it is completely possible to build a Unit Testing framework in Native VBA code.
Rubberduck VBE Add-In's Unit Testing Framework
Disclaimer: I'm one of the co-devs.
I'm biased, but this is by far my favorite of the bunch.
Little to no boiler plate code.
Intellisense is available.
The project is active.
More documentation than most of these projects.
It works in most of the major office applications, not just Access.
It is, unfortunately, a COM Add-In, so it has to be installed onto your machine.
3. Start writing tests
So, back to our code from section 1. The only code that we really needed to test was the MyModel.Reversed() function. So, let's take a look at what that test could look like. (Example given uses Rubberduck, but it's a simple test and could translate into the framework of your choice.)
'#TestModule
Private Assert As New Rubberduck.AssertClass
'#TestMethod
Public Sub ReversedReversesCorrectly()
Arrange:
Dim model As New MyModel
Const original As String = "Hello"
Const expected As String = "olleH"
Dim actual As String
model.Text = original
Act:
actual = model.Reversed
Assert:
Assert.AreEqual expected, actual
End Sub
Guidelines for Writing Good Tests
Only test one thing at a time.
Good tests only fail when there is a bug introduced into the system or the requirements have changed.
Don't include external dependencies such as databases and file systems. These external dependencies can make tests fail for reasons outside of your control. Secondly, they slow your tests down. If your tests are slow, you won't run them.
Use test names that describe what the test is testing. Don't worry if it gets long. It's most important that it is descriptive.
I know that answer was a little long, and late, but hopefully it helps some people get started in writing unit tests for their VBA code.

I appreciated knox's and david's answers. My answer will be somewhere between theirs: just make forms that do not need to be debugged!
I think that forms should be exclusively used as what they are basically, meaning graphic interface only, meaning here that they do not have to be debugged! The debugging job is then limited to your VBA modules and objects, which is a lot easier to handle.
There is of course a natural tendency to add VBA code to forms and/or controls, specially when Access offers you these great "after Update" and "on change" events, but I definitely advise you not to put any form or control specific code in the form's module. This makes further maintenance and upgrade very costy, where your code is split between VBA modules and forms/controls modules.
This does not mean you cannot use anymore this AfterUpdate event! Just put standard code in the event, like this:
Private Sub myControl_AfterUpdate()
CTLAfterUpdate myControl
On Error Resume Next
Eval ("CTLAfterUpdate_MyForm()")
On Error GoTo 0
End sub
Where:
CTLAfterUpdate is a standard procedure run each time a control is updated in a form
CTLAfterUpdateMyForm is a specific procedure run each time a control is updated on MyForm
I have then 2 modules. The first one is
utilityFormEvents
where I will have my CTLAfterUpdate generic event
The second one is
MyAppFormEvents
containing the specific code of all specific forms of the MyApp application
and including the CTLAfterUpdateMyForm procedure. Of course, CTLAfterUpdateMyForm
might not exist if there are no specific code to run. This is why we turn the
"On error" to "resume next" ...
Choosing such a generic solution means a lot. It means you are reaching a high level of code normalization (meaning painless maintenance of code). And when you say that you do not have any form-specific code, it also means that form modules are fully standardized, and their production can be automated: just say which events you want to manage at the form/control level, and define your generic/specific procedures terminology.
Write your automation code, once for all.
It takes a few days of work but it give exciting results. I have been using this solution for the last 2 years and it is clearly the right one: my forms are fully and automatically created from scratch with a "Forms Table", linked to a "Controls Table".
I can then spend my time working on the specific procedures of the form, if any.
Code normalization, even with MS Access, is a long process. But it is really worth the pain!

Another advantage of Access being a COM application is that you can create an .NET application to run and test an Access application via Automation. The advantage of this is that then you can use a more powerful testing framework such as NUnit to write automated assert tests against an Access app.
Therefore, if you are proficient in either C# or VB.NET combined with something like NUnit then you can more easily create greater test coverage for your Access app.

Although that being a very old answer:
There is AccUnit, a specialized Unit-Test framework for Microsoft Access.

I've taken a page out of Python's doctest concept and implemented a DocTests procedure in Access VBA. This is obviously not a full-blown unit-testing solution. It's still relatively young, so I doubt I've worked out all the bugs, but I think it's mature enough to release into the wild.
Just copy the following code into a standard code module and press F5 inside the Sub to see it in action:
'>>> 1 + 1
'2
'>>> 3 - 1
'0
Sub DocTests()
Dim Comp As Object, i As Long, CM As Object
Dim Expr As String, ExpectedResult As Variant, TestsPassed As Long, TestsFailed As Long
Dim Evaluation As Variant
For Each Comp In Application.VBE.ActiveVBProject.VBComponents
Set CM = Comp.CodeModule
For i = 1 To CM.CountOfLines
If Left(Trim(CM.Lines(i, 1)), 4) = "'>>>" Then
Expr = Trim(Mid(CM.Lines(i, 1), 5))
On Error Resume Next
Evaluation = Eval(Expr)
If Err.Number = 2425 And Comp.Type <> 1 Then
'The expression you entered has a function name that '' can't find.
'This is not surprising because we are not in a standard code module (Comp.Type <> 1).
'So we will just ignore it.
GoTo NextLine
ElseIf Err.Number <> 0 Then
Debug.Print Err.Number, Err.Description, Expr
GoTo NextLine
End If
On Error GoTo 0
ExpectedResult = Trim(Mid(CM.Lines(i + 1, 1), InStr(CM.Lines(i + 1, 1), "'") + 1))
Select Case ExpectedResult
Case "True": ExpectedResult = True
Case "False": ExpectedResult = False
Case "Null": ExpectedResult = Null
End Select
Select Case TypeName(Evaluation)
Case "Long", "Integer", "Short", "Byte", "Single", "Double", "Decimal", "Currency"
ExpectedResult = Eval(ExpectedResult)
Case "Date"
If IsDate(ExpectedResult) Then ExpectedResult = CDate(ExpectedResult)
End Select
If (Evaluation = ExpectedResult) Then
TestsPassed = TestsPassed + 1
ElseIf (IsNull(Evaluation) And IsNull(ExpectedResult)) Then
TestsPassed = TestsPassed + 1
Else
Debug.Print Comp.Name; ": "; Expr; " evaluates to: "; Evaluation; " Expected: "; ExpectedResult
TestsFailed = TestsFailed + 1
End If
End If
NextLine:
Next i
Next Comp
Debug.Print "Tests passed: "; TestsPassed; " of "; TestsPassed + TestsFailed
End Sub
Copying, pasting, and running the above code from a module named Module1 yields:
Module: 3 - 1 evaluates to: 2 Expected: 0
Tests passed: 1 of 2
A few quick notes:
It has no dependencies (when used from within Access)
It makes use of Eval which is a function in the Access.Application object model; this means you could use it outside of Access but it would require creating an Access.Application object and fully qualifying the Eval calls
There are some idiosyncrasies associated with Eval to be aware of
It can only be used on functions that return a result that fits on a single line
Despite its limitations, I still think it provides quite a bit of bang for your buck.
Edit: Here is a simple function with "doctest rules" the function must satisfy.
Public Function AddTwoValues(ByVal p1 As Variant, _
ByVal p2 As Variant) As Variant
'>>> AddTwoValues(1,1)
'2
'>>> AddTwoValues(1,1) = 1
'False
'>>> AddTwoValues(1,Null)
'Null
'>>> IsError(AddTwoValues(1,"foo"))
'True
On Error GoTo ErrorHandler
AddTwoValues = p1 + p2
ExitHere:
On Error GoTo 0
Exit Function
ErrorHandler:
AddTwoValues = CVErr(Err.Number)
GoTo ExitHere
End Function

I would design the application to have as much work as possible done in queries and in vba subroutines so that your testing could be made up of populating test databases, running sets of the production queries and vba against those databases and then looking at the output and comparing to make sure the output is good. This approach doesn't test the GUI obviously, so you could augment the testing with a series of test scripts (here I mean like a word document that says open form 1, and click control 1) that are manually executed.
It depends on the scope of the project as the level of automation necessary for the testing aspect.

If your interested in testing your Access application at a more granular level specifically the VBA code itself then VB Lite Unit is a great unit testing framework for that purpose.

There are good suggestions here, but I'm surprised no one mentioned centralized error processing. You can get addins that allow for quick function/sub templating and for adding line numbers (I use MZ-tools). Then send all errors to a single function where you can log them. You can also then break on all errors by setting a single break point.

I find that there are relatively few opportunities for unit testing in my applications. Most of the code that I write interacts with table data or the filing system so is fundamentally hard to unit test. Early on, I tried an approach that may be similar to mocking (spoofing) where I created code that had an optional parameter. If the parameter was used, then the procedure would use the parameter instead of fetching data from the database. It is quite easy to set up a user defined type that has the same field types as a row of data and to pass that to a function. I now have a way of getting test data into the procedure that I want to test. Inside each procedure there was some code that swapped out the real data source for the test data source. This allowed me to use unit testing on a wider variety of function, using my own unit testing functions. Writing unit test is easy, it is just repetitive and boring. In the end, I gave up with unit tests and started using a different approach.
I write in-house applications for myself mainly so I can afford wait till issues find me rather than having to have perfect code. If I do write applications for customers, generally the customer is not fully aware of how much software development costs so I need a low cost way of getting results. Writing unit tests is all about writing a test that pushes bad data at a procedure to see if the procedure can handle it appropriately. Unit tests also confirm that good data is handled appropriately. My current approach is based on writing input validation into every procedure within an application and raising a success flag when the code has completed successfully. Each calling procedure checks for the success flag before using the result. If an issue occurs, it is reported by way of an error message. Each function has a success flag, a return value, an error message, a comment and an origin. A user defined type (fr for function return) contains the data members. Any given function many populate only some of the data members in the user defined type. When a function is run, it usually returns success = true and a return value and sometimes a comment. If a function fails, it returns success = false and an error message. If a chain of functions fails, the error messages are daisy changed but the result is actually a lot more readable that a normal stack trace. The origins are also chained so I know where the issue occurred. The application rarely crashes and accurately reports any issues. The result is a hell of a lot better than standard error handling.
Public Function GetOutputFolder(OutputFolder As eOutputFolder) As FunctRet
'///Returns a full path when provided with a target folder alias. e.g. 'temp' folder
Dim fr As FunctRet
Select Case OutputFolder
Case 1
fr.Rtn = "C:\Temp\"
fr.Success = True
Case 2
fr.Rtn = TrailingSlash(Application.CurrentProject.path)
fr.Success = True
Case 3
fr.EM = "Can't set custom paths – not yet implemented"
Case Else
fr.EM = "Unrecognised output destination requested"
End Select
exitproc:
GetOutputFolder = fr
End Function
Code explained.
eOutputFolder is a user defined Enum as below
Public Enum eOutputFolder
eDefaultDirectory = 1
eAppPath = 2
eCustomPath = 3
End Enum
I am using Enum for passing parameters to functions as this creates a limited set of known choices that a function can accept. Enums also provide intellisense when entering parameters into functions. I suppose they provide a rudimentary interface for a function.
'Type FunctRet is used as a generic means of reporting function returns
Public Type FunctRet
Success As Long 'Boolean flag for success, boolean not used to avoid nulls
Rtn As Variant 'Return Value
EM As String 'Error message
Cmt As String 'Comments
Origin As String 'Originating procedure/function
End Type
A user defined type such as a FunctRet also provides code completion which helps. Within the procedure, I usually store internal results to an anonymous internal variable (fr) before assigning the results to the return variable (GetOutputFolder). This makes renaming procedures very easy as only the top and bottom have be changed.
So in summary, I have developed a framework with ms-access that covers all operations that involve VBA. The testing is permanently written into the procedures, rather than a development time unit test. In practice, the code still runs very fast. I am very careful to optimise lower level functions that can be called ten thousand times a minute. Furthermore, I can use the code in production as it is being developed. If an error occurs, it is user friendly and the source and reason for the error are usually obvious. Errors are reported from the calling form, not from some module in the business layer, which is an important principal of application design. Furthermore, I don't have the burden of maintaining unit testing code, which is really important when I am evolving a design rather than coding a clearly conceptualised design.
There are some potential issues. The testing is not automated and new bad code is only detected when the application is run. The code does not look like standard VBA code (it is usually shorter). Still, the approach has some advantages. It is far better that using an error handler just to log an error as the users will usually contact me and give me a meaningful error message. It can also handle procedures that work with external data. JavaScript reminds me of VBA, I wonder why JavaScript is the land of frameworks and VBA in ms-access is not.
A few days after writing this post, I found an article on The CodeProject that comes close to what I have written above. The article compares and contrasts exception handling and error handling. What I have suggested above is akin to exception handling.

I have not tried this, but you could attempt to publish your access forms as data access web pages to something like sharepoint or just as web pages and then use an tool such as selenium to drive the browser with a suite of tests.
Obviously this is not as ideal as driving the code directly through unit tests, but it may get you part of the way. good luck

Access is a COM application. Use COM, not Windows API. to test things in Access.
The best Test environment for an Access Application is Access. All of your Forms/Reports/Tables/Code/Queries are available, there is a scripting language similar to MS Test (Ok, you probably don't remember MS Test), there is database environment for holding your test scripts and test results, and the skills you build here are transferable to your application.

Data Access Pages have been deprecated by MS for quite some time, and never really worked in the first place (they were dependent on the Office Widgets being installed, and worked only in IE, and only badly then).
It is true that Access controls that can get focus only have a window handle when they have the focus (and those that can't get focus, such as labels, never have a window handle at all). This makes Access singularly inappropriate to window handle-driven testing regimes.
Indeed, I question why you want to do this kind of testing in Access. It sounds to me like your basic Extreme Programming dogma, and not all of the principles and practices of XP can be adapted to work with Access applications -- square peg, round hole.
So, step back and ask yourself what you're trying to accomplish and consider that you may need to utilize completely different methods than those that are based on the approaches that just can't work in Access.
Or whether that kind of automated testing is valid at all or even useful with an Access application.

Related

Global property for DB access rather than passing DB around everywhere? Advice anyone?

Globals are evil right? At least everything I read says so, because something might alter the state of the global at any point.
However, I've a DB object that's a bit of a tramp in regards class parameters. The property below is an instance of a wrapper class that automatically works in MS Access or SQL - hence why it's not EF or some other ORM.
Public Property db As New DBI.DBI(DBI.DBI.modeenum.access, String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0} ;Persist Security Info=True;Jet OLEDB:Database Password=""lkjhgfds8928""", GetRpcd("c:\cms")))
The code itself does have PostSharp for exception handling, so I'm thinking that I can conditionally handle oledb errors by logging them and re initialising the DB if it is Null.
Up till now, the solution has been to continually pass the db around as a parameter to every single class that needs it. Most of the data classes have a shared observablecollection that is built from structures that individually implement inotifyproperty changed. One of these is asynchronously built. The collection property checks if it's empty before firing off the private Async buildCollection sub.
Given that we don't use dependency injection (yet) as I need to learn it; is the Global property all that bad? Db is needed everywhere that data is pulled in or saved. The only places I don't need it at all is the View and its code behind.
It's not a customer facing project but it does need to be solid.
Any advice gratefully recieved!!
Passing the DB connection as a parameter into your classes IS using dependency injection, perhaps you just didn't recognize it as such. Hard coding the connection string in the callers is still code that is not free of dependencies, but at least your database accessors themselves are free of the dependency upon a global connection.
Globals aren't just evil because they change without notice - that's just one effect you see resulting from the bad design choice. They're evil because a design using them is brittle. Code that depends upon globals requires invisible stuff to be set correctly before calling it, and that leads to inter-dependencies between unrelated code. The invisible stuff becomes critically important stuff. Reading just the interface of a module that internally uses globals, how would I know that I have to call the SetupGlobalThing() method before calling it? What happens if I call IncrementGlobalThing() and DecrementGlobalThing() and MultiplyGlobalThing() in varying orders, depending on the function the user selects?
Instead, prefer stateless methods where you pass in all the stuff to be changed and used: IncrementThing(Integer thing) doesn't rely on hidden setup steps. It clearly does one thing: it increments the thing passed in.
It may help to think about it from a unit testing viewpoint. If you were to write a unit test to prove a specific module of code works, would you need to pass in a real database connection (hard*), or would you be able to pass in a fake database reference that meets your testing needs easily?
The best way to test your logic is to unit test it. The best way to test your class interfaces and method structure is to write unit tests that call them. If the class is hard to test, it's likely due to dependencies upon external things (globals, singletons, databases, inappropriate member variables, etc.)
The reason I called using a real database "hard" is that a unit test needs to be easy and fast to run. It shouldn't rely on slow or breakable or complex external things. Think about unit testing your software on the bus, with no network connection. Think about how much work it is to create a dummy database: you have to add users, you have to have the right version of schema in it, it has to be installed, it has to be filled with the right kind of testing data, you need network connectivity to it, all those things can make your testing unreliable. Instead, in a unit test you pass in a mock database, which simply returns values that exercise your code being tested.

How to send message/activate a SysTray application?

We are trying to setup a SysTray application which can be activated from elsewhere. To be more specific the activation will come from a third party app which we cannot modify but allows us to activate our own app via its path (plus a parameter/argument).
When it gets activated we want to put up a BalloonText, there are to be no forms involved.
Therefore we have two problems to solve:
Make our SysTray application single instance (since it's no good generating multiple instances).
Allow this other app to activate our application with arguments
Lots of help out there to help learners create simple SysTray applications (and indeed we've done it ourselves as part of a solution to an unconnected project).
However we've never tried to make it single instance before.
Lots of help out there to help learners create single instance Winforms applications (again we've done this as part of other projects) but always simple applications with conventional forms (not SysTray). We use the VisualBasic WindowsFormsApplicationBase method.
Can't seem to combine these two approaches into a single solution.
Update:
Hans answer below nails it (and especially his comment):
This is already taken care of with a NotifyIcon, drop it on the form.
And the "Make single instance application" checkbox. And the
StartupNextInstance event. You'll need to stop assuming there's
anything special about this
As far as your first question about checking for other instances this seems to work. I used the CodeProject example as a baseline. In your Sub Main routine you can check for other instances by using the GetProcessesByName Method of the Process class. Something like this:
Public Sub Main()
'Turn visual styles back on
Application.EnableVisualStyles()
'Run the application using AppContext
Dim p() As Process
p = Process.GetProcessesByName("TrayApp") 'Your application name here
If UBound(p) >= 0 Then
End
End If
Application.Run(New AppContext)
End Sub
For the second question if your SysTray application is already running you might want to give this article on .Net Interprocess Communication a try. Otherwise parse the CommandLine arguments in yourSub Main as it is created.
From above article:
The XDMessaging library provides an easy-to-use, zero-configuration solution to same-box cross-AppDomain communications. It provides a simple API for sending and receiving targeted string messages across application boundaries. The library allows the use of user-defined pseudo 'channels' through which messages may be sent and received. Any application can send a message to any channel, but it must register as a listener with the channel in order to receive. In this way, developers can quickly and programmatically devise how best their applications can communicate with each other and work in harmony.
Everything becomes trivial when you actually do use a form. It is simple to put your app together with the designer, simple to get your app to terminate, simple to avoid a ghost icon in the tray, simple to create a context menu, simple to add popups if you ever need them.
The only un-simple thing is getting the form to not display. Paste this code in the form's class:
Protected Overrides Sub SetVisibleCore(ByVal value As Boolean)
If Not Me.IsHandleCreated Then
Me.CreateHandle()
value = False
End If
MyBase.SetVisibleCore(value)
End Sub
The "Exit" command in the context menu is now simply:
Private Sub ExitToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles ExitToolStripMenuItem.Click
Me.Close()
End Sub

Django Models / SQLAlchemy are bloated! Any truly Pythonic DB models out there?

"Make things as simple as possible, but no simpler."
Can we find the solution/s that fix the Python database world?
Update: A 'lustdb' prototype has been written by Alex Martelli - if you know any somewhat lightweight, high-level database libraries with multiple backends we could wrap in syntax sugar honey, please weigh in!
from someAmazingDB import *
#we imported a smart model class and db object which talk to database adapter/s
class Task (model):
title = ''
done = False #native types not a custom object we have to think about!
db.taskList = []
#or
db.taskList = expandableTypeCollection(Task) #not sure what this syntax would be
db['taskList'].append(Task(title='Beat old sql interfaces',done=False))
db.taskList.append(Task('Illustrate different syntax modes',True)) # ok maybe we should just use kwargs
#at this point it should be autosaved to a default db option
#by default we should be able to reload the console and access the default db:
>> from someAmazingDB import *
>> print 'Done tasks:'
>> for task in db.taskList:
>> if task.done:
>> print task.title
'Illustrate different syntax modes'
I'm a fan of Python, webPy and Cherry Py, and KISS in general.
We're talking automatic Python to SQL type translation or NoSQL.
We don't have to totally be SQL compatible! Just a scalable subset or ignore it!
Re:model changes, it's ok to ask the developer when they try to change it or have a set of sensible defaults.
Here is the challenge: The above code should work with very little modification or thinking required. Why must we put up with compromise when we know better?
It's 2010, we should be able to code scalable, simple databases in our sleep.
If you think this is important, please upvote!
What you request cannot be done in Python 2.whatever, for a very specific reason. You want to write:
class Task(model):
title = ''
isDone = False
In Python 2.anything, whatever model may possibly be, this cannot ever allow you to predict any "ordering" for the two fields, because the semantics of a class statement are:
execute the body, thus preparing a dict
locate the metaclass and run special methods thereof
Whatever the metaclass may be, step 1 has destroyed any predictability of the fields' order.
Therefore, your desired use of positional parameters, in the snippet:
Task('Illustrate different syntax modes', True)
cannot associate the arguments' values with the model's various fields. (Trying to guess by type association -- hoping no two fields ever have the same type -- would be even more horribly unpythonic than your expressed desire to use db.tasklist and db['tasklist'] indifferently and interchangeably).
One of the backwards-incompatible changes in Python 3 was introduced specifically to deal with situations of this ilk. In Python 3, a custom metaclass can define a __prepare__ function which runs before "step 1" in the above simplified list, and this lets it have more control about the class's body. Specifically, quoting PEP 3115...:
__prepare__ returns a dictionary-like object which is used to store
the class member definitions during evaluation of the class body.
In other words, the class body is evaluated as a function block
(just like it is now), except that the local variables dictionary
is replaced by the dictionary returned from __prepare__. This
dictionary object can be a regular dictionary or a custom mapping
type.
...
An example would be a metaclass that
uses information about the
ordering of member declarations to create a C struct. The metaclass
would provide a custom dictionary that simply keeps a record of the
order of insertions.
You don't want to "create a C struct" as in this example, but the order of fields is crucial (to allow the use of positional parameters that you want) and so the custom metaclass (obtained through base model) would have a __prepare__ classmethod returning an ordered dictionary. This removes the specific issue, but, of course, only if you're willing to switch all of your code using this "magic ORM" to Python 3. Would you be?
Once that's settled, the issue is, what database operations do you want to perform, and how. Your example, of course, does not clarify this at all. Is the taskList attribute name special, or should any other attribute assigned to the db object be "autosaved" (by name and, what other characteristic[s]?) and "autoretrieved" upon use? Are there to be ways to remove entities, alter them, locate them (otherwise than by having once been listed in the same attribute of the db object)? How does your sample code know what DB service to use and how to authenticate to it (e.g. by userid and password) if it requires authentication?
The specific tasks you list would not be hard to implement (e.g. on top of Google App Engine's storage service, which does not require authentication nor specification of "what DB service to use"). model's metaclass would introspect the class's fields and generate a GAE Model for the class, the db object would use __setattr__ to set an atexit trigger for storing the final value of an attribute (as an entity in a different kind of Model of course), and __getattr__ to fetch that attribute's info back from storage. Of course without some extra database functionality this all would be pretty useless;-).
Edit: so I did a little prototype (Python 2.6, and based on sqlite) and put it up on http://www.aleax.it/lustdb.zip -- it's a 3K zipfile including 225-lines lustdb.py (too long to post here) and two small test files roughly equivalent to the OP's originals: test0.py is...:
from lustdb import *
class Task(Model):
title = ''
done = False
db.taskList = []
db.taskList.append(Task(title='Beat old sql interfaces', done=False))
db.taskList.append(Task(title='Illustrate different syntax modes', done=True))
and test1.p1 is...:
from lustdb import *
print 'Done tasks:'
for task in db.taskList:
if task.done:
print task
Running test0.py (on a machine with a writable /tmp directory -- i.e., any Unix-y OS, or, on Windows, one on which a mkdir \tmp has been run at any previous time;-) has no output; after that, running test1.py outputs:
Done tasks:
Task(done=True, title=u'Illustrate different syntax modes')
Note that these are vastly less "crazily magical" than the OP's examples, in many ways, such as...:
1. no (expletive delete) redundancy whereby `db.taskList` is a synonym of `db['taskList']`, only the sensible former syntax (attribute-access) is supported
2. no mysterious (and totally crazy) way whereby a `done` attribute magically becomes `isDone` instead midway through the code
3. no mysterious (and utterly batty) way whereby a `print task` arbitrarily (or magically?) picks and prints just one of the attributes of the task
4. no weird gyrations and incantations to allow positional-attributes in lieu of named ones (this one the OP agreed to)
The prototype of course (as prototypes will;-) leaves a lot to be desired in many respects (clarity, documentation, unit tests, optimization, error checking and diagnosis, portability among different back-ends, and especially DB features beyond those implied in the question). The missing DB features are legion (for example, the OP's original examples give no way to identify a "primary key" for a model, or any other kinds of uniqueness constraints, so duplicates can abound; and it only gets worse from there;-). Nevertheless, for 225 lines (190 net of empty lines, comments and docstrings;-), it's not too bad in my biased opinion.
The proper way to continue playing with this project would of course be to initiate a new lustdb open source project on the hosting part of code.google.com (or any other good open source hosting site with issue tracker, wiki, code reviews support, online browsing, DVCS support, etc, etc) - I'd do it myself but I'm close to the limit in terms of number of open source projects I can initiate on code.google.com and don't want to "burn" the last one or two in this way;-).
BTW, the lustdb name for the module is a play of word with the OP's initials (first two letters each of first and last names), in the tradition of awk and friends -- I think it sounds nicely (and most other obvious names such as simpledb and dumbdb are taken;-).
I think you should try ZODB. It is object oriented database designed for storing python objects. Its API is quite close to example you have provided in your question, just take a look at tutorial.
What about using Elixir?
Forget ORM! I like vanilla SQL. The python wrappers like psycopg2 for postgreSQL do automatic type conversion, offer pretty good protection against SQL injection, and are nice and simple.
sql = "SELECT * FROM table WHERE id=%s"
data = (5,)
cursor.execute(sql, data)
The more I think on't the more the Smalltalk model of operation seems more relevant. Indeed the OP may not have reached far enough by using the term "database" to describe a thing which should have no need for naming.
A running Python interpreter has a pile of objects that live in memory. Their inter-relationships can be arbitrarily complex, but namespaces and the "tags" that objects are bound to are very flexible. And as pickle can explicitly serialize arbitrary structures for persistence, it doesn't seem that much of a reach to consider each Python interpreter living in that object space. Why should that object space evaporate with the interpreter's close? Semantically, this could be viewed as an extension of the anydbm tied dictionaries. And since most every thing in Python is dictionary-like, the mechanism is almost already there.
I think this may be the generic model that Alex Martelli was proposing above, it might be nice to say something like:
class Book:
def __init__(self, attributes):
self.attributes = attributes
def __getattr__(....): pass
$ python
>>> import book
>>> my_stuff.library = {'garp':
Book({'author': 'John Irving', 'title': 'The World According to Garp',
'isbn': '0-525-23770-4', 'location': 'kitchen table',
'bookmark': 'page 127'}),
...
}
>>> exit
[sometime next week]
$ python
>>> import my_stuff
>>> print my_stuff.library['garp'].location
'kitchen table'
# or even
>>> for book in my_stuff.library where book.location.contains('kitchen'):
print book.title
I don't know that you'd call the resultant language Python, but it seems like it is not that hard to implement and makes backing store equivalent to active store.
There is a natural tension between the inherent structure imposed - and sometimes desired - by RDBMs and the rather free-form navel-gazing put here, but NoSQLy databases are already approaching the content addressable memory model and probably better approximates how our minds keep track of things. Contrariwise, you wouldn't want to keep all the corporate purchase orders such a storage system, but perhaps you might.
How about you give an example of how "simple" you want your "dealing with database" to be, and I then tell you all the stuff that is needed for that "simplicity" to get working ?
(And of which it will still be YOU that will be required to give the information/config to the database interface engine, somewhere, somehow.)
To name but one example :
If your database management engine is some external machine with which you/your app interfaces over IP or some such, there is no way around the fact that the IP identity of where that database engine is running, will have to be provided by your app's database interface client, somewhere, somehow. Regardless of whether that gets explicitly exposed in the code or not.
I've been busy, here it is, released under LGPL:
http://github.com/lukestanley/lustdb
It uses JSON as it's backend at the moment.
This is not the same codebase Alex Martelli did.
I wanted to make the code more readable and reusable with different
backends and such.
Elsewhere I have been working on object oriented HTML elements
accessable in Python in similar ways, AND a library for making web.py
more minimalist.
I'm thinking of ways of using all 3 elements together with automatic
MVC prototype construction or smart mapping.
While old fashioned text based template web programming will be around
for a while still because of legacy systems and because it doesn't
require any particular library or implementation, I feel soon we'll
have a lot more efficent ways of building robust, prototype friendly
web apps.
Please see the mailing list for those interested.
If you like CherryPy, you might like the complementary ORMs I wrote: GeniuSQL (which follows a Table Data gateway model) and Dejavu (which is a complete Data Mapper).
There's far too much in this question and all its subcomments to address completely, but one thing I wanted to point out was that GeniuSQL and Dejavu have a very robust system for mapping native Python types to the types that your particular backend is using. There are very sane defaults, which can be overridden as needed, and even extended if you make a new backend or use types from a backend that isn't yet supported. See http://www.aminus.net/geniusql/chrome/common/doc/trunk/advanced.html#custom for more discussion on that.

SSRS Code Shared Variables and Simultaneous Report Execution

We have some SSRS reports that are failing when two of them are executed very close together.
I've found out that if two instances of an SSRS report run at the same time, any Code variables declared at the class level (not inside a function) can collide. I suspect this may be the cause of our report failures and I'm working up a potential fix.
The reason we're using the Code portion of SSRS at all is for things like custom group and page header calculation. The code is called from expressions in TextBoxes and returns what the current label should be. The code needs to maintain state to remember what the last header value was in order return it when unknown or to store the new header value for reuse.
Note: here are my resources for the variable collision problem:
The MSDN SSRS Forum:
Because this uses static variables, if two people run the report at the exact same
moment, there's a slim chance one will smash the other's variable state (In SQL 2000,
this could occasionally happen due to two users paginating through the same report at
the same time, not just due to exactly simultaneous executions). If you need to be 100%
certain to avoid this, you can make each of the shared variables a hash table based on
user ID (Globals!UserID).
Embedded Code in Reporting Services:
... if multiple users are executing the report with this code at the same time, both
reports will be changing the same Count field (that is why it is a shared field). You
don’t want to debug these sorts of interactions – stick to shared functions using only
local variables (variables passed ByVal or declared in the function body).
I guess the idea is that on the report generation server, the report is loaded and the Code module is a static class. If a second clients ask for the same report as another quickly enough, it connects to the same instance of that static class. (You're welcome to correct my description if I'm getting this wrong.)
So, I was proceeding with the idea of using a hash table to keep things isolated. I was planning on the hash key being an internal report parameter called InstanceID with default =Guid.NewGuid().ToString().
Part way through my research into this, though, I found that it is even more complicated because Hashtables aren't thread-safe, according to Maintaining State in Reporting Services.
That writer has code similar to what I was developing, only the whole thread-safe thing is completely outside my experience. It's going to take me hours to research all this and put together sensible code that I can be confident of and that performs well.
So before I go too much farther, I'm wondering if anyone else has already been down this path and could give me some advice. Here's the code I have so far:
Private Shared Data As New System.Collections.Hashtable()
Public Shared Function Initialize() As String
If Not Data.ContainsKey(Parameters!InstanceID.Value) Then
Data.Add(Parameters!InstanceID.Value, New System.Collections.Hashtable())
End If
LetValue("SomethingCount", 0)
Return ""
End Function
Private Shared Function GetValue(ByVal Name As String) As Object
Return Data.Item(Parameters!InstanceID.Value).Item(Name)
End Function
Private Shared Sub LetValue(ByVal Name As String, ByVal Value As Object)
Dim V As System.Collections.Hashtable = Data.Item(Parameters!InstanceID.Value)
If Not V.ContainsKey(Name) Then
V.Add(Name, Value)
Else
V.Item(Name) = Value
End If
End Sub
Public Shared Function SomethingCount() As Long
SomethingCount = GetValue("SomethingCount") + 1
LetValue("SomethingCount", SomethingCount)
End Function
My biggest concern here is thread safety. I might be able to figure out the rest of the questions below, but I am not experienced with this and I know it is an area that it is EASY to go wrong in. The link above uses the method Dim _sht as System.Collections.Hashtable = System.Collections.Hashtable.Synchronized(_hashtable). Is that best? What about Mutex? Semaphore? I have no experience in this.
I think the namespace System.Collections for Hashtable is correct, but I'm having trouble adding System.Collections as a reference in my report to try to cure my current error of "Could not load file or assembly 'System.Collections'". When I browse to add the reference, it's not an available component to select.
I just confirmed that I can call code from a parameter's default value expression, so I'll put my Initialize code there. I also just found out about the OnInit procedure, but this has its own gotchas to research and work around: the Parameters collection may not be referenced from the OnInit method during parameter initialization.
I'm unsure about declaring the Data variable as New, perhaps it should be only be instantiated in the initializer if not already done (but I worry about race conditions because of the delay between the check that it's empty and the instantiation of it).
I also have a question about the Shared keyword. Is it necessary in all cases? I get errors if I leave it off function declarations, but it appears to work when I leave it off the variable declaration. Testing multiple simultaneous report executions is difficult... Could someone explain what Shared means specifically in the context of SSRS Code?
Is there a better way to initialize variables? Should I provide a second parameter to the GetValue function which is the default value to use if it finds that the variable doesn't exist in the hashtable yet?
Is it better to have nested Hashtables as I chose in my implementation, or to concatenate my InstanceID with the variable name to have a flat hashtable?
I'd really appreciate guidance, ideas and/or critiques on any aspect of what I've presented here.
Thank you!
Erik
Your code looks fine. For thread safety only the root (shared) hashtable Data needs to be synchronised. If you want to avoid using your InstanceID you could use Globals.ExecutionTime and User.UserID concatenated.
Basically I think you just want to change to initialize like this:
Private Shared Data As System.Collections.Hashtable
If Data Is Nothing Then
Set Data = Hashtable.Synchronized(New System.Collections.Hashtable())
End If
The contained hashtables should only be used by one thread at a time anyway, but if in doubt, you could synchronize them too.

How to unit test an object with database queries

I've heard that unit testing is "totally awesome", "really cool" and "all manner of good things" but 70% or more of my files involve database access (some read and some write) and I'm not sure how to write a unit test for these files.
I'm using PHP and Python but I think it's a question that applies to most/all languages that use database access.
I would suggest mocking out your calls to the database. Mocks are basically objects that look like the object you are trying to call a method on, in the sense that they have the same properties, methods, etc. available to caller. But instead of performing whatever action they are programmed to do when a particular method is called, it skips that altogether, and just returns a result. That result is typically defined by you ahead of time.
In order to set up your objects for mocking, you probably need to use some sort of inversion of control/ dependency injection pattern, as in the following pseudo-code:
class Bar
{
private FooDataProvider _dataProvider;
public instantiate(FooDataProvider dataProvider) {
_dataProvider = dataProvider;
}
public getAllFoos() {
// instead of calling Foo.GetAll() here, we are introducing an extra layer of abstraction
return _dataProvider.GetAllFoos();
}
}
class FooDataProvider
{
public Foo[] GetAllFoos() {
return Foo.GetAll();
}
}
Now in your unit test, you create a mock of FooDataProvider, which allows you to call the method GetAllFoos without having to actually hit the database.
class BarTests
{
public TestGetAllFoos() {
// here we set up our mock FooDataProvider
mockRepository = MockingFramework.new()
mockFooDataProvider = mockRepository.CreateMockOfType(FooDataProvider);
// create a new array of Foo objects
testFooArray = new Foo[] {Foo.new(), Foo.new(), Foo.new()}
// the next statement will cause testFooArray to be returned every time we call FooDAtaProvider.GetAllFoos,
// instead of calling to the database and returning whatever is in there
// ExpectCallTo and Returns are methods provided by our imaginary mocking framework
ExpectCallTo(mockFooDataProvider.GetAllFoos).Returns(testFooArray)
// now begins our actual unit test
testBar = new Bar(mockFooDataProvider)
baz = testBar.GetAllFoos()
// baz should now equal the testFooArray object we created earlier
Assert.AreEqual(3, baz.length)
}
}
A common mocking scenario, in a nutshell. Of course you will still probably want to unit test your actual database calls too, for which you will need to hit the database.
Ideally, your objects should be persistent ignorant. For instance, you should have a "data access layer", that you would make requests to, that would return objects. This way, you can leave that part out of your unit tests, or test them in isolation.
If your objects are tightly coupled to your data layer, it is difficult to do proper unit testing. The first part of unit test, is "unit". All units should be able to be tested in isolation.
In my C# projects, I use NHibernate with a completely separate Data layer. My objects live in the core domain model and are accessed from my application layer. The application layer talks to both the data layer and the domain model layer.
The application layer is also sometimes called the "Business Layer".
If you are using PHP, create a specific set of classes ONLY for data access. Make sure your objects have no idea how they are persisted and wire up the two in your application classes.
Another option would be to use mocking/stubs.
The easiest way to unit test an object with database access is using transaction scopes.
For example:
[Test]
[ExpectedException(typeof(NotFoundException))]
public void DeleteAttendee() {
using(TransactionScope scope = new TransactionScope()) {
Attendee anAttendee = Attendee.Get(3);
anAttendee.Delete();
anAttendee.Save();
//Try reloading. Instance should have been deleted.
Attendee deletedAttendee = Attendee.Get(3);
}
}
This will revert back the state of the database, basically like a transaction rollback so you can run the test as many times as you want without any sideeffects. We've used this approach successfully in large projects. Our build does take a little long to run (15 minutes), but it is not horrible for having 1800 unit tests. Also, if build time is a concern, you can change the build process to have multiple builds, one for building src, another that fires up afterwards that handles unit tests, code analysis, packaging, etc...
I can perhaps give you a taste of our experience when we began looking at unit testing our middle-tier process that included a ton of "business logic" sql operations.
We first created an abstraction layer that allowed us to "slot in" any reasonable database connection (in our case, we simply supported a single ODBC-type connection).
Once this was in place, we were then able to do something like this in our code (we work in C++, but I'm sure you get the idea):
GetDatabase().ExecuteSQL( "INSERT INTO foo ( blah, blah )" )
At normal run time, GetDatabase() would return an object that fed all our sql (including queries), via ODBC directly to the database.
We then started looking at in-memory databases - the best by a long way seems to be SQLite. (http://www.sqlite.org/index.html). It's remarkably simple to set up and use, and allowed us subclass and override GetDatabase() to forward sql to an in-memory database that was created and destroyed for every test performed.
We're still in the early stages of this, but it's looking good so far, however we do have to make sure we create any tables that are required and populate them with test data - however we've reduced the workload somewhat here by creating a generic set of helper functions that can do a lot of all this for us.
Overall, it has helped immensely with our TDD process, since making what seems like quite innocuous changes to fix certain bugs can have quite strange affects on other (difficult to detect) areas of your system - due to the very nature of sql/databases.
Obviously, our experiences have centred around a C++ development environment, however I'm sure you could perhaps get something similar working under PHP/Python.
Hope this helps.
You should mock the database access if you want to unit test your classes. After all, you don't want to test the database in a unit test. That would be an integration test.
Abstract the calls away and then insert a mock that just returns the expected data. If your classes don't do more than executing queries, it may not even be worth testing them, though...
The book xUnit Test Patterns describes some ways to handle unit-testing code that hits a database. I agree with the other people who are saying that you don't want to do this because it's slow, but you gotta do it sometime, IMO. Mocking out the db connection to test higher-level stuff is a good idea, but check out this book for suggestions about things you can do to interact with the actual database.
I usually try to break up my tests between testing the objects (and ORM, if any) and testing the db. I test the object-side of things by mocking the data access calls whereas I test the db side of things by testing the object interactions with the db which is, in my experience, usually fairly limited.
I used to get frustrated with writing unit tests until I start mocking the data access portion so I didn't have to create a test db or generate test data on the fly. By mocking the data you can generate it all at run time and be sure that your objects work properly with known inputs.
Options you have:
Write a script that will wipe out database before you start unit tests, then populate db with predefined set of data and run the tests. You can also do that before every test – it'll be slow, but less error prone.
Inject the database. (Example in pseudo-Java, but applies to all OO-languages)
class Database {
public Result query(String query) {... real db here ...}
}
class MockDatabase extends Database {
public Result query(String query) {
return "mock result";
}
}
class ObjectThatUsesDB {
public ObjectThatUsesDB(Database db) {
this.database = db;
}
}
now in production you use normal database and for all tests you just inject the mock database that you can create ad hoc.
Do not use DB at all throughout most of code (that's a bad practice anyway). Create a "database" object that instead of returning with results will return normal objects (i.e. will return User instead of a tuple {name: "marcin", password: "blah"}) write all your tests with ad hoc constructed real objects and write one big test that depends on a database that makes sure this conversion works OK.
Of course these approaches are not mutually exclusive and you can mix and match them as you need.
Unit testing your database access is easy enough if your project has high cohesion and loose coupling throughout. This way you can test only the things that each particular class does without having to test everything at once.
For example, if you unit test your user interface class the tests you write should only try to verify the logic inside the UI worked as expected, not the business logic or database action behind that function.
If you want to unit test the actual database access you will actually end up with more of an integration test, because you will be dependent on the network stack and your database server, but you can verify that your SQL code does what you asked it to do.
The hidden power of unit testing for me personally has been that it forces me to design my applications in a much better way than I might without them. This is because it really helped me break away from the "this function should do everything" mentality.
Sorry I don't have any specific code examples for PHP/Python, but if you want to see a .NET example I have a post that describes a technique I used to do this very same testing.
I agree with the first post - database access should be stripped away into a DAO layer that implements an interface. Then, you can test your logic against a stub implementation of the DAO layer.
You could use mocking frameworks to abstract out the database engine. I don't know if PHP/Python got some but for typed languages (C#, Java etc.) there are plenty of choices
It also depends on how you designed those database access code, because some design are easier to unit test than other like the earlier posts have mentioned.
I've never done this in PHP and I've never used Python, but what you want to do is mock out the calls to the database. To do that you can implement some IoC whether 3rd party tool or you manage it yourself, then you can implement some mock version of the database caller which is where you will control the outcome of that fake call.
A simple form of IoC can be performed just by coding to Interfaces. This requires some kind of object orientation going on in your code so it may not apply to what your doing (I say that since all I have to go on is your mention of PHP and Python)
Hope that's helpful, if nothing else you've got some terms to search on now.
Setting up test data for unit tests can be a challenge.
When it comes to Java, if you use Spring APIs for unit testing, you can control the transactions on a unit level. In other words, you can execute unit tests which involves database updates/inserts/deletes and rollback the changes. At the end of the execution you leave everything in the database as it was before you started the execution. To me, it is as good as it can get.

Resources