db transaction in golang - database

In Java, it's pretty easy to switch between auto commit and manual commit of a database transaction. When I say easy, I mean it doesn't need to change the connection interface. Simply setAutoCommit to be true or false will switch the transaction between auto/manual mode. However, Go uses different connection interface, sql.DB for auto mode, and sql.Tx for manual mode. It's not a problem for a one off use. The problem is I have a framework which uses sql.DB to do the db work, now I want some of them to join my new transaction, and it seems to be not that easy without modifying the existing framework to accept sql.Tx. I am wondering if there is really not an easy way to do the auto/manual switch in Go?

Without knowing more about which framework you are using, I don't think there is a way to do it without modifying the framework. You should really try to get the modifications you do included in the framework because the main problem here is that the framework you are using is designed poorly. When writing for a new language (specially a library or framework) you should get to know the conventions and design your software accordingly.
In go, it is not to hard to accomplish this functionality you just have to declared the Queryer (or however you want to call it) interface like this:
type Queryer interface {
Query(string, ...interface{}) (*sql.Rows, error)
QueryRow(string, ...interface{}) *sql.Row
Prepare(string) (*sql.Stmt, error)
Exec(string, ...interface{}) (sql.Result, error)
}
This interface is implemented implicitly by the sql.DB and the sql.Tx so after declaring it, you just have to modify the functions/methods to accept the Queryer type instead of a sql.DB.

Related

SQLDelight v1.4 not generating interface

Since v1.4, SQLDelight generated data class only.
Before, the tool generated interface and a default implementation of this interface.
That was easy to compose objects with associated projections.
Is there any change to get these interface back ?
see this answer: https://github.com/cashapp/sqldelight/pull/1698#issuecomment-646306522
essentially to keep the backwards compatibility i would copy and paste the old interfaces yourself. We made this change because that really should have been the original implementation but we needed interfaces to support autovalue, thats no longer the case so if there are still situations where you need interfaces they should probably be user code

Global property for DB access rather than passing DB around everywhere? Advice anyone?

Globals are evil right? At least everything I read says so, because something might alter the state of the global at any point.
However, I've a DB object that's a bit of a tramp in regards class parameters. The property below is an instance of a wrapper class that automatically works in MS Access or SQL - hence why it's not EF or some other ORM.
Public Property db As New DBI.DBI(DBI.DBI.modeenum.access, String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0} ;Persist Security Info=True;Jet OLEDB:Database Password=""lkjhgfds8928""", GetRpcd("c:\cms")))
The code itself does have PostSharp for exception handling, so I'm thinking that I can conditionally handle oledb errors by logging them and re initialising the DB if it is Null.
Up till now, the solution has been to continually pass the db around as a parameter to every single class that needs it. Most of the data classes have a shared observablecollection that is built from structures that individually implement inotifyproperty changed. One of these is asynchronously built. The collection property checks if it's empty before firing off the private Async buildCollection sub.
Given that we don't use dependency injection (yet) as I need to learn it; is the Global property all that bad? Db is needed everywhere that data is pulled in or saved. The only places I don't need it at all is the View and its code behind.
It's not a customer facing project but it does need to be solid.
Any advice gratefully recieved!!
Passing the DB connection as a parameter into your classes IS using dependency injection, perhaps you just didn't recognize it as such. Hard coding the connection string in the callers is still code that is not free of dependencies, but at least your database accessors themselves are free of the dependency upon a global connection.
Globals aren't just evil because they change without notice - that's just one effect you see resulting from the bad design choice. They're evil because a design using them is brittle. Code that depends upon globals requires invisible stuff to be set correctly before calling it, and that leads to inter-dependencies between unrelated code. The invisible stuff becomes critically important stuff. Reading just the interface of a module that internally uses globals, how would I know that I have to call the SetupGlobalThing() method before calling it? What happens if I call IncrementGlobalThing() and DecrementGlobalThing() and MultiplyGlobalThing() in varying orders, depending on the function the user selects?
Instead, prefer stateless methods where you pass in all the stuff to be changed and used: IncrementThing(Integer thing) doesn't rely on hidden setup steps. It clearly does one thing: it increments the thing passed in.
It may help to think about it from a unit testing viewpoint. If you were to write a unit test to prove a specific module of code works, would you need to pass in a real database connection (hard*), or would you be able to pass in a fake database reference that meets your testing needs easily?
The best way to test your logic is to unit test it. The best way to test your class interfaces and method structure is to write unit tests that call them. If the class is hard to test, it's likely due to dependencies upon external things (globals, singletons, databases, inappropriate member variables, etc.)
The reason I called using a real database "hard" is that a unit test needs to be easy and fast to run. It shouldn't rely on slow or breakable or complex external things. Think about unit testing your software on the bus, with no network connection. Think about how much work it is to create a dummy database: you have to add users, you have to have the right version of schema in it, it has to be installed, it has to be filled with the right kind of testing data, you need network connectivity to it, all those things can make your testing unreliable. Instead, in a unit test you pass in a mock database, which simply returns values that exercise your code being tested.

Django Models / SQLAlchemy are bloated! Any truly Pythonic DB models out there?

"Make things as simple as possible, but no simpler."
Can we find the solution/s that fix the Python database world?
Update: A 'lustdb' prototype has been written by Alex Martelli - if you know any somewhat lightweight, high-level database libraries with multiple backends we could wrap in syntax sugar honey, please weigh in!
from someAmazingDB import *
#we imported a smart model class and db object which talk to database adapter/s
class Task (model):
title = ''
done = False #native types not a custom object we have to think about!
db.taskList = []
#or
db.taskList = expandableTypeCollection(Task) #not sure what this syntax would be
db['taskList'].append(Task(title='Beat old sql interfaces',done=False))
db.taskList.append(Task('Illustrate different syntax modes',True)) # ok maybe we should just use kwargs
#at this point it should be autosaved to a default db option
#by default we should be able to reload the console and access the default db:
>> from someAmazingDB import *
>> print 'Done tasks:'
>> for task in db.taskList:
>> if task.done:
>> print task.title
'Illustrate different syntax modes'
I'm a fan of Python, webPy and Cherry Py, and KISS in general.
We're talking automatic Python to SQL type translation or NoSQL.
We don't have to totally be SQL compatible! Just a scalable subset or ignore it!
Re:model changes, it's ok to ask the developer when they try to change it or have a set of sensible defaults.
Here is the challenge: The above code should work with very little modification or thinking required. Why must we put up with compromise when we know better?
It's 2010, we should be able to code scalable, simple databases in our sleep.
If you think this is important, please upvote!
What you request cannot be done in Python 2.whatever, for a very specific reason. You want to write:
class Task(model):
title = ''
isDone = False
In Python 2.anything, whatever model may possibly be, this cannot ever allow you to predict any "ordering" for the two fields, because the semantics of a class statement are:
execute the body, thus preparing a dict
locate the metaclass and run special methods thereof
Whatever the metaclass may be, step 1 has destroyed any predictability of the fields' order.
Therefore, your desired use of positional parameters, in the snippet:
Task('Illustrate different syntax modes', True)
cannot associate the arguments' values with the model's various fields. (Trying to guess by type association -- hoping no two fields ever have the same type -- would be even more horribly unpythonic than your expressed desire to use db.tasklist and db['tasklist'] indifferently and interchangeably).
One of the backwards-incompatible changes in Python 3 was introduced specifically to deal with situations of this ilk. In Python 3, a custom metaclass can define a __prepare__ function which runs before "step 1" in the above simplified list, and this lets it have more control about the class's body. Specifically, quoting PEP 3115...:
__prepare__ returns a dictionary-like object which is used to store
the class member definitions during evaluation of the class body.
In other words, the class body is evaluated as a function block
(just like it is now), except that the local variables dictionary
is replaced by the dictionary returned from __prepare__. This
dictionary object can be a regular dictionary or a custom mapping
type.
...
An example would be a metaclass that
uses information about the
ordering of member declarations to create a C struct. The metaclass
would provide a custom dictionary that simply keeps a record of the
order of insertions.
You don't want to "create a C struct" as in this example, but the order of fields is crucial (to allow the use of positional parameters that you want) and so the custom metaclass (obtained through base model) would have a __prepare__ classmethod returning an ordered dictionary. This removes the specific issue, but, of course, only if you're willing to switch all of your code using this "magic ORM" to Python 3. Would you be?
Once that's settled, the issue is, what database operations do you want to perform, and how. Your example, of course, does not clarify this at all. Is the taskList attribute name special, or should any other attribute assigned to the db object be "autosaved" (by name and, what other characteristic[s]?) and "autoretrieved" upon use? Are there to be ways to remove entities, alter them, locate them (otherwise than by having once been listed in the same attribute of the db object)? How does your sample code know what DB service to use and how to authenticate to it (e.g. by userid and password) if it requires authentication?
The specific tasks you list would not be hard to implement (e.g. on top of Google App Engine's storage service, which does not require authentication nor specification of "what DB service to use"). model's metaclass would introspect the class's fields and generate a GAE Model for the class, the db object would use __setattr__ to set an atexit trigger for storing the final value of an attribute (as an entity in a different kind of Model of course), and __getattr__ to fetch that attribute's info back from storage. Of course without some extra database functionality this all would be pretty useless;-).
Edit: so I did a little prototype (Python 2.6, and based on sqlite) and put it up on http://www.aleax.it/lustdb.zip -- it's a 3K zipfile including 225-lines lustdb.py (too long to post here) and two small test files roughly equivalent to the OP's originals: test0.py is...:
from lustdb import *
class Task(Model):
title = ''
done = False
db.taskList = []
db.taskList.append(Task(title='Beat old sql interfaces', done=False))
db.taskList.append(Task(title='Illustrate different syntax modes', done=True))
and test1.p1 is...:
from lustdb import *
print 'Done tasks:'
for task in db.taskList:
if task.done:
print task
Running test0.py (on a machine with a writable /tmp directory -- i.e., any Unix-y OS, or, on Windows, one on which a mkdir \tmp has been run at any previous time;-) has no output; after that, running test1.py outputs:
Done tasks:
Task(done=True, title=u'Illustrate different syntax modes')
Note that these are vastly less "crazily magical" than the OP's examples, in many ways, such as...:
1. no (expletive delete) redundancy whereby `db.taskList` is a synonym of `db['taskList']`, only the sensible former syntax (attribute-access) is supported
2. no mysterious (and totally crazy) way whereby a `done` attribute magically becomes `isDone` instead midway through the code
3. no mysterious (and utterly batty) way whereby a `print task` arbitrarily (or magically?) picks and prints just one of the attributes of the task
4. no weird gyrations and incantations to allow positional-attributes in lieu of named ones (this one the OP agreed to)
The prototype of course (as prototypes will;-) leaves a lot to be desired in many respects (clarity, documentation, unit tests, optimization, error checking and diagnosis, portability among different back-ends, and especially DB features beyond those implied in the question). The missing DB features are legion (for example, the OP's original examples give no way to identify a "primary key" for a model, or any other kinds of uniqueness constraints, so duplicates can abound; and it only gets worse from there;-). Nevertheless, for 225 lines (190 net of empty lines, comments and docstrings;-), it's not too bad in my biased opinion.
The proper way to continue playing with this project would of course be to initiate a new lustdb open source project on the hosting part of code.google.com (or any other good open source hosting site with issue tracker, wiki, code reviews support, online browsing, DVCS support, etc, etc) - I'd do it myself but I'm close to the limit in terms of number of open source projects I can initiate on code.google.com and don't want to "burn" the last one or two in this way;-).
BTW, the lustdb name for the module is a play of word with the OP's initials (first two letters each of first and last names), in the tradition of awk and friends -- I think it sounds nicely (and most other obvious names such as simpledb and dumbdb are taken;-).
I think you should try ZODB. It is object oriented database designed for storing python objects. Its API is quite close to example you have provided in your question, just take a look at tutorial.
What about using Elixir?
Forget ORM! I like vanilla SQL. The python wrappers like psycopg2 for postgreSQL do automatic type conversion, offer pretty good protection against SQL injection, and are nice and simple.
sql = "SELECT * FROM table WHERE id=%s"
data = (5,)
cursor.execute(sql, data)
The more I think on't the more the Smalltalk model of operation seems more relevant. Indeed the OP may not have reached far enough by using the term "database" to describe a thing which should have no need for naming.
A running Python interpreter has a pile of objects that live in memory. Their inter-relationships can be arbitrarily complex, but namespaces and the "tags" that objects are bound to are very flexible. And as pickle can explicitly serialize arbitrary structures for persistence, it doesn't seem that much of a reach to consider each Python interpreter living in that object space. Why should that object space evaporate with the interpreter's close? Semantically, this could be viewed as an extension of the anydbm tied dictionaries. And since most every thing in Python is dictionary-like, the mechanism is almost already there.
I think this may be the generic model that Alex Martelli was proposing above, it might be nice to say something like:
class Book:
def __init__(self, attributes):
self.attributes = attributes
def __getattr__(....): pass
$ python
>>> import book
>>> my_stuff.library = {'garp':
Book({'author': 'John Irving', 'title': 'The World According to Garp',
'isbn': '0-525-23770-4', 'location': 'kitchen table',
'bookmark': 'page 127'}),
...
}
>>> exit
[sometime next week]
$ python
>>> import my_stuff
>>> print my_stuff.library['garp'].location
'kitchen table'
# or even
>>> for book in my_stuff.library where book.location.contains('kitchen'):
print book.title
I don't know that you'd call the resultant language Python, but it seems like it is not that hard to implement and makes backing store equivalent to active store.
There is a natural tension between the inherent structure imposed - and sometimes desired - by RDBMs and the rather free-form navel-gazing put here, but NoSQLy databases are already approaching the content addressable memory model and probably better approximates how our minds keep track of things. Contrariwise, you wouldn't want to keep all the corporate purchase orders such a storage system, but perhaps you might.
How about you give an example of how "simple" you want your "dealing with database" to be, and I then tell you all the stuff that is needed for that "simplicity" to get working ?
(And of which it will still be YOU that will be required to give the information/config to the database interface engine, somewhere, somehow.)
To name but one example :
If your database management engine is some external machine with which you/your app interfaces over IP or some such, there is no way around the fact that the IP identity of where that database engine is running, will have to be provided by your app's database interface client, somewhere, somehow. Regardless of whether that gets explicitly exposed in the code or not.
I've been busy, here it is, released under LGPL:
http://github.com/lukestanley/lustdb
It uses JSON as it's backend at the moment.
This is not the same codebase Alex Martelli did.
I wanted to make the code more readable and reusable with different
backends and such.
Elsewhere I have been working on object oriented HTML elements
accessable in Python in similar ways, AND a library for making web.py
more minimalist.
I'm thinking of ways of using all 3 elements together with automatic
MVC prototype construction or smart mapping.
While old fashioned text based template web programming will be around
for a while still because of legacy systems and because it doesn't
require any particular library or implementation, I feel soon we'll
have a lot more efficent ways of building robust, prototype friendly
web apps.
Please see the mailing list for those interested.
If you like CherryPy, you might like the complementary ORMs I wrote: GeniuSQL (which follows a Table Data gateway model) and Dejavu (which is a complete Data Mapper).
There's far too much in this question and all its subcomments to address completely, but one thing I wanted to point out was that GeniuSQL and Dejavu have a very robust system for mapping native Python types to the types that your particular backend is using. There are very sane defaults, which can be overridden as needed, and even extended if you make a new backend or use types from a backend that isn't yet supported. See http://www.aminus.net/geniusql/chrome/common/doc/trunk/advanced.html#custom for more discussion on that.

Is it bad practice to "go deep" with your application of callbacks?

Weird question, but I'm not sure if it's anti-pattern or not.
Say I have a web app that will be rendering 1000 records to an html table.
The typical approach I've seen is to send a query down to the database, translate the records in some way to some abstract state (be it an array, or a object, etc) and place the translated records into a collection that is then iterated over in the view.
As the number of records grows, this approach uses up more and more memory.
Why not send along with the query a callback that performs an operation on each of the translated rows as they are read from the database? This would mean that you don't need to collect the data for further iteration in the view so the memory footprint shrinks, and you're not iterating over the data twice.
There must be something implicitly wrong with this approach, because I rarely see it used anywhere. What's wrong with this approach?
Thanks.
Actually, this is exactly how a well-developed application should behave.
There is nothing wrong with this approach, except that not all database interfaces allow you to do this easily.
If we talk about tabularizing 10 records for a yet another social network, there is no need to mess with callbacks if you can get an array of hashes or whatever with a single call that is already implemented for you.
There must be something implicitly wrong with this approach, because I rarely see it used anywhere.
I use it. Frequently. Even when i wouldn't use too much memory repeatedly copying the data, using a callback just seems cleaner. In languages with closures, it also lets you keep relevant code together while factoring out the messy DB stuff.
This is a "limited by your tools" class of problem: Most programming languages don't allow to say "Do something around this code". This was solved in recent years with the advent of closures. Think of a closure as a way to pass code into another method which is then executed in a context. For example, in GSQL, you can write:
def l = []
sql.execute ("select id from table where time > ?", time) { row ->
l << row[0]
}
This will open a connection to the database, create a statement and a result set and then run the l << it[0] for each row the DB returns. Note that the code runs inside of sql.execute() but it can access local variables (l) and variables defined in sql.execute() (row).
With this kind of code, you can even generate the result of a HTTP request on the fly without keeping much of the page in RAM at any time. In my case, I'd stream a 2MB document to the browser using only a few KB of RAM and the browser would then chew 83s to parse this.
This is roughly what the iterator pattern allows you to do. In many cases this breaks down on the interface between your application and the database. Technologies like LINQ even have solutions that can send back code to the database.
I've found it easier to use an interface resolver than deep callback where its hooked up through several classes. MS has a much fancier version than mine called Unity. This provides a much cleaner way of accessing classes that should not be tightly coupled
http://www.codeplex.com/unity

How to unit test an object with database queries

I've heard that unit testing is "totally awesome", "really cool" and "all manner of good things" but 70% or more of my files involve database access (some read and some write) and I'm not sure how to write a unit test for these files.
I'm using PHP and Python but I think it's a question that applies to most/all languages that use database access.
I would suggest mocking out your calls to the database. Mocks are basically objects that look like the object you are trying to call a method on, in the sense that they have the same properties, methods, etc. available to caller. But instead of performing whatever action they are programmed to do when a particular method is called, it skips that altogether, and just returns a result. That result is typically defined by you ahead of time.
In order to set up your objects for mocking, you probably need to use some sort of inversion of control/ dependency injection pattern, as in the following pseudo-code:
class Bar
{
private FooDataProvider _dataProvider;
public instantiate(FooDataProvider dataProvider) {
_dataProvider = dataProvider;
}
public getAllFoos() {
// instead of calling Foo.GetAll() here, we are introducing an extra layer of abstraction
return _dataProvider.GetAllFoos();
}
}
class FooDataProvider
{
public Foo[] GetAllFoos() {
return Foo.GetAll();
}
}
Now in your unit test, you create a mock of FooDataProvider, which allows you to call the method GetAllFoos without having to actually hit the database.
class BarTests
{
public TestGetAllFoos() {
// here we set up our mock FooDataProvider
mockRepository = MockingFramework.new()
mockFooDataProvider = mockRepository.CreateMockOfType(FooDataProvider);
// create a new array of Foo objects
testFooArray = new Foo[] {Foo.new(), Foo.new(), Foo.new()}
// the next statement will cause testFooArray to be returned every time we call FooDAtaProvider.GetAllFoos,
// instead of calling to the database and returning whatever is in there
// ExpectCallTo and Returns are methods provided by our imaginary mocking framework
ExpectCallTo(mockFooDataProvider.GetAllFoos).Returns(testFooArray)
// now begins our actual unit test
testBar = new Bar(mockFooDataProvider)
baz = testBar.GetAllFoos()
// baz should now equal the testFooArray object we created earlier
Assert.AreEqual(3, baz.length)
}
}
A common mocking scenario, in a nutshell. Of course you will still probably want to unit test your actual database calls too, for which you will need to hit the database.
Ideally, your objects should be persistent ignorant. For instance, you should have a "data access layer", that you would make requests to, that would return objects. This way, you can leave that part out of your unit tests, or test them in isolation.
If your objects are tightly coupled to your data layer, it is difficult to do proper unit testing. The first part of unit test, is "unit". All units should be able to be tested in isolation.
In my C# projects, I use NHibernate with a completely separate Data layer. My objects live in the core domain model and are accessed from my application layer. The application layer talks to both the data layer and the domain model layer.
The application layer is also sometimes called the "Business Layer".
If you are using PHP, create a specific set of classes ONLY for data access. Make sure your objects have no idea how they are persisted and wire up the two in your application classes.
Another option would be to use mocking/stubs.
The easiest way to unit test an object with database access is using transaction scopes.
For example:
[Test]
[ExpectedException(typeof(NotFoundException))]
public void DeleteAttendee() {
using(TransactionScope scope = new TransactionScope()) {
Attendee anAttendee = Attendee.Get(3);
anAttendee.Delete();
anAttendee.Save();
//Try reloading. Instance should have been deleted.
Attendee deletedAttendee = Attendee.Get(3);
}
}
This will revert back the state of the database, basically like a transaction rollback so you can run the test as many times as you want without any sideeffects. We've used this approach successfully in large projects. Our build does take a little long to run (15 minutes), but it is not horrible for having 1800 unit tests. Also, if build time is a concern, you can change the build process to have multiple builds, one for building src, another that fires up afterwards that handles unit tests, code analysis, packaging, etc...
I can perhaps give you a taste of our experience when we began looking at unit testing our middle-tier process that included a ton of "business logic" sql operations.
We first created an abstraction layer that allowed us to "slot in" any reasonable database connection (in our case, we simply supported a single ODBC-type connection).
Once this was in place, we were then able to do something like this in our code (we work in C++, but I'm sure you get the idea):
GetDatabase().ExecuteSQL( "INSERT INTO foo ( blah, blah )" )
At normal run time, GetDatabase() would return an object that fed all our sql (including queries), via ODBC directly to the database.
We then started looking at in-memory databases - the best by a long way seems to be SQLite. (http://www.sqlite.org/index.html). It's remarkably simple to set up and use, and allowed us subclass and override GetDatabase() to forward sql to an in-memory database that was created and destroyed for every test performed.
We're still in the early stages of this, but it's looking good so far, however we do have to make sure we create any tables that are required and populate them with test data - however we've reduced the workload somewhat here by creating a generic set of helper functions that can do a lot of all this for us.
Overall, it has helped immensely with our TDD process, since making what seems like quite innocuous changes to fix certain bugs can have quite strange affects on other (difficult to detect) areas of your system - due to the very nature of sql/databases.
Obviously, our experiences have centred around a C++ development environment, however I'm sure you could perhaps get something similar working under PHP/Python.
Hope this helps.
You should mock the database access if you want to unit test your classes. After all, you don't want to test the database in a unit test. That would be an integration test.
Abstract the calls away and then insert a mock that just returns the expected data. If your classes don't do more than executing queries, it may not even be worth testing them, though...
The book xUnit Test Patterns describes some ways to handle unit-testing code that hits a database. I agree with the other people who are saying that you don't want to do this because it's slow, but you gotta do it sometime, IMO. Mocking out the db connection to test higher-level stuff is a good idea, but check out this book for suggestions about things you can do to interact with the actual database.
I usually try to break up my tests between testing the objects (and ORM, if any) and testing the db. I test the object-side of things by mocking the data access calls whereas I test the db side of things by testing the object interactions with the db which is, in my experience, usually fairly limited.
I used to get frustrated with writing unit tests until I start mocking the data access portion so I didn't have to create a test db or generate test data on the fly. By mocking the data you can generate it all at run time and be sure that your objects work properly with known inputs.
Options you have:
Write a script that will wipe out database before you start unit tests, then populate db with predefined set of data and run the tests. You can also do that before every test – it'll be slow, but less error prone.
Inject the database. (Example in pseudo-Java, but applies to all OO-languages)
class Database {
public Result query(String query) {... real db here ...}
}
class MockDatabase extends Database {
public Result query(String query) {
return "mock result";
}
}
class ObjectThatUsesDB {
public ObjectThatUsesDB(Database db) {
this.database = db;
}
}
now in production you use normal database and for all tests you just inject the mock database that you can create ad hoc.
Do not use DB at all throughout most of code (that's a bad practice anyway). Create a "database" object that instead of returning with results will return normal objects (i.e. will return User instead of a tuple {name: "marcin", password: "blah"}) write all your tests with ad hoc constructed real objects and write one big test that depends on a database that makes sure this conversion works OK.
Of course these approaches are not mutually exclusive and you can mix and match them as you need.
Unit testing your database access is easy enough if your project has high cohesion and loose coupling throughout. This way you can test only the things that each particular class does without having to test everything at once.
For example, if you unit test your user interface class the tests you write should only try to verify the logic inside the UI worked as expected, not the business logic or database action behind that function.
If you want to unit test the actual database access you will actually end up with more of an integration test, because you will be dependent on the network stack and your database server, but you can verify that your SQL code does what you asked it to do.
The hidden power of unit testing for me personally has been that it forces me to design my applications in a much better way than I might without them. This is because it really helped me break away from the "this function should do everything" mentality.
Sorry I don't have any specific code examples for PHP/Python, but if you want to see a .NET example I have a post that describes a technique I used to do this very same testing.
I agree with the first post - database access should be stripped away into a DAO layer that implements an interface. Then, you can test your logic against a stub implementation of the DAO layer.
You could use mocking frameworks to abstract out the database engine. I don't know if PHP/Python got some but for typed languages (C#, Java etc.) there are plenty of choices
It also depends on how you designed those database access code, because some design are easier to unit test than other like the earlier posts have mentioned.
I've never done this in PHP and I've never used Python, but what you want to do is mock out the calls to the database. To do that you can implement some IoC whether 3rd party tool or you manage it yourself, then you can implement some mock version of the database caller which is where you will control the outcome of that fake call.
A simple form of IoC can be performed just by coding to Interfaces. This requires some kind of object orientation going on in your code so it may not apply to what your doing (I say that since all I have to go on is your mention of PHP and Python)
Hope that's helpful, if nothing else you've got some terms to search on now.
Setting up test data for unit tests can be a challenge.
When it comes to Java, if you use Spring APIs for unit testing, you can control the transactions on a unit level. In other words, you can execute unit tests which involves database updates/inserts/deletes and rollback the changes. At the end of the execution you leave everything in the database as it was before you started the execution. To me, it is as good as it can get.

Resources