Is anyone having an idea on how to handle multiple entity updates within the same transaction in Spring Data REST ? The same thing can be handle within Spring controller methods using the #Transactional annotation. If I am correct, Spring Data REST executes every execution event within separate transactions. So multiple entity updates cannot be handled in a proper way.
I am having issues updating 2 entities (ABC and PQR) within the same transaction and rolling back the ABC entity when the PQR entity is failed.
// ABC repository
#RepositoryRestResource
public interface ABCEntityRepository extends MongoRepository<ABC, String> {
}
// PQR repository
#RepositoryRestResource
public interface PQREntityRepository extends MongoRepository<PQR, String> {
}
// ABC repository handler
#RepositoryEventHandler
public class ABCEventHandler {
#Autowired
private PQREntityRepository pqrEntityRepository;
#HandleBeforeSave
public void handleABCBeforeSave(ABC abc) {
log.debug("before saving ABC...");
}
#HandleAfterSave
public void handleABCAfterSave(ABC abc) {
List<PQR> pqrList = pqrEntityRepository.findById(abc.getPqrId());
if (pqrList != null && !pqrList.isEmpty()) {
pqrList.forEach(pqr -> {
// update PQR objects
}
}
// expect to fail this transaction
pqrEntityRepository.saveAll(pqrList);
}
}
since #HandleAfterSave method is executed in a separate transaction, calling HandleAfterSave method means the ABC entity updation is already completed and cannot rollback, therefore. Any suggestion to handle this ?
Spring Data REST does not think in entities, it thinks in aggregates. Aggregate is a term coming from Domain-Driven Design that describes a group of entities for which certain business rules apply. Take an order along side its line items for example and a business rule that defines a minimum order value that needs to be reached.
The responsibility to govern constraints aligns with another aspect that involves aggregates in DDD which is that strong consistency should/can only be assumed for changes on an aggregate itself. Changes to multiple (different) aggregates should be expected to be eventually consistent. If you transfer that into technology, it's advisable to apply the means of strong consistency – read: transactions – to single aggregates only.
So there is no short answer to your question. The repository structure you show here virtually turns both ABCEntity and PQREntity into aggregates (as repositories only exist for aggregate roots). That means, OOTB Spring Data REST does not support updating them in a single transactional HTTP call.
That said, Spring Data REST allows the declaration of custom resources that can take responsibility of doing that. Similarly to what is shown here, you can simply add resources on additional routes to completely implement what you imagine yourself.
Spring Data REST is not designed to produce a full HTTP API out of the box. It's designed to implement certain REST API patterns that are commonly found in HTTP APIs and will very likely be part of your API. It's build to avoid you having to spend time on thinking about the straight-forward cases and only have to plug custom code for scenarios like the one you described, assuming what you plan to do here is a good idea in the first place. Very often requests like these result in the conclusion that the aggregate design needs a bit of rework.
PS: I saw you tagged that question with spring-data-mongodb. By default, Spring Data REST does not support MongoDB transactions because it doesn't need them. MongoDB document boundaries usually align with aggregate boundaries and updates to a single document are atomic within MongoDB anyway.
I'm not sure I understood your question correctly, but I'll give it a try.
I'd suggest to have a service with both Repositories autowired in, and a method annotated with #Transactional that updates everything you want.
This way, if the transaction fails anywhere inside the method, it will all rollback.
If this does not answer your question, please clarify and I'll try to help.
In his book on DbContext, #RowanMiller shows how to use the DbSet.Local property to avoid 1.) unnecessary roundtrips to the database and 2.) passing around collections (created with e.g. ToList()) in the application (page 24). I then tried to follow this approach. However, I noticed that from one using [} – block to another, the DbSet.Local property becomes empty:
ObservableCollection<Destination> destinationsList;
using (var context = new BAContext())
{
var query = from d in context.Destinations …;
query.Load();
destinationsList = context.Destinations.Local; //Nonzero here.
}
//Do stuff with destinationsList
using (var context = new BAContext())
{
//context.Destinations.Local zero here again;
//So no way of getting the in-memory data from the previous using- block here?
//Do I have to do another roundtrip to the database here to get the same data I wanted
//to cache locally???
}
Then, what is the point on page 24? How can I avoid the passing around of my collections if the DbSet.Local is only usable inside the using- block? Furthermore, how can I benefit from the change tracking if I use these short-lived context instances not handing over any cache data to each others under the hood? So, if the contexts should be short-lived for freeing resources such as connections, have I to give up the caching for this? I.e. I can’t use both at the same time (short-lived connections but long-lived cache)? So my only option would be to store the results returned by the query in my own variables, exactly what is discouraged in the motivation on page 24?
I am developing a WPF application which maybe will also become multi-tiered in the future, involving WCF. I know Julia has an example of this in her book, but I currently don’t have access to it. I found several others on the web, e.g. http://msdn.microsoft.com/en-us/magazine/cc700340.aspx (old ObjectContext, but good in explaining the inter-layer-collaborations). There, a long-lived context is used (although the disadvantages are mentioned, but no solution to these provided).
It’s not only that the single Destinations.Local gets lost, as you surely know all other entities fetched by the query are, too.
[Edit]:
After some more reading in Julia Lerman’s book, it seems to boil down to that EF does not have 2nd level caching per default; with some (considerable, I think) effort, however, ones can add 3rd party caching solutions, as is also described in the book and in various articles on MSDN, codeproject etc.
I would have appreciated if this problem had been mentioned in the section about DbSet.Local in the DbContext book that it is in fact a first level cache which is destroyed when the using {} block ends (just my proposal to make it more transparent to the readers). After first reading I had the impression DbSet.Local would always return the same reference (Singleton-style) also in the second using {} block despite the new DbContext instance.
But I am still unsure whether the 2nd level cache is the way to go for my WPF application (as Julia mentions the 2nd level cache in her article for distributed applications)? Or is the way to go to get my aggregate root instances (DDD, Eric Evans) of my domain model into memory by one or some queries in a using {} block, disposing the DbContext and only holding the references to the aggregate instances, this way avoiding a long-lived context? It would be great if you could help me with this decision.
http://msdn.microsoft.com/en-us/magazine/hh394143.aspx
http://www.codeproject.com/Articles/435142/Entity-Framework-Second-Level-Caching-with-DbConte
http://blog.3d-logic.com/2012/03/31/using-tracing-and-caching-provider-wrappers-with-codefirst/
The Local property provides a “local view of all Added, Unchanged, and Modified entities in this set”. Like all change tracking it is specific to the context you are currently using.
The DB Context is a workspace for loading data and preparing changes.
If two users were to add changes at the same time, they must not know of the others changes before they saved them. They may discard their prepared changes which suddenly would lead to problems for other other user as well.
A DB Context should be short lived indeed, but may be longer than super short when necessary. Also consider that you may not save resources by keeping it short lived if you do not load and discard data but only add changes you will save. But it is not only about resources but also about the DB state potentially changing while the DB Context is still active and has data loaded; which may be important to keep in mind for longer living contexts.
If you do not know yet all related changes you want to save into the database at once then I suggest you do not use the DB Context to store your changes in-memory but in a data structure in your code.
You can of course use entity objects for doing so without an active DB Context. This makes sense if you do not have another appropriate data class for it and do not want to create one, or decide preparing the changes in them make more sense. You can then use DbSet.Attach to attach the entities to a DB Context for saving the changes when you are ready.
I have a WPF application with MVVM. Assuming object composition from the ViewModel down looks as follows:
MainViewModel
OrderManager
OrderRepository
EFContext
AnotherRepository
EFContext
UserManager
UserRepository
EFContext
My original approach was to inject dependencies (from the ViewModelLocator) into my View Model using .InCallScope() on the EFContext and .InTransientScope() for everything else. This results in being able to perform a "business transaction" across multiple business layer objects (Managers) that eventually underneath shared the same Entity Framework Context. I would simply Commit() said context at the end for a Unit of Work type scenario.
This worked as intended until I realized that I don't want long living Entity Framework contexts at the View Model level, data integrity issues across multiple operations described HERE. I want to do something similar to my web projects where I use .InRequestScope() for my Entity Framework context. In my desktop application I will define a unit of work which will serve as a business transaction if you will, typically it will wrap everything within a button click or similar event/command. It seems that using Ninject's ActivationBlock can do this for me.
internal static class Global
{
public static ActivationBlock GetNinjectUoW()
{
//assume that NinjectSingleton is a static reference to the kernel configured with the necessary modules/bindings
return new ActivationBlock(NinjectSingleton.Instance.Kernel);
}
}
In my code I intend to use it as such:
//Inside a method that is raised by a WPF Button Command ...
using (ActivationBlock uow = Global.GetNinjectUoW())
{
OrderManager orderManager = uow.Get<OrderManager>();
UserManager userManager = uow.Get<UserManager>();
Order order = orderManager.GetById(1);
UserManager.AddOrder(order);
....
UserManager.SaveChanges();
}
Questions:
To me this seems to replicate the way I do business on the web, is there anything inherently wrong with this approach that I've missed?
Am I understanding correctly that all .Get<> calls using the activation block will produce "singletons" local to that block? What I mean is no matter how many times I ask for an OrderManager, it'll always give me the same one within the block. If OrderManager and UserManager compose the same repository underneath (say SpecialRepository), both will point to the same instance of the repository, and obviously all repositories underneath share the same instance of the Entity Framework context.
Both questions can be answered with yes:
Yes - this is service location which you shouldn't do
Yes you understand it correctly
A proper unit-of-work scope, implemented in Ninject.Extensions.UnitOfWork, solves this problem.
Setup:
_kernel.Bind<IService>().To<Service>().InUnitOfWorkScope();
Usage:
using(UnitOfWorkScope.Create()){
// resolves, async/await, manual TPL ops, etc
}
For our senior design project my group is making a Silverlight application that utilizes graph theory concepts and stores the data in a database on the back end. We have a situation where we add a link between two nodes in the graph and upon doing so we run analysis to re-categorize our clusters of nodes. The problem is that this re-categorization is quite complex and involves multiple queries and updates to the database so if multiple instances of it run at once it quickly garbles data and breaks (by trying to re-insert already used primary keys). Essentially it's not thread safe, and we're trying to make it safe, and that's where we're failing and need help :).
The create link function looks like this:
private Semaphore dblock = new Semaphore(1, 1);
// This function is on our service reference and gets called
// by the client code.
public int addNeed(int nodeOne, int nodeTwo)
{
dblock.WaitOne();
submitNewNeed(createNewNeed(nodeOne, nodeTwo));
verifyClusters(nodeOne, nodeTwo);
dblock.Release();
return 0;
}
private void verifyClusters(int nodeOne, int nodeTwo)
{
// Run analysis of nodeOne and nodeTwo in graph
}
All copies of addNeed should wait for the first one that comes in to finish before another can execute. But instead they all seem to be running and conflicting with each other in the verifyClusters method. One solution would be to force our front end calls to be made synchronously. And in fact, when we do that everything works fine, so the code logic isn't broken. But when it's launched our application will be deployed within a business setting and used by internal IT staff (or at least that's the plan) so we'll have the same problem. We can't force all clients to submit data at different times, so we really need to get it synchronized on the back end. Thanks for any help you can give, I'd be glad to supply any additional information that you could need!
I wrote a series to specifically address this situation - let me know if this works for you (sequential asynchronous workflows):
Part 2 (has a link back to the part1):
http://csharperimage.jeremylikness.com/2010/03/sequential-asynchronous-workflows-part.html
Jeremy
Wrap your database updates in a transaction. Escalate to a table lock if necessary
I've heard that unit testing is "totally awesome", "really cool" and "all manner of good things" but 70% or more of my files involve database access (some read and some write) and I'm not sure how to write a unit test for these files.
I'm using PHP and Python but I think it's a question that applies to most/all languages that use database access.
I would suggest mocking out your calls to the database. Mocks are basically objects that look like the object you are trying to call a method on, in the sense that they have the same properties, methods, etc. available to caller. But instead of performing whatever action they are programmed to do when a particular method is called, it skips that altogether, and just returns a result. That result is typically defined by you ahead of time.
In order to set up your objects for mocking, you probably need to use some sort of inversion of control/ dependency injection pattern, as in the following pseudo-code:
class Bar
{
private FooDataProvider _dataProvider;
public instantiate(FooDataProvider dataProvider) {
_dataProvider = dataProvider;
}
public getAllFoos() {
// instead of calling Foo.GetAll() here, we are introducing an extra layer of abstraction
return _dataProvider.GetAllFoos();
}
}
class FooDataProvider
{
public Foo[] GetAllFoos() {
return Foo.GetAll();
}
}
Now in your unit test, you create a mock of FooDataProvider, which allows you to call the method GetAllFoos without having to actually hit the database.
class BarTests
{
public TestGetAllFoos() {
// here we set up our mock FooDataProvider
mockRepository = MockingFramework.new()
mockFooDataProvider = mockRepository.CreateMockOfType(FooDataProvider);
// create a new array of Foo objects
testFooArray = new Foo[] {Foo.new(), Foo.new(), Foo.new()}
// the next statement will cause testFooArray to be returned every time we call FooDAtaProvider.GetAllFoos,
// instead of calling to the database and returning whatever is in there
// ExpectCallTo and Returns are methods provided by our imaginary mocking framework
ExpectCallTo(mockFooDataProvider.GetAllFoos).Returns(testFooArray)
// now begins our actual unit test
testBar = new Bar(mockFooDataProvider)
baz = testBar.GetAllFoos()
// baz should now equal the testFooArray object we created earlier
Assert.AreEqual(3, baz.length)
}
}
A common mocking scenario, in a nutshell. Of course you will still probably want to unit test your actual database calls too, for which you will need to hit the database.
Ideally, your objects should be persistent ignorant. For instance, you should have a "data access layer", that you would make requests to, that would return objects. This way, you can leave that part out of your unit tests, or test them in isolation.
If your objects are tightly coupled to your data layer, it is difficult to do proper unit testing. The first part of unit test, is "unit". All units should be able to be tested in isolation.
In my C# projects, I use NHibernate with a completely separate Data layer. My objects live in the core domain model and are accessed from my application layer. The application layer talks to both the data layer and the domain model layer.
The application layer is also sometimes called the "Business Layer".
If you are using PHP, create a specific set of classes ONLY for data access. Make sure your objects have no idea how they are persisted and wire up the two in your application classes.
Another option would be to use mocking/stubs.
The easiest way to unit test an object with database access is using transaction scopes.
For example:
[Test]
[ExpectedException(typeof(NotFoundException))]
public void DeleteAttendee() {
using(TransactionScope scope = new TransactionScope()) {
Attendee anAttendee = Attendee.Get(3);
anAttendee.Delete();
anAttendee.Save();
//Try reloading. Instance should have been deleted.
Attendee deletedAttendee = Attendee.Get(3);
}
}
This will revert back the state of the database, basically like a transaction rollback so you can run the test as many times as you want without any sideeffects. We've used this approach successfully in large projects. Our build does take a little long to run (15 minutes), but it is not horrible for having 1800 unit tests. Also, if build time is a concern, you can change the build process to have multiple builds, one for building src, another that fires up afterwards that handles unit tests, code analysis, packaging, etc...
I can perhaps give you a taste of our experience when we began looking at unit testing our middle-tier process that included a ton of "business logic" sql operations.
We first created an abstraction layer that allowed us to "slot in" any reasonable database connection (in our case, we simply supported a single ODBC-type connection).
Once this was in place, we were then able to do something like this in our code (we work in C++, but I'm sure you get the idea):
GetDatabase().ExecuteSQL( "INSERT INTO foo ( blah, blah )" )
At normal run time, GetDatabase() would return an object that fed all our sql (including queries), via ODBC directly to the database.
We then started looking at in-memory databases - the best by a long way seems to be SQLite. (http://www.sqlite.org/index.html). It's remarkably simple to set up and use, and allowed us subclass and override GetDatabase() to forward sql to an in-memory database that was created and destroyed for every test performed.
We're still in the early stages of this, but it's looking good so far, however we do have to make sure we create any tables that are required and populate them with test data - however we've reduced the workload somewhat here by creating a generic set of helper functions that can do a lot of all this for us.
Overall, it has helped immensely with our TDD process, since making what seems like quite innocuous changes to fix certain bugs can have quite strange affects on other (difficult to detect) areas of your system - due to the very nature of sql/databases.
Obviously, our experiences have centred around a C++ development environment, however I'm sure you could perhaps get something similar working under PHP/Python.
Hope this helps.
You should mock the database access if you want to unit test your classes. After all, you don't want to test the database in a unit test. That would be an integration test.
Abstract the calls away and then insert a mock that just returns the expected data. If your classes don't do more than executing queries, it may not even be worth testing them, though...
The book xUnit Test Patterns describes some ways to handle unit-testing code that hits a database. I agree with the other people who are saying that you don't want to do this because it's slow, but you gotta do it sometime, IMO. Mocking out the db connection to test higher-level stuff is a good idea, but check out this book for suggestions about things you can do to interact with the actual database.
I usually try to break up my tests between testing the objects (and ORM, if any) and testing the db. I test the object-side of things by mocking the data access calls whereas I test the db side of things by testing the object interactions with the db which is, in my experience, usually fairly limited.
I used to get frustrated with writing unit tests until I start mocking the data access portion so I didn't have to create a test db or generate test data on the fly. By mocking the data you can generate it all at run time and be sure that your objects work properly with known inputs.
Options you have:
Write a script that will wipe out database before you start unit tests, then populate db with predefined set of data and run the tests. You can also do that before every test – it'll be slow, but less error prone.
Inject the database. (Example in pseudo-Java, but applies to all OO-languages)
class Database {
public Result query(String query) {... real db here ...}
}
class MockDatabase extends Database {
public Result query(String query) {
return "mock result";
}
}
class ObjectThatUsesDB {
public ObjectThatUsesDB(Database db) {
this.database = db;
}
}
now in production you use normal database and for all tests you just inject the mock database that you can create ad hoc.
Do not use DB at all throughout most of code (that's a bad practice anyway). Create a "database" object that instead of returning with results will return normal objects (i.e. will return User instead of a tuple {name: "marcin", password: "blah"}) write all your tests with ad hoc constructed real objects and write one big test that depends on a database that makes sure this conversion works OK.
Of course these approaches are not mutually exclusive and you can mix and match them as you need.
Unit testing your database access is easy enough if your project has high cohesion and loose coupling throughout. This way you can test only the things that each particular class does without having to test everything at once.
For example, if you unit test your user interface class the tests you write should only try to verify the logic inside the UI worked as expected, not the business logic or database action behind that function.
If you want to unit test the actual database access you will actually end up with more of an integration test, because you will be dependent on the network stack and your database server, but you can verify that your SQL code does what you asked it to do.
The hidden power of unit testing for me personally has been that it forces me to design my applications in a much better way than I might without them. This is because it really helped me break away from the "this function should do everything" mentality.
Sorry I don't have any specific code examples for PHP/Python, but if you want to see a .NET example I have a post that describes a technique I used to do this very same testing.
I agree with the first post - database access should be stripped away into a DAO layer that implements an interface. Then, you can test your logic against a stub implementation of the DAO layer.
You could use mocking frameworks to abstract out the database engine. I don't know if PHP/Python got some but for typed languages (C#, Java etc.) there are plenty of choices
It also depends on how you designed those database access code, because some design are easier to unit test than other like the earlier posts have mentioned.
I've never done this in PHP and I've never used Python, but what you want to do is mock out the calls to the database. To do that you can implement some IoC whether 3rd party tool or you manage it yourself, then you can implement some mock version of the database caller which is where you will control the outcome of that fake call.
A simple form of IoC can be performed just by coding to Interfaces. This requires some kind of object orientation going on in your code so it may not apply to what your doing (I say that since all I have to go on is your mention of PHP and Python)
Hope that's helpful, if nothing else you've got some terms to search on now.
Setting up test data for unit tests can be a challenge.
When it comes to Java, if you use Spring APIs for unit testing, you can control the transactions on a unit level. In other words, you can execute unit tests which involves database updates/inserts/deletes and rollback the changes. At the end of the execution you leave everything in the database as it was before you started the execution. To me, it is as good as it can get.