Self Tracking Entities Traffic Optimization - wpf

I'm working on a personal project using WPF with Entity Framework and Self Tracking Entities. I have a WCF web service which exposes some methods for the CRUD operations. Today I decided to do some tests and to see what actually travels over this service and even though I expected something like this, I got really disappointed. The problem is that for a simple update (or delete) operation for just one object - lets say Category I send to the server the whole object graph, including all of its parent categories, their items, child categories and their items, etc. I my case it was a 170 KB xml file on a really small database (2 main categories and about 20 total and about 60 items). I can't imagine what will happen if I have a really big database.
I tried to google for some articles concerning traffic optimization with STE, but with no success, so I decided to ask here if somebody has done something similar, knows some good practices, etc.
One of the possible ways I came out with is to get the data I need per object with more service calls:
return context.Categories.ToList();//only the categories
...
return context.Items.ToList();//only the items
Instead of:
return context.Categories.Include("Items").ToList();
This way the categories and the items will be separated and when making changes or deleting some objects the data sent over the wire will be less.
Has any of you faced a similar problem and how did you solve it or did you solve it?

We've encountered similiar challenges. First of all, as you already mentioned, is to keep the entities as small as possible (as dictated by the desired client functionality). And second, when sending entities back over the wire to be persisted: strip all navigation properties (nested objects) when they haven't changed. This sounds very simple but is not at all trivial. What we do is to recursively dig into the entities present in trackable collections of say the "topmost" entity (and their trackable collections, and theirs, and...) and remove them when their ChangeTracking state is "Unchanged". But be carefull with this, because in some cases you still need these entities because they have been removed or added to trackable collections of their parent entity (so then you shouldn't remove them).
This, what we call "StripEntity", is also mentioned (not with any code sample or whatsoever) in Julie Lerman's - Programming Entity Framework.
And although it might not be as efficient as a more purist kind of approach, the use of STE's saves a lot of code for queries against the database. We are not in need for optimal performance in a high traffic situation, so STE's suit our needs and takes away a lot of code to communicate with the database. You have to decide for your situation what the "best" solution is. Good luck!

You can find an Entity Framework project item at http://selftrackingentity.codeplex.com/. With version 0.9.8, I added a method called GetObjectGraphChanges() that returns an optimized entity object graph with only objects that have changes.
Also, there are two helper methods: EstimateObjectGraphSize() and EstimateObjectGraphChangeSize(). The first method returns the estimate size of the whole entity object along with its object graph; and the later returns the estimate size of the optimized entity object graph with only object that have changes. With these two helper methods, you can decide whether it makes sense to call GetObjectGraphChanges() or not.

Related

Using of graphql service within desktop application that "follows" MVVM and DDD

We have a WPF desktop application that uses MVVM pattern and DDD (well, let's say that at least my model classes that store data named by entities taken from the real world). APP uses several microservices through REST API. And it worked perfectly. Until we thought that it's time to use some facade for back-end part to unite all those microservices and get only data that we need for particular screen.
BUT. The question is, how to make them live together.
On the one hand, we have dynamically returned data from graphql. It
means that, for example, if we have list of people on the one screen,
we will request id, name, surname and role of the person. On the
different screen for dropdown of people we will request the same data
but without role.
On the other hand we have class Person that has static set of fields Name, Surname, Role and Id, which person has in "real life"
If we use the same Person class with graphql, converting data from JSON to model Person, both screens will work fine, but behind the scene one screen that doesn't need Role wouldn't request it from graphQL. And we will have a situation when model class Person will have field Role but it will be just empty (which is i believe is kind of smells. At least I don't feel like it would be easy to maintain such a code. Developer needs to add some information to the screen, opens model, sees that Role is there, bind the field to the screen and goes to drink cofee. And then oops, there is the fields but there was no data assigned ).
Two variants I have on my mind are:
either to not use models and DDD and map data directly to ViewModel
(which personally feels like ruining everything we had before).
or we map that dynamic data to our existing models and different field for different screens (for the same class Person e.g.) will be
empty (because not requested).
Maybe somebody has already used such a combination. How do you use it and what pros and cons are?
It's a fairly common situation where you have a data layer returns many columns but only some are used in a given view.
There is no absolute "best" solution independent of how much impact the full set of columns will have on performance. Which might in turn be linked to things like caching.
You could write services that return subsets of data and then you only use the necessary bandwidth. Sort of a CQRS pattern but with maybe more models than just read + write.
Often this is unnecessary and the complications introduced do not compensate for the increased cost of maintenance.
What is often done is just to map from model to viewmodel (and back). The viewmodel that needs just 4 columns just has 4 properties and any more returned by the model are not copied. The viewmodel that needs 5 has 5 properties and they are copied from the model.

Short lived DbContext in WPF application reasonable?

In his book on DbContext, #RowanMiller shows how to use the DbSet.Local property to avoid 1.) unnecessary roundtrips to the database and 2.) passing around collections (created with e.g. ToList()) in the application (page 24). I then tried to follow this approach. However, I noticed that from one using [} – block to another, the DbSet.Local property becomes empty:
ObservableCollection<Destination> destinationsList;
using (var context = new BAContext())
{
var query = from d in context.Destinations …;
query.Load();
destinationsList = context.Destinations.Local; //Nonzero here.
}
//Do stuff with destinationsList
using (var context = new BAContext())
{
//context.Destinations.Local zero here again;
//So no way of getting the in-memory data from the previous using- block here?
//Do I have to do another roundtrip to the database here to get the same data I wanted
//to cache locally???
}
Then, what is the point on page 24? How can I avoid the passing around of my collections if the DbSet.Local is only usable inside the using- block? Furthermore, how can I benefit from the change tracking if I use these short-lived context instances not handing over any cache data to each others under the hood? So, if the contexts should be short-lived for freeing resources such as connections, have I to give up the caching for this? I.e. I can’t use both at the same time (short-lived connections but long-lived cache)? So my only option would be to store the results returned by the query in my own variables, exactly what is discouraged in the motivation on page 24?
I am developing a WPF application which maybe will also become multi-tiered in the future, involving WCF. I know Julia has an example of this in her book, but I currently don’t have access to it. I found several others on the web, e.g. http://msdn.microsoft.com/en-us/magazine/cc700340.aspx (old ObjectContext, but good in explaining the inter-layer-collaborations). There, a long-lived context is used (although the disadvantages are mentioned, but no solution to these provided).
It’s not only that the single Destinations.Local gets lost, as you surely know all other entities fetched by the query are, too.
[Edit]:
After some more reading in Julia Lerman’s book, it seems to boil down to that EF does not have 2nd level caching per default; with some (considerable, I think) effort, however, ones can add 3rd party caching solutions, as is also described in the book and in various articles on MSDN, codeproject etc.
I would have appreciated if this problem had been mentioned in the section about DbSet.Local in the DbContext book that it is in fact a first level cache which is destroyed when the using {} block ends (just my proposal to make it more transparent to the readers). After first reading I had the impression DbSet.Local would always return the same reference (Singleton-style) also in the second using {} block despite the new DbContext instance.
But I am still unsure whether the 2nd level cache is the way to go for my WPF application (as Julia mentions the 2nd level cache in her article for distributed applications)? Or is the way to go to get my aggregate root instances (DDD, Eric Evans) of my domain model into memory by one or some queries in a using {} block, disposing the DbContext and only holding the references to the aggregate instances, this way avoiding a long-lived context? It would be great if you could help me with this decision.
http://msdn.microsoft.com/en-us/magazine/hh394143.aspx
http://www.codeproject.com/Articles/435142/Entity-Framework-Second-Level-Caching-with-DbConte
http://blog.3d-logic.com/2012/03/31/using-tracing-and-caching-provider-wrappers-with-codefirst/
The Local property provides a “local view of all Added, Unchanged, and Modified entities in this set”. Like all change tracking it is specific to the context you are currently using.
The DB Context is a workspace for loading data and preparing changes.
If two users were to add changes at the same time, they must not know of the others changes before they saved them. They may discard their prepared changes which suddenly would lead to problems for other other user as well.
A DB Context should be short lived indeed, but may be longer than super short when necessary. Also consider that you may not save resources by keeping it short lived if you do not load and discard data but only add changes you will save. But it is not only about resources but also about the DB state potentially changing while the DB Context is still active and has data loaded; which may be important to keep in mind for longer living contexts.
If you do not know yet all related changes you want to save into the database at once then I suggest you do not use the DB Context to store your changes in-memory but in a data structure in your code.
You can of course use entity objects for doing so without an active DB Context. This makes sense if you do not have another appropriate data class for it and do not want to create one, or decide preparing the changes in them make more sense. You can then use DbSet.Attach to attach the entities to a DB Context for saving the changes when you are ready.

Game player object design

I'm supporting a server for an online card game and while thinking about refactoring it into a better state I have found myself unable to decide what is a proper object model for my needs.
I have a Player class which has a lot of attributes. The first problem is just that - the class is too big. The second problem is that I don't know how to refactor it. I will list some of the attributes and issues with these.
Some attributes are very tightly bound to a player: nick, email, last login &c. These, I suppose, are to be kept directly in the player class and in the same table in the DB.
Now, some attributes are a little more difficult, like money and gold amount. The problem with these is that they are historically stored in a different table, there might be some more currencies later on and that they MUST be synched into the database at their own pace.
Third category of attributes are loosely coupled to the player, like status string, experience, achievements, statistics &c. These are stored in different tables in the DB and MUST be stored, retrieved, cached and synchronized at their own pace.
Note that one of the big problems here is that we have to implement relatively complex database synchronization schemes because we have a lot of online players and our game is soft-realtime and we have to make load on the DB as low as possible.
My questions are:
How to determine which attributes to store within a player class and which not to? Say, experience, nickname, money amount?
When one has some attributes that may be grouped together like (strength, agility, endurance, &c.) and (handItem, headItem, feetItem, weapon) when they should be grouped and when not?
What to do with complex database synchronization schemes? Make a separate model for each attribute that needs to be synched independently or make some DataManager classes to take them apart and work with them?
What to do with the need for a class to have several different "data representations" for external consumers? Like XML, Json, another XML for some external service, human-readable string, &c.
I'm sorry if my questions are bogus, I'm not really good at OOP design, I'm more an FP guy. And my English is not very good =).
There is no "limit" to what you can store in a player class. As long as it is concerning him and him only, it should be in his class. But one thing you should consider is to make several player classes. The idea is : if you don't need is, don't query it. You may have PlayerView_Small, PlayerBuying, PlayerFighting, PlayerSettings (depending on your game, they may not be fulfilling the exact same purpose)... This way for each "need" of info on a player, you only load the player data you need, and can handle it properly. Also, you may use inheritance if some class is only a more detailed version of the other.
If you are talking about the class, it may be in a sub-class PlayerAttributes of which an instance is contained into PlayerFighting and PlayerView_Detailed. In the database, it might be interesting to store it as a string (conveniently outputted by our class, and accepted in constructor), to avoid having too much fields, but you will lose the sorting ability. That's probably not a problem in our case, but might be in some others.
Blank for now, I don't understand where there is synchronization, will edit when informed.
In your PlayerViewDetailInfo(or in your PlayerAllData depending what you need), you place some methods such as ToXmlClient1(), ToJson(), ToHumanReadableString() (although that might be a bit confusing to the eye, you should consider HTML^^). The class having the method should be the class with the least (but sufficient to provide the answer) data. When requested, you load the Player... which has the method giving the correct output, and you write it directly in the response.

Silverlight LINQtoSQL: one big dataclass, or several small ones?

I'm new to Silverlight, but being dumped right into the fray - good way to learn I suppose :o)
Anyway, the webapp I'm working on has a relatively complex database structure that represents various object types that are linked to each other, and I was wondering 2 things:
1- What is the recommended approach when it comes to dataclasses? Have just one big dataclass, or try and separate it into several smaller dataclasses, keeping in mind they will need to reference each other?
2- If the recommended approach is to have several dataclasses, how do you define the inter-dataclasses references?
I'm asking because I did a small test. In my DB (simplified here, real model is more complex but that's not important), I have a table "Orders" and a table "Parameters". "Orders" has a foreign key on "Parameters". What I did is create 2 dataclasses.
The first one, ParamClass, were I dropped the "Parameters" table only, so I can have a nice "parameter" class. I then created a simple service to add basic SELECT and INSERT functionality.
The second one, OrdersClass, where I dropped both tables, so that the relation between the tables would automatically create a "EntityRef<parameter>" variable inside the "order" class. I then removed the "parameters" class that was automatically created in the OrdersClass dataclass, since the class has already been declared in the ParamClass dataclass. Again I created a small service to test it.
So far so good, it builds happily. The problem is that when I try to handle things on the application code, I added service references for both dataclasses, but it is not happy doing something like:
OrdersServiceReference.order myOrder = new OrdersServiceReference.order();
myOrder.parameter = new ParamServiceReference.parameter(); //<-PROBLEM IS HERE
It comlpains that it cannot implicitly convert from type 'MytestDC.ParamServiceReference.parameter' to 'MytestDC.OrdersServiceReference.parameter'
Do I somehow need to declare some sort of reference to ParamClass from OrdersClass, or how do I "convert" one to the other?
Is this even a recommended and efficient way of doing this?
Since it's a team-project, I initially wanted to separate the dataclasses so that they (and their services) can be easily checked out by one member without checking out the whole entire dataclass.
Any help appreciated!
PS: using Silverlight 4, in case that's important
Based on the widely accepted Single Responsability Principle (SRP), a class should always be responsible for one task, and one task only.
That pretty much invalidates your "one big dataclass" approach.
I would always recommend smaller, more manageable bits that can be combined, instead of one humonguous class that does everything (except brew coffee for you).
Resources for the SRP:
Wikipedia on SRP
OODesign: Single Responsibility Principle
ObjectMentor: list of articles on good app design - which has a few links to PDF documents, like this one on SRP written by Robert C. Martin - the "guru" on proper OO design
OK, some more research let me to this: it is not simple to separate classes from a relational model using LINQtoSQL. I ended up switching to an Entity Framework approach, which itself doesn't deal with it gracefully (see here and there, for example), but at least it solved another major problem I had with LINQtoSQL.
There are other ORMs out there that are apparently much more capable at this (NHibernate comes up often in recommendations), unfortunately, I don't have time to investigate them now, being under such a tight deadline.
As for the referencing, it was quite simple, change the line to:
myOrder.parameter = new OrderServiceReference.parameter();
even though I removed the declaration from that dataclass.
Hope this helps someone!

How to model Data Transfer Objects for different front ends?

I've run into reoccuring problem for which I haven't found any good examples or patterns.
I have one core service that performs all heavy datasbase operations and that sends results to different front ends (html, silverlight/flash, web services etc).
One of the service operation is "GetDocuments", which provides a list of documents based on different filter criterias. If I only had one front-end, I would like to package the result in a list of Document DTOs (Data transfer objects) that just contains the data. However, different front-ends needs different amounts of "metadata". The simples client just needs the document headline and a link reference. Other clients wants a short text snippet of the document, another one also wants a thumbnail and a third wants the name of the author. Its basically all up to the implementation of the GUI what needs to be displayed.
Whats the best way to model this:
As a lot of different DTOs (Document, DocumentWithThumbnail, DocumentWithTextSnippet)
tends to become a lot of classes
As one DTO containing all the data, where the client choose what to display
Lots of unnecessary data sent
As one DTO where certain fields are populated based on what the client requested
Tends to become a very large class that needs to be extended over time
One DTO but with some kind of generic "Metadata" field containing requested metadata.
Or are there other options?
Since I want a high performance service, I need to think about both network load and caching strategies.
Does anyone have any good patterns or practices that might help me?
What I would do is give the front end the ability to request the presence of the wanted metadata ( say getDocument( WITH_THUMBNAILS | WITH_TEXT_SNIPPET ) )
Then this DTO is built with only this requested information.
Adding all the possible metadata is as you said, unacceptable.
I will surely stay with one class defining all the possible methods (getTitle(), getThumbnail()) and if possible it will return a placeholder when the thumbnail was not requested. Something like "Image not available".
If you want to model this like a pattern, take a look at the factory patterns.
Hope this helps you.
Is there any noticable cost to creating a DTO that has all the data any of your views could need and using it everywhere? I would do that, especially since it insulates you from a requirement change down the line to have one of the views incorporate data one of the other views uses
ex. Maybe your silverlight/flash view doesn't show the title itself b/c it's in the thumb now, but they decide they want to sort by it later.
To clarify, I do not necesarily think you need to pass down all of the data every time, but I think your DTO class should define all of them. Just don't fall into the pits of premature optimization or analysis paralysis. Do the simplest thing first, then justify added complexity. Throw it all in and profile it. If the perf is unacceptable, optimize and try again.

Resources