Data layer architecture for WPF Rich-Client? - wpf

Background
I need to build a Rich-Client application using .NET. The app needs to handle TreeViewControls and TableViewControls with about 100000 entities. GUI is build with WPF, very likely using Telerik Controls. My question is about the general architecture of the data-layer. I've got some coarse ideas of the concepts, but would highly appreciate your comments / thoughts and hints into which technology I should dig deeper. Here're my thoughts:
Conceptual Layers
Presentation Layer
just the WPF Controls, I'd need performant synchronizing of different controls on property changes, but I don't anticipate major problems here.
Business Layer
creating views (object selections to be displayed in the controls), CRUD operations (modifications done directly with the POCOs), searching (global search, but also limited to a view)
Repository
holds POCOs in an enitity map, decides weather to load from persistence store
Persistence-Manager
I'm thinking of using a LocalDB or simple Key-Value Store as (persistent) Client-Cache. So, the Persistence-Manager would try to get an object from the local store. Otherwise get the data from the server. Also, persisting data to the Client-Cache. The data would be available via a webservice. I'm happy to give WCF Data Services a try.
Persistence-Layers
There would be two parts:
- Local DB connection using an ORM like EF or OpenAccess; or a simple key-value store
- HTTP connection to consume the Web-Service
Questions
In a layering like this, how about lazy loading referenced objects? I know EF and other ORMs take care of a lot of the issues I have here, too. But I don't see yet how to plug these frameworks into the above layering. Also, where to track changes? Where to secure consistency when deleting objects? (e.g. deleting references to these objects as well)
I would eager load whole views (hierarchical structures) and perform Linq to objects to those collections of POCOs. Maybe implement a simple inverted index if Linq performance would become a matter. But how should I best implement global searches on the server? Are there libraries ("Linq to OData") available?
What do you think about a fully "diconnected" scenario? Holding all data a user needs in a local database. Sync on start / stop and user triggered. I could use an ORM directly on the local DB, with good chances to save a lot of headaches trying to implement a lot of consistency features by hand (using the above layering).
Or in contrast, forget about the local database and batch eager load most of the needed data. Here I'm concerned about the performance of the webservices (without having experience with OData, WCF). I've build an app using Redis and Python that loads about 200000 business objects quite fast (< 1 min) to the client (the objects are already serialized cached in Redis).
I'll certainly do some prototyping and benchmarking, but to get a good start, any thoughts and recommendations are highly appreciate.
Cheers,
Jan

Related

Does ATK4 have caching support?

I've gone through the documentation on ATK4, trying to find a reference point how to handle caching - partial or full page.
Seems like there is no entry on the matter. Strange from a framework that is built for scalability. Is there a way to cache DB queries, pages, views, etc?
Thanks for your question. (I'm the author of ATK4).
In my opinion, scalability and caching are two different topics and can be addressed separately. Framework approaches scalability by optimising queries and minimising load for each request and also designing approach where multiple nodes can be used to seamlessly scale your application horizontally. There is also an option to add reverse proxy to cache pages before they even hit the web server.
Agile Toolkit has support for two types of caching:
View-level caching
As you read documentation on object render trees - framework initialise and render recursively, so if you add "caching" support to a Page level, you will be able to intercept and retrieve it's contents from cache. You can also cache views.
Here is a controller which can be used to implement caching for you:
https://github.com/agile55/viewcache
Model level caching
Sometimes you would want to cache your model data, so instead of retrieving data from the slow database, you can quickly fetch the data from a faster source. Agile Toolkit has support for multiple model data sources, where a cache would be queried first and refreshed if it didn't contain the data. Here you can find more information or ask further questions:
http://book.agiletoolkit.org/model/core.html#using-caching
http://forum.agiletoolkit.org/t/is-setcache-implemented/62
Other Ideas
Given the object-oriented nature of ATK4, you can probably come up with a new ways to cache data. If you have any interesting ideas, our c

Data Migration from Legacy Data Structure to New Data Structure

Ok So here is the problem we are facing.
Currently:
We have a ton of Legacy Applications that have direct database access
The data structure in the database is not normalized
The current process / structure is used by almost all applications
What we are trying to implement:
Move all functionality to a RESTful service so no application has direct database access
Implement a normalized data structure
The problem we are having is how to implement this migration not only with the Applications but with the Database as well.
Our current solution is to:
Identify all the CRUD functionality and implement this in the new Web Service
Create the new Applications to replace the Legacy Apps
Point the New Applications to the new Web Service ( Still Pointing to the Old Data Structure )
Migrate the data in the databases to the new Structure
Point the New Applications to the new Web Service ( Point to new Data Structure )
But as we are discussing this process we are looking at having to rewrite the New Web Service twice. Once for the Old Data Structure and Once for the New Data Structure, As currently we could not represent the old Data Structure to fit the new Data Structure for the new Web Service.
I wanted to know if anyone has faced any challenges like this and how did you overcome these types of issues/implementation and such.
EDIT: More explanation of synchronization using bi-directional triggers; updates for syntax, language and clarity.
Preamble
I have faced similar problems on a data model upgrade on a large web application I worked on for 7 years, so I feel your pain. From this experience, I would propose the something a bit different - but hopefully one that will be a lot easier to implement. But first, an observation:
Value to the organisation is the data - data will long outlive all your current applications. The business will constantly invent new ways of getting value out of the data it has captured which will engender new reports, applications and ways of doing business.
So getting the new data structure right should be your most important goal. Don't trade getting the structure right against against other short term development goals, especially:
Operational goals such as rolling out a new service
Report performance (use materialized views, triggers or batch jobs instead)
This structure will change over time so your architecture must allow for frequent additions and infrequent normalizations to it. This means that your data structure and any shared APIs to it (including RESTful services) must be properly versioned.
Why RESTful web services?
You mention that your will "Move all functionality to a RESTful service so no application has direct database access". I need to ask a very important question with respect to the legacy apps: Why is this important and what value has it brought?
I ask because:
You lose ACID transactions (each call is a single transaction unless you implement some horrifically complicated WS-* standards)
Performance degrades: Direct database connections will be faster (no web server work and translations to do) and have less latency (typically 1ms rather than 50-100ms) which will visibly reduce responsiveness in applications written for direct DB connections
The database structure is not abstracted from the RESTful service, because you acknowledge that with the database normalization you have to rewrite the web services and rewrite the applications calling them.
And the other cross-cutting concerns are unchanged:
Manageability: Direct database connections can be monitored and managed with many generic tools here
Security: direct connections are more secure than web services that your developers will write,
Authorization: The database permission model is very advanced and as fine-grained as you could want
Scaleability: The web service is a (only?) direct-connected database application and so scales only as much as the database
You can migrate the database and keep the legacy applications running by maintaining a legacy RESTful API. But what if we can keep the legacy apps without introducing a 'legacy' RESTful service.
Database versioning
Presumably the majority of the 'legacy' applications use SQL to directly access data tables; you may have a number of database views as well.
One approach to the data migration is that the new database (with the new normalized structure in a new schema) presents the old structure as views to the legacy applications, typically from a different schema.
This is actually quite easy to implement, but solves only reporting and read-only functionality. What about legacy application DML? DML can be solved using
Updatable views for simple transformations
Introducing stored procedures where updatable views not possible (eg "CALL insert_emp(?, ?, ?)" rather than "INSERT INTO EMP (col1, col2, col3) VALUES (?, ? ?)".
Have a 'legacy' table that synchronizes with the new database with triggers and DB links.
Having a legacy-format table with bi-directional synchronization to the new format table(s) using triggers is a brute-force solution and relatively ugly.
You end up with identical data in two different schemas (or databases) and the possibility of data going out-of-sync if the synchronization code has bugs - and then you have the classic issues of the "two master" problem. As such, treat this as a last resort, for example when:
The fundamental structure has changed (for example the changing the cardinality of a relation), or
The translation to the legacy format is a complex function (eg if the legacy column is the square of the new-format column value and is set to "4", an updatable view cannot determine if the correct value is +2 or -2).
When such changes are required in your data, there will be some significant change in code and logic somewhere. You could implement in a compatibility layer (advantage: no change to legacy code) or change the legacy app (advantage: data layer is clean). This is a technical decision by the engineering team.
Creating a compatibility database of the legacy structure using the approaches outlined above minimize changes to legacy applications (in some cases, the legacy application continues without any code change at all). This greatly reduces development and testing costs (for which there is no net functional gain to the business), and greatly reduces rollout risk.
It also allows you to concentrate on the real value to the organisation:
The new database structure
New RESTful web services
New applications (potentially build using the RESTful web services)
Positive aspect of web services
Please don't read the above as a diatribe against web services, especially RESTful web services. When used for the right reason, such as for enabling web applications or integration between disparate systems, this is a good architectural solution. However, it might not be the best solution for managing your legacy apps during the data migration.
What it seems like you ought to do is define a new data model ("normalized") and build a mapping from the normalized model back to the legacy model. Then you can replace legacy direct calls with calls on the normalized one at your leisure. This breaks no code.
In parallel, you need to define what amounts to a (cerntralized) legacy db api, and map it to to your normalized model. Now, at your leisure, replace the original legacy db calls with calls on the legacy db API. This breaks no code.
Once the original calls are completely replaced, you can switch the data model over to the real normalized one. This should break no code, since everything is now going against the legacy db API or the normalized db API.
Finally, you can replace the legacy db API calls and related code, with revised code that uses the normalized data API. This requires careful recoding.
To speed all this up, you want an automated code transformation tool to implement the code replacements.
This document seems to have a good overview: http://se-pubs.dbs.uni-leipzig.de/files/Cleve2006CotransformationsinDatabaseApplicationsEvolution.pdf
Firstly, this seems like a very messy situation, and I don't think there's a "clean" solution. I've been through similar situations a couple of times - they weren't much fun.
Firstly, the effort of changing your client apps is going to be significant - if the underlying domain changes (by introducing the concept of an address that is separate from a person, for instance), the client apps also change - it's not just a change in the way you access the data. The best way to avoid this pain is to write your API layer to reflect the business domain model of the future, and glue your old database schema into that; if there are new concepts you cannot reflect using the old data (e.g. "get /app/addresses/addressID"), throw a NotImplemented error. Where you can reflect the new model with the old data, wire it together as best you can, and then re-factor under the covers.
Secondly, that means you need to build versioning into your API as a first-class concern - so you can tell clients that in version 1, features x, y and z throw "NotImplemented" exceptions. Each version should be backwards compatible, but add new features. That way, you can refactor features in version 1 as long as you don't break the service, and implement feature x in version 1.1, feature y in version 1.2 etc. Ideally, have a roadmap for your versions, and notify the client app owners if you're going to stop supporting a version, or release a breaking change.
Thirdly, a set of automated integration tests for your API is the best investment you can make - they confirm that you've not broken features as you refactor.
Hope this is of some use - I don't think there's a single, straightforward answer to your question.

Unity of work used with more database

I am learning some patterns, Unity of work / repository... There are some examples on web, but no one connects to more than one database.
In my applications almost always I have the need to get some object from a database (for example users) and some other object from another, how can I use the patterns ? (Since I am a novice on this subject an explicit example is a must)
Thank you!
As a general reference I advise you and anyone interested to visit http://martinfowler.com/eaaCatalog/index.html
which has a collection of UML and explanation over the most common Design Pattern for enterprises software
In your specific case Unity of work is particularly suited to work along with Data Mapper and Identity Map. I guess to understand 100% unity of work one must master also the other 2 pattern.
To answer your question I think you can create a unity of work and save it in a registry, so it is available all over the application. The unity must be a singleton since you need to ensure a central gateway to communicate with the database. Inside your unity of work you will have an identity map which is a collection of valued Objects in memory representing your model which is responsible to maintain the object states during all the application's operations. The unity will be used by your service layer to perform CRUD operations over the model and commit these changes.
To work with more databases I guess you need to leverage some sort of namespaced access to the object stored in the Identity map. You have the choice where to namespace: unity of work or identity map. The decision is really up to your application and use cases. You might need to connect to different DBs for splitting responsibilities between read and write or trying to integrate heterogeneous data sources.
An alternative would be to inject the DB object into the unity of work methods, in this case the application has 100% control over which database is used.
The repository pattern as far as I understand helps you to abstract to the storage of you model and it is particularly useful when you are working with heterogeneous data sources of you must provide such a flexibility to your application. Therefore I guess it is quite different from unity of work and Data Mapper layer.

Best Practices for Managing/Loading Data in Silverlight Client from Server

I am developing a Business Application which is flooded with data. For each functionality, I have one ViewModel and for each of this ViewModel, I create one Separate Db Context object. To be more clear ..
i.e. There are almost 5 to 8 Functionality where I need Customers list. and to get them, I create Separate Db Context and Load Separate List from Server in each ViewModel. there are lots of redundant Data Download with multiple Db trips. which occupies too much space RAM and slows down the performance. It might affect Performance in many difference ways.
I was wondering What is the Best Practices for handling such massive data and optimize the performance of the Application?
I think one solution is to maintain common data pool across whole application But am little bit confused how to design it properly so that it won't create some other bottleneck for the Application. And There must be some Standard solution for this too..
Thank you so much for your time and help.
One option is to create a SharedViewModel as a singleton and inject it into ViewModels that need the shared data. I do this and it works well.
Another option is to use something like SterlingDB, a local document database for SL/WP7, and store the data in isolated storage.

How to abstract my business model using dbExpress and Delphi (maybe DataSnap as well)?

If my question isn't clear, please help me improve it with comments.
I'm new to Delphi and dbExpress and I'm just getting acquaintance with TSQLDataset, TDataSetProvider, TClientDataSet and TDataSource classes.
The project that I'm working on uses this components in a way that feels strange to me. There is a huge data module unit that contains lots and lots of the quartet of classes previously described. I'm guessing that there are better (and more modularized) ways of doing this. DataSnap is used only to place this data module in a server application, so the clients access the data through it.
So, let me try to explain some of my doubts:
What is the role of each one of this classes? I read the documentation but I can't get a practical insight on this subject (specially about TDataSetProvider).
Which classes should be in the data module and which should be in my forms?
Is it possible to create an intermediate layer to abstract my business model from my database setup (maybe creating functions that return immutable datasets?)?
If so, is it wise to use DataSnap to do so?
I'm sorry if I'm not clear enough. Thanks in advance.
Components
TDataSource is the bridge between data-aware controls and the dataset (TDataSet descendant) from which they are to get their values.
TClientDataSet is one such dataset. TClientDataSet can be used in isolation, for example to access data contained in xml files, but can also be hooked up to a TDataSetProvider.
TDataSetProvider is the bridge between the in-memory TClientDataSet and the actual dataset that takes its data from a database through some kind of driver. In Client Server development you will usually see a TRemoteDataSetProvider (name may be different, I don't work with these components that often), which bridges the gap between the client and server.
TSQLDataSet is the actual dataset getting its data from some database.
It would feel strange to me to see this quartet all in one executable. I would expect the TSQLDataSet on the server side hooked up to the TRemoteDataSetProvider's counter part. However, I guess that with an embedded database it could be a way to support the briefcase model, which is where TClientDataSet is really helpful (TClientDataset is very powerful, that is just one of its strong points.)
Single datamodule
Ouch. A single huge datamodule is lazy programming or the result of misconceptions about how to use datamodules. It is perfectly normal to have a single datamodule that "hosts" the database connection which is then used by various other datamodules that are more tightly focused on aspects of the application.
Domain abstraction
With regard to abstracting your business model, dbexpress and datasnap should really not be anywhere in your business model. They should be part of your data layer.
TDataSource, TClientDataSet and custom TDataSetProvider descendant(s) can be used to leverage the power of the data-aware controls in the UI while still keeping the UI separate from the business model. In this case the custom TDataSetProvider would be the bridge between the client dataset and the collections and instances in the domain layer.
Even so, I would still expect to see a separate data layer, using TRemoteDataSetProviders or straight TDataSet descendants (like TSQLDataSet) to provide the domain layer with its data.
The single huge data module you mentioned could be part of that data layer, with the client datasets providing the business layer with its data. As you also mention TDataSource as part of a common quartet, the application was probably developed in a RAD-data-aware fashion where UI controls are basically hooked up straight to database columns/tables.
If you want to transform this application to have a more layered architecture, tread carefully and slowly. Get to know the current architecture first and get to know it well enough to see the impact that this kind of transformation would have. The links provided by Serg will certainly help you there. Pawel Glowacki has written extensively about DataSnap.
The project you are working on is a Delphi Multitier Database Application
Your question is too wide for SO and hardly can be answered in its current form - you should learn to understand the underlying architecture.
You can start from video tutorial by Pawel Glowacki:
http://www.youtube.com/watch?v=B4uxLLIUddg
http://cc.embarcadero.com/item/28188

Resources