I've gone through the documentation on ATK4, trying to find a reference point how to handle caching - partial or full page.
Seems like there is no entry on the matter. Strange from a framework that is built for scalability. Is there a way to cache DB queries, pages, views, etc?
Thanks for your question. (I'm the author of ATK4).
In my opinion, scalability and caching are two different topics and can be addressed separately. Framework approaches scalability by optimising queries and minimising load for each request and also designing approach where multiple nodes can be used to seamlessly scale your application horizontally. There is also an option to add reverse proxy to cache pages before they even hit the web server.
Agile Toolkit has support for two types of caching:
View-level caching
As you read documentation on object render trees - framework initialise and render recursively, so if you add "caching" support to a Page level, you will be able to intercept and retrieve it's contents from cache. You can also cache views.
Here is a controller which can be used to implement caching for you:
https://github.com/agile55/viewcache
Model level caching
Sometimes you would want to cache your model data, so instead of retrieving data from the slow database, you can quickly fetch the data from a faster source. Agile Toolkit has support for multiple model data sources, where a cache would be queried first and refreshed if it didn't contain the data. Here you can find more information or ask further questions:
http://book.agiletoolkit.org/model/core.html#using-caching
http://forum.agiletoolkit.org/t/is-setcache-implemented/62
Other Ideas
Given the object-oriented nature of ATK4, you can probably come up with a new ways to cache data. If you have any interesting ideas, our c
Related
I'm wondering how to add caching when using the Kotlin-Exposed library for SQL access.
For experimentation I've written a small application using both Spring Boot + Hibernate, and KTOR + Exposed.
I did some load-testing and when POSTing to both versions of the application, performance is quite similar with the KTOR + Exposed version having the edge.
However when GETting an existing record from both versions the difference is shocking especially when the database is getting larger - and all time is in Postgres.
My conclusion is that the difference can only be in Hibernate second-level caching that Spring Boot configures.
Seeing the value of caching for items that are repeatedly queried in multiple transactions / sessions, I'm wondering how to configure this in into the low level Exposed framework?
At the moment Exposed supports only per-transaction level.
Also, there are ImmutableCachedEntityClass which allows you to define some entities (mostly dictionary-like) as cached and share them among application.
You have to manage cache invalidation manually with expireCache() function or actualize entities with forceUpdateEntity.
Proper caching in the age of distributed systems is not so easy to implement. You may use any caching library (e.g. caffeine) and invalidate a cache if you know when your data changes (maybe with help of Exposed StatementInterceptors).
If you'll be able to implement a good caching solution feel free to send PR to the project.
So I'm designing this blog engine and I'm trying to just keep my blog data without considering comments or membership system or any other type of multi-user data.
The blog itself is surrounded around 2 types of data, the first is the actual blog post entry which consists of: title, post body, meta data (mostly dates and statistics), so it's really simple and can be represented by simple json object. The second type of data is the blog admin configuration and personal information. Comment system and other will be implemented using disqus.
My main concern here is the ability of such engine to scale with spiked visits (I know you might argue this but lets take it for granted). So since I've started this project I'm moving well with the rest of my stack except the data layer. Now I've been having this dilemma choosing the database, I've considered MongoDB but some reviews and articles/benchmarking were suggesting slow reads after collections read certain size. Next I was looking at Redis and using its persistence features RDB and AOF, while Redis is good at both fast reading/writing I'm afraid of using it because I'm not familiar with it. And this whole search keeps going on to things like "PostgreSQL 9.4 is now faster than MongoDB for storing JSON documents" etc.
So is there any way I can settle this issue for good? considering that I only need to represent my data in key,value structure and only require fast reading but not writing and the ability to be fault tolerant.
Thank you
If I were you I would start small and not try to optimize for big data just yet. A lot of blogs you read about the downsides of a NoSQL solution are around large data sets - or people that are trying to do relational things with a database designed for de-normalized data.
My list of databases to consider:
Mongo. It has huge community support and based on recent funding - it's going to be around for a while. It runs very well on a single instance and a basic replica set. It's easy to set up and free, so it's worth spending a day or two running your own tests to settle the issue once and for all. Don't trust a blog.
Couchbase. Supports key/value storage and also has persistence to disk. http://www.couchbase.com/couchbase-server/features Also has had some recent funding so hopefully that means stability. =)
CouchDB/PouchDB. You can use PouchDB purely on the client side and it can connect to a server side CouchDB. CouchDB might not have the same momentum as Mongo or Couchbase, but it's an actively supported product and does key/value with persistence to disk.
Riak. http://basho.com/riak/. Another NoSQL that scales and is a key/value store.
You can install and run a proof-of-concept on all of the above products in a few hours. I would recommend this for the following reasons:
A given database might scale and hit your points, but be unpleasant to use. Consider picking a database that feels fun! Sort of akin to picking Ruby/Python over Java because the syntax is nicer.
Your use case and domain will be fairly unique. Worth testing various products to see what fits best.
Each database has quirks and you won't find those until you actually try one. One might have quirks that are passable, one will have quirks that are a show stopper.
The benefit of trying all of them is that they all support schemaless data, so if you write JSON, you can use all of them! No need to create objects in your code for each database.
If you abstract the database correctly in code, swapping out data stores won't be that painful. In other words, your code will be happier if you make it easy to swap out data stores.
This is only an option for really simple CMSes, but it sounds like that's what you're building.
If your blog is super-simple as you describe and your main concern is very high traffic then the best option might be to avoid a database entirely and have your CMS generate static files instead. By doing this, you eliminate all your database concerns completely.
It's not the best option if you're doing anything dynamic or complex, but in this small use case it might fit the bill.
Background
I need to build a Rich-Client application using .NET. The app needs to handle TreeViewControls and TableViewControls with about 100000 entities. GUI is build with WPF, very likely using Telerik Controls. My question is about the general architecture of the data-layer. I've got some coarse ideas of the concepts, but would highly appreciate your comments / thoughts and hints into which technology I should dig deeper. Here're my thoughts:
Conceptual Layers
Presentation Layer
just the WPF Controls, I'd need performant synchronizing of different controls on property changes, but I don't anticipate major problems here.
Business Layer
creating views (object selections to be displayed in the controls), CRUD operations (modifications done directly with the POCOs), searching (global search, but also limited to a view)
Repository
holds POCOs in an enitity map, decides weather to load from persistence store
Persistence-Manager
I'm thinking of using a LocalDB or simple Key-Value Store as (persistent) Client-Cache. So, the Persistence-Manager would try to get an object from the local store. Otherwise get the data from the server. Also, persisting data to the Client-Cache. The data would be available via a webservice. I'm happy to give WCF Data Services a try.
Persistence-Layers
There would be two parts:
- Local DB connection using an ORM like EF or OpenAccess; or a simple key-value store
- HTTP connection to consume the Web-Service
Questions
In a layering like this, how about lazy loading referenced objects? I know EF and other ORMs take care of a lot of the issues I have here, too. But I don't see yet how to plug these frameworks into the above layering. Also, where to track changes? Where to secure consistency when deleting objects? (e.g. deleting references to these objects as well)
I would eager load whole views (hierarchical structures) and perform Linq to objects to those collections of POCOs. Maybe implement a simple inverted index if Linq performance would become a matter. But how should I best implement global searches on the server? Are there libraries ("Linq to OData") available?
What do you think about a fully "diconnected" scenario? Holding all data a user needs in a local database. Sync on start / stop and user triggered. I could use an ORM directly on the local DB, with good chances to save a lot of headaches trying to implement a lot of consistency features by hand (using the above layering).
Or in contrast, forget about the local database and batch eager load most of the needed data. Here I'm concerned about the performance of the webservices (without having experience with OData, WCF). I've build an app using Redis and Python that loads about 200000 business objects quite fast (< 1 min) to the client (the objects are already serialized cached in Redis).
I'll certainly do some prototyping and benchmarking, but to get a good start, any thoughts and recommendations are highly appreciate.
Cheers,
Jan
I need to implement master/slave/load balancing into an existing site.
Does anyone use these (or other) implementations for master/slave switching?
The resources I found on how to implement master/slave into Cake:.
(preferable) gamephase.net/posts/view/master-slave-datasource-behavior-cakephp
http://bakery.cakephp.org/articles/view/master-slave-support-also-with-multiple-slave-support
http://bakery.cakephp.org/articles/view/load-balancing-and-mysql-master-and-slaves-2
I'm getting number 1) to work most of the times but it has trouble with some of the joins.
I welcome new sources, hacks or mods for master/slave implementation as for now I can't get my head around it.
(Cake version I am using atm is 1.2)
(I'm cross posting this on CakePHP's google groups http://groups.google.co.uk/group/cake-php/browse_thread/thread/4b77af429759e08f)
Take a look at this tutorial in regards to Master/Slave over several nodes.
http://www.howtoforge.com/setting-up-master-master-replication-on-four-nodes-with-mysql-5-on-debian-etch
This may help you understand better.
As far as I can tell this happens if your model has relationships with models that do not use the same behaviour. Please correct me if this assumption is wrong.
All models have meta-data, which CakePHP accumulates using a DESCRIBE query on the database, if this data is not present your joins will be broken. This meta-data is database config specific.
CakePHP uses this meta-data to populate the $this->_schema property. SQL joins are built with data from the $this->_schema property and I guess this is where your issue lies, the database introduced by this MasterSlave switch behaviour do not have any model meta-data for tables associated with the model.
A solution would be to update your behaviour so that it only switches selectively on read and writes. Add this behaviour to all related models. i.e Any model that is related using hasOne, hasMany etc should also use the same behaviour.
In essence all models that are related should write to the same database and read from the same database.
The bonus of this solution is you will share the same database connections.
Your web app seems to be multi tier, you need to scale each tier individually:
The web layer, i.e. tha CakePHP app can be spread across multiple web servers. This is easy to do, as the code itself is idempotent. You should look into how to load balance apache servers, it is not a big deal. Webservers have quite high throughput though, so if you have a bottleneck here, you might improve your code/caching strategy instead. (Use memcache instead of file caches for example.) If you depend on the file system (uploads for example) this becomes a bit more complex, as it must become distributed or separated.
The data layer. There are various tutorials how to scale/load balance MySQL already linked by others.
Albeit first I would suggest to make benchmarks. (Premature optimization is the root of all evil.) You must know first where the bottlenecks are, where the throughput should scale. Often you can optimize queries, caching, or make thing cacheable in the first place. You must also be clear in your goals: scalability? fault tolerance?
I'm using SqlServer to drive a WPF application, I'm currently using NHibernate and pre-read all the data so it's cached for performance reasons. That works for a single client app, but I was wondering if there's an in memory database that I could use so I can share the information across multiple apps on the same machine. Ideally this would sit below my NHibernate stack, so my code wouldn't have to change. Effectively I'm looking to move my DB from it's traditional format on the server to be an in memory DB on the client.
Note I only need select functionality.
I would be incredibly surprised if you even need to load all your information in memory. I say this because, just as one example, I'm working on a Web app at the moment that (for various reasons) loads thousands of records on many pages. This is PHP + MySQL. And even so it can do it and render a page in well under 100ms.
Before you go down this route make sure that you have to. First make your database as performant as possible. Now obviously this includes things like having appropriate indexes and tuning your database but even though are putting the horse before the cart.
First and foremost you need to make sure you have a good relational data model: one that lends itself to performant queries. This is as much art as it is science.
Also, you may like NHibernate but ORMs are not always the best choice. There are some corner cases, for example, that hand-coded SQL will be vastly superior in.
Now assuming you have a good data model and assuming you've then optimized your indexes and database parameters and then you've properly configured NHibernate, then and only then should you consider storing data in memory if and only if performance is still an issue.
To put this in perspective, the only times I've needed to do this are on systems that need to perform millions of transactions per day.
One reason to avoid in-memory caching is because it adds a lot of complexity. You have to deal with issues like cache expiry, independent updates to the underlying data store, whether you use synchronous or asynchronous updates, how you give the client a consistent (if not up-to-date) view of your data, how you deal with failover and replication and so on. There is a huge complexity cost to be paid.
Assuming you've done all the above and you still need it, it sounds to me like what you need is a cache or grid solution. Here is an overview of Java grid/cluster solutions but many of them (eg Coherence, memcached) apply to .Net as well. Another choice for .Net is Velocity.
It needs to be pointed out and stressed that something like NHibernate is only consistent so long as nothing externally updates the database and that there is exactly one NHibernate-enabled process (barring clustered solutions). If two desktop apps on two different PCs are both updating the same database with NHibernate the caching simply won't work because the persistence units simply won't be aware of the changes the other is making.
http://www.db4o.com/ can be your friend!
Velocity is an out of process object caching server designed by Microsoft to do pretty much what you want although it's only in CTP form at the moment.
I believe there are also wrappers for memcached, which can also be used to cache objects.
You can use HANA, express edition. You can download it for free, it's in-memory, columnar and allows for further analytics capabilities such as text analytics, geospatial or predictive. You can also access with ODBC, JDBC, node.js hdb library, REST APIs among others.