My application has a navigation framework where each data entry screen is implemented as a usercontrol. Some of these UserControls are very large, and on slow systems can take a few seconds to construct.
In an attempt to speed things up, I'm considering caching these control instances instead of instantiating them each time I need them. In tests, this definitely seems to speed things up.
Very simplified, I'm essentially doing something like this:
public void Navigate(Type pageType)
{
Control page = GetOrCreatePage(pageType);
hostPanel.Clear();
hostPanel.Add(page);
}
private Control GetOrCreate(Type controlType)
{
if(!cache.Contains(controlType))
cache.Add(controlType, Activator.CreateInstance(type);
return cache[type];
}
My question is - should I be doing this? I'm only asking because I don't see many examples of this, and I've had to do a fair amount or refactoring and rethinking to support the control instances hanging around (managing disposal, event subscriptions, etc). I guess it just feels different so I want to make sure this is OK and I'm not stepping in something I'll smell later...
any feedback is appreciated.
I would consider why the instances take so long to create, most probably it's because there's some data that takes time to load and it would probably be a better idea to cache that data instead.
Related
Let's say you are working on a big flink project. And also you are keyBy the client ip addresses of your customers.
And realized that you are going to filter the same things in the different code places like that:
public void calculationOne(){
kafkaSource.filter(isContainsSmthA).keyBy(clientip).process(processA).sink(...);
}
public void calculationTwo(){
kafkaSource.filter(isContainsSmthA).keyBy(clientip).process(processB).sink(...);
}
And assumed that they are many kafkaSource.filter(isContainsSmthA)..
Now this structure leads to performance issue in the flink?
If I did something like the below, would be much better?
public Stream filteredA(){
return kafkaSource.filter(isContainsSmthA);
public void calculationOne(){
filteredA().keyBy(clientip).process(processA).sink(...);
}
public void calculationTwo(){
filteredA().keyBy(clientip).process(processB).sink(...);
}
It depends a bit on how it should behave operationally.
The first way is a more friendly to the Kafka cluster: all records are read once. The filter itself is a very cheap operation, so you don't need to worry to much about it. However, the big downside of this approach is that if one calculations is much slower than the others, it will slow them down. If you do not process historic events, it shouldn't matter as you'd size your application cluster to keep up with all events anyways. Another current downside is that if you have a failure in calculationTwo also tasks in calculationOne are restarted. The community is actively working to mitigate that though.
The second way would allow only the affected source -> ... -> sink subtopology to be restarted. So if you expect frequent restarts or need to guarantee certain SLAs, this approach is better. An extension is to actually have separate Flink applications for each of these pipelines. You can share the same jar, but use different arguments to select the correct pipeline on submission. This approach also makes updating of applications much easier as you would only experience downtime for the pipeline that you actually modify.
I might do something like below, where a simple wrapper operator can run data through two different functions, and generate two side outputs.
SingleOutputStreamOperator comboResults = kafkaSource
.filter(isContainsSmthA)
.keyBy(clientip)
.process(new MyWrapperFunction(processA, processB));
comboResults
.getSideOutput(processATag)
.sink(...);
comboResults
.getSideOutput(processBTag)
.sink(...);
Though I don't know how that compares with what Arvid suggested.
When I started working on my current project I was given quite an arduous task - to build something that in essence suppose to replace big spreadsheet people use internally in my company.
That's why we I thought a paginated table would never work, and quite honestly I think pagination is stupid. Displaying dynamically changing data on a paginated table is lame. Say an item on page #2 with next data update can land on page whatever.
So we needed to build a grid with nice infinite scroll. Don't get me wrong, I've tried many different solutions. First, I built vanilla ng-repeat thing and tried using ng-infinite-scroll, and then ng-scroll from UI.Utils. That quickly get me to the point where scrolling became painfully slow, and I haven't even had used any crazy stuff like complicated cell templates, ng-ifs or filters. Very soon performance became my biggest pain. When I started adding stuff like resizable columns and custom cell templates, no browser could handle all those bindings anymore.
Then I tried ng-grid, and at first I kinda liked it - easy to use, it has a few nice features I needed, but soon I realized - ng-grid is awful. Current version stuffed with bugs, all contributors stopped fixing those and switched to work on a next version. And only God knows when that will be ready to use. ng-grid turned out to be pretty much worse than even vanilla ng-repeat.
I kept trying to find something better. trNgGrid looked good, but way too simplistic and doesn't offer features I was looking for out of the box.
ng-table didn't look much different from ng-grid, probably it would've caused me same performance issues.
And of course I needed to find a way to optimize bindings. Tried bind-once - wasn't satisfied, grid was still laggy. (upd: angular 1.3 offers {{::foo}} syntax for one-time binding)
Then I tried React. Initial experiment looked promising, but in order to build something more complicated I need to learn React specifics, besides that thing feels kinda non-anguleresque and who knows how to test directives built with angular+react. All my efforts to build nice automated testing failed - I couldn't find a way to make React and PhanthomJS to like each other (which is probably more Phantom's problem. is there better headless browser) Also React doesn't solve "appending to DOM" problem - when you push new elements into the data array, for a few milliseconds browser blocks the UI thread. That of course is completely different type of problem.
My colleague (who's working on server-side of things) after seeing my struggles, grumbled to me that I already spent too much, trying to solve performance problems. He made me to try SlickGrid, telling me stories how this is freakin zee best grid widget. I honestly tried it, and quickly wanted to burn my computer. That thing completely depends on jQuery and bunch of jQueryUI plugins and I refuse to suddenly drop to medieval times of web-development and lose all angular goodness. No, thank you.
Then I came by ux-angularjs-datagrid, and I really, really, really liked it. It uses some smart bad-ass algorithm to keep things very responsive. Project is young, yet looks very promising. I was able to build some basic grid with lots of rows (I mean huge number of rows) without straying too much from the way of angular zen and scrolling still smooth. Unfortunately it's not a complete grid widget solution - you won't have resizable columns and other things out of the box, documentation is somewhat lacking, etc.
Also I found this article, and had mixed feelings about it, these guys applied a few non-documented hacks to angular and most probably those would breaks with feature versions of angular.
Of course there are at least couple of paid options like Wijmo and Kendo UI. Those are compatible with angular, however examples shown are quite simple paginated tables and I'm not sure if it is worth even trying them. I might end-up having same performance issues. Also you can't selectively pay just for the grid widget, you have to buy entire suite - full of shit I probably never use.
So, finally to my question - is there good, guaranteed, less painful way to have nice grid with infinite scrolling? Can someone point to good examples, projects or web-pages? Is it safe to use ux-angularjs-datagrid or better to build my own thing using angular and react? Anybody ever tried Kendo or Wijmo grids?
Please! Don't vote for closing this question, I know there are a lot of similar questions on stackoverflow, and I read through almost every single one of them, yet the question remains open.
Maybe the problem is not with the existing widgets but more with the way you use it.
You have to understand that over 2000 bindings angular digest cycles can take too long for the UI to render smoothly. In the same idea the more html nodes you have on your page, the more memory you will use and you might reach the browser capacity to render so many nodes in a smooth way. This is one of the reason why people use this "lame" pagination.
At the end what you need to achieve to get something "smooth" is to limit the amount of displayed data on the page. To make it transparent you can do pagination on scroll.
This plunker shows you the idea, with smart-table. When scrolling down, the next page is loaded (you will have to implement the previous page when scrolling up). And at any time the maximum amount of rows is 40.
function getData(tableState) {
//here you could create a query string from tableState
//fake ajax call
$scope.isLoading = true;
$timeout(function () {
//if we reset (like after a search or an order)
if (tableState.pagination.start === 0) {
$scope.rowCollection = getAPage();
} else {
//we load more
$scope.rowCollection = $scope.rowCollection.concat(getAPage());
//remove first nodes if needed
if (lastStart < tableState.pagination.start && $scope.rowCollection.length > maxNodes) {
//remove the first nodes
$scope.rowCollection.splice(0, 20);
}
}
lastStart = tableState.pagination.start;
$scope.isLoading = false;
}, 1000);
}
This function is called whenever the user scroll down and reach a threshold (with throttle of course for performance reason)
but the important part is where you remove the first entries in the model if you have loaded more than a given amount of data.
I'd like to bring your attention towards Angular Grid. I had the exactly same problems as you said, so ended up writing (and sharing) my own grid widget. It can handle very large datasets and it has excellent scrolling.
Abstract: Is MapReduce a good idea when processing a collection of data from the database, instead of finding some answer to a somewhat complex (or just big) question?
I would like to sync a set of syndication sources (e.g. urls like http://xkcd.com/rss.xml ), which are stored in GAE's datastore as a collection/table. I see two options, one is straight forward. Make simple tasks which you put in a queue, where each task handle's 100 or 1000 or whatever natural number seems to fit in each task. The other option is MapReduce.
In the latter case, the Map does everything, and the Reduce does nothing. Moreover, the map has no result, it just alters the 'state' (of the datastore).
#Override public void map(Entity entity) {
String url = (String)entity.getProperty("url");
for(Post p : www.fetchPostsFromFeed(url)) {
p.save();
}
}
As you can see, one source can map to many posts, so my map might as well be called "Explode".
So no emits and nothing for reduce to do. The reason I like this map-approach, is that I tell google: Here, take my collection/table, split it however you see fit to different mappers, and then store the posts wherever you like. The datastore uses 'high replication'. So availability of the data is high and a best choice for what 'computational unit' does what entity doesn't really reduce network communication. The same goes for the save of the posts, as they need to go to all datastore units. What I do like is that mapreduce has some way of fault-recovery for map-computations that get stuck, and that it knows how many tasks to send to which node, instead of queueing some number of entities somewhere hoping it makes sense.
Maybe my way of thinking here is wrong, in which case, please correct me. Anyhow, is this approach 'wrong' for the lack of reduce and map being an 'explode'?
Nope, Map pretty does the same as as your manual enqueuing of tasks.
This is a C# 3.0 question. Can I use reflection or memory management classes provided by .net framework to count the total alive instances of a certain type in the memory?
I can do the same thing using a memory profiler but that requires extra time to dump the memory and involves a third party software. What I want is only to monitor a certain type and I want a light-weighted method which can go easily to unit tests. The purpose to count the alive instances is to ensure I don't have any expected living instances that cause "memory leak".
Thanks.
To do it entirely within the application you could do an instance-counter, but it would need to be explicitly coded and managed inside each class--there's no silver bullet that I'm aware of to let you query the framework from within the executing code to see how many instances are alive.
What you're asking for is really the domain of a profiler. You can purchase one or build your own, but it requires your application to run as a child process of the profiler. Rolling your own isn't an easy undertaking, by the way.
If you want to consider the instance counter it would have to be something like:
public class MyClass : IDisposable
public MyClass()
{
++ClassInstances;
}
public void Dispose()
{
--ClassInstances;
}
private static new object _ClassInstancesLock;
private static int _ClassInstances;
private static int ClassInstances
{
get
{
lock (_ClassInstancesLock)
{
return _ClassInstances
}
}
}
This is just a really rough sample, no tests for compilation; 0% guarantee for thread-safety (critical for this type of approach) and it leaves the door wide open for Dispose to be called, the instance counter to decrement, but for the object not to properly GC. To diagnose that bundle of joy you'll need, you guessed it, a professional profiler--or at least windbg.
Edit: I just noticed the very last line of your question and needed to say that my above approach, as shoddy and failure-prone as it is, is almost guaranteed to deceive and lie to you about the true number of instances if you're experiencing a leak. The best tool, IMO, for attacking these problems is ANTS Memory Profiler. Version 5 is a double-edge in that they broke Performance and Memory profiler into two seperate SKUs (used to be bundled together) but Memory Profiler 5.0 is absolutely lightning fast. Profiling these problems used to be slow as molases, but they've gotten around it somehome.
Unless this is for a personal project with 0 intent of redistribution you should invest the few hundred dollars needed for ANTS--but by all means use it's trial period first. It's a great tool for exactly this kind of analysis.
The only way I see to do this is without any form of instrumentation to use the CLR Profiling API to track object lifetimes. I'm not aware of any APIs available to the managed code to do the same thing, and, so far as I know, CLR doesn't keep the list of live objects anywhere (so even with profiler API you have to create the data structures for that yourself).
VB.NET has a feature where it lets you track objects in debugger, but it actually emits additional code specifically for that (which basically registers all created objects in internal list of weak references). You could do that as well, using e.g. PostSharp to post-process your assemblies.
Basically I'm working on a program that processes a lot of large video and image files, and I'm struggling with the memory management side of it because I've never dealt with anything quite like this before.
For instance, it stores all these images in a database, and loads a list of videos, and then you can switch between the videos and view images from the video. Right now, it's keeping all of those images in memory all the time, which is eating up a lot of space. I know I can lazy load the images, but once you've switched back and forth you get all of them stuck in memory.
I want to take advantage of the WPF databinding functionality and MVVM as much as possible, but if I need to look at a different architecture I will.
I'm just looking for general advice, tips, links to articles, or anything that could help.
One of the things you could look at is data virtualization, which is not provided in WPF by default (they provide UI virtualization instead). Data virtualization can say 'load and bind the data for an item / range of items while visible, then unload when not visible'.
Here's a great article that describes a concrete implementation that you may be able to use as-is or adapt:
http://www.codeproject.com/KB/WPF/WpfDataVirtualization.aspx
It sounds like the main problem you're having is not so much the performance-intensiveness of the application (which things like fixed-size buffers and static allocation will help with) but its overall memory footprint. The way to control that is with virtualization.
Lazy loading gets you halfway there: you don't actually create the object until something needs it. That's fine, but the longer the user works with the application and the more objects he visits in the UI, the more objects get created, and eventually the application runs out of memory.
So you want to throw away objects that the user doesn't need anymore. Figuring out which objects the user doesn't need can be a hard problem, but it can also be as easy as assuming that the user doesn't need the object that he used least recently. You use a least-recently-used (LRU) cache to do this.
This is totally consistent with the MVVM pattern. In your view class, you make your property getter for the object use this pseudocode:
if object hasn't been loaded
load object
add object to the LRU cache (whether you loaded it or not)
return object
The LRU cache I wrote keeps a simple queue of the objects it contains. When you add an object to the cache, if it's not already in the queue it gets added to the back, and if it is already in the queue it gets moved to the back.
If the queue's at its capacity when you add an object, it pops off whatever is at the front of the queue (which is the one that was used least recently) and raises the DiscardingOldestItem event.
This event is the object's chance to tell anything that holds a reference to it (i.e. the view object that it's a property of) that it needs to be discarded (probably by raising an event of its own). The view object's event handler should first raise the PropertyChanged event. If the property getter gets called when it does this, there's a binding somewhere that's still looking at the property, so it shouldn't be discarded yet. (Also, since the getter was called, the object just got moved to the back of the queue.) Otherwise, it can be thrown away.
(Note that if you have more objects visible in the UI than the cache can hold, this little dance becomes an infinite loop and you'll get a stack overflow.)
A more sophisticated approach would have the LRU cache start discarding old items when the application started running low on memory (it uses a fixed capacity right now). That's a straightforward change, but if you make that change, the scenario described in the previous paragraph is something you need to give more thought to; one very large object could result in the whole UI going kablooey.
It seems that to increase raw performance you would actually want to avoid patterns. They have their uses, don't get me wrong, but if you're trying to blast video at the highest performance possible the last thing you need to do it introduce abstraction layers that are designed to write higher quality code, not increase application performance.
this article on informIt has a lot of good info on the subject although it is more c and c++.
Static Allocation Pattern: Allocates memory up front
It suggests,
Pool Allocation Pattern: Preallocates pools of needed objects
Fixed Sized Buffer Pattern: Allocates memory in same-sized blocks
Smart Pointer Pattern: Makes pointers reliable
Garbage Collection Pattern: Automatically reclaims lost memory
Garbage Compactor Pattern: Automatically defragments and reclaims memory
"I know I can lazy load the images,
but once you've switched back and
forth you get all of them stuck in
memory."
This is not true to my understanding. The images can get garbage collected just like anything else, by removing all references. Are you sure you dont have a reference to them somewhere? Try a memory profiler like memprofiler or ANTS to see whats happening.
To those who have found this question looking for general patterns (not WPF) to reduce memory, the famous one (which I have never seen used!) is The Flyweight pattern