In Backbone when bootstrapping collections, how do I determine what is too much? Would it be a bad idea to bootstrap 10,000 member records and 300,000 photo records? I'm not sure how to decide how much is too much.
If all you're doing is displaying the content, just bootstrap enough to fill a generous size page and lazy load the rest. If you're doing something interactive (e.g. sorting or filtering) in the browser, then you might still be able to get away with that strategy, provided that
users will typically want to at least glance at the data for a few seconds before activating a sort or filter, and
your server is fast enough to fill in the remaining items (after the first "page-full") in that time.
Related
My Site is ~500 KB Gzipped including js, css and images. It is built on AngularJS. A lot of people in my company are complaining about the site being slow in lower bandwidths. There are a few questions I would like to get answered,
Is 500KB Gzipped too high for lower bandwidths? People claim it takes 20 sec for it to load on their machine, which I believe is an exaggeration. Is it really due to anugularJS and its evaluation time?
How does size of the app matters in lower bandwidths? If my site is 500KB and if I reduce it to 150KB by making a custom framework, Would it really help me in lower bandwidth? If so, how much?
It's all subjective, and the definition of "low bandwidth" is rather wide. However...using https://www.download-time.com/ you can get a rough idea of how long it would take to download 500Kb on different bandwidths.
So, on any connection above 512Kbps (minimum aDSL speeds, most are now better than 5Mbps, and 3G mobile is around the same mark), it's unlikely that the file size is the problem.
If "low bandwidth" also implies "limited hardware" (RAM, CPU), it's possible the performance problem lies in unzipping and processing your application. Angular is pretty responsive, but low-end hardware may struggle.
The above root causes would justify rewriting the application using your own custom framework.
The most likely problem, however, is any assets/resources/templates your angular app requires on initialization - images, JSON files etc. This is hard to figure out without specific details - each app is different. The good news is that you shouldn't need to rewrite your application - you should be able to use your existing application and tweak it. I'm assuming the 500Kb application can't be significantly reduced in size without a rewrite, and that the speed problem is down to loading additional assets as part of start-up.
I'd use Google Chrome's Developer tools to see what's going on. The "performance" tab has an option to simulate various types of network conditions. The "network" tab allows you to see which assets are loaded, and how long they take. I'd look at which assets take time, and seeing which of those can be lazy loaded. For instance, if the application is loading a very large image file on start-up, perhaps that could be lazy-loaded, allowing the application to appear responsive to the end user more quickly.
A common way to improve perceived performance is to use lazy loading.
To decrease your load time just process your caching and find the right download tool to calculate the download speed of your file. You can use https://downloadtime.org for reference. If you have any issues let me know. Also to To decrease the page load time try to create chunks of your javascript functionalities which consist only of the functionality which is needed for e.g. the index page to decrease the load time.
As angular.js itself has a gzipped size of 57kb it seems there is much more loaded with this initial page call which is ~10 times the size of angular.js.
To decrease the page load time try to create chunks of your javascript functionalities which consist only of the functionality which is needed for e.g. the index page to decrease the load time.
For example when you're using Webpack the recommended default maximum file size is around 244kb see here
I want to do the following:
when user connect to personal cabinet, his get all data(array ~5000 rows), these data will be stored in state 'allOrders' and will be updated (every minute) only when was added new orders (I use websockets). It is normal practice (stored in state 'big' data) or better do differently?
I've found when you get into the thousands of items, working with data in the browser can be slow. Even if you optimize the rendering, you will likely want to do filtering and sorting to better visualize the data, and simply iterating through 5k items with a .filter or etc will noticeably affect the responsiveness of your UI, and feel sluggish.
Your alternative is to work with the data server side and this of course introduces network latency which also tends to impact performance; basically it's unlikely that you will be able to work with a dataset this large without making the user wait for certain operations. Working with the data in the browser however will make the browser appear to 'hang' (ie not respond to user actions) during large operations which is worse than waiting on network latency, which will not lock up the browser.. so there is that.
I've had success working with https://github.com/bvaughn/react-virtualized when rendering large lists. It's similar to the lib you mentioned, in that it only renders what is in view. You definitely do not want to try to render 5k things.
I have a complex dashboard that I would like to update every minute. It is an Angularjs SPA application with an IIS backend running in Azure.
Dashboard shows approximatey 30-40 dashlet widgets on it. Each widget needs approximately 10 collections of data entities . Each collection gets about 3-5 new data points every minute
I want to ensure that app i the browser is performing well and is interactable (this is is very important) and that my web servers are scalable (this is secondary, because I'd rather add more web servers than sacrafice speed and interactivity of the browser)
Should I update the whole dashboard at once? (1 very large call, will probably download a 1200-1600 data entities... probably a lot more for some users, and a lot less for others). This one puts the most strain on the web servers and is less scalable from the web server perspective. But I am not sure what the browser impact is.
Update a single dashlet widget at a time? (30-40 chunky calls, each one returning about 40 pieces of information)
Update each collection of data entities inside the dashboard individually? (About 300-400 tiny calls, each returning ~3-5 pieces of information)
One can assume that total server time to generate data for 1 large update and for 300-400 individual data points is very similar.
Timeliness of updates is not /super/ important... if some widgets update 10 seconds later and some on time, that's fine. Responsivness of the browser and the website is important in general, so that if users decides to interact with the dashlets, the thing should be very responsive
Appreciate any advice
AngularJS optimizations are all about:
Making sure binding expression evaluation is fast.
Minimize the number of things being watched.
Minimize the number of digest cycles: A phase when angular check for model changes by comparing old and new model values.
But before you begin to fix\optimize any of the above parts it is important to understand how Angular data binding works and what are digest cycles. This SO post should help you in this regards.
Coming back to the possible optimization.
Fixing first one is matter of making sure if you are using functions in binding expression, they evaluate fast and do not do any compute intensive or remote operation. Simply because binding expressions are evaluated multiple times during multiple digest cycles.
Secondly minimizing the number of watches. This requires that you analyze your view binding behavior:
Are there parts which once bound do not change: Use bindonce direcitive or if you are on Angular 1.3, Angular itself supports one time binding using :: syntax in expression.
Create DOM elements only for things visible in the view: Using ng-if or ng-switch rather than ng-show\ng-hide can help. See how ngInfiniteScroll works.
Lastly reducing the number of digest cycles helps as it mean less number of dirty checks to perform over the lifetime of the app.
Angular will perform these digest cycles at various times during application execution.
Each remote call too results in a complete digest cycle hence reducing remote calls will be helpful.
There are some further optimization possibilities if you use scope.$digest instead of scope.$apply. scope.$apply triggers a app wide digest cycle, whereas scope.$digest only triggers the children scopes.
To actually optimize the digest cycle look at Building Huuuuuge Apps with AngularJS from Brian Ford
But before anything measure how things are working using tools like Batarang, and make sure such optimizations are required.
My web app contains data gathered from an external API of which I do not have control. I'm limited to about 20,000 API requests per hour. I have about 250,000 items in my database. Each of these items is essentially a cached version. Consider that it takes 1 request to update the cache of 1 item. Obviously, it is not possible to have a perfectly up-to-date cache under these circumstances. So, what things should I be considering when developing a strategy for caching the data. These are the things that come to mind, but I'm hoping someone has some good ideas I haven't thought of.
time since item was created (less time means more important)
number of 'likes' a particular item has (could mean higher probability of being viewed)
time since last updated
A few more details: the items are photos. Every photo belongs to an event. Events that are currently occurring are more like to be viewed by client (therefore they should take priority). Though I only have 250K items in database now, that number increases rather rapidly (it will not be long until 1 million mark is reached, maybe 5 months).
Would http://instagram.com/developer/realtime/ be any use? It appears that Instagram is willing to POST to your server when there's new (and maybe updated?) images for you to check out. Would that do the trick?
Otherwise, I think your problem sounds much like the problem any search engine has—have you seen Wikipedia on crawler selection criteria? You're dealing with many of the problems faced by web crawlers: what to crawl, how often to crawl it, and how to avoid making too many requests to an individual site. You might also look at open-source crawlers (on the same page) for code and algorithms you might be able to study.
Anyway, to throw out some thoughts on standards for crawling:
Update the things that have changed often when updated. So, if an item hasn't changed in the last five updates, then maybe you could assume it won't change as often and update it less.
Create a score for each image, and update the ones with the highest scores. Or the lowest scores (depending on what kind of score you're using). This is a similar thought to what is used by LilyPond to typeset music. Some ways to create input for such a score:
A statistical model of the chance of an image being updated and needing to be recached.
An importance score for each image, using things like the recency of the image, or the currency of its event.
Update things that are being viewed frequently.
Update things that have many views.
Does time affect the probability that an image will be updated? You mentioned that newer images are more important, but what about the probability of changes on older ones? Slow down the frequency of checks of older images.
Allocate part of your requests to slowly updating everything, and split up other parts to process results from several different algorithms simultaneously. So, for example, have the following (numbers are for show/example only--I just pulled them out of a hat):
5,000 requests per hour churning through the complete contents of the database (provided they've not been updated since the last time that crawler came through)
2,500 requests processing new images (which you mentioned are more important)
2,500 requests processing images of current events
2,500 requests processing images that are in the top 15,000 most viewed (as long as there has been a change in the last 5 checks of that image, otherwise, check it on a decreasing schedule)
2,500 requests processing images that have been viewed at least
Total: 15,000 requests per hour.
How many (unique) photos / events are viewed on your site per hour? Those photos that are not viewed probably don't need to be updated often. Do you see any patterns in views for old events / phones? Old events might not be as popular so perhaps they don't have to be checked that often.
andyg0808 has good detailed information however it is important to know the patterns of your data usage before applying in practice.
At some point you will find that 20,000 API requests per hour will not be enough to update frequently viewed photos, which might lead you to different questions as well.
More and more sites are displaying the number of views (and clicks like on dzone.com) certain pages receive. What is the best practice for keeping track of view #'s without hitting the database every load?
I have a bunch of potential ideas on how to do this in my head but none of them seem viable.
Thanks,
first time user.
I would try the database approach first - returning the value of an autoincrement counter should be a fairly cheap operation so you might be surprised. Even keeping a table of many items on which to record the hit count should be fairly performant.
But the question was how to avoid hitting the db every call. I'd suggest loading the table into the webapp and incrementing it there, only backing it up to the db periodically or on webapp shutdown.
One cheap trick would be to simply cache the value for a few minutes.
The exact number of views doesn't matter much anyway since, on a busy site, in the time a visitor goes through the page, a whole batch of new views is already in.
One way is to use memcached as a counter. You could modify this rate limit implementation to instead act as general counter. The key could be in yyyymmddhhmm format with an expiration of 15 or 30 minutes (depending on what you consider to be concurrent visitors) and then simply get those keys when displaying the page.
Nice libraries for communicating with the memcache server are available in many languages.
You could set up a flat file that has the number of hits in it. This would have issues scaling, but it could work.
If you don't care about displaying the number of page views, you could use something like google analytics or piwik. Both make requests after the page is already loaded, so it won't impact load times. There might be a way to make a ajax request to the analytics server, but I don't know for sure. Piwik is opensource, so you can probably hack something together.
If you are using server side scripting, increment it in a variable. It's likely to get reset if you restart the services so not such a good idea if accuracy is needed.