My view got very big almost 1k lines and 30 events.Should I brake it in smaller sub views?
I want to do this to improve readability and performance.
Yes you should brake it in smaller sub views. Although it will not necessary improve performance but will definitely improve readability. Also please keep in mind that views are only for presentation and not business logic.
Related
i thinking which one is the best way to show average star rating. is it better to calculate the average when there's a review with star value given, and store the average in DB field, so when i load a page i just check 1 field's value? or calculate the average each time user loading the page?
Without a sample schema, and an idea of typical usage, it's almost impossible to provide a good answer.
The question you pose is "should I denormalize my database" - there are lots of other questions on this topic.
From a performance point of view, the question boils down to "how often do you have to write, how often do you have to read, and how important is it that data is consistent?".
If your application user experience is such that "star ratings" are shown almost never, and calculating that star rating is "cheap", then the performance impact is low.
If you are showing long, scrolling pages with items, each with a star rating, the performance benefit could be high, especially if calculating the star rating is an expensive operation.
If it's important that star ratings are exactly accurate in all cases, you will have to add some additional logic like locking behaviour which could have a huge impact on your database.
If your application experience means that you may have periods with very high numbers of new ratings, you could have a significant performance impact on the "write" operation.
In general, it's best to design your application to be normalized (so it's easy to debug and maintain), and to measure whether you need to do anything more. Modern database engines can handle far more than most people realize.
** update **
Thanks for your update.
Your schema suggestion should be lightning fast without denormalization - you should be joining on a foreign key on the reviews table. It all depends on the exact circumstances, but unless you need to scale to hundreds of millions of products and reviews, I doubt you'd ever see a measurable difference in database performance. The logic to keep the "average score" column updated may be more of a performance overhead than calculating it on the fly.
In my experience, denormalization is an expensive thing to do - it makes your code much harder to understand and debug, and leads to entertaining bugs. From a performance point of view, if you're building a website, you'll get a much better return by focusing on caching at the HTTP level.
I'm making a database for different rental investments for my employer. I want each record to have the investment code followed by expected monthly cashflows for the next 10 years.
I can set this up with 120 fields for each future monthly cashflow,
OR
I have only 3 fields - investment code, month and cashflow.
Which is better? I will probably have 5000 new investments each month. The first produces 5000 records, the second produces 600000. Is that a problem? I'll want to run queries and stuff based on relationships in the rest of the database. Which approach gives the best performance?
Thanks in advanced!
I like the second approach, it might have bigger table size as the investment code repeats for each record, however is more normalized. I dont think there should be any performance issue if you have proper primary keys setup. It also has another advantage for new investments: you dont need to have 120 fields as they seem to start at any month.
An update...
I did my own testing using both the wide and short vs narrow and tall options using a test of many thousands.
The query performance of both is roughly the same (as long as the query is well designed and doesn't return individual records). If you use a stopwatch, the wider table is slightly better by a couple of seconds, but tall is still acceptable.
The main difference is the amount of data produced. Wide takes up only 200mb. Tall takes up 1gb for the same information! Writing data in tall takes a lot longer.
Given that Access has a 2gb limit, and that I don't need the flexibility to ever add past 5 years worth (I know for a fact this will not be needed), I think I'll go with option 1 - short and wide tables.
I'm a little surprised a just how big the tall table was - I would have though Access would be able to compress data a little better.
I am just getting into studying forecasting methods and I want to figure out how performance is commonly measured. My instinct is that out-of-sample performance is most important (you want to see how well your model does with unseen data). I have also noticed that forecast performance does not do well if your out-of-sample data is too large (which makes sense the farther you go in the future, the less likely your model will perform well). So I was wondering how to determine the best size of out-of-sample data to test on?
I think you are confusing the forecasting horizon with the out-of-sample data to test on the forecasting performance, when you say " I have also noticed that forecast performance does not do well if your out-of-sample data is too large".
When you do forecasting, you are usually interested in a certain forecasting horizon. For example, if you have a time series at monthly frequency, you might be interested in one month horizon (short-term forecast) or 12 months (long-term forecasting). So the forecasting performance usually deteriorates with longer forecasting horizons, not with more out-of-sample data.
It is hard to suggest the number of observations on which you test your model because it depends on how you want to evaluate the forecast. If you want to use some formal statistical tests, then you need more observations, but if you are interested in predicting a certain event and you are just interest in the performance of a single model, then you are fine with a relatively low number of out-of-sample observations.
Hope this helps,
Paolo
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm being optimize my Oracle Database.
I confusing about IO Performance between write concurrent 10 request to a Table and write concurrent 10 request to 10 table
and if i have 10 type data can store in 1 table --> which way bring the best performance insert data between 1 or 10 table
anybody know about it ?
Performance tuning is a large topic, and questions can't really by answered with so little information to begin with. But I'll try to provide some basic pointers.
If I got it right, you are mainly concerned with insert performance into what is currently a single table.
The first step should be to find out what is actually limiting your performance. Let's consider some scenarios:
disk I/O: Disks are slow. So get the fastest disks you can get. This might well mean SSDs. Put them in a RAID that is tuned for performance, "striping" is the key word as far as I know. Of course the SSDs will fail, as your HDs do so you want to plan for that. HDs are also faster when they aren't completely full (never really checked that). Partitioned tables might help as well (see below). But most of the time we can reduce the I/O load which is way more efficient then more and faster hardware ...
contention on locks (of primary keys for example).
Partitioned tables and indexes might be a solution. A partitioned table is logically one table (you can select it and write to it just like a normal table), but internally the data gets spread across multiple tables. A partitioned index is similar but an index. This might help, because an index underlying a unique key get locked when a new value gets added, so two sessions can't insert the same value. If the values are spread between n indexes, this might reduce the contention on such locks. Also partitions can be spread over different tablespaces/disks, so you have less waiting time for your physical stuff.
time to very constraints: If you have constraints on the table they need time to do their job. If you do batch insert, you should consider deferred constraints, they only get checked on commit time instead of on every insert. If you are careful with your application you can even disable them and enable them afterwards without checking them. This is fast, but of course you have to be really really sure the constraints actually hold. of course you should make sure your constraints have all the indexes they need to perform good.
talking about batch inserts. If you are doing those you might want to look into direct load: http://docs.oracle.com/cd/A58617_01/server.804/a58227/ch_dlins.htm (I think this is the Oracle 8 version, I'm sure there is an updated documentation somewhere)
To wrap it up. Without knowing where exactly your performance problem is, there is no way one can tell how to fix it. So find out where your problem is, then come back with with a more precise question.
How would a single BLOB column in SQL Server compare (performance wise), to ~20 REAL columns (20 x 32-bit floats)?
I remember Martin Fowler recommending using BLOBs for persisting large object graphs (in Patterns of Enterprise Application Architecture) to remove multiple joins in queries, but does it make sense to do something like this for a table with 20 fixed columns (which are never used in queries)?
This table is updated really often, around 100 times per second, and INSERT statements get rather large with all the columns specified in the query.
I presume the first answer is going to be "profile it yourself", but I'd like to know if someone already has experience with this stuff.
Typically you should not, if you have not found out that this is critical to meet your performance requirements.
If you store it in one blob you need to recalculate your whole database if you make any change to the object structure (like adding or removing a column). If you keep multiple columns your future database refactorings and deployments will be much easier.
I can't fully speak to the performance of the SELECT, you'll need to test that, but I highly doubt it will cause any performance issues there because you wouldn't be reading any more data than before. However, in regards to the INSERT, you should see a performance gain (of what size I'm unsure), because there will likely not be any statistical indexes to update. Of course that depends on a lot of settings but I'm just throwing my opinion out there. This question is pretty subjective and not near enough information is available to truly tell you if you will see performance issues surrounding the change.
Now, in practice I'm going to say, leave it be unless you're seeing real performance issues. Further, if you're seeing real performance issues, analyze those before choosing this type of solution, there are probably other ways to fix them.