Critic Loss for RL Agent - artificial-intelligence

While I was implementing agents for various problems...I have seen that my actor loss is reducing as expected. But my critic loss kept increases even though the policy learned is very. This happens for DDPG , PPO etc.
Any thoughts why my critic loss is increasing.
I tried playing with hyper parameters, it actually makes my policy worse.

In Reinforcement Learning, you really shouldn't typically be paying attention to the precise values of your loss values. They are not informative in the same sense that they would be in, for example, supervised learning. The loss values should only be used to compute the correct updates for your RL approach, but they do not actually give you any real indication of how well or poorly you are doing.
This is because in RL, your learning targets are often non-stationary; they are often a function of the policy that you are modifying (hopefully improving!). It's very well possible that, as the performance of your RL agent improves, your loss actually increases. Due to its improvement, it may discover new parts of its search space which lead to new target values that your agent was previously completely oblivious to.
Your only really reliable metric for how well your agent is doing is the returns it collects in evaluation runs.

Related

Importance of setting database type character limits

I have always been taught that when creating a db that I set the database types to the max characters thats going to be in that field - e.g int(2).
However as we know sometimes this can be unknown due to so many different lengths of data.
Is it really a must or can we just set to 255. If it is required does anyone know of a guide or a website that gives a clear list of what the character limits should be for "general" specific fields?
The major downside of setting a size limit that's too small is that you are unable to accomodate data that you really need to store. The easiest way to grasp this is to recall the Y2K problem. Many systems built much earlier stored the year part of the date in two digits. The designers probably thought their system would be discarded before it became a problem. But in about October of 1999, managers began to worry about how many systems were going to fail come New Years. The problems you might face are analogous, although probably smaller in scope.
There are two major downsides to setting a size limit that's much larger than what you really need. The first is wasted computer resources. This can result in needing more disk space, or producing more of a delay time for urgent transactions.
You also risk allowing data that should have been rejected. Many programmers prefer putting all the data integrity checks in their apps instead of relying on the DBMS to do the same job. There are a variety of reasons to choose either the DBMS or the apps, or both for data integrity. But you can harm yourself if you are limited by your own ignorance of how a DBMS really works.
There is no blanket rule. Understand your requirements, and design for your case.

Is it wrong to account for malloc in an amortized analysis of a dynamic array?

I had points docked on a homework assignment for calculating the wrong total cost in an amortized analysis of a dynamic array. I think the grader probably only looked at the total and not the steps I had taken, and I think I accounted for malloc and their answer key did not.
Here is a section of my analysis:
The example we were shown did not account for malloc, but I saw a video that did, and it made a lot of sense, so I put it in there. I realize that although malloc is a relatively costly operation, it would probably be O(1) here, so I could have left it out.
But my question is this: Is there only 1 way to calculate cost when doing this type of analysis? Is there an objective right and wrong cost, or is the conclusion drawn what really matters?
You asked, "Is there only 1 way to calculate cost when doing this type of analysis?" The answer is no.
These analyses are on mathematical models of machines, not real ones. When we say things like "appending to a resizable array is O(1) amortized", we are abstracting away the costs of various procedures needed in the algorithm. The motivation is to be able to compare algorithms even when you and I own different machines.
In addition to different physical machines, however, there are also different models of machines. For instance, some models don't allow integers to be multiplied in constant time. Some models allow variables to be real numbers with infinite precision. In some models all computation is free and the only cost tracked is the latency of fetching data from memory.
As hardware evolves, computer scientists make arguments for new models to be used in the analysis of algorithms. See, for instance, the work of Tomasz Jurkiewicz, including "The Cost of Address Translation".
It sounds like your model included a concrete cost to malloc. That is neither wrong nor right. It might be a more accurate model on your computer and a less accurate model on the graders.

Is it good practice to store a calculated value?

I'm working on a billing system, and calculating the total amount of an invoice on the database requires quite a bit of SQL. For instance:
some items are taxable, some aren't;
some items have discounts;
some are calculated dividing the price by an interval (e.g. € 30,00/month);
invoices may have overdue fees;
invoices have different tax rates.
Since performing queries is becoming more and more complex with every feature I add, I'm thinking about storing some calculations (net and gross amounts for invoice items and for the invoice). I've seen some invoicing frameworks doing it, so I thought it's not a bad practice per se.
However, I'm a bit worried about data integrity.
Cache invalidation in my application shouldn't be too hard: whenever an invoice gets changed somehow, I re-run the calculations and save the new values. But what if, someday, someone runs a script or some SQL code directly on the database?
I know there are some questions about the topic, but I'd like to discuss it further.
Yes, caching is fine as long as you don't forget to invalidate it when necessary (which is one of the two hardest problems in CS).
But what if, someday, someone runs a script or some SQL code directly on the database?
Well, with great power comes great responsibility. If you don't want responsibility, take the power away (forbid direct sql access). There's really nothing else you can do here.

Fast, high volume data input in SQL Server

I'm currently in the preparatory phase for a project which will involve (amongst other things) writing lots of data to a database, very fast (i.e. images (and associated meta-data) from 6 cameras, recording 40+ times a second).
Searching around the web, it seems that 'Big Data' more often applies to a higher rate, but smaller 'bits' (i.e. market data).
So..
Is there a more scientific way to proceed than "try it and see what happens"?
Is "just throw hardware at it" the best approach?
Is there some technology/white papers/search term that I ought to check out?
Is there a compelling reason to consider some other database (or just saving to disk)?
Sorry, this is a fairly open-ended question (maybe better for Programmers?)
Is there a more scientific way to proceed than "try it and see what happens"?
No, given your requirements are very unusual.
Is "just throw hardware at it" the best approach?
No, but at some point it is the only approach. You wont get a 400 horse power racing engine just by tuning a fiat panda. You wont get high throughput at any database without appropriate hardware.
Is there some technology/white papers/search term that I ought to check out?
Not a valid question in the context of the question - you ask specifically for sql server.
Is there a compelling reason to consider some other database (or just saving to disk)?
No. As long as you stick relational database the same rules apply pretty much - another may be faster, but not by a wide margin.
Your main problem will be disc IO and network bandwidth, depending on size of the images. Properly size the equipment and you should be fine. At the end this seems less than 300 images per second. Sure you want the images themselves in the database? I normally like that, but this is like storing a movie in pictures and that may be stretching it.
Whatever you do, that is a lot of disc IO and size, so - hardware is the only way to go if you need IOPS etc.

Why do DBS not adapt/tune their buffer sizes automatically?

Not sure whether there isn't a DBS that does and whether this is indeed a useful feature, but:
There are a lot of suggestions on how to speed up DB operations by tuning buffer sizes. One example is importing Open Street Map data (the planet file) into a Postgres instance. There is a tool called osm2pgsql (http://wiki.openstreetmap.org/wiki/Osm2pgsql) for this purpose and also a guide that suggests to adapt specific buffer parameters for this purpose.
In the final step of the import, the database is creating indexes and (according to my understanding when reading the docs) would benefit from a huge maintenance_work_mem whereas during normal operation, this wouldn't be too useful.
This thread http://www.mail-archive.com/pgsql-general#postgresql.org/msg119245.html in the contrary suggests a large maintenance_work_mem would not make too much sense during final index creation.
Ideally (imo), the DBS should know best what buffers size combination it could profit most given a limited size of total buffer memory.
So, are there some good reasons why there isn't a built-in heuristic that is able to adapt the buffer sizes automatically according to the current task?
The problem is the same as with any forecasting software. Just because something happened historically doesn't mean it will happen again. Also, you need to complete a task in order to fully analyze how you should have done it more efficient. Problem is that the next task is not necessarily anything like the previously completed task. So if your import routine needed 8gb of memory to complete, would it make sense to assign each read-only user 8gb of memory? The other way around wouldn't work well either.
In leaving this decision to humans, the database will exhibit performance characteristics that aren't optimal for all cases, but in return, let's us (the humans) optimize each case individually (if like to).
Another important aspect is that most people/companies value reliable and stable levels over varying but potentially better levels. Having a high cost isn't as big a deal as having large variations in cost. This is of course not true all the times as entire companies are based around the fact the once in a while hit that 1%.
Modern databases already make some effort into adapting itself to the tasks presented, such as increasingly more sofisticated query optimizers. At least Oracle have the option to keep track of some of the measures that are influencing the optimizer decisions (cost of single block read which will vary with the current load).
My guess would be it is awfully hard to get the knobs right by adaptive means. First you will have to query the machine for a lot of unknowns like how much RAM it has available - but also the unknown "what do you expect to run on the machine in addition".
Barring that, by setting a max_mem_usage parameter only, the problem is how to make a system which
adapts well to most typical loads.
Don't have odd pathological problems with some loads.
is somewhat comprehensible code without error.
For postgresql however the answer could also be
Nobody wrote it yet because other stuff is seen as more important.
You didn't write it yet.

Resources