I have very little experience with dynamic programming. I used it to solve a DNA alignment problem, a basic knapsack problem, and a simple pathfinding problem. I understood how they worked, but it's not something I feel absolutely comfortable with yet.
I have a problem that reminds me of 0-1 dynamic programming, but the differences have thrown me off, and I'm not sure if I can still use this technique, or if I have to settle for a recursive approach.
Let's say I have a list of items, each with different values, weights, and costs. There may be more than one of each item.
Let's say I have to choose a combo of those items which is the most valuable, but remains within the limits of weight and cost. So far, I've described the knapsack problem, pretty much, with 2 constraints. But here's the difference:
The value of a chosen item changes depending on how many of them I have in the combo.
Let's say that each item has a function associated with it, that tells me what a group of those items is worth to me. It's a basic linear function, such as
value_of_item = -3(quantity of that item) + 50
So if I have 1 of some item in a combo, then it's value to me is 47. If I had 2 of them, then they're only worth 44 to me, each.
If I use a dynamic programming table for this, then for each cell I'd have to backtrack to see if that item is already in the current combo, making DP pointless. But maybe there's a way to re-frame the problem so I can take advantage of DP.
Hopefully that made sense.
The alternative is to generate every combo of items, within the limits of cost and weight, figure the value of each combo, choose the most valuable combo. For a list of 1000 items even, that's going to be an expensive search, and it's something I'd be calculating repeatedly. I'd like to find a way to exploit the advantages of DP.
If your functions are of the form
value(x, count) = base(x) - factor(x) * count, factor(x) > 0,
then you can reduce the problem to standard knapsack by splitting the items:
x -> x_1 to x_max_count
value_new(x_i) = value(x, i)
weight(x_i) = weight(x)
Now you easily verify that no optimal solution to the new problem uses some item x_j, without using every x_i with i < j.
Proof by contradiction: Assume there is such an optimal solution S and it uses x_j, but not x_i, j > i. Then there is an alternative solution S' that uses x_i instead of x_j. Since j > i,
value_new(x_j) = value(x, j)
= base(x) - factor(x) * j
< base(x) - factor(x) * i
= value(x, i)
= value_new(x_i)
and therefore S' has a higher value than S and we reached a contradiction.
Furthermore, we can allow factor(x) = 0, this corresponds to a standard knapsack item.
However if there is a constraint of the form
value(x, count) = base(x) + factor(x) * count
where factor(x) is an arbitrary value, the solution above does no longer work, because the last item would be the one with the largest value. Maybe some sophisticated modification of DP may allow you to use such constraints, but I don't see any modifications of the problem itself to use DP right away.
Some research in this topic (more general):
http://dept.cs.williams.edu/~heeringa/publications/knapsack.pdf
http://clweb.csa.iisc.ernet.in/vsuresh/Kamesh-PLKP.pdf
Related
I’d like to perform amortized analysis of a dynamic array:
When we perform a sequence of n insertions, whenever an array of size k fills up, we reallocate an array of size k+sqrt(k), and copy the existing k values into the new array.
I’m new to amortized analysis and this is a problem I have yet to encounter, as we resize the array each time by a different non-constant value. (newSize=prevSize+sqrt(prevSize))
The total cost should be Θ(n*sqrt(n)), thus Θ(sqrt(n)) per operation.
I realize that whenever k >= c^2 for some constant c, then our array grows by c.
Let’s start off with an array of size k=1 (and assume n is large enough, for the sake of this example). After n insertions, we get the following sum of the total cost of the insertions + copies:
1+1(=k)+1+2(=k)+1+3(=k)+1+4(=k)+2+6(=k)+2+8(=k)+2+10(=k)+3+13(=k)+3+16(=k)+4+20(=k)+4+24+4+28+5+33+5+38+6+44+6+50+7…+n
I am seeing the pattern, but I can’t seem to be able to compute the bounds.
I’m trying to use all kinds of amortized analysis methods to bound this aggregated sum.
Let’s consider the accounting method for example, then I thought I needed round([k+sqrt(k)]\sqrt(k)) or simply round(sqrt(k)+1) coins per insertion, but it doesn’t add up.
I’d love to get your help, in trying to properly find the upper and lower sqrt(n) bound.
Thank you very much! :)
The easiest way is to lump together each resize operation with the inserts that follow it before the next resize.
The cost of each is lump is O(k + sqrt(k)), and each lump consists of O(sqrt(k)) operations, so the cost per operations is O( (k + k0.5)/k0.5) = O(k0.5 + 1) = O(k0.5)
Of course you want an answer in terms if n, but since k(n) is in Θ(n), O(k0.5) = O(N0.5).
This can be easily shown using the Accounting Method. Consider an array of size k+sqrt(k) such that the first k entries are occupied and the rest sqrt(k) are empty. Let us make Insert-Last operation draft sqrt(k)+2 coins: One will be used to pay for insertion while the rest (sqrt(k)+1 coins) will be deposited and used for credit. From here, execute Insert-Last sqrt(k) times. We shall then have k+sqrt(k) credit coins: in total we had drafted k+2sqrt(k) coins, sqrt(k) of which we used for paying for the insertions. Hence, we won't need to pay for the resizing of the array. As soon as the array gets full, we would be able to utilize our k+sqrt(k) credit coins and pay for the resizing operation. Since k = Θ(n), each Insert-Last operation drafts sqrt(k)+2 = O(sqrt(k)) = O(sqrt(n)) coins and thus takes O(sqrt(n)) amortized.
let A be an MxN matrix with entries a_ij in {0, ..., n-1}.
we can think of the entries in A as a rectangular grid that has been n-colored.
I am interested in partitioning each colored region into rectangles, in such a way that the number of rectangles is minimized. That is, I want to produce n sets of quadruples
L_k = {(i, j, w, h) | a_xy = k forall i <= x < i + w, j <= y < j + h}
satisfying the condition that every a_ij belongs to exactly one rectangle and all of the rectangles are disjoint. Furthermore, the sum
L_0 + ... + L_(n-1) is minimized.
Obviously, minimizing each of the L_k can be done independently, but there is also a requirement that this happen extremely fast. Assume this is a real-time application. It may be the case that since the sets are disjoint, sharing information between the L_ks speeds things up more than doing everything in parallel. n can be small (say, <100) and M and N can be large.
I assume there is a dynamic programming approach to this, or maybe there is a way to rephrase it as a graph problem, but it is not immediately obvious to me how to approach this.
EDIT:
There seems to be some confusion about what I mean. Here is a picture to help illustrate.
Imagine this is a 10x10 matrix with red = 0, green = 1, and blue = 2. draw the black boxes like so, minimizing the number of boxes. The output here would be
L_0 = {(0,8,2,2),(1,7,2,1),(2,8,1,1),(4,5,4,2),(6,7,2,2)}
L_1 = {(0,0,4,4),(4,0,6,2),(6,2,2,3),(8,8,2,2)}
L_2 = {(0,4,4,3),(0,7,1,1),(2,9,6,1),(3,7,3,2),(4,2,2,4),(8,2,2,6)}
One thing to immediately do is to note that you can separate the problem into individual instances of connected regions of colors. From there, the post linked in the article explains how you can use a maximal matching to construct the optimal solution.
But it's probably quite hard to implement that (and judging by your tag of C, even more hard). So I recommend one of two strategies: Backtracking, or greedy.
To do backtracking, you will recurse on the set of tiles which are not yet covered (I assume this makes sense, as you have listed all integer coordinates. This changes but not massively otherwise). Take the highest, leftmost uncovered tile, and loop over all possible rectangles which contain it (there are only ~n^2 of them, and hopefully less). Then recurse.
To optimize this, you will want to prune. An easy way to prune this is to stop recursing if you have already seen a solution with a better answer. This pruning is known as branch and bound.
You can also just quit early in the backtracking, if you only need an approximate answer. Since you mentioned "real-time application," it might be okay if you're only off by a little.
Continuing with the idea of approximation, you could also do something similar by just greedily picking the largest rectangle that you can at the moment. This is annoying to implement, but doable. You can also combine this with recursion and backing out early.
I have couple of general questions on genetic algorithm. In selection step where you pick up chromosomes from the population, is there an ideal number of chromosomes to be picked up? What difference does it make if I pick, say 10 chromosomes instead of 20? Does it have any effect on final result? At mutation stage, I've learnt there are different ways to mutate - Single point crossover, two points crossover, uniform crossover and arithmetic crossover. When should I choose one over the other? I know they sound very basic, but I couldn't find answer anywhere. So I thought I should ask in Stackoverflow.
Thanks
It seems to me that your terminology and concepts are a little bit messed up. Let me clarify.
First of all - there are many ways people call the members of the population: genotype, genome, chromosome, individual, solution... I will use solution for now as it is, in my opinion, the most general term, it is what we are eventually evolve, and also I'm not a biologist so I don't know whether genotype, genome and chromosome somehow differ and if they do what is the difference...
Population
Genetic Algorithms are population-based evolutionary algorithms. The algorithms have (usually) a fixed-sized population of solutions of the problem it is solving.
Genetic operators
There are two principal genetic operators - crossover and mutation. The goal of crossover is to take two (or more in some cases) solutions and combine them to create a solution that has some properties of both, optimally the best of both. The goal of mutation is to create new genetic material that was not previously present in the population by doing a small random change.
The choice of the particular operators, i.e. whether a single-point or multi-point crossover..., is totally problem-dependent. For example, if your solutions are composed of some logical blocks of bits that work together in each block, it might not be a good idea to use uniform crossover because it will destroy these blocks. In such case a single- or multi-point crossover is a better choice and the best choice is probably to restrict the crossover points to be on the boundaries of the blocks only.
You have to try what works best for your problem. Also, you can always use all of them, i.e. by randomly choosing which crossover operator is going to be used each time the crossover is about to be performed. Similarly for mutation.
Modes of operation
Now to your first question about the number of selected solutions. Genetic Algorithms can run in two basic modes - generational mode and steady-state mode.
Generational mode
In generational mode, the whole population is replaced in every generation (iteration) of the algorithm. A simple python-like pseudo-code for a generational-mode GA could look like this:
P = [...] # initial population
while not stopping_condition():
Pc = [] # empty population of children
while len(Pc) < len(P):
a = select(P) # select a solution from P using some selection strategy
b = select(P)
if rand() < crossover_probability:
a, b = crossover(a, b)
if rand() < mutation_probability:
a - mutation(a)
if rand() < mutation_probability:
b = mutation(b)
Pc.append(a)
Pc.append(b)
P = Pc # replace the population with the population of children
Evaluation of the solutions was omitted.
Steady-state mode
In steady-state mode, the population persists and only a few solutions are replaced in each iteration. Again, a simple steady-state GA could look like this:
P = [...] # initial population
while not stopping_condition():
a = select(P) # select a solution from P using some selection strategy
b = select(P)
if rand() < crossover_probability:
a, b = crossover(a, b)
if rand() < mutation_probability:
a - mutation(a)
if rand() < mutation_probability:
b = mutation(b)
replace(P, a) # put a child back into P based on some replacement strategy
replace(P, b)
Evaluation of the solutions was omitted.
So, the number of selected solutions depends on how do you want your algorithm to operate.
I have encountered variations of this problem multiple times, and most recently it became a bottleneck in my arithmetic coder implementation. Given N (<= 256) segments of known non-negative size Si laid out in order starting from the origin, and for a given x, I want to find n such that
S0 + S1 + ... + Sn-1 <= x < S0 + S1 + ... + Sn
The catch is that lookups and updates are done at about the same frequency, and almost every update is in the form of increasing the size of a segment by 1. Also, the bigger a segment, the higher the probability it will be looked up or updated again.
Obviously some sort of tree seems like the obvious approach, but I have been unable to come up with any tree implementation that satisfactorily takes advantage of the known domain specific details.
Given the relatively small size of N, I also tried linear approaches, but they turned out to be considerably slower than a naive binary tree (even after some optimization, like starting from the back of the list for numbers above half the total)
Similarly, I tested introducing an intermediate step that remaps values in such a way as to keep segments ordered by size, to make access faster for the most frequently used, but the added overhead exceeded gains.
Sorry for the unclear title -- despite it being a fairly basic problem, I am not aware of any specific names for it.
I suppose some BST would do... You may try to add a new numeric member (int or long) to each node to keep a sum of values of all left descendants. Then you'll seek for each item in approximately logarithmic time, and once an item is added, removed or modified you'll have to update just its ancestors on the returning path from the recursion. You may apply some self-organizing tree structure, for example AVL to keep the worst-case search optimal or a splay tree to optimize searches for those most often used items. Take care to update the left-subtree-sums during rebalancing or splaying.
You could use a binary tree where each node n contains two integers A_n
and U_n, where initially
A_n = S_0 + .. S_n and U_n = 0.
Let, at any fixed subsequent time, T_n = S_0 + .. + S_n.
When looking for the place of a query x, you would go along the tree, knowing that for each node m the current corresponding value of T_m is A_m + U_m + sum_{p : ancestors of m, we visited the right child of p to attain m} U_p.
This solves look up in O(log(N)).
For update of the n-th interval (increasing its size by y), you just look for it in the tree, increasing the value of U_m og y for each node m that you visit along the way. This also solves update in O(log(N)).
I have an array (arr) of elements, and a function (f) that takes 2 elements and returns a number.
I need a permutation of the array, such that f(arr[i], arr[i+1]) is as little as possible for each i in arr. (and it should loop, ie. it should also minimize f(arr[arr.length - 1], arr[0]))
Also, f works sort of like a distance, so f(a,b) == f(b,a)
I don't need the optimum solution if it's too inefficient, but one that works reasonable well and is fast since I need to calculate them pretty much in realtime (I don't know what to length of arr is, but I think it could be something around 30)
What does "such that f(arr[i], arr[i+1]) is as little as possible for each i in arr" mean? Do you want minimize the sum? Do you want to minimize the largest of those? Do you want to minimize f(arr[0],arr[1]) first, then among all solutions that minimize this, pick the one that minimizes f(arr[1],arr[2]), etc., and so on?
If you want to minimize the sum, this is exactly the Traveling Salesman Problem in its full generality (well, "metric TSP", maybe, if your f's indeed form a metric). There are clever optimizations to the naive solution that will give you the exact optimum and run in reasonable time for about n=30; you could use one of those, or one of the heuristics that give you approximations.
If you want to minimize the maximum, it is a simpler problem although still NP-hard: you can do binary search on the answer; for a particular value d, draw edges for pairs which have f(x,y)
If you want to minimize it lexiocographically, it's trivial: pick the pair with the shortest distance and put it as arr[0],arr[1], then pick arr[2] that is closest to arr[1], and so on.
Depending on where your f(,)s are coming from, this might be a much easier problem than TSP; it would be useful for you to mention that as well.
You're not entirely clear what you're optimizing - the sum of the f(a[i],a[i+1]) values, the max of them, or something else?
In any event, with your speed limitations, greedy is probably your best bet - pick an element to make a[0] (it doesn't matter which due to the wraparound), then choose each successive element a[i+1] to be the one that minimizes f(a[i],a[i+1]).
That's going to be O(n^2), but with 30 items, unless this is in an inner loop or something that will be fine. If your f() really is associative and commutative, then you might be able to do it in O(n log n). Clearly no faster by reduction to sorting.
I don't think the problem is well-defined in this form:
Let's instead define n fcns g_i : Perms -> Reals
g_i(p) = f(a^p[i], a^p[i+1]), and wrap around when i+1 > n
To say you want to minimize f over all permutations really implies you can pick a value of i and minimize g_i over all permutations, but for any p which minimizes g_i, a related but different permatation minimizes g_j (just conjugate the permutation). So therefore it makes no sense to speak minimizing f over permutations for each i.
Unless we know something more about the structure of f(x,y) this is an NP-hard problem. Given a graph G and any vertices x,y let f(x,y) be 1 if there is no edge and 0 if there is an edge. What the problem asks is an ordering of the vertices so that the maximum f(arr[i],arr[i+1]) value is minimized. Since for this function it can only be 0 or 1, returning a 0 is equivalent to finding a Hamiltonian path in G and 1 is saying that no such path exists.
The function would have to have some sort of structure that disallows this example for it to be tractable.