Brute force implementation for 0-1 Knapsack

Brute force implementation for 0-1 Knapsack - c

I'm struggling with the given task for almost a week without success of finding solution so this site is my last hope.
I have 0-1 Knapsack problem which has 20 items with different values and weights, maximum weight of sack is 524. Now i need to implement brute force to find optimal solution subset of 20 items so that total weights <= 524 and maximum values of chosen items.
Could you please point me out or better give detailed implementation to analyze how it work!!
Thank you very much

The brute-force idea is easy:
Generate all possible subsets of your 20 items, saving only those which satisfy your weight constraint. If you want to be fancy, you can even only consider subsets to which you cannot add anything else without violating the weight constraint, since only these can possibly be the right answer. O(2^n)
Find the subset with maximum weight. linear in terms of the number of candidates, and since we have O(2^n) candidates, this is O(2^n).
Please comment if you'd like some pseudocode.
EDIT: What the hey, here's the pseudocode just in case.
GetCandidateSubsets(items[1..N], buffer, maxw)
1. addedSomething = false
2. for i = 1 to N do
3. if not buffer.contains(item[i]) and
weight(buffer) + weight(items[i]) <= maxw then
4. add items[i] to buffer
5. GetCandidateSubsets(items[1..N], buffer)
6. remove items[i] from buffer
7. addedSomething = true
8. if not addedSomething then
9. emit & store buffer
Note that the GetCandidateSubsets function is not very efficient, even for a brute force implementation. Thanks to amit for pointing that out. You could rework this to only walk the combinations, rather than the permutations, of the item set, as a first-pass optimization.
GetMaximalCandidate(candidates[1..M])
1. if M = 0 then return Null
2. else then
3. maxel = candidates[1]
4. for i = 2 to M do
5. if weight(candidates[i]) > weight(maxel) then
6. maxel = candidates[i]
7. return maxel

Related

Binary search modification

I have been attempting to solve following problem. I have a sequence of positive
integer numbers which can be very long (several milions of elements). This
sequence can contain "jumps" in the elements values. The aforementioned jump
means that two consecutive elements differs each other by more than 1.
Example 01:
1 2 3 4 5 6 7 0
In the above mentioned example the jump occurs between 7 and 0.
I have been looking for some effective algorithm (from time point of view) for
finding of the position where this jump occurs. This issue is complicated by the
fact that there can be a situation when two jumps are present and one of them
is the jump which I am looking for and the other one is a wrap-around which I
am not looking for.
Example 02:
9 1 2 3 4 6 7 8
Here the first jump between 9 and 1 is a wrap-around. The second jump between
4 and 6 is the jump which I am looking for.
My idea is to somehow modify the binary search algorithm but I am not sure whether it is possible due to the wrap-around presence. It is worthwhile to say that only two jumps can occur in maximum and between these jumps the elements are sorted. Does anybody have any idea? Thanks in advance for any suggestions.

You cannot find an efficient solution (Efficient meaning not looking at all numbers, O(n)) since you cannot conclude anything about your numbers by looking at less than all. For example if you only look at every second number (still O(n) but better factor) you would miss double jumps like these: 1 5 3. You can and must look at every single number and compare it to it's neighbours. You could split your workload and use a multicore approach but that's about it.
Update
If you have the special case that there is only 1 jump in your list and the rest is sorted (eg. 1 2 3 7 8 9) you can find this jump rather efficiently. You cannot use vanilla binary search since the list might not be sorted fully and you don't know what number you are searching but you could use an abbreviation of the exponential search which bears some resemblance.
We need the following assumptions for this algorithm to work:
There is only 1 jump (I ignore the "wrap around jump" since it is not technically between any following elements)
The list is otherwise sorted and it is strictly monotonically increasing
With these assumptions we are now basically searching an interruption in our monotonicity. That means we are searching the case when 2 elements and b have n elements between them but do not fulfil b = a + n. This must be true if there is no jump between the two elements. Now you only need to find elements which do not fulfil this in a nonlinear manner, hence the exponential approach. This pseudocode could be such an algorithm:
let numbers be an array of length n fulfilling our assumptions
start = 0
stepsize = 1
while (start < n-1)
while (start + stepsize > n)
stepsize -= 1
stop = start + stepsize
while (numbers[stop] != numbers[start] + stepsize)
// the number must be between start and stop
if(stepsize == 1)
// congratiulations the jump is at start to start + 1
return start
else
stepsize /= 2
start += stepsize
stepsize *= 2
no jump found

I really can't figure out where to start

By using 9 numbers which are 1 to 9 you should find the number of ways to get N using multiplication and addition.
For example, if 100 is given, you would answer 7.
The reason is that there are 7 possible ways.
100 = 1*2*3*4+5+6+7*8+9
100 = 1*2*3+4+5+6+7+8*9
100 = 1+2+3+4+5+6+7+8*9
100 = 12+3*4+5+6+7*8+9
100 = 1+2*3+4+5+67+8+9
100 = 1*2+34+5+6*7+8+9
100 = 12+34+5*6+7+8+9
If this question is given to you, how would you start?

Are we allowed to use parentheses? That would expand the number of possibilities by a lot.
I would try to find the first additive term, let’s say 1×23, first. There are a limited number of those, and since we can’t subtract, we know that if we get a term above our target, we can prune it from our search. That leaves us looking for the solution to 23 + f = 100, where f is another formula of exactly the same form. But that is exactly the same as solving the original problem for numbers 4–9 and target 77! So call your algorithm recursively and add the solutions for that subproblem to the solutions to the original problem. That is, if we have 23 + 4, are there any solutions to the subproblem with numbers 5–9 and n = 73? Divide and conquer.
You might benefit from a dynamic table of partial solutions, since it's possible you might get the same subproblem in different ways: 1+2+3 = 1×2×3, so solving the subproblem with numbers 4–9 and target 94 twice duplicates work.
You are probably better going from right to left than from left to right, on the principle of most-constrained first. 89, 8×9, or 78+9 leave much less room for possible solutions than 1+2+3, 1×2×3, 12×3, 12+3 or 1×23.

There are three possible operations
addition
multiplication
combine, for example combine 1 and 2 to make 12
There are 8 positions for each operator. Hence, there are a total of 3^8 = 6561 possible equations. So I would start with
for ( i = 0; i < 6561; i++ )

How do I check to see if two (or more) elements of an array/vector are the same?

For one of my homework problems, we had to write a function that creates an array containing n random numbers between 1 and 365. (Done). Then, check if any of these n birthdays are identical. Is there a shorter way to do this than doing several loops or several logical expressions?
Thank you!
CODE SO FAR, NOT DONE YET!!
function = [prob] bdayprob(N,n)
N = input('Please enter the number of experiments performed: N = ');
n = input('Please enter the sample size: n = ');
count = 0;
for(i=1:n)
x(i) = randi(365);
if(x(i)== x)
count = count + 1
end
return

If I'm interpreting your question properly, you want to check to see if generating n integers or days results in n unique numbers. Given your current knowledge in MATLAB, it's as simple as doing:
n = 30; %// Define sample size
N = 10; %// Define number of trials
%// Define logical array where each location tells you whether
%// birthdays were repeated for a trial
check = false(1, N);
%// For each trial...
for idx = 1 : N
%// Generate sample size random numbers
days = randi(365, n, 1);
%// Check to see if the total number of unique birthdays
%// are equal to the sample size
check(idx) = numel(unique(days)) == n;
end
Woah! Let's go through the code slowly shall we? We first define the sample size and the number of trials. We then specify a logical array where each location tells you whether or not there were repeated birthdays generated for that trial. Now, we start with a loop where for each trial, we generate random numbers from 1 to 365 that is of n or sample size long. We then use unique and figure out all unique integers that were generated from this random generation. If all of the birthdays are unique, then the total number of unique birthdays generated should equal the sample size. If we don't, then we have repeats. For example, if we generated a sample of [1 1 1 2 2], the output of unique would be [1 2], and the total number of unique elements is 2. Since this doesn't equal 5 or the sample size, then we know that the birthdays generated weren't unique. However, if we had [1 3 4 6 7], unique would give the same output, and since the output length is the same as the sample size, we know that all of the days are unique.
So, we check to see if this number is equal to the sample size for each iteration. If it is, then we output true. If not, we output false. When I run this code on my end, this is what I get for check. I set the sample size to 30 and the number of trials to be 10.
check =
0 0 1 1 0 0 0 0 1 0
Take note that if you increase the sample size, there is a higher probability that you will get duplicates, because randi can be considered as sampling with replacement. Therefore, the larger the sample size, the higher the chance of getting duplicate values. I made the sample size small on purpose so that we can see that it's possible to get unique days. However, if you set it to something like 100, or 200, you will most likely get check to be all false as there will most likely be duplicates per trial.

Here are some more approaches that avoid loops. Let
n = 20; %// define sample size
x = randi(365,n,1); %// generate n values between 1 and 365
Any of the following code snippets returns true (or 1) if there are two identical values in x, and false (or 0) otherwise:
Sort and then check if any two consecutive elements are the same:
result = any(diff(sort(x))==0);
Do all pairwise comparisons manually; remove self-pairs and duplicate pairs; and check if any of the remaining comparisons is true:
result = nnz(tril(bsxfun(#eq, x, x.'),-1))>0;
Compute the distance between distinct values, considering each pair just once, and then check if any distance is 0:
result = any(pdist(x(:))==0);
Find the number of occurrences of the most common value (mode):
[~, occurs] = mode(x);
result = occurs>1;

I don't know if I'm supposed to solve the problem for you, but perhaps a few hints may lead you in the right direction (besides I'm not a matlab expert so it will be in general terms):
Maybe not, but you have to ask yourself what they expect of you. The solution you propose requires you to loop through the array in two nested loops which will mean n*(n-1)/2 times through the loop (ie quadratic time complexity).
There are a number of ways you can improve the time complexity of the problem. The most straightforward would be to have a 365 element table where you can keep track if a particular number has been seen yet - which would require only a single loop (ie linear time complexity), but perhaps that's not what they're looking for either. But maybe that solution is a little bit ad-hoc? What we're basically looking for is a fast lookup if a particular number has been seen before - there exists more memory efficient structures that allows look up in O(1) time and O(log n) time (if you know these you have an arsenal of tools to use).
Then of course you could use the pidgeonhole principle to provide the answer much faster in some special cases (remember that you only asked to determine whether two or more numbers are equal or not).

Randomize matrix elements between two values while keeping row and column sums fixed (MATLAB)

I have a bit of a technical issue, but I feel like it should be possible with MATLAB's powerful toolset.
What I have is a random n by n matrix of 0's and w's, say generated with
A=w*(rand(n,n)<p);
A typical value of w would be 3000, but that should not matter too much.
Now, this matrix has two important quantities, the vectors
c = sum(A,1);
r = sum(A,2)';
These are two row vectors, the first denotes the sum of each column and the second the sum of each row.
What I want to do next is randomize each value of w, for example between 0.5 and 2. This I would do as
rand_M = (0.5-2).*rand(n,n) + 0.5
A_rand = rand_M.*A;
However, I don't want to just pick these random numbers: I want them to be such that for every column and row, the sums are still equal to the elements of c and r. So to clean up the notation a bit, say we define
A_rand_c = sum(A_rand,1);
A_rand_r = sum(A_rand,2)';
I want that for all j = 1:n, A_rand_c(j) = c(j) and A_rand_r(j) = r(j).
What I'm looking for is a way to redraw the elements of rand_M in a sort of algorithmic fashion I suppose, so that these demands are finally satisfied.
Now of course, unless I have infinite amounts of time this might not really happen. I therefore accept these quantities to fall into a specific range: A_rand_c(j) has to be an element of [(1-e)*c(j),(1+e)*c(j)] and A_rand_r(j) of [(1-e)*r(j),(1+e)*r(j)]. This e I define beforehand, say like 0.001 or something.
Would anyone be able to help me in the process of finding a way to do this? I've tried an approach where I just randomly repick the numbers, but this really isn't getting me anywhere. It does not have to be crazy efficient either, I just need it to work in finite time for networks of size, say, n = 50.
To be clear, the final output is the matrix A_rand that satisfies these constraints.
Edit:
Alright, so after thinking a bit I suppose it might be doable with some while statement, that goes through every element of the matrix. The difficult part is that there are four possibilities: if you are in a specific element A_rand(i,j), it could be that A_rand_c(j) and A_rand_r(i) are both too small, both too large, or opposite. The first two cases are good, because then you can just redraw the random number until it is smaller than the current value and improve the situation. But the other two cases are problematic, as you will improve one situation but not the other. I guess it would have to look at which criteria is less satisfied, so that it tries to fix the one that is worse. But this is not trivial I would say..

You can take advantage of the fact that rows/columns with a single non-zero entry in A automatically give you results for that same entry in A_rand. If A(2,5) = w and it is the only non-zero entry in its column, then A_rand(2,5) = w as well. What else could it be?
You can alternate between finding these single-entry rows/cols, and assigning random numbers to entries where the value doesn't matter.
Here's a skeleton for the process:
A_rand=zeros(size(A)) is the matrix you are going to fill
entries_left = A>0 is a binary matrix showing which entries in A_rand you still need to fill
col_totals=sum(A,1) is the amount you still need to add in every column of A_rand
row_totals=sum(A,2) is the amount you still need to add in every row of A_rand
while sum( entries_left(:) ) > 0
% STEP 1:
% function to fill entries in A_rand if entries_left has rows/cols with one nonzero entry
% you will need to keep looping over this function until nothing changes
% update() A_rand, entries_left, row_totals, col_totals every time you loop
% STEP 2:
% let (i,j) be the indeces of the next non-zero entry in entries_left
% assign a random number to A_rand(i,j) <= col_totals(j) and <= row_totals(i)
% update() A_rand, entries_left, row_totals, col_totals
end
update()
A_rand(i,j) = random_value;
entries_left(i,j) = 0;
col_totals(j) = col_totals(j) - random_value;
row_totals(i) = row_totals(i) - random_value;
end
Picking the range for random_value might be a little tricky. The best I can think of is to draw it from a relatively narrow distribution centered around N*w*p where p is the probability of an entry in A being nonzero (this would be the average value of row/column totals).
This doesn't scale well to large matrices as it will grow with n^2 complexity. I tested it for a 200 by 200 matrix and it worked in about 20 seconds.

Getting N minimal contiguous blocks in an array of numbers

I'm currently working on the following problem:
Given an array of M positive numbers, I need to get N blocks of contiguous numbers with some given length. For example, when I have the array:
6 9 3 2 8 1 6 9 7
When I need to find one block of length 3, the solution is [3,2,8] which has a total minimal sum of 13. When I need to find two blocks, the algorithm should give [3,2,8] and [1,6,9] since the sum of all elements in these blocks is minimal (29). It is given that the length of the sequence is always strictly larger than N times the length of a block (so there is always a solution).
I think this problem is solvable by using DP but I currently can't see how. I'm struggling to find a recurrent relation between the subproblems. Could anyone give me a hand here?
Thanks in advance!

Calculate the sum of each block with the given length, and record them with the initial index. This can be done by a complexity of O(n). So you get a list like:
index sum
0 18
1 14
2 13
... ...
Due to the objective blocks could not overlap with each other, so each difference of their indexes can not be less than the given length. So you need to apply a simple dynamic planning algorithm on the list you got.
if the block length is l, list length is n(say the list S[n]), and you want to find m blocks, then the
F(n,m,l) = min { F(n-i-l,m-1,l) + S[n-i] } (for i = 0 ~ n-(m-1)*l)
The complexity of this step is O(nm) where m is how many blocks you want.
Finally the complexity is O(nm). Let me know if you need more details.