Parallel operations in array with dependency

Parallel operations in array with dependency - c

I need help to solve this problem, I have a code with operations being performed on the previous element of the array, and I need to parallelize using openmp tasks, however I know how to remove this depency. My teacher says that in this case it is not interesting to use depend (in, out). How would I remove that depended on this then?
void sum_sequential(float a[N]) {
a[0] = 1.0;
for (int i = 1; i < N; i++)
a[i] = 2 * a[i-1] + 1;
}

You are quite right that there is is a data dependency. The value to be computed for each element is expressed in terms of the value previously computed for the preceding element, so this algorithm is inherently serial. There is no practical advantage in parallelizing it, as successfully dealing with the dependencies leaves no room for any concurrency.
HOWEVER, if you study the computation carefully, it is possible to re-express it in an algebraically equivalent way that does not have such dependencies -- an altogether different algorithm that produces the same results and is trivially parallelizable. At the risk of providing too big a hint, try writing out the first several terms of the results by hand, and see whether you recognize a simple pattern.

Related

Difference of using different population size and different crossover method

I have couple of general questions on genetic algorithm. In selection step where you pick up chromosomes from the population, is there an ideal number of chromosomes to be picked up? What difference does it make if I pick, say 10 chromosomes instead of 20? Does it have any effect on final result? At mutation stage, I've learnt there are different ways to mutate - Single point crossover, two points crossover, uniform crossover and arithmetic crossover. When should I choose one over the other? I know they sound very basic, but I couldn't find answer anywhere. So I thought I should ask in Stackoverflow.
Thanks

It seems to me that your terminology and concepts are a little bit messed up. Let me clarify.
First of all - there are many ways people call the members of the population: genotype, genome, chromosome, individual, solution... I will use solution for now as it is, in my opinion, the most general term, it is what we are eventually evolve, and also I'm not a biologist so I don't know whether genotype, genome and chromosome somehow differ and if they do what is the difference...
Population
Genetic Algorithms are population-based evolutionary algorithms. The algorithms have (usually) a fixed-sized population of solutions of the problem it is solving.
Genetic operators
There are two principal genetic operators - crossover and mutation. The goal of crossover is to take two (or more in some cases) solutions and combine them to create a solution that has some properties of both, optimally the best of both. The goal of mutation is to create new genetic material that was not previously present in the population by doing a small random change.
The choice of the particular operators, i.e. whether a single-point or multi-point crossover..., is totally problem-dependent. For example, if your solutions are composed of some logical blocks of bits that work together in each block, it might not be a good idea to use uniform crossover because it will destroy these blocks. In such case a single- or multi-point crossover is a better choice and the best choice is probably to restrict the crossover points to be on the boundaries of the blocks only.
You have to try what works best for your problem. Also, you can always use all of them, i.e. by randomly choosing which crossover operator is going to be used each time the crossover is about to be performed. Similarly for mutation.
Modes of operation
Now to your first question about the number of selected solutions. Genetic Algorithms can run in two basic modes - generational mode and steady-state mode.
Generational mode
In generational mode, the whole population is replaced in every generation (iteration) of the algorithm. A simple python-like pseudo-code for a generational-mode GA could look like this:
P = [...] # initial population
while not stopping_condition():
Pc = [] # empty population of children
while len(Pc) < len(P):
a = select(P) # select a solution from P using some selection strategy
b = select(P)
if rand() < crossover_probability:
a, b = crossover(a, b)
if rand() < mutation_probability:
a - mutation(a)
if rand() < mutation_probability:
b = mutation(b)
Pc.append(a)
Pc.append(b)
P = Pc # replace the population with the population of children
Evaluation of the solutions was omitted.
Steady-state mode
In steady-state mode, the population persists and only a few solutions are replaced in each iteration. Again, a simple steady-state GA could look like this:
P = [...] # initial population
while not stopping_condition():
a = select(P) # select a solution from P using some selection strategy
b = select(P)
if rand() < crossover_probability:
a, b = crossover(a, b)
if rand() < mutation_probability:
a - mutation(a)
if rand() < mutation_probability:
b = mutation(b)
replace(P, a) # put a child back into P based on some replacement strategy
replace(P, b)
Evaluation of the solutions was omitted.
So, the number of selected solutions depends on how do you want your algorithm to operate.

Can I use dynamic programming to solve this?

I have very little experience with dynamic programming. I used it to solve a DNA alignment problem, a basic knapsack problem, and a simple pathfinding problem. I understood how they worked, but it's not something I feel absolutely comfortable with yet.
I have a problem that reminds me of 0-1 dynamic programming, but the differences have thrown me off, and I'm not sure if I can still use this technique, or if I have to settle for a recursive approach.
Let's say I have a list of items, each with different values, weights, and costs. There may be more than one of each item.
Let's say I have to choose a combo of those items which is the most valuable, but remains within the limits of weight and cost. So far, I've described the knapsack problem, pretty much, with 2 constraints. But here's the difference:
The value of a chosen item changes depending on how many of them I have in the combo.
Let's say that each item has a function associated with it, that tells me what a group of those items is worth to me. It's a basic linear function, such as
value_of_item = -3(quantity of that item) + 50
So if I have 1 of some item in a combo, then it's value to me is 47. If I had 2 of them, then they're only worth 44 to me, each.
If I use a dynamic programming table for this, then for each cell I'd have to backtrack to see if that item is already in the current combo, making DP pointless. But maybe there's a way to re-frame the problem so I can take advantage of DP.
Hopefully that made sense.
The alternative is to generate every combo of items, within the limits of cost and weight, figure the value of each combo, choose the most valuable combo. For a list of 1000 items even, that's going to be an expensive search, and it's something I'd be calculating repeatedly. I'd like to find a way to exploit the advantages of DP.

If your functions are of the form
value(x, count) = base(x) - factor(x) * count, factor(x) > 0,
then you can reduce the problem to standard knapsack by splitting the items:
x -> x_1 to x_max_count
value_new(x_i) = value(x, i)
weight(x_i) = weight(x)
Now you easily verify that no optimal solution to the new problem uses some item x_j, without using every x_i with i < j.
Proof by contradiction: Assume there is such an optimal solution S and it uses x_j, but not x_i, j > i. Then there is an alternative solution S' that uses x_i instead of x_j. Since j > i,
value_new(x_j) = value(x, j)
= base(x) - factor(x) * j
< base(x) - factor(x) * i
= value(x, i)
= value_new(x_i)
and therefore S' has a higher value than S and we reached a contradiction.
Furthermore, we can allow factor(x) = 0, this corresponds to a standard knapsack item.
However if there is a constraint of the form
value(x, count) = base(x) + factor(x) * count
where factor(x) is an arbitrary value, the solution above does no longer work, because the last item would be the one with the largest value. Maybe some sophisticated modification of DP may allow you to use such constraints, but I don't see any modifications of the problem itself to use DP right away.
Some research in this topic (more general):
http://dept.cs.williams.edu/~heeringa/publications/knapsack.pdf
http://clweb.csa.iisc.ernet.in/vsuresh/Kamesh-PLKP.pdf

Updating values in an array with logical indexing with a non-constant value

A common problem I encounter when I want to write concise/readable code:
I want to update all the values of a vector matching a logical expression with a value that depends on the previous value.
For example, double all even entries:
weights = [10 7 4 8 3];
weights(mod(weights,2)==0) = weights(mod(weights,2)==0) * 2;
% weights = [20 7 8 16 3]
Is it possible to write the second line in a more concise fashion (i.e. avoiding the double use of the logical expression, something like i+=3 for i=i+3 in other languages). If I often use this kind of vector operation in different contexts/variables, and I have long conditionals, I feel that my code is less concise and readable than it could be.
Thanks!

How about
ind = mod(weights,2)==0;
weights(ind) = weights(ind)*2;
This way you avoid calculating the indices twice and it's easy to read.

Starting your other comment to Wauzl, such powerful operation capabilities is the Fortran side. This is purely matlab's design that is quickly getting obsolete. Let's use this horribleness further:
for i=1:length(weights),if (mod(weights(i),2)==0)weights(i)=weights(i)*2;end,end
It is even slightly faster than your two liner because you are doing the conditional indexing twice on both sides. In general, consider switching to Python3.

Well, I after more searching around, I found this link that deals with this issue (I used search before posting, I swear!), and there is interesting further discussion regarding this topic in the links in that thread. So apparently there are issues with ambiguity when introducing such an operator.
Looks like that is the price we have to pay in terms of syntactic limitations for having such powerful matrix operation capabilities.
Thanks a lot anyway, Wauzl!

How to implement switch with distinct cases?

Here in the followed program if last condition is true then unnecessarily we have to check all conditions before it.
Is there any possibility to implement switch case in the below program?
I've to convert very similar code to this into Arm assembly.
main()
{
int x;
if (x< 32768)
x<<15;
elseif(x<49152)
(x<<12)- 7;
elseif(x<53248)
(x<<11)- 13;
elseif(x<59392)
(x<<10)-27;
elseif(x<60928)
(x<<9)-61;
elseif(x<62208)
(x<<8)-139;
elseif(x<64128)
(x<<7)-225;
elseif(x<65088)
(x<<6)-414;
elseif(x<65344)
(x<<5)-801;
elseif(x<65488)
(x<<4)-1595;
elseif(x<65512)
(x<<3)-2592;
elseif(x<65524)
(x<<2)-4589;
elseif(x<65534)
(x<<1)-8586;
}
Hope someone will help me.

So first things first: are you concerned about performance? If so, do you have actual profiling data showing that this code is a hot-spot and nothing else shows up on the profile?
I doubt that. In fact, I am willing to bed that you haven't even benchmarked it. You are instead looking at code and trying to micro-optimize it.
If that's the case then the answer is simple: stop doing that. Write your code the way it makes sense to write it and focus on improving the algorithmic efficiency of your code. Let the compiler worry about optimizing things. If the performance proves inadequate then profile and focus on the results of the profiling. Almost always the answer to your performance problems will use: choose a better-performing algorithm. The answer will almost never be "tinker with an if statement".
Now, to answer your question: A switch isn't helpful in this scenario because there's no sane way to represent the concept x < 32768 in a case statement, short of writing one statement for every such value of x. Obviously this is neithe practical nor sane.
More importantly you seem to operate under the misconception that a switch would translate to fewer comparisons. It's possible in some rare cases for a compiler to be able to avoid comparisons, but most of the time a switch will mean as many comparisons as you have case statements. So if you need to check a variable against 10000 different possible values using a switch, you'll get 10000 comparisons.
In your case, you're checking for way more than 10,000 possible values, so the simple if construct combined with the "less than" operator makes a lot more sense and will be much more efficient than a switch.
You write that "Here in the followed program if last condition is true then unnecessarily we have to check all conditions before it." True, you do. You could rewrite it so that if the last condition were true you would only need two comparisons. But then you'd simply flip the problem on it's head: if x< 32768 you'd end up having to check all the other possible values so you'd be back where you started.
One possible solution would be to perform binary search. This would certainly qualify as an algorithmic improvement, but again without hard data that this is, indeed, a hotspot, this would be a rather silly exercise.
The bottom line is this: write correct code that is both easy to understand and easy to maintain and improve and don't worry about reordering if statements. The excellent answer by perreal shows a good example of simple and easy to understand and maintain code.
And on the topic of writing correct code, there's no such thing as elseif in C and C++. Which brings us to my last point: before micro-optimizing code at least try to run the compiler.

You can't do that with switches since you need to have constant values to compare to x, not boolean conditions. But you can use a struct like this:
struct {
int u_limit;
int shift_left;
int add;
} ranges[13] = { {32768, 15, 0}, {49152, 12, -7} /*, ...*/};
for (int i = 0; i < 13; i++) {
if (x < ranges[i].u_limit) {
x = x << ranges[i].shift_left + ranges[i].add; break;
}
}
Of course, you can replace the linear search with binary search for some speedup.

Cases where to use Pre-Increment or Post-Increment Operators

Reading a couple of questions about Post-Increment and Pre-Increment I am in need of trying to explain a new programmer in what cases I would actually need one or the other. In what type of scenarios one would apply a Post-Increment Operator and in what it is better to apply a Pre-Increment one.
This is to teach case studies where, in a particular code, one would need to apply one or the other in order to obtain specific values for certain tasks.

The short answer is: You never need them!
The long answer is that the instruction sets of early micro-computers had features like that. Upon reading of a memory cell, you could post-increment or pre-decrement that cell when reading it. Such machine level features inspired the predecessors of C, from whence it found its way even into more recent languages.
To understand this, one must remember that RAM was extremely scarce in those days. When you have 64k addressable RAM for your program, you'll find it worth it to write very compact code. The machine architectures of those days reflected this need by providing extremely powerful instructions. Hence you could express code like:
s = s + a[j]
j = j + 1
with just one instruction, given that s and j were in a register.
Thus we have language features in C that allowed the compiler without much effort to generate efficient code line:
register int s = 0; // clr r5
s += a[j++]; // mov j+, r6 move j to r6 and increment j after
// add r5,a[r6]
The same goes for the short-cut operations like +=, -=, *= etc.
They are
A way to save typing
A help for a compiler that had to fit in small RAM, and couldn't afford much optimizations
For example,
a[i] *= 5
which is short for
a[i] = a[i] * 5
in effect saves the compiler some form of common subexpression analysis.
And yet, all that language features, can always be replaced by equivalent, maybe a bit longer code that doesn't use them. Modern compilers should translate them to efficient code, just like the shorter forms.
So the bottom line and answer to your question: you don't need to look for cases where one needs to apply those operators. Such cases simply do not exits.

Well, some people like Douglas Crockford are against using those operators because they can lead to unexpected behaviors by hiding the final result from the untrained eye.
But, since I'm using JSHint, let's share an example here:
http://jsfiddle.net/coma/EB72c/
var List = function(values) {
this.index = 0;
this.values = values;
};
List.prototype.next = function() {
return this.values[++this.index];
};
List.prototype.prev = function() {
return this.values[--this.index];
};
List.prototype.current = function() {
return this.values[this.index];
};
List.prototype.prefix = function(prefixes) {
var i;
for (i = 0; i < this.values.length; i++) {
this.values[i] = prefixes[i] + this.values[i];
}
};
var animals = new List(['dog', 'cat', 'parrot']);
console.log('current', animals.current());
console.log('next', animals.next());
console.log('next', animals.next());
console.log('current', animals.current());
console.log('prev', animals.prev());
console.log('prev', animals.prev());
animals.prefix(['Snoopy the ', 'Gartfield the ', 'A ']);
console.log('current', animals.current());

As others have said, you never "need" either flavor of the ++ and -- operators. Then again, there are a lot of features of languages that you never "need" but that are useful in clearly expressing the intent of the code. You don't need an assignment that returns a value either, or unary negation since you can always write (0-x)... heck, if you push that to the limit you don't need C at all since you can always write in assembler, and you don't need assembler since you can always just set the bits to construct the instructions by hand...
(Cue Frank Hayes' song, When I Was A Boy. And get off my lawn, you kids!)
So the real answer here is to use the increment and decrement operators where they make sense stylistically -- where the value being manipulated is in some sense a counter and where it makes sense to advance the count "in passing". Loop control is one obvious place where people expect to see increment/decrement and where it reads more clearly than the alternatives. And there are many C idioms which have almost become meta-statements in the language as it is actually used -- the classic one-liner version of strcpy(), for example -- and which an experienced C programmer will recognize at a glance and be able to recreate at need; many of those do take advantage of increment/decrement as side effect.
Unfortunately, "where it makes sense stylistically" is not a simple rule to teach. As with any other aspect of coding style, it really needs to come from exposure to other folks' code and from an understanding of how "native speakers" of the programming language think about the code.
That isn't a very satisfying answer, I know. But I don't think a better one exists.

Usually, it just depends on what you need in your specific case. If you need the result of the operation, it's just a question of whether you need the value of the variable before or after incrementing / decrementing. So use the one which makes the code more clear to understand.
In some languages like C++ it is considered good practice to use the pre-increment / pre-decrement operators if you don't need the value, since the post operators need to store the previous value in a temporary during the operation, therefore they require more instructions (additional copy) and can cause performance issues for complex types.
I'm no C expert but I don't think it really matters which you use in C, since there are no increment / decrement operators for large structs.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight