Say I have a nested loop as:
for(m=0;m<10;m++)
for(n=0;n<10;n++)
result[n][m]=result[m-3][n-2]
+result[n+1];
Would we call any of the two loops parallelizeable? My understanding is there is dependence on both the two variables n and m so we can't parallelize any of the loops.
Please clarify. Also, what type of dependences is this?
Thanks!
You are correct that the example loop there is not parallelizeable (or at least not easily), but it's not because the inner contents rely on both m and n (as those could be passed as arguments to a new thread or whatever). It's because later calculations depend on the results of earlier calculations. For example, the value of result[10][12] depends on result[9][8] which depends on result[5][7] etc.
Related
Usually in C, we right statements as a list, and when the program is run, it executes the statements one by one. Is it possible to make two statements be executed simultaneously?
For example, suppose I wish to swap two variables a and b. Usually we would declare a third variable c.
c=b;
b=a;
a=b;
But if we were capable of simultaneously executing two statements, then we wouldn't need a third variable c. We could do a=b; and b=a; simultaneously instead.
So, is there a way to simultaneously execute two or more statements at the same time?
The statement "when the program is run, it executes the statements one by one" shows that you're fundamentally misunderstanding the C programming language.
The C standard says that the compiled program needs to be executed so that the side effects of the program happen in order as if these statements were executed in the C abstract machine according to the rules of the abstract machine. However assigning a value to a non-volatile variable does not count as such a side effect, i.e. in your program
c = b;
b = a;
a = c; // notice, it is *c* here
since none of the statements have any visibile side effect the compiler is free to reorganize, interleave and eliminate these statements for as long as it does not change any previous or following side effect or their relative ordering.
In practice any decent compiler will notice that this is a swap operation and it would encode this with one assembler opcode, such as the x86 XCHG even though the C programming language does not have a swap operation in itself. Or, it could be that these generate zero opcodes and the compiler just remembers that "henceforth b shall be known as a and a as b."
The only way to actually force the compiler to generate a program that would execute each of these statements strictly sequentially would be if each of the statements touch a variable that is volatile-qualified, because accessing a volatile qualified object is considered a side effect.
You can create multiple threads in C (using libpthreads) and if you have a multi-core CPU the threads may get executed simultaneously.
The issue with your example is that the data depends on each other. You will create a race condition.
If you want to swap two variables without an intermediate variable you can use the XOR swap algorithm, but it's less efficient than simply using an intermediate variable.
There is multithreading but whatever you said is not possible because there is data dependency between this two statements.
Simultaneously executing the 2 will never give a plausible result.
What you can do is identify different independent section of program and then execute them in different threads. That's the level of parallelism you can achieve from the programmer's perspective.
So, is there a way to simultaneously execute two or more statements at the same time?
Yes, by multithreading.
Even though however, you need to have two threads run at the same time, in order to achieve that effect.
In general, we don't quest for statements to be executed simultaneously though, it's too hard, and the gain from it simply doesn't worth it.
In your case however, it would cause data races.
PS: You can swap the numbers without a temporary value, like I describe here.
Statements in c are executed sequentially unless and until you use break or continue labels in your code.
You cant execute statements simultaneously specially in example you have specified.
If you don't want to use temporary variable then you can use this logic.
a = a+b;
a = a-b;
b = a-b;
With SIMD instructions multiple statements can be executed as simultaneously as parallel (vectorized) if your data fits.
Example:
a = e + g * ...
b = f + g * ...
c = e + g * ...
d = f + g * ...
No matter how you slice it, the CPU will have to preserve the value of b in a temporary location before it gets overwritten with a and so it can be used again in the assignment back to the other variable. Even languages such as Python that allow you say a,b=b,a are really generating those temp variables under the hood.
There's a few folks mentioning "threads" in other answers. At this point, given the nature of the question you are asking, I would strongly not recommend you pursue that. (You aren't ready for threads!) It's highly unusual to use threads to do a simple variable swap and will only be slower with more race conditions to account for.
If you're looking for a shorthand way to express a "swap", you can always define your own macro.
#define SWAP(x,y) {auto tmp=x; x=y; y=tmp;}
Then you could just say: SWAP(a,b) with your own code.
Consider this simple problem:
Suppose I have a 1x4 array. I have to add 5 to each of its element. Then is it advisable to use a loop. Removing the size of code factor & good organization of code, is there any other reason why I should use a loop? Wont it take more time than executing 4 straight lines of code wherein I add 5 to each element as the control has to go back over 5 times & change the value of loop variable? What if we consider a 1x2 array? Then we dont even have the size problem, both types of code would consist of 2 lines.
Although I am tagging this question in C, I would like to know about this in other languages too.
You don't really need to worry about this. Write the way you find it easier to read, then let the compiler decide whether it finds it necessary to perform some loop unrolling optimization. Trust compiler vendors, their developers are very good at understanding these kinds of optimization-related stuff...
This is a micro optimization. If you don't have to save on the cycle level you don't have to worry unrolling such a loop. The important factor is readability and maintainability. For a loop of two iterations I don't think you add anything in readability by adding a loop.
You shouldn't bother too much about performance of the code when you are taking such minor examples... Code the way it is easier to understand...
Using a loop provides you a way to scale the example without major changes everywhere.
I came across "loops must be folded to enusre termination" in a paper on formal methods (abstract interpretation to be precise). I am clear on what termination means, but I do not know what a folded loop is, nor how to perform folding on a loop.
Could someone please explain to me what a folded loop is please? And if it is not implicit in, or does not follow immediately for the definition of a folded loop, how this ensures termination?
Thanks
Folding a loop is the opposite action from the better-known loop unfolding, which itself is better known as loop unrolling. Given a loop, to unfold it means to repeat the body several times, so that the loop test is executed less often. When the number of executions of the loop in advance, the loop can be completely unfolded, leaving a simple sequence of instructions. For example, this loop
for i := 1 to 4 do
writeln(i);
can be unfolded to
writeln(1);
writeln(2);
writeln(3);
writeln(4);
See C++ loop unfolding, bounds for another example with partial unfolding.
The metaphor is that the program is folded on itself many times over; unfolding means removing some or all of these folds.
In some static analysis techniques, it is difficult to deal with loops, because finding a precondition for the entry point of the loop (i.e. finding a loop invariant) requires a fixpoint computation which is unsolvable in general. Therefore some analyses unfold loops; this requires having a reasonable bound on the number of iterations, which limits the scope of programs that can be analyzed.
In other static analysis techniques, finding a good invariant is a critical part of the analysis. It doesn't help to unfold the loop, in fact partially unfolding the loop would make the loop body larger and so would make it more difficult to determine a good invariant; and completely unfolding the loop would be impractical or impossible if the number of iterations was large or unbounded. Without seeing the paper, I find the statement a bit surprising, because the code could have been written with the unfolded form, but there can be programs that the analysis only works on in a more folded form.
I have no knowledge of abstract interpretation, so I'll take the functional programming approach to folding. :-)
In functional programming, a fold is an operation applied to a list to do something with each element, updating a value each iteration. For example, you can implement map this way (in Scheme):
(define (map1 func lst)
(fold-right (lambda (elem result)
(cons (func elem) result))
'() lst))
What that does is that it starts with an empty list (let's call that the result), and then for each element of the list, from the right-hand-side moving leftward, you call func on the element and cons its result onto the result list.
The key here, as far as termination goes, is that with a fold, the loop is guaranteed to terminate as long as the list is finite, since you're iterating to the next element of the list each time, and if the list is finite, then eventually there will be no next element.
Contrast this with a more C-style for loop, that doesn't operate on a list, but instead have the form for (init; test; update). The test is not guaranteed to ever return false, and so the loop is not guaranteed to complete.
Are simple loops as powerful as nested loops in terms of Turing completeness?
In terms of Turing completeness, yes they are.
Proof: It's possible to write a Brainf*** interpreter using a simple loop, for example here:
http://www.hevanet.com/cristofd/brainfuck/sbi.c
For loops with a fixed number of steps (LOOP, FOR and similar): Imagine the whole purpose of a loop is to count to n. Why should make it a difference if I loop i times in an outer loop and j times in an inner loop as opposed n = i * j in just a single loop?
Assume that no WHILE, GOTO or similar constructs are allowed in a program (just assignment, IF, and fixed loops). Then all these programs end after a finite number of steps.
The next step to more expressibility is to allow loops, where the number of iterations is e.g. determined by a condition, and it is not sure, whether this condition is ever satisfied (e.g. WHILE). Then is may happen, that a program won't halt. (This type of expressiveness is also known as Turing-completeness).
Corresponding to these two forms of programs are two kinds of functions, which were historically developed around the same time and which are called primitive recursive functions and μ-recursive functions.
The number of nestings doesn't play a role in this.
When using formal aspects to create some code is there a generic method of determining a loop invariant or will it be completely different depending on the problem?
It has already been pointed out that one same loop can have several invariants, and that Calculability is against you. It doesn't mean that you cannot try.
You are, in fact, looking for an inductive invariant: the word invariant may also be used for a property that is true at each iteration but for which is it not enough to know that it hold at one iteration to deduce that it holds at the next. If I is an inductive invariant, then any consequence of I is an invariant, but may not be an inductive invariant.
You are probably trying to get an inductive invariant to prove a certain property (post-condition) of the loop in some defined circumstances (pre-conditions).
There are two heuristics that work quite well:
start with what you have (pre-conditions), and weaken until you have an inductive invariant. In order to get an intuition how to weaken, apply one or several forward loop iterations and see what ceases to be true in the formula you have.
start with what you want (post-conditions) and strengthen until you have an inductive invariant. To get the intuition how to strengthen, apply one or several loop iterations backwards and see what needs to be added so that the post-condition can be deduced.
If you want the computer to help you in your practice, I can recommend the Jessie deductive verification plug-in for C programs of Frama-C. There are others, especially for Java and JML annotations, but I am less familiar with them. Trying out the invariants you think of is much faster than working out if they work on paper. I should point out that verifying that a property is an inductive invariant is also undecidable, but modern automatic provers do great on many simple examples. If you decide to go that route, get as many as you can from the list: Alt-ergo, Simplify, Z3.
With the optional (and slightly difficult to install) library Apron, Jessie can also infer some simple invariants automatically.
It's actually trivial to generate loop invariants. true is a good one for instance. It fulfills all three properties you want:
It holds before loop entry
It holds after each iteration
It holds after loop termination
But what you're after is probably the strongest loop invariant. Finding the strongest loop invariant however, is sometimes even an undecidable task. See article Inadequacy of Computable Loop Invariants.
I don't think it's easy to automate this. From wiki:
Because of the fundamental similarity of loops and recursive programs, proving partial correctness of loops with invariants is very similar to proving correctness of recursive programs via induction. In fact, the loop invariant is often the inductive property one has to prove of a recursive program that is equivalent to a given loop.
I've written about writing loop invariants in my blog, see Verifying Loops Part 2. The invariants needed to prove a loop correct typically comprise 2 parts:
A generalisation of the state that is intended when the loop terminates.
Extra bits needed to ensure that the loop body is well-formed (e.g. array indices in bounds).
(2) is straightforward. To derive (1), start with a predicate expressing the desired state after termination. Chances are it contains a 'forall' or 'exists' over some range of data. Now change the bounds of the 'forall' or 'exists' so that (a) they depend on variables modified by the loop (e.g. loop counters), and (b) so that the invariant is trivially true when the loop is first entered (usually by making the range of the 'forall' or 'exists' empty).
There are a number of heuristics for finding loop invariants. One good book about this is "Programming in the 1990s" by Ed Cohen. It's about how to find a good invariant by manipulating the postcondition by hand. Examples are: replace a constant by a variable, strengthen invariant, ...