Is the above valid C code? [closed] - c

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I was given this question in my college in a programming contest...
void duff(register char *to, register char *from, register int count)
{
register int n=(count+7)/8;
switch(count%8){
case 0: do{ *to++ = *from++;
case 7: *to++ = *from++;
case 6: *to++ = *from++;
case 5: *to++ = *from++;
case 4: *to++ = *from++;
case 3: *to++ = *from++;
case 2: *to++ = *from++;
case 1: *to++ = *from++;
}while( --n >0);
}
}
Is the above valid C code? If so, what is it trying to acheive and why would anyone do something like the above?

Yep, it's known as Duff's device.
As far as I can remember it was written by someone doing special effects for a movie (Star Wars or something like that). It was written this way to get maximum performance.
But, would I suggest to write code like this: NO.
Keep your code readable!

Related

Obscure C construct on interlaced switch

I came across this syntax on the Wikipedia page on Duff's Device.
The page mentions, that such construct helps with some modulo-divide requirement in this case, particularly 8, as here:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++; /* {int somevar =0;} */
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}
How does control even reach case 5 when the primary loop block is at case 0?
How is it syntactically correct? Also, why then is variable declaration inside the case statements constrained to block scope only?
[Edit]
Quoting lines from the same page, "...the first edition of The C Programming Language which requires only that the body of the switch be a syntactically valid (compound) statement within which case labels can appear prefixing any sub-statement...".
I believe that's an answer, however, why did the masters chose to do it thus?
That idiom is known as Duff's Device, and depends on the fact that switch/case is a very thin abstraction over goto.
In C, it is permissible to jump from outside to inside a block like that (though a good compiler will warn if doing so will miss any variable initialisations).

Manual optimization in the past (C Language) [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Back in the 70s when C just started, I guess compiler level optimization wasn't that advanced like modern compilers (clang, gcc, etc...) and the computers themselves were limited hardware-wise, was it common to prefer optimizations at the source code level over readability?
Example:
int arrayOfItems[30]; // Global variable
int GetItemAt(int index)
{
return globalArrayOfThings[index];
}
int main()
{
// Code
// ... arrayOfItems intialized somewhere
// More code
GetSomethingByItem(GetItemAt(4)); // Get at index 4
return 0;
}
Now this can be optimized to this:
int arrayOfItems[30]; // Global variable
int main()
{
// Code
// ... arrayOfItems intialized somewhere
// More code
GetSomethingByItem(arrayOfItems[4]); // Get at index 4
return 0;
}
Completely omitting the functionGetItemAt and thus saving time by accessing the value straight from it's address instead of entering a function, creating a stack frame, accessing the value and pushing the result to some register. Do people used to prefer to write the second, 'optimized' version straight into the source code or use the first version so the code would be more readable?
I know that in this example you can use a processor to "mimic" this optimization (e.g #define GetItemAt(x) arrayOfItems[x]), but you get my point.
Also, maybe this exact optimization feature was present from the start, if so, I should find another example, suggestions are welcome.
TL;DR -
Was it common in the past to prefer optimizations at the source code level over readability?
Bonus question:
Are there optimizations that are included only so the source code can be more readable?
I don't think many developers have ever prefered optimization over readability, but sometimes it might be argued that there were optimizations that harmed readability but were necessary for performance. Something like Duff's Device (a loop unrolling optimization)
From
do { /* count > 0 assumed */
*to = *from++; /* "to" pointer is NOT incremented, see explanation below */
} while(--count > 0);
to
register n = (count + 7) / 8;
switch(count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while(--n > 0);
}
Of course, it turns out that compilers got smarter and it has been reported on the LKML that removing Duff's Device improved performance and reduced memory usage. From the linked wikipedia,
For the purpose of memory-to-memory copies (which was not the original use of Duff's device, although it can be modified to serve this purpose as described in section below), the standard C library provides function memcpy; it will not perform worse than a memory-to-memory copy version of this code, and may contain architecture-specific optimizations that will make it significantly faster
and from the LKML (in 2000)
... this effect in the X server.
It turns out that with branch predictions and the relative speed of CPU
vs. memory changing over the past decade, loop unrolling is pretty much
pointless. In fact, by eliminating all instances of Duff's Device from
the XFree86 4.0 server, the server shrunk in size by half a megabyte, and was faster to boot, because the elimination of all
that excess code meant that the X server wasn't thrashing the cache
lines as much.
As for optimizations that only improve readability, it would require that your code first be unreadable. Then anything that makes it more readable would seem to qualify. Finally, remember that premature optimization is the root of all evil.

Switch case weird scoping

Reviewing some 3rd party C code I came across something like:
switch (state) {
case 0:
if (c=='A') { // open brace
// code...
break; // brace not closed!
case 1:
// code...
break;
} // close brace!
case 2:
// code...
break;
}
Which in the code I was reviewing appeared to be just a typo but I was surprised that it compiled with out error.
Why is this valid C?
What is the effect on the execution of this code compared to closing the brace at the expected place?
Is there any case where this could be of use?
Edit: In the example I looked at all breaks were present (as above) - but answer could also include behaviour if break absent in case 0 or 1.
Not only is it valid, similar structure has been used in real code, e.g., Duff's Device, which is an unrolled loop for copying a buffer:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch(count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while(--n > 0);
}
}
Since a switch statement really just computes an address and jumps to it, it's easy to see why it can overlap with other control structures; the lines within other control structures have addresses that can be jump targets, too!
In the case you presented, imagine if there were no switch or breaks in your code. When you've finished executing the then portion of a if statement, you just keep going, so you'd fall through into the case 2:. Now, since you have the switch and break, it matters what break can break out of. According to the MSDN page, “The C break statement”,
The break statement terminates the execution of the nearest enclosing do, for, switch, or while statement in which it appears. Control passes to the statement that follows the terminated statement.
Since the nearest enclosing do, for, switch, or while statement is your switch (notice that if is not included in that list), then if you're inside the then block, you transfer to the outside of the switch statement. What's a bit more interesting, though, is what happens if you enter case 0, but c == 'A' is false. Then the if transfers control to just after the closing brace of the then block, and you start executing the code in case 2.
In C and C++ it is legal to jump into loops and if blocks so long as you don't jump over any variable declarations. You can check this answer for an example using goto, but I don't see why the same ideas wouldn't apply to switch blocks.
The semantics are different than if the } was above case 1 as you would expect.
This code actually says if state == 0 and c != 'A' then go to case 2 since that's where the closing brace of the if statement is. It then processes that code and hits the break statement at the end of the case 2 code.

C switch statement with do-while interleaved [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How does Duff's device work?
I am trying to understand how this is working. Any help would be appreciated.
#include<stdio.h>
void duff(int count)
{
int n=(count+7)/8;
printf("n=%d count =%d\n",n,count%8);
switch(count%8){
case 0: do{ printf("case 0\n");
case 7: printf("case 7\n");
case 6: printf("case 6\n");
case 5: printf("case 5\n");
case 4: printf("case 4\n");
case 3: printf("case 3\n");
case 2: printf("case 2\n");
case 1: printf("case 1\n");
}while( --n >0);
}
}
main(){
int count;
scanf("%d",&count);
duff(count);
}
Basically if the switch case evaluates to case statement 2, then the do statment of the while is never executed. But i ran this program and it gave me the output, but unable to explain:
output:
3
n=1 count =3
case 3
case 2
case 1
This is known as duff's device and is used in code optimization techniques to reduce branch instructions. The reason that it works is that by default case statements without breaks fall through to the next case so when you hit case 3, you keep going through to case 2 and case 1.
Both the do and the case "statements" are essentially just "goto labels". They don't add any actual code. They just tell while and switch (respectively) where to jump to. In other words, there is no code for the do to (not) execute.
(That said, it is somewhat remarkable/bizarre that C's grammar allows cases to exist in children of the switch, rather just as direct children of a switch.)
There are no break statements between the cases so the cases fall through. Therefore n=3 causes case 3: case 2: and case 1: to be executed.

Does Duff's Device work in other languages?

Many years ago while working on a tight graphics I/O problem, Tom Duff unrolled a loop and created his Duff's Device as follows:
dsend(to, from, count)
char *to, *from;
int count;
{
int n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}
(Note this uses old style function parameters - that's not an error.)
This coding comes directly out of thinking in assembler and coding in C and is dependent on C's case statement fall-through. Can this kind of creativity in interlacing control structures work in any other languages?
You can do it in any language that supports computed GOTO statements (Fortran, some BASICs, etc.)
It works in C++.
Note though the code generated depends on your compiler. In particular, when I compiled Duff's device using GCC targeting ARM architectures, the resulting ARM assembler was sub-optimal (I think GCC turned it into a jump table) at any optimization level.
I ended up just handing coding the assembler.
Duff's device is essentially a computed goto which can be done in many other languages - assembly (of course) and FORTRAN being a couple that support them directly.
I used it very successfully in JavaScript to speed up large array processing. I wish I could use it in C#.
Even if it cannot be use this way you may still have two loops:
dsend(to, from, count)
char *to, *from;
int count;
{
int n;
for(n=0; n!=count/8; n+=8){
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
}
for(; n!=count; n++)
{
*to = *from++;
}
}
Sure this will be somewhat slower with smaller count but it is somewhat more readable, somewhat more portable across languages and produces very similar benefir with large count.

Resources