How to implement continuations? - c

I'm working on a Scheme interpreter written in C. Currently it uses the C runtime stack as its own stack, which is presenting a minor problem with implementing continuations. My current solution is manual copying of the C stack to the heap then copying it back when needed. Aside from not being standard C, this solution is hardly ideal.
What is the simplest way to implement continuations for Scheme in C?

A good summary is available in Implementation Strategies for First-Class Continuations, an article by Clinger, Hartheimer, and Ost. I recommend looking at Chez Scheme's implementation in particular.
Stack copying isn't that complex and there are a number of well-understood techniques available to improve performance. Using heap-allocated frames is also fairly simple, but you make a tradeoff of creating overhead for "normal" situation where you aren't using explicit continuations.
If you convert input code to continuation passing style (CPS) then you can get away with eliminating the stack altogether. However, while CPS is elegant it adds another processing step in the front end and requires additional optimization to overcome certain performance implications.

I remember reading an article that may be of help to you: Cheney on the M.T.A. :-)
Some implementations of Scheme I know of, such as SISC, allocate their call frames on the heap.
#ollie: You don't need to do the hoisting if all your call frames are on the heap. There's a tradeoff in performance, of course: the time to hoist, versus the overhead required to allocate all frames on the heap. Maybe it should be a tunable runtime parameter in the interpreter. :-P

If you are starting from scratch, you really should look in to Continuation Passing Style (CPS) transformation.
Good sources include "LISP in small pieces" and Marc Feeley's Scheme in 90 minutes presentation.

It seems Dybvig's thesis is unmentioned so far.
It is a delight to read. The heap based model
is the easiest to implement, but the stack based
is more efficient. Ignore the string based model.
R. Kent Dybvig. "Three Implementation Models for Scheme".
http://www.cs.indiana.edu/~dyb/papers/3imp.pdf
Also check out the implementation papers on ReadScheme.org.
https://web.archive.org/http://library.readscheme.org/page8.html
The abstract is as follows:
This dissertation presents three implementation models for the Scheme
Programming Language. The first is a heap-based model used in some
form in most Scheme implementations to date; the second is a new
stack-based model that is considerably more efficient than the
heap-based model at executing most programs; and the third is a new
string-based model intended for use in a multiple-processor
implementation of Scheme.
The heap-based model allocates several important data structures in a
heap, including actual parameter lists, binding environments, and call
frames.
The stack-based model allocates these same structures on a stack
whenever possible. This results in less heap allocation, fewer memory
references, shorter instruction sequences, less garbage collection,
and more efficient use of memory.
The string-based model allocates versions of these structures right in
the program text, which is represented as a string of symbols. In the
string-based model, Scheme programs are translated into an FFP
language designed specifically to support Scheme. Programs in this
language are directly executed by the FFP machine, a
multiple-processor string-reduction computer.
The stack-based model is of immediate practical benefit; it is the
model used by the author's Chez Scheme system, a high-performance
implementation of Scheme. The string-based model will be useful for
providing Scheme as a high-level alternative to FFP on the FFP machine
once the machine is realized.

Besides the nice answers you've got so far, I recommend Andrew Appel's Compiling with Continuations. It's very well written and while not dealing directly with C, it is a source of really nice ideas for compiler writers.
The Chicken Wiki also has pages that you'll find very interesting, such as internal structure and compilation process (where CPS is explained with an actual example of compilation).

Examples that you can look at are: Chicken (a Scheme implementation, written in C that support continuations); Paul Graham's On Lisp - where he creates a CPS transformer to implement a subset of continuations in Common Lisp; and Weblocks - a continuation based web framework, which also implements a limited form of continuations in Common Lisp.

Continuations aren't the problem: you can implement those with regular higher-order functions using CPS. The issue with naive stack allocation is that tail calls are never optimised, which means you can't be scheme.
The best current approach to mapping scheme's spaghetti stack onto the stack is using trampolines: essentially extra infrastructure to handle non-C-like calls and exits from procedures. See Trampolined Style (ps).
There's some code illustrating both of these ideas.

The traditional way is to use setjmp and longjmp, though there are caveats.
Here's a reasonably good explanation

Continuations basically consist of the saved state of the stack and CPU registers at the point of context switches. At the very least you don't have to copy the entire stack to the heap when switching, you could only redirect the stack pointer.
Continuations are trivially implemented using fibers. http://en.wikipedia.org/wiki/Fiber_%28computer_science%29
. The only things that need careful encapsulation are parameter passing and return values.
In Windows fibers are done using the CreateFiber/SwitchToFiber family of calls.
in Posix-compliant systems it can be done with makecontext/swapcontext.
boost::coroutine has a working implementation of coroutines for C++ that can serve as a reference point for implementation.

As soegaard pointed out, the main reference remains R. Kent Dybvig. "Three Implementation Models for Scheme".
The idea is, a continuation is a closure that keeps its evaluation control stack. The control stack is required in order to continue the evalution from the moment the continuation was created using call/cc.
Oftenly invoking the continuation makes long time of execution and fills the memory with duplicated stacks. I wrote this stupid code to prove that, in mit-scheme it makes the scheme crash,
The code sums the first 1000 numbers 1+2+3+...+1000.
(call-with-current-continuation
(lambda (break)
((lambda (s) (s s 1000 break))
(lambda (s n cc)
(if (= 0 n)
(cc 0)
(+ n
;; non-tail-recursive,
;; the stack grows at each recursive call
(call-with-current-continuation
(lambda (__)
(s s (- n 1) __)))))))))
If you switch from 1000 to 100 000 the code will spend 2 seconds, and if you grow the input number it will crash.

Use an explicit stack instead.

Patrick is correct, the only way you can really do this is to use an explicit stack in your interpreter, and hoist the appropriate segment of stack into the heap when you need to convert to a continuation.
This is basically the same as what is needed to support closures in languages that support them (closures and continuations being somewhat related).

Related

tiny garbage collector in C for embedded devices

Is there some open source tiny GC implementation (preferably as one C source file)?
Google search provides tinygc.sourceforge.net :)
I've got some prototype code that might give you a head start. If all your pointers are "managed" through your interface, you can chop up a heap in any convenient way and use the classic algorithms from 70s dissertations. My adventures with a postscript garbage collector began here.
On reading through it again, the code may not be what you're looking for. It's designed to run on top of an OS. In particular, it uses relative integer locations as much as possible to allow the entire memory space to be moved by the OS if needed for a reallocation. I imagine you don't need to do that (although it guarantees that internal relocations are ok, too). But the code should show that a garbage collector doesn't have to be horribly complicated. It's just a tree traversal. It's futzing with some bits and following some pointers. Keep it simple. You can do it.

shared_ptr<> is not required to use reference count?

Do I understand the new Std right that shared_ptr is not required to use a reference count? Only that it is likely that it is implemented this way?
I could imagine an implementation that uses a hidden linked-list somehow. In N3291 "20.7.2.2.5.(8) shared_ptr observers [util.smartptr.shared.obs]" The note says
[ Note: use_count() is not necessarily efficient. — end note ]
which gave me that idea.
You're right, nothing in the spec requires the use of an explicit "counter", and other possibilities exist.
For example, a linked-list implementation was suggested for the implementation of boost's shared_ptr; however, the proposal was ultimately rejected because it introduced costs in other areas (size, copy operations, and thread safety).
Abstract description
Some people say that shared_ptr is a "reference counter smart pointer". I don't think it is the right way to look at it.
Actually shared_ptr is all about (non-exclusive) ownership: all the shared_ptr that are copies of a shared_ptr initialised with a pointer p are owners.
shared_ptr keeps track of the set of owners, to guaranty that:
while the set of owners is non-empty, delete p is not called
when the set of owners becomes empty, delete p (or a copy of D the destruction functor) is called immediately;
Of course, to determine when the set of owners becomes empty, shared_ptr only needs a counter. The abstract description is just slightly easier to think about.
Possible implementations techniques
To keep track of the number of owners, a counter is not only the most obvious approach, it's also relatively obvious how to make thread-safe using atomic compare-and-modify.
To keep track all the owners, a linked list of owner is not only the obvious solution, but also an easy way to avoid the need to allocate any memory for each set of owners. The problem is that it isn't easy to make such approach efficiently thread safe (anything can be made thread safe with a global lock, which is against the very idea of parallelism).
In the case of multi-thread implementation
On the one hand, we have a small, fix-size (unless the custom destruction function is used) memory allocation, that's very easy to optimise, and simple integer atomic operations.
On the other hand, there is costly and complicated linked-list handling; and if a per owners set mutex is needed (as I think it is), the cost of memory allocation is back, at which point we can just replace the mutex with the counter!
About multiple possible implementations
How many times I have read that many implementations are possible for a "standard" class?
Who has never heard this fantasy that the complex class that could be implemented as polar coordinates? This is idiotic, as we all know. complex must use Cartesian coordinates. In case polar coordinates are preferred, another class must be created. There is no way a polar complex class is going to be used as a drop-in replacement for the usual complex class.
Same for a (non-standard) string class: there is no reason for a string class to be internally NUL terminated and not store the length as an integer, just for the fun and inefficiency of repeatedly calling strlen.
We now know that designing std::string to tolerate COW was a bad idea that is the reason for the unusual invalidation semantics of const iterators.
std::vector is now guaranteed to be continuous.
The end of the fantasy
At some point, the fantasy where standard classes have many significantly different reasonable implementations has to be dropped. Standard classes are primitive building blocks; not only they should be very efficient, they should have predictable efficiency.
A programmer should be able to make portable assumptions about the relative speed of basic operations. A complex class is useless for serious number crunching if even the simplest addition turns into a bunch a transcendental computations. If a string class is not guaranteed to have very fast copy via data sharing, the programmer will have to minimize string copies.
An implementer is free to choose a different implementation techniques only when it doesn't make a common cheap operation extremely costly (by comparison).
For many classes, this means that there is exactly one viable implementation strategy, with sometimes a few degrees of liberty (like the size of a block in a std::deque).

Is it okay to use functions to stay organized in C?

I'm a relatively new C programmer, and I've noticed that many conventions from other higher-level OOP languages don't exactly hold true on C.
Is it okay to use short functions to have your coding stay organized (even though it will likely be called only once)? An example of this would be 10-15 lines in something like void init_file(void), then calling it first in main().
I would have to say, not only is it OK, but it's generally encouraged. Just don't overly fragment the train of thought by creating myriads of tiny functions. Try to ensure that each function performs a single cohesive, well... function, with a clean interface (too many parameters can be a hint that the function is performing work which is not sufficiently separate from it's caller).
Furthermore, well-named functions can serve to replace comments that would otherwise be needed. As well as providing re-use, functions can also (or instead) provide a means to organize the code and break it down into smaller units which can be more readily understood. Using functions in this way is very much like creating packages and classes/modules, though at a more fine-grained level.
Yes. Please. Don't write long functions. Write short ones that do one thing and do it well. The fact that they may only be called once is fine. One benefit is that if you name your function well, you can avoid writing comments that will get out of sync with the code over time.
If I can take the liberty to do some quoting from Code Complete:
(These reason details have been abbreviated and in spots paraphrased, for the full explanation see the complete text.)
Valid Reasons to Create a Routine
Note the reasons overlap and are not intended to be independent of each other.
Reduce complexity - The single most important reason to create a routine is to reduce a program's complexity (hide away details so you don't need to think about them).
Introduce an intermediate, understandable abstraction - Putting a section of code int o a well-named routine is one of the best ways to document its purpose.
Avoid duplicate code - The most popular reason for creating a routine. Saves space and is easier to maintain (only have to check and/or modify one place).
Hide sequences - It's a good idea to hide the order in which events happen to be processed.
Hide pointer operations - Pointer operations tend to be hard to read and error prone. Isolating them into routines shifts focus to the intent of the operation instead of the mechanics of pointer manipulation.
Improve portability - Use routines to isolate nonportable capabilities.
Simplify complicated boolean tests - Putting complicated boolean tests into a function makes the code more readable because the details of the test are out of the way and a descriptive function name summarizes the purpose of the tests.
Improve performance - You can optimize the code in one place instead of several.
To ensure all routines are small? - No. With so many good reasons for putting code into a routine, this one is unnecessary. (This is the one thrown into the list to make sure you are paying attention!)
And one final quote from the text (Chapter 7: High-Quality Routines)
One of the strongest mental blocks to
creating effective routines is a
reluctance to create a simple routine
for a simple purpose. Constructing a
whole routine to contain two or three
lines of code might seem like
overkill, but experience shows how
helpful a good small routine can be.
If a group of statements can be thought of as a thing - then make them a function
i think it is more than OK, I would recommend it! short easy to prove correct functions with well thought out names lead to code which is more self documenting than long complex functions.
Any compiler worth using will be able to inline these calls to generate efficient code if needed.
Functions are absolutely necessary to stay organized. You need to first design the problem, and then depending on the different functionality you need to split them into functions. Some segment of code which is used multiple times, probably needs to be written in a function.
I think first thinking about what problem you have in hand, break down the components and for each component try writing a function. When writing the function see if there are some code segment doing the same thing, then break it into a sub function, or if there is a sub module then it is also a candidate for another function. But at some time this breaking job should stop, and it depends on you. Generally, do not make many too big functions and not many too small functions.
When construction the function please consider the design to have high cohesion and low coupling.
EDIT1::
you might want to also consider separate modules. For example if you need to use a stack or queue for some application. Make it separate modules whose functions could be called from other functions. This way you can save re-coding commonly used modules by programming them as a group of functions stored separately.
Yes
I follow a few guidelines:
DRY (aka DIE)
Keep Cyclomatic Complexity low
Functions should fit in a Terminal window
Each one of these principles at some point will require that a function be broken up, although I suppose #2 could imply that two functions with straight-line code should be combined. It's somewhat more common to do what is called method extraction than actually splitting a function into a top and bottom half, because the usual reason is to extract common code to be called more than once.
#1 is quite useful as a decision aid. It's the same thing as saying, as I do, "never copy code".
#2 gives you a good reason to break up a function even if there is no repeated code. If the decision logic passes a certain complexity threshold, we break it up into more functions that make fewer decisions.
It is indeed a good practice to refactor code into functions, irrespective of the language being used. Even if your code is short, it will make it more readable.
If your function is quite short, you can consider inlining it.
IBM Publib article on inlining

What is an example in which knowing C will make me write better code in any other language?

In the Stack Overflow podcasts, Joel Spolsky constantly harps on Jeff Atwood about Jeff not knowing how to write code in C. His statement is that "knowing C helps you write better code." He also always uses some sort of story involving string manipulation and how knowing C would allow you to write more efficient string routines in a different language.
As someone who knows a little C, but loves to write code in perl and other high-level languages, I have never once come across a problem that I was able to solve by writing C.
I am looking for examples of real-world situations where knowing C would be useful while writing a project in a high-level/dynamic language like perl or python.
Edit: Reading some of the answers you guys have submitted have been great, but still doesn't make any sense to me in this regard:
Take the strcat example. There's a right way and a wrong way to combine strings in C. But why should I (as a high-level developer) think that I am smarter than Larry Wall? Why wouldn't the language designers write the string manipulation code the right way?
The classic example that Joel Spolsky uses is on misuse of strcat and strlen, and spotting "Shlemiel the painter" algorithms in general.
It's not that you need C to solve problems that higher-level languages can't solve, it's that knowing C well gives you a perspective on what's going on underneath all those levels of languages that allows you to write better software. Because just such a perspective helps you avoid writing code which is, unknown to you, actually O(n^2), for example.
Edit: Some clarification based on comments.
Knowing C is not a prerequisite for such knowledge, there are many ways to acquire the same knowledge.
Knowing C is also not a guarantee of these skills. You may be proficient in C and yet still write horrible, grotty, kludgy code in every other language you touch.
C is a low-level language, yet it still has modern control structures and functions so you aren't always getting caught up in the fiddly details. It's very difficult to become proficient at C without gaining a mastery of certain fundamentals (such as the details of memory management and pointers), mastery of which often pays rich dividends when working in any language.
It's always about the fundamentals.
This is true in many pursuits as well as software engineering. It is not secret incantations that make the best programmers the best, rather it is a greater mastery of the fundamentals. Experience has shown that knowledge of C tends to have a higher correlation to mastery of certain of those fundamentals, and that learning C tends to be one of the easier and more common routes to acquiring such knowledge.
It's a mistake to assume that learning C will somehow automatically give you a better understanding of low-level programming concerns. In a lot of cases even C is too high level to give you a good understanding of efficiency concerns.
A classic is i++ versus ++i. It's over-cited, so perhaps most people know the implications about performance between these two operations. But learning C wouldn't magically teach you this by itself.
I guess I understand arguments about strings. When string operations are made deceptively simple, people often use them in inefficient ways. But again, knowing that strncat exists doesn't give you a full appreciation for the efficiency concerns. A lot of C programmers probably haven't even thought about the fact that strncat has to do a strlen operation internally.
Even using C, it's important to understand what's going on behind the scenes if efficiency is a concern. People who know C tend to view things in a progression. Assembly and machine code are the building blocks of C, while C is a building block of higher level languages.
This isn't specifically true, but it's obvious that C is "closer to the metal" than many higher level languages. This has at least two effects: efficiency concerns aren't as hidden behind implicit behavior, and it's easier to screw up.
So you want a specific example of how knowing C gives you an advantage. I don't think there is one. I think what people mean when they say this is that knowing what's going on behind the scenes in whatever language you're happening to write for helps you make more intelligent decisions about how to write code. However, it's a mistake to assume that C is "what's going on behind the scenes" in Java, for instance.
It's hard to quantify exactly, but having an understanding of C will give your more insight into how higher-level language constructs are implemented, and as a consequence you'll be better able to use the constructs in an intelligent manner.
To give you a specific reason: having to write my own Garbage Collection routines has helped my write better code.
I don't think I have ever found a problem that I haven't been able to solve with a higher-level language; but started by learning C, it has instilled in me quite a number of excellent development practices. Knowing how the rudimentary parts of the flow of an application work will enable to you be able to look at your own code and get a good visual of how the data flows, and where it is stored. This then leads to a better understand of how to track down leaking memory, slow disk reads, poorly constructed caches, etc.
Keeping track of Pointers... that's another one that comes to mind.
Classic examples are things involving lower level memory management, such as the implementation of a linked list class:
struct Node
{
Data *data;
Node *next;
}
Understanding how the pointers are used to iterate the list, and what they signify in terms of the machine architecture will allow you to better understand your high level code.
Another example which Joel was referring to was the implementation of string concatenation, and the right way to create a string from a set of data.
// this is efficient
for (int i=0; i< n; i++)
{
strcat(str, data(i));
}
// this could be too, but you'd need to look at the implementation to be sure
std::string str;
for (int i=0; i<n; i++)
{
str+=data(i);
}
Knowing C helps you to write better code in C. I guess that the example of Joel Spolsky is of little use in C++ or Objective-C where specific classes for manipulating strings exist and have been crafted with performance in mind. Moreover, using C tricks in other languages may be couter productive.
Nevertheless, C knowledge is very helpful to understand general concepts in other languages and what is behind the hood in many situations.
As someone who knows a little C, but loves to write code in perl and other high-level languages, I have never once come across a problem that I was able to solve by writing C.
I am looking for examples of real-world situations where knowing C would be useful while writing a project in a high-level/dynamic language like perl or python.
It's easy to start writing high level code and then wonder we it's running slow. The truth is there are many ways to write perl or python code, and some are better (as in more efficient) than the others. If you know the low level details of how your code is executed in perl or python (both of which are written in C) you can code around several inefficiencies --like knowing which looping construct is faster, how memory is retained/released, etc.
Also, when writing a project in perl or python you sometimes hit a performance wall. The creators of the language (Guido, at least) advocate that you implement that part in C, as a language extension. To do that, well, you'll have to know C.
So, there.
For the purposes of argument, suppose you wanted to concatenate the string representations of all the integers from 1 to n (e.g. n = 5 would produce the string "12345"). Here's how one might do that naïvely in, say, Java.
String result = "";
for (int i = 1; i <= n; i++) {
result = result + Integer.toString(i);
}
If you were to rewrite that code segment (which is quite good-looking in Java) in C as literally as possible, you would get something to make most C programmers cringe in fear:
char *result = malloc(1);
*result = '\0';
for (int i = 1; i <= n; i++) {
char *intStr = malloc(11);
itoa(i, intStr, 10);
char *tempStr = malloc(/* some large size */);
strcpy(tempStr, result);
strcat(tempStr, intStr);
free(result);
free(intStr);
result = tempStr;
}
Because strings in Java are immutable, Integer.toString creates a dummy string and string concatenation creates a new string instance instead of altering the old one. That's not easy to see from just looking at the Java code. Knowing how said code translates into C is one way of learning exactly how inefficient said code is.
Do you use arrays much ? and do you come across situations where you need items to be stored in memory without knowing how many of them (i.e. based on a query from the database?) then I suppose C would teach you great things like stacks, structs and link lists which might help you. Regards, Andy
Knowing C is really not worth much. Many of us who know C deeply like to think that all that deep insight is valuable and important.
Some of us who know C can't think of a single specific feature of C that's helpful to know about.
Knowing how pointers work in C (especially with C's syntax) isn't all that helpful. In a high-level language your statements create objects and manage their interaction. Pointers and references are -- perhaps -- interesting from a hypothetical point of view. But the knowledge has no practical impact on how you use Java or Python.
The higher-level languages are the way they are. Knowing how doesn't change those languages; it doesn't change how you use them, debug or test them.
Knowing how to create or manipulate a linked list has no earthly impact on Python list class definition. None.
Knowing the difference between Linked List and Array List might help you write a Java program. But the C implementation doesn't help you choose between Linked List and Array List. The decision is independent of knowing C.
A bad algorithm is bad in every language. Knowing inner mysteries of C doesn't make a bad algorithm any less bad. Knowing C doesn't help you know the Java collections or the Python built-in types.
I can't see any value in learning C. Learning Fortran is just as valuable.
Technically, all of the deficiencies of C would force you to code around them; making you write more code -> making you more experienced in general. Lacking any portable integer bigger than 32-bits, for example, C has, in the past, made me write my own bignum library.
The lack of implicit memory, resource and error management (garbage collection, RAII, automatically-called constructors/destructors, maybe exceptions) force C users to write a lot of initialization, error-handling and cleanup code. It may just be me, but I'm never tired of writing such code. I go and read the documentation of every external function I call, return to my code and check for every return value and other failure-indicative stuff. It even makes me feel safe!
This last point is probably the biggest one to be made in favor of the argument. You can only write so many malloc()/free() pairs before you start to analyze the lifetime of every single variable you come across in every single language! C++'s automatic-storage objects don't help this disorder, either.
Writing truly portable C code often requires the programmer to be free of a lot assumptions about the host system - think sizeof(), CHAR___BITS, unsigned long, UINT_MAX. While this hasn't helped me write better code in other languages, it has helped me think about possible alternate implementations: how a tiny microprocessor could still run my C code, generating a gazillion RISC instructions for my simple one-line statement. (That is another thing; not many other languages map to and from a given assembly language so easily in my head. Then again, that may just be me.)
Of course, none of these arguments go only for C. #S.Lott has a valid point - Fortran might be an equally good alternative. But there is so much C code around! A whole personal computer system from top to bottom -applications to libraries to drivers to kernel- is available in source code in C. It would be such a waste if you could not read it.
I think it is worth knowing some low-level language, and there are pragmatic reasons to choose C:
It's low-level, close to assembler
It's widespread
Understanding the whole stack is valuable. Sometimes you need to debug something's guts. Sometimes you cannot fix a performance problem without low-level knowledge (this is often not the case, e.g., when the performance problem is purely algorithmic, but sometimes it is).
Why is C widely considered the quintessential "bottom of the stack", and not some other language(s)? I think this because C is a low-level programming language, and C won. It has been a while now, but C was not always as dominant. To take just one famous example, the proponents of Common Lisp (which had its own ways of writing low-level code) were hoping their language would be popular, too, and eventually lost.
The following are usually implemented in C:
operating systems (Unix variants, Windows, many embedded operating systems)
higher-level programming languages (many popular implementations of Java, Python, etc)
(obviously) reams of popular open source projects
I'm not a hardware person, but I gather that C has influenced CPU design heavily, too.
So if you believe in understanding the whole stack, learning C is, from a pragmatic perspective, the best choice.
As a caveat, I think it's worth learning assembler, as well. Although C is close to the metal, I didn't fully understand C until I had to do some assembler. It is occasionally helpful to understand how functions calls are actually performed, how for loops are implemented, etc. Less important, but also useful, is having to (at least once) deal with a system without virtual memory. When using C on Windows, Unix, and certain other operating systems, even humble malloc does a lot of work under the covers that is easier to appreciate, debug and/or tune if you've ever had to deal with manually locking and unlocking memory regions (not that I would recommend doing so on a regular basis!)
I see it like this , everything boils down to C in a crossplatform level, and assembly in a platform specific way. So it's like being a crosscountry Rally racer, and C is basic automotive mechanics, you can be a great driver but when you get into trouble knowing C means you can probably get yourself back in the race, if not you're stuck calling the mechanics. And assembly is what the mechanics and manufacturers know, it's a worthy investment if that's what you want to do, otherwise you can just trust the mechanics.
For specifics think about memory management, hardwar drivers, physics engines, high performance 3d graphics, TCP stacks, binary protocols, embedded software, creating high level languages like Perl
You cannot write an OS kernel in Perl; C would be a much better choice for that, because it is low-level enough to express everything the kernel should do, and portable enough to let you port your kernel to different architectures
Knowing C is not a requirement to being able to effectively use higher-level languages, but it certainly can help ones general understanding of how computers and software work - I think it's similar to an assertion that knowing some assembly language or computer architecture/hardware logic (and/or/nand gates, etc) can help a C programmer be a better programmer.
Sometimes in order to solve a problem it helps to know how things are working 'underneath' what you're doing.
I don't think this means a programmer must know C in order to be a good programmer, but I think that knowing C can be helpful to almost any programmer.
Not knowing Perl well, I am wondering if it is now possible to distribute processor load to more than one physical core with several threads created in a single program in Perl, without spawning additional processes
I don't think there can be any specific example.
What learning C does for you is give you an insight, a broadening of the mind, into how computers (and software) work. It's a very abstract thing ..
It doesn't make you write better code in python, it just makes you more of a computer scientist.
The reference that Wedge made to Joel's article mentioning Shlemiel the painter is an interesting one but has no relevance here. That algorithm is not tied to C in any particular way (although it manifests itself in null-terminated strings).
Python's strings are immutable anyway, and completely different from C's model of strings, so I don't quite see the relationship.
I suppose one concrete example is optimizing a parser or a lexer or a program that keeps writing to a string buffer all the time. If you use normal strings instead of a string buffer, you'll run across a problem when you build very large strings.
Consider that:
a = a + b
makes a copy of both a and b. It doesn't change the string that was referenced by a, it creates a new string, allocating more memory, etc.
If a becomes considerably large, and you keep adding small things to it, then Shlemiel the painter will manifest himself.
But then again, knowing this has nothing to do with knowing C, just knowing how your language implements things at the low level. (This is where having an experiece in C will help you).
In Python, say you have a function
def foo(l=[])
l.append("bar")
return l;
On some version of Python, available about a year ago, running foo() for times, you'd get a really interesting result (i.e. ["bar","bar","bar","bar]).
It seems that someone implemented the default parameters as a static variable (and without resetting it), so unexpected results happen.
Perhaps my example was contrived - a friend of mine who actually likes Python found this peculiar bug, but the fact of the matter is all of these languages are implemented in C or C++. Not knowing and not understanding concepts that are fundamental to the base language means that you won't have an in-depth understanding of languages that are built on top of that.
I find all the "why bother with C/C++/ASM question silly". If you're inclined enough to learn a language, that means that you're curious enough to get into it the first place. Why stop at just before C?
Knowing C is great because it does nothing behind your back (GC, bounds checking, etc.). It only does exactly what you tell it too. Nothing is implied. Even C++ does things you don't tell it too with RAII (of course, it is implied that the object is destructed when it goes out of scope, but you don't actually write that). C is a great way to learn what goes on 'under the hood' of the computer, without having to write assembly.
inefficient code (eg loops of string+=) are typically inefficient in any language. what difference does it make if someone explains why it is inefficient in one language or the other? knowing C, but not realizing that a method is inefficient, is no different than knowing python and not realizing the same.

How is Tail Call Optimization implemented in DrScheme?

I've heard that trampolining is an ineffective way of implementing TCO. How does DrScheme (PLAI Scheme, technically) do it? Does it do it the 'right' way (that is, produce assembly code which directly branches to the tail call, instead of going through the stack and trampolining)?
Matthew Flatt, the chief implementor of MzScheme (now PLT Scheme) told me in June 2008 that at one time they compiled down to virtual-machine code, in which case it is easy to write a VM that does proper tail calls. Now, however, the system is mature enough that on x86 they use a simple JIT. In either case, there is no trampolining---the PLT Scheme guys know their business.
The implementors of PLT Scheme are quite active in their Google group, where you can get a quick answer from the people who write the code.
I'm not sure they read SO, though, so your best bet would probably be asking there.
Trampolines are used in implementations that translate Scheme code into a target language X (C, Java, etc.) that doesn't support Proper Tail Calls. PLT Scheme employs JIT-compilation - and therefore trampolines are not needed. For the exact implementation strategy used, ask the question on the PLT mailing list.
PS: You can read more on trampolines in the various "Compile Scheme to C" papers available on ReadScheme.org.

Resources