Can a language be Turing-complete without any support for arrays? - turing-complete

If a language has control structures and variables, but no support for arrays, lists, memory access and allocation, etc, can it be Turing-complete?
Maybe if there was no limit to the amount of variables you can create, you can simulate arrays by creating variables like array_1, array_2, ... array_6000 and manually loop through them, and somehow create complex data structures and recursion?
Edit: Even if you cannot access variables by name manipulation (array_10+i is not allowed)?

Certainly. Have a look at Lambda Calculus, which is one of the most minimal Turing Complete languages I've ever seen. Basically, all you have are lambdas (function literals); no assignment, no declaration, no data structures. It's all very very slimmed-down.
You can, however, simulate a linear data structure like a List by chaining functions together. It gets pretty verbose, but it's certainly possible and it's much nicer than having a large series of sequentially named variables.
Generally speaking, whether or not a language is Turing Complete has nothing to do with whether it has Arrays. Functional languages like SML and Haskell lack arrays, just like Lambda Calculus, and these are actually useful languages! Saying a language is "Turing Complete" is merely another way of saying that there is no Turing Computable function which cannot be expressed in said language. This is a surprisingly loose qualification, allowing many languages which would be completely impractical (like Lambda Calculus).

There's plenty of Turing-complete languages that don't even have the notion of a "variable"! Memory access and allocations are implementation details, so they're completely irrelevant. You have to realize that Turing machines and Turing completeness are very theoretical concepts, useful for proving things, but completely divorced from the reality of actual hardware.
Paul Graham has written a long, but very, very interesting essay on the history of computer languages where he describes the two very different main traditions of computer languages:
Lisp, Scheme, etc. - derived from theoretical considerations, very simple, yet conceptually powerful languages, but for the longest time impractical because of their complete disregard for what's easy and efficient to implement
Assembler, FORTRAN, C and pretty much all "mainstream" languages - derived more or less directly from what the hardware could do, easy to implement, efficient, but for the longest time inferior to the (older!) Lisp family in terms of expressiveness.
It sounds like you know only the second tradition, but Turing completeness is a concept that originates from the same principles as the first tradition and makes little sense if you don't know those principles.

Related

Choosing an intermediate language

I am currently playing around with programming languages. I have spent some time writing parsers and interpreters in high level languages (most notably Haxe).
I have had some results, that I think are actually quite nice, but now I'd like to make them fast.
My idea was to translate the input language into C.
My C knowledge is limitted to what you learn at university. Beyond some exercises, I have never written actual C programs. But I feel confident I can make it work.
Of course I could try to write a frontend for the LLVM or to generate MSIL or JVM bytecode. But I feel that's too much to learn right now, and I don't see much of a gain actually.
Also C is perfectly human readable, so if I screw up, it's much easier to understand why. And C is, after all, high level. I can really translate concepts from the input language without too much mind-bending. I should be having something working up and running in a reasonable amount of time and then optimize it as I see fit.
So: Are there any downsides to using C? Can you recommend an alternative?
Thank you for your insight :)
Edit: Some Clarification
The reason why I want to go all the way down is, that I am writing a language with OOP support and I want to actually implement my method dispatching by hand, because I have something very specific in mind.
A primary area of use would be writing HTTP services, but I could image adding bindings to a GUI library (wxWidgets maybe) or whatever.
C is a good and quite popular choice for what you're trying to do.
Still, take a look at LLVM's intermediate language (IR). It's pretty readable and I think it's cleaner and easier to generate and parse than C. LLVM comes with quite a big collection of tools to work with it. You can generate native code for variety of platforms (as with C but with slightly more control over output) or for virtual machines. Possibility of JIT compilation is also a plus.
See The Architecture of Open Source Applications, Chapter 11 for introduction to LLVM approach and some snippets of IR.
What is your target environment? This might help us give you better answer.
C is actually a pretty good choice for a target language for a little or experimental compiler -- its widely available on many platforms, so your compiler becomes immediately useful in many environments. The main drawback is dealing with things that are not well supported in C, or are not well defined in the C spec. For example, if you want to do dynamic code generation (JIT compilation), C is problematic. Things like stack unwinding and reflection are tricky to do in C (though setjmp/longjmp and careful use of structs for which you generate layout descriptions can do a lot). Things like word sizes, big or little-endian layout, and arithmetic precision vary between C compilers, so you have to be aware of that, but those are things you need to deal with if you want to support multiple target machines anyways.
Other languages can be used as well -- the main advantage of C is its ubiquity.
You might consider C--, a C-like language intended to be a better target for code generation than C.
C is a good choice, IMHO. Unlike many languages, C is generally considered "elegant" in that you have only 32 keywords, and very basic constructs (sequence, selection, iteration), with a very simple-and-consistent collection of tokens and operators.
Because syntax is very consistent within C (brackets and braces, blocks and statements, use of expressions), you're not marching into an unbounded world of language expansion. C is a mature language, has weathered time nicely, and now-a-days is a "known quantity" (which is really hard to say about many other languages, even "mature" ones).

How do algorithms differ from design patterns?

I am new to C programming; coming from an OOP PHP background.
I find C to be (no wonder) a much more difficult language. I had particularly lots of problems figuring out a couple of things on arrays at first: like there is no native associative array.
Now, this part I guess I'm figuring out little by little, but now I have a question regarding a conversation I had just yesterday with a C developer. She was explaining the binary search algorithm to me because I asked her whether there were libraries to do array related stuff in C or not because it seemed like a smarter solution than always re-inventing the wheel.
I would really love to learn more about algorithms in C, in particular what differences are there between algorithms and the design patterns I'm used to using in PHP?
Taking things in order: the extent of C's support for anything like an associative array would be qsort to sort an array of structures based on a key, and bsearch to find one based on a key. There are, of course, quite a few alternatives -- various other libraries have hash tables, balanced trees, etc. Exactly which will suit your purposes is hard to guess though.
Offhand, I don't know of many good books covering algorithms that use C as their primary vehicle for demonstration. A few obvious recommendations for books on algorithms in general (mostly language independent) would be:
The Art of Computer Programming by Donald Knuth. This is pretty much the class algorithms book. It's now (finally) up to four volumes. Knuth originally started on it in 1967, planning to write 7 volumes. Only three volumes were available for a long time. A fourth was added quite recently. At the rate he's going, it's only going to make it to 7 if Knuth lives to be well past 100 years old. Nonetheless, the parts that are there are extremely good -- but (warning!) he analyzes the algorithms in considerable detail; if you don't know at least a little calculus, a fair amount will probably be hard to follow.
Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein. IIRC, there's now a newer edition than I have, which adds yet another author. This is a large book (dropping it on your toes would be quite painful). It uses a fair amount of mathematical notation and such throughout, but if you're willing to work a little at looking up the notation, it's really pretty understandable. It covers quite a bit of important ground (e.g., graph algorithms) that are scheduled for later volumes of Knuth, but not (at least yet) available there.
Algorithms and Data Structures by Aho, Hopcraft and Ullman. This is (by a pretty fair margin) the smallest, lightest, and at least for most people probably the easiest of these to follow.
Though it's only available used anymore, if you can find a copy of Algorithms + Data Structures = Programs by Niklaus Wirth, that's what I'd really suggest. It uses Pascal (no surprise -- Niklaus Wirth invented Pascal), but that's enough like C that it doesn't cause a real problem. It doesn't go into as much depth as Knuth about each algorithm, but still enough to give a good feel for when one is likely to be a good choice versus another. For somebody in your position (some background in programming, but little in this area) it's my top recommendation.
Though I've said it before, I think it bears repeating: IMO, all of Robert Sedgewick's books on algorithms should be avoided. Algorithms in C++ is probably the worst of them, but the others are only marginally better. The code they include (again, especially the C++ version) is truly execrable, and the descriptions of algorithms are often incomplete and/or misleading. The most recent editions have fixed some of the problems, but (IMO) not nearly enough to qualify as something that should ever be recommended. If there was no alternative, you could probably get by with these, but given the number of alternatives that are dramatically superior, the only reason to read these at all is if somebody gives them to you, and you absolutely can't afford anything else.
As far as algorithms versus design patterns goes, the line can get blurry in places, but generally an algorithm is much more tightly defined. An algorithm will normally have a specific, tightly defined input which it processes in a specific way to produce an equally specific result/output. A design pattern tends to be more loosely defined, more generic. An algorithm can be generic as well (e.g., a sorting algorithms might require a type that defines a strict, weak ordering) but still has specific requirements on the type.
A design pattern tends to be somewhat more loosely defined. For example, the visitor pattern involves processing groups of objects -- but we don't want to modify the types of those objects when we decide we need to process them in a new and different way. We do that by defining the processes separately from the objects to be processed, along with how we'll traverse the groups of objects, and allow a process to work with each.
To look at it from a rather different direction, you can usually implement an algorithm with a function or a small group of functions. A design pattern tends to be oriented more toward the style in which you write your code, rather than just "here's a function, use it."
"Algorithms in C, Parts 1-5 (Bundle): Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition)"
Cannot stress how good that series is.

Why is C so fast, and why aren't other languages as fast or faster? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
In listening to the Stack Overflow podcast, the jab keeps coming up that "real programmers" write in C, and that C is so much faster because it's "close to the machine." Leaving the former assertion for another post, what is special about C that allows it to be faster than other languages?
Or put another way: what's to stop other languages from being able to compile down to binary that runs every bit as fast as C?
There isn't much that's special about C. That's one of the reasons why it's fast.
Newer languages which have support for garbage collection, dynamic typing and other facilities which make it easier for the programmer to write programs.
The catch is, there is additional processing overhead which will degrade the performance of the application. C doesn't have any of that, which means that there is no overhead, but that means that the programmer needs to be able to allocate memory and free them to prevent memory leaks, and must deal with static typing of variables.
That said, many languages and platforms, such as Java (with its Java Virtual Machine) and .NET (with its Common Language Runtime) have improved performance over the years with advents such as just-in-time compilation which produces native machine code from bytecode to achieve higher performance.
There is a trade-off the C designers have made. That's to say, they made the decision to put speed above safety. C won't
Check array index bounds
Check for uninitialized variable values
Check for memory leaks
Check for null pointer dereference
When you index into an array, in Java it takes some method call in the virtual machine, bound checking and other sanity checks. That is valid and absolutely fine, because it adds safety where it's due. But in C, even pretty trivial things are not put in safety. For example, C doesn't require memcpy to check whether the regions to copy overlap. It's not designed as a language to program a big business application.
But these design decisions are not bugs in the C language. They are by design, as it allows compilers and library writers to get every bit of performance out of the computer. Here is the spirit of C how the C Rationale document explains it:
C code can be non-portable. Although it strove to give programmers the opportunity to write truly portable programs, the Committee did not want to force programmers into writing portably, to preclude the use of C as a ``high-level assembler'': the ability to write machine-specific code is one of the strengths of C.
Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. Some of the facets of the spirit of C can be summarized in phrases like
Trust the programmer.
Don't prevent the programmer from doing what needs to be done.
Keep the language small and simple.
Provide only one way to do an operation.
Make it fast, even if it is not guaranteed to be portable.
The last proverb needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule. An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine.
If you spend a month to build something in C that runs in 0.05 seconds, and I spend a day writing the same thing in Java, and it runs in 0.10 seconds, then is C really faster?
But to answer your question, well-written C code will generally run faster than well-written code in other languages because part of writing C code "well" includes doing manual optimizations at a near-machine level.
Although compilers are very clever indeed, they are not yet able to creatively come up with code that competes with hand-massaged algorithms (assuming the "hands" belong to a good C programmer).
Edit:
A lot of comments are along the lines of "I write in C and I don't think about optimizations."
But to take a specific example from this post:
In Delphi I could write this:
function RemoveAllAFromB(a, b: string): string;
var
before, after :string;
begin
Result := b;
if 0 < Pos(a,b) then begin
before := Copy(b,1,Pos(a,b)-Length(a));
after := Copy(b,Pos(a,b)+Length(a),Length(b));
Result := before + after;
Result := RemoveAllAFromB(a,Result); //recursive
end;
end;
and in C I write this:
char *s1, *s2, *result; /* original strings and the result string */
int len1, len2; /* lengths of the strings */
for (i = 0; i < len1; i++) {
for (j = 0; j < len2; j++) {
if (s1[i] == s2[j]) {
break;
}
}
if (j == len2) { /* s1[i] is not found in s2 */
*result = s1[i];
result++; /* assuming your result array is long enough */
}
}
But how many optimizations are there in the C version? We make lots of decisions about implementation that I don't think about in the Delphi version. How is a string implemented? In Delphi I don't see it. In C, I've decided it will be a pointer to an array of ASCII integers, which we call chars. In C, we test for character existence one at a time. In Delphi, I use Pos.
And this is just a small example. In a large program, a C programmer has to make these kinds of low-level decisions with every few lines of code. It adds up to a hand-crafted, hand-optimized executable.
I didn't see it already, so I'll say it: C tends to be faster because almost everything else is written in C.
Java is built on C, Python is built on C (or Java, or .NET, etc.), Perl is, etc. The OS is written in C, the virtual machines are written in C, the compilers are written in C, the interpreters are written in C. Some things are still written in Assembly language, which tends to be even faster. More and more things are being written in something else, which is itself written in C.
Each statement that you write in other languages (not Assembly) is typically implemented underneath as several statements in C, which are compiled down to native machine code. Since those other languages tend to exist in order to obtain a higher level of abstraction than C, those extra statements required in C tend to be focused on adding safety, adding complexity, and providing error handling. Those are often good things, but they have a cost, and its names are speed and size.
Personally, I have written in literally dozens of languages spanning most of the available spectrum, and I personally have sought the magic that you hint at:
How can I have my cake and eat it, too? How can I play with high-level abstractions in my favorite language, then drop down to the nitty gritty of C for speed?
After a couple of years of research, my answer is Python (on C). You might want to give it a look. By the way, you can also drop down to Assembly from Python, too (with some minor help from a special library).
On the other hand, bad code can be written in any language. Therefore, C (or Assembly) code is not automatically faster. Likewise, some optimization tricks can bring portions of higher-level language code close to the performance level of raw C. But, for most applications, your program spends most of its time waiting on people or hardware, so the difference really does not matter.
Enjoy.
There are a lot of questions in there - mostly ones I am not qualified to answer. But for this last one:
what's to stop other languages from being able to compile down to binary that runs every bit as fast as C?
In a word, abstraction.
C is only one or two levels of abstraction away from machine language. Java and the .NET languages are at a minimum three levels of abstraction away from assembler. I'm not sure about Python and Ruby.
Typically, the more programmer toys (complex data types, etc.), the further you are from machine language and the more translation has to be done.
I'm off here and there, but that's the basic gist.
There are some good comments on this post with more details.
It is not so much that C is fast as that C's cost model is transparent. If a C program is slow, it is slow in an obvious way: by executing a lot of statements. Compared with the cost of operations in C, high-level operations on objects (especially reflection) or strings can have costs that are not obvious.
Two languages that generally compile to binaries which are just as fast as C are Standard ML (using the MLton compiler) and Objective Caml. If you check out the benchmarks game you'll find that for some benchmarks, like binary trees, the OCaml version is faster than C. (I didn't find any MLton entries.) But don't take the shootout too seriously; it is, as it says, a game, the the results often reflect how much effort people have put in tuning the code.
C is not always faster.
C is slower than, for example, Modern Fortran.
C is often slower than Java for some things (especially after the JIT compiler has had a go at your code).
C lets pointer aliasing happen, which means some good optimizations are not possible. Particularly when you have multiple execution units, this causes data fetch stalls. Ow.
The assumption that pointer arithmetic works really causes slow bloated performance on some CPU families (PIC particularly!) It used to suck the big one on segmented x86.
Basically, when you get a vector unit, or a parallelizing compiler, C stinks and modern Fortran runs faster.
C programmer tricks, like thunking (modifying the executable on the fly), cause CPU prefetch stalls.
Do you get the drift?
And our good friend, the x86, executes an instruction set that these days bears little relationship to the actual CPU architecture. Shadow registers, load-store optimizers, all in the CPU. So C is then close to the virtual metal. The real metal, Intel don't let you see. (Historically VLIW CPU's were a bit of a bust so, maybe that's no so bad.)
If you program in C on a high-performance DSP (maybe a TI DSP?), the compiler has to do some tricky stuff to unroll the C across the multiple parallel execution units. So in that case, C isn't close to the metal, but it is close to the compiler, which will do whole program optimization. Weird.
And finally, some CPUs (www.ajile.com) run Java bytecodes in hardware. C would a PITA to use on that CPU.
what's to stop other languages from
being able to compile down to binary
that runs every bit as fast as C?
Nothing. Modern languages like Java or .NET languages are oriented more toward programmer productivity rather than performance. Hardware is cheap nowadays. Also compilation to intermediate representation gives a lot of bonuses such as security, portability, etc. The .NET CLR can take advantage of different hardware. For example, you don't need to manually optimize/recompile program to use the SSE instructions set.
I guess you forgot that Assembly language is also a language :)
But seriously, C programs are faster only when the programmer knows what he's doing. You can easily write a C program that runs slower than programs written in other languages that do the same job.
The reason why C is faster is because it is designed in this way. It lets you do a lot of "lower level" stuff that helps the compiler to optimize the code. Or, shall we say, you the programmer are responsible for optimizing the code. But it's often quite tricky and error prone.
Other languages, like others already mentioned, focus more on productivity of the programmer. It is commonly believed that programmer time is much more expensive than machine time (even in the old days). So it makes a lot of sense to minimize the time programmers spend on writing and debugging programs instead of the running time of the programs. To do that, you will sacrifice a bit on what you can do to make the program faster because a lot of things are automated.
The main factors are that it's a statically-typed language and that's compiled to machine code. Also, since it's a low-level language, it generally doesn't do anything you don't tell it to.
These are some other factors that come to mind.
Variables are not automatically initialized
No bounds checking on arrays
Unchecked pointer manipulation
No integer overflow checking
Statically-typed variables
Function calls are static (unless you use function pointers)
Compiler writers have had lots of time to improve the optimizing code. Also, people program in C for the purpose of getting the best performance, so there's pressure to optimize the code.
Parts of the language specification are implementation-defined, so compilers are free to do things in the most optimal way
Most static-typed languages could be compiled just as fast or faster than C though, especially if they can make assumptions that C can't because of pointer aliasing, etc.
C++ is faster on average (as it was initially, largely a superset of C, though there are some differences). However, for specific benchmarks, there is often another language which is faster.
From The Computer Language Benchmarks Game:
fannjuch-redux was fastest in Scala
n-body and fasta were faster in Ada.
spectral-norm was fastest in Fortran.
reverse-complement, mandelbrot and pidigits were fastest in ATS.
regex-dna was fastest in JavaScript.
chameneou-redux was fastest is Java 7.
thread-ring was fastest in Haskell.
The rest of the benchmarks were fastest in C or C++.
For the most part, every C instruction corresponds to a very few assembler instructions. You are essentially writing higher level machine code, so you have control over almost everything the processor does. Many other compiled languages, such as C++, have a lot of simple looking instructions that can turn into much more code than you think it does (virtual functions, copy constructors, etc..) And interpreted languages like Java or Ruby have another layer of instructions that you never see - the Virtual Machine or Interpreter.
I know plenty of people have said it in a long winded way, but:
C is faster because it does less (for you).
Many of these answers give valid reasons for why C is, or is not, faster (either in general or in specific scenarios). It's undeniable that:
Many other languages provide automatic features that we take for granted. Bounds checking, run-time type checking, and automatic memory management, for example, don't come for free. There is at least some cost associated with these features, which we may not think about—or even realize—while writing code that uses these features.
The step from source to machine is often not as direct in other languages as it is in C.
OTOH, to say that compiled C code executes faster than other code written in other languages is a generalization that isn't always true. Counter-examples are easy to find (or contrive).
All of this notwithstanding, there is something else I have noticed that, I think, affects the comparative performance of C vs. many other languages more greatly than any other factor. To wit:
Other languages often make it easier to write code that executes more slowly. Often, it's even encouraged by the design philosophies of the language. Corollary: a C programmer is more likely to write code that doesn't perform unnecessary operations.
As an example, consider a simple Windows program in which a single main window is created. A C version would populate a WNDCLASS[EX] structure which would be passed to RegisterClass[Ex], then call CreateWindow[Ex] and enter a message loop. Highly simplified and abbreviated code follows:
WNDCLASS wc;
MSG msg;
wc.style = 0;
wc.lpfnWndProc = &WndProc;
wc.cbClsExtra = 0;
wc.cbWndExtra = 0;
wc.hInstance = hInstance;
wc.hIcon = NULL;
wc.hCursor = LoadCursor(NULL, IDC_ARROW);
wc.hbrBackground = (HBRUSH)(COLOR_BTNFACE + 1);
wc.lpszMenuName = NULL;
wc.lpszClassName = "MainWndCls";
RegisterClass(&wc);
CreateWindow("MainWndCls", "", WS_OVERLAPPEDWINDOW | WS_VISIBLE,
CW_USEDEFAULT, 0, CW_USEDEFAULT, 0, NULL, NULL, hInstance, NULL);
while(GetMessage(&msg, NULL, 0, 0)){
TranslateMessage(&msg);
DispatchMessage(&msg);
}
An equivalent program in C# could be just one line of code:
Application.Run(new Form());
This one line of code provides all of the functionality that nearly 20 lines of C code did, and adds some things we left out, such as error checking. The richer, fuller library (compared to those used in a typical C project) did a lot of work for us, freeing our time to write many more snippets of code that look short to us but involve many steps behind the scenes.
But a rich library enabling easy and quick code bloat isn't really my point. My point is more apparent when you start examining what actually happens when our little one-liner actually executes. For fun sometime, enable .NET source access in Visual Studio 2008 or higher, and step into the simple one-linef above. One of the fun little gems you'll come across is this comment in the getter for Control.CreateParams:
// In a typical control this is accessed ten times to create and show a control.
// It is a net memory savings, then, to maintain a copy on control.
//
if (createParams == null) {
createParams = new CreateParams();
}
Ten times. The information roughly equivalent to the sum of what's stored in a WNDCLASSEX structure and what's passed to CreateWindowEx is retrieved from the Control class ten times before it's stored in a WNDCLASSEX structure and passed on to RegisterClassEx and CreateWindowEx.
All in all, the number of instructions executed to perform this very basic task is 2–3 orders of magnitude more in C# than in C. Part of this is due to the use of a feature-rich library, which is necessarily generalized, versus our simple C code which does exactly what we need and nothing more. But part of it is due to the fact that the modularized, object-oriented nature of .NET framework, lends itself to a lot of repetition of execution that often is avoided by a procedural approach.
I'm not trying to pick on C# or the .NET framework. Nor am I saying that modularization, generalization, library/language features, OOP, etc. are bad things. I used to do most of my development in C, later in C++, and most lately in C#. Similarly, before C, I used mostly assembly. And with each step "higher" my language goes, I write better, more maintainable, more robust programs in less time. They do, however, tend to execute a little more slowly.
I don't think anyone has mentioned the fact that much more effort has been put into C compilers than any other compiler, with perhaps the exception of Java.
C is extremely optimizable for many of the reasons already stated - more than almost any other language. So if the same amount of effort is put into other language compilers, C will probably still come out on top.
I think there is at least one candidate language that, with effort, could be optimized better than C and thus we could see implementations that produce faster binaries. I'm thinking of Digital Mars' D, because the creator took care to build a language that could potentially be better optimized than C. There may be other languages that have this possibility. However, I cannot imagine that any language will have compilers more than just a few percent faster than the best C compilers. I would love to be wrong.
I think the real "low hanging fruit" will be in languages that are designed to be easy for humans to optimize. A skilled programmer can make any language go faster, but sometimes you have to do ridiculous things or use unnatural constructs to make this happen. Although it will always take effort, a good language should produce relatively fast code without having to obsess over exactly how the program is written.
It's also important (at least to me) that the worst case code tends to be fast. There are numerous "proofs" on the web that Java is as fast or faster than C, but that is based on cherry picking examples.
I'm not big fan of C, but I know that anything I write in C is going to run well. With Java, it will "probably" run within 15% of the speed, usually within 25%, but in some cases it can be far worse. Any cases where it's just as fast or within a couple of percent are usually due to most of the time being spent in the library code which is heavily optimized C code anyway.
This is actually a bit of a perpetuated falsehood. While it is true that C programs are frequently faster, this is not always the case, especially if the C programmer isn't very good at it.
One big glaring hole that people tend to forget about is when the program has to block for some sort of I/O, such as user input in any GUI program. In these cases, it doesn't really matter what language you use since you are limited by the rate at which data can come in rather than how fast you can process it. In this case, it doesn't matter much if you are using C, Java, C# or even Perl; you just cannot go any faster than the data can come in.
The other major thing is that using garbage collection (GC) and not using proper pointers allows the virtual machine to make a number of optimizations not available in other languages. For instance, the JVM is capable of moving objects around on the heap to defragment it. This makes future allocations much faster since the next index can simply be used rather than looking it up in a table. Modern JVMs also don't have to actually deallocate memory; instead, they just move the live objects around when they GC and the spent memory from the dead objects is recovered essentially for free.
This also brings up an interesting point about C and even more so in C++. There is something of a design philosophy of "If you don't need it, you don't pay for it." The problem is that if you do want it, you end up paying through the nose for it. For instance, the vtable implementation in Java tends to be a lot better than C++ implementations, so virtual function calls are a lot faster. On the other hand, you have no choice but to use virtual functions in Java and they still cost something, but in programs that use a lot of virtual functions, the reduced cost adds up.
It's not so much about the language as the tools and libraries. The available libraries and compilers for C are much older than for newer languages. You might think this would make them slower, but au contraire.
These libraries were written at a time when processing power and memory were at a premium. They had to be written very efficiently in order to work at all. Developers of C compilers have also had a long time to work in all sorts of clever optimizations for different processors. C's maturity and wide adoption makes for a signficant advantage over other languages of the same age. It also gives C a speed advantage over newer tools that don't emphasize raw performance as much as C had to.
Amazing to see the old "C/C++ must be faster than Java because Java is interpreted" myth is still alive and kicking. There are articles going back a few years, as well as more recent ones, that explain with concepts or measurements why this simply isn't always the case.
Current virtual machine implementations (and not just the JVM, by the way) can take advantage of information gathered during program execution to dynamically tune the code as it runs, using a variety of techniques:
rendering frequent methods to machine code,
inlining small methods,
adjustment of locking
and a variety of other adjustments based on knowing what the code is actually doing, and on the actual characteristics of the environment in which it's running.
The lack of abstraction is what makes C faster. If you write an output statement you know exactly what is happening. If you write an output statement in Java it is getting compiled to a class file which then gets run on a virtual machine, introducing a layer of abstraction.
The lack of object-oriented features as a part of the language also increases its speed do to less code being generated. If you use C as an object-oriented language, then you are doing all the coding for things such as classes, inheritance, etc. This means rather than make something generalized enough for everyone with the amount of code and the performance penalty that requires you only write what you need to get the job done.
The fastest running code would be carefully handcrafted machine code. Assembler will be almost as good. Both are very low level and it takes a lot of writing code to do things. C is a little above assembler. You still have the ability to control things at a very low level in the actual machine, but there is enough abstraction, make writing it faster and easier then assembler.
Other languages, such as C# and Java, are even more abstract. While Assembler and machine code are called low-level languages, C# and JAVA (and many others) are called high-level languages. C is sometimes called a midlevel language.
Don't take someone’s word for it; look at the disassembly for both C and your language-of-choice in any performance critical part of your code. I think you can just look in the disassembly window at runtime in Visual Studio to see disassembled .NET code. It should be possible, if tricky, for Java using WinDbg, though if you do it with .NET, many of the issues would be the same.
I don't like to write in C if I don't need to, but I think many of the claims made in these answers that tout the speed of languages other than C can be put aside by simply disassembling the same routine in C and in your higher level language of choice, especially if lots of data is involved as is common in performance critical applications. Fortran may be an exception in its area of expertise; I don't know. Is it higher level than C?
The first time I did compare JITed code with native code resolved any and all questions whether .NET code could run comparably to C code. The extra level of abstraction and all the safety checks come with a significant cost. The same costs would probably apply to Java, but don't take my word for it; try it on something where performance is critical. (Does anyone know enough about JITed Java to locate a compiled procedure in memory? It should certainly be possible.)
Setting aside advanced optimization techniques such as hot-spot optimization, pre-compiled meta-algorithms, and various forms of parallelism, the fundamental speed of a language correlates strongly with the implicit behind-the-scenes complexity required to support the operations that would commonly be specified within inner loops.
Perhaps the most obvious is validity checking on indirect memory references—such as checking pointers for null and checking indexes against array boundaries. Most high-level languages perform these checks implicitly, but C does not. However, this is not necessarily a fundamental limitation of these other languages—a sufficiently clever compiler may be capable of removing these checks from the inner loops of an algorithm through some form of loop-invariant code motion.
The more fundamental advantage of C (and to a similar extent the closely related C++) is a heavy reliance on stack-based memory allocation, which is inherently fast for allocation, deallocation, and access. In C (and C++) the primary call stack can be used for allocation of primitives, arrays, and aggregates (struct/class).
While C does offer the capability to dynamically allocate memory of arbitrary size and lifetime (using the so called 'heap'), doing so is avoided by default (the stack is used instead).
Tantalizingly, it is sometimes possible to replicate the C memory allocation strategy within the runtime environments of other programming languages. This has been demonstrated by asm.js, which allows code written in C or C++ to be translated into a subset of JavaScript and run safely in a web browser environment—with near-native speed.
As somewhat of an aside, another area where C and C++ outshine most other languages for speed is the ability to seamlessly integrate with native machine instruction sets. A notable example of this is the (compiler and platform dependent) availability of SIMD intrinsics which support the construction of custom algorithms that take advantage of the now nearly ubiquitous parallel processing hardware—while still utilizing the data allocation abstractions provided by the language (lower-level register allocation is managed by the compiler).
1) As others have said, C does less for you. No initializing variables, no array bounds checking, no memory management, etc. Those features in other languages cost memory and CPU cycles that C doesn't spend.
2) Answers saying that C is less abstracted and therefore faster are only half correct I think. Technically speaking, if you had a "sufficiently advanced compiler" for language X, then language X could approach or equal the speed of C. The difference with C is that since it maps so obviously (if you've taken an architecture course) and directly to assembly language that even a naive compiler can do a decent job. For something like Python, you need a very advanced compiler to predict the probable types of objects and generate machine code on the fly -- C's semantics are simple enough that a simple compiler can do well.
Back in the good ole days, there were just two types of languages: compiled and interpreted.
Compiled languages utilized a "compiler" to read the language syntax and convert it into identical assembly language code, which could than just directly on the CPU. Interpreted languages used a couple of different schemes, but essentially the language syntax was converted into an intermediate form, and then run in a "interpreter", an environment for executing the code.
Thus, in a sense, there was another "layer" -- the interpreter -- between the code and the machine. And, as always the case in a computer, more means more resources get used. Interpreters were slower, because they had to perform more operations.
More recently, we've seen more hybrid languages like Java, that employ both a compiler and an interpreter to make them work. It's complicated, but a JVM is faster, more sophisticated and way more optimized than the old interpreters, so it stands a much better change of performing (over time) closer to just straight compiled code. Of course, the newer compilers also have more fancy optimizing tricks so they tend to generate way better code than they used to as well. But most optimizations, most often (although not always) make some type of trade-off such that they are not always faster in all circumstances. Like everything else, nothing comes for free, so the optimizers must get their boast from somewhere (although often times it using compile-time CPU to save runtime CPU).
Getting back to C, it is a simple language, that can be compiled into fairly optimized assembly and then run directly on the target machine. In C, if you increment an integer, it's more than likely that it is only one assembler step in the CPU, in Java however, it could end up being a lot more than that (and could include a bit of garbage collection as well :-) C offers you an abstraction that is way closer to the machine (assembler is the closest), but you end up having to do way more work to get it going and it is not as protected, easy to use or error friendly. Most other languages give you a higher abstraction and take care of more of the underlying details for you, but in exchange for their advanced functionality they require more resources to run. As you generalize some solutions, you have to handle a broader range of computing, which often requires more resources.
I have found an answer on a link about why some languages are faster and some are slower, I hope this will clear more about why C or C++ is faster than others, There are some other languages also that is faster than C, but we can not use all of them. Some explanation -
One of the big reasons that Fortran remains important is because it's fast: number crunching routines written in Fortran tend to be quicker than equivalent routines written in most other languages. The languages that are competing with Fortran in this space—C and C++—are used because they're competitive with this performance.
This raises the question: why? What is it about C++ and Fortran that make them fast, and why do they outperform other popular languages, such as Java or Python?
Interpreting versus compiling
There are many ways to categorize and define programming languages, according to the style of programming they encourage and features they offer. When looking at performance, the biggest single distinction is between interpreted languages and compiled ones.
The divide is not hard; rather, there's a spectrum. At one end, we have traditional compiled languages, a group that includes Fortran, C, and C++. In these languages, there is a discrete compilation stage that translates the source code of a program into an executable form that the processor can use.
This compilation process has several steps. The source code is analyzed and parsed. Basic coding mistakes such as typos and spelling errors can be detected at this point. The parsed code is used to generate an in-memory representation, which too can be used to detect mistakes—this time, semantic mistakes, such as calling functions that don't exist, or trying to perform arithmetic operations on strings of text.
This in-memory representation is then used to drive a code generator, the part that produces executable code. Code optimization, to improve the performance of the generated code, is performed at various times within this process: high-level optimizations can be performed on the code representation, and lower-level optimizations are used on the output of the code generator.
Actually executing the code happens later. The entire compilation process is simply used to create something that can be executed.
At the opposite end, we have interpreters. The interpreters will include a parsing stage similar to that of the compiler, but this is then used to drive direct execution, with the program being run immediately.
The simplest interpreter has within it executable code corresponding to the various features the language supports—so it will have functions for adding numbers, joining strings, whatever else a given language has. As it parses the code, it will look up the corresponding function and execute it. Variables created in the program will be kept in some kind of lookup table that maps their names to their data.
The most extreme example of the interpreter style is something like a batch file or shell script. In these languages, the executable code is often not even built into the interpreter itself, but rather separate, standalone programs.
So why does this make a difference to performance? In general, each layer of indirection reduces performance. For example, the fastest way to add two numbers is to have both of those numbers in registers in the processor, and to use the processor's add instruction. That's what compiled programs can do; they can put variables into registers and take advantage of processor instructions. But in interpreted programs, that same addition might require two lookups in a table of variables to fetch the values to add, then calling a function to perform the addition. That function may very well use the same processor instruction as the compiled program uses to perform the actual addition, but all the extra work before the instruction can actually be used makes things slower.
If you want to know more please check the source.
Some C++ algorithms are faster than C, and some implementations of algorithms or design patterns in other languages can be faster than C.
When people say that C is fast, and then move on to talking about some other language, they are generally using C's performance as a benchmark.
Just step through the machine code in your IDE, and you'll see why it's faster (if it's faster). It leaves out a lot of hand-holding. Chances are your Cxx can also be told to leave it out too, in which case it should be about the same.
Compiler optimizations are overrated, as are almost all perceptions about language speed.
Optimization of generated code only makes a difference in hotspot code, that is, tight algorithms devoid of function calls (explicit or implicit). Anywhere else, it achieves very little.
With modern optimizing compilers, it's highly unlikely that a pure C program is going to be all that much faster than compiled .NET code, if at all. With the productivity enhancement that frameworks like .NET provide the developer, you can do things in a day that used to take weeks or months in regular C. Coupled with the cheap cost of hardware compared to a developer's salary, it's just way cheaper to write the stuff in a high-level language and throw hardware at any slowness.
The reason Jeff and Joel talk about C being the "real programmer" language is because there isn't any hand-holding in C. You must allocate your own memory, deallocate that memory, do your own bounds-checking, etc. There isn't any such thing as new object(); There isn't any garbage collection, classes, OOP, entity frameworks, LINQ, properties, attributes, fields, or anything like that.
You have to know things like pointer arithmetic and how to dereference a pointer. And, for that matter, know and understand what a pointer is. You have to know what a stack frame is and what the instruction pointer is. You have to know the memory model of the CPU architecture you're working on. There is a lot of implicit understanding of the architecture of a microcomputer (usually the microcomputer you're working on) when programming in C that simply is not present nor necessary when programming in something like C# or Java. All of that information has been off-loaded to the compiler (or VM) programmer.
It's the difference between automatic and manual. Higher-level languages are abstractions, thus automated. C/C++ are manually controlled and handled; even error checking code is sometimes a manual labor.
C and C++ are also compiled languages which means none of that run-everywhere business. These languages have to be fine-tuned for the hardware you work with, thus adding an extra layer of gotcha. Though this is slightly phasing out now as C/C++ compilers are becoming more common across all platforms. You can do cross compilations between platforms. It's still not a run everywhere situation, and you’re basically instructing compiler A to compile against compiler B the same code on a different architecture.
Bottom line, C languages are not meant to be easy to understand or reason. This is also why they’re referred to as systems languages. They came out before all this high-level abstraction nonsense. This is also why they are not used for front end web programming. They’re just not suited to the task; they’re meant to solve complex problems that can't be resolved with conventional language tooling.
This is why you get crazy stuff, like micro-architectures, drivers, quantum physics, AAA games, and operating systems. There are things C and C++ are just well suited for. Speed and number crunching being the chief areas.
C is fast because it is natively compiled, low-level language. But C is not the fastest. The Recursive Fibonacci Benchmark shows that Rust, Crystal, and Nim can be faster.

What projects cannot be done in C?

I would like to know what projects cannot be done in C.
I know programming can be quicker and more intuitive in
other languages. But I would like to know what features
are missing in C that would prevent a project from being
completed well.
For example, very few web-frameworks exist in C.
C, like many other languages, is Turing Complete.
So simple answer is: none.
However, C++ Template Meta Programming meets the same criterion, so "it is possible" is not a good criterion to choose tools.
The very first C compiler?
A working solution to the halting problem
Alright, here's one: you cannot write an x86 boot sector in C. This is one of those things that has to be written in ASM.
There are none.
Different languages give you different ways to say things. For some classes of problems a given language may be more expressive and/or concise. Are there projects that you should pick something aside from C? Yes, of course. But to say you can't do it well in C is misleading. It would be better to ask which language is the best choice for the problem at hand, and are the gains worth using something unfamiliar?
Anything can be done in virtually any language.
That said there is a level of practicality. As your system's complexity increases, you need better tools to manage it.
The problems are still solvable, but you start to need more people and much more effort in design. I'm not saying other languages don't benefit from design, I'm saying that the same level and attention to detail may not be required.
Since we programmers are Human (I am at least) we have troubles in one area or another. My biggest is memory. If I can visualize my code as objects, manipulating large modules in my head becomes easier, and my brain can handle larger projects.
Of course, it's even possible to write good OO code in C, the patterns were developed in C by manually managing dispatch tables (tables of pointers with some pointers updated to point to different methods), and this is true of all programming constructs from higher languages--they can be done in any language, but...
If you were to implement objects in C, every single class you wrote would have a large amount of boilerplate overhead. If you made some form of exception handling, you would expose more boilerplate.
Higher level languages abstract this boilerplate out of your code and into the system, simplifying what you have to think about and debug (a dispatch table in C could take a lot of debugging, but in C++ it isn't going to fail because the code generated by a working compiler is going to be bug-free and hidden, you never see it).
I guess I'd say that's the biggest (only?) difference between low level and higher level languages, how much boilerplate do you hide. In the latest batch of dynamic languages, they are really into hiding loop constructs within the language, so more things look like:
directory.forEachFile(print file.name); // Not any real language
In C, even if you isolated part of the looping inside a function, setting up the function pointers and stuff would still take lines of un-obvious code that is not solving part of your primary problem.
There is not a single algorithm that cannot be written with C.
Depends on how much you want to invest (time/money/energy) to make it happen. Otherwise, I'd say there aren't any. It is just easier sometimes to use something else.
OS kernel has been written in C and everything runs over it so you can write everything in C.
Boot sector that needs ASM :-) , I don't think you meant that.

What is an example in which knowing C will make me write better code in any other language?

In the Stack Overflow podcasts, Joel Spolsky constantly harps on Jeff Atwood about Jeff not knowing how to write code in C. His statement is that "knowing C helps you write better code." He also always uses some sort of story involving string manipulation and how knowing C would allow you to write more efficient string routines in a different language.
As someone who knows a little C, but loves to write code in perl and other high-level languages, I have never once come across a problem that I was able to solve by writing C.
I am looking for examples of real-world situations where knowing C would be useful while writing a project in a high-level/dynamic language like perl or python.
Edit: Reading some of the answers you guys have submitted have been great, but still doesn't make any sense to me in this regard:
Take the strcat example. There's a right way and a wrong way to combine strings in C. But why should I (as a high-level developer) think that I am smarter than Larry Wall? Why wouldn't the language designers write the string manipulation code the right way?
The classic example that Joel Spolsky uses is on misuse of strcat and strlen, and spotting "Shlemiel the painter" algorithms in general.
It's not that you need C to solve problems that higher-level languages can't solve, it's that knowing C well gives you a perspective on what's going on underneath all those levels of languages that allows you to write better software. Because just such a perspective helps you avoid writing code which is, unknown to you, actually O(n^2), for example.
Edit: Some clarification based on comments.
Knowing C is not a prerequisite for such knowledge, there are many ways to acquire the same knowledge.
Knowing C is also not a guarantee of these skills. You may be proficient in C and yet still write horrible, grotty, kludgy code in every other language you touch.
C is a low-level language, yet it still has modern control structures and functions so you aren't always getting caught up in the fiddly details. It's very difficult to become proficient at C without gaining a mastery of certain fundamentals (such as the details of memory management and pointers), mastery of which often pays rich dividends when working in any language.
It's always about the fundamentals.
This is true in many pursuits as well as software engineering. It is not secret incantations that make the best programmers the best, rather it is a greater mastery of the fundamentals. Experience has shown that knowledge of C tends to have a higher correlation to mastery of certain of those fundamentals, and that learning C tends to be one of the easier and more common routes to acquiring such knowledge.
It's a mistake to assume that learning C will somehow automatically give you a better understanding of low-level programming concerns. In a lot of cases even C is too high level to give you a good understanding of efficiency concerns.
A classic is i++ versus ++i. It's over-cited, so perhaps most people know the implications about performance between these two operations. But learning C wouldn't magically teach you this by itself.
I guess I understand arguments about strings. When string operations are made deceptively simple, people often use them in inefficient ways. But again, knowing that strncat exists doesn't give you a full appreciation for the efficiency concerns. A lot of C programmers probably haven't even thought about the fact that strncat has to do a strlen operation internally.
Even using C, it's important to understand what's going on behind the scenes if efficiency is a concern. People who know C tend to view things in a progression. Assembly and machine code are the building blocks of C, while C is a building block of higher level languages.
This isn't specifically true, but it's obvious that C is "closer to the metal" than many higher level languages. This has at least two effects: efficiency concerns aren't as hidden behind implicit behavior, and it's easier to screw up.
So you want a specific example of how knowing C gives you an advantage. I don't think there is one. I think what people mean when they say this is that knowing what's going on behind the scenes in whatever language you're happening to write for helps you make more intelligent decisions about how to write code. However, it's a mistake to assume that C is "what's going on behind the scenes" in Java, for instance.
It's hard to quantify exactly, but having an understanding of C will give your more insight into how higher-level language constructs are implemented, and as a consequence you'll be better able to use the constructs in an intelligent manner.
To give you a specific reason: having to write my own Garbage Collection routines has helped my write better code.
I don't think I have ever found a problem that I haven't been able to solve with a higher-level language; but started by learning C, it has instilled in me quite a number of excellent development practices. Knowing how the rudimentary parts of the flow of an application work will enable to you be able to look at your own code and get a good visual of how the data flows, and where it is stored. This then leads to a better understand of how to track down leaking memory, slow disk reads, poorly constructed caches, etc.
Keeping track of Pointers... that's another one that comes to mind.
Classic examples are things involving lower level memory management, such as the implementation of a linked list class:
struct Node
{
Data *data;
Node *next;
}
Understanding how the pointers are used to iterate the list, and what they signify in terms of the machine architecture will allow you to better understand your high level code.
Another example which Joel was referring to was the implementation of string concatenation, and the right way to create a string from a set of data.
// this is efficient
for (int i=0; i< n; i++)
{
strcat(str, data(i));
}
// this could be too, but you'd need to look at the implementation to be sure
std::string str;
for (int i=0; i<n; i++)
{
str+=data(i);
}
Knowing C helps you to write better code in C. I guess that the example of Joel Spolsky is of little use in C++ or Objective-C where specific classes for manipulating strings exist and have been crafted with performance in mind. Moreover, using C tricks in other languages may be couter productive.
Nevertheless, C knowledge is very helpful to understand general concepts in other languages and what is behind the hood in many situations.
As someone who knows a little C, but loves to write code in perl and other high-level languages, I have never once come across a problem that I was able to solve by writing C.
I am looking for examples of real-world situations where knowing C would be useful while writing a project in a high-level/dynamic language like perl or python.
It's easy to start writing high level code and then wonder we it's running slow. The truth is there are many ways to write perl or python code, and some are better (as in more efficient) than the others. If you know the low level details of how your code is executed in perl or python (both of which are written in C) you can code around several inefficiencies --like knowing which looping construct is faster, how memory is retained/released, etc.
Also, when writing a project in perl or python you sometimes hit a performance wall. The creators of the language (Guido, at least) advocate that you implement that part in C, as a language extension. To do that, well, you'll have to know C.
So, there.
For the purposes of argument, suppose you wanted to concatenate the string representations of all the integers from 1 to n (e.g. n = 5 would produce the string "12345"). Here's how one might do that naïvely in, say, Java.
String result = "";
for (int i = 1; i <= n; i++) {
result = result + Integer.toString(i);
}
If you were to rewrite that code segment (which is quite good-looking in Java) in C as literally as possible, you would get something to make most C programmers cringe in fear:
char *result = malloc(1);
*result = '\0';
for (int i = 1; i <= n; i++) {
char *intStr = malloc(11);
itoa(i, intStr, 10);
char *tempStr = malloc(/* some large size */);
strcpy(tempStr, result);
strcat(tempStr, intStr);
free(result);
free(intStr);
result = tempStr;
}
Because strings in Java are immutable, Integer.toString creates a dummy string and string concatenation creates a new string instance instead of altering the old one. That's not easy to see from just looking at the Java code. Knowing how said code translates into C is one way of learning exactly how inefficient said code is.
Do you use arrays much ? and do you come across situations where you need items to be stored in memory without knowing how many of them (i.e. based on a query from the database?) then I suppose C would teach you great things like stacks, structs and link lists which might help you. Regards, Andy
Knowing C is really not worth much. Many of us who know C deeply like to think that all that deep insight is valuable and important.
Some of us who know C can't think of a single specific feature of C that's helpful to know about.
Knowing how pointers work in C (especially with C's syntax) isn't all that helpful. In a high-level language your statements create objects and manage their interaction. Pointers and references are -- perhaps -- interesting from a hypothetical point of view. But the knowledge has no practical impact on how you use Java or Python.
The higher-level languages are the way they are. Knowing how doesn't change those languages; it doesn't change how you use them, debug or test them.
Knowing how to create or manipulate a linked list has no earthly impact on Python list class definition. None.
Knowing the difference between Linked List and Array List might help you write a Java program. But the C implementation doesn't help you choose between Linked List and Array List. The decision is independent of knowing C.
A bad algorithm is bad in every language. Knowing inner mysteries of C doesn't make a bad algorithm any less bad. Knowing C doesn't help you know the Java collections or the Python built-in types.
I can't see any value in learning C. Learning Fortran is just as valuable.
Technically, all of the deficiencies of C would force you to code around them; making you write more code -> making you more experienced in general. Lacking any portable integer bigger than 32-bits, for example, C has, in the past, made me write my own bignum library.
The lack of implicit memory, resource and error management (garbage collection, RAII, automatically-called constructors/destructors, maybe exceptions) force C users to write a lot of initialization, error-handling and cleanup code. It may just be me, but I'm never tired of writing such code. I go and read the documentation of every external function I call, return to my code and check for every return value and other failure-indicative stuff. It even makes me feel safe!
This last point is probably the biggest one to be made in favor of the argument. You can only write so many malloc()/free() pairs before you start to analyze the lifetime of every single variable you come across in every single language! C++'s automatic-storage objects don't help this disorder, either.
Writing truly portable C code often requires the programmer to be free of a lot assumptions about the host system - think sizeof(), CHAR___BITS, unsigned long, UINT_MAX. While this hasn't helped me write better code in other languages, it has helped me think about possible alternate implementations: how a tiny microprocessor could still run my C code, generating a gazillion RISC instructions for my simple one-line statement. (That is another thing; not many other languages map to and from a given assembly language so easily in my head. Then again, that may just be me.)
Of course, none of these arguments go only for C. #S.Lott has a valid point - Fortran might be an equally good alternative. But there is so much C code around! A whole personal computer system from top to bottom -applications to libraries to drivers to kernel- is available in source code in C. It would be such a waste if you could not read it.
I think it is worth knowing some low-level language, and there are pragmatic reasons to choose C:
It's low-level, close to assembler
It's widespread
Understanding the whole stack is valuable. Sometimes you need to debug something's guts. Sometimes you cannot fix a performance problem without low-level knowledge (this is often not the case, e.g., when the performance problem is purely algorithmic, but sometimes it is).
Why is C widely considered the quintessential "bottom of the stack", and not some other language(s)? I think this because C is a low-level programming language, and C won. It has been a while now, but C was not always as dominant. To take just one famous example, the proponents of Common Lisp (which had its own ways of writing low-level code) were hoping their language would be popular, too, and eventually lost.
The following are usually implemented in C:
operating systems (Unix variants, Windows, many embedded operating systems)
higher-level programming languages (many popular implementations of Java, Python, etc)
(obviously) reams of popular open source projects
I'm not a hardware person, but I gather that C has influenced CPU design heavily, too.
So if you believe in understanding the whole stack, learning C is, from a pragmatic perspective, the best choice.
As a caveat, I think it's worth learning assembler, as well. Although C is close to the metal, I didn't fully understand C until I had to do some assembler. It is occasionally helpful to understand how functions calls are actually performed, how for loops are implemented, etc. Less important, but also useful, is having to (at least once) deal with a system without virtual memory. When using C on Windows, Unix, and certain other operating systems, even humble malloc does a lot of work under the covers that is easier to appreciate, debug and/or tune if you've ever had to deal with manually locking and unlocking memory regions (not that I would recommend doing so on a regular basis!)
I see it like this , everything boils down to C in a crossplatform level, and assembly in a platform specific way. So it's like being a crosscountry Rally racer, and C is basic automotive mechanics, you can be a great driver but when you get into trouble knowing C means you can probably get yourself back in the race, if not you're stuck calling the mechanics. And assembly is what the mechanics and manufacturers know, it's a worthy investment if that's what you want to do, otherwise you can just trust the mechanics.
For specifics think about memory management, hardwar drivers, physics engines, high performance 3d graphics, TCP stacks, binary protocols, embedded software, creating high level languages like Perl
You cannot write an OS kernel in Perl; C would be a much better choice for that, because it is low-level enough to express everything the kernel should do, and portable enough to let you port your kernel to different architectures
Knowing C is not a requirement to being able to effectively use higher-level languages, but it certainly can help ones general understanding of how computers and software work - I think it's similar to an assertion that knowing some assembly language or computer architecture/hardware logic (and/or/nand gates, etc) can help a C programmer be a better programmer.
Sometimes in order to solve a problem it helps to know how things are working 'underneath' what you're doing.
I don't think this means a programmer must know C in order to be a good programmer, but I think that knowing C can be helpful to almost any programmer.
Not knowing Perl well, I am wondering if it is now possible to distribute processor load to more than one physical core with several threads created in a single program in Perl, without spawning additional processes
I don't think there can be any specific example.
What learning C does for you is give you an insight, a broadening of the mind, into how computers (and software) work. It's a very abstract thing ..
It doesn't make you write better code in python, it just makes you more of a computer scientist.
The reference that Wedge made to Joel's article mentioning Shlemiel the painter is an interesting one but has no relevance here. That algorithm is not tied to C in any particular way (although it manifests itself in null-terminated strings).
Python's strings are immutable anyway, and completely different from C's model of strings, so I don't quite see the relationship.
I suppose one concrete example is optimizing a parser or a lexer or a program that keeps writing to a string buffer all the time. If you use normal strings instead of a string buffer, you'll run across a problem when you build very large strings.
Consider that:
a = a + b
makes a copy of both a and b. It doesn't change the string that was referenced by a, it creates a new string, allocating more memory, etc.
If a becomes considerably large, and you keep adding small things to it, then Shlemiel the painter will manifest himself.
But then again, knowing this has nothing to do with knowing C, just knowing how your language implements things at the low level. (This is where having an experiece in C will help you).
In Python, say you have a function
def foo(l=[])
l.append("bar")
return l;
On some version of Python, available about a year ago, running foo() for times, you'd get a really interesting result (i.e. ["bar","bar","bar","bar]).
It seems that someone implemented the default parameters as a static variable (and without resetting it), so unexpected results happen.
Perhaps my example was contrived - a friend of mine who actually likes Python found this peculiar bug, but the fact of the matter is all of these languages are implemented in C or C++. Not knowing and not understanding concepts that are fundamental to the base language means that you won't have an in-depth understanding of languages that are built on top of that.
I find all the "why bother with C/C++/ASM question silly". If you're inclined enough to learn a language, that means that you're curious enough to get into it the first place. Why stop at just before C?
Knowing C is great because it does nothing behind your back (GC, bounds checking, etc.). It only does exactly what you tell it too. Nothing is implied. Even C++ does things you don't tell it too with RAII (of course, it is implied that the object is destructed when it goes out of scope, but you don't actually write that). C is a great way to learn what goes on 'under the hood' of the computer, without having to write assembly.
inefficient code (eg loops of string+=) are typically inefficient in any language. what difference does it make if someone explains why it is inefficient in one language or the other? knowing C, but not realizing that a method is inefficient, is no different than knowing python and not realizing the same.

Resources