Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
From the number of questions posted here, it's clear that people have some pretty fundemental issues when getting their heads around pointers and pointer arithmetic.
I'm curious to know why. They've never really caused me major problems (although I first learned about them back in the Neolithic). In order to write better answers to these questions, I'd like to know what people find difficult.
So, if you're struggling with pointers, or you recently were but suddenly "got it", what were the aspects of pointers that caused you problems?
When I first started working with them, the biggest problem I had was the syntax.
int* ip;
int * ip;
int *ip;
are all the same.
but:
int* ip1, ip2; //second one isn't a pointer!
int *ip1, *ip2;
Why? because the "pointer" part of the declaration belongs to the variable, and not the type.
And then dereferencing the thing uses a very similar notation:
*ip = 4; //sets the value of the thing pointed to by ip to '4'
x = ip; //hey, that's not '4'!
x = *ip; //ahh... there's that '4'
Except when you actually need to get a pointer... then you use an ampersand!
int *ip = &x;
Hooray for consistency!
Then, apparently just to be jerks and prove how clever they are, a lot of library developers use pointers-to-pointers-to-pointers, and if they expect an array of those things, well why not just pass a pointer to that too.
void foo(****ipppArr);
to call this, I need the address of the array of pointers to pointers to pointers of ints:
foo(&(***ipppArr));
In six months, when I have to maintain this code, I will spend more time trying to figure out what all this means than rewriting from the ground up.
(yeah, probably got that syntax wrong -- it's been a while since I've done anything in C. I kinda miss it, but then I'm a bit of a massochist)
I suspect people are going a bit too deep in their answers. An understanding of scheduling, actual CPU operations, or assembly-level memory management isn't really required.
When I was teaching, I found the following holes in students' understanding to be the most common source of problems:
Heap vs Stack storage. It is simply stunning how many people do not understand this, even in a general sense.
Stack frames. Just the general concept of a dedicated section of the stack for local variables, along with the reason it's a 'stack'... details such as stashing the return location, exception handler details, and previous registers can safely be left till someone tries to build a compiler.
"Memory is memory is memory" Casting just changes which versions of operators or how much room the compiler gives for a particular chunk of memory. You know you're dealing with this problem when people talk about "what (primitive) variable X really is".
Most of my students were able to understand a simplified drawing of a chunk of memory, generally the local variables section of the stack at the current scope. Generally giving explicit fictional addresses to the various locations helped.
I guess in summary, I'm saying that if you want to understand pointers, you have to understand variables, and what they actually are in modern architectures.
Proper understanding of pointers requires knowledge about the underlying machine's architecture.
Many programmers today don't know how their machine works, just as most people who know how to drive a car don't know anything about the engine.
When dealing with pointers, people that get confused are widely in one of two camps. I've been (am?) in both.
The array[] crowd
This is the crowd that straight up doesn't know how to translate from pointer notation to array notation (or doesn't even know that they are even related). Here are four ways to access elements of an array:
array notation (indexing) with the
array name
array notation (indexing) with the
pointer name
pointer notation (the *) with the
pointer name
pointer notation (the *) with the
array name
int vals[5] = {10, 20, 30, 40, 50};
int *ptr;
ptr = vals;
array element pointer
notation number vals notation
vals[0] 0 10 *(ptr + 0)
ptr[0] *(vals + 0)
vals[1] 1 20 *(ptr + 1)
ptr[1] *(vals + 1)
vals[2] 2 30 *(ptr + 2)
ptr[2] *(vals + 2)
vals[3] 3 40 *(ptr + 3)
ptr[3] *(vals + 3)
vals[4] 4 50 *(ptr + 4)
ptr[4] *(vals + 4)
The idea here is that accessing arrays via pointers seems pretty simple and straightforward, but a ton of very complicated and clever things can be done this way. Some of which can leave experienced C/C++ programmers befuddled, let alone inexperienced newbies.
The reference to a pointer and pointer to a pointer crowd
This is a great article that explains the difference and which I'll be citing and stealing some code from :)
As a small example, it can be very difficult to see exactly what the author wanted to do if you came across something like this:
//function prototype
void func(int*& rpInt); // I mean, seriously, int*& ??
int main()
{
int nvar=2;
int* pvar=&nvar;
func(pvar);
....
return 0;
}
Or, to a lesser extent, something like this:
//function prototype
void func(int** ppInt);
int main()
{
int nvar=2;
int* pvar=&nvar;
func(&pvar);
....
return 0;
}
So at the end of the day, what do we really solve with all this gibberish? Nothing.
Now we have seen the syntax of
ptr-to-ptr and ref-to-ptr. Are there
any advantages of one over the other?
I am afraid, no. The usage of one of
both, for some programmers are just
personal preferences. Some who use
ref-to-ptr say the syntax is "cleaner"
while some who use ptr-to-ptr, say
ptr-to-ptr syntax makes it clearer to
those reading what you are doing.
This complexity and the seeming (bold seeming) interchangeability with references ,which is often another caveat of pointers and an error of newcomers, makes understanding pointers hard. It's also important to understand, for completion's sake, that pointers to references are illegal in C and C++ for confusing reasons that take you into lvalue-rvalue semantics.
As a previous answer remarked, many times you'll just have these hotshot programmers that think they are being clever by using ******awesome_var->lol_im_so_clever() and most of us are probably guilty of writing such atrocities at times, but it's just not good code, and it's certainly not maintainable.
Well this answer turned out to be longer than I had hoped...
I blame the quality of reference materials and the people doing the teaching, personally; most concepts in C (but especially pointers) are just plain taught badly. I keep threatening to write my own C book (titled The Last Thing The World Needs Is Another Book On The C Programming Language), but I don't have the time or the patience to do so. So I hang out here and throw random quotes from the Standard at people.
There's also the fact that when C was initially designed, it was assumed you understood machine architecture to a pretty detailed level just because there was no way to avoid it in your day-to-day work (memory was so tight and processors were so slow you had to understand how what you wrote affected performance).
There is a great article supporting the notion that pointers are hard on Joel Spolsky's site - The Perils of JavaSchools.
[Disclaimer - I am not a Java-hater per se.]
Most things are harder to understand if you're not grounded in the knowledge that's "underneath". When I taught CS it got a lot easier when I started my students on programming a very simple "machine", a simulated decimal computer with decimal opcodes whose memory consisted of decimal registers and decimal addresses. They would put in very short programs to, for example, add a series of numbers to get a total. Then they would single step it to watch what was happening. They could hold down the "enter" key and watch it run "fast".
I'm sure almost everyone on SO wonders why it is useful to get so basic. We forget what it was like not knowing how to program. Playing with such a toy computer puts in place concepts without which you can't program, such as the ideas that computation is a step-by-step process, using a small number of basic primitives to build up programs, and the concept of memory variables as places where numbers are stored, in which the address or name of the variable is distinct from the number it contains. There is a distinction between the time at which you enter the program, and the time at which it "runs". I liken learning to program as crossing a series of "speed bumps", such as very simple programs, then loops and subroutines, then arrays, then sequential I/O, then pointers and data structure. All of these are much easier to learn by reference to what a computer is really doing underneath.
Finally, when getting to C, pointers are confusing though K&R did a very good job of explaining them. The way I learned them in C was to know how to read them - right to left. Like when I see int *p in my head I say "p points to an int". C was invented as one step up from assembly language and that's what I like about it - it is close to that "ground". Pointers, like anything else, are harder to understand if you don't have that grounding.
I didn't get pointers until I read the description in K&R. Until that point, pointers didn't make sense. I read a whole bunch of stuff where people said "Don't learn pointers, they are confusing and will hurt your head and give you aneurysms" so I shied away from it for a long time, and created this unnecessary air of difficult-concept.
Otherwise, mostly what I thought was, why on earth would you want a variable that you have to go through hoops to get the value of, and if you wanted to assign stuff to it, you had to do strange things to get values to go into them. The whole point of a variable is something to store a value, I thought, so why someone wanted to make it complicated was beyond me. "So with a pointer you have to use the * operator to get at its value??? What kind of goofy variable is that?", I thought. Pointless, no pun intended.
The reason it was complicated was because I didn't understand that a pointer was an address to something. If you explain that it is an address, that it is something that contains an address to something else, and that you can manipulate that address to do useful things, I think it might clear up the confusion.
A class that required using pointers to access/modify ports on a PC, using pointer arithmetic to address different memory locations, and looking at more complicated C-code that modified their arguments disabused me of the idea that pointers were, well, pointless.
Here's a pointer/array example that gave me pause. Assume you have two arrays:
uint8_t source[16] = { /* some initialization values here */ };
uint8_t destination[16];
And your goal is to copy the uint8_t contents from source destination using memcpy(). Guess which of the following accomplish that goal:
memcpy(destination, source, sizeof(source));
memcpy(&destination, source, sizeof(source));
memcpy(&destination[0], source, sizeof(source));
memcpy(destination, &source, sizeof(source));
memcpy(&destination, &source, sizeof(source));
memcpy(&destination[0], &source, sizeof(source));
memcpy(destination, &source[0], sizeof(source));
memcpy(&destination, &source[0], sizeof(source));
memcpy(&destination[0], &source[0], sizeof(source));
The answer (Spoiler Alert!) is ALL of them. "destination", "&destination", and "&destination[0]" are all the same value. "&destination" is a different type than the other two, but it is still the same value. The same goes for the permutations of "source".
As an aside, I personally prefer the first version.
I should start out by saying that C and C++ were the first programming languages I learned. I started with C, then did C++ in school, a lot, and then went back to C to become fluent in it.
The first thing that confused me about pointers when learning C was the simple:
char ch;
char str[100];
scanf("%c %s", &ch, str);
This confusion was mostly rooted in having been introduced to using reference to a variable for OUT arguments before pointers were properly introduced to me. I remember that I skipped writing the first few examples in C for Dummies because they were too simple only to never get the first program I did write to work (most likely because of this).
What was confusing about this was what &ch actually meant as well as why str didn't need it.
After I became familiar with that I next remember being confused about dynamic allocation. I realized at some point that having pointers to data wasn't extremely useful without dynamic allocation of some type, so I wrote something like:
char * x = NULL;
if (y) {
char z[100];
x = z;
}
to try to dynamically allocate some space. It didn't work. I wasn't sure that it would work, but I didn't know how else it might work.
I later learned about malloc and new, but they really seemed like magical memory generators to me. I knew nothing about how they might work.
Some time later I was being taught recursion again (I'd learned it on my own before, but was in class now) and I asked how it worked under the hood -- where were the separate variables stored. My professor said "on the stack" and lots of things became clear to me. I had heard the term before and had implemented software stacks before. I had heard others refer to "the stack" long before, but had forgotten about it.
Around this time I also realized that using multidimensional arrays in C can get very confusing. I knew how they worked, but they were just so easy to get tangled up in that I decided to try to work around using them whenever I could. I think that the issue here was mostly syntactic (especially passing to or returning them from functions).
Since I was writing C++ for school for the next year or two I got a lot of experience using pointers for data structures. Here I had a new set of troubles -- mixing up pointers. I would have multiple levels of pointers (things like node ***ptr;) trip me up. I'd dereference a pointer the wrong number of times and eventually resort to figuring out how many * I needed by trial and error.
At some point I learned how a program's heap worked (sort of, but good enough that it no longer kept me up at night). I remember reading that if you look a few bytes before the pointer that malloc on a certain system returns, you can see how much data was actually allocated. I realized that the code in malloc could ask for more memory from the OS and this memory was not part of my executable files. Having a decent working idea of how malloc works is a really useful.
Soon after this I took an assembly class, which didn't teach me as much about pointers as most programmers probably think. It did get me to think more about what assembly my code might be translated into. I had always tried to write efficient code, but now I had a better idea how to.
I also took a couple of classes where I had to write some lisp. When writing lisp I wasn't as concerned with efficiency as I was in C. I had very little idea what this code might be translated into if compiled, but I did know that it seemed like using lots of local named symbols (variables) made things a lot easier. At some point I wrote some AVL tree rotation code in a little bit of lisp, that I had a very hard time writing in C++ because of pointer issues. I realized that my aversion to what I thought were excess local variables had hindered my ability to write that and several other programs in C++.
I also took a compilers class. While in this class I flipped ahead to the advanced material and learned about static single assignment (SSA) and dead variables, which isn't that important except that it taught me that any decent compiler will do a decent job of dealing with variables which are no longer used. I already knew that more variables (including pointers) with correct types and good names would help me keep things straight in my head, but now I also knew that avoiding them for efficiency reasons was even more stupid than my less micro-optimization minded professors told me.
So for me, knowing a good bit about the memory layout of a program helped a lot. Thinking about what my code means, both symbolically and on the hardware, helps me out. Using local pointers that have the correct type helps a lot. I often write code that looks like:
int foo(struct frog * f, int x, int y) {
struct leg * g = f->left_leg;
struct toe * t = g->big_toe;
process(t);
so that if I screw up a pointer type it is very clear by the compiler error what the problem is. If I did:
int foo(struct frog * f, int x, int y) {
process(f->left_leg->big_toe);
and got any pointer type wrong in there, the compiler error would be a whole lot more difficult to figure out. I would be tempted to resort to trial and error changes in my frustration, and probably make things worse.
Looking back, there were four things that really helped me to finally understand pointers. Prior to this, I could use them, but I did not fully understand them. That is, I knew if I followed the forms, I would get the results I desired, but I did not fully understand the 'why' of the forms. I realize that this is not exactly what you have asked, but I think it is a useful corollary.
Writing a routine that took a pointer to an integer and modified the integer. This gave me the necessary forms upon which to build any mental models of how pointers work.
One-dimensional dynamic memory allocation. Figuring out 1-D memory allocation made me understand the concept of the pointer.
Two-dimensional dynamic memory allocation. Figuring out 2-D memory allocation reinforced that concept, but also taught me that the pointer itself requires storage and must be taken into account.
Differences between stack variables, global variables and heap memory. Figuring out these differences taught me the types of memory to which the pointers point/refer.
Each of these items required imagining what was going on at a lower level--building a mental model that satisfied every case I could think of throwing at it. It took time and effort, but it was well worth it. I am convinced that to understand pointers, you have to build that mental model on how they work and how they are implemented.
Now back to your original question. Based on the previous list, there were several items that I had difficulty in grasping originally.
How and why would one use a pointer.
How are they different and yet similar to arrays.
Understanding where the pointer information is stored.
Understanding what and where it is the pointer is pointing at.
I had my "pointer moment" working on some telephony programs in C. I had to write a AXE10 exchange emulator using a protocol analyser that only understood classic C. Everything hinged on knowing pointers. I tried writing my code without them (hey, I was "pre-pointer" cut me some slack) and failed utterly.
The key to understanding them, for me, was the & (address) operator. Once I understood that &i meant the "address of i" then understanding that *i meant "the contents of the address pointed to by i" came a bit later. Whenever I wrote or read my code I always repeated what "&" meant and what "*" meant and eventually I came to use them intuitively.
To my shame, I was forced into VB and then Java so my pointer knowledge is not as sharp as it once was, but I am glad I am "post-pointer". Don't ask me to use a library that requires me to understand **p, though.
The main difficulty with pointers, at least to me, is that I didn't start with C. I started with Java. The whole notion of pointers were really foreign until a couple of classes in college where I was expected to know C. So then I taught myself the very basics of C and how to use pointers in their very basic sense. Even then, every time I find myself reading C code, I have to look up pointer syntax.
So in my very limited experience(1 year real world + 4 in college), pointers confuse me because I've never had to really use it in anything other than a classroom setting. And I can sympathize with the students now starting out CS with JAVA instead of C or C++. As you said, you learned pointers in the 'Neolithic' age and have probably been using it ever since that. To us newer people, the notion of allocating memory and doing pointer arithmetic is really foreign because all these languages have abstracted that away.
P.S.
After reading the Spolsky essay, his description of 'JavaSchools' was nothing like what I went through in college at Cornell ('05-'09). I took the structures and functional programming (sml), operating systems (C), algorithms (pen and paper), and a whole slew of other classes that weren't taught in java. However all the intro classes and electives were all done in java because there's value in not reinventing the wheel when you are trying to do something higher leveled than implementing a hashtable with pointers.
Here is a non-answer:
Use cdecl (or c++decl) to figure it out:
eisbaw#leno:~$ cdecl explain 'int (*(*foo)(const void *))[3]'
declare foo as pointer to function (pointer to const void) returning pointer to array 3 of int
They add an extra dimension to the code without a significant change to the syntax. Think about this:
int a;
a = 5
There's only one thing to change: a. You can write a = 6 and the results are obvious to most people. But now consider:
int *a;
a = &some_int;
There are two things about a that are relevant at different times: the actual value of a, the pointer, and the value "behind" the pointer. You can change a:
a = &some_other_int;
...and some_int is still around somewhere with the same value. But you can also change the thing it points to:
*a = 6;
There's a conceptual gap between a = 6, which has only local side effects, and *a = 6, which could affect a bunch of other things in other places. My point here is not that the concept of indirection is inherently tricky, but that because you can do both the immediate, local thing with a or the indirect thing with *a... that might be what confuses people.
I had programmed in c++ for like 2 years and then converted to Java(5 years) and never looked back. However, when I recently had to use some native stuff, I found out (with amazement) that I hadn't forgotten anything about pointers and I even find them easy to use. This is a sharp contrast to what I experienced 7 years ago when I first tried to grasp the concept. So, I guess understanding and liking is a matter of programming maturity ? :)
OR
Pointers are like riding a bike, once you figure out how to work with them, there's no forgetting it.
All in all, hard to grasp or not, the whole pointer idea is VERY educational and I believe it should be understood by every programmer regardless if he programs on a language with pointers or not.
I think one reason C pointers are difficult is that they conflate several concepts which are not really equivalent; yet, because they are all implemented using pointers, people can have a hard time disentangling the concepts.
In C, pointers are used to, amoung other things:
Define recursive data structures
In C you'd define a linked list of integers like this:
struct node {
int value;
struct node* next;
}
The pointer is only there because this is the only way to define a recursive data structure in C, when the concept really has nothing to do with such a low-level detail as memory addresses. Consider the following equivalent in Haskell, which doesn't require use of pointers:
data List = List Int List | Null
Pretty straightforward - a list is either empty, or formed from a value and the rest of the list.
Iterate over strings and arrays
Here's how you might apply a function foo to every character of a string in C:
char *c;
for (c = "hello, world!"; *c != '\0'; c++) { foo(c); }
Despite also using a pointer as an iterator, this example has very little in common with the previous one. Creating an iterator that you can increment is a different concept from defining a recursive data structure. Neither concept is especially tied to the idea of a memory address.
Achieve polymorphism
Here is an actual function signature found in glib:
typedef struct g_list GList;
void g_list_foreach (GList *list,
void (*func)(void *data, void *user_data),
void* user_data);
Whoa! That's quite a mouthful of void*'s. And it's all just to declare a function that iterates over a list that can contain any kind of thing, applying a function to each member. Compare it to how map is declared in Haskell:
map::(a->b)->[a]->[b]
That's much more straightforward: map is a function that takes a function which converts an a to a b, and applies it to a list of a's to yield a list of b's. Just like in the C function g_list_foreach, map doesn't need to know anything in its own definition about the types to which it will be applied.
To sum up:
I think C pointers would be a lot less confusing if people first learned about recursive data structures, iterators, polymorphism, etc. as separate concepts, and then learned how pointers can be used to implement those ideas in C, rather than mashing all of these concepts together into a single subject of "pointers".
I think it requires a solid foundation, probably from the machine level, with introduction to some machine code, assembly, and how to represent items and data structure in RAM. It takes a little time, some homework or problem solving practice, and some thinking.
But if a person knows high level languages at first (which is nothing wrong -- a carpenter uses an ax. a person who needs to split atom uses something else. we need people who are carpenters, and we have people who study atoms) and this person who knows high level language is given a 2 minute introduction to pointers, and then it is hard to expect him to understand pointer arithmetics, pointers to pointers, array of pointers to variable size strings, and array of array of characters, etc. A low-level solid foundation can help a lot.
Pointers are difficult because of the indirection.
Pointers (along with some other aspects of low-level work), require the user to take away the magic.
Most high level programmers like the magic.
Pointers are a way of dealing with the difference between a handle to an object and an object itself. (ok, not necessarily objects, but you know what I mean, as well as where my mind is)
At some point, you probably have to deal with the difference between the two. In modern, high-level language this becomes the distinction between copy-by-value and copy-by-reference. Either way, it is a concept that is often difficult for programmers to grasp.
However, as has been pointed out, the syntax for handling this problem in C is ugly, inconsistent, and confusing. Eventually, if you really attempt to understand it, a pointer will make sense. But when you start dealing with pointers to pointers, and so on ad nauseum, it gets really confusing for me as well as for other people.
Another important thing to remember about pointers is that they're dangerous. C is a master programmer's language. It assumes you know what the heck you're doing and thereby gives you the power to really mess things up. While some types of programs still need to be written in C, most programs do not, and if you have a language that provides a better abstraction for the difference between an object and its handle, then I suggest you use it.
Indeed, in many modern C++ applications, it is often the case that any required pointer arithmetic is encapsulated and abstracted. We don't want developers doing pointer arithmetic all over the place. We want a centralized, well tested API that does pointer arithmetic at the lowest level. Making changes to this code must be done with great care and extensive testing.
The problem I have always had (primarily self-taught) is the "when" to use a pointer. I can wrap my head around the syntax for constructing a pointer but I need to know under which circumstances a pointer should be used.
Am I the only one with this mindset? ;-)
Once upon a time... We had 8 bit microprocessors and everyone wrote in assembly. Most processors included some type of indirect addressing used for jump tables and kernels. When higher level languages came along we add a thin layer of abstraction and called them pointers. Over the years we have gotten more and more away from the hardware. This is not necessarily a bad thing. They are called higher level languages for a reason. The more I can concentrate on what I want to do instead of the details of how it is done the better.
It seems many students have a problem with the concept of indirection, especially when they meet the concept of indirection for the first time. I remember from back when I was a student that out of the +100 students of my course, only a handful of people really understood pointers.
The concept of indirection is not something that we often use in real life, and therefore it's a hard concept to grasp initially.
I have recently just had the pointer click moment, and I was surprised that I had been finding it confusing. It was more that everyone talked about it so much, that I assumed some dark magic was going on.
The way I got it was this. Imagine that all defined variables are given memory space at compile time(on the stack). If you want a program that could handle large data files such as audio or images, you wouldn't want a fixed amount of memory for these potential structures. So you wait until runtime to assign a certain amount of memory to holding this data(on the heap).
Once you have your data in memory, you don't want to be copying that data all around your memory bus every time you want to run an operation on it. Say you want to apply a filter to your image data. You have a pointer that starts at the front of the data you have assigned to the image, and a function runs across that data, changing it in place. If you didn't know what you we're doing, you would probably end up making duplicates of data, as you ran it through the operation.
At least that's the way I see it at the moment!
Speaking as a C++ newbie here:
The pointer system took a while for me to digest not necessarily because of the concept but because of the C++ syntax relative to Java. A few things I found confusing are:
(1) Variable declaration:
A a(1);
vs.
A a = A(1);
vs.
A* a = new A(1);
and apparently
A a();
is a function declaration and not a variable declaration. In other languages, there's basically just one way to declare a variable.
(2) The ampersand is used in a few different ways. If it is
int* i = &a;
then the &a is a memory address.
OTOH, if it is
void f(int &a) {}
then the &a is a passed-by-reference parameter.
Although this may seem trivial, it can be confusing for new users - I came from Java and Java's a language with a more uniform use of operators
(3) Array-pointer relationship
One thing that's a tad bit frustrating to comprehend is that a pointer
int* i
can be a pointer to an int
int *i = &n; //
or
can be an array to an int
int* i = new int[5];
And then just to make things messier, pointers and array are not interchangeable in all cases and pointers cannot be passed as array parameters.
This sums up some of the basic frustrations I had with C/C++ and its pointers, which IMO, is greatly compounded by the fact that C/C++ has all these language-specific quirks.
I personally did not understand the pointer even after my post graduation and after my first job. The only thing I was knowing is that you need it for linked list, binary trees and for passing arrays into functions. This was the situation even at my first job. Only when I started to give interviews, I understand that the pointer concept is deep and has tremendous use and potential. Then I started reading K & R and writing own test program. My whole goal was job-driven.
At this time I found that pointers are really not bad nor difficult if they are been taught in a good way. Unfortunately when I learn C in graduation, out teacher was not aware of pointer, and even the assignments were using less of pointers. In the graduate level the use of pointer is really only upto creating binary trees and linked list. This thinking that you don't need proper understanding of pointers to work with them, kill the idea of learning them.
Pointers.. hah.. all about pointer in my head is that it give a memory address where the actual values of whatever its reference.. so no magic about it.. if you learn some assembly you wouldn't have that much trouble learning how pointers works.. come on guys... even in Java everything is a reference..
The main problem people do not understand why do they need pointers.
Because they are not clear about stack and heap.
It is good to start from 16bit assembler for x86 with tiny memory mode. It helped many people to get idea of stack, heap and "address". And byte:) Modern programmers sometimes can't tell you how many bytes you need to address 32 bit space. How can they get idea of pointers?
Second moment is notation: you declare pointer as *, you get address as & and this is not easy to understand for some people.
And the last thing I saw was storage problem: they understand heap and stack but can't get into idea of "static".
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am new to C programming and I am having problems understanding common pitfalls and common usages of different library functions in C programming. Can some one point me to a good resource where I can learn subtleties in C programming. Also can some one point me to a good resource learn debugging tools like gdb.
Also I want to know what is the difference between char *c="hello"; and char c[10]="hello" . Can some one tell me which one is recommended over the other in different situations.
Thanks & Regards,
Mousey.
char *c = "hello";
That makes c a pointer and is pointing to memory that should not be modified (so you cannot modify the data). But since c is a pointer, you can change where it points to.
char c[10] = "hello";
That makes c an array and arranges to have the array initialized with the specified string. Since it's an array, you can modify the data (although make sure you don't overflow the buffer) but you cannot change where in memory c references.
Just read The C Programming Language and write code. If you're new to it then you need first-hand experience so you can learn what the subtleties are. Just reading a list won't help a huge amount.
For the language itself, the book by the language designers is a good read. Be sure to do the exercises.
Another useful resource is the comp.lang.c FAQ. You've asked question 6.2 (be sure to read 6.1 and 6.3 as well).
It's explained in the links above, but just to insist: pointers and arrays are not the same thing in C. Rather, there are circumstances where the language requires a pointer, but you can use an array instead and it'll be converted automatically.
The difference is as follows:
char *c = "hello";
Created several things:
a char* called c
a static string in memory filled with "hello\0"
and it sets c to the address of that static memory
Whereas:
char c[10] = "hello";
Creates:
a char* called c (See note below)
10 slots in memory someplace
sets c to the address of the first location in the above
and it treats "hello" like {'h','e','l','l','o','\0'}, thus copying those values into c[0] through c[5]
depending on the compiler, "hello" may or may not get allocated someplace in memory in addition
Note:
In the second case, there technically isn't both an array and a variable that exists just to contain the address of the array, it just seems that way. So c is really just an alias for the address of the first location in the array. Updated with info from Tim below in the comments.
For gdb, the docs are online http://sourceware.org/gdb/current/onlinedocs/gdb/
And a cheat sheet which I find much more useful: http://users.ece.utexas.edu/~adnan/gdb-refcard.pdf
My first recommendation would be that unless you have a really good reason to learn C specifically, learn C++ instead. I realise that is probably going to be contentious amongst some; just something to consider if you have not already done so.
For resources, in the first instance a good book is always best, but if you are looking for on-line resources then you will find that many are C++ related, some deal with C and C++; different styles of writing and presentation suit different users; try some of these:
http://www.cprogramming.com/
http://www.howstuffworks.com/c.htm
http://www2.its.strath.ac.uk/courses/c/
The C Book (online)
The C Book (HTML download)
The C Book (PDF)
The following C++ related sites include excellent coverage of the C standard library:
http://www.cplusplus.com/
http://www.cppreference.com/wiki/
With respect to GDB, I applaud the appreciation of the benefits of using a symbolic debugger, it is remarkable how many developers avoid this essential tool, but suggest that using raw GDB may put you off such tools for life. If you are able to use VC++ on Windows its debugger is second to none, and VC++ Express is free. If you must use GDB (because you are using Linux for example) I suggest that you use GDB integrated into an IDE such as Eclipse, or KDevelop, or use the stand-alone Insight debugger. If you do choose to be hardcore and use GDB directly, there seems to be few resources on how to use it effectively beyond the GDB manual itself. There is also Debugging with GDB: The GNU Source-Level Debugger at $30.
If you're mathematically inclined, Project Euler can probably give you some good practice in certain areas, especially in array manipulation and stuff.
But keep in mind, there's more to programming than math -- despite what your prof might tell you.
The "C Traps and Pitfalls" by Andrew Koenig is an excellent book precisely for learning about C pitfalls. It is a pretty thin book, though. The comp.lang.c FAQ someone else pointed to is also an excellent resource.
Try doing a search for "c programming puzzles" and you'll find lots of resources on the tricky subtleties of the language itself (and there are many). Eg. here
I'm a web coder: I currently enjoy AS3 and deal with PHP. I own a Nintendo DS and want to give C a go.
From a higher level, what basic things/creature comforts are going to go missing?
I can't find [for... in] loops, so I assume they aren't there. It looks like I'm going to have to declare things religiously, and I assume I have no objects (which I dealt with in PHP a while ago).
Hash tables? Funny data types?
To sum it up, you'll basically get:
Typed variables
Functions
Pointers
Standard libraries
Then, you make the rest -- that may be a little too simplified, but that's a rough idea of what to face.
It can be daunting to begin with and there may be a learning curve to overcome. Here's a few speed bumps you may encounter:
String? What string?
One big thing to get used to would be strings. There is no such thing as a string in C. A string is a "null-terminated character array" (sometimes called C strings), which basically means an array of type char with the final element being a \0 (char value 0).
In memory, a char array of length 4 containing Hi! would appear as:
char[0] == 'H'
char[1] == 'i'
char[2] == '!'
char[3] == '\0'
Also, strings don't know their own length (no such things as "objects" that come for free in C), so the use of standard library call strlen would be required, which more or less is a for loop that goes through the string until it hits a \0 character. (This means it's an O(N) operation -- longer the string, longer it takes to find the length, unlike O(1) operation of most string implementation in modern languages.)
Garbage collection?
No such thing is as a garbage collector in C. In fact, you need to allocate and deallocate memory yourself:
/* Allocate enough memory for array of 10 int values. */
int* array_of_ints = malloc(sizeof(int) * 10);
/* Done with the array? Don't forget to free the memory! */
free(array_of_ints);
Failing to clean up after allocation of memory can lead to things called memory leaks which I'm sure you've heard of before.
Pointers!
And as always, when we talk about C, we can't forget about pointers. The whole concept of references to variables and dereferencing pointers can be a serious headache-inducing concept, but once you get a hang of it, it's actually not too bad.
Except for the times when you expect it to work one way, but you find out that you didn't quite understand pointers well enough and it actually does something else -- as they say, been there, done that.
Oh, and pointers are probably going to be one of the first times you'll actually see a program crash bad enough that the operating system will yell at you. A segmentation fault is not something the computer likes a lot.
Types
All variables in C will have types. C is a statically-typed language, meaning that variable types will be checked at compile time. This might take some getting used to at the beginning, but can also be seen as a good thing, as it can reduce runtime errors such as type errors where you try to assign a number to a string.
However, it is possible to perform typecasts, so it is possible to cast a int type (which are integer values) to a double type (a floating type value). However, it is not possible to try to cast an int directly to a string like char*.
So, for example, in some languages the following is allowed:
// Example of a very weakly-typed pseudolanguage with implicit typecasts:
number n = 42
string s = "answer: "
string result = s + n // Result: "answer: 42"
In C, one would have to call an itoa function to get a char* representation of an int, then use strcat to concatenate two strings.
Conclusion
Those things said, learning C coming from a higher language can be very eye-opening and probably challenging to begin with, but once you get a hang of it, it can be pretty fun to work with.
I'd recommend starting to experiment with a C compiler, and have a good book or reference.
I think many people will recommend the K&R book, which is indeed an excellent book.
At first, I didn't think recommending K&R as the first C book would be a good idea because it may be a little bit on the difficult side, but on second thought, I think it is a very comprehensive and well-written book that can be good for getting into C if you already have some programming experience.
Good luck!
Well ... You might be in for something of a culture shock. These are the 32 standard keywords in C, and that includes the basic types.
C's standard library is pretty functional (more so than people perhaps expect), but very very thin when compared to what higher-level languages give you. There is no hash table in sight, and you are correct to assume that C does not have syntactic or semantic support for objects.
It is possible to write pretty object-oriented code anyway, but you will have to jump through a few hoops, and do much more manually since the language won't help you. See for instance the GTK+ UI toolkit for an example of a well-designed object-oriented C library/API.
I'm a web coder: I currently enjoy AS3 and deal with PHP. I own a Nintendo DS and want to give C a go.
Why do you want to do C programming?
What are your reasons, what do you hope to achieve?
Is it in order to write software for the Nintendo DS?
From a higher level, what basic things/creature comforts are going to go missing?
Given your background, I think you'll personally miss the lack of dynamic typing support, in other words you will have to be very explicit in your C programs, your data must be specified with proper types, so that the compiler knows what type of data you are working with. This also applies to any sort of memory management, i.e. basically anything once you start working with data structures that are non PODs.
For example, where you would do something like this in php:
function multiply(x) {
return (x*x);
}
You would have to do something like this in C:
int multiply(int x) {
return (x*x);
}
While these may seem fairly similar, there are big differences, namely typing restrictions: the php version will also work with floating point values, while in C you would have to explicitly provide versions for different types and ranges of values (C types are constrained to certain ranges).
I can't find [for... in] loops, so assume they aren't there
in C, it looks more like the following:
int c;
for (c=0;c<=10;c++) {
// loop body
}
it looks like I'm going to have to declare things religiously
Yes, very much so - much more so, than you'll appreciate
and I assume I have no objects (which I dealt with in PHP a while ago).
correct, no objects - but OOP can still be emulated using other ways, such as function(struct obj)
Depending on your goals and motivation, I think you may find C a pretty frustrating language to start serious programming with, you may want to look into some of the related alternatives like for example Java instead.
Dynamic arrays and garbage collection. It's not built in to C so you'll need to roll your own or use a pre-existing solution.
The standard procedure is that you manage the memory yourself which might sound like something horrible but it really isn't. For example in AS3 and PHP you can create an array and forget it when you're done with it. In C you'll have to make sure to deallocate it yourself or memory will leak and bad stuff can/will happen.
You'll particularly miss automatic memory management, and semantically meaningful datatypes such as strings, tables &c. However, learning C well is quite instructive, even though you probably don't want to use it for application-level programming, so I suggest you grab a "K&R" (Kernighan and Ritchie's seminal book) and give it a go -- you'll find plenty of free libraries on the web to use and study as you proceed beyond that, though you'll have to discipline yourself to use proper memory management heuristics... happy learning!
I was just doing some research online, and it seems there's a viable possibility to use lua for developing on the "nintendo DS", this may in fact be the easiest way for someone familiar with high level languages to get started doing embedded development, without sacrificing too much HLL power and without experiencing the inevitable culture shock when migrating from a HLL to C: microlua, here are the API docs.
So you might want to give it a go, possibly using an emulator for starters.
Keep us posted!
I'm pretty sure you want to be looking at C++, not C. C++ is basically object oriented C.
What you'll REALLY miss is the ability to rapidly prototype and test changes. You can't just change a line of code and run. Even using build tools like "make" a recompile can often take several minutes. This is even worse when you consider that it's really easy to make mistakes in C/C++. On large projects I reckon I spend more time compiling than actually coding. As a long-term user of script languages this is my biggest issue with using C.
Moving directly from a higher-level language running on a machine with effectively infinite resources to a DS is going to be a challenge, and not just because of the language.
The Nintendo DS has only 4MB of RAM, a 66MHz ARM-7, no operating system, and the development libraries available (such as libnds) provide only a thin abstraction over the hardware itself.
So, in addition to having to deal with manual memory management, a simpler language with fewer creature comforts, static typing, lack of objects, and the need to run a compile step before you can see any changes, you also have to deal with memory fragmentation, a very slow CPU by modern standards, and needing to interact with the hardware directly in order to do anything useful.
Writing code for the DS, the only other option is C++. You can't use a lot of the advanced features that make C++ worthwhile on such a limited system. You'd be writing C code using a C++ compiler.
That said, it's a lot of fun. You can screw around with the hardware all you like, and there's no need to interface with the operating system, because there isn't one.
C is the next level above straight assembler and allows you to operate close to the metal. This gives power to do amazing stuff but also to easily shoot yourself in the foot!
One such example is direct memory access and the perils and wonder of pointer arithmetic. Pointers are very powerful, fast, and handy however require careful management. See this SO question for an example.
Also as mentioned by the other answerers you will have to do your own memory management. Again powerful and painful.
I would recommend studying up a good textbook and find some quality example code. The key thing is to learn the patterns that make all this stuff hang together correctly and elegantly (well, as much as possible). A good debugger will also really help and get familiar with the standard C libraries too.
You may notice your applications crashing at the drop of a hat initially but perservere as C is definitely worth at least dabbling in. You will understand some of the amazing abstractions higher level languages provide and what is really going on under the hood.
We need more homebrew developers. I am a GBA/NDS and many other embedded platform developer and hope to see that you continue with this. I would say skip to arm assembler and then back up to C or any other language you like, once you know how the processor works, languages are just syntax.
I assume your prior experience covers the programming mindset, breaking things down into bite sized chunks and then writing code to perform those chunks. Then another module that links those together and so on. Then C is just another language, a very very simple language, no need to dive into the corners of it, drive down the middle. It is a good habit to declare variables, etc, and here you will have to. The compilers will tell you when you have forgotten something. You are not going to need big concepts, big structures, language magic, this is embedded, you are resource limited, write some bytes here, read a register there, extract a bit from the data to see if a button has been pressed, write a register in response to move a sprite, etc.
I think you will find the NDS much harder than C at first, there are two processors and some infrastructure to get the simplest of working binaries. Granted there are many many examples out there as well. I generally (and still do) recommend starting with the GBA then graduate to the NDS. bite size chunks.
A lot of things from OOP is the same or almost the same in PHP and C#.
You don't play with pointers in C# (compared to C++) so I would definitely recommend going with C# if you want to play with C.
What C are you talking about?
C#
foreach(string item in itemsCollection)
{
...
}
PHP
foreach($itemsCollection as $key=>$value)
{
...
}
etc.
I like C# because it is strongly typed and your types are automatically checked while you write a code... The possibility of trying to save integer into string or vice versa is zero compared to PHP where you can save anything into anything...
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
For those of you with curriculum development experience: what is the best strategy regarding arrays?
I have seen some schools that teach arrays after variables and control structures, often before even teaching functions. This allows teaching of some rudimentary algorithms, etc. However, it then brings the problem of how to pass arrays to functions, so it is necessary to go back to arrays pointers are taught and patch things up.
Another option is to go from variables and control structures to functions, and then teach pointers, and once you have pointers, teach arrays from scratch, and then use that to get to dynamic memory allocation.
To me the second option makes more sense, because unlike simple variables, with arrays it is easy to "go out of bounds", but students who did not yet learn about memory and pointers may not understand what lies outside these bounds.
However, I'm interested to know what others think.
I think the best approach is to introduce 1 concept at a time. You don't need to 100% explain arrays in the first module. You can detangle almost anything by introducing 1 concept at a time.
I would teach them in this order: Arrays, Pointers, Arrays+Pointers, OtherStuff[N].
Arrays:
You can teach simple arrays first so they understand the ability to have multiple data slots accessible from a single variable name.
//The following doesn't need an understanding of pointers
int x[10];
x[0] = 5;
Pointers:
Then you can teach about pointers and how they work, starting with some simple examples:
int y = 5;
int *p = &y;
*p = 6;
printf("%i\n", y);
Make sure to give a special emphasis that a pointer is just like any other variable. It stores a memory address.
There is no need to get into the stack vs heap just yet.
Arrays+Pointers:
How to iterate over arrays with pointers:
int x[10];
x[0] = 5;
x[1] = 6;
int *y = x;
printf("%i\n", *y);//prints the first element
y++;
printf("%i\n", *y);//prints the second element
Then you can teach more complicated things...
How to do pointer arithmetic.
Array + i shorthand for array[i]
Passing arrays to functions as array pointets vs pointer param + size
param
How arrays are continuous blocks of memory
Explain string literals, buffers, ...
How sizeof works with pointers vs array types (pointer size vs buffer size)
Explain more complicated concepts like allocating memory, the stack, and the heap
Multiple levels of indirection
References
How multi-dimensional arrays work
...
Throughout all examples make heavy use of sizeof and printing addresses. It really helps to understand what's going on.
I would teach pointers first. They can be explained without teaching arrays. While teaching arrays i could then refer to pointers when explaining the expression a[i], and when explaining how one can pass them to functions.
Don't overthink things.
Teaching these concepts as clearly and engagingly as possible is FAR more important than what order you do them in.
I would suggest touching on the basics of arrays first, and doing pointers and revisiting arrays (more fully this time around) later.
You should teach arrays first, because they exist in almost any other language, and are easier to understand. Pointers, or some aspects of pointers, build on what was learned about arrays. This is the organic order, imho, and how I learned it way back when.
I'm assuming you are teaching C to students who already know how to program in another language like Java (or back in my day, Pascal). I don't think C is a good language to use for teaching programming to complete novices.
I would teach pointers first. This is one of the important new ideas that that will be learning in C. They will already know the concept of arrays from other languages, so there's no urgency to teach this first. So when you do cover arrays in C, you can talk about how they are essentially syntactic sugar for pointer arithmetic, a concept they are now familiar with.
They should be taught at the same time.
The example of a single dimensional array being accessed as a pointer to the base with offset (typesize * index) should make an appearance.
i.e.
a[i] is equivalent to *(a + i)
I teach pointers before I worry about arrays. However, typically, the students I see, they have already been exposed to arrays in their first CS class in some other language. However, even I were teaching C in the first CS class, I'd do pointers before arrays and describe arrays in terms of pointers. Just because it is fashionable these days to think "no one will ever need or want to know how computers actually work" doesn't mean it's true.
As stated above I don't think the order is important,
but this is the order I wished someone would have showed me the stuff.
Arrays
Pointers
How Arrays and Pointers are the same
Why Arrays and Pointers are NOT the same
For more info on point 4 I really recommend chapter 4
"The Shocking truth: C arrays and Pointers Are NOT the Same!" in "Expert C, deep C secrets".
/Johan
Update:
Some links to the book, and there is also a preview of the book.
http://books.google.se - Expert C, deep C secrets
And the user comments about this book is true:
http://www.amazon.co.uk/Expert-Programming-Peter-van-Linden/dp/0131774298
If they've been exposed to assembler beforehand, teach pointers first.
If they've been exposed to higher level languages (ie just about anything) teach arrays first.
In my experience people coming to C without some exposure to assembly level programming (registers, addresses, "computer fundamentals") are about to enter a world of pain. IMHO you're actually better off teaching assembly level coding first, then introducing C as a better assembler.
Interesting question - I hope it's not too late to answer.
When I taught programming at Boston College in the early 80s, my colleagues and I wrestled with these issues every year, and we kept tweaking our approach. C was a new language then, so our progression went through Basic to Pascal. I remember thinking at the time how hard it would be to teach C just because it was more loosey-goosey, there were more ways for students to mess up, and more really confusing things like the distinction between arrays and pointers that you had to teach.
What I found most useful was to try to be concrete, not abstract. For example, in the intro programming course I used an interpreter for a simple decimal computer that you would program in it's decimal "machine language". It had addresses going from 0 to 999, and opcodes like 1234, with the "1" meaning "add to the accumulator", and "234" being the address of where to find the number to add. Students would write really simple programs, like to add up a list of numbers, and they would single-step them, observing what happens at each step.
I would have them play with this for about 3 weeks, and then start into BASIC. In the second course, they would go into Pascal. What that little decimal "computer" accomplished was to convey some concrete concepts that make the "abstractions" in "real" languages a lot easier to understand, such as:
What memory is, and what addresses are, and how both data and programs are just numbers at addresses in memory. That makes the concept of "variable" and "array" and "pointer" much easier to explain later.
How the basic model of computation is that very simple steps are performed in sequence, and before each step can begin, the previous one has to finish. I know people will object that computers are highly parallelized and pipelined nowadays, but I have to explain that you need to start really simple, because when newbies see a program run, it looks for all the world like it does everything at once, and reads your mind in the process.
How, by combining a very small vocabulary of instructions, including jumps and conditional jumps, you can get the computer to do almost anything you want it to.
Now, regarding C, I've heard it disparaged as just a cut above assembly language, but I think that's a good thing. It always struck me as a language by experts for experts. I think the ideas of arrays and pointers and structures are very easy to explain if you can just refer back to the underlying machine. Similarly for C++ and object-oriented programming.
So, to summarize, if students understand the underlying concept of how computers work, even if it's a really artificial computer, then explaining higher-level data structure concepts is a lot easier.
Depends what they know. Are you teaching C, or programming-and-C?
I've seen very little success with the latter. C is simply not a very intuitive or forgiving language. I haven't seen students been thankful for starting with it, though I've seen students frustrated with programming for it.
The ones who are going to stick with programming will go out and learn C in their spare time, anyway. There's no need to push it on them first.
If you're just teaching C, and they already know pointers and arrays, then teaching how pointers and arrays work in C can be done in one lesson.
Would you teach pointers before strings?
Probably not. And most of the same arguments apply.
(But generally I agree with #legion — don't overthink it.)
I think it would be better start with arrays, because the concept of array is simple and intuitive, but in C it would be important revisiting arrays after teach ponters, as 'Legion' suggested before.
This question can be asked for any object-oriented language really.
When I was taught Java, I was first shown arrays and the pointers, as the last part of arrays, to demonstrate the difference between a deep copy and a shallow copy.
In the Stack Overflow podcasts, Joel Spolsky constantly harps on Jeff Atwood about Jeff not knowing how to write code in C. His statement is that "knowing C helps you write better code." He also always uses some sort of story involving string manipulation and how knowing C would allow you to write more efficient string routines in a different language.
As someone who knows a little C, but loves to write code in perl and other high-level languages, I have never once come across a problem that I was able to solve by writing C.
I am looking for examples of real-world situations where knowing C would be useful while writing a project in a high-level/dynamic language like perl or python.
Edit: Reading some of the answers you guys have submitted have been great, but still doesn't make any sense to me in this regard:
Take the strcat example. There's a right way and a wrong way to combine strings in C. But why should I (as a high-level developer) think that I am smarter than Larry Wall? Why wouldn't the language designers write the string manipulation code the right way?
The classic example that Joel Spolsky uses is on misuse of strcat and strlen, and spotting "Shlemiel the painter" algorithms in general.
It's not that you need C to solve problems that higher-level languages can't solve, it's that knowing C well gives you a perspective on what's going on underneath all those levels of languages that allows you to write better software. Because just such a perspective helps you avoid writing code which is, unknown to you, actually O(n^2), for example.
Edit: Some clarification based on comments.
Knowing C is not a prerequisite for such knowledge, there are many ways to acquire the same knowledge.
Knowing C is also not a guarantee of these skills. You may be proficient in C and yet still write horrible, grotty, kludgy code in every other language you touch.
C is a low-level language, yet it still has modern control structures and functions so you aren't always getting caught up in the fiddly details. It's very difficult to become proficient at C without gaining a mastery of certain fundamentals (such as the details of memory management and pointers), mastery of which often pays rich dividends when working in any language.
It's always about the fundamentals.
This is true in many pursuits as well as software engineering. It is not secret incantations that make the best programmers the best, rather it is a greater mastery of the fundamentals. Experience has shown that knowledge of C tends to have a higher correlation to mastery of certain of those fundamentals, and that learning C tends to be one of the easier and more common routes to acquiring such knowledge.
It's a mistake to assume that learning C will somehow automatically give you a better understanding of low-level programming concerns. In a lot of cases even C is too high level to give you a good understanding of efficiency concerns.
A classic is i++ versus ++i. It's over-cited, so perhaps most people know the implications about performance between these two operations. But learning C wouldn't magically teach you this by itself.
I guess I understand arguments about strings. When string operations are made deceptively simple, people often use them in inefficient ways. But again, knowing that strncat exists doesn't give you a full appreciation for the efficiency concerns. A lot of C programmers probably haven't even thought about the fact that strncat has to do a strlen operation internally.
Even using C, it's important to understand what's going on behind the scenes if efficiency is a concern. People who know C tend to view things in a progression. Assembly and machine code are the building blocks of C, while C is a building block of higher level languages.
This isn't specifically true, but it's obvious that C is "closer to the metal" than many higher level languages. This has at least two effects: efficiency concerns aren't as hidden behind implicit behavior, and it's easier to screw up.
So you want a specific example of how knowing C gives you an advantage. I don't think there is one. I think what people mean when they say this is that knowing what's going on behind the scenes in whatever language you're happening to write for helps you make more intelligent decisions about how to write code. However, it's a mistake to assume that C is "what's going on behind the scenes" in Java, for instance.
It's hard to quantify exactly, but having an understanding of C will give your more insight into how higher-level language constructs are implemented, and as a consequence you'll be better able to use the constructs in an intelligent manner.
To give you a specific reason: having to write my own Garbage Collection routines has helped my write better code.
I don't think I have ever found a problem that I haven't been able to solve with a higher-level language; but started by learning C, it has instilled in me quite a number of excellent development practices. Knowing how the rudimentary parts of the flow of an application work will enable to you be able to look at your own code and get a good visual of how the data flows, and where it is stored. This then leads to a better understand of how to track down leaking memory, slow disk reads, poorly constructed caches, etc.
Keeping track of Pointers... that's another one that comes to mind.
Classic examples are things involving lower level memory management, such as the implementation of a linked list class:
struct Node
{
Data *data;
Node *next;
}
Understanding how the pointers are used to iterate the list, and what they signify in terms of the machine architecture will allow you to better understand your high level code.
Another example which Joel was referring to was the implementation of string concatenation, and the right way to create a string from a set of data.
// this is efficient
for (int i=0; i< n; i++)
{
strcat(str, data(i));
}
// this could be too, but you'd need to look at the implementation to be sure
std::string str;
for (int i=0; i<n; i++)
{
str+=data(i);
}
Knowing C helps you to write better code in C. I guess that the example of Joel Spolsky is of little use in C++ or Objective-C where specific classes for manipulating strings exist and have been crafted with performance in mind. Moreover, using C tricks in other languages may be couter productive.
Nevertheless, C knowledge is very helpful to understand general concepts in other languages and what is behind the hood in many situations.
As someone who knows a little C, but loves to write code in perl and other high-level languages, I have never once come across a problem that I was able to solve by writing C.
I am looking for examples of real-world situations where knowing C would be useful while writing a project in a high-level/dynamic language like perl or python.
It's easy to start writing high level code and then wonder we it's running slow. The truth is there are many ways to write perl or python code, and some are better (as in more efficient) than the others. If you know the low level details of how your code is executed in perl or python (both of which are written in C) you can code around several inefficiencies --like knowing which looping construct is faster, how memory is retained/released, etc.
Also, when writing a project in perl or python you sometimes hit a performance wall. The creators of the language (Guido, at least) advocate that you implement that part in C, as a language extension. To do that, well, you'll have to know C.
So, there.
For the purposes of argument, suppose you wanted to concatenate the string representations of all the integers from 1 to n (e.g. n = 5 would produce the string "12345"). Here's how one might do that naïvely in, say, Java.
String result = "";
for (int i = 1; i <= n; i++) {
result = result + Integer.toString(i);
}
If you were to rewrite that code segment (which is quite good-looking in Java) in C as literally as possible, you would get something to make most C programmers cringe in fear:
char *result = malloc(1);
*result = '\0';
for (int i = 1; i <= n; i++) {
char *intStr = malloc(11);
itoa(i, intStr, 10);
char *tempStr = malloc(/* some large size */);
strcpy(tempStr, result);
strcat(tempStr, intStr);
free(result);
free(intStr);
result = tempStr;
}
Because strings in Java are immutable, Integer.toString creates a dummy string and string concatenation creates a new string instance instead of altering the old one. That's not easy to see from just looking at the Java code. Knowing how said code translates into C is one way of learning exactly how inefficient said code is.
Do you use arrays much ? and do you come across situations where you need items to be stored in memory without knowing how many of them (i.e. based on a query from the database?) then I suppose C would teach you great things like stacks, structs and link lists which might help you. Regards, Andy
Knowing C is really not worth much. Many of us who know C deeply like to think that all that deep insight is valuable and important.
Some of us who know C can't think of a single specific feature of C that's helpful to know about.
Knowing how pointers work in C (especially with C's syntax) isn't all that helpful. In a high-level language your statements create objects and manage their interaction. Pointers and references are -- perhaps -- interesting from a hypothetical point of view. But the knowledge has no practical impact on how you use Java or Python.
The higher-level languages are the way they are. Knowing how doesn't change those languages; it doesn't change how you use them, debug or test them.
Knowing how to create or manipulate a linked list has no earthly impact on Python list class definition. None.
Knowing the difference between Linked List and Array List might help you write a Java program. But the C implementation doesn't help you choose between Linked List and Array List. The decision is independent of knowing C.
A bad algorithm is bad in every language. Knowing inner mysteries of C doesn't make a bad algorithm any less bad. Knowing C doesn't help you know the Java collections or the Python built-in types.
I can't see any value in learning C. Learning Fortran is just as valuable.
Technically, all of the deficiencies of C would force you to code around them; making you write more code -> making you more experienced in general. Lacking any portable integer bigger than 32-bits, for example, C has, in the past, made me write my own bignum library.
The lack of implicit memory, resource and error management (garbage collection, RAII, automatically-called constructors/destructors, maybe exceptions) force C users to write a lot of initialization, error-handling and cleanup code. It may just be me, but I'm never tired of writing such code. I go and read the documentation of every external function I call, return to my code and check for every return value and other failure-indicative stuff. It even makes me feel safe!
This last point is probably the biggest one to be made in favor of the argument. You can only write so many malloc()/free() pairs before you start to analyze the lifetime of every single variable you come across in every single language! C++'s automatic-storage objects don't help this disorder, either.
Writing truly portable C code often requires the programmer to be free of a lot assumptions about the host system - think sizeof(), CHAR___BITS, unsigned long, UINT_MAX. While this hasn't helped me write better code in other languages, it has helped me think about possible alternate implementations: how a tiny microprocessor could still run my C code, generating a gazillion RISC instructions for my simple one-line statement. (That is another thing; not many other languages map to and from a given assembly language so easily in my head. Then again, that may just be me.)
Of course, none of these arguments go only for C. #S.Lott has a valid point - Fortran might be an equally good alternative. But there is so much C code around! A whole personal computer system from top to bottom -applications to libraries to drivers to kernel- is available in source code in C. It would be such a waste if you could not read it.
I think it is worth knowing some low-level language, and there are pragmatic reasons to choose C:
It's low-level, close to assembler
It's widespread
Understanding the whole stack is valuable. Sometimes you need to debug something's guts. Sometimes you cannot fix a performance problem without low-level knowledge (this is often not the case, e.g., when the performance problem is purely algorithmic, but sometimes it is).
Why is C widely considered the quintessential "bottom of the stack", and not some other language(s)? I think this because C is a low-level programming language, and C won. It has been a while now, but C was not always as dominant. To take just one famous example, the proponents of Common Lisp (which had its own ways of writing low-level code) were hoping their language would be popular, too, and eventually lost.
The following are usually implemented in C:
operating systems (Unix variants, Windows, many embedded operating systems)
higher-level programming languages (many popular implementations of Java, Python, etc)
(obviously) reams of popular open source projects
I'm not a hardware person, but I gather that C has influenced CPU design heavily, too.
So if you believe in understanding the whole stack, learning C is, from a pragmatic perspective, the best choice.
As a caveat, I think it's worth learning assembler, as well. Although C is close to the metal, I didn't fully understand C until I had to do some assembler. It is occasionally helpful to understand how functions calls are actually performed, how for loops are implemented, etc. Less important, but also useful, is having to (at least once) deal with a system without virtual memory. When using C on Windows, Unix, and certain other operating systems, even humble malloc does a lot of work under the covers that is easier to appreciate, debug and/or tune if you've ever had to deal with manually locking and unlocking memory regions (not that I would recommend doing so on a regular basis!)
I see it like this , everything boils down to C in a crossplatform level, and assembly in a platform specific way. So it's like being a crosscountry Rally racer, and C is basic automotive mechanics, you can be a great driver but when you get into trouble knowing C means you can probably get yourself back in the race, if not you're stuck calling the mechanics. And assembly is what the mechanics and manufacturers know, it's a worthy investment if that's what you want to do, otherwise you can just trust the mechanics.
For specifics think about memory management, hardwar drivers, physics engines, high performance 3d graphics, TCP stacks, binary protocols, embedded software, creating high level languages like Perl
You cannot write an OS kernel in Perl; C would be a much better choice for that, because it is low-level enough to express everything the kernel should do, and portable enough to let you port your kernel to different architectures
Knowing C is not a requirement to being able to effectively use higher-level languages, but it certainly can help ones general understanding of how computers and software work - I think it's similar to an assertion that knowing some assembly language or computer architecture/hardware logic (and/or/nand gates, etc) can help a C programmer be a better programmer.
Sometimes in order to solve a problem it helps to know how things are working 'underneath' what you're doing.
I don't think this means a programmer must know C in order to be a good programmer, but I think that knowing C can be helpful to almost any programmer.
Not knowing Perl well, I am wondering if it is now possible to distribute processor load to more than one physical core with several threads created in a single program in Perl, without spawning additional processes
I don't think there can be any specific example.
What learning C does for you is give you an insight, a broadening of the mind, into how computers (and software) work. It's a very abstract thing ..
It doesn't make you write better code in python, it just makes you more of a computer scientist.
The reference that Wedge made to Joel's article mentioning Shlemiel the painter is an interesting one but has no relevance here. That algorithm is not tied to C in any particular way (although it manifests itself in null-terminated strings).
Python's strings are immutable anyway, and completely different from C's model of strings, so I don't quite see the relationship.
I suppose one concrete example is optimizing a parser or a lexer or a program that keeps writing to a string buffer all the time. If you use normal strings instead of a string buffer, you'll run across a problem when you build very large strings.
Consider that:
a = a + b
makes a copy of both a and b. It doesn't change the string that was referenced by a, it creates a new string, allocating more memory, etc.
If a becomes considerably large, and you keep adding small things to it, then Shlemiel the painter will manifest himself.
But then again, knowing this has nothing to do with knowing C, just knowing how your language implements things at the low level. (This is where having an experiece in C will help you).
In Python, say you have a function
def foo(l=[])
l.append("bar")
return l;
On some version of Python, available about a year ago, running foo() for times, you'd get a really interesting result (i.e. ["bar","bar","bar","bar]).
It seems that someone implemented the default parameters as a static variable (and without resetting it), so unexpected results happen.
Perhaps my example was contrived - a friend of mine who actually likes Python found this peculiar bug, but the fact of the matter is all of these languages are implemented in C or C++. Not knowing and not understanding concepts that are fundamental to the base language means that you won't have an in-depth understanding of languages that are built on top of that.
I find all the "why bother with C/C++/ASM question silly". If you're inclined enough to learn a language, that means that you're curious enough to get into it the first place. Why stop at just before C?
Knowing C is great because it does nothing behind your back (GC, bounds checking, etc.). It only does exactly what you tell it too. Nothing is implied. Even C++ does things you don't tell it too with RAII (of course, it is implied that the object is destructed when it goes out of scope, but you don't actually write that). C is a great way to learn what goes on 'under the hood' of the computer, without having to write assembly.
inefficient code (eg loops of string+=) are typically inefficient in any language. what difference does it make if someone explains why it is inefficient in one language or the other? knowing C, but not realizing that a method is inefficient, is no different than knowing python and not realizing the same.