Printing a string using a pointer in C - c

I am trying to print a string read into the program from the user.
I need to be able to print the string by using a pointer.
Here is a simplified version of my code.
#include <stdio.h>
int main(void)
{
char string[100];
printf("Enter a string: ");
scanf("%s", string);
char *pstring = &string;
printf("The value of the pointer is: %s", *pstring);
return 0;
}
But I am getting a segmentation fault
Can someone please explain why this is happening?

You don't usually take an address of an array, it's either &string[0] or just string. In the printf() for %s you should pass a char pointer not a char (which is what *pstring is). It's a good idea to add a newline to the printf() format string as scanf() doesn't pass that along:
#include <stdio.h>
int main(void) {
char string[100];
printf("Enter a string: ");
scanf("%99s", string);
char *pstring = string;
printf("The value of the pointer is: %s\n", pstring);
return 0;
}

string IS a pointer. A constant one.
It is worth understanding what arrays are (compared to pointers).
And specifically the difference between those:
char *str1=malloc(6);
strcpy(str1, "hello")
char str2[6];
strcpy(str2, "hello");
In this example, str1 and str2 are both pointers.
But str1 is a variable (like "int x": x's value is stored in memory and you can modify it). Whereas str2 is not. It is a constant, like 12. It looks like a variable, because it is an identifier (str2), but it is not. Once compiled, there is no place in memory to store str2 value. The compiler computes its value, at compile time, and in the generated machine code, this value is inserted directly in the code. Exactly like for "12".
So str2=malloc(6) for example, or str2=whatever would make no sense. No more than typing 12=6 or 12=whatever. It is not a variable. It is a constant value.
And likewise, &str2 has no sense neither. No more than &12 has. It would be asking the address where a constant is stored. But a constant is not stored anywhere. It is a value directly in the code.
Namely str2 is a constant value of an address where the compiler has reseved enough room for 6 bytes. So you can store things in str2[0], str2[1], ..., str2[5], but not in str2 itself (again, it is a constant value, not a variable. You cannot store things in it, no more than you can store things in 12)
Situation for str1 is different. str1 is a variable, declared as type char *. You can store values of type char * in it (that is pointers to a char, that is memory address where a char is stored). And that declaration affect a default (initial) value to this variable, which is the address of 6 bytes, containing the letters 'h', 'e', 'l', 'l', 'o' and the terminal 0.
So, you can easily change str1, and store whatever pointer you want in str1. Writing str1=str2 for example.
So, long story short: both are pointers. str2 is a constant pointer, but still a pointer. Both are address of a place in memory where the first, then second, etc, char of the string are stored.
So, in your case, just type
char *pstring=string
or use string directly.
Or, for aesthetic purpose, to clarify that you mean "address of the first char of string", you may type
char *pstring = &(string[0])
which is the exact same thing as "string" (in C, syntax a[b] is just a shortcut to *(a+b), so &(string[0]) is &(*(string+0)), which is &*string which is string (another strange consequence of that, is that you could as well say
char *pstring = &(0[string])
again, that just means &(*(0+string))
Note that I advise none of those strange writing. But it helps understand how pointers are just values, that you may store in variables (like with str1) or use directly as constant in the code (like with str2). It is just less obvious than for integers (x a variable declared as int x=12; vs 12 that a constant) because in case of arrays (constant pointers) those values are named by the compiler and take the concrete appearance of an identifier in the code.

Related

How can strcmp compare values at addresses without the * operator?

Does strcmp check the value at address even without the * operator? If yes, how are we able to compare normal strings using strcmp?
#include <stdio.h>
#include <string.h>
int main()
{
char *name;
char *str;
char a[] = "Max", b[] = "Max";
name = a; str = b;
printf("Add 1: %p Add 2: %p\n", name, str);
if (!strcmp(*name, *str))
printf("Names Match\n");
return 0;
}
The desired output is not obtained if I compare the values at address using the * operator. However, if I remove the * operator, it works fine.
Documentation is your friend
http://www.cplusplus.com/reference/cstring/strcmp/
int strcmp ( const char * str1, const char * str2 );
str1 - C string to be compared.
str2 - C string to be compared.
And you have
char *name;
char *str;
so
strcmp(name, str)
Is perfectly valid - you pass pointers to function - not dereferenced strings.
Why would you think that strcmp() cannot dereference the pointers?
The way it wokrs, is that you pass a pointer to a string and it then dereferences the string to compare character by character returning the difference between the two first diffrente characters, or at least that is in principle how it does.
Dereferencing the pointer, does not give the address. I believe you are confused with the & address of operator. A pointer, stores the virtual memory address for some data, you then can access that memory with the * dereference operator.
In contrast, if you have a variable and you want the virtual memory address where it's stored then you use the & operator.
Since you are already passing a pointer, then you are giving strcmp() the address where it needs to start reding characters from to compare.
Also, because integers are convertible to pointers and viceversa, you can pass the value at the beginning of the string like this
strcmp(*name, *str);
the result of such operation is undefined for multiple reasons, one being that *name is not a valid address and hence dereferencing it is undefined behavior.
Although this compiles, if you enable compilation warnings you should notice that the compiler complains about it, because it's a really undefined thing to do.
Regarding what a string is and how it's represented in memory, you can only pass a pointer to a string to any function, because it's the only way to pass arrays to functions, and it might be a pointer even before passing it. In fact, passing the value like you did would not allow strcmp() to know where the string is and therefore it would be impossible to compare more than a single character.

Pointers and Characters Variables in C

I'm trying to store the memory location of the evil variable in the ptr variable, but instead of the memory location, the variable prints out the value within the variable and not it's location. Both the ptr variable and the &evil syntax print the same result which is the value of the variable and not it's location in memory. Could someone please nudge me in the right direction, and help me determine the syntax needed to store the memory location of a string/char variable in C?
int main()
{
char *ptr;
char evil[4];
memset(evil, 0x43, 4);//fill evil variable with 4 C's
ptr = &evil[0];//set ptr variable equal to evil variable's memory address
printf(ptr);//prints 4 C's
printf(&evil);//prints 4 C's
return 0;
}
That's normal, because the first parameter to printf is a format specifier, which is basically a char* pointing to a string. Your string will be printed as-is because it does not contain any format specifiers.
If you want to display a pointer's value as an address, use %p as a format specifier, and your pointer as subsequent parameter:
printf("%p", ptr);
However, note that your code invokes Undefined Behavior (UB) because your string is not null-terminated. Chances are it was null-terminated "out-of-luck".
Also notice that to be "correct code", cast the pointer to void* when sending it to printf, because different types of pointers may differ on certain platforms, and the C standard requires the parameter to be pointer-to-void. See this thread for more details: printf("%p") and casting to (void *)
printf("%p", (void*)ptr); // <-- C standard requires this cast, although it migh work without it on most compilers and platforms.
You need #include <stdio.h> for printf, and #include <string.h> for memset.
int main()
You can get away with this, but int main(void) is better.
{
char *ptr;
char evil[4];
memset(evil, 0x43, 4);//fill evil variable with 4 C's
This would be more legible if you replaced 0x43 by 'C'. They both mean the same thing (assuming an ASCII-based character set). Even better, don't repeat the size:
memset(evil, 'C', sizeof evil);
ptr = &evil[0];//set ptr variable equal to evil variable's memory address
This sets ptr to the address of the initial (0th) element of the array object evil. This is not the same as the address of the array object itself. They're both the same location in memory, but &evil[0] and &evil are of different types.
You could also write this as:
ptr = evil;
You can't assign arrays, but an array expression is, in most contexts, implicitly converted to a pointer to the array's initial element.
The relationship between arrays and pointers in C can be confusing.
Rule 1: Arrays are not pointers.
Rule 2: Read section 6 of the comp.lang.c FAQ.
printf(ptr);//prints 4 C's
The first argument to printf should (almost) always be a string literal, the format string. If you give it the name of a char array variable, and it happens to contain % characters, Bad Things Can Happen. If you're just printing a string, you can use "%s" as the format string:
printf("%s\n", ptr);
(Note that I've added a newline so the output is displayed properly.)
Except that ptr doesn't point to a string. A string, by definition, is terminated by a null ('\0') character. Your evil array isn't. (It's possible that there just happens to be a null byte just after the array in memory. Do not depend on that.)
You can use a field width to determine how many characters to print:
printf("%.4s\n", ptr);
Or, to avoid the error-prone practice of having to write the same number multiple times:
printf("%.*s\n", (int)sizeof evil, evil);
Find a good document for printf if you want to understand that.
(Or, depending on what you're doing, maybe you should arrange for evil to be null-terminated in the first place.)
printf(&evil);//prints 4 C's
Ah, now we have some serious undefined behavior. The first argument to printf is a pointer to a format string; it's of type const char*. &evil is of type char (*)[4], a pointer to an array of 4 char elements. Your compiler should have warned you about that (the format string has a known type; the following arguments do not, so getting their types correct is up to you). If it seems to work, it's because &evil points to the same memory location as &evil[0], and different pointer types probably have the same representation on your systems, and perhaps there happens to be a stray '\0' just after the array -- perhaps preceded by some non-printable characters that you're not seeing.
If you want to print the address of your array object, use the %p format. It requires an argument of the pointer type void*, so you'll need to cast it:
printf("%p\n", (void*)&evil);
return 0;
}
Putting this all together and adding some bells and whistles:
#include <stdio.h>
#include <string.h>
int main(void)
{
char *ptr;
char evil[4];
memset(evil, 'C', sizeof evil);
ptr = &evil[0];
printf("ptr points to the character sequence \"%.*s\"\n",
(int)sizeof evil, evil);
printf("The address of evil[0] is %p\n", (void*)ptr);
printf("The address of evil is also %p\n", (void*)&evil);
return 0;
}
The output on my system is:
ptr points to the character sequence "CCCC"
The address of evil[0] is 0x7ffc060dc650
The address of evil is also 0x7ffc060dc650
Using
printf(ptr);//prints 4 C's
causes undefined behavior since the first argument to printf needs to be a null terminated string. In your case it is not.
Could someone please nudge me in the right direction
To print an address, you need to use the %p format specifier in the call to printf.
printf("%p\n", ptr);
or
printf("%p\n", &evil);
or
printf("%p\n", &evil[0]);
See printf documentation for all the ways it can be used.
if you want to see the address of a char - without all the array / pointer / decay oddness
int main()
{
char *ptr;
char evil;
evil = 4;
ptr = &evil;
printf("%p\n",(void*)ptr);
printf("%p\n",(void*)&evil);
return 0;
}

Why the char has to be a pointer instead of a type of char?

#include <stdio.h>
typedef struct {
char * name;
int age;
} person;
int main() {
person john;
/* testing code */
john.name = "John";
john.age = 27;
printf("%s is %d years old.", john.name, john.age);
}
This a well-working code, I just got a small question.
In the struct part, after I delete the * before name, this code no longer works, but no matter the age's type is, int or a pointer, it always works fine. So can anyone tell me why name has to be a pointer rather than just a type of char?
char type is short for character and can hold one character. C has no string type, instead a string in C is an array of char terminated with '\0' - the null character (null terminated strings).
Thus to use a string you need a pointer to memory that contains lots of characters. So why does it work for an int with or without the *. Well we can either have the age as an int or we can have a pointer to memory that stores the age. Either works well. But we can't store a string in one character.
This has to do with format specifiers you've in printf function. %s tries to output the string (reads a portion of memory), %d interprets everything in gets like an integer, thus even a pointer sort of works, however, you shouldn't to that, it's undefined behavior.
I suggest you to read some good books on C to get a good grasp on such things, a good list is here The Definitive C Book Guide and List
but no matter the age's type is int or a pointer, it always works fine.
That's undefined behaviour.
To elaborate, a double-quote delimited string (as seen above) is a string literal, and when used as an initializer, it basically gives you a pointer to the starting of the literal thereby it needs a pointer variable to be stored. So, name has to be a pointer.
OTOH, the initializer 27 is an integer literal (integer constant) and it needs to be stored into an int variable , not an int *. If you use 27 to initialize an int * and use that, it works (rather, seem to work) because that way, it invokes undefined behavior later, by attempting to use invalid memory location.
FWIW, if you try something like
typedef struct {
char * name;
int *age;
} person;
and then
john.age = 27; //incompatible assigment
compiler will warn you about wrong conversion from integer to pointer.
char *name: name is a pointer to type char. Now, when you make it to point to "John", the compiler stores the John\0 i.e., 5 chars to some memory and returns you the starting address of that memory. So, when you try to read using %s (string format specifier), the name variable returns you the whole string reading till \0.
char name : Here name is just one char having 1 byte of memory. So, you can't store anything more than one char. Also, when you would try to read, you should always read just one char (%c) because trying to read more than that will take you to the memory region which is not assigned to you and hence, will invoke Undefined Behavior.
int age : age is allocated 4 bytes, so you can store an integer to this memory and read as well, printf("%d", age);
int *age : age is a pointer to type int and it stores the address of some memory. Unlike strings, you do not read integers using address (loosely saying, just for the sake of avoiding complexity). You have to dereference it. So first, you need to allocate some memory, store any integer into it and return the address of this memory to age. Or else, if you don't want to allocate memory, you can use compiler's help by assigning a value to age like this, *age = 27. In this case, compiler will store 27 to some random memory and will return the address to age which can be dereferenced using *age, like printf("%d", *age);

Why does reading into a string buffer with scanf work both with and without the ampersand (&)?

I'm a little bit confused about something. I was under the impression that the correct way of reading a C string with scanf() went along the lines of
(never mind the possible buffer overflow, it's just a simple example)
char string[256];
scanf( "%s" , string );
However, the following seems to work too,
scanf( "%s" , &string );
Is this just my compiler (gcc), pure luck, or something else?
An array "decays" into a pointer to its first element, so scanf("%s", string) is equivalent to scanf("%s", &string[0]). On the other hand, scanf("%s", &string) passes a pointer-to-char[256], but it points to the same place.
Then scanf, when processing the tail of its argument list, will try to pull out a char *. That's the Right Thing when you've passed in string or &string[0], but when you've passed in &string you're depending on something that the language standard doesn't guarantee, namely that the pointers &string and &string[0] -- pointers to objects of different types and sizes that start at the same place -- are represented the same way.
I don't believe I've ever encountered a system on which that doesn't work, and in practice you're probably safe. None the less, it's wrong, and it could fail on some platforms. (Hypothetical example: a "debugging" implementation that includes type information with every pointer. I think the C implementation on the Symbolics "Lisp Machines" did something like this.)
I think that this below is accurate and it may help.
Feel free to correct it if you find any errors. I'm new at C.
char str[]
array of values of type char, with its own address in memory
array of values of type char, with its own address in memory
as many consecutive addresses as elements in the array
including termination null character '\0' &str, &str[0] and str, all three represent the same location in memory which is address of the first element of the array str
char *strPtr = &str[0]; //declaration and initialization
alternatively, you can split this in two:
char *strPtr; strPtr = &str[0];
strPtr is a pointer to a char
strPtr points at array str
strPtr is a variable with its own address in memory
strPtr is a variable that stores value of address &str[0]
strPtr own address in memory is different from the memory address that it stores (address of array in memory a.k.a &str[0])
&strPtr represents the address of strPtr itself
I think that you could declare a pointer to a pointer as:
char **vPtr = &strPtr;
declares and initializes with address of strPtr pointer
Alternatively you could split in two:
char **vPtr;
*vPtr = &strPtr
*vPtr points at strPtr pointer
*vPtr is a variable with its own address in memory
*vPtr is a variable that stores value of address &strPtr
final comment: you can not do str++, str address is a const, but
you can do strPtr++

C strings confusion

I'm learning C right now and got a bit confused with character arrays - strings.
char name[15]="Fortran";
No problem with this - its an array that can hold (up to?) 15 chars
char name[]="Fortran";
C counts the number of characters for me so I don't have to - neat!
char* name;
Okay. What now? All I know is that this can hold an big number of characters that are assigned later (e.g.: via user input), but
Why do they call this a char pointer? I know of pointers as references to variables
Is this an "excuse"? Does this find any other use than in char*?
What is this actually? Is it a pointer? How do you use it correctly?
thanks in advance,
lamas
I think this can be explained this way, since a picture is worth a thousand words...
We'll start off with char name[] = "Fortran", which is an array of chars, the length is known at compile time, 7 to be exact, right? Wrong! it is 8, since a '\0' is a nul terminating character, all strings have to have that.
char name[] = "Fortran";
+======+ +-+-+-+-+-+-+-+--+
|0x1234| |F|o|r|t|r|a|n|\0|
+======+ +-+-+-+-+-+-+-+--+
At link time, the compiler and linker gave the symbol name a memory address of 0x1234.
Using the subscript operator, i.e. name[1] for example, the compiler knows how to calculate where in memory is the character at offset, 0x1234 + 1 = 0x1235, and it is indeed 'o'. That is simple enough, furthermore, with the ANSI C standard, the size of a char data type is 1 byte, which can explain how the runtime can obtain the value of this semantic name[cnt++], assuming cnt is an integer and has a value of 3 for example, the runtime steps up by one automatically, and counting from zero, the value of the offset is 't'. This is simple so far so good.
What happens if name[12] was executed? Well, the code will either crash, or you will get garbage, since the boundary of the array is from index/offset 0 (0x1234) up to 8 (0x123B). Anything after that does not belong to name variable, that would be called a buffer overflow!
The address of name in memory is 0x1234, as in the example, if you were to do this:
printf("The address of name is %p\n", &name);
Output would be:
The address of name is 0x00001234
For the sake of brevity and keeping with the example, the memory addresses are 32bit, hence you see the extra 0's. Fair enough? Right, let's move on.
Now on to pointers...
char *name is a pointer to type of char....
Edit:
And we initialize it to NULL as shown Thanks Dan for pointing out the little error...
char *name = (char*)NULL;
+======+ +======+
|0x5678| -> |0x0000| -> NULL
+======+ +======+
At compile/link time, the name does not point to anything, but has a compile/link time address for the symbol name (0x5678), in fact it is NULL, the pointer address of name is unknown hence 0x0000.
Now, remember, this is crucial, the address of the symbol is known at compile/link time, but the pointer address is unknown, when dealing with pointers of any type
Suppose we do this:
name = (char *)malloc((20 * sizeof(char)) + 1);
strcpy(name, "Fortran");
We called malloc to allocate a memory block for 20 bytes, no, it is not 21, the reason I added 1 on to the size is for the '\0' nul terminating character. Suppose at runtime, the address given was 0x9876,
char *name;
+======+ +======+ +-+-+-+-+-+-+-+--+
|0x5678| -> |0x9876| -> |F|o|r|t|r|a|n|\0|
+======+ +======+ +-+-+-+-+-+-+-+--+
So when you do this:
printf("The address of name is %p\n", name);
printf("The address of name is %p\n", &name);
Output would be:
The address of name is 0x00005678
The address of name is 0x00009876
Now, this is where the illusion that 'arrays and pointers are the same comes into play here'
When we do this:
char ch = name[1];
What happens at runtime is this:
The address of symbol name is looked up
Fetch the memory address of that symbol, i.e. 0x5678.
At that address, contains another address, a pointer address to memory and fetch it, i.e. 0x9876
Get the offset based on the subscript value of 1 and add it onto the pointer address, i.e. 0x9877 to retrieve the value at that memory address, i.e. 'o' and is assigned to ch.
That above is crucial to understanding this distinction, the difference between arrays and pointers is how the runtime fetches the data, with pointers, there is an extra indirection of fetching.
Remember, an array of type T will always decay into a pointer of the first element of type T.
When we do this:
char ch = *(name + 5);
The address of symbol name is looked up
Fetch the memory address of that symbol, i.e. 0x5678.
At that address, contains another address, a pointer address to memory and fetch it, i.e. 0x9876
Get the offset based on the value of 5 and add it onto the pointer address, i.e. 0x987A to retrieve the value at that memory address, i.e. 'r' and is assigned to ch.
Incidentally, you can also do that to the array of chars also...
Further more, by using subscript operators in the context of an array i.e. char name[] = "..."; and name[subscript_value] is really the same as *(name + subscript_value).
i.e.
name[3] is the same as *(name + 3)
And since the expression *(name + subscript_value) is commutative, that is in the reverse,
*(subscript_value + name) is the same as *(name + subscript_value)
Hence, this explains why in one of the answers above you can write it like this (despite it, the practice is not recommended even though it is quite legitimate!)
3[name]
Ok, how do I get the value of the pointer?
That is what the * is used for,
Suppose the pointer name has that pointer memory address of 0x9878, again, referring to the above example, this is how it is achieved:
char ch = *name;
This means, obtain the value that is pointed to by the memory address of 0x9878, now ch will have the value of 'r'. This is called dereferencing. We just dereferenced a name pointer to obtain the value and assign it to ch.
Also, the compiler knows that a sizeof(char) is 1, hence you can do pointer increment/decrement operations like this
*name++;
*name--;
The pointer automatically steps up/down as a result by one.
When we do this, assuming the pointer memory address of 0x9878:
char ch = *name++;
What is the value of *name and what is the address, the answer is, the *name will now contain 't' and assign it to ch, and the pointer memory address is 0x9879.
This where you have to be careful also, in the same principle and spirit as to what was stated earlier in relation to the memory boundaries in the very first part (see 'What happens if name[12] was executed' in the above) the results will be the same, i.e. code crashes and burns!
Now, what happens if we deallocate the block of memory pointed to by name by calling the C function free with name as the parameter, i.e. free(name):
+======+ +======+
|0x5678| -> |0x0000| -> NULL
+======+ +======+
Yes, the block of memory is freed up and handed back to the runtime environment for use by another upcoming code execution of malloc.
Now, this is where the common notation of Segmentation fault comes into play, since name does not point to anything, what happens when we dereference it i.e.
char ch = *name;
Yes, the code will crash and burn with a 'Segmentation fault', this is common under Unix/Linux. Under windows, a dialog box will appear along the lines of 'Unrecoverable error' or 'An error has occurred with the application, do you wish to send the report to Microsoft?'....if the pointer has not been mallocd and any attempt to dereference it, is guaranteed to crash and burn.
Also: remember this, for every malloc there is a corresponding free, if there is no corresponding free, you have a memory leak in which memory is allocated but not freed up.
And there you have it, that is how pointers work and how arrays are different to pointers, if you are reading a textbook that says they are the same, tear out that page and rip it up! :)
I hope this is of help to you in understanding pointers.
That is a pointer. Which means it is a variable that holds an address in memory. It "points" to another variable.
It actually cannot - by itself - hold large amounts of characters. By itself, it can hold only one address in memory. If you assign characters to it at creation it will allocate space for those characters, and then point to that address. You can do it like this:
char* name = "Mr. Anderson";
That is actually pretty much the same as this:
char name[] = "Mr. Anderson";
The place where character pointers come in handy is dynamic memory. You can assign a string of any length to a char pointer at any time in the program by doing something like this:
char *name;
name = malloc(256*sizeof(char));
strcpy(name, "This is less than 256 characters, so this is fine.");
Alternately, you can assign to it using the strdup() function, like this:
char *name;
name = strdup("This can be as long or short as I want. The function will allocate enough space for the string and assign return a pointer to it. Which then gets assigned to name");
If you use a character pointer this way - and assign memory to it, you have to free the memory contained in name before reassigning it. Like this:
if(name)
free(name);
name = 0;
Make sure to check that name is, in fact, a valid point before trying to free its memory. That's what the if statement does.
The reason you see character pointers get used a whole lot in C is because they allow you to reassign the string with a string of a different size. Static character arrays don't do that. They're also easier to pass around.
Also, character pointers are handy because they can be used to point to different statically allocated character arrays. Like this:
char *name;
char joe[] = "joe";
char bob[] = "bob";
name = joe;
printf("%s", name);
name = bob;
printf("%s", name);
This is what often happens when you pass a statically allocated array to a function taking a character pointer. For instance:
void strcpy(char *str1, char *str2);
If you then pass that:
char buffer[256];
strcpy(buffer, "This is a string, less than 256 characters.");
It will manipulate both of those through str1 and str2 which are just pointers that point to where buffer and the string literal are stored in memory.
Something to keep in mind when working in a function. If you have a function that returns a character pointer, don't return a pointer to a static character array allocated in the function. It will go out of scope and you'll have issues. Repeat, don't do this:
char *myFunc() {
char myBuf[64];
strcpy(myBuf, "hi");
return myBuf;
}
That won't work. You have to use a pointer and allocate memory (like shown earlier) in that case. The memory allocated will persist then, even when you pass out of the functions scope. Just don't forget to free it as previously mentioned.
This ended up a bit more encyclopedic than I'd intended, hope its helpful.
Editted to remove C++ code. I mix the two so often, I sometimes forget.
char* name is just a pointer. Somewhere along the line memory has to be allocated and the address of that memory stored in name.
It could point to a single byte of memory and be a "true" pointer to a single char.
It could point to a contiguous area of memory which holds a number of characters.
If those characters happen to end with a null terminator, low and behold you have a pointer to a string.
char *name, on it's own, can't hold any characters. This is important.
char *name just declares that name is a pointer (that is, a variable whose value is an address) that will be used to store the address of one or more characters at some point later in the program. It does not, however, allocate any space in memory to actually hold those characters, nor does it guarantee that name even contains a valid address. In the same way, if you have a declaration like int number there is no way to know what the value of number is until you explicitly set it.
Just like after declaring the value of an integer, you might later set its value (number = 42), after declaring a pointer to char, you might later set its value to be a valid memory address that contains a character -- or sequence of characters -- that you are interested in.
It is confusing indeed. The important thing to understand and distinguish is that char name[] declares array and char* name declares pointer. The two are different animals.
However, array in C can be implicitly converted to pointer to its first element. This gives you ability to perform pointer arithmetic and iterate through array elements (it does not matter elements of what type, char or not). As #which mentioned, you can use both, indexing operator or pointer arithmetic to access array elements. In fact, indexing operator is just a syntactic sugar (another representation of the same expression) for pointer arithmetic.
It is important to distinguish difference between array and pointer to first element of array. It is possible to query size of array declared as char name[15] using sizeof operator:
char name[15] = { 0 };
size_t s = sizeof(name);
assert(s == 15);
but if you apply sizeof to char* name you will get size of pointer on your platform (i.e. 4 bytes):
char* name = 0;
size_t s = sizeof(name);
assert(s == 4); // assuming pointer is 4-bytes long on your compiler/machine
Also, the two forms of definitions of arrays of char elements are equivalent:
char letters1[5] = { 'a', 'b', 'c', 'd', '\0' };
char letters2[5] = "abcd"; /* 5th element implicitly gets value of 0 */
The dual nature of arrays, the implicit conversion of array to pointer to its first element, in C (and also C++) language, pointer can be used as iterator to walk through array elements:
/ *skip to 'd' letter */
char* it = letters1;
for (int i = 0; i < 3; i++)
it++;
In C a string is actually just an array of characters, as you can see by the definition. However, superficially, any array is just a pointer to its first element, see below for the subtle intricacies. There is no range checking in C, the range you supply in the variable declaration has only meaning for the memory allocation for the variable.
a[x] is the same as *(a + x), i.e. dereference of the pointer a incremented by x.
if you used the following:
char foo[] = "foobar";
char bar = *foo;
bar will be set to 'f'
To stave of confusion and avoid misleading people, some extra words on the more intricate difference between pointers and arrays, thanks avakar:
In some cases a pointer is actually semantically different from an array, a (non-exhaustive) list of examples:
//sizeof
sizeof(char*) != sizeof(char[10])
//lvalues
char foo[] = "foobar";
char bar[] = "baz";
char* p;
foo = bar; // compile error, array is not an lvalue
p = bar; //just fine p now points to the array contents of bar
// multidimensional arrays
int baz[2][2];
int* q = baz; //compile error, multidimensional arrays can not decay into pointer
int* r = baz[0]; //just fine, r now points to the first element of the first "row" of baz
int x = baz[1][1];
int y = r[1][1]; //compile error, don't know dimensions of array, so subscripting is not possible
int z = r[1]: //just fine, z now holds the second element of the first "row" of baz
And finally a fun bit of trivia; since a[x] is equivalent to *(a + x) you can actually use e.g. '3[a]' to access the fourth element of array a. I.e. the following is perfectly legal code, and will print 'b' the fourth character of string foo.
#include <stdio.h>
int main(int argc, char** argv) {
char foo[] = "foobar";
printf("%c\n", 3[foo]);
return 0;
}
One is an actual array object and the other is a reference or pointer to such an array object.
The thing that can be confusing is that both have the address of the first character in them, but only because one address is the first character and the other address is a word in memory that contains the address of the character.
The difference can be seen in the value of &name. In the first two cases it is the same value as just name, but in the third case it is a different type called pointer to pointer to char, or **char, and it is the address of the pointer itself. That is, it is a double-indirect pointer.
#include <stdio.h>
char name1[] = "fortran";
char *name2 = "fortran";
int main(void) {
printf("%lx\n%lx %s\n", (long)name1, (long)&name1, name1);
printf("%lx\n%lx %s\n", (long)name2, (long)&name2, name2);
return 0;
}
Ross-Harveys-MacBook-Pro:so ross$ ./a.out
100001068
100001068 fortran
100000f58
100001070 fortran

Resources