global char[] variable, how to declare it in other files? - c

I defined a global variable char buf[1024] in one file, and what's the correct may to declare it in other files? extern char buf[1024], extern char buf[], or extern char *buf? I found extern char buf[] works and extern char *buf doesn't, but would like to know more explanations.

You can use
extern char buf[];
but NOT
extern char *buf;
Because arrays are not pointers.
Reference: C FAQ

extern char buf[] and extern char buf[1024] both are ok.
In some case, the array is implemented by pointer, such as transfering arguments between two functions.

When you are making a variable as extern , you are indicating to the compiler that the symbol (address) of the variable would be found in another .o file - ( This is done during linking stage ) .
So while you are making a variable as extern , you just need to mention the name as it will give the information about address and size is not required

This is the old problem of arrays and pointers being interchangeable. Arrays and pointers are not interchangeable: they just happen to be most of the time, because most of the time you use an array name in an expression, where it decays into a pointer.
This specific case of defining as char array in one file and declaring as char pointer in the other file is thoroughly explained in Expert C Programming - Deep C Secrets; see chapter 4.
The declaration of an array gives you an array, and the declaration of a pointer gives you a pointer. The difference is that an array is an address - the address of the first element - and it is not a modifiable l-value, i.e., it can't be assigned to. On the other hand, a pointer is a variable holding an address.
Usually, the context is enough to tell whether you mean the address of a variable or the contents of the variable in an assignment. The statement
i = j;
Is saying to store the contents of j in the address of i. In compilers jargon, i is said to be an l-value, and j an r-value. The compiler has to generate code that writes the contents of the memory address of j in the memory address of i.
Consider these declarations:
char a[1024];
char *a;
What happens when you write a[i] = j;?
For the former case, the compiler will just pick the address of a's contents, which, in arrays, is the address of the first element; scale i, and sum it to the base address. Then it will write the contents of the address where j is stored into that address.
For the latter case, it is quite different: the compiler has to check the memory location where a is stored, load the contents of that memory location, use THAT as an address, and write the contents of j into that address.
If you declare a characters array like this in file1.c:
char a[] = "Hello";
And then define, in file2
extern char *a;
Then, executing a[0] = 'x'; in file2.c will crash: since you told the compiler that a is a pointer, it will check the address where a's value is stored. In reality, this address is holding the character code for 'H', but the compiler mistakenly interprets that as an address, and ends up generating code to write 'x' into the address 'H', which, if you're lucky, will crash your program with a segmentation violation.
Thus, this is a case where declaration and definition must match: if you declared it as an array, define it as an array; if you declared it as a pointer, define it as a pointer. You have to declare buf as a characters array in other files. Either of these forms is legal and equivalent:
extern char buf[1024];
or
extern char buf[];

Assignable global char variable. Make visible to all:
// shared_header.h
extern char current_mode[];
Define, instantiate and update:
// main.c
#include "shared_header.h"
char current_mode[160];
int main(int argc, char * argv [])
{
strcpy(current_mode, "MODE_CONFIGURE");
int a = randomNumber(102);
strcpy(current_mode, "MODE_EXECUTE");
int b = do_foo(a);
// other stuff
strcpy(current_mode, "MODE_TEARDOWN");
// close up shop
}
Read or update:
// foo.c
#include "shared_header.h"
int do_foo(int a)
{
printf("Current mode is %s", current_mode);
if (a > 100) {
strcpy(current_mode, "MODE_EXECUTE_SPECIAL_CASE");
printf("Switching to mode %s", current_mode);
}
// do useful things
}

Related

In C, why can't an integer value be assigned to an int* the same way a string value can be assigned to a char*?

I've been looking through the site but haven't found an answer to this one yet.
It is easiest (for me at least) to explain this question with an example.
I don't understand why this is valid:
#include <stdio.h>
int main(int argc, char* argv[])
{
char *mystr = "hello";
}
But this produces a compiler warning ("initialization makes pointer from integer without a cast"):
#include <stdio.h>
int main(int argc, char* argv[])
{
int *myint = 5;
}
My understanding of the first program is that creates a variable called mystr of type pointer-to-char, the value of which is the address of the first char ('h') of the string literal "hello". In other words with this initialization you not only get the pointer, but also define the object ("hello" in this case) which the pointer points to.
Why, then, does int *myint = 5; seemingly not achieve something analogous to this, i.e. create a variable called myint of type pointer-to-int, the value of which is the address of the value '5'? Why doesn't this initialization both give me the pointer and also define the object which the pointer points to?
In fact, you can do so using a compound literal, a feature added to the language by the 1999 ISO C standard.
A string literal is of type char[N], where N is the length of the string plus 1. Like any array expression, it's implicitly converted, in most but not all contexts, to a pointer to the array's first element. So this:
char *mystr = "hello";
assigns to the pointer mystr the address of the initial element of an array whose contents are "hello" (followed by a terminating '\0' null character).
Incidentally, it's safer to write:
const char *mystr = "hello";
There are no such implicit conversions for integers -- but you can do this:
int *ptr = &(int){42};
(int){42} is a compound literal, which creates an anonymous int object initialized to 42; & takes the address of that object.
But be careful: The array created by a string literal always has static storage duration, but the object created by a compound literal can have either static or automatic storage duration, depending on where it appears. That means that if the value of ptr is returned from a function, the object with the value 42 will cease to exist while the pointer still points to it.
As for:
int *myint = 5;
that attempts to assign the value 5 to an object of type int*. (Strictly speaking it's an initialization rather than an assignment, but the effect is the same). Since there's no implicit conversion from int to int* (other than the special case of 0 being treated as a null pointer constant), this is invalid.
When you do char* mystr = "foo";, the compiler will create the string "foo" in a special read-only portion of your executable, and effectively rewrite the statement as char* mystr = address_of_that_string;
The same is not implemented for any other type, including integers. int* myint = 5; will set myint to point to address 5.
i'll split my answer to two parts:
1st, why char* str = "hello"; is valid:
char* str declare a space for a pointer (number that represents a memory address on the current architecture)
when you write "hello" you actually fill the stack with 6 bytes of data
(don't forget the null termination) lets say at address 0x1000 - 0x1005.
str="hello" assigns the start address of that 5 bytes (0x1000) to the *str
so what we have is :
1. str, which takes 4 bytes in memory, holds the number 0x1000 (points to the first char only!)
2. 6 bytes 'h' 'e' 'l' 'l' 'o' '\0'
2st, why int* ptr = 0x105A4DD9; isn't valid:
well, this is not entirely true!
as said before, a Pointer is a number that represent an address,
so why cant i assign that number ?
it is not common because mostly you extract addresses of data and not enter the address manually.
but you can if you need !!!...
because it isn't something that is commonly done,
the compiler want to make sure you do so in propose, and not by mistake and forces you to CAST your data as
int* ptr = (int*)0x105A4DD9;
(used mostly for Memory mapped hardware resources)
Hope this clear things out.
Cheers
"In C, why can't an integer value be assigned to an int* the same way a string value can be assigned to a char*?"
Because it's not even a similar situation, let alone "the same way".
A string literal is an array of chars which – being an array – can be implicitly converted to a pointer to its first element. Said pointer is a char *.
But an int is not either a pointer in itself, nor an array, nor anything else implicitly convertible to a pointer. These two scenarios just don't have anything in common.
The problem is that you are trying to assign the address 5 to the pointer. Here you are not dereferencing the pointer, you are declaring it as a pointer and initializing it to the value 5 (as an address which surely is not what you intend to do). You could do the following.
#include <stdio.h>
int main(int argc, char* argv[])
{
int *myint, b;
b = 5;
myint = &b;
}

Declare and initialize pointer concisely (i. e. pointer to int)

Given pointers to char, one can do the following:
char *s = "data";
As far as I understand, a pointer variable is declared here, memory is allocated for both variable and data, the latter is filled with data\0 and the variable in question is set to point to the first byte of it (i. e. variable contains an address that can be dereferenced). That's short and compact.
Given pointers to int, for example, one can do this:
int *i;
*i = 42;
or that:
int i = 42;
foo(&i); // prefix every time to get a pointer
bar(&i);
baz(&i);
or that:
int i = 42;
int *p = &i;
That's somewhat tautological. It's small and tolerable with one usage of a single variable. It's not with multiple uses of several variables, though, producing code clutter.
Are there any ways to write the same thing dry and concisely? What are they?
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
String literals are a corner case : they trigger the creation of the literal in static memory, and its access as a char array. Note that the following doesn't compile, despite 42 being an int literal, because it is not implicitly allocated :
int *p = &42;
In all other cases, you are responsible of allocating the pointed object, be it in automatic or dynamic memory.
int i = 42;
int *p = &i;
Here i is an automatic variable, and p points to it.
int * i;
*i = 42;
You just invoked Undefined Behaviour. i has not been initialized, and is therefore pointing somewhere at random in memory. Then you assigned 42 to this random location, with unpredictable consequences. Bad.
int *i = malloc(sizeof *i);
Here i is initialized to point to a dynamically-allocated block of memory. Don't forget to free(i) once you're done with it.
int i = 42, *p = &i;
And here is how you create an automatic variable and a pointer to it as a one-liner. i is the variable, p points to it.
Edit : seems like you really want that variable to be implicitly and anonymously allocated. Well, here's how you can do it :
int *p = &(int){42};
This thingy is a compound literal. They are anonymous instances with automatic storage duration (or static at file scope), and only exist in C90 and further (but not C++ !). As opposed to string literals, compound literals are mutable, i.e you can modify *p.
Edit 2 : Adding this solution inspired from another answer (which unfortunately provided a wrong explanation) for completeness :
int i[] = {42};
This will allocate a one-element mutable array with automatic storage duration. The name of the array, while not a pointer itself, will decay to a pointer as needed.
Note however that sizeof i will return the "wrong" result, that is the actual size of the array (1 * sizeof(int)) instead of the size of a pointer (sizeof(int*)). That should however rarely be an issue.
int i=42;
int *ptr = &i;
this is equivalent to writing
int i=42;
int *ptr;
ptr=&i;
Tough this is definitely confusing, but during function calls its quite useful as:
void function1()
{
int i=42;
function2(&i);
}
function2(int *ptr)
{
printf("%d",*ptr); //outputs 42
}
here, we can easily use this confusing notation to declare and initialize the pointer during function calls. We don't need to declare pointer globally, and the initialize it during function calls. We have a notation to do both at same time.
int *ptr; //declares the pointer but does not initialize it
//so, ptr points to some random memory location
*ptr=42; //you gave a value to this random memory location
Though this will compile, but it will invoke undefined behaviour as you actually never initialized the pointer.
Also,
char *ptr;
char str[6]="hello";
ptr=str;
EDIT: as pointed in the comments, these two cases are not equivalent.
But pointer points to "hello" in both cases. This example is written just to show that we can initialize pointers in both these ways (to point to hello), but definitely both are different in many aspects.
char *ptr;
ptr="hello";
As, name of string, str is actually a pointer to the 0th element of string, i.e. 'h'.
The same goes with any array arr[], where arr contains the address of 0th element.
you can also think it as array , int i[1]={42} where i is a pointer to int
int * i;
*i = 42;
will invoke undefined behavior. You are modifying an unknown memory location. You need to initialize pointer i first.
int i = 42;
int *p = &i;
is the correct way. Now p is pointing to i and you can modify the variable pointed to by p.
Are there any ways to write the same thing dry and concisely?
No. As there is no pass by reference in C you have to use pointers when you want to modify the passed variable in a function.
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
If you are learning C then you can't avoid pointers and you should learn to use it properly.

Pointer-array-extern question

File 1.c
int a[10];
File main.c:
extern int *a;
int main()
{
printf("%d\n", a[0]);
return 0;
}
Gives me a segfault! What's going wrong?
Arrays decompose, or are implicitly converted to pointers when passed to a function as an argument, or when converted to an r-value on the right-hand-side of the assignment operator. So something like:
int array[10];
int* a = array; //implicit conversion to pointer to type int
void function(int* a);
function(array); //implicit conversion to pointer to type int
works just fine. But that does not mean that arrays themselves are pointers. So if you treat an array like a pointer as you've done, you're actually treating the array type as-if it was a pointer that held the address to an int object. Since your array is actually a sequence of int objects, and not pointers to int objects, you're actually trying to dereference to some memory location that isn't pointing to anywhere valid (i.e., the first slot in array is a numerical integer value like 0 which would be like dereferencing a NULL). So that is why you're segfaulting. Note that if you had done something like this:
int array[] = { 1, 2, 3, 4, 5};
int b = *array;
That still works, since array is again implicitly converted to a pointer to the memory block that is holding a sequence of integer values and is then dereferenced to get the value in the first sequence. But in your case, by declaring your array to the current code module as an externally defined pointer, and not an array, it will skip over the implicit conversion to a pointer that is normally done, and just use the array object as-if it were a pointer to an object itself, not an array of objects.
Well explained in the C FAQ. And there is a followup. The picture in the second link is worth a million bucks.
char a[] = "hello";
char *p = "world";
Short answer: use extern int a[].
A bit late, a duplicate of this problem was just entered (and closed). The answers here don't mention header files...
The problem would be caught at compile time if you put the declaration of array a in a header file, where it belongs, instead of putting it in the .c file. The header file should then be included into both .c files and the compiler can see that what you have declared is wrong.
Your header file would contain:
extern int myarray[];
You will get something like "error: conflicting types for a" if you declare a as a pointer instead.
Basically you'd need to write your main.c like this:
extern int a[];
int main()
{
printf("%d\n", a[0]);
return 0;
}
Check out the output of the following code.
File1.c
#include <stdio.h>
extern int* my_arr;
void my_print()
{
printf("%d", my_arr);
}
main.c
#include <stdio.h>
int my_arr[2] = {1,2};
extern void my_print();
void main()
{
my_print();
}
output
1
inside File1.c my_arr is a pointer variable that has the value of 1. meaning the 1st element of my_arr[] was assigned to it. Then if you use *my_arr to access memory location ox1, you get seg fault because you are not allowed to access ox01.
Why my_arr pointer was assigned 1 (the first element of my_arr[])?
Has to do with how assembler works. Read this article
Why your code can't access 0x01 ?
I know it has to do with the operating system not allowing some address space to be accessed by user code. Thats all I know. google it if you want more info.

C difference between *[] and **

This might be a bit of a basic question, but what is the difference between writing char * [] and char **? For example, in main,I can have a char * argv[]. Alternatively I can use char ** argv. I assume there's got to be some kind of difference between the two notations.
Under the circumstances, there's no difference at all. If you try to use an array type as a function parameter, the compiler will "adjust" that to a pointer type instead (i.e., T a[x] as a function parameter means exactly the same thing as: T *a).
Under the right circumstances (i.e., not as a function parameter), there can be a difference between using array and pointer notation though. One common one is in an extern declaration. For example, let's assume we have one file that contains something like:
char a[20];
and we want to make that visible in another file. This will work:
extern char a[];
but this will not:
extern char *a;
If we make it an array of pointers instead:
char *a[20];
...the same remains true -- declaring an extern array works fine, but declaring an extern pointer does not:
extern char *a[]; // works
extern char **a; // doesn't work
Depends on context.
As a function parameter, they mean the same thing (to the compiler), but writing it char *argv[] might help make it obvious to programmers that the char** being passed points to the first element of an array of char*.
As a variable declaration, they mean different things. One is a pointer to a pointer, the other is an array of pointers, and the array is of unspecified size. So you can do:
char * foo[] = {0, 0, 0};
And get an array of 3 null pointers. Three char*s is a completely different thing from a pointer to a char*.
You can use cdecl.org to convert them to English:
char *argv[] = declare argv as array of pointer to char
char **argv = declare argv as pointer to pointer to char
I think this is a little bit more than syntactic sugar, it also offers a way to express semantic information about the (voluntary) contract implied by each type of declaration.
With char*[] you are saying that this is intended to be used as an array.
With char**, you are saying that you CAN use this as an array but that's not the way it's intended to be used.
As it was mentioned in the other answers, char*[] declares an array of pointers to char, char** declares a pointer to a pointer to char (which can be used as array).
One difference is that the array is constant, whereas the pointer is not.
Example:
int main()
{
char** ppc = NULL;
char* apc[] = {NULL};
ppc++;
apc++; /* this won't compile*/
return 0;
}
This really depends on the context of where the declarations occur.
Outside of a function parameter definition, the declaration
T a[];
declares a as an unknown-sized array of T; the array type is incomplete, so unless a is defined elsewhere (either in this translation unit or another translation unit that gets linked) then no storage is set aside for it (and you will probably get an "undefined reference" error if you attempt to link, although I think gcc's default behavior is to define the array with 1 element) . It cannot be used as an operand to the sizeof operator. It can be used as an operand of the & operator.
For example:
/**
* module1.c
*/
extern char *a[]; /* non-defining declaration of a */
void foo()
{
size_t i = 0;
for (i = 0; a[i] != NULL; i++)
printf("a[%lu] = %s\n", (unsigned long) i, a[i++]);
}
module1.c uses a non-defining declaration of a to introduce the name so that it can be used in the function foo, but since no size is specified, no storage is set aside for it in this translation unit. Most importantly, the expression a is not a pointer type; it is an incomplete array type. It will be converted to a pointer type in the call to printf by the usual rules.
/**
* module2.c
*/
char *a[] = {"foo", "bar", "bletch", "blurga", NULL}; /* defining declaration of a */
int main(void)
{
void foo();
foo();
return 0;
}
module2.c contains a defining declaration for a (the size of the array is computed from the number of elements in the initializer), which causes storage to be allocated for the array.
Style note: please don't ever write code like this.
In the context of a function parameter declaration, T a[] is synonymous with T *a; in both cases, a is a pointer type. This is only true in the context of a function parameter declaration.
As Paul said in the comment above, it's syntactic sugar. Both char* and char[] are the same data type. In memory, they will both contain the address of a char.
The array/index notation is equivalent to the pointer notation, both in declaration and in access, but sometimes much more intuitive. If you are creating an array of char pointers, you may want to write it one way or another to clarify your intention.
Edit: didn't consider the case Jerry mentioned in the other answer. Take a look at that.
char *ptr[2]={"good","bad"}; //Array of ptr to char
char **str; //Refer ptr to ptr to char
int i;
//str = &ptr[0]; //work
str = ptr;
for(i=0;i<2;i++) printf("%s %s\n",ptr[i],str[i]);
Its o/p same. Using that we can easily understand.

C strings confusion

I'm learning C right now and got a bit confused with character arrays - strings.
char name[15]="Fortran";
No problem with this - its an array that can hold (up to?) 15 chars
char name[]="Fortran";
C counts the number of characters for me so I don't have to - neat!
char* name;
Okay. What now? All I know is that this can hold an big number of characters that are assigned later (e.g.: via user input), but
Why do they call this a char pointer? I know of pointers as references to variables
Is this an "excuse"? Does this find any other use than in char*?
What is this actually? Is it a pointer? How do you use it correctly?
thanks in advance,
lamas
I think this can be explained this way, since a picture is worth a thousand words...
We'll start off with char name[] = "Fortran", which is an array of chars, the length is known at compile time, 7 to be exact, right? Wrong! it is 8, since a '\0' is a nul terminating character, all strings have to have that.
char name[] = "Fortran";
+======+ +-+-+-+-+-+-+-+--+
|0x1234| |F|o|r|t|r|a|n|\0|
+======+ +-+-+-+-+-+-+-+--+
At link time, the compiler and linker gave the symbol name a memory address of 0x1234.
Using the subscript operator, i.e. name[1] for example, the compiler knows how to calculate where in memory is the character at offset, 0x1234 + 1 = 0x1235, and it is indeed 'o'. That is simple enough, furthermore, with the ANSI C standard, the size of a char data type is 1 byte, which can explain how the runtime can obtain the value of this semantic name[cnt++], assuming cnt is an integer and has a value of 3 for example, the runtime steps up by one automatically, and counting from zero, the value of the offset is 't'. This is simple so far so good.
What happens if name[12] was executed? Well, the code will either crash, or you will get garbage, since the boundary of the array is from index/offset 0 (0x1234) up to 8 (0x123B). Anything after that does not belong to name variable, that would be called a buffer overflow!
The address of name in memory is 0x1234, as in the example, if you were to do this:
printf("The address of name is %p\n", &name);
Output would be:
The address of name is 0x00001234
For the sake of brevity and keeping with the example, the memory addresses are 32bit, hence you see the extra 0's. Fair enough? Right, let's move on.
Now on to pointers...
char *name is a pointer to type of char....
Edit:
And we initialize it to NULL as shown Thanks Dan for pointing out the little error...
char *name = (char*)NULL;
+======+ +======+
|0x5678| -> |0x0000| -> NULL
+======+ +======+
At compile/link time, the name does not point to anything, but has a compile/link time address for the symbol name (0x5678), in fact it is NULL, the pointer address of name is unknown hence 0x0000.
Now, remember, this is crucial, the address of the symbol is known at compile/link time, but the pointer address is unknown, when dealing with pointers of any type
Suppose we do this:
name = (char *)malloc((20 * sizeof(char)) + 1);
strcpy(name, "Fortran");
We called malloc to allocate a memory block for 20 bytes, no, it is not 21, the reason I added 1 on to the size is for the '\0' nul terminating character. Suppose at runtime, the address given was 0x9876,
char *name;
+======+ +======+ +-+-+-+-+-+-+-+--+
|0x5678| -> |0x9876| -> |F|o|r|t|r|a|n|\0|
+======+ +======+ +-+-+-+-+-+-+-+--+
So when you do this:
printf("The address of name is %p\n", name);
printf("The address of name is %p\n", &name);
Output would be:
The address of name is 0x00005678
The address of name is 0x00009876
Now, this is where the illusion that 'arrays and pointers are the same comes into play here'
When we do this:
char ch = name[1];
What happens at runtime is this:
The address of symbol name is looked up
Fetch the memory address of that symbol, i.e. 0x5678.
At that address, contains another address, a pointer address to memory and fetch it, i.e. 0x9876
Get the offset based on the subscript value of 1 and add it onto the pointer address, i.e. 0x9877 to retrieve the value at that memory address, i.e. 'o' and is assigned to ch.
That above is crucial to understanding this distinction, the difference between arrays and pointers is how the runtime fetches the data, with pointers, there is an extra indirection of fetching.
Remember, an array of type T will always decay into a pointer of the first element of type T.
When we do this:
char ch = *(name + 5);
The address of symbol name is looked up
Fetch the memory address of that symbol, i.e. 0x5678.
At that address, contains another address, a pointer address to memory and fetch it, i.e. 0x9876
Get the offset based on the value of 5 and add it onto the pointer address, i.e. 0x987A to retrieve the value at that memory address, i.e. 'r' and is assigned to ch.
Incidentally, you can also do that to the array of chars also...
Further more, by using subscript operators in the context of an array i.e. char name[] = "..."; and name[subscript_value] is really the same as *(name + subscript_value).
i.e.
name[3] is the same as *(name + 3)
And since the expression *(name + subscript_value) is commutative, that is in the reverse,
*(subscript_value + name) is the same as *(name + subscript_value)
Hence, this explains why in one of the answers above you can write it like this (despite it, the practice is not recommended even though it is quite legitimate!)
3[name]
Ok, how do I get the value of the pointer?
That is what the * is used for,
Suppose the pointer name has that pointer memory address of 0x9878, again, referring to the above example, this is how it is achieved:
char ch = *name;
This means, obtain the value that is pointed to by the memory address of 0x9878, now ch will have the value of 'r'. This is called dereferencing. We just dereferenced a name pointer to obtain the value and assign it to ch.
Also, the compiler knows that a sizeof(char) is 1, hence you can do pointer increment/decrement operations like this
*name++;
*name--;
The pointer automatically steps up/down as a result by one.
When we do this, assuming the pointer memory address of 0x9878:
char ch = *name++;
What is the value of *name and what is the address, the answer is, the *name will now contain 't' and assign it to ch, and the pointer memory address is 0x9879.
This where you have to be careful also, in the same principle and spirit as to what was stated earlier in relation to the memory boundaries in the very first part (see 'What happens if name[12] was executed' in the above) the results will be the same, i.e. code crashes and burns!
Now, what happens if we deallocate the block of memory pointed to by name by calling the C function free with name as the parameter, i.e. free(name):
+======+ +======+
|0x5678| -> |0x0000| -> NULL
+======+ +======+
Yes, the block of memory is freed up and handed back to the runtime environment for use by another upcoming code execution of malloc.
Now, this is where the common notation of Segmentation fault comes into play, since name does not point to anything, what happens when we dereference it i.e.
char ch = *name;
Yes, the code will crash and burn with a 'Segmentation fault', this is common under Unix/Linux. Under windows, a dialog box will appear along the lines of 'Unrecoverable error' or 'An error has occurred with the application, do you wish to send the report to Microsoft?'....if the pointer has not been mallocd and any attempt to dereference it, is guaranteed to crash and burn.
Also: remember this, for every malloc there is a corresponding free, if there is no corresponding free, you have a memory leak in which memory is allocated but not freed up.
And there you have it, that is how pointers work and how arrays are different to pointers, if you are reading a textbook that says they are the same, tear out that page and rip it up! :)
I hope this is of help to you in understanding pointers.
That is a pointer. Which means it is a variable that holds an address in memory. It "points" to another variable.
It actually cannot - by itself - hold large amounts of characters. By itself, it can hold only one address in memory. If you assign characters to it at creation it will allocate space for those characters, and then point to that address. You can do it like this:
char* name = "Mr. Anderson";
That is actually pretty much the same as this:
char name[] = "Mr. Anderson";
The place where character pointers come in handy is dynamic memory. You can assign a string of any length to a char pointer at any time in the program by doing something like this:
char *name;
name = malloc(256*sizeof(char));
strcpy(name, "This is less than 256 characters, so this is fine.");
Alternately, you can assign to it using the strdup() function, like this:
char *name;
name = strdup("This can be as long or short as I want. The function will allocate enough space for the string and assign return a pointer to it. Which then gets assigned to name");
If you use a character pointer this way - and assign memory to it, you have to free the memory contained in name before reassigning it. Like this:
if(name)
free(name);
name = 0;
Make sure to check that name is, in fact, a valid point before trying to free its memory. That's what the if statement does.
The reason you see character pointers get used a whole lot in C is because they allow you to reassign the string with a string of a different size. Static character arrays don't do that. They're also easier to pass around.
Also, character pointers are handy because they can be used to point to different statically allocated character arrays. Like this:
char *name;
char joe[] = "joe";
char bob[] = "bob";
name = joe;
printf("%s", name);
name = bob;
printf("%s", name);
This is what often happens when you pass a statically allocated array to a function taking a character pointer. For instance:
void strcpy(char *str1, char *str2);
If you then pass that:
char buffer[256];
strcpy(buffer, "This is a string, less than 256 characters.");
It will manipulate both of those through str1 and str2 which are just pointers that point to where buffer and the string literal are stored in memory.
Something to keep in mind when working in a function. If you have a function that returns a character pointer, don't return a pointer to a static character array allocated in the function. It will go out of scope and you'll have issues. Repeat, don't do this:
char *myFunc() {
char myBuf[64];
strcpy(myBuf, "hi");
return myBuf;
}
That won't work. You have to use a pointer and allocate memory (like shown earlier) in that case. The memory allocated will persist then, even when you pass out of the functions scope. Just don't forget to free it as previously mentioned.
This ended up a bit more encyclopedic than I'd intended, hope its helpful.
Editted to remove C++ code. I mix the two so often, I sometimes forget.
char* name is just a pointer. Somewhere along the line memory has to be allocated and the address of that memory stored in name.
It could point to a single byte of memory and be a "true" pointer to a single char.
It could point to a contiguous area of memory which holds a number of characters.
If those characters happen to end with a null terminator, low and behold you have a pointer to a string.
char *name, on it's own, can't hold any characters. This is important.
char *name just declares that name is a pointer (that is, a variable whose value is an address) that will be used to store the address of one or more characters at some point later in the program. It does not, however, allocate any space in memory to actually hold those characters, nor does it guarantee that name even contains a valid address. In the same way, if you have a declaration like int number there is no way to know what the value of number is until you explicitly set it.
Just like after declaring the value of an integer, you might later set its value (number = 42), after declaring a pointer to char, you might later set its value to be a valid memory address that contains a character -- or sequence of characters -- that you are interested in.
It is confusing indeed. The important thing to understand and distinguish is that char name[] declares array and char* name declares pointer. The two are different animals.
However, array in C can be implicitly converted to pointer to its first element. This gives you ability to perform pointer arithmetic and iterate through array elements (it does not matter elements of what type, char or not). As #which mentioned, you can use both, indexing operator or pointer arithmetic to access array elements. In fact, indexing operator is just a syntactic sugar (another representation of the same expression) for pointer arithmetic.
It is important to distinguish difference between array and pointer to first element of array. It is possible to query size of array declared as char name[15] using sizeof operator:
char name[15] = { 0 };
size_t s = sizeof(name);
assert(s == 15);
but if you apply sizeof to char* name you will get size of pointer on your platform (i.e. 4 bytes):
char* name = 0;
size_t s = sizeof(name);
assert(s == 4); // assuming pointer is 4-bytes long on your compiler/machine
Also, the two forms of definitions of arrays of char elements are equivalent:
char letters1[5] = { 'a', 'b', 'c', 'd', '\0' };
char letters2[5] = "abcd"; /* 5th element implicitly gets value of 0 */
The dual nature of arrays, the implicit conversion of array to pointer to its first element, in C (and also C++) language, pointer can be used as iterator to walk through array elements:
/ *skip to 'd' letter */
char* it = letters1;
for (int i = 0; i < 3; i++)
it++;
In C a string is actually just an array of characters, as you can see by the definition. However, superficially, any array is just a pointer to its first element, see below for the subtle intricacies. There is no range checking in C, the range you supply in the variable declaration has only meaning for the memory allocation for the variable.
a[x] is the same as *(a + x), i.e. dereference of the pointer a incremented by x.
if you used the following:
char foo[] = "foobar";
char bar = *foo;
bar will be set to 'f'
To stave of confusion and avoid misleading people, some extra words on the more intricate difference between pointers and arrays, thanks avakar:
In some cases a pointer is actually semantically different from an array, a (non-exhaustive) list of examples:
//sizeof
sizeof(char*) != sizeof(char[10])
//lvalues
char foo[] = "foobar";
char bar[] = "baz";
char* p;
foo = bar; // compile error, array is not an lvalue
p = bar; //just fine p now points to the array contents of bar
// multidimensional arrays
int baz[2][2];
int* q = baz; //compile error, multidimensional arrays can not decay into pointer
int* r = baz[0]; //just fine, r now points to the first element of the first "row" of baz
int x = baz[1][1];
int y = r[1][1]; //compile error, don't know dimensions of array, so subscripting is not possible
int z = r[1]: //just fine, z now holds the second element of the first "row" of baz
And finally a fun bit of trivia; since a[x] is equivalent to *(a + x) you can actually use e.g. '3[a]' to access the fourth element of array a. I.e. the following is perfectly legal code, and will print 'b' the fourth character of string foo.
#include <stdio.h>
int main(int argc, char** argv) {
char foo[] = "foobar";
printf("%c\n", 3[foo]);
return 0;
}
One is an actual array object and the other is a reference or pointer to such an array object.
The thing that can be confusing is that both have the address of the first character in them, but only because one address is the first character and the other address is a word in memory that contains the address of the character.
The difference can be seen in the value of &name. In the first two cases it is the same value as just name, but in the third case it is a different type called pointer to pointer to char, or **char, and it is the address of the pointer itself. That is, it is a double-indirect pointer.
#include <stdio.h>
char name1[] = "fortran";
char *name2 = "fortran";
int main(void) {
printf("%lx\n%lx %s\n", (long)name1, (long)&name1, name1);
printf("%lx\n%lx %s\n", (long)name2, (long)&name2, name2);
return 0;
}
Ross-Harveys-MacBook-Pro:so ross$ ./a.out
100001068
100001068 fortran
100000f58
100001070 fortran

Resources