Given the Source Code from strcpy()
char * strcpy(char *s1, const char *s2)
{
char *s = s1;
while ((*s++ = *s2++) != 0);
return (s1);
}
Why does handing over the second argument work and how does it look in memory since I do not pass a pointer to the function
char dest[100];
strcpy(dest, "HelloWorld");
This works, because,
For dest, arrays, when passed as function arguments, decay to the address of the first element. That's a pointer.
So, a call like
strcpy(dest, "HelloWorld");
is the same as
strcpy(&dest[0], "HelloWorld");
For "HelloWorld", a string literal, has a type of char[]. So, it essentially gives you the address of the fist element in it.
In C string literals have types of character arrays. From C Standard (6.4.5 String literals)
6 In translation phase 7, a byte or code of value zero is appended to
each multibyte character sequence that results from a string literal
or literals.78) The multibyte character sequence is then used to
initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals,
the array elements have type char, and are initialized with the
individual bytes of the multibyte character sequence.
Also arrays with rare exceptions are converted to pointers in expressions. The C Standard, 6.3.2.1 Lvalues, arrays, and function designators
3 Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’ that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
Thus in this call
strcpy(dest, "HelloWorld");
the string literal has type char[11] that is converted to value of type char * that is equal to the address of the first character of the string literal.
You could also write for example
strcpy(dest, &"HelloWorld"[0]);
Or even :)
strcpy(dest, &0["HelloWorld"]);
or:)
strcpy(dest, &*"HelloWorld");
All three expressions yield the address of the initial element of the string literal and have type char *.
Take into account that it is implementation-defined (usually controlled by compiler options) whether
"HelloWorld" == "HelloWorld"
is evaluated to true. That is whether the compiler allocates separate extents of memory for identical string literals or will store only one copy of them.
In this expression the addresses of the first characters of the string literals are compared.
If you write
strcmp( "HelloWorld", "HelloWorld" )
then the result will be equal to 0 that is the string literals are equal each other (contain the same sequence of characters)
strcpy(dest, "HelloWorld");
dest being and array of char will decay to pointer to base element when passed to function and accessed there .
And "HelloWorld" being a string literal is of type char [](but not modifiable) and hence is correct argument to pass.
Related
I am confused with how an array of pointer to char works in C.
Here is a sample of the code which I am using to understand the array of pointers to char.
int main() {
char *d[]={"hi","bye"};
int a;
a = (d[0]=="hi") ? 1:0;
printf("%d\n",a);
return 0;
}
I am getting a = 1 so d[0]="hi". What is confusing me that since d is an array of char pointers, shouldn't be a[0] equal to the address of h of the hi string ?
C 2018 6.4.5 specifies the behavior of string literals. Paragraph 6 specifies that a string literal in source code causes the creation of an array of characters. When an array is used in an expression other than as the operand of sizeof, the operand of unary &, or as a string literal used to initialize an array, it is automatically converted to a pointer to its first character. So char *d[]={"hi","bye"}; initializes d[0] and d[1] to point to the first characters of "hi" and "bye", and d[0]=="hi" compares d[0] to a pointer to the first character of another "hi".
Paragraph 7 says that the same array may be used for identical string literals:
It is unspecified whether these arrays are distinct provided their elements have the appropriate values…
Thus, when your compiler is taking the address of the first element of "hi" in d[0]=="hi", it may, but is not required to, use the same memory for "hi" as it did when initializing d[0] in char *d[]={"hi","bye"};.
(Note that the paragraph also allows the same memory to be used for string literals identical to the substrings at the ends of other string literals. For example, "phi" and "hi" could share memory.)
I started learning pointers in C. I understood it fine untill I came across the topic "Using Pointers to store character arrays".
A sample program to highlight my doubt is as follows
#include <stdio.h>
main()
{
char *string;
string = "good";
printf ("%s", string);
}
This prints the character string, i.e, good.
Pointers are supposed to store memory addresses, or in other words, we assign the adress of a variable (using the address operator) to a pointer variable.
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
Also, how are we able to print the string without the indirection operator (*) ?
A literal string like "good" is really stored as a (read-only) array of characters. Also, all strings in C must be terminated with a special "null" character '\0'.
When you do the assingment
string = "good";
what is really happening is that you make string point to the first character in that array.
Functions handling strings knows how to deal with pointers like that, and know how to loop over such arrays using the pointer to find all the characters in the string until it finds the terminator.
Looking at it a little differently, the compile creates its array
char internal_array[] = { 'g', 'o', 'o', 'd', '\0' };
then you make string point to the first element in the array
string = &internal_array[0];
Note that &internal_array[0] is actually equal to internal_array, since arrays naturally decays to pointers to their first element.
"cccccc" is a string literal which is actually the char array stored in the ReadOnly memory. You assign the pointer to the address of the first character of this literal.
if you want to copy string literal to the RAM you need to:
char string[] = "fgdfdfgdfgf";
Bare in mind that the array initialization (when you declare it) is the only place where you can use the = to copy the string literal to the char array (string).
In any other circumstances you need to use the appropriate library function for example.
strcpy(string, "asdf");
(the string has to have enough space to accommodate the new string)
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
When an array is assigned to something, the array is converted to a pointer.
"good" is a string literal. It has a array 5 of char which includes a trailing null character. It exists in memory where write attempts should not be attempted. Attempting to write is undefined behavior (UB). It might "work", it might not. Code may die, etc.
char *string; declare string as pointer to char.
string = "good"; causes an assignment. The operation takes "good" and converts that array to the address and type (char*) of its first element 'g'. Then assigns that char * to string.
Also, how are we able to print the string without the indirection operator (*) ?
printf() expects a char * - which matches the type of string.
printf ("%s", string); passes string to printf() as a char * - no conversion is made. printf ("%s",... expects to see a "... the argument shall be a pointer to the initial element of an array of character type." then "Characters from the array are written up to (but not including) the terminating null character." C11 §7.21.6.1 8.
Your first question:
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, for e.g. "good".
From C Standard#6.4.5 [String literals]:
...The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence.....
In C, an expression that has type array of type is converted to an expression with type pointer to type that points to the initial element of the array object [there are few exceptions]. Hence, the string literal which is an array decays into pointer which can be assigned to the type char *.
In the statement:
string = "good";
string will point to the initial character in the array where "good" is stored.
Your second question:
Also, how are we able to print the string without the indirection operator (*) ?
From printf():
s
writes a character string
The argument must be a pointer to the initial element of an array of characters...
So, format specifier %s expect pointer to initial element which is what the variable string is - a pointer to initial character of "good". Hence, you don't need indirection operator (*).
I know that in C both of these works:
char* string = "foo";
printf("string value: %s", string);
and more simply:
printf("string value: %s", "foo");
But I was asking myself why.
I know that %s identifier expects the argument to be a char*, and string actually is (and it will be the same with an array of characters, because this two datatypes are pretty the same in C)
but when I pass directly a string to printf shouldn't it be different? I mean "foo" is not a pointer anymore... Right?.
The string constant "foo" has type char []. When passed to a function, the array decays to a pointer, i.e. char *. So you can pass it to a function that expects the same.
For the same reason, you can also pass a variable of this type:
char string[4] = "foo";
printf("string value: %s", string);
The "foo" is string literal. It represents an unnamed array object with static storage duration of type char[4] (that is, without const qualifier), that is passed to function by value, just as it would be with any "normal" array.
Even though the array is not const, you are not allowed to modify its values. Such modification results in undefined behavior:
char* string = "foo";
string[0] = 'b'; // wrong, this invokes UB
The array has four elements, because of trailing null character '\0', sometimes reffered as NUL character. Please don't confuse it with NULL, which is a different thing. The purpose of that character is to terminate given string literal.
The function's parameter receives pointer to char, as array object is converted into pointer to array's first element (i.e. pointer to first character in array). To be precise, not the whole pointer is passed, only the address (i.e. the value of the pointer) it holds.
In C, all strings are null-terminating char[] so your example will interact in the same exact way.
The ISO C standard, section 7.1.1, defines a string this way:
A string is a contiguous sequence of characters terminated by and
including the first null character.
What printf() gets, is a pointer:
ISO/IEC 9899:TC3, 6.5.2.2 – 4:
An argument may be an expression of any object type. In preparing for the call to a function, the arguments are evaluated, and each parameter is assigned the value of the corresponding argument.81)
81) A parameter declared to have array or function type is adjusted to have a pointer type as described in 6.9.1.
ISO/IEC 9899:TC3, 6.9.1 – 10:
On entry to the function, the size expressions of each variably modified parameter are evaluated and the value of each argument expression is converted to the type of the corresponding parameter as if by assignment. (Array expressions and function designators as arguments were converted to pointers before the call.)
"foo", in the end, is a pointer literal pointing to a statically allocated 4-byte memory region (likely marked read-only) that is initialized with the content: 'f'. 'o', 'o', '\0'.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the difference between char s[] and char *s in C?
Difference between char *str = “…” and char str[N] = “…”?
I have some code that has had me puzzled.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char* string1 = "this is a test";
char string2[] = "this is a test";
printf("%i, %i\n", sizeof(string1), sizeof(string2));
system("PAUSE");
return 0;
}
When it outputs the size of string1, it prints 4, which is to be expected because the size of a pointer is 4 bytes. But when it prints string2, it outputs 15. I thought that an array was a pointer, so the size of string2 should be the same as string1 right? So why is it that it prints out two different sizes for the same type of data (pointer)?
Arrays are not pointers. Array names decay to pointers to the first element of the array in certain situations: when you pass it to a function, when you assign it to a pointer, etc. But otherwise arrays are arrays - they exist on the stack, have compile-time sizes that can be determined with sizeof, and all that other good stuff.
Arrays and pointers are completely different animals. In most contexts, an expression designating an array is treated as a pointer.
First, a little standard language (n1256):
6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
The string literal "this is a test" is a 15-element array of char. In the declaration
char *string1 = "this is a test";
string1 is being declared as a pointer to char. Per the language above, the type of the expression "this is a test" is converted from char [15] to char *, and the resulting pointer value is assigned to string1.
In the declaration
char string2[] = "this is a test";
something different happens. More standard language:
6.7.8 Initialization
...
14 An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
...
22 If an array of unknown size is initialized, its size is determined by the largest indexed element with an explicit initializer. At the end of its initializer list, the array no longer has incomplete type.
In this case, string2 is being declared as an array of char, its size is computed from the length of the initializer, and the contents of the string literal are copied to the array.
Here's a hypothetical memory map to illustrate what's happening:
Item Address 0x00 0x01 0x02 0x03
---- ------- ---- ---- ---- ----
no name 0x08001230 't' 'h' 'i' 's'
0x08001234 ' ' 'i' 's' ' '
0x08001238 'a' ' ' 't' 'e'
0x0800123C 's' 't' 0
...
string1 0x12340000 0x08 0x00 0x12 0x30
string2 0x12340004 't' 'h' 'i' 's'
0x12340008 ' ' 'i' 's' ' '
0x1234000C 'a' ' ' 't' 'e'
0x1234000F 's' 't' 0
String literals have static extent; that is, the memory for them is set aside at program startup and held until the program terminates. Attempting to modify the contents of a string literal invokes undefined behavior; the underlying platform may or may not allow it, and the standard places no restrictions on the compiler. It's best to act as though literals are always unwritable.
In my memory map above, the address of the string literal is set off somewhat from the addresses of string1 and string2 to illustrate this.
Anyway, you can see that string1, having a pointer type, contains the address of the string literal. string2, being an array type, contains a copy of the contents of the string literal.
Since the size of string2 is known at compile time, sizeof returns the size (number of bytes) in the array.
The %i conversion specifier is not the right one to use for expressions of type size_t. If you're working in C99, use %zu. In C89, you would use %lu and cast the expression to unsigned long:
C89: printf("%lu, %lu\n", (unsigned long) sizeof string1, (unsigned long) sizeof string2);
C99: printf("%zu, %zu\n", sizeof string1, sizeof string2);
Note that sizeof is an operator, not a function call; when the operand is an expression that denotes an object, parentheses aren't necessary (although they don't hurt).
string1 is a pointer, but string2 is an array.
The second line is something like int a[] = { 1, 2, 3}; which defines a to be a length-3 array (via the initializer).
The size of string2 is 15 because the initializer is nul-terminated (so 15 is the length of the string + 1).
An array of unknown size is equivalent to a pointer for sizeof purposes. An array of static size counts as its own type for sizeof purposes, and sizeof reports the size of the storage required for the array. Even though string2 is allocated without an explicit size, the C compiler treats it magically because of the direct initialization by a quoted string and converts it to an array with static size. (Since the memory isn't allocated in any other way, there's nothing else it can do, after all.) Static size arrays are different types from pointers (or dynamic arrays!) for the purpose of sizeof behavior, because that's just how C is.
This seems to be a decent reference on the behaviors of sizeof.
The compiler know that test2 is an array, so it prints out the number of bytes allocated to it(14 letters plus null terminator). Remember that sizeof is a compiler function, so it can know the size of a stack variable.
array is not pointer. Pointer is a variable pointing to a memory location whereas array is starting point of sequential memory allocated
Its because
string1 holds pointer, where pointer has contiguous chars & its
immutable.
string2 is location where your chars sit.
basically C compiler iterprets these 2 differently. beautifully explained here http://c-faq.com/aryptr/aryptr2.html.
In the following rules for the case when array decays to pointer:
An lvalue [see question 2.5] of type array-of-T which appears in an expression decays (with three exceptions) into a pointer to its first element; the type of the resultant pointer is pointer-to-T.
(The exceptions are when the array is the operand of a sizeof or & operator, or is a literal string initializer for a character array.)
How to understand the case when the array is "literal string initializer for a character array"? Some example please.
Thanks!
The three exceptions where an array does not decay into a pointer are the following:
Exception 1. — When the array is the operand of sizeof.
int main()
{
int a[10];
printf("%zu", sizeof(a)); /* prints 10 * sizeof(int) */
int* p = a;
printf("%zu", sizeof(p)); /* prints sizeof(int*) */
}
Exception 2. — When the array is the operand of the & operator.
int main()
{
int a[10];
printf("%p", (void*)(&a)); /* prints the array's address */
int* p = a;
printf("%p", (void*)(&p)); /*prints the pointer's address */
}
Exception 3. — When the array is initialized with a literal string.
int main()
{
char a[] = "Hello world"; /* the literal string is copied into a local array which is destroyed after that array goes out of scope */
char* p = "Hello world"; /* the literal string is copied in the read-only section of memory (any attempt to modify it is an undefined behavior) */
}
Assume the declarations
char foo[] = "This is a test";
char *bar = "This is a test";
In both cases, the type of the string literal "This is a test" is "15-element array of char". Under most circumstances, array expressions are implicitly converted from type "N-element array of T" to "pointer to T", and the expression evaluates to the address of the first element of the array. In the declaration for bar, that's exactly what happens.
In the declaration for foo, however, the expression is being used to initialize the contents of another array, and is therefore not converted to a pointer type; instead, the contents of the string literal are copied to foo.
This is a literal string initializer for a character array:
char arr[] = "literal string initializer";
Could also be:
char* str = "literal string initializer";
Definition from K&R2:
A string literal, also called a string
constant, is a sequence of characters
surrounded by double quotes as in
"...". A string has type ``array of
characters'' and storage class static
(see Par.A.3 below) and is initialized
with the given characters. Whether
identical string literals are distinct
is implementation-defined, and the
behavior of a program that attempts to
alter a string literal is undefined.
It seems like you pulled that quote from the comp.lang.c FAQ (maybe an old version or maybe the printed version; it doesn't quite match with the current state of the online one):
http://c-faq.com/aryptr/aryptrequiv.html
The corresponding section links to other sections of the FAQ to elaborate on those exceptions. In your case, you should look at:
http://c-faq.com/decl/strlitinit.html