We can assign a string in C as follows:
char *string;
string = "Hello";
printf("%s\n", string); // string
printf("%p\n", string); // memory-address
And a number can be done as follows:
int num = 4404;
int *nump = #
printf("%d\n", *nump);
printf("%p\n", nump);
My question then is, why can't we assign a pointer to a number in C just like we do with strings? For example, doing:
int *num;
num = 4404;
// and the rest...
What makes a string fundamentally different than other primitive types? I'm quite new to C so any explanation as to the difference between the two would be very helpful.
There is no such type as "string" in C. A string is not a primitive type. A string is just an array of characters, terminated by a NUL byte ('\0').
When you do this:
char *string;
string = "Hello";
What really happens is that the compiler is smart and creates a constant read only char array and then assigns it to your variable string. This can be done because in C the name of an array is the same as the pointer to its first element.
// This is placed in a different section:
const char hidden_arr[] = {'H', 'e', 'l', 'l', 'o', '\0'};
char *string;
string = hidden_arr;
// Same as:
string = &(hidden_arr[0]);
Here, hidden_arr and string are both char *, because as we just said the name of an array is equal to the pointer to its first element. Of course, all of this is done transparently, you will not actually see another variable named hidden_arr, that's just an example. In reality the string will be stored in some location in your executable without a name, and the address of that location will be copied to your string pointer.
When you try to do the same with an integer, it's wrong because int * and int are different types, and you cannot write this (well, you can, but it's meaningless and does not do what you expect it to):
int *ptr;
ptr = 123;
But, you can very well do it with an array of integers:
int arr[] = {1, 2, 3};
int *ptr;
ptr = arr;
// Same as:
ptr = &(arr[0]);
why can't we assign a pointer to a number in C just like we do with strings?
int *num;
num = 4404;
Code can do that if 4404 is a valid address for an int.
An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.
C11dr §6.3.2.3 5
If the address is not properly aligned --> undefined behavior (UB).
If the address is a trap --> undefined behavior (UB).
Attempting to de-reference the pointer is a problem unless it points to a valid int.
printf("%d\n", *num);
With below, "Hello" is a string literal. It exist someplace. The assignment take the address of the string literal and assigns that to string.
char *string;
string = "Hello";
The point is that that address assigned is known to be valid for a char *.
In the num = 4404; is not known to be valid (it likely is not).
What makes a string fundamentally different than other primitive types?
In C, a string is a C library specification, not a C language one. It is definition convenient to explaining various function therein.
A string is a contiguous sequence of characters terminated by and including the first null character §7.1.1 1
Primitive types are part of the C language.
The languages also has string literals like "Hello" in char *string; string = "Hello";. These have some similarity to strings, yet differ.
I recommend searching for "ISO/IEC9899:2017" to find a draft copy of the current C spec. It will answer many of your 10 question of the last week.
What makes a string fundamentally different than other primitive types?
A string seems like a primitive type in C because the compiler understands "foo" and generates a null-terminated character array: ['f', 'o', 'o', '\0']. But a C string is still just that: an array of characters.
My question then is, why can't we assign a pointer to a number in C just like we do with strings?
You certainly can assign a pointer to a number, it's just that a number isn't a pointer, whereas the value of an array is the address of the array. If you had an array of int, then that would work just like a string. Compare your code:
char *string;
string = "Hello";
printf("%s\n", string); // string
printf("%p\n", string); // memory-address
to the analogous code for an array of integers:
int numbers[] = {1, 2, 3, 4, 5, 0};
int *nump = numbers;
printf("%d\n", nump[0]); // string
printf("%p\n", nump); // memory-address
The only real difference is that the compiler has some extra syntax for arrays of characters because they're so common, and printf() similarly has a format specifier just for character arrays for the same reason.
The type pf a string literal (e.g. "hello world") is a char[]. Where assigning char *string = "Hello" means that string now points to the start of the array (e.g. the address of the first memory address in the array: &char[0]).
Whereas you can't assign an integer to a pointer because their types are different, one is a int the other is a pointer int *. You could cast it to the correct type:
int *num;
num = (int *) 4404;
But this would be considered quite dangerous (unless you really know what you are doing). I.e. do you know what is a memory adress 4404?
Related
I'm recollecting my C programming skills after 1 year. So I decided to start from scratch. I got stuck with poineters
why i can do :
char *string="Hello";
printf ("%s",string);
But i can't do :
int *pt=23;
printf("%d",*pt);
Doesn't the pointer need to be an address ? but why the first example works?
Because using string literals to initialize a character pointer or array is a special allowed syntax, with specific rules. In your case you set a pointer to point at a string literal's address, the string literal itself having type char[] and existing in read-only memory.
For the integer case, yes it needs to be an address, or more specifically another integer pointer. You can't initialize a pointer with an integer. See:
"Pointer from integer/integer from pointer without a cast" issues
The string literal "Hello" is stored in memory as a character array like
char unnamed_literal[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
So in this declaration
char *string="Hello";
the pointer string is assigned with the address of the first character of the already existent array. It can be imaging like
char unnamed_literal[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
char *string = unnamed_literal;;
As for this declaration
int *pt=23;
then the value 23 is not a valid address that points to a valid object defined in your program. The compiler should issue a message that you are trying to assign an integer to a pointer. Thus this call
printf("%d",*pt);
invokes undefined behavior.
To make an analogy with the initialization of a pointer by string literal you could write for example
int a[] = { 1, 2, 3 };
int *pt = a;
This is syntaxic sugar from standard C:
What is actually stored in your char pointer is the address at which the string is loaded in memory when starting the program.
For an integer however, it will try to set the address of the pointer to 23 which is (and should be) invalid.
For example an integer, like:
int VAL = 10;
int *ptr = &VAL; /* ptr now points to val */
*ptr = DESIRED_VALUE; /* *(asterisk) here means dereference, meaning that the value of VAL will change */
or
int *&VAL = DESIRED_VALUE;
int variable_1 = 23;
int *pointer_to_int = &variable_1;
The first line is an integer variable, initialized to an integer value.
The second line is a pointer to integer variable, initialized to an address of an integer variable (the address of the variable variable_1, defined in the previous declaration.
char *p = "Charles Dickens";
This is a char pointer variable, initialized to the address of a string literal, which is a constant array of characters, composed of the characters between the quotes, plus an aditional \0 character, that represents the end of the string literal.
char c = 'a'; /* char variable initialized to value 'a' */
char *pointer_to_c = &c;
Probably you will find this last declaration very similar to the one above. The different thing is that in C you can use string literals, but a single character is similar to an integer, and this is the way pointer variables initialize: with variables addresses.
The following C program is not supposed to work by my understanding of pointers but it does.
#include<stdio.h>
main() {
char *p;
p = "abcdefghijk";
printf("%s", p);
}
Outputs:
abcdefghijk
The char pointer variable p is pointing to something random as I have not assigned any address to it like p = &i; where i is some char array.
That means if I try to write anything to the memory address held by the pointer p it should give me segmentation fault since it is some random address not assigned to my program by the OS.
But the program compiles and runs successfully. What is happening?
In this expression statement
p="abcdefghijk";
the pointer p is assigned with the address of the first character of the string literal "abcdefghijk" that the compiler stores as a zero-terminated character array in the static memory area.
Thus in this statement there are two things that happen. At first the compiler creates an unnamed character array with the static storage duration to hold the string literal. Then the address of the first character of the array is assigned to the pointer. You can imagine it the following way
char unnamed[] = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', '\0' };
p = unnamed;
or
p = &unnamed[0];
Take into account that though string literals in C have types of non-constant character arrays opposite to C++ where they have types of constant character arrays nevertheless you may not change string literals. Any attempt to change a string literal results in undefined behavior.
So this code snippet is invalid
char *p = "abcdefghijk";
p[0] = 'A';
But you could create your own character array initializing it with the string literal and in this case you can change the array. For example
char s[] = "abcdefghijk";
char *p = s;
p[0] = 'A';
From the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
Pay attention to this part of the quote
It is unspecified whether these arrays are distinct provided their
elements have the appropriate values.
It means that for example if you will write
char *p = "abcdefghijk";
char *q = "abcdefghijk";
then it is not necessary that this expression yields true (integer value 1)
p == q
and the result depends on compiler options whether the same string literals are stored as one array or as distinct arrays.
In C a string literal like "abcdefghijk" is actually stored as an (read-only) array of characters. The assignment makes p point to the first character of that array.
I note that you mention p = &i where i would be an array. That is in most cases wrong. Arrays naturally decays to pointers to their first element. I.e. doing p = i would be equal to p = &i[0].
While both &i and &i[0] would result in the same address, it is semantically very different. Lets take an example:
char array[10];
With the above definition doing &array[0] (or just plain array as explained just above) you get a pointer to char, i.e. char *. When doing &array you get a pointer to an array of ten characters, i.e. char (*)[10]. The two types are very different.
"abcdefghijk" is a string constant, and p="abcdefghijk"; will give to p adress of this string.
So it's normal that printf("%s",p); display this string without error.
p="abcdefghijk";
You are creating a string literal in code segment and assigning the address of first character of the literal to the pointer, and as the pointer is not constant you can assign it again with different addresses.
The string literal "abcdefghijk" is compiled by putting the characters in a block in the program's datatext segment. Then your assignment of it to the pointer assigns the address of its location in the data segment to the pointer.
I've been looking through the site but haven't found an answer to this one yet.
It is easiest (for me at least) to explain this question with an example.
I don't understand why this is valid:
#include <stdio.h>
int main(int argc, char* argv[])
{
char *mystr = "hello";
}
But this produces a compiler warning ("initialization makes pointer from integer without a cast"):
#include <stdio.h>
int main(int argc, char* argv[])
{
int *myint = 5;
}
My understanding of the first program is that creates a variable called mystr of type pointer-to-char, the value of which is the address of the first char ('h') of the string literal "hello". In other words with this initialization you not only get the pointer, but also define the object ("hello" in this case) which the pointer points to.
Why, then, does int *myint = 5; seemingly not achieve something analogous to this, i.e. create a variable called myint of type pointer-to-int, the value of which is the address of the value '5'? Why doesn't this initialization both give me the pointer and also define the object which the pointer points to?
In fact, you can do so using a compound literal, a feature added to the language by the 1999 ISO C standard.
A string literal is of type char[N], where N is the length of the string plus 1. Like any array expression, it's implicitly converted, in most but not all contexts, to a pointer to the array's first element. So this:
char *mystr = "hello";
assigns to the pointer mystr the address of the initial element of an array whose contents are "hello" (followed by a terminating '\0' null character).
Incidentally, it's safer to write:
const char *mystr = "hello";
There are no such implicit conversions for integers -- but you can do this:
int *ptr = &(int){42};
(int){42} is a compound literal, which creates an anonymous int object initialized to 42; & takes the address of that object.
But be careful: The array created by a string literal always has static storage duration, but the object created by a compound literal can have either static or automatic storage duration, depending on where it appears. That means that if the value of ptr is returned from a function, the object with the value 42 will cease to exist while the pointer still points to it.
As for:
int *myint = 5;
that attempts to assign the value 5 to an object of type int*. (Strictly speaking it's an initialization rather than an assignment, but the effect is the same). Since there's no implicit conversion from int to int* (other than the special case of 0 being treated as a null pointer constant), this is invalid.
When you do char* mystr = "foo";, the compiler will create the string "foo" in a special read-only portion of your executable, and effectively rewrite the statement as char* mystr = address_of_that_string;
The same is not implemented for any other type, including integers. int* myint = 5; will set myint to point to address 5.
i'll split my answer to two parts:
1st, why char* str = "hello"; is valid:
char* str declare a space for a pointer (number that represents a memory address on the current architecture)
when you write "hello" you actually fill the stack with 6 bytes of data
(don't forget the null termination) lets say at address 0x1000 - 0x1005.
str="hello" assigns the start address of that 5 bytes (0x1000) to the *str
so what we have is :
1. str, which takes 4 bytes in memory, holds the number 0x1000 (points to the first char only!)
2. 6 bytes 'h' 'e' 'l' 'l' 'o' '\0'
2st, why int* ptr = 0x105A4DD9; isn't valid:
well, this is not entirely true!
as said before, a Pointer is a number that represent an address,
so why cant i assign that number ?
it is not common because mostly you extract addresses of data and not enter the address manually.
but you can if you need !!!...
because it isn't something that is commonly done,
the compiler want to make sure you do so in propose, and not by mistake and forces you to CAST your data as
int* ptr = (int*)0x105A4DD9;
(used mostly for Memory mapped hardware resources)
Hope this clear things out.
Cheers
"In C, why can't an integer value be assigned to an int* the same way a string value can be assigned to a char*?"
Because it's not even a similar situation, let alone "the same way".
A string literal is an array of chars which – being an array – can be implicitly converted to a pointer to its first element. Said pointer is a char *.
But an int is not either a pointer in itself, nor an array, nor anything else implicitly convertible to a pointer. These two scenarios just don't have anything in common.
The problem is that you are trying to assign the address 5 to the pointer. Here you are not dereferencing the pointer, you are declaring it as a pointer and initializing it to the value 5 (as an address which surely is not what you intend to do). You could do the following.
#include <stdio.h>
int main(int argc, char* argv[])
{
int *myint, b;
b = 5;
myint = &b;
}
A few questions regarding C strings:
both char* and char[] are pointers?
I've learned about pointers and I can tell that char* is a pointer, but why is it automatically a string and not just a char pointer that points to 1 char; why can it hold strings?
Why, unlike other pointers, when you assign a new value to the char* pointer you are actually allocating new space in memory to store the new value and, unlike other pointers, you just replace the value stored in the memory address the pointer is pointing at?
A pointer is not a string.
A string is a constant object having type array of char and, also, it has the property that the last element of the array is the null character '\0' which, in turn, is an int value (converted to char type) having the integer value 0.
char* is a pointer, but char[] is not. The type char[] is not a "real" type, but an incomplete type. The C language is specified in such a way that, in the moment that you define a concrete variable (object) having array of char type, the size of the array is well determined in some way or another. Thus, none variable has type char[] because this is not a type (for a given object).
However, automatically every object having type array of N objects of type char is promoted to char *, that is, a pointer to char pointing to the initial object of the array.
On the other hand, this promotion is not always performed. For example, the operator sizeof() will give different results for char* than for an array of N chars. In the former case, the size of a pointer to char is given (which is in general the same amount for every pointer...), and in the last case gives you the value N, that is, the size of the array.
The behaviour is differente when you declare function arguments as char* and char[]. Since the function cannot know the size of the array, you can think of both declarations as equivalent.
Actually, you are right here: char * is a pointer to just 1 character object. However, it can be used to access strings, as I will explain you now: In the paragraph 1. I showed you that the strings are considered objects in memory having type array of N chars for some N. This value N is big enough to allow an ending null character (as all "string" is supposed to be in C).
So, what's the deal here?
The key point to understand this issues is the concept of object (in memory).
When you have a string or, more generally, an array of char, this means that you have figured out some manner to hold an array object in memory.
This object determines a portion of RAM memory that you can access safely, because C has assigned enough memory for it.
Thus, when you point to the first byte of this object with a char* variable, actually you have guaranteed access to all the adjacent elements to the "right" of that memory place, because those places are well defined by C as having the bytes of the array above.
Briefly: the adjacent (to the right) bytes of the byte pointed by a char* variable can be accessed, they are valid places to access, so the pointer can be "iterated" to walk through these bytes, up to the end of the string, without "risks", since all the bytes in an array are contiguous well defined positions in memory.
This is a complicated question, but it reveals that you are not understanding the relationship between pointers, arrays, and string literals in C.
A pointer is just a variable pointing to a position in memory.
A pòinter to char points to just 1 object having type char.
If the adjacent bytes of the pointed position correspond to an array of chars, they will be accessible by the pointer, so the pointer can "walk on" the memory bytes occupied by the array object.
A string literal is considered as an array of char object, which implictely add an ending byte with value 0 (the null character).
In any case, an array of T object has a well defined "size".
A string literal has an additional property: it's a constant object.
Try to fit and gather these concepts in your mind to figure out what's going on.
And ask me for clarification.
ADDITIONAL REMARKS:
Consider the following piece of code:
#include <stdio.h>
int main(void)
{
char *s1 = "not modifiable";
char s2[] = "modifiable";
printf("%s ---- %s\n\n", s1, s2);
printf("Size of array s2: %d\n\n", (int)sizeof(s2));
s2[1] = '0', s2[3] = s2[5] = '1', s2[4] = '7',
s2[6] = '4', s2[7] = '8', s2[9] = '3';
printf("New value of s2: %s\n\n",s2);
//s1[0] = 'X'; // Attempting to modify s1
}
In the definition and initialization of s1 we have the string literal "not modifiable", which has constant content and constant address. Its address is assigned to the pointer s1 as initialization.
Any attempt to modify the bytes of the string will give some kind of error, because the array content is read-only.
In the definition and initialization of s2, we have the string literal "modifiable", which has, again, constant content and constant address. However, what happens now is that, as part of the initialization, the content of the string is copied to the array of char s2. The size of the array s2 is not specified (the declaration char s2[] gives an incomplete type), but after initialization the size of the array is well determined and defined as the exact size of the copied string (plus 1 character used to hold the null character, or end-of-string mark).
So, the string literal "modifiable" is used to initialize the bytes of the array s2, which is modifiable.
The right manner to do that is by changing a character at the time.
For more handy ways of modifying and assigning strings, it has to be used the standard header <string.h>.
char *s is a pointer, char s[] is an array of characters. Ex.
char *s = "hello";
char c[] = "world";
s = c; //Legal
c = address of some other string //Illegal
char *s is not a string; it points to an address. Ex
char c[] = "hello";
char *s = &c[3];
Assigning a pointer is not creating memory; you are pointing to memory. Ex.
char *s = "hello";
In this example when you type "hello" you are creating special memory to hold the string "hello" but that has nothing to do with the pointer, the pointer simply points to that spot.
#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
printf("%p %s" ,"hello",ptr );
getchar();
}
Hi, I am trying to understand clearly how can arrays get assign in to pointers. I notice when you assign an array of chars to a pointer of chars ptr="hello"; the array decays to the pointer, but in this case I am assigning a char of arrays that are not inside a variable and not a variable containing them ", does this way of assignment take a memory address specially for "Hello" (what obviously is happening) , and is it possible to modify the value of each element in "Hello" wich are contained in the memory address where this array is stored. As a comparison, is it fine for me to assign a pointer with an array for example of ints something as vague as thisint_ptr = 5,3,4,3; and the values 5,3,4,3 get located in a memory address as "Hello" did. And if not why is it possible only with strings? Thanks in advanced.
"hello" is a string literal. It is a nameless non-modifiable object of type char [6]. It is an array, and it behaves the same way any other array does. The fact that it is nameless does not really change anything. You can use it with [] operator for example, as in "hello"[3] and so on. Just like any other array, it can and will decay to pointer in most contexts.
You cannot modify the contents of a string literal because it is non-modifiable by definition. It can be physically stored in read-only memory. It can overlap other string literals, if they contain common sub-sequences of characters.
Similar functionality exists for other array types through compound literal syntax
int *p = (int []) { 1, 2, 3, 4, 5 };
In this case the right-hand side is a nameless object of type int [5], which decays to int * pointer. Compound literals are modifiable though, meaning that you can do p[3] = 8 and thus replace 4 with 8.
You can also use compound literal syntax with char arrays and do
char *p = (char []) { "hello" };
In this case the right-hand side is a modifiable nameless object of type char [6].
The first thing you should do is read section 6 of the comp.lang.c FAQ.
The string literal "hello" is an expression of type char[6] (5 characters for "hello" plus one for the terminating '\0'). It refers to an anonymous array object with static storage duration, initialized at program startup to contain those 6 character values.
In most contexts, an expression of array type is implicitly converted a pointer to the first element of the array; the exceptions are:
When it's the argument of sizeof (sizeof "hello" yields 6, not the size of a pointer);
When it's the argument of _Alignof (a new feature in C11);
When it's the argument of unary & (&arr yields the address of the entire array, not of its first element; same memory location, different type); and
When it's a string literal in an initializer used to initialize an array object (char s[6] = "hello"; copies the whole array, not just a pointer).
None of these exceptions apply to your code:
char *ptr;
ptr = "hello";
So the expression "hello" is converted to ("decays" to) a pointer to the first element ('h') of that anonymous array object I mentioned above.
So *ptr == 'h', and you can advance ptr through memory to access the other characters: 'e', 'l', 'l', 'o', and '\0'. This is what printf() does when you give it a "%s" format.
That anonymous array object, associated with the string literal, is read-only, but not const. What that means is that any attempt to modify that array, or any of its elements, has undefined behavior (because the standard explicitly says so) -- but the compiler won't necessarily warn you about it. (C++ makes string literals const; doing the same thing in C would have broken existing code that was written before const was added to the language.) So no, you can't modify the elements of "hello" -- or at least you shouldn't try. And to make the compiler warn you if you try, you should declare the pointer as const:
const char *ptr; /* pointer to const char, not const pointer to char */
ptr = "hello";
(gcc has an option, -Wwrite-strings, that causes it to treat string literals as const. This will cause it to warn about some C code that's legal as far as the standard is concerned, but such code should probably be modified to use const.)
#include <stdio.h>
main()
{
char * ptr;
ptr = "hello";
//instead of above tow lines you can write char *ptr = "hello"
printf("%p %s" ,"hello",ptr );
getchar();
}
Here you have assigned string literal "hello" to ptr it means string literal is stored in read only memory so you can't modify it. If you declare char ptr[] = "hello";, then you can modify the array.
Say what?
Your code allocates 6 bytes of memory and initializes it with the values 'h', 'e', 'l', 'l', 'o', and '\0'.
It then allocates a pointer (number of bytes for the pointer depends on implementation) and sets the pointer's value to the start of the 5 bytes mentioned previously.
You can modify the values of an array using syntax such as ptr[1] = 'a'.
Syntactically, strings are a special case. Since C doesn't have a specific string type to speak of, it does offer some shortcuts to declaring them and such. But you can easily create the same type of structure as you did for a string using int, even if the syntax must be a bit different.