How to work with strings and pointers in C? - c

Why is it p+=*string[2]-*string[1] instead of p+=string[2]-string[1] (without asterisks)?
res[0]=*p; its value is 'c' why? Why p moves on the word letters while *string[1] moves among the words and not letters?
#####I have edited the code.........did "char* p=string;" can be replaced by "char p=*string[0];"?
The code's output is
c
int main (void)
{
char* strings[]={"abcdb","bbb","dddd"};
char*p=*strings;
char res[2];
p+=*string[2]-*string[1];
res[0]=*p;
p+=3;
res[1]=*p;
printf("%s\n",res);
return 0;
}

The program has undefined behaviour.
We will not take into account that there is a typo and instead of identifier strings there is sometimes used identifier string. Let's assume that everywhere in the code snippet there is used identifier strings.
In this statement
char*p=*strings;
the first element of the array is assigned to the pointer. The first element of the array is pointer to the first character of string literal "abc". So p points to character 'a' of the string literal.
In this statement
p += *strings[2] - *strings[1];
strings[2] is the third element of the array having type char * and its value is the address of the first character of string literal "dddd". Dereferencing this pointer *strings[2] you will get the first character of this string literal that is 'd'
strings[1] is the second element of the array having type char * and its value is the address of the first character of string literal "bbb". Dereferencing this pointer *strings[1] you will get the first character of this string literal that is `'b'
The difference between internal codes of characters 'd' and 'b' (for example in ASCII the code of character'b'is 98 while the code of'd'` is 100) is equal to 2.
So this statement
p += *strings[2]-*strings[1];
increases the pointer by 2. At first it pointed to character 'a' of the first string literal "abc" and after increasing by 2 it points now to character 'c' of the same string literal "abc".
Thus in this statement
res[0] = *p;
character 'c' is assigned to res[0].
After this statement
p+=3;
the value of p becomes invalid because it now points beyond the string literal "abc" and it is not necessary that the compiler placed string literal "bbb" exactly after string literal "abc".
So dereferencing this pointer in the next statement
res[1]=*p;
results in undefined behaviour.
According to the C Standard
If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is
evaluated
It simply occured such a way that the compiler placed the string literals one after another in the memory. Though this is not guaranteed by the Standard.
So if after statement
res[1]=*p;
res[1] does not contain character '\0' then the next statement
printf("%s\n",res);
also has undefined behaviour.

First thing is there is nothing like string defined in your program, so your statement should be p+=*strings[2]-*strings[1]
Answer to both of your question is Dereferencing a pointer. You need to understand, how pointers work on Strings. Please check this link.

Related

Array of pointers to char in C

I am confused with how an array of pointer to char works in C.
Here is a sample of the code which I am using to understand the array of pointers to char.
int main() {
char *d[]={"hi","bye"};
int a;
a = (d[0]=="hi") ? 1:0;
printf("%d\n",a);
return 0;
}
I am getting a = 1 so d[0]="hi". What is confusing me that since d is an array of char pointers, shouldn't be a[0] equal to the address of h of the hi string ?
C 2018 6.4.5 specifies the behavior of string literals. Paragraph 6 specifies that a string literal in source code causes the creation of an array of characters. When an array is used in an expression other than as the operand of sizeof, the operand of unary &, or as a string literal used to initialize an array, it is automatically converted to a pointer to its first character. So char *d[]={"hi","bye"}; initializes d[0] and d[1] to point to the first characters of "hi" and "bye", and d[0]=="hi" compares d[0] to a pointer to the first character of another "hi".
Paragraph 7 says that the same array may be used for identical string literals:
It is unspecified whether these arrays are distinct provided their elements have the appropriate values…
Thus, when your compiler is taking the address of the first element of "hi" in d[0]=="hi", it may, but is not required to, use the same memory for "hi" as it did when initializing d[0] in char *d[]={"hi","bye"};.
(Note that the paragraph also allows the same memory to be used for string literals identical to the substrings at the ends of other string literals. For example, "phi" and "hi" could share memory.)

Use of pointers to store character strings

I started learning pointers in C. I understood it fine untill I came across the topic "Using Pointers to store character arrays".
A sample program to highlight my doubt is as follows
#include <stdio.h>
main()
{
char *string;
string = "good";
printf ("%s", string);
}
This prints the character string, i.e, good.
Pointers are supposed to store memory addresses, or in other words, we assign the adress of a variable (using the address operator) to a pointer variable.
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
Also, how are we able to print the string without the indirection operator (*) ?
A literal string like "good" is really stored as a (read-only) array of characters. Also, all strings in C must be terminated with a special "null" character '\0'.
When you do the assingment
string = "good";
what is really happening is that you make string point to the first character in that array.
Functions handling strings knows how to deal with pointers like that, and know how to loop over such arrays using the pointer to find all the characters in the string until it finds the terminator.
Looking at it a little differently, the compile creates its array
char internal_array[] = { 'g', 'o', 'o', 'd', '\0' };
then you make string point to the first element in the array
string = &internal_array[0];
Note that &internal_array[0] is actually equal to internal_array, since arrays naturally decays to pointers to their first element.
"cccccc" is a string literal which is actually the char array stored in the ReadOnly memory. You assign the pointer to the address of the first character of this literal.
if you want to copy string literal to the RAM you need to:
char string[] = "fgdfdfgdfgf";
Bare in mind that the array initialization (when you declare it) is the only place where you can use the = to copy the string literal to the char array (string).
In any other circumstances you need to use the appropriate library function for example.
strcpy(string, "asdf");
(the string has to have enough space to accommodate the new string)
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
When an array is assigned to something, the array is converted to a pointer.
"good" is a string literal. It has a array 5 of char which includes a trailing null character. It exists in memory where write attempts should not be attempted. Attempting to write is undefined behavior (UB). It might "work", it might not. Code may die, etc.
char *string; declare string as pointer to char.
string = "good"; causes an assignment. The operation takes "good" and converts that array to the address and type (char*) of its first element 'g'. Then assigns that char * to string.
Also, how are we able to print the string without the indirection operator (*) ?
printf() expects a char * - which matches the type of string.
printf ("%s", string); passes string to printf() as a char * - no conversion is made. printf ("%s",... expects to see a "... the argument shall be a pointer to the initial element of an array of character type." then "Characters from the array are written up to (but not including) the terminating null character." C11 §7.21.6.1 8.
Your first question:
What I don't understand is how are we able to assign a character string directly to the pointer? That too without address operator?
A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, for e.g. "good".
From C Standard#6.4.5 [String literals]:
...The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence.....
In C, an expression that has type array of type is converted to an expression with type pointer to type that points to the initial element of the array object [there are few exceptions]. Hence, the string literal which is an array decays into pointer which can be assigned to the type char *.
In the statement:
string = "good";
string will point to the initial character in the array where "good" is stored.
Your second question:
Also, how are we able to print the string without the indirection operator (*) ?
From printf():
s
writes a character string
The argument must be a pointer to the initial element of an array of characters...
So, format specifier %s expect pointer to initial element which is what the variable string is - a pointer to initial character of "good". Hence, you don't need indirection operator (*).

How can a character pointer without any address specification hold data?

The following C program is not supposed to work by my understanding of pointers but it does.
#include<stdio.h>
main() {
char *p;
p = "abcdefghijk";
printf("%s", p);
}
Outputs:
abcdefghijk
The char pointer variable p is pointing to something random as I have not assigned any address to it like p = &i; where i is some char array.
That means if I try to write anything to the memory address held by the pointer p it should give me segmentation fault since it is some random address not assigned to my program by the OS.
But the program compiles and runs successfully. What is happening?
In this expression statement
p="abcdefghijk";
the pointer p is assigned with the address of the first character of the string literal "abcdefghijk" that the compiler stores as a zero-terminated character array in the static memory area.
Thus in this statement there are two things that happen. At first the compiler creates an unnamed character array with the static storage duration to hold the string literal. Then the address of the first character of the array is assigned to the pointer. You can imagine it the following way
char unnamed[] = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', '\0' };
p = unnamed;
or
p = &unnamed[0];
Take into account that though string literals in C have types of non-constant character arrays opposite to C++ where they have types of constant character arrays nevertheless you may not change string literals. Any attempt to change a string literal results in undefined behavior.
So this code snippet is invalid
char *p = "abcdefghijk";
p[0] = 'A';
But you could create your own character array initializing it with the string literal and in this case you can change the array. For example
char s[] = "abcdefghijk";
char *p = s;
p[0] = 'A';
From the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
Pay attention to this part of the quote
It is unspecified whether these arrays are distinct provided their
elements have the appropriate values.
It means that for example if you will write
char *p = "abcdefghijk";
char *q = "abcdefghijk";
then it is not necessary that this expression yields true (integer value 1)
p == q
and the result depends on compiler options whether the same string literals are stored as one array or as distinct arrays.
In C a string literal like "abcdefghijk" is actually stored as an (read-only) array of characters. The assignment makes p point to the first character of that array.
I note that you mention p = &i where i would be an array. That is in most cases wrong. Arrays naturally decays to pointers to their first element. I.e. doing p = i would be equal to p = &i[0].
While both &i and &i[0] would result in the same address, it is semantically very different. Lets take an example:
char array[10];
With the above definition doing &array[0] (or just plain array as explained just above) you get a pointer to char, i.e. char *. When doing &array you get a pointer to an array of ten characters, i.e. char (*)[10]. The two types are very different.
"abcdefghijk" is a string constant, and p="abcdefghijk"; will give to p adress of this string.
So it's normal that printf("%s",p); display this string without error.
p="abcdefghijk";
You are creating a string literal in code segment and assigning the address of first character of the literal to the pointer, and as the pointer is not constant you can assign it again with different addresses.
The string literal "abcdefghijk" is compiled by putting the characters in a block in the program's datatext segment. Then your assignment of it to the pointer assigns the address of its location in the data segment to the pointer.

Is there a null character added after the string literal even the bound is not correct?

Is there any null character after the character c in memory:
char a[3]="abc";
printf("the value of the character is %.3s\n",a);
printf("the value of the character is %s\n",a);
Which line is correct ?
char a[3] = "abc"; is well-formed; the three elements of the array will be the characters 'a', 'b', and 'c'. There will not be a NUL terminator. (There might still happen to be a zero byte in memory immediately after the storage allocated to the array, but if there is, it is not part of the array. printf("%s", a) has undefined behavior.)
You might think that this violates the normal rule for when the initializer is too long for the object, C99 6.7.8p2
No initializer shall attempt to provide a value for an object not contained within the entity
being initialized.
That's a "shall" sentence in a "constraints" section, so a program that violates it is ill-formed. But there is a special case for when you initialize a char array with a string literal: C99 6.7.8p14 reads
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array.
The parenthetical overrides 6.7.8p2 and specifies that in this case the terminating null character is discarded.
There is a similar special case for initializing a wchar_t array with a wide string literal.

Assignment a value to index of pointer

I try to assign a value to the second index of pointer, it gives me a warning
"[Warning] assignment makes integer from pointer without a cast"
and it doesn't run this program. I wonder, where am I making a mistake?
#include<stdio.h>
void main()
{
char *p="John";
*(p+2)="v";
printf("%s",p);
}
Firstly, In your code,
char *p="John";
p points to a string literal, and attempt to modify a string literal invokes undefined behavior.
Related, C11, chapter §6.4.5
[...] If the program attempts to modify such an array, the behavior is
undefined.
If you want to modify , you need an array, like
char p[]="John";
Secondly, *(p+2)="v"; is wrong, as "" denotes a string, whereas you need a char (Hint: check the type os *(p+2)). Change that to
*(p+2)='v';
To elaborate the difference, quoting C11, chapter §6.4.4.4, for Character constants
An integer character constant is a sequence of one or more multibyte characters enclosed
in single-quotes, as in 'x'.
and chapter §6.4.5, string literals
A character string literal is a sequence of zero or more multibyte characters enclosed in
double-quotes, as in "xyz".
Thirdly, as per the C standards, void main() should at least be int main(void).
You are assigning a pointer to a string literal to an element of another string literal.
First, you need to change the pointer and make it an array, so any modification is legal, this is an example
char p[] = "John";
then you need to replace "v" which is a string literal consisting of two characters 'v' and '\0', to 'v' which is the ascii value for the letter v as an integer
*(p + 2) = 'v';
also, this is the third element and not the second, because the first element is p[0].
You're making two mistakes.
First, you are attempting to modify the contents of a string literal; this invokes undefined behavior, meaning any of the following are possible (and considered equally correct): your code may crash, it may run to completion with no issues, it may run to completion but not alter the string literal, it may leave your system in a bad state, etc. Your compiler may reject the code completely, although I don't know of any compiler that does so.
If you want to be able to modify the contents of a string, then you need to set aside storage for that string, instead of just pointing to a string literal:
char p[] = "John"; // copies the contents of the string literal "John" to p
// p is sized automatically based on the length of the
// string literal
Here's a hypothetical memory map showing the result of that declaration and initialization:
Address Item 0x00 0x01 0x02 0x03
------- ---- ---- ---- ---- ----
0x8000 "John" 'J' 'o' 'h' 'n'
0x8004 0x00 0x?? 0x?? 0x?? // 0x?? represents a random byte value
...
0xfffdc100 p 'J' 'o' 'h' 'n'
0xfffdc104 0x00 0x?? 0x?? 0x??
The string literal that lives at 0x8000 should not be modified; the string that lives in p at 0xfffdc100 may be modified (although the buffer will only have enough space to store up to a 4-character string).
Your second mistake (and the one causing the compiler to complain) is in this line:
*(p+2)="v";
The expression "v" is a string literal and has type "2-element array of char"; since it's not the operand of the sizeof or unary & operators, the type "decays" to "pointer to char", and the value of the expression is the address of the string literal "v".
The expression *(p + 2) has type char (it resolves to a single character value), which is an integral type, not a pointer type. You cannot assign pointer values to non-pointer objects.
You can easily fix that by changing that line to
*(p + 2) = 'v'; // note single quote vs double quote
This time, instead of trying to assign the address of a string literal to p[2], you're assigning the value of a single character.

Resources