How to terminate a character pointer at a certain location in c? - c

I'm trying to terminate a character pointer in c, at a specific location by setting the null terminator to it.
for examples if I have a char pointer
char *hi="hello";
I want it to be "hell" by setting the o to null.
I have tried doing this with strcpy with something like
strcpy(hi+4, "\0");
But it is not working.

"hello" is a string literal so it cannot modified, and in your code, hi points to the first element in such a literal. Any attempt to modify the thing it points to is undefined behaviour.
However, if you create your own char array, you can insert a null terminator at will. For example,
char hi[] = "hello"; // hi is array with {'h', 'e', 'l', 'l', 'o', '\0'}
hi[4] = '\0';
Here, hi is a length 6 array of char which you own and whose contents you can modify. After setting the 5th element, it contains {'h', 'e', 'l', 'l', '\0', '\0'}, and printing it would yield hell.

Point 1:
In your code
char *hi="hello";
hi is a pointer to a string literal. It may not be modifiable. You've to use a char array instead and initialize that with the same string literal. Then you can modify the contenets of that array as you may want.
Point 2:
You don't need strcpy() to copy a single char. You can simply assign the value using the assignment operator =.
Note: You don't terminate a pointer, you terminate achar array with a null-terminator to make that a string.

If the string is a literal you can't modify it. Otherwise:
To terminate a C string after 4 characters you could use:
*(he+4) = 0;
or
he[4] = 0;
he[4] = '\0';
or, since strcpy() copies all the characters specified and then appends a '\0' character:
strcpy(he+4, "");
but this is rather obfuscated.

Related

Shouldn't it be impossible to point directly to text in C?

I am learning C and I came across the pointers.
Even though I learned more with this tutorial than from the textbook I still wonder about the char pointers.
If I program this
#include <stdio.h>
int main()
{
char *ptr_str;
ptr_str = "Hello World";
printf(ptr_str);
return 0;
}
The result is
Hello World
I don't understand how there isn't an error while compiling since the pointer ptr_str is pointing directly to the text and not to the first character of the text. I thought that only this would work
#include <stdio.h>
int main()
{
char *ptr_str;
char var_str[] = "Hello World";
ptr_str = var_str;
printf(ptr_str);
return 0;
}
So in the first example how was I pointing directly to the text?
Your code works because string literals are essentially static arrays.
ptr_str = "Hello World";
is treated by the compiler as if it were
static char __tmp_0[] = {'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', '\0' };
ptr_str = __tmp_0;
(except trying to modify the contents of a string literal has undefined behavior).
You can even apply sizeof to a string literal and you'll get the size of the array: sizeof "Hello" is 6, for example.
In the context of assignment to a char pointer the 'value' of a string literal is the address of its first character.
so
ptr_str = "Hello World";
sets ptr_str to the address of the 'H'
Why won't the first one work? It will work as you have seen.
String literals are arrays. From §6.4.5p6 C11 Standard N1570
The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence.
Now in the first case literal array decayed into pointer to first element - so decayed pointer will basically be pointing to 'H'. You assigned that pointer to ptr_str. Now printf will expect a format specifier and the corresponding argument. Here it will be %s and corresponding argument would be char*. And printf will print every character until it reached the \0. That's all it happened. This is how you ended up pointing directly to the text.
Note that second case is quite different from first case in that - second case a copy is being made which can be modified (Trying to modify the first one would be undefined behavior). We are basically initializing a char array with the content of the string literal.

Are C constant character strings always null terminated?

Are C constant character strings always null terminated without exception?
For example, will the following C code always print "true":
const char* s = "abc";
if( *(s + 3) == 0 ){
printf( "true" );
} else {
printf( "false" );
}
A string is only a string if it contains a null character.
A string is a contiguous sequence of characters terminated by and including the first null character. C11 §7.1.1 1
"abc" is a string literal. It also always contains a null character. A string literal may contain more than 1 null character.
"def\0ghi" // 2 null characters.
In the following, though, x is not a string (it is an array of char without a null character). y and z are both arrays of char and both are strings.
char x[3] = "abc";
char y[4] = "abc";
char z[] = "abc";
With OP's code, s points to a string, the string literal "abc", *(s + 3) and s[3] have the value of 0. To attempt to modified s[3] is undefined behavior as 1) s is a const char * and 2) the data pointed to by s is a string literal. Attempting to modify a string literal is also undefined behavior.
const char* s = "abc";
Deeper: C does not define "constant character strings".
The language defines a string literal, like "abc" to be a character array of size 4 with the value of 'a', 'b', 'c', '\0'. Attempting to modify these is UB. How this is used depends on context.
The standard C library defines string.
With const char* s = "abc";, s is a pointer to data of type char. As a const some_type * pointer, using s to modify data is UB. s is initialized to point to the string literal "abc". s itself is not a string. The memory s initial points to is a string.
In short, yes. A string constant is of course a string and a string is by definition 0-terminated.
If you use a string constant as an array initializer like this:
char x[5] = "hello";
you won't have a 0 terminator in x simply because there's no room for it.
But with
char x[] = "hello";
it will be there and the size of x is 6.
The notion of a string is determinate as a sequence of characters terminated by zero character. It is not important whether the sequence is modifiable or not that is whether a corresponding declaration has the qualifier const or not.
For example string literals in C have types of non-constant character arrays. So you may write for example
char *s = "Hello world";
In this declaration the identifier s points to the first character of the string.
You can initialize a character array yourself by a string using a string literal. For example
char s[] = "Hello world";
This declaration is equivalent to
char s[] = { 'H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\0' };
However in C you may exclude the terminating zero from an initialization of a character array.
For example
char s[11] = "Hello world";
Though the string literal used as the initializer contains the terminating zero it is excluded from the initialization. As result the character array s does not contain a string.
In C, there isn't really a "string" datatype like in C++ and Java.
Important principle that every competent computer science degree program should mention: Information is symbols plus interpretation.
A "string" is defined conventionally as any sequence of characters ending in a null byte ('\0').
The "gotcha" that's being posted (character/byte arrays with the value 0 in the middle of them) is only a difference of interpretation. Treating a byte array as a string versus treating it as bytes (numbers in [0, 255]) has different applications. Obviously if you're printing to the terminal you might want to print characters until you reach a null byte. If you're saving a file or running an encryption algorithm on blocks of data you will need to support 0's in byte arrays.
It's also valid to take a "string" and optionally interpret as a byte array.

is array of character exactly same as string in c

This is something that has bugged for a quite a while.
I am trying to declare an array of char I am aware of the fact that string is an array of char.
but what I want to know is that when I declare something for example an array of characters
note I meant characters it is not a string like
char alphabet[26]={"a", "b" ,"c" ......"z"}
is that same as
char alphabet[]="abcd...z"
let's say I would do a bubble sort(I know is slow) to switch the alphabet order other way around is there any difference in handling those 2?
just really really curious.
No, a string and an array of char are quite different, though an array of char may contain a string.
An array is a data type. A string is a data layout.
An array is an object consisting of a contiguous sequence of some specified element type.
A string in C is, by definition, "a contiguous sequence of characters terminated by and including the first null character" (reference: N1570 7.1.1, paragraph 1).
Although the terminating null character is part of a string, the length of a string defined as "the number of bytes preceding the null character".
For example:
char arr[10] = "hello";
The array arr has 10 elements, with values { 'h', 'e', 'l', 'l', 'o', '\0', '\0', '\0', '\0', '\0', '\0' }.
The first 6 bytes of the array object arr contain a string whose value is { 'h', 'e', 'l', 'l', 'o', '\0' }, or, equivalently, "hello".
As for your declarations, the first one is valid if you change the double quotes to single quotes:
char alphabet[26]={'a', 'b', 'c', ..., 'z'};
but the array alphabet doesn't contain a string because there's no terminating '\0' null character.
In your second declaration:
char alphabet[]="abcd...z";
alphabet is 27 bytes long, because a string literal implicitly specifies that there's a trailing null character.
One exception: If the length of the string literal is exactly the same as the specified size of the array:
char not_a_string[5] = "hello";
then there is no null terminator. This is rarely a good idea.

how does a pointer stores a string in memory in c

char *p = "hello";
printf("%c",*p); //output would be ***h***
printf("%s",p); //output would be ***hello***
At line 2 why we have to use *p to print a char and at line 3 we have to use p to print a string.
Also at line 1 how the string is stored in memory.
an array is simply a pointer to a block of memory filled sequentially with items of a specific type.
p in this instance is a pointer to the first character in that array,
*p will dereference the pointer, which will return the item at that specific location (i.e. the first character in the string.
The most common string representation in c (and the one used by %s) is null-terminated strings. I.e. the string will start at p and continue until it hits a null (\0) value.
"hello" is saved in a read only memory of your system
p is a pointer that it is pointing at the beginning of your "hello" memory. so it's pointed in the first element of the memory.
*p return the content of the memory which p is pointing on. and p is the address of the first element of the "hello" memory so *p will return 'h'
String are stores in memory by a chain of of characters and are terminated by a 0 character.
For example, if you have a string literal "Hello", that string would be stored in memory as a char* array of {'H', 'e', 'l', 'l', 'o', '\0'}.
Because of this, if you access the array ( *array ), you only access the first character of the array. You can loop through the array until you find the \0 - char to know where the string ends.

Why strings are declared or defined using character pointer

I dont know why we always declare like this
char* name="Srimanth"
instead of
char name[]={"Srimanth"}
I am new to this things. so please be more specific while you are giving me an answer..
thank you.
String literal is a special, simple, form of writing an array aggregate: you can write "hello" instead of {'h', 'e', 'l', 'l', 'o', '\0'} (note the terminating zero, which is added automatically).
Note that an array declaration is not only possible, but is sometimes desirable:
char str[] = "hello";
str[0] = 'H'; // OK
lets you modify the string, as opposed to
char *str = "hello";
str[0] = 'H'; // Undefined behavior
which does not allow modifications.

Resources