I understand that in C, a string is an array of characters with a special '\0' character at the end of the array.
Say I have "Hello" stored in a char* named string and there is a '\0' at the end of the array.
When I call printf("%s\n", string);, it would print out "Hello".
My question is, what happens to '\0' when you call printf on a string?
The null character ('\0') at the end of a string is simply a sentinel value for C library functions to know where to stop processing a string pointer.
This is necessary for two reasons:
Arrays decay to pointers to their first element when passed to functions
It's entirely possible to have a string in an array of chars that doesn't use up the entire array.
For example, strlen, which determines the length of the string, might be implemented as:
size_t strlen(char *s)
{
size_t len = 0;
while(*s++ != '\0') len++;
return len;
}
If you tried to emulate this behavior inline with a statically allocated array instead of a pointer, you still need the null terminator to know the string length:
char str[100];
size_t len = 0;
strcpy(str, "Hello World");
for(; len < 100; len++)
if(str[len]=='\0') break;
// len now contains the string length
Note that explicitly comparing for inequality with '\0' is redundant; I just included it for ease of understanding.
Related
Why does this code produce runtime issues:
char stuff[100];
strcat(stuff,"hi ");
strcat(stuff,"there");
but this doesn't?
char stuff[100];
strcpy(stuff,"hi ");
strcat(stuff,"there");
strcat will look for the null-terminator, interpret that as the end of the string, and append the new text there, overwriting the null-terminator in the process, and writing a new null-terminator at the end of the concatenation.
char stuff[100]; // 'stuff' is uninitialized
Where is the null terminator? stuff is uninitialized, so it might start with NUL, or it might not have NUL anywhere within it.
In C++, you can do this:
char stuff[100] = {}; // 'stuff' is initialized to all zeroes
Now you can do strcat, because the first character of 'stuff' is the null-terminator, so it will append to the right place.
In C, you still need to initialize 'stuff', which can be done a couple of ways:
char stuff[100]; // not initialized
stuff[0] = '\0'; // first character is now the null terminator,
// so 'stuff' is effectively ""
strcpy(stuff, "hi "); // this initializes 'stuff' if it's not already.
In the first case, stuff contains garbage. strcat requires both the destination and the source to contain proper null-terminated strings.
strcat(stuff, "hi ");
will scan stuff for a terminating '\0' character, where it will start copying "hi ". If it doesn't find it, it will run off the end of the array, and arbitrarily bad things can happen (i.e., the behavior is undefined).
One way to avoid the problem is like this:
char stuff[100];
stuff[0] = '\0'; /* ensures stuff contains a valid string */
strcat(stuff, "hi ");
strcat(stuff, "there");
Or you can initialize stuff to an empty string:
char stuff[100] = "";
which will fill all 100 bytes of stuff with zeros (the increased clarity is probably worth any minor performance issue).
Because stuff is uninitialized before the call to strcpy. After the declaration stuff isn't an empty string, it is uninitialized data.
strcat appends data to the end of a string - that is it finds the null terminator in the string and adds characters after that. An uninitialized string isn't gauranteed to have a null terminator so strcat is likely to crash.
If there were to intialize stuff as below you could perform the strcat's:
char stuff[100] = "";
strcat(stuff,"hi ");
strcat(stuff,"there");
Strcat append a string to existing string. If the string array is empty, it is not going go find end of string ('\0') and it will cause run time error.
According to Linux man page, simple strcat is implemented this way:
char*
strncat(char *dest, const char *src, size_t n)
{
size_t dest_len = strlen(dest);
size_t i;
for (i = 0 ; i < n && src[i] != '\0' ; i++)
dest[dest_len + i] = src[i];
dest[dest_len + i] = '\0';
return dest;
}
As you can see in this implementation, strlen(dest) will not return correct string length unless dest is initialized to correct c string values. You may get lucky to have an array with the first value of zero at char stuff[100]; , but you should not rely on it.
Also, I would advise against using strcpy or strcat as they can lead to some unintended problems.
Use strncpy and strncat, as they help prevent buffer overflows.
I need help with char array. I want to create a n-lenght array and initialize its values, but after malloc() function the array is longer then n*sizeof(char), and the content of array isnt only chars which I assign... In array is few random chars and I dont know how to solve that... I need that part of code for one project for exam in school, and I have to finish by Sunday... Please help :P
#include<stdlib.h>
#include<stdio.h>
int main(){
char *text;
int n = 10;
int i;
if((text = (char*) malloc((n)*sizeof(char))) == NULL){
fprintf(stderr, "allocation error");
}
for(i = 0; i < n; i++){
//text[i] = 'A';
strcat(text,"A");
}
int test = strlen(text);
printf("\n%d\n", test);
puts(text);
free(text);
return 0;
}
Well before using strcat make
text[0]=0;
strcat expects null terminated char array for the first argument also.
From standard 7.24.3.1
#include <string.h>
char *strcat(char * restrict s1,
const char * restrict s2);
The strcat function appends a copy of the string pointed to by s2
(including the terminating null character) to the end of the string
pointed to by s1. The initial character of s2 overwrites the null
character at the end of s1.
How do you think strcat will know where the first string ends if you don't
put a \0 in s1.
Also don't forget to allocate an extra byte for the \0 character. Otherwise you are writing past what you have allocated for. This is again undefined behavior.
And earlier you had undefined behavior.
Note:
You should check the return value of malloc to know whether the malloc invocation was successful or not.
Casting the return value of malloc is not needed. Conversion from void* to relevant pointer is done implicitly in this case.
strlen returns size_t not int. printf("%zu",strlen(text))
To start with, you're way of using malloc in
text = (char*) malloc((n)*sizeof(char)
is not ideal. You can change that to
text = malloc(n * sizeof *text); // Don't cast and using *text is straighforward and easy.
So the statement could be
if(NULL == (text = (char*) malloc((n)*sizeof(char))){
fprintf(stderr, "allocation error");
}
But the actual problem lies in
for(i = 0; i < n; i++){
//text[i] = 'A';
strcat(text,"A");
}
The strcat documentation says
dest − This is pointer to the destination array, which should contain
a C string, and should be large enough to contain the concatenated
resulting string.
Just to point out that the above method is flawed, you just need to consider that the C string "A" actually contains two characters in it, A and the terminating \0(the null character). In this case, when i is n-2, you have out of bounds access or buffer overrun1. If you wanted to fill the entire text array with A, you could have done
for(i = 0; i < n; i++){
// Note for n length, you can store n-1 chars plus terminating null
text[i]=(n-2)==i?'A':'\0'; // n-2 because, the count starts from zero
}
//Then print the null terminated string
printf("Filled string : %s\n",text); // You're all good :-)
Note: Use a tool like valgrind to find memory leaks & out of bound memory accesses.
I have a problem with printing a string in C (well, the string that *ptr points to).
I have the following code:
char *removeColon(char *word) {
size_t wordLength;
char word1[MAXLENGTH];
wordLength = strlen(word);
wordLength--;
memcpy(word1, word, wordLength);
printf("word1: %s\n", word1);
return *word1;
}
I ran this with word = "MAIN:" (the value of word comes from strtok on a string read from a file).
It works fine until the printf, where the result is:
word1: MAIN╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
and then there is an exception and everything breaks.
Any thoughts?
Your function removeColon should either
operate in place and modify the string passed as an argument
be given a destination buffer and copy the shortened string to it or
allocate memory for the shortened string and return that.
You copy just the characters into the local array, not the null terminator, nor do you set one in the buffer, passing this array to printf("%s", ...) invokes undefined behavior: printf continues printing the buffer contents until it finds a '\0' byte, it even goes beyond the end of the array, invoking undefined behavior, printing garbage and eventually dies in a crash.
You cannot return a pointer to an automatic array because this array becomes unavailable as soon as the function returns. Dereferencing the pointer later will invoke undefined behavior.
Here is a function that works in place:
char *removeColon(char *word) {
if (*word) word[strlen(word) - 1] = '\0';
return word;
}
Here is one that copies to a destination buffer, assumed to be long enough:
char *removeColon(char *dest, const char *word) {
size_t len = strlen(word);
memcpy(dest, word, len - 1);
dest[len - 1] = '\0';
return dest;
}
Here is one that allocates memory:
char *removeColon(const char *word) {
size_t len = strlen(word);
char *dest = malloc(len);
memcpy(dest, word, len - 1);
dest[len - 1] = '\0';
return dest;
}
You must make sure (1) each string is nul-terminated, and (2) you are not attempting to modify a string-literal. You have many approaches you can take. A simple approach to remove the last character (any char) with strlen:
char *rmlast (char *s)
{
if (!*s) return s; /* return if empty-string */
s[strlen (s) - 1] = 0; /* overwrite last w/nul */
return s;
}
(you can also use the string.h functions strchr (searching for 0), strrchr (searching for your target char, if passed), strpbrk (searching for one of several chars), etc.. to locate the last character as well)
Or you can do the same thing with pointers:
char *rmlast (char *s)
{
if (!*s) return s; /* return if empty-string */
char *p = s;
for (; *p; p++) {} /* advance to end of str */
*--p = 0; /* overwrite last w/nul */
return s;
}
You can also pass the last character of interest if you want to limit removal to any specific character and make a simple comparison in the function before overwriting it with a nul-terminating character.
Look over both and let me know if you have any questions.
wordLength = strlen(word);
You have to include the null terminator in the length, because every string has a terminating character whose ASCII value is 0, spelled \0 in C. Also, use the str... family of functions instead of mem..., since the former is intended for null terminated strings, but the latter for arrays. In addition, you cannot return a local stack allocated array. Based on the code of the function, it sounds like you're removing the last character. If that is the case, it is better to do
void remlast(char *str)
{
str[strlen(str) - 1] = '\0';
}
Note that this does not work on empty strings.
You copy over wordLength bytes, but you fail to add a null terminating byte. Because word1 is uninitialized prior to this copy, the remaining bytes are undefined.
So when printf attempts to print the string, it doesn't find a null terminator and keeps reading until it finds a null byte somewhere outside the bounds of the array. This is undefined behavior.
After copying the bytes, you need to manually add the null terminator:
memcpy(word1, word, wordLength);
word1[wordLength] = '\0';
Also, you're returning a pointer to a local variable. When the function returns, that variable is out of scope, and dereferencing that pointer is also undefined behavior.
Rather than making word1 a local array, you can allocate memory dynamically for it:
char *word1 = malloc(strlen(word));
If you do this, you'll need to free this memory somewhere in the calling function. The other option is to have the caller pass in a buffer of the proper size:
void removeColon(char *word, char *word1) {
I'm writing a function eliminate(char *str, int character) that takes a c-string and a character to eliminate as input, scans str for instances of character and replaces the value at the current index with... what? I thought NULL, but this seems risky and could mess with other functions that rely on the null-terminator in a c-string. For example:
char *eliminate(char *str, int character) {
if (!str) return str;
int index = 0;
while (str[index])
if (str[index] == character)
str[index++] = '\0'; //THIS LINE IS IN QUESTION
return str;
}
My question is, how do I properly implement this function such that I'm effectively eliminating all instances of a specified character in a string? And if a proper elimination assigns '\0' to the character to be replaced, how does this not affect the entire string (i.e., it effectively ends at the first '\0' encountered). For example, if I were to run the above function twice on the same string, the second call would only examine the string up to where the last character was replaced.
It is fine to use such replacement if you know what you are doing.
It can work only if the char buffer char *str is writable (dynamically allocated, for example by malloc, or just char array on stack char str[SIZE]). It cannot work for string literals.
The standard function strtok also works in this way. By the way, probably you can use strtok for your task if you want to have null terminated substring.
It does not make sense to have integer type for charcter function argument: int character -> char character
Replacing the character by '\0' will likely cause confusion. I would eliminate the undesirable character by shifting the next eligible character into its spot.
char *eliminate(char *str, int character) {
if (!str) return str;
int index = 0, shiftIndex = 0;
while (str[index]) {
if (str[index] == character)
index++;
else {
str[shiftIndex] = str[index];
shiftIndex++, index++;
}
}
str[shiftIndex] = '\0';
return str;
}
Why does this code produce runtime issues:
char stuff[100];
strcat(stuff,"hi ");
strcat(stuff,"there");
but this doesn't?
char stuff[100];
strcpy(stuff,"hi ");
strcat(stuff,"there");
strcat will look for the null-terminator, interpret that as the end of the string, and append the new text there, overwriting the null-terminator in the process, and writing a new null-terminator at the end of the concatenation.
char stuff[100]; // 'stuff' is uninitialized
Where is the null terminator? stuff is uninitialized, so it might start with NUL, or it might not have NUL anywhere within it.
In C++, you can do this:
char stuff[100] = {}; // 'stuff' is initialized to all zeroes
Now you can do strcat, because the first character of 'stuff' is the null-terminator, so it will append to the right place.
In C, you still need to initialize 'stuff', which can be done a couple of ways:
char stuff[100]; // not initialized
stuff[0] = '\0'; // first character is now the null terminator,
// so 'stuff' is effectively ""
strcpy(stuff, "hi "); // this initializes 'stuff' if it's not already.
In the first case, stuff contains garbage. strcat requires both the destination and the source to contain proper null-terminated strings.
strcat(stuff, "hi ");
will scan stuff for a terminating '\0' character, where it will start copying "hi ". If it doesn't find it, it will run off the end of the array, and arbitrarily bad things can happen (i.e., the behavior is undefined).
One way to avoid the problem is like this:
char stuff[100];
stuff[0] = '\0'; /* ensures stuff contains a valid string */
strcat(stuff, "hi ");
strcat(stuff, "there");
Or you can initialize stuff to an empty string:
char stuff[100] = "";
which will fill all 100 bytes of stuff with zeros (the increased clarity is probably worth any minor performance issue).
Because stuff is uninitialized before the call to strcpy. After the declaration stuff isn't an empty string, it is uninitialized data.
strcat appends data to the end of a string - that is it finds the null terminator in the string and adds characters after that. An uninitialized string isn't gauranteed to have a null terminator so strcat is likely to crash.
If there were to intialize stuff as below you could perform the strcat's:
char stuff[100] = "";
strcat(stuff,"hi ");
strcat(stuff,"there");
Strcat append a string to existing string. If the string array is empty, it is not going go find end of string ('\0') and it will cause run time error.
According to Linux man page, simple strcat is implemented this way:
char*
strncat(char *dest, const char *src, size_t n)
{
size_t dest_len = strlen(dest);
size_t i;
for (i = 0 ; i < n && src[i] != '\0' ; i++)
dest[dest_len + i] = src[i];
dest[dest_len + i] = '\0';
return dest;
}
As you can see in this implementation, strlen(dest) will not return correct string length unless dest is initialized to correct c string values. You may get lucky to have an array with the first value of zero at char stuff[100]; , but you should not rely on it.
Also, I would advise against using strcpy or strcat as they can lead to some unintended problems.
Use strncpy and strncat, as they help prevent buffer overflows.