I found this function on stackoverflow which concates two strings together. Here is the function:
char* concatstring(char *s1,char *s2)
{
char *result = malloc(strlen(s1)+strlen(s2)+1);
strcpy(result,s1);
strcat(result,s2);
return result;
}
My question is, why do we add 1 to the malloc call?
It's because in C "strings" are stored as arrays of chars followed by a null byte. This is by convention. Consequently, null bytes may not appear inside any C string.
However, the actual string itself does not contain the null byte (which is just part of the representation of the string), and so strlen reports the number of non-null bytes in the string. To create a C string that is the result of concatenating two strings, you thus need to leave room for the null terminator.
In fact, every string operation one way or another needs to deal with the null terminator. Unfortunately, the details vary from function to function (e.g. snprintf does it right, but strncpy is dangerously different), and you should read each function's manual very carefully to understand who takes care of the null terminator and how.
You need to allocate space for the '\0' (NULL character) which is used to terminate strings in C.
i.e. the string "cat" is actually "cat\0".
If the string is "cat":
char * mystring = "cat";
Then strlen(mystring), would return 3.
But in reality it takes 4 bytes to store mystring, with one byte to store null character.
So if you have two strings, "dog" and "cat", their length will be 3 and 3 , although the number of bytes required to store them would be 4 each. The memory required to store their concatenation would be 3+3 +1 = 7.
So the 1 in malloc is to allocate extra byte to store the null character.
Related
I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.
I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.
I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.
I am new to C and I am very much confused with the C strings. Following are my questions.
Finding last character from a string
How can I find out the last character from a string? I came with something like,
char *str = "hello";
printf("%c", str[strlen(str) - 1]);
return 0;
Is this the way to go? I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.
Converting char to char*
I have a string and need to append a char to it. How can i do that? strcat accepts only char*. I tried the following,
char delimiter = ',';
char text[6];
strcpy(text, "hello");
strcat(text, delimiter);
Using strcat with variables that has local scope
Please consider the following code,
void foo(char *output)
{
char *delimiter = ',';
strcpy(output, "hello");
strcat(output, delimiter);
}
In the above code,delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?
How strcat handles null terminating character?
If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?
Is there a good beginner level article which explains how strings work in C and how can I perform the usual string manipulations?
Any help would be great!
Last character: your approach is correct. If you will need to do this a lot on large strings, your data structure containing strings should store lengths with them. If not, it doesn't matter that it's O(n).
Appending a character: you have several bugs. For one thing, your buffer is too small to hold another character. As for how to call strcat, you can either put the character in a string (an array with 2 entries, the second being 0), or you can just manually use the length to write the character to the end.
Your worry about 2 nul terminators is unfounded. While it occupies memory contiguous with the string and is necessary, the nul byte at the end is NOT "part of the string" in the sense of length, etc. It's purely a marker of the end. strcat will overwrite the old nul and put a new one at the very end, after the concatenated string. Again, you need to make sure your buffer is large enough before you call strcat!
O(n) is the best you can do, because of the way C strings work.
char delimiter[] = ",";. This makes delimiter a character array holding a comma and a NUL Also, text needs to have length 7. hello is 5, then you have the comma, and a NUL.
If you define delimiter correctly, that's fine (as is, you're assigning a character to a pointer, which is wrong). The contents of output won't depend on delimiter later on.
It will overwrite the first NUL.
You're on the right track. I highly recommend you read K&R C 2nd Edition. It will help you with strings, pointers, and more. And don't forget man pages and documentation. They will answer questions like the one on strcat quite clearly. Two good sites are The Open Group and cplusplus.com.
A "C string" is in reality a simple array of chars, with str[0] containing the first character, str[1] the second and so on. After the last character, the array contains one more element, which holds a zero. This zero by convention signifies the end of the string. For example, those two lines are equivalent:
char str[] = "foo"; //str is 4 bytes
char str[] = {'f', 'o', 'o', 0};
And now for your questions:
Finding last character from a string
Your way is the right one. There is no faster way to know where the string ends than scanning through it to find the final zero.
Converting char to char*
As said before, a "string" is simply an array of chars, with a zero terminator added to the end. So if you want a string of one character, you declare an array of two chars - your character and the final zero, like this:
char str[2];
str[0] = ',';
str[1] = 0;
Or simply:
char str[2] = {',', 0};
Using strcat with variables that has local scope
strcat() simply copies the contents of the source array to the destination array, at the offset of the null character in the destination array. So it is irrelevant what happens to the source after the operation. But you DO need to worry if the destination array is big enough to hold the data - otherwise strcat() will overwrite whatever data sits in memory right after the array! The needed size is strlen(str1) + strlen(str2) + 1.
How strcat handles null terminating character?
The final zero is expected to terminate both input strings, and is appended to the output string.
Finding last character from a string
I propose a thought experiment: if it were generally possible to find the last character
of a string in better than O(n) time, then could you not also implement strlen
in better than O(n) time?
Converting char to char*
You temporarily can store the char in an array-of-char, and that will decay into
a pointer-to-char:
char delimiterBuf[2] = "";
delimiterBuf[0] = delimiter;
...
strcat(text, delimiterBuf);
If you're just using character literals, though, you can simply use string literals instead.
Using strcat with variables that has local scope
The variable itself isn't referenced outside the scope. When the function returns,
that local variable has already been evaluated and its contents have already been
copied.
How strcat handles null terminating character?
"Strings" in a C are NUL-terminated sequences of characters. Both inputs to
strcat must be NUL-terminated, and the result will be NUL-terminated. It
wouldn't be useful for strcat to write an extra NUL-byte to the result if it
doesn't need to.
(And if you're wondering what if the input strings have multiple trailing
NUL bytes already, I propose another thought experiment: how would strcat know
how many trailing NUL-bytes there are in a string?)
BTW, since you tagged this with "best-practices", I'll also recommend that you take care not to write past the end of your destination buffers. Typically this means avoiding strcat and strcpy (unless you've already checked that the input strings won't overflow the destination) and using safer versions (e.g. strncat. Note that strncpy has its own pitfalls, so that's a poor substitute. There also are safer versions that are non-standard, such as strlcpy/strlcat and strcpy_s/strcat_s.)
Similarly, functions like your foo function always should take an additional argument specifying what the size of the destination buffer is (and documentation should make it explicitly clear whether that size accounts for a NUL terminator or not).
How can I find out the last character
from a string?
Your technique with str[strlen(str) - 1] is fine. As pointed out, you should avoid repeated, unnecessary calls to strlen and store the results.
I somehow think that, this is not the
correct way because strlen has to
iterate over the characters to get the
length. So this operation will have a
O(n) complexity.
Repeated calls to strlen can be a bane of C programs. However, you should avoid premature optimization. If a profiler actually demonstrates a hotspot where strlen is expensive, then you can do something like this for your literal string case:
const char test[] = "foo";
sizeof test // 4
Of course if you create 'test' on the stack, it incurs a little overhead (incrementing/decrementing stack pointer), but no linear time operation involved.
Literal strings are generally not going to be so gigantic. For other cases like reading a large string from a file, you can store the length of the string in advance as but one example to avoid recomputing the length of the string. This can also be helpful as it'll tell you in advance how much memory to allocate for your character buffer.
I have a string and need to append a
char to it. How can i do that? strcat
accepts only char*.
If you have a char and cannot make a string out of it (char* c = "a"), then I believe you can use strncat (need verification on this):
char ch = 'a';
strncat(str, &ch, 1);
In the above code,delimiter is a local
variable which gets destroyed after
foo returned. Is it OK to append it to
variable output?
Yes: functions like strcat and strcpy make deep copies of the source string. They don't leave shallow pointers behind, so it's fine for the local data to be destroyed after these operations are performed.
If I am concatenating two null
terminated strings, will strcat
append two null terminating characters
to the resultant string?
No, strcat will basically overwrite the null terminator on the dest string and write past it, then append a new null terminator when it's finished.
How can I find out the last character from a string?
Your approach is almost correct. The only way to find the end of a C string is to iterate throught the characters, looking for the nul.
There is a bug in your answer though (in the general case). If strlen(str) is zero, you access the character before the start of the string.
I have a string and need to append a char to it. How can i do that?
Your approach is wrong. A C string is just an array of C characters with the last one being '\0'. So in theory, you can append a character like this:
char delimiter = ',';
char text[7];
strcpy(text, "hello");
int textSize = strlen(text);
text[textSize] = delimiter;
text[textSize + 1] = '\0';
However, if I leave it like that I'll get zillions of down votes because there are three places where I have a potential buffer overflow (if I didn't know that my initial string was "hello"). Before doing the copy, you need to put in a check that text is big enough to contain all the characters from the string plus one for the delimiter plus one for the terminating nul.
... delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?
Yes that's fine. strcat copies characters. But your code sample does no checks that output is big enough for all the stuff you are putting into it.
If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?
No.
I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.
You are right read Joel Spolsky on why C-strings suck. There are few ways around it. The ways include either not using C strings (for example use Pascal strings and create your own library to handle them), or not use C (use say C++ which has a string class - which is slow for different reasons, but you could also write your own to handle Pascal strings more easily than in C for example)
Regarding adding a char to a C string; a C string is simply a char array with a nul terminator, so long as you preserve the terminator it is a string, there's no magic.
char* straddch( char* str, char ch )
{
char* end = &str[strlen(str)] ;
*end = ch ;
end++ ;
*end = 0 ;
return str ;
}
Just like strcat(), you have to know that the array that str is created in is long enough to accommodate the longer string, the compiler will not help you. It is both inelegant and unsafe.
If I am concatenating two null
terminated strings, will strcat append
two null terminating characters to the
resultant string?
No, just one, but what ever follows that may just happen to be nul, or whatever happened to be in memory. Consider the following equivalent:
char* my_strcat( char* s1, const char* s2 )
{
strcpy( &str[strlen(str)], s2 ) ;
}
the first character of s2 overwrites the terminator in s1.
In the above code,delimiter is a local
variable which gets destroyed after
foo returned. Is it OK to append it to
variable output?
In your example delimiter is not a string, and initialising a pointer with a char makes no sense. However if it were a string, the code would be fine, strcat() copies the data from the second string, so the lifetime of the second argument is irrelevant. Of course you could in your example use a char (not a char*) and the straddch() function suggested above.
char label[8] = "abcdefgh";
char arr[7] = "abcdefg";
printf("%s\n",label);
printf("%s",arr);
====output==========
abcdefgh
abcdefgÅ
Why Å is appended at the end of the string arr?
I am running C code in Turbo C ++.
printf expects NUL-terminated strings. Increase the size of your char arrays by one to make space for the terminating NUL character (it is added automatically by the = "..." initializer).
If you don't NUL-terminate your strings, printf will keep reading until it finds a NUL character, so you will get a more or less random result.
Your variables label and arr are not strings. They are arrays of characters.
To be strings (and for you to be able to pass them to functions declared in <string.h>) they need a NUL terminator in the space reserved for them.
Definition of "string" from the Standard
7.1.1 Definitions of terms
1 A string is a contiguous sequence of characters terminated by and including
the first null character. The term multibyte string is sometimes used
instead to emphasize special processing given to multibyte characters
contained in the string or to avoid confusion with a wide string. A pointer
to a string is a pointer to its initial (lowest addressed) character. The
length of a string is the number of bytes preceding the null character and
the value of a string is the sequence of the values of the contained
characters, in order.
Your string is not null terminated, so printf is running into junk data. You need to use the '\0' at the end of the string.
Using GCC (on Linux), it prints more garbage:
abcdefgh°ÃÕÄÕ¿UTÞÄÕ¿UTÞ·
abcdefgabcdefgh°ÃÕÄÕ¿UTÞÄÕ¿UTÞ·
This is because, you are printing two character arrays as strings (using %s).
This works fine:
char label[9] = "abcdefgh\0"; char arr[8] = "abcdefg\0";
printf("%s\n",label); printf("%s",arr);
However, you need not mention the "\0" explicitly. Just make sure the array size is large enough, i.e 1 more than the number of characters in your strings.