I have a strange problem when using string function in C.
Currently I have a function that sends string to UART port.
When I give to it a string like
char buf[32];
strcpy(buf, "AT+CPMS=\"SM");
strcat(buf, "\"");
uart0_putstr(buf);
//or
uart0_putstr("AT+CPMS=SM"); //not a valid AT command, but without quotes just for test
it works well and sends string to UART. But when I use such call:
char buf[32];
strcpy(buf, "AT+CPMS=\"SM\"");
uart0_putstr(buf);
//or
uart0_putstr("AT+CPMS=\"SM\"");
it doesn't print to UART anything.
Maybe you can explain me what the difference between strings in first and second/third cases?
First the C language part:
String literals: All C string literals include an implicit null byte at the end; the C string literal "123" defines a 4 byte array with the values 49,50,51,0. The null byte is always there even if it is never mentioned and enables strlen, strcat etc. to find the end of the string. The suggestion strcpy(buf, "AT+CPMS=\"SM\"\0"); is nonsensical: The character array produced by "AT+CPMS=\"SM\"\0" now ends in two consecutive zero bytes; strcpy will stop at the first one already. "" is a 1 byte array whose single element has the value 0. There is no need to append another 0 byte.
strcat, strcpy: Both functions always add a null byte at the end of the string. There is no need to add a second one.
Escaping: As you know, a C string literal consists of characters book-ended by double quotes: "abc". This makes it impossible to have simple double quotes as part of the string because that would end the string. We have to "escape" them. The C language uses the backslash to give certain characters a special meaning, or, in this case, suppress the special meaning. The entire combination of backslash and subsequent source code character are transformed into a single character, or byte, in the compiled program. The combination \n is transformed into a single byte with the value 13 (usually interpreted as a newline by output devices), \r is 10 (usually carriage return), and \" is transformed into the byte 34, usually printed as the " glyph. The string Between the arrows is a double quote: ->"<- must be coded as "Between the arrows is a double quote: ->\"<-" in C. The middle double quote doesn't end the string literal because it is "escaped".
Then the UART part: The internet makes me believe that the command you want to send over the UART looks like AT+CPMS="SM", followed by a carriage return. The corresponding C string literal would be "AT+CPMS=\"SM\"\r".
The page I linked also inserts a delay between sending commands. Sending too quickly may cause errors that appear only sometimes.
The things to note are :
The AT command syntax probably demands that SM be surrounded by quotes on both sides.
Additionally, the protocol probably demands that a command end in a carriage return.
This ...
char buf[32];
strcpy(buf, "AT+CPMS=\"SM");
strcat(buf, "\"");
... produces the same contents in buf as this ...
char buf[32];
strcpy(buf, "AT+CPMS=\"SM\"");
... does, up to and including the string terminator at index 12. I fully expect an immediately following call to ...
uart0_putstr(buf);
... to have the same effect in each case unless uart0_putstr() looks at bytes past the terminator or its behavior is sensitive to factors other than its argument.
If it does look past the terminator, however, then that might explain not only a difference between those two, but also a difference with ...
uart0_putstr("AT+CPMS=\"SM\"");
... because in this last case, looking past the string terminator would overrun the bounds of the array, producing undefined behavior.
Thanks all. Finally It was resolved with adding NULL char to the end of string.
Related
I came across a line like
char* template = "<html><head><title>%i %s</title></head><body><h1>%i %s</h1> </body></html>";
while reading through code to implement a web server.
I'm curious as I've never seen a string like this before - is template specifying a special type of string (I'm just guessing here because it was highlighted on my IDE)? Also, how would strlen() work with something like this?
Thanks
char* template = "<html>...</html>";
is fundamentally no different than
char *s = "hello";
The name template is not special, it's just an ordinary identifier, the name of the variable. (template happens to be a keyword in C++, but this is C.)
It would be better to define it as const, to enforce the fact that string literals cannot be modified, but it's not mandatory.
Note that template itself is not a string. It's a pointer to a string. The string itself (defined by the language as "a contiguous sequence of characters terminated by and including the first null
character") is the sequence starting with "<html>" and ending with "</html>" and the implicit terminating null character.
And in answer to your second question, strlen(template) would work just fine, giving you the length of the string (81 in this case).
I imagine that there is another part of the code that uses this string to format an output string used as a page by the web server. The strlen function will return the length of the string.
Unless there's a null character somewhere in the initializer or an escape sequence using a \ character, which there isn't, there's nothing special about this string. A % is a normal character in a string and doesn't receive special treatment. The strlen function in particular will read %i as two characters, i.e. % and i. Similarly for %s.
In contrast, a \ is a special character for string and denotes an escape sequence. The \ and the character that follows it in the string constant constitute a single character in the string itself. For example, \n means a newline character (ASCII 10) and \t is a tab character (ASCII 8).
This string is most likely used as a format string for printf. This function will read the string and interpret the %i and %s as format string accepting a int and a char * respectively.
char* template = "<html>...</html>";
just create a char array to store data "<html>...</html>",and this array name is template,you can change this name to other name you want.When create char array,compiler will add \0 to the end of array.strlen will calculate the length from array start to \0(\0 is no include).
I think your IDE will highlight this string is because this string is used in other place.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I know that every string in C ends with '\0' character. It is very useful in cases when we need to know when the string ends. However, I am unable to comprehend its use in printing a string and printing a string without it. I have the following code:-
/* Printing out an array of characters */
#include<stdio.h>
#include<conio.h>
int main()
{
char a[7]={'h','e','l','l','o','!','\0'};
int i;
/* Loop where we do not care about the '\0' */
for(i=0;i<7;i++)
{
printf("%c",a[i]);
}
printf("\n");
/* Part which prints the entire character array as string */
printf("%s",a);
printf("\n");
/* Loop where we care about the '\0' */
for(i=0;i<7&&a[i]!='\0';i++)
{
printf("%c",a[i]);
}
}
The output is:-
hello!
hello!
hello!
I am unable to understand the difference. any explanations?
In this case:
for(i=0;i<7;i++)
{
printf("%c",a[i]);
}
You loop for a number of times (7) and then quit. That is the end condition of the loop. It terminates, no matter anything else.
In the other case, you also loop for 7 times and no more and you just added another condition, which really serves no function as you already keeping a count of things. If you did the following:
int index = 0;
while (a[index] != '\0') { printf("%c", a[index]); index++; }
now you would depend on the zero termination character being there, if it wasn't in the string, you while loop would go on forever until the program crashed or something terminated it forcedly. Probably printing garbage on your screen.
\0 is not part of data in character string. It is indicator of end of string. If length of string is not known, look for this indicator. With its help you can replace your cycle of:
for(i=0;i<7&&a[i]!='\0';i++) { ...
with:
for(int i=0; a[i]; ++i) { ...
So, for-loops and printf are displaying the same string. The only difference how you print it.
'\0' does not correspond to a displayable character; that's why the first and last versions appear to be the same.
The second version is the same because under the hood, printf is just iterating until it hits the '\0'.
The purpose of the terminating zero character is to terminate the string, i.e. to indirectly encode the string length information in the string itself. If you somehow already know the length of your string, you can write code that works correctly without relying on that terminating zero character. That's basically all.
Now, in your code sample the first cycle does something that does not make much sense. It prints 7 characters from a string that actually has length 6. I.e. it attempts to print the terminating zero as well.
When you want to print a string from first character until end. Knowing the length of that string is not necessary when the string ends with \0 (Print characters until \0). So you don't need any extra variable to store the length of string.
In fact a string can have many various representations but minimizing the consumed memory (which it was important to C designers) leads designers to define zero-terminated strings.
Each string representation has its trade off between speed, memory and flexibility. For example you can have your string definition same as Pascal string which stores length of the string at first element of array but it causes that string to have limited length, but retrieving the length of string is faster that zero-terminated strings (Counting each character until \0).
I am unable to comprehend its use in printing a string and printing a string without it
Normally you don't print a string character by character like that. You print the whole string. In such cases, your C library will print until it finds a zero.
When printing a string of variable length, there has to be some 'signal' to indicate that you have reached the end. Generally, this is the '\0' character. Most C standard calls, like strcpy, strcat, printf, etc. depend on the string being zero-terminated, thus ending in a '\0' character. This corresponds to your second example.
The first example is printing a string of fixed length, which is simply a far less common occurence.
The third example combines both, it looks for a zero-terminator ('\0' character) ór 7 characters maximum. This corresponds to calls like strncpy, for example.
The purpose of the terminating zero character is to terminate the string, i.e. to indirectly encode the string length information in the string itself. If you somehow already know the length of your string, you can write code that works correctly without relying on that terminating zero character. That's basically all.
Now, in your code sample the first cycle does something that does not make much sense. It prints 7 characters from a string that actually has length 6. I.e. it attempts to print the terminating zero as well. Why it is doing that - I don't know. In other words, the first output generated by your code is formally different from the rest, since it includes the effect of printing a zero character right after the ! sign. On your platform that effect just happened to be "invisible" on the screen, which is why you probably assumed that the first output is the same as the other ones. However, if you redirect the output to a file, you will be able to see that it is actually quite different.
The other output methods in your code simply output the string up to (and not including) the terminating zero character. The last cycle has redundant condition checking, since you know that the cycle will stop at zero character, before i will have a chance to hit 7.
Other than that, I don't know what "difference" you might be asking about. Please, clarify your question, if this doesn't answer it.
In your loop, you actually print the nul character. Generally this has no effect since it is a non-printing, non-control character. However printf("%s",a); will not output the nul at all - it uses it as a sentinel value. So you loop is not equivalent to %s formatted output.
If you try say:
char a[] = "123456" ;
char b[]={'h','e','l','l','o','!' } ; // No terminator
char c[] = "ABCDEF" ;
printf( "%s", a ) ;
printf( "%s", b ) ;
printf( "%s", c ) ;
You might clearly see why the nul terminator is essential. In my case it output:
123456
hello!╠╠╠╠╠╠╠╠╠╠123456
ABCDEF
Your mileage may vary - the result is undefined behaviour, but in this case the output is running through to the adjacent string, but the compiler has inserted some unused space between them with "junk" in it. I packed a string either side of the un-terminated string because there is no way of telling how a particular compiler orders data in memory. Incidentally when I declared the strings static if the strings, the string b was output with no "run-on". Sometimes the surrounding "junk" may happen to already be zero.
I need to search through a chunk of memory for a string of characters, but several of these strings have every character null separated, like this:
"I. .a.m. .a. .s.t.r.i.n.g"
with all of the '.'s being null characters. My problem comes from actually getting this into memory. I've tried several ways, for instance:
char* str2;
str2 = (char*)malloc(sizeof(char)*40);
memcpy((void*)str2, "123\0567\09abc", 12);
Will put the following into the memory that str2 points to: 123.7.9abc..
Something like
str2 = "123456789\0abcde\054321";
Will have str2 pointing to a block of memory that looks like 123456789.abcde,321 , wherein the '.' is a null character, and the ',' is an actual comma.
So clearly inserting null characters into cstrings doesn't work as easily as I thought it did, like inserting a newline character. I encountered similar difficulties trying this with the string library as well. I could do separate assignments, something like:
char* str;
str = (char*)malloc(sizeof(char)*40);
strcpy(str, "123");
strcpy(str+4, "abc");
strcpy(str+8, "ABC");
But that is certainly not preferable, and I believe the problem lies in my understanding of how c-style strings are stored in memory. Clearly "abc\0123" doesn't actually go into memory as 61 62 63 00 31 32 33 (in hex). How is it stored, and how can I store what I need to?
(I also apologize for not having set the code in blocks, this is my first time posting a question, and somehow "four spaced" is more difficult than I can handle apparently. Thank you, Luchian. I see more newlines were needed.)
If every other char contains a null, then almost certainly you actually have UTF-16 encoded strings. Process them accordingly and your problems will disappear.
Assuming you are on Windows, where UTF-16 is common, you would use wchar_t* rather than char* to hold such strings. And you would use wide char string processing functions to operate on such data. For example, use wcscpy rather than strcpy and so on.
\0 is the starting sequence of an escaped character in octets, it's not just a "null character" (even though the use of it's own will result in one).
The easiest way to define a string containing a null-character followed by something that could also be treated as a part of an escaped characer in octet (such as "\012"1) is to split it up using this below feature of C:
char const * p = "123456789" "\0" "abcde" "\0" "54321";
1. "\012" will result in the character with the equivalent hex value of 0x0A, not three characters; 0x00, '1' and '2'.
First off, every second character being a NULL is a clear hallmark of a widestring - a string that's composed of two-byte characters, really an array of unsigned shorts. Depending on your compiler and settings, you might be better off using datatype wchar_t instead of char and wcsxxx() family of functions instead of strxxx().
On Windows, 2-byte widestrings (UTF-16, technically) is the native string format of the OS, so they're all around the place.
That said, strxxx() functions all assume that the string is null-terminated. So plan accordingly. Sometimes memxxx() will come to the rescue.
"abc\0123" does not go into memory the way you expect because \012 is being interpreted by the compiler as a single octal escape sequence - the character with octal code 12 (that's 0a hex). To avoid, use one of the following literals:
"abc\000123"
"abc\x00123"
"abc\0""123"
The snippet where you generate a string from chunks is mostly correct. It's just that I'd rather use
strcpy(str+strlen(str)+1, "123");
that guarantees that the next chunk will be written past the null character of the previous chunk.
I am a bit confused by your question.
But let me guess what is going on. You are looking at 16 bit wchat_t string and not a normal c string.
wchar getting ascii characters may look like null separated between letters but actually this is normal.
simply (wchar_t *)XXX where XXX is a pointer to that region of memory and lookup wchar_t operations like wcscpy etc... as for the nulls between strings, this may actually be a known method to pass multiple string construct. You can simply iterate after your read each string until normally you encounter 2 consecutive nulls.
Hope I have answered your question.
Good luck!
If by mistake,I define a char array with no \0 as its last character, what happens then?
I'm asking this because I noticed that if I try to iterate through the array with while(cnt!='\0'), where cnt is an int variable used as an index to the array, and simultaneously print the cnt values to monitor what's happening the iteration stops at the last character +2.The extra characters are of course random but I can't get it why it has to stop after 2.Does the compiler automatically inserts a \0 character? Links to relevant documentation would be appreciated.
To make it clear I give an example. Let's say that the array str contains the word doh(with no '\0'). Printing the cnt variable at every loop would give me this:
doh+
or doh^
and so on.
EDIT (undefined behaviour)
Accessing array elements outside of the array boundaries is undefined behaviour.
Calling string functions with anything other than a C string is undefined behaviour.
Don't do it!
A C string is a sequence of bytes terminated by and including a '\0' (NUL terminator). All the bytes must belong to the same object.
Anyway, what you see is a coincidence!
But it might happen like this
,------------------ garbage
| ,---------------- str[cnt] (when cnt == 4, no bounds-checking)
memory ----> [...|d|o|h|*|0|0|0|4|...]
| | \_____/ -------- cnt (big-endian, properly 4-byte aligned)
\___/ ------------------ str
If you define a char array without the terminating \0 (called a "null terminator"), then your string, well, won't have that terminator. You would do that like so:
char strings[] = {'h', 'e', 'l', 'l', 'o'};
The compiler never automatically inserts a null terminator in this case. The fact that your code stops after "+2" is a coincidence; it could just as easily stopped at +50 or anywhere else, depending on whether there happened to be \0 character in the memory following your string.
If you define a string as:
char strings[] = "hello";
Then that will indeed be null-terminated. When you use quotation marks like that in C, then even though you can't physically see it in the text editor, there is a null terminator at the end of the string.
There are some C string-related functions that will automatically append a null-terminator. This isn't something the compiler does, but part of the function's specification itself. For example, strncat(), which concatenates one string to another, will add the null terminator at the end.
However, if one of the strings you use doesn't already have that terminator, then that function will not know where the string ends and you'll end up with garbage values (or a segmentation fault.)
In C language the term string refers to a zero-terminated array of characters. So, pedantically speaking there's no such thing as "strings without a '\0' char". If it is not zero-terminated, it is not a string.
Now, there's nothing wrong with having a mere array of characters without any zeros in it, as long as you understand that it is not a string. If you ever attempt to work with such character array as if it is a string, the behavior of your program is undefined. Anything can happen. It might appear to "work" for some magical reasons. Or it might crash all the time. It doesn't really matter what such a program will actually do, since if the behavior is undefined, the program is useless.
This would happen if, by coincidence, the byte at *(str + 5) is 0 (as a number, not ASCII)
As far as most string-handling functions are concerned, strings always stop at a '\0' character. If you miss this null-terminator somewhere, one of three things will usually happen:
Your program will continue reading past the end of the string until it finds a '\0' that just happened to be there. There are several ways for such a character to be there, but none of them is usually predictable beforehand: it could be part of another variable, part of the executable code or even part of a larger string that was previously stored in the same buffer. Of course by the time that happens, the program may have processed a significant amount of garbage. If you see lots of garbage produced by a printf(), an unterminated string is a common cause.
Your program will continue reading past the end of the string until it tries to read an address outside its address space, causing a memory error (e.g. the dreaded "Segmentation fault" in Linux systems).
Your program will run out of space when copying over the string and will, again, cause a memory error.
And, no, the C compiler will not normally do anything but what you specify in your program - for example it won't terminate a string on its own. This is what makes C so powerful and also so hard to code for.
I bet that an int is defined just after your string and that this int takes only small values such that at least one byte is 0.
I am new to C and I am very much confused with the C strings. Following are my questions.
Finding last character from a string
How can I find out the last character from a string? I came with something like,
char *str = "hello";
printf("%c", str[strlen(str) - 1]);
return 0;
Is this the way to go? I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.
Converting char to char*
I have a string and need to append a char to it. How can i do that? strcat accepts only char*. I tried the following,
char delimiter = ',';
char text[6];
strcpy(text, "hello");
strcat(text, delimiter);
Using strcat with variables that has local scope
Please consider the following code,
void foo(char *output)
{
char *delimiter = ',';
strcpy(output, "hello");
strcat(output, delimiter);
}
In the above code,delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?
How strcat handles null terminating character?
If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?
Is there a good beginner level article which explains how strings work in C and how can I perform the usual string manipulations?
Any help would be great!
Last character: your approach is correct. If you will need to do this a lot on large strings, your data structure containing strings should store lengths with them. If not, it doesn't matter that it's O(n).
Appending a character: you have several bugs. For one thing, your buffer is too small to hold another character. As for how to call strcat, you can either put the character in a string (an array with 2 entries, the second being 0), or you can just manually use the length to write the character to the end.
Your worry about 2 nul terminators is unfounded. While it occupies memory contiguous with the string and is necessary, the nul byte at the end is NOT "part of the string" in the sense of length, etc. It's purely a marker of the end. strcat will overwrite the old nul and put a new one at the very end, after the concatenated string. Again, you need to make sure your buffer is large enough before you call strcat!
O(n) is the best you can do, because of the way C strings work.
char delimiter[] = ",";. This makes delimiter a character array holding a comma and a NUL Also, text needs to have length 7. hello is 5, then you have the comma, and a NUL.
If you define delimiter correctly, that's fine (as is, you're assigning a character to a pointer, which is wrong). The contents of output won't depend on delimiter later on.
It will overwrite the first NUL.
You're on the right track. I highly recommend you read K&R C 2nd Edition. It will help you with strings, pointers, and more. And don't forget man pages and documentation. They will answer questions like the one on strcat quite clearly. Two good sites are The Open Group and cplusplus.com.
A "C string" is in reality a simple array of chars, with str[0] containing the first character, str[1] the second and so on. After the last character, the array contains one more element, which holds a zero. This zero by convention signifies the end of the string. For example, those two lines are equivalent:
char str[] = "foo"; //str is 4 bytes
char str[] = {'f', 'o', 'o', 0};
And now for your questions:
Finding last character from a string
Your way is the right one. There is no faster way to know where the string ends than scanning through it to find the final zero.
Converting char to char*
As said before, a "string" is simply an array of chars, with a zero terminator added to the end. So if you want a string of one character, you declare an array of two chars - your character and the final zero, like this:
char str[2];
str[0] = ',';
str[1] = 0;
Or simply:
char str[2] = {',', 0};
Using strcat with variables that has local scope
strcat() simply copies the contents of the source array to the destination array, at the offset of the null character in the destination array. So it is irrelevant what happens to the source after the operation. But you DO need to worry if the destination array is big enough to hold the data - otherwise strcat() will overwrite whatever data sits in memory right after the array! The needed size is strlen(str1) + strlen(str2) + 1.
How strcat handles null terminating character?
The final zero is expected to terminate both input strings, and is appended to the output string.
Finding last character from a string
I propose a thought experiment: if it were generally possible to find the last character
of a string in better than O(n) time, then could you not also implement strlen
in better than O(n) time?
Converting char to char*
You temporarily can store the char in an array-of-char, and that will decay into
a pointer-to-char:
char delimiterBuf[2] = "";
delimiterBuf[0] = delimiter;
...
strcat(text, delimiterBuf);
If you're just using character literals, though, you can simply use string literals instead.
Using strcat with variables that has local scope
The variable itself isn't referenced outside the scope. When the function returns,
that local variable has already been evaluated and its contents have already been
copied.
How strcat handles null terminating character?
"Strings" in a C are NUL-terminated sequences of characters. Both inputs to
strcat must be NUL-terminated, and the result will be NUL-terminated. It
wouldn't be useful for strcat to write an extra NUL-byte to the result if it
doesn't need to.
(And if you're wondering what if the input strings have multiple trailing
NUL bytes already, I propose another thought experiment: how would strcat know
how many trailing NUL-bytes there are in a string?)
BTW, since you tagged this with "best-practices", I'll also recommend that you take care not to write past the end of your destination buffers. Typically this means avoiding strcat and strcpy (unless you've already checked that the input strings won't overflow the destination) and using safer versions (e.g. strncat. Note that strncpy has its own pitfalls, so that's a poor substitute. There also are safer versions that are non-standard, such as strlcpy/strlcat and strcpy_s/strcat_s.)
Similarly, functions like your foo function always should take an additional argument specifying what the size of the destination buffer is (and documentation should make it explicitly clear whether that size accounts for a NUL terminator or not).
How can I find out the last character
from a string?
Your technique with str[strlen(str) - 1] is fine. As pointed out, you should avoid repeated, unnecessary calls to strlen and store the results.
I somehow think that, this is not the
correct way because strlen has to
iterate over the characters to get the
length. So this operation will have a
O(n) complexity.
Repeated calls to strlen can be a bane of C programs. However, you should avoid premature optimization. If a profiler actually demonstrates a hotspot where strlen is expensive, then you can do something like this for your literal string case:
const char test[] = "foo";
sizeof test // 4
Of course if you create 'test' on the stack, it incurs a little overhead (incrementing/decrementing stack pointer), but no linear time operation involved.
Literal strings are generally not going to be so gigantic. For other cases like reading a large string from a file, you can store the length of the string in advance as but one example to avoid recomputing the length of the string. This can also be helpful as it'll tell you in advance how much memory to allocate for your character buffer.
I have a string and need to append a
char to it. How can i do that? strcat
accepts only char*.
If you have a char and cannot make a string out of it (char* c = "a"), then I believe you can use strncat (need verification on this):
char ch = 'a';
strncat(str, &ch, 1);
In the above code,delimiter is a local
variable which gets destroyed after
foo returned. Is it OK to append it to
variable output?
Yes: functions like strcat and strcpy make deep copies of the source string. They don't leave shallow pointers behind, so it's fine for the local data to be destroyed after these operations are performed.
If I am concatenating two null
terminated strings, will strcat
append two null terminating characters
to the resultant string?
No, strcat will basically overwrite the null terminator on the dest string and write past it, then append a new null terminator when it's finished.
How can I find out the last character from a string?
Your approach is almost correct. The only way to find the end of a C string is to iterate throught the characters, looking for the nul.
There is a bug in your answer though (in the general case). If strlen(str) is zero, you access the character before the start of the string.
I have a string and need to append a char to it. How can i do that?
Your approach is wrong. A C string is just an array of C characters with the last one being '\0'. So in theory, you can append a character like this:
char delimiter = ',';
char text[7];
strcpy(text, "hello");
int textSize = strlen(text);
text[textSize] = delimiter;
text[textSize + 1] = '\0';
However, if I leave it like that I'll get zillions of down votes because there are three places where I have a potential buffer overflow (if I didn't know that my initial string was "hello"). Before doing the copy, you need to put in a check that text is big enough to contain all the characters from the string plus one for the delimiter plus one for the terminating nul.
... delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?
Yes that's fine. strcat copies characters. But your code sample does no checks that output is big enough for all the stuff you are putting into it.
If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?
No.
I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.
You are right read Joel Spolsky on why C-strings suck. There are few ways around it. The ways include either not using C strings (for example use Pascal strings and create your own library to handle them), or not use C (use say C++ which has a string class - which is slow for different reasons, but you could also write your own to handle Pascal strings more easily than in C for example)
Regarding adding a char to a C string; a C string is simply a char array with a nul terminator, so long as you preserve the terminator it is a string, there's no magic.
char* straddch( char* str, char ch )
{
char* end = &str[strlen(str)] ;
*end = ch ;
end++ ;
*end = 0 ;
return str ;
}
Just like strcat(), you have to know that the array that str is created in is long enough to accommodate the longer string, the compiler will not help you. It is both inelegant and unsafe.
If I am concatenating two null
terminated strings, will strcat append
two null terminating characters to the
resultant string?
No, just one, but what ever follows that may just happen to be nul, or whatever happened to be in memory. Consider the following equivalent:
char* my_strcat( char* s1, const char* s2 )
{
strcpy( &str[strlen(str)], s2 ) ;
}
the first character of s2 overwrites the terminator in s1.
In the above code,delimiter is a local
variable which gets destroyed after
foo returned. Is it OK to append it to
variable output?
In your example delimiter is not a string, and initialising a pointer with a char makes no sense. However if it were a string, the code would be fine, strcat() copies the data from the second string, so the lifetime of the second argument is irrelevant. Of course you could in your example use a char (not a char*) and the straddch() function suggested above.