String initialization with and without explicit trailing terminator

String initialization with and without explicit trailing terminator - c

What is the difference between
char str1[32] = "\0";
and
char str2[32] = "";

Since you already declared the sizes, the two declarations are exactly equal. However, if you do not specify the sizes, you can see that the first declaration makes a larger string:
char a[] = "a\0";
char b[] = "a";
printf("%i %i\n", sizeof(a), sizeof(b));
prints
3 2
This is because a ends with two nulls (the explicit one and the implicit one) while b ends only with the implicit one.

Well, assuming the two cases are as follows (to avoid compiler errors):
char str1[32] = "\0";
char str2[32] = "";
As people have stated, str1 is initialized with two null characters:
char str1[32] = {'\0','\0'};
char str2[32] = {'\0'};
However, according to both the C and C++ standard, if part of an array is initialized, then remaining elements of the array are default initialized. For a character array, the remaining characters are all zero initialized (i.e. null characters), so the arrays are really initialized as:
char str1[32] = {'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0'};
char str2[32] = {'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0',
'\0','\0','\0','\0','\0','\0','\0','\0'};
So, in the end, there really is no difference between the two.

As others have pointed out, "" implies one terminating '\0' character, so "\0" actually initializes the array with two null characters.
Some other answerers have implied that this is "the same", but that isn't quite right. There may be no practical difference -- as long the only way the array is used is to reference it as a C string beginning with the first character. But note that they do indeed result in two different memory initalizations, in particular they differ in whether Str[1] is definitely zero, or is uninitialized (and could be anything, depending on compiler, OS, and other random factors). There are some uses of the array (perhaps not useful, but still) that would have different behaviors.

Unless I'm mistaken, the first will initialize 2 chars to 0 (the '\0' and the terminator that's always there, and leave the rest untouched, and the last will initialize only 1 char (the terminator).

Related

Utility of '\0' in C string [duplicate]

This question already has answers here:
What is a null-terminated string?
(7 answers)
Closed last year.
#include <stdio.h>
#include <string.h>
int main()
{
char ch[20] = {'h','i'};
int k=strlen(ch);
printf("%d",k);
return 0;
}
The output is 2.
As far as I know '\0' helps compiler identify the end of string but the output here suggests the strlen can detect the end on it's own then why do we need '\0'?

long story short: it's your compiler making proactive decisions based on the standard.
long story:
char ch[20] = {'h','i'}
in the line above what you are implying to your compiler is;
allocate a memory big enough to store 20 characters (aka, array of 20 chars).
initialize first two slices (first two members of the array) as 'h' & 'i'.
implicitly initialize the rest.
since you are initialing your char array, your compiler is smart enough to insert the null terminator to the third element if it has enough space remaining. This process is the standard for initialization.
if you were to remove the initialization syntax and initialize each member manually like below, the result is undefined behavior.
char ch[20];
ch[0] = 'h';
ch[1] = 'i';
Also, if you were to not have extra space for your compiler to put the null terminator, even if you used a initializer the result would still be an undefined behavior as you can easily test via this code snippet below:
char ch[2] = { 'h','i' };
int k = strlen(ch);
printf("%d\n%s\n", k, ch);
now, if you were to increase the array size of 'ch' from 2 to 3 or any other number higher than 2, you can see that your compiler initializes it with the null terminator thus no more undefined behavior.

In this declaration:
char ch[20] = {'h','i'};
the first two elements are initialized explicitly and all other elements are initialized implicitly by zeroes.
The above declaration in fact (with one exceptions that the third element of the array is also explicitly initialized) is equivalent to:
char ch[20] = "hi";
Pat attention to that the string literal is represented as the following array:
{ 'h', 'i', '\0' }
That is the array contains a string that is terminated by the zero character '\0' and the function strlen can successfully find the length of the stored string.
If you would write for example:
char ch[2] = "hi";
then in this case the array ch does not have a space to store the terminating zero of the string literal. In this case applying the function strlen to this array invokes undefined behavior.

A null byte (i.e. the value 0) is what defines the end of a string in C.
When you defined ch, you gave less initializers than values in the array, so the remaining elements are set to 0. This results in a null terminated string.
The strlen function is basically looking for that value and counting how many elements it sees before it finds the null byte.

As far as I know '\0' helps compiler identify the end of string
Technically, it helps user code and the C runtime library identify the ends of strings. To the extent that the compiler needs to know where strings end, it knows without looking for a terminator.
but the output here suggests the strlen can detect the end on it's own
That would be a misinterpretation. The actual fact is that your string is null-terminated even though you did not put a null terminator in it explicitly. This is a consequence of declaring your array with an initializer that specifies values for only some of the elements. As some of your other answers describe in more detail, that does not produce a partial initialization. Rather, elements for which the initializer does not specify values are default-initialized. For elements of type char, that means initialization with 0, which serves as a string terminator.
Moreover, if the array were without a terminator then the result of passing it to strlen() would be undefined. You could not then conclude anything from the result.
then why do we need '\0'?
So that user code and many standard library functions can recognize the ends of strings. You already know this.
But in many cases we do not need to provide terminators explicitly. In particular, we do not need to represent them in string literals (and it means something different than you probably intended if you do), and you don't need to represent them in the initializers for char arrays storing strings, provided that the array has more elements than you specify in the initializer.

It is likely that your array ch contained zeros thus the byte after i is already set to zero. You can view it with a debugger or simply test it in the code. Trust me, strlen needs the zero to work.

Are char arrays guaranteed to be null terminated?

#include <stdio.h>
int main() {
char a = 5;
char b[2] = "hi"; // No explicit room for `\0`.
char c = 6;
return 0;
}
Whenever we write a string, enclosed in double quotes, C automatically creates an array of characters for us, containing that string, terminated by the \0 character
http://www.eskimo.com/~scs/cclass/notes/sx8.html
In the above example b only has room for 2 characters so the null terminating char doesn't have a spot to be placed at and yet the compiler is reorganizing the memory store instructions so that a and c are stored before b in memory to make room for a \0 at the end of the array.
Is this expected or am I hitting undefined behavior?

It is allowed to initialize a char array with a string if the array is at least large enough to hold all of the characters in the string besides the null terminator.
This is detailed in section 6.7.9p14 of the C standard:
An array of character type may be initialized by a character string
literal or UTF−8 string literal, optionally enclosed in braces.
Successive bytes of the string literal (including the terminating null
character if there is room or if the array is of unknown size)
initialize the elements of the array.
However, this also means that you can't treat the array as a string since it's not null terminated. So as written, since you're not performing any string operations on b, your code is fine.
What you can't do is initialize with a string that's too long, i.e.:
char b[2] = "hello";
As this gives more initializers than can fit in the array and is a constraint violation. Section 6.7.9p2 states this as follows:
No initializer shall attempt to provide a value for an object not contained within the entity
being initialized.
If you were to declare and initialize the array like this:
char b[] = "hi";
Then b would be an array of size 3, which is large enough to hold the two characters in the string constant plus the terminating null byte, making b a string.
To summarize:
If the array has a fixed size:
If the string constant used to initialize it is shorter than the array, the array will contain the characters in the string with successive elements set to 0, so the array will contain a string.
If the array is exactly large enough to contain the elements of the string but not the null terminator, the array will contain the characters in the string without the null terminator, meaning the array is not a string.
If the string constant (not counting the null terminator) is longer than the array, this is a constraint violation which triggers undefined behavior
If the array does not have an explicit size, the array will be sized to hold the string constant plus the terminating null byte.

Whenever we write a string, enclosed in double quotes, C automatically creates an array of characters for us, containing that string, terminated by the \0 character.
Those notes are mildly misleading in this case. I shall have to update them.
When you write something like
char *p = "Hello";
or
printf("world!\n");
C automatically creates an array of characters for you, of just the right size, containing the string, terminated by the \0 character.
In the case of array initializers, however, things are slightly different. When you write
char b[2] = "hi";
the string is merely the initializer for an array which you are creating. So you have complete control over the size. There are several possibilities:
char b0[] = "hi"; // compiler infers size
char b1[1] = "hi"; // error
char b2[2] = "hi"; // No terminating 0 in the array. (Illegal in C++, BTW)
char b3[3] = "hi"; // explicit size matches string literal
char b4[10] = "hi"; // space past end of initializer is always zero-initialized
For b0, you don't specify a size, so the compiler uses the string initializer to pick the right size, which will be 3.
For b1, you specify a size, but it's too small, so the compiler should give you a error.
For b2, which is the case you asked about, you specify a size which is just barely big enough for the explicit characters in the string initializer, but not the terminating \0. This is a special case. It's legal, but what you end up with in b2 is not a proper null-terminated string. Since it's unusual at best, the compiler might give you a warning. See this question for more information on this case.
For b3, you specify a size which is just right, so you get a proper string in an exactly-sized array, just like b0.
For b4, you specify a size which is too big, although this is no problem. There ends up being extra space in the array, beyond the terminating \0. (As a matter of fact, this extra space will also be filled with \0.) This extra space would let you safely do something like strcat(b4, ", wrld!").
Needless to say, most of the time you want to use the b0 form. Counting characters is tedious and error-prone. As Brian Kernighan (one of the creators of C) has written in this context, "Let the computer do the dirty work."
One more thing. You wrote:
and yet the compiler is reorganizing the memory store instructions so that a and c are stored before b in memory to make room for a \0 at the end of the array.
I don't know what's going on there, but it's safe to say that the compiler is not trying to "make room for a \0". Compilers can and often do store variables in their own inscrutable internal order, matching neither the order you declared them, nor alphabetical order, nor anything else you might think of. If under your compiler array b ended up with extra space after it which did contain a \0 as if to terminate the string, that was probably basically random chance, not because the compiler was trying to be nice to you and helping to make something like printf("%s\n", b) be better defined. (Under the two compilers where I tried it, printf("%s\n", b) printed hi^E and hi ??, clearly showing the presence of trailing random garbage, as expected.)

There are two things in your question.
String literal. String literal (ie something enclosed in the double quotes) is always the correct null character terminated string.
char *p = "ABC"; // p references null character terminated string
Character array may only hold as many elements as it has so if you try to initialize two element array with three elements string literal, only two first will be written. So the array will not contain the null character terminated C string
char p[2] = "AB"; // p is not a valid C string.

A array of char need not be terminated by anything at all. It is an array. If the actual content is smaller than the dimensions of the array then you need to track the size of that content.
Answers here seem to have degenerated into a string discussion. Not all arrays of char are strings. However it is a very strong convention to use a null terminator as a sentinel if they are to be handled as de facto strings.
Your array may use something else, and may also have separators and zones. After all it may be a Union or overlay a structure. Possibly a staging area for another system.

Exactly how many ways to declare String in C? (need help on modifiers and scope)

So I want to know exactly how many ways are there to declare string. I know similar questions have been asked for several times, but I think my focus is different. As a beginner in C, I want to know which method of declaration is correct and preferable, so that I can stick to it.
We all know we can declare string in the two following ways.
char str[] = "blablabla";
char *str = "blablabla";
After reading some Q&A in stack-overflow, I was told string literal is placed in the read-only part of the memory. So in order to create modifiable string, you need to create a character array of the length of the string + 1 in the stack and copy each characters to array.
So this is what the 1st line of code doing.
However, what the 2nd line of code does is to create a character pointer and assign the pointer the address of the 1st character located in the read-only part of the memory. So this kind of declaration does not involved copying character by character.
Please let me know if I am wrong.
So far it seems quite understandable, but what really confuses me is the modifiers. For instance, some suggests
const char *str = "blablabla";
instead of
char *str = "blablabla";
because if we do something like
*(str + 1) = 'q';
It will cause undefined behavior which is horrible.
But some go even further and suggest something like
static const char *str = "blablabla";
and say this will place the string into the static memory
which will never gets modified to avoid the undefined behavior.
So which is actually the #right# way to declare a string?
Besides, I am also interested in knowing the scope when declaring string.
For example,
(You can ignore the examples, both of them are buggy as pointed out by the others)
#include <stdio.h>
char **strPtr();
int main()
{
printf("%s", *strPtr());
}
char **strPtr()
{
char str[] = "blablabla";
char *charPtr = str;
char **strPtr = &charPtr;
return strPtr;
}
will print some garbage value.
But
#include <stdio.h>
char **strPtr();
int main()
{
printf("%s", *strPtr());
}
char **strPtr()
{
char *str = "blablabla";
/*As point out by other I am returning the address of a local variable*/
/*This causes undefined behavior*/
char **strPtr = &str;
return strPtr;
}
will work perfectly fine. (NO it doesn't, it is undefined behavior.)
I think I should leave it as another question.
This question is getting too long.

A lot of your confusion comes from a common mis-understanding about C and strings: one which is explicitly stated in the title of your question. The C language does not have a native string type, so in fact there are exactly zero ways to declare a string in C.
Spend some time reading Does C Have a String type? which does a good job of explaining that.
This is evident from the fact that you can't (sensibly) do the following:
char *a, *b;
// code to point a and b at some "strings"
if (a == b)
{
// do something if strings compare equal
}
The if statement will compare the values of the pointers, not the contents of the memory they address. So, if a and b pointed to two different areas of memory, each containing identical data, the comparison a == b would fail. The only time the comparison would evaluate as "true" (i.e. something other than zero), would be if a and b held the same address (i.e. pointed to the same location in memory).
What C has is a convention, and some syntactic sugar to make life easier.
The convention is that a "string" is represented as a sequence of char terminated with the value zero (usually referred to as NUL and represented by the special character escape sequence '\0'). This convention comes from the API of the original standard library (back in the 70's) which provides a set of string primitives such as strcpy(). These primitives were so fundamental to doing anything truly useful in the language that life was made easier for programmers by adding syntactic sugar (this is all before the language escaped from the lab).
The syntactic sugar is the existence of "string literals": no more, and no less. In source code, any sequence of ASCII characters, enclosed in double quotes, is interpreted as a string literal and the compiler produces a copy (in "read-only" memory) of the characters plus a terminating NUL byte to conform to the convention. Modern compilers detect duplicated literals and only produce a single copy - but it's not a requirement of the standard last time I looked. Thus this:
assert("abc" == "abc");
may or may not raise an assertion - which reinforces the statement that C does not have a native string type. (For that matter, neither does C++ - it has a String class!)
With that out of the way, how do you use string literals to initialize a variable?
The first (and most common) form you will see is this
char *p = "ABC";
Here, the compiler sets aside 4 bytes (assuming sizeof(char) ==1) of memory in a "read only" section of the program and initializes it with [0x41, 0x42, 0x43, 0x00]. It then initializes p with the address of that array. You should note that there is some const casting going on here as the underlying type of a string literal is const char * const (a constant pointer to a constant character). Which is why you would normally be advised to write this as:
const char *p = "ABC";
Which is a "pointer to a constant char" - another way of saying "pointer to read only memory".
The next two forms use string literals to initialize arrays
char p1[] = "ABC";
char p2[3] = "ABC";
Note that there is a critical difference between the two. the first line creates a 4 byte array. The second creates a 3 bytes array.
In the first case, as before, the compiler creates a 4 byte constant array containing [0x41, 0x42, 0x43, 0x00]. Note that it adds the trailing NUL to form a "C String". It then reserves four bytes of RAM (on the stack for a local variable, or in "static" memory for variables at file scope) and inserts code to initialize it at run time by copying the "read only" array into the allocated RAM. You are now free to modify elements of p1 at will.
In the second case, the compiler creates a 3 byte constant array containing [0x41, 0x42, 0x43]. Note that there is no trailing NUL. It then reserves 3 bytes of RAM (on the stack for a local variable, or in "static" memory for variables at file scope) and inserts code to initialize it at run time by copying the "read only" array into the allocated RAM. You are again now free to modify elements of p2 at will.
The difference in sizes of the two arrays p1 and p2 is critical. The following code (if you ran it) would demonstrate it.
char p1[] = "ABC";
char p2[3] = "ABC";
printf ("p1 = %s\n", p1); // Will print out "p1 = ABC"
printf ("p2 = %s\n", p2); // Will print out "p2 = ABC!##$%^&*"
The output of the second printf is unpredictable, and could theoretically result in your code crashing. It tends to seem to work, simply because so much of RAM is filled with zeroes that eventually printf finds a terminating NUL.
Hope this helps.

storing of strings in char arrays in C

#include<stdio.h>
int main()
{
char a[5]="hello";
puts(a); //prints hello
}
Why does the code compile correctly? We need six places to store "hello", correct?

The C compiler will let you run off the end of arrays, it does no checks of that sort.

The C compiler allows you to explicitly ask for no null terminator.
char a[] = "Hello"; /* adds a terminator implicitly */
char a[6] = "Hello"; /* adds a terminator implicitly */
char a[5] = "Hello"; /* skips it */
Any value smaller than 5 results in an error.
As for why - one possibility is that your strings are of a fixed size, or are being used as buffers of byte values. In these cases you do not need a null terminator.
Best practice is to use char a[] so the compiler can set it to the correct value (including terminator) automatically.

a doesn't contain a null terminated string (extra initializers for fixed size arrays - such as the null terminator in "hello" - are discarded), so the behaviour when a pointer to that array is passed to puts is undefined.

In my experience, a lot of compilers will let you get away with compiling this. It will usually crash at runtime, though (because you don't have a null terminator).

C char array initialization includes the terminating null only if there is room or if the array dimensions are not specified.

You need 6 characters to store "hello" as a null terminated string. But char arrays are not constrained to store nul terminated string, you may need the array for another purpose and forcing an additional nul character in those cases would be pointless.

That is because in C memory management is done manually unlike in java and some other few languages....
The six places you allocated is not checked for during compilation but if you
have to get into filing(I mean storing actually) you are going to have a runtime error becuase the program kept five places in memory(but is expected to hold six) for the characters but the compiler did not check!

"hello" string is kept in read-only memory with 0 in the end. "a" points to this string, this is why the program may work correctly. But I think that generally this is undefined behavior.
It is necessary to see Assembly code generated by compiler to see what happens exactly. If you want to get junk output in this situation, try:
char a[5] = {'h', 'e', 'l', 'l', 'o'}

The C compiler you are using does not check that the string literal fits to the char array. You need 6 characters in the array to fit the literal "Hello" since the literal includes a terminating zero. Modern compilers, such as Visual C++ 2010 do check these things and give you and error.

I'm new to C, can someone explain why the size of this string can change?

I have never really done much C but am starting to play around with it. I am writing little snippets like the one below to try to understand the usage and behaviour of key constructs/functions in C. The one below I wrote trying to understand the difference between char* string and char string[] and how then lengths of strings work. Furthermore I wanted to see if sprintf could be used to concatenate two strings and set it into a third string.
What I discovered was that the third string I was using to store the concatenation of the other two had to be set with char string[] syntax or the binary would die with SIGSEGV (Address boundary error). Setting it using the array syntax required a size so I initially started by setting it to the combined size of the other two strings. This seemed to let me perform the concatenation well enough.
Out of curiosity, though, I tried expanding the "concatenated" string to be longer than the size I had allocated. Much to my surprise, it still worked and the string size increased and could be printf'd fine.
My question is: Why does this happen, is it invalid or have risks/drawbacks? Furthermore, why is char str3[length3] valid but char str3[7] causes "SIGABRT (Abort)" when sprintf line tries to execute?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main() {
char* str1 = "Sup";
char* str2 = "Dood";
int length1 = strlen(str1);
int length2 = strlen(str2);
int length3 = length1 + length2;
char str3[length3];
//char str3[7];
printf("%s (length %d)\n", str1, length1); // Sup (length 3)
printf("%s (length %d)\n", str2, length2); // Dood (length 4)
printf("total length: %d\n", length3); // total length: 7
printf("str3 length: %d\n", (int)strlen(str3)); // str3 length: 6
sprintf(str3, "%s<-------------------->%s", str1, str2);
printf("%s\n", str3); // Sup<-------------------->Dood
printf("str3 length after sprintf: %d\n", // str3 length after sprintf: 29
(int)strlen(str3));
}

This line is wrong:
char str3[length3];
You're not taking the terminating zero into account. It should be:
char str3[length3+1];
You're also trying to get the length of str3, while it hasn't been set yet.
In addition, this line:
sprintf(str3, "%s<-------------------->%s", str1, str2);
will overflow the buffer you allocated for str3. Make sure you allocate enough space to hold the complete string, including the terminating zero.

void main() {
char* str1 = "Sup"; // a pointer to the statically allocated sequence of characters {'S', 'u', 'p', '\0' }
char* str2 = "Dood"; // a pointer to the statically allocated sequence of characters {'D', 'o', 'o', 'd', '\0' }
int length1 = strlen(str1); // the length of str1 without the terminating \0 == 3
int length2 = strlen(str2); // the length of str2 without the terminating \0 == 4
int length3 = length1 + length2;
char str3[length3]; // declare an array of7 characters, uninitialized
So far so good. Now:
printf("str3 length: %d\n", (int)strlen(str3)); // What is the length of str3? str3 is uninitialized!
C is a primitive language. It doesn't have strings. What it does have is arrays and pointers. A string is a convention, not a datatype. By convention, people agree that "an array of chars is a string, and the string ends at the first null character". All the C string functions follow this convention, but it is a convention. It is simply assumed that you follow it, or the string functions will break.
So str3 is not a 7-character string. It is an array of 7 characters. If you pass it to a function which expects a string, then that function will look for a '\0' to find the end of the string. str3 was never initialized, so it contains random garbage. In your case, apparently, there was a '\0' after the 6th character so strlen returns 6, but that's not guaranteed. If it hadn't been there, then it would have read past the end of the array.
sprintf(str3, "%s<-------------------->%s", str1, str2);
And here it goes wrong again. You are trying to copy the string "Sup<-------------------->Dood\0" into an array of 7 characters. That won't fit. Of course the C function doesn't know this, it just copies past the end of the array. Undefined behavior, and will probably crash.
printf("%s\n", str3); // Sup<-------------------->Dood
And here you try to print the string stored at str3. printf is a string function. It doesn't care (or know) about the size of your array. It is given a string, and, like all other string functions, determines the length of the string by looking for a '\0'.

Instead of trying to learn C by trial and error, I suggest that you go to your local bookshop and buy an "introduction to C programming" book. You'll end up knowing the language a lot better that way.
There is nothing more dangerous than a programmer who half understands C!

What you have to understand is that C doesn't actually have strings, it has character arrays. Moreover, the character arrays don't have associated length information -- instead, string length is determined by iterating over the characters until a null byte is encountered. This implies, that every char array should be at least strlen + 1 characters in length.
C doesn't perform array bounds checking. This means that the functions you call blindly trust you to have allocated enough space for your strings. When that isn't the case, you may end up writing beyond the bounds of the memory you allocated for your string. For a stack allocated char array, you'll overwrite the values of local variables. For heap-allocated char arrays, you may write beyond the memory area of your application. In either case, the best case is you'll error out immediately, and the worst case is that things appear to be working, but actually aren't.
As for the assignment, you can't write something like this:
char *str;
sprintf(str, ...);
and expect it to work -- str is an uninitialized pointer, so the value is "not defined", which in practice means "garbage". Pointers are memory addresses, so an attempt to write to an uninitialized pointer is an attempt to write to a random memory location. Not a good idea. Instead, what you want to do is something like:
char *str = malloc(sizeof(char) * (string length + 1));
which allocates n+1 characters worth of storage and stores the pointer to that storage in str. Of course, to be safe, you should check whether or not malloc returns null. And when you're done, you need to call free(str).
The reason your code works with the array syntax is because the array, being a local variable, is automatically allocated, so there's actually a free slice of memory there. That's (usually) not the case with an uninitialized pointer.
As for the question of how the size of a string can change, once you understand the bit about null bytes, it becomes obvious: all you need to do to change the size of a string is futz with the null byte. For example:
char str[] = "Foo bar";
str[1] = (char)0; // I'd use the character literal, but this editor won't let me
At this point, the length of the string as reported by strlen will be exactly 1. Or:
char str[] = "Foo bar";
str[7] = '!';
after which strlen will probably crash, because it will keep trying to read more bytes from beyond the array boundary. It might encounter a null byte and then stop (and of course, return the wrong string length), or it might crash.
I've written all of one C program, so expect this answer to be inaccurate and incomplete in a number of ways, which will undoubtedly be pointed out in the comments. ;-)

Your str3 is too short - you need to add extra byte for null-terminator and the length of "<-------------------->" string literal.
Out of curiosity, though, I tried
expanding the "concatenated" string to
be longer than the size I had
allocated. Much to my surprise, it
still worked and the string size
increased and could be printf'd fine.
The behaviour is undefined so it may or may not segfault.

strlen returns the length of the string without the trailing NULL byte (\0, 0x00) but when you create a variable to hold the combined strings you need to add that 1 character.
char str3[length3 + 1];
…and you should be all set.

C strings are '\0' terminated and require an extra byte for that, so at least you should do
char str3[length3 + 1]
will do the job.

In sprintf() ypu are writing beyond the space allocated for str3. This may cause any type of undefined behavior (If you are lucky then it will crash). In strlen(), it is just searching for a NULL character from the memory location you specified and it is finding one in 29th location. It can as well be 129 also i.e. it will behave very erratically.

A few important points:
Just because it works doesn't mean it's safe. Going past the end of a buffer is always unsafe, and even if it works on your computer, it may fail under a different OS, different compiler, or even a second run.
I suggest you think of a char array as a container and a string as an object that is stored inside the container. In this case, the container must be 1 character longer than the object it holds, since a "null character" is required to indicate the end of the object. The container is a fixed size, and the object can change size (by moving the null character).
The first null character in the array indicates the end of the string. The remainder of the array is unused.
You can store different things in a char array (such as a sequence of numbers). It just depends on how you use it. But string function such as printf() or strcat() assume that there is a null-terminated string to be found there.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight