The following code snippet gives unexpected output in Turbo C++ compiler:
char a[]={'a','b','c'};
printf("%s",a);
Why doesn't this print abc? In my understanding, strings are implemented as one dimensional character arrays in C.
Secondly, what is the difference between %s and %2s?
This is because your string is not zero-terminated. This will work:
char a[]={'a','b','c', '\0'};
The %2s specifies the minimum width of the printout. Since you are printing a 3-character string, this will be ignored. If you used %5s, however, your string would be padded on the left with two spaces.
char a[]={'a','b','c'};
Well one problem is that strings need to be null terminated:
char a[]={'a','b','c', 0};
Without change the original char-array you can also use
char a[]={'a','b','c'};
printf("%.3s",a);
or
char a[]={'a','b','c'};
printf("%.*s",sizeof(a),a);
or
char a[]={'a','b','c'};
fwrite(a,3,1,stdout);
or
char a[]={'a','b','c'};
fwrite(a,sizeof(a),1,stdout);
Because you aren't using a string. To be considered as a string you need the 'null termination': '\0' or 0 (yes, without quotes).
You can achieve this by two forms of initializations:
char a[] = {'a', 'b', 'c', '\0'};
or using the compiler at your side:
char a[] = "abc";
Whenever we store a string in c programming, we always have one extra character at the end to identify the end of the string.
The Extra character used is the null character '\0'.
In your above program you are missing the null character.
You can define your string as
char a[] = "abc";
to get the desired result.
Related
I wanted to get the first character of a string using strncpy, here's my code
int main() {
char s[] = "abcdefghi";
char c[] = "";
printf("%c\n",c);
printf("%s\n",s);
strncpy(c,s,1);
printf("%c\n",c);
printf("%s\n",s);
return 0;
}
The problem is that c is still an empty string, what's wrong with it?
The problem is that c is an array (which decays into a pointer), while the %c format of printf() requires a character. If you pass the pointer when the function expects a character, you get undefined behavior, and a possible output will be the first byte of the address of the array, interpreted as an ASCII character.
Also, please note that char c[] = ""; simply declares an array of length 1 which contains the null (\0) character. If you try to print that, like you do in your code, you will likely get weird output.
If c is meant to only store a single character, declare it as a simple char variable, and pass it's address to strncpy():
char s[] = "abcdefghi";
char c = ' ';
printf("%c\n", c);
printf("%s\n", s);
strncpy(&c, s, 1); //notice the & before c
printf("%c\n", c);
printf("%s\n", s);
However, you have to be careful. Passing any value greater than 1 to strncpy() asks for big trouble, since you would be writing to an undefined location.
s is an array of 10 chars, c is an array with one char. Do you understand why? And which char is stored in c?
%s is the format for strings, %c is the format for chars. But c is not a char. An array containing one char is an array, not a char. So when you try to print c you get rubbish.
Now the bad one: strncpy in your case doesn’t produce a C string because there isn’t enough space for one. If you try to print c with %c you get nonsense. If you try to print it with %s you get undefined behaviour. Do NOT use strncpy unless you really know what it is doing (you should generally avoid strncpy).
If this code is correct:
char v1[ ] = "AB";
char v2[ ] = {"AB"};
char v3[ ] = {'A', 'B'};
char v4[2] = "AB";
char v5[2] = {"AB"};
char v6[2] = {'A', 'B'};
char *str1 = "AB";
char *str2 = {"AB"};
Then why this other one is not?
char *str3 = {'A', 'B'};
To the best of my knowledge (please correct me if I'm wrong at any point) "AB" is a string literal and 'A' and 'B' are characters (integers,scalars). In char *str1 = "AB"; the string literal "AB" is defined and the char pointer is set to point to that string literal (to the first element). With char *str3 = {'A', 'B'}; two characters are defined and stored in subsequent memory positions, and the char pointer "should" be set to point to the first one. Why is that not correct?
In a similar way, a regular char array like v3[] or v6[2] can indeed be initialized with {'A', 'B'}. The two characters are defined, the array is set to point to them and thus, being "turned into" or treated like a string literal. Why a char pointer like char *str3 does not behave in the same way?
Just for the record, gcc compiler warnings I get are "initialization makes pointer from integer without a cast" when it gets to the 'A', and "excess elements in scalar initializer" when it gets to the 'B'.
Thanks in advance.
There is one thing you need to learn about constant string literals. Except when used to initialize an array (for example in the case of v1 in your example code) constant string literals are themselves arrays. For example if you use the literal "AB" it is stored somewhere by the compiler as an array of three characters: 'A', 'B' and the terminator '\0'.
When you initialize a pointer to point to a literal string, as in the case of str1 and str2, then you are making those pointers point to the first character in those arrays. You don't actually create an array named str1 (for example) you just make it point somewhere.
The definition
char *str1 = "AB";
is equivalent to
char *str1;
str1 = "AB";
Or rather
char unnamed_array_created_by_compiler[] = "AB";
char *str1 = unnamed_array_created_by_compiler;
There are also other problematic things with the definitions you show. First of all the arrays v3, v4, v5 and v6. You tell the compiler they will be arrays of two char elements. That means you can not use them as strings in C, since strings needs the special terminator character '\0'.
In fact if you check the sizes of v1 and v2 you will see that they are indeed three bytes large, once for each of the characters plus the terminator.
Another important thing you miss is that while constant string literals are arrays of char, you miss the constant part. String literals are really read-only, even if not stored as such. That's why you should never create a pointer to char (like str1 and str2) to point to them, you should create pointers to constant char. I.e.
const char *str1 = "AB";
(" ") is for string and (' ') is for character. for an string a memory has been allocated and for character not. pointers points to a memory and you must allocate an specified memory to it but for array of characters is not necessary.
I am learning C. Some characters are being added automatically to my program. What am I doing wrong?
#include <stdio.h>
#include <string.h>
int main() {
char test1[2]="xx";
char test2[2]="xx";
printf("test is %s and %s.\n", test1, test2);
return 0;
}
Here is how I am running it on Fedora 20.
gcc -o problem problem.c
./problem
test is xx?}� and xx#.
I would expect the answer would be test is xx and xx.
The issue is that string literals such as "xx" have an extra character that is the nul-termination, \0, that is, it is composed of the characters 'x', 'x' and '\0'.
This is how functions that take char* and treat them as strings know the extent of the strings. Your arrays are simply one element too short, missing the nul-terminator. By passing char* that don't point to a nul-terminated string to a function that expects one, you are invoking undefined behaviour.
You can initialize them like this instead:
char test[] = "xx";
This will result in test having the correct length of 3. You can test that using the sizeof operator. Of course, you can also be explicit about the length:
char test[3] = "xx";
but this is more error-prone.
When you define a String in C like this
char A[] = "hello";
It gets initialized something like this
A = { 'h', 'e', 'l', 'l', 'o', '\0'}
That last null character is needed for the it to be a string. So in your code
char test1[2]="xx";
You have made the test1 character array to be 2 characters long, leaving no space for the null character.
To correct your program, You can either not give the size of the character array, like
char test1[]="xx";
Or, give one more then the characters you are filling in, like
char test1[3]="xx";
In your code char test1[2]="xx", char test1[2] creates a kind a "container" for two chars, but the actual string "xx" implicitly has three chars xx0, where 0 indicates an end of the line. This 0 is an indicator for printf, where it should stop reading the input string. In your case printf doesn't get this 0 as 0 doesn't fit into the test1 and it reads to some random zero in memory, printing everything it meets on the way.
You should change your declaration to the following:
char test1[3]="xx"
I am a little confused by the following C code snippets:
printf("Peter string is %d bytes\n", sizeof("Peter")); // Peter string is 6 bytes
This tells me that when C compiles a string in double quotes, it will automatically add an extra byte for the null terminator.
printf("Hello '%s'\n", "Peter");
The printf function knows when to stop reading the string "Peter" because it reaches the null terminator, so ...
char myString[2][9] = {"123456789", "123456789" };
printf("myString: %s\n", myString[0]);
Here, printf prints all 18 characters because there's no null terminators (and they wouldn't fit without taking out the 9's). Does C not add the null terminator in a variable definition?
Your string is [2][9]. Those [9] are ['1', '2', etc... '8', '9']. Because you only gave it room for 9 chars in the first array dimension, and because you used all 9, it has no room to place a '\0' character. redefine your char array:
char string[2][10] = {"123456789", "123456789"};
And it should work.
Sure it does, you just aren't leaving enough room for the '\0' byte. Making it:
char string[2][10] = { "123456789", "123456789" };
Will work as you expect (will just print 9 characters).
If you tell C that an array is a given size, C cannot make the array any larger. It would be disobeying you if it did so! Remember that not every char array contains a null terminated string. Sometimes the array (as used) is truly an array of (individual) char. The compiler doesn't know what you are doing and cannot read your mind.
This is why C allows you to initialize a char array where the null terminator won't fit but everything else will. Try your example with a string one byte longer and the compiler will complain.
Note that your example will compile but will not do what you expect, as the contents are not (null terminated) strings. With GCC, running your example, I see the string I should, followed by garbage.
Alterenatively, you can use:
char* myString[2] = {"123456789", "123456789" };
Like this, the initializer computes the right size for your null terminated strings.
C allows unterminated strings, C++ does not.
C allows character arrays to be
initialized with string constants. It
also allows a string constant
initializer to contain exactly one
more character than the array it
initializes, i.e., the implicit
terminating null character of the
string may be ignored. For example:
char name1[] = "Harry"; // Array of 6 char
char name2[6] = "Harry"; // Array of 6 char
char name3[] = { 'H', 'a', 'r', 'r', 'y', '\0' };
// Same as 'name1' initialization
char name4[5] = "Harry"; // Array of 5 char, no null char
C++ also allows character arrays to be
initialized with string constants, but
always includes the terminating null
character in the initialization. Thus
the last initializer (name4) in the
example above is invalid in C++.
Is there a reason why the compiler doesn't warn that there isn't enough room for the 0 byte? I get a warning if I try to add another '9' that won't fit, but it doesn't seem to care about dropping the 0 byte?
The '\0' byte isn't it's problem. Most of the time, if you have this:
char code[9] = "123456789";
The next byte will be off the edge of the variable, but will be unused memory, and will most likely be 0 (unless you malloc() and don't set the values before using them). So most of the time it works, even if it's bad for you.
If you're using gcc, you might also want to use the -Wall flag, or one of the other (million) warning flags. This might help (not sure).
This C program gives a weird result:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char str1[5] = "abcde";
char str2[5] = " haha";
printf("%s\n", str1);
return 0;
}
when I run this code I get:
abcde haha
I only want to print the first string as can be seen from the code.
Why does it print both of them?
"abcde" is actually 6 bytes long because of the null terminating character in C strings. When you do this:
char str1[5] = "abcde";
You aren't storing the null terminating character so it is not a proper string.
When you do this:
char str1[5] = "abcde";
char str2[5] = " haha";
printf("%s\n", str1);
It just happens to be that the second string is stored right after the first, although this is not required. By calling printf on a string that isn't null terminated you have already caused undefined behavior.
Update:
As stated in the comments by clcto this can be avoided by not explicitly specifying the size of the array and letting the compiler determine it based off of the string:
char str1[] = "abcde";
or use a pointer instead if that works for your use case, although they are not the same:
const char *str1 = "abcde";
Both strings str1 and str2 are not null terminated. Therefore the statement
printf("%s\n", str1);
will invoke undefined behavior.
printf prints the characters in a string one by one until it encounters a '\0' which is not present in your string. In this case printf continues past the end of the string until it finds a null character somewhere in the memory. In your case it seems that printf past the end of string "abcde" and continues to print the characters from second string " haha" which is by chance located just after first string in the memory.
Better to change the block
char str1[5] = "abcde";
char str2[5] = " haha";
to
char str1[] = "abcde";
char str2[] = " haha";
to avoid this problem.
Technically, this behavior is not unexpected, it is undefined: your code is passing a pointer to a C string that lacks null terminator to printf, which is undefined behavior.
In your case, though, it happens that the compiler places two strings back-to-back in memory, so printf runs into null terminator after printing str2, which explains the result that you get.
If you would like to print only the first string, add space for null terminator, like this:
char str1[6] = "abcde";
Better yet, let the compiler compute the correct size for you:
char str1[] = "abcde";
You have invoked undefined behaviour. Here:
char str1[5] = "abcde";
str1 has space for the above five letters but no null terminator.
Then the way you try to pass not null terminated string to printf for printing, invokes undefined behaviour.
In general in most of the cases it is not good idea to pass not null terminated strings to standard functions which expect (C) strings.
In C such declarations
char str1[5] = "abcde";
are allowed. In fact there are 6 initializers because the string literal includes the terminating zero. However in the left side there is declared a character array that has only 5 elements. So it does not include the terminating zero.
It looks like
char str1[5] = { 'a', 'b', 'c', 'd', 'e', '\0' };
If you would compile this declaration in C++ then the compiler issues an error.
It would be better to declare the array without specifying explicitly its size.
char str1[] = "abcde";
In this case it would have the size equal to the number of characters in the string literal including the terminating zero that is equal to 6. And you can write
printf("%s\n", str1);
Otherwise the function continues to print characters beyond the array until it meets the zero character.
Nevertheless it is not an error. Simply you should correctly specify the format specifier in the call of printf:
printf("%5.5s\n", str1);
and you will get the expected result.
The %s specifier searches for a null termination. Therefore you need to add '\0' to the end of your string
char str1[6] = "abcde\0";