Printf function prints another string too - c

Hi i am fairly new in C language and i was trying to understand the strings. As i know, strings are just an array of characters and there shouldn't be a difference between char a[]= "car" and char a[] = {'c','a','r'}.
When i try to print the string as:
char a[] = "car";
char b[] = "testing the cars";
printf("%s", a);
the output is just car and there's no problem.But when i try to print it as:
char a[] = {'c','a','r'};
char b[] = "testing the cars";
printf("%s", a);
it's printing the b too. Can you explain what's the reason of it?

The %s specifier of printf() expects a char* pointer to a null-terminated string.
In the first case, a and b are both null-terminated. Initializing a char[] array of unspecified size with a string literal will include the literal's null-terminator '\0' character at the end. Thus:
char a[] = "car";
is equivalent to:
char a[] = {'c', 'a', 'r', '\0'};
In the second case, a is NOT null-terminated, leading to undefined behavior, as printf("%s", a) will read past the end of a into surrounding memory until it eventually finds a '\0' character. It just happens that, in your case, b exists in that memory following a, but that is not guaranteed, the compiler can put b wherever it wants.

Related

Declaration of a character array in C - Clash between Norminette and Compiler

I am hoping somebody could please clarify what I am doing wrong.
I am trying to replicate the strcpy function in C.
The exercise requires us to create two loop through a src string and replace the content at each corresponding index at the destination string.
My issue is when I create the test function in int main() I initialise a character array and assign it some content. It compiles fine however I get a norminette error:
// Method 1
char str1[5] = "abcde";// Error: DECL_ASSIGN_LINE Declaration and assignation on a single line
char str2[5] = "fghij"; //Error: DECL_ASSIGN_LINE Declaration and assignation on a single line
If I initialise and assign like bellow, Norminette is ok but I get a compilation error:
// Method 2
char str1[5] = "abcde";
char str2[5] = "fghij";
str1[] = "abcde"; // error: expected expression before ‘]’ token ... (with an arrow pointing to ] bracket)
str2[] = "fghij"; // error: expected expression before ‘]’ token ... (with an arrow pointing to ] bracket)
// Method 3
char str1[] = {'a', 'b', 'c', 'd', 'e','\0'}; // Error: DECL_ASSIGN_LINE Declaration and assignation on a single line
char str2[] = {'f', 'g', 'h', 'i', 'j', '\0'};//Error: DECL_ASSIGN_LINE Declaration and assignation on a single line
I have also tried various methods including str[5] = "abcde" after declaration with no success.
My question is how can I declare these character arrays to satisfy both the norminette and compiler?
Also is my understand that in C, a character array and a string are interchangeable concepts?
Thank you
In method 2:
str1[] = "abcde"; // error:
is an assignment, so it's invalid. The [] syntax can only be used in a definition. More on this below.
Method 3 is fine. Whoever is flagging this is wrong.
In method 1 [and other places]:
char str1[5] = "abcde";
is wrong because it should be:
char str1[6] = "abcde";
to account for the EOS (0x00) string terminator at the end.
An alternate would be:
char str1[] = "abcde";
And, you did this [more or less] in method 3:
char str1[] = {'a', 'b', 'c', 'd', 'e', '\0'};
AFAICT, this was flagged by the external tool (not the compiler?). It is a perfectly valid alternative.
Also is my understand that in C, a character array and a string are interchangeable concepts?
No. In C, a string is a sequence of character values including a 0-valued terminator. The string "abcde" would be represented by the sequence {'a', 'b', 'c', 'd', 'e', 0}. That terminator is how the various string handling routines like strlen and strcpy know where the end of the string is.
Strings (including string literals like "abcde") are stored in arrays of character type, but not every array of character type stores a string - it could be a character sequence with no 0-valued terminator, or it could be a character sequence including multiple 0-valued bytes.
In order to store a string of N characters, the array has to be at least N+1 elements wide:
char str1[6] = "abcde"; // 5 printing characters plus 0 terminator
char str1[6] = "fghij";
You can declare the array without an explicit size, and the size (including the +1 for the terminator) will be determined from the size of the initializer:
char str1[] = "abcde"; // will allocate 6 elements for str1
char str2[] = "fghij";
I've found the documentation for Norminette and ... ugh.
I get that your school wants everyone to follow a common coding standard; that makes it easier to analyze and grade everyone's code. But some of its rules are just plain weird and non-idiomatic and, ironically, encourage bad style. If I'm interpreting it correctly, it wants you to write your initializers as
char str1[]
= "abcde";
or something equally bizarre. Nobody does that.
One way to get around the problem is to not initialize the array in the declaration, but assign it separately using strcpy:
char str1[6]; // size is required since we don't have an initializer
char str2[6];
strcpy( str1, "abcde" );
strcpy( str2, "fghij" );
You cannot use = to assign whole arrays outside of a declaration (initialization is not the same thing as assignment). IOW, you can't write something like
str1 = "abcde";
as a statement.
You either have to use a library function like strcpy or strncpy (for strings) or memcpy (for things that aren't strings), or you have to assign each element individually:
str1[0] = 'a';
str1[1] = 'b';
str1[2] = 'c';
...

Why it shows two arrays rather than one array

I am a beginner of c. Today when i write a c program, I find some strange thing.
What i want it to show is abc, but it show abcefg. I want to know why it's so shown.
the code is:
#include <stdio.h>
int main() {
char a[3] = "abc";
char b[3] = "efg";
printf("%s", a);
return 0;
}
It's answer is not abc but abcefg
Strings in C are zero terminated with '\0'. "abc" actually is { 'a', 'b', 'c', '\0' }, which is 4 chars. Your array a only has room for 3 chars so the '\0' isn't stored. When printf() tries to print the string stored in a it reads and prints one character a time until it encounters a terminating '\0', but there is none. So it continues reading and printing. And it happens that b is right next to a in memory, so the content of b gets printet as well.
Cure:
#include <stdio.h>
int main(void)
{
char a[4] = "abc";
char b[4] = "efg";
printf("%s", a);
}
or, even better, don't specify a size for the arrays at all. Let the compiler figure out the correct size based on the initializer "abc":
#include <stdio.h>
int main(void)
{
char a[] = "abc";
char b[] = "efg";
printf("%s", a);
}
char a[3] = "abc"; misses space for the 0-terminator, so printf will read out of bounds (undefined behavior) into the next memory location , where it finds the b array (by luck).
You should use char a[4] = "abc"; or char a[] = "abc";.
When you do not write an array size, the compiler will evaluate the minimum size from the initialization.
char b[3] = "efg"; has the same problem, but it seems that you are lucky enough to have a 0 byte afterwards.

Explanation regarding character pointer for string

Hi I'm new to programming.
In the following code str is a pointer to a character, so str should contain the address of the character 'h'. Therefore %p should be used to print that address. But I don't understand how %s is used for printing a pointer parameter.
#include<stdio.h>
int main (){
char s[] = "hello";
char *str = s;
int a[] = {1, 2, 3, 4, 5};
int *b = a;
printf("%s\n", str); // I don't understand how this works ?
printf("%c\n", *str); // This statement makes sense
printf("%c\n", *(str + 1)); // This statement also makes sense.
printf("%p\n",str); // This prints the address of the pointer str. This too makes sense.
printf("%d\n",*b); // makes sense, is the same as the second print.
// printf("%d",b); // I don't understand why str pointer works but this gives a compile error
return 0;
}
char s[] = "hello";
Declares an array of zero-terminated characters called s. Its the same as writing
char s[6] = { 'h', 'e', 'l', 'l', 'o', '\0' };
As you can see, the quotation marks are a shorthand.
char *str = s;
This declares str to be a pointer to a character. It then makes str point to the first character in s. In other words, str contains the address of the first character in s.
int a[] = {1, 2, 3, 4, 5};
Declares an array of integers. It initializes them to the values 1-5, inclusive.
int *b = a;
Declares b to be a pointer to an int. It then makes b point to the first int in a.
printf("%s\n", str);
The %s specifier accepts the address of the first character in the string. printf then walks from that address, printing the characters it sees, until it sees the \0 character at the end.
printf("%c\n", *str);
This prints the first character in str. Since str is pointing to a character (the first character in the string), then *str should obtain the character being pointed at (the first character in the string).
printf("%c\n", *(str + 1));
This prints the second character in str. This is the long way of writing str[1]. The logic behind this is pointer arithmetic. If str is the address of a character, then str + 1 is the address of the next character in the array. Since (str + 1) is an address, it may be dereferenced. Thus, the * obtains the character 1 character past the first character of the array.
printf("%p\n",str);
The %p specifier expects a pointer, just like %s would, but it does something else. Instead of printing the contents of a string, it simply prints the address the pointer is containing, in hex.
printf("%d\n",*b);
This prints the first int in the array pointed to by b. This is equivalent to writing b[0].
printf("%d",b);
b is an int *, not an int, which is what %d expects. If you were trying to print the address of the first element of the array, the specifier would be %p, not %d. Also, this line should not generate a compiler error. Instead, it should have been a runtime undefined behavior, since the compiler does not know what a printf format string is.

Initializing a char array with an explicit size and initialized to bigger than the size

I've been reading some code and I encountered the following:
int function(){
char str[4] = "ABC\0";
int number;
/* .... */
}
Normally, when you write a string literal to initialize a char array, the string should be null terminated implicitly right? What happens in this case? Does the compiler recognize the '\0' in the string literal and make that the null terminator? or does it overflow to the int number? Is there anything wrong with this form?
The C99 standard §6.7.8.¶14 says
An array of character type may be initialized by a character string
literal, optionally enclosed in braces. Successive characters of the
character string literal (including the terminating null character if
there is room or if the array is of unknown size) initialize the
elements of the array.
This means that the following statements are equivalent.
char str[4] = "ABC\0";
// equivalent to
char str[4] = "ABC";
// equivalent to
char sr[4] = {'A', 'B', 'C', '\0'};
So there's nothing wrong with the first statement above. As the standard explicitly states, only that many characters in the string literal are used for initializing the array as the size of the array. Note that the string literal "ABC\0" actually contains five characters. '\0' is just like any character, so it's fine.
However please note that there's a difference between
char str[4] = "ABC\0";
// equivalent to
char str[4] = {'A', 'B', 'C', '\0'};
char str[] = "ABC\0"; // sizeof(str) is 5
// equivalent to
char str[] = {'A', 'B', 'C', '\0', '\0'};
That's because the string literal "ABC\0" contains 5 characters and all these characters are used in the initialization of str when the size of the array str is not specified. Contrary to this, when the size of str is explicitly stated as 4, then only the first 4 characters in the literal "ABC\0" are used for its initialization as clearly mentioned in the above quoted para from the standard.
If the code is:
char str[3] = "ABC";
It's fine in C, but the character array str is not a string because it's not null-terminated. See C FAQ: Is char a[3] = "abc"; legal? What does it mean? for detail.
In your example:
char str[4] = "ABC\0";
The last character of the array str happens to be set to '\0', so it's fine and it's a string.

How to initialize a char array using a char pointer in C

Let's say I have a char pointer called string1 that points to the first character in the word "hahahaha". I want to create a char[] that contains the same string that string1 points to.
How come this does not work?
char string2[] = string1;
"How come this does not work?"
Because that's not how the C language was defined.
You can create a copy using strdup() [Note that strdup() is not ANSI C]
Refs:
C string handling
strdup() - what does it do in C?
1) pointer string2 == pointer string1
change in value of either will change the other
From poster poida
char string1[] = "hahahahaha";
char* string2 = string1;
2) Make a Copy
char string1[] = "hahahahaha";
char string2[11]; /* allocate sufficient memory plus null character */
strcpy(string2, string1);
change in value of one of them will not change the other
What you write like this:
char str[] = "hello";
... actually becomes this:
char str[] = {'h', 'e', 'l', 'l', 'o'};
Here we are implicitly invoking something called the initializer.
Initializer is responsible for making the character array, in the above scenario.
Initializer does this, behind the scene:
char str[5];
str[0] = 'h';
str[1] = 'e';
str[2] = 'l';
str[3] = 'l';
str[4] = 'o';
C is a very low level language. Your statement:
char str[] = another_str;
doesn't make sense to C.
It is not possible to assign an entire array, to another in C. You have to copy letter by letter, either manually or using the strcpy() function.
In the above statement, the initializer does not know the length of the another_str array variable. If you hard code the string instead of putting another_str, then it will work.
Some other languages might allow to do such things... but you can't expect a manual car to switch gears automatically. You are in charge of it.
In C you have to reserve memory to hold a string.
This is done automatically when you define a constant string, and then assign to a char[].
On the other hand, when you write string2 = string1,
what you are actually doing is assigning the memory addresses of pointer-to-char objects. If string2 is declares as char* (pointer-to-char), then it is valid the assignment:
char* string2 = "Hello.";
The variable string2 now holds the address of the first character of the constanta array of char "Hello.".
It is fine, also, to write string2 = string1 when string2 is a char* and string1 is a char[].
However, it is supposed that a char[] has constant address in memory. Is not modifiable.
So, it is not allowed to write sentences like that:
char string2[];
string2 = (something...);
However, you are able to modify the individual characters of string2, because is an array of characters:
string2[0] = 'x'; /* That's ok! */

Resources