#include<stdio.h>
void main()
{
char a[10][5] = {"hi", "hello", "fellow"};
printf("%s",a[0]);
}
Why this code printing only hi
#include<stdio.h>
void main()
{
char a[10][5] = {"hi", "hello", "fellow"};
printf("%s",a[1]);
}
While this code is printing "hellofellow"
Why this code printing only hi
You've told printf to print the string stored at a[0], and that string happens to be "hi".
While this code is printing "hellofellow"
This one is by coincidence, in fact your code ought to be rejected by the compiler due to a constraint violation:
No initializer shall attempt to provide a value for an object not contained within the entity being initialized.
The string "fellow", specifically the 'w' at the end of it does not fit within the char[5] being initialised, and this violates the C standard. Perhaps also by coincidence, your compiler provides an extension (making it technically a non-C compiler), and so you don't see the error messages that I do:
prog.c:3:6: error: return type of 'main' is not 'int' [-Werror=main]
void main()
^
prog.c: In function 'main':
prog.c:5:37: error: initializer-string for array of chars is too long [-Werror]
char a[10][5] = {"hi", "hello", "fellow"};
^
Note that the second error message is complaining about "fellow", but not "hello". Your "hello" initialisation is valid by exception:
An array of character type may be initialized by a character string literal or UTF-8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
The emphasis is mine. What the emphasised section states is that if there isn't enough room for a terminal '\0' character, that won't be used in the initialisation.
Your code:
char a[10][5] = {"hi", "hello", "fellow"};
Allocates 10 char [5]
"hello" takes up 5 so there is no room for the terminating \0, so it runs into "fellow"
If you try it, a [3] should be "w" because "fellow" is too big and the "w" runs over from a[2] to a[3]
Aside from being undefined behavior, it is confusing what you were trying to do
It will give undefines behaviour as string are null-terminated.
And element hello has length of 5.
Declare your array as a[10][7] then you will get intended output.
See here -https://ideone.com/c2zUs0
Why this code printing only hi
Because a[0][2] is null indicating termination thus giving you hi.
This is undefined behavior due to insufficient space to store \0 character.
Please note that the memory allocated is 5bytes per string in your array of strings. Thus, for the a[1] there is not sufficient memory to store the \0 character as all five bytes are assigned with "hello".
Thus, the subsequent memory is read until the \0 character is found.
Thus, you can change the line:
char a[10][5] = {"hi", "hello", "fellow"};
to
char a[][7] = {"hi", "hello", "fellow"};
Why this code printing only hi
This is because the \0 character is already encountered at a[0][2] and thus the reading of the characters is stopped.
What Your Code Does:
Look at the following statement:
char a[10][5] = {"hi", "hello", "fellow"};
It allocates 10 rows. 5 characters are allocated for each index of a.
What is the Problem:
Strings are Null Terminated there is always a null-terminator needed to be stored except for the given characters, so basically the used size of array is numOfCharacters+1, the extra one byte is for the null terminator. When you are initializing the array with exactly size number of characters, the null terminator is skipped. Normally the character array value is printed until the first \0(null terminator) is not found. Please also have a look at this.
The Solution:
No need to worry about this problem, all you need to do is just to set the size equal to the numOfCharactersInString + 1. You can use the following statement:
char a[10][7] = {"hi", "hello", "fellow"};
Since the largest string is "fellow" which contains 6 characters, you need to set the size 6 + 1 that is why the statement should use char a[10][7] instead of char a[10][5]
Hope it helps.
When you declare a 2-D character array as
char a[10][5] = {"hi", "hello", "fellow"};
char a[10][5] reserves memory to store 10 strings each of length 5 which means 4 characters + 1 '\0' character. A point to note is that the array elements are stored in contiguous memory locations.
a[0] points to the first string, a[1] to the second and so on.
Also when you initialize an array partially the other uninitialized elements become 0 instead of being garbage values.
Now in your case,after initialization if you try to visualize the array it would be something like
hi\0\0\0hellofello\0\0...
Now the command
printf("%s",a[0]);
prints characters starting from 'h' of "hi" and stops printing when a '\0' is encountered so "hi" is printed.
Now for the second case,
printf("%s",a[1]);
characters are printed starting from the 'h' of "hello" till a '\0' is encountered.Now the '\0' character is encountered only after printing "hellofello" and hence the output.
Related
I have a doubt how the length for an array is allocated
#include <stdio.h>
#include <string.h>
int main()
{
char str[] = "s";
long unsigned a = strlen(str);
scanf("%s", str);
printf("%s\n%lu\n", str, a);
return 0;
}
In the above program, I assign the string "s" to a char array.
I thought the length of str[] is 1. so we cannot store more than the length of the array. But it behaves differently. If I reading a string using scanf it is stored in str[] without any error. What was the length of the array str?
Sample I/O :
Hello
Hello 1
Your str is an array of char initialized with "s", that is, it has size 2 and length 1. The size is one more than the length because a NUL string terminator character (\0) is added at the end.
Your str array can hold at most two char. Trying to write more will cause your program to access memory past the end of the array, which is undefined behavior.
What actually happens though, is that since the str array is stored somewhere in memory (on the stack), and that memory region is far larger than 2 bytes, you are actually able to write past the end without causing a crash. This does not mean that you should. It's still undefined behavior.
Since your array has size 2, it can only hold a string of length 1, plus its terminator. To use scanf() and correctly avoid writing past the end of the array, you can use the field width specifier: a numeric value after the % and before the s, like this:
scanf("%1s", str);
When an array is declared without specifying its size when the size is determined by the used initializers.
In this declaration of an array
char str[] = "s";
there is used a string literal as an initializer. A string literal is a sequence of characters terminated by an included zero-terminating character. That is the string literal "s" has two characters { 's', '\0' }.
Its characters are used to initialize sequentially elements of the array str.
So if you will write
printf( "sizeof( str ) = %zu\n", sizeof( str ) );
then the output will be 2. The length of a string is determinate as a number of characters before the terminating zero character. So if you will write
#include <string.h>
//...
printf( "strlen( str ) = %zu\n", strlen( str ) );
then the output will be 1.
If you will try to write data outside an array then you will get undefined behavior because a memory that does not belong to the array will be overwritten. In some cases you can get the expected result. In other cases the program can finish abnormally. That is the behavior of the program is undefined.
The array str has size 2: 1 byte for the character 's' and one for the terminating null byte. What you're doing is writing past the end of the array. Doing so invokes undefined behavior.
When your code has undefined behavior, it could crash, it could output strange results, or it could (as in this case) appear to work properly. Also, making a seemingly unrelated change such as a printf call for debugging or an unused local variable can change how undefined behavior manifests itself.
1. Which of the following has a null terminator character added at the end?
int main()
{
char arr[]="sample";
char arr2[6]="sample";
char arr3[7]="sample";
char* strarr="sample";
char* strarr1=arr;
char* strarr2=arr2;
char* strarr3=arr3;
return 0;
}
2. Would printf("%s",somestr) fail in case:
somestr is an array of char with no null termination character at end?
somestr is a char* pointing to a continuous location of chars with no null termination character at end?
Edit : Is there a way I can check in gdb if a char* or a char array is null terminated or not?
First, "sample" is called a string literal. It declares a const char array terminated with a null character.
Let us go on:
char arr[]="sample";
The right hand part in a const char array of size 7 (6 characters and a '\0'. The dimension of arr is deduced from its initialization and is also 7. The char array is then initialized from the literal string.
char arr2[6]="sample";
arr2 has a declared size of 6. It is initialized from a string literal of size 7: only the 6 declared position are initialized to {'s', 'a', 'm', 'p', 'l', 'e'} with no terminating null. Nothing is wrong here, except that passing arr2 to a function that expects a null terminated string invokes Undefined Behaviour.
char arr3[7]="sample";
Declared size an initialization literal string size are both 7: it is just an explicit version of the first use case. Rather dangerous because if you later add one character to the initialization string you will get a not null terminated char array.
char* strarr="sample";
Avoid that. You are declaring a non const char pointer on a string literal. While the standard declares explicitely:
If the program attempts to modify such an array, the behavior is
undefined.
strarr[3] = 'i' would then invoke Undefined Behaviour with no warning. That being said and provided you never modify the string, you have a nice null terminated string.
char* strarr1=arr;
Ok, you declare a pointer to another string. Or more exactly a pointer to the first character of another string. And it is correctly null terminated.
char* strarr2=arr2;
You have a pointer to the first character of a not null terminated char array... You could not pass arr2 to a function expecting a null terminated char array, and you cannot either pass strarr2.
char* strarr3=arr3;
You have another pointer pointing to a string. Same behaviour as strarr1.
As per how to check in gdb for the terminating null, you cannot print it directly, because gdb knows enough of C strings to automatically stop printing a string on first null character. But you can always use p arr[7] to see whether the character after the array is a null or not.
For arr2, arr2+7 is one past the array. So it is undefined what lies there and in a truely bad system, using p arr[7] could raise a signal because it could be after the end of a memory segment - but I must admit that I have never seen that...
Each of arr and arr3 contains a null terminated string allocated on the stack when the function is called.
strarr points to a null terminated string allocated in the read-only data section of the program.
strarr1 points to a null terminated string allocated on the stack when the function is called.
strarr3 points to a null terminated string allocated on the stack when the function is called.
str points to the same string as strarr1.
I have a simple code in C to see if three same char arrays all end with '\0':
int main(){
char a[4] = "1234";
char b[4] = "1234";
char c[4] = "1234";
if(a[4] == '\0')
printf("a end with '\\0'\n");
if(b[4] == '\0')
printf("b end with '\\0'\n");
if(c[4] == '\0')
printf("c end with '\\0'\n");
return 0;
}
But the output shows that only array b ends with terminator '\0'. Why is that? I supposed all char arrays have to end with '\0'.
Output:
b end with '\0'
The major problem is, for an array defined like char a[4] = .... (with a size of 4 elements), using if (a[4] ....) is already off-by-one and causes undefined behavior.
You want to check for a[3], as that is the last valid element.
That said, in your case, you don;t have room for null-terminator!!
Emphasizing the quote from C11, §6.7.9,
An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
So, you need to either
use an array size which has room for the null-terminator
or, use an array of unknown size, like char a[ ] = "1234"; where, the array size is automatically determined by the length of the supplied initializer (including the null-terminator.)
It is undefined behaviour because you have trying to access array out of bound.
Do not specify the bound of a string initialized with a string literal because the compiler will automatically allocate sufficient space for entire string literal,including the terminating null character.
C standard(c11 - 6.7.9 : paragraph 14) says:
An array of character type may be initialized by a character string
literal or UTF−8 string literal, optionally enclosed in braces.
Successive bytes of the string literal (including the terminating null
character if there is room or if the array is of unknown size)
initialize the elements of the array.
So, does not specify the bound of a character array in the array initialize.
char a[] = "1234";
You need one more place at the end of the array to store the \0. Declare the arrays with length 5.
You can access the nth element of an array if the array has n elements. Here the size of the arrays are 4 bytes and you are trying to get the 5th byte (as array indices in C start from 0) when you do something like if(a[4] == '\0').
Execute the above code without specifying the array size, in that case all the 3 if statements will be executed, here as we have specified the array size and we know that the array of string will occupy 1 more char for NULL TERMINATION, but here we didn't give chance to the array to behave that way, therefore the compiler behaves randomly.
I'm starting to understand pointers and how to dereference them etc. I've been practising with ints but I figured a char would behave similarly. Use the * to dereference, use the & to access the memory address.
But in my example below, the same syntax is used to set the address of a char and to save a string to the same variable. How does this work? I think I'm just generally confused and maybe I'm overthinking it.
int main()
{
char *myCharPointer;
char charMemoryHolder = 'G';
myCharPointer = &charMemoryHolder;
printf("%s\n", myCharPointer);
myCharPointer = "This is a string.";
printf("%s\n", myCharPointer);
return 0;
}
First, you need to understand how "strings" work in C.
"Strings" are stored as an array of characters in memory. Since there is no way of determining how long the string is, a NUL character, '\0', is appended after the string so that we know where it ends.
So for example if you have a string "foo", it may look like this in memory:
--------------------------------------------
| 'f' | 'o' | 'o' | '\0' | 'k' | 'b' | 'x' | ...
--------------------------------------------
The things after '\0' are just stuff that happens to be placed after the string, which may or may not be initialised.
When you assign a "string" to a variable of type char *, what happens is that the variable will point to the beginning of the string, so in the above example it will point to 'f'. (In other words, if you have a string str, then str == &str[0] is always true.) When you assign a string to a variable of type char *, you are actually assigning the address of the zeroth character of the string to the variable.
When you pass this variable to printf(), it starts at the pointed address, then goes through each char one by one until it sees '\0' and stops. For example if we have:
char *str = "foo";
and you pass it to printf(), it will do the following:
Dereference str (which gives 'f')
Dereference (str+1) (which gives 'o')
Dereference (str+2) (which gives another 'o')
Dereference (str+3) (which gives '\0' so the process stops).
This also leads to the conclusion that what you're currently doing is actually wrong. In your code you have:
char charMemoryHolder = 'G';
myCharPointer = &charMemoryHolder;
printf("%s\n", myCharPointer);
When printf() sees the %s specifier, it goes to address pointed to by myCharPointer, in this case it contains 'G'. It will then try to get next character after 'G', which is undefined behaviour. It might give you the correct result every now and then (if the next memory location happens to contain '\0'), but in general you should never do this.
Several comments
Static strings in c are treated as a (char *) to a null terminated
array of characters. Eg. "ab" would essentially be a char * to a block of memory with 97 98 0. (97 is 'a', 98 is 'b', and 0 is the null termination.)
Your code myCharPointer = &charMemoryHolder; followed by printf("%s\n", myCharPointer) is not safe. printf should be passed a null terminated string, and there's no guarantee that memory contain the value 0 immediately follows your character charMemoryHolder.
In C, string literals evaluate to pointers to read-only arrays of chars (except when used to initialize char arrays). This is a special case in the C language and does not generalize to other pointer types. A char * variable may hold the address of either a single char variable or the start address of an array of characters. In this case the array is a string of characters which has been stored in a static region of memory.
charMemoryHolder is a variable that has an address in memory.
"This is a string." is a string constant that is stored in memory and also has an address.
Both of these addresses can be stored in myCharPointer and dereferenced to access the first character.
In the case of printf("%s\n", myCharPointer), the pointer will be dereferenced and the character displayed, then the pointer is incremented. It repeasts this until finds a null (value zero) character and stops.
Hopefully you are now wondering what happens when you are pointing to the single 'G' character, which is not null-terminated like a string constant. The answer is "undefined behavior" and will most likely print random garbage until it finds a zero value in memory, but could print exactly the correct value, hence "undefined behavior". Use %c to print the single character.
I am a little confused by the following C code snippets:
printf("Peter string is %d bytes\n", sizeof("Peter")); // Peter string is 6 bytes
This tells me that when C compiles a string in double quotes, it will automatically add an extra byte for the null terminator.
printf("Hello '%s'\n", "Peter");
The printf function knows when to stop reading the string "Peter" because it reaches the null terminator, so ...
char myString[2][9] = {"123456789", "123456789" };
printf("myString: %s\n", myString[0]);
Here, printf prints all 18 characters because there's no null terminators (and they wouldn't fit without taking out the 9's). Does C not add the null terminator in a variable definition?
Your string is [2][9]. Those [9] are ['1', '2', etc... '8', '9']. Because you only gave it room for 9 chars in the first array dimension, and because you used all 9, it has no room to place a '\0' character. redefine your char array:
char string[2][10] = {"123456789", "123456789"};
And it should work.
Sure it does, you just aren't leaving enough room for the '\0' byte. Making it:
char string[2][10] = { "123456789", "123456789" };
Will work as you expect (will just print 9 characters).
If you tell C that an array is a given size, C cannot make the array any larger. It would be disobeying you if it did so! Remember that not every char array contains a null terminated string. Sometimes the array (as used) is truly an array of (individual) char. The compiler doesn't know what you are doing and cannot read your mind.
This is why C allows you to initialize a char array where the null terminator won't fit but everything else will. Try your example with a string one byte longer and the compiler will complain.
Note that your example will compile but will not do what you expect, as the contents are not (null terminated) strings. With GCC, running your example, I see the string I should, followed by garbage.
Alterenatively, you can use:
char* myString[2] = {"123456789", "123456789" };
Like this, the initializer computes the right size for your null terminated strings.
C allows unterminated strings, C++ does not.
C allows character arrays to be
initialized with string constants. It
also allows a string constant
initializer to contain exactly one
more character than the array it
initializes, i.e., the implicit
terminating null character of the
string may be ignored. For example:
char name1[] = "Harry"; // Array of 6 char
char name2[6] = "Harry"; // Array of 6 char
char name3[] = { 'H', 'a', 'r', 'r', 'y', '\0' };
// Same as 'name1' initialization
char name4[5] = "Harry"; // Array of 5 char, no null char
C++ also allows character arrays to be
initialized with string constants, but
always includes the terminating null
character in the initialization. Thus
the last initializer (name4) in the
example above is invalid in C++.
Is there a reason why the compiler doesn't warn that there isn't enough room for the 0 byte? I get a warning if I try to add another '9' that won't fit, but it doesn't seem to care about dropping the 0 byte?
The '\0' byte isn't it's problem. Most of the time, if you have this:
char code[9] = "123456789";
The next byte will be off the edge of the variable, but will be unused memory, and will most likely be 0 (unless you malloc() and don't set the values before using them). So most of the time it works, even if it's bad for you.
If you're using gcc, you might also want to use the -Wall flag, or one of the other (million) warning flags. This might help (not sure).