C: String assignment to character array - c

I am coming back from after reading this c-faq question I am totaly confused what happening here.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main ()
{
char ar[3]="NIS", *c;
printf ("%s\n", ar);
strcpy (c, ar);
printf ("%s\n", c);
if (ar[4] == '\0')
{
printf ("Null");
}
else
{
printf ("%c\n", ar[4]);
}
}
Here I have assigned "NIS" Equal size of declare array.and when i try to access ar[3],ar[4] it giving null why ? it's ok in case of ar[3] but why in case of ar[4] Another thought: In c-faq it was mentioned that if you assign any string equal to the size of declared array, you can't use printf ("%s"), and strcpy() on that array as it is mentioned in c-faq. But in my above code i have used printf as well as strcpy here both working fine.It might be i have interpreted wrong please correct me. and another question is that When I try to compare ar[5] with null it is not printing anything that's ok but why it is printing Null for ar[4].My thought on this "NIS" String will store in memory like this..
Thanks in advance.
--------------------------------------------------------
| N | I | S | /0 | Garbage value here
|_______|________|_______|________|_____________________
ar[0] ar[1] ar[2] ar[3]
Well Here ar[3] is giving null when I compare it with '\0' that's ok but when I comapre it with ar[4] still it giving me null instead of some garbage value..

Your code exhibits undefined behaviour. It works for you by chance, but on another machine it could fail. As you understood from the FAQ, the code is not valid. But that does not mean it will always fail. That is simply the nature of undefined behaviour. Literally anything can happen.
Accessing ar[3] is illegal because that is beyond the end of the array. Valid indices for this array are 0, 1 and 2.
You did not allocate memory for c so any de-referencing of the pointer is undefined behaviour.
Your main declaration is wrong. You should write:
int main(void)

Don't do this. The declaration char NIS[3]; gives you a three-character array with which you can use the indexes 0 through 2 inclusive.
Any use of other indexes (for dereferencing) is undefined behaviour and should not be done.
The reason why it may be working is because there's nothing stating that the "garbage" values have to be non-zero. That's what garbage means in this context, they could be anything.
Your strcpy is also undefined behaviour since your c pointer has not been initialised to anything useful.

ar[3] does not exist because ar is only 3 characters long.
That faq is saying that it is legal, but that it's not a C string.
If the array is too short, the null character will be cut off.
Basically, "abc" is silently 'a', 'b', 'c', 0. However, since ar is of length 3 and not four, the null byte gets truncated.
What the compiler chooses to do in this situation (and the OS) is not known. If it happens to work, that's just by luck.

Related

Character array initialization in C

I am trying to understand the array concept in string.
char a[5]="hello";
Here, array a is an character array of size 5. "hello" occupies the array index from 0 to 4. Since, we have declared the array size as 5, there is no space to store the null character at the end of the string.
So my understanding is when we try to print a, it should print until a null character is encountered. Otherwise it may also run into segmentation fault.
But, when I ran it in my system it always prints "hello" and terminates.
So can anyone clarify whether my understanding is correct. Or does it depends upon the system that we execute.
As ever so often, the answer is:
Undefined behavior is undefined.
What this means is, trying to feed this character array to a function handling strings is wrong. It's wrong because it isn't a string. A string in C is a sequence of characters that ends with a \0 character.
The C standard will tell you that this is undefined behavior. So, anything can happen. In C, you don't have runtime checks, the code just executes. If the code has undefined behavior, you have to be prepared for any effect. This includes working like you expected, just by accident.
It's very well possible that the byte following in memory after your array happens to be a \0 byte. In this case, it will look to any function processing this "string" as if you passed it a valid string. A crash is just waiting to happen on some seemingly unrelated change to the code.
You could try to add some char foo = 42; before or after the array definition, it's quite likely that you will see that in the output. But of course, there's no guarantee, because, again, undefined behavior is undefined :)
What you have done is undefined behavior. Apparently whatever compiler you used happened to initialize memory after your array to 0.
Here, array a is an character array of size 5. "hello" occupies the array index from 0 to 4. Since, we have declared the array size as 5, there is no space to store the null character at the end of the string.
So my understanding is when we try to print a, it should print until a null character is encountered.
Yes, when you use printf("%s", a), it prints characters until it hits a '\0' character (or segfaults or something else bad happens - undefined behavior). I can demonstrate that with a simple program:
#include <stdio.h>
int main()
{
char a[5] = "hello";
char b[5] = "world";
int c = 5;
printf("%s%s%d\n", a, b, c);
return 0;
}
Output:
$ ./a.out
helloworldworld5
You can see the printf function continuing to read characters after it has already read all the characters in array a. I don't know when it will stop reading characters, however.
I've slightly modified my program to demonstrate how this undefined behavior can create bad problems.
#include <stdio.h>
#include <string.h>
int main()
{
char a[5] = "hello";
char b[5] = "world";
int c = 5;
printf("%s%s%d\n", a, b, c);
char d[5];
strcpy(d, a);
printf("%s", d);
return 0;
}
Here's the result:
$ ./a.out
helloworld��world��5
*** stack smashing detected ***: <unknown> terminated
helloworldhell�p��UAborted (core dumped)
This is a classic case of stack overflow (pun intended) due to undefined behavior.
Edit:
I need to emphasize: this is UNDEFINED BEHAVIOR. What happened in this example may or may not happen to you, depending on your compiler, architecture, libraries, etc. You can make guesses to what will happen based on your understanding of different implementations of various libraries and compilers on different platforms, but you can NEVER say for certain what will happen. My example was on Ubuntu 17.10 with gcc version 7. My guess is that something very different could happen if I tried this on an embedded platform with a different compiler, but I cannot say for certain. In fact, something different could happen if I had this example inside of a larger program on the same machine.

pointer in c program

program in c language
void main()
{
char *a,*b;
a[0]='s';
a[1]='a';
a[2]='n';
a[3]='j';
a[4]='i';
a[5]='t';
printf("length of a %d/n", strlen(a));
b[0]='s';
b[1]='a';
b[2]='n';
b[3]='j';
b[4]='i';
b[5]='t';
b[6]='g';
printf("length of b %d\n", strlen(b));
}
here the output is :
length of a 6
length of b 12
Why and please explain it.
thanks in advance.
You are assigning to pointer (which contains garbage) without allocating memory. What you are noticing is Undefined Behavior. Also main should return an int. Also it does not make sense to try and find the length of an array of chars which are not nul terminated.
This is how you can go about:
Sample code
When you declare any variable it comes with whatever it had in memory previously where your application is running, and since pointers are essentially numbers, whatever number it had referenced to some random memory address.
Then, when setting a[i], the compiler interprets that as you want to step sizeof(a) bytes forward, thus, a[i] is equal to the address (a + i*1) (1 because chars use one byte).
Finally, C-strings need to be NULL terminated (\0, also known as sentinel), and methods like strlen go over the length of the string until hitting the sentinel, most likely, your memory had a stray 0 somewhere that caused strlen to stop.
Allocate some memory and terminate the strings then it will work better
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void main(){
char *a=malloc(10);
char *b=malloc(10);
if(a){
a[0]='s';
a[1]='a';
a[2]='n';
a[3]='j';
a[4]='i';
a[5]='t';
a[6]=(char)0;
printf("length of a %d\n", (int)strlen(a));
}else{
printf("Failed to allocate 10 bytes\n" );
}
if(b){
b[0]='s';
b[1]='a';
b[2]='n';
b[3]='j';
b[4]='i';
b[5]='t';
b[6]='g';
b[7]=(char)0;
printf("length of b %d\n", (int)strlen(b));
}else{
printf("Failed to allocate 10 bytes\n" );
}
free(a);
free(b);
}
Undefined behavior. That's all.
You're using an uninitialized pointer. After that, all bets are off as to what will happen.
Of course, we can attempt to explain why your particular implementation acts in a certain way but it'd be quite pointless outside of novelty.
The indexing operator is de-referencing the pointers a and b, but you never initialized those pointers to point at valid memory. Writing to un-initialized memory triggers undefined behavior.
You are simply "lucky" (or unlucky, it depends on your viewpoint) that the program doesn't crash, that the pointer values are such that you succeed in writing at those locations.
Note that you never write the termination character ('\0') to either string, but still get the "right" value from strlen(); this implies that a and b both point at memory that happens to be full of zeros. More luck.
This is a very broken program; that it manages to run "successfully" is because it's behavior is undefined, and undefined clearly includes "working as the programmer intended".
a and b are both char pointers. First of all, you didn't initialise them and secondly didn't terminate them with NULL.

Why can an array receive values more than it is declared to hold

int main(void)
{
char name1[5];
int count;
printf("Please enter names\n");
count = scanf("%s",name1);
printf("You entered name1 %s\n",name1);
return 0;
}
When I entered more than 5 characters, it printed the characters as I entered, it was more than 5, but the char array is declared as:
char name1[5];
Why did this happened
Because the characters are stored on the addresses after the 'storage space'. This is very dangerous and can lead to crashes.
E.g. suppose you enter name: Michael and the name1 variable starts at 0x1000.
name1: M i c h a e l \0
0x1000 0x1001 0x1002 0x1003 0x1004 0x1005 0x1006 0x1007
[................................]
The allocated space is shown with [...]
This means from 0x1005 memory is overwritten.
Solution:
Copy only 5 characters (including the \0 at the end) or check the length of the entered string before you copy it.
This is undefined behavior, you are writing beyond the bounds of allocated memory. Anything can happen, including a program that appears to work correctly.
The C99 draft standard section J.2 Undefined Behavior says:
The behavior is undefined in the following circumstances:
and contains the following bullet:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).
This applies to the more general case since E1[E2] is identical to (*((E1)+(E2))).
This is undefined behavior, you can't count on it. It just happens to work, it may not work on another machine.
To avoid buffer overflow, use
fgets(name1, sizeof(name1) - 1, stdin);
or in C11
gets_s(name1, sizeof(name1) - 1);
another example to make things clearer :
#include <stdio.h>
int array[5] ;
int main ( void )
{
array[-1] = array[-1] ; // sound strange ??
printf ( "%d" , array[-1] ) ; // but work !!
return 0 ;
}
array in this case in an address, and you get number
before or after that address, but this is undefined behavior
unless you know what you do. Pointer works with ++ or -- !
It's very clear from other answers that this constitutes some kind of vulnerability to your program.
What can be learned from this? Lets assume:
int func(void)
{
char buffer[1];
...
In almost every implementation of the C compiler, the code generated here will create a local stack area and enables you to access this stack by the address given in buffer. On this stack reside other important data too, for example: the address of the next code line to be executed after the function returns to it's caller.
You could, therefore, theoretically:
Enter a lot of code into your input function,
Create a code that defines (in binary code) a new function that does something ugly,
Overwrite the correct return address (on the stack) with the address that the new function would have if you write it beyond the buffers bounds.
This is called buffer overflow exploit, you can read up here (and on many other places).
Yes it is allowed in C, as there is no bound checking.

Is it necessary to initialize the char array for accurate length?

In the below example when I define char array uninitialized and want to find the length, it's undefined behavior.
#include<stdio.h>
int main()
{
char a[250];
printf("length=%d\n",strlen(a));
}
I got "0". I don't know how? Explain it.
Luck. Whether it's good or bad luck is a matter of opinion. The contents of your array are whatever happened to already occupy that memory, and is not initialized. In your case, it happened the first byte was a '\0'.
This is, of course, undefined behavior and you can't depend on it happening this way.
You said in your example you were using an uninitialized char array to show undefined behavior, then when you got "0" you want an explanation? It's... undefined behavior.
If you got 0 for the length if just means that there happens to be a 0 as the first element of a[] in your uninitialized array. When it's an uninitialized local that means, as far as the C standards are concerened, anything can be in there, including a 0.
To address the question in your title: "Is it necessary to initialize the char array for accurate length?"
Yes, to be able to deterministically know the length of a string in a char array via the strlen() function, it is required for a null terminator to be present. That means it needs to be initialized or set in some manner or another.
As other answers say the strlen() result is more a matter of luck than defined behaviour
To find the "size" of the memory block use sizeof() instead
Note: I've also included the string.h and used a long conversion for the integers in the printf
#include<stdio.h>
#include<string.h>
int main()
{
char a[250];
printf("length=%ld\n",strlen(a));
printf("sizeof=%ld\n",sizeof(a));
}
when you define
char a[250];
The array will contains garabage contents and random.
strlen(a) count the number of not null charachter ('\0') till it find a null charachter then it stop.
so if your char a[250]; array contains garabage element and the first element is randomly set to null '\0' the strlen(a) will return 0

Why is %s treated as character. why such unusual behavior of string as a pointer

int main()
{
char *arr="hello";
clrscr();
printf("%s",arr);
arr="";
printf("and %s",arr+2);
getch();
return 0;
}!
OUTPUT OF THIS CODE:-
helloand nd %s
check it here:-
http://ideone.com/TJzUvp
// why this unusual behaviour with the pointer string?
You're invoking undefined behaviour.
Referring to arr+2 after setting arr="" is picking up an arbitrary piece of memory (in this case, a piece of one of your string literals).
When you run the program the memory is laid out so that the constant string and %s comes right after the constant empty string. The memory looks like this:
. . . \0 a n d ' ' % s \0 . . .
When you assign the empty string to a, you make a point to the \0 character. Then you calculate a+2, and the result points to the n character.
Therefore, when you interpret a+2 as a NUL-terminated string you get nd %s.
This is, of course, undefined behaviour. Many compilers lay out constants so that they are in the order that you use them: here the empty string comes before and %s. But nothing forces the compiler to do so, and if it lays out the memory differently you would get different results. The program could even crash.
You are not creating a pointer to a string>.. You created a block in the memory space meant for Global Variables and constants..and used a pointer to point to that block..You can't change the value because it is a part of CONSTANT memory space...what you are doing is reassigning the pointer to another memory location and expecting to be lucky to point to a desired location..it wont always happen

Resources