String terminator issues - c

If I type in this code, it compiles and runs (I use GCC)
#include<stdio.h>
int main()
{
char sentence[8]="September";
printf("The size of the array is %d \n",sizeof(sentence));
printf("The array is %s \n",sentence);
}
and gives the output
The size of the array is 8
The array is Septembe
How is this working? A string terminator is needed for C to know that the string has ended. How is the array worth 8 bytes of space and knows where to stop?

By passing a non-NUL-terminated string to printf("%s"), you're invoking undefined behavior.
By its very nature, the result is undefined. It may seemingly "work" (like you're seeing).
As others have explained, what's probably happening is that there happens to be a zero byte after your string, which stops printf from going further. However, if you were to add more stuff around that variable, you'd probably see different behavior:
#include<stdio.h>
int main(void)
{
char sentence[8] = "September"; // NOT NUL TERMINATED!
char stuff[] = "This way is better";
printf("%s\n", sentence); // Will overrun sentence
return 0;
}

Related

Char array in C yields extra characters than required [duplicate]

This question already has an answer here:
What are null-terminated strings?
(1 answer)
Closed 7 months ago.
I have this simple program in which I initialize a string with "HELLO". I need the output to be printed as HLOEL, i.e) all the even indexes (0,2,4) followed by the odd ones (1,2). I could not infer what's wrong with my code, but it yields "HLOELHLO" instead of "HLOEL". Could someone explain what is happening here?
#include <stdio.h>
int main() {
int i,loPos=0,hiPos=0;
char *str="HELLO";
char lo[2];
char hi[3];
for(i=0;i<5;i++)
{
if(i%2==0)
{
hi[hiPos++]=str[i];
}
else
{
lo[loPos++]=str[i];
}
}
printf("%s%s",hi,lo);
return 0;
}
Thanks in Advance!
After the for loop, you need to put string terminating 0 bytes to the new strings, and also make sure they habe room for it:
char lo[2+1];
char hi[3+1];
for(...) {
}
hi[hiPos] = '\0';
Lo[loPos] = '\0';
Otherwise any string functions will have buffer overflow, causing undefined behavior. Generally they will keep reading bytes until by chance they encounter byte with value 0. But as always with undefined behavior, even this can cause your program to do anything.
because C style string need extra one char '\0' as its end. With your code,
the memory layout around two arrays maybe looks like:
lo[0], lo[1], hi[0], hi[1], hi[2], something else equal 0,
printf stops when it meets a '\0'
you should declare arrays as:
char lo[3];
char hi[4];
lo[2] = '\0';
hi[3] = '\0';

What is the point of assigning the size of a string?

For an instance if I store ABCDE from scanf function, the later printf function gives me ABCDE as output. So what is the point of assigning the size of the string(Here 4).
#include <stdio.h>
int main() {
int c[4];
printf("Enter your name:");
scanf("%s",c);
printf("Your Name is:%s",c);
return 0;
}
I'll start with, don't use int array to store strings!
int c[4] allocates an array of 4 integers. An int is typically 4 bytes, so usually this would be 16 bytes (but might be 8 or 32 or something else on some platforms).
Then, you use this allocation first to read characters with scanf. If you enter ABCDE, it uses up 6 characters (there is an extra 0 byte at the end of the string marking the end, which needs space too), which happens to fit into the memory reserved for array of 4 integers. Now you could be really unlucky and have a platform where int has a so called "trap representation", which would cause your program to crash. But, if you are not writing the code for some very exotic device, there won't be. Now it just so happens, that this code is going to work, for the same reason memcpy is going to work: char type is special in C, and allows copying bytes to and from different types.
Same special treatment happens, when you print the int[4] array with printf using %s format. It works, because char is special.
This also demonstrates how very unsafe scanf and printf are. They happily accept c you give them, and assume it is a char array with valid size and data.
But, don't do this. If you want to store a string, use char array. Correct code for this would be:
#include <stdio.h>
int main() {
char c[16]; // fits 15 characters plus terminating 0
printf("Enter your name:");
int items = scanf("%15s",c); // note: added maximum characters
// scanf returns number of items read successfully, *always* check that!
if (items != 1) {
return 1; // exit with error, maybe add printing error message
}
printf("Your Name is: %s\n",c); // note added newline, just as an example
return 0;
}
The size of an array must be defined while declaring a C String variable because it is used to calculate how many characters are going to be stored inside the string variable and thus how much memory will be reserved for your string. If you exceed that amount the result is undefined behavior.
You have used int c , not char c . In C, a char is only 1 byte long, while a int is 4 bytes. That's why you didn't face any issues.
(Simplifying a fair amount)
When you initialize that array of length 4, C goes and finds a free spot in memory that has enough consecutive space to store 4 integers. But if you try to set c[4] to something, C will write that thing in the memory just after your array. Who knows what’s there? That might not be free, so you might be overwriting something important (generally bad). Also, if you do some stuff, and then come back, something else might’ve used that memory slot (properly) and overwritten your data, replacing it with bizarre, unrelated, and useless (to you) data.
In C language the last of the string is '\0'.
If you print with the below function, you can see the last character of the string.
scanf("%s", c); add the last character, '\0'.
So, if you use another function, getc, getch .., you should consider adding the laster character by yourself.
#include<stdio.h>
#include<string.h>
int main(){
char c[4+1]; // You should add +1 for the '\0' character.
char *p;
int len;
printf("Enter your name:");
scanf("%s", c);
len = strlen(c);
printf("Your Name is:%s (%d)\n", c, len);
p = c;
do {
printf("%x\n", *(p++));
} while((len--)+1);
return 0;
}
Enter your name:1234
Your Name is:1234 (4)
31
32
33
34
0 --> last character added by scanf("%s);
ffffffae --> garbage

Why this code can't print characters in an array?

I compiled this code using gcc (tdm-1) 5.1.0 and please tell me why the output doesn't contain "hello"
#include<stdio.h>
void main()
{
int i;
char st[20];
printf("Enter a string ");
scanf("%s",st);
for(i=0;i<20;i++)
{
printf("%c",st[i]);
}
}
Input:hello
Output: # #
You print all 20 elements of the array, but if the user entered a string smaller than that not all elements would be initialized. They would be indeterminate and seemingly random.
Remember that char strings in C are really called null-terminated byte strings. That null-terminated bit is important, and mean you can easily find the end of the string by checking the current character agains '\0' (which is the terminator character).
Or you could just use the strlen function to get the length of the string instead:
for(i=0;i<strlen(st);i++) { ... }
Or use the "%s" format to print the string:
printf("%s", st);
Also note that without any protection the scanf function will allow you give longer input than is space for in the array, so you need to protect agains that, for example by limiting the amount of characters scanf will read:
scanf("%19s",st); // Write at most 19 character (*plus* terminator) to the string
Now for why your input doesn't seem to be printed, it's because the indeterminate contents of the uninitialized elements. While you're not going out of bounds of your array, you still go out of bounds of the actual string. Going out of bounds leads to undefined behavior.
What's probably is happening is that some of the "random" indeterminate contents happens to be a carriage return '\r', which moves the cursor to the start of the line and the output already written will be overwritten by the uninitialized elements in your array.
Here's a short example as Qubit already explained:
#include <stdio.h>
void main () {
char str1[20];
printf("Enter name: ");
scanf("%s", str1);
printf("Entered Name: %s", str1);
}
Here
char st[20];
st is a local variable & default array st contents are garbage not zero. So if you scan less than 20 characters into st, in that case remaining location of array st contains garbage, hence it's printing some junk data like # # in case of
char st[20];
printf("Enter a string ");
scanf("%s",st);
for(i=0;i<20;i++) {
printf("%c",st[i]);
}
& it's a bad practice as if user entered few char lets say 5 char, then your loop rotates 20 times, internally it will do more operations or consume more CPU cycle.
So if you want to print a char array char by char, then you should rotate a loop until \0 char encounters, for e.g
for(i=0;st[i];i++) { /* this fails when \0 encounters */
printf("%c",st[i]);
}
Or
as others suggested you can print char array st using single printf by using %s format specifier like
printf("%s\n",st); /*here printf starts printing from base address of st
and prints until \0 */
Also it's better to initialize char array st while declaring itself. for e.g
char st[20] ="";

Given a string write a program to generate all possible strings by replacing ? with 0 and 1?

I have written this code it is working fine for a?b?c? and a?b?c?d? but for a?b?c?d?e? it is giving one additional garbage value at the end. At the end of s there is '\0' character attached then why and how is it reading that garbage value. I tried to debug it by placing printf statements in between the code but couldn't resolve it. please help.
#include<stdio.h>
void print(char* s,char c[],int l)
{
int i,j=0;
for(i=0;s[i]!='\0';i++)
{
if(s[i]=='?')
{
printf("%c",c[j]);
j++;
}
else
printf("%c",s[i]);
}
printf(", ");
}
void permute(char *s,char c[],int l,int index)
{
if(index==l)
{
print(s,c,l);
return;
}
c[index]='0';
permute(s,c,l,index+1);
c[index]='1';
permute(s,c,l,index+1);
}
int main()
{
char s[10],c[10];
printf("Enter a string.");
scanf("%s",s);
int i,ct=0;
for(i=0;s[i]!='\0';i++)
{
if(s[i]=='?')
ct++;
}
permute(s,c,ct,0);
return 0;
}
My output was like this :-
a0b0c0d0e0♣, a0b0c0d0e1♣,
...and so on.
As we can see from your code, with an array defined like char s[10] and the input being
a?b?c?d?e?
is too big an input to be held in s along with the null-terminator by
scanf("%s",s);
You need to use a bigger array. Otherwise, in attempt to add the terminating null after the input, the access is being made to out-of-bound memory which invokes undefined behaviour.
That said, never allow unbound input to the limited-sized array, always use the field-width to limit the input length (in other words, reserve the space for null-terminator), like
scanf("%9s",s);
The code is producing the correct output here, but note that it has undefined behavior for strings of size greater than or equal to 10 chars, because that's the size of your buffer.
So, for a?b?c?d?e? you need a buffer of at least 11 characters, to account for the null terminator. You should make s bigger.
See actually in C what happens in String is that everytime it appends a '\0' character at last.
Now notice in C there is nothing called string.
It's array of characters.
So if you have defined like this-
char s[10]
This actually accepts an array of less than of 9 characters as the last one will be the '\0' character.
If you add more than 9 character it will give erroneous output.

Find String Length without recursion in C

#include<stdio.h>
#include<conio.h>
void main()
{
int str1[25];
int i=0;
printf("Enter a string\n");
gets(str1);
while(str1[i]!='\0')
{
i++;
}
printf("String Length %d",i);
getch();
return 0;
}
i'm always getting string length as 33. what is wrong with my code.
That is because, you have declared your array as type int
int str1[25];
^^^-----------Change it to `char`
You don't show an example of your input, but in general I would guess that you're suffering from buffer overflow due to the dangers of gets(). That function is deprecated, meaning it should never be used in newly-written code.
Use fgets() instead:
if(fgets(str1, sizeof str1, stdin) != NULL)
{
/* your code here */
}
Also, of course your entire loop is just strlen() but you knew that, right?
EDIT: Gaah, completely missed the mis-declaration, of course your string should be char str1[25]; and not int.
So, a lot of answers have already told you to use char str1[25]; instead of int str1[25] but nobody explained why. So here goes:
A char has length of one byte (by definition in C standard). But an int uses more bytes (how much depends on architecture and compiler; let's assume 4 here). So if you access index 2 of a char array, you get 1 byte at memory offset 2, but if you access index 2 of an int array, you get 4 bytes at memory offset 8.
When you call gets (which should be avoided since it's unbounded and thus might overflow your array), a string gets copied to the address of str1. That string really is an array of char. So imaging the string would be 123 plus terminating null character. The memory would look like:
Adress: 0 1 2 3
Content: 0x31 0x32 0x33 0x00
When you read str1[0] you get 4 bytes at once, so str1[0] does not return 0x31, you'll get either 0x00333231 (little-endian) or 0x31323300 (big endian).
Accessing str1[1] is already beyond the string.
Now, why do you get a string length of 33? That's actually random and you're "lucky" that the program didn't crash instead. From the start address of str1, you fetch int values until you finally get four 0 bytes in a row. In your memory, there's some random garbage and by pure luck you encounter four 0 bytes after having read 33*4=132 bytes.
So here you can already see that bounds checks are very important: your array is supposed to contain 25 characters. But gets may already write beyond that (solution: use fgets instead). Then you scan without bounds and may thus also access memory well beyond you array and may finally run into non-existing memory regions (which would crash your program). Solution for that: do bounds checks, for example:
// "sizeof(str1)" only works correctly on real arrays here,
// not on "char *" or something!
int l;
for (l = 0; l < sizeof(str1); ++l) {
if (str1[l] == '\0') {
// End of string
break;
}
}
if (l == sizeof(str1)) {
// Did not find a null byte in array!
} else {
// l contains valid string length.
}
I would suggest certain changes to your code.
1) conio.h
This is not a header that is in use. So avoid using it.
2) gets
gets is also not recommended by anyone. So avoid using it. Use fgets() instead
3) int str1[25]
If you want to store a string it should be
char str1[25]
The problem is in the string declaration int str1[25]. It must be char and not int
char str1[25]
void main() //"void" should be "int"
{
int str1[25]; //"int" should be "char"
int i=0;
printf("Enter a string\n");
gets(str1);
while(str1[i]!='\0')
{
i++;
}
printf("String Length %d",i);
getch();
return 0;
}

Resources