C Array First Input "\0" - c

The arrays [a] and [b] are supposed to take in 4 numbers each on two separate lines, and output ?A?B, as in the famous number guessing game. (For instance, 1234\n1347 should output 1A2B.) However, I found out that the 1 at a[0] gets replaced by \0 while the others are still fine. (Side-note: This happened on XCode, but not on Visual Studio.)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char a[4], b[4];
int i, j;
int A = 0, B = 0;
scanf("%s%s",a,b);
for(i=0;i<4;i++)
if(a[i] == b[i])
A++;
for(i=0;i<4;i++)
for(j=0;j<4;j++)
if(a[i] == b[j])
B++;
B = B-A;
printf("%dA%dB\n",A,B);
return 0;
}
Changing the arrays to size [5] solves the problem, but I want to understand what's going on. Thanks!

Strings in C are really called null terminated byte strings. That name gives a hint that there is a special terminator for the strings, which is the '\0' character.
A string of four characters needs space for five characters, with the fifth being the null terminator.
When you input four characters with your scanf call, the function will write the fifth (the terminator) out of bounds. And writing out of bounds leads to undefined behavior.

Related

Spurious newlines+whitespace when printing substrings in C

I have a simple program that reads a pair of characters from a char[] array and prints each pair to the console, all on the same line - for some reason, some spurious newlines (and whitespace) are added to the output.
I've removed usage of str libs (apart from strlen) that may add newlines at the end of strings - but I am still lost as to what's happening.
The program:
#include <stdio.h>
#include <string.h>
char input[] = "aabbaabbaabbaabbaabb";
int main() {
int i;
char c[2];
size_t input_length = strlen(input);
for (i=0; i<input_length; i+=2) {
c[0] = input[i];
c[1] = input[i+1];
printf("%s", c);
}
printf("\n");
return 0;
}
Expected output:
aabbaabbabbaabbaabb
Output:
aabbaabbabb
aa
bbaabb
Why are there newlines and whitespace in the output? (Note that the 1st line has a single a towards the end - could not deduce why)
Using Apple clang version 11.0.0 (clang-1100.0.33.16), though I would doubt if that matters.
%s works properly if your string contains null character ('\0'). If it does not (just like your case), then printf function continues to print characters until it finds '\0' somewhere in memory. Remember that string in C is a character sequence terminated with '\0'. This is the reason why your code does not behave as you expected.
On the other hand, %c prints only one character so you can use:
printf("%c%c", c[0],c[1]);
If you persist in using %s, in this case you have to use %.2s. You probably already know that . shows precision in C. Precision in string means maximum number of characters that you want to print. So usage of .2 results in printing the first two characters in your string. No need to wait for '\0'!
printf("%.2s", c);
I also give #Tom Karzes's solution. You should change and add these lines:
char c[3];
c[2] = '\0';

How to count the number of distinct characters in common between two strings?

How can a program count the number of distinct characters in common between two strings?
For example, if s1="connect" and s2="rectangle", the count is being displayed as 5 but the correct answer is 4; repeating characters must be counted only once.
How can I modify this code so that the count is correct?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i,j,count=0;
char s1[100],s2[100];
scanf("%s",s1);//string 1 is inputted
scanf("%s",s2);//string 2 is taken as input
for(i=1;i<strlen(s1);i++)
{
for(j=1;j<strlen(s2);j++)
{
if(s1[i]==s2[j])//compare each char of both the strings to find common letters
{
count++;//count the common letters
break;
}
}
}
printf("%d",count);//display the count
}
The program is to take two strings as input and display the count of the common characters in those strings. Please let me know what's the problem with this code.
If repeating characters must be ignored, the program must 'remember' the character which were already encountered. You could do this by storing the characters which were processed into a character array and then consult this array while processing the other characters.
You could use a counter variable to keep track of the number of common characters like
int ctr=0;
char s1[100]="connect", s2[100]="rectangle", t[100]="";
Here, t is the character array where the examined characters will be stored. Its size is made to be same as the size of the largest of the other 2 character arrays.
Now use a loop like
for(int i=0; s1[i]; ++i)
{
if(strchr(t, s1[i])==NULL && strchr(s2, s1[i])!=NULL)
{
t[ctr++]=s1[i];
t[ctr]=0;
}
}
t initially has an empty string. Characters which were previously absent in t are added to it via the body of the loop which will be executed only if the character being examined (ie, s1[i]) is not in t but is present in the other string (ie, s2).
strchr() is a function with a prototype
char *strchr( const char *str, int c );
strchr() finds the first occurrence of c in the string pointed to by str. It returns NULL if c is not present in str.
Your usage of scanf() may cause trouble.
Use
scanf("%99s",s1);
(where 99 is one less than the size of the array s1) instead of
scanf("%s",s1);
to prevent overflow problems. And check the return value of scanf() and see if it's 1. scanf() returns the number of successful assignment that it made.
Or use fgets() to read the string.
Read this post to see more about this.
And note that array indexing starts from 0. So in your loops, the first character of the strings are not checked.
So it should've been something like
for(i=0;i<strlen(s1);i++)
instead of
for(i=1;i<strlen(s1);i++)
Here's a solution that avoids quadratic O(N²) or cubic O(N³) time algorithms — it is linear time, requiring one access to each character in each of the input strings. The code uses a pair of constant strings rather than demanding user input; an alternative might take two arguments from the command line and compare those.
#include <limits.h>
#include <stdio.h>
int main(void)
{
int count = 0;
char bytes[UCHAR_MAX + 1] = { 0 };
char s1[100] = "connect";
char s2[100] = "rectangle";
for (int i = 0; s1[i] != '\0'; i++)
bytes[(unsigned char)s1[i]] = 1;
for (int j = 0; s2[j] != '\0'; j++)
{
int k = (unsigned char)s2[j];
if (bytes[k] == 1)
{
bytes[k] = 0;
count++;
}
}
printf("%d\n",count);
return 0;
}
The first loop records which characters are present in s1 by setting an appropriate element of the bytes array to 1. It doesn't matter whether there are repeated characters in the string.
The second loop detects when a character in s2 was in s1 and has not been seen before in s2, and then both increments count and marks the character as 'no longer relevant' by setting the entry in bytes back to 0.
At the end, it prints the count — 4 (with a newline at the end).
The use of (unsigned char) casts is necessary in case the plain char type on the platform is a signed type and any of the bytes in the input strings are in the range 0x80..0xFF (equivalent to -128..-1 if the char type is signed). Using negative subscripts would not lead to happiness. The code does also assume that you're working with a single-byte code set, not a multi-byte code set (such as UTF-8). Counts will be off if you are dealing with multi-byte characters.
The code in the question is at minimum a quadratic algorithm because for each character in s1, it could step through all the characters in s2 only to find that it doesn't occur. That alone requires O(N²) time. Both loops also use a condition based on strlen(s1) or strlen(s2), and if the optimizer does not recognize that the value returned is the same each time, then the code could scan each string on each iteration of each loop.
Similarly, the code in the other two answers as I type (Answer 1 and Answer 2) are also quadratic or worse because of their loop structures.
At the scale of 100 characters in each string, you probably won't readily spot the difference, especially not in a single iteration of the counting. If the strings were bigger — thousands or millions of bytes — and the counts were performed repeatedly, then the difference between the linear and quadratic (or worse) algorithms would be much bigger and more easily detected.
I've also played marginally fast'n'loose with the Big-O notation. I'm assuming that N is the size of the strings, and they're sufficiently similar in size that treating N₁ (the length of s1) as approximately equal to N₂ (the length of s2) isn't going to be a major problem. The 'quadratic' algorithms might be more formally expressed as O(N₁•N₂) whereas the linear algorithm is O(N₁+N₂).
Based on what you expect as output you should keep track which char you used from the second string. You can achieve this as follows:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i, j, count = 0, skeep;
char s1[100], s2[100], s2Used[100]{0};
scanf("%s", s1); //string 1 is inputted
scanf("%s", s2); //string 2 is taken as input
for (i = 0; i<strlen(s1); i++)
{
skeep = 0;
for (j = 0; j < i; j++)
{
if (s1[j] == s1[i])
{
skeep = 1;
break;
}
}
if (skeep)
continue;
for (j = 0; j<strlen(s2); j++)
{
if (s1[i] == s2[j] && s2Used[j] == 0) //compare each char of both the strings to find common letters
{
//printf("%c\n", s1[i]);
s2Used[j] = 1;
count++;//count the common letters
break;
}
}
}
printf("%d", count);//display the count
}

Why does puts() function gives me a heart symbol?

I was trying to figure out that how a string with a known size can be filled with single characters. Then I wrote this simple code for a bigger problem that I have
(dynamic filling of a string with unknown size)
. When I tried to compile and run this code I encountered a problem which output had a heart symbol! and I don't know where it comes from.
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
char str[3];
for(i=0;i<=2;i++){
str[i]=getc(stdin);
}
puts(str);
return 0;
}
Thank you.
The C strings are sequences of chars terminated by the null character (i.e. the character with code 0). It can be expressed as '\0', '\x0' or simply 0.
Your code fills str with three chars but fails to produce the null terminator. Accordingly, puts() prints whatever characters it finds in memory until it reaches the first null character.
Your code exposes Undefined Behaviour. It can do anything and it's not its fault.
In order to fix it you have to make sure the string ends with the null terminating character:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
// Make room for 3 useful chars and the null terminator
char str[4];
// Read three chars
for(i = 0; i < 3; i ++) {
str[i] = getc(stdin);
}
// Add the null terminator for strings
str[3] = 0;
puts(str);
return 0;
}
Update
As #JeremyP notes in a comment, if the file you read from (stdin) ends before the code reads 3 characters, fgetc() will return EOF (End Of File) characters that are also funny non-printable characters that makes you wonder where they came from.
The correct way to write this code is to check if the input file reached its EOF (feof()) before reading from it:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
// Make room for 3 useful chars and the null terminator
char str[4];
// Read at most three chars
for(i = 0; i < 3 && !feof(stdin); i ++) {
str[i] = getc(stdin);
}
// Add the null terminator for strings
str[i] = 0;
puts(str);
return 0;
}
Strings in c need to be null terminated so it could be that you forgot to add a '\0' character to the end of str. The reason the heart symbol shows up would be that when puts() tries to write out a string it keeps reading the next character in memory until it reaches a null terminator, '\0'. Since it doesn't encounter one it just continues reading into memory and happens to find the heart symbol I'd guess. Hope this helps.

why string a gets appended to string b in this code

i am writing a basic c program to display two strings, one taken from user i.e "a" and the other being defined in code "b" but when i run the code below string "a" gets appended to "b". why? and what is that symbol at end of "a"
updated code:
#include <stdio.h>
#include <string.h>
int main()
{
char a[ 5 ];
int i=0;
while(i<5)
{
a[i]=getchar();
i++;
}
char b[]={'r','f','s','/0'};
printf("output:-");
printf("\n %s",a);
printf("\n %s",b);
return 0;
console
qwert
output:-qwert$
rfs$qwert$
there is a some special symbol instead of $ above, what is it?
Putting all the comments into an answer. The problems in the original code stem mostly from not NUL terminating the character arrays to produce valid C strings.
a is not NUL terminated. Can fix by increasing the a array by 1 and explicitly writing a NUL to the last byte.
b is not NUL terminated. Can fix by initialising b using a literal string or a char array with '\0' as the last byte. The example below uses the former.
Here is the full code with the errors corrected. Note that the code to read input is fragile as it only accepts exactly a 5 character string.
#include <stdio.h>
#include <string.h>
int main(void)
{
char a[6];
int i=0;
while (i<5) {
a[i]=getchar();
i++;
}
a[i] = '\0';
char b[]="rfs";
printf("output:-\n");
printf(" %s\n",a);
printf(" %s\n",b);
return 0;
}

program in c with strings

I have a problem with this code. I need to scan strings until I give the char 0 and count how many words with 1 char, 2, 3 ... etc. Here is my code, but it never stops.
#include <stdio.h>
#include <string.h>
int main()
{
char a[100];
int length[14],i,k;
for (i=1; i<=14; i++)
length[i]=0;
do
{
scanf("%s",a);
length[strlen(a)] =length[strlen(a)]+1;
} while (a!="0");
printf("Word Length\t|Number of Occurs\n");
for(i=1; i<=14; i++)
{
printf("%d\t\t|",i);
if (length[i]>=1)
for (k=1; k<=length[i]; k++)
printf("*");
printf("\n");
}
return 0;
}
You cannot check whether two strings are equal/different using ==/!= operators.
In C, string is array of characters and name of the array represents its address, so in your example, comparing a with "0" simply compares adress of array a and address of string "0" (which is probably address in the fixed area called string pool, depending on the compiler you use) which will never be the same, that's why your program enters an infinite loop.
You should use strcmp function form string.h

Resources