Difference b/w using i<strlen() and str[i] != '\0' - arrays

When I use for(i=0;i<strlen(s);i++) then I am getting time limit exceed error. And When I use for(i=0;s[i]!='\0';i++) my code get successful submit. Why?
I am also providing link of question from codechef - https://www.codechef.com/problems/LCPESY
Type 1:
for (i = 0; i < strlen(s1); i++) {
f1[s1[i]]++;
}
for (i = 0; i < strlen(s2); i++) {
f2[s2[i]]++;
}
Type 2:
for (i = 0; s1[i] != '\0'; i++) {
f1[s1[i]]++;
}
for (i = 0; s2[i] != '\0'; i++) {
f2[s2[i]]++;
}
Complete code:
#include <stdio.h>
#include <string.h>
long int min(long int a, long int b) {
if (a >= b)
return b;
else
return a;
}
int main(void) {
// your code goes here
int t;
scanf("%d", &t);
while (t--) {
char s1[10001], s2[10001];
scanf("%s%s", s1, s2);
long int f1[200] = { 0 }, f2[200] = { 0 }, i, count = 0;
for (i = 0; i < strlen(s1); i++) {
f1[s1[i]]++;
}
for (i = 0; i < strlen(s2); i++) {
f2[s2[i]]++;
}
for (i = 0; i < 200; i++) {
count += min(f1[i], f2[i]);
}
printf("%ld\n", count);
}
return 0;
}

If a non-optimizing compiler is used it can be that strlen is re-evaluated once per each iteration. strlen then needs to check each and every character in the string for equivalence with 0. This results in quadratic runtime, where there are O(n²) checks for the terminatin null instead of just the necessary O(n) times. In the strlen code the timeout happens because it does perhaps 2,000,000 null checks and 10,000 other operations; the other code would do 2,000 null checks and those same 10,000 other operations and not time out.
However, this need not be a case. Due to the as-if rule, a C compiler can generate exactly equivalent machine for the cases
for (i = 0; i < strlen(s1); i++){
f1[s1[i]] ++;
}
and
for (i = 0; s1[i] != '\0'; i++) {
f1[s1[i]] ++;
}
because a compiler can easily prove that the inner loop cannot possibly change s1 and therefore both forms would behave equivalently.

In addition to #Antti Haapala good answer:`
Difference b/w using i<strlen() and str[i] != '\0'
Code like int i; ... i < strlen(s1) readily complains about mismatched sign-ness - when such warnings are enabled. Usually inoffensive code like that discourages wide use of that warning. I see that as a less preferred approach. str[i] != '\0' does not cause that warning.
Some other concerns
Prevent buffer overflow
char s1[10001], s2[10001];
// scanf("%s%s", s1, s2);
if (scanf("%10000s%10000s", s1, s2) == 2) {
// OK, success, lets go!
There are more than 200 characters.
// long int f1[200] = { 0 };
long int f1[256] = { 0 };
// or better
long int f1[UCHAR_MAX + 1] = { 0 };
Avoid a negative index
// f1[s1[i]]++;
f1[(unsigned char) s1[i]]++;
or use unquestionable unsigned types.
// char s1[10001];
unsigned char s1[10001];

Related

cannot give larger input in a program

I am trying to solve a question on codechef. This code is running fine as far as logic is considered, but I am having trouble with giving large inputs.
For example, if I give input of 13 or more digits then it does not run in vscode, and if the input is 22 digits or more then it does not run in devc++.
#include <stdio.h>
#include <math.h>
#include<string.h>
int main() {
int t, l, i, flag = 0;
char ch = ' ';
scanf("%d", &t);
char d[t][100000];
int out[t];
for(i = 0; i < t; i++) {
scanf("%s", &d[t]);
size_t len = strlen(d[t]);
printf("\n%d", len);
char ch = d[t][0];
for (size_t j = 0; j < len; j++) {
if ((ch != d[t][j])) {
flag++;
}
}
if (flag >= 2 && flag != (len-1))
out[i]=0;
else
out[i]=1;
flag = 0;
}
for(i = 0; i < t; i++) {
if (out[i] == 1)
printf("Yes\n");
else
printf("No\n");
}
return 0;
}
TEST RUN 1:-
1
11110111111
11
Yes
TEST RUN 2:-
1
111101111111
0
Why doesn't it handle larger inputs?
When you start writing from d[t], you are out of the bounds of the array, meaning that you might be overwriting other parts of your program. You should instead write to d[i].
Also this array is now on the stack, but the stack only has a very limited size, so there is a high risk you might go over that size, which is called a stack overflow. On windows the stack size is I think 1MB, so if t was bigger than 10 you would already exceed that. So you should allocate on the heap instead.
This should work, but I did not test it:
int t = 0;
scanf("%d", &t);
char **d = (char **)malloc(t * sizeof(char *));
for (int i = 0; i < t; i++){
d[i] = (char *)malloc(STR_SIZE * sizeof(char));
scanf("%s", d[i]);
}

Finding anagram

if (strlen(a) != strlen(b)) {
printf("Not anagram");
} else {
for (int i = 0; i < strlen(a); i++) {
for (int j = 0; j < strlen(b); j++) {
if (a[i] == b[j]) {
len++;
}
}
}
if (len != strlen(a))
printf("Not anagram");
else
printf("Anagram");
}
return 0;
This is a code snippet to check if 2 strings are anagrams. How can repeated characters be handled here? Also, could this program be made more optimized? And what would be the runtime complexity of this code?
An optimal solution would be probably based on calculating the number of characters in every string and then comparing both counts. Ideally, we should use a Dictionary data structure but for simplicity, I will demonstrate the algorithm on an array:
char *word1 = "word1";
char *word2 = "ordw1";
// C strings can have only 256 possible characters, therefore let's store counts in an array with 256 items.
int* letterCounts1 = calloc(256, sizeof(int));
int* letterCounts2 = calloc(256, sizeof(int));
size_t length1 = strlen(word1);
size_t length2 = strlen(word2);
for (size_t i = 0; i < length1; i++) {
int letterIndex = word1[i] & 0xFF;
letterCounts1[letterIndex] += 1;
}
for (size_t i = 0; i < length2; i++) {
int letterIndex = word2[i] & 0xFF;
letterCounts2[letterIndex] += 1;
}
bool isAnagram = true;
for (size_t i = 0; i < 256; i++) {
if (letterCounts1[i] != letterCounts2[i]) {
isAnagram = false;
break;
}
}
free(letterCounts1);
free(letterCounts2);
if (isAnagram) {
printf("Anagram");
} else {
printf("Not anagram");
}
This algorithm has linear (O(n)) complexity (iteration over the "dictionary" can be considered a constant).
Your original solution has quadratic complexity, however, you would also have to make sure to store result of strlen into variables because every call to strlen has to iterate over the whole string, increasing complexity to cubic.
First of all, this is not the right solution. Think in this 2 strings: "aabc" and "aade"
a[0] == b[0], a[0] == b[1], a[1] == b[0] and a[1] == b[1]. len would be 4 but they are not anagram. Complexity is O(n^2) being n the length of the string.
As #Sulthan has answered you, a better approach is to sort the strings which complexity is O(n*log(n)) and then compare both strings in one go O(n).
To order the strings in O(n * log(n)) you can not use a bubble method but you can use a merge-sort as described here: https://www.geeksforgeeks.org/merge-sort/
An even better approach is to create an array of integers in which you count the number of occurrences of each character in the first string and then you subtract one of the occurrences for each occurrence in the second array. In the end, all the positions of the auxiliary array must be 0.
Here a some answers:
Your algorithm does not handle duplicate letters, it may return false positives.
It is unclear if it is correct otherwise because you did not post a complete function definition with all declarations and definitions, especially whether len is initialized to 0.
It has O(N2) time complexity or even O(N3) if the compiler cannot optimize the numerous redundant calls to strlen().
Here is simple solution for systems with 8-bit characters with linear complexity:
#include <stdio.h>
#include <string.h>
int check_anagrams(const char *a, const char *b) {
size_t counters[256];
size_t len = strlen(a);
size_t i;
if (len != strlen(b)) {
printf("Not anagrams\n");
return 0;
}
for (i = 0; i < 256; i++) {
counters[i] = 0;
}
for (i = 0; i < len; i++) {
int c = (unsigned char)a[i];
counters[c] += 1;
}
for (i = 0; i < len; i++) {
int c = (unsigned char)b[i];
if (counters[c] == 0) {
printf("Not anagrams\n");
return 0;
}
counters[c] -= 1;
}
printf("Anagrams\n");
return 1;
}

char-Tables, Segmentation Error

I'm currently learning the C programming language, and I'm having some issues with it.
I'm getting Segmentation Error quite a lot when dealing with string (A.K.A char tables)
Here a simple algorithm just to delete the 'e' letter in the input string.
Example:
"hackers does exist" ->>> "hacks dos xist"
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char const *argv[])
{
char T[200];
int j,i,l,times=0;
printf("Entre THE TXT\n");
gets(T);
while (T[i] != '\0')
{
l++;
i++;
}
for (i=0;i<l;i++)
{
if ( T[i] == 'e')
{
times++;
}
}
l=l-times;
i=0;
j=0;
while (i<l)
{
if ( T[j] != 'e')
{
T[i]=T[j];
i++;
j++;
}
else j++;
}
for (i=0;i<l;i++)
{
printf("%c",T[i]);
}
return 0;
}
Can you please tell me what I did wrong?
PS: I have noticed that each time I do incrementation as j++ in this code I will get the Segmentation Error... I really don't understand why.
Initialize i, j, l variables. Since uninitialized local variables are indeterminate. Reading them prior to assigning a value results in undefined behavior.
You are accessing the i and l variable without initialization.
while (T[i] != '\0')
{
l++;
i++;
}
Initialize as below.
int j = 0, i = 0, l = 0, times = 0;
As kiran Biradar already answered you only missed to initialize your integers.
You have several options here. I'll write them from most common to most discouraged.
Most used form, verbose but easier to maintain later.
int i = 0;
int j = 0;
int l = 0;
int times = 0;
Short form 1:
int i = 0, j = 0, l = 0, times 0;
Short form 2:
int i, j, l, times;
i = j = l = times = 0;
I'd suggest you also to use the features of at least the C99 Standard and reduce the scope of your variables completely. (Yes I know it's possible with {}-Blocks but I kinda like for-loops, if you iterate completely over something.
Hence my suggestion for your code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h> // str(n)len
int main(void) // argv/argc is never used
{
char text[200];
printf("Entre THE TXT\n");
if (fgets(text, sizeof(text), stdin) == NULL) // fgets as gets is deprecated
exit(EXIT_FAILURE);
size_t len = strlen(text); // returns number of Characters excluding '\0'
if (len > 0 && text[len-1] == '\n') { // strip newline if present from fgets
text[len-1] = '\0';
}
unsigned int times = 0;
for (size_t i=0; i<len; i++) {
if (text[i] == 'e') {
times++;
}
}
// I'd prefer to use a `newlen` variable
len -= (size_t) times;
for (size_t j=0, i=0; i < len; j++) {
if (text[j] != 'e') {
text[i] = text[j];
i++;
}
}
text[len] = '\0'; // just for safety reasons terminate string properly
puts(text); // Use puts instead of calling printf several times.
return 0;
}
Further improvements:
Actually the times could be eliminated, as it's not really used to delete es.
So just remove the times block and all lines with it.

Longest Substring Palindrome issue

I feel like I've got it almost down, but for some reason my second test is coming up with a shorter palindrome instead of the longest one. I've marked where I feel the error may be coming from, but at this point I'm kind of at a loss. Any direction would be appreciated!
#include <stdio.h>
#include <string.h>
/*
* Checks whether the characters from position first to position last of the string str form a palindrome.
* If it is palindrome it returns 1. Otherwise it returns 0.
*/
int isPalindrome(int first, int last, char *str)
{
int i;
for(i = first; i <= last; i++){
if(str[i] != str[last-i]){
return 0;
}
}
return 1;
}
/*
* Find and print the largest palindrome found in the string str. Uses isPalindrome as a helper function.
*/
void largestPalindrome(char *str)
{
int i, last, pStart, pEnd;
pStart = 0;
pEnd = 0;
int result;
for(i = 0; i < strlen(str); i++){
for(last = strlen(str); last >= i; last--){
result = isPalindrome(i, last, str);
//Possible error area
if(result == 1 && ((last-i)>(pEnd-pStart))){
pStart = i;
pEnd = last;
}
}
}
printf("Largest palindrome: ");
for(i = pStart; i <= pEnd; i++)
printf("%c", str[i]);
return;
}
/*
* Do not modify this code.
*/
int main(void)
{
int i = 0;
/* you can change these strings to other test cases but please change them back before submitting your code */
//str1 working correctly
char *str1 = "ABCBACDCBAAB";
char *str2 = "ABCBAHELLOHOWRACECARAREYOUIAMAIDOINEVERODDOREVENNGGOOD";
/* test easy example */
printf("Test String 1: %s\n",str1);
largestPalindrome(str1);
/* test hard example */
printf("\nTest String 2: %s\n",str2);
largestPalindrome(str2);
return 0;
}
Your code in isPalindrome doesn't work properly unless first is 0.
Consider isPalindrome(6, 10, "abcdefghhgX"):
i = 6;
last - i = 4;
comparing str[i] (aka str[6] aka 'g') with str[last-i] (aka str[4] aka 'e') is comparing data outside the range that is supposed to be under consideration.
It should be comparing with str[10] (or perhaps str[9] — depending on whether last is the index of the final character or one beyond the final character).
You need to revisit that code. Note, too, that your code will test each pair of characters twice where once is sufficient. I'd probably use two index variables, i and j, set to first and last. The loop would increment i and decrement j, and only continue while i is less than j.
for (int i = first, j = last; i < j; i++, j--)
{
if (str[i] != str[j])
return 0;
}
return 1;
In isPalindrome, replace the line if(str[i] != str[last-i]){ with if(str[i] != str[first+last-i]){.
Here's your problem:
for(i = first; i <= last; i++){
if(str[i] != str[last-i]){
return 0;
}
}
Should be:
for(i = first; i <= last; i++, last--){
if(str[i] != str[last]){
return 0;
}
}
Also, this:
for(last = strlen(str); last >= i; last--){
Should be:
for(last = strlen(str) - 1; last >= i; last--){

Character table-based searcher doesn't work

The following code is supposed to return the character that is registered in the table set and has been determined in the source string.
int find (char source[], char set[])
{
int i, l = strlen(set);
int exit = 0;
for(i = 0; source[i] != '\0';)
{
do
{
if(source[i] == set[l])
{
exit = 1;
break;
}
else l--;
} while (!l);
if(exit)
break;
else
{
i++;
l = strlen(set);
}
}
return set[l];
}
What am I doing wrong?
Use early returns when appropriate, use for loops with all three clauses most of the time, avoid calling strlen() more than once, and avoid do … while loops.
int find(char source[], char set[])
{
int len = strlen(set);
for (int i = 0; source[i] != '\0'; i++)
{
for (int l = 0; l < len; l++)
{
if (source[i] == set[l])
return (unsigned char)set[l];
}
}
return -1;
}
The cast returns a positive value even if plain char is a signed type.
I'm not wholly convinced that returning the character is best; the index where the character is found might be better.
If you're stuck with a C89 compiler, then you can use:
int find(char source[], char set[])
{
int len = strlen(set);
int i;
for (i = 0; source[i] != '\0'; i++)
{
int l;
for (l = 0; l < len; l++)
{
if (source[i] == set[l])
return (unsigned char)set[l];
}
}
return -1;
}
I'm letting the compiler optimize source[i] in the inner loop. If you don't trust your compiler, you could use:
int find(char source[], char set[])
{
int len = strlen(set);
int i;
for (i = 0; source[i] != '\0'; i++)
{
char c = source[i];
int l;
for (l = 0; l < len; l++)
{
if (c == set[l])
return (unsigned char)set[l];
}
}
return -1;
}
If you want to use a standard function, you probably want strcspn(), a much-neglected part of Standard C. This will return 0 (or '\0') if there is no other match, unlike the other functions that return -1.
int find(char source[], char set[])
{
size_t i = strcspn(source, set);
return (unsigned char)source[i];
}
If the negative return is important, then you'd use:
int find(char source[], char set[])
{
size_t i = strcspn(source, set);
return (source[i] == '\0') ? -1 : (unsigned char)source[i];
}
Or you could use strpbrk():
int find(char source[], char set[])
{
char *tgt = strpbrk(source, set);
if (tgt == 0)
return -1;
return (unsigned char)*tgt;
}
And there are probably other variants I've not thought of.
If you want to keep the inner do/while loop (thus fixing up the original logic) you can write:
int find(char source[], char set[])
{
int i;
int len = strlen(set) - 1;
for (i = 0; source[i] != '\0'; i++)
{
int l = len;
do
{
if (source[i] == set[l])
return (unsigned char)set[l];
} while (--l >= 0);
}
return -1;
}
This avoids testing set[strlen(set)], which is by definition '\0', with source[i] which is known not to be '\0'. It still uses an early return which radically simplifies the code (no exit variable, which is not a good name to use since there's a standard function exit() too). Note, too, how this keeps the loop control for the variable i all in the for statement — that is one of its principal virtues. You should aim to use it whenever possible. Note that the original code scans from the end of set to the beginning instead of from the beginning to the end. Both methods work essentially equally well, but starting at the beginning and ending at the end is the more conventional way to work. It is also less apt to create bugs. If someone changes the type of l to size_t, then it never goes negative, so the do/while variant fails. The original proposed version will work fine if every int in the body of the function is changed to size_t.
I think, instead of
while (!l);
what you want is
while (l);
while (l > -1); //yeah, because array index starts from 0.
because you are decrementing the value of l and want to continue looping untill l becomes less than 0, right?
Also, return type of function find() is int and you're returning a char. Even it is not an error, you may probably want to change that.

Resources