Finding anagram - c

if (strlen(a) != strlen(b)) {
printf("Not anagram");
} else {
for (int i = 0; i < strlen(a); i++) {
for (int j = 0; j < strlen(b); j++) {
if (a[i] == b[j]) {
len++;
}
}
}
if (len != strlen(a))
printf("Not anagram");
else
printf("Anagram");
}
return 0;
This is a code snippet to check if 2 strings are anagrams. How can repeated characters be handled here? Also, could this program be made more optimized? And what would be the runtime complexity of this code?

An optimal solution would be probably based on calculating the number of characters in every string and then comparing both counts. Ideally, we should use a Dictionary data structure but for simplicity, I will demonstrate the algorithm on an array:
char *word1 = "word1";
char *word2 = "ordw1";
// C strings can have only 256 possible characters, therefore let's store counts in an array with 256 items.
int* letterCounts1 = calloc(256, sizeof(int));
int* letterCounts2 = calloc(256, sizeof(int));
size_t length1 = strlen(word1);
size_t length2 = strlen(word2);
for (size_t i = 0; i < length1; i++) {
int letterIndex = word1[i] & 0xFF;
letterCounts1[letterIndex] += 1;
}
for (size_t i = 0; i < length2; i++) {
int letterIndex = word2[i] & 0xFF;
letterCounts2[letterIndex] += 1;
}
bool isAnagram = true;
for (size_t i = 0; i < 256; i++) {
if (letterCounts1[i] != letterCounts2[i]) {
isAnagram = false;
break;
}
}
free(letterCounts1);
free(letterCounts2);
if (isAnagram) {
printf("Anagram");
} else {
printf("Not anagram");
}
This algorithm has linear (O(n)) complexity (iteration over the "dictionary" can be considered a constant).
Your original solution has quadratic complexity, however, you would also have to make sure to store result of strlen into variables because every call to strlen has to iterate over the whole string, increasing complexity to cubic.

First of all, this is not the right solution. Think in this 2 strings: "aabc" and "aade"
a[0] == b[0], a[0] == b[1], a[1] == b[0] and a[1] == b[1]. len would be 4 but they are not anagram. Complexity is O(n^2) being n the length of the string.
As #Sulthan has answered you, a better approach is to sort the strings which complexity is O(n*log(n)) and then compare both strings in one go O(n).
To order the strings in O(n * log(n)) you can not use a bubble method but you can use a merge-sort as described here: https://www.geeksforgeeks.org/merge-sort/
An even better approach is to create an array of integers in which you count the number of occurrences of each character in the first string and then you subtract one of the occurrences for each occurrence in the second array. In the end, all the positions of the auxiliary array must be 0.

Here a some answers:
Your algorithm does not handle duplicate letters, it may return false positives.
It is unclear if it is correct otherwise because you did not post a complete function definition with all declarations and definitions, especially whether len is initialized to 0.
It has O(N2) time complexity or even O(N3) if the compiler cannot optimize the numerous redundant calls to strlen().
Here is simple solution for systems with 8-bit characters with linear complexity:
#include <stdio.h>
#include <string.h>
int check_anagrams(const char *a, const char *b) {
size_t counters[256];
size_t len = strlen(a);
size_t i;
if (len != strlen(b)) {
printf("Not anagrams\n");
return 0;
}
for (i = 0; i < 256; i++) {
counters[i] = 0;
}
for (i = 0; i < len; i++) {
int c = (unsigned char)a[i];
counters[c] += 1;
}
for (i = 0; i < len; i++) {
int c = (unsigned char)b[i];
if (counters[c] == 0) {
printf("Not anagrams\n");
return 0;
}
counters[c] -= 1;
}
printf("Anagrams\n");
return 1;
}

Related

Difference b/w using i<strlen() and str[i] != '\0'

When I use for(i=0;i<strlen(s);i++) then I am getting time limit exceed error. And When I use for(i=0;s[i]!='\0';i++) my code get successful submit. Why?
I am also providing link of question from codechef - https://www.codechef.com/problems/LCPESY
Type 1:
for (i = 0; i < strlen(s1); i++) {
f1[s1[i]]++;
}
for (i = 0; i < strlen(s2); i++) {
f2[s2[i]]++;
}
Type 2:
for (i = 0; s1[i] != '\0'; i++) {
f1[s1[i]]++;
}
for (i = 0; s2[i] != '\0'; i++) {
f2[s2[i]]++;
}
Complete code:
#include <stdio.h>
#include <string.h>
long int min(long int a, long int b) {
if (a >= b)
return b;
else
return a;
}
int main(void) {
// your code goes here
int t;
scanf("%d", &t);
while (t--) {
char s1[10001], s2[10001];
scanf("%s%s", s1, s2);
long int f1[200] = { 0 }, f2[200] = { 0 }, i, count = 0;
for (i = 0; i < strlen(s1); i++) {
f1[s1[i]]++;
}
for (i = 0; i < strlen(s2); i++) {
f2[s2[i]]++;
}
for (i = 0; i < 200; i++) {
count += min(f1[i], f2[i]);
}
printf("%ld\n", count);
}
return 0;
}
If a non-optimizing compiler is used it can be that strlen is re-evaluated once per each iteration. strlen then needs to check each and every character in the string for equivalence with 0. This results in quadratic runtime, where there are O(n²) checks for the terminatin null instead of just the necessary O(n) times. In the strlen code the timeout happens because it does perhaps 2,000,000 null checks and 10,000 other operations; the other code would do 2,000 null checks and those same 10,000 other operations and not time out.
However, this need not be a case. Due to the as-if rule, a C compiler can generate exactly equivalent machine for the cases
for (i = 0; i < strlen(s1); i++){
f1[s1[i]] ++;
}
and
for (i = 0; s1[i] != '\0'; i++) {
f1[s1[i]] ++;
}
because a compiler can easily prove that the inner loop cannot possibly change s1 and therefore both forms would behave equivalently.
In addition to #Antti Haapala good answer:`
Difference b/w using i<strlen() and str[i] != '\0'
Code like int i; ... i < strlen(s1) readily complains about mismatched sign-ness - when such warnings are enabled. Usually inoffensive code like that discourages wide use of that warning. I see that as a less preferred approach. str[i] != '\0' does not cause that warning.
Some other concerns
Prevent buffer overflow
char s1[10001], s2[10001];
// scanf("%s%s", s1, s2);
if (scanf("%10000s%10000s", s1, s2) == 2) {
// OK, success, lets go!
There are more than 200 characters.
// long int f1[200] = { 0 };
long int f1[256] = { 0 };
// or better
long int f1[UCHAR_MAX + 1] = { 0 };
Avoid a negative index
// f1[s1[i]]++;
f1[(unsigned char) s1[i]]++;
or use unquestionable unsigned types.
// char s1[10001];
unsigned char s1[10001];

Extracting a portion of an array by using pointer in C

For example if I have an array and I want to extract elements of this array with a specified value as a new array. I did as the following:
int a[10] = { 1, 2, 1, 3, 2, 3, 4, 1, 2, 6 };
int i, k;
int count = 0;
for (i = 0; i < 10; i++) {
if (a[i] == 1) {
count = count + 1;
}
}
int b[count];
k = 0;
for (i = 0; i < 10; i++) {
if (a[i] == 1) {
b[k] = a[i];
k = k + 1;
}
}
So, for the array "a" I extracted all the elements of value 1, and make them as a new array "b". How can I achieve the same thing by using pointers? Will it be conciser than this way? If it is possible, is there any other advantages?
I think you already noticed that you just had to write 1 several times; yet I suppose you want that it works for arbitrary conditions.
"Using a pointer" can mean dynamic memory allocation instead of a variable length array. Just for the sake of having use a pointer, you could then write:
int *b = malloc(count * sizeof(int));
k = 0;
for (i = 0; i < 10; i++) {
if (a[i] == 1) {
b[k] = a[i];
k = k + 1;
}
}
If, just for sake of using a pointer for the writing process, too, you could adapt the program as follows:
int *b = malloc(count * sizeof(int));
int *bPtr = b;
for (i = 0; i < 10; i++) {
if (a[i] == 1) {
*bPtr++ = a[i];
}
}
Hope it helps a bit.
If you don't know which portion of the array your target values will be in, as in your case, where you're searching the entire unsorted array for a specific value, then there is no advantage to using pointers rather than a linear search to find the elements in the array you are after.
If, however, you are trying to access or copy a contiguous set of elements starting at a known index in the array, then you could use a pointer to simplify things. For example, if I'm after the last few elements in an array of chars, this works:
#include <stdio.h>
int main()
{
char str[100] = "I don\'t want anyone to think I hate the Calgary Flames";
char *c = (str + 29);
printf("%s\n", c);
return 0;
}
Output:
I hate the Calgary Flames
In this case, no, there is no benefit. a[i] is already basically a + (sizeof(int) * i). Even if you used pointers, you'd still have to do all the counting anyway to make sure you don't walk off the end of the array.
Where its often handy is with a null terminated array of pointers, such as a string, where you don't know the length. But it's not really about performance. As you can see below, they have to do roughly the same things.
char string[] = "foo bar";
// Allocate and initialize i.
// `string + i` twice, compare, increment.
for( int i = 0; string[i] != '\0'; i++ ) {
printf("%c", string[i]);
}
puts("");
// Allocate and initialize s.
// Dereference s twice, compare, increment.
for( char *s = string; *s != '\0'; s++ ) {
printf("%c", *s);
}
puts("");
Where iterating by pointer is handy is when you need to iterate through an array in several steps. Instead of passing around the original array pointer plus your last index, and changing all your function signatures to accommodate, just pass around the incremented pointer and return the incremented pointer. This allows you to use standard string functions on the middle of a string.
#include <stdio.h>
char *findVal( char *string, char delim ) {
char *val = string;
for( ; *val != '\0' && *val != delim; val++ ) {
}
if( val == '\0' ) {
return NULL;
}
else {
// val is sitting on the ':'
return val+1;
}
}
int main() {
char string[] = "this:that";
char *val = findVal(string, ':');
if( val != NULL ) {
// Just use val, not string[valIdx].
printf("%s\n", val);
}
}
This is also safer. With an offset there's two things which must remain in sync, the pointer and the offset; that opens the possibility that the wrong offset will be used with the wrong pointer. An incremented pointer carries its offset with it.
As has been pointed out in the comments, you can tighten up the second loop like so:
int b[count];
for (i = 0; i < count; i++) {
b[i] = 1;
}

Finding Prefix as Suffix in a string

I already posted this question but I'm still struggling to get it properly working. Dreamlax tried to help me out by giving the following steps -
starting with n = 1, Take the first n characters from the string.
Compare it to the last n characters from the string
Do they match?
If yes, print out the first n characters as the suffix and stop processing.
If no, increment n and try again. Try until n is in the middle of the string.
Here's my code which doesn't work:
#include <stdio.h>
#include <string.h>
void main()
{
int i, T, flag, j, k, len = 0, n;
char W[20], X[20], A[20], B[20];
scanf("%d", &T);
for (i = 0; i < T; i++)
{
scanf("%s", W);
for (len = 0; W[len] != '\0'; len++)
X[len] = W[len];
X[len] = '\0';
len--;
n = 1;
while (n < len / 2)
{
for (k = 0; k < n; k++)
A[k] = W[k];
for (k = 0, j = len - n; W[j] != '\0'; j++, k++)
B[k] = W[j];
if (!strcmp(A, B))
{
printf("YES\n");
break;
}
else
{
n++;
}
}
printf("NO\n");
}
}
Help me in pin pointing the error please.
There are several things going on in your code:
You should null-terminate your auxiliary strings A and B. Alternatively, yopu could compare just then n first characters with strncmp instead of strcmp.
strcmp is a comparison function. It returns zero if the strings match. (Comparison function means it can be used in sorting to determine whether a string is lexically greater or smaller than another string. The nomenclature for such functions is to return a negative number for lexically smaller, a positive number for lexically greater and zero means equality.)
You don't use the auxiliary string X excapt to find the length. You can easily find the length of a string with strlen, which, like strcmp, is declared in <string.h>
The calculation of your index for the suffix is off. Your length len is one less than the actual length and W[len] is the last character. Don't subtract one from your length.
Here's your code, refactored into a function, so that input and program logic are separated as they ought to be:
int is_nice(const char *W)
{
char A[20], B[20];
int len = strlen(W);
int j, k, n = 1;
while (n < len / 2) {
for (k = 0; k < n; k++) A[k] = W[k];
A[k] = '\0';
for (k = 0, j = len - n; W[j] != '\0'; j++, k++) B[k] = W[j];
B[k] = '\0';
if (strcmp(A, B) == 0) return 1;
n++;
}
return 0;
}
Above, I've said that you could use strncmp to compare ony a certain number of characters in the string. If you think about it, you can omit the auxiliary strings A and B and compare just slices of your original string:
int is_nice(const char *W)
{
int len = strlen(W);
int n = 1;
while (n < len / 2) {
if (strncmp(W, W + len - n, n) == 0) return 1;
n++;
}
return 0;
}
This saves a lot of copying, some temporary variables and has one other significant benefit: Because the code doesn't have to guess a maximum size for the auxiliary buffers, it now works for strings of any size.
You have three errors in your code.
The first is when you compute the length of the input string. Subtracting 1 from len after the loop is not necessary (simulate this loop for a small n to see why).
In this line:
if (!strcmp(A, B))
you are comparing non null-terminated strings which is undefined behavior. You should either terminate strings A and B or use strncmp(A, B, n) to compare at most n characters.
The third error is a logical error. If a string is "nice", your program will output both YES and NO. But this one should be easy to fix.

How can I remove a certain number of digits in a number so the number obtained is minimal?

How can I remove a certain number of digits in a number so the number obtained is minimal?
Specifically, I want to write a function int remove_digits(int large, int num_digits_to_remove) such that:
Any num_digits_to_remove digits are removed from large as though removing characters from its string representation
The number that is returned has the lowest possible value from removing digits as in step 1
For example, removing 4 digits from 69469813 would give 4613
I would prefer answers written in C.
Idea:
char number[] = "69469813";
char digits[ARRAY_SIZE(number)];
size_t i;
// sort digits; complexity O(n * log n);
sort_digits(digits, number); // -> digits becomes "99866431"
for (i = 0; i < number_of_digits_to_be_removed; ++i) {
size_t j;
for (j = 0; j < ARRAY_SIZE(number); ++j) {
if (number[j] == digits[i]) {
number[j] = 'X'; // invalidate it
break;
}
}
}
for (i = 0; i < ARRAY_SIZE(number); ++i)
if (number[i] != 'X')
printf("%c", number[i]);
Whole thing has a complexity of O(n * m);
The basic idea is that if you can only remove one digit, you want to remove the first digit (starting with the most significant digit) that is followed by a smaller digit.
For example, if your number is 123432, you want to remove the 4 (since it is followed by a 3), resulting in 12332.
You then repeat this process for as many digits as you want to remove:
char *num = "69469813";
char *buf = malloc(strlen(num)+1);
size_t to_remove = 4;
while (to_remove --> 0) {
char *src = num;
char *dst = buf;
while (*src < *(src+1)) { *dst++ = *src++; } // Advance until the next digit is less than the current digit
src++; // Skip it
while (*dst++ = *src++); // Copy the rest
strcpy(num, buf);
}
printf("%s\n", num); // Prints 4613
I don't know C but here is how I would do it in java:
String original = "69469813";
String result = "";
int numNeedToBeTaken = 4;
int numLeft = original.length() - numNeedToBeTaken;
while(result.length() < numLeft)
{
String temp = original.substring(0,original.length()-numNeedToBeTaken+1);
int smallest= 9;
int index = 0;
for(int i = 0; i<temp.length(); i++)
{
int number = Integer.parseInt(Character.toString(temp.charAt(i)));
if( number < smallest)
{
smallest = number;
index = i+1;
}
}
numNeedToBeTaken--;
result = result.concat(String.valueOf(smallest));
original = original.substring(index);
}
Log.d("debug","result: "+result); //tested to work with your example, returns 4613
converting this to C should be pretty easy, I only used some basic operations.

Print out the longest substring in c

Suppose that we have a string "11222222345646". So how to print out subsequence 222222 in C.
I have a function here, but I think something incorrect. Can someone correct it for me?
int *longestsubstring(int a[], int n, int *length)
{
int location = 0;
length = 0;
int i, j;
for (i = 0, j = 0; i <= n-1, j < i; i++, j++)
{
if (a[i] != a[j])
{
if (i - j >= *length)
{
*length = i - j;
location = j;
}
j = i;
}
}
return &a[location];
}
Sorry,I don't really understand your question.
I just have a little code,and it can print the longest sub string,hope it can help.
/*breif : print the longest sub string*/
void printLongestSubString(const char * str,int length)
{
if(length <= 0)
return;
int i ;
int num1 = 0,num2 = 0;
int location = 0;
for(i = 0; i< length - 1; ++i)
{
if(str[i] == str[i+1])
++num2;//count the sub string ,may be not the longest,but we should try.
else
{
if(num2 >num1)//I use num1 store the sum longest of current sub string.
{ num1 = num2;location = i - num2;}
else
;//do nothing for short sub string.
num2 = 0;
}
}
for(i = location;str[i]== str[num1];++i)
printf("%c",str[i]);
printf("\n");
}
int main()
{
char * str = "1122222234566";
printLongestSubString(str,13);
return 0;
}
From your code it appears you want to return the longest sub-sequence (sub-string). Since I'm relearning C I thought I would give it a shot.
I've used strndup to extract the substring. I'm not sure how portable it is but I found an implementation if needed, just click on the link. It will allocate memory to store the new cstring so you have to remember to free the memory once finished with the substring. Following your argument list, the length of the sub-string is returned as the third argument of the extraction routine.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *extract_longest_subsequence(const char *str, size_t str_len, size_t *longest_len);
int main()
{
char str[] = "11222234555555564666666";
size_t substr_len = 0;
char *substr = extract_longest_subsequence(str, sizeof(str), &substr_len);
if (!substr)
{
printf("Error: NULL sub-string returned\n");
return 1;
}
printf("original string: %s, length: %zu\n", str, sizeof(str)-1);
printf("Longest sub-string: %s, length: %zu\n", substr, substr_len);
/* Have to remember to free the memory allocated by strndup */
free(substr);
return 0;
}
char *extract_longest_subsequence(const char *str, size_t str_len, size_t *longest_len)
{
if (str == NULL || str_len < 1 || longest_len == NULL)
return NULL;
size_t longest_start = 0;
*longest_len = 0;
size_t curr_len = 1;
size_t i = 0;
for (i = 1; i < str_len; ++i)
{
if (str[i-1] == str[i])
{
++curr_len;
}
else
{
if (curr_len > *longest_len)
{
longest_start = i - curr_len;
*longest_len = curr_len;
}
curr_len = 1;
}
}
/* strndup allocates memory for storing the substring */
return strndup(str + longest_start, *longest_len);
}
It looks like in your loop that j is supposed to be storing where the current "substring" starts, and i is the index of the character that you are currently looking at. In that case, you want to change
for (i = 0, j = 0; i <= n-1, j < i; i++, j++)
to
for (i = 0, j = 0; i <= n-1; i++)
That way, you are using i to store which character you're looking at, and the j = i line will "reset" which string of characters you are checking the length of.
Also, a few other things:
1) length = 0 should be *length = 0. You probably don't actually want to set the pointer to point to address 0x0.
2) That last line would return where your "largest substring" starts, but it doesn't truncate where the characters start to change (i.e. the resulting string isn't necessarily *length long). It can be intentional depending on use case, but figured I'd mention it in case it saves some grief.

Resources