C program for calculating alphabetical strings

C program for calculating alphabetical strings - c

Task:
Create a function maxRepeatingLetter that receives a string and as a result returns the letter that is repeated the most times in the given string (if more than one letter meets the given condition, return the first letter of the English alphabet). It is assumed that there will be at least one letter in the string.
Code:
#include <stdio.h>
char maxRepeatingLetter(const char *s)
{
int i, max = 0, maxIndex = 0;
int letter[26] = {0};
const char *pom = s;
while (*pom != '\0')
{
if ((*pom >= 'a' && *pom <= 'z'))
{
letter[*pom - 97]++;
}
if (*pom >= 'A' && *pom <= 'Z')
{
letter[*pom - 65]++;
}
pom++;
}
for (i = 0; i < 26; i++)
{
if (letter[i] > max)
{
max =letter[i]; maxIndex = I;
}
return maxIndex + 65;
}
int main ()
{
printf("Most repeating letter is: %c",
maxRepeatingLetter("input for letter to repeat"));
return 0;
}
My current task is being able to explain the code above and how it works. And I need to input a minor change into it, for example, to add something to the code, or make it simpler. But not to lose the main functions the code has.
Is anyone willing to explain the code above in 2-3 lines? And if you could, assist me or hint me, or even show me, what changes to the code I could apply.

I can see that you have to distinguish lowercase and uppercase and you can have only letters not symbols such as ? | ^ ! ecc... so I'll try to give you some advice:
Try to indent the code, it will be more readable to an external eye.
Using the variable letter is a good idea but i don't get the point of using pom.
If you can use use the function strlen() from the library string.h, otherwise implementing it by yourself could be a good exercise.
letter[*pom - 97]++, letter[*pom - 65]++ and maxIndex + 65 are dependant from the ASCII table, try letter[*pom - 'a']++, letter[*pom - 'A']++ and maxIndex + 'A'.
The for loop doesn't work, you missed a brackets so the if isn't in the for.
The code explanation is pretty easy, first of all you use the arrayletter of 26 elements because in the alphabet we have 26 letters, so the i-th element correspond to the i-th letter of the alphabet.
You loop once on the string and save in the i-th element letter the number of occurrence of the i-th letter.
With the if in for loop you are simply finding the max in that array, once found it, you return the index of the max, the index is the letter occuring more often.
Sorry for my bad English tell me if you need more help.

It is very straightforward if you are looking for an explanation of the above code. I recommend using multiple prints to understand how things are happening. Here are my few tips.
function maxRepeatingLetter() is updating the counter table letter[] each time the character appears.
After that, the code tries to find the highest number in the counter table letter[].

Related

Count Different Character Types In String

I wrote a program that counts and prints the number of occurrences of elements in a string but it throws a garbage value when i use fgets() but for gets() it's not so.
Here is my code:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int main() {
char c[1005];
fgets(c, 1005, stdin);
int cnt[26] = {0};
for (int i = 0; i < strlen(c); i++) {
cnt[c[i] - 'a']++;
}
for (int i = 0; i < strlen(c); i++) {
if(cnt[c[i]-'a'] != 0) {
printf("%c %d\n", c[i], cnt[c[i] - 'a']);
cnt[c[i] - 'a'] = 0;
}
}
return 0;
}
This is what I get when I use fgets():
baaaabca
b 2
a 5
c 1
32767
--------------------------------
Process exited after 8.61 seconds with return value 0
Press any key to continue . . . _
I fixed it by using gets and got the correct result but i still don't understand why fgets() gives wrong result

Hurray! So, the most important reason your code is failing is that your code does not observe the following inviolable advice:
Always sanitize your inputs
What this means is that if you let the user input anything then he/she/it can break your code. This is a major, common source of problems in all areas of computer science. It is so well known that a NASA engineer has given us the tale of Little Bobby Tables:
Exploits of a Mom #xkcd.com
It is always worth reading the explanation even if you get it already #explainxkcd.com
medium.com wrote an article about “How Little Bobby Tables Ruined the Internet”
Heck, Bobby’s even got his own website — bobby-tables.com
Okay, so, all that stuff is about SQL injection, but the point is, validate your input before blithely using it. There are many, many examples of C programs that fail because they do not carefully manage input. One of the most recent and widely known is the Heartbleed Bug.
For more fun side reading, here is a superlatively-titled list of “The 10 Worst Programming Mistakes In History” #makeuseof.com — a good number of which were caused by failure to process bad input!
Academia, methinks, often fails students by not having an entire course on just input processing. Instead we tend to pretend that the issue will be later understood and handled — code in academia, science, online competition forums, etc, often assumes valid input!
Where your code went wrong
Using gets() is dangerous because it does not stop reading and storing input as long as the user is supplying it. It has created so many software vulnerabilities that the C Standard has (at long last) officially removed it from C. SO actually has an excellent post on it: Why is the gets function so dangerous that it should not be used?
But it does remove the Enter key from the end of the user’s input!
fgets(), in contrast, stops reading input at some point! However, it also lets you know whether you actually got an entire line of of text by not removing that Enter key.
Hence, assuming the user types: b a n a n a Enter
gets() returns the string "banana"
fgets() returns the string "banana\n"
That newline character '\n' (what you get when the user presses the Enter key) messes up your code because your code only accepts (or works correctly given) minuscule alphabet letters!
The Fix
The fix is to reject anything that your algorithm does not like. The easiest way to recognize “good” input is to have a list of it:
// Here is a complete list of VALID INPUTS that we can histogram
//
const char letters[] = "abcdefghijklmnopqrstuvwxyz";
Now we want to create a mapping from each letter in letters[] to an array of integers (its name doesn’t matter, but we’re calling it count[]). Let’s wrap that up in a little function:
// Here is our mapping of letters[] ←→ integers[]
// • supply a valid input → get an integer unique to that specific input
// • supply an invalid input → get an integer shared with ALL invalid input
//
int * histogram(char c) {
static int fooey; // number of invalid inputs
static int count[sizeof(letters)] = {0}; // numbers of each valid input 'a'..'z'
const char * p = strchr(letters, c); // find the valid input, else NULL
if (p) {
int index = p - letters; // 'a'=0, 'b'=1, ... (same order as in letters[])
return &count[index]; // VALID INPUT → the corresponding integer in count[]
}
else return &fooey; // INVALID INPUT → returns a dummy integer
}
For the more astute among you, this is rather verbose: we can totally get rid of those fooey and index variables.
“Okay, okay, that’s some pretty fancy stuff there, mister. I’m a bloomin’ beginner. What about me, huh?”
Easy. Just check that your character is in range:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'z')) return &count[c - 'a'];
return &fooey;
}
“But EBCDIC...!”
Fine. The following will work with both EBCDIC and ASCII:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'i')) return &count[ 0 + c - 'a'];
if (('j' <= c) && (c <= 'r')) return &count[ 9 + c - 'j'];
if (('s' <= c) && (c <= 'z')) return &count[18 + c - 's'];
return &fooey;
}
You will honestly never have to worry about any other character encoding for the Latin minuscules 'a'..'z'.Prove me wrong.
Back to main()
Before we forget, stick the required magic at the top of your program:
#include <stdio.h>
#include <string.h>
Now we can put our fancy-pants histogram mapping to use, without the possibility of undefined behavior due to bad input.
int main() {
// Ask for and get user input
char s[1005];
printf("s? ");
fgets(s, 1005, stdin);
// Histogram the input
for (int i = 0; i < strlen(s); i++) {
*histogram(s[i]) += 1;
}
// Print out the histogram, not printing zeros
for (int i = 0; i < strlen(letters); i++) {
if (*histogram(letters[i])) {
printf("%c %d\n", letters[i], *histogram(letters[i]));
}
}
return 0;
}
We make sure to read and store no more than 1004 characters (plus the terminating nul), and we prevent unwanted input from indexing outside of our histogram’s count[] array! Win-win!
s? a - ba na na !
a 4
b 1
n 2
But wait, there’s more!
We can totally reuse our histogram. Check out this little function:
// Reset the histogram to all zeros
//
void clear_histogram(void) {
for (const char * p = letters; *p; p++)
*histogram(*p) = 0;
}
All this stuff is not obvious. User input is hard. But you will find that it doesn’t have to be impossibly difficult genius-level stuff. It should be entertaining!
Other ways you could handle input is to transform things into acceptable values. For example you can use tolower() to convert any majuscule letters to your histogram’s input set.
s? ba na NA!
a 3
b 1
n 2
But I digress again...
Hang in there!

Question about segment of code in C programming course

I'm doing a course online in C and this is the assignment:
You are still conducting linguistic research! This time, you'd like to
write a program to find out how many letters occur multiple times in a
given word. Your program should read a word from the input and then
sort the letters of the word alphabetically (by their ASCII codes).
Next, your program should iterate through the letters of the word and
compare each letter with the one following it. If these equal each
other, you increase a counter by 1, making sure to then skip ahead far
enough so that letters that occur more than twice are not counted
again. You may assume that the word you read from the input has no
more than 50 letters, and that the word is all lowercase.
The solution they provide:
#include <stdio.h>
int main(void)
{
char word[51];
int length = 0;
int i, j;
char swap;
int repeats = 0;
scanf("%s", word);
while (word[length]!='\0')
length++;
//Sort the word by alphabetical order
for(j=0;j<length-1; j++) {
for(i=0;i<length-1;i++) {
if (word[i] > word[i+1]) {
swap = word[i];
word[i] = word[i+1];
word[i+1] = swap;
}
}
}
i = 0;
//Check for repeating characters in the sorted word
while (i<length-1) {
if (word[i]==word[i+1]) {
repeats++;
j=i+2;
//Continues through the word until it reaches a new character
while (j<length && word[i]==word[j])
j++;
i = j;
} else {
i++;
}
}
printf("%d", repeats);
return 0;
}
I understand everything up to code "//check for repeating characters in the sorted word//".
Specifically I don't understand the purpose or logic of "j=i+2" (especially the "+2") and how it relates to the next section of code "//continues through the word until it reaches a new character". I don't think this was adequately explained in the tutorials provided by the course.
Any insight or feedback is much appreciated.

Imagine your word is printed on paper, and you examine it through a small hole in a cardboard sheet. You can only see 2 letters at a time.
First, look at the beginning of the word. Identical letters? If not, shift by 1 position and repeat. If yes:
You found one repeated letter. Now you should find where the repeated run ends. To do that, shift your examination hole to the end of the repeated run.
It is possible to do this shift correctly in several ways. The way they implemented is until the first of the two visible letters is different from the repeated letters you found earlier. To do that, first you should shift by 2 positions, because at that position in code you know that the two letters are identical. But you could shift by 1 position instead — that would be correct too.
Another correct implementation — shift until you see two different letters in the hole. This may be easier to implement and more intuitive.

How can I implement a ceaser cipher that stays within the bounds of the alphabet in C

I have the following problem:
I would to implement a ceaser cipher which works mostly, but when I reach the end of the alphabet it goes beyond the alphabet which I assume is due to the ascii values.
for example:
if I insert a k and use the key 35 I get a H but it should wrap around in the lowercase letters and produce a b.
It also sometimes produces a punctuation mark or something else like < which I do not want.
The code responsible for the encryption is
encripted_text = (plain_text + key - 97)%26 +97;
am I missing something to make it wrap around and only stay in the alphabet.
Example run of program:
char plain_text = 'k';
int k = 35;
char encripted_text = '\0';
encripted_text = (plain_text + key - 97)%26 + 97;
printf("%c", encripted_text);
Thanks for your help.

Assuming that encripted_text and plain_text are both char variables (1 byte), you have to decide a formula making sure that even when the result wraps around a valid output character is calculated.
How to do that? It depends on what are valid chars for you! In some cases you can simply rely on how characters are mapped in ASCII code, but I want suggest you a general solution that you will be able to translate in your specific requirement.
This general solution consists in defining an array with the accepted alphabet, and translate the input character to one of the characters in it.
For example:
char alphabet[] = { 'a','A','b','B','c','C' }; //just an example
char input_char = 'k';
int key = 35;
char encripted_char = '\0';
encripted_char = alphabet[(input_char + key - 97)%(sizeof(alphabet))];
printf("%c", encripted_char );
Summarizing: the formula doesn't calculate directly the encrypted char, but the index of the accepted alphabet array normalized using % operator.

I got the right output with the same logic #nad34. In fact, I correctly got the output as 't' for 'k'. It shouldn't and won't give a 'b'.
Your code is having the right logic, except for a few slight errors.
I don't know why you're using a string here, but since you are anyway, this -> char plain_text[] = 'k'; should instead be char plain_text[] = "k"; ==> Note the double quotes.
int k = 35; should be int key = 35;, since you have used the variable name key and not k.
Logicwise it is right, and it will give you the right output.
You can check out the execution of the same code here.

The use of tolower and storing in an array

I am trying to trace through this problem and can not figure out how the star is goes through the while loop and is stored in the array. Is * stored as 8 because of tolower? If anyone could please walk through the first for - to second for loop please I would be eternally grateful.
#include <stdio.h>
#include <ctype.h>
int main()
{
int index, freq[26], c, stars, maxfreq;
for(index=0; index<26; index++)
freq[index] = 0;
while ( (c = getchar()) != '7')
{
if (isalpha(c))
freq[tolower(c)-'a']++;
printf("%d", &freq[7]);
}
maxfreq = freq [25];
for (index = 24; index >= 0; index--)
{
if (freq[index] > maxfreq)
maxfreq = freq[index];
}
printf ("a b c d e f\n");
for (index = 0; index < 5; index++)
{
for (stars = 0; stars < (maxfreq - freq[index]); stars ++)
printf(" ");
for (stars = 0; stars < (freq[index]); stars++)
printf("*");
printf("%c \n", ('A' + index) );
printf(" \n");
}
return 0;
}

It seems that this code is a histogram of sorts that prints how many times a given character has been entered into the console before it reaches the character '7'.
The following code:
for(index=0; index<26; index++)
freq[index] = 0;
Is simply setting all of the values of the array to 0. This is because of the fact that in C, variables that are declared in block scope (that is, inside a function) and that are not static do not have a specific default value and as such simply contain the garbage that was in that memory before the variable was declared. This would obviously affect the results that are displayed each time it is run, or when it is run elsewhere, which is not what you want I'm sure.
while ( (c = getchar()) != '7')
{
if (isalpha(c))
freq[tolower(c)-'a']++;
printf("%d", &freq[7]);
}
This next section uses a while loop to continue accepting input using getchar() (which gets the next character of input from STDIN in this case) until the character "7" is reached. This is due to the fact that assigning a value (such as "c = getchar()") allows the value to be used in such a way that it can be compared using "! = '7'". This allows us to continue looping until the character that is accepted from STDIN is equal to '7', after which the while loop will end.
Inside the loop itself, it's checking the value that has been entered using "isalpha()", which returns true if the character is an alphabetic letter. By using "tolower()" and returning that value to be subtracted by the character value of 'a', we are basically finding which character in the alphabet this is numerically. An example would be if we took the letter 'F'. Capital 'F' is stored as the value 70 in the background. tolower() checks to see if it is an uppercase character, and if it is, it returns the lowercase version of it (in this case, 'f' == 102). This value is then subtracted by 'a' (stored as 97) which returns the value 6 (which, when counting from 0, is the position of 'F' in the alphabet). This is then used to target that element of the array and to increment it, which tells us that another "F" or "f" has been entered.
maxfreq = freq [25];
for (index = 24; index >= 0; index--)
{
if (freq[index] > maxfreq)
maxfreq = freq[index];
}
This next section sets the variable "maxfreq" to the last value (how many times 'Z' was found), and iterates downwards, changing the value of maxfreq to the highest value that is found (that is, the largest number of any given character that is found in the array). This is later used to format the output to make sure that the letters line up correctly and the number of stars and spaces are correct.

Converting Letters to Numbers in C

I'm trying to write a code that would convert letters into numbers. For example
A ==> 0
B ==> 1
C ==> 2
and so on. Im thinking of writing 26 if statements. I'm wondering if there's a better way to do this...
Thank you!

This is a way that I feel is better than the switch method, and yet is standards compliant (does not assume ASCII):
#include <string.h>
#include <ctype.h>
/* returns -1 if c is not an alphabetic character */
int c_to_n(char c)
{
int n = -1;
static const char * const alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char *p = strchr(alphabet, toupper((unsigned char)c));
if (p)
{
n = p - alphabet;
}
return n;
}

If you need to deal with upper-case and lower-case then you may want to do something like:
if (letter >= 'A' && letter <= 'Z')
num = letter - 'A';
else if (letter >= 'a' && letter <= 'z')
num = letter - 'a';
If you want to display these, then you will want to convert the number into an ascii value by adding a '0' to it:
asciinumber = num + '0';

The C standard does not guarantee that the characters of the alphabet will be numbered sequentially. Hence, portable code cannot assume, for example, that 'B'-'A' is equal to 1.
The relevant section of the C specification is section 5.2.1 which describes the character sets:
3 Both the basic source and basic execution character sets shall have
the following members: the 26 uppercase letters of the Latin
alphabet
ABCDEFGHIJKLM
NOPQRSTUVWXYZ
the 26 lowercase letters of the Latin alphabet
abcdefghijklm
nopqrstuvwxyz
the 10 decimal digits
0123456789
the following 29 graphic characters
!"#%&'()*+,-./:
;<=>?[\]^_{|}~
the space character, and control characters representing horizontal
tab, vertical tab, and form feed. The
representation of each member of the source and execution basic
character sets shall fit in a byte. In both the source and execution
basic character sets, the value of each character after 0 in the above
list of decimal digits shall be one greater than the value of the
previous.
So the specification only guarantees that the digits will have sequential encodings. There is absolutely no restriction on how the alphabetic characters are encoded.
Fortunately, there is an easy and efficient way to convert A to 0, B to 1, etc. Here's the code
char letter = 'E'; // could be any upper or lower case letter
char str[2] = { letter }; // make a string out of the letter
int num = strtol( str, NULL, 36 ) - 10; // convert the letter to a number
The reason this works can be found in the man page for strtol which states:
(In bases above 10, the letter 'A' in either upper or lower case
represents 10, 'B' represents 11, and so forth, with 'Z' representing
35.)
So passing 36 to strtol as the base tells strtol to convert 'A' or 'a' to 10, 'B' or 'b' to 11, and so on. All you need to do is subtract 10 to get the final answer.

Another, far worse (but still better than 26 if statements) alternative is to use switch/case:
switch(letter)
{
case 'A':
case 'a': // don't use this line if you want only capital letters
num = 0;
break;
case 'B':
case 'b': // same as above about 'a'
num = 1;
break;
/* and so on and so on */
default:
fprintf(stderr, "WTF?\n");
}
Consider this only if there is absolutely no relationship between the letter and its code. Since there is a clear sequential relationship between the letter and the code in your case, using this is rather silly and going to be awful to maintain, but if you had to encode random characters to random values, this would be the way to avoid writing a zillion if()/else if()/else if()/else statements.

There is a much better way.
In ASCII (www.asciitable.com) you can know the numerical values of these characters.
'A' is 0x41.
So you can simply minus 0x41 from them, to get the numbers. I don't know c very well, but something like:
int num = 'A' - 0x41;
should work.

In most programming and scripting languages there is a means to get the "ordinal" value of any character. (Think of it as an offset from the beginning of the character set).
Thus you can usually do something like:
for ch in somestring:
if lowercase(ch):
n = ord(ch) - ord ('a')
elif uppercase(ch):
n = ord(ch) - ord('A')
else:
n = -1 # Sentinel error value
# (or raise an exception as appropriate to your programming
# environment and to the assignment specification)
Of course this wouldn't work for an EBCDIC based system (and might not work for some other exotic character sets). I suppose a reasonable sanity check would be to test of this function returned monotonically increasing values in the range 0..26 for the strings "abc...xzy" and "ABC...XYZ").
A whole different approach would be to create an associative array (dictionary, table, hash) of your letters and their values (one or two simple loops). Then use that. (Most modern programming languages include support for associative arrays.
Naturally I'm not "doing your homework." You'll have to do that for yourself. I'm simply explaining that those are the obvious approaches that would be used by any professional programmer. (Okay, an assembly language hack might also just mask out one bit for each byte, too).

Since the char data type is treated similar to an int data type in C and C++, you could go with some thing like:
char c = 'A'; // just some character
int urValue = c - 65;
If you are worried about case senstivity:
#include <ctype.h> // if using C++ #include <cctype>
int urValue = toupper(c) - 65;

Aww if you had C++
For unicode
definition of how to map characters to values
typedef std::map<wchar_t, int> WCharValueMap;
WCharValueMap myConversion = fillMap();
WCharValueMap fillMap() {
WCharValueMap result;
result[L'A']=0;
result[L'Â']=0;
result[L'B']=1;
result[L'C']=2;
return result;
}
usage
int value = myConversion[L'Â'];

I wrote this bit of code for a project, and I was wondering how naive this approach was.
The benefit here is that is seems to be adherent to the standard, and my guess is that the runtime is approx. O(k) where k is the size of the alphabet.
int ctoi(char c)
{
int index;
char* alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
c = toupper(c);
// avoid doing strlen here to juice some efficiency.
for(index = 0; index != 26; index++)
{
if(c == alphabet[index])
{
return index;
}
}
return -1;
}

#include<stdio.h>
#include<ctype.h>
int val(char a);
int main()
{
char r;
scanf("%c",&r);
printf("\n%d\n",val(r));
}
int val(char a)
{
int i=0;
char k;
for(k='A';k<=toupper(a);k++)
i++;
return i;
}//enter code here

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

C program for calculating alphabetical strings - c

Related

Count Different Character Types In String

Question about segment of code in C programming course

How can I implement a ceaser cipher that stays within the bounds of the alphabet in C

The use of tolower and storing in an array

Converting Letters to Numbers in C

Categories

Resources