I am ciphering like the following but don't know how to prevent the capitals going into other symbols if they are shifted out of range, similarly the lowercase go out of range of the lowercase letters. How can I make them go round in a circle and stop overflow? Thanks
int size = strlen(plain_text);
int arrayelement = 0;
for(arrayelement = 0; arrayelement < size; arrayelement++)
{
if (islower(plain_text[arrayelement]))
{
ciphered_text[arrayelement] = (int)(plain_text[arrayelement] + shiftkey);
}
else if (isupper(plain_text[arrayelement]))
{
ciphered_text[arrayelement] = (int)(plain_text[arrayelement] + shiftkey);
}
}
ciphered_text[size] = '\0';
printf("%s", ciphered_text);
I guess you use a type like char so an easy solution to not overflow is to do
int tmp_ciphered = (my_char + shift) % 0xff;
char ciphered = (char)(tmp_ciphered);
thenyou turn and do not overflow, this is a ring
This duplicates (almost exactly) c++ simple Caesar cipher algorithm.
Note that I don't agree with the accepted answer on that post. Basically you have to map the characters back into the range using something like ((c-'a'+shift) % 26) + 'a'. However that assumes your characters are in 'a'..'z'. Might be safer to use c >= 'a' && c <= 'z' instead of islower as I'm not sure how locale will play into on non-English systems. Similar for isupper and the other range. Finally, you need an else clause to handle when the char is not in either range.
The only truly portable way to do this involves building a lookup table for the input domain, and manually building the chars based on non-linear-assumptions.
Even for the restricted domain of ['a'..'z','A'..'Z'], assuming 'A'..'Z' is contiguous is not defined by the language standard, and is provably not always the case. For any naysayers that think otherwise, I direct you to ordinal positions of characters in the chart at this link, paying close attention to the dead-zones in the middle of the assumed sequences. If you think "Nobody uses EBCDIC anymore", let me assure you both AS/400 and OS/390 are alive and well (and probably processing your US taxes right now, as the IRS is one of IBM's biggest customers).
In fact, the C standard is pretty explicit about this:
C99-5.2.1.3 In both the source and execution basic character sets, the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous.
Nowhere is there even a mention of defined ordering or even implied ordering on any other part of the character sets. In fact, '0'..'9' has one other unique attribute: they are the only characters guaranteed to be unaffected by locale changes.
So rather than assume a linear continuation exists for characters while thumbing our noses at the suspicious silence of the standard, let us define our own, hard map. I'lll not inline the code here like I normally do; if you're still with me you're genuinely interested in knowing and will likely read and critique the code below. But I will describe in summary how it works:
Static-declare two alphabets, double in length (A..ZA..Z,a..za..z).
Declare two arrays (encrypt and decrypt) large enough to hold (1<<CHAR_BIT) entries.
Fully initialize both arrays with values corresponding to their indexes. Ex: a[0]=0,a[1]=1,...
Fill each location in the encrypt-array that is part of our alphabets from (1) with the proper value corresponding to the shift width Ex. a['a'] = 'g' for a ROT5.
Mirror (4) by working backward from the tail of the alphabet applying the opposite shift direction. Ex: `a['g'] = 'a';
You can now use the encryption array as a simple table to translate input text to cipher text:
enc-char = encrypt[ dec-char ];
dec-char = decrypt[ enc-char ];
If you think it seems like a ton of work just to get source-level platform independence, you're absolutely right. But you would be amazed at the #ifdef #endif hell that people try to pass off as "multi-platform". The core goal of platform-independent code is to not only define common source, but define behavior as well. No matter what the platform, the concepts above will work. (and not a #ifdef in sight).
Thanks for taking the time to read this fiasco. Such a seemingly simple problem...
Sample main.cpp
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <string.h>
// global tables for encoding. must call init_tables() before using
static char xlat_enc[1 << CHAR_BIT];
static char xlat_dec[1 << CHAR_BIT];
void init_tables(unsigned shift)
{
// our rotation alphabets
static char ucase[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ";
static char lcase[] = "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz";
int i=0;
// ensure shift is below our maximum shift
shift %= 26;
// prime our table
for (;i<(1 << CHAR_BIT);i++)
xlat_enc[i] = xlat_dec[i] = i;
// apply shift to our xlat tables, both enc and dec.
for (i=0;i<(sizeof(ucase)+1)/2;i++)
{
xlat_enc[ lcase[i] ] = lcase[i+shift];
xlat_enc[ ucase[i] ] = ucase[i+shift];
xlat_dec[ lcase[sizeof(lcase) - i - 1] ] = lcase[sizeof(lcase) - i - 1 - shift];
xlat_dec[ ucase[sizeof(ucase) - i - 1] ] = ucase[sizeof(ucase) - i - 1 - shift];
}
}
// main entrypoint
int main(int argc, char *argv[])
{
// using a shift of 13 for our sample
const int shift = 13;
// initialize the tables
init_tables(shift);
// now drop the messsage to the console
char plain[] = "The quick brown fox jumps over the lazy dog.";
char *p = plain;
for (;*p; fputc(xlat_enc[*p++], stdout));
fputc('\n', stdout);
char cipher[] = "Gur dhvpx oebja sbk whzcf bire gur ynml qbt.";
p = cipher;
for (;*p; fputc(xlat_dec[*p++], stdout));
fputc('\n', stdout);
return EXIT_SUCCESS;
}
Output
Gur dhvpx oebja sbk whzcf bire gur ynml qbt.
The quick brown fox jumps over the lazy dog.
You can implement it literally:
"if they are shifted out of range":
if (ciphered_text[arrayelement] > 'z')
"make them go round in a circle and stop overflow":
ciphered_text[arrayelement] -= 26;
In your context:
if (plain_text[arrayelement] >= 'a' && plain_text[arrayelement] <= 'z')
{
ciphered_text[arrayelement] = (int)(plain_text[arrayelement] + shiftkey);
if (ciphered_text[arrayelement] > 'z')
ciphered_text[arrayelement] -= 26;
}
(assuming you work with English text in ACSII encoding, and shiftkey is in the range 1...25, like it should be)
Related
I wrote a program that counts and prints the number of occurrences of elements in a string but it throws a garbage value when i use fgets() but for gets() it's not so.
Here is my code:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int main() {
char c[1005];
fgets(c, 1005, stdin);
int cnt[26] = {0};
for (int i = 0; i < strlen(c); i++) {
cnt[c[i] - 'a']++;
}
for (int i = 0; i < strlen(c); i++) {
if(cnt[c[i]-'a'] != 0) {
printf("%c %d\n", c[i], cnt[c[i] - 'a']);
cnt[c[i] - 'a'] = 0;
}
}
return 0;
}
This is what I get when I use fgets():
baaaabca
b 2
a 5
c 1
32767
--------------------------------
Process exited after 8.61 seconds with return value 0
Press any key to continue . . . _
I fixed it by using gets and got the correct result but i still don't understand why fgets() gives wrong result
Hurray! So, the most important reason your code is failing is that your code does not observe the following inviolable advice:
Always sanitize your inputs
What this means is that if you let the user input anything then he/she/it can break your code. This is a major, common source of problems in all areas of computer science. It is so well known that a NASA engineer has given us the tale of Little Bobby Tables:
Exploits of a Mom #xkcd.com
It is always worth reading the explanation even if you get it already #explainxkcd.com
medium.com wrote an article about “How Little Bobby Tables Ruined the Internet”
Heck, Bobby’s even got his own website — bobby-tables.com
Okay, so, all that stuff is about SQL injection, but the point is, validate your input before blithely using it. There are many, many examples of C programs that fail because they do not carefully manage input. One of the most recent and widely known is the Heartbleed Bug.
For more fun side reading, here is a superlatively-titled list of “The 10 Worst Programming Mistakes In History” #makeuseof.com — a good number of which were caused by failure to process bad input!
Academia, methinks, often fails students by not having an entire course on just input processing. Instead we tend to pretend that the issue will be later understood and handled — code in academia, science, online competition forums, etc, often assumes valid input!
Where your code went wrong
Using gets() is dangerous because it does not stop reading and storing input as long as the user is supplying it. It has created so many software vulnerabilities that the C Standard has (at long last) officially removed it from C. SO actually has an excellent post on it: Why is the gets function so dangerous that it should not be used?
But it does remove the Enter key from the end of the user’s input!
fgets(), in contrast, stops reading input at some point! However, it also lets you know whether you actually got an entire line of of text by not removing that Enter key.
Hence, assuming the user types: b a n a n a Enter
gets() returns the string "banana"
fgets() returns the string "banana\n"
That newline character '\n' (what you get when the user presses the Enter key) messes up your code because your code only accepts (or works correctly given) minuscule alphabet letters!
The Fix
The fix is to reject anything that your algorithm does not like. The easiest way to recognize “good” input is to have a list of it:
// Here is a complete list of VALID INPUTS that we can histogram
//
const char letters[] = "abcdefghijklmnopqrstuvwxyz";
Now we want to create a mapping from each letter in letters[] to an array of integers (its name doesn’t matter, but we’re calling it count[]). Let’s wrap that up in a little function:
// Here is our mapping of letters[] ←→ integers[]
// • supply a valid input → get an integer unique to that specific input
// • supply an invalid input → get an integer shared with ALL invalid input
//
int * histogram(char c) {
static int fooey; // number of invalid inputs
static int count[sizeof(letters)] = {0}; // numbers of each valid input 'a'..'z'
const char * p = strchr(letters, c); // find the valid input, else NULL
if (p) {
int index = p - letters; // 'a'=0, 'b'=1, ... (same order as in letters[])
return &count[index]; // VALID INPUT → the corresponding integer in count[]
}
else return &fooey; // INVALID INPUT → returns a dummy integer
}
For the more astute among you, this is rather verbose: we can totally get rid of those fooey and index variables.
“Okay, okay, that’s some pretty fancy stuff there, mister. I’m a bloomin’ beginner. What about me, huh?”
Easy. Just check that your character is in range:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'z')) return &count[c - 'a'];
return &fooey;
}
“But EBCDIC...!”
Fine. The following will work with both EBCDIC and ASCII:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'i')) return &count[ 0 + c - 'a'];
if (('j' <= c) && (c <= 'r')) return &count[ 9 + c - 'j'];
if (('s' <= c) && (c <= 'z')) return &count[18 + c - 's'];
return &fooey;
}
You will honestly never have to worry about any other character encoding for the Latin minuscules 'a'..'z'.Prove me wrong.
Back to main()
Before we forget, stick the required magic at the top of your program:
#include <stdio.h>
#include <string.h>
Now we can put our fancy-pants histogram mapping to use, without the possibility of undefined behavior due to bad input.
int main() {
// Ask for and get user input
char s[1005];
printf("s? ");
fgets(s, 1005, stdin);
// Histogram the input
for (int i = 0; i < strlen(s); i++) {
*histogram(s[i]) += 1;
}
// Print out the histogram, not printing zeros
for (int i = 0; i < strlen(letters); i++) {
if (*histogram(letters[i])) {
printf("%c %d\n", letters[i], *histogram(letters[i]));
}
}
return 0;
}
We make sure to read and store no more than 1004 characters (plus the terminating nul), and we prevent unwanted input from indexing outside of our histogram’s count[] array! Win-win!
s? a - ba na na !
a 4
b 1
n 2
But wait, there’s more!
We can totally reuse our histogram. Check out this little function:
// Reset the histogram to all zeros
//
void clear_histogram(void) {
for (const char * p = letters; *p; p++)
*histogram(*p) = 0;
}
All this stuff is not obvious. User input is hard. But you will find that it doesn’t have to be impossibly difficult genius-level stuff. It should be entertaining!
Other ways you could handle input is to transform things into acceptable values. For example you can use tolower() to convert any majuscule letters to your histogram’s input set.
s? ba na NA!
a 3
b 1
n 2
But I digress again...
Hang in there!
I'm really sorry to bother but I have a problem and I don't know how to fix it. I've been doing CS5O and in Problem Set 2 I've been receiving wrong results and I don´t understand what I did wrong. It should be giving me "Before Grade 1" for the sentence "One fish. Two fish. Red fish. Blue fish." but is giving me Grade 2, it is giving Grade 14 for a sentence that is Grade 16+. Can someone help me? This is my code:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <math.h>
int count_letters(string text);
int count_word(string text);
int count_sentences(string text);
int letters;
int words;
int sentences;
int main(void)
{
string text = get_string("Text: ");
printf("Text: %s\n", text);
count_letters(text);
count_word(text);
count_sentences(text);
float L = 100 * (letters / words);
float S = 100 * (sentences / words);
int index = round(0.0588 * L - 0.296 * S - 15.8);
if (index < 1)
{
printf("Before Grade 1\n");
} else if (index > 16)
{
printf("Grade 16+\n");
} else
{
printf("Grade: %i\n", index);
}
}
int count_letters(string text)
{
letters = 0;
for (int i = 0; i < strlen(text); i++)
{
if ((text[i] >= 65 && text[i] <= 99) || (text[i] >= 97 && text[i] <= 122))
{
letters++;
}
}
return letters;}
int count_word(string text)
{
words = 1;
for (int i = 0; i < strlen(text); i++)
{
if (isspace(text[i]))
{
words++;
}
}
return words;}
int count_sentences(string text)
{
sentences = 0;
for (int i = 0; i < strlen(text); i++)
{
if (text[i] == 33 || text[i] == 46 || text[i] == 63)
{
sentences++;
}
}
return sentences;
}
Thank you!!
Regarding types:
There's a phenomenon sometimes called "sloppy typing" which basically means carelessly spamming out any random variable type in the code and then wonder why nothing works. Here's some things you need to know:
Integer division truncates the result by dropping the remainder. int is an integer, constants such as 100 is also int. Thus 3/2 evaluates to 1 in C.
Unless you have very good and exotic reasons1), you should pretty much never use the type float in a C program.
On high end systmes like PC, always use double. All the default math functions use double, for example round(). If you wish to use them with float you'd use a special function roundf(). Similarly, all floating point constants 1.0 are of type double. If you want them to be float you'd use 1.0f.
For the two above reasons, make it a habit of never mixing different types in the same expression. Do not mix integer and floating point. Do not mix float and double. Mixing types can lead to unexpected implicit conversion problems, accidental truncation, loss of precision and so on.
So for example a line such as float L = 100 * (letters / words); needs to be rewritten to explicitly use double everywhere:
double L = 100.0 * ((double)letters / (double)words);
1) Like using a microcontroller with single precision floating point FPU but only software floating point double. Or a FPU where double would be far less efficient.
Regarding functions and global variables:
The variables letters, words and sentences could have been declared locally inside main(), since your functions return the values to use anyway.
Declaring global variables like you do is considered very bad practice for multiple reasons. It exposes variables to other parts of the program that shouldn't have access, which in turn increases chances of accidental/intentional abuse since the variables are available everywhere. It increases the chances for naming collisions. It is unsafe in multi-threaded applications. It's universally bad; don't do this.
Instead pass variables to/from functions by parameters and return values.
Regarding "magic numbers":
Dropping a constant such as 0.0588 in the middle of source code, with zero explanation where that number comes from or what it does, is known as "magic numbers". This is bad practice since the reader of the code has no clue what it's therefore. The reader is most often yourself, one year from now, when you have forgotten all about what the code does, so it's a form of self-torture.
So instead of typing out some number like that, use a #define or const variable with a meaningful name, then use that meaningful name in the equation/expression instead.
In case of symbol table values, we don't have to invent that meaningful name ourselves, since C already has a built-in mechanism for it. Instead of text[i] >= 65 you should type text[i] >= 'A', which is 100% equivalent but much more readable.
An advanced detail regarding symbol table values is that they aren't actually guaranteed to be adjacant. Something like ch >= 'A' && ch <= 'Z' may work on classic 7 bit ASCII, but it's non-portable and also don't conver "locale" - local language-specific letters (as seen in Spanish, French, German and so on - almost every major language using latin letters). The portable solution to this is to use isupper or islower from ctype.h. Or in your case better yet, isalpha.
Regarding code formatting:
Don't invent some own non-standard formatting. There are some things regarding code formatting that are subjective, but these are not:
Always use empty lines between function bodies.
Always place the last } of a function at a line of its own.
I wrote this function that performs a slightly modified variation of run-length encoding on text files in C.
I'm trying to generalize it to binary files but I have no experience working with them. I understand that, while I can compare bytes of binary data much the same way I can compare chars from a text file, I am not sure how to go about printing the number of occurrences of a byte to the compressed version like I do in the code below.
A note on the type of RLE I'm using: bytes that occur more than once in a row are duplicated to signal the next-to-come number is in fact the number of occurrences vs just a number following the character in the file. For occurrences longer than one digit, they are broken down into runs that are 9 occurrences long.
For example, aaaaaaaaaaabccccc becomes aa9aa2bcc5.
Here's my code:
char* encode(char* str)
{
char* ret = calloc(2 * strlen(str) + 1, 1);
size_t retIdx = 0, inIdx = 0;
while (str[inIdx]) {
size_t count = 1;
size_t contIdx = inIdx;
while (str[inIdx] == str[++contIdx]) {
count++;
}
size_t tmpCount = count;
// break down counts with 2 or more digits into counts ≤ 9
while (tmpCount > 9) {
tmpCount -= 9;
ret[retIdx++] = str[inIdx];
ret[retIdx++] = str[inIdx];
ret[retIdx++] = '9';
}
char tmp[2];
ret[retIdx++] = str[inIdx];
if (tmpCount > 1) {
// repeat character (this tells the decompressor that the next digit
// is in fact the # of consecutive occurrences of this char)
ret[retIdx++] = str[inIdx];
// convert single-digit count to string
snprintf(tmp, 2, "%ld", tmpCount);
ret[retIdx++] = tmp[0];
}
inIdx += count;
}
return ret;
}
What changes are in order to adapt this to a binary stream? The first problem I see is with the snprintf call since it's operating using a text format. Something that rings a bell is also the way I'm handling the multiple-digit occurrence runs. We're not working in base 10 anymore so that has to change, I'm just unsure how having almost never worked with binary data.
A few ideas that can be useful to you:
one simple method to generalize RLE to binary data is to use a bit-based compression. For example the bit sequence 00000000011111100111 can be translated to the sequence 0 9623. Since the binary alphabet is composed by only two symbols, you need to only store the first bit value (this can be as simple as storing it in the very first bit) and then the number of the contiguous equal values. Arbitrarily large integers can be stored in a binary format using Elias gamma coding. Extra padding can be added to fit the entire sequence nicely into an integer number of bytes. So using this method, the above sequence can be encoded like this:
00000000011111100111 -> 0 0001001 00110 010 011
^ ^ ^ ^ ^
first bit 9 6 2 3
If you want to keep it byte based, one idea is to consider all the even bytes frequencies (interpreted as an unsigned char) and all the odd bytes the values. If one byte occur more than 255 times, than you can just repeat it. This can be very inefficient, though, but it is definitively simple to implement, and it might be good enough if you can make some assumptions on the input.
Also, you can consider moving out from RLE and implement Huffman's coding or other sophisticated algorithms (e.g. LZW).
Implementation wise, i think tucuxi already gave you some hints.
You only have to address 2 problems:
you cannot use any str-related functions, because C strings do not deal well with '\0'. So for example, strlen will return the index of the 1st 0x0 byte in a string. The length of the input must be passed in as an additional parameter: char *encode(char *start, size_t length)
your output cannot have an implicit length of strlen(ret), because there may be extra 0-bytes sprinkled about in the output. You again need an extra parameter: size_t encode(char *start, size_t length, char *output) (this version would require the output buffer to be reserved externally, with a size of at least length*2, and return the length of the encoded string)
The rest of the code, assuming it was working before, should continue to work correctly now. If you want to go beyond base-10, and for instance use base-256 for greater compression, you would only need to change the constant in the break-things-up loop (from 9 to 255), and replace the snprintf as follows:
// before
snprintf(tmp, 2, "%ld", tmpCount);
ret[retIdx++] = tmp[0];
// after: much easier
ret[retIdx++] = tmpCount;
I have the following problem:
I would to implement a ceaser cipher which works mostly, but when I reach the end of the alphabet it goes beyond the alphabet which I assume is due to the ascii values.
for example:
if I insert a k and use the key 35 I get a H but it should wrap around in the lowercase letters and produce a b.
It also sometimes produces a punctuation mark or something else like < which I do not want.
The code responsible for the encryption is
encripted_text = (plain_text + key - 97)%26 +97;
am I missing something to make it wrap around and only stay in the alphabet.
Example run of program:
char plain_text = 'k';
int k = 35;
char encripted_text = '\0';
encripted_text = (plain_text + key - 97)%26 + 97;
printf("%c", encripted_text);
Thanks for your help.
Assuming that encripted_text and plain_text are both char variables (1 byte), you have to decide a formula making sure that even when the result wraps around a valid output character is calculated.
How to do that? It depends on what are valid chars for you! In some cases you can simply rely on how characters are mapped in ASCII code, but I want suggest you a general solution that you will be able to translate in your specific requirement.
This general solution consists in defining an array with the accepted alphabet, and translate the input character to one of the characters in it.
For example:
char alphabet[] = { 'a','A','b','B','c','C' }; //just an example
char input_char = 'k';
int key = 35;
char encripted_char = '\0';
encripted_char = alphabet[(input_char + key - 97)%(sizeof(alphabet))];
printf("%c", encripted_char );
Summarizing: the formula doesn't calculate directly the encrypted char, but the index of the accepted alphabet array normalized using % operator.
I got the right output with the same logic #nad34. In fact, I correctly got the output as 't' for 'k'. It shouldn't and won't give a 'b'.
Your code is having the right logic, except for a few slight errors.
I don't know why you're using a string here, but since you are anyway, this -> char plain_text[] = 'k'; should instead be char plain_text[] = "k"; ==> Note the double quotes.
int k = 35; should be int key = 35;, since you have used the variable name key and not k.
Logicwise it is right, and it will give you the right output.
You can check out the execution of the same code here.
I'm trying to write a code that would convert letters into numbers. For example
A ==> 0
B ==> 1
C ==> 2
and so on. Im thinking of writing 26 if statements. I'm wondering if there's a better way to do this...
Thank you!
This is a way that I feel is better than the switch method, and yet is standards compliant (does not assume ASCII):
#include <string.h>
#include <ctype.h>
/* returns -1 if c is not an alphabetic character */
int c_to_n(char c)
{
int n = -1;
static const char * const alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char *p = strchr(alphabet, toupper((unsigned char)c));
if (p)
{
n = p - alphabet;
}
return n;
}
If you need to deal with upper-case and lower-case then you may want to do something like:
if (letter >= 'A' && letter <= 'Z')
num = letter - 'A';
else if (letter >= 'a' && letter <= 'z')
num = letter - 'a';
If you want to display these, then you will want to convert the number into an ascii value by adding a '0' to it:
asciinumber = num + '0';
The C standard does not guarantee that the characters of the alphabet will be numbered sequentially. Hence, portable code cannot assume, for example, that 'B'-'A' is equal to 1.
The relevant section of the C specification is section 5.2.1 which describes the character sets:
3 Both the basic source and basic execution character sets shall have
the following members: the 26 uppercase letters of the Latin
alphabet
ABCDEFGHIJKLM
NOPQRSTUVWXYZ
the 26 lowercase letters of the Latin alphabet
abcdefghijklm
nopqrstuvwxyz
the 10 decimal digits
0123456789
the following 29 graphic characters
!"#%&'()*+,-./:
;<=>?[\]^_{|}~
the space character, and control characters representing horizontal
tab, vertical tab, and form feed. The
representation of each member of the source and execution basic
character sets shall fit in a byte. In both the source and execution
basic character sets, the value of each character after 0 in the above
list of decimal digits shall be one greater than the value of the
previous.
So the specification only guarantees that the digits will have sequential encodings. There is absolutely no restriction on how the alphabetic characters are encoded.
Fortunately, there is an easy and efficient way to convert A to 0, B to 1, etc. Here's the code
char letter = 'E'; // could be any upper or lower case letter
char str[2] = { letter }; // make a string out of the letter
int num = strtol( str, NULL, 36 ) - 10; // convert the letter to a number
The reason this works can be found in the man page for strtol which states:
(In bases above 10, the letter 'A' in either upper or lower case
represents 10, 'B' represents 11, and so forth, with 'Z' representing
35.)
So passing 36 to strtol as the base tells strtol to convert 'A' or 'a' to 10, 'B' or 'b' to 11, and so on. All you need to do is subtract 10 to get the final answer.
Another, far worse (but still better than 26 if statements) alternative is to use switch/case:
switch(letter)
{
case 'A':
case 'a': // don't use this line if you want only capital letters
num = 0;
break;
case 'B':
case 'b': // same as above about 'a'
num = 1;
break;
/* and so on and so on */
default:
fprintf(stderr, "WTF?\n");
}
Consider this only if there is absolutely no relationship between the letter and its code. Since there is a clear sequential relationship between the letter and the code in your case, using this is rather silly and going to be awful to maintain, but if you had to encode random characters to random values, this would be the way to avoid writing a zillion if()/else if()/else if()/else statements.
There is a much better way.
In ASCII (www.asciitable.com) you can know the numerical values of these characters.
'A' is 0x41.
So you can simply minus 0x41 from them, to get the numbers. I don't know c very well, but something like:
int num = 'A' - 0x41;
should work.
In most programming and scripting languages there is a means to get the "ordinal" value of any character. (Think of it as an offset from the beginning of the character set).
Thus you can usually do something like:
for ch in somestring:
if lowercase(ch):
n = ord(ch) - ord ('a')
elif uppercase(ch):
n = ord(ch) - ord('A')
else:
n = -1 # Sentinel error value
# (or raise an exception as appropriate to your programming
# environment and to the assignment specification)
Of course this wouldn't work for an EBCDIC based system (and might not work for some other exotic character sets). I suppose a reasonable sanity check would be to test of this function returned monotonically increasing values in the range 0..26 for the strings "abc...xzy" and "ABC...XYZ").
A whole different approach would be to create an associative array (dictionary, table, hash) of your letters and their values (one or two simple loops). Then use that. (Most modern programming languages include support for associative arrays.
Naturally I'm not "doing your homework." You'll have to do that for yourself. I'm simply explaining that those are the obvious approaches that would be used by any professional programmer. (Okay, an assembly language hack might also just mask out one bit for each byte, too).
Since the char data type is treated similar to an int data type in C and C++, you could go with some thing like:
char c = 'A'; // just some character
int urValue = c - 65;
If you are worried about case senstivity:
#include <ctype.h> // if using C++ #include <cctype>
int urValue = toupper(c) - 65;
Aww if you had C++
For unicode
definition of how to map characters to values
typedef std::map<wchar_t, int> WCharValueMap;
WCharValueMap myConversion = fillMap();
WCharValueMap fillMap() {
WCharValueMap result;
result[L'A']=0;
result[L'Â']=0;
result[L'B']=1;
result[L'C']=2;
return result;
}
usage
int value = myConversion[L'Â'];
I wrote this bit of code for a project, and I was wondering how naive this approach was.
The benefit here is that is seems to be adherent to the standard, and my guess is that the runtime is approx. O(k) where k is the size of the alphabet.
int ctoi(char c)
{
int index;
char* alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
c = toupper(c);
// avoid doing strlen here to juice some efficiency.
for(index = 0; index != 26; index++)
{
if(c == alphabet[index])
{
return index;
}
}
return -1;
}
#include<stdio.h>
#include<ctype.h>
int val(char a);
int main()
{
char r;
scanf("%c",&r);
printf("\n%d\n",val(r));
}
int val(char a)
{
int i=0;
char k;
for(k='A';k<=toupper(a);k++)
i++;
return i;
}//enter code here