Do char's in C have pre-assigned zero indexed values?

Do char's in C have pre-assigned zero indexed values? - c

Sorry if my title is a little misleading, I am still new to a lot of this but:
I recently worked on a small cipher project where the user can give the file a argument at the command line but it must be alphabetical. (Ex: ./file abc)
This argument will then be used in a formula to encipher a message of plain text you provide. I got the code to work, thanks to my friend for helping but i'm not 100% a specific part of this formula.
#include <stdio.h>
#include <cs50.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <ctype.h>
int main (int argc, string argv[])
{ //Clarify that the argument count is not larger than 2
if (argc != 2)
{
printf("Please Submit a Valid Argument.\n");
return 1;
}
//Store the given arguemnt (our key) inside a string var 'k' and check if it is alpha
string k = (argv[1]);
//Store how long the key is
int kLen = strlen(k);
//Tell the user we are checking their key
printf("Checking key validation...\n");
//Pause the program for 2 seconds
sleep(2);
//Check to make sure the key submitted is alphabetical
for (int h = 0, strlk = strlen(k); h < strlk; h++)
{
if isalpha(k[h])
{
printf("Character %c is valid\n", k[h]);
sleep(1);
}
else
{ //Telling the user the key is invalid and returning them to the console
printf("Key is not alphabetical, please try again!\n");
return 0;
}
}
//Store the users soon to be enciphered text in a string var 'pt'
string pt = get_string("Please enter the text to be enciphered: ");
//A prompt that the encrypted text will display on
printf("Printing encrypted text: ");
sleep(2);
//Encipher Function
for(int i = 0, j = 0, strl = strlen(pt); i < strl; i++)
{
//Get the letter 'key'
int lk = tolower(k[j % kLen]) - 'a';
//If the char is uppercase, run the V formula and increment j by 1
if isupper(pt[i])
{
printf("%c", 'A' + (pt[i] - 'A' + lk) % 26);
j++;
}
//If the char is lowercase, run the V formula and increment j by 1
else if islower(pt[i])
{
printf("%c", 'a' + (pt[i] - 'a' + lk) % 26);
j++;
}
//If the char is a symbol just print said symbol
else
{
printf("%c", pt[i]);
}
}
printf("\n");
printf("Closing Script...\n");
return 0;
}
The Encipher Function:
Uses 'A' as a char for the placeholder but does 'A' hold a zero indexed value automatically? (B = 1, C = 2, ...)

In C, character literals like 'A' are of type int, and represent whatever integer value encodes the character A on your system. On the 99.999...% of systems that use ASCII character encoding, that's the number 65. If you have an old IBM mainframe from the 1970s using EBCDIC, it might be something else. You'll notice that the code is subtracting 'A' to make 0-based values.
This does make the assumption that the letters A-Z occupy 26 consecutive codes. This is true of ASCII (A=65, B=66, etc.), but not of all codes, and not guaranteed by the language.

does 'A' hold a zero indexed value automatically? (B = 1, C = 2, ...)
No. Strictly conforming C code can not depend on any character encoding other than the numerals 0-9 being represented consecutively, even though the common ASCII character set does represent them consecutively.
The only guarantee regarding character sets is per 5.2.1 Character sets, paragraph 3 of the C standard:
... the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous...
Character sets such as EBCDIC don't represent letters consecutively

char is a numeric type that happens to also often be used to represent visible characters (or special non-visible pseudo-characters). 'A' is a value (with actual type int) that can be converted to a char without overflow or underflow. That is, it's really some number, but you usually don't need to know what number, since you generally use a particular char value either as just a number or as just a character, not both.
But this program is using char values in both ways, so it somewhat does matter what the numeric values corresponding to visible characters are. One way it's very often done, but not always, is using the ASCII values which are numbered 0 to 127, or some other scheme which uses those values plus more values outside that range. So for example, if the computer uses one of those schemes, then 'A'==65, and 'A'+1==66, which is 'B'.
This program is assuming that all the lowercase Latin-alphabet letters have numeric values in consecutive order from 'a' to 'z', and all the uppercase Latin-alphabet letters have numeric values in consecutive order from 'A' to 'Z', without caring exactly what those values are. This is true of ASCII, so it will work on many kinds of machines. But there's no guarantee it will always be true!
C does guarantee the ten digit characters from '0' to '9' are in consecutive order, which means that if n is a digit number from zero to nine inclusive, then n + '0' is the character for displaying that digit, and if c is such a digit character, then c - '0' is the number from zero to nine it represents. But that's the only guarantee the C language makes about the values of characters.
For one counter-example, see EBCDIC, which is not in much use now, but was used on some older computers, and C supports it. Its alphabetic characters are arranged in clumps of consecutive letters, but not with all 26 letters of each case all together. So the program would give incorrect results running on such a computer.

Sequentiality is only one aspect of concern.
Proper use of isalpha(ch) is another, not quite implemented properly in OP's code.
isalpha(ch) expects a ch in the range of unsigned char or EOF. With k[h], a char, that value could be negative. Insure a non-negative value with:
// if isalpha(k[h])
if isalpha((unsigned char) k[h])

Related

Trying to write a program to sum the value of an int and a char

I am trying to write a program in C to sum the value of an integer and a character. If the user enters an integer where the character should be, I am expecting my program to calculate the value of the 2 integers. My code below works with the user entering 2 integers but only calculates up to 9 (Ex: 4 5: "Character '4' represents a digit. Sum of '4' and '5' is 9"). If the user enters 5 6, the result is: "Character '5' represents a digit. Sum of '5' and '6' is ;". I have been searching for a while now and any potential solution always leads to the incorrect sum. I also expect my program to accept user input higher than '9' (Ex: 20 50), but if I change '9' to '99', I get the following warning: "warning: multi-character character constant [-Wmultichar]". Can someone please point me in the right direction to achieve these goals?
#include <stdio.h>
int sum (int m, char n){
return m+n;
}
int main(){
char ch;
int c;
printf("Enter an integer and a character separated by a blank> ");
scanf("%d %c",&c, &ch);
if((c >= '0' && c <= '9')||(ch >= '0' && ch <= '9')){
int cs = sum(c, ch - 0);
printf("Character '%d' represents a digit. Sum of '%d' and '%c' is %d" , c, c, ch - 0, cs);
}
return 0;
}

int cs = sum(c, ch - 0);
It looks like your trying to account for ASCII values by subtracting the ASCII value of 0 from whatever character the user enters. However, you used an integer literal of 0, when you'd want to use a character literal of '0'. See below:
int cs = sum(c, ch - '0');
Also, I would recommend renaming your int to i or something other than c. It's a little difficult to distinguish that the types of c and ch are different.
Also consider changing
if((c >= '0' && c <= '9')
to
if((c >= 0 && c <= 9)
c is an integer and you should compare it as such. By using ' ', you're basically doing a cast to a char variable which is unnecessary here.
Another problem is that I don't think you're going to be able to accomplish what you're trying to do using a char variable for a two-digit number. A char variable can hold a single character, where as a two-digit number is composed of, well, two characters.

Sorry I can't comment so I'm adding this answer for the problem abou t using only one digit.
You have a single char
char ch;
So it reads only one char, you need an array of chars like char ch[10]
Then you'd use int foo = atoi(ch) to converto your array to an integer

Syntax and different meanings of '<letter>'

I am learning C from the K&R book and I came across the code to count the no. of occurrence of white space characters (blank, tab, newline) and of all other characters.
The code is like this:
#include <stdio.h>
/* count digits, white space, others */
main()
{
int c, i, nwhite, nother;
int ndigit[10];
nwhite = nother = 0;
for (i = 0; i < 10; ++i)
ndigit[i] = 0;
while ((c = getchar()) != EOF)
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
printf("digits =");
for (i = 0; i < 10; ++i)
printf(" %d", ndigit[i]);
printf(", white space = %d, other = %d\n",
nwhite, nother);
}
I need to ask 2 questions..
1st question:
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
I very well know that '0' and '9'represents the ASCII value of 0 & 9 respectively. But what I don't seem to understand is why we even need to use the ASCII vale and not the integer itself. Like why can't we simply use
if (c >= 0 && c <= 9)
to find if c lies between 0 and 9?
2nd question:
++ndigit[c-'0']
What does the above statement do?
Why aren't we taking the ASCII value of c here?
Because if we did, it should have been written as ['c'-'0'].

1.
C is a character, not an integer. Thus we need to compare them to their ASCII values. The integers 0 and 9 correspond to Nul and Tab, not something we are looking for.
2.
By subtracting off the ASCII value the index corresponding to the integer is increased. For example if our number is '1'. Then '1' - '0' = 1 so the index at one is increased, its a convenient way to keep track of characters. We dont put ['c' - '0'] because we care about the variable c not the character 'c'
This table shows how characters are represented, they are different from integers. The main take away is '1' != 1
http://www.asciitable.com/

With the current C standards, this would be a perfect exercise for localized wide input:
#include <stdlib.h>
#include <locale.h>
#include <stdio.h>
#include <wchar.h>
#include <wctype.h>
#include "wdigit.h"
int main(void)
{
size_t num_space = 0; /* Spaces, tabs, newlines */
size_t num_letter = 0;
size_t num_punct = 0; /* Punctuation */
size_t num_digit[10] = { 0, }; /* Digits - all initialized to zero */
size_t num_other = 0; /* Other printable characters */
size_t total = 0;
wint_t wc;
int digit;
if (!setlocale(LC_ALL, "")) {
fprintf(stderr, "Current locale is not supported by the C library.\n");
return EXIT_FAILURE;
}
if (fwide(stdin, 1) < 1) {
fprintf(stderr, "The C library does not support wide input for this locale.\n");
return EXIT_FAILURE;
}
while ((wc = fgetwc(stdin)) != WEOF) {
total++;
digit = wdigit(wc);
if (digit >= 0 && digit <= 9)
num_digit[digit]++;
else
if (iswspace(wc))
num_space++;
else
if (iswpunct(wc))
num_punct++;
else
if (iswalpha(wc))
num_letter++;
else
if (iswprint(wc))
num_other++;
/* All nonprintable non-whitespace characters are ignored */
}
printf("Read %zu wide characters total.\n", total);
printf("%15zu letters\n", num_letter);
printf("%15zu zeros (equivalent to '0')\n", num_digit[0]);
printf("%15zu ones (equivalent to '1')\n", num_digit[1]);
printf("%15zu twos (equivalent to '2')\n", num_digit[2]);
printf("%15zu threes (equivalent to '3')\n", num_digit[3]);
printf("%15zu fours (equivalent to '4')\n", num_digit[4]);
printf("%15zu fives (equivalent to '5')\n", num_digit[5]);
printf("%15zu sixes (equivalent to '6')\n", num_digit[6]);
printf("%15zu sevens (equivalent to '7')\n", num_digit[7]);
printf("%15zu eights (equivalent to '8')\n", num_digit[8]);
printf("%15zu nines (equivalent to '9')\n", num_digit[9]);
printf("%15zu whitespaces (including newlines and tabs)\n", num_space);
printf("%15zu punctuation characters\n", num_punct);
printf("%15zu other printable characters\n", num_other);
return EXIT_SUCCESS;
}
You also need wdigit.h, a header file that returns the decimal digit value (0 to 9, inclusive) if the given wide character is a decimal digit, and -1 otherwise. If this was an exercise, the header file would be provided.
The following "wdigit.h" should support all decimal digits defined in Unicode (which is the closest standard we have to an universal character set). I don't think it is copyrightable (because it is essentially just a listing from the Unicode standard), but if it is, I dedicate it to public domain:
#ifndef WDIGIT_H
#define WDIGIT_H
#include <wchar.h>
/* wdigits[] are wide strings that contain all known versions of a decimal digit.
For example, wdigits[0] is a wide string that contains all known zero decimal digit
wide characters. You can use e.g.
wcschr(wdigits[0], wc)
to determine if wc is a zero decimal digit wide character.
*/
static const wchar_t *const wdigits[10] = {
L"0" L"\u0660\u06F0\u07C0\u0966\u09E6\u0A66\u0AE6\u0B66\u0BE6\u0C66"
L"\u0CE6\u0D66\u0DE6\u0E50\u0ED0\u0F20\u1040\u1090\u17E0\u1810"
L"\u1946\u19D0\u1A80\u1A90\u1B50\u1BB0\u1C40\u1C50\uA620\uA8D0"
L"\uA900\uA9D0\uA9F0\uAA50\uABF0\uFF10"
L"\U000104A0\U00011066\U000110F0\U00011136\U000111D0\U000112F0"
L"\U00011450\U000114D0\U00011650\U000116C0\U00011730\U000118E0"
L"\U00011C50\U00011D50\U00016A60\U00016B50\U0001D7CE\U0001D7D8"
L"\U0001D7E2\U0001D7EC\U0001D7F6\U0001E950",
L"1" L"\u0661\u06F1\u07C1\u0967\u09E7\u0A67\u0AE7\u0B67\u0BE7\u0C67"
L"\u0CE7\u0D67\u0DE7\u0E51\u0ED1\u0F21\u1041\u1091\u17E1\u1811"
L"\u1947\u19D1\u1A81\u1A91\u1B51\u1BB1\u1C41\u1C51\uA621\uA8D1"
L"\uA901\uA9D1\uA9F1\uAA51\uABF1\uFF11"
L"\U000104A1\U00011067\U000110F1\U00011137\U000111D1\U000112F1"
L"\U00011451\U000114D1\U00011651\U000116C1\U00011731\U000118E1"
L"\U00011C51\U00011D51\U00016A61\U00016B51\U0001D7CF\U0001D7D9"
L"\U0001D7E3\U0001D7ED\U0001D7F7\U0001E951",
L"2" L"\u0662\u06F2\u07C2\u0968\u09E8\u0A68\u0AE8\u0B68\u0BE8\u0C68"
L"\u0CE8\u0D68\u0DE8\u0E52\u0ED2\u0F22\u1042\u1092\u17E2\u1812"
L"\u1948\u19D2\u1A82\u1A92\u1B52\u1BB2\u1C42\u1C52\uA622\uA8D2"
L"\uA902\uA9D2\uA9F2\uAA52\uABF2\uFF12"
L"\U000104A2\U00011068\U000110F2\U00011138\U000111D2\U000112F2"
L"\U00011452\U000114D2\U00011652\U000116C2\U00011732\U000118E2"
L"\U00011C52\U00011D52\U00016A62\U00016B52\U0001D7D0\U0001D7DA"
L"\U0001D7E4\U0001D7EE\U0001D7F8\U0001E952",
L"3" L"\u0663\u06F3\u07C3\u0969\u09E9\u0A69\u0AE9\u0B69\u0BE9\u0C69"
L"\u0CE9\u0D69\u0DE9\u0E53\u0ED3\u0F23\u1043\u1093\u17E3\u1813"
L"\u1949\u19D3\u1A83\u1A93\u1B53\u1BB3\u1C43\u1C53\uA623\uA8D3"
L"\uA903\uA9D3\uA9F3\uAA53\uABF3\uFF13"
L"\U000104A3\U00011069\U000110F3\U00011139\U000111D3\U000112F3"
L"\U00011453\U000114D3\U00011653\U000116C3\U00011733\U000118E3"
L"\U00011C53\U00011D53\U00016A63\U00016B53\U0001D7D1\U0001D7DB"
L"\U0001D7E5\U0001D7EF\U0001D7F9\U0001E953",
L"4" L"\u0664\u06F4\u07C4\u096A\u09EA\u0A6A\u0AEA\u0B6A\u0BEA\u0C6A"
L"\u0CEA\u0D6A\u0DEA\u0E54\u0ED4\u0F24\u1044\u1094\u17E4\u1814"
L"\u194A\u19D4\u1A84\u1A94\u1B54\u1BB4\u1C44\u1C54\uA624\uA8D4"
L"\uA904\uA9D4\uA9F4\uAA54\uABF4\uFF14"
L"\U000104A4\U0001106A\U000110F4\U0001113A\U000111D4\U000112F4"
L"\U00011454\U000114D4\U00011654\U000116C4\U00011734\U000118E4"
L"\U00011C54\U00011D54\U00016A64\U00016B54\U0001D7D2\U0001D7DC"
L"\U0001D7E6\U0001D7F0\U0001D7FA\U0001E954",
L"5" L"\u0665\u06F5\u07C5\u096B\u09EB\u0A6B\u0AEB\u0B6B\u0BEB\u0C6B"
L"\u0CEB\u0D6B\u0DEB\u0E55\u0ED5\u0F25\u1045\u1095\u17E5\u1815"
L"\u194B\u19D5\u1A85\u1A95\u1B55\u1BB5\u1C45\u1C55\uA625\uA8D5"
L"\uA905\uA9D5\uA9F5\uAA55\uABF5\uFF15"
L"\U000104A5\U0001106B\U000110F5\U0001113B\U000111D5\U000112F5"
L"\U00011455\U000114D5\U00011655\U000116C5\U00011735\U000118E5"
L"\U00011C55\U00011D55\U00016A65\U00016B55\U0001D7D3\U0001D7DD"
L"\U0001D7E7\U0001D7F1\U0001D7FB\U0001E955",
L"6" L"\u0666\u06F6\u07C6\u096C\u09EC\u0A6C\u0AEC\u0B6C\u0BEC\u0C6C"
L"\u0CEC\u0D6C\u0DEC\u0E56\u0ED6\u0F26\u1046\u1096\u17E6\u1816"
L"\u194C\u19D6\u1A86\u1A96\u1B56\u1BB6\u1C46\u1C56\uA626\uA8D6"
L"\uA906\uA9D6\uA9F6\uAA56\uABF6\uFF16"
L"\U000104A6\U0001106C\U000110F6\U0001113C\U000111D6\U000112F6"
L"\U00011456\U000114D6\U00011656\U000116C6\U00011736\U000118E6"
L"\U00011C56\U00011D56\U00016A66\U00016B56\U0001D7D4\U0001D7DE"
L"\U0001D7E8\U0001D7F2\U0001D7FC\U0001E956",
L"7" L"\u0667\u06F7\u07C7\u096D\u09ED\u0A6D\u0AED\u0B6D\u0BED\u0C6D"
L"\u0CED\u0D6D\u0DED\u0E57\u0ED7\u0F27\u1047\u1097\u17E7\u1817"
L"\u194D\u19D7\u1A87\u1A97\u1B57\u1BB7\u1C47\u1C57\uA627\uA8D7"
L"\uA907\uA9D7\uA9F7\uAA57\uABF7\uFF17"
L"\U000104A7\U0001106D\U000110F7\U0001113D\U000111D7\U000112F7"
L"\U00011457\U000114D7\U00011657\U000116C7\U00011737\U000118E7"
L"\U00011C57\U00011D57\U00016A67\U00016B57\U0001D7D5\U0001D7DF"
L"\U0001D7E9\U0001D7F3\U0001D7FD\U0001E957",
L"8" L"\u0668\u06F8\u07C8\u096E\u09EE\u0A6E\u0AEE\u0B6E\u0BEE\u0C6E"
L"\u0CEE\u0D6E\u0DEE\u0E58\u0ED8\u0F28\u1048\u1098\u17E8\u1818"
L"\u194E\u19D8\u1A88\u1A98\u1B58\u1BB8\u1C48\u1C58\uA628\uA8D8"
L"\uA908\uA9D8\uA9F8\uAA58\uABF8\uFF18"
L"\U000104A8\U0001106E\U000110F8\U0001113E\U000111D8\U000112F8"
L"\U00011458\U000114D8\U00011658\U000116C8\U00011738\U000118E8"
L"\U00011C58\U00011D58\U00016A68\U00016B58\U0001D7D6\U0001D7E0"
L"\U0001D7EA\U0001D7F4\U0001D7FE\U0001E958",
L"9" L"\u0669\u06F9\u07C9\u096F\u09EF\u0A6F\u0AEF\u0B6F\u0BEF\u0C6F"
L"\u0CEF\u0D6F\u0DEF\u0E59\u0ED9\u0F29\u1049\u1099\u17E9\u1819"
L"\u194F\u19D9\u1A89\u1A99\u1B59\u1BB9\u1C49\u1C59\uA629\uA8D9"
L"\uA909\uA9D9\uA9F9\uAA59\uABF9\uFF19"
L"\U000104A9\U0001106F\U000110F9\U0001113F\U000111D9\U000112F9"
L"\U00011459\U000114D9\U00011659\U000116C9\U00011739\U000118E9"
L"\U00011C59\U00011D59\U00016A69\U00016B59\U0001D7D7\U0001D7E1"
L"\U0001D7EB\U0001D7F5\U0001D7FF\U0001E959",
};
static int wdigit(const wint_t wc)
{
int i;
for (i = 0; i < 10; i++)
if (wcschr(wdigits[i], wc))
return i;
return -1;
}
#endif /* WDIGIT_H */
On a Linux, *BSD, or Mac machine, you can compile the above using e.g.
gcc -std=c99 -Wall -Wextra -pedantic example.c -o example
or
clang -std=c99 -Wall -Wextra -pedantic example.c -o example
and test it using e.g.
printf 'Bengali decimal digit five is ৫.\n' | ./example
which outputs
Read 33 wide characters total.
25 letters
0 zeros (equivalent to '0')
0 ones (equivalent to '1')
0 twos (equivalent to '2')
0 threes (equivalent to '3')
0 fours (equivalent to '4')
1 fives (equivalent to '5')
0 sixes (equivalent to '6')
0 sevens (equivalent to '7')
0 eights (equivalent to '8')
0 nines (equivalent to '9')
6 whitespaces (including newlines and tabs)
1 punctuation characters
0 other printable characters
The above code is fully compliant to ISO C99 (and later versions of the ISO C standard), and should be completely portable.
However, note that not all C libraries fully support C99; the main one people have issues with is Microsoft C. I don't use Windows myself, but if you are, try using the UTF-8 codepage (chcp 65001). This is wholly and completely a Microsoft issue, as it apparently can support UTF-8 input with some nonstandard Windows extensions. They just don't want you to write portable code, it seems.

I need to ask 2 questions..
1st question: I very well know that '0' and '9'represents the ASCII value of 0 & 9 respectively. But what I don't seem to understand is why we even need to use the ASCII vale and not the integer itself. Like why can't we simply use
if (c >= 0 && c <= 9)
Let's start with basics. All user input, file input, etc. is given in characters, so when you need to compare the character you have just read, it must be compared against another character. Within the character set, digits 0-9 are represented with ASCII values 48-57, so character '0' is represented by 48, and so on.
Your test above tests whether c is a digit, an ASCII value between 48-57, so you must use the characters themselves within the comparison, e.g. if ('0' <= c && c <= '9') you then know c is a digit. This brings us to:
2nd question:
++ndigit[c-'0']
In any classification problem you do, you will generally use an array initialized to all zero with at least enough elements for the set (of characters here). You can split them out as an array of ten elements to hold your digits, uppercase, lowercase, etc...
Your ndigit array, begins initialized to all zeros, the plan is to increment the proper element in the array each time a digit is encountered during your read. This is where you make use of the ASCII value for the bottom of the digits '0' (48). Since your ndigit array is likely indexed 0-9 each time a digit is encountered it must be scaled (or mapped) into the correct index of ndigit (so that '0' is mapped to 0, '1' mapped to 1, and so on.
Above through your test we determined, in this case, that c held a digit, so to classify that digit and have it map to the correct element of the ndigit array, we use c - '0'. If the digit in c is '3' (ASCII 51), then incrementing
++ndigit[c-'0'];
is actually indexing
++ndigit[51 - 48];
or
++ndigit[3]; /* since c was 3, we no increment ndigit[3] adding one more
occurrence of '3' to the data stored at ndigit[3] */
That way when you are done, the ndigit array will hold the exact number of 0, 1, 2, 3, 4, ... digits found in your input. It takes a bit to wrap your head around the scheme, but all in all, you simply need somewhere to begin counting from zero to store the totals for each character, digits, punctuations, seen, and an array that is sized for the character set will hold these values exactly when you are done because each character has been classified, and the corresponding ++ndigits[] element incremented to capture the information as you went along.
These, in a general sense, are called frequency arrays because they are used to store the frequency with which the individual members of a set appeared. They are many, many applications outside simply classifying characters.
Look all of the answers over and let me know if you are still confused and I'm more than happy to help further.

getchar() returns character codes and sentinel values (EOF). So, we know c holds a character code inside the loop.
c-'0' is the distance on the character code "number line" from the value of c (a character code) to the code for '0'. Per the C standard, character codes must have these digits in consecutive order '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'. So, the expression computes the integer value of the digit character.

C, Help while loop continues while not true

Task:
Write a char do/while loop, where the program will end if the letter is not in capital:
Solution:
Char input;
do{
scanf("%c", &input);
} while (input <'a' || 'z'< input);
So my program says: "do this, while the input is either a or z". Why does it control all letters from a to z and how come my program ends if it's a little char instead of a capital?
I'm new to C, and I can't find an explanation anywhere, thanks in advance.

the problem is this statement:
while (input <'a' || 'z'< input);
as this is looking for anything not lower case letters and not taking into account the whole ascii (single char) table of possibilities.
And the criteria is for upper case which those letter are lower case.
you could use:
while ('A' <= input && input <= 'Z')
however, best to use the functionality in the header file: ctype.h because not all systems use the ASCII character set. (IBM mainframe for instance, uses EBCDIC rather than ASCII, where the alphabet is not contiguous )
Remember that the 'enter' key is not upper case, (and not allowed for in the code)
the following proposed code:
cleanly compiles
performs the desired function
properly checks for errors
uses the facilities defined in the header file ctype.h
and now the proposed code
#include <stdio.h> // scanf(), perror()
#include <stdlib.h> // exit(), EXIT_FAILURE
#include <ctype.h> // isupper()
int main( void )
{
// 'char' is all lower case:
// so this statement: Char input;
// does not compile, suggest:
char input;
do
{
int scanfStatus = scanf("%c", &input);
// always check the returned value (not the parameter value)
if( 1 != scanfStatus )
{
perror( "scanf failed" );
exit( EXIT_FAILURE );
}
} while ( isupper( input ) );
} // end function: main

Your question is:
Why does it control all letters from a to z and how come my program ends if it's a little char instead of a capital?
The answer is, because of the while test, which tests whether input <'a' or 'z'< input.
Here is some background information that will help you understand why this happens.
In your program, input is a char, and, according to the C standard, the char type is an integral type. This means that 'a' (C's way to designate a char literal) is, in fact, a number, and thus, it can be compared with comparison operators such as < or > to other (integral) numbers or other char (here, the content of input).
Now, what is the actual integral value of a char? While the integral values of the character set are implementation-defined, in general, C compilers (including Visual Studio's) will use the ASCII Character Codes.
So:
'a' in your code, refers to the integral value of ASCII code for the char 'a', which is 97,
and 'z' in your code refers to the integral value of ASCII code for the char 'z', which is 122
As you can see also from the ASCII table (ASCII Character Codes Chart 1 from MSDN), the alphabet a-z has consecutive code numbers ('a' is 97, 'b' is 98, etc.), and the lowercase alphabet is effectively an ASCII code from 97 to 122.
So, in your code, the test input <'a' or 'z'< input is equivalent to input < 96 or 122 < input, and this will be true when the entered char has any ASCII value outside the range of 96 - 122, meaning, any char that is entered which is not in the range of ASCII codes for lowercase letters from 'a' to 'z' will result in the while test being true, and repeating the scanf().
Finally, as noted by other commentators or contributors, the right type is char, not Char since C is case-sensitive.

Convert a char to an int in C

I want to make a program which converts 3www2as3com0 to www.as.com but I have got trouble at the beginning; I want to convert the first number of the string (the character 3) to an integer to use functions like strncpy or strchr so when I print the int converted the program shows 51 instead of 3. What is the problem?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
/* argv[1]--->3www2as3com0*/
char *string;
char *p;
string=argv[1];
p=string;
char cond,cond2;
cond=*p; //I want to have in cond the number 3
cond2=(int)cond; //I want to convert cond (a char) to cond2(an int)
printf("%d",cond2); //It print me 51 instead of 3
return (EXIT_SUCCESS);
}

Your computer evidently encodes strings in a scheme called ASCII . (I am fairly sure most modern computers use ASCII or a superset such as UTF-8 for char* strings).
Notice how both printable and nonprintable characters are encoded as numbers. 51 is the number for the character '3'.
One of the nice features of ASCII is that all the digits have increasing codes starting from '0'.
This allows one to get the numerical value of a digit by calculating aDigitCharacter - '0'.
For example: cond2 = cond - '0';
EDIT:
You should also probably also double check that the character is indeed a digit by making sure it lies between '0' and '9';
If you want to convert a string containing more than one digit to a number you might want to use atoi.
It can be found in <stdlib.h>.

The character's integer value is the ASCII code for the digit, not the number it actually represents. You can convert by subtracting '0'.
if( c >= '0' && c <= '9' ) val = c - '0';

Seems like the strings you are using will never have negative number, so you can use atoi(), returns the integer value from char. If it encounters something that is not a number, it will get the number that builds up until then.

Character frequency histogram in C

I read this program, but i'm not able to understand it. Please explain what exactly is happening in the length[] arraay . How can it be used to store different type of characters i.e. both digits & chars.Following is the code:
#include <stdio.h>
#define EOL '\n'
#define ARYLEN 256
main()
{
int c, i, x;
int length[ARYLEN];
for(x = 0; x < ARYLEN;x++)
length[x] = 0;
while( (c = getchar() ) != EOL)
{
length[c]++;
if (c == EOL)
break;
}
for(x = 0; x < ARYLEN; x++)
{
if( length[x] > 0){
printf("%c | ", x);
for(i = 1; i <= length[x]; ++i){
printf("*");
}
printf("\n");
}
}
}

The array doesn't store any characters (at least conceptually). It stores the number of times the program has encountered a character with the numerical value c in the array position of index c.
Basically, in the C programming language, a char is a datatype that consists of 8 bits and is able to hold values of the range 0 to 255 for an unsigned char or -128 to 127 for a signed char.
The program then defines an array large enough to hold as many different values as it is possible to represent using a char, one array position for each unique value.
Then it counts the number of occurances using the appropriate array position, length[c], as a counter for that specific value. As it loops over the array to print out the data, it can tell which character the data belongs to just by looking at the current index inside the loop, so printf("%c | ", x); is the character while length[x] is the data we're after.

In your code the integer array length[] is not used to store characters. It is only used to store the count of each character being typed. The characters are read one by one into the character variable c while( (c = getchar() ) != EOL).
But the tricky part is length[c]++;. The count of each character is kept at a location equal to its ASCII value - 1 in the array length[].
For example in a system using ASCII codes, length[64] contains the count of A, because 65 is the ASCII code for A.
length[65] contains the count of B, because 66 is the ASCII-8 code for B.
length[96] contains the count of a, because 97 is the ASCII code for a.
length[47] contains the count of 0, because 48 is the ASCII code for 0.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight