C, Help while loop continues while not true - c

Task:
Write a char do/while loop, where the program will end if the letter is not in capital:
Solution:
Char input;
do{
scanf("%c", &input);
} while (input <'a' || 'z'< input);
So my program says: "do this, while the input is either a or z". Why does it control all letters from a to z and how come my program ends if it's a little char instead of a capital?
I'm new to C, and I can't find an explanation anywhere, thanks in advance.

the problem is this statement:
while (input <'a' || 'z'< input);
as this is looking for anything not lower case letters and not taking into account the whole ascii (single char) table of possibilities.
And the criteria is for upper case which those letter are lower case.
you could use:
while ('A' <= input && input <= 'Z')
however, best to use the functionality in the header file: ctype.h because not all systems use the ASCII character set. (IBM mainframe for instance, uses EBCDIC rather than ASCII, where the alphabet is not contiguous )
Remember that the 'enter' key is not upper case, (and not allowed for in the code)
the following proposed code:
cleanly compiles
performs the desired function
properly checks for errors
uses the facilities defined in the header file ctype.h
and now the proposed code
#include <stdio.h> // scanf(), perror()
#include <stdlib.h> // exit(), EXIT_FAILURE
#include <ctype.h> // isupper()
int main( void )
{
// 'char' is all lower case:
// so this statement: Char input;
// does not compile, suggest:
char input;
do
{
int scanfStatus = scanf("%c", &input);
// always check the returned value (not the parameter value)
if( 1 != scanfStatus )
{
perror( "scanf failed" );
exit( EXIT_FAILURE );
}
} while ( isupper( input ) );
} // end function: main

Your question is:
Why does it control all letters from a to z and how come my program ends if it's a little char instead of a capital?
The answer is, because of the while test, which tests whether input <'a' or 'z'< input.
Here is some background information that will help you understand why this happens.
In your program, input is a char, and, according to the C standard, the char type is an integral type. This means that 'a' (C's way to designate a char literal) is, in fact, a number, and thus, it can be compared with comparison operators such as < or > to other (integral) numbers or other char (here, the content of input).
Now, what is the actual integral value of a char? While the integral values of the character set are implementation-defined, in general, C compilers (including Visual Studio's) will use the ASCII Character Codes.
So:
'a' in your code, refers to the integral value of ASCII code for the char 'a', which is 97,
and 'z' in your code refers to the integral value of ASCII code for the char 'z', which is 122
As you can see also from the ASCII table (ASCII Character Codes Chart 1 from MSDN), the alphabet a-z has consecutive code numbers ('a' is 97, 'b' is 98, etc.), and the lowercase alphabet is effectively an ASCII code from 97 to 122.
So, in your code, the test input <'a' or 'z'< input is equivalent to input < 96 or 122 < input, and this will be true when the entered char has any ASCII value outside the range of 96 - 122, meaning, any char that is entered which is not in the range of ASCII codes for lowercase letters from 'a' to 'z' will result in the while test being true, and repeating the scanf().
Finally, as noted by other commentators or contributors, the right type is char, not Char since C is case-sensitive.

Related

Do char's in C have pre-assigned zero indexed values?

Sorry if my title is a little misleading, I am still new to a lot of this but:
I recently worked on a small cipher project where the user can give the file a argument at the command line but it must be alphabetical. (Ex: ./file abc)
This argument will then be used in a formula to encipher a message of plain text you provide. I got the code to work, thanks to my friend for helping but i'm not 100% a specific part of this formula.
#include <stdio.h>
#include <cs50.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <ctype.h>
int main (int argc, string argv[])
{ //Clarify that the argument count is not larger than 2
if (argc != 2)
{
printf("Please Submit a Valid Argument.\n");
return 1;
}
//Store the given arguemnt (our key) inside a string var 'k' and check if it is alpha
string k = (argv[1]);
//Store how long the key is
int kLen = strlen(k);
//Tell the user we are checking their key
printf("Checking key validation...\n");
//Pause the program for 2 seconds
sleep(2);
//Check to make sure the key submitted is alphabetical
for (int h = 0, strlk = strlen(k); h < strlk; h++)
{
if isalpha(k[h])
{
printf("Character %c is valid\n", k[h]);
sleep(1);
}
else
{ //Telling the user the key is invalid and returning them to the console
printf("Key is not alphabetical, please try again!\n");
return 0;
}
}
//Store the users soon to be enciphered text in a string var 'pt'
string pt = get_string("Please enter the text to be enciphered: ");
//A prompt that the encrypted text will display on
printf("Printing encrypted text: ");
sleep(2);
//Encipher Function
for(int i = 0, j = 0, strl = strlen(pt); i < strl; i++)
{
//Get the letter 'key'
int lk = tolower(k[j % kLen]) - 'a';
//If the char is uppercase, run the V formula and increment j by 1
if isupper(pt[i])
{
printf("%c", 'A' + (pt[i] - 'A' + lk) % 26);
j++;
}
//If the char is lowercase, run the V formula and increment j by 1
else if islower(pt[i])
{
printf("%c", 'a' + (pt[i] - 'a' + lk) % 26);
j++;
}
//If the char is a symbol just print said symbol
else
{
printf("%c", pt[i]);
}
}
printf("\n");
printf("Closing Script...\n");
return 0;
}
The Encipher Function:
Uses 'A' as a char for the placeholder but does 'A' hold a zero indexed value automatically? (B = 1, C = 2, ...)
In C, character literals like 'A' are of type int, and represent whatever integer value encodes the character A on your system. On the 99.999...% of systems that use ASCII character encoding, that's the number 65. If you have an old IBM mainframe from the 1970s using EBCDIC, it might be something else. You'll notice that the code is subtracting 'A' to make 0-based values.
This does make the assumption that the letters A-Z occupy 26 consecutive codes. This is true of ASCII (A=65, B=66, etc.), but not of all codes, and not guaranteed by the language.
does 'A' hold a zero indexed value automatically? (B = 1, C = 2, ...)
No. Strictly conforming C code can not depend on any character encoding other than the numerals 0-9 being represented consecutively, even though the common ASCII character set does represent them consecutively.
The only guarantee regarding character sets is per 5.2.1 Character sets, paragraph 3 of the C standard:
... the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous...
Character sets such as EBCDIC don't represent letters consecutively
char is a numeric type that happens to also often be used to represent visible characters (or special non-visible pseudo-characters). 'A' is a value (with actual type int) that can be converted to a char without overflow or underflow. That is, it's really some number, but you usually don't need to know what number, since you generally use a particular char value either as just a number or as just a character, not both.
But this program is using char values in both ways, so it somewhat does matter what the numeric values corresponding to visible characters are. One way it's very often done, but not always, is using the ASCII values which are numbered 0 to 127, or some other scheme which uses those values plus more values outside that range. So for example, if the computer uses one of those schemes, then 'A'==65, and 'A'+1==66, which is 'B'.
This program is assuming that all the lowercase Latin-alphabet letters have numeric values in consecutive order from 'a' to 'z', and all the uppercase Latin-alphabet letters have numeric values in consecutive order from 'A' to 'Z', without caring exactly what those values are. This is true of ASCII, so it will work on many kinds of machines. But there's no guarantee it will always be true!
C does guarantee the ten digit characters from '0' to '9' are in consecutive order, which means that if n is a digit number from zero to nine inclusive, then n + '0' is the character for displaying that digit, and if c is such a digit character, then c - '0' is the number from zero to nine it represents. But that's the only guarantee the C language makes about the values of characters.
For one counter-example, see EBCDIC, which is not in much use now, but was used on some older computers, and C supports it. Its alphabetic characters are arranged in clumps of consecutive letters, but not with all 26 letters of each case all together. So the program would give incorrect results running on such a computer.
Sequentiality is only one aspect of concern.
Proper use of isalpha(ch) is another, not quite implemented properly in OP's code.
isalpha(ch) expects a ch in the range of unsigned char or EOF. With k[h], a char, that value could be negative. Insure a non-negative value with:
// if isalpha(k[h])
if isalpha((unsigned char) k[h])

What is most advantage of using char instead of 'int'

Below are my codes that convert large letters to small letters and vice versa.
#if SOL_2
char ch;
char diff = 'A' - 'a';
//int diff = 'A' - 'a';
fputs("input your string : ", stdout);
while ((ch = getchar()) != '\n') {
if (ch >= 'a' && ch <= 'z') {
ch += diff;
}
else if (ch >= 'A' && ch <= 'Z') {
ch -= diff;
}
else {}
printf("%c", ch);
}
#endif
Above codes, instead of char diff = 'A' - 'a', I used the int = 'A' -'a' and the result was same. Therefore, I thought that using character can save memory since char is one byte but int is four bytes. I can't think other advantages of it.
I would appreciate it if you let me know other advantages of it.
And What is the main reason of using char in order to store character values?
It is because of just memory size problem?
You should be using int ch and int diff.
getchar() returns int, not char. Therefore ch needs to be int. This is so you can tell the difference between end-of-file and character 0xff, both of which would be -1 in a signed byte. (reference)
char might be signed or unsigned (see this answer). Therefore, you should use int for comparisons so that you know you have room for negative values (int is signed by default).
To answer your specific question, use char when you know you have byte data and, yes, you'll most likely save some memory. Another reason to use char (or wchar_t or other character types) is to make it clear to the reader of your code that you intend this data to be text and not numeric, if indeed that is the case. Another use case for char is to access individual bytes of a file or other data stream.
What is the main reason of using char in order to store character values? It is because of just memory size problem?
The primary use of using char vs. int with arrays and sequences of characters is space (and processing speed on machines with wide architectures). If code uses characters limited to an 8-bit range, excessively large data types slow things down.
With single instances of a type, int is often better as that is typically the "native" type that the processor is optimized for.
Yet optimizing for a single char vs int (assuming both work in the application) is usually not a fruitful use of your time. Worry about larger issues and let the compiler optimize the small stuff.
Note that int getchar() returns values in the range of unsigned char and EOF. These typically 257 different values cannot be store distinctly in a char. Use an int
C provides isupper(), islower(), toupper(), tolower() and is the robust method to handle simple character case conversion.
if (isupper(ch)) ch = tolower(ch);
Example usage:
int ch;
while ((ch = getchar()) != '\n' && ch != EOF) {
if (isupper(ch)) {
ch = tolower(ch);
}
else if (islower(ch)) {
ch = toupper(ch);
}
printf("%c", ch);
}
fflush(stdout);
With ASCII, EBCDIC and every small character encoding I've encounterd, A-Z case conversion can be done by simple toggling a bit. Notice no magic numbers.
ch ^= 'A' ^ 'a';
Example usage:
int ch;
while ((ch = getchar()) != '\n' && ch != EOF) {
if (isalpha(ch)) {
ch ^= 'A' ^ 'a';
}
printf("%c", ch);
}
fflush(stdout);
Yes, you pointed out correctly the character the we use in char are nothing but binary code of 1 byte i.e 256 number each number in binary represent a number mapping to a character (might be not all binary number represent different character it depends which encoding you use) refer unicode encoding , don't just considering only english language consider other language characters as well like chinesse or hindi... and so on .So each character in this language needs to be represented by a number which is standardise by unicode
so the point is when you use char of java it only contains a subset of only english language alphabets however when you develop a international software which has ability to choose across different languages to display you should use int rather . However if your scope is only english language char would be the best choice as when you use int it consumes more bits that are unused bit which are been padded off with zero this are just extra bits with no significance to match the length of a int
suppose you have a text in chinesse language opened in editor like notepad and if the character encoding is set to ASCII as ascii has a small charset that is only english A-Z, a-z, 0-9 , space , newline ... like 256 odd characters, you will see wired characters in the file just like a binary file to see the actually content of file you need to change encoding to UTF-8 which uses unicode charset , and now you can see the text
Plase read Standard 6.3.1.8 Usual arithmetic conversions and 6.3.1.1 Boolean, characters, and integers.
If an int can represent all values of the original type [...] the value is converted to an int;
In
char c1 = 'A', c2 = 'Z';
c2 - c1; // expression without side effects
the expression above, both x and y are converted to int before the subtraction is performed.

Ascii character encoding issue

#include <stdio.h>
int main()
{
char line[80];
int count;
// read the line of charecter
printf("Enter the line of text below: \n");
scanf("%[ˆ\n]",line);
// encode each individual charecter and display them
for(count = 0; line[count]!= '\0'; ++ count){
if(((line[count]>='0')&& (line [count]<= '9')) ||
((line[count]>= 'A')&& (line[count]<='Z')) ||
((line[count]>= 'a')&& (line[count]<='z')))
putchar(line[count]+1);
else if (line[count]=='9')putchar('0');
else if (line [count]== 'A')putchar('Z');
else if (line [count]== 'a') putchar('z');
else putchar('.');
}
}
In the above code problem is converting encoding. Whenever I compile the code, the compiler automatically converts the encoding and then I am unable to get required output.
My target output should look like:
enter the string
Hello World 456
Output
Ifmmp.uif.tusjof
For every letter, it is replaced by 2nd letter and space is replaced by '.'.
This is suspect:
scanf("%[ˆ\n]",line);
It should be:
scanf("%79[^\n]",line);
Your version has a multibyte character that looks a bit like ^, instead of the ^. This would cause your scans to malfunction. Your symptoms sound as if the text that has been input is actually multi-byte characters.
BTW you could make your code easier to read by using isalnum( (unsigned char)line[count] ). That test replaces your a-z, A-Z, 0-9 tests.
You are not checking your conditions correctly:
if (line[count]>= 'A')&& (line[count]<='Z)
..
already converts the character 'Z'. The next check,
if (line [count]== 'A')putchar('Z');
is never executed. But that is not the only thing wrong here. The character 'A' should be translated to 'B', not 'Z'. You probably want
if (line[count]>= 'A' && line[count] < 'Z)
(< instead of <=) and
if (line [count]== 'Z')putchar('A');
and the same for lowercase and digits.
The problem is your format string for scanf. If you want to read a line of text from the console, you should use %s.
If you want to make sure that you read a maximum of 79 characters, you should use %79s (because your line vector has a length of 80).
So you should replace your scanf with this:
scanf("%79s", line);

Unexpected Results Using isdigit(x)

I am using the following code. I expect the output to be "Yes", but I instead get "No." I must be missing something very simple and fundamental.
#include <stdio.h>
#include <ctype.h>
int main(void)
{
int n = 3;
if (isdigit(n))
{
printf("Yes\n");
}
else
{
printf("No\n");
}
return 0;
}
isdigit checks whether the character passed to it is a numeric character. Therefore, its argument should be char type or int type which is the code of a character.
Here, you are passing 3 to isdigit. In ASCII, 3 is the code of the character ETX (end of text) which is a non-numeric character. Therefore, isdigit(3) returns false.
isdigit() expects a character code, while you expect it to accept a plain number.
Those expectations do not mesh well.
Character literals:
'0' ordinal 0x30 -- 48
'1' ordinal 0x31 -- 49
'2' ordinal 0x32 -- 50
... You get the drift
'3' is not a digit, here, in the way isdigit() considers them. Change int n = 3 to int n = '3'. If your system uses ASCII, for instance, 3 is the end of text marker, whereas 51, equivalent to '3', is the actual character three.
isdigit expects a character. If the character passed is digit then it returns non zero. 3 is an ASCII value of non-printable character, i.e it doesn't represent a digit and that's why your else body gets executed.
man isdigit:
These functions check whether c, which must have the value of an
unsigned char or EOF, falls into a certain character class according to
the current locale.
3 is an integer, not a character. In fact, 3 is also the value of a character that is not a digit.
The string "\03\t\nABCabc123" is composed of 12 characters. The first 9 characters all return false when applied to the isdigit() function. The last 3 all return true;
#include <ctype.h>
#include <stdio.h>
int main(void) {
char data[] = "\03\t\nABCabc123";
// data[0] = 3;
char *ptr = data;
while (*ptr) {
printf("%4d --> %s\n", *ptr, isdigit((unsigned char)*ptr) ? "true" : "false");
ptr++;
}
return 0;
}
Here is the implementation of function isdigit:
int isdigit(char c)
{
return '0' <= c && c <= '9';
}
In simple words, function isdigit takes a char argument as input:
If the input represents a decimal digit in ASCII format, then the function returns 1.
If the input does not represent a decimal digit in ASCII format, then the function returns 0.
When you call function isdigit with an int argument, the argument is first truncated to a char. That doesn't make any difference in your example, since 3 fits into a char, so no information is lost due to the truncation. However, since 3 does not represent any decimal digit in ASCII format, the return-value of isdigit(3) is 0.
To summarize the above:
The return-value of isdigit('0') is 1.
The return-value of isdigit('1') is 1.
...
The return-value of isdigit('9') is 1.
In all other cases, the-return value of isdigit(...) is 0.
Your mistake probably stems from the assumption that '3' == 3. If you want to check whether or not an int variable stores a single-decimal-digit value, then you can implement a slightly different function:
int isintdigit(int i)
{
return 0 <= i && i <= 9;
}

Convert a char to an int in C

I want to make a program which converts 3www2as3com0 to www.as.com but I have got trouble at the beginning; I want to convert the first number of the string (the character 3) to an integer to use functions like strncpy or strchr so when I print the int converted the program shows 51 instead of 3. What is the problem?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
/* argv[1]--->3www2as3com0*/
char *string;
char *p;
string=argv[1];
p=string;
char cond,cond2;
cond=*p; //I want to have in cond the number 3
cond2=(int)cond; //I want to convert cond (a char) to cond2(an int)
printf("%d",cond2); //It print me 51 instead of 3
return (EXIT_SUCCESS);
}
Your computer evidently encodes strings in a scheme called ASCII . (I am fairly sure most modern computers use ASCII or a superset such as UTF-8 for char* strings).
Notice how both printable and nonprintable characters are encoded as numbers. 51 is the number for the character '3'.
One of the nice features of ASCII is that all the digits have increasing codes starting from '0'.
This allows one to get the numerical value of a digit by calculating aDigitCharacter - '0'.
For example: cond2 = cond - '0';
EDIT:
You should also probably also double check that the character is indeed a digit by making sure it lies between '0' and '9';
If you want to convert a string containing more than one digit to a number you might want to use atoi.
It can be found in <stdlib.h>.
The character's integer value is the ASCII code for the digit, not the number it actually represents. You can convert by subtracting '0'.
if( c >= '0' && c <= '9' ) val = c - '0';
Seems like the strings you are using will never have negative number, so you can use atoi(), returns the integer value from char. If it encounters something that is not a number, it will get the number that builds up until then.

Resources