In C, I need to create a function that, for an input, will count and display the number of times each letter occurs.
For input of "Lorem ipsum dolor sit amet", the function should return something similar to:
a: 0
b: 0
c: 0
d: 1
e: 2
f: 0
...
So you will basically need to read through the entire file char-by-char. Assuming that you know how file-reading works, you will need to do something like (apologies, been a while since I did C):
if (isalpha(ch)) {
count[ch-'a']++;
}
/* rest of code where char pointer is moved on etc. */
You will need to import the ctype library for that:
#include <ctype.h>
** forgot to mention, assumed you would deduce the following: ch is your pointer to the currently read in character, while count[] is an int[], initialized to all zeros with a size of (26 * 2) = 52 to cater for both upper and lowercase. If upper and lower-case should be treated the same, you can use the tolower(int c) function also included in the ctype library. In this case you only need a 26 size array.
if (isalpha(ch)) {
count[tolower(ch)-'a']++;
}
Then the count[] should contain the counts for each character.
/* ***** */
If you wanted to do this with only the stdio.h library, you can implement the two functions used from the ctype.h library.
A simple implementation of the isalpha(int c) function could be something like:
if (((int)c >= 'a' && (int)c <= 'z') || ((int)c >= 'A' && (int)c <= 'Z') {
return TRUE;
} else {
return FALSE;
}
(where TRUE and FALSE are of type your return type and something you defined).
And a REALLY simple version of tolower could be something like:
if ((int)c >= 'A' && (int)c <= 'Z') {
return (int)c - 'a';
} else {
return (int)c;
}
You could probably do without all the casts...
hints:
char c[26] = { 0 }; // init
// read each input chars
++c[input-'a'];
I would have an array (of the size equal to char domain) and increment the count at apropriate position.
count[ch]++;
Just in case you are concerned with speed:
unsigned int chars[255], *p = text;
while(*p)
chars[*p]++;
for(int i = 0; i < 255; i++)
if(i > ('A' - 1) && i < ('Z' + 1))
printf("%c) %u/n", i, chars[i];
Sorry for the "/n" but my mac pro does not have the right character on its keyboard...
Related
I have for example two arrays
char 1 is: 77a abcd Abc abc1d #### v k
char2 is: 789 ABA AABB 123 ab #% abcde
The common index should be in places 0,3,4,5,9,10,12,20
The result should be 8 but I get 9 The problem is that an Aski code lower than 64 still works and it should not
Is the code
int intersection(char arrayNumberOne[size], char arrayNumberTwo[size])
{
int counter = 0;
for (int i = 0; i < strlen(arrayNumberOne); i++)
{
if ((arrayNumberOne[i] == arrayNumberTwo[i]) || (arrayNumberOne[i] == arrayNumberTwo[i] + 32) || (arrayNumberOne[i] + 32 == arrayNumberTwo[i]))
{
if (arrayNumberOne[i] < 64)
{
?????
}
counter++;
}
}
return counter;
}
Sum 32 to an ASCII character can lead to undesirable combinations, and you already found the issue.
I suggest you, break your conditionals into small pieces and there are some small changes:
First, the condition about the symbols is incomplete, then change:
if (arrayNumberOne[i] < 64)
to:
if (arrayNumberOne[i] <= 64 || arrayNumberTwo[i] <= 64)
Because both arrays could contain a symbol.
Second, organize the expressions to see the logic better, but you can join after all:
// considering inside a for-loop used 'continue' to skip the other verifications
// compare any character
// both lower case or upper case
if (arrayNumberOne[i] == arrayNumberTwo[i]) {
counter++;
continue;
}
// is a symbol, skip the verification ahead
if (arrayNumberOne[i] <= 64 || arrayNumberTwo[i] <= 64)
continue;
// character verifications
// first is uppercase
if (arrayNumberOne[i] + 32 == arrayNumberTwo[i]) {
counter++;
continue;
}
// second is uppercase
if (arrayNumberOne[i] == arrayNumberTwo[i] + 32) {
counter++;
continue;
}
This will make your code work but would be better to check for 'a-z' or use tolower as they said in the comments/answers.
Use tolower, no need to do strlen and you need to check both string length.
cast to (unsigned char) this allow to avoid an undefined behavior if char is a signed type (which it often is) and if any of the arrays contain negative char values (that are different from EOF: the one negative value permitted in tolower) (thanks #Kaz to flag this)
#include <ctype.h>
#include <stdio.h>
int intersection(const char *s1, const char *s2)
{
int counter = 0;
while (*s1 != '\0' && *s2 != '\0') {
if (tolower((unsigned char) *s1++) == tolower((unsigned char) *s2++)) {
counter++;
}
}
return counter;
}
int main() {
printf("%d\n", intersection("77a abcd Abc abc1d #### v k",
"789 ABA AABB 123 ab #% abcde"));
return 0;
}
Returns 8
What I'm thinking of is something like this:
range = range <= '9' && range >= '0';
I want to extract a contiguous sequence of digits from a string. And once the program find a non-digit after it finds the sequence of digits, I want it to break out of the loop using break; in the second if-statement (line 59). And I think it would be much easier if I can just write the condition using a variable.
What I want to say in line 59 is "If the var digit_flag is TRUE and the element in the array s is included in the var range(which is a number range), then break;"
Can it be done?
If it can't, why not?
int i = 0;
int size_of_s = 0;
int digit_flag = FALSE;
while (s[i] != '\0') {
if (s[i] == ' ') {
i++;
} else if (s[i] <= '9' && s[i] >= '0') {
size_of_s++;
i++;
digit_flag = TRUE;
}
if (digit_flag == TRUE && s[i] != range) {
break;
}
}
What I want to say in line 59 is "If the var digit_flag is TRUE and the element in the array s is included in the var range(which is a number range), then break;"
Can it be done?
As far as I know this can not be done in C.
If it can't, why not?
Because there is no relational or comparison operator in the C language which means "operand 1 is within the range of operand 2" (even if the second operand is an array). You need to use a logical AND (&&) of two conditions (>= A, <= B).
If you don't want to use a standard function - such as isdigit(), which will use the range 0-9 - you could use a macro
#define IS_IN_RANGE(x, min, max) (x >= min && x <= max)
or inline function.
static inline int is_in_range(int x, int min, int max) {
return (x >= min && x <= max);
}
If you want a "range" in C, you have to create one yourself:
// integer range class
// in this case it could also be a char range
struct int_range {
int start;
int end;
};
// range method to test for inclusinon
// in this case range could be passed as value efficiently, too,
// but passing as const pointer is more generic, so better example
bool int_range_contains(const struct int_range *range, int value) {
return value >= range->start && value <= range->end;
}
// example usage
void func(void) {
struct range digit_chars = { '0', '9' };
int character = 'a';
if (int_range_contains(&digit_chars, character)) {
// something
}
}
Of course this is a total overkill for this case. Your current if (s[i] <= '9' && s[i] >= '0') is better, and every C programmer immediately sees what's going on there.
Instead of
digit_flag == TRUE && s[i] != range
you can re-use what you already wrote
digit_flag == TRUE && !(s[i] <= '9' && s[i] >= '0')
If you want to extract the number sequence part in a string like
char str[]="hello1234isthesequence.";
you could just do
char seq[30];
if( sscanf(str, "%*[^0-9]%[0-9]", seq)==1 )
{
printf("\nThe sequence is: %s", seq);
}
where the %*[^0-9] is used to read everything from str[] till a non-digit character is encountered and the * is to suppress it ie, it will be discarded and won't be assigned to anywhere.
Next the %[0-9] will read the remaining part of the string in str till and excluding non-number and assign it to seq.
sscanf() will return the number of total successful assignments that it made, which in this case should be 1.
You may change the size of seq as per the size of the input string and change the width specifier in the format string of scanf() to avoid overflow.
I am attempting to write a program that accepts grammatically incorrect text (under 990 characters in length) as input, corrects it, and then returns the corrected text as output. I attempted to run the program using the online compiler, "ideone", but it returned quite a few errors that I don't quite understand. I have posted my code, as well as a picture of the errors below. Can anybody explain to me what exactly the errors mean?
#include "stdio.h"
char capitalize(int i); //prototype for capitalize method
int main(void)
{
char userInput[1200]; //Array of chars to store user input. Initialized to 1200 to negate the possibility of added characters filling up the array.
int i; //Used as a counter for the for loop below.
int j; //Used as a counter for the second for loop within the first for loop below.
int numArrayElements;
printf("Enter your paragraphs: ");
scanf("%c", &userInput); //%c used since chars are expected as input(?)
numArrayElements = sizeof(userInput) / sizeof(userInput[0]); //stores the number of elements in the array into numArrayElements.
if (userInput[0] >= 97 && userInput[0] <= 122) //Checks the char in index 0 to see if its ascii value is equal to that of a lowercase letter. If it is, it is capitalized.
userInput[0] = capitalize(userInput[0]);
//code used to correct input should go here.
for (i = 1; i < numArrayElements; i++) //i is set to 1 here because index 0 is taken care of by the if statement above this loop
{
if (userInput[i] == 32) //checks to see if the char at index i has the ascii value of a space.
if (userInput[i + 1] == 32 && userInput[i - 1] != 46) //checks the char at index i + 1 to see if it has the ascii value of a space, as well as the char at index i - 1 to see if it is any char other than a period. The latter condition is there to prevent a period from being added if one is already present.
{
for (j = numArrayElements - 1; j > (i - 1); j--) //If the three conditions above are satisfied, all characters in the array at location i and onwards are shifted one index to the right. A period is then placed within index i.
userInput[j + 1] = userInput[j];
userInput[i] = 46; //places a period into index i.
numArrayElements++; //increments numArrayElements to reflect the addition of a period to the array.
if (userInput[i + 3] >= 97 && userInput[i + 3] <= 122) //additionally, the char at index i + 3 is examined to see if it is capitalized or not.
userInput[i + 3] = capitalize(userInput[i + 3]);
}
}
printf("%c\n", userInput); //%c used since chars are being displayed as output.
return 0;
}
char capitalize(char c)
{
return (c - 32); //subtracting 32 from a lowercase char should result in it gaining the ascii value of its capitalized form.
}
Your code hase several problems, quite typical for a beginner. Teh answer to teh question in your last commenst lies in the way scanf() works: it takes everything between whitepsaces as a token, so it just ends after hey. I commented the code for the rest of the problems I found without being too nitpicky. The comments below this post might do it if they fell so.
#include "stdlib.h"
#include "stdio.h"
#include <string.h>
// Check for ASCII (spot-checks only).
// It will not work for encodings that are very close to ASCII but do not earn the
// idiomatic cigar for it but will fail for e.g.: EBCDIC
// (No check for '9' because non-consecutive digits are forbidden by the C-standard)
#if ('0' != 0x30) || ('a' != 0x61) || ('z' != 0x7a) || ('A' != 0x41) || ('Z' != 0x5a)
#error "Non-ASCII input encoding found, please change code below accordingly."
#endif
#define ARRAY_LENGTH 1200
// please put comments on top, not everyone has a 4k monitor
//prototype for capitalize method
char capitalize(char i);
int main(void)
{
//Array of chars to store user input.
// Initialized to 1200 to negate the possibility of
// added characters filling up the array.
// added one for the trailing NUL
char userInput[ARRAY_LENGTH + 1];
// No need to comment counters, some things can be considered obvious
// as are ints called "i", "j", "k" and so on.
int i, j;
int numArrayElements;
// for returns
int res;
printf("Enter your paragraphs: ");
// check returns. Always check returns!
// (there are exceptions if you know what you are doing
// or if failure is unlikely under normal circumstances (e.g.: printf()))
// scanf() will read everything that is not a newline up to 1200 characters
res = scanf("%1200[^\n]", userInput);
if (res != 1) {
fprintf(stderr, "Something went wrong with scanf() \n");
exit(EXIT_FAILURE);
}
// you have a string, so use strlen()
// numArrayElements = sizeof(userInput) / sizeof(userInput[0]);
// the return type of strlen() is size_t, hence the cast
numArrayElements = (int) strlen(userInput);
// Checks the char in index 0 to see if its ascii value is equal
// to that of a lowercase letter. If it is, it is capitalized.
// Do yourself a favor and use curly brackets even if you
// theoretically do not need them. The single exception being "else if"
// constructs where it looks more odd if you *do* place the curly bracket
// between "else" and "if"
// don't use the numerical value here, use the character itself
// Has the advantage that no comment is needed.
// But you still assume ASCII or at least an encoding where the characters
// are encoded in a consecutive, gap-less way
if (userInput[0] >= 'a' && userInput[0] <= 'z') {
userInput[0] = capitalize(userInput[0]);
}
// i is set to 1 here because index 0 is taken care of by the
// if statement above this loop
for (i = 1; i < numArrayElements; i++) {
// checks to see if the char at index i has the ascii value of a space.
if (userInput[i] == ' ') {
// checks the char at index i + 1 to see if it has the ascii
// value of a space, as well as the char at index i - 1 to see
// if it is any char other than a period. The latter condition
// is there to prevent a period from being added if one is already present.
if (userInput[i + 1] == ' ' && userInput[i - 1] != '.') {
// If the three conditions above are satisfied, all characters
// in the array at location i and onwards are shifted one index
// to the right. A period is then placed within index i.
// you need to include the NUL at the end, too
for (j = numArrayElements; j > (i - 1); j--) {
userInput[j + 1] = userInput[j];
}
//places a period into index i.
userInput[i] = '.';
// increments numArrayElements to reflect the addition
// of a period to the array.
// numArrayElements might be out of bounds afterwards, needs to be checked
numArrayElements++;
if (numArrayElements > ARRAY_LENGTH) {
fprintf(stderr, "numArrayElements %d out of bounds\n", numArrayElements);
exit(EXIT_FAILURE);
}
// additionally, the char at index i + 3 is examined to see
// if it is capitalized or not.
// The loop has the upper limit at numArrayElements
// i + 3 might be out of bounds, so check
if (i + 3 > ARRAY_LENGTH) {
fprintf(stderr, "(%d + 3) is out of bounds\n",i);
exit(EXIT_FAILURE);
}
if (userInput[i + 3] >= 97 && userInput[i + 3] <= 122) {
userInput[i + 3] = capitalize(userInput[i + 3]);
}
}
}
}
printf("%s\n", userInput);
return 0;
}
char capitalize(char c)
{
// subtracting 32 from a lowercase char should result
// in it gaining the ascii value of its capitalized form.
return (c - ' ');
}
I'm trying to write a program in C that converts hexadecimal numbers to integers. I've written successfully a program that converts octals to integers. However, the problems begin once I start using the letters (a-f). My idea for the program is ads follows:
The parameter must be a string that starts with 0x or 0X.
The parameter hexadecimal number is stored in a char string s[].
The integer n is initialized to 0 and then converted as per the rules.
My code is as follows (I've only read up to p37 of K & R so don't know much about pointers) :
/*Write a function htoi(s), which converts a string of hexadecimal digits (including an optional 0x or 0X) into its equivalent integer value. The allowable digits are 0 through 9, a through f, and A through F.*/
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <ctype.h>
int htoi(const char s[]) { //why do I need this to be constant??
int i;
int n = 0;
int l = strlen(s);
while (s[i] != '\0') {
if ((s[0] == '0' && s[1] == 'X') || (s[0] == '0' && s[1] == 'x')) {
for (i = 2; i < (l - 1); ++i) {
if (isdigit(s[i])) {
n += (s[i] - '0') * pow(16, l - i - 1);
} else if ((s[i] == 'a') || (s[i] == 'A')) {
n += 10 * pow(16, l - i - 1);
} else if ((s[i] == 'b') || (s[i] == 'B')) {
n += 11 * pow(16, l - i - 1);
} else if ((s[i] == 'c') || (s[i] == 'C')) {
n += 12 * pow(16, l - i - 1);
} else if ((s[i] == 'd') || (s[i] == 'D')) {
n += 13 * pow(16, l - i - 1);
} else if ((s[i] == 'e') || (s[i] == 'E')) {
n += 14 * pow(16, l - i - 1);
} else if ((s[i] == 'f') || (s[i] == 'F')) {
n += 15 * pow(16, l - i - 1);
} else {
;
}
}
}
}
return n;
}
int main(void) {
int a = htoi("0x66");
printf("%d\n", a);
int b = htoi("0x5A55");
printf("%d\n", b);
int c = htoi("0x1CA");
printf("%d\n", c);
int d = htoi("0x1ca");
printf("%d\n", d);
}
My questions are:
1. If I don't use const in the argument for htoi(s), i get the following warnings from the g++ compiler :
2-3.c: In function ‘int main()’: 2-3.c:93:20: warning: deprecated
conversion from string constant to ‘char*’ [-Wwrite-strings]
2-3.c:97:22: warning: deprecated conversion from string constant to
‘char*’ [-Wwrite-strings] 2-3.c:101:21: warning: deprecated conversion
from string constant to ‘char*’ [-Wwrite-strings] 2-3.c:105:21:
warning: deprecated conversion from string constant to ‘char*’
[-Wwrite-strings]
Why is this?
2.Why is my program taking so much time to run? I haven't seen the results yet.
3.Why is it that when I type in cc 2-3.c instead of g++ 2-3.c in the terminal, I get the following error message:
"undefined reference to `pow'"
on every line that I've used the power function?
4. Please do point out other errors/ potential improvements in my program.
If I don't use const in the argument for htoi(s), i get the following warnings from the g++ compiler
The const parameter should be there, because it is regarded as good and proper programming to never typecast away const from a pointer. String literals "..." should be treated as constants, so if you don't have const as parameter, the compiler thinks you are casting away the const qualifier.
Furthermore, you should declare all pointer parameters that you don't intend to modify the contents of as const, Google the term const correctness.
Why is my program taking so much time to run? I haven't seen the results yet.
I think mainly because you have made an initialization goof-up. int i; i contains rubbish. Then while (s[rubbish_value] != '\0'). This function can be written a whole lot better too. Start by checking for the 0x in the start of the string, if they aren't there, signal some error (return NULL?), otherwise discard them. Then start one single loop after that, you don't need 2 loops.
Note that the pow() function deals with float numbers, which will make your program a slight bit slower. You could consider using an integer-only version. Unfortunately there is no such function in standard C, so you will have to found one elsewhere.
Also consider the function isxdigit(), a standard function in ctype.h, which checks for digits 0-9 as well as hex letters A-F or a-f. It may however not help with performance, as you will need to perform different calculations for digits and letters.
For what it is worth, here is a snippet showing how you can convert a single char to a hexadecimal int. It is not the most optimized version possible, but it takes advantage of available standard functions, for increased readability and portability:
#include <ctype.h>
uint8_t hexchar_to_int (char ch)
{
uint8_t result;
if(isdigit(ch))
{
result = ch - '0';
}
else if (isxdigit(ch))
{
result = toupper(ch) - 'A' + 0xA;
}
else
{
// error
}
return result;
}
Don't use a C++ compiler to compile a C program. That's my first advice to you.
Secondly const in a function parameter for a char * ensures that the programmer doesn't accidentally modify the string.
Thirdly you need to include the math library with -lm as stated above.
a const char[] means that you cannot change it in the function. Casting from a const to not-const gives a warning. There is much to be said about const. Check out its Wikipedia page.
--
Probably, cc doesn't link the right libraries. Try the following build command: cc 2-3.c -lm
Improvements:
Don't use pow(), it is quite expensive in terms of processing time.
Use the same trick with the letters as you do with the numbers to get the value, instead of using fixed 'magic' numbers.
You don't need the last else part. Just leave it empty (or put an error message there, because those characters aren't allowed).
Good luck!
About my remark about the pow() call (with the use of the hexchar_to_int() function described above, this is how I'd implement this (without error checking):
const char *t = "0x12ab";
int i = 0, n = 0;
int result = 0;
for (i = 2; i < strlen(t); i++) {
n = hexchar_to_int(t[i]);
result |= n;
result <<= 4;
}
/* undo the last shift */
result >>= 4;
I just worked through this exercise myself, and I think one of the main ideas was to use the knowledge that chars can be compared as integers (they talk about this in chapter 2).
Here's my function for reference. Thought it may be useful as the book doesn't contain answers to exercises.
int htoi(char s[]) {
int i = 0;
if(s[i] == '0') {
++i;
if(s[i] == 'x' || s[i] == 'X') {
++i;
}
}
int val = 0;
while (s[i] != '\0') {
val = 16 * val;
if (s[i] >= '0' && s[i] <= '9')
val += (s[i] - '0');
else if (s[i] >= 'A' && s[i] <= 'F')
val += (s[i] - 'A') + 10;
else if (s[i] >= 'a' && s[i] <= 'f')
val += (s[i] - 'a') + 10;
else {
printf("Error: number supplied not valid hexadecimal.\n");
return -1;
}
++i;
}
return val;
}
Always init your variables int i=0, otherwise i will contain a garbage value, could be any number, not necessary 0 as you expect. You're running the while statement in an infinite loop, that's why it takes forever to get the results, print i to see why. Also, add a break if the string doesn't start with 0x, will avoid the same loop issue when the user is used on a random string. As others mention you need to import the library containing pow function and declare your string with const to get rid of the warning.
This is my version of program for the question above. It converts the string of hex into decimal digits irrespective of optional prefix(0x or 0X).
4 important library functions used are strlen(s), isdigit(c), isupper(c), isxdigit(c), pow(m,n)
Suggestions to improve the code are welcome :)
/*Program - 5d Function that converts hex(s)into dec -*/
#include<stdio.h>
#include<stdlib.h>
#include<math.h> //Declares mathematical functions and macros
#include<string.h> //Refer appendix in Page 249 (very useful)
#define HEX_LIMIT 10
int hex_to_dec(char hex[]) //Function created by me :)
{
int dec = 0; //Initialization of decimal value
int size = strlen(hex); //To find the size of hex array
int temp = size-1 ; //Pointer pointing the right element in array
int loop_limit = 0; //To exclude '0x' or 'OX' prefix in input
if(hex[0]=='0' && ((hex[1]=='x') || (hex[1]=='X')))
loop_limit = 2;
while(temp>=loop_limit)
{
int hex_value = 0; //Temporary value to hold the equivalent hex digit in decimal
if(isdigit(hex[temp]))
hex_value = (hex[(temp)]-'0') ;
else if(isxdigit(hex[temp]))
hex_value = (toupper(hex[temp])-'A' + 10);
else{
printf("Error: No supplied is not a valid hex\n\n");
return -1;
}
dec += hex_value * pow(16,(size-temp-1)); //Computes equivalent dec from hex
temp--; //Moves the pointer to the left of the array
}
return dec;
}
int main()
{
char hex[HEX_LIMIT];
printf("Enter the hex no you want to convert: ");
scanf("%s",hex);
printf("Converted no in decimal: %d\n", hex_to_dec(hex));
return 0;
}
I know that ctype.h defines isdigit, however this only works for base 10. I'd like to check to see if a number is a digit in a given base int b.
What's the best way to do this in C?
Edit
I've come up with the following function:
int y_isdigit(char c, int b) {
static char digits[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static int digitslen = sizeof digits - 1;
static int lowest = 0;
int highest = b - 1;
if(highest >= digitslen)
return -1; /* can't handle bases above 35 */
if(b < 1)
return -2; /* can't handle bases below unary */
if(b == 1)
return c == '1'; /* special case */
int loc = strchr(digits, c);
return loc >= lowest && loc <= highest;
}
Is there any advantage to using the version schnaader made to this? (This seems to have the added benefit of not relying on the user's charset being ASCII—not that it matters much anymore.)
I'd suggest something like this:
// input: char c
if (b <= 10) {
if ((c >= '0') && (c < ('0' + b))) {
// is digit
}
} else if (b <= 36) {
if ((c >= '0') && (c <= '9')) {
// is digit
} else if ((c >= 'A') && (c < 'A' + (b - 10))) {
// is digit
}
}
This should work (untested) for base 2..36 if you're using 0..9 and A..Z.
An alternative would be to use a boolean lookup table, this is the fastest way to check. For example you could prepare tables for bases 2..36, using up 256*35 = 8960 bytes of memory, after this the isdigit check is a simple memory read.
if you are using conventional bases (e.g. octal or hexadecimal) you can use strtol() to convert and check for an error condition. if you are using arbitrary bases, e.g. base 99 there may not be an out of the box solution.
The advantage of isdigit is that it is usually a macro that expands at compile time. There is also another one isxdigit.
If you'd want to do the same for your own convention of digits you could go for an inline function that would be almost as good:
inline
bool isdigit42(char c) {
switch (c) {
default: return false;
case '0': return true;
case '1': return true;
.
.
}
}
Your compiler would know best of what cases can be shortened because the characters are in a common range of values. And in case that this is called with a compile time constant character this should be completely optimized out.