Find spaces and alphanumeric characters in a string C Language

Find spaces and alphanumeric characters in a string C Language - c

Hi i'm trying to build a function in C language that checks if the string contains numbers , upper cases and lower cases and space, if the string contains any other character then those the function return -1.
float avg_word_len(char* str)
{
float check;
for (int i = 0; i < strlen(str); i++)
{
if (((str[i] >= '0') && (str[i] <= '9')&&(str[i] >= 'a') && (str[i] <= 'z') && (str[i] == ' ')&& (str[i] >= 'A') && (str[i] <= 'Z')))
check = 1;
else
check = -1;
}
str = '\0';
return check;
that's my code ,but the function keep return -1 please help

Some of your && must replaced by || because one character is a number OR a lower case OR a space OR an upper case, but it cannot be all these things at a time :
check = 1;
for (int i = 0; i < strlen(str); i++)
{
if (! (((str[i] >= '0') && (str[i] <= '9')) ||
((str[i] >= 'a') && (str[i] <= 'z')) ||
(str[i] == ' ') ||
((str[i] >= 'A') && (str[i] <= 'Z')))) {
check = -1;
break;
}
}

You can use these three function which are countain in the hreader #include <ctype.h>
isalpha : Checks if a character is alphabetic (upper or lower case).
isdigit : Checks if a character is a number.
isblank : Checks whether a character is blank or not.
#include <ctype.h>
#include <stdio.h>
float avg_word_len(char* string)
{int check=-1;
for(int i=0;string[i]!='\0';i++)
{
if(isalpha(string[i])||isdigit(string[i])||isblank(string[i]))
{
check=1;
}
}
return check;
}
int main()
{
char string[150];
printf("Give me a text :");
scanf("%s[^\n]",string);
printf("\n%.f\n",avg_word_len(string));
}

As Weather Vane commented, a lot of those &&s should be ||s - additionally, parentheses should surround each range (e.g. ('0' <= str[i] && str[i] <= '9'))).
To check whether the character is in a range, we use AND (i.e. the character is above the lower bound AND below the upper bound). To check whether the character is in one of multiple ranges, we use OR (i.e. the character is in range 1 OR it is in range 2 OR...).
If we were to only fix that, here's how the if condition would look:
(str[i] >= '0' && str[i] <= '9') || (str[i] >= 'a' && str[i] <= 'z') || (str[i] == ' ') || (str[i] >= 'A' && str[i] <= 'Z')
Having said that, I would suggest using the function isalnum from ctype.h in the standard library, which checks if a character is alphanumeric. It makes the code much simpler and avoids the assumption that characters are ordered in such a way that all lowercase letters lie between 'a' and 'z' (which is true in ASCII - which is what is used in practice - but is not standard).
In addition, I would suggest initializing check to -1 and breaking early from the loop once you find a non-alphanumeric, non-space character so that a -1 is not later overwritten by a 1 (as would happen in the current version of your code).
This is what it would look like:
float check = -1;
for (int i = 0; i < strlen(str); i++)
{
if (!isalnum(str[i]) && str[i] != ' ') {
check = 1;
break;
}
}

Related

What does str[i - 1] == ' ' mean?

I've been reviewing a program that capitalises the first letter of every word in a string. For example, "every single day" becomes "Every Single Day".
I don't understand the part str[i - 1] == ' '. What does that do?
#include <stdio.h>
char *ft_strcapitalize(char *str)
{
int i;
i = 0;
while (str[i] != '\0')
{
if ((i == 0 || str[i - 1] == ' ') &&
(str[i] <= 'z' && str[i] >= 'a'))
{
str[i] -= 32;
}
else if (!(i == 0 || str[i - 1] == ' ') &&
(str[i] >= 'A' && str[i] <= 'Z'))
{
str[i] += 32;
}
i++;
}
return (str);
}
int main(void)
{
char str[] = "asdf qWeRtY ZXCV 100TIS";
printf("\n%s", ft_strcapitalize(str));
return (0);
}

i is the index in the string of the current character you are thinking about capitalising (remembering it starts at 0).
i-1 is the index in the string of the previous character to the one you are considering.
str[i-1] is the character in the position previous to the one you are considering.
== ' ' is comparing that character to a space character.
So str[i-1] == ' ' means "Is the character to the left of this one a space?"

It is checking for spaces, or more exactly, the line
if ((i == 0 || str[i - 1] == ' ')
Checks if we are looking at the string beginning or its previous line was a space, that is, to check if a new word was encountered.
In the string "every single day", i = 0 at the bold position, and in the next case,
"every single day", i = 6 and str[i-1] is ' ' marking a new word was encountered

"What does str[i - 1] == ' ' mean?"
' ' is a character constant for the white space character (ASCII value 32).
str is a pointer to char in the caller. (Practically thinking, it should point to an array of char with a string inside of it, not just a single char).
i is a counter.
Note that the C syntax allows that you can use array notation for pointers. Thus, str[1] is equal to *(str + 1).
The [i - 1] in str[i - 1] means that you access the element before the element str[i] is pointing to.
The element str[i - 1] is pointing to, is compared to the white space character (If the element str[i - 1] is pointing to actually contains white space).
The condition evaluates to true if this is the case, else the condition is false.
Side Notes:
Note that str[i - 1] can be dangerous when i == 0. Then you would try to access memory beyond the bounds of the pointed array. But in your case, this is secure since str[i - 1] == ' ' is only evaluated, if i == 0 is not true, thanks to the logical OR ||.
if ((i == 0 || str[i - 1] == ' ')
So this case is considered in your code.
str[i] -= 32; is equivalent to str[i] -= 'a' - 'A';. The latter form can improve readability as the capitalizing nature is brought to focus.

Here you are comparing str[i-1] with character space, Whose ASCII code is 32.
e.g.
if(str[i-1] == ' ')
{
printf("Hello, I'm space.\n");
}
else
{
printf("You got here, into the false block.\n");
}
Execute this snippet and if the comparison yields the value 1 it's true, false otherwise. Put str[] = "Ryan Mney"; and then compare you'll understand, what is happening?

The C-language provides a number of useful character macros that can be used to both make code more portable, and more readable. Although the sample code you are reviewing does not use these macros, please consider using these macros to make your code more portable, more robust, and easier for others to read.
Please use the islower/isupper/isalpha and tolower/toupper macros; these ctype macros make C-language string processing easier to read.
islower(ch) - check whether ch is lower case
isupper(ch) - check whether ch is upper case
isalpha(ch) - check whether ch is alphabetic (lower or upper case)
tolower(ch) - convert ch to lower case (if it is alphabetic)
toupper(ch) - convert ch to upper case (if it is alphabetic)
Yes, they are macros -
What is the macro definition of isupper in C?
The C-language provides the 'for' control statement which provides a nice way to express string processing. Simple indexed loops are often written using 'for' rather than 'while'.
#include <ctype.h>
char*
ft_strcapitalize(char *str)
{
for( int i=0; (str[i] != '\0'); i++ )
{
if ((i == 0 || isspace(str[i - 1])) && islower(str[i]) )
{
str[i] = toupper(str[i]);
}
else if (!(i == 0 || str[i - 1] == ' ') && isupper(str[i]) )
{
str[i] = tolower(str[i]);
}
}
return (str);
}
A slight refactoring makes the code a bit more readable,
char*
ft_strcapitalize(char *str)
{
for( int i=0; (str[i] != '\0'); i++ )
{
if( (i == 0 || isspace(str[i - 1])) )
{
if( islower(str[i]) ) str[i] = toupper(str[i]);
}
else if( !(i == 0 || isspace(str[i - 1]) )
{
if( isupper(str[i]) ) str[i] = tolower(str[i]);
}
}
return(str);
}
Alternately, use isalpha(ch),
char*
ft_strcapitalize(char *str)
{
for( int i=0; (str[i] != '\0'); i++ )
{
if( (i == 0 || isspace(str[i - 1])) )
{
if( isalpha(str[i]) ) str[i] = toupper(str[i]);
}
else if( !(i == 0 || isspace(str[i - 1]) )
{
if( isalpha(str[i]) ) str[i] = tolower(str[i]);
}
}
return(str);
}
Simplify the conditional expression even further, by performing the special case (first character of string) first.
char*
ft_strcapitalize(char *str)
{
if( islower(str[0]) ) str[0] = toupper(str[0]);
for( int i=1; (str[i] != '\0'); i++ )
{
if( isspace(str[i - 1]) )
{
if( islower(str[i]) ) str[i] = toupper(str[i]);
}
else if( !isspace(str[i - 1]) )
{
if( isupper(str[i]) ) str[i] = tolower(str[i]);
}
}
return(str);
}
Again, the alternate isalpha(ch) version,
char*
ft_strcapitalize(char *str)
{
if( isalpha(str[0]) ) str[0] = toupper(str[0]);
for( int i=1; (str[i] != '\0'); i++ )
{
if( isspace(str[i - 1]) )
{
if( isalpha(str[i]) ) str[i] = toupper(str[i]);
}
else if( !isspace(str[i - 1]) )
{
if( isalpha(str[i]) ) str[i] = tolower(str[i]);
}
}
return(str);
}
Even more idiomatic, just use a 'state' flag that indicates whether we should fold to upper or lower case.
char*
ft_strcapitalize(char *str)
{
int first=1;
for( char* p=str; *p; p++ ) {
if( isspace(*p) ) {
first = 1;
}
else if( !isspace(*p) ) {
if( first ) {
if( isalpha(str[i]) ) str[i] = toupper(str[i]);
first = 0;
}
else {
if( isalpha(str[i]) ) str[i] = tolower(str[i]);
}
}
}
return(str);
}
And your main test driver,
int main(void)
{
char str[] = "asdf qWeRtY ZXCV 100TIS";
printf("\n%s", ft_strcapitalize(str));
return (0);
}

' ' is a character constant representing the value of the space character in the execution set. Using ' ' instead of 32 increases both readability and portability to systems where space might not have the same value as in the ASCII character set. (i == 0 || str[i - 1] == ' ') is true if i is the offset of the beginning of a word in a space separated list of words.
It is important to try and make the as simple and readable as possible. Using magic constants like 32 is not recommended when a more expressive alternative is easy and cheap. For example you convert lowercase characters to uppercase with str[i] -= 32: this magic value 32 (again!) happens to be the offset between the lowercase and the uppercase characters. It would be more readable to write:
str[i] -= 'a' - 'A';
Similarly, you wrote the range tests for lower case and upper case in the opposite order: this is error prone and surprising for the reader.
You are also repeating the test for the start of word: testing for lower case only at the start of word and testing for upper case otherwise makes the code simpler.
Finally, using a for loop is more concise and less error prone than the while loop in your function, but I known that the local coding conventions at your school disallow for loops (!).
Here is a modified version:
#include <stdio.h>
char *ft_strcapitalize(char *str) {
size_t i;
i = 0;
while (str[i] != '\0') {
if (i == 0 || str[i - 1] == ' ') {
if (str[i] >= 'a' && str[i] <= 'z') {
str[i] -= 'a' - 'A';
}
} else {
if (str[i] >= 'A' && str[i] <= 'Z') {
str[i] += 'a' - 'A';
}
}
i++;
}
return str;
}
int main(void) {
char str[] = "asdf qWeRtY ZXCV 100TIS";
printf("\n%s", ft_strcapitalize(str));
return 0;
}
Note that the above code still assumes that the letters form two contiguous blocks in the same order from a to z. This assumption holds for the ASCII character set, which is almost universal today, but only partially so for the EBCDIC set still in use in some mainframe systems, where there is a constant offset between cases but the letters from a to z do not form a contiguous block.
A more generic approach would use functions and macros from <ctype.h> to test for whitespace (space and other whitespace characters), character case and to convert case:
#include <ctype.h>
char *ft_strcapitalize(char *str) {
for (size_t i = 0; str[i] != '\0'; i++) {
if (i == 0 || isspace((unsigned char)str[i - 1]))
str[i] = toupper((unsigned char)str[i]);
else
str[i] = tolower((unsigned char)str[i]);
}
return str;
}

reading space character into string with size determined by str_size

I'm trying to make this program such that the user could type any given string of characters, and the program would separate alphanumerical characters from the rest, print them into a second string, and finally print the final result into the screen.
I've already tried using scanf ("%[^\n]%*c", string);, but it doesn't seem to work since the size of the string is not specified beforehand, and is rather defined by STR_SIZE.
char string[STR_SIZE];
printf("please type in a string \n");
scanf("%s", string);
printf("string: \n %s \n", string);
int size = (strlen(string));
char alfanumerico[STR_SIZE];
int count = 0;
int count2 = 0;
while(count <= size)
{
if(string[count] >= '0' && string[count] <= '9')
{
alfanumerico[count2] = string[count];
count2++;
}
if(string[count] >= 'a' && string[count] <= 'z')
{
alfanumerico[count2] = string[count];
count2++;
}
if(string[count] >= 'A' && string[count] <= 'Z')
{
alfanumerico[count2] = string[count];
count2++;
}
if(string[count] ==' ')
{
alfanumerico[count2] = string[count];
count2++;
}
count++;
}
printf("alphanumerical characters typed: \n %s \n", alfanumerico);
Given the user typed a string such as: -=-=[[][][]}}Hello 123 ```//././.
I expect the output to be: Hello 123

scanf is not the way to go, especially if your input might contain white-spaces on which scanf would stop reading more inputs and wouldn't store spaces for instance.
You should use fgets which lets you limit the input data according to the buffer this data is stored in. So something like:
fgets(string, STR_SIZE, stdin)
should work.
About the size - you should have some limitation about the maximum size of the string and then STR_SIZE should be set to this number. It should be part of your program requirements or just a size that makes sense if you're making the requirements. It must be defined before you're reading input from the user because the buffer memory is allocated before reading to it.
A comment about style, unrelated to your question - always try to decrease code duplication to 0. The line alfanumerico[count2] = string[count]; count2++; appears 4 times in your code. A more elegant minimal if statement with exactly the same functionality would be:
if ((string[count] >= '0' && string[count] <= '9') ||
(string[count] >= 'a' && string[count] <= 'z') ||
(string[count] >= 'A' && string[count] <= 'Z') ||
(string[count] == ' '))
{
alfanumerico[count2] = string[count];
count2++;
}
and to be even more minimal:
char c = string[count];
if ((c >= '0' && c <= '9') ||
(c >= 'a' && c <= 'z') ||
(c >= 'A' && c <= 'Z') ||
(c == ' '))
{
alfanumerico[count2] = c;
count2++;
}
It's also more readable and more maintainable - if you want to change the variable count to i you do it in one place instead of 8.
Also, always close a scope in a new line.

Counting the number of characters that are not in an array

In this method I am counting the type of characters that are in a data file. It successfully counts the number of character A-Z (Uppercase), a-z (Lowercase), and any digit, I also need it to count if there are any other type of character besides the ones already counted. Everything I have tried has counted all of the characters, none of the characters, or only a select few.
Thanks
public void countChars (){
String currentWord;
for(int pass = 0; pass < numberOfTokens; pass++){
currentWord = words[pass];
for (int i = 0; i < currentWord.length(); i++){
char ch = currentWord.charAt(i);
if (ch >= 'A' && ch <= 'Z'){
numberOfUpperCase++;
}
if (ch >= 'a' && ch <= 'z'){
numberOfLowerCase++;
}
if (ch >= '0' && ch <= '9'){
numberOfDigits++;
}
}
}
}//end of countChars

You should check their ascii values it will be easier, use (int) my_char
and check if it's value is between 0-47, 58-64 or 91-127. Refer to this table to understand why: ASCII VALUES
This is basically what you are already doing, by saying if(char >= a && char <= z) The next code should be enough to solve your issue.
char my_char = '#';
int ascii_value = (int) my_char;
System.out.println("ASCII value of " + my_char + " is " + ascii_value);
if((ascii_value >= 0 && ascii_value <= 47) || (ascii_value >= 58 && ascii_value <= 64) || (ascii_value >= 91 && ascii_value <= 127)){
System.out.println("Your character is a symbol!");
}

Function to check for alphabetic characters

I created a Function to check if user typed a Real Name excluding all other non alphabetic characters.
Well, from my side, as a beginer in C language its works fine.
Anyway i have just a small problem, with the string name, if there is space inside that string i get wrong Name, but if there is only one name (michi) everything is ok.
#include <stdio.h>
#include<string.h>
/* Here is the Function which check if the string contains only: */
/* abcdefghijklmnopqrstuvwxyz and ABCDEFGHIJKLMNOPQRSTUVWXYZ */
int checkName(char *s){
int i,length;
length = (strlen(s));
for (i=0;i<length;i++){
if(s[i] == '0' || s[i] <= '9'){
return 1;
}
}
return 0;
}
int main(){
char name[]= "Michi";
int check;
if((check = checkName(name)) == 0){
printf("\n\n\t\t\tYour name is:\t%s\n\n",name);
}else{
printf("\n\n\t\t\tWrong name:\t%s\n\n",name);
}
return 0;
}
My questions are:
1)
Did i found a right way of checking if string contains only non alphabetic characters.
2)
How can i extend my Function to skip spaces

Take a look at isalpha in ctype.h. This returns true if a char is a letter, just like what you want.
http://www.cplusplus.com/reference/cctype/isalpha/
By the way, if you're checking ASCII encodings, your function fails for characters such as '(' or '~'.

Here is the Function which check if the string contains only:
abcdefghijklmnopqrstuvwxyz and ABCDEFGHIJKLMNOPQRSTUVWXYZ
Looking at the code below that statement, you're lying. What your code does is checking whether there is a character 0 or any character below 9 in the string. Better do what you're saying:
if((str[i] >= 'a' && str[i] <= 'z') ||
(str[i] >= 'A' && str[i] <= 'Z') ||
(str[i] == ' ')) {
// fine ..
}
else {
// not fine!
}
As you see I added the space to the set of allowed characters. To get rid of the if branch just negate the whole test expression (either by hand or using the not operator !).
The comparisons are working this way because of the layout of the ASCII table.
Note that there's a library function for this: isalpha

If you have a valid set, test against this set, not other sets that might or might not be the complement set (So many sets in one sentence :-):
for (i=0; i<length; i++) {
int valid = 1;
valid &= s[i] >= 'a' && s[i] <= 'z';
valid &= s[i] >= 'A' && s[i] <= 'Z';
valid &= s[i] == ' ';
if (!valid) {
return 0; // or any value you prefer to indicate "not valid"
}
}

If you want to check only alphabetic chars and space, you can use isapha and isspace from ctype.h. These functions return non-zero for ture and zero for false.

You can just continue the loop if the character is a space:
for (i=0;i<length;i++){
if(s[i] == ' '){
continue;
}
else if(s[i] == '0' || s[i] <= '9'){
return 1;
}
}
Furthermore, you could also make sure it does not contain any other character than just alphabetic, by checking if all character are not outside the range of accepted characters:
for (i=0;i<length;i++){
if((s[i] < 'A') || (s[i] > 'Z' && s[i] < 'a') || (s[i] > 'z')){
return 1;
}
}
Note: the ASCII table is a nice "tool" to confirm the range you have to check.

Char as a decimal separator in C

I've written a function that extracts double from a string. Like asfas123123afaf to 123123 or afafas12312.23131asfa to 12312.23131 using the point as a decimal separator.
Here is the code:
double get_double(const char *str, char sep)
{
char str_dbl[80];
size_t i,j;
char minus;
double dbl;
for (minus = 1, i = j = 0; i < strlen(str); ++i)
{
if (
(str[i] == '-' && minus)
|| (str[i] >= '0' && str[i] <= '9')
|| (str[i] == 'e' || str[i] == 'E')
)
{
str_dbl[j++] = str[i];
minus = 0;
}
}
str_dbl[j] = '\0';
dbl = strtod (str_dbl,NULL);
return dbl;
}
But now I want to set a user defined comma separator (char sep) from the ASCII-chars (without E or e that are literals for ^10). How could I implement it?
Let me sepcify this: We say the separator is ',' so the string is 123123asfsaf,adsd,as.1231 it should return 123123,1231 as a double. It recognizes the first ',' (from left) and ignore all other.
It is really hard for me to find a solution for this problem. I have thought about setlocale but I it doesn't seem the best solution.
Thank you!

You can just replace any , with . before doing the strtod.
If you for some reason don't want to modify the source string, copy it to a new string first.

Well I know that using String.Split does this sor of thing but I think you should write your own function for it stop when it found the first one.
here is the msdn website for more help on that
MSDN Split

You can simply extend your code:
if (
(str[i] == '-' && minus)
|| (str[i] >= '0' && str[i] <= '9')
|| (str[i] == 'e' || str[i] == 'E')
)
to
char separator = ','; //or whatever you want
int have_separator = 0;
if (
(str[i] == '-' && minus)
|| (str[i] >= '0' && str[i] <= '9')
|| (str[i] == 'e' || str[i] == 'E')
|| str[i] == separator
)
{
if (str[i] == separator && have_separator == 0)
{
have_separator = 1;
str_dbl[j++] = str[i];
continue;
}
...
Please note that this is only some try to show the idea - not the real working code (but it could work anyway). You can use similar concept.