Find spaces and alphanumeric characters in a string

Find spaces and alphanumeric characters in a string - c

I need to develop a function that goes through a character string and detects letters (lower and upper cases), digits 0-9 and spaces ' '. If the functions finds only valid characters (the characters listed before) it returns 1 otherwise(if the string has characters like !,&,/,£, etc.) it returns 0. I am aware of a function that finds characters and digits which is isalnum().That is not helpful to find spaces. Does anyone can provide inbuilt or manual function which can detect characters,digits and spaces all together.
I've developed mine as under but function does not detect invalid character !,&,/,£ etc. in middle of the string and therefore it does not return the value I expect.
for (i=0; i<strlen(str); i++) {
if ((str[i]>='A' && str[i]<='Z') || str[i] == ' ' || (str[i]>='a' && str[i]<='z') || (str[i]>='0' && str[i]<='9'))
for (i=0; i<strlen(str); i++) {
char *p = str;
while (*p) {
if (isalnum((unsigned char) *p) || *p == ' ') {
res =1;
} else {
res = 0;
}
p++;
}
}

You can make the code more succinct:
int Validate_Alphanumeric(char *str)
{
unsigned char *ptr = (unsigned char *)str;
unsigned char uc;
while ((uc = *ptr++) != '\0')
{
if (!isalnum(uc) && uc != ' ')
return 0;
}
return 1;
}
Amongst other things, this avoids reevaluating strlen(str) on each iteration of the loop; that nominally makes the algorithm quadratic as strlen() is an O(N) operation and you would do it N times, for O(N2) in total. Either cache the result of strlen(str) in a variable or don't use it at all. Using strlen(str) requires the entire string to be scanned; the code above will stop at the first punctuation or other verboten character without scanning the whole string (but the worst case performance, for valid strings, is O(N)).

I came up with a function that goes through the string and that is able to return 0 if an invalid character (ex. $&$&&(%$(=()/)&)/) is found.
int Validate_Alphanumeric (char str[]) {
int i;
int res;
int valid=0;
int invalid=0;
const char *p = str;
while (*p) {
if (isalnum((unsigned char) *p) || *p == ' ') {
valid++;
} else {
invalid++;
}
p++;
}
if (invalid==0)
res=1;
else
res=0;
return res;
}

Related

camelCase function in C, unable to remove duplicate chars after converting to uppercase

void camelCase(char* word)
{
/*Convert to camelCase*/
int sLength = stringLength(word);
int i,j;
for (int i = 0; i < sLength; i++){
if (word[i] == 32)
word[i] = '_';
}
//remove staring char '_',*,numbers,$ from starting
for (i = 0; i < sLength; i++){
if (word[i] == '_'){
word[i] = toUpperCase(word[i + 1]);
}
else
word[i] = toLowerCase(word[i]);
}
word[0] = toLowerCase(word[0]);
//remove any special chars if any in the string
for(i = 0; word[i] != '\0'; ++i)
{
while (!((word[i] >= 'a' && word[i] <= 'z') || (word[i] >= 'A' && word[i] <= 'Z') || word[i] == '\0') )
{
for(j = i; word[j] != '\0'; ++j)
{
word[j] = word[j+1];
}
word[j] = '\0';
}
}
}
int main()
{
char *wordArray;
wordArray = (char*)malloc(sizeof(char)*100);
// Read the string from the keyboard
printf("Enter word: ");
scanf("%s", wordArray);
// Call camelCase
camelCase(wordArray);
// Print the new string
printf("%s\n", wordArray);
return 0;
}
I am writing a function that takes in this for example _random__word_provided, and I am to remove any additional underscores or special characters, capitalize the first word after an underscore and reprint the word without any underscores. The above example would come out like this randomWordProvided.
When I run my code though this is what I am getting rrandomWwordPprovided. I am unsure where my loop is having issues. Any guidance would be appreciated. Thank you!

You are WAY over-processing the string...
First measure the length. Why? You can find the '\0' eventually.
Then convert ' 's to underscores (don't use magic numbers in code).
Then force almost everything to lowercase.
Then try to "strip out" non-alphas, cajoling the next character to uppercase.
(The non-alpha '_' has already been replaced with an uppercase version of the next character... This is causing the "thewWho" duplication to remain in the string. There's no indication of '$' being addressed as per your comments.)
It seems the code is traversing the string 4 times, and the state of the string is in flux, leading to hard-to-understand intermediate states.
Process from beginning to end in one pass, doing the right thing all the way along.
char *camelCase( char word[] ) { // return something usable by the caller
int s = 0, d = 0; // 's'ource index, 'd'estination index
// one sweep along the entire length
while( ( word[d] = word[s] ) != '\0' ) {
if( isalpha( word[d] ) ) { // make ordinary letters lowercase
word[ d ] = tolower( word[ d ] );
d++, s++;
continue;
}
// special handling for non-alpha. may be more than one!
while( word[s] && !isalpha( word[s] ) ) s++;
// end of non-alpha? copy alpha as UPPERCASE
if( word[s] )
word[d++] = toupper( word[s++] );
}
// make first character lowercase
word[ 0 ] = tolower( word[ 0 ] );
return word; // return modified string
}
int main() {
// multiple test cases. Add "user input" after algorithm developed and tested.
char *wordArray[] = {
"_random__word_provided",
" the quick brown fox ",
"stuff happens all the time",
};
for( int i = 0; i < 3; i++ )
puts( camelCase( wordArray[i] ) );
return 0;
}
randomWordProvided
theQuickBrownFox
stuffHappensAllTheTime
There may come comments pointing out that the ctype.h functions receive and return unsigned datatypes. This is a "casting" elaboration that you can/should add to the code if you ever expect to encounter something other than 7-bit ASCII characters.

In my opinion, there's a very simple algorithm that just requires you to remember the last character parsed only:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
void camelCase(char* source)
{
/*Convert to camelCase*/
char last = '_',
*dest = source;
/* while we are not at the string end copy the char values */
while ((*dest = *source++) != '\0') {
/* if the char is a lower case letter and the previous was a '_' char. */
if (islower(*dest) && last == '_')
*dest = toupper(*dest);
/* update the last character */
last = *dest;
/* to skip on the underscores */
if (*dest != '_') dest++;
}
} /* camelCase */
int main()
{
char wordArray[100]; /* better use a simple array */
// Read the string from the keyboard
printf("Enter identifiers separated by spaces/newlines: ");
/* for each line of input */
while (fgets(wordArray, sizeof wordArray, stdin)) {
for ( char *word = strtok(wordArray, " \t\n");
word;
word = strtok(NULL, " \t\n"))
{
printf("%s -> ", word);
// Call camelCase
camelCase(word);
// Print the new string
printf("%s\n", word);
}
}
return 0;
}
if you actually want to skip the first character (and don't convert it to uppercase), you can initialize last with a different char (e.g. '\0')

convert a string cointaing a base 10 number to an integer value

I am fairly new to programming and I am trying to convert a string containing a base 10 number to an integer value following this pseudo algorithm in c.
start with n = 0
read a character from the string and call it c
if the value of c is between '0' and '9' (48 and 57):
n = n * 10 +(c-'0')
read the next character from the string and repeat
else return n
here is the rough basics of what i wrote down however I am not clear on how to read a character from the string. i guess im asking if i understand the pseudocode correctly.
stoi(char *string){
int n = 0;
int i;
char c;
for (i = 0;i < n ; i++){
if (c[i] <= '9' && c[i] >= '0'){
n = n *10 +(c - '0')}
else{
return n
}
}
}

You were close, you just need to traverse the string to get the value of each digit.
Basically you have two ways to do it.
Using array notation:
int stoi(const char *str)
{
int n = 0;
for (int i = 0; str[i] != '\0'; i++)
{
char c = str[i];
if ((c >= '0') && (c <= '9'))
{
n = n * 10 + (c - '0');
}
else
{
break;
}
}
return n;
}
or using pointer arithmetic:
int stoi(const char *str)
{
int n = 0;
while (*str != '\0')
{
char c = *str;
if ((c >= '0') && (c <= '9'))
{
n = n * 10 + (c - '0');
}
else
{
break;
}
str++;
}
return n;
}
Note that in both cases we iterate until the null character '\0' (which is the one that marks the end of the string) is found.
Also, prefer const char *string over char *string when the function doesn't need to modify the string (like in this case).

Congrats on starting your C journey!
One of the most important aspects of strings in C is that, technically, there are none. A string is not a primitive type like in Java. You CAN'T do:
String myString = "Hello";
In C, each string is just an array of multiple characters. That means the word Hello is just the array of [H,e,l,l,o,\0]. Here, the \0 indicates the end of the word. This means you can easily access any character in a string by using indexes (like in a normal array):
char *myString = "Hello";
printf("%c", myString[0]); //Here %c indicates to print a character
This will then print H, since H is the first character in the string. I hope you can see how you can access the any character in the string.

Not initialised and memory errors (strings and pointers)

Here is what I need to do: delete all occurrences of a number that appears most frequently in a given string
Here is what I've done: wrote two functions; the second one extracts all integers from a string into an array, finds the most frequently repeated one, calls the first function to find that number in a string, deletes all its occurrences in a given string
The problem is it works alright when I compile it, but doesn't pass the series of auto-generated tests and displays "access to an uninitialised value" and "memory error" in lines I marked with <------.
I know this is not exactly the "minimum reproducible code" but I'm hoping someone could point out what the problem is, as I run into a lot of similar errors when working with pointers.
char* find_number(char* string,int search)
{
int sign=1;
int number=0,temp=0;
char* p = string;
while(*string != '\0') {<----------
p=string;
if(*string=='-') sign=-1;
else if(*string==' ') {
string++;
continue;
} else if(*string>='0' && *string<='9') {
temp=0;
while(*string != '\0' && *string>='0' && *string<='9') {
temp=temp*10+*string-'0';
string++;
}
number=temp*sign;
if(number==search) {
return p;
}
} else {
sign=1,number=0;
}
string++;
}
return NULL;
}
char* delete_most_frequent(char* string)
{
//writing all integers in a string to an array
char* pointer=string;
char* s = string;
int temp=0, sign = 1,i=0,array[1000],number=0,counters[1001]= {0},n=0;
while (*s != '\0') {<------------
if (*s == '-') sign = -1;<----------
else if (*s >= '0' && *s <='9') {<----------
temp = 0;
while (*s != '\0' && *s >= '0' && *s <= '9') {
temp = temp * 10 + *s - '0';
s++;
}
number=sign*temp;
if(number>=0 && number<=1000) {
array[i]=number;
i++;
}
}
number=0;
sign=1;
s++;
}
n=i;//size of the array
//finding the number that occurs most frequently
int max=0;
for (i=0; i<n; i++) {
counters[array[i]]++;
if(counters[array[i]]>counters[max]) {
max=array[i];
}
}
char* p=find_number(string,max);//pointer to the first digit of wanted number
//deleting the integer
while (*string != '\0') {
if (p != NULL) {
char *beginning = p, *end = p;
while(*end>='0' && *end<='9')
end++;
//while (*beginning++ = *end++);
while(*end != '\0'){
*beginning = *end;
beginning++;
end++;
}
*beginning = '\0';
} else string++;
p=find_number(string,max);
}
return pointer;//pointer to the first character of a string
}
int main()
{
char s[] = "abc 14 0, 389aaa 14! 15 1, 153";
printf("'%s'", delete_most_frequent(s));
return 0;
}

chopping your code down to just what is likely causing the problem, you have pattern that looks like
while(*string != '\0') {
:
while(*string != '\0' ...) {
:
string++;
}
:
string++;
}
so you have two nested while loops, both of which are advancing the pointer looking for a NUL terminator to end the loop. The problem is that if the inner loop gets all the way to the NUL (it might stop earlier, but it might not), then the increment in the outer loop will increment the pointer past the NUL. It will then happily run through (probably invalid) memory looking for another NUL that might not exist. This is a hard one to catch as in most test cases you write, there are likely multiple NULs (soon) after the string, so it will appear to work fine -- you almost have to specifically write a test case to trigger this failure mode to catch this.
One fix would be to check you're not yet at the null before incrementing -- if (*string) string++; instead of just string++;

If loop in C not iterating properly

Writing a word counter on C. I'm counting the number of spaces in the string to determine the number of words. I'm guessing there's something wrong with my if statement. Sometimes it counts the number of words and other times it's throwing up random numbers? For instance "my red dog" has three words but "I drink rum" has one word and "this code makes no sense" has three. It's printing the length of the strings fine for each.
Here's the part of the code in question:
void WordCounter(char string[])
{ int counter = 1;
int i = 0;
int length = strlen(string);
printf("\nThe length of your string is %d", length);
for( i=0;i<length;i++){
if (string[i] == ' ')
{
counter+=1;
++i;
}
else
{
counter +=0;
++i;
}
}
printf("There are %d words in this sentence and i is equal to: %d", counter, i);
}

The i++ part of the for loop means that i is incremented at every loop, you should not do it again inside the loop. Also, your else is not necessary here. You'll want to remove bits to have:
for( i=0;i<length;i++) {
if (string[i] == ' ')
{
counter+=1;
}
}

The biggest problem with your posted code is that it incorrectly increments i where it should not.
for( i=0;i<length;i++){
if (string[i] == ' ')
{
counter+=1;
++i; // here
}
else
{
counter +=0;
++i; // here
}
}
Neither of the //here lines above are needed for what you appear to be trying to do. Furthermore, the entire else-block is pointless, as it modifies nothing except i, which it shouldn't be doing. Therefore, a more correct approach would be simply:
for(i=0; i<length; ++i)
{
if (string[i] == ' ')
++counter;
}
This increments counter whenever you index a space ' ' character. For a trivial algorithm, this will probably suffice to what you were trying.
Counting Spaces, Not Words
Your algorithm really doesn't count words, it simply counts the number of space characters encountered, plus one. This mean some inputs, such as those below (quotes used to note content in the string, not actually present), will not return an accurate word-count:
// shoud be zero, but returns one
""
// should be zero, but returns four (three spaces)
" "
// should be one, but returns five (two spaces, either side)
" word "
// should be two, but returns three (two spaces between)
"word word"
etc.
A more robust algorithm is required. Note this does NOT solve everything, but it makes great leaps in getting you closers to counting what we call "words". That is, non-whitespace characters separated by whitespace characters and potentially buttressing up to the end of the string.
This uses pointers rather than indexes. In my opinion it is simply easier to read, and declutters the code from indexing syntax, thereby amplifying what is really going on: consuming whitespace, then consuming non-whitespace that we call a "word". Comments inline should explain what is going on:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
int WordCounter(const char *s)
{
int counter = 0;
// while we haven't reached terminating nullchar
while (*s)
{
// discard all whitespace until nullchar or non-whitespace found
for (; *s && isspace((unsigned char)*s); ++s);
// if not nullchar, we have the start of a "word"
if (*s)
{
++counter;
// discard all text until next whitespace or nullchar
for (; *s && !isspace((unsigned char)*s); ++s);
}
}
return counter;
}
int main()
{
const char *s1 = "There should be eight words in this string";
printf("\"%s\"\nwords: %d\n", s1, WordCounter(s1));
const char *s2 = " There should\t\t\tbe eight\n in\n this\n string too\n ";
printf("\"%s\"\nwords: %d\n", s2, WordCounter(s2));
const char *s3 = "One";
printf("\"%s\"\nwords: %d\n", s3, WordCounter(s3));
const char *s4 = "";
printf("\"%s\"\nwords: %d\n", s4, WordCounter(s4));
return 0;
}
Output
"There should be eight words in this string"
words: 8
" There should be eight
in
this
string too
"
words: 8
"One"
words: 1
""
words: 0
Not Perfect; Just Better
The prior algorithm still isn't perfect. To accurately extract single words would require knowledge of punctuation, potentially hyphenation, etc. But it is much, much closer to what appears to be the goal. Hopefully you get something out of it.

Your code has increment i in 3 different places. The increment needs only the loop.
Try this code:
for( i=0;i<length;i++){
if (string[i] == ' ')
{
counter++;
}
}

You´re incrementing i 2 times in the same loop. One in the for statement and one in the loop. It means you´re checking only one character on two. It can explain why all pair spaces are not in the count:
I put in Bolt all pair characters:
"my red dog": Both spaces are unpair numbers
"I drink rum": Both spaces are unpair numbers
Here is your corrected source code:
void WordCounter(char string[]){
int counter = 1;
int i = 0;
int length = strlen(string);
printf("\nThe length of your string is %d", length);
for( i=0;i<length;i++){
if (string[i] == ' ')
{
counter+=1;
}
}
printf("There are %d words in this sentence and i is equal to: %d", counter, i);
}

You even do not need the i variable
counter = 0;
while(*str)
counter += *str++ == ' ' ? 1 : 0;
printf("There are %d spaces (not words as you may have more spaces between words) in this sentence and length of the string is: %d", counter, length);
int isDelimeter(int ch, const char *delimiters)
{
return strchr(delimiters, ch) != NULL;
}
const char *skipDelimiters(const char *str, char *delimiters)
{
while (*str && isDelimeter(*str, delimiters)) str++;
return str;
}
const char *skipWord(const char *str, char *delimiters)
{
while (*str && !isDelimeter(*str, delimiters)) str++;
return str;
}
const char *getWord(const char *str, char *buff, char *delimiters)
{
while (*str && !isDelimeter(*str, delimiters)) *buff++ = *str++;
*buff = 0;
return str;
}
int countWords(const char *str, char *delimiters)
{
int count = 0;
while (*str)
{
str = skipDelimiters(str, delimiters);
if (*str) count++;
str = skipWord(str, delimiters);
}
return count;
}
int printWords(const char *str, char *delimiters)
{
char word[MAXWORDLENGTH];
int count = 0;
while (*str)
{
str = skipDelimiters(str, delimiters);
if (*str)
{
count++;
str = getWord(str, word, delimiters);
printf("%d%s word is %s\n", count, count == 1 ? "st" : count == 2 ? "nd" : count == 3 ? "rd" : "th", word);
}
}
return count;
}

remove a character from the string which does not come simultaneously in c

for example, given the string str1 = "120jdvj00ncdnv000ndnv0nvd0nvd0" and the character ch = '0', the output should be 12jdvj00ncdnv000ndnvnvdnvd. That is, the 0 is removed only wherever it occurs singly.
this code is not working
#include<stdio.h>
char remove1(char *,char);
int main()
{
char str[100]="1o00trsg50nf0bx0n0nso0000";
char ch='0';
remove1(str,ch);
printf("%s",str);
return 0;
}
char remove1(char* str,char ch)
{
int j,i;
for(i=0,j=0;i<=strlen(str)-1;i++)
{
if(str[i]!=ch)
{
if(str[i+1]==ch)
continue;
else
str[j++]=str[i];
}
}
str[j]='\0';
}

Your code looks for an occurrence of something other than the character to be removed with "if(str[i]!=ch)", then if the next character is the one to be removed it skips (i.e. does not keep the characters it has just seen), otherwise it copies the current character. So if it sees 'a0' and is looking for '0' it will ignore the 'a'.
What you could do is copy all characters other than the one of interest and set a counter to 0 each time you see one of them (for the number of contiguous character of interest you've seen at this point). When you find the one of interest increment that count. Now whenever you find one that is not of interest, you do nothing if the count is 1 (as this is the single character you want to remove), or put that many instances of the interesting character into str if count > 1.
Ensure you deal with the case of the string ending with a contiguous run of the character to be removed, and you should be fine.

char *remove1(char* str, char ch){
char *d, *s;
for(d = s = str;*s;++s){
if(*s == ch){
if(s[1] == ch)
while(*s == ch)
*d++=*s++;
else
++s;//skip a ch
if(!*s)break;
}
*d++ = *s;
}
*d = '\0';
return str;
}
Code to copy the basic
for(d = s = str;*s;++s){
*d++ = *s;
}
*d = '\0';
Special processing to be added.
for(d = s = str;*s;++s){
if(find a character that is specified){
Copy that in the case of continuously than one character
if one letter then skip
}
*d++ = *s;
}
*d = '\0';

Here is the working code
output is : "1o00trsg5nfbxnnso0000"
#include<stdio.h>
char remove1(char *,char);
int main()
{
char str[100]="1o00trsg50nf0bx0n0nso0000";
char ch='0';
remove1(str,ch);
printf("%s",str);
return 0;
}
char remove1(char* str,char ch)
{
int j,i;
int len = strlen(str);
for(i = 0;i < (len - 1);i++){
if(str[i] == ch){
/* if either of check prev and next character is same then contd. without removal */
if((str[i+1] == ch) || (str[i-1] == ch))
continue;
/* replacing the char and shifting next chars left*/
for(j = i;j < (len - 2);j++) {
str[j] = str[j + 1];
}
/* string length is decrementing due to removal of one char*/
len--;
}
}
str[len] = '\0';
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Find spaces and alphanumeric characters in a string - c

Related

camelCase function in C, unable to remove duplicate chars after converting to uppercase

convert a string cointaing a base 10 number to an integer value

Not initialised and memory errors (strings and pointers)

If loop in C not iterating properly

remove a character from the string which does not come simultaneously in c

Categories

Resources