tempWord[0]='\0' Does not reset String somehow - c

I wrote a program in C, The expected result should be:
$ cat poem.txt
Said Hamlet to Ophelia,
I'll draw a sketch of thee,
What kind of pencil shall I use?
2B or not 2B?
$ ./censor Ophelia < poem.txt
Said Hamlet to CENSORED,
I'll draw a sketch of thee,
What kind of pencil shall I use?
2B or not 2B?
But I got this:
$ ./censor Ophelia < poem.txt
Said Hamlet tomlet CENSORED,
I'lllia drawlia arawlia sketcha ofetcha theecha,
Whatcha kindcha ofndcha pencila shallla Ihallla usellla?
2Bsellla orellla notllla 2Botllla?
I use tempWord to store every word and compare it with the word that needs to be censored. Then I use tempWord[0]='\0' to reset the temp String, so that I can do another comparison. But it seems not working. Can anyone help?
# include <stdio.h>
# include <string.h>
int compareWord(char *list1, char *list2);
int printWord(char *list);
int main(int argc, char *argv[]) {
int character = 0;
char tempWord[128];
int count = 0;
while (character != EOF) {
character = getchar();
if ((character <= 'z' && character >= 'a') ||
(character <= 'Z' && character >= 'A') ||
character == 39) {
tempWord[count] = character;
count++;
} else {
if (count != 0 && compareWord(tempWord, argv[1])) {
printf("CENSORED");
count = 0;
tempWord[0] = '\0';
}
if (count != 0 && !compareWord(tempWord, argv[1])) {
printWord(tempWord);
count = 0;
tempWord[0] = '\0';
}
if (count == 0) {
printf("%c", character);
}
}
}
return 0;
}
int printWord(char *list) {
// print function
}
int compareWord(char *list1, char *list2) {
// compareWord function
}

There are multiple issues in your code:
You do not test for end of file at the right spot: if getc() returns EOF, you should exit the loop immediately instead of processing EOF and exiting at the next iteration. The classic C idiom to do this is:
while ((character = getchar()) != EOF) {
...
For portability and readability, you should use isalpha() from <ctype.h> to check if the byte is a letter and avoid hardcoding the value of the value of the apostrophe as 39, use '\'' instead.
You have a potential buffer overflow when storing the bytes into the tempWord array. You should compare the offset with the buffer size.
You do not null terminate tempWord, hence the compareWord() function cannot determine the length of the first string. The behavior is undefined.
You do not check if a command line argument was provided.
The second test is redundant: you could just use an else clause.
You have undefined behavior when printing the contents of tempWord[] because of the lack of null termination. This explains the unexpected behavior, but you might have much worse consequences.
printWord just prints a C string, use fputs().
The compWord function is essentially the same as strcmp(a, b) == 0.
Here is a simplified and corrected version:
#include <ctype.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
char tempWord[128];
size_t count = 0;
int c;
while ((c = getchar()) != EOF) {
if (isalpha(c) || c == '\'') {
if (count < sizeof(tempWord) - 1) {
tempWord[count++] = c;
}
} else {
tempWord[count] = '\0';
if (argc > 1 && strcmp(tempWord, argv[1]) == 0) {
printf("CENSORED");
} else {
fputs(tempWord, stdout);
}
count = 0;
putchar(c);
}
}
return 0;
}
EDIT: chux rightfully commented that the above code does not handle 2 special cases:
words that are too long are truncated in the output.
the last word is omitted if it falls exactly at the end of file.
I also realized the program does not handle the case of long words passed on the command line.
Here is a different approach without a buffer that fixes these shortcomings:
#include <ctype.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
const char *word = (argc > 1) ? argv[1] : "";
int count = 0;
int c;
for (;;) {
c = getchar();
if (isalpha(c) || c == '\'') {
if (count >= 0 && (unsigned char)word[count] == c) {
count++;
} else {
if (count > 0) {
printf("%.*s", count, word);
}
count = -1;
putchar(c);
}
} else {
if (count > 0) {
if (word[count] == '\0') {
printf("CENSORED");
} else {
printf("%.*s", count, word);
}
}
if (c == EOF)
break;
count = 0;
putchar(c);
}
}
return 0;
}

tempWord[0] = '\0';
It will not reset the variable to null. It just assign the '\0' to the first position. But The values which are assigned are still in memory only. Only the first position is assigned to '\0'. So, to reset the character array try the below.
memset(tempWord, 0, 128);
Add the above line instead of your tempWord[0] = '\0'.
And also this will solves you don't need to add the '\0' at end of each word. This itself will work. But for the first time your have to reset the character array using the same memset function. Before entering to the loop you have to set the tempWord to null using the memset function.

Using tempWord[0]='\0' will not reset the whole array, just the first element. Looking at your code, there are 2 ways you could go forward, either reset the whole array by using memset:
memset(tempWord, 0, sizeof tempWord);
or
memset(tempWord, 0, 128);
(or you can only clear it by the size of last word, also it needs string.h which you have already included),
Or you could just set the element after the length of 'current word' to be '\0' (ex, if current word is the then set tempWord[3]='\0', since strlen checks the string till null char only) which can be placed before those 2 ifs checking if the strings are equal or not, your new while loop will look like this:
{
character = getchar();
if((character<='z' && character>='a')||(character<='Z' && character>='A')||character == 39)
{
tempWord[count]=character;
count++;
}else {
tempWord[count]='\0';
if(count!=0 && compareWord(tempWord, argv[1]))
{
printf("CENSORED");
count=0;
}
if(count!=0 && !compareWord(tempWord, argv[1]))
{
printWord(tempWord);
count=0;
}
if (count==0)
{
printf("%c", character);
}
}
}
(it works, tested)

Related

Program "returns" and exits using 2d char array pointers in loop?

I'm really baffled by my program's behavior. I'm trying to read from a file and represent the entire file as a 2d char array, but for some reason it seems to just "return" or exit whenever I'm trying to assign a char to an element in that array... Any ideas what I'm missing here?
(Oh and I want it to cut off after the first 10 characters)
Thanks kindly.
(Edited: added headers and main)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
int main() {
FILE *fpin;
fpin = fopen("tester.txt", "r");
char (*strings)[20][10]; //Array of strings contains entire file
bool continueReading = true; //boolean for end of file: end of file = 0
/* ===THIS WORKS===
(*strings)[0][0] = 'f';
printf("\nPutting character: %c", (*strings)[0][0]);
return 0;
*/
int whileLoops = 0;
while (continueReading)
{
//grab first char
char ch = fgetc(fpin);
//if no charaters exist break
if (ch == EOF)
{
break;
}
if (ch != '\n' && ch != EOF)
{
printf("\nPutting character: %c", ch);
printf(" in Strings - %d", whileLoops);
printf(" - 0");
(*strings)[whileLoops][0] = ch;
/* === PROGRAM TERMINATES HERE, NO ERRORS ??? === */
}
for (int i = 1; i < 10; i++)
{
//concat char by passing to array
ch = fgetc(fpin); //repeat with next char from infile
if (ch == '\n') // newline char
{
break; //break here go back into while loop
}
else if (ch == EOF)
{
continueReading = false;
break;
}
if (ch != '\n' && ch != EOF)
{
printf("\nPutting character: %c", ch);
printf(" in Strings - %d", whileLoops);
printf(" - %d", i);
(*strings)[whileLoops][i] = ch;
}
if (i+1 >= 10)
{
while (1)
{
ch = fgetc(fpin); //repeat with next char from infile
if (ch == '\n' || ch == EOF) // newline char
{
break; //break here go back into while loop
}
}
}
}
whileLoops++;
}
fclose(fpin);
return 0;
}
Input File:
Tony Buffet
Kailey Heson
Art Johnson
John Pernanski
Output:
Putting character: T in Strings - 0 - 0
The problem with the code is how you define and access the array.
This is hinted at by the compiler if you include the -Wall flag:
$ gcc main.c -o main -Wall
main.c: In function 'main':
main.c:35:39: warning: 'strings' may be used uninitialized in this function [-Wmaybe-uninitialized]
35 | (*strings)[whileLoops][0] = ch;
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
char (*strings)[20][10] defines a pointer to a 2D array, whereas what you want is just the array: char strings[20][10].
Now you can access the array using strings[whileLoops][0] = ch and the code works.

Make 1st letter the last letter in the word

I need help rotating words so they output with the 1st letter at the end.
I have a file called flip.txt
backpack
carpet
rotate
and i want to be able to enter ./RotateWord < flip.txt (in terminal) and it should output
ackpackb
arpetc
otater
I was able to get it to output the words with the 1st letter missing . How to i make it output the 1st letter at the end?
Heres my code
#include <stdio.h>
#define BUFFER_SIZE 81
int main(int argc, char **argv) {
char string[BUFFER_SIZE];
while(fgets(string, BUFFER_SIZE, stdin) > 0) {
int numChars = 0;
while(string[numChars] && string[numChars] != '\n')
++numChars;
int i;
for(i = 1; i < numChars; ++i){
if (string[i] == '\n'){
putchar(string[0]);
putchar('\n');
}
else {
putchar(string[i]);
}
}
putchar('\n');
fflush(stdout);
}
return 0;
}
You can use strcspn to get rid of the \n.
string[strcspn(string,"\n")]=0;
If you want to print the string in that way - do this two line. (You can merge it in one line too).
printf("%s",string+1);
printf("%c\n",string[0]);
The code will be
while(fgets(string, BUFFER_SIZE, stdin) != NULL) {
string[strcspn(string,"\n")]=0;
printf("%s%c\n",string+1,string[0]);
fflush(stdout);
}
Why your code didn't work?
The thing is as per your code the if block is never executed. So you will never get the first character printed. The previous while loop simply increments the variable as long as it doesn't meet the \n. Then in the next loop you iterate over them < numChars and then expect to see the \n, which won't be the case now.
Changing your code this would work
while(fgets(string, BUFFER_SIZE, stdin) != NULL) {
for(int i = 1; string[i]; ++i){
if (string[i] == '\n'){
putchar(string[0]);
putchar('\n');
}
else {
putchar(string[i]);
}
}
fflush(stdout);
}
Earlier you were putting \n after every character that you print. It should be only after the last character.
Note:
fgets returns char* - in case of error it returns NULL. So to check if fgets is successful or not you should do a null check.
It it possible that you're not counting the final '\n' character when you
while(string[numChars] && string[numChars] != '\n')
++numChars;
?
If so, you never hit
if (string[i] == '\n'){
putchar(string[0]);
putchar('\n');
}
because your loop stops one early.
In the for loop run till
for(i=1,i<=numChars,i++)
This is because the if statement does not get executed as the index does not reach the newline character.
Thus the complete code is as follows:
#include <stdio.h>
#define BUFFER_SIZE 81
int main(int argc, char **argv) {
char string[BUFFER_SIZE];
while(fgets(string, BUFFER_SIZE, stdin) > 0) {
int numChars = 0;
while(string[numChars] && string[numChars] != '\n')
++numChars;
int i;
for(i = 1; i <= numChars; ++i){
if (string[i] == '\n'){
putchar(string[0]);
putchar('\n');
}
else {
putchar(string[i]);
}
}
//putchar('\n');
fflush(stdout);
}
return 0;
}
Also there is no need of extra
putchar('\n');

Converting from arrays to pointers. Am I missing something?

I had to rewrite two functions as per two exercises in a book I'm working from. One that simply reads a line of characters, readLine and another that compared two character strings and returned either 1 or 0 based on whether they match, 'equalStrings`.
The point of the exercise was to rewrite the functions so they used pointers, as opposed to arrays.
I've been struggling with prior exercises and was surprised how quickly I was able to do this so I'm concerned I'm missing something important.
Both programs compile and run as hoped though.
This is the original readLine function:
#include <stdio.h>
void readLine(char buffer[]);
int main(void)
{
int i;
char line[81];
for(i = 0; i < 3; i++)
{
readLine(line);
printf("%s\n\n", line);
}
return 0;
}
void readLine(char buffer[])
{
char character;
int i = 0;
do
{
character = getchar();
buffer[i] = character;
i++;
}
while(character != '\n');
buffer[i - 1] = '\0';
}
My edited with pointers:
#include <stdio.h>
void readLine(char *buffer);
int main(void)
{
int i;
char line[81];
char *pointer;
pointer = line;
for(i = 0; i < 3; i++)
{
readLine(pointer);
printf("%s\n\n", line);
}
return 0;
}
void readLine(char *buffer)
{
char character;
int i;
i = 0;
do
{
character = getchar();
buffer[i] = character;
i++;
}
while(character != '\n');
buffer[i - 1] = '\0';
}
Here is the original equalString function:
#include <stdio.h>
#include <stdbool.h>
bool equalStrings(const char s1[], const char s2[]);
int main(void)
{
const char stra[] = "string compare test";
const char strb[] = "string";
printf("%i\n", equalStrings(stra, strb));
printf("%i\n", equalStrings(stra, stra));
printf("%i\n", equalStrings(strb, "string"));
return 0;
}
bool equalStrings(const char s1[], const char s2[])
{
int i = 0;
bool areEqual;
while(s1[i] == s2[i] && s1[i] != '\0'){
i++;
if(s1[i] == '\0' && s2[i] == '\0')
areEqual = true;
else
areEqual = false;
}
return areEqual;
}
and the rewritten with pointers:
#include <stdio.h>
#include <stdbool.h>
bool equalStrings(const char *pointera, const char *pointerb);
int main(void)
{
const char stra[] = "string compare test";
const char strb[] = "string";
const char *pointera;
const char *pointerb;
pointera = stra;
pointerb = strb;
printf("%i\n", equalStrings(pointera, pointerb));
printf("%i\n", equalStrings(pointerb, pointerb));
printf("%i\n", equalStrings(strb, "string"));
return 0;
}
bool equalStrings(const char *pointera, const char *pointerb)
{
int i = 0;
bool areEqual;
while(pointera[i] == pointerb[i] && pointera[i] != '\0'){
i++;
if(pointera[i] == '\0' && pointerb[i] == '\0')
areEqual = true;
else
areEqual = false;
}
return areEqual;
}
Is there anything glaring out that needs to be changed?
Thank you.
There are (3) conditions you need to protect against in your readline function. (1) you must protect against writing beyond the end of your array. Utilizing a simple counter to keep track of the number of characters added will suffice. You can express this limit in your read loop. Your array size is 81 (which will hold a string of 80 characters +1 for the nul-terminating character. Assuming you create a #define MAXC 81 for use in your code, your first condition could be written as:
void readline (char *buffer)
{
int i = 0, c;
while (i + 1 < MAXC && ...
(2) the second condition you want to protect against is reaching a '\n' newline character. The second condition for your read loop could be written as:
while (i + 1 < MAXC && (c = getchar()) != '\n' && ...
(3) the third condition you must protect against is encountering EOF with a line before a newline character is reached (many editors produce files with non-POSIX line-endings). With the final condition, your complete set of test conditions could look like the following:
while (i + 1 < MAXC && (c = getchar()) != '\n' && c != EOF)
(and that is why c must be signed (and should be a signed int), because EOF is generally -1)
Putting that together, with what it appears was intended in rewriting the function from using array-index notation to using pointer notation, you could do something like the following:
void readline (char *buffer)
{
int i = 0, c;
while (i + 1 < MAXC && (c = getchar()) != '\n' && c != EOF) {
*buffer++ = c;
i++;
}
*buffer = 0;
if (i + 1 == MAXC && *(buffer - 1) != '\n')
fprintf (stderr, "warning: line truncation occurred.\n");
}
You should also check, as shown above, whether you read all the characters in the line, or whether a short-read occurred (meaning after reading 80 allowable characters, there were still more characters in the line to be read, but to prevent writing beyond the end of your array, and leaving room for the terminating nul, you stopped reading before your reached the newline). You are free to handle it as you like, but be aware -- those characters still exist in the input buffer (stdin here) and will be the very next characters read on your next call to getchar(). So you may want a way to tell if that occurred.
Putting the function together in a short example with a helpful input file will help explain.
#include <stdio.h>
#define MAXC 81
void readline(char *buffer);
int main(void) {
int i;
char line[MAXC] = "", *pointer = line;
for(i = 0; i < 3; i++) {
readline (pointer);
printf ("%s\n\n", line);
}
return 0;
}
void readline (char *buffer)
{
int i = 0, c;
while (i + 1 < MAXC && (c = getchar()) != '\n' && c != EOF) {
*buffer++ = c;
i++;
}
*buffer = 0;
if (i + 1 == MAXC && *(buffer - 1) != '\n')
fprintf (stderr, "warning: line truncation occurred.\n");
}
How will your function behave if given a 90 character line to read?
Input File
Two lines with 90 characters each.
$cat dat/90.txt
123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
Example Use/Output
Note what has occurred. On the first read attempt, 80 character were read, and a short read occurred. You were warned of that fact. The second read, read the reamining 10 characters in the first line (chars 81-90). The third, and final, read, again reads the first 80 chars of the second line and the code terminates.
$ ./bin/getchar_ptr <dat/90.txt
warning: line truncation occurred.
12345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890
warning: line truncation occurred.
12345678901234567890123456789012345678901234567890123456789012345678901234567890
I'll let you look this over and incorporate any of the suggestions you find helpul in the rest of your code. Let me know if you have any questions. Make sure you fully undetstand what is being passed as buffer in void readline (char *buffer) (copy as opposed to original) as basic pointer understandin has implications throughout C.

fgets and chdir acting strangely together in C

I am currently creating a simple shell for homework and I've run into a problem. Here is a snippet of code with the pieces that pertain to the problem (I may have forgotten some pieces please tell me if you see anything missing):
eatWrd returns the first word from a string, and takes that word out of the string.
wrdCount, as implied, returns the number of words in a string.
if either of these codes are necessary for a response I can post them, just please tell me, I am almost 100% positive they are not the cause of the problem.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX 100
int main(void)
{
char input[MAX];
char *argm[MAX];
memset(input, 0, sizeof(input));
memset(argm, 0, sizeof(argm));
while(1){
printf("cmd:\n");
fgets(input, MAX-1, stdin);
for(i=0;i < wrdCount(input); i++){
argm[i] = eatWrd(input);
}
argm[i] = NULL;
if (!strncmp(argm[0],"cd" , 2)){
chdir(argm[1]);
}
if (!strncmp(argm[0],"exit", 4)){
exit(0);
}
memset(input, 0, sizeof(input));
memset(argm, 0, sizeof(argm));
}
}
Anyways, this loop works for lots of other commands using execvp, (such as cat, ls, etc.), when I use cd, it works as expected, except when I try to exit the shell, it takes multiple exit calls to actually get out. (as it turns out, the number of exit calls is exactly equal to the number of times I call cd). It only takes one exit call when I don't use cd during a session. I'm not really sure what's going on, any help is appreciated, thanks.
Here is eatWrd:
char* eatWrd(char * cmd)
{
int i = 0; // i keeps track of position in cmd
int count = 0; // count keeps track of position of second word
char rest[MAX_LINE]; // rest will hold cmd without the first word
char * word = (char *) malloc(MAX_LINE); //word will hold the first word
sscanf(cmd, "%s", word); //scan the first word into word
// iterate through white spaces, then first word, then the following white spaces
while(cmd[i] == ' ' || cmd[i] == '\t'){
i++;
count++;
}
while(cmd[i] != ' ' && cmd[i] != '\t' && cmd[i] != '\n' && cmd[i] != '\0'){
i++;
count++;
}
while(cmd[i] == ' ' || cmd[i] == '\t'){
i++;
count++;
}
// copy the rest of cmd into rest
while(cmd[i] != '\n' && cmd[i] != '\0'){
rest[i-count] = cmd[i];
i++;
}
rest[i-count] = '\0';
memset(cmd, 0, MAX_LINE);
strcpy(cmd, rest); //move rest into cmd
return word; //return word
}
And here is wrdCount:
int wrdCount(char *sent)
{
char *i = sent;
int words = 0;
//keep iterating through the string,
//increasing the count if a word and white spaces are passed,
// until the string is finished.
while(1){
while(*i == ' ' || *i == '\t') i++;
if(*i == '\n' || *i == '\0') break;
words++;
while(*i != ' ' && *i != '\t' && *i != '\n' && *i != '\0') i++;
}
return words;
}
This variation on your code works for me:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <unistd.h>
#define MAX 100
char *eatWrd(char **line) {
char *next_c = *line;
char *word_start = NULL;
while (isspace(*next_c)) next_c += 1;
if (*next_c) {
word_start = next_c;
do {
next_c += 1;
} while (*next_c && ! isspace(*next_c));
*next_c = '\0';
*line = next_c + 1;
}
return word_start;
}
int main(void)
{
char input[MAX];
char *argm[MAX];
while(1) {
int word_count = 0;
char *next_input = input;
printf("cmd:\n");
fgets(input, MAX, stdin);
do {
argm[word_count] = eatWrd(&next_input);
} while (argm[word_count++]);
/* The above always overcounts by one */
word_count -= 1;
if (!strcmp(argm[0], "cd")){
chdir(argm[1]);
} else if (!strcmp(argm[0], "exit")) {
exit(0);
}
}
}
Note my variation on eatWrd(), which does not have to move any data around, and which does not require pre-parsing the string to determine how many words to expect. I suppose your implementation would be more complex, so as to handle quoting or some such, but it could absolutely follow the same general approach.
Note, too, my correction to the command-matching conditions, using !strcmp() instead of strncmp().

Remove punctuation at beginning and end of a string

I have a string and I want to remove all the punctuation from the beginning and the end of it only, but not the middle.
I have wrote a code to remove the punctuation from the first and last character of a string only, which is clearly very inefficient and useless if a string has 2 or more punctuations at the end.
Here is an example:
{ Hello ""I am:: a Str-ing!! }
Desired output
{ Hello I am a Str-ing }
Are there any functions that I could use? Thanks.
This is what I've done so far. I'm actually editing the string in a linked-list
if(ispunct(removeend->string[(strlen(removeend->string))-1]) != 0) {
removeend->string[(strlen(removeend->string))-1] = '\0';
}
else {}
Iterate over the string, use isalpha() to check each character, write the characters which pass into a new string.
char *rm_punct(char *str) {
char *h = str;
char *t = str + strlen(str) - 1;
while (ispunct(*p)) p++;
while (ispunct(*t) && p < t) { *t = 0; t--; }
/* also if you want to preserve the original address */
{ int i;
for (i = 0; i <= t - p + 1; i++) {
str[i] = p[i];
} p = str; } /* --- */
return p;
}
Iterate over the string, use isalpha() to check each character, after the first character that passes start writing into a new string.
Iterate over the new string backwards, replace all punctuation with \0 until you find a character which isn't punctuation.
#include <stdio.h>
#include <ctype.h>
#include <string.h>
char* trim_ispunct(char* str){
int i ;
char* p;
if(str == NULL || *str == '\0') return str;
for(i=strlen(str)-1; ispunct(str[i]);--i)
str[i]='\0';
for(p=str;ispunct(*p);++p);
return strcpy(str, p);
}
int main(){
//test
char str[][16] = { "Hello", "\"\"I", "am::", "a", "Str-ing!!" };
int i, size = sizeof(str)/sizeof(str[0]);
for(i = 0;i<size;++i)
printf("%s\n", trim_ispunct(str[i]));
return 0;
}
/* result:
Hello
I
am
a
Str-ing
*/
Ok, in a while iteration, call multiple times the strtok function to separate each single string by the character (white space). You could also use sscanf instead of strtok.
Then, for each string, you have to do a for cycle, but beginning from the end of the string up to the beginning.As soon as you encounter !isalpha(current character) put a \0 in the current string position. You have eliminated the tail's punctuation chars.
Now, do another for cycle on the same string. Now from 0 to strlen(currentstring). While is !isalpha(current character) continue. If isalpha put the current character in in a buffer and all the remaining characters. The buffer is the cleaned string. Copy it into the original string.
Repeat the above two steps for the others strtok's outputs. End.
Construct a tiny state machine. The cha2class() function divides the characters into equivalence classes. The state machine will always skip punctuation, except when it has alphanumeric characters on the left and the right; in that case it will be preserved. (that is the memmove() in state 3)
#include <stdio.h>
#include <string.h>
#define IS_ALPHA 1
#define IS_WHITE 2
#define IS_PUNCT 3
int cha2class(int ch);
void scrutinize(char *str);
int cha2class(int ch)
{
if (ch >= 'a' && ch <= 'z') return IS_ALPHA;
if (ch >= 'A' && ch <= 'Z') return IS_ALPHA;
if (ch == ' ' || ch == '\t') return IS_WHITE;
if (ch == EOF || ch == 0) return IS_WHITE;
return IS_PUNCT;
}
void scrutinize(char *str)
{
size_t pos,dst,start;
int typ, state ;
state = 0;
for (dst = pos = start=0; ; pos++) {
typ = cha2class(str[pos]);
switch(state) {
case 0: /* BOF, white seen */
if (typ==IS_WHITE) break;
else if (typ==IS_ALPHA) { start = pos; state =1; }
else if (typ==IS_PUNCT) { start = pos; state =2; continue;}
break;
case 1: /* inside a word */
if (typ==IS_ALPHA) break;
else if (typ==IS_WHITE) { state=0; }
else if (typ==IS_PUNCT) { start = pos; state =3;continue; }
break;
case 2: /* inside punctuation after whitespace: skip it */
if (typ==IS_PUNCT) continue;
else if (typ==IS_WHITE) { state=0; }
else if (typ==IS_ALPHA) {state=1; }
break;
case 3: /* inside punctuation after a word */
if (typ==IS_PUNCT) continue;
else if (typ==IS_WHITE) { state=0; }
else if (typ==IS_ALPHA) {
memmove(str+dst, str+start, pos-start); dst += pos-start;
state =1; }
break;
}
str[dst++] = str[pos];
if (str[pos] == '\0') break;
}
}
int main (int argc, char **argv)
{
char test[] = ".This! is... ???a.string?" ;
scrutinize(test);
printf("Result=%s\n", test);
return 0;
}
int main (int argc, char **argv)
{
char test[] = ".This! is... ???a.string?" ;
scrutinize(test);
printf("Result=%s\n", test);
return 0;
}
OUTPUT:
Result=This is a.string

Resources