Counting occurrences of words within an inputted string in c - c

I'm currently struggling with counting the occurrences of the words within an inputted string. I believe it is just my logic that is off but I've been scratching my head for a while and I've just hit a wall.
The problems I'm currently yet to solve are:
With longer inputs the ends of the string is sometimes cut off.
Incrementing the counter for each word when repeated
I know the code has things that may not be the most ideal way for it to work but I'm fairly new to C so any pointers are really helpful.
To sum it up I'm looking for pointers to help solve the issues I'm facing above
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <ctype.h>
#define MAX_WORDS 1000
int main(void) {
int i,j,isUnique,uniqueLen;
char word[MAX_WORDS];
char words[200][30];
char uniqueWords[200][30];
int count[200];
char *p = strtok(word, " ");
int index=0;
//read input until EOF is reached
scanf("%[^EOF]", word);
//initialize count array
for (i = 0; i < 200; i++) {
count[i] = 0;
}
//convert lower case letters to upper
for (i = 0; word[i] != '\0'; i++) {
if (word[i] >= 'a' && word[i] <= 'z') {
word[i] = word[i] - 32;
}
}
//Split work string into an array and save each token into the array words
p = strtok(word, " ,.;!\n");
while (p != NULL)
{
strcpy(words[index], p);
p = strtok(NULL, " ,.;!\n");
index++;
}
/*
Check each string in the array word for occurances within the uniqueWords array. If it is unique then
copy the string from word into the unique word array. Otherwise the counter for the repeated word is incremented.
*/
uniqueLen = 0;
for (i = 0; i < index; i++) {
isUnique = 1;
for (j = 0; j < index; j++) {
if (strcmp(uniqueWords[j],words[i])==0) {
isUnique = 0;
break;
}
else {
}
}
if (isUnique) {
strcpy(uniqueWords[uniqueLen], words[i]);
count[uniqueLen] += 1;
uniqueLen++;
}
else {
}
}
for (i = 0; i < uniqueLen; i++) {
printf("%s => %i\n", uniqueWords[i],count[i]);
}
}

This is the code i ended up using, this turned out to be mainly an issue with using the scanf function. Placing it in a while loop made it much easier to edit words as inputted.
Thankyou for all the help :)
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <ctype.h>
int main(void) {
// Create all variables
int i, len, isUnique, index;
char word[200];
char uniqueWords[200][30];
int count[200];
// Initialize the count array
for (i = 0; i < 200; i++) {
count[i] = 0;
}
// Set the value for index to 0
index = 0;
// Read all words inputted until the EOF marker is reached
while (scanf("%s", word) != EOF) {
/*
For each word being read if the characters within it are lowercase
then each are then incremented into being uppercase values.
*/
for (i = 0; word[i] != '\0'; i++) {
if (word[i] >= 'a' && word[i] <= 'z') {
word[i] = word[i] - 32;
}
}
/*
We use len to find the length of the word being read. This is then used
to access the final character of the word and remove it if it is not an
alphabetic character.
*/
len = strlen(word);
if (ispunct(word[len - 1]))
word[len - 1] = '\0';
/*
The next part removes the non alphabetic characters from within the words.
This happens by incrementing through each character of the word and by
using the isalpha and removing the characters if they are not alphabetic
characters.
*/
size_t pos = 0;
for (char *p = word; *p; ++p)
if (isalpha(*p))
word[pos++] = *p;
word[pos] = '\0';
/*
We set the isUnique value to 1 as upon comparing the arrays later we
change this value to 0 to show the word is not unique.
*/
isUnique = 1;
/*
For each word through the string we use a for loop when the counter i
is below the index and while the isUnique value is 1.
*/
for (i = 0; i < index && isUnique; i++)
{
/*
Using the strcmp function we are able to check if the word in
question is in the uniqueWords array. If it is found we then
change the isUnique value to 0 to show that the value is not
unique and prevent the loop happening again.
*/
if (strcmp(uniqueWords[i], word) == 0)
isUnique = 0;
}
/* If word is unique then add it to the uniqueWords list
and increment index. Otherwise increment occurrence
count of current word.
*/
if (isUnique)
{
strcpy(uniqueWords[index], word);
count[index]++;
index++;
}
else
{
count[i - 1]++;
}
}
/*
For each item in the uniqueWords list we iterate through the words
and print them out in the correct format with the word and the following count of them.
*/
for (i = 0; i < index; i++)
{
printf("%s => %d\n", uniqueWords[i], count[i]);
}
}

I don't know if you are facing some requirements, but for all it's limitations in terms of standard library functions, C does have one that would make your job much easier, strstr, e.g.:
Live demo
#include <stdio.h>
#include <string.h>
int main() {
const char str[] = "stringstringdstringdstringadasstringipoistring";
const char* substr = "string";
const char* orig = str;
const char* temp = substr;
int length = 0;
while(*temp++){length++;} // length of substr
int count = 0;
char *ret = strstr(orig, substr);
while (ret != NULL){
count++;
//check next occurence
ret = strstr(ret + length, substr);
}
printf("%d", count);
}
The output should be 6.
Regarding user3121023's comment, scanf("%999[^\n]", word); parses all characters until it finds a \n or it reaches the width limit, and I agree fgets ( word, sizeof word, stdin); is better.

Related

How to print length of each word in a string before the each Word

I want to print the length of each word in a string.
I have tried but not getting right answer. After running the code it will print the length of each word after the word instead of printing before the each word.
char str[20] = "I Love India";
int i, n, count = 0;
n = strlen(str);
for (i = 0; i <= n; i++) {
if (str[i] == ' ' || str[i] == '\0') {
printf("%d", count);
count = 0;
} else {
printf("%c", str[i]);
count++;
}
}
I except the output is 1I 4Love 5India, but the actual output is I1 Love4 India5.
You can use strtok as Some programmer dude sugested. You may want to make a copy of the original string as strtok modifies the passed string. Also strtok is not thread-safe and must be replaced with strtok_r when working with multi-threaded programs.
#include <stdio.h>
#include <stdlib.h>
/* for strtok */
#include <string.h>
int main() {
char str[20] = "I Love India";
int n;
char* tok = strtok(str, " ");
while (tok != NULL) {
n = strlen(tok);
printf("%d%s ", n, tok);
tok = strtok(NULL, " ");
}
return EXIT_SUCCESS;
}
You want to compute and print the length of each word before you print the word.
Here is a simple solution using strcspn(), a standard function that should be used more often:
#include <stdio.h>
#include <string.h>
int main() {
char str[20] = "I Love India";
char *p;
int n;
for (p = str; *p;) {
if (*p == ' ') {
putchar(*p++);
} else {
n = strcspn(p, " "); // compute the length of the word
printf("%d%.*s", n, n, p);
p += n;
}
}
printf("\n");
return 0;
}
Your approach is wrong as you print the word before the length. So you need to calculate the length first then print it and then print the word.
It could be something like:
int main(void)
{
char str[20]="I Love India";
size_t i = 0;
while(str[i])
{
if (str[i] == ' ') // consider using the isspace function instead
{
// Print the space
printf(" ");
++i;
}
else
{
size_t j = i;
size_t count = 0;
// Calculate word len
while(str[j] && str[j] != ' ')
{
++count;
++j;
}
// Print word len
printf("%zu", count);
// Print word
while(i<j)
{
printf("%c", str[i]);
++i;
}
}
}
}
The basic idea is to have two index variables for the string, i and j. The index i is at the words first character and index j is used for finding the end of the word. Once the end of word has been found, the length and the word can be printed.
This is what you want:
#include <stdio.h>
#include <string.h>
int main()
{
char str[20]="I Love India";
char buf[20];
int i,n,count=0;
n=strlen(str);
for (i=0; i <= n; i++) {
if(str[i]==' ' || str[i]=='\0'){
buf[count] = '\0';
printf("%d", count); /* Print the size of the last word */
printf("%s", buf); /* Print the buffer */
memset(buf, 0, sizeof(buf)); /* Clear the buffer */
count = 0;
} else {
buf[count] = str[i];
count++;
}
}
return 0;
}
You will want to keep a buffer of the word that is currently being counted. (buf)
Increment count each time its not a space or 0/. Then, when it is a space or a 0/, print count first, then buf. Then, we will clear buf and set count to 0, so that the variable i is still incrementing through the entire string str, but we are inserting the words into buf starting from 0.

Kochan InsertString segmentation fault

I am working through Kochan's programming in C book and I am working on an exercise which requires a function to insert one character string inside another string, with the function call including where the string is to be inserted.
I have written the below code but I receive a segmentation fault whenever I enter the inputs. I think it's because the 'input' string is defined to the length of the user's input and then the insertString function tries to add additional characters to this string. I just can't see a way of defining the string as large enough to be able to take in additional characters. Do you think that this is the reason I am receiving a segmentation fault? Are there any other ways to go about this problem?
#include<stdio.h>
#include <string.h>
insertString(char input[], const char insert[], int position)
{
int i, j;
char temp[81];
j = strlen(input);
for(i = 0; i < position - 1; i++)
{
temp[i] = input[i];
}
for(j = 0; insert != '\0'; i++, j++)
{
temp[i] = insert[j];
}
for(j = i - j; input != '\0'; i++, j++)
{
temp[i] = input[j];
}
for(i = 0; temp[i] != '\0'; i++)
{
input[i] = temp[i];
}
input[i] = '\0';
}
void readLine(char buffer[])
{
char character;
int i = 0;
do
{
character = getchar();
buffer[i] = character;
i++;
}
while(character != '\n');
buffer[i - 1] = '\0';
}
int main(void)
{
char input[81];
char insert[81];
int position;
printf("Enter the first string: ");
readLine(input);
printf("Enter the insert string: ");
readLine(insert);
printf("Enter placement position int: ");
scanf("%i", &position);
insertString(input, insert, position);
printf("The adjusted string is %s\n", input);
return 0;
}
There might be other reasons as well, but the following fragment will crash for sure:
for(j = 0; insert != '\0'; i++, j++)
{
temp[i] = insert[j];
}
The reason is that - since insert will not be increased or manipulated - this is an endless loop writing "indefinitely" long into temp. Once exceeding its length 80 (or a bit later) it will crash. I suppose you meant for(j = 0; insert[j] != '\0'; i++, j++), right?
Check all for loop conditions in insertString function. For example:
for(j = 0; insert != '\0'; i++, j++)
{
temp[i] = insert[j];
}
is infinite loop. Because of it you access memory out of temp array bounds. It causes UB and segmentation fault. Looks like you need insert[j] != '\0' condition here.
I'm familiar with this book. The author, Stephen Kochan, has a website with answers to the odd-numbered end of chapter exercises.
The website is at classroomm.com but you'll need to look around some to find the information.
Here is the info from that site related to this exercise:
Programming in C, exercise 10-7 (3rd edition) and 9-7 (4th edition)
/* insert string s into string source starting at i
This function uses the stringLength function defined
in the chapter.
Note: this function assumes source is big enough
to store the inserted string (dangerous!) */
void insertString (char source[], char s[], int i)
{
int j, lenS, lenSource;
/* first, find out how big the two strings are */
lenSource = stringLength (source);
lenS = stringLength (s);
/* sanity check here -- note that i == lenSource
effectively concatenates s onto the end of source */
if (i > lenSource)
return;
/* now we have to move the characters in source
down from the insertion point to make room for s.
Note that we copy the string starting from the end
to avoid overwriting characters in source.
We also copy the terminating null (j starts at lenS)
as well since the final result must be null-terminated */
for ( j = lenSource; j >= i; --j )
source [lenS + j] = source [j];
/* we've made room, now copy s into source at the
insertion point */
for ( j = 0; j < lenS; ++j )
source [j + i] = s[j];
}
There's an error somewhere in your insertString function where it goes out of bounds. By the way your insertString function doesn't start with the word void.
If I substitute the insertString function which I wrote for the exercise then the program works.
#include<stdio.h>
#include <string.h>
void insertString (char source[], const char s[], int start)
{
int stringLength (const char s[]);
int lenSource = strlen (source);
int lenString = strlen (s);
int i;
if ( start > lenSource ) {
printf ("insertion point exceeds string length\n");
return;
}
// move the characters in the source string which are above the
// starting point (including the terminating null character) to make
// room for the new characters; to avoid overwriting characters the
// process begins at the end of the string
for ( i = lenSource; i >= start; --i )
source[i + lenString] = source[i];
// insert new characters
for ( i = 0; i < lenString; ++i )
source[start + i] = s[i];
}
void readLine(char buffer[])
{
char character;
int i = 0;
do
{
character = getchar();
buffer[i] = character;
i++;
}
while(character != '\n');
buffer[i - 1] = '\0';
}
int main(void)
{
char input[81];
char insert[81];
int position;
printf("Enter the first string: ");
readLine(input);
printf("Enter the insert string: ");
readLine(insert);
printf("Enter placement position int: ");
scanf("%i", &position);
insertString(input, insert, position);
printf("The adjusted string is %s\n", input);
return 0;
}

I am trying to compare string literals and I want to remove repeated literals I want to do it without using POINTERS.,

I am trying to compare string literals and I want to remove repeated literals I want to do it without using POINTERS.
This is my code:
char str[30];
printf("Enter strings : ");
fgets(str,29,stdin);
char tem[30];
int count , county;
for(count = 0 ; count < strlen(str)-1 ; count++) {
for(county = 1 ; county < strlen(str) ; county++) {
if(str[count] != str[county]) {
tem[count] = str[count];
}
}
}
//PRINT
for(count = 0 ;count < strlen(str) -1 ; count++) {
printf("%c",tem[count]);
}
Input: happen
Expected output: hapen
Correcting your code:
#include <stdio.h>
#include <string.h>
int main()
{
char str[30];
printf("Enter strings : ");
fgets(str,30,stdin);
char tem[30];
size_t count;
size_t county=0;;
for(count = 0 ; count < strlen(str)-1 ; count++) {
if(str[count] != str[count+1]) {
tem[county++] = str[count];
}
}
tem[county] = '\0';
printf("%s\n", tem);
return 0;
}
Take note that this code remove double chars if this char have +1 displacement in the sting.
EDIT
To have mspi as output of entered string mississippi
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
int main()
{
char str[30];
printf("Enter strings : ");
fgets(str,30,stdin);
char tem[30];
size_t count;
size_t county;
size_t tem_index = 0;
size_t size_of_string = strlen(str);
bool found;
for(count = 0 ; count < size_of_string-1; count++)
{
found = false;
county = count+1;
while ((found == false) && (county<size_of_string))
{
if(str[count] == str[county])
{
found = true;
}
county++;
}
if (found == false)
{
tem[tem_index++] = str[count];
}
}
tem[tem_index] = '\0';
printf("%s\n", tem);
return 0;
}
EDIT 2
To have misp as output of entered string mississippi
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
int main()
{
char str[30];
printf("Enter strings : ");
fgets(str,30,stdin);
char tem[30];
size_t count;
size_t county;
size_t tem_index = 0;
size_t size_of_string = strlen(str);
bool found;
for(count = 0 ; count < size_of_string-1; count++)
{
found = false;
county = 0;
while ((found == false) && (county<tem_index))
{
if(str[count] == tem[county])
{
found = true;
}
county++;
}
if (found == false)
{
tem[tem_index++] = str[count];
}
}
tem[tem_index] = '\0';
printf("%s\n", tem);
return 0;
}
Take note that all these solutions are case sensitive.
Your code to remove allduplicates isn't quite there yet:
You use the same indices for the new and the old string, but you need two different indices, because the new string is as long or shorter than the old string. If the old string is "aaab", your index for the old string is 3 when you see the "b", but the index for the new string is only 1. (By skipping indices you leave uninitialised gaps in your string.)
You look forward to find other occurrences of the same letter, but you append to the new string for every letter that doesn't atch. You must look at all folllowing letters, but you must append to the new string only once. That is, you must make your decision whether the letter is duplicate or not after the loop, based on the information that you've found in the loop.
When you look forward, you shouldn't start a 1, but at the letter after he current letter. If you start at one, you will find duplicates for every letter after the first, because you check each letter with itself.
This is not an arror, but it's not a good idea to call strlen repeatedly in a loop. The length of the input string doesn't change, so you can determine the string length beforehand. If you just want to use it as your termination condition, you can test whethet the current letter is the null terminator.
Below is a solution that uses your logic, albeit by looking backwards, not forward. (If you look forward, you will copy the last occurence of a letter, if you look back, you'll copy the first occurrence. It may make a difference in the order of the letters. For Mississippi, you'll get "Mspi" or "Misp" depending on which strategy you use.)
The program overwrites the same string. This is possible, because you are filtering out letters and the new index is equal to the old index or smaller:
#include <stdlib.h>
#include <stdio.h>
void remdup(char *str)
{
int i = 0; // index into old string
int j = 0; // index into new string
for (i = 0; str[i]; i++) {
int k = 0;
int dup = 0;
for (k = 0; k < i; k++) {
if (str[i] == str[k]) {
dup = 1;
break;
}
}
if (dup == 0) str[j++] = str[i];
}
str[j] = '\0';
}
int main()
{
char str[] = "Mississippi";
puts(str);
remdup(str);
puts(str);
return 0;
}
This solution doesn't scale for large strings. A more effective method would be to keep a table of which of the 256 possible characters have already been used.

Longest Substring Palindrome issue

I feel like I've got it almost down, but for some reason my second test is coming up with a shorter palindrome instead of the longest one. I've marked where I feel the error may be coming from, but at this point I'm kind of at a loss. Any direction would be appreciated!
#include <stdio.h>
#include <string.h>
/*
* Checks whether the characters from position first to position last of the string str form a palindrome.
* If it is palindrome it returns 1. Otherwise it returns 0.
*/
int isPalindrome(int first, int last, char *str)
{
int i;
for(i = first; i <= last; i++){
if(str[i] != str[last-i]){
return 0;
}
}
return 1;
}
/*
* Find and print the largest palindrome found in the string str. Uses isPalindrome as a helper function.
*/
void largestPalindrome(char *str)
{
int i, last, pStart, pEnd;
pStart = 0;
pEnd = 0;
int result;
for(i = 0; i < strlen(str); i++){
for(last = strlen(str); last >= i; last--){
result = isPalindrome(i, last, str);
//Possible error area
if(result == 1 && ((last-i)>(pEnd-pStart))){
pStart = i;
pEnd = last;
}
}
}
printf("Largest palindrome: ");
for(i = pStart; i <= pEnd; i++)
printf("%c", str[i]);
return;
}
/*
* Do not modify this code.
*/
int main(void)
{
int i = 0;
/* you can change these strings to other test cases but please change them back before submitting your code */
//str1 working correctly
char *str1 = "ABCBACDCBAAB";
char *str2 = "ABCBAHELLOHOWRACECARAREYOUIAMAIDOINEVERODDOREVENNGGOOD";
/* test easy example */
printf("Test String 1: %s\n",str1);
largestPalindrome(str1);
/* test hard example */
printf("\nTest String 2: %s\n",str2);
largestPalindrome(str2);
return 0;
}
Your code in isPalindrome doesn't work properly unless first is 0.
Consider isPalindrome(6, 10, "abcdefghhgX"):
i = 6;
last - i = 4;
comparing str[i] (aka str[6] aka 'g') with str[last-i] (aka str[4] aka 'e') is comparing data outside the range that is supposed to be under consideration.
It should be comparing with str[10] (or perhaps str[9] — depending on whether last is the index of the final character or one beyond the final character).
You need to revisit that code. Note, too, that your code will test each pair of characters twice where once is sufficient. I'd probably use two index variables, i and j, set to first and last. The loop would increment i and decrement j, and only continue while i is less than j.
for (int i = first, j = last; i < j; i++, j--)
{
if (str[i] != str[j])
return 0;
}
return 1;
In isPalindrome, replace the line if(str[i] != str[last-i]){ with if(str[i] != str[first+last-i]){.
Here's your problem:
for(i = first; i <= last; i++){
if(str[i] != str[last-i]){
return 0;
}
}
Should be:
for(i = first; i <= last; i++, last--){
if(str[i] != str[last]){
return 0;
}
}
Also, this:
for(last = strlen(str); last >= i; last--){
Should be:
for(last = strlen(str) - 1; last >= i; last--){

How do I allocate memory to my char pointer?

My assignment is to allow the user to enter any input and print the occurrences of letters and words, we also have to print out how many one letter, two, three, etc.. letter words are in the string. I have gotten the letter part of my code to work and have revised my word function several times, but still can't get the word finding function to even begin to work. The compiler says the char pointer word is undeclared when it clearly is. Do I have to allocate memory to it and the array of characters?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void findLetters(char *ptr);
void findWords(char *point);
int main()
{
char textStream[100]; //up to 98 characters and '\n\ and '\0'
printf("enter some text\n");
if (fgets(textStream, sizeof (textStream), stdin)) //input up to 99 characters
{
findLetters(textStream);
findWords(textStream);
}
else
{
printf("fgets failed\n");
}
return 0;
}
void findLetters(char *ptr) //find occurences of all letters
{
int upLetters[26];
int loLetters[26];
int i;
int index;
for (i = 0; i < 26; i++) // set array to all zero
{
upLetters[i] = 0;
loLetters[i] = 0;
}
i = 0;
while (ptr[i] != '\0') // loop until prt[i] is '\0'
{
if (ptr[i] >= 'A' && ptr[i] <= 'Z') //stores occurrences of uppercase letters
{
index = ptr[i] - 'A';// subtract 'A' to get index 0-25
upLetters[index]++;//add one
}
if (ptr[i] >= 'a' && ptr[i] <= 'z') //stores occurrences of lowercase letters
{
index = ptr[i] - 'a';//subtract 'a' to get index 0-25
loLetters[index]++;//add one
}
i++;//next character in ptr
}
printf("Number of Occurrences of Uppercase letters\n\n");
for (i = 0; i < 26; i++)//loop through 0 to 25
{
if (upLetters[i] > 0)
{
printf("%c : \t%d\n", (char)(i + 'A'), upLetters[i]);
// add 'A' to go from an index back to a character
}
}
printf("\n");
printf("Number of Occurrences of Lowercase letters\n\n");
for (i = 0; i < 26; i++)
{
if (loLetters[i] > 0)
{
printf("%c : \t%d\n", (char)(i + 'a'), loLetters[i]);
// add 'a' to go back from an index to a character
}
}
printf("\n");
}
void findWords(char *point)
{
int i = 0;
int k = 0;
int count = 0;
int j = 0;
int space = 0;
int c = 0;
char *word[50];
char word1[50][100];
char* delim = "{ } . , ( ) ";
for (i = 0; i< sizeof(point); i++) //counts # of spaces between words
{
if ((point[i] == ' ') || (point[i] == ',') || (point[i] == '.'))
{
space++;
}
}
char *words = strtok(point, delim);
for(;k <= space; k++)
{
word[k] = malloc((words+1) * sizeof(*words));
}
while (words != NULL)
{
printf("%s\n",words);
strcpy(words, word[j++]);
words = strtok(NULL, delim);
}
free(words);
}
This is because you are trying to multiply the pointer position+1 by the size of pointer. Change line 100 to:
word[k] = malloc(strlen(words)+1);
This will solve your compilation problem, but you still have other problems.
You've got a couple of problems in function findWords:
Here,
for (i = 0; i< sizeof(point); i++)
sizeof(point) is the same as sizeof(char*) as point in a char* in the function fincdWords. This is not what you want. Use
for (i = 0; i < strlen(point); i++)
instead. But this might be slow as strlen will be called in every iteration. So I suggest
int len = strlen(point);
for (i = 0; i < len; i++)
The same problem lies here too:
word[k] = malloc((words+1) * sizeof(*words));
It doesn't makes sense what you are trying with (words+1). I think you want
word[k] = malloc( strlen(words) + 1 ); //+1 for the NUL-terminator
You got the arguments all mixed up:
strcpy(words, word[j++]);
You actually wanted
strcpy(word[j++], words);
which copies the contents of words to word[j++].
Here:
free(words);
words was never allocated memory. Since you free a pointer that has not been returned by malloc/calloc/realloc, the code exhibits Undefined Behavior. So, remove that.
You allocated memory for each element of word. So free it using
for(k = 0; k <= space; k++)
{
free(word[k]);
}
Your calculation of the pointer position+1 is wrong. If you want the compilation problem will go away change line 100 to:
word[k] = malloc( 1 + strlen(words));

Resources