Program to find Word Length Statistics [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am expected to make a program to calculate and display statistics about the length of words in a text file. I have been provided the following file
int readFile(const char fName[], char textStr[]){
FILE *fPtr;
char ch;
int size = 0;
if ((fPtr = fopen(fName, "r")) == NULL) {
fprintf(stderr, "Error, failed to open %s: ", fName);
perror("");
return 1;
}
while ((ch = fgetc(fPtr)) != EOF) {
if (size >= MAX_FILE - 1)
break;
textStr[size++] = ch;
}
textStr[size] = '\0';
return size;
}
I was able to verify that I can access the file using the following code
int main() {
char str[MAX_FILE];
int len = readFile("test.txt", str);
if (len == -1) {
printf("An error occurred\n");
} else {
printf("file read");
}
}
File test.txt contains
The quick brown fox jumps over the lazy dog
What I want to do is to get the contents of test.txt and find the length of each word in it something like:-
1 letter words- 0
2 letter words - 0
3 letter words - 3
4 letter words -4
and so on...

As a fellow new contributor, I'm going to give you a break and try to answer the question you didn't ask. ;)
I believe the question is "how to proceed". This is going to be a long answer as I will try be very detailed since you seem to be a newbie. Hopefully this will help you or maybe someone else.
The trick is to take a word problem and convert it into a mathematical solution. The best way to do this is to write "pseudocode". (See Wikipedia for more information, if you need to.) I'm going to give you some pseudocode at the end, but since this appears to be a homework assignment, please try to write your own pseudocode first. If you read the pseudocode and it still doesn't help, I can post my solution later. (I'm not a great programmer so it might not be the best program. And it took way overlong to come up with it.)
First things first: There appears to be a typo in the code you posted. In the source code you were provided, the problem is the return 1 statement if the file isn't found. That should be return -1, because what would happen if you had a test file that had exactly 1 letter? The code wouldn't work correctly.
Now, to first convert the word problem you were given: You need to have an array of word counts to keep track of 1-letter, 2-letter, etc. words. Now according to this the longest word in the English dictionary is 45 letters. So, in theory, you would need to have an array of 45 elements of wordCounts. You can shorten this as required.
Now to process your str variable, you need a while statement to go through one character at a time. Since the characters in the string go from element 0 through one less that the len variable, you need to code the while accordingly.
Within that while, you need another while. This while needs to count up the wordLength one character at a time, until you see a blank or the trailing '/0' character of str. To do this, you initialize the wordLength to zero right before the second while. Then add 1 to the wordLength for each character you count and increment your subscript.
At the end of this inner while you need to accumulate your wordCounts. Keep in mind that your 1-letter words are going to be accumulated into element 0 of your array. So you need to adjust the wordLength - 1 array element. After that you need to increment your subscript you are using to go through your str, one character at a time.
At the end, you need to print out the wordCounts array values. Since most of the word lengths will have a value of zero, I wouldn't print these. Unless you set the maximum length of the wordCounts array to something like 10, instead of 45. You want a for loop to go through your wordCounts array, and do something like this: printf("%2d letter words = %d", ..., ...);. Keep in mind your 1-letter words are going to be in element 0;
That is a very detailed version of a word problem that is the solution to the problem of "count the number of words that the phrase has from 1-letter words to x-letter words".
Here is the pseudocode I came up with, after coding my solution. It is a little more detailed than normal pseudocode would be. (Personally, I abbreviate all variable names and use Pascal case, but that's just me.)
Declare a numeric array of wordCounts and a subscript.
For each element of wordCounts, zero out the number of words or the code won't work right.
Reinitialize subscript to zero.
As long as (while) the subscript is less than the len, continue.
Initialize the wordLength to zero.
As long as the str[subscript] is not a blank or a null character, add 1 to the wordLength.
Increment the subscript.
After both while statements are complete print out the array of wordLengths, as described above.
Your done!
Now I could post the actual code that could be used to come up with this pseudocode, but it would be better if you came up with it yourself. If you try but have a bug in your code, post a new question, and I'll try to check back to answer it. Hope this helps you or someone else.

Related

Why do I get additional weird characters in this C code with dynamically-allocated char arrays?

I'm quite new to dynamic memory allocation in general.
I've been looking for an error in this code for about 6 hours in the last 3 days now, it's driving me crazy, that's why I've decided to ask for help here.
Here's the code:
char ch;
char* line=(char*)calloc(1, sizeof(char));
if(input!=NULL) {
for(int num=1; (ch=fgetc(input)) != EOF; num++) //input is the pointer to the in file
if(ch!=' ') {
line=(char*)realloc(line, sizeof(char)*num+1);
strcat(line, &ch);
}
else
break;
}
I'm trying to read from a file the first of two whitespace-separated words, where the total size is not predetermined (I'll need this to read even more from the file so it's important, this was "just to try").
This is for a single line, not multiple lines (char** I think would be used in that case), and the idea was to allocate the first character of the line and set it to zero, then reallocate the memory incrementing its size by one character.
If I "num++", it crashes; if I don't, its output will be, instead of "Nole", this: N☺o☺l☺e☺ (output is after the loop; how does it even increase if num remains the same?). I checked the ASCII codes and this is what I get: 78 1 111 1 108 1 101 1; there is a '1' after every character, which is THE SAME value as "num" (in fact, if num==2, then I get '2's instead of '1's). I've tried it with different compilers and different machines but I always get the same result and I cannot explain why.
I'm really going crazy, also because I'm gonna have an exam in about two weeks and this is basically the only thing I haven't learned yet among all the required topics.
Thank you so much in advance 😿
EOF is an int so you must use int ch;
As mentioned in comments, you pass a single ch to strcat and not a null terminated string, so it will go haywire. Quick fix: strcat(line, (char[2]){ch,'\0'});.
Or if you add a counter, you could just do line[count] = ch; which is much more efficient. Though in that case you'll have to remember to append the null terminator manually in the end.
Also, sizeof(char) is always 1 by the very definition of sizeof, so it's just a needlessly bloated way of writing 1.

How to find a string within a string in C? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
First, I have already tried using the strstr() method that has been mentioned here, but this is not working for the problem I am trying to solve.
I am to obtain an input string from the user (the target string), which can be a word or a full sentence. It then scans an input text file line-by-line and checks if this input string exists in any lines of the file. If so, the corresponding text file line number and the full text file line in which the target string can be found is printed to the screen.
My problem while using the strstr(line, targetString) method is as follows:
Let's presume there is a sentence in the file, "I vigorously slapped that fish with a squirrel." Then, I enter my target string as, "I frolicked about the cat graveyard with unrelenting glee." It will show that the target string was matched, presumably because it is just finding a match with the "I". However, I need it to match only when it matches word-for-word of the target string to some substring within the text file line.
Any ideas how I might go about this? Thanks in advance!
Code:
FILE *inputFile = fopen(fileName, "r");
i = 1;
while (fgets(line, sizeof line, inputFile) != NULL)
{
if ((strlen(line) != 1) && (strstr(line, targetString)))
{
printf("Line %d: ", i);
printf("%s\n\n", line);
}
i++;
}
strstr() should work for your problem. Make sure both of the arguments actually have spaces and not \0 characters. strstr() like many standard C library functions treats \0 as a string terminator. Based on your description, I am going to guess the problem is that you read one line at a time and terminate each with a \0, then start matching one line at a time. You might be able to solve your problem if you read all of the strings at once into a buffer equal to the size of the file removing newline characters.
To be able to answer your question better, I would need to see your current source code with a test case that you expect to work.
My problem while using the strstr(line, targetString) method is as follows: Let's presume there is a sentence in the file, "I vigorously slapped that fish with a squirrel." Then, I enter my target string as, "I frolicked about the cat graveyard with unrelenting glee." It will show that the target string was matched, presumably because it is just finding a match with the "I".
This is not how strstr works. It doesn't match the first word/letter, it matches the entire string.
Let's presume there is a sentence in the file, "I vigorously slapped that fish with a squirrel." Then, I enter my target string as, "I frolicked about the cat graveyard with unrelenting glee." It will show that the target string was matched, presumably because it is just finding a match with the "I".
It sounds like you're interpreting the result of strstr() as an index into the string in question, but in fact the result is a char *, i.e. a pointer to the first match in the string. The 0 that you're getting back isn't the index the the first character, it's nil, meaning that there was no match.

Dynamic memory allocation for input? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am having a lot of trouble starting my project. Here are the directions:
"Complete counts.c as follows:
Read characters from standard input until EOF (the end-of-file mark) is read. Do not prompt the user to enter text - just read data as soon as the program starts.
Keep a running count of each different character encountered in the input, and keep count of the total number of characters input (excluding EOF)."
The format my professor gave me to start is: `
#include <stdio.h>
int main(int argc, char *argv[]) {
return 0;
}
In addition to how to start the problem, I'm also confused as to why the two parameter's are given in the main function when nothing is going to be passed to it. Help would be much appretiated! Thank you!
`
Slightly tricky to see what you're having trouble with here. The title doesn't form a complete question, nor is there one in the body; and they seem to be hinting at entirely different questions.
The assignment tells you to read characters - not store them. You could have a loop that only reads them one at a time if you wish (for instance, using getchar). You're also asked to report counts of each character, which would make sense to store in an array. Given that this is of "each different character", the simplest way would be to size the array for all possible characters (limits.h defines UCHAR_MAX, which would help with this). Remember to initialize the array if it's automatically allocated (the default for function local variables).
Regarding the arguments to main, this program does not need them, and the C standard does allow you to leave them out. They're likely included as this is a template of a basic C program, to make it usable if command line arguments will be used also.
For more reference code you might want to compare the word count utility (wc); the character counting you want is the basis of a frequency analysis or histogram.
This should give you a start to investigate what you need to learn to complete your task,
Initially declare a character input buffer of sufficient size to read chars as,
char input[SIZE];
Use fgets() to read the characters from stdin as,
if (fgets(input, sizeof input, stdin) == NULL) {
; // handle EOF
}
Now input array has your string of characters which you to find occurrence of characters. I did not understand When you say different characters to count, however you have an array to traverse it completely to count the characters you need.
Firstly, luckily for you we will not need dynamic memory allocation at all here as we are not asked to store the input strings, instead we simply need to record how many of each ascii code is input during program run, as there a constant and finite number of those we can simply store them in a fixed size array.
The functions we are looking at here (assuming we are using standard libs) are as follows:
getchar, to read chars from standard input
printf, to print the outputs back to stdout
The constructs we will need are:
do {} while, to loop around until a condition is false
The rest just needs simple mathematical operators, here is a short example which basically shows a sample solution:
#include <stdio.h>
int main(int argc, char *argv[])
{
/* Create an array with entries for each char,
* then init it to zeros */
int AsciiCounts[256] = {0};
int ReadChar;
int TotalChars = 0;
int Iterator = 0;
do
{
/* Read a char from stdin */
ReadChar = getchar();
/* Increment the entry for its code in the array */
AsciiCounts[ReadChar]++;
TotalChars++;
} while (ReadChar != EOF);
/* Stop if we read an EOF */
do
{
/* Print each char code and how many times it occurred */
printf("Char code %#x occurred %d times\n", Iterator, AsciiCounts[Iterator]);
Iterator++;
} while (Iterator <= 255);
/* Print the total length read in */
printf("Total chars read (excluding EOF): %d", --TotalChars);
return 0;
}
Which should achieve the basic goal, however a couple of extension exercises which would likely benefit your understanding of C. First you could try to convert the second do while loop to a for loop, which is more appropriate for the situation but I did not use for simplicity's sake. Second you could add a condition so the output phase skips codes which never occurred. Finally it could be interesting to check which chars are printable and print their value instead of their hex code.
On the second part of the question, the reason those arguments are passed to main even though they are ignored is due to the standard calling convention of c programs under most OSes, they pass the number of command line arguments and values of each command line argument respectively in case the program wishes to check them. However if you really will not use them you can in most compilers just use main() instead however this makes things more difficult later if you choose to add command line options and has no performance benefit.

Incrementation and addition in pointers [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have written the following code and it works fine. But before I made changes i had entered a few statements which I expected would work but didn't. Just as a trial, i made changes and it worked. Please clarify what I was doing wrong. I am trying simple programs initially to make my understanding of Pointers better.
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void main()
{
int i;
char *instring,*outstring;
char ch,p;
instring = (char*)malloc(15*sizeof(char));
outstring= (char*)malloc(15*sizeof(char));
printf("Enter the string:");
gets(instring);
printf("Enter the character to be removed:");
scanf("%c",&ch);
for(i=0;i<strlen(instring);i++)
{
if( *(instring+i) != ch)
{
*(outstring+i) = *(instring+i);
printf("%c",*(outstring+i));
}
}
Under if statement, I had written the following:
if(*(instring) != ch)
{
*outstring = *instring;
printf("%c",*(outstring));
instring++;
outstring++;
}
Why doesn't this work?
I'm not sure what you mean by not working, but if your non-working function looked like this:
for(i=0;i<strlen(instring);i++)
{
if(*(instring) != ch)
{
*outstring = *instring;
printf("%c",*(outstring));
instring++;
outstring++;
}
}
Then it looks like the problem is that you are only incrementing instring if it's current character doesn't match ch. So if instring's current character does match ch, you will loop forever and the process will hang.
Also, if all you want to do is print out the string without the character to be removed, you don't need outstring. Just do e.g.
printf("%c",*(instring));
On the other hand if you also want to store the string in outstring with the character removed, you can't increment the pointers in tandem like you are doing. Because when you hit the character to be removed, you want to increment the instring pointer to move past it, but not increment the outstring pointer since you haven't added anything to that string.
It does work. But you loose both strings (pointers to them)
Try to store pointers to the beginning of both strings in another two pointers (char*) and then do your cycle.
And at the end, print the pointers you stored before - because they still point to your strings. When you increment pointers instring and outstring, they don't point to your strings anymore - they point to the very end of these strings.
Play with it a little bit more and you'll see :)
edit: Well, no, I've been writing faster then reading, sorry. Your (another) problem is even sooner - comparison. You are comparing instring with character and you change it only if condition is true. So you are comparing the same thing over and over and over.
It's tough to tell, what your problem really is.

How can I read a specific line from a file, in C?

All right: So I have a file, and I must do things with it. Oversimplifying, the file has this format:
n
first name
second name
...
nth name
random name
do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾
random name
do x⁽²⁾, y⁽²⁾, z⁽²⁾
...
random name
do x⁽ⁿ⁾, y⁽ⁿ⁾, z⁽ⁿ⁾
So, the actual details are not important.
The problem is: I'll have to declare a variable n, I have an array name[MAX], and I'll fill this array with the names, from name[0] to name[n-1].
Alright, the problem is: How can I get this input, if I don't know previously how many names do I have?
For example, I could do it just fine if that was an user input, from the keyboard: I would do it like this:
int n; char name[MAX];
scanf( "%d", &n);
int i; for (i = 0; i < n; i++)
scanf( "%s", &N[i]);
And I could go on, do the whole code, but you get the point. But, my input now comes from a file. I don't know how can I get the input, all I can do is to fscanf() the whole file, but since I don't know its size (the first number will determine it), I can't do it. As far as I know (please correct me if that's not true, I am very new to this), we can't use the command "for" and get the numbers gradually as if that was coming from the keyboard, right?
So, the only exit I see is to find a way to read a particular line from the file. If I can do this, the rest is easy. The thing is, how can I do that?
I google'd it, I even found some questions in there, though it didn't make any sense at all. Apparently, reading a particular line from a file is really complicated.
This is from a beginner problem set, so I doubt it is something that complicated. I must be missing something very simple, though I just don't know what it is.
So, the question is: How would you do it, for instance?
How to scan the first number n from the file, and then, scan the others 'n' names, assigning each one to an element in an array (first name = name[0], last name = name[n - 1])?
I would suggest looking into End Of File.
while(!eof(fd))
{
...code...
}
Mind you my C knowledge is rusty, but this should get you started.
IIRC eof returns a value (-1) so that's why you need to compare it to something. Here fd being file descriptor of the file you are reading.
Then after parse of text or count of lines you have your 'n'.
EDIT: Since I'm obviously more tired then I thought(didn't notice your 'n' at the top).
Read first line
malloc for 'n' size array
for loop to iterate names.
Here you go.. I leve compiling and debugging as an exercise for the student.
The idea is to slurp the whole file into a single array if you files are always small.
This is so much more efficient than scanf().
char buf[100000], *bp, *N[1000]; // plenty big
memset( buf, '\0', sizeof buf );
if ( fgets( buf, sizeof(buf), fd ) )
{
int n = 0;
char *bp;
if ( buf[(sizeof buf)-2)] != '\0' )
{ // file too long for buffer
printf( stderr, "trouble: file too large: %d\n", (int)(sizeof buf));
exit(EXIT_FAILURE);
}
// now replace each \n with a \0, remembering where each line is.
for ( bp = buf, bp = strchr( bp, '\n' ); bp++ )
N[n++] = bp;
}
If you want to read any size files you need to read the file in chunks, calloc()ing each chunk before a read, and carefully handling of the line fragments left at the end of the current buffer to move them to the next buffer and then properly continuing you reads.
Unless you have a limit on how many lines you can read the N may need to also be set up in chunks, but this time remalloc() might be your friend.
Since the given format seems to imply that the number of names n is given as the first entry in the file, it would be possible to use the style of reading that the OP describes when reading from stdin. Use fscanf to read the first integer from the file (n), then use malloc to allocate the array(s) for the names, then use a for loop up to n to read the names.
However, I am unsure of the meaning of the example data following that with the do x⁽¹⁾, y⁽¹⁾ and z⁽¹⁾ format. Perhaps I am not understanding part of the question. If it means there are potentially more than n names, then you can use realloc to grow the size of the array. One way of growing the array that is not uncommon is to double the length each time.

Resources