Why is my character count incorrect? - c

The following code gets the number of words:
int count = 0;
for (int i = 0; chars[i] != EOF; i++)
{
if (chars[i] == ' ')
{
count++;
}
}
My problem is, that it doesn't count the words correctly.
For example, if my file.txt has the following text in it:
spaced-out there's I'd like
It says I have 6 words, when according to MS Word I'd have 4.
spaced-out and in
Gives me a word count of 4.
spaced out and in
Gives me a word count of 6
I'm sorry if this question has been answered before, Google doesn't take into account the special characters in the search, so it is hard to find the answer to coding. I'd preferably have the words just by identifying if it's a space or not.
I tried looking for answers but no one seemed to have the same problem exactly. I know that the .txt files might end in /r/n in Windows, but then that should be part of one word. For example:
spaced out and in/r/n
I believe it should still give me 4 words. Also when I add || chars[i] == '\n' as:
for (int i = 0; chars[i] != EOF || chars[i] == '\n'; i++)
I get even more words, 8 for the line
spaced out and in
I am doing this on a Linux-based server, but on an SSH client on Windows. The characters come from a .txt file.
Edit: Okay, here is the code, I avoided the #include when posting it.
#define BUF_SIZE 500
#define OUTPUT_MODE 0700
int main(int argc, char *argv[])
{
int input, output;
int readSize = 1, writeSize;
char chars[BUF_SIZE];
int count = 0;
input = open(argv[1], O_RDONLY);
output = creat(argv[2], OUTPUT_MODE);
while (readSize > 0)
{
readSize = read(input, chars, BUF_SIZE);
if (readSize < 0)
exit(4);
for (int i = 0; chars[i] != '\0'; i++)
{
if (chars[i] == ' ')
{
count++;
}
}
writeSize = write(output, chars, readSize);
if (writeSize <= 0)
{
close(input);
close(output);
printf("%d words\n", count);
exit(5);
}
}
}

I am writing this answer because I think, I know what your confusion is. But note that you did not explain how you read the file, I'll give an example and explain why we test != EOF, which is not a character that you read from a file.
It appears that you think EOF is a character that is stored in the file, well it's not. If you just want to count words you can do something like
int chr;
while ((chr = fgetc(file)) != EOF)
count += (chr == ' ') ? 1 : 0;
note that chr MUST be of type int because EOF is of type int, but it's certainly not present in the file! It's returned by functions like fgetc() to indicate that there is nothing more to read, note that an attempt to read must be made in order for it to return it.
Oops, also note that my sample code will not count the last word. But that's for you to figure out.
Also, this would count multiple spaces as "words" something that you should also workout.

Related

Working on a C program to count characters, words, sentences, lines, or all the above.

We are asked to create a program that provides a table in which the user can choose whether to count characters, words, sentences, lines, or all the above. This requires a separate function for each utility. I have the line counter working perfectly, but for some reason, the character counter function keeps returning 0. The program is incomplete, but I am getting very frustrated with the character counter.
#include <stdlib.h>
#define WHT_SPC\
(cur == ' ' || cur == '\n' || cur == '\t')
int countLines(sp1);
int wordCounter(sp1);
int characterCounter(sp1);
int sentenceCounter (sp1);
int main()
{
int lineCount = 0;
int wordCount = 0;
int characterCount= 0;
int sentenceCount = 0;
char filename[100];
FILE* sp1;
printf("Enter Filename to be read: ");
gets(filename);
sp1 = fopen(filename,"r");
lineCount = countLines(sp1);
characterCount = characterCounter(sp1);
printf("Number of Lines: %d\n",lineCount);
printf("Number of Characters: %d\n",characterCount);
fclose(sp1);
return 0;
}
int countLines(sp1)
{
int curCh;
int preCh;
int countLn = 0;
while ((curCh = fgetc(sp1)) != EOF)
{
if (curCh == '\n')
countLn++;
preCh = curCh;
}
if (preCh != '\n')
countLn++;
return countLn;
}
int characterCounter(sp1)
{
int chr;
int countCh = 0;
while ((chr = fgetc(sp1)) != EOF)
{
if (chr != 'n' && chr != ' ')
countCh++;
}
return countCh;
}
I understand the lack of comments is not ideal, but my problem is very specific. Not looking for answers just some advice to kind of point me in the right direction.
sp1 is a structure that holds, among other things, your position in the file you are reading. After the inner loop of countLines(), this field will point the end of the file. You will need to call rewind(sp1); before performing other read operations on this same stream, so it will work on its beginning again.
Don't hesitate to take a look at man rewind!
fgetc() increments the filepointer.
Thus characterCounter() caled after countLines() has to return 0 as the first character it encounters is EOF.
Try using fseek(sp1, 0L, SEEK_SET) before characterCounter().

C output too long

I have a question about my little c program:
#include <stdio.h>
#include <stdlib.h>
int main() {
int c, len;
int max = 100;
char *buffer = malloc(max);
for (len = 0; (c = getchar()) != EOF; len++) {
buffer[len] = c;
if (len == max - 1) {
buffer = realloc(buffer, (len + max));
if (buffer == NULL) {
printf("Error: Out of memory!\n");
return 1;
}
max += 100;
}
}
buffer[len] = '\0';
for (; len >= 0; --len) {
printf("%c", buffer[len]);
}
printf("\n");
free(buffer);
return 0;
}
My task is to write a program which inserts a text and gives a backwards output of the text.
If there happens to be a problem with the allocated memory an error message should occur.
According to my test report from university the first lines of the output are 1 character too long, I can't determine the reason for this problem and I'm seeking for some advice and help
First of all, you should understand your problem. You have the following diagnostic:
the first lines of the output are 1 character too long
This is not enough! You should make a specific example. If you give your program some small input, e.g. abc, what will it output? And what should it output? This is less abstract than "1 character too long", and possible to debug.
Your program has an off-by-one bug:
buffer[len] = '\0';
...
printf("%c", buffer[len]);
The first character it will output will be a null character \0. It may not be visible on screen (it's an "unprintable" character), so to debug this you better make your output more verbose, like this:
printf("Character '%c', whose code is %d\n", buffer[len], buffer[len]);
Note the following features that make debugging easier:
Apostrophes around the printed character will make it clear where your code outputs a space
Verbose format will make it clear how many characters your code outputs
Printing the character a second time as integer (%d) will output its code and will help you debug unprintable characters
Your program has more than one bug. Use the above ideas to reproduce and isolate bugs one by one. Please also read this.

Append data from an array to a char variable in C

i have a program that i'm writing, and i need to read in a configuration file. if you can't tell by the way it's written it is a placeholder for another program, it opens the second program in its memory space. I have the readline function all set up, but my "main" operation will only support a variable for arguments (unless im incorrect), like this: "arg1 arg2 arg3..." I have seen things on the net like 'strcat' and others, but since im not so versed in C these seem to only add a single character. my needed solution would be:
char args[ 10 ];
FILE *fp=fopen("file.cfg","r");
void readLine(FILE* file, char* line, int limit)
{
int i;
int read;
read = fread(line, sizeof(char), limit, file);
line[read] = '\0';
for(i = 0; i <= read;i++)
{
if('\0' == line[i] || '\n' == line[i] || '\r' == line[i])
{
line[i] = '\0';
break;
}
}
if(i != read)
{
fseek(file, i - read + 1, SEEK_CUR);
}
}
int main(void)
{
_spawnl( P_OVERLAY, "prog1.exe", "prog1.exe", args, NULL );
return 0;
}
the 'args' variable in 'int main(void)' would need to be the one that is = line[i].
unless a complete rewrite would be nessecary.
Also b4 you flame me, i dont want a loop in main, because this program just calls another, then dies. A loop might make it call an infinite number of the same program, and... well that would be bad. thanks in advance!

Counting lines, numbers, and characters in C

I'm new to C and I got an assignment today that requires that I read text in from a file, count the number of lines, characters, and words, and return it in a specific format.
Just to be clear - I need to read in this text file:
"I must not fear.
Fear is the mind-killer.
Fear is the little-death that brings total obliteration.
I will face my fear.
I will permit it to pass over me and through me.
And when it has gone past I will turn the inner eye to see its path.
Where the fear has gone there will be nothing... only I will remain"
Litany Against Fear, Dune by Frank Herbert
and have it output like so:
1)"I must not fear.[4,17]
2)Fear is the mind-killer.[4,24]
3)Fear is the little-death that brings total obliteration.[8,56]
4)I will face my fear.[5,20]
5)I will permit it to pass over me and through me.[11,48]
6)And when it has gone past I will turn the inner eye to see its path.[16,68]
7)Where the fear has gone there will be nothing... only I will remain"[13,68]
8) Litany Against Fear, Dune by Frank Herbert[7,48]
Now, I've written something that will accept the file, it counts the number of lines properly, but I have 2 major issues - 1. How do I get the text from the file to appear in the output? I can't get that at all. My word count doesn't work at all, and my character count is off too. Can you please help?
#include <stdio.h>
#define IN 1
#define OUT 0
void main()
{
int numChars = 0;
int numWords = 0;
int numLines = 0;
int state = 0;
int test = 0;
FILE *doesthiswork;
doesthiswork = fopen("testWords.in", "r");
state = OUT;
while ((test = fgetc(doesthiswork)) != EOF)
{
++numChars;
if ( test == '\n')
{
++numLines;
if (test == ' ' || test == '\t' || test == '\n')
{
state = OUT;
}
else if (state == OUT)
{
state = IN;
++numWords;
}
}
printf("%d) I NEED TEXT HERE. [%d %d]\n",numLines, numWords, numChars);
}
}
It will be better if you use getline() function to read each line from the file.
And after reading the line process it using strtok() function. With this you will get the number of words in the line and save it in a variable.
Then process each variable and get the number of characters.
Output the line number, number of words and the number of characters.
Then read another line and so on.
How do I get the text from the file to appear in the output?
It should be stored there by preparing a buffer.
My word count doesn't work at all, and my character count is off too.
Order in which the test is wrong.
fix like this:
#include <stdio.h>
#define IN 1
#define OUT 0
int main(){
int numChars = 0;
int numWords = 0;
int numLines = 0;
int state = OUT;
int test;
char buffer[1024];
int buff_pos = 0;
FILE *doesthiswork;
doesthiswork = fopen("data.txt", "r");
state = OUT;
while((test = fgetc(doesthiswork)) != EOF) {
++numChars;
buffer[buff_pos++] = test;
if(test == ' ' || test == '\t' || test == '\n'){
state = OUT;
if(test == '\n') {
++numLines;
--numChars;//no count newline
buffer[--buff_pos] = '\0';//rewrite newline
printf("%d)%s[%d,%d]\n", numLines, buffer, numWords, numChars);
buff_pos = 0;
numWords = numChars = 0;
}
} else {
if(state == OUT){
state = IN;
++numWords;
}
}
}
fclose(doesthiswork);
if(buff_pos != 0){//Input remains in the buffer.
++numLines;
buffer[buff_pos] = '\0';
printf("%d)%s[%d,%d]\n", numLines, buffer, numWords, numChars);
}
return 0;
}

C - Malloc issue (maybe something else)

Update edition:
So, I'm trying to get this code to work without using scanf/fgets. Gets chars from the user, puts it into a pointer array using a while loop nested in a for loop.
#define WORDLENGTH 15
#define MAXLINE 1000
int main()
{
char *line[MAXLINE];
int i = 0;
int j;
int n;
char c;
for (n=0; c!=EOF; n){
char *tmp = (char *) malloc(256);
while ((c=getchar())!=' '){
tmp[i]=c; // This is no longer updating for some reason.
i++;
}
line[n++]=tmp; //
i=0;
printf("\n%s\n",line[n]); //Seg fault here
}
for(j = 0; j (lessthan) n; j++){
printf("\n%s\n", line[j]);
free (line[j]);
}
return 0;
So, now I'm getting a seg fault. Not sure why tmp[i] is not updating properly. Still working on it.
I've never learned this much about programming during the entire semester so far. Please keep helping me learn. I'm loving it.
You print line[i] and just before that, you set i to 0. Print line[n] instead.
Also, you forgot the terminating 0 character. And your code will become easier if you make tmp a char array and then strdup before assigning to line[n].
sizeof(WORLDLENGTH), for one, is wrong. malloc takes an integer, and WORLDLENGTH is an integer. sizeof(WORLDLENGTH) will give you the size of an integer, which is 4 if you compile for a 32-bit system, so you're allocating 4 bytes.
Btw - while ((c=getchar())!=' '||c!=EOF) - what's your intent here? A condition like (a!=b || a!=c) will always return true if b!=c because there is no way a can be both b and c.
And, as others pointed out, you're printing out line[i], where i is always 0. You probably meant line[n]. And you don't terminate the tmp string.
And there's no overflow checking, so you'll run into evil bugs if a word is longer than WORDLENGTH.
Others have already told you some specific problems with your code but one thing they seem to have missed is that c should be an int, not a char. Otherwise the comparison to EOF wil not work as expected.
In addition, the segfault you're getting is because of this sequence:
line[n++]=tmp;
printf("\n%s\n",line[n]);
You have already incremented n to the next array element then you try to print it. That second line should be:
printf("\n%s\n",line[n-1]);
If you just want some code that works (with a free "do what you darn well want to" licence), here's a useful snippet from my code library.
I'm not sure why you think fgets is to be avoided, it's actually very handy and very safe. I'm assuming you meant gets which is less handy and totally unsafe. Your code is also prone to buffer overruns as well, since it will happily write beyond the end of your allocated area if it gets a lot of characters that are neither space nor end of file.
By all means, write your own code if you're educating yourself but part of that should be examining production-tested bullet-proof code to see how it can be done. And, if you're not educating yourself, you're doing yourself a disservice by not using freely available code.
The snippet follows:
#include <stdio.h>
#include <string.h>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
// Test program for getLine().
int main (void) {
int rc;
char buff[10];
rc = getLine ("Enter string> ", buff, sizeof(buff));
if (rc == NO_INPUT) {
printf ("No input\n");
return 1;
}
if (rc == TOO_LONG) {
printf ("Input too long\n");
return 1;
}
printf ("OK [%s]\n", buff);
return 0;
}
It's a useful line input function that has the same buffer overflow protection as fgets and can also detect lines entered by the user that are too long. It also throws away the rest of the too-long line so that it doesn't affect the next input operation.
Sample runs with 'hello', CTRLD, and a string that's too big:
pax> ./qq
Enter string> hello
OK [hello]
pax> ./qq
Enter string>
No input
pax> ./qq
Enter string> dfgdfgjdjgdfhggh
Input too long
pax> _
For what it's worth (and don't hand this in as your own work since you'll almost certainly be caught out for plagiarism - any half-decent educator will search for your code on the net as the first thing they do), this is how I'd approach it.
#include <stdio.h>
#include <stdlib.h>
#define WORDLENGTH 15
#define MAXWORDS 1000
int main (void) {
char *line[MAXWORDS];
int numwords = 0; // Use decent variable names.
int chr, i;
// Code to run until end of file.
for (chr = getchar(); chr != EOF;) { // First char.
// This bit gets a word.
char *tmp = malloc(WORDLENGTH + 1); // Allocate space for word/NUL
i = 0;
while ((chr != ' ') && (chr != EOF)) { // Read until space/EOF
if (i < WORDLENGTH) { // If space left in word,
tmp[i++] = chr; // add it
tmp[i] = '\0'; // and null-terminate.
}
chr = getchar(); // Get next character.
}
line[numwords++] = tmp; // Store.
// This bit skips space at end of word.
while ((chr == ' ') && (chr != EOF)) {
chr = getchar();
}
}
// Now we have all our words, print them.
for (i = 0; i < numwords; i++){
printf ("%s\n", line[i]);
free (line[i]);
}
return 0;
}
I suggest you read that and studdy the comments so that you know how it's working. Feel free to ask any questions in the comments section and I'll answer or clarify.
Here's a sample run:
pax$ echo 'hello my name is pax andthisisaverylongword here' | ./testprog
hello
my
name
is
pax
andthisisaveryl
here
Change your printf line - you need to print line[n] rather than line[i].
first your malloc formula is wrong
malloc(sizeof(char)*WORDLENGTH);
you need to allocate the sizeof a char enought times for the lenght of your word (also 15 seems a bit small, your not counting the longest word in the dictionnary or the "iforgettoputspacesinmyphrasestoscrewtheprogrammer" cases lol
don't be shy char is small you can hit 256 or 512 easily ^^
also
printf("\n%s\n",line[i]);
needs to be changed to
int j = 0;
for(j=0;j<i;j++){
printf("\n%s\n",line[j]);
}
your i never changes so you always print the same line

Resources