Printing Word Length Histogram in C - c

Okay so I'm using Kernighan and Ritchie's "The C Programming Language" and I'm on exercise 1.13 and I can't seem to get this right. My program seems to not be printing much. The problem is as follows:
Exercise 1-13. Write a program to print a histogram of the lengths of words in its input. It is easy to draw the histogram with
the bars horizontal; a vertical orientation is more challenging.
Besides the creation of variables, here's my pseudocode for reading the input and storing what I want to store in the array.
Create an array -- in this case, my array is of size 21 (21 elements, from 0 to 20) all assigned a value of 0 initially. It has 21 elements because I'm not going to use words that have more than 20 characters. I realize this is weird given no words have zero characters.
Begin counting characters in input.
If I encounter a space, tab, or newline character (i.e., this means the first word ended), stop.
Depending on how many characters the word had, increment that particular position in the array (i.e., if the word had two characters add 1 to the element at position 2 in the array).
Increment the wordCounter variable -- this variable as it's name indicates keeps track of the number of words that have been "read" in the input.
Continue doing this to each word until EOF is reached.
Here's my pseudocode for printing the histogram (horizontally).
For the first position, print the value stored in the first position of the array (i.e., 0) using tick marks |
Do this for every element in the array.
Here's my code:
#include <stdio.h>
#define SIZEOFWORDSOFLENGTH 21
int main() {
int wordsOfLength[SIZEOFWORDSOFLENGTH];
int c, i, j;
int lengthCounter = 0;
/*Initializing all array elements to 0.*/
for (i = 0; i < SIZEOFWORDSOFLENGTH; i++)
wordsOfLength[i] = 0;
/*Going through the input and counting.*/
while ((c = getchar()) != EOF) {
++lengthCounter;
if (c == ' ' || c == '\t' || c == '\n') {
++wordsOfLength[lengthCounter - 1];
lengthCounter = 0;
}
}
for (i = 0; i < SIZZEOFWORDSOFLENGTH; i++) {
printf("Words of Length %d: ", i);
/*The third argument of the following for loop was previously j = j*/
for (j = 0; j < SIZEOFWORDSOFLENGTH; j++) {
while (j < wordsOfLength[i]) {
printf("|");
/*Was previously j++ instead of break*/
break;
}
}
printf("\n");
}
}
I debugged it by hand but I can't seem to find the problem. Maybe something really simple is going over my head. Also, I know this question has been asked before but I'm not trying to find a solution for the actual problem, I think my pseudocode is right if not somewhat right, I just want to know what's wrong with my code and maybe learn something. Thank you in advance.

As indicated in Ji-Young Park's answer, the reading loop has problems because it uses negative indexes into the array wordsOfLength. I would keep life simple and have wordsOfLength[i] store the number of words of length i, though it effectively wastes wordsOfLength[0]. I would use the macros from <ctype.h> to spot word boundaries, and I'd keep a record of whether I was in a word or not. You get credit for using int c.
int inword = 0;
while ((c = getchar()) != EOF)
{
if (!isspace(c))
lengthCounter++;
else if (inword)
{
wordsOfLength[lengthCounter]++;
lengthCounter = 0;
inword = 0;
}
}
if (inword)
wordsOfLength[lengthCounter]++;
This code is not bamboozled by leading white space in the file. If you think there's any risk of reading 'antidisestablishmentarianism' (28) or 'floccinaucinihilipilification' (29) or other grotesquely long words, you should check on lengthCounter before blindly using it as an index, either dropping overlong words from the count or mapping them all to '20+ characters' class.
Your final triple loop is quite problematic too — it is currently:
for (i = 0; i < SIZZEOFWORDSOFLENGTH; i++) {
printf("Words of Length %d: ", i);
/*The third argument of the following for loop was previously j = j*/
for (j = 0; j < SIZEOFWORDSOFLENGTH; j++) {
while (j < wordsOfLength[i]) {
printf("|");
/*Was previously j++ instead of break*/
break;
}
}
printf("\n");
}
Under my scheme, I'd start with i = 1, but that isn't a major issue. I'd ensure that the first printf() printed for 2 digits to align the output for the counts of words of lengths 10-20.
The inner for loop should be constrained by wordsOfLength[i] rather than SIZEOFWORDSOFLENGTH, and the while loop is redundant (not least because you break it on the first iteration each time). You just need simple nested for loops:
for (i = 1; i < SIZZEOFWORDSOFLENGTH; i++)
{
printf("Words of Length %2d: ", i);
for (j = 0; j < wordsOfLength[i]; j++)
printf("|");
printf("\n");
}
The only issue now is if the maximum value in wordsOfLength is too long for comfort (for example, you've read the entire works of Shakespeare, so some of the words appear thousands of times).

you don't need to substract '1' from lengthCounter in
++wordsOfLength[lengthCounter - '1'];
It should be
++wordsOfLength[lengthCounter];

Related

Counting the number of appearances of each character

I am new to C and working on homework. I got most work done, but I can't pass all test cases that used by my professor. He refuses to post cases being used in the auto grader.
What would be the cases that I have missed?
Any clue will be appreciated!!!
Write a program to remove all the blank characters (space and tab) in a line.
Then, count the number of appearances of each character.
Input
A single line with a maximum length of 5000 characters
Output
First line: The line after removing all the blank characters.
If the line is empty, don’t print anything in the output.
Next lines: Each line contains a character that appears in the line and its count.
Note that, the characters must appear according to their ASCII number order. (http://www.asciitable.com)
#include <stdio.h>
int main (){
int c = 0;
int characters[128] = {0}; // subscripts for the ASCII table
int count = 0; // number of characters been reading in
while(count < 5001 && (c = getchar()) != EOF) {
// 9 -> TAB on ASCII, 32 -> Space on ASCII
if (c != 9 && c != 32) {
putchar(c);
characters[c]++;
count++;
}
}
fflush(stdout);
printf("\n");
for (int i = 0; i < 128; i++) {
if (characters[i] != 0) {
printf("%c %d\n", i, characters[i]);
}
}
return 0;
}
Again, any help will be really appreciated!
Update:
The code has been corrected.
Probably you don't want to write
characters[index] = 0;
What you want instead is probably
text[index] = 0;

While reading two array in c through scanf, the second one somehow modify the first [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have the following problem reading two strings through scanf: I insert the first string and everything it's OK, but after I insert the second one the first one changes.
#include<stdio.h>
#define N 6
#define K 2
int main(){
char a[N];
char b[K];
int i = 0,j=0;
printf("first word\n\n\n");
scanf("%s", a);
for(i = 0; i <= N; i++){
printf("%c", a[i]);
}
printf("second word \n\n\n");
scanf("%s", b);
for(i = 0; i <= N; i++){
printf("%c", a[i]);
}
}
The first time it prints it correctly. The second time it prints a similar string (maybe the first scanf is still getting the input when I'm inserting the second one)
To begin, you are printing the array a twice; it seems that you mean to print b with the second loop. But there is a problem in your loops. They are going out of array bounds. Since arrays are zero-indexed in C, you need:
for (i = 0; i < N; i++) {}
for a, and:
for (i = 0;i < K; i++) {}
for b.
But even this is not quite right, since the input strings may not entirely fill the arrays. You really need to terminate the loop when the null-terminator is reached, or when the end of the array has been reached:
for (i = 0; a[i] != '\0' && i < N; i++) {}
and:
for (i = 0; b[i] != '\0' && i < K; i++) {}
Of course, it would be simpler to just use puts() to print the strings.
It seems that the inputs (abcabc and abc) were too large for the arrays, causing buffer overflow. This can be avoided by specifying maximum widths when using the %s conversion specifier with scanf().
Here is a modified version of the posted code. I increased the sizes of N and K by one, since it appears that space for null-terminators was not considered in the original code:
#include <stdio.h>
#define N 7
#define K 3
int main(void)
{
char a[N];
char b[K];
int i = 0;
printf("first word\n\n\n");
scanf("%6s", a);
for (i = 0; a[i] != '\0' && i < N; i++) {
printf("%c", a[i]);
}
putchar('\n');
printf("second word \n\n\n");
scanf("%2s", b);
for (i = 0; b[i] != '\0' && i < K; i++) {
printf("%c", b[i]);
}
putchar('\n');
return 0;
}
You are printing a twice, change the second printf("%c", a[i]); to printf("%c", b[i]); .
The problem you are experiencing is because you invoke undefined behavior by failing to insure the strings are nul-terminated and by using incorrect limits regarding b. For instance, you
#define N 6
#define K 2
...
char a[N], b[K];
a can hold a total of 5-chars + the nul-terminator, for a total of 6-chars. b on the other hand, can only hold 1-char + the nul-terminator for a total of 2-chars.
When you then subsequently loop of both a and b with for(i = 0; i <= N; i++), not only have you guaranteed to access an element outside the bounds of the array, e.g. a[6] (valid indexes are 0-5), you have also invoked undefined behavior for any a with less that 6 total characters by attempting to read from an uninitialized value (those uninitialized array elements after the last valid char in word of say, 3-chars) When you invoke Undefined Behavior, the execution of your code is unreliable from that moment forward.
In your case you can eliminate undefined behavior by using field width modifiers to limit the number of characters placed in the arrays by scanf itself, e.g.
if (scanf ("%5s", a) != 1) {
fprintf (stderr, "error: invalid input - a.\n");
return 1;
}
You validate the return of scanf to insure the proper number of conversions have taken place, or you handle the error if they have not.
You prevent reading beyond the bounds of the array by limiting your read and output char loop to only valid characters within the array. You do that by checking the character to be printed is not the nul-terminating character, and when the nul-terminating character is reached, you exit the loop without attempting to print it.
Putting those pieces together, you could do something similar to the following (note j is unused in your code so it is commented out):
#include <stdio.h>
#define N 6
#define K 2
int main (void ) {
char a[N], b[K];
int i = 0/*, j = 0*/;
printf ("enter first word: ");
if (scanf ("%5s", a) != 1) {
fprintf (stderr, "error: invalid input - a.\n");
return 1;
}
for (i = 0; a[i] && i < N; i++)
printf ("%c", a[i]);
putchar ('\n');
printf ("enter second word: ");
if (scanf ("%1s", b) != 1) {
fprintf (stderr, "error: invalid input - b.\n");
return 1;
}
for (i = 0; b[i] && i < N; i++)
printf ("%c", b[i]);
putchar ('\n');
return 0;
}
Example Use/Output
$ ./bin/twowords
enter first word: cats
cats
enter second word: dogs
d
I would strongly caution you to consider reading line-oriented input with a line-oriented input function like fgets. This eliminates many pitfalls for new programmers. The only additional step when using fgets is to recall it reads up-to-and-including the trailing '\n', so you need to trim the '\n' from the string read.
Look things over an let me know if you have further questions.
For the second printf you have to write
printf("%c", b[i]);
As per inputs(abcabc and abc ) you mentioned your are providing to scanf are causing overflow for both the variables a and b,
You should enter string of length 5 for first and 1 for second keeping space for \0 for both the strings
Edit: Also change loop condition from i <= N to i<N or i<sizeof(a).
Please note the loop will print garbage characters past string length if the length of string happens to be less than 5
I see two things that you can do here:
Use another big buffer array that you will load data to, and that copy data to specyfic array. Than you will get max 6 letters of first word in a[], and max 2 letters of second word in b[]. But you will be able to load 2 words.
Scanf specyfic amount of chars like #xing mentiond in comment.
scanf("%6s",word) //general use, not in OP's case
Than when you've got longer word than size that you set up, you will have the rest of input as input in second word.

Inputting multiple lines of strings in C

I'm fairly new to coding and am currently taking a programming course at school with C. We were given an assignment and I'm having a bit of difficulty with the first part. We're learning how to use the string-handling library (stdlib.h) and the objective of the assignment is to input multiple lines of text from the keyboard. The instructor advised us to use two-dimensional arrays in order to do this, but I'm a bit stuck. Here's the code I've written:
int main(void) {
char string[3][SIZE];
int i, j;
int c;
printf("Enter three lines of text:\n");
for (i = 0; i < 3; i++) {
j = 0;
while ((j < SIZE) && (c = getchar() != '\n')) {
string[i][j] = c;
j++;
}
}
for (i = 0; i < 3; i++) {
for (j = 0; j < SIZE; j++) {
printf("%c", string[i][j]);
}
printf("\n");
}
return 0;
}
Some points that I'd like to make are that I used the getchar() function to receive input one character at a time, and also the second for loop I intended to print each line of text that is stored in each row of the string array.
The input is any string of text for three lines, for example:
Hi my name is John.\n
I am from the US\n
and I'm a student.
Here's what the current output looks like:
Enter three lines of text:
rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr...
The output that I'm expecting is:
Enter three lines of text:\n
Hi my name is John.\n
I'm from the US\n
and am a student.
Any tips or advice would be greatly appreciated. Thank you!
First of all let me commend the fact the you starting your way with C. That's the most solid language to learn(better is only assembly itself) - you will have a full understanding of how things work, which you wouldn't get if started from some language written on top of C(like Java and Python).
But it's a hard and long road, which worth that.
On the code: there is a lot going and you have made a lot of amusing bugs that would reproduce different interesting things every other time and machine you run it.
First of all: to make your code work somehow all you need is add parenthesis:
while ((j < SIZE) && ((c = getchar()) != '\n')) {
In C everything is binary(or integer, depending how you look at it) and default binding is to the right a op1 b op2 c op3 d..
First op3 is evaluated c op3 d = r1, then you have a op1 b op2 r1 and so on.
Thus you was comparing the value of getchar() with value of character '\n' - which are not equal, so you get TRUE (value 1) and store it in local variable c.
Next you still have some problems because of the way you initialized your array:
char string[3][SIZE];
What it does is simply "intrusts" 3*SIZE*sizeof(char) bytes of you process address space to a thing labeled "string". But that does not clear up all the remnants of previous live (of your program, or even before) on those bytes, so if it happens that SIZE in your program == 100 and you used to store your credit card on a real address memory (RAM) mapped to that region of your program memory you would see your credit card when you print it by printf - if you didn't overwrite those 300 bytes.
This may help you looking at it:
#include <stdlib.h>
#include <stdio.h>
#define SIZE 10
int main(void) {
char string[3][SIZE];
int i, j;
int c;
for(i = 0; i < 3; i++)
for(j = 0; j < SIZE; j++){
string[i][j] = 0;
}
printf("Enter three lines of text:\n");
for (i = 0; i < 3; i++) {
j = 0;
while ((j < SIZE) && ((c = getchar()) != '\n')) {
string[i][j] = c;
j++;
}
}
for (i = 0; i < 3; i++) {
for (j = 0; j < SIZE; j++) {
printf("%c", string[i][j]);
}
printf("\n");
}
return 0;
}
Also be aware that getchar() may behave lousy with input and newlines - it depends on whether you console buffers input before sending it to your program on enter(newline) or not. More here How to avoid press enter with any getchar()
Note: I wrote this answer before the OP clarified they had to use getchar.
To read a whole line at a time, use fgets. To print a whole string at a time, use printf with the %s format.
#include <stdio.h>
int main(void) {
// No need to define a SIZE constant.
// Because it's stack allocated we can its size.
char strings[3][100];
printf("Enter three lines of text:\n");
for ( int i = 0; i < 3; i++) {
// Reads one line, up to the size of `strings[i]`, from stdin.
fgets( strings[i], sizeof(strings[i]), stdin );
}
for ( int i = 0; i < 3; i++) {
// Print each string and its line number.
printf("Line %d: %s\n", i, strings[i]);
}
return 0;
}
This is not the best pattern to read input. You'll learn very quickly that fixed memory sizes and reading input don't work well. For future reference, it would be more like this.
#include <stdio.h>
#include <string.h>
int main(void) {
// A list to store 3 strings, but no memory for the strings themselves.
char *strings[3];
printf("Enter three lines of text:\n");
// A line buffer that's sufficiently large.
// This will be reused.
char line[4096];
for ( int i = 0; i < 3; i++) {
// Read into the large line buffer.
fgets( line, sizeof(line), stdin );
// Copy the string into a buffer that's just big enough.
strings[i] = strdup( line );
}
for ( int i = 0; i < 3; i++) {
printf("Line %d: %s\n", i, strings[i]);
}
return 0;
}
This allocates a single large line buffer to do the reading, then copies what its read with strdup to memory of just the right size. This lets you read even very long lines of input without wasting a bunch of memory if they're very short.
Note that strdup() is not part of the C standard library, but it's part of the POSIX spec. Any major compiler will have it, and it's easy to write your own.

2 questions regarding program in C Programming - Absolute Beginner's Guide

In C Programming - Absolute Beginner's Guide chapter 18, example 2, the program writes
#include <stdio.h>
#include <stdlib.h>
main ()
{
int i;
char msg[25];
printf("Type up to 25 characters and then press Enter...\n");
for (i = 0; i < 25; i++)
{
msg[i] = getchar();
if (msg[i] == '\n')
{
i--;
break;
}
}
putchar('\n');
for (; i >= 0; i--)
{
putchar(msg[i]);
}
putchar('\n');
return 0;
}
I have 2 questions regarding the program.
msg gets allocated an array of 25 characters and printf tells the user to type up to 25 characters. Shouldn't msg then be allocated an array of 26 characters to accommodate backslash zero?
When the for loop is written like this: for (; i >= 0; i--), what is the start expression?
You are right that a string would need one more byte for the terminator.
But the program never treats it as a string, so that's a moot point.
There is none.
None of the expressions in the C for-loop are obligatory.
As a case-in-point, the idiomatic infinite loop:
for(;;) /* Do things */;
No, since this is not a c string and you know the size of the array, which happens to be 25, and can hold 25 elements.
Nothing is defined in the for loop. We use the variable i, defined at the start of main, and the value it holds is the value we reached when inputting characters.
FWIW, there is a bug in this program if the user inputs more than 25 characters before pressing Enter. In that case, the first for loop ends with the value of i equal to 25. In the second for loop, you will end up accessing msg[25], which is accessing msg out of bounds.
i should be decremented after the for loop, not inside the if (msg[i] == '\n') block.
for (i = 0; i < 25; i++)
{
msg[i] = getchar();
if (msg[i] == '\n')
{
break;
}
}
i--;

Scanf and two strings

My task is read two strings of digits and save them in different arrays.
I decided to use scanf function, but program can read only first string.
This is my bad-code.
int main()
{
int firstArray[50], secondArray[50], i, j;
/* fill an array with 0 */
for(i=0; i<50; ++i)
{
firstArray[i]=secondArray[i]=0;
}
i=j=0;
while((scanf("%d", &firstArray[i]))== 1) { ++i; }
while((scanf("%d", &secondArray[j]))== 1) { ++j; }
/* Print this. */
for(i = 0; i < 20; ++i)
{
printf("%d ", firstArray[i]);
}
putchar('\n');
for(j = 0; j < 20; ++j)
{
printf("%d ", secondArray[j]);
}
return 0;
}
I just don't understand how scanf function works. Can someone please explain?
scanf ignores blank characters (including new line). Thus your scan will read entire input into firstArray if you have no "non blank" separator.
If file/data has ; at end of first line it will stop the read into firstArray there, and never read anything into secondArray - as you never consume the ;.
/* This will never be 1 as ; is blocking */
while((scanf("%d", &secondArray[i])) == 1) {
So: if you separate with i.e. ; you will have to read / check for this before you read into secondArray.
You could also add something like:
char c;
/* this can be done more tidy, but only as concept */
while((scanf("%d", &firstArray[i])) == 1 && i < max) {
++i;
if ((c = getchar()) == '\n' || c == ';')
break;
}
Also instead of initializing array to 0 by loop you can say:
int firstArray[50] = {0}; /* This set every item to 0 */
Also take notice to ensure you do not go over your 50 limit.
You say strings of digits and you read %d. The format scans the input for the longest sequence representing an integer (signed) value. Two "digit strings" are consumed by the first while loop.
EDIT Instead of "strings of digits" you should say "strings of integers". In this case it is a little bit more subtle since the first while can consume all the integers, unless they are separated by something that is not a possible integer (e.g. a ;).
So, to make the following to work, you must separate the two "lines" with something that can't be parsed as integer and which is not considered "white character". Not the better solution, but one the possible.
#include <stdio.h>
#include <ctype.h>
int main()
{
int firstArray[50] = {0};
int secondArray[50] = {0};
int i, j, l1, l2;
int tmp;
i = j = 0;
// read integers, but not more than size of array
while( scanf("%d", &firstArray[i]) == 1 && i < sizeof(firstArray) ) {
++i;
}
// consume non digits
for(tmp = getchar(); tmp != EOF && !isdigit(tmp); tmp = getchar());
// on EOF you should exit and stop processing;
// we read one more char, push it back if it was a digit
if (isdigit(tmp)) ungetc(tmp, stdin);
while( scanf("%d", &secondArray[j]) == 1 && j < sizeof(secondArray) ) {
++j;
}
l1 = i; // preserve how many ints were read
l2 = j;
/* Print this. */
for(i = 0; i < l1; ++i)
{
printf("%d ", firstArray[i]);
}
putchar('\n');
for(j=0; j < l2; ++j)
{
printf("%d ", secondArray[j]);
}
return 0;
}
EDIT A solution that maybe fits your need better is to read the lines (one per time) into a buffer and sscanf the buffer.
You cannot use scanf to do that.
Read the documentation.
Observations:
with scanf if you enter a digit your loop runs forever
there is no check on size 50 limit of your arrays
if you press return then it ignores that line because does not match your pattern
if you enter a letter the pattern does not match and loop breaks
So use some other function, maybe gets, atoi or strtol. And remember to check the size 50 limit of your arrays.
Actually, there is one special point in C's arrays.
Though you declare an array's size. say int arr[5]; You can store values beyond the size of 5. It doesn't show any error but leads to undefined behavior (Might overwrite other variables).
Please Refer this question: Array size less than the no. of elements stored in it
In you case, that was your problem. The compiler had never passed beyond the first while statements. Thus, you didn't get any output. In fact, it didn't even compile the whole code yet!
while((scanf("%d", &firstArray[i]))== 1) { ++i; }
So, you could write this while statement like this:
while( scanf("%d", &firstArray[i]) ==1 && i<50 )
i++;
or else:
while(i<50 )
{
scanf("%d", &firstArray[i]);
i++;
}
or else:
for (i=0; i<50; i++)
scanf("%d", &firstArray[i]);

Resources