Need some suggestions on how to print a histogram more neatly - c

I'm writing a program that will read input and then give back a histogram of the character count from K & R - Ex. 1.13
Any suggestions on how I can improve my code? Does it matter whether or not if I test for status in condition or out first? I have noticed in my examples people test to see if c is a blank or tab first.
I think I need to revisit my histogram. It doesn't really scale the results. It just draws a hyphen based on the length.
Revised to make a little bit more readable I think.
// Print a histogram of the length of words in it's input.
#include <stdio.h>
#define IN 1
#define OUT 2
#define MAX 99
int main(){
int c; // the character
int countOfLetters = 0;
int insideWord = OUT;
int frequencyOfLengths[MAX];
int longestWordCount = 0;
int i, j; // Counters
for (i = 0; i < MAX; i++){
frequencyOfLengths[i] = 0;
}
while ((c = getchar()) != EOF){
if (c == ' ' || c == '\n' || c == '\t'){
if (insideWord == IN){
if (countOfLetters > MAX){
return 1;
}
++frequencyOfLengths[countOfLetters];
if (countOfLetters >= longestWordCount) longestWordCount = countOfLetters;
}
countOfLetters = 0;
}
else {
countOfLetters++;
insideWord = IN;
}
}
for (i = 1; i <= longestWordCount; i++){
printf("%3i : %3i ", i, frequencyOfLengths[i]);
for (j = 0; j < frequencyOfLengths[i]; j++){
printf("*");
}
printf("\n");
}
return 0;
}

Definitely scale results, check out my Character Histogram that does a horizontal scaling histogram.
Also, you could benefit a y-axis label. It's hard to tell which bar is for which kind of word length. I have no idea which bar is for what word length.
I added this code right before you display the histogram, it basically halves every value, which does throw off your bar number labels. You can figure it out!
// Iterates and tells us the most frequent word length
int mostFrequent = 0;
for (i = 1; i < MAXWORD; i++)
if (charCount[i] > mostFrequent)
mostFrequent = charCount[i];
// If the bar will be too big, cut every value in half
while (mostFrequent > 60) {
for (i = 1; i < MAXWORD; i++)
if (charCount[i] > 0) {
charCount[i] /= 2;
charCount[i] |= 1;
}
// Check again to find the most frequent word length category
mostFrequent = 0;
for (i = 1; i < MAXWORD; i++)
if (charCount[i] > mostFrequent)
mostFrequent = charCount[i];
}
Honestly the bars are hard to read, maybe just use a single row of characters such as █ !
Great book so far, we're practically reading it together and are on the same page!
Cheers

Related

Why is this array being initialized in an odd way?

I am reading K&R 2nd Edition and I am having trouble understanding exercise 1-13. The answer is this code
#include <stdio.h>
#define MAXHIST 15
#define MAXWORD 11
#define IN 1
#define OUT 0
main()
{
int c, i, nc, state;
int len;
int maxvalue;
int ovflow;
int wl[MAXWORD];
state = OUT;
nc = 0;
ovflow = 0;
for (i = 0; i < MAXWORD; i++)
wl[i] = 0;
while ((c = getchar()) != EOF)
{
if(c == ' ' || c == '\n' || c == '\t')
{
state = OUT;
if (nc > 0)
{
if (nc < MAXWORD)
++wl[nc];
else
++ovflow;
}
nc = 0;
}
else if (state == OUT)
{
state = IN;
nc = 1;
}
else
++nc;
}
maxvalue = 0;
for (i = 1; i < MAXWORD; ++i)
{
if(wl[i] > maxvalue)
maxvalue = wl[i];
}
for(i = 1; i < MAXWORD; ++i)
{
printf("%5d - %5d : ", i, wl[i]);
if(wl[i] > 0)
{
if((len = wl[i] * MAXHIST / maxvalue) <= 0)
len = 1;
}
else
len = 0;
while(len > 0)
{
putchar('*');
--len;
}
putchar('\n');
}
if (ovflow > 0)
printf("There are %d words >= %d\n", ovflow, MAXWORD);
return 0;
}
At the top, wl is being declared and initialized. What I don't understand is why is it looping through it and setting everything to zero if it just counts the length of words? It doesn't keep track of how many words there are, it just keeps track of the word length so why is everything set to 0?
I know this is unclear it's just been stressing me out for the past 20 minutes and I don't know why.
The ith element of the array wl[] is the number of words of length i that have been found in an input file. The wl[] array needs to be zero-initialized first so that ++wl[nc]; does not cause undefined behavior by attempting to use an uninitialized variable, and so that array elements that represent word lengths that are not present reflect that no such word lengths were found.
Note that ++wl[nc] increments the value wl[nc] when a word of length nc is encountered. If the array were not initialized, the first time the code attempts to increment an array element, it would be attempting to increment an indeterminate value. This attempt would cause undefined behavior.
Further, array indices that represent counts of word lengths that are not found in the input should hold values of zero, but without the zero-initialization, these values would be indeterminate. Even attempting to print these indeterminate values would cause undefined behavior.
The moral: initialize variables to sensible values, or store values in them, before attempting to use them.
It would seem simpler and be more clear to use an array initializer to zero-initialize the wl[] array:
int wl[MAXWORD] = { 0 };
After this, there is no need for the loop that sets the array values to zero (unless the array is used again) for another file. But, the posted code is from The C Answer Book by Tondo and Gimpel. This book provides solutions to the exercises found in the second edition of K&R in the style of K&R, and using only ideas that have been introduced in the book before each exercise. This exercise, 1.13, occurs in "Chapter 1 - A Tutorial Introduction". This is a brief tour of the language lacking many details to be found later in the book. At this point, assignment and arrays have been introduced, but array initializers have not (this has to wait until Chapter 4), and the K&R code that uses arrays has initialized arrays using loops thus far. Don't read too much into code style from the introductory chapter of a book that is 30+ years old.
Much has changed in C since K&R was published, e.g., main() is no longer a valid function signature for the main() function. Note that the function signature must be one of int main(void) or int main(int argc, char *argv[]) (or alternatively int main(int argc, char **argv)), with a caveat for implementation-defined signatures for main().
Everything is set to 0 because if you dont initialize the array, the array will be initialize with random number in it. Random number will cause error in your program. Instead of looping in every position of your array you could do this int wl[MAXWORD] = {0}; at the place of int wl[MAXWORD]; this will put 0 at every position in your array so you dont hava to do the loop.
I edited your code and put some comments in as I was working through it, to explain what's going on. I also changed some of your histogram calculations because they didn't seem to make sense to me.
Bottom line: It's using a primitive "state machine" to count up the letters in each group of characters that isn't white space. It stores this in wl[] such that wl[i] contains an integer that tells you how many groups of characters (sometimes called "tokens") has a word length of i. Because this is done by incrementing the appropriate element of w[], each element must be initialized to zero. Failing to do so would lead to undefined behavior, but probably would result in nonsensical and absurdly large counts in each element of w[].
Additionally, any token with a length that can't be reflected in w[] will be tallied in the ovflow variable, so at the end there will be an accounting of every token.
#include <stdio.h>
#define MAXHIST 15
#define MAXWORD 11
#define IN 1
#define OUT 0
int main(void) {
int c, i, nc, state;
int len;
int maxvalue;
int ovflow;
int wl[MAXWORD];
// Initializations
state = OUT; //Start off not assuming we're IN a word
nc = 0; //Start off with a character count of 0 for current word
ovflow = 0; //Start off not assuming any words > MAXWORD length
// Start off with our counters of words at each length at zero
for (i = 0; i < MAXWORD; i++) {
wl[i] = 0;
}
// Main loop to count characters in each 'word'
// state keeps track of whether we are IN a word or OUTside of one
// For each character in the input stream...
// - If it's whitespace, set our state to being OUTside of a word
// and, if we have a character count in nc (meaning we've just left
// a word), increment the counter in the wl (word length) array.
// For example, if we've just counted five characters, increment
// wl[5], to reflect that we now know there is one more word with
// a length of five. If we've exceeded the maximum word length,
// then increment our overflow counter. Either way, since we're
// currently looking at a whitespace character, reset the character
// counter so that we can start counting characters with our next
// word.
// - If we encounter something other than whitespace, and we were
// until now OUTside of a word, change our state to being IN a word
// and start the character counter off at 1.
// - If we encounter something other than whitespace, and we are
// still in a word (not OUTside of a word), then just increment
// the character counter.
while ((c = getchar()) != EOF) {
if (c == ' ' || c == '\n' || c == '\t') {
state = OUT;
if (nc > 0) {
if (nc < MAXWORD) ++wl[nc];
else ++ovflow;
}
nc = 0;
} else if (state == OUT) {
state = IN;
nc = 1;
} else {
++nc;
}
}
// Find out which length has the most number of words in it by looping
// through the word length array.
maxvalue = 0;
for (i = 1; i < MAXWORD; ++i) {
if(wl[i] > maxvalue) maxvalue = wl[i];
}
// Print out our histogram
for (i = 1; i < MAXWORD; ++i) {
// Print the word length - then the number of words with that length
printf("%5d - %5d : ", i, wl[i]);
if (wl[i] > 0) {
len = wl[i] * MAXHIST / maxvalue;
if (len <= 0) len = 1;
} else {
len = 0;
}
// This is confusing and unnecessary. It's integer division, with no
// negative numbers. What we want to have happen is that the length
// of the bar will be 0 if wl[i] is zero; that the bar will have length
// 1 if the bar is otherwise too small to represent; and that it will be
// expressed as some fraction of MAXHIST otherwise.
//if(wl[i] > 0)
// {
// if((len = wl[i] * MAXHIST / maxvalue) <= 0)
// len = 1;
// }
// else
// len = 0;
// Multiply MAXHIST (our histogram maximum length) times the relative
// fraction, i.e., we're using a histogram bar length of MAXHIST for
// our statistical mode, and interpolating everything else.
len = ((double)wl[i] / maxvalue) * MAXHIST;
// Our one special case might be if maxvalue is huge, a word length
// with just one occurrence might be rounded down to zero. We can fix
// that manually instead of using a weird logic structure.
if ((len == 0) && (wl[i] > 0)) len = 1;
while (len > 0) {
putchar('*');
--len;
}
putchar('\n');
}
// If any words exceeded the maximum word length, say how many there were.
if (ovflow > 0) printf("There are %d words >= %d\n", ovflow, MAXWORD);
return 0;
}

Replace a character with another character + Setting a tie game

This is for Homework
I have to create a game of TicTacToe for a project and I have two issues. Also I apologize if I'm violating a rule by having two questions within one post, If it's not allowed then I'd appreciate someone notifying me in the comments and I'll go ahead and break this into two separate posts. I'll post my code then ask my questions following the code.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
char table[3][3];
void clear_table();
void player1_move();
void player2_move();
void the_matrix(); // Like the movie
char check_three();
int main() {
srand(time(NULL));
char win;
printf("This program plays the game of Tic Tac Toe.\n");
win = ' ';
clear_table();
do {
the_matrix(); // Like the movie
player1_move();
win = check_three(); // Check win for player 1
if (win != ' ')
break;
player2_move();
win = check_three(); // Check win for player 2
}
while (win == ' ');
the_matrix(); // Shows the final move+Like the movie
if (win == 'O')
printf("Congratulations, Player 1 wins!\n");
else
printf("Congratulations, Player 1 lost!\n");
// the_matrix (); //Shows the final move+Like the movie
return 0;
}
void clear_table() {
// Creates empty spaces for the user and computer to enter stuff in
int i, j, k;
for (i = 0; i < 3; i++) {
for (j = 0; j < 3; j++)
// for(l = 0; k < 3; j++)
table[i][j] = ' ';
}
}
void player1_move() {
// Moves that player 1 can and can't make
int x, y, z;
printf("Player 1 enter your selection[row, col]: ");
scanf("%d, %d", &x, &y);
x--;
y--;
// z--;
if (table[x][y] != ' ') {
printf("Space already taken, please try again.\n");
player1_move();
}
else
table[x][y] = 'O'; // O goes first for some reason
}
void player2_move() {
// Needs work!!
// Call srand in the main
int a = rand() % 3;
int b = rand() % 3;
// Make it so the game would end in a tie when possible
for (a = rand() % 3; a < 3; a++) {
for (b = rand() % 3; b < 3;
b++) // For loops causing issues in randomization?
// for(c = 0; c < 3; c++)
if (table[a][b] == ' ')
break;
if (table[a][b] == ' ') // Checks the rows and columns
break;
}
if (a * b == 9)
**Kinda works ? ** {
printf("Game Over, No Player Wins\n");
exit(0);
}
else
table[a][b] = 'X';
}
void the_matrix() { // Like the movie
**Get rid of the underscores **
int m;
printf("The current state of the board:\n");
for (m = 0; m < 3; m++) {
printf("%c_ %c_ %c_\n", table[m][0], table[m][1], table[m][2]);
}
printf("\n");
}
char check_three() {
int w;
// char table[3][3];
for (w = 0; w < 3; w++) {
if (table[w][0] == table[w][2] && table[w][0] == table[w][1])
return table[w][0]; // Row Check
}
for (w = 0; w < 3; w++) {
if (table[0][w] == table[2][w] && table[0][w] == table[1][w])
return table[0][w]; // Col Check
}
if (table[0][0] == table[1][1] && table[1][1] == table[2][2])
return table[0][0];
if (table[0][2] == table[1][1] && table[1][1] == table[2][0])
return table[0][2]; // Diag Check
return ' ';
}
First Question
So my first question is with a draw game. On the player two function I have a snip of code set to determine a draw game. Initially I assumed that if the X's and O's were to multiply to 9 then that would mean that the board would be filled up then that would result in a draw game. [This is within my third function - player2_move near the end of the function] It kind of works, but sometimes the program just preemptively ends the game. It's a bit hard to test it because the computers moves are randomized and most of the times I've tried, I ended up winning accidentally. My question is what would I need to do to set up my program to essentially have a better way of determining a draw game.
Second Question
On my 4th function called the_matrix I need help with formatting. The assignment requires the format to be a little like this where if I were to enter in the coordinates 1,1 then the board would look like this:
O _ _ with the proceeding lines near the bottom to be blank. However as my program is right now, it looks like this:
O_ _ _
What I want to do is swap or replace the underscore with the user's input. Not entirely sure how to do that and any help would be appreciated.
I apologize if I violated any rules for stackoverflow by having two questions in one and I'm also sorry for this huge post.

Reduce amount of C code [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
How can I refactor this with less code?
This is homework and is cracking a Caesar cipher-text using frequency distribution.
I have completed the assignment but would like it to be cleaner.
int main(int argc, char **argv){
// first allocate some space for our input text (we will read from stdin).
char* text = (char*)malloc(sizeof(char)*TEXT_SIZE+1);
char textfreq[ALEN][2];
char map[ALEN][2];
char newtext[TEXT_SIZE];
char ch, opt, tmpc, tmpc2;
int i, j, tmpi;
// Check the CLI arguments and extract the mode: interactive or dump and store in opt.
if(!(argc == 2 && isalpha(opt = argv[1][1]) && (opt == 'i' || opt == 'd'))){
printf("format is: '%s' [-d|-i]\n", argv[0]);
exit(1);
}
// Now read TEXT_SIZE or feof worth of characters (whichever is smaller) and convert to uppercase as we do it.
for(i = 0, ch = fgetc(stdin); i < TEXT_SIZE && !feof(stdin); i++, ch = fgetc(stdin)){
text[i] = (isalpha(ch)?upcase(ch):ch);
}
text[i] = '\0'; // terminate the string properly.
// Assign alphabet to one dimension of text frequency array and a counter to the other dimension
for (i = 0; i < ALEN; i++) {
textfreq[i][0] = ALPHABET[i];
textfreq[i][1] = 0;
}
// Count frequency of characters in the given text
for (i = 0; i < strlen(text); i++) {
for (j = 0; j < ALEN; j++) {
if (text[i] == textfreq[j][0]) textfreq[j][1]+=1;
}
}
//Sort the character frequency array in descending order
for (i = 0; i < ALEN-1; i++) {
for (j= 0; j < ALEN-i-1; j++) {
if (textfreq[j][1] < textfreq[j+1][1]) {
tmpi = textfreq[j][1];
tmpc = textfreq[j][0];
textfreq[j][1] = textfreq[j+1][1];
textfreq[j][0] = textfreq[j+1][0];
textfreq[j+1][1] = tmpi;
textfreq[j+1][0] = tmpc;
}
}
}
//Map characters to most occurring English characters
for (i = 0; i < ALEN; i++) {
map[i][0] = CHFREQ[i];
map[i][1] = textfreq[i][0];
}
// Sort the map lexicographically
for (i = 0; i < ALEN-1; i++) {
for (j= 0; j < ALEN-i-1; j++) {
if (map[j][0] > map[j+1][0]) {
tmpc = map[j][0];
tmpc2 = map[j][1];
map[j][0] = map[j+1][0];
map[j][1] = map[j+1][1];
map[j+1][0] = tmpc;
map[j+1][1] = tmpc2;
}
}
}
if(opt == 'd'){
decode_text(text, newtext, map);
} else {
// do option -i
}
// Print alphabet and map to stderr and the decoded text to stdout
fprintf(stderr, "\n%s\n", ALPHABET);
for (i = 0; i < ALEN; i++) {
fprintf(stderr, "%c", map[i][1]);
}
printf("\n%s\n", newtext);
return 0;
}
Um, Refactoring != less code. Obfuscation can sometimes result in less code, if that is your objective :)
Refactoring is done to improved code readability and reduced complexity. Suggestions for improvement in your case:
Look at the chunks of logic you've implemented and consider replacing them with in built functions is usually a good place to begin. I'm convinced that some of the sorting you've performed can be replaced with qsort(). However, side note, if this is your assignment, your tutor may be a douche and want to see you write out the code in FULL VS using C's in built function, and dock you points on being too smart. (Sorry personal history here :P)
Move your logical units of work into dedicated functions, and have a main function to perform orchestration.

Using histogram to find the most common letter in an array

This is what I came up with, but I always get a Run-Time Check Failure #2 - Stack around the variable 'h' was corrupted.
int mostCommonLetter(char s[]) {
int i=0, h[26],k=0, max=0, number=0;
while ( k < 26){
h[k] = '0';
k++;
}
while(s[i] != '\0'){
h[whichLetter(s[i])] = h[whichLetter(s[i])]+1;
i++;
}
h[26] = '\0';
for(i=0;h[i]!='\0';i++){
if(h[i] > max)
number=i;
}
return number;
}
You cannot do h[26] = '\0'; - h has 26 elements indexed 0..25. As you know the length of h you don't need to 0-terminate it, simply do for (i=0; i < 26; ++i)
Also, are you certain whichLetter always returns a value in the 0..25 range? What does it do if it e.g. encounters a space?
This writes past the end of the array:
h[26] = '\0';
Make the for loop depend on the length rather than the last character:
for(i=0;i<26;i++){
if(h[i] > max)
number=i;
}

C Arrays and unbroken lists

/edit: thanks for the help so far, however I haven't got any of the solutions to take the sample input and give the sample output. My description isn't the clearest, sorry.
I have an array composed of binary data. What I want to do is determine how long each unbroken segment of 1s or 0s is.
Say I have this data:
0111010001110
In an array binaryArray which I need to translate to:
0100110
stored in nwArray where 0 represents a narrow (less than 3 digits long) and 1 represents wide (>3 digits long). I am not concerned with the binary value but with the length of each component. I'm not sure if that explanation makes sense.
This is what I have; it doesn't work, I can see why, but I can't think of a good solution.
for(x=0;x<1000;x++){
if(binaryArray[x]==binaryArray[x+1]){
count++;
if(count>=3){
nwArray[y]=1;
y++;
count=0;
}
}else{
if(barcodeArray[x]){
nwArray[y]=0;
}
}
}
Does this do it?
int count = 0;
for (x=0; x<1000;x++)
{
if (binaryArray[x] != binaryArray[x+1])
{
if (count < 3)
nwArray[y]=0;
else
nwArray[y]=1;
y++;
count = 0;
}
else
count++;
}
One problem you have is that you compare count with 3 too early. Wait until you see a change in the bitstream. Try a while loop until the bit flips then compare the count.
Modified #MikeW's answer:
int count = 0;
int nwSize = 0;
const int ilast = SIZEOF(binaryArray) - 1;
for (int i = 0; i <= ilast; ++i)
if (i == ilast || binaryArray[i] != binaryArray[i+1]) {
nwArray[nwSize++] = (count > 1); /* true for '1110'; false for '110' */
count = 0;
}
else
++count;
assert(count == 0);

Resources