Good-morning one and all!
This is going to end up being one of those blindingly-easy questions in hindsight, but for the life of me I'm stumped. I'm going through some of the exercises in The C Programming Language, and I've managed to write some code to initialize a loop. After some Googling, I found better ways of initializing a loop to 0, but I don't understand why the loop that I wrote to do it doesn't finish. I've used the debugger to find out that it's because the 'c' variable never reaches 50, it gets to 49 and then rolls over to 0, but I can't figure out why it's rolling over. The code is attached below, does anyone know what's going on here?
#include <stdio.h>
#define IN 1
#define OUT 0
/* Write a program to print a histogram of the lengths of words in
itsinput. */
main()
{
int c=0;
int histogram[50]={0}
int current_length=0;
int state=OUT;
//Here we borrow C so we don't have to use i
printf("Initializing...\n");
while(c<51){
histogram[c] =0;
c=c+1;
}
c=0;
printf("Done\n");
while( (c=getchar()) != EOF){
if( (c==32 || c==10) && state==IN ){
//End of word
state=OUT;
histogram[current_length++];
}else if( (c>=33 && c<=126) && state==OUT ){
//Start of word
state=IN;
current_length=0;
}else if( (c>=33 && c<=126) && state==IN ){
//In a word
current_length++;
} else {
//Not in a word
//Example, " " or " \n "
;
}
}
//Print the histogram
//Recycle current_length to hold the length of the longest word
//Find longest word
for( c=0; c<50; c++){
if( c>histogram[c] )
current_length=histogram[c];
}
for( c=current_length; c>=0; c--){
for( state=0; state<=50; state++){
if( histogram[c]>=current_length )
printf("_");
else
printf(" ");
}
}
}
It's because histogram[c] = 0 writes past the histogram memory when c = 50. So essentially histogram[50] overwrites c and makes it 0.
This happens because arrays start from 0 in C. So the last valid index in a 50-element array is 49.
Technically, while interesting and exploitable you can't rely on this. It's a manifestation of undefined behavior. The memory could easily have another layout causing things to "just work" or do something funnier.
histogram has 50 elements: from index 0 to index 49.
You attempt to write to index 50. ALL BETS ARE OFF
do
while (c < 50)
or, to avoid magic constants
while (c < sizeof histogram / sizeof *histogram)
You are accessing elements 0 to 50 in histogram, which only contains elements 0 to 49 (C/C++ use zero-indexing, so the maximum element of an array will always be size-1).
To avoid errors like this, you could define the histogram size as a constant, and use that for all operations relating to the histogram array:
#define HISTOGRAM_SIZE 50
Or (only works for C99 or C++, see below comment):
const int HISTOGRAM_SIZE = 50;
Then:
int histogram[HISTOGRAM_SIZE];
And:
while(c<HISTOGRAM_SIZE)
'#define' is a C-preprocessor statement, and will be processed before compilation. To the compiler, it will just look as if you've written 50 everywhere where HISTOGRAM_SIZE is used, so you wont get any extra overhead.
'const int' gives you a similar solution, which in many cases will give the same result as with the define (I'm not 100% certain under which circumstances though, others are free to elaborate), but will also give you the added bonus of type-checking.
Related
I wrote a program that counts and prints the number of occurrences of elements in a string but it throws a garbage value when i use fgets() but for gets() it's not so.
Here is my code:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int main() {
char c[1005];
fgets(c, 1005, stdin);
int cnt[26] = {0};
for (int i = 0; i < strlen(c); i++) {
cnt[c[i] - 'a']++;
}
for (int i = 0; i < strlen(c); i++) {
if(cnt[c[i]-'a'] != 0) {
printf("%c %d\n", c[i], cnt[c[i] - 'a']);
cnt[c[i] - 'a'] = 0;
}
}
return 0;
}
This is what I get when I use fgets():
baaaabca
b 2
a 5
c 1
32767
--------------------------------
Process exited after 8.61 seconds with return value 0
Press any key to continue . . . _
I fixed it by using gets and got the correct result but i still don't understand why fgets() gives wrong result
Hurray! So, the most important reason your code is failing is that your code does not observe the following inviolable advice:
Always sanitize your inputs
What this means is that if you let the user input anything then he/she/it can break your code. This is a major, common source of problems in all areas of computer science. It is so well known that a NASA engineer has given us the tale of Little Bobby Tables:
Exploits of a Mom #xkcd.com
It is always worth reading the explanation even if you get it already #explainxkcd.com
medium.com wrote an article about “How Little Bobby Tables Ruined the Internet”
Heck, Bobby’s even got his own website — bobby-tables.com
Okay, so, all that stuff is about SQL injection, but the point is, validate your input before blithely using it. There are many, many examples of C programs that fail because they do not carefully manage input. One of the most recent and widely known is the Heartbleed Bug.
For more fun side reading, here is a superlatively-titled list of “The 10 Worst Programming Mistakes In History” #makeuseof.com — a good number of which were caused by failure to process bad input!
Academia, methinks, often fails students by not having an entire course on just input processing. Instead we tend to pretend that the issue will be later understood and handled — code in academia, science, online competition forums, etc, often assumes valid input!
Where your code went wrong
Using gets() is dangerous because it does not stop reading and storing input as long as the user is supplying it. It has created so many software vulnerabilities that the C Standard has (at long last) officially removed it from C. SO actually has an excellent post on it: Why is the gets function so dangerous that it should not be used?
But it does remove the Enter key from the end of the user’s input!
fgets(), in contrast, stops reading input at some point! However, it also lets you know whether you actually got an entire line of of text by not removing that Enter key.
Hence, assuming the user types: b a n a n a Enter
gets() returns the string "banana"
fgets() returns the string "banana\n"
That newline character '\n' (what you get when the user presses the Enter key) messes up your code because your code only accepts (or works correctly given) minuscule alphabet letters!
The Fix
The fix is to reject anything that your algorithm does not like. The easiest way to recognize “good” input is to have a list of it:
// Here is a complete list of VALID INPUTS that we can histogram
//
const char letters[] = "abcdefghijklmnopqrstuvwxyz";
Now we want to create a mapping from each letter in letters[] to an array of integers (its name doesn’t matter, but we’re calling it count[]). Let’s wrap that up in a little function:
// Here is our mapping of letters[] ←→ integers[]
// • supply a valid input → get an integer unique to that specific input
// • supply an invalid input → get an integer shared with ALL invalid input
//
int * histogram(char c) {
static int fooey; // number of invalid inputs
static int count[sizeof(letters)] = {0}; // numbers of each valid input 'a'..'z'
const char * p = strchr(letters, c); // find the valid input, else NULL
if (p) {
int index = p - letters; // 'a'=0, 'b'=1, ... (same order as in letters[])
return &count[index]; // VALID INPUT → the corresponding integer in count[]
}
else return &fooey; // INVALID INPUT → returns a dummy integer
}
For the more astute among you, this is rather verbose: we can totally get rid of those fooey and index variables.
“Okay, okay, that’s some pretty fancy stuff there, mister. I’m a bloomin’ beginner. What about me, huh?”
Easy. Just check that your character is in range:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'z')) return &count[c - 'a'];
return &fooey;
}
“But EBCDIC...!”
Fine. The following will work with both EBCDIC and ASCII:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'i')) return &count[ 0 + c - 'a'];
if (('j' <= c) && (c <= 'r')) return &count[ 9 + c - 'j'];
if (('s' <= c) && (c <= 'z')) return &count[18 + c - 's'];
return &fooey;
}
You will honestly never have to worry about any other character encoding for the Latin minuscules 'a'..'z'.Prove me wrong.
Back to main()
Before we forget, stick the required magic at the top of your program:
#include <stdio.h>
#include <string.h>
Now we can put our fancy-pants histogram mapping to use, without the possibility of undefined behavior due to bad input.
int main() {
// Ask for and get user input
char s[1005];
printf("s? ");
fgets(s, 1005, stdin);
// Histogram the input
for (int i = 0; i < strlen(s); i++) {
*histogram(s[i]) += 1;
}
// Print out the histogram, not printing zeros
for (int i = 0; i < strlen(letters); i++) {
if (*histogram(letters[i])) {
printf("%c %d\n", letters[i], *histogram(letters[i]));
}
}
return 0;
}
We make sure to read and store no more than 1004 characters (plus the terminating nul), and we prevent unwanted input from indexing outside of our histogram’s count[] array! Win-win!
s? a - ba na na !
a 4
b 1
n 2
But wait, there’s more!
We can totally reuse our histogram. Check out this little function:
// Reset the histogram to all zeros
//
void clear_histogram(void) {
for (const char * p = letters; *p; p++)
*histogram(*p) = 0;
}
All this stuff is not obvious. User input is hard. But you will find that it doesn’t have to be impossibly difficult genius-level stuff. It should be entertaining!
Other ways you could handle input is to transform things into acceptable values. For example you can use tolower() to convert any majuscule letters to your histogram’s input set.
s? ba na NA!
a 3
b 1
n 2
But I digress again...
Hang in there!
I'm just starting learning C but I really don't know what am I doing wrong. I wrote this code, and it was supposed to stop reading numbers when it receives a negative number. I have wasted a lot of time trying to figure out what it is wrong, and I still don't know what it is.
#include<stdio.h>
int main(){
const int qtd = 3;
float ent[qtd];
int i = qtd;
printf("Digite os numeros\n");
do{
scanf("%f", &ent[i]);
i--;
}while (ent[i] >= 0 && i >= 1);
printf("\n\n\n\nPressione 'Enter' para sair");
fflush(stdin);
getchar();
return 0;
}
The problem is with the index of ent that you check for being negative. It's ent[i], but it is after i has been decremented, so you are reading the location that has not been written yet by scanf.
To fix the problem, change the code to use the prior location, i.e.
do {
...
} while (ent[i+1] >= 0 && ...);
There are several other problems with your code, all coming from the assumption that array indexes start at 1. In C, however, the initial index is zero, not one, so the correct check should be
do {
...
} while (ent[i+1] >= 0 && i >= 0);
In addition, i should be initialized to int i = qtd-1; to avoid writing past the end of allocated array.
Hey guys i'm new to C and i'm trying to learn something by myself.
So here's the question: i have an infinite loop and i don't understand why.
I've already checked other topics but i didn't understand, actually.
Here's the code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/**
* Auto-generated code below aims at helping you parse
* the standard input according to the problem statement.
**/
int main()
{
int n,i=0; // the number of temperatures to analyse
scanf("%d", &n); fgetc(stdin);
char temps[257]; // the n temperatures expressed as integers ranging from -273 to 5526
fgets(temps, 257, stdin); // the n temperatures expressed as integers ranging from -273 to 5526
int temp[257]={0};
char *pointer;
pointer= temps;
while(*pointer != NULL){
int i=0, sign=1;
if(*pointer == '-'){
sign=-1;
pointer++;
}
while(*pointer != 32) { //infinite loop!
if(*pointer >='0' && *pointer<='9'){
temp[i]= (temp[i] *10) + ((*pointer) -'0');
temp[i]= temp[i]*sign;
printf("try");
}
}
printf("%d\n", temp[i]); //verifying temps != 0
pointer++;
i++;
}
return 0;
}
I really don't understand why.
Anyway, the aim of the program is: "Write a program that prints the temperature closest to 0 among input data. If two numbers are equally close to zero, positive integer has to be considered closest to zero (for instance, if the temperatures are -5 and 5, then display 5)."
You may need it.
Thank you in advance.
In the loop:
while(*pointer != 32)
you never change pointer or *pointer within the loop body. So if this loop is entered once then it can never exit.
You probably meant to have a pointer++ somewhere, and perhaps the loop condition should actually be while(*pointer >='0' && *pointer<='9') (what if the string has some numbers, then a letter, then some numbers?)
However bear in mind that this loop will also have to check for end-of-string ('\0') and exit the outer loop correctly if it does hit that (instead of doing pointer++ and going past the terminator as you do in the case of the input being just -).
Ok, I actually understood. Thank you.
So now this is the loop
while(*pointer != 32 || *pointer != '\0') {
if(*pointer >='0' && *pointer<='9'){
temp[i]= (temp[i] *10) + ((*pointer) -'0');
temp[i]= temp[i]*sign;
pointer++;
}
}
Now it gives me values, but at a certain point, it becomes infinite.
The rest of the code is the same.
EDIT: i modified the loop condition with while(*pointer >='0' && *pointer<='9') and it's not infinite!
But doesn't work. There may be a logical mistake.
EDIT 2: i found it. i've initialized i=0 in the while loop and of course it kept updating the same temp[i].
Thank you again.
I'm working through K&R exercise 1-13, and I forgot to set the elements in my array to 0. To my surprise, the last value that I got when printing the array was 32767; subsequent tests have different element values for the array, some different, and some not.
I'd like to know why this is happening. If it's highly complex, then what's going on in simple terms?
#include <stdio.h>
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
/* print the length of words as input to a histogram with horizontal bars */
int main() {
int c, i;
int state = OUT;
int accum = 0;
int nchar[10];
while ((c = getchar()) != EOF) {
if (c != ' ' && c != '\n' && c != '\t') {
state = IN;
++accum;
}
else {
state = OUT;
++nchar[accum];
accum = 0;
}
}
for (i = 0; i < 10; ++i)
printf("%d\n", nchar[i]);
return 0;
}
Input & Corresponding Output:
hello codes
4195584
0
0
0
4196032
2
4195584
0
-1608045280
32767
When the array is created, the compiler claims memory on the stack. Data is written to that memory location, if you are initializing the array or (in general) assigning values to it.
If you do not initialize anything, just memory is claimed, which was already used before for something else. The stack is not zeroed after data gets removed, because it would waste too much processor time and the RAM is getting filled with data again anyway.
That's simply what happens when you don't initialize your memory. You get whatever was there before your program claimed it...
Whatever the program that previously ran in your address space put there. So if a program put, say, 77, at address 0xabcd5657, and then you read that address, you'd get 77. This is because C does not zero initialize memory for you, although you can yourself with memset:
memset(nchar, 0, 10);
I'm actually writing about the same program as before, but I feel like I've made significant progress since the last time. I have a new question however; I have a function designed to store the frequencies of letters contained within the message inside an array so I can do some comparison checks later. When I ran a test segment through the function by outputting all of my array entries to see what their values are, it seems to be storing some absurd numbers. Here's the function of issue:
void calcFreq ( float found[] )
{
char infname[15], alpha[27];
char ch;
float count = 0;
FILE *fin;
int i = 0;
while (i < 26) {
alpha[i] = 'A' + i++;
}
printf("Please input the name of the file you wish to scan:\n");
scanf("%s", infname);
fin = fopen ( infname, "r");
while ( !feof(fin) ) {
fscanf(fin, "%c", &ch);
if ( isalpha(ch) ) {
count += 1;
i = 0;
if ( islower(ch) ) { ch = toupper(ch); }
while ( i < 26 ) {
if ( ch == alpha[i] ) {
found[i]++;
i = 30;
}
i++;
}
}
}
fclose(fin);
i = 0;
while ( i < 26 ) {
found[i] = found[i] / count;
printf("%f\n", found[i]);
i++;
}
}
At like... found[5], I get this hugely absurd number stored in there. Is there anything you can see that I'm just overlooking? Also, some array values are 0 and I'm pretty certain that every character of the alphabet is being used at least once in the text files I'm using.
I feel like a moron - this program should be easy, but I keep overlooking simple mistakes that cost me a lot of time >.> Thank you so much for your help.
EDIT So... I set the entries to 0 of the frequency array and it seems to turn out okay - in a Linux environment. When I try to use an IDE from a Windows environment, the program does nothing and Windows crashes. What the heck?
Here are a few pointers besides the most important one of initializing found[], which was mentioned in other comments.
the alpha[] array complicates things, and you don't need it. See below for a modified file-read-loop that doesn't need the alpha[] array to count the letters in the file.
And strictly speaking, the expression you're using to initialize the alpha[] array:
alpha[i] = 'A' + i++;
has undefined behavior because you modify i as well as use it as an index in two different parts of the expression. The good news is that since you don't need alpha[] you can get rid of its initialization entirely.
The way you're checking for EOF is incorrect - it'll result in you acting on the last character in the file twice (since the fscanf() call that results in an EOF will not change the value of ch). feof() won't return true until after the read that occurs at the end of the file. Change your ch variable to an int type, and modify the loop that reads the file to something like:
// assumes that `ch` is declared as `int`
while ( (ch = fgetc(fin)) != EOF ) {
if ( isalpha(ch) ) {
count += 1;
ch = toupper(ch);
// the following line is technically non-portable,
// but works for ASCII targets.
// I assume this will work for you because the way you
// initialized the `alpha[]` array assumed that `A`..`Z`
// were consecutive.
int index = ch - 'A';
found[index] += 1;
}
}
alpha[i] = 'A' + i++;
This is undefined behavior in C. Anything can happen when you do this, including crashes. Read this link.
Generally I would advise you to replace your while loops with for loops, when the maximum number of iterations is already known. This makes the code easier to read and possibly faster as well.
Is there a reason you are using float for counter variables? That doesn't make sense.
'i = 30;' What is this supposed to mean? If your intention was to end the loop, use a break statement instead of some mysterious magic number. If your intention was something else, then your code isn't doing what you think it does.
You should include some error handling if the file was not found. fin = fopen(..) and then if(fin == NULL) handle errors. I would say this is the most likely cause of the crash.
Check the definition of found[] in the caller function. You're probably running out of bounds.