I am trying to write code for reverse polish notation calculator. Why when I input a number the following code gets executed twice ?
int a[50];
int topOfStack = -1;
char c;
while((c = getchar()) != EOF)
{
int n = atoi(&c);
topOfStack += 1;
a[topOfStack] = n;
printf("top of stack is %d\n", a[topOfStack]);
printf("index top of stack is %d\n", topOfStack);
}
return 0;
}
This
int n = atoi(&c);
is undefined behavior.
The atoi() function takes a char * pointer pointing to a string, AKA a sequence of non-nul bytes followed by a nul byte.
You are passing a pointer to a single char, then atoi() increments the pointer trying to find the terminating '\0' but dereferencing the incremented pointer is undefined behavior because the pointer does not point to an array.
When there is undefined behavior in your code, it doesn't matter what other behavior you observe because it might very well be caused by the undefined behavior problem.
To convert a single char to int you just need to subtract the ascii value of 0 from the ascii value of the digit like this
int n = c - '0';
but that doesn't guarantee that n is the value you expect, for that you need to check with isdigit(c) before attempting to use c as if it were a digit.
Also: The type of c is wrong, it should be int since getchar() returns int and you don't want the value to be truncated.
Related
int getLevelWidth(FILE *level){
char a;
int i = 0;
while(fgets(&a, 2, level)) {
printf("%c",a);
i++;
}
printf("%i", i);
return 0;
}
This is file's content:
ABCDEFGHIJ
KLMNOPQ
RSTUVW
XYZ
And this is the output:
ABCDEFGHIJ
KLMNOPQ
RSTUVW
XYZ
1
The fgets function expects as its first parameter a pointer to the first element of an array of char, and the length of that array as the second. You're passing it the address of a single character and telling it that it is an array of size 2. This means that fgets is writing past the bounds of the variable c, triggering undefined behavior.
What most likely happened in this particular case is that a was followed immediately by i in memory, so writing past the bounds of a ended up writing into i. And assuming your system uses little-endian byte ordering, the first byte of i contains its lowest order byte. So by treating a as a 2 character array, the character in the file is written into a and the terminating null byte (i.e. the value 0) for the string is written into the first byte of i, and assuming the value of i was less than 256 this resets its value to 0.
But again, this is undefined behavior. Just because this is what happened in this particular case doesn't mean that it will always happen.
Since you only want to read a single character at a time, you instead want to use fgetc. You'll also want to change the type of c to an int to match what the function returns so you can check for EOF.
int a;
int i = 0;
while((a=fgetc(level)) != EOF) {
printf("%c",a);
i++;
}
You need 2 chars long buffer. Your code is writing 2 chars into single char. So the second one is written out of bounds. It is undefined behaviour.
Using same fgets function:
int getLevelWidth(FILE *level){
char a[2];
int i = 0;
while(fgets(a, 2, level)) {
printf("%c",a[0]);
i++;
}
printf("%i", i);
return 0;
}
This question already has answers here:
Comparing unsigned char and EOF
(6 answers)
Closed 5 years ago.
I’m learning C using Xcode 8 and the compiler doesn’t run any code after a while- or for-loop executes. is this a bug? how can I fix it?
In the example provided below printf("code executed after while-loop"); never executes
#include <stdio.h>
int getTheLine(char string[]);
int getTheLine(char string[]) {
char character;
int index;
index = 0;
while ((character = getchar()) >= EOF) {
string[index] = character;
++index;
}
printf("code executed after while-loop");
return index;
}
int main(int argc, const char * argv[]) {
char string[100];
int length = getTheLine(string);
printf("length %d\n", length);
return 0;
}
getchar returns an int not a char, and comparison with EOF should be done with the != operator instead of the >= operator.
...
int character; // int instead of char
int index;
index = 0;
while ((character = getchar()) != EOF) { // != instead of >=
...
It's the >= EOF, which will let the condition be always true. The reason is that a "valid" result of getchar() will be a positive integer, and a "non-valid" result like end-of-file will be EOF, which is negative (cf. getchar()):
EOF ... integer constant expression of type int and negative value
Hence, any valid result from getchar will be >EOF, while the end-of-file-result will be ==EOF, such that >= EOF will always match.
Write != EOF instead.
Note further that you do not terminate your string by the string-terminating-character '\0', such that using string like a string (e.g. in a printf("%s",string)) will yield undefined behaviour (crash or something else probably unwanted).
So write at least:
while ((character = getchar()) != EOF) {
string[index] = character;
++index;
}
string[index]='\0';
Then there is still the issue that you may write out of bounds, e.g. if one enters more then 100 characters in your example. But checking this is now beyond the actual question, which was about the infinite loop.
The symbolic constant EOF is an integer constant, of type int. It's (usually) defined as a macro as -1.
The problem is that the value -1 as an (32-bit) int has the value 0xffffffff and as a (8-bit) char the same value would be 0xff. Those two values are not equal. Which in turn means that your loop condition will never be false, leading to an infinite loop.
The solution to this problem is that all standard functions that reads characters returns them as an int. Which means your variable character needs to be of that type too.
Important note: It's a compiler implementation detail if plain char is a signed or an unsigned type. If it is signed then a comparison to an int would lead to sign extension when the char value is promoted in the comparison. That means a signed char with the value 0xff would be extended to the int value 0xffffffff. That means if char is signed then the comparison would work.
This means that your compile have char as unsigned char. So the unsigned char value 0xff after promotion to int will be 0x000000ff.
As for why the value -1 becomes 0xffffffff is because of how negative numbers are usually represented on computers, with something called two's complement.
You also have another couple of flaws in your code.
The first is that since the loop is infinite you will go way out of bounds of the string array, leading to undefined behavior (and a possible crash sooner or later). The solution to this is to add a condition to make sure that index never reaches 100 (in the specific case of your array, should really be passed as an argument).
The second problem is that if you intend to use the string array as an actual string, you need to terminate it. Strings in C are actually called null terminated strings. That terminator is the character '\0' (equal to integer 0), and need to be put at the end of every string you want to pass to a standard function handling such strings. Having this terminator means that an array of 100 characters only can have 99 characters in it, to be able to fit the terminator. This have implications to the solution to the above problem. As for how to add the terminator, simply do string[index] = '\0'; after the loop (if index is within bounds of course).
Background:
I'm trying to create a program that takes a user name(assuming that input is clean), and prints out the initials of the name.
Objective:
Trying my hand out at C programming with CS50
Getting myself familiar with malloc & realloc
Code:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
string prompt(void);
char *getInitials(string input);
char *appendArray(char *output,char c,int count);
//Tracks # of initials
int counter = 0;
int main(void){
string input = prompt();
char *output = getInitials(input);
for(int i = 0; i < counter ; i++){
printf("%c",toupper(output[i]));
}
}
string prompt(void){
string input;
do{
printf("Please enter your name: ");
input = get_string();
}while(input == NULL);
return input;
}
char *getInitials(string input){
bool initials = true;
char *output;
output = malloc(sizeof(char) * counter);
for(int i = 0, n = strlen(input); i < n ; i++){
//32 -> ASCII code for spacebar
//9 -> ASCII code for tab
if(input[i] == 32 || input[i] == 9 ){
//Next char after spaces/tab will be initial
initials = true;
}else{//Not space/tab
if(initials == true){
counter++;
output = appendArray(output,input[i],counter);
initials = false;
}
}
// eprintf("Input[i] is : %c\n",input[i]);
// eprintf("Counter is : %i\n",counter);
// eprintf("i is : %i\n",i);
// eprintf("n is : %i\n",n);
}
return output;
}
char *appendArray(char *output,char c,int count){
// allocate an array of some initial (fairly small) size;
// read into this array, keeping track of how many elements you've read;
// once the array is full, reallocate it, doubling the size and preserving (i.e. copying) the contents;
// repeat until done.
//pointer to memory
char *data = malloc(0);
//Increase array size by 1
data = realloc(output,sizeof(char) * count);
//append the latest initial
strcat(data,&c);
printf("Value of c is :%c\n",c);
printf("Value of &c is :%s\n",&c);
for(int i = 0; i< count ; i++){
printf("Output: %c\n",data[i]);
}
return data;
}
Problem:
The output is not what i expected as there is a mysterious P appearing in the output.
E.g When i enter the name Barack Obama, instead of getting the result:BO, i get the result BP and the same happens for whatever name i choose to enter, with the last initial always being P.
Output:
Please enter your name: Barack Obama
Value of c is :B
Value of &c is :BP
Output: B
Value of c is :O
Value of &c is :OP
Output: B
Output: P
BP
What i've done:
I've traced the problem to the appendArray function, and more specifically to the value of &c (Address of c) though i have no idea what's causing the P to appear,what it means, why it appears and how i can get rid of it.
The value of P shows up no matter when i input.
Insights as to why it's happening and what i can do to solve it will be much appreciated.
Thanks!
Several issues, in decreasing order of importance...
First issue - c in appendArray is not a string - it is not a sequence of character values terminated by a 0. c is a single char object, storing a single char value.
When you try to print c as a string, as in
printf("Value of &c is :%s\n",&c);
printf writes out the sequence of character values starting at the address of c until it sees a 0-valued byte. For whatever reason, the byte immediately following c contains the value 80, which is the ASCII (or UTF-8) code for the character 'P'. The next byte contains a 0 (or there's a sequence of bytes containing non-printable characters, followed by a 0-valued byte).
Similarly, using &c as the argument to strcat is inappropriate, since c is not a string. Instead, you should do something like
data[count-1] = c;
Secondly, if you want to treat the data array as a string, you must make sure to size it at least 1 more than the number of initials and write a 0 to the final element:
data[count-1] = 0; // after all initials have been stored to data
Third,
char *data = malloc(0);
serves no purpose, the behavior is implementation-defined, and you immediately overwrite the result of malloc(0) with a call to realloc:
data = realloc(output,sizeof(char) * count);
So, get rid of the malloc(0) call altogether; either just initialize data to NULL, or initialize it with the realloc call:
char *data = realloc( output, sizeof(char) * count );
Fourth, avoid using "magic numbers" - numeric constants with meaning beyond their immediate, literal value. When you want to compare against character values, use character constants. IOW, change
if(input[i] == 32 || input[i] == 9 ){
to
if ( input[i] == ' ' || input[i] == '\t' )
That way you don't have to worry about whether the character encoding is ASCII, UTF-8, EBCDIC, or some other system. ' ' means space everywhere, '\t' means tab everywhere.
Finally...
I know part of your motivation for this exercise is to get familiar with malloc and realloc, but I want to caution you about some things:
realloc is potentially an expensive operation, it may move data to a new location, and it may fail. You really don't want to realloc a buffer a byte at a time. Instead, it's better to realloc in chunks. A typical strategy is to multiply the current buffer size by some factor > 1 (typically doubling):
char *tmp = realloc( data, current_size * 2 );
if ( tmp )
{
current_size *= 2;
data = tmp;
}
You should always check the result of a malloc, calloc, or realloc call to make sure it succeeded before attempting to access that memory.
Minor stylistic notes:
Avoid global variables where you can. There's no reason counter should be global, especially since you pass it as an argument to appendArray. Declare it local to main and pass it as an argument (by reference) to getInput:
int main( void )
{
int counter = 0;
...
char *output = getInitials( input, &counter );
for(int i = 0; i < counter ; i++)
{
printf("%c",toupper(output[i]));
}
...
}
/**
* The "string" typedef is an abomination that *will* lead you astray,
* and I want to have words with whoever created the CS50 header.
*
* They're trying to abstract away the concept of a "string" in C, but
* they've done it in such a way that the abstraction is "leaky" -
* in order to use and access the input object correctly, you *need to know*
* the representation behind the typedef, which in this case is `char *`.
*
* Secondly, not every `char *` object points to the beginning of a
* *string*.
*
* Hiding pointer types behind typedefs is almost always bad juju.
*/
char *getInitials( const char *input, int *counter )
{
...
(*counter)++; // parens are necessary here
output = appendArray(output,input[i],*counter); // need leading * here
...
}
#include<stdlib.h>
#include<stdio.h>
#define NO_OF_CHARS 256
/* Returns an array of size 256 containg count
of characters in the passed char array */
int *getCharCountArray(char *str)
{
int *count = (int *)calloc(sizeof(int), NO_OF_CHARS);
int i;
for (i = 0; *(str+i); i++)
count[*(str+i)]++;
return count;
}
/* The function returns index of first non-repeating
character in a string. If all characters are repeating
then returns -1 */
int firstNonRepeating(char *str)
{
int *count = getCharCountArray(str);
int index = -1, i;
for (i = 0; *(str+i); i++)
{
if (count[*(str+i)] == 1)
{
index = i;
break;
}
}
free(count); // To avoid memory leak
return index;
}
/* Driver program to test above function */
int main()
{
char str[] = "geeksforgeeks";
int index = firstNonRepeating(str);
if (index == -1)
printf("Either all characters are repeating or string is empty");
else
printf("First non-repeating character is %c", str[index]);
getchar();
return 0;
}
I really can't grasp the following lines:
count[*(str+i)]++;
amd
int *getCharCountArray(char *str)
{
int *count = (int *)calloc(sizeof(int), NO_OF_CHARS);
int i;
for (i = 0; *(str+i); i++)
count[*(str+i)]++;
return count;
}
The program is used to find the first Non-Repeating character in the string.
*(str+i) is same as str[i]. The line:
for (i = 0; *(str+i); i++)
is the same as:
for (i = 0; str[i]; i++)
The statements in the loop will be executed as long as str[i] evaluates to non-zero. Since C strings are arrays of characters that are terminated by a null character, the for loop will be executed for each character in str. It will stop when the end of the string is reached.
count[*(str+i)]++;
is the same as:
count[str[i]]++;
If str[i] is 'a', this line will increment the value of count['a'], which is count[97] in ASCII encoding.
At the end of the loop, count will be filled with integers that represent the number of times a particular character appears in str.
I really can't grasp the following lines:
count[*(str+i)]++;
Work from the outside in:
since str is a pointer to char and i is an int, str + i is a pointer to the char that is i chars after the one str itself points to
*(str+i) dereferences pointer str+i, meaning it evaluates to the char the pointer points to. This is exactly equivalent to str[i].
count[*(str+i)] uses the char at index i in string str as an index into dynamic array count. The expression designates the int at that index (since count points to an array of ints). See also below.
count[*(str+i)]++ evaluates to the int at index *(str+i) in the array count points to. As a side effect, it increments that array element by one after the value of the is determined expression. This overall expression is present in your code exclusively for its side effect.
It is important to note that although space is reserved in array count for counting appearances of 256 distinct char values, the expression you asked about is not a safe way to count all of them. That's because type char can be implemented as a signed type (at the C implementer's discretion), and it is common for it to be implemented that way. In that case, only the non-negative char values correspond to array elements, and undefined behavior will result if the input string contains others. Safer would be:
#include <stdint.h>
# ...
count[(uint8_t) *(str+i)]++;
i.e. the same as the original, except for explicitly casting each character of the input string to an unsigned 8-bit value.
Overall, the function simply creates an array of 256 ints, one for each possible char value, and scans the string to count the number of occurrences of each char value that appears in it. It then returns this array of occurrence counts.
This code is equivalent to the confusing loop you posted. Does it help?
*(str + i) is confusing way of expressing str[i] and IMO inappropriate here.
for (i = 0; str[i] != '\0'; ++i)
{
char curr_char = str[i];
++count[curr_char];
}
In for loop there are three things we need to consider :
Explanation of for loop
Initialization of counter variable( i in your eg.). 2) Condition (*(str+i)) 3) Increment/decrement part (i++).
the for loop gets executed till the condition is true(i.e any non zero value) . so *(str+i) is providing a non zero value until there is any character in the array..
count[*(str+i)]++; // it is counting the number of characters in the array by incrementing the string character by character.
count[*(str+i)]++ =>count[*(str+i)]=count[*(str+i)]+1
Now consider one scenario:
char str[] = "aaab";
*(str+i)/str[i] Will show char like 'a','b'...etc.
So
count[*(str+i)]++=count['a']++ Mean;
count['a']=count['a']+1 // Will store iteration of a=1
count['a']=count['a']+1 // Will Update iteration of a=2
count['a']=count['a']+1 // Will Update iteration of a=3
and like other character.
So count[*(str+i)]++ will update occrance of charcarter in updated count.
the code is like this:
char c;
scanf("%d", &c);
inputting 3...
My guess is that when 3 is inputted, it is as type int;
and then type-demoted to char and assigned to c;
I print the value of c in specifier %d yielding 3, seems to be as expected;
but printing the value of c with specifier %c yields --a blank-- on the terminal;this is one question...(1);
to test more I furthermore declare a variable ch with type char and initialized it like this:
char ch = 1;
and the test is like this:
(i&j)? printf("yes") : printf("no");
and the result is "no"
I print out the value of i&j, and it is 0
but 1&3 should be 1? this is another question....(2);
my question is (1) and (2)
You're actually invoking undefined behavior doing that.
By using the format string %d, you're telling scanf to expect an int* as a parameter and you're passing it a pointer to a single character. Remember that scanf has no further type information on what you're passing it than what you're putting in the format string.
This will result in scanf attempting to write an int sized value to memory at an address that points to a char sized reservation, potentially (and for most architectures) writing out of bounds.
After invoking UB, all bets are off on your further calculations.
Suppose that scanf() were not a varargs-function, but a plain ordinary function taking a pointer-to-int as the 2nd argument:
int noscanf(char *format, int *ptr)
{
*ptr = 42;
return 1;
}
int main(void)
{
char ch;
int rc;
// This should *at least* give a warning ...
rc = noscanf("Haha!" , &ch);
return 0;
}
Now, scanf() is a varargs function. The only way for scanf() to determine the type of the (pointer) arguments is by inspecting the format string. And a %d means : the next argument is supposed to be a pointer to int. So scanf can happily write sizeof(int) bytes to *ptr.
I can't see a variable jthere. So i&j will be 0. And yes, if i == 1 and j == 3 then i & j == 1.
(i&j)? printf("yes") : printf("no");
statement gives the output yes,for i=1 and j=3.
And for (1) question ASCII 3 is for STX char which is not printable.