"UndefinedBehaviorSanitizer:DEADLYSIGNAL" Error in Code - c

So I am currently taking this cs50 course and I can't seem to figure out what the error/crash mentioned relates to!
I am trying to make my code check if the input from the command line is an integer or not. For example, ./caesar 2 is valid but ./caesar 2z returns ./caesar key.
Error Message:
#include <cs50.h>
#include <stdio.h>
#include <ctype.h>
//string error_key = "./caesar key \n";
int main(int argc, string argv[])
{
if (argc == 2)
{
if (isdigit(argv[1]) == 0)
{
printf("./caesar key \n");
return 1;
}
else
{
int string_to_int = atoi(argv[1]);
printf("%i\n",string_to_int);
}
}
else
{
printf("./caesar key \n");
return 1;
}
}

This is undefined behavior:
isdigit(argv[1])
argv[1] is not a character, it's a pointer. As a pointer most likely has a value higher than can be represented in an unsigned char, it causes an error with LLVM's Undefined Behavior Sanitizer.
Quoting from POSIX, which is aligned to the ISO C standard:
The c argument is an int, the value of which the application shall ensure is a character representable as an unsigned char or equal to the value of the macro EOF. If the argument has any other value, the behavior is undefined.

Related

How can I reject an alphanumeric command-line argument (i.e. ./main 20x), but not a solely numeric one (i.e. ./main 20)? (in C)

I have figured out how to reject a purely alphabetical argument. I cannot figure out how to reject an alphanumeric user input while passing numeric inputs.
Here is my relevant code:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, string argv[])
{
if (argc != 2 || isalpha(*argv[1]))
{
printf("Usage: ./caesar key\n");
return 1;
}
else...
and my code goes on to successfully carry out the function of my program (save for this one bug).
Any help would be much appreciated! This is for an edX cs50 homework assignment. I finished the whole assignment except for this bug. I searched for an answer for over 3 hours, to no avail. Help me, stack overflownobi. You're my only hope.
You're almost there: your code checks if *argv[1] (i.e. the first character of that argument) is an alphabetic character. Instead, let's use a loop to check the entire argv[1] string. We're also going to use isdigit instead, so that we can reject strings such as 41!#4 which wouldn't get detected using isalpha.
argv is an array of char*, or pointers to characters, meaning that argv[1] is a pointer to the first character of that argument. Given a pointer to the first character of a string, we need to find the string's length using strlen, after which we can write a loop. Let's break this out into a function:
bool string_is_numeric(char* string) {
size_t length = strlen(string);
for(size_t i = 0; i < length; i++) {
if(!isdigit(string[i])) { return false; }
}
return true;
}
You can call this as follows:
if (argc != 2 || !string_is_numeric(*argv[1]))
{
printf("Usage: ./caesar key\n");
return 1;
}
Note that the code I gave has a few limitations:
It doesn't check whether the number is too large, e.g. to fit into an int.
It doesn't handle decimal values (i.e. those that we would parse into double).
It doesn't allow negative numbers.
Alternative using strtol:
The library function strtol allows you to convert your string to a long int, while also providing you a pointer to the very first character that couldn't be converted.
You can check that pointer: if it points to a null character (i.e. *endptr == '\0'), then strtol reached the end of the string successfully meaning that it was all valid digits.
You'll need to declare a long and a char* to hold the results:
if (argc != 2)
{
printf("Usage: ./caesar key\n");
return 1;
}
char* endptr;
long key = strtol(argv[1], &endptr, 10); // 10 meaning decimal
if(*endptr != '\0') {
printf("Key must be numeric and fit into a long\n");
return 1;
}

UndefinedBehaviorSanitiser Issue - CS50 Vigenere

I am working on Vigenere (CS50) and keep getting an "UndefinedBehaviorSanitiser SEGV on Unknown Address" when I run my program with any argument that passes the initial screening.
I have read about this issue but cannot find the solution. I cut my code down as much as I could and found the problem occurs even when I do this part. Where is the issue?
Thank you so much.
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
//int shift(char c);
int main(int argc, string argv[])
{
if (argc != 2)
{
printf("Usage: ./vigenere keyword");
return 1;
}
else
{
for (int i = 0; i < strlen(argv[1]); i++)
{
if (!isalpha(argv[1]))
{
printf("Usage: ./vigenere keyword");
return 1;
}
else
{
printf("All good!");
return 1;
}
}
}
}
The fix indeed is
if (!isalpha((unsigned char)argv[1][i]))
The isalpha function/macro takes only a single integer, which must have the value of a single character as unsigned char. But argv[1] is a pointer to multiple characters!
Now as an extra complication, the isalpha is often realized as a macro, and often coded so that a compiler does not produce any diagnostics for a wrong type of argument. This is unfortunate, but you just need to know these when you're programming in C.
The cast of char to unsigned char is required too - if not, then any extended characters (i.e. for example รค) will invoke undefined behaviour on platforms where char is signed - and they are on x86 processors - because the value will be negative, but isalpha would expect just a number that is either EOF or a non-negative number less than or equal to UCHAR_MAX.

How to check an edge case in taking command line argument in C and evaluating to int or double?

So I have an assignment to figure out whether a number on the command line is either an integer or a double.
I have it mostly figured it out by doing:
sscanf(argv[x], "%lf", &d)
Where "d" is a double. I then cast it to an int and then subtract "d" with itself to check to see if it is 0.0 as such.
d - (int)d == 0.0
My problem is if the command line arguments contains doubles that can be technically classified as ints.
I need to classify 3.0 as a double whereas my solution considers it an int.
For example initializing the program.
a.out 3.0
I need it to print out
"3.0 is a double"
However right now it becomes
"3 is an int."
What would be a way to check for this? I did look around for similar problems which led me to the current solution but just this one edge case I do not know how to account for.
Thank you.
For example, a way like this:
#include <stdio.h>
int main(int argc, char *argv[]){
if(argc != 2){
puts("Need an argument!");
return -1;
}
int int_v, read_len = 0;
double double_v;
printf("'%s' is ", argv[1]);
//==1 : It was able to read normally.
//!argv[1][read_len] : It used all the argument strings.
if(sscanf(argv[1], "%d%n", &int_v, &read_len) == 1 && !argv[1][read_len])
puts("an int.");
else if(sscanf(argv[1], "%lf%n", &double_v, &read_len) == 1 && !argv[1][read_len])
puts("a double.");
else
puts("isn't the expected input.");
}
To test if a string will covert to a int and/or double (completely, without integer overflow, without undefined behavior), call strtol()/strtod(). #Tom Karzes
The trouble with a sscanf() approach is that the result is undefined behavior (UB) on overflow. To properly detect, use strtol()/strtod().
#include <errno.h>
#include <limits.h>
#include <stdbool.h>
#include <stdlib.h>
bool is_int(const char *src) {
char *endptr;
// Clear, so it may be tested after strtol().
errno = 0;
// Using 0 here allows 0x1234, octal 0123 and decimal 1234.
// or use 10 to allow only decimal text.
long num = strtol(src, &endptr, 0 /* or 10 */);
#if LONG_MIN < INT_MIN || LONG_MAX > INT_MAX
if (num < INT_MIN || num > INT_MAX) {
errno = ERANGE;
}
#endif
return !errno && endptr > src && *endptr == '\0';
}
bool is_double(const char *src) {
char *endptr;
// Clear, so it may be tested after strtod().
strtod(src, &endptr);
// In this case, detecting over/underflow IMO is not a concern.
return endptr > src && *endptr == '\0';
}
It is not entirely clear what the specific expectations are for your program, but it has at least something to do with the form of the input, since "3.0" must be classified as a double. If the form is all it should care about, then you should not try to convert the argument strings to numbers at all, for then you will run into trouble with unrepresentable values. In that case, you should analyze the character sequence of the argument to see whether it matches the pattern of an integer, and if not, whether it matches the pattern of a floating-point number.
For example:
int main(int argc, char *argv[]) {
for (int arg_num = 1; arg_num < argc; arg_num++) {
char *arg = argv[arg_num];
int i = (arg[0] == '-' || arg[0] == '+') ? 1 : 0; // skip any leading sign
// scan through all the decimal digits
while(isdigit(arg[i])) {
++i;
}
printf("Argument %d is %s.\n", arg_num, arg[i] ? "floating-point" : "integer");
}
}
That makes several assumptions, chief among them:
the question is strictly about form, so that the properties of your system's built-in data types (such as int and double) are not relevant.
each argument will have the form of either an integer or a floating-point number, so that eliminating "integer" as a possibility leaves "floating-point" as the only alternative. If "neither" is a possibility that must also be accommodated, then you'll also need to compare the inputs that do not have integer form to a pattern for floating-point numbers, too.
only decimal (or smaller radix) integers need be accommodated -- not, for example, hexadecimal inputs.
Under those assumptions, particularly the first, it is not just unnecessary but counterproductive to attempt to convert the arguments to one of the built-in numeric data types, because you would then come to the wrong conclusion about arguments that, say, are not within the bounds of representable values for those types.
For example, consider how the program should classify "9000000000". It has the form of an integer, but supposing that your system's int type has 31 value bits, that type cannot accommodate a value as large as the one the string represents.
int main (int argc,char *argv[])
{
if(argc==2)
{
int i;
double d;
d=atof(argv[1]);
i=atoi(argv[1]);
if(d!=i)
printf("%s is a double.",argv[1]);
else if(d==i)
printf("%s is an int.",argv[1]);
}
else
printf("Invalid input\n");
return 0;
}
You must add #include <stdlib.h>

How do you convert parameters from char to int in the main function for C?

I have this code:
int main(int argc, char *argv[]) {
int num = *argv[1];
When I run the function in terminal with a parameter: for example, if I were to call ./main 17, I want num = 17. However, with this code, num = 49 (ASCII value for 1 because argv is an array of characters). How would I get it to read num = 17 as an int? Playing around with the code, I can get it to convert the parameter into an int, but it will still only read/convert the first value (1 instead of 17).
I'm new to C and the concept of pointers/pointers to arrays is still confusing to me. Shouldn't *argv[1] return the value of the second char in the array? Why does it read the first value of the second char in the array instead?
Thanks for help!
How do you convert parameters from char to int?
Can be done by a simple cast (promotion), but this isn't your case.
In your case *argv[] is array of pointer to char (You can use this for breaking down complex C declarations), meaning that argv[1] is the 2nd element in the array, i.e. the 2nd char* in the array, meaning *argv[1] is the first char in the 2nd char* in the array.
To show it more clearly, assume argv holds 2 string {"good", "day"}. argv[1] is "day" and *argv[1] is 'd' (note the difference in types - char vs char*!)
Now, you are left with the 1st char in your input string i.e. '1'. Its ascii is indeed 49 as, so in order to get it's "int" value you should use atoi like this:
int i = atoi("17");
BUT atoi gets const char * so providing it with 17 is a good idea while sending it a char would not. This means atoi should get argv[1] instead of *argv[1]
int main(int argc, char *argv[]) {
int num = atoi(argv[1]);
// not : int num = *argv[1]; --> simple promotion that would take the ascii value of '1' :(
// and not: int num = atoi(*argv[1]); --> argument is char
note: atoi is considered obsolete so you may want to use long int strtol(const char *str, char **endptr, int base) but for a simple example I preferred using atoi
Shouldn't *argv[1] return the value of the second char in the array?
Look at the signature:
int main(int argc, char *argv[])
Here, argv is an array ([]) of pointers (*) to char. So argv[1] is the second pointer in this array. It points to the first argument given at the command line. argv[0] is reserved for the name of the program itself. Although this can also be any string, the name of the program is put there by convention (shells do this).
If you just dereference a pointer, you get the value it points to, so *argv[1] will give you the first character of the first argument. You could write it as argv[1][0], they're equivalent. To get the second character of the first argument, you'd write argv[1][1].
An important thing to note here is that you can never pass an array to a function in C. The signature above shows an array type, but C automatically adjusts array types to pointer types in function declarations. This results in the following declaration:
int main(int argc, char **argv)
The indexing operator ([]) in C works in terms of pointer arithmetics: a[x] is equivalent to *(a+x). The identifier of an array is evaluated as a pointer to the first array element in most contexts (exceptions include the sizeof operator). Therefore indexing works the same, no matter whether a is an array or a pointer. That's why you can treat argv very similar to an array.
Addressing your "core" problem: You will always have strings in argv and you want numeric input, this means you have to convert a string to a number. There are already functions doing this. A very simple one is atoi(), you can use it like this:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
if (argc != 2)
{
// use program name in argv[0] for error message:
fprintf(stderr, "Usage: %s [number]\n", argv[0]);
return EXIT_FAILURE;
}
int i = atoi(argv[1]);
printf("Argument is %d.\n", i);
return EXIT_SUCCESS;
}
This will give you 0 if the argument couldn't be parsed as a number and some indeterminate value if it overflows your int. In cases where you have to make sure the argument is a valid integer, you could use strtol() instead (note it converts to long, not int, and it can handle different bases, so we have to pass 10 for decimal):
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main(int argc, char **argv)
{
if (argc != 2)
{
// use program name in argv[0] for error message:
fprintf(stderr, "Usage: %s [number]\n", argv[0]);
return EXIT_FAILURE;
}
errno = 0; // reset error number
char *endptr; // this will point to the first character not read by strtol
long i = strtol(argv[1], &endptr, 10);
if (errno == ERANGE)
{
fprintf(stderr, "This number is too small or too large.\n");
return EXIT_FAILURE;
}
else if (endptr == argv[1])
{
// no character was converted. This also catches the case of an empty argument
fprintf(stderr, "The argument was not a number.\n");
return EXIT_FAILURE;
}
else if (*endptr)
{
// endptr doesn't point to NUL, so there were characters not converted
fprintf(stderr, "Unexpected characters in number.\n");
return EXIT_FAILURE;
}
printf("You entered %ld.\n", i);
return EXIT_SUCCESS;
}
I'm new to C and the concept of pointers/pointers to arrays is still confusing to me.
In C strings are represented by null terminated ('\0') character arrays. Let's consider the following example:
char str[] = "Hello world!"
The characters would lie contiguous in memory and the usage of str would decay to a character pointer (char*) that points to the first element of the string. The address of (&) the first element taken by &str[0] would also point to that address:
| . | . | . | H | e | l | l | o | | W | o | r | l | d | ! | \0 | . | . | . |
^ ^
str null terminator
Shouldn't *argv[1] return the value of the second char in the array?
First of all in the argv is an array of character pointer char* argv[], so that it could be interpreted like an array of strings.
The first string argv[0] is the program name of the program itself and after that the arguments that are passed are coming:
argv[0] contains a pointer to the string: "program name"
argv[1] contains a pointer to the argument: "17"
If you dereference argv[1] with the use of * you get the first character at that address, here 1 which is 49 decimal in the Ascii code. Example:
p r ("program name")
^ ^
argv[0] (argv[0] + 1)
--------------------------------------------
1 7 ("17")
^ ^
argv[1] (argv[1] + 1)
How would I get it to read num = 17 as an int?
Check the number of passed arguments with argc which contains also the program name as one (read here more about argc and argv). If there are 2 you can use strtol() to convert argv[1] to the an integer. Use strtol() over atoi() because atoi() is considered to be deprecated because there is no error checking available. If atoi() fails it simply returns 0 as integer instead of strtol() that is setting the second argument and the global errno variable to a specific value.
The followig code will use the pointer that strtol() set the second argument to, to check for conversion errors. There are also overflow and underflow errors to check like it's described here on SO. Moreover you have to check if the returned long value would fit into an int variable if you want to store it into an int variable. But for simplicity I've left that out:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char* argv[])
{
/* At least 1 argument passed? */
if (argc >= 2)
{
char* endptr;
long num = strtol(argv[1], &endptr, 10);
/* Were characters consumed? */
if (argv[1] != endptr)
{
printf("Entered number: %ld\n", num);
}
else
{
printf("Entered argument was not a number!\n");
}
}
else
{
printf("Usage: %s [number]!\n", argv[0]);
}
return 0;
}
Here's what you want to do:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv []) {
int num = atoi (argv[1]);
printf ("Here's what you gave me: %d", num);
return 0;
}
Here's the documentation for atoi ().
argv is an array of strings, so argv[x] points to a string. atoi () accepts an ASCII string as input and returns an int.
Bonus: This is still a bit unsafe. Try running this program without passing it a parameter and see what happens.
Also, you must take a look at the documentation for strtol (), which is a safe way of doing this.

Error message: Conversion may lose significant digits

C Program to accept and display "5" characters using getchar() and putchar() functions:
#include<stdio.h>
void main()
{
char ch[6];
ch[0]=getchar();
ch[1]=getchar();
ch[2]=getchar();
ch[3]=getchar();
ch[4]=getchar();
putchar(ch[0]);
putchar(ch[1]);
putchar(ch[2]);
putchar(ch[3]);
putchar(ch[4]);
}
When i am Compiling this code in "C" language then it is displaying this Error message:
"Conversion may lose significant digits", what could be the reason?
getchar returns an int and you're storing it in a char.
You can fix this by changing char ch[6]; to int ch[6];.
getchar() returns an int, to allow for passing additional values which are not in the range of valid characters. This allows for failures and error codes to be returned as values without overloading a particular character with multiple meanings. In normal operation any valid character may be returned, in error operation an int that is too large to fit in a char is returned.
The warning message indicates that you are going to only look at the char portion of the returned value, and as such, you might oddly cast an error or special return value into a char that wasn't actually captured.
--- Edited at the request of gautham to demonstrate good error detection ---
The lack of error checking for input failure can be fixed in one of two ways. The firs one is more common, and works on all systems where the size of an integer is greater than the size of a character.
// an ok approach which works for most systems
// provided that sizeof(int) != sizeof(char)
int c = getchar();
if (EOF != c) {
ch[0] = (char)c;
} else {
// some error occurred during input capture
// which resulted in getchar returning EOF
}
The second solution to error checking for input failure doesn't rely on the size of an integer being larger than the size of a character. It will work on all systems.
// a better approach which works for all systems
// even where sizeof(int) == sizeof(char)
c = getchar();
if (!feof(stdin) && !ferror(stdin)) {
// even if c was EOF, we know it's a char value, not an error value
ch[0] = c;
} else {
// c's value is EOF because of an error capturing the input, and
// not because a char value equaling EOF was read.
}
putting it all together to rewrite your program
#include<stdio.h>
int main(int argc, char** argv)
{
char ch[6];
ch[0]=getchar();
ch[1]=getchar();
ch[2]=getchar();
ch[3]=getchar();
ch[4]=getchar();
putchar(ch[0]);
putchar(ch[1]);
putchar(ch[2]);
putchar(ch[3]);
putchar(ch[4]);
return 0;
}
or if you have been introduced to procedures, the much better version of the above
#include<stdio.h>
char getInput() {
int c = getchar();
if (!feof(stdin) && !ferror(stdin)) {
return (char)c;
} else {
exit(EXIT_FAILURE);
}
}
int main(int argc, char** argv)
{
char ch[6];
ch[0]=getInput();
ch[1]=getInput();
ch[2]=getInput();
ch[3]=getInput();
ch[4]=getInput();
putchar(ch[0]);
putchar(ch[1]);
putchar(ch[2]);
putchar(ch[3]);
putchar(ch[4]);
return 0;
}
and if you have learned a bit about looping, it can be rewritten with a loop like so
#include<stdio.h>
char getInput() {
int c = getchar();
if (!feof(stdin) && !ferror(stdin)) {
return (char)c;
} else {
exit(EXIT_FAILURE);
}
}
int main(int argc, char** argv)
{
char ch[6];
int index;
for (index = 0; index < 5; index++) {
ch[index] = getInput();
}
for (index = 0; index < 5; index++) {
putchar(ch[index]);
}
return 0;
}
There are other checks you might want to add, like checking to see if your putchar failed due to an error during output; but, if you don't have an alternative means of presenting the error (like writing it to a file) then adding such checks only increase the complexity of the code without providing a means of communicating the error to the end user.
Error checking is one of the most important items in writing robust programs, but in most programming courses it is treated very lightly (if it is covered at all). If you don't get much discussion about error checking, do yourself a favor and independently read over <errno.h>. A decent description of how to handle errors in C can be found in the GNU pages disussing error handling. A basic "print to stderr" error handler might look like
// this defines errno
#include <errno.h>
// this defines perror
#include <stdio.h>
// this defines strerror
#include <string.h>
extern volatile int errno;
void printError(int value) {
perror(strerror(value));
}

Resources