UndefinedBehaviorSanitiser Issue - CS50 Vigenere - c

I am working on Vigenere (CS50) and keep getting an "UndefinedBehaviorSanitiser SEGV on Unknown Address" when I run my program with any argument that passes the initial screening.
I have read about this issue but cannot find the solution. I cut my code down as much as I could and found the problem occurs even when I do this part. Where is the issue?
Thank you so much.
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
//int shift(char c);
int main(int argc, string argv[])
{
if (argc != 2)
{
printf("Usage: ./vigenere keyword");
return 1;
}
else
{
for (int i = 0; i < strlen(argv[1]); i++)
{
if (!isalpha(argv[1]))
{
printf("Usage: ./vigenere keyword");
return 1;
}
else
{
printf("All good!");
return 1;
}
}
}
}

The fix indeed is
if (!isalpha((unsigned char)argv[1][i]))
The isalpha function/macro takes only a single integer, which must have the value of a single character as unsigned char. But argv[1] is a pointer to multiple characters!
Now as an extra complication, the isalpha is often realized as a macro, and often coded so that a compiler does not produce any diagnostics for a wrong type of argument. This is unfortunate, but you just need to know these when you're programming in C.
The cast of char to unsigned char is required too - if not, then any extended characters (i.e. for example รค) will invoke undefined behaviour on platforms where char is signed - and they are on x86 processors - because the value will be negative, but isalpha would expect just a number that is either EOF or a non-negative number less than or equal to UCHAR_MAX.

Related

How can I get only one digit as an argument of the main function?

I have a problem with isdigit(). For an exercise, I need to pass only a numerical value to the main function.
This means that I cannot pass "3 4 5" neither "hello". My code is properly working with these examples but it's not working with the value "2x" even though it's working with the value "x2".
This is the code:
#include <stdio.h>
#include <cs50.h>
#include <string.h>
#include <ctype.h>
int main (int argc, string argv[])
{
if ((argc == 2) && isdigit(*argv[1]))
{
printf("%s \n", argv[1]);
printf("%i \n", argc);
}
else
{
printf("You have to use only one numeric value \n");
}
}
I tried to change if ((argc == 2) && isdigit(*argv[1])) into if ((argc == 2) && isdigit(argv[1])) but I keep getting Segmentation fault.
Could you please help me? Thanks a lot!
isdigit(int character) checks only one value. You can see that because you have to pass *argv[1] being the first character of the second element in argv. The first character of 2x is 2 so your program behaves as expected.
What you could do is get the length of the input using strlen(argv[1]) and then use a loop to check whether all characters in the string are digits. This however only works for decimal integers.

"UndefinedBehaviorSanitizer:DEADLYSIGNAL" Error in Code

So I am currently taking this cs50 course and I can't seem to figure out what the error/crash mentioned relates to!
I am trying to make my code check if the input from the command line is an integer or not. For example, ./caesar 2 is valid but ./caesar 2z returns ./caesar key.
Error Message:
#include <cs50.h>
#include <stdio.h>
#include <ctype.h>
//string error_key = "./caesar key \n";
int main(int argc, string argv[])
{
if (argc == 2)
{
if (isdigit(argv[1]) == 0)
{
printf("./caesar key \n");
return 1;
}
else
{
int string_to_int = atoi(argv[1]);
printf("%i\n",string_to_int);
}
}
else
{
printf("./caesar key \n");
return 1;
}
}
This is undefined behavior:
isdigit(argv[1])
argv[1] is not a character, it's a pointer. As a pointer most likely has a value higher than can be represented in an unsigned char, it causes an error with LLVM's Undefined Behavior Sanitizer.
Quoting from POSIX, which is aligned to the ISO C standard:
The c argument is an int, the value of which the application shall ensure is a character representable as an unsigned char or equal to the value of the macro EOF. If the argument has any other value, the behavior is undefined.

How to check an edge case in taking command line argument in C and evaluating to int or double?

So I have an assignment to figure out whether a number on the command line is either an integer or a double.
I have it mostly figured it out by doing:
sscanf(argv[x], "%lf", &d)
Where "d" is a double. I then cast it to an int and then subtract "d" with itself to check to see if it is 0.0 as such.
d - (int)d == 0.0
My problem is if the command line arguments contains doubles that can be technically classified as ints.
I need to classify 3.0 as a double whereas my solution considers it an int.
For example initializing the program.
a.out 3.0
I need it to print out
"3.0 is a double"
However right now it becomes
"3 is an int."
What would be a way to check for this? I did look around for similar problems which led me to the current solution but just this one edge case I do not know how to account for.
Thank you.
For example, a way like this:
#include <stdio.h>
int main(int argc, char *argv[]){
if(argc != 2){
puts("Need an argument!");
return -1;
}
int int_v, read_len = 0;
double double_v;
printf("'%s' is ", argv[1]);
//==1 : It was able to read normally.
//!argv[1][read_len] : It used all the argument strings.
if(sscanf(argv[1], "%d%n", &int_v, &read_len) == 1 && !argv[1][read_len])
puts("an int.");
else if(sscanf(argv[1], "%lf%n", &double_v, &read_len) == 1 && !argv[1][read_len])
puts("a double.");
else
puts("isn't the expected input.");
}
To test if a string will covert to a int and/or double (completely, without integer overflow, without undefined behavior), call strtol()/strtod(). #Tom Karzes
The trouble with a sscanf() approach is that the result is undefined behavior (UB) on overflow. To properly detect, use strtol()/strtod().
#include <errno.h>
#include <limits.h>
#include <stdbool.h>
#include <stdlib.h>
bool is_int(const char *src) {
char *endptr;
// Clear, so it may be tested after strtol().
errno = 0;
// Using 0 here allows 0x1234, octal 0123 and decimal 1234.
// or use 10 to allow only decimal text.
long num = strtol(src, &endptr, 0 /* or 10 */);
#if LONG_MIN < INT_MIN || LONG_MAX > INT_MAX
if (num < INT_MIN || num > INT_MAX) {
errno = ERANGE;
}
#endif
return !errno && endptr > src && *endptr == '\0';
}
bool is_double(const char *src) {
char *endptr;
// Clear, so it may be tested after strtod().
strtod(src, &endptr);
// In this case, detecting over/underflow IMO is not a concern.
return endptr > src && *endptr == '\0';
}
It is not entirely clear what the specific expectations are for your program, but it has at least something to do with the form of the input, since "3.0" must be classified as a double. If the form is all it should care about, then you should not try to convert the argument strings to numbers at all, for then you will run into trouble with unrepresentable values. In that case, you should analyze the character sequence of the argument to see whether it matches the pattern of an integer, and if not, whether it matches the pattern of a floating-point number.
For example:
int main(int argc, char *argv[]) {
for (int arg_num = 1; arg_num < argc; arg_num++) {
char *arg = argv[arg_num];
int i = (arg[0] == '-' || arg[0] == '+') ? 1 : 0; // skip any leading sign
// scan through all the decimal digits
while(isdigit(arg[i])) {
++i;
}
printf("Argument %d is %s.\n", arg_num, arg[i] ? "floating-point" : "integer");
}
}
That makes several assumptions, chief among them:
the question is strictly about form, so that the properties of your system's built-in data types (such as int and double) are not relevant.
each argument will have the form of either an integer or a floating-point number, so that eliminating "integer" as a possibility leaves "floating-point" as the only alternative. If "neither" is a possibility that must also be accommodated, then you'll also need to compare the inputs that do not have integer form to a pattern for floating-point numbers, too.
only decimal (or smaller radix) integers need be accommodated -- not, for example, hexadecimal inputs.
Under those assumptions, particularly the first, it is not just unnecessary but counterproductive to attempt to convert the arguments to one of the built-in numeric data types, because you would then come to the wrong conclusion about arguments that, say, are not within the bounds of representable values for those types.
For example, consider how the program should classify "9000000000". It has the form of an integer, but supposing that your system's int type has 31 value bits, that type cannot accommodate a value as large as the one the string represents.
int main (int argc,char *argv[])
{
if(argc==2)
{
int i;
double d;
d=atof(argv[1]);
i=atoi(argv[1]);
if(d!=i)
printf("%s is a double.",argv[1]);
else if(d==i)
printf("%s is an int.",argv[1]);
}
else
printf("Invalid input\n");
return 0;
}
You must add #include <stdlib.h>

How do I write a program in C that counts the number of numbers as an argument?

I'm just toying with the
int main(int argc, int *argv[void])
function, and im trying to make a program that reads the number of number arguments.
Theoretically (in my own crazy delusional mind), this should work:
#include <stdio.h>
int main(int argc, char *argv[])
{
int count;
printf("%d\n", sizeof(int));
}
but no matter what i put as the argument in the command line, i always get 4 (4 bytes in a word?)
How can I tweak this code a little so that when i type
./program 9 8 2 7 4 3 1
i get:
7
much appreciated!
argc represents the number of command line arguments passed in. You can use that as an index into the second argument to main, argv. If you want all the arguments not including the first one (the program name), then you'll need to decrement argc, and increment argv.
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
/*
* forget about the program name.
*/
argv++;
argc--;
int i;
unsigned int totalNumbers = 0;
printf("Total number of arguments: %d\n", argc);
for(i = 0; i < argc; i++) {
printf("argv[%d]=%s\n", i, argv[i]);
errno = 0;
long num = strtol(argv[i], NULL, 10);
if(!(num == 0L && errno == EINVAL))
totalNumbers++;
}
printf("Total number of numeric arguments: %u\n",
totalNumbers);
return 0;
}
As others have pointed out in the comments, sizeof doesn't do quite what you think it does.
You are given argc and argv. The second of these is an array of string corresponding to the things on the command line. This argv array of strings is argc long, and the first element of it is likely to hold the name of the executable program.
You need to loop through the remaining elements of argv (if there are any) and see which ones are numbers, as opposed to non-numbers.
To check if a string is a number or not, we can use strtol() (from stdlib.h) to try to convert it into a long. If the conversion fails, it's not a number. If you'd like to accept floating point values, then use strtod() instead, it works almost in the same way (doesn't take the last argument that strtol() does). EDIT: I actually changed the code to use strtod() instead since it accepts a larger variety of "numbers".
The conversion fails if the string is empty from the start, or if the pointer that we supply to the function (endptr) doesn't point to the very end of the string after calling it.
Then, if the argument is a number, simply count it, and at the end tell the user what he or she probably already knew.
What you're doing here is called validating user input and it's a really good thing to know how to do. Don't trust users to give you numbers just because you ask them to. Check to see if they really are numbers by reading in strings and trying to convert them.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
bool is_number(const char *string)
{
char *endptr;
strtod(string, &endptr);
return (*string != '\0' && *endptr == '\0');
}
int main(int argc, char **argv)
{
int i;
int numcount = 0;
for (i = 1; i < argc; ++i) {
if (is_number(argv[i]))
numcount++;
}
printf("There were %d numbers on the command line\n", numcount);
return EXIT_SUCCESS;
}
Running it:
$ /a.out 1 2 3 a b c -.5 +20 1e20
There were 6 numbers on the command line
$ ./a.out nt 12 ,e2 2 21n 1 -8
There were 4 numbers on the command line

Error message: Conversion may lose significant digits

C Program to accept and display "5" characters using getchar() and putchar() functions:
#include<stdio.h>
void main()
{
char ch[6];
ch[0]=getchar();
ch[1]=getchar();
ch[2]=getchar();
ch[3]=getchar();
ch[4]=getchar();
putchar(ch[0]);
putchar(ch[1]);
putchar(ch[2]);
putchar(ch[3]);
putchar(ch[4]);
}
When i am Compiling this code in "C" language then it is displaying this Error message:
"Conversion may lose significant digits", what could be the reason?
getchar returns an int and you're storing it in a char.
You can fix this by changing char ch[6]; to int ch[6];.
getchar() returns an int, to allow for passing additional values which are not in the range of valid characters. This allows for failures and error codes to be returned as values without overloading a particular character with multiple meanings. In normal operation any valid character may be returned, in error operation an int that is too large to fit in a char is returned.
The warning message indicates that you are going to only look at the char portion of the returned value, and as such, you might oddly cast an error or special return value into a char that wasn't actually captured.
--- Edited at the request of gautham to demonstrate good error detection ---
The lack of error checking for input failure can be fixed in one of two ways. The firs one is more common, and works on all systems where the size of an integer is greater than the size of a character.
// an ok approach which works for most systems
// provided that sizeof(int) != sizeof(char)
int c = getchar();
if (EOF != c) {
ch[0] = (char)c;
} else {
// some error occurred during input capture
// which resulted in getchar returning EOF
}
The second solution to error checking for input failure doesn't rely on the size of an integer being larger than the size of a character. It will work on all systems.
// a better approach which works for all systems
// even where sizeof(int) == sizeof(char)
c = getchar();
if (!feof(stdin) && !ferror(stdin)) {
// even if c was EOF, we know it's a char value, not an error value
ch[0] = c;
} else {
// c's value is EOF because of an error capturing the input, and
// not because a char value equaling EOF was read.
}
putting it all together to rewrite your program
#include<stdio.h>
int main(int argc, char** argv)
{
char ch[6];
ch[0]=getchar();
ch[1]=getchar();
ch[2]=getchar();
ch[3]=getchar();
ch[4]=getchar();
putchar(ch[0]);
putchar(ch[1]);
putchar(ch[2]);
putchar(ch[3]);
putchar(ch[4]);
return 0;
}
or if you have been introduced to procedures, the much better version of the above
#include<stdio.h>
char getInput() {
int c = getchar();
if (!feof(stdin) && !ferror(stdin)) {
return (char)c;
} else {
exit(EXIT_FAILURE);
}
}
int main(int argc, char** argv)
{
char ch[6];
ch[0]=getInput();
ch[1]=getInput();
ch[2]=getInput();
ch[3]=getInput();
ch[4]=getInput();
putchar(ch[0]);
putchar(ch[1]);
putchar(ch[2]);
putchar(ch[3]);
putchar(ch[4]);
return 0;
}
and if you have learned a bit about looping, it can be rewritten with a loop like so
#include<stdio.h>
char getInput() {
int c = getchar();
if (!feof(stdin) && !ferror(stdin)) {
return (char)c;
} else {
exit(EXIT_FAILURE);
}
}
int main(int argc, char** argv)
{
char ch[6];
int index;
for (index = 0; index < 5; index++) {
ch[index] = getInput();
}
for (index = 0; index < 5; index++) {
putchar(ch[index]);
}
return 0;
}
There are other checks you might want to add, like checking to see if your putchar failed due to an error during output; but, if you don't have an alternative means of presenting the error (like writing it to a file) then adding such checks only increase the complexity of the code without providing a means of communicating the error to the end user.
Error checking is one of the most important items in writing robust programs, but in most programming courses it is treated very lightly (if it is covered at all). If you don't get much discussion about error checking, do yourself a favor and independently read over <errno.h>. A decent description of how to handle errors in C can be found in the GNU pages disussing error handling. A basic "print to stderr" error handler might look like
// this defines errno
#include <errno.h>
// this defines perror
#include <stdio.h>
// this defines strerror
#include <string.h>
extern volatile int errno;
void printError(int value) {
perror(strerror(value));
}

Resources