I have the following code to read an argument from the command line. If the string is 1 character long and a digit I want to use that as the exit value. The compiler gives me a warning on the second line (array subscript has type 'char' ) This error comes from the second part after the "&&" .
if (args[1] != NULL) {
if ((strlen(args[1]) == 1) && isdigit(*args[1]))
exit(((int) args[1][0]));
else
exit(0);
}
}
Also, when I use a different compiler I get two errors on the next line (exit).
builtin.c: In function 'builtin_command':
builtin.c:55: warning: implicit declaration of function 'exit'
builtin.c:55: warning: incompatible implicit declaration of built-in function 'exit'
The trouble is that the isdigit() macro takes an argument which is an integer that is either the value EOF or the value of an unsigned char.
ISO/IEC 9899:1999 (C Standard – old), §7.4 Character handling <ctype.h>, ¶1:
In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the macro EOF. If the
argument has any other value, the behavior is undefined.
On your platform, char is signed, so if you have a character in the range 0x80..0xFF, it will be treated as a negative integer. The usual implementation of the isdigit() macros is to use the argument to index into an array of flag bits. Therefore, if you pass a char from the range 0x80..0xFF, you will be indexing before the start of the array, leading to undefined behaviour.
#define isdigit(x) (_CharType[(x)+1]&_Digit)
You can safely use isdigit() in either of two ways:
int c = getchar();
if (isdigit(c))
...
or:
if (isdigit((unsigned char)*args[1]))
...
In the latter case, you know that the value won't be EOF. Note that this is not OK:
int c = *args[1];
if (isdigit(c)) // Undefined behaviour if *args[1] in range 0x80..0xFF
...
The warning about 'implicit definition of function exit' means you did not include <stdlib.h> but you should have done so.
You might also notice that if the user gives you a 2 as the first character of the first argument, the exit status will be 50, not 2, because '2' is (normally, in ASCII and UTF-8 and 8859-1, etc) character code 50 ('0' is 48, etc). You'd get 2 (no quotes) by using *args[1] - '0' as the argument to exit(). You don't need a cast on that expression, though it won't do much harm.
It seems that the compiler you use has a macro for isdigit (and not a function, you wouldn't have a warning if it was the case) that uses the argument as a subscript for an array.
That's why isdigit takes an INT as argument, not a char.
One way to remove the warning it to cast your char to int :
isdigit(*args[1]) => isdigit((int)(*args[1]))
The second warning means you want to use the exit function, but it has not been defined yet. Which means you have to do the required #include.
#include <stdlib.h>
is the standard in c-library to use exit(int) function.
BTW, if this code is in your "main" function, you mustn't check "arg[1] == NULL", which could lead to a segmentation fault if the user didn't provide any argument on the command-line. You must check that the argc value (the int parameter) is greater than 1.
It is not completely clear what you want to give as exit code, but probably you want args[1][0] - '0' for the decimal value that the character represents and not the code of the character.
If you do it like that, you'd have the side effect that the type of that expression is int and you wouldn't see the warning.
For exit you probably forgot to include the header file.
Try changing
isdigit(*args[1])
to
isdigit(args[1][0])
Your other errors are because you aren't using #include <stdlib.> which defines the exit function.
Related
#include <ctype.h>
#include <stdio.h>
int atoi(char *s);
int main()
{
printf("%d\n", atoi("123"));
}
int atoi(char *s)
{
int i;
while (isspace(*s))
s++;
int sign = (*s == '-') ? -1 : 1;
/* same mistake for passing pointer to isdigit, but will not cause CORE DUMP */
// isdigit(s), s++;// this will not lead to core dump
// return -1;
/* */
/* I know s is a pointer, but I don't quite understand why code above will not and code below will */
if (!isdigit(s))
s++;
return -1;
/* code here will cause CORE DUMP instead of an comile-time error */
for (i = 0; *s && isdigit(s); s++)
i = i * 10 + (*s - '0');
return i * sign;
}
I got "Segmentation fault (core dumped)" when I accidentally made mistake about missing * operator before 's'
then I got this confusing error.
Why "(!isdigit(s))" lead to core dump while "isdigit(s), s++;" will not.
From isdigit [emphasis added]
The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF.
From isdigit [emphasis added]
The c argument is an int, the value of which the application shall ensure is a character representable as an unsigned char or equal to the value of the macro EOF. If the argument has any other value, the behavior is undefined.
https://godbolt.org/z/PEnc8cW6T
An undefined behaviour includes it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended.
All answers so far has failed to point out the actual problem, which is that implicit pointer to integer conversions are not allowed during assignment in C. Details here: "Pointer from integer/integer from pointer without a cast" issues
Specifically C17 6.5.2.2/7
If the expression that denotes the called function has a type that does include a prototype,
the arguments are implicitly converted, as if by assignment, to the types of the
corresponding parameters
Where "as if by assignment" sends us to check the rules of assignment 6.5.16.1, which are quoted in the above link. So isdigit(s) is equivalent to something like this:
char* s;
...
int param_to_isdigit = s; // constraint violation of 6.5.16.1
Here the compiler must issue a diagnostic message. If you didn't spot it or in case you are using a tool chain giving warnings instead of errors, check out What compiler options are recommended for beginners learning C? so that you prevent code like this from compiling, so that you don't have to spend time troubleshooting bugs that the compiler already spotted for you.
Furthermore, the ctype.h functions require that the passed integer must be representable as unsigned char, but that's another story. C17 7.4 Character handling <ctype.h>:
In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the macro EOF
You are invoking undefined behavior. isdigit() is supposed to receive an int argument, but you pass in a pointer. This is effectively attempting to assign a pointer to an int (xref: Language / Expressions / Assignment operators / Simple assignment, ¶1).
Furthermore, there is a constraint that the argument to isdigit() be representable as an unsigned char or equal to EOF. (xref: Library / Character handling <ctype.h>, ¶1).
As a guess, the isdigit() function may be performing some kind of table lookup, and the input value may cause the function to access a pointer value beyond the table.
Why no segfault from isdigit(s), s++;?
First of all. Undefined behavior can manifest itself in a lot of ways, including the program working as intended. That's what undefined means.
But that line is not equivalent to your if statement. What this does is that it executes isdigit(s), throws away the result, increments s and also throw away the result of that operation.
However, isdigit does not have side effects, so it's quite probable that the compiler simply removes the call to that function, and replace this line with an unconditional s++. That would explain why it does not segfault. But you would have to study the generated assembly to make sure, but it's a possibility.
You can read about the comma operator here What does the comma operator , do?
I wasn't able to repeat this behaviour in MacOS/Darwin, but I was able to in Debian Linux.
To investigate a bit further, I wrote the following program:
#include <ctype.h>
#include <stdio.h>
int main()
{
printf("isalnum('a'): %d\n", isalnum('a'));
printf("isalpha('a'): %d\n", isalpha('a'));
printf("iscntrl('\n'): %d\n", iscntrl('\n'));
printf("isdigit('1'): %d\n", isdigit('1'));
printf("isgraph('a'): %d\n", isgraph('a'));
printf("islower('a'): %d\n", islower('a'));
printf("isprint('a'): %d\n", isprint('a'));
printf("ispunct('.'): %d\n", ispunct('.'));
printf("isspace(' '): %d\n", isspace(' '));
printf("isupper('A'): %d\n", isupper('A'));
printf("isxdigit('a'): %d\n", isxdigit('a'));
printf("isdigit(0x7fffffff): %d\n", isdigit(0x7fffffff));
return 0;
}
In MacOS, this just prints out 1 for every result except the last one, implying that these functions are simply returning the result of a logical comparison.
The results are a bit different in Linux:
isalnum('a'): 8
isalpha('a'): 1024
iscntrl('\n'): 2
isdigit('1'): 2048
isgraph('a'): 32768
islower('a'): 512
isprint('a'): 16384
ispunct('.'): 4
isspace(' '): 8192
isupper('A'): 256
isxdigit('a'): 4096
Segmentation fault
This suggests to me that the library used in Linux is fetching values from a lookup table and masking them with a bit pattern corresponding to the argument provided. For example, '1' (ASCII 49) is an alphanumeric character, a digit, a printable character and a hex digit, so entry 49 in this lookup table is probably equal to 8+2018+32768+16384+4096, which is 55274.
The documentation for these functions does mention that the argument must have either the value of an unsigned char (0-255) or EOF (-1), so any value outside this range is causing this table to be read out of bounds, resulting in a segmentation error.
Since I'm only calling the isdigit() function with an integer argument, this can hardly be described as undefined behaviour. I really think the library functions should be hardened against this sort of problem.
I have the following program that causes a segmentation fault.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char *argv[])
{
printf("TEST");
for (int k=0; k<(strlen(argv[1])); k++)
{
if (!isalpha(argv[1])) {
printf("Enter only alphabets!");
return 1;
}
}
return 0;
}
I've figured out that it is this line that is causing the problem
if (!isalpha(argv[1])) {
and replacing argv[1] with argv[1][k] solves the problem.
However, I find it rather curious that the program results in a segmentation fault without even printing TEST. I also expect the isalpha function to incorrectly check if the lower byte of the char* pointer to argv[1], but this doesn't seem to be the case. I have code to check for the number of arguments but isn't shown here for brevity.
What's happening here?
In general it is rather pointless to discuss why undefined behaviour leads to this result or the other.
But maybe it doesn't harm to try to understand why something happens even if it is outside the spec.
There are implementation of isalpha which use a simple array to lookup all possible unsigned char values. In that case the value passed as parameter is used as index into the array.
While a real character is limited to 8 bits, an integer is not.
The function takes an int as parameter. This is to allow entering EOF as well which does not fit into unsigned char.
If you pass an address like 0x7239482342 into your function this is far beyond the end of the said array and when the CPU tries to read the entry with that index it falls off the rim of the world. ;)
Calling isalpha with such an address is the place where the compiler should raise some warning about converting a pointer to an integer. Which you probably ignore...
The library might contain code that checks for valid parameters but it might also just rely on the user not passing things that shall not be passed.
printf was not flushed
the implicit conversion from pointer to integer that ought to have generated at least compile-time diagnostics for constraint violation produced a number that was out of range for isalpha. isalpha being implemented as a look-up table means that your code accessed the table out of bounds, therefore undefined behaviour.
Why you didn't get diagnostics might be in one part because of how isalpha is implemented as a macro. On my computer with Glibc 2.27-3ubuntu1, isalpha is defined as
# define isalpha(c) __isctype((c), _ISalpha)
# define __isctype(c, type) \
((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) type)
the macro contains an unfortunate cast to int in it, which will silence your error!
One reason why I am posting this answer after so many others is that you didn't fix the code, it still suffers from undefined behaviour given extended characters and char being signed (which happens to be generally the case on x86-32 and x86-64).
The correct argument to give to isalpha is (unsigned char)argv[1][k]! C11 7.4:
In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.
I find it rather curious that the program results in a segmentation fault without even printing TEST
printf doesn't print instantly, but it writes to a temporal buffer. End your string with \n if you want to flush it to actual output.
and replacing argv[1] with argv[1][k] solves the problem.
isalpha is intended to work with single characters.
First of all, a conforming compiler must give you a diagnostic message here. It is not allowed to implicitly convert from a pointer to the int parameter that isalpha expects. (It is a violation of the rules of simple assignment, 6.5.16.1.)
As for why "TEST" isn't printed, it could simply be because stdout isn't flushed. You could try adding fflush(stdout); after printf and see if this solves the issue. Alternatively add a line feed \n at the end of the string.
Otherwise, the compiler is free to re-order the execution of code as long as there are no side effects. That is, it is allowed to execute the whole loop before the printf("TEST");, as long as it prints TEST before it potentially prints "Enter only alphabets!". Such optimizations are probably not likely to happen here, but in other situations they can occur.
I think the title does not suit well for my question. (I appreciate it, if someone suggests an Edit)
I am learning C with "Learn C The Hard Way.". I am using printf to output values using format specifiers. This is my code snippet:
#include <stdio.h>
int main()
{
int x = 10;
float y = 4.5;
char c = 'c';
printf("x=%d\n", x);
printf("y=%f\n", y);
printf("c=%c\n", c);
return 0;
}
This works as I expect it to. I wanted to test it's behavior when it comes to conversion. So everything was ok unless I made it to break by converting char to float by this line:
printf("c=%f\n", c);
Ok, I'm compiling it and this is the output:
~$ cc ex2.c -o ex2
ex2.c: In function ‘main’:
ex2.c:13:3: warning: format ‘%f’ expects argument of type ‘double’, but argument 2 has type ‘int’ [-Wformat=]
printf("c=%f\n", c);
^
The error clearly tells me that It cannot convert from int to float, But this does not prevent the compiler from making an object file, and the confusing part is here, where I run the object file:
~$ ./ex2
x=10
y=4.500000
c=c
c=4.500000
As you can see printf prints the last float value it printed before. I tested it with other values for y and in each case it prints the value of y for c. Why this happen?
Your compiler is warning you about the undefined behaviour you have. Anything can happen. Anything from seeming to work to nasal demons. A good reference on the subject is What Every C Programmer Should Know About Undefined Behavior.
Normally, int can convert to double just fine:
int i = 10;
double d = i; //works fine
printf is a special kind of function. Since it can take any number of arguments, the types have to match exactly. When given a char, it is promoted to int when passed in. printf, however, uses the %f you gave it to get a double. That's not going to work.
Here is how one would implement their own variadic function, taken from here:
int add_nums(int count, ...)
{
int result = 0;
va_list args;
va_start(args, count);
for (int i = 0; i < count; ++i) {
result += va_arg(args, int);
}
va_end(args);
return result;
}
count is the number of arguments that follow. There is no way for the function to know this without being told. printf can deduce it from the format specifiers in the string.
The other relevant part is the loop. It will execute count times. Each time, it uses va_arg to get the next argument. Notice how it gives va_arg the type. This type is assumed. The function needs to rely on the caller to pass in something that gets promoted to int in order for the va_arg call to work properly.
In the case of printf, it has a defined list of format specifiers that each tell it which type to use. %d is int. %f is double. %c is also int because char is promoted to int, but printf then needs to represent that integer as a character when forming output.
Thus, any function that takes variadic arguments needs some caller cooperation. Another thing that could go wrong is giving printf too many format specifiers. It will blindly go and get the next argument, but there are no more arguments. Uh-oh.
If all of this isn't enough, the standard explicitly says for fprintf (which it defines printf in terms of) in C11 (N1570) §7.21.6.1/9:
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
All in all, thank your compiler for warning you when you are not cooperating with printf. It can save you from some pretty bad results.
Since printf is a varargs function, parameters cannot be converted automatically to the type expected by the function. When varargs functions are called, parameters undergo certain standard conversions, but these will not convert between different fundamental types, such as between integer and float. It's the programmer's responsibility to ensure that the type of each argument to printf is appropriate for the corresponding format specifier. Some compilers will warn about mismatches because they do extra checking for printf, but the language doesn't allow them to convert the type -- printf is just a library function, calls to it must follow the same rules as any other function.
Here is a very general description, which may be slightly different depending on the compiler in use...
When printf("...",a,b,c) is invoked:
The address of the string "..." is pushed into the stack.
The values of each of the variables a, b, c are pushed into the stack:
Integer values shorter than 4 bytes are expanded to 4 bytes when pushed into the stack.
Floating-point values shorter than 8 bytes are expanded to 8 bytes when pushed into the stack.
The Program Counter (or as some call it - Instruction Pointer) jumps to the address of function printf in memory, and execution continues from there.
For every % character in the string pointed by the first argument passed to function printf, the function loads the corresponding argument from the stack, and then - based on the type specified after the % character - computes the data to be printed.
When printf("%f",c) is invoked:
The address of the string "%f" is pushed into the stack.
The value of the variable c is expanded to 4 bytes and pushed into the stack.
The Program Counter (or as some call it - Instruction Pointer) jumps to the address of function printf in memory, and execution continues from there.
Function printf sees %f in the string pointed by the first argument, and loads 8 bytes of data from the stack. As you can probably understand, this yields "junk data" in the good scenario and a memory access violation in the bad scenario.
I have this code:
#include <ctype.h>
char *tokenHolder[2500];
for(i = 0; tokenHolder[i] != NULL; ++i){
if(isdigit(tokenHolder[i])){ printf("worked"); }
Where tokenHolder holds the input of char tokens from user input which have been tokenized through getline and strtok. I get a seg fault when trying to use isdigit on tokenHolder — and I'm not sure why.
Since tokenHolder is an array of char *, when you index tokenHolder[i], you are passing a char * to isdigit(), and isdigit() does not accept pointers.
You are probably missing a second loop, or you need:
if (isdigit(tokenHolder[i][0]))
printf("working\n");
Don't forget the newline.
Your test in the loop is odd too; you normally spell 'null pointer' as 0 or NULL and not as '\0'; that just misleads people.
Also, you need to pay attention to the compiler warnings you are getting! Don't post code that compiles with warnings, or (at the least) specify what the warnings are so people can see what the compiler is telling you. You should be aiming for zero warnings with the compiler set to fussy.
If you are trying to test that the values in the token array are all numbers, then you need a test_integer() function that tries to convert the string to a number and lets you know if the conversion does not use all the data in the string (or you might allow leading and trailing blanks). Your problem specification isn't clear on exactly what you are trying to do with the string tokens that you've found with strtok() etc.
As to why you are getting the core dump:
The code for the isdigit() macro is often roughly
#define isdigit(x) (_Ctype[(x)+1]&_DIGIT)
When you provide a pointer, it is treated as a very large (positive or possibly negative) offset to an array of (usually) 257 values, and because you're accessing memory out of bounds, you get a segmentation fault. The +1 allows EOF to be passed to isdigit() when EOF is -1, which is the usual value but is not mandatory. The macros/functions like isdigit() take either an character as an unsigned char — usually in the range 0..255, therefore — or EOF as the valid inputs.
You're declaring an array of pointer to char, not a simple array of just char. You also need to initialise the array or assign it some value later. If you read the value of a member of the array that has not been initialised or assigned to, you are invoking undefined behaviour.
char tokenHolder[2500] = {0};
for(int i = 0; tokenHolder[i] != '\0'; ++i){
if(isdigit(tokenHolder[i])){ printf("worked"); }
On a side note, you are probably overlooking compiler warnings telling you that your code might not be correct. isdigit expects an int, and a char * is not compatible with int, so your compiler should have generated a warning for that.
You need/want to cast your input to unsigned char before passing it to isdigit.
if(isdigit((unsigned char)tokenHolder[i])){ printf("worked"); }
In most typical encoding schemes, characters outside the USASCII range (e.g., any letters with umlauts, accents, graves, etc.) will show up as negative numbers in the typical case that char is a signed.
As to how this causes a segment fault: isdigit (along with islower, isupper, etc.) is often implemented using a table of bit-fields, and when you call the function the value you pass is used as an index into the table. A negative number ends up trying to index (well) outside the table.
Though I didn't initially notice it, you also have a problem because tokenHolder (probably) isn't the type you expected/planned to use. From the looks of the rest of the code, you really want to define it as:
char tokenHolder[2500];
Whenever I send in a number from the command line it errors and gives me a wrong number
edgeWidth=*argv[2];
printf("Border of %d pixels\n", edgeWidth);
fileLocation=3;
./hw3 -e 100 baboon.ascii.pgm is what I send in through the command line and when I print the number to the screen I get 49 as the number
int edgeWidth is defined at the beginning of the program.
Why is it not giving me 100?
argv contains an array of strings. So argv[1] is a string, you need to convert it to an integer:
edgeWidth = atoi(argv[1]);
The problem is that by doing
edgeWidth = *argv[2];
you're assigning the first character of "100" to edgeWidth. 49 happens to be the ASCII value for '1'.
If you want 100, you need to use something like atoi or strtol to parse the string into an int.
Addendum: Regarding numeric promotion, part two of 6.5.16.1 in the C99 spec states:
In simple assignment (=), the value of the right operand is converted
to the type of the assignment expression and replaces the value stored
in the object designated by the left operand.
so it does appear that numeric promotion happens here.
Because command line arguments are by default as char* (or may be char** somewhere) not int. you need proper conversion like atoi() to use it as int.
You should use edgeWidth = atoi(argv[2]) to get expected output.