Flex - identfiy float/ int/ id tokens - c

I'm trying to create a flex file that will recognize float/ integers and id:
Valid int - not allowed to start with 0.
Valid float-its presentation must include exponent whose value is an integer number with or without sign 2.78e+10.
Valid id- can only start with a lower-case letter and several underscores can not appear one after another
I am not sure where I'm wrong, if I have only float number I getting back float also int and id, but when everything combined in one file it's not working.
this the file that I create:
%option noyywrap
%{
#include "Token.h"
#include <stdio.h>
#include <stdlib.h>
static int skip_single_line_comment(int num); //function for one line comment
static int skip_multiple_line_comment(int num);//function for multiple lines of comments
int line_num=0;
%}
ALPHA ([a-zA-Z])
DIGIT ([0-9])
Sign ([+|-])
Expo ([e]{Sign}?)
float_num ([1-9]+(\.({DIGIT}+{Expo}{DIGIT}+)))
int_num [1-9]{DIGIT}+
id ([a-z]+({ALPHA}|{DIGIT}|(\_({ALPHA}|{DIGIT})))*)
%%
{float_num} {
create_and_store_token(TOKEN_FLOAT, yytext, line_num);
fprintf(yyout,"Line %d : found token of type TOKEN_FLOAT, lexeme %s.\n", line_num, yytext);
}
\n {line_num++;}
{int_num} {
create_and_store_token(TOKEN_INTEGER, yytext, line_num);
fprintf(yyout,"Line %d : found token of type TOKEN_INTEGER , lexeme %s.\n", line_num, yytext);
}
{id} {
create_and_store_token(TOKEN_ID, yytext, line_num);
fprintf(yyout,"Line %d : found token of type TOKEN_ID, lexeme %s.\n", line_num, yytext);
}
"//" {line_num=skip_single_line_comment(line_num); fprintf(yyout,"The number of the line is:%d.\n", line_num);}
"/*" {line_num=skip_multiple_line_comment(line_num); fprintf(yyout,"The number of the line is:%d.\n", line_num);}
%%
static int
skip_single_line_comment(int num)
{
char c;
/* Read until we find \n or EOF */
while((c = input()) != '\n' && c != EOF)
;
/* Maybe you want to place back EOF? */
if(c == EOF)
unput(c);
return num=num+1;
}
static int
skip_multiple_line_comment(int num)
{
char c;
for(;;)
{
switch(input())
{
/* We expect ending the comment first before EOF */
case EOF:
fprintf(stderr, "Error unclosed comment, expect */\n");
exit(-1);
goto done;
break;
/* Is it the end of comment? */
case '*':
if((c = input()) == '/'){
num=num+1;
goto done;
}
unput(c);
break;
default:
/* skip this character */
break;
}
}
done:
/* exit entry */
return num ;
}
void main(int argc, char **argv){
yyin=fopen("C:\\temp\\test1.txt","r");
yyout=fopen("C:\\temp\\test1Soltion.txt","w");
yylex();}
the input file:
21
41.e-21
a_23_e4_5
8
1.1E+21
a1_c23_e4_56
The output:
Line 0 : found token of type TOKEN_INTEGER , lexeme 21.
Line 1 : found token of type TOKEN_INTEGER , lexeme 41.
.Line 1 : found token of type TOKEN_ID, lexeme e.
-Line 1 : found token of type TOKEN_INTEGER , lexeme 21.
Line 2 : found token of type TOKEN_ID, lexeme a_23_e4_5.
81.1E+Line 4 : found token of type TOKEN_INTEGER , lexeme 21.
Line 5 : found token of type TOKEN_ID, lexeme a1_c23_e4_56.

You have several problems in your code: (from top to bottom)
Sign is bad... you are saying that a sign is one of +, | or -. You have used three characters in between the square brackets [ and ] which makes them possible... you could use (\+|-) or [+-], but not what you have written. In order for the - to be accepted not as a range indicator, is to stick it to one of the square brackets that delimite the charset (better to the last one, so if you have to use the negation ^ character, you can do it without interference)
An exponent trailer to a floating point allows both e and E, so the actual regexp should be [eE].
a Floating point number can begin with 0. you can have something like -00013.26 and be valid...
your floating point number is forced to have digits at both sides of the dot ., so you'll not recognize anything like 3. or .26 as floating point numbers. You have written ([1-9]+(\.({DIGIT}+{Expo}{DIGIT}+))) which accepts a variable number of digits in the set [1-9] (you disallow 0 in front of a decimal point) but always greater than zero, followed with a dot, and followed with at least one digit after the dot... this makes 41.e-21 not to be recognized as a floating point. Even 40.25 will not be recognized as floating point (but as the tokens 4 (integer) followed by 0 integer, and a dot (which will be echoed to output by default) and then the integer 25)
you don't allow to put signs in front of number (this is common in compiler implementation, but not to read number sequences as you are trying) You have not included support for a sign in front of a number... this is the reason that 41.e-21 is parsed as Int(41), .(echoed), e(identifier),- (is not valid as sign here, because you don't allow signed integers), and 21 as an integer.
you don't accept integers of less than two digits: again, the use of + makes you to have one digit (different than 0) followed by at least one more digit... this makes the mesh you have on line four and five.
So the only thing you recognize correctly is the identifier, that by the way has to begin with a lower case alphabetic... you have not included support for uppercase beginning identifiers.

Related

How to check if input is numeric(float) or it is some character?

I was asked to write a program to find sum of two inputs in my college so I should first check whether the input is valid.
For example, if I input 2534.11s35 the program should detect that it is not a valid input for this program because of s in the input.
to check input is numeric(float)
1) Take input as a string char buf[Big_Enough]. I'd expect 160 digits will handle all but the most arcane "float" strings1.
#define N 160
char buf[N];
if (fgets, buf, sizeof buf, stdin) {
2) Apply float strtof() for float, (strtod() for double, strtold() for long double).
char *endptr;
errno = 0;
float d = strtof(buf, &endptr);
// endptr now points to the end of the conversion, if any.
3) Check results.
if (buf == endptr) return "No_Conversion";
// Recommend to tolerate trailing white-space.
// as leading white-spaces are already allowed by `strtof()`
while (isspace((unsigned char)*endptr) {
endptr++;
}
if (*endptr) return "TrailingJunkFound";
return "Success";
4) Tests for extremes, if desired.
At this point, the input is numeric. The question remains if the "finite string" an be well represented by a finite float: if a the |result| is in range of 0 or [FLT_TRUE_MIN...FLT_MAX].
This involves looking at errno.
The conversion "succeed" yet finite string values outside the float range become HUGE_VALF which may be infinity or FLT_MAX.
Wee |values| close to 0.0, but not 0.0 become something in the range [0.0 ... INT_MIN].
Since the goal is to detect is a conversion succeeded (it did), I'll leave these details for a question that wants to get into the gory bits of what value.
An alternative is to use fscanf() to directly read and convert, yet the error handling there has its troubles too and hard to portably control.
1 Typical float range is +/- 1038. So allowing for 40 or so characters makes sense. An exact print of FLT_TRUE_MIN can take ~150 characters. To distinguish a arbitrarily "float" string from FLT_TRUE_MIN from the next larger one needs about that many digits.
If "float" strings are not arbitrary, but only come from the output of a printed float, then far few digits are needed - about 40.
Of course it is wise to allow for extra leading/trailing spaces and zeros.
You need to take the input as a string and then, make use of strtod() to parse the input.
Regarding the return values, from the man page:
double strtod(const char *nptr, char **endptr);
These functions return the converted value, if any.
If endptr is not NULL, a pointer to the character after the last character used in the conversion is stored in the location referenced by endptr.
If no conversion is performed, zero is returned and the value of nptr is stored in the location referenced by endptr.
Getting to the point of detection of errors, couple of points:
Ensure the errno is set to 0 before the call and it still is 0 after the call.
The return value is not HUGE_VAL.
The content pointed to by *endptr is not null and not equal to nptr (i.e., no conversation has been preformed).
The above checks, combined together will ensure a successful conversion.
In your case, the last point is essential, as if there is an invalid character present in the input, the *endptr would not be pointing to a null, instead it would hold the address of that (first) invalid character in the input.
#include<stdio.h>
#include<stdlib.h>
void main(){
char num1[15];
float number1;
int dot_check1=0,check=0,i;
printf("enter the numbers :\n");
gets(num1);
i=0;
while(num1[i]){
if(num1[i]>'/' && num1[i]<':')
;
else { if(dot_check1==0){
if(num1[i]=='.')
dot_check1=1;
else {
check=1;
break;
}
}
else {
check=1;
break;
}
}
i++;
}
if(check){
printf("please check the number you have entered");
}
else{
number1=atof(num1);
printf("you entered number is %f",number1);
}
}
Here is untested code to check whether a string meets the requested specification.
#include <ctype.h>
/* IsFloatNumeral returns true (1) if the string pointed to by p contains a
valid numeral and false (0) otherwise. A valid numeral:
Starts with optional white space.
Has an optional hyphen as a minus sign.
Contains either digits, a period followed by digits, or both.
Ends with optional white space.
Notes:
It is unusual not to accept "3." for a float literal, but this was
specified in a comment, so the code here is written for that.
The question does not state that leading or trailing white space
should be accepted (and ignored), but that is included here. To
exclude such white space, simply delete the relevant lines.
*/
_Bool IsFloatNumeral(const char *p)
{
_Bool ThereAreInitialDigits = 0;
_Bool ThereIsAPeriod = 0;
// Skip initial spaces. (Not specified in question; removed if undesired.)
while (isspace(*p))
++p;
// Allow an initial hyphen as a minus sign.
if (*p == '-')
++p;
// Allow initial digits.
if (isdigit(*p))
{
ThereAreInitialDigits = 1;
do
++p;
while (isdigit(*p));
}
// Allow a period followed by digits. Require at least one digit to follow the period.
if (*p == '.')
{
++p;
if (!isdigit(*p))
return 0;
ThereIsAPeriod = 1;
do
++p;
while (isdigit(*p));
}
/* If we did not see either digits or a period followed by digits,
reject the string (return 0).
*/
if (!ThereAreInitialDigits && !ThereIsAPeriod)
return 0;
// Skip trailing spaces. (Not specified in question; removed if undesired.)
while (isspace(*p))
++p;
/* If we are now at the end of the string (the null terminating
character), accept the string (return 1). Otherwise, reject it (return
0).
*/
return *p == 0;
}

Finding unprintable characters and printing out their hex form in C

I currently have a finite state machine which analyzes a long string, separates the long string by white space, and analyzes each token to either octal, hex, float, error, etc.
Here is a brief overview of how I analyze each token:
enum state mystate = start_state;
while (current_index <= end_index - 1) { // iterate through whole token
switch (mystate) {
case 0:
// analyze first character and move to appropriate state
// cases 1-5 represent the valid states, if error set mystate = 6
case 6: // this is the error state
current_index = end_index - 1; // end loop
break;
}
current_index++;
}
At the end of this loop, I analyze what state my token fell under, for example if the token didn't fit into any category and it went to state 6 (the error state):
if (mystate == 6) {
// token is char pointer to string token
fprintf(stdout, "Error: \" %s \" is invalid\n", token);
}
Now, I am supposed to print out unprintable characters from 0x20 and under, such as start-of-text, start-of-header, etc. in their hex form, such as [0x02] and [0x01]. I found a good list of the ASCII unprintable characters from 0x20 and under here: http://www.theasciicode.com.ar/ascii-control-characters/start-of-header-ascii-code-1.html
Firstly, I am confused how to even type the unprintable characters into the command line. How does one type an unprintable character as a command line argument for my program to analyze?
After that hurdle, I know that the unprintable characters will fall into state 6, my error state. So I have to modify my error state if statement slightly. Here is my thought process of how to do so in pseudo code:
if (mystate == 6) {
if (token is equal to unprintable character) {
// print hex form, use 0x%x for formatting
} else {
// still error, but not unprintable so just have original error statement
fprintf(stdout, "Error: \" %s \" is invalid\n", token);
}
}
Another thought I had was:
if (mystate == 6) {
if (the token's hex value is between 0x01 and 0x20) {
// print hex form, use 0x%x for formatting
} else {
// still error, but not unprintable so just have original error statement
fprintf(stdout, "Error: \" %s \" is invalid\n", token);
}
}
With a sane libc you would use
#include <ctype.h>
...
if (!isprint((int)ch) {
unsigned x = ch;
printf ("[0x%02x]", 0xff&(int)ch);
}
...
to find non-printable ascii characters, assumed that char ch is your current input character.
To use them in a command line you could use printf(1) from the command line.
printf '\x02'|xxd
0000000: 02
There you see the STX character. BTW. There is an excellent manual page about ascii (ascii(7))!
So as a complete command line:
YOUR_Program "`printf '\x02\x03\x18\x19'`"
(The xxd was just to show what comes out of printf, as it is non-printable). xxd is just a hexdump utility, similar to od.
Note: When you really want unprintable input, it is more convenient to take the input either from a file, or from stdin. That simplifies your program call:
printf '\x02\x03\x18\x19'|YOUR_Program
One piece of your puzzle is printing in hex.
Printf("%02x", 7);
This prints the two digit hex value 07.
Another piece is detecting non printable.
If (c < 20).
This translates as if the character has any value less than a space.
You might research the isprint function as there are some unprintable characters that are greater than space.
Good luck. Welcome to c.

Flex pattern for ID gives 'Segmentation fault'

I have a program in C that converts expression to RPN (reverse Polish notation).
All I need to do is to replace lexer code written in C with Flex. I already did some work, but I have problems with patterns - word or variable id to be specific. Yes, this is class exercise.
This is what I have:
%{
#include "global.h"
int lineno = 1;
int tokenval = NONE;
%}
%option noyywrap
WS " "
NEW_LINE "\n"
DIGIT [0-9]
LETTER [a-zA-Z]
NUMBER {DIGIT}+
ID {LETTER}({LETTER}|{DIGIT})*
%%
{WS}+ {}
{NEW_LINE} { ++lineno; }
{NUMBER} { sscanf (yytext, "%d", &tokenval); return(NUM); }
{ID} { sscanf (yytext, "%s", &tokenval); return(ID); }
. { return *yytext;}
<<EOF>> { return (DONE); }
%%
and defined in global.h
#define BSIZE 128
#define NONE -1
#define EOS '\0'
#define NUM 256
#define DIV 257
#define MOD 258
#define ID 259
#define DONE 260
All work when I use digits, brackets and operators, but when I type for example a+b it gives me Segmentation fault (and the output should be ab+).
Please don't ask me for a parser code (I can share if really needed) - requirement is to ONLY implement lexer using Flex.
The problem is that the program is doing an sscanf with a string format (%s) into the address of an integer (&tokenval). You should change that to an array of char, e.g.,
%{
#include "global.h"
int lineno = 1;
int tokenval = NONE;
char tokenbuf[132];
%}
and
{ID} { sscanf (yytext, "%s", tokenbuf); return(ID); }
(though strcpy is a better choice than sscanf, this is just a starting point).
When flex scans a token matching pattern ID, the associated action attempts to copy the token into a character array at location &tokenval. But tokenval has type int, so
the code has undefined behavior
if the length of the ID equals or exceeds the size of an int, then you cannot fit all its bytes (including a string terminator) in the space occupied by an int. A reasonably likely result is that you attempt to write past its end, which could result in a segfault.

How do i check each line of file in C?

I'm new to C
i'm asked to check if the format of the text file input is right or not!
the file should have lines like this :
1-float
2-('+'/'*'/'-')
3-flaot
4-'='
5-the result of the above operation
6-';'
I read the file and place each char in an array but have no idea what to do next
here is my code
#include <stdio.h>
#include <conio.h>
/*Max number of characters to be read/write from file*/
#define MAX_CHAR_FOR_FILE_OPERATION 1000000
int main()
{
char *filename = "D:\input.txt";
FILE *fp;
char text[MAX_CHAR_FOR_FILE_OPERATION];
int i;
fp = fopen(filename, "r");
if(fp == NULL)
{
printf("File Pointer is invalid\n");
return -1;
}
//Ensure array write starts from beginning
i = 0;
//Read over file contents until either EOF is reached or maximum characters is read and store in character array
while( (fgets(&text[i++],sizeof(char)+1,fp) != NULL) && (i<MAX_CHAR_FOR_FILE_OPERATION) ) ;
//Ensure array read starts from beginning
fclose(fp);
getche();
return 0;
}
The easiest solution I can think of is to create an automata. That could be an enum with steps, for exemple:
enum AUTOMATE
{
FirstFloat = 0,
FirstSign,
SecondFloat,
EqualSign,
Answer
};
More info on how to use enum here : http://msdn.microsoft.com/en-us/library/whbyts4t.aspx
If you already have all each char in an array, iterate over the entire array using whichever loop you want, and check the integer value of each char. Use this table http://www.asciitable.com/ to check weather the integer value represents a number or a sign (-, +, =, etc). When each step is passed, tell your automate to go further (+=1). If you reach the end, you verified it. If not, then format is wrong.
It is not 100% clear what you want to do here.
If all you want to do is check that the expression is syntactically correct, that's one thing. If you want to check that it is also arithmetically correct (i.e. that the result on the RHS of the = is actually the result of the arithmetic expression on the LHS), that's another.
In either case, you must parse the input lines. There are several ways of doing this. The canonical, general, and robust way is to tokenize the lines with a lexer and pass the tokens from the lexer to a parser, which is a kind of finite state machine that “knows” the grammar of the language you are trying to parse (in this case infix arithmetic expressions). Given that you asked this question, it's reasonable to assume that you haven't got to this kind of material yet.
In your case, you are only dealing with simple infix arithmetic expressions of the form:
NUMBER OPERATOR NUMBER = NUMBER ;
You can get away with checking for lines that “look” exactly like this with one of the scanf() family of functions, but this is a fragile solution: if you add another term to the expression on the left, it will break; it takes considerable care to craft the correct format string; and it does not check for arithmetic correctness.
If all you need is something this simple, you can do it like this (I have omitted the file I/O):
#include <stdio.h>
#include <stdbool.h>
#define OPERATOR_CLASS "[-+/*]"
bool is_a_binary_infix_expression(const char *expr)
{
int count; // Count returned by sscanf()
float left_opd; // Left operand
char operator[2] = {'\0'}; // Operator
float right_opd; // Right operand
float result; // Result
char junk; // Trailing junk
// Format specifier for sscanf():
const char *format = "%f %1" OPERATOR_CLASS "%f =%f ; %c";
// Attempt conversion:
count = sscanf(expr, format, &left_opd, operator, &right_opd, &result, &junk);
// If exactly 4 conversions succeeded, the expression was good. If fewer,
// the conversion failed prematurely. If 5, there was trailing junk:
return count==4;
}
int main(void) {
int i;
int n_lines;
char *lines[]={
"1.5+2.2=3.7;",
"1.5 + 2.2 = 3.7 ; ",
"a+2.2=3.7;",
"1.5+2.2=3.7;x",
};
n_lines = (int)sizeof(lines)/sizeof(char *);
for(i=0; i<n_lines; i++) {
printf("'%s' is %s\n", lines[i], is_a_binary_infix_expression(lines[i]) ? "OK" : "NOT OK");
}
return 0;
}
This only checks for syntactic correctness. If you want to check for arithmetic correctness, you can switch on the operand to compute the correct result and compare that with the result extracted from the input line, but be careful not to fall into the trap of doing a direct comparison with ==.

Macro directives in C — my code example doesn't work

I want to get the following piece of code to work:
#define READIN(a, b) if(scanf('"#%d"', '"&a"') != 1) { printf("ERROR"); return EXIT_FAILURE; }
int main(void)
{
unsigned int stack_size;
printf("Type in size: ");
READIN(d, stack_size);
}
I don't get how to use directives with the # operator. I want to use the scanf with print ERROR etc. several times, but the "'"#%d"' & '"&a"'" is, I think, completely wrong. Is there any way to get that running? I think a macro is the best solution — or do you disagree?
You should only stringify arguments to the macro, and they must be outside of strings or character constants in the replacement text of the macro. Thus you probably should use:
#define READIN(a, b) do { if (scanf("%" #a, &b) != 1) \
{ fprintf(stderr, "ERROR\n"); return EXIT_FAILURE; } \
} while (0)
int main(void)
{
unsigned int stack_size;
printf("Type in size: ");
READIN(u, stack_size);
printf("You entered %u\n", stack_size);
return(0);
}
There are many changes. The do { ... } while (0) idiom prevents you from getting compilation errors in circumstances such as:
if (i > 10)
READIN(u, j);
else
READIN(u, k);
With your macro, you'd get an unexpected keyword 'else' type of message because the semi-colon after the first READIN() would be an empty statement after the embedded if, so the else could not belong to the visible if or the if inside the macro.
The type of stack_size is unsigned int; the correct format specifier, therefore, is u (d is for a signed int).
And, most importantly, the argument a in the macro is stringized correctly (and string concatenation of adjacent string literals - an extremely useful feature of C89! - takes care of the rest for you. And the argument b in the macro is not embedded in a string either.
The error reporting is done to stderr (the standard stream for reporting errors on), and the message ends with a newline so it will actually appear. I didn't replace return EXIT_FAILURE; with exit(EXIT_FAILURE);, but that would probably be a sensible choice if the macro will be used outside of main(). That assumes that 'terminate on error' is the appropriate behaviour in the first place. It often isn't for interactive programs, but fixing it is a bit harder.
I'm also ignoring my reservations about using scanf() at all; I usually avoid doing so because I find error recovery too hard. I've only been programming in C for about 28 years, and I still find scanf() too hard to control, so I essentially never use it. I typically use fgets() and sscanf() instead. Amongst other merits, I can report on the string that caused the trouble; that's hard to do when scanf() may have gobbled some of it.
My thought with scanf() here is, to only read in positive numbers and no letters. My overall code does create a stack, which the user types in and the type should be only positive, otherwise error. [...] I only wanted to know if there's a better solution to forbid the user to type in something other than positive numbers?
I just tried the code above (with #include <stdlib.h> and #include <stdio.h> added) and entered -2 and got told 4294967294, which isn't what I wanted (the %u format does not reject -2, at least on MacOS X 10.7.2). So, I would go with fgets() and strtoul(), most likely. However, accurately detecting all possible problems with strtoul() is an exercise of some delicacy.
This is the alternative code I came up with:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>
#include <string.h>
int main(void)
{
unsigned int stack_size = 0;
char buffer[4096];
printf("Type in size: ");
if (fgets(buffer, sizeof(buffer), stdin) == 0)
printf("EOF or error detected\n");
else
{
char *eos;
unsigned long u;
size_t len = strlen(buffer);
if (len > 0)
buffer[len - 1] = '\0'; // Zap newline (assuming there is one)
errno = 0;
u = strtoul(buffer, &eos, 10);
if (eos == buffer ||
(u == 0 && errno != 0) ||
(u == ULONG_MAX && errno != 0) ||
(u > UINT_MAX))
{
printf("Oops: one of many problems occurred converting <<%s>> to unsigned integer\n", buffer);
}
else
stack_size = u;
printf("You entered %u\n", stack_size);
}
return(0);
}
The specification of strtoul() is given in ISO/IEC 9899:1999 §7.20.1.4:
¶1 [...]
unsigned long int strtoul(const char * restrict nptr,
char ** restrict endptr, int base);
[...]
¶2 [...] First,
they decompose the input string into three parts: an initial, possibly empty, sequence of
white-space characters (as specified by the isspace function), a subject sequence
resembling an integer represented in some radix determined by the value of base, and a
final string of one or more unrecognized characters, including the terminating null
character of the input string. Then, they attempt to convert the subject sequence to an
integer, and return the result.
¶3 [...]
¶4 The subject sequence is defined as the longest initial subsequence of the input string,
starting with the first non-white-space character, that is of the expected form. The subject
sequence contains no characters if the input string is empty or consists entirely of white
space, or if the first non-white-space character is other than a sign or a permissible letter
or digit.
¶5 If the subject sequence has the expected form and the value of base is zero, the sequence
of characters starting with the first digit is interpreted as an integer constant according to
the rules of 6.4.4.1. If the subject sequence has the expected form and the value of base
is between 2 and 36, it is used as the base for conversion, ascribing to each letter its value
as given above. If the subject sequence begins with a minus sign, the value resulting from
the conversion is negated (in the return type). A pointer to the final string is stored in the
object pointed to by endptr, provided that endptr is not a null pointer.
¶6 [...]
¶7 If the subject sequence is empty or does not have the expected form, no conversion is
performed; the value of nptr is stored in the object pointed to by endptr, provided
that endptr is not a null pointer.
Returns
¶8 The strtol, strtoll, strtoul, and strtoull functions return the converted
value, if any. If no conversion could be performed, zero is returned. If the correct value
is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN,
LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type
and sign of the value, if any), and the value of the macro ERANGE is stored in errno.
The error I got was from a 64-bit compilation where -2 was converted to a 64-bit unsigned long, and that was outside the range acceptable to a 32-bit unsigned int (the failing condition was u > UINT_MAX). When I recompiled in 32-bit mode (so sizeof(unsigned int) == sizeof(unsigned long)), then the value -2 was accepted again, interpreted as 4294967294 again. So, even this is not delicate enough...you probably have to do a manual skip of leading blanks and reject a negative sign (and maybe a positive sign too; you'd also need to #include <ctype.h> too):
char *bos = buffer;
while (isspace(*bos))
bos++;
if (!isdigit(*bos))
...error - not a digit...
char *eos;
unsigned long u;
size_t len = strlen(bos);
if (len > 0)
bos[len - 1] = '\0'; // Zap newline (assuming there is one)
errno = 0;
u = strtoul(bos, &eos, 10);
if (eos == bos ||
(u == 0 && errno != 0) ||
(u == ULONG_MAX && errno != 0) ||
(u > UINT_MAX))
{
printf("Oops: one of many problems occurred converting <<%s>> to unsigned integer\n", buffer);
}
As I said, the whole process is rather non-trivial.
(Looking at it again, I'm not sure whether the u == 0 && errno != 0 clause would ever catch any errors...maybe not because the eos == buffer (or eos == bos) condition catches the case there's nothing to convert at all.)
You are incorrectly encasing your macro argument(s), it should look like:
#define READIN(a, b) if(scanf("%"#a, &b) != 1) { printf("ERROR"); return EXIT_FAILURE; }
you use of the stringify operator was also incorrect, it must directly prefix the argument name.
In short, use "%"#a, not '"#%d"', and &b, not '"&a"'.
as a side note, for longish macro's like those, it helps to make them multi-line using \, this keeps them readable:
#define READIN(a, b) \
if(scanf("%"#a, &b) != 1) \
{ \
printf("ERROR"); \
return EXIT_FAILURE; \
}
When doing something like this, one should preferably use a function, something along the lines of this should work:
inline int readIn(char* szFormat, void* pDst)
{
if(scanf(szFormat,pDst) != 1)
{
puts("Error");
return 0;
}
return 1;
}
invoking it would be like so:
if(!readIn("%d",&stack_size))
return EXIT_FAILURE;
scanf(3) takes a const char * as a first argument. You are passing '"..."', which is not a C "string". C strings are written with the " double quotes. The ' single quotes are for individual characters: 'a' or '\n' etc.
Placing a return statement inside a C preprocessor macro is usually considered very poor form. I've seen goto error; coded inside preprocessor macros before for repetitive error handling code when storing formatted data to and reading data from a file or kernel interface, but these are definitely exceptional circumstances. You would detest debugging this in six months time. Trust me. Do not hide goto, return, break, continue, inside C preprocessor macros. if is alright so long as it is entirely contained within the macro.
Also, please get in the habit of writing your printf(3) statements like this:
printf("%s", "ERROR");
Format string vulnerabilities are exceedingly easy to write. Your code does not contain any such vulnerability now, but trust me, at some point in the future those strings are inevitably modified to include some user-supplied content, and putting in an explicit format string now will help prevent these in the future. At least you'll think about it in the future if you see this.
It is considered polite to wrap your multi-line macros in do { } while (0) blocks.
Finally, the stringification is not quite done correctly; try this instead:
#define READIN(A, B) do { if (scanf("%" #A, B) != 1) { \
/* error handling */ \
} else { \
/* success case */ \
} } while(0)
Edit: I feel I should re-iterate akappa's advice: Use a function instead. You get better type checking, better backtraces when something goes wrong, and it is far easier to work with. Functions are good.

Resources