Lex - recognizing tokens - c

I am trying to learn Lex. I have a simple program where i want to read in a file and recognize tokens.
Right now i am getting some errors. I think i am having problems because there is more than one line in the file to recognize the tokens?
Here is the file
fd 3x00
bk
setc 100
int xy3 fd 10 rt 90
here is the output i am trying to achieve:
Keyword: fd
Illegal: 3x00
Keyword: bk
Keyword: setc
Number: 100
Keyword: int
here is my program:
%{
/* Comment */
#include <stdio.h>
#include <stdlib.h>
%}
%%
fd {printf("Keyword: fd\n");}
[0-9][a-z][0-9] {printf("Illegal: 3x00\n");}
bk {printf("Keyword: bk\n");}
setc[0-9] {printf("Keyword: setc\n Number: %s\n", yytext);}
int {printf("Keyword: int\n");}
xy3 {printf("ID: xy3\n");}
fd[0-9] {printf("Keyword: fd\n Number %s\n", yytext);}
rt[0-9] {printf("Keyword: rt \n Number %s\n", yytext);}
%%
main( argc, argv)
int argc;
char** argv;
{
if(argc > 1)
{
FILE *file;
file = fopen(argv[1], "r");
if(!file)
{
fprintf(stderr, "Could not open %s \n", argv[1]);
exit(1);
}
yyin = file;
}
yylex();
}
here are the errors i am getting when i try to compile it:
In function 'yylex':
miniStarLogo.l:11: error: expected expression before '[' token
miniStarLogo.l:11: error: 'a' undeclared (first use in this function)
miniStarLogo.l:11: error: (Each undeclared identifier is reported only once
miniStarLogo.l:11: error: for each function it appears in.)
miniStarLogo.l:11: error: expected ';' before '{' token
miniStarLogo.l:13: error: expected expression before '[' token
miniStarLogo.l:13: error: expected ';' before '{' token
is the error in my printf statements?
Thank you

When I compiled a copy of your code on MacOS X (10.7.2) with flex (2.5.35) and gcc (4.6.1), the only complaints I got from the C compiler were about the non-prototype definition of main(), and that was because I always compile with that warning enabled and mention of yyunput() defined but not used (which is not your fault).
Since you're learning C, you should only be using the notation:
int main(int argc, char **argv)
{
...
}
or an equivalent.
I also converted the miniStarLogo.l file to DOS format (CRLF line endings), and both flex and gcc seemed to be OK with the results - somewhat to my surprise. It might not be the case on your machine.
When I ran the code on your test data, I got:
Keyword: fd
Illegal: 3x00
0
Keyword: bk
setc 100
Keyword: int
ID: xy3
Keyword: fd
10 rt 90
So, you are not far off where you need to be by my reckoning.
Confusion reigneth over my commands.
I used (hmmm, it was GCC 4.2.1 rather than 4.6.1), but:
$ flex miniStarLogo.l
$ gcc -Wall -Wextra -O3 -g -o lex.yy lex.yy.c -lfl
miniStarLogo.l:22: warning: return type defaults to ‘int’
miniStarLogo.l: In function ‘main’:
miniStarLogo.l:42: warning: control reaches end of non-void function
miniStarLogo.l: At top level:
lex.yy.c:1114: warning: ‘yyunput’ defined but not used
$ ./lex.yy <<EOF
> fd 3x00
> bk
> setc 100
> int xy3 fd 10 rt 90
> EOF
Keyword: fd
Illegal: 3x00
0
Keyword: bk
setc 100
Keyword: int
ID: xy3
Keyword: fd
10 rt 90
$
(OK - I cheated marginally: the first time around, I ran rmk lex.yy LDLIBS=-lfl, where rmk is a variant of make and the compilation rules in the directory use the command line shown. But I redid the compilations to get the error messages right, exactly as above.)
You might need to look at expanding your patterns to accept 'one or more' digits with [0-9]+ in place of just [0-9]. You might need to look at a rule dealing with unmatched characters. And personally, I go to great pains to avoid blanks immediately before newlines, so you would need to tighten up your print formatting to meet my criteria. However, that's not germane to getting the program running.
Also, if you need to convert your file from DOS to Unix line endings, the easiest is the dos2unix command, if you have it on your machine. Otherwise, use:
$ tr -d '\015' < miniStarLogo.l > x
$ od -c x
0000000 % { \r \n \r \n / * C o m m e n t
...
0001560 \n } \r \n
0001564
$ mv x miniStarLogo.l
$
I carefully added the carriage returns using vim and :set fileformat=dos; it would also be possible to undo it with vim and :set fileformat=unix. This is Unix so TMTOWTDI (There's More Than One Way To Do It -- the Perl motto), and I'm not even trying to use Perl.

Related

Multiple definition of `main' while bison output file compiling

So I'm writing a bison (without lex) parser and now I want to read the input code from file and to write the output to another file.
Searching the stackoverflow for some time I found that this way should be good.
bison.y:
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern FILE *yyin;
int yylex() { return getc(stdin); }
void yyerror(char *s) {
fprintf (stderr, "%s\n", s);
}
int counter = 1;
char filename2[10] = "dest.ll";
FILE *f2;
%}
%name parse
%%
//grammars
%%
int main(int argc, char *argv[]) {
yyin = fopen(argv[1], "r");
if (argc > 2)
f2 = fopen(argv[2], "w");
else
f2 = fopen(filename2, "w");
yyparse();
return 0;
}
Then i compile it this way:
bison bison.y
cc -ly bison.tab.c
And here the result of cc-compilation:
/tmp/ccNqiuhW.o: In function `main':
bison.tab.c:(.text+0x960): multiple definition of `main'
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/liby.a(main.o):(.text.startup+0x0): first defined here
/tmp/ccNqiuhW.o: In function `main':
bison.tab.c:(.text+0x98c): undefined reference to `yyin'
collect2: error: ld returned 1 exit status
The output bison.tab.c file have only 1 main. Ofc int/void main doesn't matter. Can you teach me how to do it correctly?
P.S. By the way, I don't want to spam different posts, and have a little question here. How can I store the string (char *) in $$ in bison? For example, I want to generate a code string after I met the int grammar. I have this error and can't find the answer:
bison.y:94:8: warning: assignment makes integer from pointer without a cast [-Wint-conversion]
INTNUM: NUMBER | DIGIT INTNUM {$$ = "string"};
bison.y: In function ‘yyparse’:
bison.y:28:15: warning: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘int’ [-Wformat=]
PROGRAM: EXPRS { fprintf(f2, "%s: string here %d.\n", $$, counter++) };
will be extremely good if I find the help.
You are linking library liby (linker option -ly). The Bison manual has this to say about it:
The Yacc library contains default implementations of the yyerror and
main functions.
So that's why you have multiple definitions of main. You provide one, and there's one in liby.
Moreover, the docs go on to say that
These default implementations are normally not useful, but POSIX requires them.
(Emphasis added)
You do not need to link liby in order to build a program that includes a bison-generated parser, and normally you should not do so. Instead, provide your own main() and your own yyerror(), both of which you've already done.
Additionally, you are expected to provide a definition of yyin, not just a declaration, whether you link liby or not. To do so, remove the extern keyword from the declaration of yyin in your grammar file.
Your grammar is not complete (there are no rules at all) and the %name directive is not documented and is not recognized by my Bison, but if I add a dummy rule and comment out the %name, in conjunction with the other changes discussed, then bison generates a C source file for me that can be successfully compiled to an executable (without liby).

How to prints the built in functions name used in our program using a specific header file in C?

I need to find the built-in functions used in our program from a specific header file.
For example, I have the C file below:
#include<stdio.h>
int main()
{
int a;
scanf("%d",&a);
printf("a = %d\n", a);
}
If I given the stdio.h header file to any command, it needs to give the output as below:
scanf
printf
Is there any built-in command to get this?
Or any options available in the gcc or cc command to get this?
If you are using GCC as compiler, you can run this command:
echo "#include <stdio.h>" | gcc -E -
This will print many lines from the stdio.h header, and from the files that are included by that header, and so on.
Some lines look like #line …, they tell you where the following lines come from.
You can analyze these lines, but extracting the functions from them (parsing) is quite complicated. But if you just want a quick, unreliable check, you could search whether these lines contain the word scanf or printf.
EDIT
As suggested in a comment, the -aux-info is more useful, but it works only when compiling a file, not when preprocessing. Therefore:
cat <<EOF >so.c
#include <stdio.h>
int main(int argc, char **argv) {
for (int i = 1; i < argc; i++) {
fprintf(stdout, "%s%c", argv[i], i < argc - 1 ? ' ' : '\n');
}
fflush(stdout);
return ferror(stdout) == -1;
}
EOF
gcc -c so.c -aux-info so.aux
Determining the function calls from your program can be done using objdump, as follows:
objdump -t so.c
The above commands give you the raw data. You still need to parse this data and combine it to only give you the data relevant to your question.

Command line arguments in a C program (from a shell)

I'm writing a command line calculator. Each expression is provided by user must be separated by space (that's convention). For example: ./calc 2 + 5 * sin 45
The problem is when I try to get each expression i get also as arguments all files that are in the folder that I've complied the code...
Here is the code:
int main(int argc, char* argv[]) {
double result;
int i;
printf("Number of arguments: %d\n", argc);
for (i=0; i<argc; i++) {
printf("Argument: %s\n", argv[i]);
}
//result = equation(argv, argc);
//printf("Result is: %f", result);
return 0;
}
And the output for that example expression is:
Number of arguments: 10
Argument: ./calc
Argument: 2
Argument: +
Argument: 5
Argument: calc
Argument: calculate.c
Argument: lab2
Argument: sin
Argument: 45
And my question is why there are calc calculate.c lab2 (of course the folder where this program is compiled contains all the three files). Should I compile it in separate folder? I tried that approach but still the 'calc' is there
ps. i'm using the gcc compiler: gcc calculate -o calc
This has nothing to do with your program, and everything to do with your shell.
Most shells expand *, the wildcard character, into all matching files. This includes UNIX shells like bash, and Windows' cmd.
There's nothing you can do about this; it's just how it works.
The alternative would be for your program to take one argument, which is a string containing the expression to be parsed. Of course, you would have to do the parsing, instead of the shell doing it for you. E.g.
./calc '2 + 5 * sin 45'
Note the single quotes. This prevents the shell from expanding anything inside. Your pgrogram then has argc == 2, where argv[1] == "2 + 5 * sin 45".
This is due to the command line expansion of the * character (which matches all files in the current folder). Try quoting it like so:
./calc 2 + 5 '*' sin 45
or escaping it as follows:
./calc 2 + 5 \* sin 45
Best use a diffent character ..
The * character is a special character on most shells. In Linux, the shell interprets it as "everything in the current directory" and expands it before feeding it to the command. That's how you can use a command like this:
grep 'some string' *
The shell expands * to mean all files, so that statements searches for 'some string' in all files. In your case, when you want the shell to interpret * as a literal character, you should put it in quotes, or escape it with a \ character.

C - cs50.h GetString error

Hello I am completely new to the world of programming an I am attempting to take Harvard's CS50 course online.
While making my "Hello World" program, I downloaded 'cs50.h' to define GetString and string (at least I think). So this is the code I wrote:
file.c:
#include "cs50.h"
#include <stdio.h>
int main(int argc, string argv[])
{
string name;
printf("Enter your name: ");
name = GetString();
printf("Hello, %s\n", name);
}
However, whenever I try to make file, this happens:
cc file.c -o file
Undefined symbols for architecture x86_64:
"_GetString", referenced from:
_main in file-JvqYUC.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [file] Error 1
Here is a link to the cs50.h file if it can help: http://dkui3cmikz357.cloudfront.net/library50/c/cs50-library-c-3.0/cs50.h
I would like to know why I get this error and how I can fix it. Please help.
It seems that you forgot to download and link to project cs50.c file from http://dkui3cmikz357.cloudfront.net/library50/c/cs50-library-c-3.0/cs50.c
*.h usually contain only declarations. *.c (for C) and *.cpp (for C++) contains implementations.
There is GetSting function implementation from this class:
string GetString(void)
{
// growable buffer for chars
string buffer = NULL;
// capacity of buffer
unsigned int capacity = 0;
// number of chars actually in buffer
unsigned int n = 0;
// character read or EOF
int c;
// iteratively get chars from standard input
while ((c = fgetc(stdin)) != '\n' && c != EOF)
{
// grow buffer if necessary
if (n + 1 > capacity)
{
// determine new capacity: start at 32 then double
if (capacity == 0)
capacity = 32;
else if (capacity <= (UINT_MAX / 2))
capacity *= 2;
else
{
free(buffer);
return NULL;
}
// extend buffer's capacity
string temp = realloc(buffer, capacity * sizeof(char));
if (temp == NULL)
{
free(buffer);
return NULL;
}
buffer = temp;
}
// append current character to buffer
buffer[n++] = c;
}
// return NULL if user provided no input
if (n == 0 && c == EOF)
return NULL;
// minimize buffer
string minimal = malloc((n + 1) * sizeof(char));
strncpy(minimal, buffer, n);
free(buffer);
// terminate string
minimal[n] = '\0';
// return string
return minimal;
}
Look at your first include statement. You are using " " instead of < >.
In the videos with the CS50 course, the instructor uses carets (< >) rather than quotation marks (" ").
For anyone taking the CS50 class, and don't want to paste the .c code every time, you can also link the CS50 code when compiling.
Place cs50.h and cs50.c in the same directory as file.c, and then type the following in the command line:
clang file.c -lcs50 -o <file name>
The "-l" links the cs50.c and cs50.h files to your c file (after compiling to object file), and "-o" specifies where to put the compiled output.
More information on this here
In your #include"cs50.h" header you should be typing it like this: #include<cs50.h>. Also, try doing:
#include<cs50.h>
#include<stdio.h>
int main(void)
{
string name = get_string("Enter your name: ");
printf("%s\n", name);
}
Instead of this:
#include "cs50.h"
#include <stdio.h>
int main(int argc, string argv[])
{
string name;
printf("Enter your name: ");
name = GetString();
printf("Hello, %s\n", name);
}
That should get rid of the error messages.
P.S
In week 2 they tell you about help50, but if you want you can use it now.
I myself have found it very useful. Here's how it works: in your terminal window(the one where you execute ./hello and clang) you should type : "help50 make hello" (without the quotation marks) and then it will type: asking for help... in yellow. Then it will decipher the error message and type it in a more simple language. For example:
#include <stdio.h>
#include <cs50.h>
int main(void)
{
string name = get_string("Enter your name: ");
printf("%s\n", name)
}
I do make hello, and this appears:
clang -ggdb3 -O0 -std=c11 -Wall -Werror -Wextra -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wshadow hello.c -lcrypt -lcs50 -lm -o hello
hello.c:13:21: error: expected ';' after expression
printf("%s\n", name)
^
;
1 error generated.
<builtin>: recipe for target 'hello' failed
make: *** [hello] Error 1
But when I do it with help50 make hello, this appears:
clang -ggdb3 -O0 -std=c11 -Wall -Werror -Wextra -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wshadow hello.c -lcrypt -lcs50 -lm -o hello
hello.c:13:21: error: expected ';' after expression
printf("%s\n", name)
^
;
1 error generated.
<builtin>: recipe for target 'hello' failed
make: *** [hello] Error 1
Asking for help...
hello.c:13:21: error: expected ';' after expression
printf("%s\n", name)
^
;
Are you missing a semicolon at the end of line 13 of hello.c?
As you can see, now I know my problem and can fix it. Help50 deciphers the error messages into a language you can understand.

GCC compilation errors related to while statement [duplicate]

This question already has an answer here:
Lots of stray errors - "error: stray ‘\210’ in program in C++" [duplicate]
(1 answer)
Closed 8 years ago.
When trying to compile this short C program using GCC I get these errors:
expected ‘)’ before numeric constant
make: *** [file3_5.o] Error 1
stray ‘\210’ in program
stray ‘\227’ in program
stray ‘\342’ in program
Eclipse (Juno) points all of these errors to one line of code:
while(fgets(line ,STRSIZE∗NFIELDS, fp))
Using the following statement to compile:
gcc -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"file3_5.d" -MT"file3_5.d" -o "file3_5.o" "../file3_5.c"
Here is the program I am trying to compile:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define STRSIZE 100
#define NFIELDS 9
int main()
{
char inputfile[]= "/home/ty/workspace/OpenCoursware_Exercises/Assign_ /stateoutflow0708.txt";
/* define all of the fields */
char state_code_org[STRSIZE];
char country_code_org[STRSIZE];
char state_code_dest[STRSIZE];
char country_code_dest[STRSIZE];
char state_abbrv[STRSIZE];
char state_name[STRSIZE];
char line[STRSIZE*NFIELDS];
int return_num = 0;
int exmpt_num=0;
int aggr_agi= 0;
long total=0;
/* file related */
int fields_read = 0;
FILE* fp=fopen(inputfile,"r");
if(fp==NULL)
{
fprintf(stderr, "Cannot open file\n");
exit(-1);
}
/* skip first line */
fgets(line, STRSIZE*NFIELDS,fp);
/* print the header */
printf ("%-30s,%6s\n","STATE","TOTAL");
printf("---------------------------------------\n");
while(fgets(line ,STRSIZE∗NFIELDS, fp))
{
/* parse the fields */
fields_read=sscanf(line,"%s %s %s %s %s %s %d %d %d",
state_code_org,
country_code_org,
state_code_dest,
country_code_dest,
state_abbrv,
state_name,
&return_num,
&exmpt_num,
&aggr_agi);
if(strcmp(state_code_org, "\"25\"")==0)
{
printf("%-30s, %6d\n", state_name, aggr_agi);
total += aggr_agi;
}
}
/* print the header */
printf(" ----------------------------------------\n");
printf("%-30s,%6lu\n","TOTAL",total);
fclose(fp);
return 0;
}
Your ∗ is not the mulitplication operator * , they may look similar, but are different characters, and gcc doesn't recognize the ∗
while(fgets(line ,STRSIZE∗NFIELDS, fp))
^
^
Should be
while(fgets(line ,STRSIZE*NFIELDS, fp))
^
^
(Whether you see a difference between the two depends on the font used to display the characters).
The ∗ in the first one is not the character used for the multiplication operator, it is this character here.
Your "*" character in STRSIZE*NFIELDS is not the regular * (ascii value 42) but an unicode character "ASTERISK OPERATOR" : http://www.fileformat.info/info/unicode/char/2217/index.htm
That's what the compiler is trying to tell you by complaining about stray characters in the source.

Resources