Can anyone explain this lex preprocessor output? - c

I want to manipulate the output of lex. There is only one write to yyout, in the ECHO macro. The macro is surrounded by "#ifndef ECHO", so I am replacing it with my desired action.
However, I want to be sure to correctly replicate the original lex behavior. Lex defines ECHO to this code fragment:
do {
if (fwrite( yytext, yyleng, 1, yyout )) {
}
} while (0)
Can anyone guess why the output is not simply "fwrite(...)"?

do { .. } while (0)
is a convenient way to #define a multi-statement operation as pointed out by this.
By
if (fwrite( yytext, yyleng, 1, yyout ))
I believe you're given an option to deal with fwrite failure.
Here you call fwrite with just 1 element of size yyleng. Considering that fwrite returns the total number of elements written, the possible return values are just 0 and 1 - 0 indicating any failure and 1 indicating success.
Ideally(or actually it is?), it should have been
if (!fwrite( yytext, yyleng, 1, yyout ))
I'm guessing this because, only block is given to write the fallback/logging code.

Related

Yacc actions return 0 for any variable ($)

I'm new to Lex/Yacc. Found these lex & yacc files which parse ansi C.
To experiment I added an action to print part of the parsing:
constant
: I_CONSTANT { printf("I_CONSTANT %d\n", $1); }
| F_CONSTANT
| ENUMERATION_CONSTANT /* after it has been defined as such */
;
The problem is, no matter where I put the action, and whatever $X I use, I always get value 0.
Here I got printed:
I_CONSTANT 0
Even though my input is:
int foo(int x)
{
return 5;
}
Any idea?
Nothing in the lex file you point to actually sets semantic values for any token. As the author says, the files are just a grammar and "the bulk of the work" still needs to be done. (There are other caveats having to do with the need for a preprocessor.)
Since nothing in the lex file ever sets yylval, it will always be 0, and that is what yacc/bison will find when it sets up the semantic value for the token ($1 in this case).
Turns out yylval = atoi(yytext) is not done in the lex file, so I had to add it myself. Also learned I can add extern char *yytext to the yacc file header, and then use yytext directly.

splittling a file into multiple with a delimiter awk

I am trying to split files evenly in a number of chunks. This is my code:
awk '/*/ { delim++ } { file = sprintf("splits/audio%s.txt", int(delim /2)); print >> file; }' < input_file
my files looks like this:
"*/audio1.lab"
0 6200000 a
6200000 7600000 b
7600000 8200000 c
.
"*/audio2.lab"
0 6300000 a
6300000 8300000 w
8300000 8600000 e
8600000 10600000 d
.
It is giving me an error: awk: line 1: syntax error at or near *
I do not know enough about awk to understand this error. I tried escaping characters but still haven't been able to figure it out. I could write a script in python but I would like to learn how to do this in awk. Any awkers know what I am doing wrong?
Edit: I have 14021 files. I gave the first two as an example.
For one thing, your regular expression is illegal; '*' says to match the previous character 0 or more times, but there is no previous character.
It's not entirely clear what you're trying to do, but it looks like when you encounter a line with an asterisk you want to bump the file number. To match an asterisk, you'll need to escape it:
awk '/\*/ { close(file); delim++ } { file = sprintf("splits/audio%d.txt", int(delim /2)); print >> file; }' < input_file
Also note %d is the correct format character for decimal output from an int.
idk what all the other stuff around this question is about but to just split your input file into separate output files all you need is:
awk '/\*/{close(out); out="splits/audio"++c".txt"} {print > out}' file
Since "repetition" metacharacters like * or ? or + can take on a literal meaning when they are the first character in a regexp, the regexp /*/ will work just fine in some (e.g. gawk) but not all awks and since you apparently have a problem with having too many files open you must not be using gawk (which manages files for you) so you probably need to escape the * and close() each output file when you're done writing to it. No harm doing that and it makes the script portable to all awks.

How can I create a grammar rule for an error?

I am writing a compiler in C, and I use bison for the grammar and flex for the tokens. To improve the quality of error messages, some common errors need to appear in the grammar. This has the side effect, however, of bison thinking that an invalid input is actually valid.
For example, consider this grammar:
program
: command ';' program
| command ';'
| command {yyerror("Missing ;.");} // wrong input
;
command
: INC
| DEC
;
where INC and DEC are tokens and program is the initial symbol. In this case, INC; is a valid program, but INC is not, and an error message is generated. The function yyparse(), however, returns 0 as if the program were correct.
Looking at the bison manual, I found the macro YYERROR, which should behave as if the parser itself found an error. But even if I add YYERROR after the call to yyerror(), the function yyparse() still returns 0. I could use YYABORT instead, but that would stop on the first error, which is terrible and not what I want.
Is there anyway to make yyparse() return 1 without stopping on the first error?
Since you intend to recover from syntax errors, you're not going to be able to use the return code from yyparse to signal that one or more errors occurred. Instead, you'll have to track that information yourself.
The traditional way to do that would be to use a global error count (just showing the crucial pieces):
%{
int parse_error_count = 0;
%}
%%
program: statement { yyerror("Missing semicolon");
++parse_error_count; }
%%
int parse_interface() {
parse_error_count = 0;
int status = yyparse();
if (status) return status; /* Might have run out of memory */
if (parse_error_count) return 3; /* yyparse returns 0, 1 or 2 */
return 0;
}
A more modern solution is to define an additional "out" parameter to yyparse:
%parse-param { int* error_count }
%%
program: statement { yyerror("Missing semicolon");
++*error_count; }
%%
int main() {
int error_count = 0;
int status = yyparse(&error_count);
if (status || error_count) { /* handle error */ }
Finally, in case you really need to export the symbol yyparse from your bison-generated code, you can do the following ugly hack:
%code top {
#define yyparse internal_yyparse
}
%parse-param { int* error_count }
%%
program: statement { yyerror("Missing semicolon");
++*error_count; }
%%
#undef yyparse
int yyparse() {
int error_count = 0;
int status = internal_yyparse(&error_count);
// Whatever you want to do as a summary
return status ? status : error_count ? 1 : 0;
}
yyerror() just prints an error message. It doesn't alter what yyparse() returns.
What you're attempting is not a good idea. You'll enormously expand the grammar and you run a major risk of making it ambiguous. All you need to do is remove the production that calls yyerror(). That input will produce a syntax error anyway, and that will cause yyparse() not to return 0. You're keeping a dog and barking yourself. What you should be checking for is semantic errors that the parser can't see.
If you really want to improve the error messages, there's enough information in the parse tables and state information to tell you what the expected next token was. However in most cases it's such a large set it's pointless to print it. But programmers are used to sorting out 'syntax error'. Don't sweat it. Writing compilers is hard enough already.
NB You should make your grammar left-recursive to avoid excessive stack usage: for example, program : program ';' command.

Function macro argument to function macro

I have some macros to define bit fields in registers easily (I use these for read, modify, write operations, set, gets etc). I'm getting a compiler error that I don't understand.
// used just for named arguments -- to make the values more clear when defined
#define FLDARGS(dwOffset, bitStart, bitLen) dwOffset, bitStart, bitLen
// extract just the dwOffset part
#define FLD_DWOFFSET(dwOffset, bitStart, bitLen) dwOffset
// define a bit field
#define CFGCAP_DEVCTRL FLDARGS(2, 16, 4)
// in a function:
uint32_t dwAddr = addr/4;
// compare just the dwOffset part
if(dwAddr == FLD_DWOFFSET( CFGCAP_DEVCTRL ))
{
// do something
}
I expected this to expand like:
CFGCAP_DEVCTRL = 2, 16, 4
FLD_DWOFFSET( CFGCAP_DEVCTRL ) = 2
I get the gcc error:
error: macro "FLD_DWOFFSET" requires 3 arguments, but only 1 given
if(dwAddr == FLD_DWOFFSET( CFGCAP_DEVCTRL ))
^
error: ‘FLD_DWOFFSET’ was not declared in this scope
if(dwAddr == FLD_DWOFFSET( CFGCAP_DEVCTRL ))
Any help? Thanks.
Let's see how your macros are going to be processed:
if(dwAddr == FLD_DWOFFSET( CFGCAP_DEVCTRL ))
First, it tries to substitute the outermost macro, which is FLD_DWOFFSET. But it requires 3 arguments, when you only provide 1 (your inner macro isn't parsed at that moment yet). The preprocessor can't go any further, hence the error.
There is more relevant info here: http://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html#Macro-Pitfalls
The macro pass expands macros in the order they're found. The first macro to be found is FLD_DWOFFSET( stuff ), which only sees one argument CFGCAP_DEVCTRL, and as a result cannot expand the FLD_DWOFFSET macro. It does not try to expand further macros until the current expansion is complete - in other words, it won't recognize that CFGCAP_DEVCTRL is a macro until it's finished expanding FLD_DWOFFSET, but it won't do that because you haven't provided enough arguments...
The other answers are correct in telling me why I can't do what I want to do. FLD_DWOFFSET is being evaluated with a single arg that isn't being expanded.
Here's my solution :
static inline uint32_t FLD_DWOFFSET(int dwOffset, int bitStart, int bitLen){return dwOffset;}
Hopefully this performs the same with optimization. Since it's a function, the macro argument (which expands to 3 args) is expanded before calling it.

Reading from binary file is unsuccessful in C

I am using C programming language and am trying to read the first string of every line in a binary File .
Example of data in the binary file (I have written to a txt file in order to show you)
Iliya Iliya Vaitzman 16.00 israel 1 0 1
I want to read to first Iliya in the line (or what ever the first word in the line will be).
I am trying the following code but it keeps returning NULL to the string variable I gave him
The following code:
FILE* ptrMyFile;
char usernameRecieved[31];
boolean isExist = FALSE;
ptrMyFile = fopen(USERS_CRED_FILENAME, "a+b");
if (ptrMyFile)
{
while (!feof(ptrMyFile) && !isExist)
{
fread(usernameRecieved, 1, 1, ptrMyFile);
if (!strcmp(userName, usernameRecieved))
{
isExist = TRUE;
}
}
}
else
{
printf("An error has encountered, Please try again\n");
}
return isExist;
I used typedef and #define to a boolean variable (0 is false, everything else is true (TRUE is true, FALSE is false))
usernameRecieved keeps getting NULL from the fread .
What should I do in order to solve this?
Instead of this:
fread(usernameRecieved, 1, 1, ptrMyFile);
try this:
memset(usernameRecieved, 0, sizeof(usernameRecieved));
fread(usernameRecieved, sizeof(usernameRecieved)-1, 1, ptrMyFile);
As it is, you are reading at most only one byte from the file.
Documentation on fread
A couple things: you're setting the count field in fread to 1, so you'll only ever read 1 byte, at most (assuming you don't hit an EOF or other terminal marker).
It's likely that what you want is:
fread(usernameRecieved, 1, 31, ptrMyFile);
That way you'll copy into your whole char buffer. You'll then want to compare only up to whatever delimiter you're using (space, period, etc).
It's not clear what "usernameRecieved keeps getting NULL" means; usernameRecieved is on the stack (you aren't using malloc). Do you mean that nothing is being read? I highly suggest that you always check the return value from fread to see how much is read; this is helpful in debugging.

Resources