how to rescan m4 data for recursive macro inplace substitution - c

I have this very simple code.
define(`S',`some')
define(`T',`thing')
define(`D',`doing')
define(`something',`st_todo')
define(`st_tododoing',`done!')
S`'T
OR
S()T()
the actual end result is
something
OR
something
but expected recursive substitution result as
st_todo
how I can rescan the code to the input again ?
maybe the "indir", But its A crooked nasty solution:
indir(S()`'T()`'D())
the result is:
--NOTHING--
maybe the so called command "divert", can make the output, recycled to input ?
of course the reviews says it is "inplace macro substitution",
BUT the result are different.
of course we can use "C" language primitive macro solution. EG:
define(`concat',`$1$2$2$4$5')
BUT this "concat" "solution", will increase the code NESTING complexity, in a large code reconstruction. EG:
concat(concat(S1,T1),concat(S2,T2,more1,more2,...))
WHAT if we have "concat" of 10 words OR more OR with conditionals "ifelse" ?.
M4 concept is beyond the old "C" preprocessor!
the real solution must come from the macro "inside core system"
any idea ?

M4 does rescan. However, the empty string `' ends the preceding token, and thus prevents some and thing from being recognised as a single token.
Instead, concatenate the macro expansions using another macro:
define(`concat',`$1$2')
define(`S',`some')
define(`T',`thing')
define(`something',`st_todo')
concat(S,T)
$ m4<<"EOF"
> define(`concat',`$1$2')dnl
> define(`S',`some')dnl
> define(`T',`thing')dnl
> define(`something',`st_todo')dnl
> concat(S,T)
> EOF
st_todo

Related

"Use" the Perl file that h2ph generated from a C header?

The h2ph utility generates a .ph "Perl header" file from a C header file, but what is the best way to use this file? Like, should it be require or use?:
require 'myconstants.ph';
# OR
use myconstants; # after mv myconstants.ph myconstants.pm
# OR, something else?
Right now, I am doing the use version shown above, because with that one I never need to type parentheses after the constant. I want to type MY_CONSTANT and not MY_CONSTANT(), and I have use strict and use warnings in effect in the Perl files where I need the constants.
It's a bit strange though to do a use with this file since it doesn't have a module name declared, and it doesn't seem to be particularly intended to be a module.
I have just one file I am running through h2ph, not a hundred or anything.
I've looked at perldoc h2ph, but it didn't mention the subject of the intended mechanism of import at all.
Example input and output: For further background, here's an example input file and what h2ph generates from it:
// File myconstants.h
#define MY_CONSTANT 42
...
# File myconstants.ph - generated via h2ph -d . myconstants.h
require '_h2ph_pre.ph';
no warnings qw(redefine misc);
eval 'sub MY_CONSTANT () {42;}' unless defined(&MY_CONSTANT);
1;
Problem example: Here's an example of "the problem," where I need to use parentheses to get the code to compile with use strict:
use strict;
use warnings;
require 'myconstants.ph';
sub main {
print "Hello world " . MY_CONSTANT; # error until parentheses are added
}
main;
which produces the following error:
Bareword "MY_CONSTANT" not allowed while "strict subs" in use at main.pl line 7.
Execution of main.pl aborted due to compilation errors.
Conclusion: So is there a better or more typical way that this is used, as far as following best practices for importing a file like myconstants.ph? How would Larry Wall do it?
You should require your file. As you have discovered, use accepts only a bareword module name, and it is wrong to rename myconstants.ph to have a .pm suffix just so that use works.
The choice of use or require makes no difference to whether parentheses are needed when you use a constant in your code. The resulting .ph file defines constants in the same way as the constant module, and all you need in the huge majority of cases is the bare identifier. One exception to this is when you are using the constant as a hash key, when
my %hash = { CONSTANT => 99 }
my $val = $hash{CONSTANT}
doesn't work, as you are using the string CONSTANT as a key. Instead, you must write
my %hash = { CONSTANT() => 99 }
my $val = $hash{CONSTANT()}
You may also want to wrap your require inside a BEGIN block, like this
BEGIN {
require 'myconstants.ph';
}
to make sure that the values are available to all other parts of your code, including anything in subsequent BEGIN blocks.
The problem does somewhat lie in the require.
Since require is a statement that will be evaluated at run-time, it cannot have any effect on the parsing of the latter part of the script. So when perl reads through the MY_CONSTANT in the print statement, it does not even know the existence of the subroutine, and will parse it as a bareword.
It is the same for eval.
One solution, as mentioned by others, is to put it into a BEGIN block. Alternatively, you may forward-delcare it by yourself:
require 'some-file';
sub MY_CONSTANT;
print 'some text' . MY_CONSTANT;
Finally, from my perspective, I have not ever used any ph files in my Perl programming.

Using macros to generalise code for function calls

I'm writing C code which requires me to use multiple function calls of the same definition which differ only by single characters. Is there a way I can make a macro function which takes say a number and can insert these calls into my code for me where I call the macro given I know the numbers at compile time:
i.e.
#define call_pin_macro(X)
enable_pin#X();
do_thing_pin#X();
do_other_thing_pin#X();
.
.
void pin_function(void){
call_pin_macro(1);
call_pin_macro(2);
call_pin_macro(3);
}
Instead of:
void pin_function(void){
enable_pin1();
do_thing_pin1();
do_other_thing_pin1();
enable_pin2();
do_thing_pin2();
do_other_thing_pin2();
enable_pin3();
do_thing_pin3();
do_other_thing_pin3();
}
As a note I have looked at stringification (Hence the included #X's) in gcc however I cannot get the above code to compile which I get an error "error: '#' is not followed by a macro parameter". And it thus it seems this isn't exactly the functionality I am after. Thanks in advance.
In gcc you can do it like this:
#define call_pin_macro(X) \
enable_pin##X(); \
do_thing_pin##X(); \
do_other_thing_pin##X();
The double hash is the macro concatenation operator. You don't want to use stringify because that will put quotes around it.
The backslashes allow you to continue the macro over several lines.

Two quick questions about flex / C

I would like to use this idiom :
yy_scan_string(line);
int i;
while ((i = yylex()))
....
where these two functions are define in the flex generated lex.yy.c in my main C file. So far, I am
#including "lex.yy.c"
but it seems fishy. How do I do that the correct C way?
Secondly, I would like the last line of my .l file,
. { return WORD; }
to no longer return a "WORD" token, but rather to return its input. For exemple (it is a smallish linux shell)
ls > ls.txt
Currently returns 2 WORD tokens, a GREATER token, and 6 WORD tokens, when I would like a return of "ls" GREATER "ls.txt". Of course yylex() can only return one type, so what is the accepted way to obtain the desired result?
Thanks.
You can tell flex to generate a header file as well as the C source file, using the --header-file=<filename> command line option, or by including %option header-file="<filename>" in the flex source. I typically invoke flex with:
flex --header-file=file.h -o file.c file.l
(Actually, I use make rules to generate a command like that, but that's the idea.) Then you can #include "file.h" in any source file which needs to invoke a flex function.
Normally, yylex returns the token type (an integer). The global variable yytext contains a pointer to the token string itself, which is probably sufficient for your purposes. However, please read "A Note About yytext And Memory" in the flex manual. (Summary: if you need to save the value of yytext, you must make a copy of it; strdup is recommended. Don't forget to free the copy when you don't need it anymore.)
Sometimes, the token string itself is not exactly what you want as a semantic value. By convention, flex actions place the semantic value of the token in the global yylval, which is where bison-generated parsers will look for it. However, yylval is not declared anywhere by flex-generated code, so you need to include a declaration yourself, both in the flex-generated code and in any source file which includes it. (If you use bison to generate your parser, bison will generate this declaration and put it in the header file it generates.)

What kind of statements,keywords,arguments etc can span multiple lines,and what need "\" for this?

How to know what kind of "things" can span multiple lines in C code without needing a \ character at the end of the line?And what kind of "things" need the \?How to know that?For example, in the following code, if and printf() work fine if I split them up in multiple lines.
if
(2<5)
printf
("Hi");
But in the following code,printf() needs a \ ,else shows error:
printf("Hi \
");
Similarly,the following shows error without a \
char name[]="Alexander the \
great of Greece";
So please tell me how to know when to use the \ while spanning multiple lines in C code, and when we can do without it?I mean, like if works both with and without the \.
This is about a concept called 'tokens'. A token is source-program text that the compiler does not break down into component elements. Literals (42, "text"), variable names, keywords are tokens.
Endline escaping is important for string constants only because it breaks apart a token. In your first example line breaks don't split tokens. All whitespace symbols between tokens are ignored.
The exception is macro definitions. A macro definition is ended with line break, so you need to escape it. But macros are not C code.
If you want to break a string across lines, you can either use the \ as you have...
printf("Hello \
World");
Or, alternatively, you can terminate the string with a " and then start a new string on the next line with no punctuation...
printf("Hello "
"World");
To the best of my knowledge, the issue with lines applies in only two places... within a string and within a define..
#define MY_DEFINE(fp) \
fprintf( fp, "Hello "\
"World" );
In short, the \ character is telling the compiler this statement continues on the next line. However, C/C++ is not white-space dependent, so really the only place this would come up is on a statement that is expected to be on a single line... which would be a string or a define.
C does not depend on line feeds.
You could use line feeds anywhere you like, as well as just not using them at all.
This implies seeing string literals as one token.
But, as in real life: Too much or to few, both does make life difficult. Happyness is matter of balance ... :-)
Please note that lines starting with a # are not C code, but pre-processor instructions.

Regular expression for filter c comments

For a merge with a tool I need to compare only non-commented parts of source lines.
So I try to create a filter which detects actual code, i.e. a regular expression that matches all text EXCEPT comments.
Perhaps something like this:
^.*(?!((/\**([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|(//.*)))
This one will do :
(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|(//.*)
Source : http://ostermiller.org/findcomment.html.
Or using non-greedy matching : (/\*([\r\n]|.)*?\*/)|(//.*).
Amine's answer is right, but you could also find for any comments and remove them from the string:
This regex will give you all comments:
(/\*.*?\*/)|//.*?\n
This will replace the matches with "" (if you're using c++):
std::string str2 = std::tr1::regex_replace(string, regex, "");
Maybe your compiler can help. Some might have an option to preprocess source and strip comments. Maybe the preprocessor can be made to only strip comments. This would be the Unix way of having one tool do one thing right--the C preprocessor knows what a comment is (while regexen are a kluge for parsing, IMNSHO).
As a second option, writing a lexer with lex or flex to recognize comments is easy. There should be plenty of examples on the 'net. Any search engine will turn up tons of hits.
For multiline comments use :
/\/\*([\s\S]*?)\*\//mg
and for matching single line comments:
/\/\/([\s\S]*?)[\n\r]?$/mg
or combine these two for matching all comments
/(\/\*(?<multiline>[\s\S]*?)\*\/)|(\/\/(?<singleline>[\s\S]*?)[\n\r]?$)/mg

Resources