strip action code from bison grammar file - c

Is there any existing tool to strip all the action code from bison grammar files, leaving only the {} around it?

To the best of my knowledge, no.
As you surely know, writing your own tool is doable, but difficult. For example, the { and } characters can appear as character constants or in strings. (So can the : and ; characters, of course.)
If you have specific files you want to strip the actions from, and you can rely on your own environment and constraints (i.e. you don't need a solution for the general case), there may be a relatively simple way to do it.
If you need a full general solution, what remains is to hack bison code. Not for the faint of heart, I admit. That said, much of bison is implemented or sketched out in bison.
In the bison sources, see scan-gram.l and parse-gram.y for a bison scanner/parser team. The token to look out for is BRACED_CODE.
Now, since what you need is basically to take a file and generate a near-exact copy of it, and you don't really need to understand it, you can probably do all your work in the lexer. You can use scan-gram.l as a basis for your work. A helpful modification may be to add another state (start condition) to describe if you're in the prologue/declaration section, versus the grammar rules. Everything but the grammar rules should be printed verbatim.
Comments, whitespace, directives, most punctuation, identifiers, numbers: just print these out verbatim.
Characters and strings: these require their own states in the lexer because it's essential to find where they end. (Character literals may be longer than one keyboard character; think octal.) But given that they have their own states, print them out verbatim.
Code: like characters and strings, you need to figure out where it ends. This is a bit trickier, too, because it may contain strings and comments and whatnot. But once you find where it ends, you can exit the code state. Nothing in here needs to be printed (except for the braces, of course).
Good luck!

I know the post is old, but I came across the same problem and found a much simpler solution using a small python script.
filename = "in.txt";
b_count = 0;
with open("out.txt", "w") as fout:
with open(filename) as f:
while True:
c = f.read(1)
if not c:
print "End of file"
break
if (b_count == 0):
fout.write(c);
if (c == '{'):
b_count += 1
else :
if (c == '{'):
b_count += 1
if (c == '}'):
b_count -= 1
if (b_count == 0):
fout.write('}')
I hope this is helpful to anyone!

Related

I need help filtering bad words in C?

As you can see, I am trying to filter various bad words. I have some code to do so. I am using C, and also this is for a GTK application.
char LowerEnteredUsername[EnteredUsernameLen];
for(unsigned int i = 0; i < EnteredUsernameLen; i++) {
LowerEnteredUsername[i] = tolower(EnteredUsername[i]);
}
LowerEnteredUsername[EnteredUsernameLen+1] = '\0';
if (strstr(LowerEnteredUsername, (char[]){LetterF, LetterU, LetterC, LetterK})||strstr(LowerEnteredUsername, (char[]){LetterF, LetterC, LetterU, LetterK})) {
gtk_message_dialog_set_markup((GtkMessageDialog*)Dialog, "This username seems to be innapropriate.");
UsernameErr = 1;
}
My issue is, is that, it will only filter the last bad word specified in the if statement. In this example, "fcuk". If I input "fuck," the code will pass that as clean. How can I fix this?
(char[]){LetterF, LetterU, LetterC, LetterK}
(char[]){LetterF, LetterC, LetterU, LetterK}
You’ve forgotten to terminate your strings with a '\0'. This approach doesn’t seem to me to be very effective in keeping ~bad words~ out of source code, so I’d really suggest just writing regular string literals:
if (strstr(LowerEnteredUsername, "fuck") || strstr(LowerEnteredUsername, "fcuk")) {
Much clearer. If this is really, truly a no-go, then some other indirect but less error-prone ways are:
"f" "u" "c" "k"
or
#define LOWER_F "f"
#define LOWER_U "u"
#define LOWER_C "c"
#define LOWER_K "k"
and
LOWER_F LOWER_U LOWER_C LOWER_K
Doing human-language text processing in C is painful because C's concept of strings (i.e. char*/char[] and wchar_t*/wchar_t[]) are very low-level and are not expressive enough to easily represent Unicode text, let alone locate word-boundaries in text and match words in a known dictionary (also consider things like inflection, declension, plurals, the use of diacritics to evade naive string matching).
For example - your program would need to handle George carlin's famous Seven dirty words quote:
https://www.youtube.com/watch?v=vbZhpf3sQxQ
Someone was quite interested in these words. They kept referring to them: they called them bad, dirty, filthy, foul, vile, vulgar, coarse, in poor taste, unseemly, street talk, gutter talk, locker room language, barracks talk, bawdy, naughty, saucy, raunchy, rude, crude, lude, lascivious, indecent, profane, obscene, blue, off-color, risqué, suggestive, cursing, cussing, swearing... and all I could think of was: shit, piss, fuck, cunt, cocksucker, motherfucker, and tits!
This could be slightly modified to evade a naive filter, like so:
Someone was quite interested in these words. They kept referring to them: they called them bad, dirty, filthy, foul, vile, vulgar, coarse, in poor taste, unseemly, street talk, gutter talk, locker room language, barracks talk, bawdy, naughty, saucy, raunchy, rude, crude, lude, lascivious, indecent, profane, obscene, blue, off-color, risqué, suggestive, cursing, cussing, swearing... and all I could think of was: shít, pis$, phuck, c​unt, сocksucking, motherfúcker, and títs!
Above, some of the words have simple replacements done, like s to $, others had diacritics added like u to ú, and some are just homonyms), however some of the other words in the above look the same but actually contain homographs or "invisible" characters like Unicode's zero-width-space, so they would evade naive text matching systems.
So in short: Avoid doing this in C. if you must, then use a robust and fully-featured Unicode handling library (i.e. do not use the C Standard Library's string functions like strstr, strtok, strlen, etc).
Here's how I would do it:
Read in input to a binary blob containing Unicode text (presumably UTF-8).
Use a Unicode library to:
Normalize the encoded Unicode text data (see https://en.wikipedia.org/wiki/Unicode_equivalence )
Identify word boundaries (assuming we're dealing with European-style languages that use sentences comprised of words).
Use a linguistics library and database (English alone is full of special-cases) to normalize each word to some singular canonical form.
Then lookup each morpheme in a case-insensitive hash-set of known "bad words".
Now, there are a few shortcuts you can take:
You can use regular-expressions to identify word-boundaries.
There exist Unicode-aware regular-expression libraries for C, for example PCRE2: http://www.pcre.org/current/doc/html/pcre2unicode.html
You can skip normalizing each word's inflections/declensions if you're happy with having to list those in your "bad word" list.
I would write working code for this example, but I'm short on time tonight (and it would be a LOT of code), but hopefully this answer provides you with enough information to figure out the rest yourself.
(Pro-tip: don't match strings in a list by checking each character - it's slow and inefficient. This is what hashtables and hashsets are for!)

Parsing a stream of data for control strings

I feel like this is a pretty common problem but I wasn't really sure what to search for.
I have a large file (so I don't want to load it all into memory) that I need to parse control strings out of and then stream that data to another computer. I'm currently reading in the file in 1000 byte chunks.
So for example if I have a string that contains ASCII codes escaped with ('$' some number of digits ';') and the data looked like this... "quick $33;brown $126;fox $a $12a". The string going to the other computer would be "quick brown! ~fox $a $12a".
In my current approach I have the following problems:
What happens when the control strings falls on a buffer boundary?
If the string is '$' followed by anything but digits and a ';' I want to ignore it. So I need to read ahead until the full control string is found.
I'm writing this in straight C so I don't have streams to help me.
Would an alternating double buffer approach work and if so how does one manage the current locations etc.
If I've followed what you are asking about it is called lexical analysis or tokenization or regular expressions. For regular languages you can construct a finite state machine which will recognize your input. In practice you can use a tool that understands regular expressions to recognize and perform different actions for the input.
Depending on different requirements you might go about this differently. For more complicated languages you might want to use a tool like lex to help you generate an input processor, but for this, as I understand it, you can use a much more simple approach, after we fix your buffer problem.
You should use a circular buffer for your input, so that indexing off the end wraps around to the front again. Whenever half of the data that the buffer can hold has been processed you should do another read to refill that. Your buffer size should be at least twice as large as the largest "word" you need to recognize. The indexing into this buffer will use the modulus (remainder) operator % to perform the wrapping (if you choose a buffer size that is a power of 2, such as 4096, then you can use bitwise & instead).
Now you just look at the characters until you read a $, output what you've looked at up until that point, and then knowing that you are in a different state because you saw a $ you look at more characters until you see another character that ends the current state (the ;) and perform some other action on the data that you had read in. How to handle the case where the $ is seen without a well formatted number followed by an ; wasn't entirely clear in your question -- what to do if there are a million numbers before you see ;, for instance.
The regular expressions would be:
[^$]
Any non-dollar sign character. This could be augmented with a closure ([^$]* or [^$]+) to recognize a string of non$ characters at a time, but that could get very long.
$[0-9]{1,3};
This would recognize a dollar sign followed by up 1 to 3 digits followed by a semicolon.
[$]
This would recognize just a dollar sign. It is in the brackets because $ is special in many regular expression representations when it is at the end of a symbol (which it is in this case) and means "match only if at the end of line".
Anyway, in this case it would recognize a dollar sign in the case where it is not recognized by the other, longer, pattern that recognizes dollar signs.
In lex you might have
[^$]{1,1024} { write_string(yytext); }
$[0-9]{1,3}; { write_char(atoi(yytext)); }
[$] { write_char(*yytext); }
and it would generate a .c file that will function as a filter similar to what you are asking for. You will need to read up a little more on how to use lex though.
The "f" family of functions in <stdio.h> can take care of the streaming for you. Specifically, you're looking for fopen(), fgets(), fread(), etc.
Nategoose's answer about using lex (and I'll add yacc, depending on the complexity of your input) is also worth considering. They generate lexers and parsers that work, and after you've used them you'll never write one by hand again.

Extract a line of C code using perl based on filename + line number

I'm looking to extract a statement of C code given the filename and linenumber where it begins.
I can't, of course, just take the line, as I could have something like:
foo(i,
j, "this is ); \
", k);
as the example indicates, I also can't look for the next ); either, which would make it fairly simple.
Is there anything out there, presumably on CPAN, which does this automatically?
If I could run the code through indent first, I would have it allow unlimited line lengths, and then take just that line, but if I do that, I lose the line number!
To do this perfectly well, you'd need a full-blown C parser in Perl. However, you can cover probably more than 99% of all cases with a much simpler algorithm:
Open the file
Skip lines until you reach the line in question
Slurp lines until you reach a semicolon at the end of the line
"Semicolon at end of line" is a pretty decent heuristic for "end of statement". Relatively simple quote-parsing might protect you from the situation shown above.
If you need something more sophisticated than that, you might look at C::Scan and its subclasses, or Inline::C::ParseRecDescent.

Why use #if 0 for block commenting out?

Reverse engineering code and I'm kind of appalled at the style, but I wanted to make sure there's no good reason for doing these things....
Is it just me or is this a horrible coding style
if ( pwbuf ) sprintf(username,"%s",pwbuf->pw_name);
else sprintf(username,"%d",user_id);
And why wrap code not intended for compilation in an
#if 0
....
#endif
Instead of comments?
EDIT: So as some explained below, this is due to the possibility to flummox /* */ which I didn't realize.
But I still don't understand, why not just use your programming environment tools or favorite text editor's macro's to block comment it out using "//"
wouldn't this be MUCH more straightforward and easy to know to visually skip?
Am I just inexperienced in C and missing why these things might be a good idea -- or is there no excuse, and I'm justified in feeling irritated at how ugly this code is?
#if 0 is used pretty frequently when the removed block contains block-comments
I won't say it's a good practice, but I see it rather often.
The single line flow-control+statement is easy enough to understand, although I personally avoid it (and most of the coding guidelines I've worked under forbid it)
BTW, I'd probably edit the title to be somewhat useful "Why use #if 0 instead of block comments"
If you have the following
#if 0
silly();
if(foo)
bar();
/* baz is a flumuxiation */
baz = fib+3;
#endif
If you naively replace the #if 0/#endif with /* */, that will cause the comment to end right after flumuxiation, causing a syntax error when you hit the */ in the place of the #endif above..
EDIT: One final note, often the #if 0 syntax is just used while developing, particularly if you have to support multiple versions or dependencies or hardware platforms. It's not unusual for the code to be modified to
#ifdef _COMPILED_WITHOUT_FEATURE_BAZ_
much_code();
#endif
With a centralized header defining (or not) hundreds of those #define constants. It's not the prettiest thing in the world, but every time I've worked on a decent sized project, we've used some combination of runtime switches, compile-time constants (this), compile-time compilation decisions (just use different .cpp's depending on the version), and the occasional template solution. It all depends on the details.
While you're the developer just getting the thing working in the first place, though... #if 0 is pretty common if you're not sure if the old code still has value.
Comments are comments. They describe the code.
Code that's being excluded from compilation is code, not comments. It will often include comments, that describe the code that isn't being compiled, for the moment.
They are two distinct concepts, and forcing the same syntax strikes me as being a mistake.
I'm editing this because I'm in the middle of a sizeable refactor and I'm making heavy use of this pattern.
As a part of this refactor, I'm removing some widely-used types, and replacing them with another. The result, of course, is that nothing will build.
And I really hate spending days fixing one issue after another in the hope that when I'm done everything will build and all the tests will run.
So my first step is to #ifdef-out all the code that won't compile, and then to [Ignore] all the unit tests that call it. With this done everything builds and all the non-ignored tests pass.
The result is a lot of functions that look like this:
public void MyFunction()
{
#if true
throw new NotImplementedException("JT-123");
#else
// all the existing code that won't compile
#endif
}
Then I unignore the unit tests, one at a time, and then fix the functions, one at a time.
It's going to take me a couple of days to worth through all of it, and all of these #if's will be gone, before I create the pull request to merge this, but I find it helpful, during the process.
Besides the problem with C-style comments not nesting, disabling blocks of code with #if 0 has the advantage of being able to be collapsed if you are using an editor that supports code folding. It is also very easy to do in any editor, whereas disabling large blocks of code with C++-style comments can be unwieldy without editor support/macros.
Also, many #if 0 blocks have an else block as well. This gives an easy way to swap between two implementations/algorithms, and is arguably less error-prone than mass-commenting out one section and mass-uncommenting another. However, you'd be better off using something more readable like #if DEBUG in that event.
As far as block commenting using // is concerned, one reason that I can think of is that, should you check that code into your source control system, the blame log will show you as the last editor for those lines of code. While you probably want the commenting to be attributed to you, at the same time the code itself is also being attributed to you. Sure, you can go back and look at previous revisions if you need to check the blame log for the "real" author of the code, but it would save time if one preserved that information in the current revision.
That's pretty idiomatic C right there. I don't see what's so wrong with it. It's not a beautiful piece of code but it's easy to read and is clear what's going on and why, even without context.
The variable names could be better, and and it'd probably be safer to use snprintf or perhaps strncpy.
If you think it could be better, what would you prefer it look like?
I might make a slight change:
char username[32];
strncpy(username, 30, (pwbuf ? pwbuf->pw_name : user_id));
username[31] = '\0';
Obviously, everyone has their own opinions on this sort of thing. So here's mine:
I would never write code like the above, and would think less of anyone who did. I can't count the number of times people think it's ok to get away without scope braces, and then been bitten by it.
Putting the control statement on the same line as the code block is even worse; the lack of indenting makes it harder to see the flow control whilst reading. Once you've been coding for a few years, you get used to being able to read and interpret code quickly and accurately, so long as you can rely on certain visual cues. Circumventing these cues for "special cases" means that the reader has to stop and do a double-take, for no good reason.
#if (0), on the other hand, is ok during development, but should be removed once code is "stable" (or at least replace 0 with some meaningful preprocessor symbol name).
Woah there! Don't overreact...
I would call it sloppier for more the inconsistant spacing than anything else. I have had time where I found it better to put short statements on the same line as their IF, though those statements are stretching it.
The inline style is better for vertical brevity... could easily be broken into 4, more lines
if (pwbuf)
sprintf(username,"%s",pwbuf->pw_name);
else
sprintf(username,"%d",user_id);
Personally I hate the next style since it so long-winded, making it difficult to skim a file.
if (pwbuf)
{
sprintf(username,"%s",pwbuf->pw_name);
}
else
{
sprintf(username,"%d",user_id);
}
points above noted. But monitors being widescreen and all, these days, I sort of don't mind
if (pwbuf) sprintf(username,"%s",pwbuf->pw_name);
else sprintf(username,"%d",user_id);
Always seem to have too much horizontal space, and not enough vertical space on my screen!
Also, if the code block already has preprocessor directives, don't use #if 0; if the code already has block comments, don't use /* */. If it already has both, either resort to an editor that has a ctrl+/, to comment out lots of lines. If not, you're stuffed, delete the code outright!
if ( pwbuf ) sprintf(username,"%s",pwbuf->pw_name);
else sprintf(username,"%d",user_id);
Idiomatic and concise. If it got touched more than 2 or 3 times, I would bracket and next-line it. It's not very maintainable if you add logging information or other conditions.
#if 0
....
#endif
Good to turn on blocks of debug code or not. Also, would avoid compilation errors related to trying to block comment this sort of thing out:
/* line comment */
...
/* line comment again */
Since C block comments don't nest.
Very occasionally I use the more concise style when it supports the symmetry of code and the lines don't get too long. Take the following contrived example:
if (strcmp(s, "foo") == 0)
{
bitmap = 0x00000001UL;
bit = 0;
}
else if (strcmp(s, "bar") == 0)
{
bitmap = 0x00000002UL;
bit = 1;
}
else if (strcmp(s, "baz") == 0)
{
bitmap = 0x00000003UL;
bit = 2;
}
else if (strcmp(s, "qux") == 0)
{
bitmap = 0x00000008UL;
bit = 3;
}
else
{
bitmap = 0;
bit = -1;
}
and the concise version:
if (strcmp(s, "foo") == 0) { bitmap = 0x00000001UL; bit = 0; }
else if (strcmp(s, "bar") == 0) { bitmap = 0x00000002UL; bit = 1; }
else if (strcmp(s, "baz") == 0) { bitmap = 0x00000003UL; bit = 2; }
else if (strcmp(s, "qux") == 0) { bitmap = 0x00000008UL; bit = 3; }
else { bitmap = 0; bit = -1; }
Bugs are much more likely to jump straight into your face.
Disclaimer: This example is contrived, as I said. Feel free to discuss the use of strcmp, magic numbers and if a table based approach would be better. ;)
#if is a macro which checks for the condition written aside to it, since ‘0’ represents a false, it means that the block of code written in between ‘#if 0’ and ‘#endif’ will not be compiled and hence can be treated as comments.
So, we can basically say that #if 0 is used to write comments in a program.
Example :
#if 0
int a;
int b;
int c = a + b;
#endif
The section written between “#if 0” and “#endif” are considered as comments.
Questions arises : “/* … */” can be used to write comments in a program then why ”#if 0”?
Answer : It is because, #if 0 can be used for nested comments but nested comments are not supported by “/* … */”
What are nested comments? Nested Comments mean comments under comments and can be used in various cases like :
Let us take an example that you have written a code like below:
Now, someone is reviewing your code and wants to comment this whole piece of code in your program because, he doesn’t feel the need for this piece of code. A common approach to do that will be :
The above is an example of nested comments.The problem with the above code is, as soon as the first “/” after “/” is encountered, the comment ends there.
i.e., in the above example, the statement : int d = a-b; is not commented.
This is solved by using “if 0” :
Here, we have used nested comments using #if 0.
I can name a few reasons for using #if 0:
comments don't nest, #if direcitves do;
It is more convenient: If you want to temporarily enable a disabled block of code, with #if 0 you just have to put 1 instead of 0. With /* */ you have to remove both /* and */;
you can put a meaningful macro instead of 0, like ENABLE_FEATURE_FOO;
automated formatting tools will format code inside #if block, but ignore commented out code;
it is easier to grep for #if than look for comments;
it plays nicer with VCS because you aren't touching the original code, just adding lines around it.
#if 0 ... #endif is pretty common in older C code. The reason is that commenting with C style comments /* .... */ doesn't work because comments don't nest.
Even though it is common, I'd say it has no place in modern code. People did it in olden days because their text editors couldn't block comment large sections automatically. More relevantly, they didn't have proper source code control as we do now. There's no excuse for leaving commented or #ifdef'd in production code.

What is this strange C code format?

What advantage, if any, is provided by formatting C code as follows:
while(lock_file(lockdir)==0)
{
count++;
if(count==20)
{
fprintf(stderr,"Can't lock dir %s\n",lockdir);
exit(1);
}
sleep(3);
}
if(rmdir(serverdir)!=0)
{
switch(errno)
{
case EEXIST:
fprintf(stderr,"Server dir %s not empty\n",serverdir);
break;
default:
fprintf(stderr,"Can't delete dir %s\n",serverdir);
}
exit(1);
}
unlock_file(lockdir);
versus something more typical such as
while(lock_file(lockdir)==0) {
count++;
if(count==20) {
fprintf(stderr,"Can't lock dir %s\n",lockdir);
exit(1);
}
sleep(3);
}
if(rmdir(serverdir)!=0) {
switch(errno) {
case EEXIST:
fprintf(stderr,"Server dir %s not empty\n",serverdir);
break;
default:
fprintf(stderr,"Can't delete dir %s\n",serverdir);
}
exit(1);
}
unlock_file(lockdir);
I just find the top version difficult to read and to get the indenting level correct for statements outside of a long block, especially for longs blocks containing several nested blocks.
Only advantage I can see is just to be different and leave your fingerprints on code that you've written.
I notice vim formatting would have to be hand-rolled to handle the top case.
The top example is know as "Whitesmiths style". Wikipedia's entry on Indent Styles explains several styles along with their advantages and disadvantages.
The indentation you're seeing is Whitesmiths style. It's described in the first edition of Code Complete as "begin-end Block Boundaries". The basic argument for this style is that in languages like C (and Pascal) an if governs either a single statement or a block. Thus the whole block, not just its contents should be shown subordinate to the if-statement by being indented consistently.
XXXXXXXXXXXXXXX if (test)
XXXXXXXXXXXX one_thing();
XXXXXXXXXXXXXXX if (test)
X {
XXXXX one_thing();
XXXXX another_thing();
X }
Back when I first read this book (in the 90s) I found the argument for "begin-end Block Boundaries" to be convincing, though I didn't like it much when I put it into practice (in Pascal). I like it even less in C and find it confusing to read. I end up using what Steve McConnel calls "Emulating Pure Blocks" (Sun's Java Style, which is almost K&R).
XXXXXXXXXXXXXX X if (test) {
XXXXXX one_thing();
XXXXXX another_thing();
X }
This is the most common style used to program in Java (which is what I do all day). It's also most similar to my previous language which was a "pure block" language, requiring no "emulation". There are no single-statement bodies, blocks are inherent in the control structure syntax.
IF test THEN
oneThing;
anotherThing
END
Nothing. Indentation and other coding standards are a matter of preference.
Personal Preference I would have thought? I guess it has the code block in one vertical line so possibly easier to work out at a glance? Personally I prefer the brace to start directly under the previous line
It looks pretty standard to me. The only personal change I'd make is aligning the curly-braces with the start of the previous line, rather than the start of the next line, but that's just a personal choice.
Anyway, the style of formatting you're looking at there is a standard one for C and C++, and is used because it makes the code easier to read, and in particular by looking at the level of indentation you can tell where you are with nested loops, conditionals, etc. E.g.:
if (x == 0)
{
if (y == 2)
{
if (z == 3)
{
do_something (x);
}
}
}
OK in that example it's pretty easy to see what's happening, but if you put a lot of code inside those if statements, it can sometimes be hard to tell where you are without consistent indentation.
In your example, have a look at the position of the exit(1) statement -- if it weren't indented like that, it would be hard to tell where this was. As it is, you can tell it's at the end of that big if statement.
Code formatting is personal taste. As long as it is easy to read, it would pay for maintenance!
By following some formatting and commenting standards, first of all you show your respect to other people that will read and edit code written by you. If you don't accept rules and write somehow esoteric code the most probable result is that you will not be able communicate with other people (programmers) effectively. Code format is personal choice if software is written only by you and for you and nobody is expected to read it, but how many modern software is written only by one person ?
The "advantage" of Whitesmiths style (as the top one in your example is called) is that it mirrors the actual logical structure of the code:
indent if there is a logical dependency
place corresponding brackets on the same column so they are easy to find
opening and closing of a context (which may open/close a stack frame etc) are visible, not hidden
So, less if/else errors, loops gone wrong, and catches at the wrong level, and overall logical consistency.
But as benefactual wrote: within certain rational limits, formatting is a matter of personal preference.
Its just another style--people code how they like to code, and that is one accepted style (though not my preferred). I don't think it has much of a disadvantage or advantage over the more common style in which brackets are not indented but the code within them is. Perhaps one could justify it by saying that it more clearly delimits code blocks.
In order for this format to have "advantage", we really need some equivalent C code in another format to compare to!
Where I work, this indentation scheme is used in order to facilitate a home-grown folding editor mechanism.
Thus, I see nothing fundamentally wrong with this format - within certain rational limits, formatting is a matter of personal preference.

Resources