Misplaced preprocessor character '\' - c

I'm trying to get a bunch of C modules written in 1994 for a Panasonic 3DO lib to compile with armcc. I've run into an error which I'm kind of confused about. My knowledge of C is not that deep, so perhaps one of you would be so kind as to help me figure this out:
#define DS_MSG_HEADER \
long whatToDo; /* opcode determining msg contents */ \
Item msgItem; /* message item for sending this buffer */ \
void* privatePtr; /* ptr to sender's private data */ \
void* link /* user defined -- for linking msg into lists */
The \ character is used in many include files in this library I'm unfamiliar with this syntax... and the ARM compiler seems to hate it.
Serious error: misplaced preprocessor character '\'
If you know why these \ characters are being used, could please explain? (Sorry if its a noob question) Also, is there an alternative way to write this so the compiler is happy?

This error is shown (among other reasons) if the shown backslash '\' is not the last character on the line.
I can think of two reasons:
Somehow you got at least one whitespace (space, tab) after the backslash.
I never had this problem.
The source is stored with Windows-style end-of-line markers, that are '\r' and '\n', "carriage return" and "line feed". And you are trying to compile it on a Unix-like system (Linux?) or by a compiler that expects Unix-like end-of-line markers, that is only '\n', "line feed". (Or the other way around.)
This is a quite common problem, that hits me time after time.
In any case, open the source in a capable editor and enable the visibility of "unvisible characters", commonly an option with this icon: ¶. Check for whitespace. Then check for the coding of the end-of-line. Save with the appropriate one.

Related

Using \ to extend single-line comments

I just noticed that I can use \ to extend the single-line comment to the next line, similarly to doing so in pre-processor directives.
Why is nobody speaking for this language feature?
I didn't even see it in books..
What language version supports this?
It's part of C. Called line splicing.
The K&R book talks about it
Lines that end with the backslash character \ are folded by deleting the backslash and the
following newline character. This occurs before division into tokens.
This occurs in the preprocessing phase.
So single line comments can be made to appear like multi line like
//This is \
still a single line comment
Likewise with the case of strings
char str[]="Hello \
world. This is \
a string";
Edit: As noted in the comments, single line comments were not there in ANSI C but were introduced as part of the standard in C99 though many compilers already supported it.
From C99,
Except within a character constant, a string literal, or a comment, the characters // introduce a comment that includes all multibyte characters up to, but not including, the next new-line character. The contents of such a comment are examined only to identify multibyte characters and to find the terminating new-line character.
As far as line splicing is concerned, it is specified in C89 itself
2.1.1.2 Translation phases
Each instance of a new-line character and an immediately preceding backslash character is deleted, splicing physical source lines to form logical source lines. A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character.
Look at KamiKaze's answer to see the relevant part of C99.
While it's true that a \ will effectively escape the newline at the end of a single-line comment, splicing the line with the following one (just as it does on any other line), you could claim that this is a bug in the Standard. At any rate, the situation is fantastically confusing. You might believe that both of these facts are true:
The single-line comment syntax // turns the rest of the line, up to the next newline, into a comment, which is not interpreted in any way, i.e. is ignored.
At the end of any line, a \ character eliminates the newline and splices the line to the following line.
But these two rules are basically in conflict; it looks like they can't both be true at the same time.
Now in fact, by definition, the second rule "wins", and the first rule really has to say that the rest of the line is not interpreted in any way except to check whether the last character is a \, in which case it retains its line-splicing meaning.
(Now, if you're a compiler writer or a language lawyer, of course, you don't think about it that way. If you're a compiler writer or language lawyer, you know that the \ was processed during an earlier phase of compilation, before comments are parsed, meaning that the first rule is perfectly true as stated. But most people don't think like compiler writers and language lawyers.)
My point is that this situation is basically fraught with peril. I would bet good money that there are compilers or other language processors out there that get this wrong. I would urge any sane programmer not to rely on this, not to put a \ at the end of any line that contains a single-line comment. (And if I were writing a compiler or other language processor, I'd try to warn about this.)
This is not a feature of comments but a general feature of the language, as it applies to all newline-characters.
The following is found in the C99 standard:
5.1.1.2 Translation phases
Each instance of a backslash character () immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place.
So it is standard compliant for C99 at least.
It is not much talked about, because the relevant usecases (except for large macros and strings) are quite rare. If you need a multiline comment (the standard comment in C, // was added from C++ later on), you could just use
/* multi
line
comment
*/
Every use except for large macros and strings will make the code harder to read and might even make it quite confusing. So generally it is not used except for the mentioned niches.
Every instance of \ followed by a newline is removed from the source during the first phase of parsing, before tokenisation and comment handling.
As a consequence, a single line comment can be extended to the next line of source code by escaping this newline with a \ (or a ??/ trigraph sequence):
// this is a single \
line comment
Note how the stackoverflow code highlighter is fooled by this trick and does not colorize the end of the comment line.
This feature can be further abused to make really weird looking comments:
/\
/\ This is a single line comment /\
\/ \/
/\
*\ This is a multi-line comment
*\
/
Any token can be broken in pieces this way. Check this corner case:
\
r\
et\
urn\
0x7\
ffff;\

What is the meaning of multi-line comment warnings in C?

I'm working on a C file for a homework assignment and I thought it might help the graders if I made my answers visible like so:
//**********|ANSWER|************\\
//blah blah blah, answering the
//questions, etc etc
and found when compiling with gcc that those backslash characters at the end of the first line seemed to be triggering a "multi-line comment" warning. When I removed them, the warning disappeared. So my question is twofold:
a) how exactly does the presence of the backslash characters make it a "multi-line comment", and
b) why would a multi-line comment be a problem anyway?
C (since the 1999 standard) has two forms of comments.
Old-style comments are introduced by /* and terminated by */, and can span a portion of a line, a complete line, or multiple lines.
C++-style comments are introduced by // and terminated by the end of the line.
But a backslash at the end of a line causes that line to be spliced to the next line. So you can legally introduce a comment with //, put a backslash at the end of the line, and cause the comment to span multiple physical lines (but only one logical line).
That's what you're doing on your first line:
//**********|ANSWER|************\\
Just use something other than backslash at the end of the line, for example:
//**********|ANSWER|************//
Though even that is potentially misleading, since it almost looks like an old-style /* .. */ comment. You might consider something a little simpler:
/////////// |ANSWER| ////////////
or:
/**********|ANSWER|************/
The compiler simply tells you that you might have inadvertently commented-out the next line of code by ending the previous comment line with \, which is a line continuation character in C. This causes the second line to get concatenated with the first. This in turn makes the // comment to actually comment-out both original lines. In your case it is not a problem, since the next line is a comment as well.
But if the next line was not intended to be a comment, then you might have ended up with "weird behavior": compiler ignoring the second line for no apparent reason. The situation is often complicated by the fact that some syntax-highlighting code editors do not detect this situation and fail to highlight the next line as a comment.
Generally, for this specific reason, it is not a good idea to abuse the \ character as code level. Use it only if you really have to, i.e. only if you really want to stitch several lines into one.
Nobody asked, but this is the top answer in Google, so
Suppressing, this specific warning could be done with -Wno-comment option.
a) how exactly does the presence of the backslash characters make it a "multi-line comment", and
A backslash as the last character on a line means that the compiler should disregard the backslash and the newline character - it tells the compiler to do this before it should check for comments. So it says that before removing comments it should effectively look at
//**********|ANSWER|************\//blah blah blah, answering the
//questions, etc etc
it now sees the // at the start and ignores the rest of the line
b) why would a multi-line comment be a problem anyway?
In your example it isn't since the second line is a comment anyway, but what if you had written something useful on the second line?
Well since you asked question "a" it's likely that you didn't realize that the compiler behaved this way, and if you don't realize that you've commented out a line of code, then it's quite nice of the compiler to warn you.
Another reason is that even if had known this is that normally an editor will not visibly show whitespace and it's therefore easy to miss that the backslash may or may not be the last character on the line. For example:
int i = 42;
// backslash+space: \
i++
// backslash and no space: \
i--
printf("%d\n", i);
Would result in 43 since the i-- is commented out, but i++ isn't (because the backslash is not the last character on the line, but a space is).
That will comment the line below it as well. If you want to do that all on one line without a warning try
/* // Bla \\ */

What does '\' actually do in C?

As far as I know \ in C just appends the next line as if there was not a line break.
Consider the following code:
main(){\
return 0;
}
When I saw the pre-processed code(gcc -E) it shows
main(){return
0;
}
and not
main(){return 0;
}
What is the reason for this kind of behaviour? Also, how can I get the code I expected?
Yes, your expected result is the one required by the C and C++ standards. The backslash simply escapes the newline, i.e. the backslash-newline sequence is deleted.
GCC 4.2.1 from my OS X installation gives the expected result, as does Clang. Furthermore, adding a #define to the beginning and testing with
#define main(){\
return 0;
}
main()
yields the correct result
}
{return 0;
Perhaps gcc -E does some extra processing after preprocessing and before outputting it. In any case, the line break seen by the rest of the preprocessor seems to be in the right place. So it's a cosmetic bug.
UPDATE: According to the GCC FAQ, -E (or the default setting of the cpp command) attempts to put output tokens in roughly the same visual location as input tokens. To get "raw" output, specify -P as well. This fixes the observed issues.
Probably what happened:
In preserving visual appearance, tokens not separated by spaces are kept together.
Line splicing happens before spaces are identified for the above.
The { and return tokens are grouped into the same visual block.
0 follows a space and its location on the next line is duly noted.
PLUG: If this is really important to you, I have implemented my own preprocessor with correct implementation of both raw-preprocessed and whitespace-preserving "pretty" modes. Following this discussion I added line splices to the preserved whitespace. It's not really intended as a standalone tool, though. It's a testbed for a compiler framework which happens to be a fully compliant C++11 preprocessor library, which happens to have a miniature command-line driver. (The error messages are on par with GCC, or Clang, sans color, though.)
From K&R section A.12 Preprocessing:
A.12.2 Line Splicing
Lines that end with the backslash character \ are
folded by deleting the backslash and the following newline character.
This occurs before division into tokens.
It doesn't matter :/ The tokenizer will not see any difference. 1
Update In response to the comments:
There seems to be a fair amount of confusion as to what the expected output of the preprocessor should be. My point is that the expectation /seems/ reasonable at a glance but doesn't actually need to be specified in this way for the output to be valid. The amount of whitespace present in the output is simply irrelevant to the parser. What matters is that the preprocessor should treat the continued line as one line while interpreting it.
In other words: the preprocessor is not a text transformation tool, it's a token manipulation tool.
If it matters to you, you're probably
using the preprocessor for for something other than C/C++
treating C++ code as text, which is a ... code smell. (libclang and various less complete parser libraries come to mind).
1 (The preprocessor is free to achieve the specified result in whichever way it sees fit. The result you are seeing is possibly the most efficient way the implementors have found to implement this particular transformation)

force gcc compilation / ignore error messages

I'm trying to compile some code I found online, but gcc keeps getting me error messages.
Is there any way I can bypass the error, and compile?
ttys000$ gcc -o s-proc s-proc.c
s-proc.c:84:18: error: \x used with no following hex digits
Here's the line it keeps bitching about:
printf("\x%02x", ((unsigned char *)code)[i]);
...
First post on here, so if I broke any rules or wasn't specific enough, let me know.
You can't ignore errors1. You can only ignore warnings. Change the code.
printf("\\x%02x", ((unsigned char *)code)[i]);
It's just a guess, since without documentation or input from the original author of the code, we have no solid evidence for what the code is actually supposed to do. However, the above correction is extremely plausible, it's a simple typo (the original author forgot a \), and it's conceivable that the author uses a C compiler which silently ignores the error (Python has the same behavior by design).
The line of code above, or something almost exactly like it, is found in probably tens of thousands of source files across the globe. It is used for encoding a binary blob using escape sequences so it can be embedded as a literal in a C program. Similar code appears in JSON, XML, and HTML emitters. I've probably written it a hundred times.
Alternatively, if the code were supposed to print out the character, this would not work:
printf("\x%02x", ((unsigned char *)code)[i]);
This doesn't work because escape sequences (the things that start with \, like \x42) are handled by the C compiler, but format strings (the things that start with %, like %02x) are handled by printf. The above line of code might only work if the order were reversed: if printf ran first, before you compiled the program. So no, it doesn't work.
If the author had intended to write literal characters, the following is more plausible:
printf("%c", ((unsigned char *)code)[i]); // clumsy
putchar((unsigned char *)code)[i]); // simpler
So you know either the original author simply typo'd and forgot a single \ (I make that mistake all the time), or the author has no clue.
Notes:
1: An error means that GCC doesn't know what the code is supposed to do, so continuing would be impossible.
Looks like you want to add a prefix of x to the hex number. If yes, you can drop the \:
printf("x%02x", ((unsigned char *)code)[i]);
The reason you are getting error is \x marks the beginning of a hex escape sequence.
Example: printf("\x43\x4f\x4f\x4c");
Prints
COOL
As C has an ASCII value of 0x43.
But in your case the \x is not followed by hex digits which causes parse errors. You can see the C syntax here it clearly says:
hex-escape ::= \x hex-digit ∗
Escape the \ with another \
printf("\\x%02x", ((unsigned char *)code)[i]);
By the way, you can't force GCC to continue compilation after an error, as as error is an error because it prevents further logical analysis of the source code which is impossible to resolve.

Old C compiler chokes on #ifndef #define

I am trying to port some relatively modern C code to an older compiler.
This compiler (DICE), it seems, chokes on the first header file and the first occurrence of this idiom:
#ifndef SOMETHING
#define SOMETHING
...
#endif /* SOMETHING */
it dies on the second line in the header with:
DCPP: "../../code/someheader.h" L:2 C:0 Error:39 Syntax Error
Changing to #define SOMETHING 1 made no difference.
So I have really two questions, am I using DICE with the wrong option or something, or did C programmers use some other idiom equal to ifndef-define back in the old days?
References:
DICE Wikipedia Entry
Original source code, runs on Unix
Slightly updated Amiga version
The author of DICE, Matt Dillon, went on to produce DragonFlyBSD
If it is this C compiler then by looking at the sources (src\dcpp\cpp.c) you can see that newlines only include the carriage return character and not the linefeed character.
If you have a line ending with CRLF then when the compiler strips the whitespace at the start of the line, it does not strip the linefeed before the # which is a syntax error, since preprocessor directives starting with # must be the first non-whitespace character in the line.
#if SOMETHING
#else
#endif
might just work everywhere

Resources