I see following code on this page:
int main(string[]#a)
{print("Manganese");return 0;}
Why it is not following:
int main(string[] args)
{print("Manganese");return 0;}
What is difference between string[]#a and string[] args and when is it used?
The # symbol is used to prefix identifier names when the name either begins with a digit, or is a keyword.
Identifier names may be any combination of letters ([a-z], [A-Z]), underscores and digits. However, to define or refer to an identifier with a name that either starts with a digit or is a keyword, you must prefix it with the '#' character. This character is not considered a part of the name. For example, you can name a method foreach by writing #foreach, even though this is a reserved Vala keyword. You can omit the '#' character when it can be unambiguously interpreted as an identifier name, such as in "foo.foreach()".
See: Vala Tutorial under the Syntax section
To answer your question "What is difference between string[]#a and string[] args and when is it used?", well, not much. Other than simply using the variable name a instead of args, it's not a compiler error to use the # symbol in front of other variable names, even when the criteria above aren't met (although certainly not good practice). The author could safely prefix the variable a as #a, even though it's not the normal usage of the prefix.
Related
I stumbled on some C++ code like this:
int $T$S;
First I thought that it was some sort of PHP code or something wrongly pasted in there but it compiles and runs nicely (on MSVC 2008).
What kind of characters are valid for variables in C++ and are there any other weird characters you can use?
The only legal characters according to the standard are alphanumerics
and the underscore. The standard does require that just about anything
Unicode considers alphabetic is acceptable (but only as single
code-point characters). In practice, implementations offer extensions
(i.e. some do accept a $) and restrictions (most don't accept all of the
required Unicode characters). If you want your code to be portable,
restrict symbols to the 26 unaccented letters, upper or lower case, the
ten digits, and the '_'.
It's an extension of some compilers and not in the C standard
MSVC:
Microsoft Specific
Only the first 2048 characters of Microsoft C++ identifiers are significant. Names for user-defined types are "decorated" by the compiler to preserve type information. The resultant name, including the type information, cannot be longer than 2048 characters. (See Decorated Names for more information.) Factors that can influence the length of a decorated identifier are:
Whether the identifier denotes an object of user-defined type or a type derived from a user-defined type.
Whether the identifier denotes a function or a type derived from a function.
The number of arguments to a function.
The dollar sign is also a valid identifier in Visual C++.
// dollar_sign_identifier.cpp
struct $Y1$ {
void $Test$() {}
};
int main() {
$Y1$ $x$;
$x$.$Test$();
}
https://web.archive.org/web/20100216114436/http://msdn.microsoft.com/en-us/library/565w213d.aspx
Newest version: https://learn.microsoft.com/en-us/cpp/cpp/identifiers-cpp?redirectedfrom=MSDN&view=vs-2019
GCC:
6.42 Dollar Signs in Identifier Names
In GNU C, you may normally use dollar signs in identifier names. This is because many traditional C implementations allow such identifiers. However, dollar signs in identifiers are not supported on a few target machines, typically because the target assembler does not allow them.
http://gcc.gnu.org/onlinedocs/gcc/Dollar-Signs.html#Dollar-Signs
In my knowledge only letters (capital and small), numbers (0 to 9) and _ are valid for variable names according to standard (note: the variable name should not start with a number though).
All other characters should be compiler extensions.
This is not good practice. Generally, you should only use alphanumeric characters and underscores in identifiers ([a-z][A-Z][0-9]_).
Surface Level
Unlike in other languages (bash, perl), C does not use $ to denote the usage of a variable. As such, it is technically valid. In C it most likely falls under C11, 6.4.2. This means that it does seem to be supported by modern compilers.
As for your C++ question, lets test it!
int main(void) {
int $ = 0;
return $;
}
On GCC/G++/Clang/Clang++, this indeed compiles, and runs just fine.
Deeper Level
Compilers take source code, lex it into a token stream, put that into an abstract syntax tree (AST), and then use that to generate code (e.g. assembly/LLVM IR). Your question really only revolves around the first part (e.g. lexing).
The grammar (thus the lexer implementation) of C/C++ does not treat $ as special, unlike commas, periods, skinny arrows, etc... As such, you may get an output from the lexer like this from the below c code:
int i_love_$ = 0;
After the lexer, this becomes a token steam like such:
["int", "i_love_$", "=", "0"]
If you where to take this code:
int i_love_$,_and_.s = 0;
The lexer would output a token steam like:
["int", "i_love_$", ",", "_and_", ".", "s", "=", "0"]
As you can see, because C/C++ doesn't treat characters like $ as special, it is processed differently than other characters like periods.
I need to write a regex which will match only lines with the C function call, not its declaration.
So, I need it to match only lines, where funcName() is not preceeded by int, double, float, char etc. and an arbitrary number of spaces.
The problem is, I can run into following expressions:
printf("Hello"); int f() {return 1;};
So I must consider even the situation, where there are some other characters before the date-type name.
myStruct f();
In this situation I want regex to match it, ONLY basic data-types should be excluded.
So far I've got to this expression:
^(?!(void|int|double|char))\s*f\(\).*$
But I have no idea, how to take care of the situation with characters before the type name.
The following regex meets your specs:
(^|((^|\s)(?!(void|int|double|char))[^\s]+)\s+)([a-zA-Z_]+\(\)?)
The function name is defined by a character class containing letters and the underscore.
The line starts with the function call, or
the line contains at least one non-whitespace character before the function name. In that case ...
this non-WS sequence does not match the excluded keywords
there is at least 1 WS character before the function name
See the live demo at regex101.
Caveat
As several commentors have noted, this is not a robust solution. It will work for a tightly constrained set of function call and declaration patterns only.
A general regex-based solution (if possible at all, which would heavily depend on the regex engine features available) will be of theoretical interest only as it had to mimic completely the C preprocessor.
I need to generate some variable name with macro in C.
It seems that # token-pasting operator does the job, but the result is always a string.
#define create_var( name ) char #name
will not work because name is expanding in "name" (as string).
#define create_var( name ) char prefix##name
will work, but all my vars will have a prefix.
Is there any trick available to obtain a simple name?
create(test) to expand in
char test;
Thanks very much in advance,
If you would like your variable name to appear unmodified (without prefix) in your preprocessed code, just use the formal parameter name of the macro, without # and without ##.
You can # in the macro definition if you want to convert some argument to a string constant. And can use ## to concatenate tokens to build new tokens (for example to build new variable name with prefixes and/or suffixes and other stuff). With out any of these the preprocessor will just insert the sequence of tokens to pass to the macro unmodified (*).
(*): C preprocessor semantics ar subtle. Preprocessor macros are replaced at multiple stages during macro expansion which can have quite unobvious results.
How can I match a word (1-n characters) in ANSI C? (in addition: What is the pattern to match a constant in C-sourcecode?)
I tried reading the file and passing it to regexec() (regex.h).
Problem: The tool I'm writing should be able to read sourcecode and find
all used constants (#define) to check if they're defined.
The pattern used for testing is: [a-zA-Z_0-9]{1,}. But this would match words such as the "h" in "test.h".
Identifiers must start with a letter or underscore, so the pattern is
[A-Za-z_][A-Za-z0-9_]*
I know of no syntactic difference between C and preprocessor identifiers. There is a convention to use upper case for preprocessor and lowercase for C identifiers, but no actual requirement. Unless defines are guaranteed to use a distinct naming convention you would basically have to find every identifier in the source file and any included files and sort them into preprocessor identifiers, C identifiers and undeclared identifiers.
From the GCC manual:
Preprocessing tokens fall into five broad classes: identifiers, preprocessing numbers, string literals, punctuators, and other. An identifier is the same as an identifier in C: any sequence of letters, digits, or underscores, which begins with a letter or underscore. Keywords of C have no significance to the preprocessor; they are ordinary identifiers. You can define a macro whose name is a keyword, for instance. The only identifier which can be considered a preprocessing keyword is defined.
Another option besides doing regex searches over C source code would be to use a preprocessor library like Boost Wave or perhaps something like Coan instead of starting from scratch.
Here is the Lexer grammar and the Parser grammar (in flex and bison format, respectively) for the entire c language. In particular, the part relevant to identifiers is:
D [0-9]
L [a-zA-Z_]
{L}({L}|{D})* { count(); return(check_type()); }
So the id can start with any uppercase or lowercase letter or an underscore, and then have more uppercase or lowercase letters, underscores, and numbers. I believe it doesn't match parts of file names because they're quoted and it handles quotes separately.
can we say that identifier are alias of variables?
are identifier and variables same?
To say it another way, identifiers are the names given to things (such as variables and functions). They identify the thing which they are naming.
No.
int f() { }
f is an identifier. It is not a variable.
Identifier is the fancy term used to mean ‘name’. In C, identifiers are used to refer to a number of things: we've already seen them used to name variables and functions. They are also used to give names to some things we haven't seen yet, amongst which are labels and the ‘tags’ of structures, unions, and enums.
An identifier is used for any variable, function, data definition, etc. In the C programming language, an identifier is a combination of alphanumeric characters, the first being a letter of the alphabet or an underline, and the remaining being any letter of the alphabet, any numeric digit, or the underline. and you know about variables.
please check C Tutorial - Chapter 1
No, from C99 (6.2.1):
An identifier can denote an object; a
function; a tag or a member of a
structure, union, or enumeration; a
typedef name; a label name; a macro
name; or a macro parameter.