I know there are several language extensions added in the GNU C compiler (aka gcc).
I can read something about that here.
What I'm looking for is deeper and wider documentation about those topics.
For example I'd like to read more about _Static_assert(), typeof and the likes.
Maybe it's just my fault, but I cannot find such an official documentation. Any hint? TIA!
The answer is http://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html and you're not finding about static assertions because it's not an extension of the C language, it's a core, built-in, standardized part of the language and described in the language international standards. In this case, refer to the C specification:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
See section 6.7.10 Static assertions, in particular paragraph 3:
"The constant expression shall be an integer constant expression. If the value of the
constant expression compares unequal to 0, the declaration has no effect. Otherwise, the
constraint is violated and the implementation shall produce a diagnostic message that
includes the text of the string literal, except that characters not in the basic source
character set are not required to appear in the message."
Here: http://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html.
Use Google to search inside gnu.org. Found it by typing this search in Google: c extensions site:gnu.org.
Related
I'm writing a simple shell in C under linux. I'm trying to parse user input with POSIX regex with group capturing. My problem is I dont want to capture all the groups, but the ?: symbol desnt seem to work for me.
"^(?:[A-Za-z0-9]+)( [A-Za-z0-9]*(?:\"[^\"]*\")*(?:\'[^\']*\')*[A-Za-z0-9]*)*&?$"
The use of (?:..), or any other grouping prefix, is not allowed in POSIX Regular Expressions.
There are tools to make languages, lex & yacc for example, and a simplified yacc grammar for POSIX shells is provided by the standard.
The character sequence (? is undefined as per section 9.4.3 ERE Special
Characters:
*+?{
The <asterisk>, <plus-sign>, <question-mark>, and <left-brace> shall be special except when used in a bracket expression (see RE Bracket Expression). Any of the following uses produce undefined results:
If these characters appear first in an ERE, or immediately following an unescaped <vertical-line>, <circumflex>, <dollar-sign>, or <left-parenthesis>
If a <left-brace> is not part of a valid interval expression (see EREs Matching Multiple Characters)
A POSIX RE implementation has a few choices for how to handle undefined syntax. Those choices include enabling an extended syntax as per section 9.1 Regular Expression Definitions. So it's free to implement the non-capturing group syntax:
[...] violations of the specified syntax or semantics for REs produce
undefined results: this may entail an error, enabling an extended
syntax for that RE, or using the construct in error as literal
characters to be matched.
If you'd like to see the feature as part of a future POSIX standard, you could open an issue on the standard's issue tracker.
So I have come across following definition in one of Wikipedia articles (rough translation):
Modifier (programming) - element of source code being a phrase of given programming language construct, which results in changed behavior of given construct.
Then, the article mentions modifiers in regard to ANSI C standard:
type modifiers (sign - signed unsigned, constness const, volatility volatile)
Then it also mentions the term in regard to languages such as Turbo C, Borland, Perl, but given there is no mention of modifier in ANSI/ISO 9899, this already puts validity of article into doubt.
Answers to this question draw similar conclusions.
However, when looking at some of the top searches on google, you get the term modifier mentioned everywhere around in tutorial sections, or even example interview questions.
So the question is: Can the usage of the term modifier in this context be justified or rather requires correction when mentioned?
Can the usage of the term modifier in this context be justified or rather requires correction when mentioned?
The C spec does not use "modifier" with a specific definition. It does discuss how things are modifiable, etc. and details the term modifiable lvalue, but nothing that ties to OP's concerns about signed, unsigned, const, volatile.
In C, const, volatile, and restrict are type-qualifiers.
signed, unsigned are 2 of the standard integer types.
So the authoritative reference is silent on "usage of the term modifier".
Lacking a standard reference answer, it does make sense, when using the term modifier, to justify its context to avoid quibbling corrections.
Like many terms that span multiple languages, the reader needs to understand the terms are used loosely when applied so broadly. Each computer language has and needs very precise terms. When speaking C, best to avoid the term unless a generality is needed in context with other languages.
According to the C Standard, subclause 6.10.2, paragraph 5 [ISO/IEC 9899:2011],
The implementation shall provide unique mappings for sequences
consisting of one or more nondigits or digits (6.4.2.1) followed by a
period (.) and a single nondigit. The first character shall not be a
digit. The implementation may ignore distinctions of alphabetical case
and restrict the mapping to eight significant characters before the
period.
This would mean that if two include files have first 8 characters in common, the header it actually picks is undefined.
When I compile using clang or gcc, I haven't really faced this issue. However, is there a documented behavior for source file inclusion in GCC and Clang?
In the modern world, I would find it weird if any compiler really restricts to 8 characters.
Reference: C11 WG14 draft version N1570, Cert C Coding standard
This would mean that if two include files have first 8 characters in common, the header it actually picks is undefined.
No, I'd argue against that: Looking at the exact wording we see that standard uses:
[..] The implementation may ignore [..]
It's "may", not "shall". If the later was used it would indeed mean that the behavior was undefined (N1570 $4/2). Since "may" is used as-is, without exact declaration I think it's safe to assume the normal meaning of the word (source, emphasis mine):
used to express opportunity or permission
Thus, an implementation is allowed to only consider the first 8 characters, but it doesn't have to.
Funny thing: I cannot find an exact documentation for the "distinction limit" of the "sequence" in GCC's manual, meaning (N1570 $4/8, emphasis mine) ...
An implementation shall be accompanied by a document that defines all implementation defined and locale-specific characteristics and all extensions.
... that GCC could (under some very pedantic point of view) be considered a nonconforming implementation. The practical relevant part of their manual, as #PaulGriffiths pointed out, is probably (source, point 4 in the list):
Significant initial characters in an identifier or macro name.
The preprocessor treats all characters as significant. The C standard requires only that the first 63 be significant.
Regarding the comment:
[..] I am actually trying to evaluate if this will bite me as long as I am using one of these compilers on a Linux platform. [..]
I really doubt that this will ever (again?) be an issue.
I´ve seen this in many popular C-Projects e.g the Go language and nowhere i can find some information about it. I think it is a kind of namespacing but i thought C doesn´t support it.
e.g
void runtime·memhash(uintptr*, uintptr, void*);
Thanks.
· is not a part of the "basic execution character set", and thus is not a standard C operator.
However, it does appear that the C standard allows it as an implementation-defined identifier character. It has no special meaning; it's just another character.
C99 still isn't supported by many compilers, and much of the focus is now on C++, and its upcoming standard C++1x.
I'm curious as to what C will "get" in its next standard, when it will get it, and how it will keep C competitive. C and C++ are known to feed on one another's improvements, will C be feeding on the C++1x standard?
What can I look forward to in C's future?
The ISO/IEC 9899:2011 standard, aka C11, was published in December 2011.
The latest draft is N1570; I'm not aware of any differences between it and the final standard. There's already a Technical Corrigendum fixing an oversight in the specification of __STDC_VERSION__ (now 201112L) and the optional __STDC_LIB_EXT1__ (now 201112L).
I was typing a list of of features, but noticed the Wikipedia page on C1X has a pretty complete listing of all proposed changes.
On the ISO C working group posts 'after meeting' mailings on their website. One of the more interesting is this Editor's Report.
Here's a summary from the Wikipedia page:
Alignment specification (_Align specifier, alignof operator, aligned_alloc function)
Multithreading support (_Thread_local storage-class specifier, <threads.h> header including thread creation/management functions, mutex, condition variable and thread-specific storage functionality)
Improved Unicode support (char16_t and char32_t types for storing UTF-16/UTF-32 encoded data, including the corresponding u and U string literal prefixes and conversion functions in <uchar.h>)
Removal of the gets function
Bounds-checking interfaces (Annex K)
Analyzability features (Annex L)
I looks like gcc as of 4.6 is starting to look at C1x. They claim to have:
Static assertions (_Static_assert keyword)
Typedef redefinition
New macros in <float.h>
Anonymous structures and unions
Probably the best place to find the current status would be to look at the latest draft of the new version of the C standard. Warning: though it's coming directly from the committee, the server behind that link isn't always the most responsive...