MISRA C 2012 21.1 - c

#define and#undefshall not be used on a reserved identifier or reserved macro name.
I have a violation for this rule for this code
"#define _POSIX_C_SOURCE 200809L".
_POSIX_C_SOURCE is a reserved identifier for macro.
Which is a formal deviation for this code?

This isn't just a MISRA violation but a standard C violation as well. See for example C11 7.1.3:
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
Where "reserved for any use" means reserved for the compiler/library implementation.
The problem lies in the Glibc naming of the identifier. If the implementation vouches for this identifier, then you should be able to use it.
But here's the catch: getting Glibc MISRA compliant is mission impossible. And professional MISRA-C implementations don't allow non-compliant libraries to be used.
If you still insist on using these libraries, you have to create some massive, project-wide deviation for the whole of the standard libraries. The problem here is that a vast amount of the Glibc code relies on gcc non-standard extensions, such as writing code that would otherwise have poorly-defined behavior outside the library implementation - for the sake of writing a standard lib where normal C rules don't apply. You cannot possibly make an argument that such code is to be trusted from a MISRA-C perspective.
I'd ask the person who made the call to combine POSIX, Glibc and MISRA-C in the same project how to carry on from here...

Yes, that's right, there is a violation of rule 21.1 here.
One of the points in the text of rule 21.1 states that identifiers or macro names beginning with and underscore should not be used.
Rationale of this rule is that macro definitions started from underscore are used in standard library headers.

Related

strtok_s and compilers C11 onward compliance

The declaration of strtok_s in C11, and its usage, look to be very different from the strtok_s in compilers like the latest bundled with Visual Studio 2022 (17.4.4) and also GCC 12.2.0 (looking at MinGW64 distribution).
I fear the different form has been developed as a safer and accepted alternative to strtok long before C11. What happens now if someone wants to use strtok_s and stay C11 compliant?
Are the compiler supplied libraries C11 compliant?
Maybe it's just that I've been fooled by something otherwise obvious, and someone can help me...
This is C11 (and similar is to C17 and early drafts of C23):
char *strtok_s(char * restrict s1,
rsize_t * restrict s1max,
const char * restrict s2,
char ** restrict ptr);
the same can be found as a good reference in the safec library
While MSC/VC and GCC have the form
char* strtok_s(
char* str,
const char* delimiters,
char** context
);
The C11 "Annex K bounds checking interfaces" was received with a lot of scepticism and in practice nearly no standard lib implemented it. See for example Field Experience With Annex K โ€” Bounds Checking Interfaces.
As for the MSVC compiler, it doesn't conform to any C standard and never made such claims - you can try this out to check if you are using such a compiler or not:
#if !defined(__STDC__) || (__STDC__==0)
#error This compiler is non-conforming.
#endif
In particular, MSVC did not implement Annex K either, but already had non-standard library extensions in place prior to C11.
In practice _s means:
Possibly more safe or possibly less safe, depending on use and what the programmer expected.
Non-portable.
Possibly non-conforming.
If portability and standard conformance are important, then avoid _s functions.
In practice _s functions protect against two things: getting passed non-sanitized input or null pointers. So assuming that you do proper input sanitation and don't pass null pointers to library functions, the _s functions aren't giving you extra safety, just extra execution bloat and portability problems.
What happens now if someone wants to use strtok_s and stay C11 compliant?
You de facto can't.
And it's not limited to just strtok_s(). The entire C11 Annex K set of implementations is fractured, and because the major deviations from the standard are from Microsoft's implementation, there will probably never be a way to write portable, standard-conforming code using the Annex K functions.
Per N1967 Field Experience With Annex K โ€” Bounds Checking Interface:
Available Implementations
Despite the specification of the APIs having been around for over a
decade only a handful of implementations exist with varying degrees of
completeness and conformance. The following is a survey of
implementations that are known to exist and their status.
While two of the implementations below are available in portable
source code form as Open Source projects, none of the popular Open
Source distribution such as BSD or Linux has chosen to make either
available to their users. At least one (GNU C Library) has repeatedly
rejected proposals for inclusion for some of the same reasons as those
noted by the Austin Group in their initial review of TR 24731-1
N1106]. It appears unlikely that the APIs will be provided by future
versions of these distributions.
Microsoft Visual Studio
Microsoft Visual Studio implements an early version of the APIs.
However, the implementation is incomplete and conforms neither to C11
nor to the original TR 24731-1. For example, it doesn't provide the
set_constraint_handler_s function but instead defines a
_invalid_parameter_handler _set_invalid_parameter_handler(_invalid_parameter_handler) function with similar behavior but a slightly different and incompatible
signature. It also doesn't define the abort_handler_s and
ignore_handler_s functions, the memset_s function (which isn't
part of the TR), or the RSIZE_MAX macro. The Microsoft
implementation also doesn't treat overlapping source and destination
sequences as runtime-constraint violations and instead has undefined
behavior in such cases.
As a result of the numerous deviations from the specification the
Microsoft implementation cannot be considered conforming or portable.
...
Safe C Library
Safe C Library [SafeC] is a fairly efficient and portable but
unfortunately very incomplete implementation of Annex K with support
for the string manipulation subset of functions declared in
<string.h>.
Due to its lack of support for Annex K facilities beyond the
<string.h> functions the Safe C Library cannot be considered a
conforming implementation.
Even the Safe C library is non-conforming.
Whether these functions are "safer" is debatable. Read the entire document.
Unnecessary Uses
A widespread fallacy originated by Microsoft's deprecation of the standard functions in an effort to increase the adoption of the APIs is that every call to the standard functions is necessarily unsafe and should be replaced by one to the "safer" API. As a result, security-minded teams sometimes naively embark on months-long projects rewriting their working code and dutifully replacing all instances of the "deprecated" functions with the corresponding APIs. This not only leads to unnecessary churn and raises the risk of injecting new bugs into correct code, it also makes the rewritten code less efficient.
Also, read the updated N1969 Updated Field Experience With Annex K โ€” Bounds Checking Interfaces:
Despite more than a decade since the original proposal and nearly ten years since the ratification of ISO/IEC TR 24731-1:2007, and almost five years since the introduction of the Bounds checking interfaces into the C standard, no viable conforming implementations has emerged. The APIs continue to be controversial and requests for implementation continue to be rejected by implementers.
The design of the Bounds checking interfaces, though well-intentioned, suffers from far too many problems to correct. Using the APIs has been seen to lead to worse quality, less secure software than relying on established approaches or modern technologies. More effective and less intrusive approaches have become commonplace and are often preferred by users and security experts alike.
Therefore, we propose that Annex K be either removed from the next revision of the C standard, or deprecated and then removed.

Why C language called Standard

Recently I've read "Extreme C Programming" book and often heard that
C is a Standard
I know, C is standardized by ANSI. But what does it really mean? Is this is about keywords, supported functions or headers?
It means that there is international standardization in the form of a document ISO/IEC 9899:2018 1) stating how compilers and applications should behave. ISO is an international collaboration, consisting of working groups that take input from national standardization institutes such as ANSI/INCITS in USA. So saying that C is standardized by ANSI is wrong unless you happen to live in USA, where the local name for the standard is INCITS/ISO/IEC 9899:2018.
The whole language is specified in this document: terms, behavior, keywords, operators, environment considerations, certain libraries and so on.
1) The official standard costs money to obtain. For student/hobbyist purposes, you can download a draft version of the standard for free though, such as the C11 draft.
If the sentence indeed refers to C being ANSI/ISO standardized, it refers to a lot of things, including your "keywords, supported functions or headers". The ISO C standard defines:
The preprocessor directives (defines and includes).
The syntax (the grammar, the formal structure): The keywords and other building blocks of the language (literals, operators, identifier syntax) and how these can be combined to expressions and statements in order to form a syntactically correct C program.
The semantics of a program (which grammatically correct constructs are allowed, and what is their meaning).
The C Standard Library (malloc, printf, memcpy etc.). The "user facing" part of that library are the headers (stdio.h, string.h etc.) which name and describe the functions available in the standard library. The "system facing" part of the standard library is the actual compiled code of those functions, typically in the form of library files in a platform specific format with platform specific names in platform specific locations such as libc.a on a gcc/linux system. Because the standard library is so commonly used by normal programs, no special effort must be made to link to it: The linker does that automatically. (You still need to include the proper header file though to let the compiler know about the function names and the arguments you want to use.)
Saying that C is a "standard" can have both meanings: The ISO standardization detailed above, but also the fact that C, compared to assembler, is an abstraction layer that shields a program from peculiarities of the underlying hardware, for example is word length, its endianness, signedness of its character type etc. The interaction with the "system" a program is running on is abstracted through the aptly named "Standard Library".
A well-written C program runs without or with only minor modification on a wide variety of platforms. In this sense C was a de-facto "standard" for programming for years before its formal ISO standardization at the end of the 1980s, much in the sense that *nix in one of its flavors has become a de-facto standard for server operating systems.
Addendum: After browsing the accessible part of the book that inspired your question I can say with confidence that the author indeed addresses both meanings of "standard": He talks about the different ISO C standard versions, dedicating an entire chapter to C 2018; but he also says the following, on "54% of sample" (I cannot see a page number there; emphasis by me):
The size of a pointer depends on the architecture rather than being a specific C concept. C doesn't worry too much about such hardware-related details, and it tries to provide a generic way of working with pointers and other programming concepts. That is why we know C as a standard.
I know, C is standardized by ANSI
C was standardized by ANSI in 1989 (aka C89).
It was then globally adopted by ISO/IEC JTC1/SC22 Programming Languages in 1990 as ISO/IEC 9899:1990 (aka C90).
Working Group 14 (WG14) of SC22 have subsequently evolved the C Standard as:
ISO/IEC 9899:1990 (aka C90)
ISO/IEC 9899:1990/AMD1:1995 (aka C95)
ISO/IEC 9899:1999 (aka C99)
ISO/IEC 9899:2011 (aka C11)
ISO/IEC 9899:2018 (aka C18 - although sometimes called C17 as __STDC__ is 201710L)
ISO/IEC 9899:202x (aka C2x) is pending...
There were a couple of TCs too...
As a Standard it has requirements for conformance.

Why do Windows and Linux have different strdup implementations: strdup() and _strdup()?

When working with strdup on Windows I found out that _strdup is Windows specific, but when I ran the same code on Linux it required strdup without the underscore. Does anyone know the history behind this difference, as-well as some information on how you have dealt with this problem when writing cross-platform code?
There are several functions that are part of the POSIX specification, i.e. Linux and most other UNIX variants, that are not part of standard C. These include strdup, write, read, and others.
The reasoning for the leading underscore is as follows, taken from the MSDN docs:
The Universal C Run-Time Library (UCRT) supports most of the C
standard library required for C++ conformance. It implements the C99
(ISO/IEC 9899:1999) library, with certain exceptions: The type-generic
macros defined in , and strict type compatibility in
. The UCRT also implements a large subset of the POSIX.1
(ISO/IEC 9945-1:1996, the POSIX System Application Program Interface)
C library. However, it's not fully conformant to any specific POSIX
standard. The UCRT also implements several Microsoft-specific
functions and macros that aren't part of a standard.
Functions specific to the Microsoft implementation of Visual C++ are
found in the vcruntime library. Many of these functions are for
internal use and can't be called by user code. Some are documented for
use in debugging and implementation compatibility.
The C++ standard reserves names that begin with an underscore in the
global namespace to the implementation. Both the POSIX functions and
Microsoft-specific runtime library functions are in the global
namespace, but aren't part of the standard C runtime library. That's
why the preferred Microsoft implementations of these functions have a
leading underscore. For portability, the UCRT also supports the
default names, but the Microsoft C++ compiler issues a deprecation
warning when code that uses them is compiled. Only the default names
are deprecated, not the functions themselves. To suppress the warning,
define _CRT_NONSTDC_NO_WARNINGS before including any headers in code
that uses the original POSIX names.
I've handled that by having a #define that check if the program is being compiled for Windows, and if so create another #define to map the POSIX name to the Windows specific name. There are a few choices you can check, although probably the most reliable is _MSC_VER which is defined if MSVC is the compiler.
#ifdef _MSC_VER
#define strdup(p) _strdup(p)
#endif

What is the purpose of Microsoft's underscore C functions?

This question is about the same subject as strdup or _strdup? but it is not the same. That question asks how to work around MS's renamings, this question asks why they did it in the first place.
For some reason Microsoft has deprecated a whole slew of POSIX C functions and replaced them with _-prefixed variants. One example among many is isatty:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/posix-isatty
This POSIX function is deprecated. Use the ISO C++ conformant _isatty instead.
What exactly is ISO C++ conformant about _isatty? It appears to me that the MSDN help is totally wrong.
The other questions answer explained how to deal with this problem. You add the _CRT_NONSTDC_NO_DEPRECATE define. Fine. But I want to know what Microsoft's thinking is. What was their point in renaming and deprecating functions? Was it just to make C programmers lives even harder?
The fact that _isatty() is ISO C++ conformant makes sense if you think of it like a language-lawyer.
Under ISO C++, the compiler is only supposed to provide the functions in the standard (at least for the standard headers) -- they're not allowed to freely add extra functions, because it could conflict with functions declared in the code being compiled. Since isatty() is not listed in the standard, providing an isatty() function in a standard header would not be ISO C++ compliant.
However, the standard does allow the compiler to provide any function it wants as long as the function starts with a single underscore. So -- language lawyer time -- _isatty() is compliant with ISO C++.
I believe that's the logic that leads to the error message being phrased the way it is.
(Now, in this specific case, isatty() was provided in io.h, which is not actually a C++ standard header, so technically Microsoft could provide it and still claim to be standards-conformant. But, they had other non-compliant functions like strcmpi() in string.h, which is a standard header. So, for consistency, they deprecated all of the POSIX functions the same way and they all report the same error message.)
Names starting with an underscore, like _isatty are reserved for the implementation. They do not have a meaning defined by ISO C++, nor by ISO C, and you can't use them for your own purposes. So Microsoft is entirely right in using this prefix, and POSIX is actually wrong.
C++ has namespaces, so a hypthetical "Posix C++" could define namespace posix, but POSIX has essentially become fossilized - no new innovation in that area.
isatty & co., although POSIX, are not standard C, and are provided as "extensions" by the VC++ runtime1.
As such, they are prefixed with an underscore supposedly to avoid name clashes - as names starting with an underscore followed by a lowercase letter are reserved for implementation-defined stuff at global scope. So if, for example, you wanted to use an actual POSIX compatibility layer providing its own versions of these functions, they wouldn't have to fight with the VC++-provided "fake" ones for the non-underscored names.
Extensions which have no presumption to be actually POSIX-compliant, by the way.

Is a compiler allowed to add functions to standard headers?

Is a C compiler allowed to add functions to standard headers and still conform to the C standard?
I read this somewhere, but I can't find any reference in the standard, except in annex J.5:
The inclusion of any extension that may cause a strictly conforming
program to become invalid renders an implementation nonconforming.
Examples of such extensions are new keywords, extra library functions
declared in standard headers, or predefined macros with names that do
not begin with an underscore.
However, Annex J is informative and not normative... so it isn't helping.
So I wonder if it is okay or not for a conforming compiler to add additional functions in standard headers?
For example, lets say it adds non-standard itoa to stdlib.h.
In 4. "Conformance" ยง6, there is:
A conforming implementation may have extensions (including additional
library functions), provided they do not alter the behavior of any strictly conforming
program.
with the immediate conclusion in a footnote:
This implies that a conforming implementation reserves no identifiers other than those explicitly
reserved in this International Standard.
The reserved identifiers are described in 7.1.3. Basically, it is everything starting with an underscore and everything explicitly listed as used for the standard libraries.
So, yes the compiler is allowed to add extensions. But they have to have a name starting with an underscore or one of the prefixes reserved for libraries.
itoa is not a reserved identifier and a compiler defining it in a standard header is not conforming.
In "7.26 Future library directions" you have a list of the identifiers that may be added to the standard headers, this includes identifiers starting with str or mem, macros starting with E and stuff like that.
Other than that, implementations are restricted to the generic names as reserved in "7.1.3 Reserved identifiers".
Compilers for embedded systems regularly add functions and macros to standard headers, usually to make a special processor feature available for use.
If I read the standard correctly, they can do so without sacrificing conformity if they do use names specified as reserved by the standard. Since a conforming program may use any non-reserved name as a variable or a function name, using such a non-reserved name as an addition to a standard header would break a conforming program.
In practice, however, the compiler writers usually do not care too much. They will at most provide a list of elements defined for the system you may not use if you want your program to work with their implementation.

Resources