Rule 2.3 MISRA A project should not contain unused type declarations - c

Whats means for "project"?
And in follow statement
"If a type is declared but not used, then it is unclear to a reviewer if the type is redundant or it has been left unused by mistake."
Whats mean "if type is redundant"? What is a redundant type?

MISRA document does not contain a strict definition of the "project". Intuitively, a project can be defined as a collection of source files used to build a set of artifacts.
Redundant type in this context means a type definition that is not used in the project sources. They can be easily detected using -Wunused-local-typedefs option in the recent versions of gcc and clang.

This is a family of rules from under MISRA-C:2012 2.x that in plain English say that you should never declare any variables, types, macros etc that aren't actually used anywhere in the program. Which is common sense - redundant simply means not used anywhere.
But note that these rules are mainly there for the benefit of the static analyser - this is the kind of checks that you definitely wish to automate. For mission-critical systems in general, we aren't allow to have parts of the production code which are never actually executed. Not even code which is "commented out" is allowed.

Related

Valid programs in C89, but not in C99

Are there features / semantics introduced, or removed, in C99 which would make a well defined program written in C89 either
invalid (i.e not compiling anymore, according to the C99 standard)
compiling, but having different semantics.
My findings so far, concerning plainly invalid programs:
implicit int (C89 §3.5.2)
implicit function declaration (C89 §3.3.2.2)
not returning from a function expecting a return value (C89 §3.6.6.4)
using new keywords as identifier (for example restrict, inline, etc)
hacks involving //, which are now treated as comments. However, nearly never encountered in production code.
Subtle changes, making the same code having different semantics:
Integer division has been made well defined, for example -3 / 2 now has to truncate towards zero (C99 §6.5.5/6), instead of being implementation defined (C89 §3.3.5/6)
strtod gained the ability to parse hexadecimal numbers in C99, by parsing 0x or 0X
What have I missed?
There are a lot of programs which would have been considered valid under C89, prior to the publication of C99, which some people insist were never valid. C89 includes a rule that requires that an object of any type may only be accessed using a pointer of that type, a related type, or a character type. Prior to the publication of C99, this rule was generally interpreted as applying only to "named" objects (variables of static or automatic duration which are accessed directly by name), and only in situations where the object in question didn't have its address taken immediately before it was used as a different pointer type. Such interpretation was motivated by a number of factors:
One of the stated goals of the Standard was to fit with what existing compilers and programs were doing, and while it would have been rare for existing programs to access discrete named variables using pointers of different types other than in cases where the variable's address was taken immediately before such use, many other usages of pointer type punning were quite common.
The rationale for the Standard includes as its sole example a function which receives a pointer of one primitive type to write a global variable of another primitive type in such a way that a compiler would have no particular reason to expect aliasing. Being able to keep global variables in registers is clearly a useful optimization, and the stated purpose of the rule is to allow such optimizations in cases where a compiler would have no reason to expect aliasing to occur. Outlawing constructs like like (int*)&foo=23; does nothing to aid such optimizations, since the fact that code is taking foo's address and dereferencing it should make it abundantly clear to any compiler that isn't being deliberately obtuse that the code is going to modify foo.
There are many kinds of code which require semantically the ability to use memory bits as various types, and nothing in the Standard indicate that the rules were intended to make programmers jump through hoops (e.g. by using memcpy) to achieve semantics that could have been easily obtained in the absence of the rules, especially considering that using memcpy would prevent the compiler from keeping global variables in registers across the pointer accesses (thus defeating the purpose for which the rules were written in the first place).
If structure types V and W have a common initial sequence, U is any union type containing both, and p is a V* which identifies the V within a U, then (W*)(U*)p may be used to access those common members, and will be equivalent to (W*)p. Unless a compiler could show that p couldn't possibly be a pointer to a member of some union containing W, it would be required to allow (W*)p to access the common members; it was more helpful to simply treat such common member access as being legitimate regardless of whether or where U might exist than to search for excuses to deny it.
Nothing in the C89 rules makes clear how the "type" of a region of allocated storage is defined, or how storage which holds things of one type that are no longer needed might be re-purposed to hold things of another.
Keeping track of registers allocated to named variables was easier than keeping track of registers allocated to other pointer exceptions, and code which was interested in minimizing the number of loads and stores via pointers would often copy things to named variables and work on them there.
C99 added "effective type" rules which are explicitly applicable to allocated storage. Some people insist those were merely "clarifications" of rules which already existed in C89, but for the above reasons I find that viewpoint untenable. It's fashionable to claim that the only reasons compilers didn't apply aliasing rules to unnamed objects are #5 and #6, but objections #1-#4 are equally significant (and continue to apply to C99 just as much as C89). Still, since C99 added the effective type rules, many constructs which would have been treated as legitimate by most common interpretations of the C89 rules are clearly forbidden.
As an element of contrast and comparison, the git/git codebase remains strictly conform to C89 and does not use C99 initializers, or features from newer C standard.
This is detailed in Git 2.23 (Q3 2019) in Git Coding Guidelines.
This answer illustrates post-C89 feature that might be compatible with C89.
See commit cc0c429 (16 Jul 2019) by Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster -- in commit fe9dc6b, 25 Jul 2019)
CodingGuidelines: spell out post-C89 rules
Even though we have been sticking to C89, there are a few handy features we borrow from more recent C language in our codebase after trying them in weather balloons and saw that nobody screamed.
Spell them out.
While at it, extend the existing variable declaration rule a bit to
read better with the newly spelled out rule for the for loop.
The coding guidelines now include:
You should not use features from newer C standard, even if your compiler groks them.
There are a few exceptions to this guideline:
since early 2012 with e1327023ea (Git v1.7.9.2), we have been using an enum definition whose last element is followed by a comma.
This, like an array initializer that ends with a trailing comma, can be used to reduce the patch noise when adding a new identifer at the end.
since mid 2017 with cbc0f81d (Git v2.15.0-rc0), we have been using designated
initializers for struct (e.g. "struct t v = { .val = 'a' };")
There are certain C99 features that might be nice to use in our code base, but we've hesitated to do so in order to avoid breaking compatibility with older compilers.
But we don't actually know if people are even using pre-C99 compilers these days.
If this patch can survive a few releases without complaint, then we can feel more confident that designated initializers are widely supported by our user base.
It also is an indication that other C99 features may be supported, but not a guarantee (e.g., gcc had designated initializers before C99 existed).
since mid 2017 with 512f41cf (Git v2.15.0-rc0), we have been using designated initializers for array (e.g. "int array[10] = { [5] = 2 }").
This is another test balloon to see if we get complaints from people
whose compilers do not support designated initializer for arrays.
These used to be forbidden, but we have not heard any breakage report, and they are assumed to be safe.
Variables have to be declared at the beginning of the block, before the first statement (i.e. -Wdeclaration-after-statement).
Declaring a variable in the for loop "for (int i = 0; i < 10; i++)" is still not allowed in this codebase.

Oberon: How to resolve contradiction in Wirth's PIO re type guard

I am trying to figure out whether Oberon allows addressing of a field in a record that is not present in said record's type declaration, but only in one of its extensions and do so without a type guard.
In PIO ("Programming in Oberon") page 62, last sentence of the first paragaph, Wirth writes (1):
This concludes our brief introduction to the object-oriented paradigm
of programming. We realize that almost no language features had to be
added to Oberon to support it. Apart from the already present
facilities of records and of procedural types, only the notion of type
extension is both necessary and crucial. It allows to construct
hierarchies of types and to build inhomogeneous data structures. As a
consequence of abandoning the rule of strictly static typing, the
introduction of dynamic type tests became necessary. The further
facility of the type guard is merely one of convenience.
In PIO page 59, first three sentences of the last paragraph before scetion 23.2 he writes (2):
The simple designator p.radius would not be acceptable, because p is of type Figure, which does not feature a field radius. With the type guard, the programmer can ascertain that in this case p is also of type Circle, in which case the field radius is indeed applicable. Whereas p is of base type Figure, p(Circle) is of type Circle.
On the one hand I interpret #2 such that the type guard is absolutely necessary in order to be able to address a field that is not in the designator's type declaration. Were it not for the type guard, addressing such a field should cause a compile time error.
On the other hand, if the type guard is merely a convenience as suggested by #1, then it could also be omitted. Its facility would simply be that of an assert and consequently the compiler could allow the addressing of a field that is not in the designator's type declaration.
Since the latter is not type safe I would be surprised if Wirth intended it that way.
I am therefore inclined to completely disregard #1 and implement #2.
Before I bother Wirth with an email I'd appreciate if Oberon practitioners (and compiler implementers) could share how this is interpreted in their respective Oberon compilers.
thanks in advance
I emailed Professor Wirth to ask for clarification.
It turns out that in the earlier Oberon language reports the statement "merely a convenience" has indeed been misleading because in these versions of Oberon the type guard syntax was necessary to address fields of extensions not present in the base type. There was no other way to do this.
However, as Wirth pointed out, in his latest revision of Oberon the semantics of the CASE statement have been extended to perform both type test and addressing of fields in extensions not present in their base type.
CASE msg OF
DrawMsg : msg.draw(self)
| MoveMsg : msg.move(self, msg.dx, msg.dy)
...
In this case, neither the IS type test, nor the type guard syntax is strictly necessary. Thus, in the current Oberon version they are indeed merely convenience.
The language report for the latest Oberon version can be found at:
https://www.inf.ethz.ch/personal/wirth/Oberon/Oberon07.Report.pdf
The CASE statement is described in section 9.5.

How to Protect Against Symbol Redefinition

My project incorporates a stack, which has a number of user-defined types (typedef). The problem is that many of these type definitions conflict with our in-house type definitions. That is, the same symbol name is being used. Is there any way to protect against this?
The root of the problem is that to use the stack in our application, or wrapper code, as the case may be, a certain header file must be included. This stack header file in turn includes the stack provider's types definition file. That's the problem. They should have included their type definition file via a non-public include path, but they didn't. Now, there are all sorts of user-defined type conflicts for very common names, such as BYTE, WORD, DWORD, and so forth.
Since you probably can't easily change the program stack you are using, you will have to start with your own code.
The first thing to do is (obviously) to limit the number of names in the global namespace, as far as possible. Don't use global variables, just use static ones, as an example.
The next step is to adopt a naming convention for your code modules. Suppose you have an "input module" in the project. You could then for example prefix all functions in the input module "inp".
void inp_init (void);
void inp_get (int input);
#define INP_SOMECONSTANT 4
typedef enum
{
INP_THIS,
INP_THAT,
} inp_something_t;
And so on. Whenever these items are used elsewhere in the code, they will not only have a unique identifier, it will also be obvious to the reader which module they belong to, and therefore what purpose they have. So while fixing the namespace conflicts, you gain readability at the same time.
Something like the above could be the first steps to implementing a formal coding standard, something you need to do sooner or later anyway as a professional programmer.
I suggest you define a wrapping header that redefines all of the functions and structures exported by the stack in terms of your own types. This header is then included in your system files but not in the stack files (where it would conflict). You can then compile and link but there is a weak point at the interface. If you select your types correctly in your redefinitions, it should work correctly, leaving only an maintenance problem on each update from the stack supplier...
I think that I've come up with a reasonable workaround, for the time being, but as Lundin stated, a formal coding standard is needed for a long-term solution.
Basically what I did was to move the inclusion of the required stack header file to before the inclusion of our in-house type definitions file. Then, between those two includes I added a compiler macro to set a defined constant dependent on whether the stack's header file single-include protection definition has been defined. Then, I used that conditional defined constant as a conditional compile option in our in-house type definition file to prevent the conflicting data-types from being re-defined. It's a little sloppy, but progress can only be made in incremental steps.

Get type name as string in C, GCC

Is there some 'builtin' extension in GCC to get type name of expression in C? (As a string, i.e. 'const char*').
First. You want to obtain type of a C expression at runtime. The problem is that types are erased during compilation and the machine code is almost typeless, it does not contains anything else than 8/16/32/64 bit integers and 32/64/80 bit floating point numbers (in case of x86). Types are compile time entity for C (C++ may retain some information about types in runtime though, because of its object-oriented nature, it associates types with classes, but it's hard to track PODs and primitive types).
Second. You want a type of a C expression. Sometimes it's hard to say what a given C expression be at runtime.
Thus there's no way to obtain C expression type at runtime.
Maybe you could have a look to the TYPE_NAME macro which seems to be a good starting point.
Since you said you want the name at runtime, that is a definitive "no". In C, data is just bytes in memory and doesn't have an intrinsic type at all. It is only the type declaration that tells the compiler what the compiled code should expect the type to be.
It would make sense, however, for a C compiler to be able to recognize the type of a variable at compile time, and that would be great for implementing things like equality assertions with friendly output in a unit testing framework. I can't see that C has anything like that either though.
Does anyone know if new versions of the ANSI C spec are still being developed? Compile-time type identification would be a great thing to add. Maybe integer constants for intrinsic types and a type equality test for either intrinsic or defined types?

What style to use when naming types in C

According to this stack overflow answer, the "_t" postfix on type names is reserved in C. When using typedef to create a new opaque type, I'm used to having some sort of indication in the name that this is a type. Normally I would go with something like hashmap_t but now I need something else.
Is there any standard naming scheme for types in C? In other languages, using CapsCase like Hashmap is common, but a lot of C code I see doesn't use upper case at all. CapsCase works fairly nicely with a library prefix too, like XYHashmap.
So is there a common rule or standard for naming types in C?
Yes, POSIX reserves names ending _t if you include any of the POSIX headers, so you are advised to stay clear of those - in theory. I work on a project that has run afoul of such names two or three times over the last twenty or so years. You can minimize the risk of collision by using a corporate prefix (your company's TLA and an underscore, for example), or by using mixed case names (as well as the _t suffix); all the collisions I've seen have been short and all-lower case (dec_t, loc_t, ...).
Other than the system-provided (and system-reserved) _t suffix, there is no specific widely used convention. One of the mixed-case systems (camelCase or InitialCaps) works well. A systematic prefix works well too - the better libraries tend to be careful about these.
If you do decide to use lower-case and _t suffix, do make sure that you use long enough names and check diligently against the POSIX standard, the primary platforms you work on, and any you think you might work on to avoid unnecessary conflicts. The worst problems come when you release some name example_t to customers and then find there is a conflict on some new platform. Then you have to think about making customers change their code, which they are always reluctant to do. It is better to avoid the problem up front.
The Indian Hill style guidelines have some suggestions:
Individual projects will no doubt have
their own naming conventions. There
are some general rules however.
Names with leading and trailing underscores are reserved for system
purposes and should not be used for
any user-created names. Most systems
use them for names that the user
should not have to know. If you must
have your own private identifiers,
begin them with a letter or two
identifying the package to which they
belong.
#define constants should be in all CAPS.
Enum constants are Capitalized or in all CAPS
Function, typedef, and variable names, as well as struct, union, and
enum tag names should be in lower
case.
Many macro "functions" are in all CAPS. Some macros (such as getchar and
putchar) are in lower case since they
may also exist as functions.
Lower-case macro names are only
acceptable if the macros behave like a
function call, that is, they evaluate
their parameters exactly once and do
not assign values to named parameters.
Sometimes it is impossible to write a
macro that behaves like a function
even though the arguments are
evaluated exactly once.
Avoid names that differ only in case, like foo and Foo. Similarly,
avoid foobar and foo_bar. The
potential for confusion is
considerable.
Similarly, avoid names that look like each other. On many terminals and
printers, 'l', '1' and 'I' look quite
similar. A variable named 'l' is
particularly bad because it looks so
much like the constant '1'.
In general, global names (including
enums) should have a common prefix
identifying the module that they
belong with. Globals may alternatively
be grouped in a global structure.
Typedeffed names often have "_t"
appended to their name.
Avoid names that might conflict with
various standard library names. Some
systems will include more library code
than you want. Also, your program may
be extended someday.
C only reserves some uses of a _t suffix. As far as I can tell, this is only current identifiers ending with _t plus any identifier that starts int or uint (7.26.8). However, POSIX may reserve more.
It's a general problem in C, since you have extremely flat namespaces, and there's no silver bullet. If you're familiar with CapCase names and they work well for you, then you should continue to use them. Otherwise, you'll have to evaluate the goals of the current project and see which solution best meets them.
CapsCase is often used for types in C.
For instance, if you look at projects in the GNOME ecosystem (GTK+, GDK, GLib, GObject, Clutter, etc.), you'll see types like GtkButton or ClutterStageWindow. They only use CapsCase for data types; function names and variables are all lower-case with underscore separators - e.g. clutter_actor_get_geometry().
Type naming schemes are like indentation conventions - they generate religious wars with people asserting some sort of moral superiority for their preferred approach. It is certainly preferable to follow the style in existing code, or in related projects (e.g. for me, GNOME over the last few years.)
However, if you're starting from scratch and have no template, there's no hard-and-fast rule. If you're interested in coding efficiently and leaving work at reasonable hour so you can go home and have a beer or whatever, you certainly should pick a style and stick to it for your project, but it matters very little exactly which style you pick.
One alternate solution that works reasonably well is to use uppercase for all type names and macro names. Global variables may be CapCase (CamelBack) and all local variables lower case.
This technique helps to improve readability and also takes advantage of language syntax which reduces the number of pollution characters in variable names; e.g. gvar, kvar, type_t, etc. For example, data types cannot be syntatically confused with any other type.
Global variables are easily distinguished from locals by having at least one upper case letter.
I agree that prefixed or postfixed underscores should be avoided in all token names.
Lets look at the example below.
Its readily clear that InvertedCount is a global due to its case. It's equally clear that INT32U and RET_ERR are types due to their sytax. Its also clear that INVERT_VAL() is a macro due to the fact thats its on the right hand side and there is no cast so it cant be a data type.
One thing is for sure though. Whichever method you use, it should be inline with your organizations coding standard. For me, the least amount of clutter, the better.
Of course, style is a different issue.
#define INVERT_VAL(x) (~x)
#define CALIBRATED_VAL 100u
INT32U InvertedCount;
typedef enum {
ERR_NONE = 0,
...
} RET_ERR;
RET_ERR my_func (void)
{
INT32U val;
INT32U check_sum;
val = CALIBRATED_VAL; // --> Lower case local variable.
check_sum = INVERT_VAL(val); // --> Clear use of macris.
InvertedCount = checksum; // --> Upper case global variable.
// Looks different no g prefix required.
...
return (ERR_NONE);
}
There are many ideas and opinion on this subject, but there is no one universal standard for naming types. The most important thing is to be consistent. In the absence of coding standards, when maintaining code, resist the urge to use another naming convention. Introducing a new naming convention, even if it's perfect, can add unnecessary complexity.
This is actually a great topic to raise when interviewing people. I've never come across a good programmer that didn't have an opinion on this. No opinion or no passion in the answer indicates that the person isn't an experienced programmer.

Resources