Make tolower as static - c

I need to make the standard library function tolower static instead of "public' scope.
I am compiling with MISRA C:2004, using IAR Embedded Workbench compiler. The compiler is declaring tolower as inline:
inline
int tolower(int _C)
{
return isupper(_C) ? (_C + ('A' - 'a')) : _C;
}
I am getting the following error from the compiler:
Error[Li013]: the symbol "tolower" in RS232_Server.o is public but
is only needed by code in the same module - all declarations at
file scope should be static where possible (MISRA C 2004 Rule 8.10)
Here are my suggested solutions:
Use tolower in another module, in a dummy circumstance, so that it
is needed by more than one module.
Implement the functionality without using tolower. This is an embedded system.
Add a "STATIC" macro, defined as empty by default, but can be
defined to static before the ctype.h header file is included.
I'm looking for a solution to the MISRA linker error. I would prefer to make the tolower function static only for the RS232_Server translation unit (If I make tolower static in the standard header file, it may affect other future projects.)
Edit 1:
Compiler is IAR Embedded Workbench 6.30 for ARM processor.
I'm using an ARM7TDMI processor in 32-bit mode (not Thumb mode).
The tolower function is used with a debug port.
Edit 2:
I'm also getting the error for _LocaleC_isupper and _LocaleC_tolower
Solution:
I notified the vendor of the issue according as recommended by Michael Burr.
I decided not to rewrite the library routine because of localization
issues.
I implemented a function pointer in the main.c file as suggested by
gbulmer; however this will be commented incredibly because it should
be removed after IAR resolves their issue.

I'd suggest that you disable this particular MISRA check (you should be able to do that just for the RS232_Server translation unit) rather than use the one of the workarounds you suggest. In my opinion the utility of rule 8.10 is pretty minimal, and jumping through the kinds of hoops in the suggested workarounds seems more likely to introduce a higher risk than just disabling the rule. Keep in mind that the point of MISRA is to make C code less likely to have bugs.
Note that MISRA recognizes that "in some instances it may be necessary to deviate from the rules" and there is a documented "Deviation procedure" (section 4.3.2 in MISRA-C 2004).
If you won't or can't disable the rule for whatever reason, in my opinion you should probably just reimplement tolower()'s functionality in your own function, especially if you don't have to deal with locale support. It may also be worthwhile to open a support incident with IAR. You can prod them with rule 3.6 says that "All libraries used in production code shall be written to comply with [MISRA-C]".

Who sells the MISRA linker? It seems to have an insane bug.
Can you work around it by taking its address int (*foo)(int) = tolower;at file scope?
Edit: My rationale is:
a. that is the stupidest thing I've seen this decade, so it may be a bug, but
b. pushing it towards an edge case (a symbol having its name exported via a global) might shut it up.
For that to be correct behaviour, i.e. not a bug, it would have to be a MISRA error to include any library function once, like initialise_system_timer, initialise_watchdog_timer, ... , which just hurts my head.
Edit: Another thought. Again based on an assumption that this is an edge-case error.
Theory: Maybe the compiler is doing both the inline, and generating an implementation of the function. Then the function is (of course) not being called. So the linker rules are seeing that un-called function.
GNU has options to prevent in-lining.
Can you do the same for that use of tolower? Does that change the error? You could do a test by
#define inline /* nothing */
before including ctype.h, then undef the macro.
The other test is to define:
#define inline static inline
before including ctype.h, which is the version of inline I expect.
EDIT2:
I think there is a bug which should be reported. I would imagine IAR have a workaround. I'd take their advice.
After a nights sleep, I strongly suspect the problem is inline int tolower() rather than static inline int tolower or int tolower, because it makes most sense. Having a public function which is not called does seem to be the symptoms of a bug in a program.
Even with documentation, all the coding approaches have downsides.
I strongly support the OP, I would not change a standard header file. I think there are several reasons. For example, a future upgrade of the tool chain (which comes with a new set of headers) breaks an old app if it ever gets maintained. Or simply building the application on a different machine gives an error, merging two apparently correct applications may give an error, ... . Nothing good is likely to come of that. -100
Use #define ... to make the error go away. I do not like using macros to change standard libraries. This seems like the seeds of a long-term bad idea. If in future another part of the program uses another function with a similar problem, the application of the 'fix' gets worse. Some inexperienced developer might get the job of maintaining the code, and 'learns' wrapping strange pieces of #define trickery around #include is 'normal' practice. It becomes company 'practice' to wrap #include <ctype.h> in weird macro workarounds, which remains years after it was fixed. -20
Use a command line compiler option to switch off inlining. If this works, and the semantics are correct, i.e. including the header and using the function in two source files does not lead to a multiple defined function, then okay. I would expect it leads to an error, but it is worth confirming as part of the bug report. It lays a frustrating trap for someone else, who comes along in the future. If a different inline standard library function is used, and for some reason a person has to look at the generated code it won't be included either. They might go a bit crazy wondering why inline is not honoured. I conjecture the reason they are looking at generated code is because performance is critical. In my experience, people would spend a lot of time looking at code, baffled before looking at the build script for a program which works. If suppressing inline is used as a fix, I do feel it is better to do it everywhere than for one file. At least the semantics are consistent and the high-level comment or documentation might get noticed. Anyone using the build scripts as a 'quick check' will get consistent behaviour which might cause them to look at the documentation. -1 (no-inline everywhere) -3 (no-inline on one file)
Use tolower in a bogus way in a second source file. This has the small benefit that the build system is not 'infected' with extra compiler options. Also it is a small file which will give the opportunity to explain the error being worked around. I do not like this much but I like it more than the fiddling with standard headers. My current concern is it might not work. It might include two copies which the linker can't resolve. I do like it is better than the weird 'take its address and see if the linker shuts up` (which I do think is an interesting way to test the edge cases). -2
Code your own tolower. I don't understand the wider context of the application. My reaction is not to code replacements for library functions because I am concerned about testing (and the unit tests which introduce even more code) and long term maintenance. I am even more nervous with character I/O because the application might need to become capable of handling a wider character set, for example UTF-8 encoding, and a user-defined tolower will tend to get out of synch. It does sound like a very specific application (debugging to a port), so I could cope. I don't like writing extra code to work around things which look like bugs, especially if it is commercial software. -5
Can you convert the error to a warning without losing all of the other checking? I still feel it is a bug in the toolchain, so I'd prefer it to be a warning, rather than silence so that there is a hook for the incident and some documentation, and there is less chance of another error creeping in. Otherwise go with switching off the error. +1 (Warning) 0 (Error)
Switching the error off seems to lose the 'corporate awareness' that (IMHO) IAR owes you an explanation, and longer term fix, but it seems much better than messing with the build system, writing macro-nastiness to futz with standard libraries, or writing your own code which increases your costs.
It may just be me, but I dislike writing code to work around a deficiency in a commercial product. It feels like this is where the vendor should be pleased to have the opportunity to justify its license cost. I remember Microsoft charged us for incidents, but if the problem was proven to be theirs, the incident and fix were free. The right thing to do seems to be give them a chance to earn the money. Products will have bugs, so silently working around it without giving them a chance to fix it seems less helpful too.

First of all, MISRA-C:2004 does not allow inline nor C99. Upcoming MISRA 2012 will allow it. If you try to run C99 or non-standard code through a MISRA-C:2004 static analyser, all bets are off. The MISRA checker should give you an error for the inline keyword.
I believe a MISRA-C compliant version of the code would look like:
static uint8_t tolower(uint8_t ch)
{
return (uint8_t)(isupper(ch) ? (uint8_t)((uint8_t)(ch + 'A') - 'a') :
ch);
}
Some comments on this: MISRA encourages char type for character literals, but at the same time warns against using the plain char type, as it has implementation-defined signedness. Therefore I use uint8_t instead. I believe it is plain dumb to assume that there exist ASCII tables with negative indices.
(_C + ('A' - 'a')) is most certainly not MISRA-compliant, as MISRA regards it, it contains two implicit type promotions. MISRA regards character literals as char, rather than int, like the C standard.
Also, you have to typecast to underlying type after each expression. And because the ?: operator contains implicit type promotions, you must typecast the result of it to underlying type as well.
Since this turned out to be quite an unreadable mess, the best idea is to forget all about ?: entirely and rewrite the function. At the same time we can get rid of the unnecessary reliance on signed calculations.
static uint8_t tolower (uint8_t ch)
{
if(isupper(ch))
{
ch = (uint8_t)(ch - 'A');
ch = (uint8_t)(ch + 'a');
}
return ch;
}

Related

Cross-Platform C single header file and multiple implementations

I am working on an open source C driver for a cheap sensor that is used mostly for Arduino projects. The project is set up in such a way that it is possible to support multiple platforms outside the Arduino ecosystem, like the Raspberry Pi.
The project is set up with a platform.h file, with the intention of having different implementations of this header file. Like the example below:
platform.h
platform_arduino.c
platform_rpi.c
platform_windows.c
There is this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I have come up with some solutions like just adding the requirements for each platform at the top of the file.
#if SOME_REQUIREMENT
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
#endif //SOME_REQUIREMENT
But this seems like a clunky solution.
It impacts readability of the code.1
It will probably make debugging conflicting requirements a nightmare.
1 Many editors (Like VS Code) try to gray out code which does not match requirements. While I want this most of the time, it is really annoying when working on cross-platform drivers. I could just disable it for the entirety of the project, but in other parts of the project it is useful. I understand that it could probably be solved using VS Code thing. However, I am asking for alternative methods of selecting the right file/code for the platform because I am interested in seeing what other strategies there are.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic. My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If it cannot be done without makefile magic, then that is an answer too.
For reference, here is a simplified example of the header file and implementation
platform.h
#ifndef __PLATFORM__
#define __PLATFORM__
int8_t t_open(void);
#endif //__PLATFORM__
platform_arduino.c
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I don't see why you say that. The first suggestions in the two highest-scoring answers are variations on the idea of using conditional macros, which not only is valid in C, but is a traditional approach. You yourself present an alternative along these lines.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic.
I take you to mean that the approach to platform adaptation has to be encoded somehow into the C source, as opposed to being handled via the build system. Frankly, this is an unusual constraint, except inasmuch as it can be addressed by use of the various system-identification macros provided by C compilers of interest.
Even if you don't want to rely specifically on makefiles, you should consider attributing some responsibility to the build system, which you can do even without knowing specifically what build system that is. For example, you can designate macro names, such as for_windows, etc that request builds for non-default platforms. You then leave it to the person building an instance of the driver to figure out how to configure their tools to provide the appropriate macro definition for their needs (which generally is not hard), based on your build documentation.
My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If the solution needs to be embodied entirely in the C source, then you have three main alternatives:
write code that just works correctly on all platforms, or
perform runtime detection and adaptation, or
use conditional compilation based on macros automatically defined by supported compilers.
If you're prepared to rely on macro definitions supplied by the user at build time, then the last becomes simply
use conditional compilation
Do not dismiss the first out of hand, but it can be a difficult path, and it might not be fully possible for your particular problem (and probably isn't if you're writing a driver or other code for a freestanding implementation).
Runtime adaptation could be viewed as a specific case of code that just works, but what I have in mind for this is a higher level of organization that performs runtime analysis of the host environment and chooses function variants and internal parameters suited to that, as opposed to those choices being made at compile time. This is a real thing that is occasionally done, but it may or may not be viable for your particular case.
On the other hand, conditional compilation is the traditional basis for platform adaptation in C, and the general form does not have the caveat of the other two that it might or might not work in your particular situation. The level of readability and maintainability you achieve this way is a function of the details of how you implement it.
I have come up with some solutions like just adding the requirements for each platform at the top of the file. [...] But this seems like a clunky solution.
If you must include a source file in your build but you don't want anything in it to actually contribute to the target then that's exactly what you must do. You complain that "It will probably make debugging conflicting requirements a nightmare", but to the extent that that's a genuine issue, I think it's not so much a question of syntax as of the whole different code for different platforms plan.
You also complain that the conditional compilation option might be a practical difficulty for you with your choice of development tools. It certainly seems to me that there ought to be good workarounds for that available from your tools and development workflow. But if you must have a workaround grounded only in the C language, then there is one (albeit a bad one): introduce a level of preprocessing indirection. That is, put the conditional compilation directives in a different source file, like so:
platform.c
#if defined(for_windows)
#include "platform_windows.c"
#else
#if defined(for_rpi)
#include "platform_rpi.c"
#else
#include "platform_arduino.c"
#endif
#endif
You then designate platform.c as a file to be built, but not (directly) any of the specific-platform files.
This solves your tool-presentation issue because when you are working on one of the platform-specific .c files, the editor is unlikely to be able to tell whether it would actually be included in a build or not.
Do note well that it is widely considered bad practice to #include files containing function implementations, or those not ending with an extension conventionally designating a header. I don't say otherwise about the above, but I would say that if the whole platform.c contains nothing else, then that's about the least bad variation that I can think of within the category.

unused `static inline` functions generate warnings with `clang`

When using gcc or clang, it's generally a good idea to enable a number of warnings, and a first batch of warnings is generally provided by -Wall.
This batch is pretty large, and includes the specific warning -Wunused-function.
Now, -Wunused-function is useful to detect static functions which are no longer invoked, meaning they are useless, and should therefore preferably be removed from source code.
When applying a "zero-warning" policy, it's no longer "preferable", but downright compulsory.
For performance reasons, some functions may be defined directly into header files *.h, so that they can be inlined at compile time (disregarding any kind of LTO magic). Such functions are generally declared and defined as static inline.
In the past, such functions would probably have been defined as macros instead, but it's considered better to make them static inline functions instead, whenever applicable (no funny type issue).
OK, so now we have a bunch of functions defined directly into header files, for performance reasons. A unit including such a header file is under no obligation to use all its declared symbols. Therefore, a static inline function defined in a header file may reasonably not be invoked.
For gcc, that's fine. gcc would flag an unused static function, but not an inline static one.
For clang though, the outcome is different : static inline functions declared in headers trigger a -Wunused-function warning if a single unit does not invoke them. And it doesn't take a lot of flags to get there : -Wall is enough.
A work-around is to introduce a compiler-specific extension, such as __attribute__((unused)), which explicitly states to the compiler that the function defined in the header may not necessarily be invoked by all its units.
OK, but now, the code which used to be clean C99 is including some form of specific compiler extension, adding to the weight of portability and maintenance.
The question therefore is more about the logic of such a choice : why does clang selects to trigger a warning when a static inline function defined in a header is not invoked ? In which case is that a good idea ?
And what does clang proposes to cover the relatively common case of inlined functions defined in header file, without requesting the usage of compiler extension ?
edit :
After further investigation, it appears the question is incorrect.
The warning is triggered in the editor (VSCode) using clang linter applying a selected list compilation flags (-Wall, etc.).
But when the source code is actually compiled with clang and with exactly the same list of flags, the "unused function" warning is not present.
So far, the results visible in the editor used to be exactly the ones found at compilation time. It's the first time I witness a difference.
So the problem seems related to the way the linter uses clang to produce its list of warnings. That's a much more complex and specific question.
Note the comment:
OK, sorry, this is actually different from expectation. It appears the warning is triggered in the editor using clang linter with selected compilation flags (-Wall, etc.). But when the source code is compiled with exactly the same flags, the "unused function" warning is actually not present. So far, the results visible in the editor used to be exactly the ones found at compilation time; it's the first time I witness a difference. So the problem seems related to the way the linter uses clang to produce its list of warnings. It seems to be a more complex question [than I realized].
I'm not sure you'll find any "why". I think this is a bug, possibly one that they don't care to fix. As you hint in your question, it does encourage really bad practice (annotation with compiler extensions where no annotation should be needed), and this should not be done; rather, the warning should just be turned off unless/until the bug is fixed.
If you haven't already, you should search their tracker for an existing bug report, and open one if none already exists.
Follow-up: I'm getting reports which I haven't verified that this behavior only happens for functions defined in source files directly, not from included header files. If that's true, it's nowhere near as bad, and probably something you can ignore.
'#ifdef USES_FUNTION_XYZ'
One would have to configure the used inline functions before including the header.
Sounds like a hassle and looks clumsy.
When using gcc or clang, it's generally a good idea to enable a number of warnings,
When using any C compiler, it's a good idea to ensure that the warning level is turned up, and to pay attention to the resulting warnings. Much breakage, confusion, and wasted effort can be saved that way.
Now, -Wunused-function is useful to detect static functions which are
no longer invoked, meaning they are useless, and should therefore
preferably be removed from source code. When applying a "zero-warning"
policy, it's no longer "preferable", but downright compulsory.
Note well that
Such zero-warning policies, though well-intended, are a crutch. I have little regard for policies that substitute inflexible rules for human judgement.
Such zero-warning policies can be subverted in a variety of ways, with disabling certain warnings being high on the list. Just how useful are they really, then?
Policy is adopted by choice, as a means to an end. Maybe not your choice personally, but someone's. If existing policy is not adequately serving the intended objective, or is interfering with other objectives, then it should be re-evaluated (though that does not necessarily imply that it will be changed).
For performance reasons, some functions may be defined directly into header files *.h, so that they can be inlined at compile time (disregarding any kind of LTO magic).
That's a choice. More often than not, one affording little advantage.
Such functions are generally declared and defined as static inline. In the past, such functions would probably have been defined as macros instead, but it's considered better to make them static inline functions instead, whenever applicable (no funny type issue).
Considered by whom? There are reasons to prefer functions over macros, but there are also reasons to prefer macros in some cases. Not all such reasons are objective.
A unit including such a header file is
under no obligation to use all its declared symbols.
Correct.
Therefore, a
static inline function defined in a header file may reasonably not be
invoked.
Well, that's a matter of what one considers "reasonable". It's one thing to have reasons to want to do things that way, but whether those reasons outweigh those for not doing it that way is a judgement call. I wouldn't do that.
The question therefore is more about the logic of such a choice : why
does clang selects to trigger a warning when a static inline function
defined in a header is not invoked ? In which case is that a good idea
?
If we accept that it is an intentional choice, one would presume that the Clang developers have a different opinion about how reasonable the practice you're advocating is. You should consider this a quality-of-implementation issue, there being no rules for whether compilers should emit diagnostics in such cases. If they have different ideas about what they should warn about than you do, then maybe a different compiler would be more suitable.
Moreover, it would be of little consequence if you did not also have a zero-warning policy, so multiple choices on your part are going into creating an issue for you.
And what does clang proposes to cover the relatively common case of
inlined functions defined in header file, without requesting the usage
of compiler extension ?
I doubt that clang or its developers propose any particular course of action here. You seem to be taking the position that they are doing something wrong. They are not. They are doing something that is inconvenient for you, and that therefore you (understandably) dislike. You will surely find others who agree with you. But none of that puts any onus on Clang to have a fix.
With that said, you could try defining the functions in the header as extern inline instead of static inline. You are then obligated to provide one non-inline definition of each somewhere in the whole program, too, but those can otherwise be lexically identical to the inline definitions. I speculate that this may assuage Clang.

How to use the __attribute__ keyword in GCC C?

I am not clear with use of __attribute__ keyword in C.I had read the relevant docs of gcc but still I am not able to understand this.Can some one help to understand.
__attribute__ is not part of C, but is an extension in GCC that is used to convey special information to the compiler. The syntax of __attribute__ was chosen to be something that the C preprocessor would accept and not alter (by default, anyway), so it looks a lot like a function call. It is not a function call, though.
Like much of the information that a compiler can learn about C code (by reading it), the compiler can make use of the information it learns through __attribute__ data in many different ways -- even using the same piece of data in multiple ways, sometimes.
The pure attribute tells the compiler that a function is actually a mathematical function -- using only its arguments and the rules of the language to arrive at its answer with no other side effects. Knowing this the compiler may be able to optimize better when calling a pure function, but it may also be used when compiling the pure function to warn you if the function does do something that makes it impure.
If you can keep in mind that (even though a few other compilers support them) attributes are a GCC extension and not part of C and their syntax does not fit into C in an elegant way (only enough to fool the preprocessor) then you should be able to understand them better.
You should try playing around with them. Take the ones that are more easily understood for functions and try them out. Do the same thing with data (it may help to look at the assembly output of GCC for this, but sizeof and checking the alignment will often help).
Think of it as a way to inject syntax into the source code, which is not standard C, but rather meant for consumption of the GCC compiler only. But, of course, you inject this syntax not for the fun of it, but rather to give the compiler additional information about the elements to which it is attached.
You may want to instruct the compiler to align a certain variable in memory at a certain alignment. Or you may want to declare a function deprecated so that the compiler will automatically generate a deprecated warning when others try to use it in their programs (useful in libraries). Or you may want to declare a symbol as a weak symbol, so that it will be linked in only as a last resort, if any other definitions are not found (useful in providing default definitions).
All of this (and more) can be achieved by attaching the right attributes to elements in your program. You can attach them to variables and functions.
Take a look at this whole bunch of other GCC extensions to C. The attribute mechanism is a part of these extensions.
There are too many attributes for there to be a single answer, but examples help.
For example __attribute__((aligned(16))) makes the compiler align that struct/function on a 16-bit stack boundary.
__attribute__((noreturn)) tells the compiler this function never reaches the end (e.g. standard functions like exit(int) )
__attribute__((always_inline)) makes the compiler inline that function even if it wouldn't normally choose to (using the inline keyword suggests to the compiler that you'd like it inlining, but it's free to ignore you - this attribute forces it).
Essentially they're mostly about telling the compiler you know better than it does, or for overriding default compiler behaviour on a function by function basis.
One of the best (but little known) features of GNU C is the attribute mechanism, which allows a developer to attach characteristics to function declarations to allow the compiler to perform more error checking. It was designed in a way to be compatible with non-GNU implementations, and we've been using this for years in highly portable code with very good results.
Note that attribute spelled with two underscores before and two after, and there are always two sets of parentheses surrounding the contents. There is a good reason for this - see below. Gnu CC needs to use the -Wall compiler directive to enable this (yes, there is a finer degree of warnings control available, but we are very big fans of max warnings anyway).
For more information please go to http://unixwiz.net/techtips/gnu-c-attributes.html
Lokesh Venkateshiah

debugging c programs

Programming in a sense is easy. But bugs are something which always makes more trouble. Can anyone help me with good debugging tricks and softwares in c?
From "The Elements of Programming Style" Brian Kernighan, 2nd edition, chapter 2:
Everyone knows that debugging is twice
as hard as writing a program in the
first place. So if you're as clever as
you can be when you write it, how will
you ever debug it?
So from that; don't be "too clever"!
But apart from that and the answers already given; use a debugger! That is your starting point tool-wise. You'd be amazed how many programmers struggle along without the aid of a debugger, and they are fools to do so.
But before you even get to the debugger, get your compiler to help you as much as possible; set the warning level to high, and set warnings as errors. A static analysis tool such as lint, pclint, or QA-C would be even better.
Tools for debugging are all well and good and for some classes of error they will just point you straight to the problem. The best tip that I have for debugging is that you need to think about it in the right way. What works for me is the following:
The compiler probably isn't broken. I've been working with C for 25 years now and in all that time it's almost invariably something I'm doing wrong.
Read the error messages. Often I've looked back at the error message and in hindsight realized it was telling me exactly what was wrong.
Read the documentation. Make sure you aren't making assumptions about the language or library that aren't true.
Make a mental model of the problem. I ask myself what needs to be hapening in my code in order for the results I'm seeing to occur. Then add debug statements, assertions or just step through in the debugger (if you can) to see what is really happening.
Talk the problem through with someone else. Just describing it to a a third party often results in a revelation about what might be happening.
Other people will have other ways of approaching debugging, but I find if you have a structured approach to it rather than flailing around changing stuff at random you usually get there and when you do be prepared for the inevitable Why didn't I see that straight away!
Best debugger for C
gdb
Best tools for memory leak checking:
Valgrind
The following are popular debugging tools.
Valgrind
Purify
Duma
Some very simple Tricks/Suggestions
-> Always check that nowhere in your code you have dereferenced a wild/dangling pointer
Example 1)
int main()
{
int *p;
*p=10; //Undefined Behaviour (crash on most implementations)
}
Example 2)
int main()
{
int *p=malloc(sizeof(int));
//do something with p
free p;
printf("%d", *p); ////Undefined Behaviour (crash on most implementations)
}
-> Always initialize variables before using
int main()
{
int k;
for(int i= k;i<10;++i)
^^
Ouch
printf("%d",i");
}
In addition to all the other suggestions (gdb, valgrind, all that), some simple rules when writing the code help a lot when debugging afterwards.
Always use types with the proper
semantics. Unsigned types (best
size_t) for array indices and numbers that represent a cardinal,
ptrdiff_t for pointer differences,
off_t for file offsets etc. enum types for tags and case distinctions.
There is almost no need for the
builtin types int, long, char or
whatever. Avoid them whenever possible.
In particular don't use char for
arithmetic, the signedness problems with that are a plague. Use uint8_t or int8_t
if you feel the need for such a
thing.
Always initialize variables, all of them: integer, double, pointers, struct. It is
not true that this is less efficient
with a modern compiler. In most cases it will just
be optimized away when not necessary.
But especially pointer variables that
are not properly initialized can
produce spurious errors and make code
hard to debug. If you have them
initialized to NULL your program
will fail early, and your debugger will show you the place.
Compile with all warnings on, and
don't finish tidying your code until
the compiler doesn't give a single
warning. They are quite good at that nowadays, take advantage.
Compile with different optimization
options on, or even better with
different versions of your compiler,
or still better with completely
different compilers on different
platforms.
Use the assert macro. This forces you to think of your assumptions and also make your
code fail early if they are not fulfilled.
Unit testing. Makes getting your software correct a lot easier.
gdb is a debugger to analyse your program.
Other techinque is to use printf or logs
Valgrind provides dynamic analysis of the executable
Purify provides static and dynamic analysis. Sparrow and Prevent are some other tools in competition to Purify.
This can be separated into:
Prevention measures:
Use strict coding styles, don't make a mess
Use comments and code revisions
Use static code analysis tools
Use assertions where it's possible
Don't over complicate
Post-factum
Use debugger/tracer
Use memory checking tools
Use regression testing
Use your brain
Off the top of my head, Valgrind.
You might also want to hone your debugging skills by reading the book Debugging by David Agans. Every programmer should read this early on in their career.
valgrind for memory problems if you're on linux. use gdb/ddd on linux as well. On windows a lot of windows programmers don't seem to be knowledgeable of windbg. It is very useful but has a learning curve like gdb; more powerful than the built in debugger in visual studio. learn to use assert, you will catch lots of stuff and you can turn it off in release code if you so choose. Use a unit testing framework like Check, cunit, etc . Always initialize your pointer, to NULL if nothing else. When you free a pointer set it to NULL. Better you to catch a segfault than your user. Pick a coding standard and stick to it, consistency will help you make fewer mistakes. Keep your functions small if at all possible, this will keep you from having 10 level deep braces which are logic nightmares. If compiling using gcc use -Wall and -Wextra . Use the strn* functions instead of str* functions. Well worth the extra thinking they force you to do.

Large C macros. What's the benefit?

I've been working with a large codebase written primarily by programmers who no longer work at the company. One of the programmers apparently had a special place in his heart for very long macros. The only benefit I can see to using macros is being able to write functions that don't need to be passed in all their parameters (which is recommended against in a best practices guide I've read). Other than that I see no benefit over an inline function.
Some of the macros are so complicated I have a hard time imagining someone even writing them. I tried creating one in that spirit and it was a nightmare. Debugging is extremely difficult, as it takes N+ lines of code into 1 in the a debugger (e.g. there was a segfault somewhere in this large block of code. Good luck!). I had to actually pull the macro out and run it un-macro-tized to debug it. The only way I could see the person having written these is by automatically generating them out of code written in a function after he had debugged it (or by being smarter than me and writing it perfectly the first time, which is always possible I guess).
Am I missing something? Am I crazy? Are there debugging tricks I'm not aware of? Please fill me in. I would really like to hear from the macro-lovers in the audience. :)
To me the best use of macros is to compress code and reduce errors. The downside is obviously in debugging, so they have to be used with care.
I tend to think that if the resulting code isn't an order of magnitude smaller and less prone to errors (meaning the macros take care of some bookkeeping details) then it wasn't worth it.
In C++, many uses like this can be replaced with templates, but not all. A simple example of Macros that are useful are in the event handler macros of MFC -- without them, creating event tables would be much harder to get right and the code you'd have to write (and read) would be much more complex.
If the macros are extremely long, they probably make the code short but efficient. In effect, he might have used macros to explicitly inline code or remove decision points from the run-time code path.
It might be important to understand that, in the past, such optimizations weren't done by many compilers, and some things that we take for granted today, like fast function calls, weren't valid then.
To me, macros are evil. With their so many side effects, and the fact that in C++ you can gain same perf gains with inline, they are not worth the risk.
For ex. see this short macro:
#define max(a, b) ((a)>(b)?(a):(b))
then try this call:
max(i++, j++)
More. Say you have
#define PLANETS 8
#define SOCCER_MIDDLE_RIGHT 8
if an error is thrown, it will refer to '8', but not either of its meaninful representations.
I only know of two reasons for doing what you describe.
First is to force functions to be inlined. This is pretty much pointless, since the inline keyword usually does the same thing, and function inlining is often a premature micro-optimization anyway.
Second is to simulate nested functions in C or C++. This is related to your "writing functions that don't need to be passed in all their parameters" but can actually be quite a bit more powerful than that. Walter Bright gives examples of where nested functions can be useful.
There are other reasons to use of macros, such as using preprocessor-specific functionality (like including __FILE__ and __LINE__ in autogenerated error messages) or reducing boilerplate code in ways that functions and templates can't (the Boost.Preprocessor library excels here; see Boost.ScopeExit or this sample enum code for examples), but these reasons don't seem to apply for doing what you describe.
Very long macros will have performance drawbacks, like increased compiled binary size, and there are certainly other reasons for not using them.
For the most problematic macros, I would consider running the code through the preprocessor, and replacing the macro output with function calls (inline if possible) or straight LOC. If the macros exists for compatibility with other architectures/OS's, you might be stuck though.
Part of the benefit is code replication without the eventual maintenance cost - that is, instead of copying code elsewhere you create a macro from it and only have to edit it once...
Of course, you could also just make a method to be called but that is sort of more work... I'm against much macro use myself, just trying to present a potential rationale.
There are a number of good reasons to write macros in C.
Some of the most important are for creating configuration tables using x-macros, for making function like macros that can accept multiple parameter types as inputs and converting tables from human readable/configurable/understandable values into computer used values.
I cant really see a reason for people to write very long macros, except for the historic automatic function inline.
I would say that when debugging complex macros, (when writing X macros etc) I tend to preprocess the source file and substitute the preprocessed file for the original.
This allows you to see the C code generated, and gives you real lines to work with in the debugger.
I don't use macros at all. Inline functions serve every useful purpose a macro can do. Macro allow you to do very weird and counterintuitive things like splitting up identifiers (How does someone search for the identifier then?).
I have also worked on a product where a legacy programmer (who thankfully is long gone) also had a special love affair with Macros. His 'custom' scripting language is the height of sloppiness. This was compounded by the fact that he wrote his C++ classes in C, meaning all class functions and variables were all public. Anyways, he wrote almost everything in macro's and variadic functions (Another hideous monstrosity foisted on the world). So instead of writing a proper template class he would use a Macro instead! He also resorted to macro's to create factory classes as well, instead of normal code... His code is pretty much unmaintanable.
From what I have seen, macro's can be used when they are small and are used declaratively and don't contain moving parts like loops, and other program flow expressions. It's OK if the macro is one or at the most two lines long and it declares and instance of something. Something that won't break during runtime. Also macro's should not contain class definitions, or function definitions. If the macro contains code that needs to be stepped into using a debugger than the macro should be removed and replace with something else.
They can also be useful for wrapping custom tracing/debugging functionality. For instance you want custom tracing in debug builds but not release builds.
Anyways when you are working in legacy code like that, just be sure to remove a bit of the macro mess a bit at a time. If you keep it up, with enough time eventually you will remove them all and make life a bit easier for yourself. I have done this in the past, with especially messy macro's. What I do is turn on the compiler switch to have the preprocessor generate an output file. Then I raid that file, and copy the code, re-indent it, and replace the macro with the generated code. Thank goodness for that compiler feature.
Some of the legacy code I've worked with used macros very extensively in the place of methods. The reasoning was that the computer/OS/runtime had an extremely small stack, so that stack overflows were a common problem. Using macros instead of methods meant that there were fewer methods on the stack.
Luckily, most of that code was obsolete, so it is (mostly) gone now.
C89 did not have inline functions. If using a compiler with extensions disabled (which is a desirable thing to do for several reasons), then the macro might be the only option.
Although C99 came out in 1999, there was resistance to it for a long time; commercial compiler vendors didn't feel it was worth their time to implement C99. Some (e.g. MS) still haven't. So for many companies it was not a viable practical decision to use C99 conforming mode, even up to today in the case of some compilers.
I have used C89 compilers that did have an extension for inline functions, but the extension was buggy (e.g. multiple definition errors when there should not be), things like that may dissuade a programmer from using inline functions.
Another thing is that the macro version effectively forces that the function will actually be inlined. The C99 inline keyword is only a compiler hint and the compiler may still decide to generate a single instance of the function code which is linked like a non-inline function. (One compiler that I still use will do this if the function is not trivial and returning void).

Resources