Is it wise to use the `this` keyword in C? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Basically, I have a inline function in C:
struct array {
unsigned long size;
void* items;
};
typedef struct array* Array;
inline Array array_create(unsigned long initsize);
inline void array_free(Array this);
Am I free to use the this keyword in this kind of situation, or is it better to avoid it, and why (not)?
EDIT: This question originated from a bug in my code where I used inline void array_free(Array array); which changed the result of sizeof(array); and gave me the idea to use this instead of adapting to the (in my opinion ugly) sizeof(struct array);.

It's technically correct because C is not C++, so this is not a keyword in C.
Whether it's wise, now that's a different question. If there is any chance that this piece of code will ever be compiled as C++, then an identifier called this will break the code.

Using this in any fashion you want is totally valid C.
What you have to ask yourself, by the way these apply to any C++ reserved words like class, final etc, is :
Do I use a program that highlights this as a keyword conveing the wrong message ? e.g. Visual Studio highlights this even when you're in a .c file so some confusion may arise.
Do I plan to promote my code to C++ in the future ? In such a case you'll have some extra work to do that could be avoided.
Does my code interact with C++ ? This is not a problem per se but you have to bear in mind that your code will interact with C++ programmers as well. And you don't won't to confuse people that may not be aware of C in great detail (even though someone may say it's their duty do be aware of what they're doing when reading another language).
Since this is something that can be avoided I find using it immoral but not incorrect.
I'm not saying you have to study C++ prior to writing C, but once you know something, it's a good practice to make good use of it.
A subtle problem you may cause is to make your code 'uncallable' from C++ for example a
#define this some_definition
in a C header file that is later included from C++ may have weird effects to your code.

The reasons not to use this in standard C is that it makes your code:
Slightly less readable, due to the "well-known" usage of this in C++. Your code can look as if it's C++ and an uninformed reader can easily get confused.
Unportable to C++. If you (or someone else) ever want to take this code and use it in a C++ compilation, it will fail. That's a downside and an upside at the same time, since getting an error can be indicative that care must be taken, where's not getting one might let important issues slip.

It depends what your objective is.
If your C code program will always be built with a C compiler that is not a C++ compiler, then it makes no difference whether you use this as an identifier in your code. It is not a reserved identifier, so you are free to use it.
The potential problem with that premise is that a number of mainstream C compilers are actually C++ compilers, or they support some C++ features as extensions, so they may reject your code (or - less likely - do something with it that you don't expect). It is not possible to predict with absolute certainty that the vendor of your compiler(s) of choice will never (even if they give you a promise in writing) release a future version of their C compiler that will reject or do something unexpected with your code. The likelihood of this happening is relatively low, but non-zero.
In the end you need to decide what risk you are willing to take with maintaining your code in future.
Of course, if you are a C fanatic who wants your code to have an incompatibility with C++ (yes, such people do exist) then using a number of keywords or reserved identifiers that are specific to C++, as well as using such keywords or identifiers that are specific to (more recent versions of) C may be a worthwhile approach.

As it is already mentioned, you can use any non-reserved keyword as a variable name.
However, I suggest to use something like 'pThis', or '[struct name]This' or similar to express your intent of using a C struct together with functions that are taking as first argument a pointer to [struct name] instance, and are meant to be used in a similar manner as member functions of a C++ class.
This way your code may be more readable and your intent more understandable by someone who is using it.

In C you do not have the this keyword. Only in C++ and in a class, so your code is C and you use your this variable as a local method parameter, where you access the array struct.

Yes, you can. If the code happens to get compiled as C++, it is not your fault. However, in case that happens, other things won't be accepted; the fact that you likely assign to items without a cast, for instance, because C++ does not allow that unlike C.

Related

Is there a standard-compliant way to detect whether a function in the C standard library is implemented via intrinsic/builtin?

Is there a standard-compliant way to detect whether a function in the C standard library is implemented via intrinsic/builtin?
I'm pretty confident I can implement code which performs better than the function provided by the standard library for a specific call site if only because of function call overhead. But if the function in question is implemented via intrinsic/builtin, there's no function call overhead to beat, so it would be foolish to try.
If there's a way, I have a feeling it won't be simple because it may vary by call site. For example, passing a constant length to memcpy may provide the compiler a great opportunity to generate inline code, but a variable length may provide a lesser opportunity. I guess the best hint available might be one of three values, "always", "never", or "sometimes". That would be good enough for me.
The details of how this might be accomplished are negotiable as long as they're standard-compliant. The version of the standard is even negotiable because that's testable and I'd be happy making the safest assumption if the question weren't answerable for an earlier version of the standard. But of course a way to do this at compile-time would be preferred.
(edited to include concrete details to make it easier to think about even though these details don't matter)
Let's assume memcpy is indeed the function in question and that we know the length is always variable because it was passed in to the function which calls memcpy, but we also know that length is frequently 1.
The overhead of calling into a library will surely dominate both if (1==length) and *dst = *src;. So the questions are how frequently 1 is actually the value, which is a question only I can answer, and whether any possibility the implementation will call into a library can be eliminated.
This question isn't about whether one can write a function which goes faster than memcpy or any other standard library function. There are plenty of questions on that and this isn't one of them.
It seems the closest we'll get to a simple YES or NO answer is this comment from Nate Eldredge: "The C standard doesn't even have the concept of 'intrinsic / builtin'".

Negative stigma with C macros [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
As I dive deeper into the C programming language I am having trouble understanding why macros should be used as a last resort. For instance, this post. This is not the first time I've heard chatter that they are a last resort. Some suggest that the memory footprint is more excessive than calling a function (this post). Albeit I understand these arguments, as well as why they should not be used for C++ (compiler optimizations and the like), I do not understand the following:
Since macros 'unroll' (if you will) in the .text segment of the stack there is less overhead associated with macros as opposed to function calls - e.g. memory need not be allocated between the frame pointer and stack pointer. This memory overhead is quantifiable, where as this post suggest that macros are not.
Much of the work I do is in embedded systems, micro-controllers, and systems programming. I have read many books and articles by Bjarne Stroustrup. I am also in the process of reading Clean Code - where Robert Martin persists that readability is king (not implying macros increase readability in their own right).
TLDR:
Considering macros reduce the overhead associated with stack frames and (if used appropriately) can increase readability - Why the negative stigma? They are littered through BSD papers and man pages.
Some will vote to close because this could be construed as an opinion question, but there are some facts.
If you have a decent compiler, calls to simple functions are expanded inline just like macros, but they're considerably more type safe. I've seen cases where inlined functions were faster because the compiler missed common sub-expressions in the the compiled macro that were eliminated when function arguments were inlined.
When you call a simple function multiple times, you can give the compiler hints on whether to expand it inline each time or call it -- trading code space for tiny stack and runtime overhead you mention. Better yet, lots of compilers (like gcc) will make this call automatically based on heuristics that may be better than your intuition. With a macro, you can only modify the macro to call a function. Code changes are more error prone than build hints.
In most compilation systems, error messages and debuggers don't reference the body code of macros. OTOH, modern debuggers will step correctly line-by-line through the body of a function even though it's actually been expanded inline. Similarly, compilers will correctly point to error locations in function bodies.
It's extremely easy to code subtle bugs by expanding multiple arg references: #define MAX(X,Y) (X>Y?X:Y) followed by MAX(++x, 3). Here x is incremented once if it's less than 3, twice otherwise. Ouch.
Multi-level macro expansion (where a macro call produces other macro calls) relies on complex rules and is therefore error-prone. E.g. it's not hard to create macros that depend on specific behavior of one preprocessor so that they fail on another.
Functions can be recursive. Macros can't.
C11 features provide ways to do things where macros have been (ab)used in the past: aggregate literals for example. Again, the advantages are type safety, error, and debugger references.
The upshot is that when you use a macro, you're adding error risk and maintenance cost that don't exist with inlined functions. It's hard to escape the conclusion that you should use macros only as a fallback. The main defensible reason to do otherwise is that you're forced to use an old or junky compiler.
No typesafety and have side effects of pure textual replacement. These are main things for which we should avoid.
Macros can do a lot - quite a lot - but as is well-known it's easy to accidentally build macros that do the wrong thing, either by mishandling side-effects, accidentally evaluating arguments multiple times, or messing up operator precedence. I think that the right evaluation to make would be to evaluate the upsides of the benefits of macros versus the potential risks or drawbacks, then make the call from there.
For example, let's take function-like macros. As you mention, they're often used to make code faster by eliminating short function calls. But nowadays, you can achieve the same thing either by using the new inline keyword adopted from C++ or just by cranking up compiler optimization settings, since compilers these days are dramatically better at optimization than they were many years back. If you're using those macros because you want to perform operations like taking a min or max that have basically identical code but different types, use the ‘_Generic` keyword. These other options are less error-prone and easier to debug and test, so it's probably worth avoiding the risk of a macro error by using them.
Then there's defining constants with#define. This fell out of favor in C++ in favor of constants, and that's also now possible in C using static const variables at global scope. You can use macros to do this, but it's more type-safe to use the other option instead. Most compilers are smart enough to inline the constants and do optimizations on the values in ways that previously only macros would guarantee.
For these more routine operations, the benefits of using macros aren't as high as they used to be because new language features and vastly smarter compilers have provided less-risky alternatives. That's the main reason why in general the advice is to avoid using macros - they're just not the best tool for the job.
This doesn't mean to never use macros. There are many times and places where they're fantastic. X Macros are a really neat way to automatically generate code, and macro substitution is super helpful for taking advantage of compiler-specific or OS-specific features while maintaining portability. I don't foresee those uses going away any time soon. But do consider the alternatives in other cases, since in many instances they were specifically invented to address weaknesses of the macro preprocessing system!
A good optimizing compiler should give efficient code for calls to inline functions, as efficient as if you use macros (think of getc as an example)
But you may want to use macros when they are not replacable by inline functions. (Here is an example).

Emulating lambdas in C?

I should mention that I'm generating code in C, as opposed to doing this manually. I say this because it doesn't matter too much if there's a lot of code behind it, because the compiler should manage it all. Anyway, how would I go around emulating a lambda in C? I was thinking I could just generate a function with some random name somewhere in the source code and then call that? I'm not too sure. I haven't really tried anything just yet, since I wanted to get the idea down before I implement it.
Is there some kind of preprocessor directive I can do, or some macro that will make this cleaner to do? I've been inspired by Jon Blow to try out compiler development, and he seemed to implement Lambdas in his language Jai. However, I think he does something where he generates bytecode, and then into C? I'm not sure.
Edit:
I'm working on a compiler, the compiler is just a project of mine to keep me busy, plus I wanted to learn more about compilers. I primarily use clang, I'm on Ubuntu 14.10. I don't have any garbage collection, but I wanted to try my hand at some kind of smart pointer-y/rust/ARC inspired memory model for garbage collection, i.e. little to no overhead. I chose C because I wanted to dabble in it more. My project is free software, just a hobby project.
There are several ways of doing it ("having" lambdas in C). The important thing to understand is that lambdas give closures and that closures are mixing "code" with "data" (the closed values); notice that objects are also mixing "code" with "data" and there is a similarity between objects and closures. See also this answer on Programmers.
Traditionally, in C, you not only use function pointers, but you adopt a convention regarding callbacks. This for instance is the case with GTK: every time you pass a function pointer, you also pass some data with it. You can view callbacks (the convention of giving C function pointer with some void*data) as a way to implement closures.
Since you generate C code (which is a wise idea, I'm doing similar things in MELT which -on Linux- generates C++ code at runtime, compile it into a shared object, and dlopen-s that) you could adopt a callback convention and pass some closed values to every function that you generate.
You might also consider closed values as static variables, but this approach is generally unwise.
There have been in the past some lambda.h header library which generates a machine-specific trampoline code for closures (essentially generating a code which pushes some closed values as arguments then call some routine). You might use some JIT compilation techniques (using libjit, GNU lightning, LLVM, asmjit, ....) to do the same. See also libffi to call an arbitrary function (of signature known at runtime only).
Notice that there is a strong -but indirect- relation between closures and garbage collection (read the GC handbook for more), and it is not by accident that every functional language has a GC. C++11 lambda functions are an exception on this (and it is difficult to understand all the intricacies of memory management of C++11 closures). So if you are generating C code, you could and probably should use Boehm's conservative garbage collector (which is wrapping dlopen) and you would have closure GC-ed values. (You could use some other GC libraries, e.g. Ravenbrook's MPS or my unmaintained Qish...) Then you could have the convention that every generated C function takes its closure as first argument.
I would suggest to read Scott's book on Programming Language Pragmatics and (assuming you know a tiny bit of Scheme or Lisp; if you don't you should learn a bit of Scheme and read SICP) Queinnec's book Lisp In Small Pieces (if you happen to read French, read the latest French variant).

Make tolower as static

I need to make the standard library function tolower static instead of "public' scope.
I am compiling with MISRA C:2004, using IAR Embedded Workbench compiler. The compiler is declaring tolower as inline:
inline
int tolower(int _C)
{
return isupper(_C) ? (_C + ('A' - 'a')) : _C;
}
I am getting the following error from the compiler:
Error[Li013]: the symbol "tolower" in RS232_Server.o is public but
is only needed by code in the same module - all declarations at
file scope should be static where possible (MISRA C 2004 Rule 8.10)
Here are my suggested solutions:
Use tolower in another module, in a dummy circumstance, so that it
is needed by more than one module.
Implement the functionality without using tolower. This is an embedded system.
Add a "STATIC" macro, defined as empty by default, but can be
defined to static before the ctype.h header file is included.
I'm looking for a solution to the MISRA linker error. I would prefer to make the tolower function static only for the RS232_Server translation unit (If I make tolower static in the standard header file, it may affect other future projects.)
Edit 1:
Compiler is IAR Embedded Workbench 6.30 for ARM processor.
I'm using an ARM7TDMI processor in 32-bit mode (not Thumb mode).
The tolower function is used with a debug port.
Edit 2:
I'm also getting the error for _LocaleC_isupper and _LocaleC_tolower
Solution:
I notified the vendor of the issue according as recommended by Michael Burr.
I decided not to rewrite the library routine because of localization
issues.
I implemented a function pointer in the main.c file as suggested by
gbulmer; however this will be commented incredibly because it should
be removed after IAR resolves their issue.
I'd suggest that you disable this particular MISRA check (you should be able to do that just for the RS232_Server translation unit) rather than use the one of the workarounds you suggest. In my opinion the utility of rule 8.10 is pretty minimal, and jumping through the kinds of hoops in the suggested workarounds seems more likely to introduce a higher risk than just disabling the rule. Keep in mind that the point of MISRA is to make C code less likely to have bugs.
Note that MISRA recognizes that "in some instances it may be necessary to deviate from the rules" and there is a documented "Deviation procedure" (section 4.3.2 in MISRA-C 2004).
If you won't or can't disable the rule for whatever reason, in my opinion you should probably just reimplement tolower()'s functionality in your own function, especially if you don't have to deal with locale support. It may also be worthwhile to open a support incident with IAR. You can prod them with rule 3.6 says that "All libraries used in production code shall be written to comply with [MISRA-C]".
Who sells the MISRA linker? It seems to have an insane bug.
Can you work around it by taking its address int (*foo)(int) = tolower;at file scope?
Edit: My rationale is:
a. that is the stupidest thing I've seen this decade, so it may be a bug, but
b. pushing it towards an edge case (a symbol having its name exported via a global) might shut it up.
For that to be correct behaviour, i.e. not a bug, it would have to be a MISRA error to include any library function once, like initialise_system_timer, initialise_watchdog_timer, ... , which just hurts my head.
Edit: Another thought. Again based on an assumption that this is an edge-case error.
Theory: Maybe the compiler is doing both the inline, and generating an implementation of the function. Then the function is (of course) not being called. So the linker rules are seeing that un-called function.
GNU has options to prevent in-lining.
Can you do the same for that use of tolower? Does that change the error? You could do a test by
#define inline /* nothing */
before including ctype.h, then undef the macro.
The other test is to define:
#define inline static inline
before including ctype.h, which is the version of inline I expect.
EDIT2:
I think there is a bug which should be reported. I would imagine IAR have a workaround. I'd take their advice.
After a nights sleep, I strongly suspect the problem is inline int tolower() rather than static inline int tolower or int tolower, because it makes most sense. Having a public function which is not called does seem to be the symptoms of a bug in a program.
Even with documentation, all the coding approaches have downsides.
I strongly support the OP, I would not change a standard header file. I think there are several reasons. For example, a future upgrade of the tool chain (which comes with a new set of headers) breaks an old app if it ever gets maintained. Or simply building the application on a different machine gives an error, merging two apparently correct applications may give an error, ... . Nothing good is likely to come of that. -100
Use #define ... to make the error go away. I do not like using macros to change standard libraries. This seems like the seeds of a long-term bad idea. If in future another part of the program uses another function with a similar problem, the application of the 'fix' gets worse. Some inexperienced developer might get the job of maintaining the code, and 'learns' wrapping strange pieces of #define trickery around #include is 'normal' practice. It becomes company 'practice' to wrap #include <ctype.h> in weird macro workarounds, which remains years after it was fixed. -20
Use a command line compiler option to switch off inlining. If this works, and the semantics are correct, i.e. including the header and using the function in two source files does not lead to a multiple defined function, then okay. I would expect it leads to an error, but it is worth confirming as part of the bug report. It lays a frustrating trap for someone else, who comes along in the future. If a different inline standard library function is used, and for some reason a person has to look at the generated code it won't be included either. They might go a bit crazy wondering why inline is not honoured. I conjecture the reason they are looking at generated code is because performance is critical. In my experience, people would spend a lot of time looking at code, baffled before looking at the build script for a program which works. If suppressing inline is used as a fix, I do feel it is better to do it everywhere than for one file. At least the semantics are consistent and the high-level comment or documentation might get noticed. Anyone using the build scripts as a 'quick check' will get consistent behaviour which might cause them to look at the documentation. -1 (no-inline everywhere) -3 (no-inline on one file)
Use tolower in a bogus way in a second source file. This has the small benefit that the build system is not 'infected' with extra compiler options. Also it is a small file which will give the opportunity to explain the error being worked around. I do not like this much but I like it more than the fiddling with standard headers. My current concern is it might not work. It might include two copies which the linker can't resolve. I do like it is better than the weird 'take its address and see if the linker shuts up` (which I do think is an interesting way to test the edge cases). -2
Code your own tolower. I don't understand the wider context of the application. My reaction is not to code replacements for library functions because I am concerned about testing (and the unit tests which introduce even more code) and long term maintenance. I am even more nervous with character I/O because the application might need to become capable of handling a wider character set, for example UTF-8 encoding, and a user-defined tolower will tend to get out of synch. It does sound like a very specific application (debugging to a port), so I could cope. I don't like writing extra code to work around things which look like bugs, especially if it is commercial software. -5
Can you convert the error to a warning without losing all of the other checking? I still feel it is a bug in the toolchain, so I'd prefer it to be a warning, rather than silence so that there is a hook for the incident and some documentation, and there is less chance of another error creeping in. Otherwise go with switching off the error. +1 (Warning) 0 (Error)
Switching the error off seems to lose the 'corporate awareness' that (IMHO) IAR owes you an explanation, and longer term fix, but it seems much better than messing with the build system, writing macro-nastiness to futz with standard libraries, or writing your own code which increases your costs.
It may just be me, but I dislike writing code to work around a deficiency in a commercial product. It feels like this is where the vendor should be pleased to have the opportunity to justify its license cost. I remember Microsoft charged us for incidents, but if the problem was proven to be theirs, the incident and fix were free. The right thing to do seems to be give them a chance to earn the money. Products will have bugs, so silently working around it without giving them a chance to fix it seems less helpful too.
First of all, MISRA-C:2004 does not allow inline nor C99. Upcoming MISRA 2012 will allow it. If you try to run C99 or non-standard code through a MISRA-C:2004 static analyser, all bets are off. The MISRA checker should give you an error for the inline keyword.
I believe a MISRA-C compliant version of the code would look like:
static uint8_t tolower(uint8_t ch)
{
return (uint8_t)(isupper(ch) ? (uint8_t)((uint8_t)(ch + 'A') - 'a') :
ch);
}
Some comments on this: MISRA encourages char type for character literals, but at the same time warns against using the plain char type, as it has implementation-defined signedness. Therefore I use uint8_t instead. I believe it is plain dumb to assume that there exist ASCII tables with negative indices.
(_C + ('A' - 'a')) is most certainly not MISRA-compliant, as MISRA regards it, it contains two implicit type promotions. MISRA regards character literals as char, rather than int, like the C standard.
Also, you have to typecast to underlying type after each expression. And because the ?: operator contains implicit type promotions, you must typecast the result of it to underlying type as well.
Since this turned out to be quite an unreadable mess, the best idea is to forget all about ?: entirely and rewrite the function. At the same time we can get rid of the unnecessary reliance on signed calculations.
static uint8_t tolower (uint8_t ch)
{
if(isupper(ch))
{
ch = (uint8_t)(ch - 'A');
ch = (uint8_t)(ch + 'a');
}
return ch;
}

Tips/resources for structuring C code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Does anyone have tips/resources for how to, in the best way, structure your C code projects? (Different folders etc.) And how do you know when it's good to split code into separate files? And what is an example of a good Makefile?
My project is not that big, but I wanna start to structure my code at an early stage..
Structuring code needs some experience but mostly common sense.
For splitting code, you usually go for readability: conceptually coherent functions/datatypes should go in the same file. You can take c standard library as a good example. It is better to keep your data structure definitions and function declarations in separate headers. This allows you to use the data structures as part of a compilation unit even if you have not defined all the functions.
Files that provide similar functionality should go in the same directory. It is good to avoid deep directory structure (1 level deep is best) as that complicates building the project unnecessarily.
I think Makefiles are OK for small projects, but become unwieldy for bigger ones. For really serious work (if you want to distribute your code, create an installer etc) you may want to look at cmake, scons, etc.
Have a look at the GNU coding standards: http://www.gnu.org/prep/standards/standards.html
Look at the gnu make manual for a simple example Makefile. You can also pick up any opensource project and look at the Makefile. Browsing code repositories in sourceforge.net may be useful.
Read one of the many C coding standards available on the internet and follow one that looks reasonable for your requirements. A few links:
GNU Coding Standards
C Coding Standards at IRAM (pdf)
Indian Hill C Style and Coding Standards
The following books also contain effective guidelines on writing good C code:
The C Programming Language
The Practice of Programming
The Elements of Programming Style
This is sometimes overlooked, but security is an issue in big projects. Here's some advice about how to program securely.
Here is an idiom I like:
Declare structs in a header so that their size is known by client code. Then declare init and deinit functions to the following convention:
The first parameter is a struct foo*.
The return type is a struct foo*.
If they might fail, the last parameter is either int* (simplest), enum foo_error* (if there are several ways it can fail that the calling code might care about) or GError** (if you're writing GLib-style code).
foo_init() and foo_deinit() return NULL if the first parameter is NULL. They also return the first parameter.
Why do it this way? Calling code doesn't have to allocate heap space for the structure, it can go on the stack. If you are allocating it on the heap, though, the following works nicely:
struct foo* a_foo = foo_init(malloc(sizeof(*a_foo)));
if (a_foo == NULL) {
/* Ruh-oh, allocation failure... */
}
free(foo_deinit(a_foo));
Everything works nicely even if a_foo == NULL when foo_deinit is called.

Resources