I have to return nul in a function but I'm not allowed to include any library. I tried to find in how is NULL defined, then to sys/_types/_null.h only to find that NULL is actually __DARWIN_NULL. Great ! Now, I have no idea where to search in order to find the __DARWIN_NULL definition...
I have to return nul in a function but I'm not allowed to include any library.
The solution to this problem is far simpler than you're making it; you've actually asked a form of XY problem.
You won't find the NUL character defined in any standard library; the best way to return that will be using the constant '\0' or 0.
If your professor is teaching you to avoid using <stddef.h> to find NULL then he/she has set a silly exercise which involves using something other than the most appropriate tool for the job, a tool which is guaranteed by the standard to exist, by the way... I would be raising this as a concern.
Nonetheless, sometimes professors don't care and will teach you to do stupid things anyway. NULL is defined as an implementation-defined null pointer constant, usually 0 or a conversion of 0 to void * like so: ((void *) 0). That may not be the implementation-defined value your NULL resolves to; in that case, adjust to suit :)
You could add a preprocessor definition such as #define NULL ((void *) 0) and then you would be able to return NULL; from your functions. Ta-da! Stupid exercises deserve stupid solutions.
If your professor says you're not allowed to use #define, either, I would be tempted to ask him what you are allowed to use. Changing requirements on the fly is not fair. Most compilers will allow you to set preprocessor constants using a command line argument, for example cc -DNULL='((void *) 0)' .... This is useful for exposing compile-time configuration options, but again, using this to define NULL is dumb.
The question in your title is different to the rest of your post. __DARWIN_NULL could also be defined using either of the above, providing it isn't already defined, but I really don't think that's required to answer your actual question.
Related
I'm trying to fix the compliance of my code to misra C. During the static analysis, I had this violation:
Rule 12.1: Extra parentheses recommended. A conditional operation is
the operand of another conditional operator.
The code is:
if (CHANNEL_STATE_GET(hPer, channel) != CHANNEL_STATE_READY)
{
retCode = ERROR;
}
where CHANNEL_STATE_GET is a macro as follow:
#define CHANNEL_STATE_GET(__HANDLE__, __CHANNEL__)\
(((__CHANNEL__) == CHANNEL_1) ? (__HANDLE__)->ChannelState[0] :\
((__CHANNEL__) == CHANNEL_2) ? (__HANDLE__)->ChannelState[1] :\
((__CHANNEL__) == CHANNEL_3) ? (__HANDLE__)->ChannelState[2] :\
((__CHANNEL__) == CHANNEL_4) ? (__HANDLE__)->ChannelState[3] :\
((__CHANNEL__) == CHANNEL_5) ? (__HANDLE__)->ChannelState[4] :\
(__HANDLE__)->ChannelState[5])
Do you have any idea to solve this violation?
BR,
Vincenzo
There's several concerns here, as far as MISRA C is concerned:
There's various rules saying that macros and complex expressions should be surrounded by parenthesis, and that code shouldn't rely on the C programmer knowing every single operator precedence rule. You can solve that by throwing more parenthesis on the expression, but that's just the top of the iceberg.
The ?: operator is considered a "composite operator" and so expressions containing it are considered "composite expressions" and come with a bunch of extra rules 10.6, 10.7 and 10.8. Meaning that there is a lot of rules regarding when and how this macro may be mixed with other expressions - the main concerns are implicit, accidental type conversions.
The use of function-like macros should be avoided in the first place.
Identifiers beginning with multiple underscores aren't allowed by the C language since it reserves those for the implementation (C17 7.1.3).
The easier and recommended fix is just to forget about that macro, since it will just cause massive MISRA compliance headache. Also at a glance, it looks like very inefficient code with nested branches. My suggested fix:
In case hPer happens to be a pointer to pointer (seems like it), then dereference it and store the result in a plain, temporary pointer variable. Don't drag the nasty pointer to pointer syntax around across the whole function/macro.
Replace this whole macro with a (inline) function or a plain array table look-up, depending on how well you've sanitized the channel index.
Ensure that CHANNEL_1 to CHANNEL_5 are adjacent integers from 0 to 4. If they aren't, use some other constant or look-up in between.
A MISRA compliant re-design might look like this:
typedef enum
{
CHANNEL_1,
CHANNEL_2,
CHANNEL_3,
CHANNEL_4,
CHANNEL_5
} channel_t;
// state_t is assumed to be an enum too
state_t CHANNEL_STATE_GET (const HANDLE* handle, channel_t channel)
{
if((uint32_t)channel > (uint32_t)CHANNEL_5)
{
/* error handling here */
}
uint32_t index = (uint32_t)channel;
return handle[index];
}
...
if (CHANNEL_STATE_GET(*hPer, channel) != CHANNEL_STATE_READY)
If you can trust the value of channel then you don't even need the function, just do a table look-up. Also note that MISRA C encourages "handle" in this case to be an opaque type, but that's a chapter of its own.
Note that this code is also assuming that HANDLE isn't a pointer hidden behind a typedef as in Windows API etc - if so then that needs to be fixed as well.
Note (as more or less implied by Lundins comment....), I answer more about how to approach MISRA findings (and those of a few other analysis tools I suffered from ....).
I would first try to get a better angle on what the finding is actually describing. And with a nested structure like shown, that takes some re-looking. So ...
I would apply indentation, just to make life easier while editing and then, well, add some more () in inviting places, e.g. in this case so as to enclose each x?y:z into one pair.
#define CHANNEL_STATE_GET(__HANDLE__, __CHANNEL__)\
( ((__CHANNEL__) == CHANNEL_1) ? (__HANDLE__)->ChannelState[0] :\
( ((__CHANNEL__) == CHANNEL_2) ? (__HANDLE__)->ChannelState[1] :\
( ((__CHANNEL__) == CHANNEL_3) ? (__HANDLE__)->ChannelState[2] :\
( ((__CHANNEL__) == CHANNEL_4) ? (__HANDLE__)->ChannelState[3] :\
(((__CHANNEL__) == CHANNEL_5) ? (__HANDLE__)->ChannelState[4] :\
(__HANDLE__)->ChannelState[5] \
) \
) \
) \
) \
)
This is to address what the quoted finding is about.
I would not feel bad about sprinkling a few more around e.g. each CHANNEL_N.
(I admit that I did not test my code against a MISRA checker. I try to provide an approach. I hope this fixes the mentioned finding, possibly replacing it with another one.... MISRA in my experience is good at that.... I do not even expect this to solve all findings.)
When trying to fix some seriously odd code like this, it's often a good idea to take one or two big steps backwards.
We know that hPer refers to an array. We have some troublesome code that is indexing into that array and pulling out one of the channel states. But this code is, frankly, pretty awful. Even if the MISRA checker weren't complaining about it, any time you've got five nested ?: operators, performing a cumbersome by-hand emulation of what ought to be a simple array lookup, it's a sure sign that something isn't right, and that there's probably a better way to do it. So what might that better way be?
One way to approach that question is to ask, How is the ChannelState array filled in? And is there any other code that also fetches out of it?
You've only asked us about this one line that your MISRA checker is complaining about. That suggests that the code that fills in the ChannelState array, and any other code that fetches out of it, is not drawing complaints. Perhaps that other code accesses the ChannelState array in some different, hopefully better way. Perhaps the underlying problem is that the programmer who wrote this CHANNEL_STATE_GET macro was unaware of that other code, had not been properly educated on this program's coding conventions and available utility routines. Perhaps it's perfectly acceptable to directly index a ChannelState array using a channel value. Or perhaps there's already something like the map_channel_index function which I suggested in my other answer.
So, do yourself a favor: spend a few minutes seeking out some other code that accesses the ChannelState array. You might learn something very interesting.
Other comments and answers are suggesting replacing the cumbersome CHANNEL_STATE_GET macro with a much simpler array lookup, and I strongly agree with that recommendation.
It's possible, though, that the definitions of CHANNEL_1 through CHANNEL_5 are not under your control, such that you can't guarantee that they're consecutive small integers as would be required. In that case, I recommend writing a small function whose sole job is to map a channel_t to an array index. The most obvious way to do this is with a switch statement:
unsigned int map_channel_index(channel_t channel)
{
switch(channel) {
case CHANNEL_1: return 0;
case CHANNEL_2: return 1;
case CHANNEL_3: return 2;
case CHANNEL_4: return 3;
case CHANNEL_5: return 4;
default: return 5;
}
}
Then you can define the much simpler
#define CHANNEL_STATE_GET(handle, channel) \
((handle)->ChannelState[map_channel_index(channel)])
Or, you can get rid of CHANNEL_STATE_GET entirely by replacing
if(CHANNEL_STATE_GET(hPer, channel) != CHANNEL_STATE_READY)
with
if((*hPer)->ChannelState[map_channel_index(channel)] != CHANNEL_STATE_READY)
Here's a formal grammar brain teaser (maybe :P)
I'm fairly certain there is no context where the character sequence => may appear in a valid C program (except obviously within a string). However, I'm unable to prove this to myself. Can you either:
Describe a method that I can use for an arbitrary character sequence to determine whether it is possible in a valid C program (outside a string/comment). Better solutions require less intuition.
Point out a program that does this. I have a weak gut feeling this could be undecidable but it'd be great if I was wrong.
To get your minds working, other combos I've been thinking about:
:- (b ? 1:-1), !? don't think so, ?! (b ?!x:y), <<< don't think so.
If anyone cares: I'm interested because I'm creating a little custom C pre-processor for personal use and was hoping to not have to parse any C for it. In the end I will probably just have my tokens start with $ or maybe a backquote but I still found this question interesting enough to post.
Edit: It was quickly pointed out that header names have almost no restrictions so let me amend that I'm particularly interested in non-pre-processor code, alternatively, we could consider characters within the <> of #include <...> as a string literal.
Re-edit: I guess macros/pre-processor directives beat this question any which way I ask it :P but if anyone can answer the question for pure (read: non-macro'd) C code, I think it's an interesting one.
#include <abc=>
is valid in a C program. The text inside the <...> can be any member of the source character set except a newline and >.
This means that most character sequences, including !? and <<<, could theoretically appear.
In addition to all the other quibbles, there are a variety of cases involving macros.
The arguments to a macro expansion don't need to be syntactically correct, although of course they would need to be syntactically correct in the context of their expansion. But then, they might never be expanded:
#include <errno.h>
#define S_(a) #a
#define _(a,cosmetic,c) [a]=#a" - "S_(c)
const char* err_names[] = {
_(EAGAIN, =>,Resource temporarily unavailable),
_(EINTR, =>,Interrupted system call),
_(ENOENT, =>,No such file or directory),
_(ENOTDIR, =>,Not a directory),
_(EPERM, =>,Operation not permitted),
_(ESRCH, =>,No such process),
};
#undef _
const int nerr = sizeof(err_names)/sizeof(err_names[0]);
Or, they could be used but in stringified form:
#define _(a,b,c) [a]=#a" "S_(b)" "S_(c)
Note: Why #a but S_(c)? Because EAGAIN and friends are macros, not constants, and in this case we don't want them to be expanded before stringification.
/*=>*/
//=>
"=>"
'=>'
We are all familiar with working of sizeof operator in C language. I am trying to make a similar working function that will absorb any kind of datatype and return me its size.
Can somebody tell me how to make such a similar function in "C".
int myOwnSizeOf(/*what would be parameter type?*/)
{
//and what about the definition?
}
Thanks
You can't do that with a function. That's why sizeof is an operator built into the language, not a library function. It's magic.
You could do
#define myOwnSizeOf(x) (sizeof(x))
but I don't really see the point.
Since a function is evaluated at runtime, it can never consume a datatype but only objects. That's why sizeof is a built in operator.
You might get it to work for this limited case in C++ with a template function. But in C your only possibilities to consume an object of any datatype are either void pointers or macros. But the former won't work, of course, as it looses any type information and the latter was already suggested by aschepler and as he noted, it won't buy you anything (and it isn't a function, anyway).
A function can't do it, but a macro can, if you don't mind throwing in a little technical UB that can't/won't really matter:
#define mysizeof(T) (size_t)((char *)((T*)0+1)-(char *)0)
If you replaced 0 with the address of a static-storage-duration object larger than any other object you'd ever try to take the size of, the UB would go away.
Edit: Note that this is for types T; a version for variables is much easier:
#define mysizeof(v) (size_t)((char *)(&v+1)-(char *)&v)
This is a nitpicky-details question with three parts. The context is that I wish to persuade some folks that it is safe to use <stddef.h>'s definition of offsetof unconditionally rather than (under some circumstances) rolling their own. The program in question is written entirely in plain old C, so please ignore C++ entirely when answering.
Part 1: When used in the same manner as the standard offsetof, does the expansion of this macro provoke undefined behavior per C89, why or why not, and is it different in C99?
#define offset_of(tp, member) (((char*) &((tp*)0)->member) - (char*)0)
Note: All implementations of interest to the people whose program this is supersede the standard's rule that pointers may only be subtracted from each other when they point into the same array, by defining all pointers, regardless of type or value, to point into a single global address space. Therefore, please do not rely on that rule when arguing that this macro's expansion provokes undefined behavior.
Part 2: To the best of your knowledge, has there ever been a released, production C implementation that, when fed the expansion of the above macro, would (under some circumstances) behave differently than it would have if its offsetof macro had been used instead?
Part 3: To the best of your knowledge, what is the most recently released production C implementation that either did not provide stddef.h or did not provide a working definition of offsetof in that header? Did that implementation claim conformance with any version of the C standard?
For parts 2 and 3, please answer only if you can name a specific implementation and give the date it was released. Answers that state general characteristics of implementations that may qualify are not useful to me.
There is no way to write a portable offsetof macro. You must use the one provided by stddef.h.
Regarding your specific questions:
The macro invokes undefined behavior. You cannot subtract pointers except when they point into the same array.
The big difference in practical behavior is that the macro is not an integer constant expression, so it can't safely be used for static initializers, bitfield widths, etc. Also strict bounds-checking-type C implementations might completely break it.
There has never been any C standard that lacked stddef.h and offsetof. Pre-ANSI compilers might lack it, but they have much more fundamental problems that make them unusable for modern code (e.g. lack of void * and const).
Moreover, even if some theoretical compiler did lack stddef.h, you could just provide a drop-in replacement, just like the way people drop in stdint.h for use with MSVC...
To answer #2: yes, gcc-4* (I'm currently looking at v4.3.4, released 4 Aug 2009, but it should hold true for all gcc-4 releases to date). The following definition is used in their stddef.h:
#define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER)
where __builtin_offsetof is a compiler builtin like sizeof (that is, it's not implemented as a macro or run-time function). Compiling the code:
#include <stddef.h>
struct testcase {
char array[256];
};
int main (void) {
char buffer[offsetof(struct testcase, array[0])];
return 0;
}
would result in an error using the expansion of the macro that you provided ("size of array ‘buffer’ is not an integral constant-expression") but would work when using the macro provided in stddef.h. Builds using gcc-3 used a macro similar to yours. I suppose that the gcc developers had many of the same concerns regarding undefined behavior, etc that have been expressed here, and created the compiler builtin as a safer alternative to attempting to generate the equivalent operation in C code.
Additional information:
A mailing list thread from the Linux kernel developer's list
GCC's documentation on offsetof
A sort-of-related question on this site
Regarding your other questions: I think R's answer and his subsequent comments do a good job of outlining the relevant sections of the standard as far as question #1 is concerned. As for your third question, I have not heard of a modern C compiler that does not have stddef.h. I certainly wouldn't consider any compiler lacking such a basic standard header as "production". Likewise, if their offsetof implementation didn't work, then the compiler still has work to do before it could be considered "production", just like if other things in stddef.h (like NULL) didn't work. A C compiler released prior to C's standardization might not have these things, but the ANSI C standard is over 20 years old so it's extremely unlikely that you'll encounter one of these.
The whole premise to this problems begs a question: If these people are convinced that they can't trust the version of offsetof that the compiler provides, then what can they trust? Do they trust that NULL is defined correctly? Do they trust that long int is no smaller than a regular int? Do they trust that memcpy works like it's supposed to? Do they roll their own versions of the rest of the C standard library functionality? One of the big reasons for having language standards is so that you can trust the compiler to do these things correctly. It seems silly to trust the compiler for everything else except offsetof.
Update: (in response to your comments)
I think my co-workers behave like yours do :-) Some of our older code still has custom macros defining NULL, VOID, and other things like that since "different compilers may implement them differently" (sigh). Some of this code was written back before C was standardized, and many older developers are still in that mindset even though the C standard clearly says otherwise.
Here's one thing you can do to both prove them wrong and make everyone happy at the same time:
#include <stddef.h>
#ifndef offsetof
#define offsetof(tp, member) (((char*) &((tp*)0)->member) - (char*)0)
#endif
In reality, they'll be using the version provided in stddef.h. The custom version will always be there, however, in case you run into a hypothetical compiler that doesn't define it.
Based on similar conversations that I've had over the years, I think the belief that offsetof isn't part of standard C comes from two places. First, it's a rarely used feature. Developers don't see it very often, so they forget that it even exists. Second, offsetof is not mentioned at all in Kernighan and Ritchie's seminal book "The C Programming Language" (even the most recent edition). The first edition of the book was the unofficial standard before C was standardized, and I often hear people mistakenly referring to that book as THE standard for the language. It's much easier to read than the official standard, so I don't know if I blame them for making it their first point of reference. Regardless of what they believe, however, the standard is clear that offsetof is part of ANSI C (see R's answer for a link).
Here's another way of looking at question #1. The ANSI C standard gives the following definition in section 4.1.5:
offsetof( type, member-designator)
which expands to an integral constant expression that has type size_t,
the value of which is the offset in bytes, to the structure member
(designated by member-designator ), from the beginning of its
structure (designated by type ).
Using the offsetof macro does not invoke undefined behavior. In fact, the behavior is all that the standard actually defines. It's up to the compiler writer to define the offsetof macro such that its behavior follows the standard. Whether it's implemented using a macro, a compiler builtin, or something else, ensuring that it behaves as expected requires the implementor to deeply understand the inner workings of the compiler and how it will interpret the code. The compiler may implement it using a macro like the idiomatic version you provided, but only because they know how the compiler will handle the non-standard code.
On the other hand, the macro expansion you provided indeed invokes undefined behavior. Since you don't know enough about the compiler to predict how it will process the code, you can't guarantee that particular implementation of offsetof will always work. Many people define their own version like that and don't run into problems, but that doesn't mean that the code is correct. Even if that's the way that a particular compiler happens to define offsetof, writing that code yourself invokes UB while using the provided offsetof macro does not.
Rolling your own macro for offsetof can't be done without invoking undefined behavior (ANSI C section A.6.2 "Undefined behavior", 27th bullet point). Using stddef.h's version of offsetof will always produce the behavior defined in the standard (assuming a standards-compliant compiler). I would advise against defining a custom version since it can cause portability problems, but if others can't be persuaded then the #ifndef offsetof snippet provided above may be an acceptable compromise.
(1) The undefined behavior is already there before you do the substraction.
First of all, (tp*)0 is not what you think it is. It is a null
pointer, such a beast is not necessarily represented with all-zero
bit pattern.
Then the member operator -> is not simply an offset addition. On a CPU with segmented memory this might be a more complicated operation.
Taking the address with a & operation is UB if the expression is
not a valid object.
(2) For the point 2., there are certainly still archictures out in the wild (embedded stuff) that use segmented memory. For 3., the point that R makes about integer constant expressions has another drawback: if the code is badly optimized the & operation might be done at runtime and signal an error.
(3) Never heard of such a thing, but this is probably not enough to convice your colleagues.
I believe that nearly every optimizing compiler has broken that macro at multiple points in time. Your coworkers have apparently been lucky enough not to have been hit by it.
What happens is that some junior compiler engineer decides that because the zero page is never mapped on their platform of choice, any time anyone does anything with a pointer to that page, that's undefined behavior and they can safely optimize away the whole expression. At that point, everyone's homebrew offsetof macros break until enough people scream about it, and those of us who were smart enough not to roll our own go happily about our business.
I don't know of any compiler where this is the behavior in the current released version, but I think I've seen it happen at some point with every compiler I've ever worked with.
I want to get the size of a specific member in a struct.
sizeof(((SomeStruct *) 0)->some_member) works for me but I feel like there might be a nicer way to do it.
I could #define SIZEOF_ELEM(STRUCT, ELEM) sizeof(((STRUCT *) 0)->ELEM) and then use SIZEOF_ELEM(SomeStruct, some_member), but I wonder whether there is already something better built-in.
My specific use-case is in hsc2hs (Haskell C bindings).
pokeArray (plusPtr context (#offset AVFormatContext, filename)) .
take (#size ((AVFormatContext *) 0)->filename) .
(++ repeat '\NUL') $ filename
What you've got is about as clean as it gets if you can't guarantee you have a variable to dereference. (If you can, then use just sizeof(var.member) or sizeof(ptr->member), of course, but this won't work in some contexts where a compile-time constant is needed.)
Once upon a long, long time ago (circa 1990), I ran into a compiler that had 'offsetof' defined using the base address 0, and it crashed. I worked around the problem by hacking <stddef.h> to use 1024 instead of 0. But you should not run into such problems now.
Microsoft has the following in one of their headers:
#define RTL_FIELD_SIZE(type, field) (sizeof(((type *)0)->field))
I see no reason to do any different.
They have related macros for:
RTL_SIZEOF_THROUGH_FIELD()
RTL_CONTAINS_FIELD()
and the nifty:
CONTAINING_RECORD()
which helps implement generic lists in straight C without having to require that link fields be at the start of a struct. See this Kernel Mustard article for details.
I believe you've already got the correct solution there. You could dig up your stddef.h and look for how offsetof is defined, since it does a very similar thing.
Remember that there may well be a difference between the sizeof a member and the difference between the offsetofs of that member and the next one, due to padding.
In C++ you could do sizeof(SomeStruct::some_member), but this is c and you have no scope resolution operator. What you've written is as good as can be written, as far as I know.