Is it okay to compare a pointer and an integer in C? - c

I'm writing some code that maps virtual addresses to physical addresses.
I have code along these lines:
if (address > 0xFFFF)
Status = XST_FAILURE; // Out of range
else if (address <= 0xCFFF || address >= 0xD400) {
// Write to OCM
Xil_Out8(OCM_HIGH64_BASEADDR + OCM_OFFSET + address, data);
else { // (address >= 0xD000)
// Write to external CCA
Status = ext_mem_write(address, data);
I get a compiler warning:
comparison between pointer and integer [enabled by default]
I realize that I'm comparing two different types (pointer and integer), but is this an issue? After all, comparing a pointer to an integer is exactly what I want to do.
Would it be cleaner to define pointer constants to compare to instead of integers?
const int *UPPER_LIMIT = 0xFFFF;
...
if (address > UPPER_LIMIT ){
....

The clean way is to use contants of type uintptr_t, which is defined to be an unsigned integer that can uniquely map between pointers and integers.
This should be defined by #include <stdint.h>. If it is not defined then it indicates that either your compiler doesn't follow the C standard, or the system does not have a flat memory model.
It's intended to be mapped in the "obvious" way , i.e. one integer per byte in ascending order. The standard doesn't absolutely guarantee that but as a quality of implementation issue it's hard to see anything else happening.
Example:
uintptr_t foo = 0xFFFF;
void test(char *ptr)
{
if ( (uintptr_t)ptr < foo )
// do something...
}
This is well-defined by the C standard. The version where you use void * instead of uintptr_t is undefined behaviour, although it may appear to work if your compiler isn't too aggressive.

That's probably why Linux Kernel uses unsigned long for addresses (note the difference -- pointer points to an object, while address is an abstract code representing location in memory).
That's how it seem from compiler perspective:
C standard doesn't define how to compare int (arithmetic type) literal 0xFFFF and pointer address -- see paragraph 6.5.8
So, it has to convert operands somehow. Both conversions are implementation defined as paragraph 6.3.2.3 states. Here are couple of crazy decisions that compiler eligible to make:
Because 0xFFFF is probably int -- see 6.4.4, it may coerce pointer to int and if sizeof(int) < sizeof(void*), you will lose higher bytes.
I can imagine more crazier situations when 0xFFFF is sign extended to 0xFFFFFFFF (shouldn't be, but why not)
Of course, none of (2) should happen, modern compilers are smart enough. But it can happen (I assume you're writing something embedded, where it is more likely to happen), so that's why compiler raises a warning.
Here is one practical example of "crazy compiler things": in GCC 4.8 optimizer started to treat integer overflow as UB (Undefined Behavior) and omit instructions assuming programmer doesn't want integer overflow: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61569
I'm referring to N1570 - C11 standard draft

Cast pointer to unsigned int to avoid warnings: (unsigned)address - in case of 32 or 16 bit address space.

Related

Is it valid to calculate element pointers by explicit arithmetic?

Is the following program valid? (In the sense of being well-defined by the ISO C standard, not just happening to work on a particular compiler.)
struct foo {
int a, b, c;
};
int f(struct foo *p) {
// should return p->c
char *q = ((char *)p) + 2 * sizeof(int);
return *((int *)q);
}
It follows at least some of the rules for well-defined use of pointers:
The value being loaded, is of the same type that was stored at the address.
The provenance of the calculated pointer is valid, being derived from a valid pointer by adding an offset, that gives a pointer still within the original storage instance.
There is no mixing of element types within the struct, that would generate padding to make an element offset unpredictable.
But I'm still not sure it's valid to explicitly calculate and use element pointers that way.
C is a low level programming language. This code is well-defined but probably not portable.
It is not portable because it makes assumptions about the layout of the struct. In particular, you might run into fields being 64-bit aligned on a 64bit platform where in is 32 bit.
Better way of doing it is using the offsetof marco.
The C standard allows there to be arbitrary padding between elements of a struct (but not at the beginning of one). Real-world compilers won’t insert padding into a struct like that one, but the DeathStation 9000 is allowed to. If you want to do that portably, use the offsetof() macro from <stddef.h>.
*(int*)((char*)p + offsetof(foo, c))
is guaranteed to work. A difference, such as offsetof(foo,c) - offsetof(foo, b), is also well-defined. (Although, since offsetof() returns an unsigned value, it’s defined to wrap around to a large unsigned number if the difference underflows.)
In practice, of course, use &p->c.
An expression like the one in your original question is guaranteed to work for array elements, however, so long as you do not overrun your buffer. You can also generate a pointer one past the end of an array and compare that pointer to a pointer within the array, but dereferencing such a pointer is undefined behavior.
I think it likely that at least some authors of the Standard intended to allow a compiler given something like:
struct foo { unsigned char a[4], b[4]; } x;
int test(int i)
{
x.b[0] = 1;
x.a[i] = 2;
return x.b[0];
}
to generate code that would always return 1 regardless of the value of i. On the flip side, I think it is extremely like nearly all of the Committee would have intended that a function like:
struct foo { char a[4], b[4]; } x;
void put_byte(int);
void test2(unsigned char *p, int sz)
{
for (int i=0; i<sz; i++)
put_byte(p[i]);
}
be capable of outputting all of the bytes in x in a single invocation.
Clang and gcc will assume that any construct which applies the [] operator to a struct or union member will only be used to access elements of that member array, but the Standard defines the behavior of arrayLValue[index] as equivalent to (*((arrayLValue)+index)), and would define the address of x.a's first element, which is an unsigned char*, as equivalent to the address of x, cast to that type. Thus, if code calls test2((unsigned char*)x), the expression p[i] would be equivalent to x.a[i], which clang and gcc would only support for subscripts in the range 0 to 3.
The only way I see of reading the Standard as satisfying both viewpoints would be to treat support for even the latter construct as a "quality of implementation" issue outside the Standard's jurisdiction, on the assumption that quality implementations would support constructs like the latter with or without a mandate, and there was thus no need to write sufficiently detailed rules to distinguish those two scenarios.

Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?

For example, can this
unsigned f(float x) {
unsigned u = *(unsigned *)&x;
return u;
}
cause unpredictable results on a platform where,
unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.
I know about strict aliasing rules, but most examples showing problematic cases of violating strict aliasing is like the following.
static int g(int *i, float *f) {
*i = 1;
*f = 0;
return *i;
}
int h() {
int n;
return g(&n, (float *)&n);
}
In my understanding, the compiler is free to assume that i and f are implicitly restrict. The return value of h could be 1 if the compiler thinks *f = 0; is redundant (because i and f can't alias), or it could be 0 if it puts into account that the values of i and f are the same. This is undefined behaviour, so technically, anything else can happen.
However, the first example is a bit different.
unsigned f(float x) {
unsigned u = *(unsigned *)&x;
return u;
}
Sorry for my unclear wording, but everything is done "in-place". I can't think of any other way the compiler might interpret the line unsigned u = *(unsigned *)&x;, other than "copy the bits of x to u".
In practice, all compilers for various architectures I tested in https://godbolt.org/ with full optimization produce the same result for the first example, and varying results (either 0 or 1) for the second example.
I know it's technically possible that unsigned and float have different sizes and alignment requirements, or should be stored in different memory segments. In that case even the first code won't make sense. But on most modern platforms where the following holds, is the first example still undefined behaviour (can it produce unpredictable results)?
unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.
In real code, I do write
unsigned f(float x) {
unsigned u;
memcpy(&u, &x, sizeof(x));
return u;
}
The compiled result is the same as using pointer casting, after optimization. This question is about interpretation of the standard about strict aliasing rules for code such as the first example.
Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?
Yes.
The rule is https://port70.net/~nsz/c/c11/n1570.html#6.5p7 :
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the
object,
a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
a character type.
The effective type of the object x is float - it is defined with that type.
unsigned is not compatible with float,
unsigned is not a qualified version of float,
unsigned is not a signed or unsigned type of float,
unsigned is not a signed or unsigned type corresponding to qualified version of float,
unsigned is not an aggregate or union type
and unsigned is not a character type.
The "shall" is violated, it is undefined behavior (see https://port70.net/~nsz/c/c11/n1570.html#4p2 ). There is no other interpretation.
We also have https://port70.net/~nsz/c/c11/n1570.html#J.2 :
The behavior is undefined in the following circumstances:
An object has its stored value accessed other than by an lvalue of an allowable type (6.5).
As Kamil explains, it's UB. Even int and long (or long and long long) aren't alias-compatible even when they're the same size. (But interestingly, unsigned int is compatible with int)
It's nothing to do with being the same size, or using the same register-set as suggested in a comment, it's mainly a way to let compilers assume that different pointers don't point to overlapping memory when optimizing. They still have to support C99 union type-punning, not just memcpy. So for example a dst[i] = src[i] loop doesn't need to check for possible overlap when unrolling or vectorizing, if dst and src have different types.1
If you're accessing the same integer data, the standard requires that you use the exact same type, modulo only things like signed vs. unsigned and const. Or that you use (unsigned) char*, which is like GNU C __attribute__((may_alias)).
The other part of your question seems to be why it appears to work in practice, despite the UB.
Your godbolt link forgot to link the actual compilers you tried.
https://godbolt.org/z/rvj3d4e4o shows GCC4.1, from before GCC went out of its way to support "obvious" local compile-time-visible cases like this, to sometimes not break people's buggy code using non-portable idioms like this.
It loads garbage from stack memory, unless you use -fno-strict-aliasing to make it movd to that location first. (Store/reload instead of movd %xmm0, %eax is a missed-optimization bug that's been fixed in later GCC versions for most cases.)
f: # GCC4.1 -O3
movl -4(%rsp), %eax
ret
f: # GCC4.1 -O3 -fno-strict-aliasing
movss %xmm0, -4(%rsp)
movl -4(%rsp), %eax
ret
Even that old GCC version warns warning: dereferencing type-punned pointer will break strict-aliasing rules which should make it obvious that GCC notices this and does not consider it well-defined. Later GCC that do choose to support this code still warn.
It's debatable whether it's better to sometimes work in simple cases, but break other times, vs. always failing. But given that GCC -Wall does still warn about it, that's probably a good tradeoff between convenience for people dealing with legacy code or porting from MSVC. Another option would be to always break it unless people use -fno-strict-aliasing, which they should if dealing with codebases that depend on this behaviour.
Being UB doesn't mean required-to-fail
Just the opposite; it would take tons of extra work to actually trap on every signed overflow in the C abstract machine, for example, especially when optimizing stuff like 2 + c - 3 into c - 1. That's what gcc -fsanitize=undefined tries to do, adding x86 jo instructions after additions (except it still does constant-propagation so it's just adding -1, not detecting temporary overflow on INT_MAX. https://godbolt.org/z/WM9jGT3ac). And it seems strict-aliasing is not one of the kinds of UB it tries to detect at run time.
See also the clang blog article: What Every C Programmer Should Know About Undefined Behavior
An implementation is free to define behaviour the ISO C standard leaves undefined
For example, MSVC always defines this aliasing behaviour, like GCC/clang/ICC do with -fno-strict-aliasing. Of course, that doesn't change the fact that pure ISO C leaves it undefined.
It just means that on those specific C implementations, the code is guaranteed to work the way you want, rather than happening to do so by chance or by de-facto compiler behaviour if it's simple enough for modern GCC to recognize and do the more "friendly" thing.
Just like gcc -fwrapv for signed-integer overflows.
Footnote 1: example of strict-aliasing helping code-gen
#define QUALIFIER // restrict
void convert(float *QUALIFIER pf, const int *pi) {
for(int i=0 ; i<10240 ; i++){
pf[i] = pi[i];
}
}
Godbolt shows that with the -O3 defaults for GCC11.2 for x86-64, we get just a SIMD loop with movdqu / cvtdq2ps / movups and loop overhead. With -O3 -fno-strict-aliasing, we get two versions of the loop, and an overlap check to see if we can run the scalar or the SIMD version.
Is there actual cases where strict aliasing helps better code generation, in which the same cannot be achieved with restrict
You might well have a pointer that might point into either of two int arrays, but definitely not at any float variable, so you can't use restrict on it. Strict-aliasing will let the compiler still avoid spill/reload of float objects around stores through the pointer, even if the float objects are global vars or otherwise aren't provably local to the function. (Escape analysis.)
Or a struct node * that definitely isn't the same type as the payload in a tree.
Also, most code doesn't use restrict all over the place. It could get quite cumbersome. Not just in loops, but in every function that deals with pointers to structs. And if you get it wrong and promise something that's not true, your code's broken.
The Standard was never intended to fully, accurately, and unambiguously partition programs that have defined behavior and those that don't(*), but instead relies upon compiler writers to exercise a certain amount of common sense.
(*) If it was intended for that purpose, it fails miserably, as evidenced by the amount of confusion stemming from it.
Consider the following two code snippets:
/* Assume suitable declarations of u are available everywhere */
union test { uint32_t ww[4]; float ff[4]; } u;
/* Snippet #1 */
uint32_t proc1(int i, int j)
{
u.ww[i] = 1;
u.ff[j] = 2.0f;
return u.ww[i];
}
/* Snippet #2, part 1, in one compilation unit */
uint32_t proc2a(uint32_t *p1, float *p2)
{
*p1 = 1;
*p2 = 2.0f;
return *p1;
}
/* Snippet #2, part 2, in another compilation unit */
uint32_t proc2(int i, int j)
{
return proc2a(u.ww+i, u.ff+j);
}
It is clear that the authors of the Standard intended that the first version of the code be processed meaningfully on platforms where that would make sense, but it's also clear that at least some of the authors of C99 and later versions did not intend to require that the second version be processed likewise (some of the authors of C89 may have intended that the "strict aliasing rule" only apply to situations where a directly named object would be accessed via pointer of another type, as shown in the example given in the published Rationale; nothing in the Rationale suggests a desire to apply it more broadly).
On the other hand, the Standard defines the [] operator in such a fashion that proc1 is semantically equivalent to:
uint32_t proc3(int i, int j)
{
*(u.ww+i) = 1;
*(u.ff+j) = 2.0f;
return *(u.ww+i);
}
and there's nothing in the Standard that would imply that proc() shouldn't have the same semantics. What gcc and clang seem to do is special-case the [] operator as having a different meaning from pointer dereferencing, but nothing in the Standard makes such a distinction. The only way to consistently interpret the Standard is to recognize that the form with [] falls in the category of actions which the Standard doesn't require that implementations process meaningfully, but relies upon them to handle anyway.
Constructs such as yours example of using a directly-cast pointer to access storage associated with an object of the original pointer's type fall in a similar category of constructs which at least some authors of the Standard likely expected (and would have demanded, if they didn't expect) that compilers would handle reliably, with or without a mandate, since there was no imaginable reason why a quality compiler would do otherwise. Since then, however, clang and gcc have evolved to defy such expectations. Even if clang and gcc would normally generate useful machine code for a function, they seek to perform aggressive inter-procedural optimizations that make it impossible to predict what constructs will be 100% reliable. Unlike some compilers which refrain from applying potential optimizing transforms unless they can prove that they are sound, clang and gcc seek to perform transforms that can't be proven to affect program behavior.

How to resolve MISRA C:2012 Rule 11.6?

I am utilizing Microchip sample nvmem.c file function to write data into particular memory address of PIC32 Microcontroller. When I am trying to use it showing following MISRA error: I just posted sample code where I got an error. My whole code is compiled and working fine.
1] explicit cast from 'unsigned int' to 'void ' [MISRA 2012 Rule 11.6, required] at NVMemWriteWord((void)APP_FLASH_MARK_ADDRESS,(UINT)_usermark);
How can I resolve this error?
nvmem.c
uint8_t NVMemWriteWord(void* address, uint32_t data)
{
uint8_t res;
NVMADDR = KVA_TO_PA((uint32_t)address); //destination address to write
NVMDATA = data;
res = NVMemOperation(NVMOP_WORD_PGM);
}
test.c
#define ADDRESS 0x9D007FF0U;
NVMemWriteWord((void*)ADDRESS,(uint32_t)_usermark);
Use
uint8_t NVMemWriteWord(unsigned int address, uint32_t data)
{
uint8_t res;
NVMADDR = KVA_TO_PA(address);
NVMDATA = data;
res = NVMemOperation(NVMOP_WORD_PGM);
}
and
#define ADDRESS 0x9D007FF0U
NVMemWriteWord(ADDRESS,(uint32_t)_usermark);
instead. Functionally it is exactly equivalent to the example, it just avoids the cast from a void pointer to an unsigned integer address.
Suggest:
#define ADDRESS (volatile uint32_t*)0x9D007FF0U
NVMemWriteWord( ADDRESS, _usermark) ;
Never cast to void* - the purpose of void* is that you can assign any other pointer type to it safely and without explicit cast. The cast of _usermark may or may not be necessary, but unnecessary explicit casts should be avoided - they can suppress important compiler warnings. You should approach type conversions in the following order of preference:
Type agreement - exactly same types.
Type compatibility - smaller type to larger type, same signedness.
Type case - last resort (e.g. larger to smaller type, signedness mismatch, integer to/from pointer).
In this instance since NVMemWriteWord simply casts address to an integer, then the use of void* may not be appropriate. If in other contexts you are actually using a pointer, then it may be valid.
The whole of MISRA-C:2012 chapter 12 regarding pointer conversions is quite picky. And rightly so, since this is very dangerous territory.
11.6 is a sound rule that bans conversions from integers to void*. The rationale is to block alignment bugs. There aren't many reasons why you would want to do such conversions anyway.
Notably, there's also two rigid but advisory rules 11.4 which bans conversions from integers to pointers, and 11.5 which pretty much bans the use of void* entirely. It isn't possible to do hardware-related programming and follow 11.4, so that rule has to be ignored. But you have little reason to use void*.
In this specific cast you can get away by using uint32_t and avoiding pointers entirely.
In the general case of register access, you must do a conversion with volatile-qualified pointers: (volatile uint32_t*)ADDRESS, assuming that the MCU uses 32 bit registers.

What is the maximum value of a pointer on Mac OSX?

I'd like to make a custom pointer address printer (like the printf(%p)) and I'd like to know what is the maximum value that a pointer can have on the computer I'm using, which is an iMac OS X 10.8.5.
Someone recommended I use an unsigned long. Is the following cast the adapted one and big enough ?
function print_address(void *pointer)
{
unsigned long a;
a = (unsigned long) pointer;
[...]
}
I searched in the limits.h header but I couldn't find any mention of it. Is it a fixed value or there a way to find out what is the maximum on my system ?
Thanks for your help !
Quick summary: Convert the pointer to uintptr_t (defined in <stdint.h>), which will give you a number in the range 0 to UINTPTR_MAX. Read on for the gory details and some unlikely problems you might theoretically run into.
In general there is no such thing as a "maximum value" for a pointer. Pointers are not numbers, and < and > comparisons aren't even defined unless both pointers point into the same (array) object or just past the end of it.
But I think that the size of a pointer is really what you're looking for. And if you can convert a void* pointer value to an unsigned 32-bit or 64-bit integer, the maximum value of that integer is going to be 232-1 or 264-1, respectively.
The type uintptr_t, declared in <stdint.h>, is an unsigned integer type such that converting a void* value to uintptr_t and back again yields a value that compares equal to the original pointer. In short, the conversion (uintptr_t)ptr will not lose information.
<stdint.h> defines a macro UINTPTR_MAX, which is the maximum value of type uintptr_t. That's not exactly the "maximum value of a pointer", but it's probably what you're looking for.
(On many systems, including Mac OSX, pointers are represented as if they were integers that can be used as indices into a linear monolithic address space. That's a common memory model, but it's not actually required by the C standard. For example, some systems may represent a pointer as a combination of a descriptor and an offset, which makes comparisons between arbitrary pointer values difficult or even impossible.)
The <stdint.h> header and the uintptr_t type were added to the C language by the 1999 standard. For MacOS, you shouldn't have to worry about pre-C99 compilers.
Note also that the uintptr_t type is optional. If pointers are bigger than any available integer type, then the implementation won't define uintptr_t. Again, you shouldn't have to worry about that for MacOS. If you want to be fanatical about portable code, then you can use
#include <stdint.h>
#ifdef UINTPTR_MAX
/* uintptr_t exists */
#else
/* uintptr_t doesn't exist; do something else */
#endif
where "something else" is left as an exercise.
You probably are looking for the value of UINTPTR_MAX defined in <stdint.h>.
As ouah's answer says, uintptr_t sounds like the type you really want.
unsigned long is not guaranteed to to be able to represent a pointer value. Use uintptr_t which is an unsigned integer type that can hold a pointer value.

Is the sizeof(enum) == sizeof(int), always?

Is the sizeof(enum) == sizeof(int), always ?
Or is it compiler dependent?
Is it wrong to say, as compiler are optimized for word lengths (memory alignment) ie y int is the word-size on a particular compiler? Does it means that there is no processing penalty if I use enums, as they would be word aligned?
Is it not better if I put all the return codes in an enum, as i clearly do not worry about the values it get, only the names while checking the return types. If this is the case wont #DEFINE be better as it would save memory.
What is the usual practice?
If I have to transport these return types over a network and some processing has to be done at the other end, what would you prefer enums/#defines/ const ints.
EDIT - Just checking on net, as complier don't symbolically link macros, how do people debug then, compare the integer value with the header file?
From Answers —I am adding this line below, as I need clarifications—
"So it is implementation-defined, and
sizeof(enum) might be equal to
sizeof(char), i.e. 1."
Does it not mean that compiler checks for the range of values in enums, and then assign memory. I don't think so, of course I don't know. Can someone please explain me what is "might be".
It is compiler dependent and may differ between enums. The following are the semantics
enum X { A, B };
// A has type int
assert(sizeof(A) == sizeof(int));
// some integer type. Maybe even int. This is
// implementation defined.
assert(sizeof(enum X) == sizeof(some_integer_type));
Note that "some integer type" in C99 may also include extended integer types (which the implementation, however, has to document, if it provides them). The type of the enumeration is some type that can store the value of any enumerator (A and B in this case).
I don't think there are any penalties in using enumerations. Enumerators are integral constant expressions too (so you may use it to initialize static or file scope variables, for example), and i prefer them to macros whenever possible.
Enumerators don't need any runtime memory. Only when you create a variable of the enumeration type, you may use runtime memory. Just think of enumerators as compile time constants.
I would just use a type that can store the enumerator values (i should know the rough range of values before-hand), cast to it, and send it over the network. Preferably the type should be some fixed-width one, like int32_t, so it doesn't come to conflicts when different machines are involved. Or i would print the number, and scan it on the other side, which gets rid of some of these problems.
Response to Edit
Well, the compiler is not required to use any size. An easy thing to see is that the sign of the values matter - unsigned types can have significant performance boost in some calculations. The following is the behavior of GCC 4.4.0 on my box
int main(void) {
enum X { A = 0 };
enum X a; // X compatible with "unsigned int"
unsigned int *p = &a;
}
But if you assign a -1, then GCC choses to use int as the type that X is compatible with
int main(void) {
enum X { A = -1 };
enum X a; // X compatible with "int"
int *p = &a;
}
Using the option --short-enums of GCC, that makes it use the smallest type still fitting all the values.
int main() {
enum X { A = 0 };
enum X a; // X compatible with "unsigned char"
unsigned char *p = &a;
}
In recent versions of GCC, the compiler flag has changed to -fshort-enums. On some targets, the default type is unsigned int. You can check the answer here.
C99, 6.7.2.2p4 says
Each enumerated type shall be
compatible with char, a signed
integer type, or an unsigned
integer type. The choice of type
is implementation-defined,108) but
shall be capable of representing the
values of all the members of the
enumeration. [...]
Footnote 108 adds
An implementation may delay the choice of which integer
type until all enumeration constants have been seen.
So it is implementation-defined, and sizeof(enum) might be equal to sizeof(char), i.e. 1.
In chosing the size of some small range of integers, there is always a penalty. If you make it small in memory, there probably is a processing penalty; if you make it larger, there is a space penalty. It's a time-space-tradeoff.
Error codes are typically #defines, because they need to be extensible: different libraries may add new error codes. You cannot do that with enums.
Is the sizeof(enum) == sizeof(int), always
The ANSI C standard says:
Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined. (6.7.2.2 Enumerationspecifiers)
So I would take that to mean no.
If this is the case wont #DEFINE be better as it would save memory.
In what way would using defines save memory over using an enum? An enum is just a type that allows you to provide more information to the compiler. In the actual resulting executable, it's just turned in to an integer, just as the preprocessor converts a macro created with #define in to its value.
What is the usual practise. I if i have to transport these return types over a network and some processing has to be done at the other end
If you plan to transport values over a network and process them on the other end, you should define a protocol. Decide on the size in bits of each type, the endianess (in which order the bytes are) and make sure you adhere to that in both the client and the server code. Also don't just assume that because it happens to work, you've got it right. It just might be that the endianess, for example, on your chosen client and server platforms matches, but that might not always be the case.
No.
Example: The CodeSourcery compiler
When you define an enum like this:
enum MyEnum1 {
A=1,
B=2,
C=3
};
// will have the sizeof 1 (fits in a char)
enum MyEnum1 {
A=1,
B=2,
C=3,
D=400
};
// will have the sizeof 2 (doesn't fit in a char)
Details from their mailing list
On some compiler the size of an enum is depending on how many entry's are in the Enum. (less than 255 Entrys => Byte, More than 255 Entrys int)
But this is depending on the Compiler and the Compiler Settings.
enum fruits {apple,orange,strawberry,grapefruit};
char fruit = apple;
fruit = orange;
if (fruit < strawberry)
...
all of this works perfectly
if you want a specific underlying type for an enum instance, just don't use the type itself.

Resources