C - Writing On Windows Compiling On UNIX And Vice Versa - c

I am planning to write an ANSI-C program on Windows with Netbeans using Cygwin suite, and later on i want to compile the source code on an UNIX family OS and use the program. Should i worry about any kind of compability problems?

If you use only the functionality described in the C standard, the set of possible incompatibilities typically reduces to:
signedness of char
sizes of all types (e.g. int=long=32-bit in Windows, not necessarily so on UNIX), I mean literally all, including pointers and enums
poorly thought out type conversions and casts, especially involving pointers and negative values (mixing of signed and unsigned types in expressions is error-prone too)
alignment of types and their padding in structures/unions
endianness
order of bitfields
implementation-defined/specific behavior, e.g. right shifts of negative values, rounding and signs when dividing signed values
floating-point: different implementation and different optimization
unspecified behavior, e.g. orders of function parameter and subexpression evaluation, the direction in which memcpy() copies data (low to high addresses or the other way around), etc etc
undefined behavior, e.g. i+i++ or a[i]=i++, modifying string literals, pointer dereference when the object it's pointed to is gone (e.g. free()'d), not using or misusing const and volatile, etc etc
supplying standard library functions with inappropriate parameters leading to undefined behavior, e.g. calling printf()-like functions with wrong number or kind of parameters
non-ASCII characters/strings
filename formats (special chars, length, case sensitivity)
clock/time/locale formats/ranges/values/configuration
There's much more. You actually have to read the standard and note what's guaranteed to work and what's not and if there are any conditions.
If you use something outside of the C standard, that functionality may not be available or identical on a different platform.
So, yes, it's possible, but you have to be careful. It's usually the assumptions that you make that make your code poorly portable.

There will be comparability problems, but as long as you stick to basic unix functionality, they ought to be manageable for command line applications. However, if your app has a GUI or has to interact with other programs in the unix environment, you'll probably regret your approach.
Another way to go would be to run the appropriate flavor of unix in a virtualbox on your desktop, and be pretty sure there are no compatibility problems.

Related

Should one use Named Address Spaces where they are available?

There are some architectures which have multiple address spaces, notable examples are true Harvard ones, but for example OpenCL also has this property.
C compilers may provide some solutions to this, one of these is Named Address Spaces, supporting special pointer qualifiers to indicate the pointer's address space, but other solutions might also be present.
For GCC, the corresponding documentation is here: https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Named-Address-Spaces.html
For IAR targeting the AVR, the corresponding documentation is here: https://www.iar.com/support/tech-notes/compiler/strings-with-iccavr-2.x/ (note that this is earlier than GCC's support, which GCC likely adapted for the 8 bit AVR target).
For SDCC (Small Device C compiler): http://sdcc.sourceforge.net/doc/sdccman.pdf , starts on Page 36. Covers microcontrollers like the 8051, Z80 and 68HC08.
Some information for OpenCL: https://www.khronos.org/registry/OpenCL/sdk/1.1/docs/man/xhtml/local.html and https://software.intel.com/en-us/articles/the-generic-address-space-in-opencl-20
I didn't know about them, and on the architecture I am using (8 bit AVR), there is another solution to deal with the problem, specialized macros (pgmspace.h) to work with data in the ROM. But there are no type checks on these, and they (in my opinion) make code ugly, so it would seem to me that using Named Address Spaces are a superior, and possibly even more portable way to deal with the problem (portable in that one could easily port such a software to a target having a single address space by providing empty definitions for the address space qualifiers).
However in a previous question in which I learned from their availability, solutions suggesting the use of Named Address Spaces got severely downvoted, here: How to make two otherwise identical pointer types incompatible
Downvoters didn't provide any explanation, and I neither found any myself, for me it seems like Named Address Spaces are a good and perfectly functional way of dealing with the problem.
Could anyone provide an explanation? Why Named Address Spaces probably shouldn't be used? (favoring whatever other method available on the target having multiple distinct address spaces)
Another approach is to steal a technique used in things like the Linux kernel and tools like smatch.
Linux has defines like
#define __user
which mean the code can say things like int foo(const __user char *p). The compiler ignores the __user but tools like smatch are then used to make sure that pointers don't accidentally wander between namespaces.
The problem with these is obvious: they only work on the gcc compiler.
And in the embedded systems branch there are lots of different compilers, each offering its own unique, non-portable way to do this. Sometimes that is fine (most embedded projects never get ported to different compilers) but from a generic point-of-view, it is not.
(The very same issue also exists with extended addresses - if you would for example use a 8 or 16 bit MCU with more than 64kib addressable memory. Compilers then use various non-standard extensions such as near and far.)
One solution to these problems is to make a "wrapper" around the compiler-specific behavior, by making a hardware abstraction layer (HAL) where you specify that the type used for storing data in flash is flash_byte_t or some such, then from your HAL include a compiler-specific header-file containing the actual typedef, such as typedef const __flash uint8_t flash_byte_t;. For example, the application includes "compiler.h" and this one in turn includes "gcc.h". That way you only need to re-write one small header file when you switch compiler.
Also as it turns out, C allows const flash_byte_t just fine even though this was already typedef'd as const. There's a special odd rule in C saying that you can add the same qualifier as many times in a declaration as you like. So const const int x is equivalent to const int x. This means that if the user would put on extra const-qualification, that's fine.
Note that it's mostly AVR being a special exception here, because of its weird Harvard model.
Otherwise, there's an industry de facto standard convention used by most compilers: all const qualified variables with static storage duration should be allocated in flash. Of course the C standard makes no guarantees of this (it is out of scope of the standard), but most embedded compilers behave like that.

How to make C codes in Tru64 Unix to work in Linux 64 bit?

I wanna know probable problems faced while moving C programs for eg. server process from Tru64 Unix to Linux 64 bits and why? What probable modifications the program would need or only recompiling the source code in new environment would do as both are 64 bit platforms? I am a little confused, I gotta know before I start working on it.
I spent a lot of time in the early 90s (OMG I feel old...) porting 32-bit code to the Alpha architecture. This was back when it was called OSF/1.
You are unlikely to have any difficulties relating to the bit-width when going from Alpha to x86_64.
Developers are much more aware of the problems caused by assuming that sizeof(int) == sizeof(void *), for example. That was far and away the most common problem I used to have when porting code to Alpha.
Where you do find differences they will be in how the two systems differ in their conformity to various API specifications, e.g. POSIX, XOpen, etc. That said, such differences are normally easily worked around.
If the Alpha code has used the SVR4 style APIs (e.g. streams) that you may have more difficulty than if it has used the more BSD-like APIs.
64 bit architecture is only an approximation of the classification of an architecture.
Ideally your code would have used only "semantic" types for all descriptions of variables, in particular size_t and ptrdiff_t for sizes and pointer arithmetic and the [u]intXX_t for types where a particular width is assumed.
If this is not the case, the main point would be to compare all the standard arithmetic types (all integer types, floating point types and pointers) if they map to the same concept on both platforms. If you find differences, you know the potential trouble spots.
Check the 64-bit data model used by both platforms, most 64bit Unix-like OS's use LP64, so it is likely that your target platforms use the same data model. This being the case you should have few problems given that teh code itself compiles and links.
If you use the same compiler (e.g. GCC) on both platforms you also need not worry about incompatible compiler extensions or differences in undefined or implementation defined behaviour. Such behaviour should be avoided in any case - even if the compilers are the same, since it may differ between target architectures. If you are not using the same compiler, then you need to be cautious about using extensions. #pragma directives are a particular issue since a compiler is allowed to quietly ignore a #pragma it does not recognise.
Finally in order to compile and link, any library dependencies outside the C standard library need to be available on both platforms. Most OS calls will be available since Unix and Linux share the same POSIX API.

Is it possible to dynamically create equivalent of limits.h macros during compilation?

Main reason for this is attempt to write perfectly portable C library. After a few weeks i ended up with constants, which are unfortunately not very flexible (using constants for defining another constants isn't possible).
Thx for any advice or critic.
What you ask of is impossible. As stated before me, any standards compliant implementation of C will have limits.h correctly defined. If it's incorrect for whatever reason, blame the vendor of the compiler. Any "dynamic" discovery of the true limits wouldn't be possible at compile time, especially if you're cross compiling for an embedded system, and thus the target architecture might have smaller integers than the compiling system.
To dynamically discover the limits, you would have to do it at run-time by bit shifting, multiplying, or adding until an overflow is encountered, but then you have a variable in memory rather than a constant, which would be significantly slower. (This wouldn't be reliable anyways since different architectures use different bit-level representations, and arithmetic sometimes gets a bit funky around the limits especially with signed and abstract number representations such as floats)
Just use standard types and limits as found in stdint.h and limits.h, or try to avoid pushing the limits all together.
First thing that comes to my mind: have you considered using stdint.h? Thanks to that your library will be portable across C99-compliant compilers.

What parts of C are most portable?

I recently read an interview with Lua co-creators Luiz H. de Figueredo and Roberto Ierusalimschy, where they discussed the design, and implementation of Lua. It was very intriguing to say the least. However, one part of the discussion brought something up in my mind. Roberto spoke of Lua as a "freestanding application" (that is, it's pure ANSI C that uses nothing from the OS.) He said, that the core of Lua was completely portable, and because of its purity has been able to be ported much more easily and to platforms never even considered (such as robots, and embedded devices.)
Now this makes me wonder. C in general is a very portable language. So, what parts of C (namely those in the the standard library) are the most unportable? and what are those that can be expected to work on most platforms? Should only a limited set of data types be used (e.g. avoiding short and maybe float)? What about the FILE and the stdio system? malloc and free? It seems that Lua avoids all of these. Is that taking things to the extreme? Or are they the root of portability issues? Beyond this, what other things can be done to make code extremely portable?
The reason I'm asking all of this, is because I'm currently writing an application in pure C89, and it's optimal that it be as portable as possible. I'm willing take a middle road in implementing it (portable enough, but no so much that I have to write everything from scratch.) Anyways, I just wanted to see what in general is key to writing the best C code.
As a final note, all of this discussion is related to C89 only.
In the case of Lua, we don't have much to complain about the C language itself but we have found that the C standard library contains many functions that seem harmless and straight-forward to use, until you consider that they do not check their input for validity (which is fine if inconveninent). The C standard says that handling bad input is undefined behavior, allowing those functions to do whatever they want, even crash the host program. Consider, for instance, strftime. Some libc's simply ignore invalid format specifiers but other libc's (e.g., in Windows) crash! Now, strftime is not a crucial function. Why crash instead of doing something sensible? So, Lua has to do its own validation of input before calling strftime and exporting strftime to Lua programs becomes a chore. Hence, we have tried to stay clear from these problems in the Lua core by aiming at freestanding for the core. But the Lua standard libraries cannot do that, because their goal is to export facilities to Lua programs, including what is available in the C standard library.
"Freestanding" has a particular meaning in the context of C. Roughly, freestanding hosts are not required to provide any of the standard libraries, including the library functions malloc/free, printf, etc. Certain standard headers are still required, but they only define types and macros (for example stddef.h).
C89 allows two types of compilers: hosted and freestanding. The basic difference is that a hosted compiler provides all of the C89 library, while a freestanding compiler need only provide <float.h>, <limits.h>, <stdarg.h>, and <stddef.h>. If you limit yourself to these headers, your code will be portable to any C89 compiler.
This is a very broad question. I'm not going to give the definite answer, instead I'll raise some issues.
Note that the C standard specifies certain things as "implementation-defined"; a conforming program will always compile on and run on any conforming platform, but it may behave differently depending on the platform. Specifically, there's
Word size. sizeof(long) may be four bytes on one platform, eight on another. The sizes of short, int, long etc. each have some minimum (often relative to each other), but otherwise there are no guarantees.
Endianness. int a = 0xff00; int b = ((char *)&a)[0]; may assign 0 to b on one platform, -1 on another.
Character encoding. \0 is always the null byte, but how the other characters show up depends on the OS and other factors.
Text-mode I/O. putchar('\n') may produce a line-feed character on one platform, a carriage return on the next, and a combination of each on yet another.
Signedness of char. It may or it may not be possible for a char to take on negative values.
Byte size. While nowadays, a byte is eight bits virtually everywhere, C caters even to the few exotic platforms where it is not.
Various word sizes and endiannesses are common. Character encoding issues are likely to come up in any text-processing application. Machines with 9-bit bytes are most likely to be found in museums. This is by no means an exhaustive list.
(And please don't write C89, that's an outdated standard. C99 added some pretty useful stuff for portability, such as the fixed-width integers int32_t etc.)
C was designed so that a compiler may be written to generate code for any platform and call the language it compiles, "C". Such freedom acts in opposition to C being a language for writing code that can be used on any platform.
Anyone writing code for C must decide (either deliberately or by default) what sizes of int they will support; while it is possible to write C code which will work with any legal size of int, it requires considerable effort and the resulting code will often be far less readable than code which is designed for a particular integer size. For example, if one has a variable x of type uint32_t, and one wishes to multiply it by another y, computing the result mod 4294967296, the statement x*=y; will work on platforms where int is 32 bits or smaller, or where int is 65 bits or larger, but will invoke Undefined Behavior in cases where int is 33 to 64 bits, and the product, if the operands were regarded as whole numbers rather than members of an algebraic ring that wraps mod 4294967296, would exceed INT_MAX. One could make the statement work independent of the size of int by rewriting it as x*=1u*y;, but doing so makes the code less clear, and accidentally omitting the 1u* from one of the multiplications could be disastrous.
Under the present rules, C is reasonably portable if code is only used on machines whose integer size matches expectations. On machines where the size of int does not match expectations, code is not likely to be portable unless it includes enough type coercions to render most of the language's typing rules irrelevant.
Anything that is a part of the C89 standard should be portable to any compiler that conforms to that standard. If you stick to pure C89, you should be able to port it fairly easily. Any portability problems would then be due to compiler bugs or places where the code invokes implementation-specific behavior.

When should I use type abstraction in embedded systems

I've worked on a number of different embedded systems. They have all used typedefs (or #defines) for types such as UINT32.
This is a good technique as it drives home the size of the type to the programmer and makes you more conscious of chances for overflow etc.
But on some systems you know that the compiler and processor won't change for the life of the project.
So what should influence your decision to create and enforce project-specific types?
EDIT
I think I managed to lose the gist of my question, and maybe it's really two.
With embedded programming you may need types of specific size for interfaces and also to cope with restricted resources such as RAM. This can't be avoided, but you can choose to use the basic types from the compiler.
For everything else the types have less importance.
You need to be careful not to cause overflow and may need to watch out for register and stack usage. Which may lead you to UINT16, UCHAR.
Using types such as UCHAR can add compiler 'fluff' however. Because registers are typically larger, some compilers may add code to force the result into the type.
i++;
can become
ADD REG,1
AND REG, 0xFF
which is unecessary.
So I think my question should have been :-
given the constraints of embedded software what is the best policy to set for a project which will have many people working on it - not all of whom will be of the same level of experience.
I use type abstraction very rarely. Here are my arguments, sorted in increasing order of subjectivity:
Local variables are different from struct members and arrays in the sense that you want them to fit in a register. On a 32b/64b target, a local int16_t can make code slower compared to a local int since the compiler will have to add operations to /force/ overflow according to the semantics of int16_t. While C99 defines an intfast_t typedef, AFAIK a plain int will fit in a register just as well, and it sure is a shorter name.
Organizations which like these typedefs almost invariably end up with several of them (INT32, int32_t, INT32_T, ad infinitum). Organizations using built-in types are thus better off, in a way, having just one set of names. I wish people used the typedefs from stdint.h or windows.h or anything existing; and when a target doesn't have that .h file, how hard is it to add one?
The typedefs can theoretically aid portability, but I, for one, never gained a thing from them. Is there a useful system you can port from a 32b target to a 16b one? Is there a 16b system that isn't trivial to port to a 32b target? Moreover, if most vars are ints, you'll actually gain something from the 32 bits on the new target, but if they are int16_t, you won't. And the places which are hard to port tend to require manual inspection anyway; before you try a port, you don't know where they are. Now, if someone thinks it's so easy to port things if you have typedefs all over the place - when time comes to port, which happens to few systems, write a script converting all names in the code base. This should work according to the "no manual inspection required" logic, and it postpones the effort to the point in time where it actually gives benefit.
Now if portability may be a theoretical benefit of the typedefs, readability sure goes down the drain. Just look at stdint.h: {int,uint}{max,fast,least}{8,16,32,64}_t. Lots of types. A program has lots of variables; is it really that easy to understand which need to be int_fast16_t and which need to be uint_least32_t? How many times are we silently converting between them, making them entirely pointless? (I particularly like BOOL/Bool/eBool/boolean/bool/int conversions. Every program written by an orderly organization mandating typedefs is littered with that).
Of course in C++ we could make the type system more strict, by wrapping numbers in template class instantiations with overloaded operators and stuff. This means that you'll now get error messages of the form "class Number<int,Least,32> has no operator+ overload for argument of type class Number<unsigned long long,Fast,64>, candidates are..." I don't call this "readability", either. Your chances of implementing these wrapper classes correctly are microscopic, and most of the time you'll wait for the innumerable template instantiations to compile.
The C99 standard has a number of standard sized-integer types. If you can use a compiler that supports C99 (gcc does), you'll find these in <stdint.h> and you can just use them in your projects.
Also, it can be especially important in embedded projects to use types as a sort of "safety net" for things like unit conversions. If you can use C++, I understand that there are some "unit" libraries out there that let you work in physical units that are defined by the C++ type system (via templates) that are compiled as operations on the underlying scalar types. For example, these libraries won't let you add a distance_t to a mass_t because the units don't line up; you'll actually get a compiler error.
Even if you can't work in C++ or another language that lets you write code that way, you can at least use the C type system to help you catch errors like that by eye. (That was actually the original intent of Simonyi's Hungarian notation.) Just because the compiler won't yell at you for adding a meter_t to a gram_t doesn't mean you shouldn't use types like that. Code reviews will be much more productive at discovering unit errors then.
My opinion is if you are depending on a minimum/maximum/specific size don't just assume that (say) an unsigned int is 32 bytes - use uint32_t instead (assuming your compiler supports C99).
I like using stdint.h types for defining system APIs specifically because they explicitly say how large items are. Back in the old days of Palm OS, the system APIs were defined using a bunch of wishy-washy types like "Word" and "SWord" that were inherited from very classic Mac OS. They did a cleanup to instead say Int16 and it made the API easier for newcomers to understand, especially with the weird 16-bit pointer issues on that system. When they were designing Palm OS Cobalt, they changed those names again to match stdint.h's names, making it even more clear and reducing the amount of typedefs they had to manage.
I believe that MISRA standards suggest (require?) the use of typedefs.
From a personal perspective, using typedefs leaves no confusion as to the size (in bits / bytes) of certain types. I have seen lead developers attempt both ways of developing by using standard types e.g. int and using custom types e.g. UINT32.
If the code isn't portable there is little real benefit in using typedefs, however , if like me then you work on both types of software (portable and fixed environment) then it can be useful to keep a standard and use the cutomised types. At the very least like you say, the programmer is then very much aware of how much memory they are using. Another factor to consider is how 'sure' are you that the code will not be ported to another environment? Ive seen processor specific code have to be translated as a hardware engieer has suddenly had to change a board, this is not a nice situation to be in but due to the custom typedefs it could have been a lot worse!
Consistency, convenience and readability. "UINT32" is much more readable and writeable than "unsigned long long", which is the equivalent for some systems.
Also, the compiler and processor may be fixed for the life of a project, but the code from that project may find new life in another project. In this case, having consistent data types is very convenient.
If your embedded systems is somehow a safety critical system (or similar), it's strongly advised (if not required) to use typedefs over plain types.
As TK. has said before, MISRA-C has an (advisory) rule to do so:
Rule 6.3 (advisory): typedefs that indicate size and signedness should be used in place of the basic numerical types.
(from MISRA-C 2004; it's Rule #13 (adv) of MISRA-C 1998)
Same also applies to C++ in this area; eg. JSF C++ coding standards:
AV Rule 209 A UniversalTypes file will be created to define all sta
ndard types for developers to use. The types include: [uint16, int16, uint32_t etc.]
Using <stdint.h> makes your code more portable for unit testing on a pc.
It can bite you pretty hard when you have tests for everything but it still breaks on your target system because an int is suddenly only 16 bit long.
Maybe I'm weird, but I use ub, ui, ul, sb, si, and sl for my integer types. Perhaps the "i" for 16 bits seems a bit dated, but I like the look of ui/si better than uw/sw.

Resources