Writing a portable C program - which things to consider? - c

For a project at university I need to extend an existing C application, which shall in the end run on a wide variety of commercial and non-commercial unix systems (FreeBSD, Solaris, AIX, etc.).
Which things do I have to consider when I want to write a C program which is most portable?

The best advice I can give, is to move to a different platform every day, testing as you go.
This will make the platform differences stick out like a sore thumb, and teach you the portability issues at the same time.
Saving the cross platform testing for the end, will lead to failure.
That aside
Integer sizes can vary.
floating point numbers might be represented differently.
integers can have different endianism.
Compilation options can vary.
include file names can vary.
bit field implementations will vary.
It is generally a good idea to set your compiler warning level up as high as possible,
to see the sorts of things the compiler can complain about.

I used to write C utilities that I would then support on 16 bit to 64 bit architectures, including some 60 bit machines. They included at least three varieties of "endianness," different floating point formats, different character encodings, and different operating systems (though Unix predominated).
Stay as close to standard C as you can. For functions/libraries not part of the standard, use as widely supported a code base as you can find. For example, for networking, use the BSD socket interface, with zero or minimal use of low level socket options, out-of-band signalling, etc. To support a wide number of disparate platforms with minimal staff, you'll have to stay with plain vanilla functions.
Be very aware of what's guaranteed by the standard, vice what's typical implementation behavior. For instance, pointers are not necessarily the same size as integers, and pointers to different data types may have different lengths. If you must make implementation dependent assumptions, document them thoroghly. Lint, or --strict, or whatever your development toolset has as an equivalent, is vitally important here.
Header files are your friend. Use implementaton defined macros and constants. Use header definitions and #ifdef to help isolate those instances where you need to cover a small number of alternatives.
Don't assume the current platform uses EBCDIC characters and packed decimal integers. There are a fair number of ASCII - two's complement machines out there as well. :-)
With all that, if you avoid the tempation to write things multiple times and #ifdef major portions of code, you'll find that coding and testing across disparate platforms helps find bugs sooner. You'll end up producing more disciplined, understandable, maintainable programs.

Use atleast two compilers.
Have a continuous build system in place, which preferably builds on the various target platforms.
If you do not need to work very low-level, try to use some library that provides abstraction. It is unlikely that you won't find third-party libraries that provide good abstraction for the things you need. For example, for network and communication, there is ACE. Boost (e.g. filesystem) is also ported to several platforms. These are C++ libraries, but there may be other C libraries too (like curl).
If you have to work at the low level, be aware that the platforms occasionally have different behavior even on things like posix where they are supposed to have the same behavior. You can have a look at the source code of the libraries above.

One particular issue that you may need to stay abreast of (for instance, if your data files are expected to work across platforms) is endianness.
Numbers are represented differently at the binary level on different architectures. Big-endian systems order the most significant byte first and little-endian systems order the least-significant byte first.
If you write some raw data to a file in one endianness and then read that file back on a system with a different endianness you will obviously have issues.
You should be able to get the endianness at compile-time on most systems from sys/param.h. If you need to detect it at runtime, one method is to use a union of an int and a char, then set the char to 1 and see what value the int has.

It is a very long list. The best thing to do is to read examples. The source of perl, for example. If you look at the source of perl, you will see a gigantic process of constructing a header file that deals with about 50 platform issues.
Read it and weep, or borrow.

The list may be long, but it's not nearly as long as also supporting Windows and MSDOS. Which is common with many utilities.
The usual technique is to separate core algorithm modules from those which deal with the operating system—basically a strategy of layered abstraction.
Differentiating amongst several flavors of unix is rather simple in comparison. Either stick to features all use the same RTL names for, or look at the majority convention for the platforms supported and #ifdef in the exceptions.

Continually refer to the POSIX standards for any library functions you use. Parts of the standard are ambiguous and some systems return different styles of error codes. This will help you to proactively find those really hard to find slightly different implementation bugs.


Why doesn't the standard require that struct members be padded minimally?

The standard doesn't seem impose any padding requirements on struct members, even though it does prohibit reordering ( How likely is it that a C platform will not pad minimally, i.e., not add only the minimum amount of padding needed to make sure the next member (or instance of the same struct, if this is the last member) is sufficiently aligned for its type?
Is it even sensible of the standard not to require that padding be minimal?
I'm asking because this lack of a padding guarantee seems to prevent me from portably representing serialized objects as structs (even if I limit myself to just uint8_t arrays as members, compilers seem to be allowed to add padding in between them), and I'm finding it a little weird to have to resort to offset arithmetic there.
How likely is it that a C platform will not pad minimally, i.e., not add only the minimum amount of padding needed to make sure the next member (or instance of the same struct, if this is the last member) is sufficiently aligned for its type?
Essentially, the "extra" padding may allow significant compiler optimizations.
Unfortunately, I don't know if any compilers actually do that (and therefore cannot provide any estimate on its likelihood of occurring).
As a simple example, consider a 32-bit or 64-bit architecture, where the ABI states that string literals and character arrays are aligned to 32-bit or 64-bit boundary. Many of the C library functions are (also) implemented by the C compiler itself; see e.g. these lists for GCC. The compiler can track the parameters to see if they refer to a string literal or (the beginning of a) character array, and if so, replace e.g. strcmp() with an optimized built-in version (which does the comparison in 32-bit units, rather than char-at-a-time).
As a more complicated example, consider a RISC hardware architecture, where unaligned byte access is slower than aligned native word access. (For example, the former may be implemented in hardware as the latter, followed by a bit shift.) Such an architecture could have an ABI that requires all structure members to be word-aligned. Then, the C compiler would be required to add more-than-minimal padding.
Traditionally, the C standards committee has been very careful to not exclude any kind of hardware architecture from correctly implementing the language.
Is it even sensible of the standard not to require that padding be minimal?
The purpose of the C standard used to be to ensure that C code would behave in the same manner if compiled with different compilers, and to allow implementation of the language on any sufficiently capable hardware architecture. In that sense, it is very sensible for the standard not to require minimal padding, as some ABIs may require more than minimal padding for whatever reason.
With the introduction of the Microsoft "extensions", the purpose of the C standard has shifted significantly, to binding C to C++ to ensure a C++ compiler can compile C code with minimal differences to C++ compilation, and to provide interfaces that can be marketed as "safer" with the actual purpose of balkanizing developers and binding them to a single vendor implementation. Because this is contrary to the previous purpose of the standard, and it is clearly non-sensible to standardize single-vendor functions like fscanf_s() while not standardizing multi-vendor functions like getline(), it may not be possible to define what sensible means anymore in the context of the C standard. It definitely does not match "good judgment"; it probably now refers to "being perceptible by the senses".
I'm asking because this lack of a padding guarantee seems to prevent me from portably representing serialized objects as structs
You are making the same mistake C programmers make, over and over again. Structs are not suitable for representing serialized objects. You should not use a struct to represent a network object, or a file header, because of the C struct rules.
Instead, you should use a simple character buffer, and either accessor functions (to extract or pack each member or field from the buffer), or conversion functions (to convert the buffer contents to a struct and vice versa).
The underlying reason why even experienced programmers like the asker still would prefer to use a struct instead, is that the accessors/conversion involves a lot of extra code; having the compiler do it instead would be much better: less code, simpler code, easier to maintain.
And I agree. It would even be quite straightforward, if a new keyword, say serialized_struct was introduced; to introduce a serialized data structure with completely different member rules to traditional C structs. (Note that this support would not affect e.g. linking at all, so it really is not as complicated as one might think.) Additional attributes or keywords could be used to specify explicit byte order, and the compiler would do all the conversion details for us, in whatever way the compiler sees best for the specific architecture it compiler for. This support would only be available for new code, but it would be hugely beneficial in cutting down on interoperability issues -- and it would make a lot of serialization code simpler!
Unfortunately, when you combine the C standard committee's traditional dislike to adding new keywords, and the overall direction change from interoperability to vendor lock-in, there is no chance at all for anything like this to be included in the C standard.
Of course, as described in the comments, there are lots of C libraries that implement one serialization scheme or other. I've even written a few myself (for rather peculiar use cases, though). A sensible approach (poor pun intended) would be to pick a vibrant one (well maintained, with a lively community around the library), and use it.

Why using Low-level-Languages or close to it ( C ) for embedded system and not a high level language, when all will be compiled to machine code?

I have searched but I couldn't find a clear answer. If we are compiling the code in a computer(powerful) then we are only sending a machine instruction to the memory in the embedded device. This, for my understandings, will make no difference if we use any sort of language because, in the end, we will be sending only a machine code to the embedded device, the code compilation which is the expensive phase is already done by a powerful machine!
Why using language like C ? Why not Java? we are sending a machine code at the end.
The answer partly lies in the runtime requirements and platform-provided expectations of a language: The size of the runtime for C is minimal - it needs a stack and that is about it to be able to start running code. For a compliant implementation static data initialisation is required, but you can run code without it - the initialisation itself could even be written in C, and even heap and standard library initialisation are optional, as is the presence of a library at all. It need have no OS dependencies, no interpreter and no virtual machine.
Most other languages require a great deal more runtime support and this is usually provided by an OS, runtime-library, or virtual machine. To operate "stand-alone" these languages would require that support to be "built-in" and would consequently be much larger - so much so that you may as well in many cases deploy a system with an OS and/or JVM for example in any case.
There are of course other reasons why particular languages are suited to embedded systems, such as hardware level access, performance and deterministic behaviour.
While the issue of a runtime environment and/or OS is a primary reason you do not often see higher-level languages in small embedded systems, it is by no means unheard of. The .Net Micro Framework for example allows C# to be used in embedded systems, and there are a number of embedded JVM implementations, and of course Linux distributions are widely embedded making language choice virtually unlimited. .Net Micro runs on a limited number of processor architectures, and requires a reasonably large memory (>256kb), and JVM implementations probably have similar requirements. Linux will not boot on less than about 16Mb ROM/4Mb RAM. Neither are particularly suited to hard real-time applications with deadlines in the microsecond domain.
C is more-or-less ubiquitous across 8, 16, 32 and 64 bit platforms and normally available for any architecture from day one, while support for other languages (other than perhaps C++ on 32 bit platforms at least) may be variable and patchy, and perhaps only available on more mature or widely used platforms.
From a developer point of view, one important consideration is also the availability of cross-compilation tools for the target platform and language. It is therefore a virtuous circle where developers choose C (or increasingly also C++) because that is the most widely available tool, and tool/chip vendors provide C and C++ tool-chains because that is what developers demand. Add to that the third-party support in the form of libraries, open-source code, debuggers, RTOS etc., and it would be a brave (or foolish) developer to select a language with barely any support. It is not just high level languages that suffer in this way. I once worked on a project programmed in Forth - a language even lower-level than C - it was a lonely experience, and while there were the enthusiastic advocates of the language, they were frankly a bit nuts favouring language evangelism over commercial success. C has in short reached critical mass acceptance and is hard to dislodge. C++ benefits from broad interoperability with C and similarly minimal runtime requirements, and by tool-chains that normally support both languages. So the only barrier to adoption of C++ is largely developer inertia, and to some extent availability on 8 and 16 bit platforms.
You're misunderstanding things a bit. Let's start by explaining the foundation of how computers work internally. I'll use simple and practical concepts here. For the underlying theories, read about Turing machines. So, what's your machine made up of? All computers have two basic components: a processor and a memory.
The memory is a sequential group of "cells" that works sort of like a table. If you "write" a value into the Nth cell, you can then retrieve that same value by "reading" from the Nth cell. This allows computers to "remember" things. If a computer is to perform a calculation, it needs to retrieve input data for it from somewhere, and to output data from it into somewhere. That place is the memory. In practice, the memory is what we call RAM, short for random access memory.
Then we have the processor. Its job is to perform the actual calculations on memory. The actual operations that are to be performed are mandated by a program, that is, a series of instructions that the processor is able to understand and execute. The processor decodes and executes an instruction, then the next one, and so on until the program halts (stops) the machine. If the program is add cell #1 and cell #2 and store result in cell #3, the processor will grab the values at cells 1 and 2, add their values together, and store the result into cell 3.
Now, there's some sort of an intrinsic question. Where is the program stored, if at all? First of all, a program can't be hardcoded into the wires. Otherwise, the system is not more of a computer than your microwave. To these problems are two distinct approaches/solutions: the Harvard architecture and the Von Neumann Architecture.
Basically, in the Harvard architecture, the data (as always has been) is stored in the memory. The code (or program) is stored somewhere else, usually in read-only memory. In the Von Neumann architecture, code is stored in memory, and is just another form of data. As a result, code is data, and data is code. It's worth noting that most modern systems use the Von Neumann architecture for several reasons, including the fact that this is the only way to implement just-in-time compilation, an essential part of runtime systems for modern bytecode-based programming languages, such as Java.
We now know what the machine does, and how it does that. However, how are both data and code stored? What's the "underlying format", and how shall it be interpreted? You've probably heard of this thing called the binary numeral system. In our usual decimal numeral system, we have ten digits, zero through nine. However, why exactly ten digits? Couldn't they be eight, or sixteen, or sixty, or even two? Be aware that it's impossible to create an unary based computational system.
Have you heard that computers are "logical and cold". Both of them are true... unless your machine has an AMD processor or a special kind of Pentium. The theory states that every logical predicate can be reduced to either "true" or "false". That is to say that "treu" and "false" are the basis of logic. Plus, computers are made up of electrical cruft, no? A light switch is either on or off, no? So, at the electrical level we can easily recognize two voltage levels, right? And we want to handle logic stuff, such as numbers, in computers, right? So zero and one may be, as the only feasible solution they are.
Now, taking all the theory into account, let's talk about programming languages and assembly languages. Assembly languages are a way to express binary instructions in a (supposedly) readable way to human programmers. For instance, something like this...
ADD 0, 1 # Add cells 0 and 1 together and store the result in cell 0
Could be translated by an assembler into something like...
Both are equivalent, but humans will only understand the former, and processors will only understand the later.
A compiler is a program that translates input data that is expected to conform to the rules of a given programming language into another, usually lower-level form. For instance, a C compiler may take this code...
x = some_function(y + z);
And translate it into assembly code such as (of course this is not real assembly, BTW!)...
# Assume x is at cell 1, y at cell 2, and z at cell 3.
# Assuem that, when calling a function, the first argument
# is at cell 16, and the result is stored in cell 0.
MOVE 16, 2
ADD 16, 3
CALL some_function
MOVE 1, 0
And the assembler will spit (this is not random)...
Now, let's talk about another language, namely Java. Java's compiler does not give you assembly/raw binary code, but bytecode. Bytecode is... like a generic, higher-level form of assembly language that the CPU can't understand (there are exceptions), but another program that directly runs on the CPU does. This means that the lie that some badly educated people spread around, that "both interpreted and compiled programs ultimately boil down to machine code" is false. If, for example, the interpreter is written in C, and has this line of code...
Bytecode some_bytecode;
/* ... */
(Note: I won't translate that into assembly/binary again!) The processor executes the interpreter, and the interpreter's code executes the bytecode, by performing the actions specified by the bytecode. Although, if not optimized correctly, this can severely degrade performance, this is not the problem per se, but the fact that things such as reflection, garbage collection, and exceptions can add quite some overhead. For embedded systems, whose memories are small and whose processors are slow, this is something you want. You're wasting precious system resources on things you don't need. If C programs are slow on your Arduino, image a full blown Java/Python program with all sorts of bells and whistles! Even if you translated bytecode into machine code before inserting it into the system, support must be there for all that extra stuff, and results in basically the same unwanted overhead/waste. You would still need support for reflection, exceptions, garbage collection, etc... It's basically the same thing.
On most other environments, this is not a big deal, as memory is cheap and abundant, and processors are fast and powerful. Embedded systems have special needs, they're special by themselves, and things are not free in that land.
Why using language like C ? why not Java ? we are sending a machine
code at the end.
No, Java code does not compile to machine code, it needs a virtual machine (the JVM) on the target system.
You're partly right about the compilation, however, but still "higher-level" languages can result in less efficient machine code. For instance, the language can include garbage collection, run-time correctness checks, can't use all the "native" numeric types, etc.
In general it depends on the target. On small targets (i.e. microcontrollers like AVR) you don't have that complex programs running. Additionally, you need to access the hardware directly (f.e. a UART). High level languages like Java don't support accessing the hardware directly, so you usually end up with C.
In the case of C versus Java there's a major difference:
With C you compile the code and get a binary that runs on the target. It directly runs on the target.
Java instead creates Java Bytecode. The target CPU cannot process that. Instead it requires running another program: the Java runtime environment. That translates the Java Bytecode to actual machine code. Obviously this is more work and thus requires more processing power. While this isn't much of a concern for standard PCs it is for small embedded devices. (Note: some CPUs do actually have support for running Java bytecode directly. Those are exceptions though.)
Generally speaking, the compile step isn't the issue -- the limited resources and special requirements of the target device are.
you misunderstand something , 'compiling' java gives a different output then compiling a low level language , it is true that both are machine codes , but in c case the machine code is directly executable by the processor , whereas with java the output will be in an intermediate stage , a bytecode , and it can't be executed by the processor , it needs some extra work , a translation to a machine code , that is the only directly executable format , while that takes a extra time , c will be an attractive choice , because of its speed , with low level language you write you code then you compile to a target machine ( you need to specify the target to the compiler since each processor have his own machine code ) , then your code is understandable by the processor .
in the other hand c allows direct hardware access , that is not allowed in java-like languages even via an api
It's an industry thing.
There are three kinds of high level languages. Interpreted (lua, python, javascript), compiled to bytecode (java, c#), and compiled to machinne code (c, c++, fortran, cobol, pascal)
Yes, C is a high level language, and closer to java than to assembly.
High level languages are popular for two reasons.
Memory management, and a wide standard library.
Managed memory comes with a cost,
somebody must manage it. That's an issue not only for java and c#, where somebody must implement a VM, but also to baremetal c/c++ where someone must implement the memory allocation functions.
A wide standard library can't be supported by all targets because there aren't enough resources. ie, avr arduino doesn't support the full c++ standard library.
C gained popularity, because it can easily be converted to equivalent assembly code. Most statements can be converted, without optimization, to a bunch of fixed assembly instructions, so compilers are easy to program. And its standard is compact and easy to implement. C prevailed because it became the defacto standard for the lowest high level language of any arch.
So in the end, besides special snowflakes like cython, go, rust, haskell etc, industry decided that machinne code is compiled from C, C++ and most optimization efforts went that way
Languages, like java, decided to hide memory from the progarammer, so good luck trying to interface with low level stuff there. As by design they do that, almost nobody bothers trying to bring them to compete with C. Realistically, java without GC would be C++ with different syntax.
Finally, if all the industry money goes to one language, the cheapest/easyest thing to do is choosig that language.
You are right in that you can use any language that generates machine code. But JAVA is not one of them. JAVA, Python and even some languages that compile to machine code may have heavy system requirements. You could and some folks use Pascal, but C won the C vs Pascal war many years ago. There are some other languages that fell by the wayside that if you had a compiler for you could use. there are some new languages you can use, but the tools are not as mature and not as many targets as one would like. But it is very unlikely that they will unseat C. C is just the right amount of power/freedom, low enough and high enough.
Java is an interpreted language and (like all interpreted languages) produces an intermediate code that is not directly executable by the processor. So what you send to the embedded device would be the Bytecode and you should have a JVM running on it and interpreting your code. Clearly not feasible. For what concern the compiled languages (C, C++...) you are right to say that at the end you send machine code to the device. However consider that using high level features of a language will produce much more machine code that you would expect. If you use polymorphism for example, you have just a function call, but when you compile the machine code explodes. Consider also that very often the use of dynamic memory (malloc, new...) is not feasible on an embedded device.

Difference between different binary formats

What are the difference between binary formats like COFF, ELF, a.out, etc, why do so many different formats exist?
All they have to be is a sequence of instructions and their arguments (specified by the ISA). So as long as the processor is the same, you can use the same binary between computers (currently needs to be of compatible ABI).
Short answer: History and development. See the feature comparison at Wikipedia.
Better answer: Your assumptions are wrong. It is definitely not just code and initialized variables; much wider range of features are needed. There is no universal executable format, or a best executable format, as the features needed vary in different cases. Usually, we also want to keep backwards support; and why switch from a known working solution to a new one, if you don't have to? Because the new one is "universal" is just silly (actually stupid, if you consider how badly monocultures fare in the real, changing world).
Currently, ELF format is the closest we have to an "universal" format, and is used by many current operating systems -- although both Windows and Mac OS use their own formats for basically historical reasons. (Windows retains backwards portability, later having switching to a COFF-based "portable executable" format, COFF itself not being that portable; Mac OS X uses Mach-O format, which is directly related to its kernel.)

Adding 64 bit support to existing 32 bit code, is it difficult?

There is a library which I build against different 32-bit platforms. Now, 64-bit architectures must be supported. What are the most general strategies to extend existing 32-bit code to support 64-bit architectures? Should I use #ifdef's or anything else?
The amount of effort involved will depend entirely on how well written the original code is. In the best possible case there will be no effort involved other than re-compiling. In the worst case you will have to spend a lot of time making your code "64 bit clean".
Typical problems are:
assumptions about sizes of int/long/pointer/etc
assigning pointers <=> ints
relying on default argument or function result conversions (i.e. no function prototypes)
inappropriate printf/scanf format specifiers
assumptions about size/alignment/padding of structs (particularly in regard to file or network I/O, or interfacing with other APIs, etc)
inappropriate casts when doing pointer arithmetic with byte offsets
Simply don't rely on assumption of the machine word size? always use sizeof, stdint.h, etc. Unless you rely on different library calls for different architectures, there should be no need for #ifdefs.
The easiest strategy is to build what you have with 64-bit settings and test the heck out of it. Some code doesn't need to change at all. Other code, usually with wrong assumptions about the size of ints/pointers will be much more brittle and will need to be modified to be non-dependant on the architecture.
Very often binary files containing binary records cause the most problems. This is especially true in environments where ints grow from 32-bit to 64-bit in the transition to a 64-bit build. Primarily this is due to the fact that integers get written natively to files in their current (32-bit) length and read in using an incorrect length in a 64-bit build where ints are 64-bit.

When to worry about endianness?

I have seen countless references about endianness and what it means. I got no problems about that...
However, my coding project is a simple game to run on linux and windows, on standard "gamer" hardware.
Do I need to worry about endianness in this case? When should I need to worry about it?
My code is simple C and SDL+GL, the only complex data are basic media files (png+wav+xm) and the game data is mostly strings, integer booleans (for flags and such) and static-sized arrays. So far no user has had issues, so I am wondering if adding checks is necessary (will be done later, but there are more urgent issues IMO).
The times when you need to worry about endianess:
you are sending binary data between machines or processes (using a network or file). If the machines may have different byte order or the protocol used specifies a particular byte order (which it should), you'll need to deal with endianess.
you have code that access memory though pointers of different types (say you access a unsigned int variable through a char*).
If you do these things you're dealing with byte order whether you know it or not - it might be that you're dealing with it by assuming it's one way or the other, which may work fine as long as your code doesn't have to deal with a different platform.
In a similar vein, you generally need to deal with alignment issues in those same cases and for similar reasons. Once again, you might be dealing with it by doing nothing and having everything work fine because you don't have to cross platform boundaries (which may come back to bite you down the road if that does become a requirement).
If you mean a PC by "standard gamer hardware", then you don't have to worry about endianness as it will always be little endian on x86/x64. But if you want to port the project to other architectures, then you should design it endianness-independently.
Whenever you recieve/transmit data from a network, remeber to convert to/from network and host byte order. The C functions htons, htonl etc, or equivalients in your language, should be used here.
Whenever you read multi-byte values (like UTF-16 characters or 32 bit ints) from a file, since that file might have originated on a system with different endianness. If the file is UTF 16 or 32 it probably has a BOM (byte-order mark). Otherwise, the file format will have to specify endianness in some way.
You only need to worry about it if your game needs to run on different hardware architectures. If you are positive that it will always run on Intel hardware then you can forget about it. If it will run on Linux though many people use different architectures than Intel and you may end up having to think about it.
Are you distributing you game in source code form?
Because if you are distributing you game as a binary only, then you know exactly which processor families your game will run on. Also, the media files, are they user generated (possibly via a level editor) or are they really only ment to be supplied by yourself?
If this is a truly closed environment (your distribute binaries and the game assets are not intended to be customized) then you know your own risks to endians and I personally wouldn't fool with it.
However, if you are either distributing source and/or hoping people will customize their game, then you have a potential for concern. However, with most of the desktop/laptop computers around these days moving to x86 I would think this is a diminishing concern.
The problem occurs with networking and how the data is sent and when you are doing bit fiddling on different processors since different processors may store the data differently in memory.
I believe Power PC has the opposite endianness of the Intel boards. Might be able to have a routine that sets the endianness dependant on the architecture? I'm not sure if you can actually tell what the hardware architecture is in code...maybe someone smarter then me does know the answer to that question.
Now in reference to your statement "standard" Gamer H/W, I would say typically you're going to look at Consumer Off the Shelf solutions are really what most any Standard Gamer is using, so you're almost going to for sure get the same endian across the board. I'm sure someone will disagree with me but that's my $.02
Ha...I just noticed on the right there is a link that is showing up related to the suggestion I had above.
Find Endianness through a c program
