AFAIK in c-language structs get laid out, aligned and padded as the compiler sees fit. That is why you cannot rely on one c-program to use structs from another c-program. E.g. you cannot save a struct as a binary file that another c program will read and cast to that same struct. You may be able to use packed structs like that but that's not really good practice.
So I was surprised to learn that .so and DLL files have c functions that take complicated struct's (references thereof) as their parameters. At least my company's products do this.
Is this reliable, is it good practice? Is there some new standard for struct layout with sizes, alignment and padding all being the same?
I know a 64-bit program cannot call a 32-bit library, but still I thought struct layout can vary amongst compilers of the same bits.
For a given processor type and a given operating system, there is usually a standard ABI (application binary interface) which specifies things such as:
The width and endianness of integer types.
The width and representation of floating-point types.
Alignment constraints, which dictate the presence of padding in structures.
How parameters are passed to functions (in registers and on the stack).
For example, on the x86_64 processor architecture (i.e. the processor architecture of 64-bit PC), there are two popular ABIs: the Microsoft x64 calling convention, used on Windows, and the System V amd64 ABI (PDF — the part you're asking about specifically is in §3.1 “Machine interface”), used everywhere else. 32-bit x86 historically had more fragmentation.
So generally, if you use different compilers for the same platform (processor and operating system) in their default mode, they'll produce the same layouts for structs, and they'll generate function calling code that's compatible with how functions compiled by other compilers read their arguments. Problems arise when you mix different platforms, for example writing a struct on an embedded device and trying to read it on a PC.
const struct sockaddr FAR* name,
It's an old extension from the era of segmented memory architectures. It basically means "this is a pointer that needs to be able to point at any address, not just things in the same segment as the code using it".
See more or on the wikipedia page.
far doesn't mean anything in C. Check out the C99 standard [PDF] and see if you can find mention of far pointers. Far pointers were an extension added to compilers targeting the 8086/80286 architectures to provide support for the segmented memory model.
If does nothing unless you happen to be using a 16 bit x86 compiler.
If you look in the Win32 header WinDef.h (in Visual Studio, simply right-click the word FAR in the source and select "Go to Definition", you will see that it is a macro defined as far, which in turn is also a macro defined as nothing at all!
It is only there to allow the compilation of legacy Win16 source as Win32. In 16 bit x86 compilers, far was a compiler extension keyword to support seg::offset pointers which resolve to a 20bit address (16 bit x86 only had a 1Mb address space!). They are distinct from 16 bit near pointers which comprised only the ::offset from the current segment.
I wanna know probable problems faced while moving C programs for eg. server process from Tru64 Unix to Linux 64 bits and why? What probable modifications the program would need or only recompiling the source code in new environment would do as both are 64 bit platforms? I am a little confused, I gotta know before I start working on it.
I spent a lot of time in the early 90s (OMG I feel old...) porting 32-bit code to the Alpha architecture. This was back when it was called OSF/1.
You are unlikely to have any difficulties relating to the bit-width when going from Alpha to x86_64.
Developers are much more aware of the problems caused by assuming that sizeof(int) == sizeof(void *), for example. That was far and away the most common problem I used to have when porting code to Alpha.
Where you do find differences they will be in how the two systems differ in their conformity to various API specifications, e.g. POSIX, XOpen, etc. That said, such differences are normally easily worked around.
If the Alpha code has used the SVR4 style APIs (e.g. streams) that you may have more difficulty than if it has used the more BSD-like APIs.
64 bit architecture is only an approximation of the classification of an architecture.
Ideally your code would have used only "semantic" types for all descriptions of variables, in particular size_t and ptrdiff_t for sizes and pointer arithmetic and the [u]intXX_t for types where a particular width is assumed.
If this is not the case, the main point would be to compare all the standard arithmetic types (all integer types, floating point types and pointers) if they map to the same concept on both platforms. If you find differences, you know the potential trouble spots.
Check the 64-bit data model used by both platforms, most 64bit Unix-like OS's use LP64, so it is likely that your target platforms use the same data model. This being the case you should have few problems given that teh code itself compiles and links.
If you use the same compiler (e.g. GCC) on both platforms you also need not worry about incompatible compiler extensions or differences in undefined or implementation defined behaviour. Such behaviour should be avoided in any case - even if the compilers are the same, since it may differ between target architectures. If you are not using the same compiler, then you need to be cautious about using extensions. #pragma directives are a particular issue since a compiler is allowed to quietly ignore a #pragma it does not recognise.
Finally in order to compile and link, any library dependencies outside the C standard library need to be available on both platforms. Most OS calls will be available since Unix and Linux share the same POSIX API.
const struct sockaddr FAR* name,
It's an old extension from the era of segmented memory architectures. It basically means "this is a pointer that needs to be able to point at any address, not just things in the same segment as the code using it".
See more or on the wikipedia page.
far doesn't mean anything in C. Check out the C99 standard [PDF] and see if you can find mention of far pointers. Far pointers were an extension added to compilers targeting the 8086/80286 architectures to provide support for the segmented memory model.
If does nothing unless you happen to be using a 16 bit x86 compiler.
If you look in the Win32 header WinDef.h (in Visual Studio, simply right-click the word FAR in the source and select "Go to Definition", you will see that it is a macro defined as far, which in turn is also a macro defined as nothing at all!
It is only there to allow the compilation of legacy Win16 source as Win32. In 16 bit x86 compilers, far was a compiler extension keyword to support seg::offset pointers which resolve to a 20bit address (16 bit x86 only had a 1Mb address space!). They are distinct from 16 bit near pointers which comprised only the ::offset from the current segment.
Is there any special C standard for microcontrollers?
I ask because so far when I programmed something under Windows OS, it doesn't matter which compiler I used. If I had a compiler for C99, I knew what I could do with it.
But recently I started to program in C for microcontrollers, and I was shocked, that even it's still C in its basics, like loops, variables creation and so, there is some syntax type I have never seen in C for desktop computers. And furthermore, the syntax is changing from version to version. I use AVR-GCC compiler, and in previous versions, you used a function for port I/O, now you can handle a port like a variable in the new version.
What defines what functions and how to have them to be implemented into the compiler and still have it be called C?
Is there any special C standard for microcontrollers?
No, there is the ISO C standard. Because many small devices have special architecture features that need to be supported, many compilers support language extensions. For example because an 8051 has bit addressable RAM, a _bit data type may be provided. It also has a Harvard architecture, so keywords are provided for specifying different memory address spaces which an address alone does not resolve since different instructions are required to address these spaces. Such extensions will be clearly indicated in the compiler documentation. Moreover, extensions in a conforming compiler should be prefixed with an underscore. However, many provide unadorned aliases for backward compatibility, and their use should be deprecated.
... when I programmed something under Windows OS, it doesn't matter which compiler I used.
Because the Windows API is standardized (by Microsoft), and it only runs on x86, so there is no architectural variation to consider. That said, you may still see FAR, and NEAR macros in APIs, and that is a throwback to 16-bit x86 with its segmented addressing, which also required compiler extensions to handle.
... that even it's still C in its basics, like loops, variables creation and so,
I am not sure what that means. A typical microcontroller application has no OS or a simple kernel, you should expect to see a lot more 'bare metal' or 'system-level' code, because there are no extensive OS APIs and device driver interfaces to do lots of work under the hood for you. All those library calls are just that; they are not part of the language; it is the same C language; jut put to different work.
... there is some syntax type I have never seen in C for desktop computers.
For example...?
And furthermore, the syntax is changing from version to version.
I doubt it. Again; for example...?
I use AVR-GCC compiler, and in previous versions, you used a function for port I/O, now you can handle a port like a variable in the new version.
That is not down to changes in the language or compiler, but more likely simple 'preprocessor magic'. On AVR, all I/O is memory mapped, so if for example you include the device support header, it may have a declaration such as:
#define PORTA (*((volatile char*)0x0100))
You can then write:
PORTA = 0xFF;
to write 0xFF to memory mapped the register at address 0x100. You could just take a look at the header file and see exactly how it does it.
The GCC documentation describes target specific variations; AVR is specifically dealt with here in section 6.36.8, and in 3.17.3. If you compare that with other targets supported by GCC, it has very few extensions, perhaps because the AVR architecture and instruction set were specifically designed for clean and efficient implementation of a C compiler without extensions.
What defines what functions and how to have them to be implemented into the compiler and still have it be called C?
It is important to realise that the C programming language is a distinct entity from its libraries, and that functions provided by libraries are no different from the ones you might write yourself - they are not part of the language - so it can be C with no library whatsoever. Ultimately, library functions are written using the same basic language elements. You cannot expect the level of abstraction present in, say, the Win32 API to exist in a library intended for a microcontroller. You can in most cases expect at least a subset of the C Standard Library to be implemented since it was designed as a systems level library with few target hardware dependencies.
I have been writing C and C++ for embedded and desktop systems for years and do not recognise the huge differences you seem to perceive, so can only assume that they are the result of a misunderstanding of what constitutes the C language. The following books may help.
C Programming Language (2nd Edition) by Brian W. Kernighan and Dennis M. Ritchie
Embedded C by Michael J. Pont
Embedded systems are weird and sometimes have exceptions to "standard" C.
From system to system you will have different ways to do things like declare interrupts, or define what variables live in different segments of memory, or run "intrinsics" (pseudo-functions that map directly to assembly code), or execute inline assembly code.
But the basics of control flow (for/if/while/switch/case) and variable and function declarations should be the same across the board.
and in previous versions, you used function for Port I/O, now you can handle Port like variable in new version.
That's not part of the C language; that's part of a device support library. That's something each manufacturer will have to document.
The C language assumes a von Neumann architecture (one address space for all code and data) which not all architectures actually have, but most desktop/server class machines do have (or at least present with the aid of the OS). To get around this without making horrible programs, the C compiler (with help from the linker) often support some extensions that aid in making use of multiple address spaces efficiently. All of this could be hidden from the programmer, but it would often slow down and inflate programs and data.
As far as how you access device registers -- on different desktop/server class machines this is very different as well, but since programs written to run under common modern OSes for these machines (Mac OS X, Windows, BSDs, or Linux) don't normally access hardware directly, this isn't an issue. There is OS code that has to deal with these issues, though. This is usually done through defining macros and/or functions that are implemented differently on different architectures or even have multiple versions on a single system so that a driver could work for a particular device (such an Ethernet chip) whether it were on a PCI card or a USB dongle (possibly plugged into a USB card plugged into a PCI slot), or directly mapped into the processor's address space.
Additionally, the C standard library makes more assumptions than the compiler (and language proper) about the system that hosts the programs that use it (the C standard library). These things just don't make sense when there isn't a general purpose OS or filesystem. fopen makes no sense on a system without a filesystem, and even printf might not be easily definable.
As far as what AVR-GCC and its libraries do -- there are lots of stuff that goes into how this is done. The AVR is a Harvard architecture with memory mapped device control registers, special function registers, and general purpose registers (memory addresses 0-31), and a different address space for code and constant data. This already falls outside of what standard C assumes. Some of the registers (general, special, and device control) are accessible via special instructions for things like flipping single bits and read/writing to some multi-byte registers (a multi-instruction operation) implicitly blocks interrupts for the next instruction (so that the second half of the operation can happen). These are things that desktop C programs don't have to know anything about, and since AVR-GCC comes from regular GCC, it didn't initially understand all of these things either. That meant that the compiler wouldn't always use the best instructions to access control registers, so:
*(DEVICE_REG_ADDR) |= 1; // Set BIT0 of control register REG
would have turned into:
temp_reg = *DEVICE_REG_ADDR;
temp_reg |= 1;
*DEVICE_REG_ADDR = temp_reg;
because AVR generally has to have things in its general purpose registers to do bit operations on them, though for some memory locations this isn't true. AVR-GCC had to be altered to recognize that when the address of a variable used in certain operations is known at compile time and lies within a certain range, it can use different instructions to preform these operations. Prior to this, AVR-GCC just provided you with some macros (that looked like functions) that had inline assembly to do this (and use the single instruction inplemenations that GCC now uses). If they no longer provide the macro versions of these operations then that's probably a bad choice since it breaks old code, but allowing you to access these registers as though they were normal variables once the ability to do so efficiently and atomically was implemented is good.
I have never seen a C compiler for a microcontroller which did not have some controller-specific extensions. Some compilers are much closer to meeting ANSI standards than others, but for many microcontrollers there are tradeoffs between performance and ANSI compliance.
On many 8-bit microcontrollers, and even some 16-bit ones, accessing variables on a stack frame is slow. Some compilers will always allocate automatic variables on a run-time stack despite the extra code required to do so, some will allocate automatic variables at compile time (allowing variables that are never live simultaneously to overlap), and some allow the behavior to be controlled with a command-line options or #pragma directives. When coding for such machines, I sometimes like to #define a macro called "auto" which gets redefined to "static" if it will help things work faster.
Some compilers have a variety of storage classes for memory. You may be able to improve performance greatly by declaring things to be of suitable storage classes. For example, an 8051-based system might have 96 bytes of "data" memory, 224 bytes of "idata" memory which overlaps the first 96 bytes, and 4K of "xdata" memory.
Variables in "data" memory may be accessed directly.
Variables in "idata" memory may only be accessed by loading their address into a one-byte pointer register. There is no extra overhead accessing them in cases where that would be necessary anyway, so idata memory is great for arrays. If array q is stored in idata memory, a reference to q[i] will be just as fast as if it were in data memory, though a reference to q[0] will be slower (in data memory, the compiler could pre-compute the address and access it without a pointer register; in idata memory that is not possible).
Variables in xdata memory are far slower to access than those in other types, but there's a lot more xdata memory available.
If one tells an 8051 compiler to put everything in "data" by default, one will "run out of memory" if one's variables total more than 96 bytes and one hasn't instructed the compiler to put anything elsewhere. If one puts everything in "xdata" by default, one can use a lot more memory without hitting a limit, but everything will run slower. The best is to place frequently-used variables that will be directly accessed in "data", frequently-used variables and arrays that are indirectly accessed in "idata", and infrequently-used variables and arrays in "xdata".
The vast majority of the standard C language is common with microcontrollers. Interrupts do tend to have slightly different conventions, although not always.
Treating ports like variables is a result of the fact that the registers are mapped to locations in memory on most microcontrollers, so by writing to the appropriate memory location (defined as a variable with a preset location in memory), you set the value on that port.
As previous contributors have said, there is no standard as such, mainly due to different architectures.
Having said that, Dynamic C (sold by Rabbit Semiconductor) is described as "C with real-time extensions". As far as I know, the compiler only targets Rabbit processors, but there are useful additional keywords (for example, costate, cofunc, and waitfor), some real peculiarities (for example, #use mylib.lib instead of #include mylib.h - and no linker), and several omissions from ANSI C (for example, no file-scope static variables).
It's still described as 'C' though.
Wiring has a C-based language syntax. Perhaps you might want to see what makes it as such.