gcc - OS-independent function labels

gcc - OS-independent function labels - c

void foo(){
...
}
Compiling this to assembly, it seems that gcc on linux will create label foo as an entry point but label _foo on OSX.
We can, of course, do an OS-specific selection whenever we need a label, but this is cumbersome.
Is there any way to suppress this so that the labels on both systems are the same (preferably one that is also Windows-compatible)?

No. It's part of the name mangling specifications of the platform.
You can't change that. You're still writing assembly. Don't expect it to be portable in any way, that's what C was invented for.

The early C compilers decorated the name of the functions with an _ to avoid name clashing when linking against the already developed and huge assembly libraries of the times.
Credits for this information go to this excellent old answer.
Today this is not needed anymore but the tradition is still sticking around, mostly for backward compatibility, even though some systems are getting rid of it.
This is not an OS issue, OSes are completely orthogonal to programming languages, name decoration is not something defined by the OS ABI, it is a matter of the compiler/linker designers; though standards have been created to reduce the incompatibilities and an ABI may suggest their use.
In order to fully understand how you can mitigate your problem it is worth noting that while the OS API are language agnostic, a C program rarely invoke them directly, more likely it uses the C run-time.
The C run-time is usually statically linked and it expects names to be decorated according to the scheme of the compiler used to create it.
So if you need to use the C run-time you have to stick with the same name decoration as your system components are using.
This last point rules out the -fno-leading-underscore option as it will generate a linker error on the relevant platforms.
It is better to work on the assembly files, since you have the freedom to define and imports names exactly as typed. Furthermore usually the assembly code is limited.
If you are using NASM1 there is a nice trick you can use, it's called Macro indirection and it allow you to append a symbol, define at command line, to a name.
Consider:
BITS 32
mov eax, %[p]data
_data db 0
data db 0
If you compile this file twice, the first time as nasm -Dp=_ ... and the second as nasm -Dp= ..., by inspecting the immediate value in the generated opcode for mov eax, %[p]data you can check that in the first case it has been translated as mov eax, _data and in the second as mov eax, data.
Assuming you access external symbols by declaring them as EXTERN symn (precise syntax is irrelevant here), you can define a macro PEXTERN that works like the directive EXTERN but import the symbol with or without a leading underscore based on the value of the macro p (you can change this name) and define an alias for it so that its imported name is the same regardless.
BITS 32
%macro PEXTERN 1
EXTERN %[p]%1
%ifnidn %1, %[p]%1
%define %1 %[p]%1
%endif
%endmacro
PEXTERN foo
PEXTERN bar
mov eax, foo
call bar
Running nasm -Dp= -e ... and nasm -Dp=_ -e ... produces the listings
extern foo extern _foo
extern bar extern _bar
mov eax, foo mov eax, _foo
call bar call _bar
You'll need to update the building scripts/Makefiles, off the top of my head you can use two methods:
Detect the OS type and properly define the symbol p.
With Makefiles this may be easier.
Try compiling a test program.
Write a minimal C program that import/export a function and a minimal assembly file that export/import that function.
Define the symbol as _ and try to assemble + compile (redirecting everything into /dev/null).
If it fails redefine the symbol as empty.
Note that besides names, individual OSes may need specific assembly flags, so a universal building script maybe more involved but not necessarily unmanageable.
You'll end up needing something like Cygwin for Windows.
1 If not, check if you can port the idea into your assembler.

Related

How does the compiler differentiate indentically-named items

In the following example:
int main(void) {
int a=7;
{
int a=8;
}
}
The generated assembly would be something like this (from Compiler Explorer) without optimizations:
main:
pushq %rbp
movq %rsp, %rbp
movl $7, -4(%rbp) // outer scope: int a=7
movl $8, -8(%rbp) // inner scope: int a=8
movl $0, %eax
popq %rbp
ret
How does the compiler know where the variable is if there are duplicately-named variables? That is, when in the inner scope, the memory address is at %rbp-8 and when in the outer scope the address is at %rbp-4.

There are many ways to implement the local scoping rule. Here is a simple example:
the compiler can keep a list of nested scopes, each with its own list of symbol definitions.
this list initially has a single element for the global scope,
when it parses a function definition, it adds a new scope element in front of the scope list for the function argument names, and adds each argument name with the corresponding information in the identifier list of this scope element.
for each new block, it adds a new scope element in front of the scope list. for ( introduces a new scope too for definitions in its first clause.
upon leaving the scope (at the end of the block), it pops the scope element from the scope list.
when it parses a declaration or a definition, if the corresponding symbol is already in the current scope's list, it is a local redefinition, which is forbidden (except for extern forward declarations). Otherwise the symbol is added to the scope list.
when it encounters a symbol in an expression, it looks it up in the current scope's list of symbols, and each successive scope in the scope list until it finds it. If the symbol cannot be found, it is undefined, which is an error according to the latest C Standard. Otherwise the symbol information is used for further parsing and code generation.
The above steps are performed for type and object names, a separate list of symbols is maintained for struct, union and enum tags.
Preprocessing is performed before all of this occurs, in a separate phase of program translation.

The C programming language has some specification, like n1570 (or newer ones). That specification defines in §6.2.1 the scope of an identifier.
So any C compiler should follow that specification.
How does a C compiler implements that specification requires a good book for explanations. I recommend the Dragon book.
Some simple or complex C compilers are open source. Look inside the source code of TinyCC, nwcc, Clang, or GCC to understand how they implement that specification (they have symbol tables, but details are specific to each compiler).
How does the compiler know where the variable is if there are duplicately-named variables?
It manages symbol tables, and update them when parsing blocks. Usually, a compiler build some abstract syntax tree of the compiled source code, and leafs in that tree representing variables refer to some symbol table. The GCC compiler documents its Generic Tree and GIMPLE data structures and provide dump options to output them. You could also compile your foo.c as gcc -S -O -fverbose-asm foo.c and look into the emitted assembler code foo.s.
At last, your example can be considered as poor programming style. Some coding guidelines (like MISRA-C or GNU coding standards) disallow or discourage it. Your code review process should catch such code (in my opinion, your example is a quite unreadable code).
My feeling is that single letter variables should have a very small scope - a dozen of lines at most.
I suggest to look (for inspiration) inside the C code of existing free software projects (like GNU bash or GNU make). Care has been taken to choose understandable names.
Take advantage of modern source code editors like GNU emacs or vim. You can configure them to type long identifiers with a few keyboard presses (they have auto-completion; and some input libraries like GNU readline provides that too). Since you (or your colleagues) will spend much more time in reading source code than in typing it, such an effort (naming well your variables and identifiers) is worth your valuable time.
If you use GCC as your compiler, invoke it as gcc -Wall -Wextra -g to get a lot of warnings and debug information. You could also use static source code analysis tools like Frama-C or the Clang static analyzer.
For real life software projects (for example GTK), you'll have a document specifying coding conventions, and you could write some GCC plugin checking most of them. See also the DECODER project.
For some parts of your software project, you may use C code generators like SWIG or GNU bison. In some cases, you would have your own C code generator. Then be sure to generate long C identifiers to reduce the possibility of name clashes.
Some code obfuscation tools are renaming most C identifiers. If you ship C source code without comments and with most identifiers generated like _0TwK4TkhEG the resulting C code can be compiled at your client site and would practically stay unreadable. You technically could write a code obfuscator transforming readable C code to cryptic C code.

Is it possible in practice to compile millions of small functions into a static binary?

I've created a static library with about 2 million small functions, but I'm having trouble linking it to my main function, using GCC (tested 4.8.5 or 7.3.0) under Linux x86_64.
The linker complains about relocation truncations, very much like those in this question.
I've already tried using -mcmodel=large, but as the answer to that same question says, I would
"need a crt1.o that can handle full 64-bit addresses". I've then tried compiling one, following this answer, but recent glibc won't compile under -mcmodel=large, even if libgcc does, which accomplishes nothing.
I've also tried adding the flags -fPIC and/or -fPIE to no avail. The best I get is this sole error:
ld: failed to convert GOTPCREL relocation; relink with --no-relax
and adding that flag also doesn't help.
I've searched around the Internet for hours, but most posts are very old and I can't find a way to do this.
I'm aware this is not a common thing to try, but I think it should be possible to do this. I'm working in an HPC environment, so memory or time constraints are not the issue here.
Has anyone been successful in accomplishing something similar with a recent compiler and toolchain?

Either don't use the standard library or patch it. As for the 2.34 version, Glibc doesn't support the large code model. (See also Glibc mailing list and Redhat Bugzilla)
Explanation
Let's examine the Glibc source code to understand why recompiling with -mcmodel=large accomplished nothing. It replaced the relocations originating from C files. But Glibc contained hardcoded 32-bit relocations in raw Assembly files, such as in start.S (sysdeps/x86_64/start.S).
call *__libc_start_main#GOTPCREL(%rip)
start.S emitted R_X86_64_GOTPCREL for __libc_start_main, which used relative addressing. x86_64 CALL instruction didn't support relative jumps by more than 32-bit displacement, see AMD64 Manual 3. So, ld couldn't offset the relocation R_X86_64_GOTPCREL because the code size surpassed 2GB.
Adding -fPIC didn't help due to the same ISA constraints. For position-independent code, the compiler still generated relative jumps.
Patching
In short, you have to replace 32-bit relocations in the Assembly code. See System V Application Binary Interface AMD64 Architecture Process Supplement for more info about implementing 64-bit relocations. See also this for a more in-depth explanation of code models.
Why don't 32-bit relocations suffice for the large code model? Because we can't rely on other symbols being in a range of 2GB. All calls must become absolute. Contrast with the small PIC code model, where the compiler generates relative jumps whenever possible.
Let's look closely at the R_X86_64_GOTPCREL relocation. It contains the 32-bit difference between RIP and the symbol's GOT entry address. It has a 64-bit substitute — R_X86_64_GOTPCREL64, but I couldn't find a way to use it in Assembly.
So, to replace the GOTPCREL, we have to compute the symbol entry GOT base offset and the GOT address itself. We can calculate the GOT location once in the function prologue because it doesn't change.
First, let's get the GOT base (code lifted wholesale from the ABI Supplement). The GLOBAL_OFFSET_TABLE relocation specifies the offset relative to the current position:
leaq 1f(%rip), %r11
1: movabs $_GLOBAL_OFFSET_TABLE_, %r15
leaq (%r11, %r15), %r15
With the GOT base residing on the %r15 register, now we have to find the symbol's GOT entry offset. The R_X86_64_GOT64 relocation specifies exactly this. With this, we can rewrite the call to __libc_start_main as:
movabs $__libc_start_main#GOT, %r11
call *(%r11, %r15)
We replaced R_X86_64_GOTPCREL with GLOBAL_OFFSET_TABLE and R_X86_64_GOT64. Replace others in the same vein.
N.B.: Replace R_X86_64_GOT64 with R_X86_64_PLTOFF64 for functions from dynamically linked executables.
Testing
Verify the patch correctness using the following test that requires the large code model. It doesn't contain a million small functions, having one huge function and one small function instead.
Your compiler must support the large code model. If you use GCC, you'll need to build it from the source with the flag -mcmodel=large. Startup files shouldn't contain 32-bit relocations.
The foo function takes more than 2GB, rendering 32-bit relocations unusable. Thus, the test will fail with the overflow error if compiled without -mcmodel=large. Also, add flags -O0 -fPIC -static, link with gold.
extern int foo();
extern int bar();
int foo(){
bar();
// Call sys_exit
asm( "mov $0x3c, %%rax \n"
"xor %%rdi, %%rdi \n"
"syscall \n"
".zero 1 << 32 \n"
: : : "rax", "rdx");
return 0;
}
int bar(){
return 0;
}
int __libc_start_main(){
foo();
return 0;
}
int main(){
return 0;
}
N.B. I used patched Glibc startup files without the standard library itself, so I had to define both _libc_start_main and main.

why we must recompile a c source code for a different os on the same machine?

When I compile my c source code (for example in a Linux environment) the compiler generates a file in a "machine readable" format.
Why the same file is not working on the same machine under a different operating system?
Is the problem about the way we "execute" this file?

Sometimes it will work, depending on the format and the libraries that you use, etc.. For example, things like allocating memory or creating a window all call the OS functions. So you have to compile for the target OS, with those libraries linked in (statically or dynamically).
However, the instructions themselves are the same. So, if your program doesn't use any of the OS functions (no standard or any other library), you could run it on another OS. The second thing that is problematic here is executable formats.. Windows .exe is very different from for example ELF. However, a flat format that just has the instructions (such as .com) would work on all systems.
EDIT: A fun experiment would be to compile some functions to a flat format (just the instructions) on one OS (e.g. Windows). For example:
int add(int x, int y) { return x + y; }
Save just the instructions to a file, without any relocation or other staging info. Then, on a different OS (e.g. Linux) compile a full program that will do something like this:
typedef int (*PFUNC)(int, int); // pointer to a function like our add one
PFUNC p = malloc(200); // make sure you have enough space.
FILE *f = fopen("add.com", "rb");
fread(p, 200, 1, f); // Load the file contents into p
fclose(f);
int ten = p(4, 6);
For this to work, you'd also need to tell the OS/Compiler that you want to be able to execute allocated memory, which I'm not sure how to do, but I know can be done.

I have been asked what is an ABI discrepancy. I think it's best to explain over a simple example.
Consider a little silly function:
int f(int a, int b, int (*g)(int, int))
{
return g(a * 2, b * 3) * 4;
}
Compile it for x64/Windows and for x64/Linux.
For x64/Windows the compiler emits something like:
f:
sub rsp,28h
lea edx,[rdx+rdx*2]
add ecx,ecx
call r8
shl eax,2
add rsp,28h
ret
For x64/Linux, something like:
f:
sub $0x8,%rsp
lea (%rsi,%rsi,2),%esi
add %edi,%edi
callq *%rdx
add $0x8,%rsp
shl $0x2,%eax
retq
Allowing for different traditional notations of assembly language on Windows and Linux, there obviously are substantial differences in the code.
The Windows version clearly expects a to arrive in ECX (lower half of the RCX register), b in EDX (lower half of the RDX register), and g in the R8 register. This is mandated by the x64/Windows calling convention, which is a part of the ABI (application binary interface). The code also prepares arguments to g in ECX and EDX.
The Linux version expects a in EDI (the lower half of the RDI register), b in ESI (the lower half of the RSI register), and g in the RDX register. This is mandated by the calling convention of System V AMD64 ABI (used on Linux and other Unix-like operating systems on x64). The code prepares arguments to g in EDI and ESI.
Now imagine that we run a Windows program which somehow extracts the body of f from a Linux-targeted module and calls it:
int g(int a, int b);
typedef int (*G)(int, int);
typedef int (*F)(int, int, G);
F f = (F) load_linux_module_and_get_symbol("module.so", "f");
int result = f(3, 4, &g);
What is going to happen? Since on Windows functions expect their arguments in ECX, EDX and R8, the compiler will place actual arguments in those registers:
mov edx,4
lea r8,[g]
lea ecx,[rdx-1]
call qword ptr [f1]
But the Linux-targeted version of f looks for values elsewhere. In particular, it is looking for the address of g in RDX. We have just initialized its lower half to 4, so there are practically nil chances that RDX will contain anything making sense. The program will most likely crash.
Running Windows-targeted code on a Linux system will produce the same effect.
Thus, we cannot run 'foreign' code but with a thunk. A thunk is a piece of low-level code which rearranges arguments to allow calls between pieces of code following different sets of rules. (Thunks may probably do something else because the effects of ABI may not be limited by the calling convention.) You typically cannot write a thunk in high-level programming language.
Note that in our scenario we need to provide thunks for both f ('host-to-foreign') and g ('foreign-to-host').

There are two things of importance:
the development environment;
the target platform.
The development environment's compiler generates an object file with machine code and references to functions and data not contained in the object moule (not defined in the source file). Another program, the linker, combines all your object modules, plus libraries, into the executable. Please note:
The format of the object module is in principle platform Independent, although standards exist for platforms to easily combine object modules produced by different compilers for the platform. But that doesn't need to be; a fully integerated development environment can have its own "standard".
The linker can be a program from any manufacturer. It needs to know the format of the object modules, the format of the libraries and the desired format of the resulting excutable. Only this latter format is platform dependent.
The libraries can be in any format, as long as there is a linker that can read them. BUT: the libraries are platform dependent as the functions in the library call the API of the operating system.
A cross-development environment could for example generate object modules that are Windows compatible, then a linker can link them with libraries in Windows compatible format, yet targeted for Linux (using Linux OS calls) and deliver a Linux executable. Or any combination you like (Linux object format, windows library format, Windows executable; ...).
To summarize, the only truly platform dependent items are the functions in the libraries, as these call the OS, and the resulting executable as that is what the OS will load.
So, to answer the question: no, there is not necessarily a need to compile a source file for different platforms. The same object module can be linked for Linux (using Linux targeted libraries and creating a linux-format executable), or for Windows (using Windows targeted libraries and creating a Windows-format executable).

Different operating systems will use different Application Binary Interfaces (ABIs), this includes code needed for function entry and exit
Certain language features may need direct platform support (things like thread local storage come to mind)
The linker will generally link automatically to the toolchain specific standard library. This will need to change between Operating systems is for no other reason that each operating system has its own set of system calls.
Having said that, the Wine project is a good example where all these issue have been wrapped up to try to make windows code run on linux.

You are right, compiling translates your source code into machine readable code, e.g. into x86 machine code.
But there is more to it than that. Your code often not only uses machine code that is compiled into your executable file, but also references operating system libraries. All modern operating systems supply different APIs and libraries to the programs. So if your program is built to work with e.g. some Linux libaries and is then executed under an operation system that doesn't contain these libraries it will not run.
The other thing here is the executable file format. Most executable files contain more than just executable machine code, but also some metadata, e.g. icons, information about how the file is packed, version numbers and quite a bit more.
So by default, if you run e.g. a Windows .exe file on Linux, the operating system would not be able to handle that different file format correctly.
Systems like Wine add the missing libraries and are able to handle the different executable file formats, thus allowing you to run e.g. a Windows .exe file on Linux as if it was run on Windows natively.

There are several good, general answers here. I'll give you a very specific example.
A x86 machine can easily run printf("Hello world") both 32bit Linux and DOS, if the C file is compiled for each platform.
One of many major differences between operating systems is how a program instructs the operating system to provide the services it does. Here is how you ask Linux to print a string:
msg db "Hello world" # Define a message with no terminator
mov edx, 11 # Put the message length in the edx register
mov ecx, msg # Put the message address in ecx
mov ebx, 1 # Put the file descriptor in ebx (1 meaning standard output)
mov eax, 4 # Set the system call to 4, "write to file descriptor"
int 80h # Invoke interrupt 80h to give control to Linux
Here is how you ask DOS to print the same string:
msg db "Hello world$" # Define a message terminated by a dollar sign
mov dx, msg # Load the message address into dx
mov ah, 9 # Set the system call number to 9, "print string"
int 21h # Invoke interrupt 21h to give control to DOS
They both use the same kind of basic, machine readable and executable instructions, but the directions are as different as English and Chinese.
So can't you teach Linux how to understand directions intended for DOS, and run the same file on both? Yes you can, and that's what DosEmu did back in the day. It's also how Linux+Wine runs Windows software, and how FreeBSD runs Linux software. However, it's a lot of headache and additional work, and may still not be very compatible.

I post this reply to Andrey's discussion about ABIs as an answer because it is too much for a comment and requires the formatting of an Answer.
Andrey, what you are showing has nothing to do with Linux or Windows. It is an example of a development environment using certain conventions. All object modules and modules in libraries must adhere to these conventions, and nothing else. It isn't Linux or Windows that expect values in certain registers, it is the development environment.
The following is the more standard way of C calling conventions (Visual Stdio 2008). In all cases, the caller must evaluate parameters right-to-left as per the C standard:
int f(int a, int b, int (*g)(int, int))
{
push ebp
mov ebp,esp
return g(a * 2, b * 3) * 4;
mov eax,dword ptr [ebp+0Ch]
imul eax,eax,3
push eax
mov ecx,dword ptr [ebp+8]
shl ecx,1
push ecx
call dword ptr [ebp+10h]
add esp,8
shl eax,2
mov esp,ebp
pop ebp
ret
}
The caller pushes the parameters right-to-left and calls the callee
the callee saves the stack frame pointer, usually ebp on Intel, and adds to esp for local storage (none here)
The callee references the parameters relative to ebp
The callee performs its function
The callee restores ebp and returns
The caller removes the parameters of the call from the stack, e.g. add esp,8
Again, it is the development environment that dictates these conventions, not the OS. The OS may have its own conventions for applications to request services. These are then implemented in the OS-targeted libraries.

__fastcall vs register syntax?

Currently I have a small function which gets called very very very often (looped multiple times), taking one argument. Thus, it's a good case for a __fastcall.
I wonder though.
Is there a difference between these two syntaxes:
void __fastcall func(CTarget *pCt);
and
void func(register CTarget *pCt);
After all, those two syntaxes basically tell the compiler to pass the argument in registers right?
Thanks!

__fastcall defines a particular convention.
It was first added by Microsoft to define a convention in which the first two arguments that fit in the ECX and EDX registers are placed in them (on x86, on x86-64 the keyword is ignored though the convention that is used already makes an even heavier use of registers anyway).
Some other compilers also have a __fastcall or fastcall. GCC's is much as Microsofts. Borland uses EAX, EDX & ECX.
Watcom recognises the keyword for compatibility, but ignores it and uses EAX, EDX, EBX & ECX regardless. Indeed, it was the belief that this convention was behind Watcom beating Microsoft on several benchmarks a long time ago that led to the invention of __fastcall in the first place. (So MS could produce a similar effect, while the default would remain compatible with older code).
_mregparam can also be used with some compilers to change the number of registers used (some builds of the Linux kernel are on Intel or GCC but with _mregparam 3 so as to result in a similar result as that of __fastcall on Borland.
It's worth noting that the state of the art having moved on in many regards, (the caching that happens in CPUs being particularly relevant) __fastcall may in fact be slower than some other conventions in some cases.
None of the above is standard.
Meanwhile, register is a standard keyword originally defined as "please put this in a register if possible" but more generally meaning "The address of this automatic variable or parameter will never be used. Please make use of this in optimising, in whatever way you can". This may mean en-registering the value, it may be ignored, or it may be used in some other compiler optimisation (e.g. the fact that the address cannot be taken means certain types of aliasing error can't happen with certain optimisations).
As a rule, it's largely ignored because compilers can tell if you took an address or not and just use that information (or indeed have a memory location, copy into a register for a bunch or work, then copy back before the address is used). Conversely, it may be ignored in function signatures just to allow conventions to remain conventions (especially if exported, then it would either have to be ignored, or have to be considered part of the signature; as a rule, it's ignored by most compilers).
And all of this becomes irrelevant if the compiler decides to inline, as there is then no real "argument passing" at all.
register is enforced, so it can serve as an assertion that you won't take the address; any attempt to do so is then a compile error.

Visual Studio 2012 Microsoft documentation regarding the register keyword:
The compiler does not accept user requests for register variables; instead, it makes its own register choices when global register-allocation optimization (/Oe option) is on. However, all other semantics associated with the register keyword are honored.
Visual Studio 2012 Microsoft documentation regarding the __fastcall keyword:
The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible. The following list shows the implementation of this calling convention.
You can still have a look at the assembler code created by the compiler to check what actually happens.

register is essentially meaningless in modern C/C++. Compilers ignore it, putting whichever variables in registers they want (and note that a given variable will often be in a register some of the time, and in the stack some of the time, during the function's execution). It has some minor utility in hinting non-aliasing, but using restrict (or a given compiler's equivalent to restrict) is a better way to achieve that.
__fastcall does improve performance slightly, though not as much as you'd expect. If you have a small function which is called often, the number one thing to do to improve performance is to inline it.

In short, it depends on your architecture and your compiler.
The main difference between these two syntaxes is that register is standardized and __fastcall isn't, but they are both calling conventions.
The default calling convention in C is the cdecl, where parameters are pushed into the stack in reverse order, and return value is stored on EAX register. Every data register can be used in the function, before the call they are caller-saved.
There is another convention, the fastcall, which is indicated by the register keyword. It passes arguments into EAX, ECX and EDX registers (the remaining args are pushed into the stack).
And __fastcall keyword isn't conventionned, it totaly depends on your compiler. With cl (Visual Studio), it seems to store the four first arguments of your function to registers, except on x86-64 and ARM archs. With gcc, the two first arguments are stored on register, regardless of the arch.
But keep in mind that compilers are able by themselves to optimize your code to greatly improve its speed. And I bet that for your function there is a better way to optimize your code.
But you need to disable optimisation to use these keywords (volatile as well). Which is a thing I totaly not recommend.

Where is declaration for get_pc() in GNU ARM?

I'm building legacy code using the GNUARM C compiler and trying to resolve all the implicit declarations of functions.
I've come across some ARM specific functions and can't find the header file containing the declarations for these functions:
get_pc
get_cpsr
get_sp
I have searched the web and only came up with source code containing these functions without any non-standard include files.
I'll also settle for the function declarations.
Since I will also be porting the code to the Cygwin / Windows platform, what are the equivalent declarations for Cygwin GNU GCC?
Thanks.

Just write your own if you really need those functions, asm is easier than inline asm:
.globl get_pc
get_pc:
mov r0,pc
bx lr
.globl get_sp
get_sp:
mov r0,sp
bx lr
.globl get_cpsr
get_cpsr:
mrs r0,cpsr
bx lr
At least for arm. if you are porting to x86 and need the equivalents, I have to wonder what the code needs with those things anyway. the cpsr in particular you would likely have to change any code that uses the result as the status registers across processor vendors/families pretty much never match. The x86 equivalents should still be about the same level of effort, takes longer to do a google search and read the results than it is to just write the code (if you know the processor).
Depending on what your application is doing it is probably better to just comment out any code that calls those functions and/or uses the return value. I can imagine a few reasons why those items would be used, but it could get into architecture specific stuff and that is more involved than just porting a few register read functions. So what user786653 asked is the key question. How are these functions used? Not where can I find them but how are they used and why do you think you need them.

Are you sure those are functions? I'm not very familiar with ARM, but those sound like compiler intrinsics to me. If you're moving to GCC, you might be better off replacing those with inline assembly.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight