Running the following on Linux x86-64 compiled with gcc -m32
#include <stdio.h>
#include <limits.h>
int main() {
int a = 4;
int* ptr = &a;
printf("int* is %d bits in size\n", CHAR_BIT * sizeof(ptr));
return 0;
}
results in
int* is 32 bits in size
Why I convinced myself it ought to be 64 bits (prior to executing): since it is running on a 64-bit computer in order to address the memory we need 64 bits. Since &a is the address of where value 4 is stored it should be 64 bits. The compiler could implement a trick by having the same offset for all pointers since it is running in the compatibility mode, but it couldn't guarantee congruent data after calling malloc multiple times. This is wrong. Why?
On the hardware level, your typical x86-64 processor has a 32-bits compatibility mode, where it behaves like a x86 processor. That means memory is addressed using 4 bytes, hence your pointer is 32 bits.
On the software level, the 64 bits kernel allows 32 bits processes to be run in this compatibility mode.
This is how 'old' 32 bits programs can run on 64 bits machines.
The compiler, particularly with the -m32 flag, writes code for x86 addressing, so that's why int* is also 32 bits.
Modern CPUs have a memory management unit, it makes possible that every program has its own address space. You could even have two different programs using the same addresses. This unit is also what detects segmentation faults (access violations). With this, the addresses a program uses are not the same as the addresses on the address bus that connects the CPU and the peripherials including RAM, so it's no problem for the OS to assign 32-bit addresses to a program.
An x86-64 machine running a 64bit OS runs 32bit processes in "compat" mode, which is different from "legacy" mode. In compat mode, user-space (i.e. the 32bit program's point of view) works the same as on a system in legacy mode (32bit everything).
However, the kernel is still 64bits, and can map the compat-mode process's virtual address space anywhere in physical address space. (so two different 32b processes can each be using 4GB of RAM.) IDK if the page tables for a compat process need to be different from 64bit processes. I found http://wiki.osdev.org/Setting_Up_Long_Mode, which has some stuff but doesn't answer that question.
In compat mode, system calls switch the CPU to 64b long mode, and returns from system calls switch back. Kernel functions that take a user-space pointer as an argument need simple wrappers to do whatever is necessary to get the appropriate address for use from kernel-space.
The high level answer is that there's hardware support for everything compat mode needs to be just as fast as legacy mode (32bit kernel).
IIRC, 32bit virtual addresses get zero-extended to 64bit by the MMU hardware, so the kernel just sets up the page tables accordingly.
If you use an address-size override prefix in 64bit code, the 32-bit address formed from the 32bit registers involved will be zero-extended. (There's an x32 ABI for code that doesn't need more than 4GB of RAM, and would benefit from smaller pointers, but still wants the performance benefit of more registers, and having them be 64b.)
Related
Is there an environment variable in c which stores the heap word size, or at least a variable which stores the type of system ?
For example in 64 bit system would be 8(bytes) and in 32 bit would be 4(bytes)
Note that 64 bit systems can execute 32 bit binaries, in this case, sizeof(void *), sizeof(int), ... will be 4, even on 64 bit system.
You can get some additional mileage using the uname system call (see uname -m). For Intel, it will be x86_64 (64), or i686 (for 32). If you need a solution for Intel only, this can work. You can extend this to other processor (arm, etc.) but you will need to code each platform that you code may run on. See "machine" in: https://man7.org/linux/man-pages/man2/uname.2.html
To make things more complex, you might be running under 32 bit operating system, which runs under 64 bit processor (or some virtualized environment). In those cases, uname will report on the operating system, not on the processor. Not clear which one you are looking for.
In a 32 bit machine, if you copied an int p, it would copy 4 bytes of information, which would be addressed at 0xbeefbeef, 0xbeefbef0, 0xbeefbef1, 0xbeefbef2 respectively.
Is this the same with 64 bit? Or does it store 2 bytes at a single address?
It depends on the architecture. On most "normal" 64-bit systems (e.g. arm64, x86_64, etc.) memory is "byte addressed," so each memory address refers to one byte (so it's the same as your 32-bit example).
There are systems out there which are not byte addressed, and this can included 64-bit architectures. For example, DSPs are a classic example of systems where char can be 32-bits (or more) and an individual byte (or rather, octet) is not addressable.
On a amd64 architecture (also called x86_64 and x64, which is the most common 64-bit architecture), each addressable unit still refers to one byte of memory (8-bits).
Additionally, an int still usually contains 4 bytes of memory (32-bits), though this can vary from compiler to compiler (as it also does on 32-bit systems).
What will be different is the size of a pointer. On a 32-bit system, pointers are normally 32-bits, but are 64-bits on a 64-bit system (8 bytes). This will allow the computer to access more bytes of memory, but each byte is still 8-bits long.
As far as I noticed a 32bit program uses the FLAT memory model and the 64bit also. Using the 32bit program one has only 4GB to address and using 64bit (rcx for example) makes it possible to saturate the 40 to 48 address bits modern CPU provide and address even more.
So beside this and some additional control registers that a 32bit processor does not has, I ask myself if it is possible to run 32bit code in linux flawlessly.
I mean must every C code I execute be 64bit for instance?
I can understand that since C builds upon a stack frame and base pointer pushing a 32bit base pointer on stack my introduce problems where the stack pointer is 64bit and one might access the pop and push op codes in 32 bit fashion.
So what are the difference and is it possible to actually run 32bit code when running a 64bit Linux kernel?
[Update]
To state the scenario clear I am running a 64bit program and load a ELF64 file into memory map everything and call the method directly. The idea is to generate asm code dynamically.
The main difference between them is the different calling conventions. On 32bit there are several types: __stdcall, __fastcall, ...
On 64bit (x64) there's only one (on Windows® platforms, about others I don't know) And it has some requirements, which are very different to 32bit.
More on https://future2048.blogspot.com
Note that ARM and IA64 (Itanium) also are different Encodings as x64 (Intel64/AMD64)
And you have 8 more general registers r8..r15, with sub registers
r8d..r15d, r8w..r15w, r8b..r15b
For the SIMD-based code also 8 additional registers xmm8..xmm15 are present.
The exception handling is data-based on 64bit; on 32bit it was code-based. So on 64bit for unwinding exceptions no longer instructions are used to build the exception frame. The exceptiom handling is completely data-based so that no additional instructions are required to try/catch.
The memory limit of 2GB on 32bit apps (or with /LARGEADDRESSAWARE 3GB on an app on 32bit Win OS, or 4GB on 64bit Win OS) is now much larger
More on https://msdn.microsoft.com/en-us/library/windows/desktop/aa366778(v=vs.85).aspx
And of course, the general purpose registers have 64bit width instead of 32bit. So any integer calculation can process values bigger than the 32bit limit of 0..4294967296. (signed -2147483648..+2147483647)
Also reading and storing memory with a simple MOV instruction can read and write a QWORD (64bit) at once; on 32bit that only could write a DWORD (32bit).
Some instructions have been removed: PUSHA + POPA disappeared.
And one Encoding form of INC/DEC is now used as REX-Byte prefix Encoding.
Some 32 bit code will work in a 64 bit environment without modification. However, in general, functions won't work because the calling conventions are probably different (depends on the architecture). Depending on the program, you could write some glue to convert arguments to the calling convention you want. So you can't just link a 32-bit library into your 64-bit application.
However, if your entire application is 32-bit, you will probably be able to run it just fine. The word size of the kernel doesn't really matter. Popular operating systems all support running 32-bit code with a 64-bit kernel: Linux, OS X, and Windows support this.
In short: you can run 32-bit applications on a 64-bit system but you can't mix 32-bit and 64-bit code in the same application (barring deep wizardry).
I'm doing a kernel mode driver, and I've run into a bit of a bug when running the code on 64-bit.
The code runs fine on 32-bit, but when I build/run in amd64 I'm getting strange results. I read up a little on 64 bit pointers and addressing vs 32bit vs 16bit (in win32) and I'm sure I'm missing something regarding the fundamentals of pointers in the 64bit architecture.
Here is the C code that works just fine in 32-bit.
ncImageLoadEventSettings.buff is a char* and ncILHead->count is simply an int.
// Calculate offset
pnt = (void*)(ncImageLoadEventSettings.buff + sizeof(struct NC_IL_HEAD) + (ncILHead->count * sizeof(struct NC_IL_INFO)));
This code calculates the address at which to write a struct object onto a buffer (beginning at .buff), which works perfectly fine in 32-bit mode.
It should be noted that the program reading this buffer is 32-bit. I think I read somewhere that structs in 64-bit mode are different sizes than those in 32-bit mode.
The 32-bit reader program reads some of the buffer's contents just fine, while the majority of the entries are garbage.
Is this the proper way to calculate addresses, or might there be an issue with the 64-bit vs 32-bit reader application that is reading that buffer?
See http://en.wikipedia.org/wiki/Data_structure_alignment#Typical_alignment_of_C_structs_on_x86
In general, pointers are larger (64bit), and most fields that are of 64bit size (including pointers) will be aligned (with added padding).
On my OS X box, the kernel is a 32-bit binary and yet it can run a 64-bit binary.
How does this work?
cristi:~ diciu$ file ./a.out
./a.out: Mach-O 64-bit executable x86_64
cristi:~ diciu$ file /mach_kernel
/mach_kernel: Mach-O universal binary with 2 architectures
/mach_kernel (for architecture i386): Mach-O executable i386
/mach_kernel (for architecture ppc): Mach-O executable ppc
cristi:~ diciu$ ./a.out
cristi:~ diciu$ echo $?
1
The CPU can be switched from 64 bit execution mode to 32 bit when it traps into kernel context, and a 32 bit kernel can still be constructed to understand the structures passed in from 64 bit user-space apps.
The MacOS X kernel does not directly dereference pointers from the user app anyway, as it resides its own separate address space. A user-space pointer in an ioctl call, for example, must first be resolved to its physical address and then a new virtual address created in the kernel address space. It doesn't really matter whether that pointer in the ioctl was 64 bits or 32 bits, the kernel does not dereference it directly in either case.
So mixing a 32 bit kernel and 64 bit binaries can work, and vice-versa. The thing you cannot do is mix 32 bit libraries with a 64 bit application, as pointers passed between them would be truncated. MacOS X supplies more of its frameworks in both 32 and 64 bit versions in each release.
It's not the kernel that runs the binary. It's the processor.
The binary does call library functions and those need to be 64bit. And if they need to make a system call, it's their responsibility to cope with the fact that they themselves are 64bit, but the kernel is only 32.
But that's not something you would have to worry about.
Note that not all 32-bit kernels are capable of running 64-bit processes. Windows certainly doesn't have this property and I've never seen it done on Linux.
The 32 bit kernel that is capable of loading and running 64 bit binaries has to have some 64 bit code to handle memory mapping, program loading and a few other 64 bit issues.
However, the scheduler and many other OS operations aren't required to work in the 64 bit mode in order to deal with other issues - it switches the processor to 32 bit mode and back as needed to handle drivers, tasks, memory allocation and mapping, interrupts, etc.
In fact, most of the things that the OS does wouldn't necessarily perform any faster running at 64 bits - the OS is not a heavy data processor, and those portions that are (streams, disk I/O, etc) are likely converted to 64 bit (plugins to the OS anyway).
But the bare kernel itself probably won't task switch any faster, etc, if it were 64 bit.
This is especially the case when most people are still running 32 bit apps, so the mode switching isn't always needed, even though that's a low overhead operation, it does take some time.
-Adam
An ELF32 file can contain 64bit instructions and run in 64 bit mode. Only thing it is having is that organization of header and symbols are in 32bit format. Symbols table offsets are 32 bits. Symbol table entries are 32 bit wide etc. A file which contain both 64 bit code and 32 bit code can expose itself as 32 bit ELF file wheres it uses 64 bit registors for its internal calculations. mach_kernel is one such executable. Advantage it get is that 32 bit driver ELFs can linked to it. If it take care of passing pointers which are located below 4GBs to other linked ELF binaries it will work fine.
For the kernel to be 64-bit would only bring the effective advantage that kernel extensions (i.e., typically drivers) could be 64-bit. In fact, you'd need to have either all 64-bit kernel extensions, or (as is the case now) all 32-bit ones; they need to be native to the architecture of the running kernel.