How to check msr.le at runtime using built-ins? - c

This question came up in a Power8 in-core crypto patch. The patch provides AES using Power8 built-ins. When loading a VSX register we need to perform a 128-bit endian reversal when running on a little-endian machine to ensure the VSX register loads the proper value.
At compile time we can check macros like __BYTE_ORDER__. However, I believe we are supposed to check the machine status register at runtime. If msr.le=1, then we perform the endian swap. Also see the AltiVec Programming Environment Manual, Section 3.1.4, p. 3-5.
How do we check the machine status register at runtime using built-ins?

You don't need to - it's known at compile time. Your instructions will be encoded completely incorrectly if you're running in the opposite endianness of your compiled code. So, your OS will ensure that your program is running in the correct MSR[LE] setting for the endianness of the executable.
In essence: the MSR[LE] bit controls instructions as well as data loads/stores.
There are some tricks we can use to detect endianness if we really have no idea, but unless you're writing super early boot code, you won't need that.

Related

Is it possible to build C source code written for ARM to run on x86 platform?

I got some source code in plain C. It is built to run on ARM with a cross-compiler on Windows.
Now I want to do some white-box unit testing of the code. And I don't want to run the test on an ARM board because it may not be very efficient.
Since the C source code is instruction set independent, and I just want to verify the software logic at the C-level, I am wondering if it is possible to build the C source code to run on x86. It makes debugging and inspection much easier.
Or is there some proper way to do white-box testing of C code written for ARM?
Thanks!
BTW, I have read the thread: How does native android code written for ARM run on x86?
It seems not to be what I need.
ADD 1 - 10:42 PM 7/18/2021
The physical ARM hardware that the code targets may not be ready yet. So I want to verify the software logic at a very early phase. Based on John Bollinger's answer, I am thinking about another option: Just build the binary as usual for ARM. Then use QEMU to find a compatible ARM cpu to run the code. The code is assured not to touch any special hardware IO. So a compatible cpu should be enough to run all the code I think. If this is possible, I think I need to find a way to let QEMU load my binary on a piece of emulated bare-metal. And to get some output, I need to at least write a serial port driver to bridge my binary to the serial port.
ADD 2 - 8:55 AM 7/19/2021
Some more background, the C code is targeting ARMv8 ISA. And the code manipulates some hardware IPs which are not ready yet. I am planning to create a software HAL for those IPs and verify the C code over the HAL. If the HAL is good enough, everything can be purely software and I guess the only missing part is a ARMv8 compatible CPU, which I believe QEMU can provide.
ADD 3 - 11:30 PM 7/19/2021
Just found this link. It seems QEMU user mode emulation can be leveraged to run ARM binaries directly on a x86 Linux. Will try it and get back later.
ADD 4 - 11:42 AM 7/29/2021
An some useful links:
Override a function call in C
__attribute__((weak)) and static libraries
What are weak functions and what are their uses? I am using a stm32f429 micro controller
Why the weak symbol defined in the same .a file but different .o file is not used as fall back?
Now I want to do some white-box unit testing of the code. And I don't want to run the test on an ARM board because it may not be very efficient.
What does efficiency have to do with it if you cannot be sure that your test results are representative of the real target platform?
Since the C source code is instruction set independent,
C programs vary widely in how portable they are. This tends to be less related to CPU instruction set than to target machine and implementation details such as data type sizes, word endianness, memory size, and floating-point implementation, and implementation-defined and undefined program behaviors.
It is not at all safe to assume that just because the program is written in C, that it can be successfully built for a different target machine than it was developed for, or that if it is built for a different target, that its behavior there is the same.
I am wondering if it is possible to build the C source code to run on x86. It makes debugging and inspection much easier.
It is probably possible to build the program. There are several good C compilers for various x86 and x86_64 platforms, and if your C code conforms to one of the language specifications then those compilers should accept it. Whether the behavior of the result is representative of the behavior on ARM is a different question, however (see above).
It may nevertheless be a worthwhile exercise to port the program to another platform, such as x86 or x86_64 Windows. Such an exercise would be likely to unmask some bugs. But this would be a project in its own right, and I doubt that it would be worth the effort if there is no intention to run the program on the new platform other than for testing purposes.
Or is there some proper way to do white-box testing of C code written for ARM?
I don't know what proper means to you, but there is no substitute for testing on the target hardware that you ultimately want to support. You might find it useful to perform initial testing on emulated hardware, however, instead of on a physical ARM device.
If you were writing ARM code for a windows desktop application there would be no difference for the most part and the code would just compile and run. My guess is you are developing for some device that does some specific task.
I do this for a lot of my embedded ARM code. Typically the core algorithms work just fine when built on x86 but the whole application does not. The problems come in with the hardware other than the CPU. For example I might be using a LCD display, some sensors, and Free RTOS on the ARM project but the code that runs on Windows does not have any of these. What I do is extract important pieces of C/C++ code and write a test framework around it. In the real ARM code the device is reading values from a sensor and doing something with it. In the test code that runs on a desktop the code reads from a data file with fake sensor values and writes its output to a datafile that can be analyzed. This way I can have white box tests for the most complicated code.
May I ask, roughly what does this code do? An ARM processor with no peripherals would be kind of useless. Typically we use the processor to interact with some other hardware like a screen, some buttons, or Bluetooth. It's those interactions that are going to be the most problematic.

check CPU model to execute a specific C code [duplicate]

This question already has an answer here:
How to tell if program is running on x86/x64 or ARM Linux platforms
(1 answer)
Closed 4 years ago.
I want to create a C code that somehow contains two separated blocks. I want to use a function or a tool that extracts the CPU model, and based on that, the program decides which block of code it executes. I only have the idea and I don't know how to implement it !
The first block of code will be executed on an Intel i7 and the second should be executed on ARM Cortex A53.
PS : I am a beginner and I have nothing to do with hardware and similar stuff. Thank you for your help :)
As clearly pointed out, first off you cant have a C program that runs to a point to determine ARM from x86 as that code has to already be ARM or x86. These are different instruction sets. You can use say python or JAVA or some other scripty/virtual machine language. But then you have a COMPILE time decision to build for one target or the other, at that point you already know which target as you are actually running code on it, so if this is strictly ARM vs X86 there is no reason to check runtime. Thats not to say that each architecture and/or system will have a way to check the architecture and flavor you are on ARMv6 vs ARMv7, for example, but not necessarily ARMv7 32 bit vs ARMv8 64 bit although you technically can run aarch32 and aarch64 instruction sets on most ARMv8s just not intermixed, have to have the os or execution level changes yourself to switch between them.
You do understand there are different incompatible instruction sets, specifically the ones you described and C code is compiled to one or the other. So you cannot have a program in C compiled for a target that can detect the other target. You have already selected the target before you get to this point. Now there are emulators, but they tend to target one architecture as well. There are/were products from specific vendors that would emulate one instruction set and convert it runtime to the other, over time as you re-run that code it continues to convert it. You could try that, but you still have to be running code for the right target on the right logic/emulator, and then have a now special detection that is not the norm to find the true underlaying architecture, not the faked emulator.
I suspect you are thinking you can have one architecture specific module that detects the architecture to run architecture specific code. This does not work with C in general, does not make sense to try, thus there probably isnt a good tool for this. In particular since the solution for such a thing is either you build this into the binary file format and the operating system picks because it knows, or you wrap your binary with a target independent language like Python or JAVA or scripty language like perl, bash, etc. that can independent of target determine the architecture (in that case solutions vary widely specific to operating system and language for starters) and then choose which binary to run.
There are many ways to achieve what you want. To check which model is present you first have to read which model you have. how to do that varies between Windows and Linux. i found this SO-topic helpful and it might also be a good start for your research: How to check CPU name, model, speed on Windows/Linux C?

How can linux boot code be written in C?

I'm a newbie to learning OS development. From the book I read, it said that boot loader will copy first MBR into 0x7c00, and starts from there in real mode.
And, example starts with 16 bit assembly code.
But, when I looked at today's linux kernel, arch/x86/boot has 'header.S' and 'boot.h', but actual code is implemented in main.c.
This seems to be useful by "not writing assembly."
But, how is this done specifically in Linux?
I can roughly imagine that there might be special gcc options and link strategy, but I can't see the detail.
I'm reading this question more as an X-Y problem. It seems to me the question is more about whether you can write a bootloader (boot code) in C for your own OS development. The simple answer is YES, but not recommended. Modern Linux kernels are probably not the best source of information for creating bootloaders written in C unless you have an understanding of what their code is doing.
If using GCC there are restrictions on what you can do with the generated code. In newer versions of GCC there is an -m16 option that is documented this way:
The -m16 option is the same as -m32, except for that it outputs the ".code16gcc" assembly directive at the beginning of the assembly output so that the binary can run in 16-bit mode.
This is a bit deceptive. Although the code can run in 16-bit real mode, the code generated by the back end uses 386 address and operand prefixes to make normally 32-bit code execute in 16-bit real mode. This means the code generated by GCC can't be used on processors earlier than the 386 (like the 8086/80186/80286 etc). This can be a problem if you want a bootloader that can run on the widest array of hardware. If you don't care about pre-386 systems then GCC will work.
Bootloader code that uses GCC has another downside. The address and operand prefixes that get get added to many instructions add up and can make a bootloader bloated. The first stage of a bootloader is usually very constrained in space so this could potentially become a problem.
You will need inline assembly or assembly language objects with functions to interact with the hardware. You don't have access to the Linux C library (printf etc) in bootloader code. For example if you want to write to the video display you have to code that functionality yourself either writing directly to video memory or through BIOS interrupts.
To tie it altogether and place things in the binary file usable as an MBR you will likely need a specially crafted linker script. In most projects these linker scripts have an .ld extension. This drives the process of taking all the object files putting them together in a fashion that is compatible with the legacy BIOS boot process (code that runs in real mode at 0x07c00).
There are so many pitfalls in doing this that I recommend against it. If you are intending to write a 32-bit or 64-bit kernel then I'd suggest not writing your own bootloader and use an existing one like GRUB. In the versions of Linux from the 1990s it had its own bootloader that could be executed from floppy. Modern Linux relies on third party bootloaders to do most of that work now. In particular it supports bootloaders that conform to the Multiboot specification
There are many tutorials on the internet that use GRUB as a bootloader. OS Dev Wiki is an invaluable resource. They have a Bare Bones tutorial that uses the original Multiboot specification (supported by GRUB) to boot strap a basic kernel. The Mulitboot specification can easily be developed for using a minimal of assembly language code. Multiboot compatible bootloaders will automatically place the CPU in protected mode, enable the A20 line, can be used to get a memory map, and can be told to place you in a specific video mode at boot time.
Last year someone on the #Osdev chat asked about writing a 2 stage bootloader located in the first 2 sectors of a floppy disk (or disk image) developed entirely in GCC and inline assembly. I don't recommend this as it is rather complex and inline assembly is very hard to get right. It is very easy to write bad inline assembly that seems to work but isn't correct.
I have made available some sample code that uses a linker script, C with inline assembly to work with the BIOS interrupts to read from the disk and write to the video display. If anything this code should be an example why it's non-trivial to do what you are asking.

How to check the existence of NEON on arm?

How to determine whether NEON engine exists on given ARM processor? Any status/flag register can be queried for such purpose?
I believe unixsmurf's answer is about as good as you'll get if using an OS with privileged kernel. For general purpose feature detection, it seems ARM has made it a requirement to get this from the OS, and so you must use an OS API to get it.
On Android NDK use #include <cpu-features.h> with (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM) && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON). Note this is for 32 bit ARM. ARM 64 bit has different flags but the idea is the same. See the sources/docs.
On Linux, if available use #include <sys/auxv.h> and #include <asm/hwcap.h> with getauxval(AT_HWCAP) & HWCAP_NEON.
On iOS, I'm not sure there is a dynamic call, the methodology seems to be that you build your app targeting NEON, then make sure your app is flagged to require NEON so it will only install on devices which support it. Of course you should use the pre-defined preprocessor flag __ARM_NEON__ to make sure everything is in order at compile time.
On whatever Microsoft does or if you are using some other RTOS... I don't know...
Actually you'll see a lot of Android implementations which just parse /proc/cpuinfo in order to implement android_getCpuFeatures().... Heh. But still it seems to be getting improved and newest versions use the getauxval method.
One reliable way is to check the architectural feature trap register. For example, on ARM Cortex A35, you can check the value of HCPTR register to see whether NEON is implemented (0x000033FF), or not (0x0000BFFF). The register name and indication value are platform dependent, making sure to check the technical reference manual.
Is there anyway to check if Neon and sve is supported?
I have seen someone saying something about the HCPTR register, but it does not seem to have any relationship to neon and besides looks to be a Aarch32 instruction according to the docs
https://developer.arm.com/docs/ddi0595/g/aarch32-system-registers/hcptr

Compiling C and assembling ASM into machine code [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have three questions:
What compiler can I use and how can I use it to compile C source code into machine code?
What assembler can I use and how can I use it to assemble ASM to machine code?
(optional) How would you recommend placing machine code in the proper addresses (i.e. bootloader machine code must be placed in the boot sector)?
My goal:
I'm trying to make a basic operating system. This would use a personally made bootloader and kernel. I would also try to take bits and pieces from the Linux kernel (namely the drivers) and integrate them into my kernel. I hope to create a 32-bit DOS-like operating system for messing with memory on most modern computers. I don't think I will be creating a executable format for my operating system, as my operating system wont be dynamic enough to require it.
My situation:
I'm running on a x86-64 windows 8 laptop with a Intel Celeron CPU; I believe it uses secure boot. I would be testing my operating system on a x86-64 desktop with Intel Core I3 CPU. I have a average understanding of operating systems and their techniques. I know the C, ASM, and computer theory required for this project. I think it is also note worthy that I'm sixteen with no formal education about computer science.
My research: After searching Google for what C normally compiles into, I found answers ranging from machine code, binary, plain binary, raw binary, assembly, and relocatable object code. Assembly as I understand normally assembles into a PE formatted executable. I have heard of the Cygwin, GCC C, and MingW C compilers. As for assemblers, I have heard of FASM, MASM, and NASM. I have searched websites such as OSDev and OSDever.
What I have tried: I tried to setup GCC (a nightmare) and create a cross compiler (another nightmare).
Conclusion: As you can tell, I'm vary confused about compilers, assemblers, and executable formats. Please dispel my ignorance along with answering my questions. These are probably the only things keeping me from having a OS on my resume. Sorry, I would have included more links, but stackoverflow wouldn't let me make more then two. Thanks a ton!
First, some quick answers to your three questions.
Pretty much any compiler will translate C code into assembly code. That's what compilers do. GCC and clang are popular and free.
clang -S -o example.s example.c
Whichever compiler you choose will probably support assembly as well, simply by using the same compiler driver.
clang -o example.o example.s
Your linker documentation will tell you how to put specific code at specific addresses and so forth. If you use GCC or clang as described above, you will probably use ld(1). In that case, read into 'linker scripts'.
Next, some notes:
You don't need a cross compiler or to set up GCC by yourself. You're working on an Intel machine, generating code for an Intel machine. Any binary distribution of clang or GCC that comes with your linux distribution should work fine.
C compilers normally compile code into assembly, and then pass the resulting assembly off to a system assembler to end up with machine code. Machine code, binary, plain binary, raw binary, are all basically synonymous.
The generated machine code is packaged into some kind of executable file format, to tell the host operating system how to load and run the code. On windows, it's PE, on Linux, it's ELF, and on Mac OS X it's Mach-O.
You don't need to create an executable format for your OS, but you will probably want to use one. ELF is a pretty straightforward (and well-documented) option.
And a bit of a personal note that I hope doesn't discourage you too much - If you are not very familiar with how compilers, assemblers, linkers, and all of those tools work, your project is going to be very difficult and confusing. You might want to start with some smaller projects to get your "sea legs", so to speak.
At first "machine code" and "binary" are synonyms. "Object code" is some kind of intermediate form, that the linker will convert to binary at the end. Some C/C++ compilers generate not directly binary, but assembler source code, that they feed to the assembler, that produces object code and then to the linker, that makes the final binary. In the most cases these processes are transparent to the user. You feed the compiler with C/C++/Pascal/whatever source code and get a binary file at the output.
FASM assembler, aka flatassembler is the best assembler for OS development. There are several OSes already created in FASM.
That is because FASM is self compilable and is very easy portable. This way, for 2..3 days, you can port it to your OS and then your OS will become self sufficient - i.e. you will be able to compile the programs from within your OS.
Another good feature of FASM is that it does not need linker - it can generate directly binary files in several formats.
The big active community is also very important. There are tons of sources available for FASM, including for OS development.
The message board is very active and is place where one can learn a lot.
I think the first part of your question has been answered, so I'll take on the other two:
What assembler can I use and how can I use it to assemble ASM to machine code?
One of nasm, yasm (basically very like nasm), fasm, "masm" i.e. ml64.exe, ml.exe and freely available as part of the Microsoft tools.
Of these, I probably recommend either nasm or yasm. That recommendation is based entirely on personal preference - but the wide range of platforms they support, plus using Intel syntax by default are my reasons. I'd try a few and see what you like.
(optional) How would you recommend placing machine code in the proper addresses (i.e. bootloader machine code must be placed in the boot sector)?
Well, there is only one way to place the bootloader at the correct address for MBR - open the disk at LBA 0 and write exactly 512 bytes there, ending in 0x55AA. Flush, then close. The MBR usually also contains a partition table embedded in it - it is both code and data. The sciency term for this stuff is Von Neumann Architecture which can be briefly summarised as "programs and data are stored in the same place". The action of the BIOS on wanting to boot from disk will be to read the first 512 bytes into memory, check the signature and if it matches, execute that memory (starting from byte 0).
OK, that's those questions out of the way. Now I'll give you some more notes:
512-bytes for a bootloader is not really enough for anyone's usage. As such, some file systems contain boot sectors and the bootloader itself simply loads the code/data found in these. This allows for larger amounts of code to be loaded - enough to get a kernel going. For example, grub contains stage1, stage1_5 and stage2 components in the legacy version.
Although most operating systems require you to use an executable format container, you don't need one. On disk and in memory, executable code is just one, two or three byte strings called opcodes. You can read the opcode reference or the Intel/AMD manuals to find out what hexadecimal value translates to what. Anyway, you can perform a direct conversion from assembler to binary using nasm like this:
nasm -f bin input.asm -o output.asm
Which will work for 16, 32 or 64 bit assembler quite happily although the result likely won't execute. The only place it will is if you explicitly use the [bits 16] directive in your code, along with org 100h, then you have an MSDOS .com program. Unfortunately, this is the simplest of binary formats in existence - you only have code and data in one big lump and this must not exceed the size of a single segment.
I feel this might handle this point:
I found answers ranging from machine code, binary, plain binary, raw binary, assembly, and relocatable object code.
The answer as to what assembly assembles to - it assembles to opcodes and memory addresses, depending on the assembler. This is represented in bytes which are data all of themselves. You can read them raw with a hex editor although there are few occasions where this is strictly necesary. I mention memory addresses because some opcodes control how memory addresses are interpreted - relocatable object code for example requires that addresses are not hard-coded (instead, they are interpreted as offsets from the current location).
Assembly as I understand normally assembles into a PE formatted executable.
It is fair to say the assembler from which your C/C++ was derived is compiled to opcodes which are then, along with anything else to be included in the program (data, resources) are stored in an executable format, such as PE. Normally depends on your OS.
If you have thoroughly read the OSDev Wiki, you'll realise segmented addressing is an utter pain - the standard and only usage of segments in modern operating systems is to define four segments spanning the entire address space - two data segments at ring 0 and 3, two code segments at ring 0 and 3.
If you haven't read the OSDEV Wiki thoroughly, you should. I'd also recommend JamesM's kernel tutorials which contain practical advice on building a kernel in C.
If you simply want to do bad things to a DOS kernel, you actually still can without needing to write a full kernel yourself. You should also be able to switch the CPU to protected mode from DOS, too. You need FreeDOS and an assembler of your choice. There is an excellent tutorial on terminate and stay resident which basically means hooking an interrupt routine, then editing yourself out of the active process list, in The Rootkit Arsenal. There are probably tutorials on the internet for this, too.
I might be tempted to recommend doing this as a first, just to get yourself used to this kind of low level stuff.
If you just wanted to poke an OS, you can set up kernel debugging on Windows. WinDbg is a bit... arcane, but once you get used to it it makes sense.
You mention your laptop uses secure boot. If this is the case your laptop uses UEFI. If you want to read up on this, the UEFI spec is 100% guaranteed more boring than your maths homework, but I recommend skimming it just to understand the goals and the basic environment. THe important thing is to have the EFI SDK which enables you to build EFI-compatible applications (which are in PE format and exist on a FAT32 partition on your disk - so installing an EFI bootloader is very simple even if writing one is not so. If I had to make an honest recommendation, I'd stick to MBR for now, since emulating OSes with MBR is much easier than EFI at the time of writing and you really do want to do this in some form of VM for now. Also, I'd use an existing one like grub, since bootloaders are not all that exciting, really.
Others have said it, and I will say it: You absolutely want to do anything like this under some form of emulator or virtual machine. You will make a mistake, guaranteed, and you will come up against things you don't understand. Emulators and VM software are free these days, and some such as BOCHS will tell you what the reason for a given fault, trap etc is. This is massively helpful!
First, use something like Virtual box for your testing
I think you might want to take some smaller steps, get comfortable writing C code.
then look into how boot sectors on disks work ( well documented on the internet) also look at code of other open source boot loaders.
Then look at how to do task switching. Its not too hard to write. You can even write most of it while running it under your normal OS before trying to embeded into your own OS
With C compilers you can generally mix in asm inline usually with asm { /* assembly code */ }

Resources