ARM TrustZone, Hypervisor: Hypervisor functionality WITHOUT Virtualization Extensions

ARM TrustZone, Hypervisor: Hypervisor functionality WITHOUT Virtualization Extensions - arm

I have found some interesting information regarding CPU virtualization for ARM and I'm wondering if you guys could help me understand more about it.
Basically, folks at some company called SierraWare have developed an ARM secure-mode OS called SierraTEE that (they say) virtualizes a guest OS like Linux/Android running in non-secure mode, needing only the Security Extensions. A piece of information from one of their presentation documents has caught my attention, specifically at page 19 of this PDF http://www.sierraware.com/sierraware_tee_hypervisor_overview.pdf they state:
Integrity checks for Rootkits and Kernel Hacks:
Monitor Syscall interrupt and interrupt handler. This will ensure that core syscalls are not tampered with.
By "Syscall interrupt" I understand SVC (=old SWI) instruction executions (correct me if I'm wrong), but by "monitoring" I'm not really sure because it could be real-time monitoring, from-time-to-time monitoring or on certain-events monitoring. In my mind they could monitor the SVC handler to prevent tampering-with by either:
Inspect SVC handler from time to time (timer interrupt for instance, since IRQs and FIQs can be routed to monitor mode) - PatchGuard-like approach, doesn't seem very useful to me
Inspect SVC handler on SVC instruction execution (=certain-events monitoring)
Trap SVC handlers memory region write-access (=real-time monitoring)
Regarding approach 2: would it be possible to trap non-secure SVC instruction executions from secure-mode?
Regarding approach 3: would it be possible to hook non-secure memory-region writes by using only the Security Extensions?
Thanks very much in advance

"Monitor" here may refer to the Monitor mode, the new mode added by the Security Extensions.
I'm not very familiar with the Security Extensions but I imagine it should be possible to mark specific memory regions as secure so any access to them will result in a Monitor mode trap, which can then handle the access and resume the non-secure code execution.
However, I just found this notice in the ARM ARM (B1.8.7 Summaries of asynchronous exception behavior):
In an implementation that includes the Security Extensions but does
not include the Virtualization Extensions, the following
configurations permit the Non-secure state to deny service to the
Secure state. Therefore, ARM recommends that, wherever possible, these
configurations are not used:
Setting SCR.IRQ to 1. With this configuration, Non-secure PL1 software can set CPSR.I to 1, denying the required routing of IRQs to
Monitor mode.
Setting SCR.FW to 1 when SCR.FIQ is set to 1. With this configuration, Non-secure PL1 software can set CPSR.F to 1, denying
the required routing of FIQs to Monitor mode.
The changes introduced by the Virtualization Extensions remove these
possible denials of service.
So it would seem it's not possible to achieve perfect virtualization with just Security Extensions.

Related

How do I enter supervisor mode on the ARM Cortex m4 to disable interrupts?

Im trying to find out how I can disable and enable interrupts on the STM32L4x6RG Nucleo?
After a bit of googling I found the macros __disble_irq() and __enable_irq() but I'm not convinced these are disabling interrupts.
After more investigation it seems that the cpsid instruction this macro maps to only has effect when it runs in supervisor context. So the question becomes how do I move to supervisor mode to disable interrupts and back again??

I found the macros __disble_irq() and __enable_irq() but I'm not
convinced these are disabling interrupts.
They do, unless you (or the OS you are using) explicitly leave privileged mode with the MSR control, Rn instruction, or the __set_CONTROL() function, which does the same.
So the question becomes how do I move to supervisor mode to disable
interrupts and back again??
The processor is in privileged mode after reset, and stays in it unless you tell it otherwise. It also enters privileged mode temporarily when executing an exception handler.
You can use the SVC instruction to call the SVC exception handler from user code, and run some code in privileged mode. There is a problem though, that the SVC handler invocation would be blocked too by __disable_irq(), so there would be no way to reenable them afterwards. Instead of __disable_irq(), you can adjust the BASEPRI register to selectively disable lower priority interrupts, and set SVC priority higher so that it would not be blocked.

The processor boots in privileged mode so unless you are running your application on top of an operating system or have switched to unprivileged mode yourself you should already be in privileged mode. If you are running you application on top of an OS you should use its services to handle interrupts and if no such service exists you should leave the interrupts alone.
If you have switched to unprivileged-mode yourself, you can use the svc instruction to trigger an svc-exception and the handler for the exception executes in privileged mode.

Enter Hypervisor Mode on ARMv7 through Kernel Module

I am working on a project where I have a router with ARMv7 processor (Cortex A15) and OpenWRT OS. I have a shell on the router and can load kernel modules with insmod.
My goal is to write a kernel module in C which changes the HVBAR register and then executes the hvc instruction to get the processor in the hyp mode.
This is a scientific project where I want to check if I can place my own hypervisor on a running system. But before I start to write my own hypervisor I want to check if and how I can bring the processor in the hyp mode.
According to this picture take from armv7-a manual B.9.3.4 the system must be in insecure mode, not in user mode and the SCR.HCE bit must be set to 1.
My question is how I can prepare the processor with a C kernel module and inline assembly and then execute the hvc instruction. I want to do this with a kernel module because then I start in PL1. This pseudocode describes what I want to achieve:
call smc // to get in monitor mode
set SRC.HCE to 1 // to enable hvc instruction
set SRC.NS to 1 // to set the system to not secure
call hvc #0 // call the hvc instruction to produce a hypervisor exception

The easiest way to elevate privilege is to start off in the needed privilege mode already: You've a root shell. Is the boot chain verified? Could you replace bootloader or kernel, so your code naturally runs in PL2 (HYP) mode? If so, that's probably the easiest way to do it.
If you can't replace the relevant part of the boot chain, the details of writing the rootkit depend a lot on information about your system left out: In which mode is Linux started? Is KVM support enabled and active? Was PL2 initialized? Was it locked? Is there "secure" firmware you can exploit?
The objective is always the same: have HVBAR point at some code you can control and do a hvc. Depending on your environment, solutions may range from spraying as much RAM as possible with your code and hope (perhaps after some reboots) an uninitialized HVBAR would point at an instruction you control to inhibiting KVM from running and accessing the early hypervisor stub to install yourself instead.
Enumerating such exploits is a bit out of scope for a StackOverflow answer; this is rather dissertation material. Indeed, there's a doctoral thesis exactly on this topic:
Strengthening system security on the ARMv7 processor architecture with hypervisor-based security mechanisms

RTOS within an RTOS

I'm planning to run an RTOS e.g Nuttx as a Process of another RTOS e.g FreeRTOS such that freertos tasks and the Nuttx running as a Freertos task would co-exist.
Would this be feasible implementation given that the underlying hardware is an ARM cortex A8 single core processor? What changes could be required if the implementation is not based on VM concept?

Your requirement, in a nutshell, is to allow a GUEST RTOS to completely work within the realms of an underlying HOST RTOS. First answer would be to use virtualization extension, but A8 processor does not have that, hence will rule this option out. Without Virtualization extensions you have to resort to one of the following methods and would require a lot of code changes.
Option 1 - Port your GUEST OS API's
Take all your GUEST OS API's and replace their implementation, so that it mimics the required API behavior by making use of HOST OS's API's. Technically now your GUEST OS will not have a scheduler, and will be reduced to a porting layer on top of your HOST OS. This method is used by companies when they need their software solutions to work across multiple RTOS's. They would write their software solution based on an RTOS. When a customer comes to them with a requirement to run the software on their RTOS, they would simply port the RTOS API implementations on to the customer's RTOS.
Option 2 - Para-virtualization
Your guest RTOS user and kernel space should both work inside the userspace of your host RTOS. Let us break the problem into a few parts.
Handling Privileged Instructions
When your Guest OS, while executing in "Kernel mode" tries to execute a privileged instruction, will cause an undef instruction abort. You have to modify the undef instruction abort handler of your host kernel to trap/intercept these instructions and act on them. Every single privileged instructions has to be trapped/intercepted and 'simulated'. There are some instructions that wouldn't trap but would need to be handled by modifying code. Eg. If your kernel code reads CPSR to confirm the execution mode, CPSR would say the mode is User mode. (This instruction wouldn't cause an instruction abort, so you could not follow the trap and simulate model. The only way is to identify, search and replace these instructions in your GUEST OS codebase.)
Memory Management Unit
If a privilege violation happens the Data Abort will be triggered to your host OS. It has to be forwarded to your guest OS.
Interrupts
You would have to replace your GUEST OS's interrupt controller driver with dummy SVC calls that would call into your HOST OS to setup interrupts.
Timers
You would have to modify your GUEST timer driver to account for 'lost' ticks when you were running your HOST OS tasks.
Hardware Drivers
All other hardware drivers used by your GUEST OS have to be modified to allow device sharing between GUEST and HOST.
Schedulers
Your GUEST OS scheduler now works inside (and thus is at the mercy of ) another scheduler (HOST OS Scheduler).

It is feasible.
You need to separate resources: memory, timers, IRQs, etc. So that, "Host" OS (FreeRTOS) don't even "know" about resources used by "Guest" OS (Nuttx).
For Cortex-A8 you may want to use IRQ for FreeRTOS and FIQ for GuestOS. It will let you not to rewrite IRQ controller (but again, make sure Host does not control FIQ after GuestOS started).
And some changes might be required for context switch: you need to differ Host-Host context switch, Host-Guest (and Guest-Host) and Guest-Guest context switch.

Though not direct answer to your question, address this problem at design level, do a separation of code that depends hardware (create API) and make the application level code independent of the underlying OS or runtime i.e rather depend on particular implementation let it depend on the API.
where ever needed port the hardware (OS) dependent code to the underlying OS/Runtime

How to determine if ARM processor running in a usual locked-down "world" or in Secore "world"?

For example, virt-what shows if you are running inside hardware virtualization "sandbox".
How to detect if you are running in ARM "TrustZone" sandbox?

TrustZone maybe different than what you think. There is a continuum of modes. From 'a simple API of trusted functions' to 'dual OSs' running in each world.
If there was more context given to the question, it would be helpful. Is this for programatically determining or for reverse engineering considerations? For the current Linux user-space, the answer is no.
Summary
No current user space utility.
Time based analysis.
Code based analysis.
CPU exclusion and SCR.
ID_PRF1 bits [7:4].
virt-what is not a fool-proof way of discovering if you are running under a hyper-visor. It is a program written for linux user-space. Mostly, these are shell scripts which examine /proc/cpuinfo, etc. procfs is a pseudo-file system which runs code in the kernel context and reports to user space. There is no such detection of TrustZone in the main line ARM linux. By design, ARM has made it difficult to detect. An design intent is to have code in the normal world run unmodified.
Code analysis
In order to talk to the secure world, the normal world needs SMC instructions. If your user space has access to kernel code or the vmlinux image, you can try to analyze the code sections for an SMC instruction. However, this code maybe present in the image, but never active. At least this says whether the Linux kernel has some support for TrustZone. You could write a kernel module which would trap any execution of an SMC instruction, but there are probably better solutions.
Timing analysis
If an OS is running in the secure world, some time analysis would show that some CPU cycles have been stolen if frequency scaling is not active. I think this is not an answer in the spirit of the original question. This relies on knowing that the secure world is a full-blown OS with a timer (or at least pre-emptible interrupts).
CPU exclusion and SCR
The SCR (Secure configuration register) is not available in the normal world. From the ARM Cortex-A5 MPcore manual (pg4-46),
Usage constraints The SCR is:
• only accessible in privileged modes
• only accessible in Secure state.
An attempt to access the SCR from any state other than secure privileged
results in an Undefined instruction exception.
ID_PRF1 bits [7:4].
On some Cortex-A series, the instruction,
mrc p15, 0, r0, c0, c1, 1
will get a value where bits [7:4] indicate whether the CPU supports Security Extensions, also known as TrustZone. A non-zero value indicates it is supported. Many early CPUs may not support this CP15 register . So, it is much like the SCR and handling the undefined instruction. Also, it doesn't tell you that code is active in the TrustZone mode.
Summary
It is possible that you could write a kernel module which would try this instruction and handle the undefined exception. This would detect a normal versus secure world. However, you would have to exclude CPUs which don't have TrustZone at all.
If the device is not an ARMv6 or better, then TrustZone is impossible. A great deal of Cortex-A devices have TrustZone in the CPU, but it is not active.
The combined SMC test and a CPU id, is still not sufficient. Some boot loaders run in the secure world and then transition to the normal world. So secure is only active during boot.
Theoretically, it is possible to know, especially with more knowledge of the system. There maybe many signs, such as spurious interrupts from the GIC, etc. However, I don't believe that any user space linux tool exists as of Jan 2014. This is a typical war of escalation between virus/rootkit writers and malware detection software.TZ Rootkits

You have not specified any details of the processor (A8, A9, A15?) or the execution mode (user/kernel/monitor) from where you want to detect the processor state.
As per the ARM documentation, the current state of the processor as Secure (aka the TrustZone sandbox) or Non-secure can be detected by reading the Secure Configuration Register and checking for the NS bit.
To access the Secure Configuration Register: MRC p15, 0, <Rd>, c1, c1, 0
Bit 0 being set corresponds to the processor being in non-secure mode and vice-versa.

You can check the processor's datasheet, and find those registers which behaves different between Normal world and Secure world. Generally, in Secure World, when you read these registers you will just get null. But get data in Normal world. And also, some registers that you can just access in Secure world, if you are in Secure World, you can access it, but in Normal World your access will be rejected.
Any way, there are many ways to distinguish Normal World and Secure World. JUST READ THE DATASHEET IN DETAIL.

To what extent are interrupts supported in Win32?

To what extent are interrupts supported in Win32 beyond processor definitions? For example, x86 machines define at least 18 interrupts, including traps such as the breakpoint trap (INT 3). The other 19-255 interrupts are left open by Intel as software defined interrupts. Are any of these used by Windows/WinAPI or are they just open and free for applications to use as they please? If Windows uses them, where can I find the relevant documentation? I looked on MSDN and could not find anything.
(BTW I am doing compiler, debugger and other system-level programming, so please don't lecture me on your opinions about the advisability of using interrupts in the first place.)

In Win32 apps, there's probably just one interrupt used commonly, int 2Eh. It's used as the system call entry point. It's analogous to int 21h in DOS. The rest of the interrupts aren't used by apps.
Apps, however, can handle some CPU exceptions (and debug breaks) via Structured Exception Handling (SEH)/Vectored Exception Handling (VEH). Windows catches CPU exceptions originating in apps and reflects them back into the apps, if and however possible (Windows is not perfect in imitating the CPU exception model).
Windows uses device interrupts internally and does not let apps mess with them. The x86 CPU handles interrupts in the most privileged mode, where the kernel runs.
Nowadays many device interrupts aren't associated with fixed interrupt vectors and are configurable and you need to work with the various things like PCI to query or change the settings.
If you want to work with devices and interrupts directly, you need to write a kernel-mode driver for Windows. There's the Device Driver Kit (DDK) and books like Windows Internals that can get you started.
Still, if you're looking for specifics of device XYZ and its interrupt programming, you aren't going to find everything or much on MSDN or in the DDK because you'll need hardware-specific information, something that's outside of Microsoft's control. The kernel provides the functionality necessary to do I/O and handle interrupts, but it's ultimately up to device drivers to use them one way or the other.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight