Security and the Cortex-M MPU, part 4: SWI API for MPU systems
January 19, 2017
The Cortex-M v7 memory protection unit (MPU) is difficult to use, but it is the main means of hardware memory protection available for Cortex-M3, -M4, and -M7 processors. These processors are...
The Cortex-M v7 memory protection unit (MPU) is difficult to use, but it is the main means of hardware memory protection available for Cortex-M3, -M4, and -M7 processors. These processors are in widespread use in small- to medium-size embedded systems. Hence, it is important to learn to use the Cortex-M v7 MPU effectively in order to achieve the reliability, security, and safety that modern embedded systems require.
Previous blogs have presented an introduction to the MPU and terminology, MPU multitasking, and defining MPU regions. In the first blog, privileged tasks (
ptasks) and unprivileged tasks (
utasks) were defined. The former run in privileged thread mode and the latter run in unprivileged thread mode. The mode of a task is determined by the
umode flag in its task control back (TCB) and takes effect when it is dispatched by the real-time operating system (RTOS) scheduler. The
umode flag can be set in
pmode at any time after creating the task.
ptasks can directly call system services, but
utasks cannot. There are two reasons for this:
- To protect the RTOS and its data from the less-trusted software in
utasks. This may be software of unknown pedigree (SOUP), or it may be vulnerable to malware (e.g., a TCP/IP stack).
- To limit the RTOS services that this software can use. It is undesirable for
utasksto be able to perform operations that can harm normal system operation, such as power off or task delete.
This blog discusses the mechanisms by which the foregoing protections are achieved. It should be noted that a principal objective of MPU security is to put as much application code as possible into
utask MPA regions
Each task has its own memory protection array (MPA), which is initialized from an MPA template. A typical MPA template for a
ut2a) is as follows:
This template is loaded into the MPA for the task after the task is created, and then into the MPU when that task is dispatched. The region for the task stack is defined and put into
MPA when the task is first started. So, the above
utask has access to its own stack, to its own code and data regions, to common code and data regions, and to nothing else. Regions 5, 6, and 7 are either disabled or privileged. Hence, this
utask is prevented from accessing system services and data directly. The latter is true for all
utasks, though their templates may differ.
utasks must use a software interrupt (SWI) application programming interface (API) to access system services, and they can never access system data directly. In addition, only unrestricted system services can be accessed by
utasks. These barriers help to protect the operating system (OS) from untrusted code.
The SWI API is implemented via the Cortex-M SVC instruction “
SVC n“, where
n specifies the system service to be performed.
For the smx RTOS kernel, the header file,
xapi.h, contains the prototype functions of all smx services. Including this file at the start of
pcode allows it to access any of them. For
xapiu.h is defined. It consists of mapping macros for system services that are permitted in
umode. For example:
This macro overrides the function prototype in
xapi.h so that for the rest of the application module, rather than calling the system service directly, it calls a shell function instead:
This shell function serves to call the SVC instruction with
n == ID of the system service.
NI (Not Inline) is a macro that blocks function in-lining by the compiler. Note that the shell function has the same name, except that its prefix is
smxu_ instead of
smx_. Shells are in the
ucom_code region so that they are accessible by
An application module can start with
pcode followed by
ucode, or it may be entirely
ucode. Either way, the
ucode is prefaced with:
No system services can be directly called after that point. All of the above is done at compile time and thus becomes hard-coded and therefore resistant to malware and bugs, especially if the code is located in ROM.
pcode/ucode modules are convenient because a functional section of a system will typically have a root task, which is a
ptask that creates, initializes, and starts all other tasks for the section, most of which may be
utasks. Thus, all related tasks can be kept together. Inherent in this is the idea that some tasks of a system section might be carefully-constructed tasks that perform mission-critical functions. These tasks would probably be
ptasks. Other tasks of the system section might be performing non-critical functions, such as gathering statistics to be sent to the cloud. These would be
Some tasks might start their existence as
ptasks and be migrated to
utasks as a project develops. It is typically easier to debug code in
pmode then move it to
umode. Also, tasks can start as
ptasks and execute
pcode, set their own
umode flags, then restart themselves as
utasks and execute
Another interesting feature is that multiple
xapiu.h files can be deployed and used by different
utasks. This allows for different levels of trust. Thus, more-trusted tasks can be given access to more RTOS services than less-trusted tasks. This permits tightening the noose on SOUP or highly vulnerable tasks. However, this only works at compile time. To protect against malware, it is also necessary to have different jump tables (see Figure 5 below) corresponding to the different
xapiu.h files and a mechanism to select the jump table per task.
utask service call mechanism
The basic concept of a software interrupt API, as presented above, is pretty simple. But when the called system service might cause a task switch, things get more complicated – particularly for the Cortex-M architecture, which requires that the RTOS scheduler reside inside of the
PendSV_Handler. Also complicated is that handlers run in privileged mode and use the System Stack (SS) instead of the current Task Stack, TS.
As shown in the following diagram,
SVC_Handler() is invoked by the SVC instruction and runs in handler mode:
SVC_Handler() starts running, the system service parameters are in TS due to stacking by the processor. The handler moves parameters 0 thru 3 into
r3 and it moves the 5th parameter, if any, into the top of the system stack, SS (this is where the system service expects to find these parameters).
SVC_Handler() then calls the system service (SSR) via the
ssrt jump table, using the index
n (ID) passed to it (see above).
The system service executes normally and returns to
SVC_Handler(), which moves the system service return value from
r0 to its correct position in TS. The handler return operation, performed by the processor, unstacks all stacked registers in TS, thus the return value ends up in
If the system service has resulted in the need for the task scheduler to run (
sched > 0) or an interrupt has resulted in a need for the link service routine (LSR) scheduler to run (
lqctr > 0), the
PendSV_Handler() will have been pended. In this case, the processor tail-chains from
PendSV_Handler (shown by the dotted line in the diagram), instead of returning to the
In this case, control does not go back to the point of call in the
utask yet, but rather to the scheduler running inside of
PendSV_Handler(). This may result in the current task being suspended and another task being resumed to run instead (shown to the right in the diagram). The preempting task can be either a
utask or a
ptask. Eventually the suspended
utask will be resumed, unstacked, and continue running from the point of call, provided that it was not stopped nor deleted by a preempting task. (Note: All of the above is done in privileged mode and thus is protected from malware that has infected
ptask service call mechanism
By contrast, the following diagram shows operation when a system service is called from a
Note that this is much simpler (and faster):
SVC_Handler() is not involved. The
ptask calls the system service directly, and if
sched is set,
PendSV_Handler() is pended. From there, operation is identical to that for a
System services operate the same regardless of whether they are invoked from
ptasks. For example, a
utask may test a semaphore and become suspended upon it. A
ptask may signal the semaphore and the
utask will resume. Or vice-versa. A
ptask may have higher or lower priority than a
utask and the scheduler will dispatch it according to its priority (privilege has no priority here!). What is different is that the
ptask executes trusted code (
pcode) and usually has full access to memory, peripherals, and system services, whereas the
utask executes unprivileged code (
ucode) and has access to only what the MPU permits. Furthermore, the MPU can only be changed by
Lest there be concern that
ptasks are unbridled agents, note that it is possible to prevent access to a region via the MPU even though the background region is enabled. Hence, a region that is read/write (RW) to one task could be read only (RO) to another and execute never (XN) to both. On the other hand,
ptasks do have direct access to all smx services. As security of a system is tightened, consideration should be given to limiting
ptasks as well as
- Porting Existing Applications to an MPU
For more information on the MPU software architecture, see previous blogs:
Additional information can be found at www.smxrtos.com/mpu.