How to mitigate timing and interference issues on multicore processors

By Mark Pitchford

Technical Specialist

LDRA

February 14, 2024

Blog

Embedded software developers face unique challenges when dealing with timing and interference issues on heterogeneous multicore systems. Such platforms offer higher CPU workload capacity and performance than single core processor (SCP) setups but their complexity can make strict timing requirements extremely difficult to meet.

In hard real-time applications, deterministic execution times are crucial for meeting operational and safety goals. Although multicore processor (MCP) platforms generally exhibit lower average execution times for a given set of tasks than do SCP setups, the distribution of these times is often spread out. This variability makes it difficult for developers to ensure precise timing for tasks, creating significant problems when they are building applications where meeting individual task execution times is just as critical as meeting goals for average times.

To address these challenges, embedded software developers can turn to guidance documents like CAST-32A, AMC 20-193, and AC 20-193. In CAST-32A, the Certification Authorities Software Team (CAST) outlines important considerations for MCP timing, and sets Software Development Life Cycle (SDLC) objectives to better understand the behavior of a multicore system. While not prescriptive requirements, these objectives guide and support developers toward adhering to widely accepted standards like DO-178C.

In Europe, the AMC 20-193 document has superseded and replaced CAST-32A, and the AC 20-193 document is expected to do the same in the U.S. These successor documents, collectively referred to as A(M)C 20-193, largely duplicate the principles outlined in CAST-32A.

To apply the guidance from CAST-32A and its successors, developers can employ various techniques for measuring timing and interference on MCP-based systems.

The importance of understanding worst-case execution times

In hard real-time systems, meeting strict timing requirements is essential for ensuring predictability and determinism. Such systems run mission- and safety-critical applications such as advanced driver-assistance systems (ADAS) and aircraft autopilots. In contrast to soft real-time systems, where missing timing deadlines has less severe consequences, understanding the worst-case execution time (WCET) for hard real-time tasks is essential because missed deadlines can be catastrophic.

Developers must consider both the best-case execution time (BCET) and the WCET for each CPU task. The BCET represents the shortest execution time, while the WCET represents the longest execution time. Figure 1 illustrates how these values are determined given an example set of timing measurements.

Measuring the WCET is particularly important because it provides an upper bound on the task's execution time, ensuring that critical tasks complete within the required time constraints.

Figure 1: Different execution times and guarantees for a given real-time task

In SCP setups, meeting upper timing bounds can be guaranteed as long as there is sufficient CPU capacity planned and maintained. In MCP-based systems, meeting these bounds is more difficult due to the lack of effective methods for calculating a guaranteed tasking schedule that accounts for multiple processes running in parallel across heterogeneous cores. This complexity increases when applications share hardware between processor cores. Contention for the use of these Hardware Shared Resources (HSR) is largely unpredictable, disrupting the measurement of task timing.

Developers of hard real-time systems on MCPs – unlike their counterparts working with single-core systems – cannot rely on static approximation methods to generate usable approximations of BCETs and WCETs. Instead, they must rely on iterative tests and measurements to gain as much confidence as possible in understanding the timing characteristics of tasks.

Understanding CAST-32A and A(M)C 20-193

To better understand the landscape of guidance available to developers, Figure 2 provides a visual representation of the relationships between key civil aviation documents.

 

Figure 2: Relationships between civil aviation regulatory and advisory documents

CAST-32A and A(M)C 20-193 place significant focus on developers providing evidence that the allocated resources of a system are sufficient to allow for worst-case execution times. This evidence requires adapting development processes and tools to iteratively collect and analyze execution times in controlled ways that help developers optimize code throughout the lifecycle.

Neither CAST-32A nor A(M)C 20-193 specify exact methods for achieving their objectives, leaving it to developers to implement in ways that best suit their projects.

Guidance for developers

CAST-32A and A(M)C 20-193 specify MCP timing and interference objectives for software planning through to verification that include the following:

  • Documentation of MCP configuration settings throughout the project lifecycle, as the nature of software development and testing makes it likely that configurations will change. (MCP_Resource_Usage_1)
  • Identification of and mitigation strategies for MCP interference channels to reduce the likelihood of runtime issues. (MCP_Resource_Usage_3)
  • Ensuring MCP tasks have sufficient time to complete execution and adequate resources are allocated when hosted in the final deployed configuration. (MCP_Software_1 and MCP_Resource_Usage_4)
  • Exercising data and control coupling between software components during testing to demonstrate that their impacts are restricted to those intended by the design. (MCP_Software_2)

Both CAST-32A and A(M)C 20-193 cover partitioning in time and space, allowing developers to determine WCETs and verify applications separately if they have verified that the MCP platform itself supports robust resource and time partitioning. Making use of these partitioning methods helps developers mitigate interference issues, but not all HSRs can be partitioned in this way. In either case, the likes of DO-178C require evidence of adequate resourcing.

How to analyze execution times on MCP platforms

The methods that follow are proven effective to meet the needs of WCET analysis and to meet the guidance of CAST-32A and A(M)C 20-193 guidelines.

Halstead’s metrics and static analysis

Halstead's complexity metrics can act as an early warning system for developers, providing insights into the complexity and resource demands of specific segments in code. By employing static analysis, developers can use Halstead data with actual measurements from the target system, resulting in a more efficient path to ensuring adequate application resourcing.

These metrics and others shed light on timing-related aspects of code, like module size, control flow structures, and data flow. Identifying sections with larger size, higher complexity, and intricate data flow patterns helps developers prioritize their efforts and fine-tune code segments that incur the highest demands on processing time. Optimizing these resource-intensive areas early in the lifecycle reduces the mitigation effort and risks of timing violations.

Empirical analysis of execution times

Measuring, analyzing, and tracking individual task execution times helps mitigate issues in modules that fail to meet timing objectives. Dynamic analysis is essential to this process, as it automates the measurement and reporting of task timings to free up developer workloads.

To ensure accuracy, three considerations must be taken into account:

  1. The analysis must occur in the actual environment where the application will run, eliminating the influence of configuration differences between development and production, such as compiler options, linker options, and hardware features.
  2. Sufficient tests must be executed repeatedly to account for environmental and application variations between runs, ensuring reliable and consistent results.
  3. Automation is highly recommended for executing sufficient tests within a reasonable timeframe and for eliminating the influence of relatively slower manual actions.

 

Figure 3: Execution time histograms from the LDRA tool suite (Source: LDRA)

One example is the LDRA tool suite, which employs a “wrapper” test harness to exercise modules on the target device to automate timing measurements. Developers can define specific components under test, whether at the function level, a subsystem of components, or the overall system. Additionally, they can specify CPU stress tests, like using the open source Stress‑ng workload generator, to further improve confidence in the analysis results.

Analysis of application control and data coupling

Control and data coupling analysis play a crucial role in identifying task dependencies within applications. Through control coupling analysis, developers examine how the execution and data dependencies between tasks affect one another. The standards insist on these analyses not only to ensure that all couples have been exercised, but also because of their capacity to reveal potential problems.

The LDRA tool suite provides robust support for control and data coupling analyses. As illustrated in Figure 4, these analyses help developers identify critical sections of code requiring optimization or restructuring to improve the timing predictability and resource utilization of the application.

 

Figure 4: Data coupling and control coupling analysis reports from the LDRA tool suite (Source: LDRA)

Conclusion

By understanding the CAST-32A and A(M)C 20-193 guidance and the analysis methods presented here, embedded software developers can better manage the complexities of Hardware Shared Resource interference and address coding issues that impact it – essential in ensuring the efficient and deterministic execution of critical workloads on multicore platforms.

A Chartered Engineer with over 20 years experience in software development, more lately focused on technical marketing. Proven ability as a developer, team leader, manager and in pre-sales, working in functional safety- and security-critical sectors. Excellent communications skills, demonstrated through project management, commissioning, technical consultancy, and marketing roles.

More from Mark