AImotive Launches aiWare4 Featuring Advanced Wavefront Processing, Upgraded Safety, and Low-Power For Automotive AI

By Tiera Oliver

Associate Editor

Embedded Computing Design

June 01, 2021


AImotive Launches aiWare4 Featuring Advanced Wavefront Processing, Upgraded Safety, and Low-Power For Automotive AI
(Image courtesy of AImotive)

AImotive, supplier of scalable modular automated driving technologies, announced the latest release of its aiWare NPU hardware IP.

Featuring upgrades to on-chip memory architecture, new wavefront-processing algorithms, and updated ISO26262-compliant safety features, aiWare4 delivers a scalable solution for challenging single-chip edge applications to high performance central processing platforms for automotive AI. With aiWare4 key metrics have been further improved, including TOPS/mm2, effective TOPS/W, and range of high-efficiency CNN topologies.

Upgraded capabilities for aiWare4 include:

  • Scalability: up to 64 TOPs per core (up from 32 TOPS for aiWare3) and up to 256 TOPS per multi-core cluster, with ideal configurability of on-chip memory, hardware safety mechanisms, and external/shared memory support
  • Safety: Enhanced standard hardware features and related documentation ensuring straightforward ISO26262 ASIL B, and auitable compliance for both SEooC (Safety Element out of Context) and in-context safety element applications
  • PPA(Note 1): 8-10 Effective TOPS/W for typical CNNs (theoretical peak up to 30 TOPS/W) using a 5nm or smaller process node; up to 98% efficiency for a wider range of CNN topologies; more flexible power domains enabling dynamic power management able to respond to real-time context changes without needing to restart
  • Processing: Wavefront RAM (WFRAM) leverages aiWare’s latest wavefront-processing and interleaved multi-tasking scheduling algorithms, enabling ideal parallel execution, multi-tasking capability, and reductions in memory bandwidth compared to aiWare3 for CNNs requiring access to external memory resources

Per the company, aiWare4 continues to deliver NPU efficiency (see note 2), enabling suitable performance using less silicon. These latest upgrades also enable aiWare4 to execute a range of CNN workloads using only on-chip SRAM for single-chip edge AI or more highly-optimized ASIC or SoC applications.

AImotive will be shipping aiWare4 RTL to lead customers starting Q3 2021.

For more information, visit:

Note 1: PPA: Power, Performance and Area
Note 2: download the latest aiWare3 benchmark demonstrating up to 98% efficiency measured on Nextchip’s Apache5 SoC. NPU Efficiency measures % of claimed TOPS usable to execute theoretical GMACS of CNN workload. Request additional benchmark data here

Tiera Oliver, Associate Editor for Embedded Computing Design, is responsible for web content edits, product news, and constructing stories. She also assists with newsletter updates as well as contributing and editing content for ECD podcasts and the ECD YouTube channel. Before working at ECD, Tiera graduated from Northern Arizona University where she received her B.S. in journalism and political science and worked as a news reporter for the university’s student led newspaper, The Lumberjack.

More from Tiera