Kinara Processor Boost Performance and Enables Edge AI

By Ken Briodagh

Editor in Chief

Embedded Computing Design

January 27, 2024

Story

Kinara Processor Boost Performance and Enables Edge AI

It seems like every company is looking to layer AI into their solutions, from the Edge to the Cloud. The technical requirements for even lightweight AI, not to mention Generative or Predictive AI, and particularly at the edge, are no simple matter.

Power needs are often drastically increased, and processing power must be boosted well beyond the typical, non-AI, device. These problems are multiplied at the Edge, and in remote and mobile deployments.

Now, Kinara at CES launched its Ara-2 Edge AI processor in order to address these challenges. The Ara-2 reportedly can power edge servers and laptops with high performance, cost effective, and energy efficient inference, and allow them to run sophisticated software applications like video analytics, Large Language Models (LLMs), and Generative AI models.

The company says that Ara-2 is also designed to enable edge applications to run AI models with or without transformer-based architectures. According to the announcement, the new processor handles real-time responsiveness with high throughput, which pairs latency optimization with on-chip memory and high off-chip bandwidth. In addition, implementing the Ara-2 will help facilitate the migration from GPU processing for a wide variety of AI models, Kinara said in the release, because the compute engines and the associated SDK are designed to support high-accuracy quantization, a dynamically moderated host runtime, and direct FP32 support.

The SDK includes a model compiler and compute-unit scheduler, an integrated Kinara quantizer, support for pre-quantized PyTorch and TFLite models, a load balancer for multi-chip systems, and a dynamically moderated host runtime. The Ara-2 reportedly also offers secure boot, encrypted memory access, and a secure host interface to enable enterprise AI deployments with even greater security.

LLMs and Generative AI in general have become a desirable application for many embedded systems, but running these resource-demanding tools on GPUs in data centers often comes along with increased latency, higher costs, and questionable (at best) privacy. Each of these problems alone can disqualify many use cases, particularly any that is managing critical systems or high-risk automation. Think about automated cars, remote monitoring on energy systems, and security.

Kinara says that the Ara-2 overcomes these problems and makes it simpler for companies to transition to the edge with their embedded systems, even those loaded with sophisticated AI, ML, and other Intelligent Edge applications.

“With Ara-2 added to our family of processors, we can better provide customers with performance and cost options to meet their requirements. For example, Ara-1 is the right solution for smart cameras as well as edge AI appliances with 2-8 video streams, whereas Ara-2 is strongly suited for handling 16-32+ video streams fed into edge servers, as well as laptops, and even high-end cameras,” said Ravi Annavajjhala, CEO, Kinara. “The Ara-2 enables better object detection, recognition, and tracking by using its advanced compute engines to process higher resolution images more quickly and with significantly higher accuracy. And as an example of its capabilities for processing Generative AI models, Ara-2 can hit roughly 0.5 seconds per iteration for Stable Diffusion and tens of tokens/sec for LLaMA-7B.”

Other features include:

  • 8 Gen-2 neural cores for enhanced compute utilization
  • Support for INT4, INT8, MSFP16
  • Access to up to 16GB LPDDR4(X) per chip
  • Secure boot, encrypted memory, and interface
  • 4-lane PCIe Gen 4, USB 3.2 Gen 2 interface
  • Scalable performance with multi-chip and automatic load balancing
  • 17mmx17mm EHS-FCBGA

The Ara-2 is available as a stand-alone device, a USB module, an M.2 module, and a PCIe card featuring multiple processors.

Ken Briodagh is a writer and editor with two decades of experience under his belt. He is in love with technology and if he had his druthers, he would beta test everything from shoe phones to flying cars. In previous lives, he’s been a short order cook, telemarketer, medical supply technician, mover of the bodies at a funeral home, pirate, poet, partial alliterist, parent, partner and pretender to various thrones. Most of his exploits are either exaggerated or blatantly false.

More from Ken