Edge AI optimization

This story is part of Bosch Research Blog

Discover the whole series

The four Bosch Research experts (from left), Nina Bretz, Michael Beyer, Christoph Schorn and Cecilia De La Parra in front of a monitor displaying a visualization of a neural network's prediction for an Advanced Driver Assistance System (ADAS).

Artificial intelligence (AI) is rapidly transforming our digital lives. Yet, for all its power, AI is still just beginning its journey into the physical world. The most powerful AI models today reside in massive data centers. Bringing a piece of that intelligence into the products we use every day — our cars, our tools, and complex autonomous systems — is a challenge far greater than you might imagine. This article unpacks why AI must operate “at the edge,” why that is so difficult, and how a groundbreaking toolchain from Bosch Research is making it possible.

The critical need for Edge AI: Why the cloud isn’t enough

First, why embed AI into products at all? Why not just connect everything to a powerful AI in the cloud? The answer is that we will ultimately need both: a centralized Cloud AI and local, or Edge AI, which operates directly within a device. This Edge AI can react reliably in milliseconds and operate securely on local data.

Many scenarios demand this local intelligence. For example, an autonomous vehicle cannot wait for a cloud round-trip to brake for a pedestrian. Likewise, collaborative humanoid robots require instant decision-making to work safely alongside people.

These scenarios highlight several key drivers for Edge AI:

Latency and reliability: Edge AI provides the split-second, dependable response times essential for safety-critical systems like automated driving.
Connectivity: The application may be in a location with poor or no internet access.
Data bandwidth: Modern autonomous systems generate enormous amounts of data. Processing it locally is far more efficient and affordable than sending it all to the cloud.
Security and privacy: Processing sensitive data on-device reduces the attack surface for cyber threats.
Cost: Relying on the cloud for every single calculation can lead to significant, recurring operational costs.

The bottleneck: The hardware-software performance gap

Given the need for Edge AI, why not just run a cloud model on a local chip? The core of the problem lies in two challenges:

First, a strategic challenge: relying on a single chip family for both cloud and edge can limit design freedom and supply chain resilience.
Second, a technical mismatch: hardware for training AI in the cloud is built for flexibility, while embedded systems on a chip (SoCs) are built for maximum efficiency – performing a specific task with the least amount of power.

This has created a vibrant market of specialized SoCs for Edge AI. While their manufacturers provide toolchains to help deploy AI models, these tools are, by necessity, model-agnostic. They cannot be pre-optimized for the unique architecture and data characteristics of a specific customer’s AI model. This is the source of the hardware-software performance gap.

Imagine trying to pack a suitcase. A generic approach might be to just toss your clothes in, which works but leaves a lot of wasted space. But if you are an expert at folding each specific item, you can fit dramatically more inside. That is precisely what we do with AI.

The Bosch Research team working on Edge AI optimization: Cecilia De La Parra, Nina Bretz, Michael Beyer and Christoph Schorn. — Our multi-disciplinary team working on Edge AI optimization. From left: Cecilia De La Parra, Nina Bretz, Michael Beyer and Christoph Schorn.

The Bosch solution: A hardware-aware AI co-optimization toolchain

At Bosch, we have developed that expert packing technique for AI. It is an AI inference optimization toolchain, the result of a co-creation effort between Bosch Research and our Bosch Mobility division. Our toolchain serves as a sophisticated engine for hardware-software co-optimization. It analyzes both the AI model and the target chip’s architecture, then automatically restructures the model for maximum efficiency.

For AI-powered perception in advanced driver-assistance systems (ADAS), our toolchain has achieved up to a six times inference speed-up on the same hardware compared to a standard deployment and without compromising the model's required output accuracy. This means the chip can deliver more powerful features, or that a smaller, cost-effective SoC can be used for the application. We have successfully showcased these capabilities with several leading automotive SoCs.

This holistic approach liberates us from being locked into a single hardware supplier, enabling software-defined systems, ensuring supply chain resilience, and giving us a global competitive advantage.

Agent-driven edge AI optimization process: The agent analyzes the execution profile of a neural network model on different hardware options and performs hardware-aware model tuning for most efficient inference.

Under the hood: How we unleash performance

Our toolchain’s key principle is that we use AI to optimize AI. This is achieved through a suite of automated methods:

Hardware-Aware Neural Architecture Search (HW-NAS): Instead of designing an AI model in a vacuum, our algorithms automatically search for an architecture that is perfectly suited to the target hardware’s strengths and weaknesses from the start.
Quantization-Aware Training (QAT): To drastically reduce model size and improve speed, we convert models to use lower-precision mathematics (quantization). Crucially, this is not just a post-processing step. It is integrated into a specialized training process (pruning and quantization-aware training) that ensures the model outputs remain highly accurate even after compression.
Graph-level deployment optimizations: We analyze the complex chain of mathematical calculations inside a neural network and intelligently reorder them to match the way the specific chip processes information, dramatically improving runtime without affecting the model's output.
AI-driven optimization loop: We take automation a step further by using intelligent agents to control the optimization process itself. These agents gather performance data from the target hardware (or its simulation) and use that information to reason about the best optimization strategy.

The path forward: Open ecosystems and shared innovation

The journey of AI from the cloud into the physical world is one of today’s most exciting frontiers. Our vision extends to enabling “liquid intelligence” across the edge-cloud continuum, where algorithms can be deployed on any suitable silicon. However, mastering the future of edge AI is a team sport. The path forward lies in building open ecosystems where automakers, suppliers, and technology companies collaborate through open standards and deep partnerships. Our toolchain is a key enabler for this collaborative future, supporting the development of a multi-vendor chiplet ecosystem crucial for a modular and sovereign European compute landscape. By providing a clear path to targeted AI optimization, Bosch is committed to driving this transformation with our partners.

Author

Christoph Schorn

Christoph leads the activities in the Embedded AI Strategic Portfolio Cluster at Bosch Research, spearheading innovations that bring advanced artificial intelligence from the cloud directly into physical products. He joined Bosch Research in 2016 as a Ph.D. student, dedicating his early work to ensuring the dependability and fault tolerance of edge AI systems. His current focus is at the crucial interface between research and Bosch Mobility, where he drives the successful implementation of hardware-aware AI optimization toolchains. This work is vital for achieving significant performance gains, cost reductions, and strategic flexibility in deploying optimized AI for software-defined vehicles and fostering an open chiplet ecosystem for mobility.

Google Scholar

Share this on:

From cloud to concrete: How Bosch is unlocking AI for the physical world

The critical need for Edge AI: Why the cloud isn’t enough

The bottleneck: The hardware-software performance gap

The Bosch solution: A hardware-aware AI co-optimization toolchain

Under the hood: How we unleash performance

The path forward: Open ecosystems and shared innovation

Author

Christoph Schorn

Discover related topics

Writing reliable software with AI: A generate-and-check approach

The hidden engine: How applied mathematics powers a world “Invented for life”

Better customer experiences thanks to IoT and AI