Skip to main content
Research

Writing reliable software with AI: A generate-and-check approach

Bosch Research Blog | Posted by Jochen Quante and Matthias Woehrle, 2025-07-04

The two Bosch Research experts, Jochen Quante (left,) and Matthias Woehrle (right).

Artificial intelligence is rapidly transforming creative processes — from content generation to product design, software development, and beyond. The practical value of AI-generated output hinges on its reliability and usability, particularly in critical applications such as software development. A key question is how to ensure that AI-driven solutions function effectively and safely in real-world conditions. We see a promising approach to address this challenge: generate-and-check.

As an intuitive example, consider a crossword puzzle. While filling in suggestions, you check that the words actually fit with the overall entries in the puzzle. This is a generate-and-check approach, and we apply it to a generative AI-based system: after the AI generates a creative proposal — such as a word — a separate system checks it against practical constraints and rules. For the crossword, we can check the correct length of suggestions and, over time, the consistency of letters across different entries. Additionally, if there is an issue such as an inconsistency of letters across words, you can provide this feedback to the assistant to improve future suggestions.

The generate-and-check approach is particularly well-suited for software development, where established rules can be automatically checked. These rules can vary in complexity — ranging from coding style guides to formal verification of specified properties. This facilitates verification and validation, as we verify resulting artifacts rather than validating the system under all circumstances. By implementing strong rule sets, we can ensure that the AI's output is safe, accurate, and compliant with specific requirements, reducing the need for extensive human supervision.

We're excited about the potential of generate-and-check and pleased to see this approach gaining traction within the software engineering community. Generate-and-check is inherently a multi-disciplinary approach, e.g., combining AI, software engineering, and — in our case — domain-specific aspects such as embedded systems and safe software. As such, we have a great collaboration in our team where each member contributes their unique background to tackle exciting and challenging applications of generate-and-check. Let's take a closer look at some of these.

The process of generate-and-check
The generate-and-check process: each generated proposal is checked against defined properties. If the check fails, the resulting feedback informs the generation of the next proposal. A result is returned only when the check succeeds, ensuring a high-quality output.

Use case: Code refactoring for performance

Code refactoring provides a compelling example of the generate-and-check approach. Generative AI can be used to suggest and apply refactorings and migrating to a more efficient framework, followed by automated validation. While conventional approaches rely on existing unit tests to ensure no bugs are introduced, we can go further: in many cases, refactoring should be behaviorally invariant. This enables a stronger form of checking, ensuring that the refactored code is functionally equivalent to the original.

Beyond functional correctness, non-functional properties like maintainability and performance can also be evaluated. By leveraging sophisticated profiling tools — such as those developed at Bosch Research for special-purpose platforms — we can automatically verify that refactorings have no negative impact on performance and may even lead to performance improvements.

Use case: Software translation

Software translation presents another compelling use case for the generate-and-check approach. There are many motivations for translating code — for example, to leverage the safety features of a particular language. Traditional transpilers may achieve functional equivalence but often fall short in exploiting the idioms and advantages of the target language or framework. With generate-and-check, we can use AI to translate code idiomatically, while a separate checking phase ensures not only functional equivalence but also adherence to specific properties such as safety, security, and performance requirements. The approach enables the migration of legacy codebases to modern environments, allowing teams to benefit from advancements in language design and framework capabilities — without compromising code quality or introducing new vulnerabilities.

The Bosch Research team working on GenAI-based Software Engineering: Jochen Quante, Andrea Flexeder, Martin Leinberger, Matthias Woehrle, and Jesko Hecking-Harbusch.
Our multi-disciplinary team working on LLM-based Software Engineering. From left: Jochen Quante, Andrea Flexeder, Martin Leinberger, Matthias Woehrle, and Jesko Hecking-Harbusch.

Integration into a software development toolchain

Generative AI for software development works best as an assistive technology. We are exploring various integration strategies depending on the task. For example, generate-and-check plugins can deliver low-latency, high-value feedback directly within IDEs — similar to existing code assistants. For more complex tasks, such as refactoring legacy codebases, background workflows are more suitable. These workflows can leverage offline analyses and propose changes via pull requests (PRs), enabling thorough evaluations (such as performance profiling) without disrupting the developer’s workflow. Developers can then review the supporting evidence for each proposed code change within the PR before deciding to merge. This way, developers stay in control, and the automated checking reduces their effort by providing concise verification and validation insights on generated software artifacts.

Looking ahead

The generate-and-check paradigm marks a significant advancement in reliable, AI-driven software development. By combining the creative power of generative AI with the rigor of automated verification, it enables higher levels of productivity, safety, and code quality. We believe the generate-and-check approach is a key enabler for the next generation of software engineering — and we are excited to see how it continues to evolve and drive innovation across the industry.

What are your thoughts on this topic?

Please feel free to share them or to contact us directly.

Author 1

Jochen Quante, Bosch Research

Jochen Quante

Jochen Quante is a senior expert for software analysis and design at Bosch Research. His research focuses on making software development more efficient, increasing software quality, and supporting developers by utilizing software analyses, formal methods, and artificial intelligence. Since Jochen joined Bosch in 2008, he has worked on these topics with multiple business units, addressing many different use cases. Before that, he was a software engineer at an IT company. He currently leads a research activity that applies generate-and-check to software development.

Jochen holds a diploma in computer science from the University of Karlsruhe (now KIT) and a PhD from the University of Bremen. He regularly serves on program committees for international software engineering conferences such as ICSME, SCAM, and SANER.

Author 2

Matthias Woehrle, Bosch Research

Matthias Woehrle

Matthias is a senior expert for verification hybrid models and embedded AI. Matthias is passionate about applying research to practical applications in an industrial context. At Bosch, his work is applied to various domains, such as embedded control systems, virtual sensors, and autonomous systems. Recently, his focus has been on hybrid modeling — i.e., combining domain knowledge with machine learning — and its benefits with respect to verification and validation.

Matthias holds a diploma in electrical engineering from the University of Karlsruhe (now KIT) and a PhD from ETH Zurich.

Share this on: