VeriStruct automates formal verification of Rust data structures with 99.2% function success rate
Researchers have introduced VeriStruct, a framework that extends AI-assisted formal verification from individual functions to complete data structure modules in Verus. The system successfully verified 128 of 129 functions (99.2%) across eleven Rust data structure modules by using a planner module to generate abstractions, type invariants, and proof code, with automatic error correction for Verus syntax.
VeriStruct Automates Formal Verification of Rust Data Structures with 99.2% Success Rate
Researchers have introduced VeriStruct, a framework that significantly advances AI-assisted formal verification by scaling from single functions to complete data structure modules in Verus, a formal verification tool for Rust.
Key Technical Approach
VeriStruct addresses a critical limitation in existing AI verification systems: they struggle with complex module-level verification tasks. The framework employs a three-stage process:
- Planner module: Orchestrates systematic generation of abstractions, type invariants, specifications, and proof code
- Syntax guidance: Embeds Verus-specific annotation syntax directly within prompts to improve LLM understanding
- Repair stage: Automatically detects and corrects annotation errors and verification-specific semantic mistakes
The repair mechanism is particularly significant. LLMs frequently misunderstand Verus' annotation syntax and the domain-specific semantics required for formal verification. Rather than failing on malformed code, VeriStruct identifies errors and attempts correction, reducing manual intervention.
Evaluation Results
VeriStruct was evaluated on eleven Rust data structure implementations:
- Module-level success: 10 of 11 modules fully verified (90.9%)
- Function-level success: 128 of 129 functions verified (99.2%)
- Coverage: Verified entire real-world data structure implementations including linked lists, hash tables, and other standard structures
The single failing function was part of one module that didn't fully verify, suggesting the framework maintains high performance even on complex, interconnected verification tasks.
Significance for Formal Verification
Formal verification—mathematically proving code correctness—has remained a specialist domain requiring deep expertise. Most previous AI assistance focused on simple function-level verification. VeriStruct's success on module-level verification represents a meaningful escalation in capability.
The framework demonstrates that with proper prompt engineering, error recovery, and domain-specific guidance, LLMs can contribute substantively to verification tasks that typically require human experts. This has implications for code security, especially in systems where correctness failures carry high costs (cryptography, safety-critical systems, financial software).
Limitations and Open Questions
The research was conducted on data structures—a well-defined problem space. Generalization to more complex, domain-specific verification problems remains untested. The paper does not disclose which LLM(s) powered the system, limiting reproducibility assessments.
What This Means
VeriStruct represents concrete progress toward AI-assisted formal verification at scale. The 99.2% function-level success rate on real data structures suggests LLMs, when properly constrained with syntax guidance and error correction, can handle non-trivial verification tasks. This could expand formal verification accessibility beyond specialist teams, though the approach still appears specialized to Verus and Rust. The automatic repair mechanism—catching and fixing LLM annotation errors—may prove reusable across other formal verification contexts.