Видео [Off-By-One 2024] AI-Powered Bug Hunting - Evolution and benchmarking

weaver · 03.09.2024

Description

While AI holds promise for assisting bug hunting, its actual impact remains unclear. This presentation addresses this gap by introducing Crash-Benchmark, a standardized evaluation framework for AI-driven static analysis tools.

We’ll share results from a simple bug-hunting AI agent, AK1, and discuss the implications for optimizing AI-based bug hunting in C/C++ codebases.

AI-bughunting presents unique challenges: Early models lacked sophistication, struggling to comprehend long codebases. Moreover, privacy concerns often necessitate exclusive use of local models, which are inherently less capable than commercial AI models offered by industry leaders such as OpenAI and Google.

To illustrate this challenge, we’ll showcase AK1, a simple rule-based AI agent capable of autonomously identifying various bug classes within C/C++ codebases.

Notably, its model-agnostic design allows it to improve performance with each new model release. Nevertheless, evaluating the effectiveness of AI-based tools poses difficulties due to the subjectivity of the output.

https://offbyone.sg

Видео [Off-By-One 2024] AI-Powered Bug Hunting - Evolution and benchmarking

weaver

31 c0 bb ea 1b e6 77 66 b8 88 13 50 ff d3