Пожалуйста, обратите внимание, что пользователь заблокирован
Description
https://offbyone.sgWhile AI holds promise for assisting bug hunting, its actual impact remains unclear. This presentation addresses this gap by introducing Crash-Benchmark, a standardized evaluation framework for AI-driven static analysis tools.
We’ll share results from a simple bug-hunting AI agent, AK1, and discuss the implications for optimizing AI-based bug hunting in C/C++ codebases.
AI-bughunting presents unique challenges: Early models lacked sophistication, struggling to comprehend long codebases. Moreover, privacy concerns often necessitate exclusive use of local models, which are inherently less capable than commercial AI models offered by industry leaders such as OpenAI and Google.
To illustrate this challenge, we’ll showcase AK1, a simple rule-based AI agent capable of autonomously identifying various bug classes within C/C++ codebases.
Notably, its model-agnostic design allows it to improve performance with each new model release. Nevertheless, evaluating the effectiveness of AI-based tools poses difficulties due to the subjectivity of the output.