DARPA - Defense Advanced Research Projects Agency

07/31/2024 | Press release | Distributed by Public on 07/31/2024 15:02

Eliminating Memory Safety Vulnerabilities Once and For All

Memory safety vulnerabilities are the most prevalent type of disclosed software vulnerability1 and affect a computer's memory in two primary ways. First, programming languages like C allow programmers to manipulate memory directly, making it easy to accidentally introduce errors in their program that would enable a seemingly routine operation to corrupt the state of memory. Second, memory safety issues can arise when a programming language exhibits an "undefined behavior." Undefined behaviors happen when the programming language standard provides no specification or guidance on how the program should behave under conditions not explicitly defined in the standard.

After more than two decades of grappling with memory safety issues in C and C++, the software engineering community has reached a consensus. Relying on bug-finding tools is not enough. Even the Office of the National Cyber Director has called for more proactive approaches to eliminate memory safety vulnerabilities to reduce potential attacks2.

While it's been no secret that memory safe programming languages can eliminate memory safety vulnerabilities, the challenge has been rewriting legacy code at scale that matches the vastness of the problem. The C language was created in the 1970s and has become ubiquitous. It has been used to develop applications that run everything from modern smartphones to space vehicles and beyond. And the Department of Defense has long-lived systems that disproportionately depend on programming languages like C.

However, in recent years, a cultural shift toward the programming language Rust and recent breakthroughs in machine learning techniques, like large language models (LLMs), have created an environment that may lend itself to a new class of solutions.

DARPA's Translating All C to Rust (TRACTOR) program wants to seize this opportunity by substantially automating the translation of the world's legacy C code to Rust.

"You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is 'here's some C code, please translate it to safe idiomatic Rust code,' cut, paste, and something comes out, and it's often very good, but not always," said Dr. Dan Wallach, DARPA program manager for TRACTOR. "The research challenge is to dramatically improve the automated translation from C to Rust, particularly for program constructs with the most relevance."

TRACTOR will strive to create the same quality and style that a skilled Rust developer would produce, thereby eliminating the entire class of memory safety security vulnerabilities in C programs.

Wallach anticipates proposals that include novel combinations of software analysis, such as static and dynamic analysis, and large language models. The program will host public competitions throughout the effort to test the capabilities of the LLM-powered solutions.

"Rust forces the programmer to get things right," said Wallach. "It can feel constraining to deal with all the rules it forces, but when you acclimate to them, the rules give you freedom. They're like guardrails; once you realize they're there to protect you, you'll become free to focus on more important things."

DARPA will sponsor a Proposers Day on Aug. 26, 2024, which attendees can attend in person or virtually. Participants must register by Aug. 19, 2024. Details and registration info are available at SAM.Gov.

[1]https://www.cisa.gov/sites/default/files/2023-12/The-Case-for-Memory-Safe-Roadmaps-508c.pdf

[2]https://www.whitehouse.gov/oncd/briefing-room/2024/02/26/memory-safety-fact-sheet/