JAIST - Japan Advanced Institute of Science and Technology

10/21/2024 | Press release | Distributed by Public on 10/21/2024 00:10

Towards a Safe Society 5.0: Reinforcement Learning Pentesting Agent Training in Realistic Network Environments

Researchers developed an innovative and realistic reinforcement learning agent training framework for penetration testing (pentesting) purposes

  • Researchers at the Japan Advanced Institute of Science and Technology (JAIST) implemented a framework named PenGym that supports the creation of realistic training environments for reinforcement learning pentesting agents, accommodating diverse complexity scenarios with actual network hosts and security vulnerabilities. This implementation was done in collaboration with KDDI Research, Inc. (hereafter KDDI Research).
  • The experiment results demonstrated the advantages and effectiveness of using PenGym as a realistic training environment, with PenGym-trained agents having superior pentesting performance compared to simulation-trained agents.
  • The optimizations ensure that PenGym offers a reasonable training duration compared to simulation, even though the agents execute actual actions on the network hosts, and the approach leads to a high overall realism of the trained agents.

Ensuring the security of network systems and infrastructure is a critical aspect of cybersecurity. Penetration testing (pentesting) is an effective method for evaluating the network security posture. In recent years, researchers aimed to develop efficient approaches for conducting the pentesting procedure automatically to address the issues in traditional manual and time-consuming methods. One approach is to use reinforcement learning (RL) techniques, which have been applied to create automated agents that mimic the actions of human pentesters but have enhanced speed, scale, and precision. Various simulation environments have been introduced as the main method to train these RL agents. However, the heavy reliance on predefined constants and probabilistic values for agent actions and environment states leads to potential inaccuracies in replicating real-world behavior due to factors that were not modeled, thus decreasing agent accuracy and performance. In addition, the simulated network may not accurately represent the configuration and topology of an actual network.

To address this "reality gap", a team of researchers led by Associate Professor Razvan Beuran, along with his doctoral student Huynh Phuong Thanh Nguyen at the Japan Advanced Institute of Science and Technology (JAIST), and researchers at KDDI Research, has designed and implemented PenGym, an effective and reliable realistic training framework for RL pentesting agents that was developed as part of a joint project with KDDI Research. PenGym enables RL agents to execute actual actions on realistic hosts in network environments. For this purpose, the framework contains an Action/State Module that implements a set of real pentesting actions for the interaction between the RL agents and the training environment. Moreover, the training environment is based on the cyber range technology used for human cybersecurity training and is created automatically according to several pentesting scenarios. Several optimization techniques were implemented to enhance the time execution performance of PenGym. As a result, their framework eliminates the need for action modeling, resulting in a more accurate representation of network and security dynamics compared to simulation-based environments. Their study was published in Computers & Security.

The approach of using a real network environment that makes possible the execution of actual pentesting actions, as employed in this research, yields promising results compared to simulated environments. In particular, their experiments demonstrated the advantages and effectiveness of using PenGym as a realistic training environment for RL pentesting agents. Thus, the PenGym-trained agents showed a superior pentesting performance in real networks compared to simulation-trained agents.

Based on the experiment results the researchers obtained, they consider that their research could lead to changes in various network-related research areas, potentially replacing the traditional approach of creating complex logical models to simulate network environments with more realistic methods. Furthermore, realistic training environments can be applied to other research areas. One important example is automated cyber defense using RL agents, which can be used to enhance the protection mechanisms of real network infrastructure and contribute to the trustworthiness of Society 5.0. To support the potential activities of other researchers in this field, they released PenGym as open source on GitHub.

Image title: Figure1
Image caption: Overview of the PenGym framework architecture.
Image credit: Razvan Beuran from JAIST.
License type: Original Content.
Usage restrictions: Cannot be reused without permission.

Image title: Figure2
Image caption: Example of a realistic network environment used for agent training in PenGym.
Image credit: Razvan Beuran from JAIST.
License type: Original Content.
Usage restrictions: Cannot be reused without permission.

Reference

Title of original paper: PenGym: Realistic training environment for reinforcement learning pentesting agents
Authors: Huynh Phuong Thanh Nguyen, Kento Hasegawa (KDDI Research), Kazuhide Fukushima (KDDI Research), Razvan Beuran
Journal: Computers & Security
DOI: 10.1016/j.cose.2024.104140

PenGym source code URL:https://github.com/cyb3rlab/PenGym

Collaboration
This work was based on a joint-research project with KDDI Research, Inc.

October 18, 2024