AutoPentest-DRL – Automated Penetration Testing Using Deep Reinforcement Learning

AutoPentest-DRL - What is Deep Reinforcement Learning and Penetration Testing


AutoPentest-DRL is an automated penetration testing framework based on Deep Reinforcement Learning (DRL) techniques. The framework determines the most appropriate attack path for a given network, and can be used to execute a simulated attack on that network via penetration testing tools, such as Metasploit. AutoPentest-DRL is being developed by the Cyber Range Organization and Design (CROND) NEC-endowed chair at the Japan Advanced Institute of Science and Technology (JAIST) in Ishikawa, Japan.

An overview of AutoPentest-DRL is shown below. The framework can use network scanning tools, such as Nmap, to find vulnerabilities in the target network; otherwise, user input is employed instead. The MulVAL attack-graph generator is used to determine potential attack trees, which are then fed in a simplified form into the DQN Decision Engine. The attack path that is produced as output can be fed into penetration testing tools, such as Metasploit, to conduct an attack on a real target network, or used with a logical network instead, for example for educational purposes. In addition, a topology generation algorithm is used to produce multiple network topologies that are used to train the DQN.

AutoPentest-DRL - Automated Penetration Testing Using Deep Reinforcement Learning xploitlab

Next we provide brief information on how to setup and use AutoPentest-DRL. For details, please refer to the User Guide that we also make available.


Several external tools are needed in order to use AutoPentest-DRL, as follows:

  • MulVAL: Attack-graph generator used by AutoPentest-DRL to produce possible attack paths for a given network. See the MulVAL page for installation instructions. MulVAL should be installed in the directory repos/mulval in the AutoPentest-DRL folder. You also need to configure the /etc/profile file as discussed here.
  • Nmap: Network scanner used by AutoPentest-DRL to determine vulnerabilities in a given real network. The command needed to install nmap on Ubuntu is given below:
sudo apt-get install nmap

  • Metasploit: Penetration testing tools used by AutoPentest-DRL to actually conduct the attack proposed by the DQN engine on the real target network. To install Metasploit, you can use the installers made available on the Metasploit website. In addition, we use pymetasploit3 as RPC API to communicate with Metasploit, and this tool needs to be installed in the directory Penetration_tools/pymetasploit3 by following its author’s instructions.


AutoPentest-DRL has been developed mainly on the Ubuntu 18.04 LTS operating system; other OSes may work, but have not been tested. In order to set up AutoPentest-DRL, use the releases page to download the latest version, and extract the source code archive into a directory of your choice (for instance, your home directory) on the host on which you intend to use it.

AutoPentest-DRL is implemented in Python, and it requires several packages to run. The file requirements.txt included with the distribution can be used to install the necessary packages via the following command that should be run from the AutoPentest-DRL/ directory:

sudo -H pip install -r requirements.txt

The last step is to install the database, which contains information about real hosts and vulnerabilities. For this purpose, download the asset file named database.tgz from the release page, and extract it into the Database/ directory.

Quick Start

The simplest way to use AutoPentest-DRL is to start it in the logical attack mode that will determine the optimal attack path for a given logical network. For more information about this and the other operation modes, real attack mode and training mode, see our User Guide.

In order to use the logical attack mode on a sample network topology, run the following command from a terminal window:

python3 ./ logical_attack

The logical network topology used in this attack mode is described in the file MulVal_P/logical_attack.P, which includes details about the servers, their connections, and their vulnerabilities. This file can modified following the syntax described in the MulVAL documentation.

In the logical attack mode no actual attack is conducted, and only the optimal attack path is provided as output. By referring to the visualization of the attack graph that is generated by MulVAL in the file mulval_results/AttackGraph.pdf you can study in detail the attack steps. The figures below provide examples of such output.


For a research background regarding AutoPentest-DRL, please refer to the following paper:

  • Z. Hu, R. Beuran, Y. Tan, “Automated Penetration Testing Using Deep Reinforcement Learning”, IEEE European Symposium on Security and Privacy Workshops (EuroS&PW 2020), Workshop on Cyber Range Applications and Technologies (CACOE’20), Genova, Italy, September 7, 2020, pp. 2-10.

For a list of contributors to this project, see the file CONTRIBUTORS included in the distribution.

You May Also Like

Leave a Reply

Your email address will not be published. Required fields are marked *

five × 3 =