master thesis


Neu­ral­Vis­U­AL: Deep Neural Network Visualization in the Unreal Engine for Interactive Fly-through Exploration


An open-source, extensible and modular framework for visualizing inner workings of artifical intelligence, developed as my master thesis for graduation in Media Computer Science at the University of Tübingen.

View the open-source repository
Download the thesis as PDF

Project Characteristics

Time frame

Mar 2021 - Sep 2021, full-time

Thesis content

A total of 106 pages, citing 211 different sources, featuring an extensive background section that explains Deep Neural Networks, AI visualizations, and force-based graph drawing algorithms in a detailed manner.

Open-source license

The code of this master thesis project is published under the open-source license GNU GPLv3.

Professors / supervision

Examining professor: Prof. Dr. Hendrik Lensch,
Secondary professor: Prof. Dr. Martin V. Butz,
Project supervision: Mark Boss

Final grade

For thesis and defense each 1.0
(equivalent to American A+ / GPA 4.0)

Tools utilized

Unreal Engine, Python, Numpy, Blueprints, C++, Tensorflow, Keras, Websocket, Trello, LaTeX, Git

Skills I learned / improved upon through this project

Project management, development prioritisation, framework building, interactive visualizations, Artificial Intelligence, Deep Neural Networks, Force-based layouting, building asynchronous server-client infrastructure

Abstract

Deep neural networks have astounding capabilities, surpassing human abilities in many disciplines. Due to their increasingly complex nature, with modern architectures consisting of hundreds of millions, sometimes even billions of trainable weights, it is virtually impossible for researchers to intuitively understand how exactly these networks come to produce such incredible results. Research in neural network interpretability and network visualization aims to provide more insight into the inner workings of artificial intelligence.

Neu­ral­Vis­U­AL (Deep Neural Network Visualization in the Unreal Engine for Interactive Fly-through Exploration) is the first step towards an extensible framework for neural network visualization, facilitating a better understanding of these networks through exploration. It is a modular open-source application, written in python, C++, and Unreal Engine blueprints, that visualizes any feed-forward deep neural network developed in TensorFlow and Keras. Neu­ral­Vis­U­AL utilizes the force-based algorithm ForceAtlas 2 with some modifications to calculate a meaningful layout of the given network on a two-dimensional plane. This layout determines where to spawn objects in a virtual game environment, which the user can freely explore, interacting with the network through this application. Furthermore, the application visualizes the kernels of convolution layers in convoluational neural networks, the corresponding activation maps, saliency maps, and integrated gradients according to user-defined preferences.

Neu­ral­Vis­U­AL consists of several distinct modules connected by precisely defined interface interactions. Among other advantages, this allows for a separation between a server interacting with the neural network and an Unreal Engine 4 client that renders the visualization for the user to explore freely.

Conclusion

This thesis presents Neu­ral­Vis­U­AL, a modular open-source application designated to visualizing feed-forward DNNs using UE4 [Inc21]. It utilizes a modified FBA based on ForceAtlas 2 [JVHB14] to calculate an easily comprehensible 2D layout that aims to visualize the directional information flow, keeping connected layers close while maintaining space between unrelated ones. After calculating and rendering this layout in the virtual environment, the application can visualize convolution kernels, activation maps, saliency maps, and integrated gradients within the network.

Neu­ral­Vis­U­AL contains several distinct modules connected by well-defined interfaces. Most crucial is the separation between server and client, which allows the visualization to run on a different machine than the neural network to be analyzed. The python server directly loads the neural network, interacts with it, and processes its data. Furthermore, it is responsible for calculating visualizations from this obtained data, according to visualization settings that the user can adapt to each project. To render these visualizations, it generates instructions for manipulating the virtual world and sends them to the client via a WebSocket connection [FM11], serialized with msgpack [Fur19]. Within the Neu­ral­Vis­U­AL's client module, a UE4 C++ plugin receives these instructions and calls blueprint functions [Inc20b] during this unpacking process. These methods, designed in Unreal's blueprint visual scripting system [Inc20b], are responsible for caching the relevant data from unpacking, interpreting the instructions, and spawning objects in the virtual world. Finally, UE4 renders this world, providing an interactive game rendering environment and allowing the user to freely explore the visualization as a 3D representation [Inc21]. The user exclusively interacts with the neural network through this application and can send commands to the server through a custom console within the game environment. The client relays those commands back through blueprints, the C++ plugin, and the WebSocket connection to the server. The python server finally asynchronously receives and processes these commands, calling the corresponding methods to fulfill the user's wishes, such as requests for new visualizations.

This modularity provides several key advantages. It relieves development efforts due to circumventing relatively long C++ compilation times, yet still providing a high performance for computationally expensive calculations, especially when the server runs on a high-performance machine. Furthermore, the separation between server and client facilitates analyzing networks on servers dedicated to ML research while executing the UE4 rendering application on a client machine. Finally, this modularity permits quick adaptation of Neu­ral­Vis­U­AL to other use cases, extending its functionality or even modifying modules to work for completely different use cases.

The primary purpose of Neu­ral­Vis­U­AL is to further the research in the field of neural network visualizations. DNNs, especially for computer vision, possess a high level of complexity, often having hundreds of millions, or even billions of trainable weights [SZ14, BMR+20, SMM+17]. This complexity makes it relatively difficult for researchers to comprehensively grasp these networks' precise mechanisms to generate super-human results [ZZ18, ZGCH21]. The fields of neural network interpretability and visualization help answer this question, aiming to give researchers and developers more profound insight into the inner workings of AI [ZZ18].

Neu­ral­Vis­U­AL currently works exclusively on feed-forward DNNs developed with TensorFlow and Keras, has been tested on a 64-bit Windows computer with the WebSocket connecting through localhost. Due to its adaptability, the provided visualization can be extended, depending on the project-specific requirements of such a visualization. For example, it could be helpful to increase interactivity, implement more user guidance, display individual neurons or show backpropagation. It also could be useful to implement feature visualization [OMS17], training progress comparison, VR visualization, or interactive exploration of latent space [LNH+18, STN+16] and spatial activations [CAS+19].

Neu­ral­Vis­U­AL is a modular CNN visualization framework that helps researchers obtain a more exhaustive understanding of neural networks. Its settings, modularity, and open-source status make it easily adaptable to personal preferences and extendable to individual visualization requirements. Neu­ral­Vis­U­AL contains helpful features to visualize architecture, kernels, activations, saliency, and gradients and allows users to intuitively comprehend how a CNN processes the input information, furthering research into AI visualization and explainability.

Read more scientific details in the full PDF of this thesis