Incorporating Domain Expertise in Reinforcement Learning through Human Intervention
Reinforcement Learning (RL) is a well-known technique for problems where sequential decision making can result in reaching an optimal solution. In RL, the agent typically tries to reach the optimal solution by maximizing reward through trial-and-error. This approach has recently shown super-human performance in computer games such as Atari, Alpha Go, and Starcraft II. However, for real-world applications, such trial-and-error can impose significant risk. As an example, RL can be attractive for computer-aided diagnosis in the medical domain, however, in such a domain an error through trial can be very expensive, and not desirable at all. A possible solution to the aforementioned issue is to incorporate domain knowledge into the RL agent training process through human intervention. In this project, we will investigate how human supervision can be incorporated into an RL paradigm to prevent the agent from performing erroneous steps that can be disastrous. The target is to occasionally ask a domain expert for feedback on the choices for next steps of the agent to pick the best one based on qualitative assessments (e.g. safety, heuristic importance) that only human experts can assess. Incorporation of such domain expertise is expected to not only teach the RL agent to take safe steps, but also has the potential to speed up the training time for an RL agent, which is also a common issue with RL. In the 4-month project, the candidate will perform literature review on incorporating human experts into RL, design a practical algorithm, implement and evaluate different aspects of it through a simulated environment (e.g. OpenAI Gym, DeepMind Lab).
* Literature review on involving human feedback in RL agent training process. * Design new algorithm, focusing on incorporating qualitative aspects in RL through human experts. * Programming to implement the algorithm. * Integrate implementation into simulation environment such as OpenAI Gym or DeepMind Lab. * Calculate performance matrices on simple benchmarks such as Atari.
Must be VERY good at programming (solid understanding of materials covered in: COE428, COE328, COE318). Python experience is an added bonus. Some prior understanding of machine learning, especially reinforcement learning, is preferred.
Naimul Mefraz Khan : Incorporating Domain Expertise in Reinforcement Learning through Human Intervention | Tuesday March 26th 2019 06:40 PM