TY - JOUR
T1 - Intrinsic Rewards for Maintenance, Approach, Avoidance and Achievement Goal Types
AU - Dhakan, Paresh
AU - Merrick, Kathryn
AU - Rano, Ignacio
AU - Siddique, N
PY - 2018/10/9
Y1 - 2018/10/9
N2 - In reinforcement learning, reward is used to guide the learning process. The reward is often designed to be task-dependent, and it may require significant domain knowledge to design a good reward function. This paper proposes general reward functions for maintenance, approach, avoidance, and achievement goal types. These reward functions exploit the inherent property of each type of goal and are thus task-independent. We also propose metrics to measure an agent's performance for learning each type of goal. We evaluate the intrinsic reward functions in a framework that can autonomously generate goals and learn solutions to those goals using a standard reinforcement learning algorithm. We show empirically how the proposed reward functions lead to learning in a mobile robot application. Finally, using the proposed reward functions as building blocks, we demonstrate how compound reward functions, reward functions to generate sequences of tasks, can be created that allow the mobile robot to learn more complex behaviors.
AB - In reinforcement learning, reward is used to guide the learning process. The reward is often designed to be task-dependent, and it may require significant domain knowledge to design a good reward function. This paper proposes general reward functions for maintenance, approach, avoidance, and achievement goal types. These reward functions exploit the inherent property of each type of goal and are thus task-independent. We also propose metrics to measure an agent's performance for learning each type of goal. We evaluate the intrinsic reward functions in a framework that can autonomously generate goals and learn solutions to those goals using a standard reinforcement learning algorithm. We show empirically how the proposed reward functions lead to learning in a mobile robot application. Finally, using the proposed reward functions as building blocks, we demonstrate how compound reward functions, reward functions to generate sequences of tasks, can be created that allow the mobile robot to learn more complex behaviors.
KW - intrinsic reward function
KW - goal types
KW - open-ended learning
KW - autonomous goal generation
KW - reinforcement learning
UR - https://pure.ulster.ac.uk/en/publications/intrinsic-rewards-for-maintenance-approach-avoidance-and-achievem
U2 - 10.3389/fnbot.2018.00063
DO - 10.3389/fnbot.2018.00063
M3 - Article
C2 - 30356820
VL - 12
SP - 1
EP - 16
JO - Frontiers in Neurorobotics
JF - Frontiers in Neurorobotics
SN - 1662-5218
IS - 63
M1 - 63
ER -