Place recognition is an essential component to address the problem of visual navigation and SLAM. The long-term place recognition is challenging as the environment exhibits significant variations across different times of the days, months, and seasons. In this paper, we view appearance changes as multiple domains and propose a Feature Disentanglement Network (FDNet) based on a convolutional auto-encoder and adversarial learning to extract two independent deep features– content and appearance. In our network, the content feature is learned which only retains the content information of images through the competition with the discriminators and content encoder. Besides, we utilize the triplets loss to make the appearance feature encode the appearance information. The generated content features are directly used to measure the similarity of images without dimensionality reduction operations. We use datasets that contain extreme appearance changes to carry out experiments, which show how meaningful recall at 100% precision can be achieved by our proposed method where existing state-of-art approaches often get worse performance.
- Visual place recognition
- Changing environment
- Adversarial learning
- Representation disentanglement