Sometimes collecting a massive dataset just isn’t in the cards. How best to leverage small datasets for machine learning tasks is an active area of exploration. Synthetic data has the potential to alleviate object rarity and long-tail distributions, provided the synthetic data introduces more signal than noise into the system. The IQT Labs Synthesizing Robustness project explored whether domain adaptation strategies (attempting to make the synthetic data look more “realistic”) can improve the efficacy of the synthetic data when it comes to localizing rare objects.
This project builds upon our earlier RarePlanes project, in which we examined the value of synthetic data in aiding computer vision algorithms tasked with detecting rare objects in satellite imagery. In brief, the RarePlanes dataset consists of both real data (satellite imagery from Maxar satellites) and synthetic data (imagery generated via a game engine generated by AI Reverie). The Synthesizing Robustness project extends RarePlanes work by exploring whether Generative Adversarial Networks (GANs), a type of deep learning model which can be used to produce photo-realistic images, can be used to improve the value of synthetic data.