Turbulence Mitigation Transformer

Xingguang Zhang
Zhiyuan Mao
Nicholas Chimitt
Stanley Chan
[Paper]
[Code]
[Relevant]
[P2S Simulator]






Abstract

Restoring images distorted by atmospheric turbulence is a ubiquitous problem in long-range imaging applications. While existing deep-learning based methods have demonstrated promising results in specific testing conditions, they suffer from three limitations: (1) lack of generalization capability from synthetic training data to real turbulence data; (2) fail to scale, hence causing memory and speed challenges when extending the idea to a large number of frames; (3) lack of a fast and accurate simulator to generate data for training neural networks.
In this paper, we introduce the turbulence mitigation transformer (TMT) that specifically aims to resolve these issues. TMT brings three contributions: Firstly, TMT explicitly uses turbulence physics by decoupling the turbulence degradation and introducing a multi-scale loss for removing distortion, thus improving effectiveness. Secondly, TMT presents a new attention module along the temporal axis to extra features efficiently, thus improving memory and speed. Thirdly, TMT introduces a new simulator based on the Fourier sampler, temporal correlation, and flexible kernel size, thus improving our capability to synthesize better training data. TMT outperforms state-of-the-art video restoration models, especially in generalizing from synthetic to real turbulence data.




Results on real world static image sequences






Source of input images: 12 frames of the pattern 9, 13, 14, 15, 16 in the OTIS dataset




Source of input images: first 12 frames of the hillhouse sequence in the CLEAR's dataset







Source of input images: first 12 frames of the 2nd, 24th, 58th, 96th sequences in the static text dataset




Results on real world dynamic image sequences

Left: input sequence
Right: restored

Source of the input video: the CLEAR's dataset



Source of the input video: the CLEAR's dataset



Source of the input video: the TSRWGAN's dataset



Source of the input video: the TSRWGAN's dataset




Network Scheme