Wave equation migration velocity analysis (WEMVA) builds a kinematically accurate macro velocity model containing the large-scale structures of the model for seismic imaging or full waveform inversion (FWI). Differential semblance optimization (DSO) formulates the misfit function in the subsurface domain by applying a penalty to the unfocused subsurface-offset gathers or the unflatness of the angle gathers. Such penalty applied by DSO leads to gradients with strong artifacts and thus converges slower. Here, we propose to formulate the misfit function using the optimal transport (OT) between neighbouring traces in angle domain common image gathers (ADCIGs). Specially, we measure the unflatness of the gathers by comparing the Wasserstein distance of adjacent traces. The proposed objective function is expected to be minimum for the correct velocity when the angle gathers are flat. Numerical examples of a two-layer and Marmousi models demonstrate the validity of the new method for estimating the macro velocity model.