We present the theory of wave-equation Radon tomography (WRT) where the slopes and zero-intercept time of early arrivals in the τ−p domain are inverted for the subsurface velocity structure. The early arrivals are windowed in a shot gather, but they are still too wiggly to avoid local minima with a full waveform inversion (FWI) method. To reduce their complexity, a local linear Radon τ−p transform is applied to the events to focus them into few points. These points, which identify the slopes and zero-intercept time of the early arrivals, are picked to give the slowness coordinate pobs i at the zero-intercept time τ i . The misfit function ε=∑ i=1 P (p i −p i obs )2+∑ i=1 P (τ i −τ i obs )2 is computed and a gradient optimization method is used to find the optimal velocity model that minimizes e. Results with synthetic data and field data show that WRT can accurately reconstruct the nearsurface P-wave velocity model and converges faster than other wave-equation methods.