Abstract Motivation Electron tomography (ET) has become an indispensable tool for structural biology studies. In ET, the tilt series alignment and the projection parameter calibration are the key steps towards high-resolution ultrastructure analysis. Usually, fiducial markers are embedded in the sample to aid the alignment. Despite the advances in developing algorithms to find correspondence of fiducial markers from different tilted micrographs, the error rate of the existing methods is still high such that manual correction has to be conducted. In addition, existing algorithms do not work well when the number of fiducial markers is high. Results In this paper, we try to completely solve the fiducial marker correspondence problem. We propose to divide the workflow of fiducial marker correspondence into two stages: (i) initial transformation determination, and (ii) local correspondence refinement. In the first stage, we model the transform estimation as a correspondence pair inquiry and verification problem. The local geometric constraints and invariant features are used to reduce the complexity of the problem. In the second stage, we encode the geometric distribution of the fiducial markers by a weighted Gaussian mixture model and introduce drift parameters to correct the effects of beam-induced motion and sample deformation. Comprehensive experiments on real-world datasets demonstrate the robustness, efficiency and effectiveness of the proposed algorithm. Especially, the proposed two-stage algorithm is able to produce an accurate tracking within an average of ≤ ms per image, even for micrographs with hundreds of fiducial markers, which makes the real-time ET data processing possible. Availability The code is available at https://github.com/icthrm/auto-tilt-pair . Additionally, the detailed original figures demonstrated in the experiments can be accessed at https://rb.gy/6adtk4.