TY - JOUR

T1 - ML-descent: An optimization algorithm for full-waveform inversion using machine learning

AU - Sun, Bingbing

AU - Alkhalifah, Tariq Ali

N1 - KAUST Repository Item: Exported on 2021-02-09
Acknowledgements: We thank the editor-in-chief J. Shragge for improving the manuscript. We appreciate the comments and suggestions by the assistant editor A. Guitton and the associate editor S. A. L. de Ridder in the reviewing process. We thank the members of SWAG at KAUST for the useful discussions.

PY - 2020/10/21

Y1 - 2020/10/21

N2 - Full-waveform inversion (FWI) is a nonlinear optimization problem, and a typical optimization algorithm such as the nonlinear conjugate gradient or limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) would iteratively update the model mainly along the gradient-descent direction of the misfit function or a slight modification of it. Based on the concept of meta-learning, rather than using a hand-designed optimization algorithm, we have trained the machine (represented by a neural network) to learn an optimization algorithm, entitled the “ML-descent,” and apply it in FWI. Using a recurrent neural network (RNN), we use the gradient of the misfit function as the input, and the hidden states in the RNN incorporate the history information of the gradient similar to an LBFGS algorithm. However, unlike the fixed form of the LBFGS algorithm, the machine-learning (ML) version evolves in response to the gradient. The loss function for training is formulated as a weighted summation of the L2 norm of the data residuals in the original inverse problem. As with any well-defined nonlinear inverse problem, the optimization can be locally approximated by a linear convex problem; thus, to accelerate the training, we train the neural network by minimizing randomly generated quadratic functions instead of performing time-consuming FWIs. To further improve the accuracy and robustness, we use a variational autoencoder that projects and represents the model in latent space. We use the Marmousi and the overthrust examples to demonstrate that the ML-descent method shows faster convergence and outperforms conventional optimization algorithms. The energy in the deeper part of the models can be recovered by the ML-descent even when the pseudoinverse of the Hessian is not incorporated in the FWI update.

AB - Full-waveform inversion (FWI) is a nonlinear optimization problem, and a typical optimization algorithm such as the nonlinear conjugate gradient or limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) would iteratively update the model mainly along the gradient-descent direction of the misfit function or a slight modification of it. Based on the concept of meta-learning, rather than using a hand-designed optimization algorithm, we have trained the machine (represented by a neural network) to learn an optimization algorithm, entitled the “ML-descent,” and apply it in FWI. Using a recurrent neural network (RNN), we use the gradient of the misfit function as the input, and the hidden states in the RNN incorporate the history information of the gradient similar to an LBFGS algorithm. However, unlike the fixed form of the LBFGS algorithm, the machine-learning (ML) version evolves in response to the gradient. The loss function for training is formulated as a weighted summation of the L2 norm of the data residuals in the original inverse problem. As with any well-defined nonlinear inverse problem, the optimization can be locally approximated by a linear convex problem; thus, to accelerate the training, we train the neural network by minimizing randomly generated quadratic functions instead of performing time-consuming FWIs. To further improve the accuracy and robustness, we use a variational autoencoder that projects and represents the model in latent space. We use the Marmousi and the overthrust examples to demonstrate that the ML-descent method shows faster convergence and outperforms conventional optimization algorithms. The energy in the deeper part of the models can be recovered by the ML-descent even when the pseudoinverse of the Hessian is not incorporated in the FWI update.

UR - http://hdl.handle.net/10754/667268

UR - http://mr.crossref.org/iPage?doi=10.1190%2Fgeo2019-0641.1

U2 - 10.1190/geo2019-0641.1

DO - 10.1190/geo2019-0641.1

M3 - Article

VL - 85

SP - R477-R492

JO - GEOPHYSICS

JF - GEOPHYSICS

SN - 0016-8033

IS - 6

ER -