TY - GEN

T1 - ML-descent: An optimization algorithm for FWI using machine learning

AU - Sun, Bingbing

AU - Alkhalifah, Tariq Ali

N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: We thank KAUST for the funding of this research and the members of SWAG group for useful discussions.

PY - 2019/8/10

Y1 - 2019/8/10

N2 - Full-Waveform Inversion is a nonlinear inversion problem, and a typical optimization algorithm such as nonlinear conjugate-gradient or LBFGS would iteratively update the model along gradient-descent direction of the misfit function or a slight modification of it. Rather than using a hand-designed optimization algorithm, we trained a machine to learn an optimization algorithm which we refer to as”ML-descent” and applied it in FWI. Using recurrent neural network (RNN), we use the gradient of the misfit function as input for training and the hidden states in the RNN uses the history information of the gradient similar to an BFGS algorithm. However, unlike the fixed BFGS algorithm, the ML version evolves as the gradient directs it to evolve.The loss function for training is formulated by summarization of the FWI misfit function by the L2-norm of the data residual. Any well-defined nonlinear inverse problem can be locally approximated by a linear convex problem, and thus, in order to accelerate the training speed, we train the neural network using the solution of randomly generated quadratic functions instead of the time-consuming FWI gradient. We use the Marmousi example to demonstrate that the ML-descent method outperform the steepest descent method, and the energy in the deeper part of the model can be compensable well by the ML-descent when the pseudo-inverse of the Hessian is not incorporated in the gradient of FWI.

AB - Full-Waveform Inversion is a nonlinear inversion problem, and a typical optimization algorithm such as nonlinear conjugate-gradient or LBFGS would iteratively update the model along gradient-descent direction of the misfit function or a slight modification of it. Rather than using a hand-designed optimization algorithm, we trained a machine to learn an optimization algorithm which we refer to as”ML-descent” and applied it in FWI. Using recurrent neural network (RNN), we use the gradient of the misfit function as input for training and the hidden states in the RNN uses the history information of the gradient similar to an BFGS algorithm. However, unlike the fixed BFGS algorithm, the ML version evolves as the gradient directs it to evolve.The loss function for training is formulated by summarization of the FWI misfit function by the L2-norm of the data residual. Any well-defined nonlinear inverse problem can be locally approximated by a linear convex problem, and thus, in order to accelerate the training speed, we train the neural network using the solution of randomly generated quadratic functions instead of the time-consuming FWI gradient. We use the Marmousi example to demonstrate that the ML-descent method outperform the steepest descent method, and the energy in the deeper part of the model can be compensable well by the ML-descent when the pseudo-inverse of the Hessian is not incorporated in the gradient of FWI.

UR - http://hdl.handle.net/10754/661903

UR - https://library.seg.org/doi/10.1190/segam2019-3215304.1

UR - http://www.scopus.com/inward/record.url?scp=85079486031&partnerID=8YFLogxK

U2 - 10.1190/segam2019-3215304.1

DO - 10.1190/segam2019-3215304.1

M3 - Conference contribution

SP - 2288

EP - 2292

BT - SEG Technical Program Expanded Abstracts 2019

PB - Society of Exploration Geophysicists

ER -