The transmembrane topology is the key to understand the 3D structures of multi-pass Transmembrane Proteins (mTPs). However, accurate prediction of the 1D topology label for each residue of an mTP from evolutionary information alone is very challenging, if not infeasible. In this work, we propose a novel approach to identify the transmembrane topology under an object detection framework that takes as the input the predicted 2D distance matrix from the co-evolutionary information, followed by several bidirectional Transformer blocks that effectively fuse both 2D and 1D features for accurate label prediction. Specifically, we employ the Faster-RCNN module to simultaneously predict the rectangular bounds that cover the interacted transmembrane regions, as well as the confidence scores to discriminate them from the non-transmembrane regions. To integrate the 2D pairwise features and the 1D sequential features, we establish several bidirectional Transformer blocks consisting of self-attention units for capturing long-range dependencies in the transmembrane topology. Tested on the 330 non-redundant mTPs and the newly released 45 mTPs, in terms of the Segment OVerlap (SOV) score, our approach achieves 0.927 and 0.843, which are about 4.5% and 6.6% better than the cutting-edge consensus methods, respectively.
|Original language||English (US)|
|Title of host publication||Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics - BCB '19|
|Number of pages||8|
|State||Published - Sep 9 2019|