Statistical Learning Machines for Protein Structure Prediction in the Era of High-Throughput Sequencing

Jinbo Xu
Seminar

If we know the primary sequence of a protein, can we predict its 3D structure by computational methods? This is one of the most important and challenging problems in computational molecular biology and has tremendous implications for the understanding of life process, diseases and drug discovery. Depending on whether or not there is one solved structure similar to the protein sequence under consideration, computational methods for protein folding can be classified into two categories: template-based and template-free modeling. The former uses similar solved structures as templates to predict the structure of a protein while the latter does not. This talk will demonstrate how statistical learning methods especially probabilistic graphical models can be applied to address some fundamental challenges facing template-based and template-free protein folding by taking advantage of high-throughput sequencing and protein structure initiatives.