MetaMHCpan, A Meta Approach for Pan-Specific MHC Peptide Binding Prediction

Recent computational approaches in bioinformatics can achieve high performance, by which they can be a powerful support for performing real biological experiments, making biologists pay more attention to bioinformatics than before. In immunology, predicti

  • PDF / 450,876 Bytes
  • 8 Pages / 504.57 x 720 pts Page_size
  • 20 Downloads / 135 Views

DOWNLOAD

REPORT


1

Introduction Major Histocompatibility Complex (MHC), also known as Human Leukocytc Antigen (HLA) in human, consists of a large family of genes in most vertebrates and plays important roles in adaptive immune response. An important function of MHC molecules is to bind peptide fragments derived from pathogens and to display the peptides on the cell surface to be recognized by the counterpart T cells [1]. Biochemical validation of peptides binding to MHC molecules is expensive and time consuming; while computational approaches are much more efficient, being recognized as useful, and allow to provide only a small number of top candidates (peptides) for further experimental verification. Recent advances of immunoinformatics allow developing many computational methods for predicting peptides which can bind MHC molecules. These computational methods can be divided into two groups: allele-specific and pan-specific methods. Allele-specific methods train models by using binding data from an allele, and the model can be applied to predict peptides binding to the allele only. In this case if the number of binders for an allele is limited, the trained model for the allele is likely to fail to give a good predictive performance. To overcome this problem, the idea of pan-specific methods is to use data from multiple alleles as input and attempt to predict binders of not only the input alleles but also other alleles. In particular, this setting must be useful for predicting binders for alleles with very few or even no known binders [2, 3]. Currently several pan-specific methods have been proposed, which invites a problem of what methods are most reliable and should be used. To overcome this issue, we develop a Web server, MetaMHCpan, an ensemble predictor using existing pan-specific

Sunil Thomas (ed.), Vaccine Design: Methods and Protocols, Volume 2: Vaccines for Veterinary Diseases, Methods in Molecular Biology, vol. 1404, DOI 10.1007/978-1-4939-3389-1_49, © Springer Science+Business Media New York 2016

753

754

Yichang Xu et al.

methods as component predictors. MetaMHCpan consists of MetaMHCIpan and MetaMHCIIpan, which predict peptides to bind to MHC-I and MHC-II, respectively. MetaMHCIpan uses two pan-specific methods, MHC2SKpan [4] and LApan [5] for components, while MetaMHCIIpan uses three pan-specific methods: TEPITOPEpan [6], MHC2SKpan, and LApan, and an allele-specific method: MHC2MIL [7] for components. Technically MetaMHCpan can achieve a higher predictive performance than component predictors, allowing MetaMHCpan to be current cutting-edge software on predicting peptide binders of a variety of MHC alleles.

2

Materials The training set for MHC-I is Peters’ dataset [8]. We use 35 HLA alleles and six H-2 alleles as our training alleles. Among these alleles, there are a total of 43,312 peptides, and 12,362 of them are binders. The training set for MHC-II is the dataset used by NetMHCIIpan-3.0 [9]. There are 24 DR alleles, five DP alleles, six DQ alleles, and two H-2 alleles in this dataset with totally 52,062 pept