Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature

  • PDF / 1,809,337 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 7 Downloads / 142 Views

DOWNLOAD

REPORT


Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation Phasit Charoenkwan1 · Chanin Nantasenamat2 · Md. Mehedi Hasan3 · Watshara Shoombuatong2  Received: 4 May 2020 / Accepted: 10 June 2020 © Springer Nature Switzerland AG 2020

Abstract Phage virion protein (PVP) perforate the host cell membrane and eventually culminates in cell rupture thereby releasing replicated phages. The accurate identification of PVP is thus a crucial step towards improving our understanding of the biological function and mechanisms of PVPs. Therefore, it is desirable to develop a computational method that is capable of fast and accurate identification of PVPs. To address this, we propose a novel sequence-based meta-predictor employing probabilistic information (referred herein as the Meta-iPVP) for the accurate identification of PVPs. Particularly, efficient feature representation approach was used to generate discriminative probabilistic features from four machine learning (ML) algorithms making use of seven feature encodings. To the best of our knowledge, the Meta-iPVP is the first meta-based approach that has been developed for PVP prediction. Independent test results indicated that the Meta-iPVP could discern important characteristics between PVPs and non-PVPs as well as achieving the best accuracy and MCC of 0.817 and 0.642, respectively, which corresponds to 6–10% and 14–21% improvements over existing PVP predictors. As such, this demonstrates that the proposed Meta-iPVP is a more efficient, robust and promising for the identification of PVPs. The predictive model is deployed as a publicly accessible Meta-iPVP webserver freely available online at http://camt.pytho​nanyw​here. com/Meta-iPVP. Keywords  Phage virion protein · Machine learning · Classification · Feature selection · Support vector machine · Metapredictor

Introduction

Electronic supplementary material  The online version of this article (https​://doi.org/10.1007/s1082​2-020-00323​-z) contains supplementary material, which is available to authorized users. * Watshara Shoombuatong [email protected] 1



Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand

2



Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand

3

Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680‑4 Kawazu, Iizuka, Fukuoka 820‑8502, Japan



Bacteriophages are viruses that can infect and thrive in bacteria. It can be found in several environments including soil, freshwater and marine. The infectious phage particle is essentially comprised of a nucleic acid component (i.e. either DNA or RNA) whereby they are encapsulated by a protein coat known as capsids [1]. Bacteriophages has an uncanny specificity towards a particular bacterial host species where they can irreversibly attach themselves to the surface of susceptible host and inje