MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline

  • PDF / 1,464,314 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 30 Downloads / 167 Views

DOWNLOAD

REPORT


Open Access

SOFTWARE

MetaLAFFA: a flexible, end‑to‑end, distributed computing‑compatible metagenomic functional annotation pipeline Alexander Eng1, Adrian J. Verster1,2 and Elhanan Borenstein3,4,5* *Correspondence: [email protected] 3 Blavatnik School of Computer Science, Tel Aviv University, 6997801 Tel Aviv, Israel Full list of author information is available at the end of the article

Abstract  Background:  Microbial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functional annotation of shotgun metagenomic data has become an increasingly popular method for identifying the aggregate functional capacities encoded by the community’s constituent microbes. Currently available metagenomic functional annotation pipelines, however, suffer from several shortcomings, including limited pipeline customization options, lack of standard raw sequence data pre-processing, and insufficient capabilities for integration with distributed computing systems. Results:  Here we introduce MetaLAFFA, a functional annotation pipeline designed to take unfiltered shotgun metagenomic data as input and generate functional profiles. MetaLAFFA is implemented as a Snakemake pipeline, which enables convenient integration with distributed computing clusters, allowing users to take full advantage of available computing resources. Default pipeline settings allow new users to run MetaLAFFA according to common practices while a Python module-based configuration system provides advanced users with a flexible interface for pipeline customization. MetaLAFFA also generates summary statistics for each step in the pipeline so that users can better understand pre-processing and annotation quality. Conclusions:  MetaLAFFA is a new end-to-end metagenomic functional annotation pipeline with distributed computing compatibility and flexible customization options. MetaLAFFA source code is available at https​://githu​b.com/boren​stein​-lab/MetaL​AFFA and can be installed via Conda as described in the accompanying documentation. Keywords:  Metagenomics, Functional annotation, Pipeline, Distributed computing

Background The analysis of the functional capacities of microbial communities has become an important component of microbiome-based studies, providing novel insights into associations between the gut microbiome and host conditions such as depression [22], autism [18], and type 2 diabetes [16]. Such functional profiles are generally obtained via shotgun © The Author(s) 2020. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images