Abstract
Background: Microalgae constitute a prominent feedstock for producing biofuels and biochemicals by virtue of their prolific reproduction, high bioproduct accumulation, and the ability to grow in brackish and saline water. However, naturally-occurring wild type algal strains are rarely optimal for industrial use. Bioengineering of algae is necessary to generate superior performing strains that can address production challenges in industrial settings, particularly the bioenergy and bioproduct sectors. One of the crucial steps in this process is deciding on a bioengineering target: namely, which gene/protein to differentially express. These targets are often orthologs which are defined as genes/proteins originating in a common ancestor in divergent species. Although bioinformatics tools for the identification of protein orthologs already exist, processing the output from such tools is non-trivial, especially for a researcher with little or no bioinformatics experience.
Results: The present study introduces AlgaeOrtho, a user-friendly tool that builds upon the SonicParanoid orthology inference tool and the PhycoCosm database from JGI (Joint Genome Institute) to help researchers identify orthologs of their proteins of interest in multiple diverse algal species. This tool includes an application with a user interface, to upload an ortholog protein group file (created using SonicParanoid), and a query file that includes their protein sequence(s) of interest in the FASTA format. The output generates a table of the putative orthologs of their protein of interest, a heatmap showing sequence similarity (%), and a tree of the putative protein orthologs. Notably, the tool would be instrumental in identifying novel bioengineering targets in different algal strains, including targets in not-fully-annotated algal species, since it does not depend on existing protein annotations.
Conclusions: We tested AlgaeOrtho using two case studies, for which orthologs of proteins relevant to bioengineering targets were identified from a range of algal species, demonstrating its ease of use and utility for bioengineering researchers. This tool is unique in the protein ortholog identification space as it can visualize putative orthologs, as desired by the user, across several algal species.