Abstract
AbstractSummaryHomology detection by sequence comparison is a typical first step in the study of protein function and evolution. Here, we describe a new homology detection tool, pLM-BLAST, that uses a modified Smith-Waterman algorithm for unsupervised comparison of single-sequence representations obtained from a protein language model (such as ProtT5) trained on millions of sequences. In our benchmarks, pLM-BLAST has shown the ability to detect homology between highly divergent proteins, demonstrating its applicability to tasks such as protein classification, domain annotation, and function prediction.Availability and ImplementationpLM-BLAST is available as a web server in the MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de/tools/plmblast), where it can be used to search precomputed databases. It is also available as a standalone tool to build custom databases and run batch searches (https://github.com/labstructbioinf/pLM-BLAST).
Publisher
Cold Spring Harbor Laboratory
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献