Abstract
We introduce UltraGen, an RNA language model to capture RNA binding properties. Utilizing fine-grained self-learning, UltraGen identifies RNA aptamers for a wide range of target sizes, including small molecules, proteins, cells, and tissue. Additionally, UltraGen discerns tissue specificity for millions RNA species across 22 human organs based on their 3’-UTR sequences, predicts the tropism of human-pathogenic RNA viruses, and characterizes SARS-CoV-2 replicase RNA binding at single-base resolution.