March 30th, 2025

Recent Publications Harnessing the Power of Translatomics

Every week we provide a digest of a small number of recent interesting papers in the field of translatomics.

In this week’s Sunday papers, several groups of researchers employ ribosome profiling and micropeptide discovery techniques in their research.

  • Perdikopanis et al. introduce D-sORF, a deep learning framework that accurately classifies small open reading frames (sORFs) associated with translation using only sequence features—uncovering novel protein-coding elements without relying on conservation.
  • Peptigate, a workflow by Borges et al., predicts bioactive peptides directly from transcriptomes by integrating multiple peptide prediction tools and machine learning, streamlining functional peptide discovery.
  • A study on human β-cells by Walters et al., reveals nearly 1,000 novel open reading frames with translational activity, including a regulatory uORF in TYK2, linking previously unannotated regions to type 1 diabetes pathogenesis.

Together, these studies highlight how emerging tools are redefining the boundaries of the coding genome and expanding the known functional proteome.

D-sORF: Accurate Ab Initio Classification of Experimentally Detected Small Open Reading Frames (sORFs) Associated with Translational Machinery

Biology, May 2024

Perdikopanis N., Giannakakis A., Kavakiotis I. and Hatzigeorgiou A.G.

This paper introduces D-sORF, a deep learning-based framework for accurately classifying small open reading frames (sORFs) in eukaryotic genomes. sORFs, often overlooked due to their short length (<100 codons), are increasingly recognized for encoding functional microproteins with regulatory roles in translation and other cellular processes.

D-sORF was developed to address the limitations of existing tools that rely heavily on evolutionary conservation or expression data. It uses an ab initio approach, meaning it only requires raw nucleotide sequences—making it highly applicable to newly sequenced genomes. The framework is built on a multi-layer neural network trained on experimentally validated datasets, using features derived from codon usage, nucleotide composition, and signal patterns commonly found in protein-coding regions.

Key findings highlight that D-sORF outperforms traditional classifiers in distinguishing coding from non-coding sORFs, especially those associated with translational machinery like ribosomal proteins and regulatory peptides. The model maintains strong accuracy even when applied to independent datasets, showing robustness and generalizability.

The impact of this work lies in its potential to enhance genome annotation pipelines and deepen our understanding of the functional roles of sORFs in biology and disease. By enabling large-scale sORF identification without external data dependencies, D-sORF opens new directions for research into microproteins and their translational control mechanisms.

Learn more about EIRNA Bio’s ribosome profiling and sORF explorer services here.

Predicting bioactive peptides from transcriptome assemblies with the peptigate workflow

Arcadia, August 2024

Borges A.L., Celebi F.M., Cheveralls K., Chou S., Reiter T. and Weiss E.C.P.

In this publication the authors present Peptigate, a computational pipeline for identifying bioactive peptides (BAPs) from transcriptome data. BAPs—short peptides with roles in immunity, signaling, and antimicrobial activity—are valuable for therapeutic research, but their discovery through experimental methods is often time-consuming and costly.

Peptigate automates BAP prediction using the Snakemake workflow engine and integrates several tools. It detects small open reading frames (sORFs) with plm-utils, cleavage-derived peptides via DeepPeptide, and ribosomally synthesized and post-translationally modified peptides (RiPPs) using NLPPrecursor. These candidates are assessed for bioactivity with AutoPeptideML, annotated using the Peptipedia database, screened for signal peptides via DeepSig, and analyzed for chemical properties using peptides.py.

The authors demonstrate Peptigate’s effectiveness across diverse transcriptomes, identifying novel BAPs with antimicrobial and anticancer potential. Its flexibility allows for input from both transcript and protein sequences, making it adaptable for different research needs.

By streamlining peptide prediction and annotation, Peptigate reduces the reliance on experimental screening and accelerates peptide discovery. Its modular, open-source design makes it a powerful tool for exploring functional peptides in genomics, drug discovery, and beyond.

Learn more about EIRNA Bio’s ribosome profiling and sORF explorer services here.

8681 Preoteogenomic Identification Of Novel Translated And Regulatory Open Reading Frames In Human Beta-Cells

Journal of the Endocrine Society, Oct-Nov 2024

Walters K., Guiterrez R.C., Sakhar S., Baldwin A., Nakayasu E.S., Russ H. and Mukerjee N.

The researchers here focused their work on exploring unannotated open reading frames (nuORFs) in human β-cells, aiming to identify potential novel antigens relevant to type 1 diabetes (T1D). While most studies focus on known proteins like insulin, emerging evidence suggests that nuORFs and defective ribosomal products (DRiPs) may also contribute to autoimmunity.

Using ribosome profiling (ribo-seq) on stem cell-derived β-cells engineered to express tagged ribosomal proteins, the authors mapped actively translated regions with high resolution. Computational tools RiboCode and ORFQuant identified over 12,000 canonical ORFs, 85% of which showed proteomic evidence. Crucially, the study uncovered nearly 1,000 nuORFs, with more than half confirmed at the protein level in both stem-cell-derived and cadaveric human islets.

Of particular interest, 149 nuORFs were associated with T1D-linked genes. One notable example is a novel upstream ORF in TYK2, a gene implicated in immune signaling and T1D risk. Functional experiments showed that this uORF acts as a translational activator, hinting at a regulatory role in β-cell biology and disease.

This work expands the known β-cell proteome and suggests that nuORFs may be important sources of autoimmune targets in T1D. It offers a valuable resource for future therapeutic research and underscores the broader biological significance of unannotated ORFs.

Learn more about EIRNA Bio’s ribosome profiling and sORF explorer services here.

Scroll to Top