Oral Presentation 29th Annual Lorne Proteomics Symposium 2024

Deep learning at your fingertips: Changing proteomics a spectrum at a time (#15)

Mathias Wilhelm 1
  1. Technical University of Munich, Freising, Germany

Various deep learning-assisted data analysis approaches developed recently have been demonstrated to boost the performance of mass spectrometry-based proteomics in recent years. One such model, Prosit, a deep neuronal network trained on synthetic peptides generated in the ProteomeTools project, has been used to predict >11 billion spectra since its release in 2019. Here, we summarize the available Prosit models for peptide property prediction. We specifically highlight recent developments to generate a new single model covering tryptic and non-tryptic, modified and unmodified, and labeled and unlabeled peptide fragment intensity prediction for various mass analyzers and fragmentation methods. The application and impact of this model when using it for the reanalysis of public data are shown on various datasets covering different organisms, sample complexities, and modifications. The application of deep learning in proteomics is still in its early stages and holds the potential to boost its performance even more. To facilitate this development, while ensuring reproducibility and FAIRness, we developed a number of open-source packages relevant to the development and application of deep learning models in proteomics. We use DLOmix to train Prosit, Koina to host and provide access to pre-trained models, and Oktoberfest to perform data-driven rescoring. We believe that such software ecosystems can be a crucial factor in avoiding an impeding reproducibility crisis.