Poster Presentation 29th Annual Lorne Proteomics Symposium 2024

MHCpLogics: an interactive machine learning-based tool for unsupervised data visualisation and cluster analysis of immunopeptidomes (#114)

Mohammad Shahbazy 1 , Sri H Ramarathinam 1 , Chen Li 1 , Patricia T Illing 1 , Pouya Faridi 2 , Nathan P Croft 1 , Anthony W Purcell 1
  1. Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, Victoria, Australia
  2. Department of Medicine, School of Clinical Sciences, Monash University, Melbourne, Victoria, Australia

The major histocompatibility complex (MHC) encodes a range of immune response genes, including the human leukocyte antigen molecules (HLAs) in humans. These molecules can bind and present peptide antigens on the cell surface for T cell recognition. The repertoires of peptides presented by HLA molecules are termed immunopeptidomes, and the utilisation of mass spectrometry to identify and quantify immunopeptidomes is termed immunopeptidomics. The highly polymorphic nature of the genes that encode the HLA molecules leads to allotype-specific differences in the sequences of bound ligands, which can be represented by peptide-binding motifs. Current mass spectrometry and immunopeptidomic protocols enable large-scale acquisition and sequencing of tens of thousands of HLA peptides derived from cells expressing several HLA alleles, making the deconvolution of such complex immunopeptidomic data into allotype-specific contributions challenging. To overcome this, we have developed MHCpLogics as an interactive machine learning-based tool for mining peptide-binding sequence motifs and visualisation of immunopeptidome data across complex datasets. MHCpLogics allows fully unsupervised data analysis of HLA peptide ligands within and across datasets, providing the user with rapid cluster analysis for the deconvolution of motifs. We showcase functionalities to examine the performance of MHCpLogics by analysing both in-house and published mono- and multi-allelic immunopeptidomics data, demonstrating clear segregation of sequences into allotype-specific motifs and sub-motifs, individual sub-peptidome sequence patterns, and the ability to highlight variations at anchor residues across closely related HLA allotypes. We anticipate the tool to be a valuable resource for the immunology and vaccine research communities for efficient inspection of immunopeptidomes, supporting T cell immunotherapy strategies. MHCpLogics is distributed as a standalone application available via an executable installation at https://github.com/PurcellLab/MHCpLogics