Emilio Dorigatti
Dr.
* Former Member
Multi-epitope vaccines (EVs) represent a versatile and promising approach for combating a wide range of diseases, including viral and bacterial infections, parasitic diseases, and cancer. A critical aspect of EV design is ensuring efficient proteasomal cleavage of the synthetic polypeptide to recover therapeutic epitopes for presentation to T cells. This process is influenced by the selection and arrangement of epitopes, as well as the design of linkers joining them. Modern EV design frameworks leverage proteasomal cleavage predictors to optimize epitope recovery and vaccine efficacy. However, the predictive power of these tools remains a limiting factor, particularly for challenging cleavage sites such as N -terminals. In this work, we systematically review and benchmark recent advances in proteasomal cleavage prediction, focusing on deep learning-based methods. We evaluate a range of architectures, including MLPs, CNNs, LSTMs, and Transformers, alongside simpler baselines such as logistic regression. Our results demonstrate that while complex models achieve marginally higher predictive performance, simpler models remain competitive and offer significant computational efficiency. We also explore the impact of dataset size, window size, and training techniques, finding diminishing returns from increasingly larger datasets and more complex models. Notably, our benchmarking highlights substantial improvements in predicting N-terminal cleavage sites, which are often overlooked but critical for EV design. Our findings provide practical guidance for the development of nextgeneration proteasomal cleavage predictors and underscore the importance of considering the probabilistic nature of cleavage in EV design. By consolidating the state of the art and introducing efficient baselines, this work aims to advance the field of multiepitope vaccine design and support the development of more effective and personalized immunotherapies.
inproceedings ZMB+25
BibTeXKey: ZMB+25