Predicting the binding affinity of major histocompatibility complex I (MHC I) proteins and their peptide ligands is important for vaccine design. We introduce an open-source package for MHC I binding prediction, MHCflurry. The software implements allele-specific neural networks that use a novel architecture and peptide encoding scheme. When trained on affinity measurements, MHCflurry outperformed the standard predictors NetMHC 4.0 and NetMHCpan 3.0 overall and particularly on non-9-mer peptides in a benchmark of ligands identified by mass spectrometry. The released predictor, MHCflurry 1.2.0, uses mass spectrometry datasets for model selection and showed competitive accuracy with standard tools, including the recently released NetMHCpan 4.0, on a small benchmark of affinity measurements. MHCflurry's prediction speed exceeded 7,000 predictions per second, 396 times faster than NetMHCpan 4.0. MHCflurry is freely available to use, retrain, or extend, includes Python library and command line interfaces, may be installed using package managers, and applies software development best practices. Accurate prediction servers for MHC I ligands have been in wide use for some time, but these tools are typically closed source, may be trained only by their developers, and can be challenging to integrate into high-throughput workflows required for tumor neoantigen discovery. We introduce a prediction package that exposes a programmatic interface, may be modified and re-retrained, and is much faster than existing tools.
- epitope prediction
- neural network