Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics
Contents
Citation
Webb-Robertson, B.-J. M.; Wiberg, H. K.; Matzke, M. M.; Brown, J. N.; Wang, J.; McDermott, J. E.; Smith, R. D.; Rodland, K. D.; Metz, T. O.; Pounds, J. G.; Waters, K. M.; et al. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J. Proteome Res. 2015, 14 (5), 1993−2001.
Summary
Evaluation of performance and caveats of 9 imputation algorithms applied on a LC-MS data set.
Study outcomes
Outcome O1
Most imputation methods perform well, no single algorithm or imputation strategy (single, local, global) outperforms, sometimes even no imputation is superior in subsequent classification analysis.
Outcome O2
Local similarity-based approaches are in general the most accuarate and robust methods. Such as least-squares adaptive (LSA) or regularized expectation maximization (REM) (Figure 4)
Outcome O3
The 'best' imputation method highly depends on the data and the goal of the downstream analysis and therewith advantageous methods are hard to define (Figure 3)
Further outcomes
With left-censored data the number of missing values highly depends on peptide intensity (Figure 1)
Study design and evidence level
General aspects
3 single-value approaches (LOD1,LOD2,RTI), 5 local similarity approaches (KNN, LLS, LSA, REM, MBI) and 2 global-structure approaches (PPCA, BPCA) were evaluated which allows comparison and discussion of different imputation strategies. They were applied to 3 real datasets of different type and species, which represent a broad biological application.