Difference between revisions of "Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies"
(→Outcome O3) |
(→Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies) |
||
(9 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | == | + | === Citation === |
Lazar, C., Gatto, L., Ferro, M., Bruley, C., and Burger, T. (2016): | Lazar, C., Gatto, L., Ferro, M., Bruley, C., and Burger, T. (2016): | ||
Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. Journal of Proteome Research, 15:1116–1125. | Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. Journal of Proteome Research, 15:1116–1125. | ||
− | [https://doi.org/10.1021/acs.jproteome.5b00981 | + | [https://doi.org/10.1021/acs.jproteome.5b00981 Permanent link to the article] |
Line 10: | Line 10: | ||
=== Study outcomes === | === Study outcomes === | ||
− | + | Following outcomes can be drawn from the paper: | |
==== Outcome O1 ==== | ==== Outcome O1 ==== | ||
Imputation performs better with fewer missing values. | Imputation performs better with fewer missing values. | ||
==== Outcome O2 ==== | ==== Outcome O2 ==== | ||
− | There exist MNAR-devoted methods and MCAR-devoted methods (see | + | There exist MNAR-devoted methods and MCAR-devoted methods (see Figures 2 and 3). |
− | Depending on the MNAR ratio of a specific data set, one should privilege a MNAR/MCAR-devoted method | + | Depending on the MNAR ratio of a specific data set, one should privilege a MNAR/MCAR-devoted method (see Figure 4). |
− | ==== Outcome | + | ==== Outcome O3 ==== |
MNAR-devoted methods perform worse the more missing values and the more random the missing values are (see Figures 2 and 3). | MNAR-devoted methods perform worse the more missing values and the more random the missing values are (see Figures 2 and 3). | ||
MCAR-devoted methods perform worse the more missing values and the more NOT at random the missing values are (see Figures 2 and 3). | MCAR-devoted methods perform worse the more missing values and the more NOT at random the missing values are (see Figures 2 and 3). | ||
− | ==== | + | ==== Outcome O4 ==== |
− | + | On average MCAR-devoted methods outperform MNAR-devoted methods, so that MCAR-devoted methods are recommended if the randomness of missing values is not known. | |
+ | ==== Outcome O5 ==== | ||
+ | Peptide-level imputation is more accuarte (Figure 6). | ||
=== Study design and evidence level === | === Study design and evidence level === | ||
− | |||
− | |||
− | |||
− | + | The consideration of simulated data as well as real data, plus the application on protein level as well as on peptide level, makes the result sound and reliable. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | The great variations of missing value incorporation, 11 rates of MV and 11 rates of MNAR values, result in 121 simulated datasets which give a broad representation of different missingness mechanisms. | |
− | + | Imputation was performed with 3 MCAR-devoted methods (kNN, SVDimpute, MLE) and 2 MNAR-devoted methods (MinDet, MinProb) which is not many but still shows the performance difference between MCAR/MNAR-devoted methods. | |
− | |||
− | |||
− | |||
=== Further comments and aspects === | === Further comments and aspects === | ||
=== References === | === References === | ||
− | + | Webb-Robertson, B.-J. M.; Wiberg, H. K.; Matzke, M. M.; | |
+ | Brown, J. N.; Wang, J.; McDermott, J. E.; Smith, R. D.; Rodland, K. D.; | ||
+ | Metz, T. O.; Pounds, J. G.; Waters, K. M.; et al. Review, evaluation, | ||
+ | and discussion of the challenges of missing value imputation for mass | ||
+ | spectrometry-based label-free global proteomics. J. Proteome Res. 2015, | ||
+ | 14 (5), 1993−2001. [[Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.]] [https://doi.org/10.1021/acs.jproteome.5b00981: DOI] |
Latest revision as of 11:50, 25 February 2020
Contents
Citation
Lazar, C., Gatto, L., Ferro, M., Bruley, C., and Burger, T. (2016): Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. Journal of Proteome Research, 15:1116–1125.
Summary
In this paper 5 imputation algorithms are evaluated depending on the number of missing values and randomness of the data to set practical guideless in choosing an appropriate imputation method which accounts for the specific type of missingness mechanism.
Study outcomes
Following outcomes can be drawn from the paper:
Outcome O1
Imputation performs better with fewer missing values.
Outcome O2
There exist MNAR-devoted methods and MCAR-devoted methods (see Figures 2 and 3). Depending on the MNAR ratio of a specific data set, one should privilege a MNAR/MCAR-devoted method (see Figure 4).
Outcome O3
MNAR-devoted methods perform worse the more missing values and the more random the missing values are (see Figures 2 and 3).
MCAR-devoted methods perform worse the more missing values and the more NOT at random the missing values are (see Figures 2 and 3).
Outcome O4
On average MCAR-devoted methods outperform MNAR-devoted methods, so that MCAR-devoted methods are recommended if the randomness of missing values is not known.
Outcome O5
Peptide-level imputation is more accuarte (Figure 6).
Study design and evidence level
The consideration of simulated data as well as real data, plus the application on protein level as well as on peptide level, makes the result sound and reliable.
The great variations of missing value incorporation, 11 rates of MV and 11 rates of MNAR values, result in 121 simulated datasets which give a broad representation of different missingness mechanisms.
Imputation was performed with 3 MCAR-devoted methods (kNN, SVDimpute, MLE) and 2 MNAR-devoted methods (MinDet, MinProb) which is not many but still shows the performance difference between MCAR/MNAR-devoted methods.
Further comments and aspects
References
Webb-Robertson, B.-J. M.; Wiberg, H. K.; Matzke, M. M.; Brown, J. N.; Wang, J.; McDermott, J. E.; Smith, R. D.; Rodland, K. D.; Metz, T. O.; Pounds, J. G.; Waters, K. M.; et al. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J. Proteome Res. 2015, 14 (5), 1993−2001. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. DOI