Literature Studies

Page summary
Here outcomes of benchmarking studies from the literature are collected. The primary aim is a comprehensive overview about neutral benchmark studies, i.e. assessments which were performed independenty on publication of a new approach. Studies which are not neutral are put in brackets. The focus is on computational methods for analyzing experimental data (instead of comparing experimental techniques or platforms). Please extend this list by creating a new page and adding a link below. Use the guidelines described here.

1 Results from Literature

1.1 Classification

2003

Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data

2005

A review and comparison of classification algorithms for medical decision making

2016

Predicting Breast Cancer Survivability Using Data Mining Techniques

1.2 Selection of Differential Features and Regions

1.2.1 Identifying differential features

2006

Rat toxicogenomic study reveals analytical consistency across microarray platforms

2010

A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the sequencing Quality control consortium

2017

2018

Identification of Differentially Methylated Sites with Weak Methylation Effects

1.2.2 Identifying differential regions (e.g. DMRs)

2015

2016

2017

DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data

2018

1.2.3 Identifying sets of features (e.g. gene set analyses)

2009

2018

Gene set analysis methods: a systematic comparison

1.2.4 Dimension reduction

2008

On the Relationship Between Feature Selection and Classification Accuracy

2015

Comparing feature selection methods for highdimensional imbalanced data: identifying rheumatoid arthritis cohorts from routine data

1.3 Imputation methods for missing values

Year	First Author	Title
1996	Schenker	Partially parametric techniques for multiple imputation
1999		Imputing Missing Data for Gene Expression Arrays
2001	Troyanskaya	Missing value estimation methods for DNA microarrays
2002		Imputation of missing longitudinal data: a comparison of methods
2003	Oba	A Bayesian missing value estimation method for gene expression profile data
2005	Scholz	Nonlinear PCA: a missing data approach
2007	Stacklies	pcaMethods—a bioconductor package providing PCA methods for incomplete data
2007	Verboven	Sequential imputation for missing values
2008		Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes
2011	Templ	Iterative stepwise regression imputation using standard and robust methods
2012	Hrydziuszko O	Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline
2012	Stekhoven	MissForest—non-parametric missing value imputation for mixed-type data
2013	Taylor	Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies
2013	Waljee	Comparison of imputation methods for missing laboratory data in medicine
2014	Shah	Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study
2014	Rodwell	Comparison of methods for imputing limited-range variables: a simulation study
2014	Morris	Tuning multiple imputation by predictive mean matching and local residual draws
2014		Recursive partitioning for missing data imputation in the presence of interaction effects
2015		Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics
2016		Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies
2016		Multiple imputation and analysis for high-dimensional incomplete proteomics data
2018	Wei	Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data
2018	Poyatos	Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information
2018	O'Brien	The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments

1.4 ODE-based Modelling

2001

Ways to Fit a PK Model with Some Data Below the Quantification Limit

2008

Hybrid optimization method with general switching strategy for parameter estimation

2011

Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis

2013

2017

2018

2019

2020

1.5 Omics Workflows

Year	First Author	Title
2013	Weisser H	An Automated Pipeline for High-Throughput Label-Free Quantitative Proteomics
2015		ComparingVariant Call Files for Performance Benchmarkingof Next-Generation Sequencing Variant Calling Pipelines
2016	Tyanova S	The MaxQuant computational platform for mass spectrometry–based shotgun proteomics
2017		A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies
2018	Välikangas T	A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation
2019		A Systematic Evaluation of Single CellRNA-Seq Analysis Pipelines
2019		Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays

1.6 Preprocessing high-throughput data

Year	First Author	Title
2003		A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
2005		Comparison of Affymetrix GeneChip Expression Measures
2005	Meleth S	The case for well-conducted experiments to validate statistical protocols for 2D gels: different pre-processing = different lists of significant proteins
2005		Comparison of background correction and normalization procedures for high-density oligonucleotide microarrays
2006		Using RNA sample titrations to assess microarray platform performance and normalization techniques
2006	Wang P	Normalization regarding non-random missing values in high-throughput mass spectrometry data
2007	Carvalho B	Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data
2007	Cannataro M	MS‐Analyzer: preprocessing and data mining services for proteomics applications on the Grid
2008		Comparison of preprocessing methods for the hgU133+2 chip from Affymetrix
2009		Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations
2010		Consistency of predictive signature genes and classifiers generated using different microarray platforms
2010		Detecting and correcting systematic variation in large-scale RNA sequencing data
2010		Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments
2010		Normalization of RNA-seq data using factor analysis of control genes or samples
2011		Affymetrix GeneChip microarray preprocessing for multivariate analyses
2012		A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis
2014		Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets
2014	Zhou X	Prevention, diagnosis and treatment of high-throughput sequencing data pathologies
2015	Caraus I	Detecting and overcoming systematic bias in high-throughput screening technologies: a comprehensive review of practical issues and methodological solutions
2015	Tam S	Optimization of miRNA-seq data preprocessing
2016	Yi L	Chemometric methods in data processing of mass spectrometry-based metabolomics: A review
2018	Mazoure B	Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening

Anonymous

Search

Navigation

Navigation

Show

Wiki tools

Wiki tools

Literature Studies

Namespaces

Page actions

Contents

1 Results from Literature

1.1 Classification

1.2 Selection of Differential Features and Regions

1.2.1 Identifying differential features

1.2.2 Identifying differential regions (e.g. DMRs)

1.2.3 Identifying sets of features (e.g. gene set analyses)

1.2.4 Dimension reduction

1.3 Imputation methods for missing values

1.4 ODE-based Modelling

1.5 Omics Workflows

1.6 Preprocessing high-throughput data

Anonymous

Search

Navigation

Wiki tools

Page tools

Literature Studies

Contents

1 Results from Literature

1.1 Classification

1.2 Selection of Differential Features and Regions

1.2.1 Identifying differential features

1.2.2 Identifying differential regions (e.g. DMRs)

1.2.3 Identifying sets of features (e.g. gene set analyses)

1.2.4 Dimension reduction

1.3 Imputation methods for missing values

1.4 ODE-based Modelling

1.5 Omics Workflows

1.6 Preprocessing high-throughput data