Gene set analysis methods: a systematic comparison
Contents
1 Gene set analysis methods: a systematic comparison
Mathur, R., Rotroff, D., Ma, J., Shojaie, A., & Motsinger-Reif, A. , Gene set analysis methods: a systematic comparison, 2018, BioData mining, 11(1), 8.
1.1 Summary
Approaches for gene set analyses were assessed by using simulated data that were generated based on a real experimental data set.
1.2 Study outcomes
1.2.1 Outcome O1: False positives under null distribution
The frequency of false-positives was assessed by using an alpha=0.05. Consequently all approaches (except FET-1k) showed around 5% false-positive or less. FET-1k ("FET global statistic in SAFE") had around than 20%.
Outcome O1 is presented as Figure 2 in the original publication for the prostate data template and in the "Additional File 1" for the other templates.
Baseline of this outcome is that all approaches excep FET-1k perform similarly well in terms of false-positives.
1.2.2 Outcome O2
...
Outcome O2 is presented as Figure X in the original publication.
1.2.3 Outcome On
...
Outcome On is presented as Figure X in the original publication.
1.2.4 Further outcomes
If intended, you can add further outcomes here.
1.3 Study design and evidence level
1.3.1 General aspects
- The authors compared four different methods:
- Gene Set Enrichment Analysis (GSEA)
- Significance Analysis of Function and Expression (SAFE)
- sigPathway, and
- Correlation Adjusted Mean RAnk (CAMERA).
- The authors consider different sizes of the gene sets
- The authors consider different proportions of regulated genes in the gene sets
- The authors consider different magnitudes of the underlying effect size (i.e. log-fold-changes)
- The authors consider three null simulations (without regulation) as reference:
- permutation of class labels
- independently sampled expression of all features (=genes)
- centering the simulated data, i.e. set effect size to zero
- In this publication, the authors published a novel simulation approach termed (FANGS)
- The simulation approach is available in this R package (FANGS) offers the opportunity to reproduce the simulations and repeat the analysis for other gene set methods.
- The authors provide a comprehensive list of the used configuration parameters
- The authors evaluated the following alternative configurations
- For GSEA one alternative
- For SAFE five alternative setups
- For sigPathway and CAMERA no other configurations were considered
- Three experimental data sets were used as foundations for simulating data
- prostate cancer (264 cases, 160 controls)
- ischemic stroke (20 cases, 20 controls)
- normal brain tissue (21 cases, 20 controls)
1.3.2 Design for Outcome O1
- The outcome was generated for ...
- Configuration parameters were chosen ...
- ...
1.3.3 Design for Outcome O2
- The outcome was generated for ...
- Configuration parameters were chosen ...
- ...
...
1.3.4 Design for Outcome O
- The outcome was generated for ...
- Configuration parameters were chosen ...
- ...
1.4 Further comments and aspects
1.5 References
The list of cited or related literature is placed here.