MethCP: Differentially Methylated Region Detection with Change Point Models (bioRxiv)
Contents
1 MethCP: Differentially Methylated Region Detection with Change Point Models
Boying Gong, Elizabeth Purdom, MethCP: Differentially Methylated Region Detection with Change Point Models, 2018, bioRxiv.
https://doi.org/10.1101/265116
1.1 Summary
A new approach (MethCP) for the identification of differentially methylated regions (DMRS) of the DNA based on whole genome sequencing data is supposed. The approach is develope for more complex design than two-group comparisons, e.g. for time course experiments. For the two-group setup, it is claimed that MethCP outperforms existing approaches.
1.2 Study outcomes
1.2.1 Outcome O1
For simulated data, the following outcomes were obtained (ROC curves, i.e. TPR vs. FPR for different local precision or local recall)
- Overall, metilene, MethCP-DSS, MethCP-MethylKit are superior to bsmooth, HMM-Fisher, DSS and methylKit
- DSS has
- MethCP better controls the desired FPR than metilene at a significance level 0.05
Outcome O1 is presented as Figure 2 in the original publication.
1.2.2 Outcome O2
For simulating small effect sizes (2.5%, 5%, 10%, 20%), the following result is obtained:
- For <10% it is claimed that only MethCP can accurately predict DMRs (although only results for MethCP and metilene are plotted).
Outcome O2 is presented as Figure 3 in the original publication.
1.2.3 Outcome O3
For randomly dividing six control samples in two groups with three replicates and by randomly permute over samples for each CgG (termed "1. permutation" below), the following performance was observed:
- HMM-Fisher performes best. It yiels almost no false-positive predictions.
- MethCP-DSS and MethCP-MethylKit have good performance (around 20-40 false positive DMRs and less than 0.0005 for the proportion of CpGs)
- Bsmooth, DSS, methylKit and metilene perform worst (more than 150 false DMRs and a proportion of around 0.008-0.0022 as wrongly predicted CpGs)
Outcome O3 is presented as Figure 4 panels (c) and (e) in the original publication.
1.2.4 Outcome O4
For randomly dividing six control samples in two groups with three replicates and by randomly permute the CpG positions within each sample (termed "2. permutation" below), the following performance was observed:
- MethCP- performs best
- Bsmooth, HMM-Fisher, methylKit, MethCP-DSS and MethCP-MethylKit have very few (almost no) wrong predictions.
- DSS and metilene performe worst (around 60-90 wrong DMRs, around 0.0005 wrongly predicted CpG proportions)
Outcome O4 is presented as Figure 4 panels (d) and (f) in the original publication.
1.2.5 Further outcomes
If intended, you can add further outcomes here.
1.3 Study design and evidence level
1.3.1 General aspects
You can describe general design aspects here. The study designs for describing specific outcomes are listed in the following subsections:
1.3.2 Design for Outcome O1
- The outcome was generated for ...
- Configuration parameters were chosen ...
- ...
1.3.3 Design for Outcome O2
- Publicly available data for Arabidopsis Thaliana [Coleman-Derr et al., 2012] with GEO accession number GSE39045 was analyzed
- Wildtype data was compared to H2Z.Z mutant
- The data had six replicates in both groups
- For assessing false-positives, the six control replicates were randomly assinged to two groups with three replicates AND by performing one of the two additional permutation approaches:
- The two counts for methylated and unmethylated were permuted across samples for each CpG. This breaks local correlations within a sample but preserved correlations which occur over all/several samples. It also prohibits global differences between the samples in the average methylation level.
- The CpG positions within a sample were permuted which breaks local correlations along the genome. This does not prevent potential global difference between the methylation levels of the individual samples.
1.4 Further comments and aspects
1.5 References
Coleman-Derr, D. and Zilberman, D. 2012. Deposition of histone variant h2a. z within gene bodies regulates responsive genes. PLoS genetics 8, e1002988