Skip to contents

See vignette("JACUSA2helper") for general description of analysis with JACUSA2helper. For details on JACUSA2, check the JACUSA2 manual.

In the following, the use of meta conditions to combine multiple pairwise comparisons will be shown.

Meta condition

JACUSA2helper supports the analysis of several related JACUSA2 result files via results <- read_results(files, meta_conds) where meta_conds is a vector of strings that provides a descriptive name for each file in the vector of strings files.

Here, we will use the the Zhou et al. (2018) data set, where the authors map RNA modification of pseudouridine (\(\Psi\)) by chemically modifying pseudouridines with carbodiimide (+CMC) and detecting arrest events that are induced by reverse transcription stops in high-throughput sequencing under 3 different conditions (HIVRT, SIIIMn, and SIIIMg). All three data sets are available in JACUSA2helper via data(). Additionally, we have compiled combined data sets data(Zhou2018_call2) and data(Zhou2018_rt_arrest) that utilizes meta conditions. Note, that this data has NO replicates!

data(Zhou2018_rt_arrest)
unique(Zhou2018_rt_arrest$meta_cond)
#> [1] HIVRT    SIIIRTMn SIIIRTMg
#> Levels: HIVRT SIIIRTMg SIIIRTMn

Group By Site (and other)

When manipulating a multi results object created by read_results(), it is crucial to distinguish the following files in an analysis pipeline:

  • results %>% ... other functions()
  • results %>% dplyr:group_by(results, meta_cond) %>% ... other functions()

The first statement will apply any subsequent functions to ALL sites regardless of the meta condition while the last statement will apply to sites of EACH meta condition!

Number of sites

The following statement will determine the number of covered sites per contig and meta condition:

This statement will determine the number of covered sites per contig regardless of the meta condition:

Filter

Filter by coverage regardless of the meta condition.

Plot

First, we add a description data_desc of the conditions to the result object. The data sets of Zhou2018 have been layout out in such a way that condition 1 and 2 correspond to carbodiimide (+CMC) treatment and control (-CMC), respectively.

Next, we define a ggplot2 object that allows to merge legend for different scales. Check combine legends for details. In brief, we use colour to represent cond(ition) and linetype to represent repl(icate) an relate their possible combinations to descriptive labels.

Pvalue distribution

Plot of the empirical cumulative distributions of pvalues for all meta conditions.

References

Zhou, Katherine I., Wesley C. Clark, David W. Pan, Matthew J. Eckwahl, Qing Dai, and Tao Pan. 2018. “Pseudouridines Have Context-Dependent Mutation and Stop Rates in High-Throughput Sequencing.” RNA Biology 15 (7): 892–900. https://doi.org/10.1080/15476286.2018.1462654.