Skip to contents

Zhou2018 et al. 2018 map RNA modification of pseudouridine \(\Psi\) by chemically modifying pseudouridines with carbodiimide (+CMC) and detecting arrest events that are induced by reverse transcription stops in high-throughput sequencing under 3 different conditions (HIVRT, SIIIMn, and SIIIMg). A combined version of all conditions and their pairwise +CMC and -CMC(control) comparisons is presented. The data structure contains an additional field "meta_condition" that corresponds and identifies the condition from Zhou 2018:

  • HIVRT,

  • SIIIMn, and

  • SIIIMg.

This dataset exemplifies the usefullness of combining multiple pairwise comparisons and make them distinguishable by "meta" field.

Usage

data(Zhou2018_rt_arrest)

Format

a tibble with 23 elements:

  • id: Character string representing a unique identifier - created from contig, start, [end], and strand.

  • contig: Character string representing the contig of the variant

  • start: Numeric position of variant (>=0)

  • end: Numeric corresponds to "start + 1"

  • name: Character string. Name of used method call-2

  • pvalue: Numeric value representing the pvalue of the test.

  • strand: Character representing strand information; "+", "-", or "."(no strand information available)

  • arrest: Numeric tibble with representing counts for A, C, G, and T base calls from arrest reads.

  • through: Numeric tibble with representing counts for A, C, G, and T base calls from through reads.

  • bases: Numeric tibble with representing counts for A, C, G, and T base calls.

  • cov: Numeric value indicating the read coverage for this site

  • arrest_rate: Numeric tibble representing the arrest rate for each sample.

  • arrest_score: Numeric - test-statistic score.

  • backtrack1: Character - indicator if backtracking was used for condition 1.

  • backtrack2: Character - indicator if backtracking was used for condition 2.

  • backtrackP: Character - indicator if backtracking was used for condition pooled condition.

  • reset1: Character - indicator if default estimation was unstable with for condition 1.

  • reset2: Character - indicator if default estimation was unstable with for condition 2.

  • resetP: Character - indicator if default estimation was unstable with for pooled condition.

  • info: Character string separated with ";" provding additional data for this specific site. Empty field is equal to "*"

  • filter: ";"-separated character string showing feature filter information. Empty field is equal to "*"

  • ref: Character "A", "C", "G", "T", or "N" representing the reference base for this site - inverted when strand is "-".

  • meta: Character string indicating the dataset. Here: "HIVRT", "SIIIRTMn", or "SIIIRTMn".

Details

Check Section "Reverse transcriptase arrest events" in https://github.com/dieterich-lab/JACUSA2/blob/master/manual/manual.pdf for details on pre-processing and mapping primary sequencing data.

References

Zhou, K. I.; Clark, W. C.; Pan, D. W.; Eckwahl, M. J.; Dai, Q. & Pan, T. Pseudouridines have context-dependent mutation and stop rates in high-throughput sequencing RNA Biology, Informa UK Limited, 2018 , 15 , 892-900