dynast.estimation.p_e

Module Contents

Functions

read_p_e(p_e_path: str, group_by: Optional[List[str]] = None) → Dict[Union[str, Tuple[str, Ellipsis]], float]

Read p_e CSV as a dictionary, with group_by columns as keys.

estimate_p_e_control(df_counts: pandas.DataFrame, p_e_path: str, conversions: FrozenSet[FrozenSet[str]] = frozenset({frozenset({'TC'})})) → str

Estimate background mutation rate of unlabeled RNA for a control sample

estimate_p_e(df_counts: pandas.DataFrame, p_e_path: str, conversions: FrozenSet[FrozenSet[str]] = frozenset({frozenset({'TC'})}), group_by: Optional[List[str]] = None) → str

Estimate background mutation rate of unabeled RNA by calculating the

estimate_p_e_nasc(df_rates: pandas.DataFrame, p_e_path: str, group_by: Optional[List[str]] = None) → str

Estimate background mutation rate of unabeled RNA by calculating the

dynast.estimation.p_e.read_p_e(p_e_path: str, group_by: Optional[List[str]] = None) Dict[Union[str, Tuple[str, Ellipsis]], float][source]

Read p_e CSV as a dictionary, with group_by columns as keys.

Parameters
p_e_path

Path to CSV containing p_e values

group_by

Columns to group by

Returns

Dictionary with group_by columns as keys (tuple if multiple)

dynast.estimation.p_e.estimate_p_e_control(df_counts: pandas.DataFrame, p_e_path: str, conversions: FrozenSet[FrozenSet[str]] = frozenset({frozenset({'TC'})})) str[source]

Estimate background mutation rate of unlabeled RNA for a control sample by simply calculating the average mutation rate.

Parameters
df_counts

Pandas dataframe containing number of each conversion and nucleotide content of each read

p_e_path

Path to output CSV containing p_e estimates

conversions

Conversion(s) in question

Returns

Path to output CSV containing p_e estimates

dynast.estimation.p_e.estimate_p_e(df_counts: pandas.DataFrame, p_e_path: str, conversions: FrozenSet[FrozenSet[str]] = frozenset({frozenset({'TC'})}), group_by: Optional[List[str]] = None) str[source]

Estimate background mutation rate of unabeled RNA by calculating the average mutation rate of all three nucleotides other than conversion[0].

Parameters
df_counts

Pandas dataframe containing number of each conversion and nucleotide content of each read

p_e_path

Path to output CSV containing p_e estimates

conversions

Conversion(s) in question, defaults to frozenset([(‘TC’,)])

group_by

Columns to group by, defaults to None

Returns

Path to output CSV containing p_e estimates

dynast.estimation.p_e.estimate_p_e_nasc(df_rates: pandas.DataFrame, p_e_path: str, group_by: Optional[List[str]] = None) str[source]

Estimate background mutation rate of unabeled RNA by calculating the average CT and GA mutation rates. This function imitates the procedure implemented in the NASC-seq pipeline (DOI: 10.1038/s41467-019-11028-9).

Parameters
df_counts

Pandas dataframe containing number of each conversion and nucleotide content of each read

p_e_path

Path to output CSV containing p_e estimates

group_by

Columns to group by, defaults to None

Returns

Path to output CSV containing p_e estimates