`dynast.estimation.pi`

Module Contents

Functions

`read_pi`(pi_path: str, group_by: Optional[List[str]] = None) → Tuple[Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]], Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]], Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]]]	Read pi CSV as a dictionary.
`initializer`(model: pystan.StanModel)	Multiprocessing initializer.
`beta_mean`(alpha: float, beta: float) → float	Calculate the mean of a beta distribution.
`beta_mode`(alpha: float, beta: float) → float	Calculate the mode of a beta distribution.
`guess_beta_parameters`(guess: float, strength: int = 5) → Tuple[float, float]	Given a guess of the mean of a beta distribution, calculate beta
`fit_stan_mcmc`(values: numpy.ndarray, p_e: float, p_c: float, guess: float = 0.5, model: pystan.StanModel = None, n_chains: int = 1, n_warmup: int = 1000, n_iters: int = 1000, n_threads: int = 1, seed: Optional[int] = None) → Tuple[float, float, float, float]	Run MCMC to estimate the fraction of labeled RNA.
`estimate_pi`(df_aggregates: pandas.DataFrame, p_e: float, p_c: float, pi_path: str, group_by: Optional[List[str]] = None, p_group_by: Optional[List[str]] = None, n_threads: int = 8, threshold: int = 16, seed: Optional[int] = None, nasc: bool = False, model: Optional[pystan.StanModel] = None) → str	Estimate the fraction of labeled RNA.

Attributes

_model

dynast.estimation.pi.read_pi(pi_path: str, group_by: Optional[List[str]] = None) → Tuple[Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]], Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]], Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]]][source]

Read pi CSV as a dictionary.

Parameters

pi_path: path to CSV containing pi values
group_by: columns that were used to group estimation

Returns

Dictionary with barcodes and genes as keys

dynast.estimation.pi._model[source]

dynast.estimation.pi.initializer(model: pystan.StanModel)[source]

Multiprocessing initializer. https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor

This initializer performs a one-time expensive initialization for each process.

dynast.estimation.pi.beta_mean(alpha: float, beta: float) → float[source]

Calculate the mean of a beta distribution. https://en.wikipedia.org/wiki/Beta_distribution

Parameters

alpha: First parameter of the beta distribution
beta: Second parameter of the beta distribution

Returns

Mean of the beta distribution

dynast.estimation.pi.beta_mode(alpha: float, beta: float) → float[source]

Calculate the mode of a beta distribution. https://en.wikipedia.org/wiki/Beta_distribution

When the distribution is bimodal (alpha, beta < 1), this function returns nan.

Parameters

alpha: First parameter of the beta distribution
beta: Second parameter of the beta distribution

Returns

Mode of the beta distribution

dynast.estimation.pi.guess_beta_parameters(guess: float, strength: int = 5) → Tuple[float, float][source]

Given a guess of the mean of a beta distribution, calculate beta distribution parameters such that the distribution is skewed by some strength toward the guess.

Parameters

guess: Guess of the mean of the beta distribution
strength: Strength of the skew

Returns: Beta distribution parameters (alpha, beta)

dynast.estimation.pi.fit_stan_mcmc(values: numpy.ndarray, p_e: float, p_c: float, guess: float = 0.5, model: pystan.StanModel = None, n_chains: int = 1, n_warmup: int = 1000, n_iters: int = 1000, n_threads: int = 1, seed: Optional[int] = None) → Tuple[float, float, float, float][source]

Run MCMC to estimate the fraction of labeled RNA.

Parameters

values: Array of three columns encoding a sparse array in (row, column, value) format, zero-indexed, where row: number of conversions column: nucleotide content value: number of reads
p_e: Average mutation rate in unlabeled RNA
p_c: Average mutation rate in labeled RNA
guess: Guess for the fraction of labeled RNA
model: PyStan model to run MCMC with. If not provided, will try to use the _model global variable
n_chains: Number of MCMC chains
n_warmup: Number of warmup iterations
n_iters: Number of MCMC iterations, excluding any warmups
n_threads: Number of threads to use
seed: random seed used for MCMC

Returns

(guess, alpha, beta, pi)

dynast.estimation.pi.estimate_pi(df_aggregates: pandas.DataFrame, p_e: float, p_c: float, pi_path: str, group_by: Optional[List[str]] = None, p_group_by: Optional[List[str]] = None, n_threads: int = 8, threshold: int = 16, seed: Optional[int] = None, nasc: bool = False, model: Optional[pystan.StanModel] = None) → str[source]

Estimate the fraction of labeled RNA.

Parameters

df_aggregates: Pandas dataframe containing aggregate values
p_e: Average mutation rate in unlabeled RNA
p_c: Average mutation rate in labeled RNA
pi_path: Path to write pi estimates
group_by: Columns that were used to group cells
p_group_by: Columns that p_e/p_c estimation was grouped by
n_threads: Number of threads
threshold: Any conversion-content pairs with fewer than this many reads will not be processed
seed: Random seed
nasc: Flag to change behavior to match NASC-seq pipeline. Specifically, the mode of the estimated Beta distribution is used as pi, defaults to False
model: PyStan model to run MCMC with. If not provided, will try to compile the module manually

Returns

Path to pi output

dynast.estimation.pi

Module Contents

Functions

Attributes

`dynast.estimation.pi`