dynast.estimation.pi
Module Contents
Functions
|
Read pi CSV as a dictionary. |
|
Multiprocessing initializer. |
|
Calculate the mean of a beta distribution. |
|
Calculate the mode of a beta distribution. |
|
Given a guess of the mean of a beta distribution, calculate beta |
|
Run MCMC to estimate the fraction of labeled RNA. |
|
Estimate the fraction of labeled RNA. |
Attributes
- dynast.estimation.pi.read_pi(pi_path: str, group_by: Optional[List[str]] = None) Tuple[Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]], Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]], Union[float, Dict[str, float], Dict[Tuple[str, Ellipsis], float]]] [source]
Read pi CSV as a dictionary.
- Parameters
- pi_path
path to CSV containing pi values
- group_by
columns that were used to group estimation
- Returns
Dictionary with barcodes and genes as keys
- dynast.estimation.pi.initializer(model: pystan.StanModel)[source]
Multiprocessing initializer. https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor
This initializer performs a one-time expensive initialization for each process.
- dynast.estimation.pi.beta_mean(alpha: float, beta: float) float [source]
Calculate the mean of a beta distribution. https://en.wikipedia.org/wiki/Beta_distribution
- Parameters
- alpha
First parameter of the beta distribution
- beta
Second parameter of the beta distribution
- Returns
Mean of the beta distribution
- dynast.estimation.pi.beta_mode(alpha: float, beta: float) float [source]
Calculate the mode of a beta distribution. https://en.wikipedia.org/wiki/Beta_distribution
When the distribution is bimodal (alpha, beta < 1), this function returns nan.
- Parameters
- alpha
First parameter of the beta distribution
- beta
Second parameter of the beta distribution
- Returns
Mode of the beta distribution
- dynast.estimation.pi.guess_beta_parameters(guess: float, strength: int = 5) Tuple[float, float] [source]
Given a guess of the mean of a beta distribution, calculate beta distribution parameters such that the distribution is skewed by some strength toward the guess.
- Parameters
- guess
Guess of the mean of the beta distribution
- strength
Strength of the skew
- Returns
Beta distribution parameters (alpha, beta)
- dynast.estimation.pi.fit_stan_mcmc(values: numpy.ndarray, p_e: float, p_c: float, guess: float = 0.5, model: pystan.StanModel = None, n_chains: int = 1, n_warmup: int = 1000, n_iters: int = 1000, n_threads: int = 1, seed: Optional[int] = None) Tuple[float, float, float, float] [source]
Run MCMC to estimate the fraction of labeled RNA.
- Parameters
- values
Array of three columns encoding a sparse array in (row, column, value) format, zero-indexed, where row: number of conversions column: nucleotide content value: number of reads
- p_e
Average mutation rate in unlabeled RNA
- p_c
Average mutation rate in labeled RNA
- guess
Guess for the fraction of labeled RNA
- model
PyStan model to run MCMC with. If not provided, will try to use the _model global variable
- n_chains
Number of MCMC chains
- n_warmup
Number of warmup iterations
- n_iters
Number of MCMC iterations, excluding any warmups
- n_threads
Number of threads to use
- seed
random seed used for MCMC
- Returns
(guess, alpha, beta, pi)
- dynast.estimation.pi.estimate_pi(df_aggregates: pandas.DataFrame, p_e: float, p_c: float, pi_path: str, group_by: Optional[List[str]] = None, p_group_by: Optional[List[str]] = None, n_threads: int = 8, threshold: int = 16, seed: Optional[int] = None, nasc: bool = False, model: Optional[pystan.StanModel] = None) str [source]
Estimate the fraction of labeled RNA.
- Parameters
- df_aggregates
Pandas dataframe containing aggregate values
- p_e
Average mutation rate in unlabeled RNA
- p_c
Average mutation rate in labeled RNA
- pi_path
Path to write pi estimates
- group_by
Columns that were used to group cells
- p_group_by
Columns that p_e/p_c estimation was grouped by
- n_threads
Number of threads
- threshold
Any conversion-content pairs with fewer than this many reads will not be processed
- seed
Random seed
- nasc
Flag to change behavior to match NASC-seq pipeline. Specifically, the mode of the estimated Beta distribution is used as pi, defaults to False
- model
PyStan model to run MCMC with. If not provided, will try to compile the module manually
- Returns
Path to pi output