dynast.preprocessing.coverage

Module Contents

Functions

read_coverage(coverage_path: str) → Dict[str, Dict[int, int]]

Read coverage CSV as a dictionary.

calculate_coverage_contig(counter: multiprocessing.Value, lock: multiprocessing.Lock, bam_path: str, contig: str, indices: List[Tuple[int, int, int]], alignments: Set[Tuple[str, int]] = None, umi_tag: Optional[str] = None, barcode_tag: Optional[str] = None, gene_tag: str = 'GX', barcodes: Optional[List[str]] = None, temp_dir: Optional[str] = None, update_every: int = 50000, velocity: bool = True) → str

Calculate converage for a specific contig. This function is designed to

calculate_coverage(bam_path: str, conversions: Dict[str, Set[int]], coverage_path: str, alignments: Optional[List[Tuple[str, int]]] = None, umi_tag: Optional[str] = None, barcode_tag: Optional[str] = None, gene_tag: str = 'GX', barcodes: Optional[List[str]] = None, temp_dir: Optional[str] = None, velocity: bool = True) → str

Calculate coverage of each genomic position per barcode.

Attributes

COVERAGE_PARSER

dynast.preprocessing.coverage.COVERAGE_PARSER[source]
dynast.preprocessing.coverage.read_coverage(coverage_path: str) Dict[str, Dict[int, int]][source]

Read coverage CSV as a dictionary.

Parameters
coverage_path

Path to coverage CSV

Returns

Coverage as a nested dictionary

dynast.preprocessing.coverage.calculate_coverage_contig(counter: multiprocessing.Value, lock: multiprocessing.Lock, bam_path: str, contig: str, indices: List[Tuple[int, int, int]], alignments: Set[Tuple[str, int]] = None, umi_tag: Optional[str] = None, barcode_tag: Optional[str] = None, gene_tag: str = 'GX', barcodes: Optional[List[str]] = None, temp_dir: Optional[str] = None, update_every: int = 50000, velocity: bool = True) str[source]

Calculate converage for a specific contig. This function is designed to be called as a separate process.

Parameters
counter

Counter that keeps track of how many reads have been processed

lock

Semaphore for the counter so that multiple processes do not modify it at the same time

bam_path

Path to alignment BAM file

contig

Only reads that map to this contig will be processed

indices

Genomic positions to consider

alignments

Set of (read_id, alignment_index) tuples to process. All alignments are processed if this option is not provided.

umi_tag

BAM tag that encodes UMI, if not provided, NA is output in the umi column

barcode_tag

BAM tag that encodes cell barcode, if not provided, NA is output in the barcode column

gene_tag

BAM tag that encodes gene assignment, defaults to GX

barcodes

List of barcodes to be considered. All barcodes are considered if not provided

temp_dir

Path to temporary directory

update_every

Update the counter every this many reads

velocity

Whether or not velocities were assigned

Returns

Path to coverage CSV

dynast.preprocessing.coverage.calculate_coverage(bam_path: str, conversions: Dict[str, Set[int]], coverage_path: str, alignments: Optional[List[Tuple[str, int]]] = None, umi_tag: Optional[str] = None, barcode_tag: Optional[str] = None, gene_tag: str = 'GX', barcodes: Optional[List[str]] = None, temp_dir: Optional[str] = None, velocity: bool = True) str[source]

Calculate coverage of each genomic position per barcode.

Parameters
bam_path

Path to alignment BAM file

conversions

Dictionary of contigs as keys and sets of genomic positions as values that indicates positions where conversions were observed

coverage_path

Path to write coverage CSV

alignments

Set of (read_id, alignment_index) tuples to process. All alignments are processed if this option is not provided.

umi_tag

BAM tag that encodes UMI, if not provided, NA is output in the umi column

barcode_tag

BAM tag that encodes cell barcode, if not provided, NA is output in the barcode column

gene_tag

BAM tag that encodes gene assignment

barcodes

List of barcodes to be considered. All barcodes are considered if not provided

temp_dir

Path to temporary directory

velocity

Whether or not velocities were assigned

Returns

Path to coverage CSV