dynast.preprocessing.coverage
Module Contents
Functions
|
Read coverage CSV as a dictionary. |
|
Calculate converage for a specific contig. This function is designed to |
|
Calculate coverage of each genomic position per barcode. |
Attributes
- dynast.preprocessing.coverage.read_coverage(coverage_path: str) Dict[str, Dict[int, int]] [source]
Read coverage CSV as a dictionary.
- Parameters
- coverage_path
Path to coverage CSV
- Returns
Coverage as a nested dictionary
- dynast.preprocessing.coverage.calculate_coverage_contig(counter: multiprocessing.Value, lock: multiprocessing.Lock, bam_path: str, contig: str, indices: List[Tuple[int, int, int]], alignments: Set[Tuple[str, int]] = None, umi_tag: Optional[str] = None, barcode_tag: Optional[str] = None, gene_tag: str = 'GX', barcodes: Optional[List[str]] = None, temp_dir: Optional[str] = None, update_every: int = 50000, velocity: bool = True) str [source]
Calculate converage for a specific contig. This function is designed to be called as a separate process.
- Parameters
- counter
Counter that keeps track of how many reads have been processed
- lock
Semaphore for the counter so that multiple processes do not modify it at the same time
- bam_path
Path to alignment BAM file
- contig
Only reads that map to this contig will be processed
- indices
Genomic positions to consider
- alignments
Set of (read_id, alignment_index) tuples to process. All alignments are processed if this option is not provided.
- umi_tag
BAM tag that encodes UMI, if not provided, NA is output in the umi column
- barcode_tag
BAM tag that encodes cell barcode, if not provided, NA is output in the barcode column
- gene_tag
BAM tag that encodes gene assignment, defaults to GX
- barcodes
List of barcodes to be considered. All barcodes are considered if not provided
- temp_dir
Path to temporary directory
- update_every
Update the counter every this many reads
- velocity
Whether or not velocities were assigned
- Returns
Path to coverage CSV
- dynast.preprocessing.coverage.calculate_coverage(bam_path: str, conversions: Dict[str, Set[int]], coverage_path: str, alignments: Optional[List[Tuple[str, int]]] = None, umi_tag: Optional[str] = None, barcode_tag: Optional[str] = None, gene_tag: str = 'GX', barcodes: Optional[List[str]] = None, temp_dir: Optional[str] = None, velocity: bool = True) str [source]
Calculate coverage of each genomic position per barcode.
- Parameters
- bam_path
Path to alignment BAM file
- conversions
Dictionary of contigs as keys and sets of genomic positions as values that indicates positions where conversions were observed
- coverage_path
Path to write coverage CSV
- alignments
Set of (read_id, alignment_index) tuples to process. All alignments are processed if this option is not provided.
- umi_tag
BAM tag that encodes UMI, if not provided, NA is output in the umi column
- barcode_tag
BAM tag that encodes cell barcode, if not provided, NA is output in the barcode column
- gene_tag
BAM tag that encodes gene assignment
- barcodes
List of barcodes to be considered. All barcodes are considered if not provided
- temp_dir
Path to temporary directory
- velocity
Whether or not velocities were assigned
- Returns
Path to coverage CSV