dynast.preprocessing.consensus
Module Contents
Functions
|
Call a single consensus alignment given a list of aligned reads. |
|
Helper function to call |
|
Multiprocessing worker. |
|
Call consensus sequences from BAM. |
Attributes
- dynast.preprocessing.consensus.call_consensus_from_reads(reads: List[pysam.AlignedSegment], header: pysam.AlignmentHeader, quality: int = 27, tags: Optional[Dict[str, Any]] = None) pysam.AlignedSegment [source]
Call a single consensus alignment given a list of aligned reads.
Reads must map to the same contig. Results are undefined otherwise. Additionally, consensus bases are called only for positions that match to the reference (i.e. no insertions allowed).
This function only sets the minimal amount of attributes such that the alignment is valid. These include: * read name – SHA256 hash of the provided read names * read sequence and qualities * reference name and ID * reference start * mapping quality (MAPQ) * cigarstring * MD tag * NM tag * Not unmapped, paired, duplicate, qc fail, secondary, nor supplementary
The caller is expected to further populate the alignment with additional tags, flags, and name.
- Parameters
- reads
List of reads to call a consensus sequence from
- header
header to use when creating the new pysam alignment
- quality
quality threshold
- tags
additional tags to set
- Returns
New pysam alignment of the consensus sequence
- dynast.preprocessing.consensus.call_consensus_from_reads_process(reads, header, tags, strand=None, quality=27)[source]
Helper function to call
call_consensus_from_reads()
from a subprocess.
- dynast.preprocessing.consensus.consensus_worker(args_q, results_q, *args, **kwargs)[source]
Multiprocessing worker.
- dynast.preprocessing.consensus.call_consensus(bam_path: str, out_path: str, gene_infos: dict, strand: typing_extensions.Literal[forward, reverse, unstranded] = 'forward', umi_tag: Optional[str] = None, barcode_tag: Optional[str] = None, gene_tag: str = 'GX', barcodes: Optional[List[str]] = None, quality: int = 27, add_RS_RI: bool = False, temp_dir: Optional[str] = None, n_threads: int = 8) str [source]
Call consensus sequences from BAM.
- Parameters
- bam_path
Path to BAM
- out_path
Output BAM path
- gene_infos
Gene information, as parsed from the GTF
- strand
Protocol strandedness
- umi_tag
BAM tag containing the UMI
- barcode_tag
BAM tag containing the barcode
- gene_tag
BAM tag containing the assigned gene
- barcodes
List of barcodes to consider
- quality
Quality threshold
- add_RS_RI
Add RS and RI BAM tags for debugging
- temp_dir
Temporary directory
- n_threads
Number of threads
- Returns
Path to sorted and indexed consensus BAM