genomepy.seq.as_seqdict

genomepy.seq.as_seqdict(to_convert, genome=None, minsize=None)
genomepy.seq.as_seqdict(to_convert: list, genome=None, minsize=None)
genomepy.seq.as_seqdict(to_convert: TextIOWrapper, genome=None, minsize=None)
genomepy.seq.as_seqdict(to_convert: str, genome=None, minsize=None)
genomepy.seq.as_seqdict(to_convert: Fasta, genome=None, minsize=None)
genomepy.seq.as_seqdict(to_convert: ndarray, genome=None, minsize=None)

Convert input to a dictionary with name as key and sequence as value.

If the input contains genomic coordinates, the genome needs to be specified. If minsize is specified all sequences will be checked if they are not shorter than minsize. If regions (or a region file) are used as the input, the genome can optionally be specified in the region using the following format: genome@chrom:start-end.

Current supported input types include: * FASTA, BED and region files. * List or numpy.ndarray of regions. * pyfaidx.Fasta object. * pybedtools.BedTool object.

Parameters
  • to_convert (list, str, pyfaidx.Fasta or pybedtools.BedTool) – Input to convert to FASTA-like dictionary

  • genome (str, optional) – Genomepy genome name.

  • minsize (int or None, optional) – If specified, check if all sequences have at least size minsize.

Returns

sequence names as key and sequences as value.

Return type

dict