genomepy.functions.install_genome

genomepy.functions.install_genome(name: str, provider: Optional[str] = None, genomes_dir: Optional[str] = None, localname: Optional[str] = None, mask: Optional[str] = 'soft', keep_alt: Optional[bool] = False, regex: Optional[str] = None, invert_match: Optional[bool] = False, bgzip: Optional[bool] = None, annotation: Optional[bool] = False, only_annotation: Optional[bool] = False, skip_matching: Optional[bool] = False, skip_filter: Optional[bool] = False, threads: Optional[int] = 1, force: Optional[bool] = False, **kwargs: Optional[dict]) Genome

Install a genome (& gene annotation).

Parameters
  • name (str) – Genome name

  • provider (str , optional) – Provider name. will try Gencode, Ensembl, UCSC and NCBI (in that order) if not specified.

  • genomes_dir (str , optional) – Where to create the output folder.

  • localname (str , optional) – Custom name for this genome.

  • mask (str , optional) – Genome masking of repetitive sequences. Options: hard/soft/none, default is soft.

  • keep_alt (bool , optional) – Some genomes contain alternative regions. These regions cause issues with sequence alignment, as they are inherently duplications of the consensus regions. Set to true to keep these alternative regions.

  • regex (str , optional) – Regular expression to select specific chromosome / scaffold names.

  • invert_match (bool , optional) – Set to True to select all chromosomes that don’t match the regex.

  • bgzip (bool , optional) – If set to True the genome FASTA file will be compressed using bgzip, and gene annotation will be compressed with gzip.

  • threads (int , optional) – Build genome index using multithreading (if supported). Default: lowest of 8/all threads.

  • force (bool , optional) – Set to True to overwrite existing files.

  • annotation (bool , optional) – If set to True, download gene annotation in BED and GTF format.

  • only_annotation (bool , optional) – If set to True, only download the gene annotation files.

  • skip_matching (bool , optional) – If set to True, contigs in the annotation not matching those in the genome will not be corrected.

  • skip_filter (bool , optional) – If set to True, the gene annotations will not be filtered to match the genome contigs.

  • kwargs (dict , optional) –

    Provider specific options.

    toplevelbool , optional

    Ensembl only: Always download the toplevel genome. Ignores potential primary assembly.

    versionint , optional

    Ensembl only: Specify release version. Default is latest.

    to_annotationtext , optional

    URL only: direct link to annotation file. Required if this is not the same directory as the fasta.

    path_to_annotationtext, optional

    Local only: path to local annotation file. Required if this is not the same directory as the fasta.

Returns

Genome class with the installed genome

Return type

Genome