genomepy.functions.install_genome
- genomepy.functions.install_genome(name: str, provider: Optional[str] = None, genomes_dir: Optional[str] = None, localname: Optional[str] = None, mask: Optional[str] = 'soft', keep_alt: Optional[bool] = False, regex: Optional[str] = None, invert_match: Optional[bool] = False, bgzip: Optional[bool] = None, annotation: Optional[bool] = False, only_annotation: Optional[bool] = False, skip_matching: Optional[bool] = False, skip_filter: Optional[bool] = False, threads: Optional[int] = 1, force: Optional[bool] = False, **kwargs: Optional[dict]) Genome
Install a genome (& gene annotation).
- Parameters
name (str) – Genome name
provider (str , optional) – Provider name. will try Gencode, Ensembl, UCSC and NCBI (in that order) if not specified.
genomes_dir (str , optional) – Where to create the output folder.
localname (str , optional) – Custom name for this genome.
mask (str , optional) – Genome masking of repetitive sequences. Options: hard/soft/none, default is soft.
keep_alt (bool , optional) – Some genomes contain alternative regions. These regions cause issues with sequence alignment, as they are inherently duplications of the consensus regions. Set to true to keep these alternative regions.
regex (str , optional) – Regular expression to select specific chromosome / scaffold names.
invert_match (bool , optional) – Set to True to select all chromosomes that don’t match the regex.
bgzip (bool , optional) – If set to True the genome FASTA file will be compressed using bgzip, and gene annotation will be compressed with gzip.
threads (int , optional) – Build genome index using multithreading (if supported). Default: lowest of 8/all threads.
force (bool , optional) – Set to True to overwrite existing files.
annotation (bool , optional) – If set to True, download gene annotation in BED and GTF format.
only_annotation (bool , optional) – If set to True, only download the gene annotation files.
skip_matching (bool , optional) – If set to True, contigs in the annotation not matching those in the genome will not be corrected.
skip_filter (bool , optional) – If set to True, the gene annotations will not be filtered to match the genome contigs.
kwargs (dict , optional) –
Provider specific options.
- toplevelbool , optional
Ensembl only: Always download the toplevel genome. Ignores potential primary assembly.
- versionint , optional
Ensembl only: Specify release version. Default is latest.
- to_annotationtext , optional
URL only: direct link to annotation file. Required if this is not the same directory as the fasta.
- path_to_annotationtext, optional
Local only: path to local annotation file. Required if this is not the same directory as the fasta.
- Returns
Genome class with the installed genome
- Return type