Note: This package is in active development and functionality might change or not work correctly (yet)!
PeakSQL¶
Dynamic machine learning database for genomics. Supports common bed-like dataformats like .bed, and .narrowPeak. bedgraph; and the binary bigwig format.
Installation¶
PeakSQL can be installed through pip:
pip install peaksql
Or installed with Conda (hosted on Bioconda):
conda install peaksql
And finally, installed from source:
git clone https://github.com/vanheeringen-lab/peaksql
cd peaksql
pip install .
Getting started¶
import peaksql
# paths to our files
db_file = 'peakSQL.sqlite' # where to store our database
assembly = "/path/to/hg38.fa"
data = "binding_sites.bed"
# load data into database
db = peaksql.database.DataBase(db_file)
db.add_assembly(assembly, assembly="hg38", species="human")
db.add_data(data, assembly="hg38")
# now load as dataset
dataset = peaksql.BedDataSet(db_file, seq_length=101, stride=200)
# use the dataset in your application
for seq, label in dataset:
...