glow-solution-accelerator

About Glow

Glow is an open-source toolkit for working with genomic data at biobank-scale and beyond. The toolkit is natively built on Apache Spark, the leading unified engine for big data processing and machine learning, enabling the scale of the cloud for genomics workflows.

This accelerators demonstrates how to run sample glow workloads as a Multi-Task Job in Databricks on AWS and Azure.

To run this accelerator, clone this repo into a Databricks workspace. Attach the RUNME notebook to any cluster running a DBR 11.0 or later runtime, and execute the notebook via Run-All. A multi-step-job describing the accelerator pipeline will be created, and the link will be provided. Execute the multi-step-job to see how the pipeline runs.

The job configuration is written in the RUNME notebook in json format. The cost associated with running the accelerator is the user's responsibility.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
dbsql		dbsql
etl		etl
tertiary		tertiary
.gitignore		.gitignore
0_setup_constants_glow.py		0_setup_constants_glow.py
1_setup_constants_hail.py		1_setup_constants_hail.py
2_setup_metadata.py		2_setup_metadata.py
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
RUNME.py		RUNME.py
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

glow-solution-accelerator

About

Releases

Packages

Contributors 2

Languages

License

databricks-industry-solutions/glow-solution-accelerator

Folders and files

Latest commit

History

Repository files navigation

glow-solution-accelerator

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages