Training resources for Accelerating Data Engineering Pipelines on Baskerville HPC. This course covers:
- Setting up your environment on Baskerville
- Data on the Hardware Level with Pandas, cuDF and Dask
- Data Visualisation with Plotly
- Final Challenge
To take this course, you will need a registered account on Baskerville. Details for requesting access can be found here.
This course is for beginners, however some familiarity with the following may be beneficial:
- Python
- Jupyter notebooks
- Pandas
This work is licensed under a GNU General Public License v3.0. See LICENSE.md
for more information.
Email us: baskerville-tier2-support@contacts.bham.ac.uk
Project Link: https://github.com/baskerville-hpc/data-engineering
Baskerville is funded by the EPSRC and UKRI through the World Class Labs scheme (EP/T022221/1) and the Digital Research Infrastructure programme (EP/W032244/1).