Skip to content

Latest commit

 

History

History
7 lines (5 loc) · 785 Bytes

README.md

File metadata and controls

7 lines (5 loc) · 785 Bytes

Machine-learning-with-Dask

The following project shows and compares machine learning between Pandas DataFrames and Dask Dataframes.
Dask is a flexible library for parallel computing in Python. A Dask DataFrame is a large parallel DataFrame composed of many smaller Pandas DataFrames, split along the index. These Pandas DataFrames may live on disk for larger-than-memory computing on a single machine, or on many different machines in a cluster. One Dask DataFrame operation triggers many operations on the constituent Pandas DataFrames.

Click here for code.

Screenshot 2022-07-19 165807