Data Cleaning Projects with Pandas

This repository contains a collection of Jupyter notebooks showcasing various data cleaning and dataset creation projects using Pandas and various Python libraries built for web scraping.

File Organization

pandas_and_beyond/
├── analyze_data/
│   ├── __init__.py
│   ├── 00_project.ipynb
│   ├── 01_project.py
│   ├── ...
├── data/
│   ├── __init__.py
│   ├── external/
│   │   ├── external_data.csv
│   │   └── ...
│   └── generated/
│       ├── raw/
│       │   ├── raw_data.csv
│       │   └── ...
│       ├── cleaned/
│       │   ├── cleaned_data.csv
│       │   └── ...
│       └── ...
├── generate_data/
│   ├── __init__.py
│   ├── 00_create_dataset.ipynb
│   ├── 01_web_scrape.ipynb
│   ├── ...
│   └── using_csv/
│       ├── __init__.py
│       ├── 00_read_csv.ipynb
│       ├── ...
│       └── ...
├── helper/
│   ├── __init__.py
│   ├── helper_function.ipynb
│   ├── helper_module.py
│   └── ...
└── tests/
    ├── __init__.py
    ├── test_.py
    └── ...

Requirements

Python 3.6 or higher
Pandas 1.0 or higher
Jupyter Notebook

Getting Started

To get started with this repository, you will need to clone or download it to your local machine. Once you have done so, you can navigate to analyze data directory and open the corresponding Jupyter notebook.

Contributing

This repository is a part of my continuous learning journey, which has been inspired by the valuable contributions made by various members of the Kaggle community. If you have a data cleaning project that you have implemented using Pandas and would like to contribute to this repository, please create a new branch and submit a pull request. Your contributions are highly appreciated and will help other learners who are looking to enhance their data cleaning skills using Pandas.

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
analyze_data		analyze_data
assets		assets
data/external		data/external
generate_data		generate_data
helper		helper
tests		tests
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Cleaning Projects with Pandas

Table of Contents

Data Analysis Projects

Dataset Creation Projects

File Organization

Requirements

Getting Started

Contributing

About

Releases

Packages

Languages

benkaan001/pandas_and_beyond

Folders and files

Latest commit

History

Repository files navigation

Data Cleaning Projects with Pandas

Table of Contents

Data Analysis Projects

Dataset Creation Projects

File Organization

Requirements

Getting Started

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages