GitHub - ivo-pfaffen/kaggle-spaceship-titanic: Feed-forward neural network in PyTorch for Kaggle's "Spaceship Titanic" competition

Kaggle competition: Spaceship Titanic

For my second ML/data science project, I'm participating in Kaggle's Spaceship Titanic competition.

Data

We are provided multiple CSVs:

train.csv: personal records for about two-thirds (~8700) of the passengers, to be used as training data
test.csv: personal records for the remaining one-third (~4300) of the passengers, to be used as test data
sample_submission: a submission file in the correct format Which are all located in the data/ directory.

The one we use for training looks like this:

PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported
0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False
0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True
(...)

The goal here is to predict the status of ~4k passengers in a spaceship based on ~8k passenger records.

Approach

Our first task is to load and preprocess the data to be able to feed it into our neural network for training. As we can see, there are lots of non-numeric data. We are going to perform feature encoding for each of the columns containing non-numerical data, and some feature engineering after that to improve model performance.

After that, I intend to build a multilayered feed-forward neural network using Pytorch to predict the outcome (the Transported column) for each of the passengers in the test.csv file.

How to run the code

Eveything is located in the Jupyter Notebook. To run it, follow the steps:

Clone the repository
Create a virtual environment (python3 -m venv .venv)
Activate the virtual enviroment (source .venv/bin/activate)
Install the dependencies (pip install -r requirements.txt)

After that, you can open and run the Jupyter notebook in your local IDE. Just make sure that it's running inside the virtual enviromnent (tutorial for VSCode here).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
spaceship_titanic_nn.ipynb		spaceship_titanic_nn.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kaggle competition: Spaceship Titanic

Data

Approach

How to run the code

About

Releases

Packages

Languages

ivo-pfaffen/kaggle-spaceship-titanic

Folders and files

Latest commit

History

Repository files navigation

Kaggle competition: Spaceship Titanic

Data

Approach

How to run the code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages