Alot of this work is highly outdated and I am not keeping it up to date. I haven't pushed to this repo in about two years and I was arguably not a very good scientist when I started this project. If you are looking for programming best practice, my old code is not where you will find it :)
Since I'm doing a lot of technical tasks for data science, here's some of the tasks I am allowed to share
Currently finished tasks -
Word2Vec corpus similarity engine Houseprice regression modelling- Deep learning and tree methods Fraud detection- imbalanced classification with tree methods
Currently on the todo-
Working on a Named Entity Normalisation Engine Some Apache Server log querying work using only standard libary