Basic Text Mining using Python

Assignment of the Intelligent Systems course of the EIT Digital data science master at UPM

Abstract

This project aims to perform a basic analysis a provided corpus consisting of a head and neck cancer medication textual corpus. First, the dataset needs to be preprocessed, filtering the seer stage field and creating additional columns. Next, a basic word cloud will be created and the results discussed, followed my researching more advances techniques for word cloud generation. Approaches used include TextRank, MultipartiteRank, TopicRank, PositionRank, Yake, TF-IDF, SingleRank and a custom text rank. The implementation can be found in the format of Jupyter Notebook.

Authors

Angel Igareta (angel@igareta.com)
Cristian Abrante Dorta

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Basic Text Mining using Python

Assignment of the Intelligent Systems course of the EIT Digital data science master at UPM

Abstract

Authors

About

Languages

License

angeligareta/basic-text-mining-python

Folders and files

Latest commit

History

Repository files navigation

Basic Text Mining using Python

Assignment of the Intelligent Systems course of the EIT Digital data science master at UPM

Abstract

Authors

About

Topics

Resources

License

Stars

Watchers

Forks

Languages