Starter repository for the Manning liveProject: Summarize News Articles with NLP and TensorFlow](https://www.manning.com/liveproject/summarize-news-articles-with-nlp-and-tensorflow). This code lpsumnlp20 is always good for a 35% discount on the liveProject. This repository contains the intermediate files that might be helpful for the learners of this liveProject. Those files reside here.
By: Souradip Chakraborty & Sayak Paul
In this liveProject, you will be filling in the shoes of an NLP Engineer to work on building an automatic text summarizer for your colleagues at a News Media firm. This hypothetical News Media firm uses flashcards of broad news articles to design the front page of their blog that is read by more than a million readers across the globe. To develop the content for these flashcards, currently, the news editors manually summarize the prospective news articles, and needless to say, this process is very time-consuming. This text summarizer will be used by the news editors to automatically generate these summaries that could act as fairly good starting points.
This text summarizer is going to play a very crucial role in reducing the turnaround time for the news editors in developing the content for the flashcards. Your first assignment as a newly hired NLP Engineer would be to develop a PoC (proof of concept) text summarizer so that the stakeholders can properly plan out the next steps.
The steps would briefly include -
- Converting an abstractive text summarization dataset to conform to extractive text summarization with the help of Rouge score.
- Visualizing the newly prepared dataset and extracting meaningful summary statistics from it. For example - highest news article length, highest summary length, and so on.
- Preprocessing the dataset with basic NLP techniques like tokenization, padding, and so on.
- Building deep learning models with attention mechanism that are able to produce meaningful summary candidates from a news article.
- Preparing an overall report of the most performing deep learning models from your experiments for the stakeholders.