Text-Summarization Project

STEPS:

Clone the repository

https://https://github.com/Kshitij-Nishant/Text-Summarization

STEP 01- Create a conda environment after opening the repository

conda create -n summary python=3.8 -y

conda activate summary

STEP 02- install the requirements

pip install -r requirements.txt

STEP 03- run template.py to create required folders

python template.py

STEP 04- Do research in notebooks and put those codes into respective folders based on Workflow

WORKFLOW followed for each stage:

Update config.yaml
Update params.yaml
Update entity
Update the configuration manager in src config
update the components
update the pipeline
update the main.py
Check the complete flow of code execution:

python  main.py

After each stage use below to push to Github:

git add .
git commit -m "<Put Caption here on the updates>"
git push origin main

STEP 05- Make prediction pipeline in pipeline folder

Add the prediction pipeline and Update the app.py

check for complete flow of code execution in FastAPI:

# Finally run the following command
python  app.py

Now,

open up you local host and port

......Push to git after

Author: Kshitij Nishant
Data Scientist Practitioner
Email: kshitijnishant09@gmail.com

AWS-CICD-Deployment-with-Github-Actions

1. Login to AWS console.

2. Create IAM user for deployment

#with specific access

1. EC2 access : It is virtual machine

2. ECR: Elastic Container registry to save your docker image in aws


#Description: About the deployment

1. Build docker image of the source code

2. Push your docker image to ECR

3. Launch Your EC2 

4. Pull Your image from ECR in EC2

5. Lauch your docker image in EC2

#Policy:

1. AmazonEC2ContainerRegistryFullAccess

2. AmazonEC2FullAccess

3. Create ECR repo to store/save docker image

- Save the URI: 381492009295.dkr.ecr.ap-south-1.amazonaws.com/textsum

4. Create EC2 machine (Ubuntu)

5. Open EC2 and Install docker in EC2 Machine:

#optinal

sudo apt-get update -y

sudo apt-get upgrade

#required

curl -fsSL https://get.docker.com -o get-docker.sh

sudo sh get-docker.sh

sudo usermod -aG docker ubuntu

newgrp docker

6. Configure EC2 as self-hosted runner:

setting>actions>runner>new self hosted runner> choose os> then run command one by one

7. Setup github secrets:

AWS_ACCESS_KEY_ID=

AWS_SECRET_ACCESS_KEY=

AWS_REGION = ap-south-1

AWS_ECR_LOGIN_URI = demo>>  381492009295.dkr.ecr.ap-south-1.amazonaws.com

ECR_REPOSITORY_NAME = simple-app

Demonstration:

1. For the below Desktop screenshots I provided the text:

Input Text:

"National Aluminium Company Limited (NALCO) is a Schedule ‘A’ Navratna CPSE established on 7th January, 1981 having its registered office at Bhubaneswar. It is one of the largest integrated Bauxite-Alumina-Aluminium- Power Complex in the Country. At present, Government of India holds 51.28% of paid up equity capital. The Company has been operating its captive Panchpatmali Bauxite Mines for the pit head Alumina refinery at Damanjodi, in the District of Koraput in Odisha and Aluminium Smelter & Captive Power Plant at Angul. As a part of green initiative, NALCO has installed 198 MW Wind Power Plants at various locations in India and 850 kWp roof top Solar Power Plants at its premises to join hands for carbon neutrality. From the days of first commercial operation since 1987 the Company has continuously earned profits for last 36 years. NALCO is one of the leading foreign exchange earning CPSEs of the Country."

Output Text:

"National Aluminium Company Limited (NALCO) is a Schedule ‘A’ Navratna CPSE established on 7th January, 1981 .It is one of the largest integrated Bauxite-Alumina-Aluminium- Power Complex in the Country .Government of India holds 51.28% of paid up equity capital ."

2. Below is a demonstration for mobile device:

For mobile, I gave a rather longer paragraph description of the show "Bridgerton" and this is the summary I got back from the model.

Input Text:

"Bridgerton is an American historical romance television series created by Chris Van Dusen for Netflix. Based on the book series by Julia Quinn, it is Shondaland's first scripted show for Netflix. The series is set during the early 1800s in an alternative London Regency era, in which George III established racial equality and granted many people of African descent aristocratic titles due to the African heritage of his wife, Queen Charlotte. The viewer is taken to observe the highly competitive social season; where young marriageable nobility and gentry are introduced into society.

The first season debuted on December 25, 2020. The second season premiered on March 25, 2022. Part one of the third season premiered on May 16, 2024, with part two following on June 13, 2024.[1] The series was renewed for a fourth season in April 2021.[2][3] In May 2023, Queen Charlotte: A Bridgerton Story, a spin-off series focused on Queen Charlotte, was released.

Bridgerton was positively received for its direction, actors' performances, production and set design, winning two Primetime Creative Arts Emmy Awards, a Make-Up Artists And Hair Stylists Guild Awards, and nominations at the Primetime Emmy Awards, Screen Actors Guild Awards, Satellite Awards and NAACP Image Awards. The music score by Kris Bowers earned a Grammy Award nomination for Best Score Soundtrack for Visual Media."

Output Text:

"Bridgerton is an American historical romance television series created by Chris Van Dusen for Netflix .Based on the book series by Julia Quinn, it is Shondaland's first scripted show for Netflix ."

We can see from the second images, respectively, that the whole paragraph has been summarized to a few lines with a good enough information. Hence demonstrating the model's efficiency in understanding the keywords and giving the user valuable insight about the subject in lesser lines in and under a minute time.

(PS: We can also increase the number of words in summarized paragraph.)

Usecase:

In filtering long written rage and vulgar comments: It will be more computationally efficient if the Filter model takes in the summarized text of this model and use it as it's input.
Summarizing comments on a product: User's buying a product can go through the summary of all the comments made on the product from previous buyers rather than going through each comment just to understand if the product is worth it or not.
Creating Headlines for an article: In a world filled with information, headlines serve as the first point of contact, grabbing the reader's attention and enticing them to read further and provides a quick summary of the article or news piece, giving readers an idea of what to expect.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
config		config
images		images
research		research
src/textsumarrizer		src/textsumarrizer
.gitignore		.gitignore
=2.12		=2.12
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
params.yaml		params.yaml
requirements.txt		requirements.txt
setup.py		setup.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-Summarization Project

STEPS:

STEP 01- Create a conda environment after opening the repository

STEP 02- install the requirements

STEP 03- run template.py to create required folders

STEP 04- Do research in notebooks and put those codes into respective folders based on Workflow

WORKFLOW followed for each stage:

STEP 05- Make prediction pipeline in pipeline folder

AWS-CICD-Deployment-with-Github-Actions

1. Login to AWS console.

2. Create IAM user for deployment

3. Create ECR repo to store/save docker image

4. Create EC2 machine (Ubuntu)

5. Open EC2 and Install docker in EC2 Machine:

6. Configure EC2 as self-hosted runner:

7. Setup github secrets:

Demonstration:

1. For the below Desktop screenshots I provided the text:

2. Below is a demonstration for mobile device:

Usecase:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Kshitij-Nishant/Text-Summarization

Folders and files

Latest commit

History

Repository files navigation

Text-Summarization Project

STEPS:

STEP 01- Create a conda environment after opening the repository

STEP 02- install the requirements

STEP 03- run template.py to create required folders

STEP 04- Do research in notebooks and put those codes into respective folders based on Workflow

WORKFLOW followed for each stage:

STEP 05- Make prediction pipeline in pipeline folder

AWS-CICD-Deployment-with-Github-Actions

1. Login to AWS console.

2. Create IAM user for deployment

3. Create ECR repo to store/save docker image

4. Create EC2 machine (Ubuntu)

5. Open EC2 and Install docker in EC2 Machine:

6. Configure EC2 as self-hosted runner:

7. Setup github secrets:

Demonstration:

1. For the below Desktop screenshots I provided the text:

2. Below is a demonstration for mobile device:

Usecase:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages