This is a project that aims to provide a platform where users can upload their text or PDF files, get a summary of the content. Additional to summarization, the user can chat with our bot.
In the later version, the user will also be to ask questions about the content.
The project leverage Groq API
for chat functionality, text summarization, and for question answering. The project is built using Python, FastAPI, and Docker.
- Summarization: Summarize the text-like documents .
- Question Answering (QA): Perform QA ( added feature, not previously rquired in the project).
- Question Answering (QA): Perform QA on Text or PDFs( added feature, not previously rquired in the project) is in progress.
python>=3.8
PyPDF2
requests
fastapi
uvicorn
pytest
groq
python-dotenv
bcrypt
Install dependencies locally with:
pip install -r requirements.txt
.
├── README.md
├── backend
│ ├── Dockerfile
│ ├── app
│ │ ├── __init__.py
│ │ ├── app.py
│ │ └── main.py
│ ├── local.settings.json
│ ├── mockup
│ │ ├── SignUp.png
│ │ ├── accountSuccessfullyCreated.png
│ │ ├── loginlogin.png
│ │ └── updateUserInfoConfirmation.png
│ ├── requirements.txt
│ ├── routers
│ │ ├── __init__.py
│ │ ├── auth.py
│ │ ├── chat.py
│ │ └── chat_history.py
│ ├── tests
│ │ ├── __init__.py
│ │ ├── sample.pdf
│ │ ├── sample.txt
│ │ ├── test_auth.py
│ │ ├── tpdf.py
│ │ └── ttxt.py
│ └── utils
│ ├── __init__.py
│ ├── api.py
│ ├── auth.py
│ ├── database.py
│ ├── deps.py
│ ├── models.py
│ ├── okapi_app.db
│ ├── pdf.py
│ └── summarizer.py
├── docs.md
├── frontend
│ ├── Dockerfile
│ ├── README.md
│ ├── app
│ │ ├── admin
│ │ │ ├── login
│ │ │ │ └── page.js
│ │ │ └── page.js
│ │ ├── favicon.ico
│ │ ├── globals.css
│ │ ├── layout.js
│ │ ├── login
│ │ │ └── page.js
│ │ ├── page.js
│ │ ├── page.module.css
│ │ └── register
│ │ └── page.js
│ ├── components
│ │ ├── Footer.js
│ │ ├── Header.js
│ │ ├── ProtectedRoute.js
│ │ └── Welcome.js
│ ├── context
│ │ └── AuthContext.js
│ ├── jsconfig.json
│ ├── next.config.mjs
│ ├── package-lock.json
│ ├── package.json
│ └── tsconfig.json
├── product
│ └── Product design .jpg
└── project-tree.txt
16 directories, 54 files
To run the application, you need to install the required dependencies. You can install them locally or use Docker. If you want to run it locally, here are the steps:
- Clone the repository:
git clone https://github.com/jnlandu/text-summarization.git
- Navigate to the project directory:
cd text-summarization
- Install the dependencies:
pip install -r requirements.txt
- Run the FastAPI application:
uvicorn app.main:app --reload or fastapi dev app/main.py
Note: You will need to have the groq
API key to run the application. You can get the API key by signing up on the Groq API. Set the API key in the .env
file, with the name GROQ_API_KEY
, that is:
GROQ_API_KEY=your_api_key
Ensure that the .env
file is in the root directory of the project, like described above. The application will be available at http://localhost:8000
.
Once the application is running, you can access the FastAPI Swagger UI at http://localhost:8000/docs
. Here are the steps to access the full app:
- Authenticate the user by signing up or logging in. Click on lock icon on the top right corner of the page and Use the following credentials:
credentials:
username: admin | user
password: userpass
- Click on the "Try it out" button.
- Type or paste your text in the text area under Request body. Make user the text is enclosed in quotes.
- Click on the "Execute" button.
- The response will be displayed in the "Response body" section.
Note: As mentioned earlier, the chat functionality is also available. The document-chat functionality is in progress.
- You can pull the docker image from the docker hub and run it on your local machine or server.
- The docker image is available on the docker hub here or you can pull it using the following command:
docker pull jnlandu/api:latest
- if you have cloned the repository, you can build the docker image using the following command:
docker build -t your-preferred-image-name .
- You can run the docker container using the following command:
docker run -d -p 8000:8000 your-preferred-image-name
- Pass the API key as an environment variable:
docker run -d -p 8000:8000 -env-file ./.env your-preferred-image-name
The docker container will be available at http://localhost:8000
.
-
The project is deployed on Azure and can be accessed here.
-
CI/CD is implemented using GitHub Actions. The workflow file is available in the
.github/workflows
folder.
The project includes a basic testing structure within the /tests
folder. Ensure you have the necessary dependencies installed, and run:
pytest tests/
As for now, only the authentication tests are available. The tests for the PDF and text files are in progress.
Here are the features that are in progress:
- Add frontend for the web application (already built and will be available soon).
- The mockup for the frontend is available in the
mockup
folder. - The frontend mockup and design is built using Penpot, a free and open-source design tool, and can be accessed here.
- Can play with our prototype here and give us feedback on the design and the user experience. To show interactions, click on the "Interactions" tab on the top right corner of the page.
The backend is built using:
- Python
- FastAPI
- Docker
- Pydantic
- Azure Blob Storage for users' PDF files to be processed and allow users to chat with.
- Azure PostgreSQL for the database and chat history of users
The frontend is built using:
- HTML
- CSS
- JavaScript
- Next.js 14
- Bootsrtrap 5
Contributions are welcome! For feature requests, bug reports, or questions, please open an issue.
- Groq API
- FastAPI Documentation
- Hugging Face Transformers
- PyPDF2 Documentation
- Azure Documentation
- Docker Documentation
- Next.js Documentation
This project is licensed under the MIT License - see the LICENSE file for details.