This Streamlit application empowers users to effortlessly perform Document Question-Answering (QA) and Text Summarization tasks in their preferred language, English or German, with just a few simple steps.
- Select your preferred language (English or German) from the sidebar.
- Choose the task you want to perform - Document QA or Text Summarization.
- Click on the "Upload Files" section in the sidebar.
- Upload various types of documents including (PDFs, Markdown, plain text, and DOCX files).
- Click on the "OCR for Images" section in the sidebar.
- Conveniently upload images (PNG, JPG, JPEG) for optical character recognition (OCR).
- Click on the "Upload Audio Files and Transcribe" section in the sidebar.
- Effortlessly upload audio files (MP3, WAV) for automatic transcription.
- Click on the "Import HTML" section in the sidebar.
- Simply enter URLs to import HTML content from websites.
- Click on the "YouTube Video" section in the sidebar.
- Enter a YouTube video URL for transcription.
- Click on the "Create Vector Database" section in the sidebar to create a database from uploaded documents.
- Click on the "Remove Files" section in the sidebar to remove all files in the data directory.
- Engage with a chatbot that can provide answers to questions based on the uploaded documents.
- In the Text Summarization task, simply enter text in the provided text area.
- Click the "Summarize" button to generate a concise summarization of the input text.
Here's an example of the Document Question-Answering task in action:
And here's an example of the Text Summarization task in action:
To use this Streamlit application, follow these steps:
-
Clone the repository and navigate to the project directory:
git clone https://github.com/saadkh1/DocQA-TextSummarization-App.git
cd DocQA-TextSummarization-App
-
Install the required packages from the requirements.txt file:
pip install -r requirements.txt
-
Download the necessary language models and embeddings by running the models.sh script:
sh models.sh
-
Run the Streamlit app:
streamlit run app.py
-
Open this URL in your browser: http://localhost:8501/
Alternatively, you can use Docker to run the application in a container. Make sure you have Docker installed on your system. Follow these steps:
-
Clone the repository and navigate to the project directory:
git clone https://github.com/saadkh1/DocQA-TextSummarization-App.git
cd DocQA-TextSummarization-App
-
Build the Docker image:
docker build -t qa-summrize-app:1.0 .
-
Run the Docker container:
docker run -p 8501:8501 qa-summrize-app:1.0
-
Open this URL in your browser: http://localhost:8501/
If you prefer to use Google Colab, you can run the app using the provided app.ipynb notebook:
-
Open the app.ipynb notebook in Google Colab:
-
Run all the cells in the notebook.
The notebook will start the Streamlit app and expose it using ngrok. Follow the instructions in the notebook to access the app URL.