The application consists of two scripts. The first generates a Chroma database from a given set of PDFs. The database is created in the subfolder "chroma_db". The second implements a Streamlit web chat bot, based on the database, which can be used to ask questions related to the content of the PDFs.
An OpenAI key is required for this application (see Create an OpenAI API key).
The OpenAI key must either be set in the environment variable OPENAI_API_KEY
or must be passed as an argument to the scripts.
chromadb
, langchain
, langchain-community
, openai
, pypdf
, streamlit
, tiktoken
To create the database, the "create_db.py" script must be executed and a file path to the PDFs must be passed as the first argument. The second argument is optional and can be the OpenAI key.
python3 create_db.py <path_to_pdfs> [<openai_key>]
The OpenAI key must be set in the environment variable OPENAI_API_KEY
or set in the "app.py" script.
To run the chat bot, the "app.py" script must be executed.
streamlit run app.py