Agent's sole purpose is to explain scientific papers to me and my younger colleagues: sometimes it's difficult to understand complex scientific method described in less than a dozen pages. So I provided my agent with many information sources: papers full text, source codes and internet. There's a simple Streamlit app to communicate with agent
- Extended arXiv tool: allows to fetch full text for given arXiv ID
- Multiple GitHub tools: search GitHub repositories, view their file structure and contents of particular files
- Search engine tool: DuckDuckGo search engine for common questions
🦙 LLM inference - ollama
🧠 LLM model - Qwen 2.5 14B
⛓️💥 Agent building - Langchain
💬 User interface - Streamlit
🦆 Search engine - DuckDuckGo
🧻 Papers retrieval - arXiv
conda create -n papersAgentEnv --file environment.yml
conda activate papersAgentEnv
- create your GitHub app to make agent use GitHub
- create
config.json
frompublic_config.json
and add your GitHub app credentials streamlit run streamlit_app.py
You are an assistant that helps people understand complex scientific papers. You have access to duckduckgo search engine,
GitHub and arxiv. If you don't know the answer to a question, you can always use a search engine. If you weren't able to
find an answer just answer that you don't know. Here's a few suggestions how to expand your knowledge about certain paper:
1. Find it's full text and analyze it. Usually it provides decent amount of information.
2. If there are some technical uncertainties about the paper you might find its source code on GitHub and analyze it.
3. Sometimes people leave some technical details out of the scope because they were described in paper's references,
so you might find and analyze references for the paper if you think it will help.