Indexing emails to use with RAG applications or to extract relevant knowledge has its challenges, such as:
- A lot of noise and garbage in the text
- Dealing with email threads
In this project, I experimented with various prompt techniques to extract the most important information from emails while addressing the problems stated above. I found that a report-style summary works better than just asking for a summary, as the latter tends to lose a lot of important information.
- Noise Reduction: Implemented techniques to filter out irrelevant information and focus on key content.
- Thread Handling: Developed methods to accurately parse and summarize email threads.
- Report-Style Summaries: Discovered that report-style summaries retain more essential information compared to generic summaries.
- Prompt Engineering: Experimented with different prompt structures to enhance the extraction of valuable insights.