Skip to content

Hands-on training on extracting structured information with LLMs. Learn prompt design, JSON parsing, and batch processing through practical Python examples.

Notifications You must be signed in to change notification settings

javiervela/llm-information-extraction-workshop

Repository files navigation

LLM Information Extraction Workshop

🚀 Learn to extract structured information with LLMs locally and at scale on CESGA GPUs.

This hands-on workshop teaches you how to run LLMs with Ollama, design effective prompts, validate outputs with Pydantic, and execute remote batch jobs on the CESGA FinisTerrae III cluster.


🎯 Learning Outcomes

By the end of this workshop, you will:

  • ✅ Run and interact with LLMs locally using Ollama.
  • ✅ Design and test prompts to extract structured information.
  • ✅ Parse and validate responses programmatically.
  • ✅ Run batch jobs on CESGA’s GPU cluster.

📋 Prerequisites

  • Basic knowledge of Python programming.
  • Familiarity with command-line operations.
  • Modules 1–3 and 5 can be done locally; Module 4 requires CESGA FinisTerrae III access.

🚀 Quick Start

  1. Module 1 – Set up Ollama locally and configure CESGA access.
  2. Module 2 – Run your first extraction jobs with Ollama.
  3. Module 3 – Validate and save structured outputs.
  4. Module 4 – Run your scripts on CESGA GPUs.
  5. Module 5 – Analyze long texts and interview transcripts.

📂 Repository Structure

Folder Description
01_setup/ Local LLM setup and CESGA access
02_basic_llm_extraction/ Basic local LLM queries & batch jobs
03_structured_llm_extraction/ Structured data extraction & validation
04_cluster_execution/ CESGA cluster job scripts
05_text_analysis/ Long text and interview analysis
data/ Sample texts and outputs

🔗 Navigation

Start Here: Module 1 – Setup & Environment

About

Hands-on training on extracting structured information with LLMs. Learn prompt design, JSON parsing, and batch processing through practical Python examples.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •