Skip to content

Comments

Add S3 transcript download and OPENAI_API_KEY validation for Lenny's Podcast example#40

Open
Copilot wants to merge 3 commits intomainfrom
copilot/update-lenny-podcast-import
Open

Add S3 transcript download and OPENAI_API_KEY validation for Lenny's Podcast example#40
Copilot wants to merge 3 commits intomainfrom
copilot/update-lenny-podcast-import

Conversation

Copy link

Copilot AI commented Jan 29, 2026

The Lenny's Podcast example required manual transcript setup and failed silently when OPENAI_API_KEY was missing. Neo4j driver warnings cluttered the import output.

Changes

S3 transcript download

  • scripts/download_transcripts.py: Downloads and validates archive from S3
  • Makefile: Auto-downloads on first make load-* command, manual make download-transcripts target
  • Validates 250+ transcript files extracted, progress indication, --check-only and --force modes

OPENAI_API_KEY validation

  • backend/src/config.py: Pydantic validator fails fast at startup with actionable error
  • scripts/load_transcripts.py: Pre-flight check before processing
  • Rejects empty/placeholder values, validates sk-* format
@field_validator("openai_api_key")
@classmethod
def validate_openai_api_key(cls, v: SecretStr) -> SecretStr:
    if not v or not v.get_secret_value() or v.get_secret_value().strip() == "":
        error_msg = (
            "\n\n" "=" * 70 + "\n"
            "ERROR: OPENAI_API_KEY is required but not set\n"
            # ... actionable instructions ...
        )
        raise ValueError("OPENAI_API_KEY environment variable is required")
    return v

Neo4j driver warning suppression

  • src/neo4j_agent_memory/graph/client.py: Filter deprecation warnings, set neo4j.* loggers to WARNING

Documentation

  • README: OpenAI API key marked as required prerequisite with use cases
  • .env.example: Detailed API key documentation with setup link
  • .gitignore: Exclude data/ and tmp/ directories

Tests

  • test_download_transcripts.py: Unit tests for download/validation, integration test for S3 availability
  • test_config_validation.py: API key validation edge cases

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • s3.us-west-1.amazonaws.com
    • Triggering command: /usr/bin/python python scripts/download_transcripts.py --check-only (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Lenny Podcast example import</issue_title>
<issue_description>- disable warnigs for the driver

  • and document that the OPENAI_API_KEY is required
  • probably fail fast when it is not set

But after setting these, it looks really good with the import

Image Image

Comments on the Issue (you are @copilot in this section)

@johnymontana Podcast transcript files now available here: https://s3.us-west-1.amazonaws.com/data.neo4j.com/lennys_podcast_transcripts_archive.zip

Update the Lenny's podcast example app readme instructions, build scripts, and makefile targets to download and extract the transcript txt files from this url. add tests, including an integration test that verifies the file is available</comment_new>


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

@vercel
Copy link

vercel bot commented Jan 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agent-memory Ready Ready Preview, Comment Feb 6, 2026 5:21pm

- Created download_transcripts.py script to download from S3
- Updated Makefile with download-transcripts target and auto-download
- Added OPENAI_API_KEY validation with clear error messages
- Suppressed Neo4j driver warnings in graph client
- Updated README with download instructions and API key requirements
- Added comprehensive tests for download and config validation
- Created .gitignore for data/ and tmp/ directories

Co-authored-by: johnymontana <1222454+johnymontana@users.noreply.github.com>
Copilot AI changed the title [WIP] Update Lenny Podcast example import instructions Add S3 transcript download and OPENAI_API_KEY validation for Lenny's Podcast example Jan 29, 2026
Copilot AI requested a review from johnymontana January 29, 2026 18:03
@johnymontana johnymontana marked this pull request as ready for review February 6, 2026 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lenny Podcast example import

2 participants