Skip to content

Improve LLM context generation#138

Merged
aszenz merged 2 commits intomasterfrom
feat/improve-llm-context
Feb 6, 2026
Merged

Improve LLM context generation#138
aszenz merged 2 commits intomasterfrom
feat/improve-llm-context

Conversation

@aszenz
Copy link
Owner

@aszenz aszenz commented Feb 6, 2026

  • Auto-infer GitHub Pages URL for llms.txt
  • Add concrete URL examples and query parameter documentation
  • Optimize context for better token efficiency

Add actions/configure-pages step to automatically detect the site URL
including support for custom domains. The base_url output is passed to
the build process via SITE_URL environment variable.
Copilot AI review requested due to automatic review settings February 6, 2026 02:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request improves the LLM context generation for the Malloy Data Explorer by enhancing the llms.txt file with better URL handling, concrete examples, and token-optimized content.

Changes:

  • Auto-infer GitHub Pages URL using actions/configure-pages action outputs for dynamic site URL configuration
  • Add concrete URL examples and comprehensive query parameter documentation to llms.txt
  • Optimize content for token efficiency by removing redundancy and shortening descriptions

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vite.config.ts Pass optional siteUrl from environment to llms.txt plugin
src/llms-txt/generator.ts Add siteUrl parameter support, generate concrete URL examples, add query parameter documentation, optimize content descriptions
plugins/vite-plugin-llms-txt.ts Add siteUrl option to plugin interface and pass through to generator
README.md Add comprehensive query parameter documentation with examples
.github/workflows/ci_cd.yml Use configure-pages action to auto-detect GitHub Pages URL and base path, supporting custom domains

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

### Model Schema (`/#/model/{model}`)

- **`tab`** - Active tab name
- **`expanded`** - Comma-separated list of sources to expand (e.g., `?expanded=users,orders`)
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent terminology between README and generator. This line says "sources" but the implementation in Schema.tsx uses "expandedExplores" and the generator.ts says "expanded explores" (line 231). Consider changing "sources" to "explores" here to match the actual implementation and be consistent with the llms.txt documentation.

Suggested change
- **`expanded`** - Comma-separated list of sources to expand (e.g., `?expanded=users,orders`)
- **`expanded`** - Comma-separated list of explores to expand (e.g., `?expanded=users,orders`)

Copilot uses AI. Check for mistakes.
@aszenz aszenz force-pushed the feat/improve-llm-context branch 2 times, most recently from 8c2ad88 to 442df3e Compare February 6, 2026 02:29
- Add mandatory siteUrl parameter with build-time validation
- Optimize context for better token efficiency
- Add concrete examples (named query, view, custom query)
- Document all URL query parameters in README
- Validate SITE_URL is provided (uses localhost in dev mode)

This helps LLMs better understand the site structure and generate
valid, actionable URLs for users.
Copilot AI review requested due to automatic review settings February 6, 2026 02:33
@aszenz aszenz force-pushed the feat/improve-llm-context branch from 442df3e to 30ee4ae Compare February 6, 2026 02:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +40 to +46
function generateHeader(
siteTitle: string,
basePath: string,
siteUrl: string,
): string {
const base = basePath.endsWith("/") ? basePath.slice(0, -1) : basePath;
const fullUrl = `${siteUrl.endsWith("/") ? siteUrl.slice(0, -1) : siteUrl}${base}/`;
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new URL construction logic and siteUrl integration lack test coverage. Given that the repository has comprehensive unit tests for other utilities (download-utils, notebook-parser, schema-utils), consider adding tests for the generator module to verify URL construction works correctly with various siteUrl and basePath combinations, especially edge cases like trailing slashes and empty basePath.

Copilot uses AI. Check for mistakes.
dataFiles,
notebooks,
),
generateModelsSection(models, basePath),
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generateModelsSection function uses relative URLs (basePath only) for model Browse and Download links, while the header now shows the full absolute "Site URL". For consistency and to help LLMs better understand the URLs, consider updating generateModelsSection to also use full absolute URLs by passing siteUrl to it and using the same URL construction pattern as in generateHeader and generateOverview.

Suggested change
generateModelsSection(models, basePath),
generateModelsSection(models, basePath, siteUrl),

Copilot uses AI. Check for mistakes.
Comment on lines 45 to +46
const base = basePath.endsWith("/") ? basePath.slice(0, -1) : basePath;
const fullUrl = `${siteUrl.endsWith("/") ? siteUrl.slice(0, -1) : siteUrl}${base}/`;
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL construction logic that removes trailing slashes from both siteUrl and basePath is duplicated across generateHeader (lines 45-46) and generateOverview (lines 63-64). Consider extracting this into a shared utility function like buildFullUrl(siteUrl: string, basePath: string): string to reduce duplication and ensure consistency.

Copilot uses AI. Check for mistakes.
options: LlmsTxtPluginOptions = {},
): Plugin {
const { siteTitle = "Malloy Data Explorer", modelsDir = "models" } = options;
export default function llmsTxtPlugin(options: LlmsTxtPluginOptions): Plugin {
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the default value {} from the options parameter makes this a breaking API change. Previously, calling llmsTxtPlugin() without arguments was valid, but now it will cause a TypeScript error since options.siteUrl is required. Consider either keeping the default empty object and making siteUrl optional (with runtime validation), or document this as a breaking change if intentional.

Copilot uses AI. Check for mistakes.
@aszenz aszenz merged commit 456bff3 into master Feb 6, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants