Skip to content

feat(vtex-proxy): add props to exclude and add sitemap entries#1549

Open
pedrobernardina wants to merge 2 commits intodeco-cx:mainfrom
oficina-dev:feat/proxy-sitemap-with-handlers
Open

feat(vtex-proxy): add props to exclude and add sitemap entries#1549
pedrobernardina wants to merge 2 commits intodeco-cx:mainfrom
oficina-dev:feat/proxy-sitemap-with-handlers

Conversation

@pedrobernardina
Copy link
Copy Markdown
Contributor

@pedrobernardina pedrobernardina commented Mar 10, 2026

What is this Contribution About?

This PR improves the sitemap flexibility of the vtex/loaders/proxy.ts loader by introducing two new configuration options.

1. excludeSiteMapEntry

Allows removing specific entries from the generated sitemap.

This is useful when the upstream sitemap contains routes that should not be exposed by the storefront or should be handled differently.

2. includeSiteMapWithHandler

Allows adding new sitemap entries while specifying the handler responsible for that route.

Currently the existing includeSiteMap option only appends a path to the sitemap without allowing a handler to be associated with it. When a route requires a specific handler, this usually requires creating a dedicated loader.

With includeSiteMapWithHandler, it becomes possible to register both the route and its handler directly in the proxy configuration, avoiding the need to create additional loaders for simple cases.

This change was motivated by the need to generate custom sitemap entries for content coming from external sources (e.g. Sanity CMS) without introducing additional loader boilerplate.

Example

  // site.json
{
      "__resolveType": "vtex/loaders/proxy.ts",
      "includeSiteMapWithHandler": [
        {
          "path": "/sitemap/my-custom-sitemap.xml",
          "handler": "site/handlers/my-custom-sitemap.ts",
          "__title": "Custom Sitemap"
        }
      ],
      "excludeSiteMapEntry": [
        // custom routes generated by vtex io
        "/custom-user-routes-1.xml"
      ]
}

Issue Link

No related issue.

Loom Video

N/A

Demonstration Link

Original Sitemap
https://www.oficina.com/sitemap.xml

Sitemap
https://sites-oficina-reserva--858qfs.decocdn.com/sitemap.xml

The Post Posts
https://sites-oficina-reserva--858qfs.decocdn.com/sitemap/the-post-posts.xml

The Post Categories
https://sites-oficina-reserva--858qfs.decocdn.com/sitemap/the-post-categories.xml

Care Guide Posts
https://sites-oficina-reserva--858qfs.decocdn.com/sitemap/care-guide-posts.xml


Summary by cubic

Add flexible sitemap controls to vtex/loaders/proxy.ts so you can exclude upstream entries and include new entries with custom handlers. Updates vtex/handlers/sitemap.ts to filter sitemap index entries at runtime.

  • New Features
    • excludeSiteMapEntry: removes matching <sitemap><loc> entries from the final sitemap index (match by URL or path suffix) in vtex/handlers/sitemap.ts.
    • includeSiteMapWithHandler: registers routes and adds them to the index in one place (path + optional handler, e.g. website/handlers/sitemap.ts), avoiding extra loaders for custom sitemaps. Backwards compatible with includeSiteMap.

Written for commit 77540a7. Summary will update on new commits.

Summary by CodeRabbit

Release Notes

  • New Features
    • Added sitemap entry exclusion filtering to remove specific URLs from sitemaps
    • Enhanced sitemap entry customization with support for per-entry handlers and selective path filtering

@github-actions
Copy link
Copy Markdown
Contributor

Tagging Options

Should a new tag be published when this PR is merged?

  • 👍 for Patch 0.139.3 update
  • 🎉 for Minor 0.140.0 update
  • 🚀 for Major 1.0.0 update

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 10, 2026

📝 Walkthrough

Walkthrough

The changes extend sitemap handling by introducing entry exclusion capabilities alongside inclusion logic. A new IncludeSiteMapEntry interface is added with helper functions to normalize and transform entries. The sitemap handler now filters excluded entries, while the proxy loader propagates both inclusion and exclusion configurations through the proxy route building pipeline.

Changes

Cohort / File(s) Summary
Sitemap Handler Exclusion
vtex/handlers/sitemap.ts
Added excludeSitemapEntries helper to filter sitemap blocks by matching <loc> tags. Extended Props interface with excludeSiteMapEntry?: string[] parameter and updated function signature to accept and apply exclusion filtering.
Proxy Loader Sitemap Configuration
vtex/loaders/proxy.ts
Introduced IncludeSiteMapEntry interface with fields for path, optional handler, and excludePaths. Added helpers: normalizeIncludeSiteMap, includeEntriesToPaths, includeEntriesToRoutes for entry transformation. Extended buildProxyRoutes and loader Props to accept includeSiteMapWithHandler and excludeSiteMapEntry, integrating them into sitemap generation and route construction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Hopping through the sitemaps with glee,
Entries excluded or included, all free!
New paths are normalized, handlers assigned,
Filtered with joy, a grand design! 🌳
Hops triumphantly across the XML trees

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main feature additions: excluding and including sitemap entries in the vtex-proxy configuration.
Description check ✅ Passed The description covers all template sections with comprehensive details about the changes, example usage, and demonstration links, though Loom video is marked as N/A.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@vtex/handlers/sitemap.ts`:
- Around line 86-95: The current flow calls includeSiteMaps(...) then
excludeSitemapEntries(...), which removes any upstream entries that were just
replaced and prevents re-injecting custom sitemap entries; instead call
excludeSitemapEntries on the original response content (the value passed into
includeSiteMaps — i.e., text.replaceAll(publicUrl, `${reqUrl.origin}/`)) first
using the same excludeSiteMapEntry predicate, then pass that filtered content
into includeSiteMaps along with reqUrl.origin and include so upstream entries
are removed before re-injection; update references in sitemap handler (functions
includeSiteMaps, excludeSitemapEntries and the variables text, publicUrl,
reqUrl.origin, include, excludeSiteMapEntry) and return the result of
includeSiteMaps(filteredContent, reqUrl.origin, include).

In `@vtex/loaders/proxy.ts`:
- Around line 192-196: The schema comment for includeSiteMapWithHandler is
misleading because IncludeSiteMapEntry[] does not accept raw string paths;
update the JSDoc above includeSiteMapWithHandler to remove the suggestion that a
plain string like "/sitemap/blog.xml" is valid and instead state that entries
must be objects with path and handler (mention __resolveType for dynamic
sitemaps) or, if you intend to support raw strings, change the TypeScript type
to include string (e.g., IncludeSiteMapEntry | string) and update related
validation/consumers accordingly; reference includeSiteMapWithHandler and
IncludeSiteMapEntry when making the change.
- Around line 38-43: normalizeIncludeSiteMap and includeEntriesToPaths currently
return/emit raw entries which can contain duplicate IncludeSiteMapEntry.path
values causing duplicate <loc> and custom routes; update normalizeIncludeSiteMap
to both normalize (default to empty array) and deduplicate entries by path, and
change includeEntriesToPaths to produce a list of unique, normalized paths
(e.g., map to path, filter out falsy, use a Set or keyed dedupe) so downstream
sitemap materialization and route generation use unique paths; apply the same
dedupe/normalization logic to the analogous functions around the other block
mentioned (the functions referenced by includeEntriesToPaths and
normalizeIncludeSiteMap in the later section).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 47fef74d-baa3-4ed3-8cee-16ea25220d07

📥 Commits

Reviewing files that changed from the base of the PR and between f9a5a44 and 77540a7.

📒 Files selected for processing (2)
  • vtex/handlers/sitemap.ts
  • vtex/loaders/proxy.ts

Comment on lines +86 to +95
const withIncludes = includeSiteMaps(
text.replaceAll(publicUrl, `${reqUrl.origin}/`),
reqUrl.origin,
include,
);

const filtered = excludeSitemapEntries(withIncludes, excludeSiteMapEntry);

return new Response(
includeSiteMaps(
text.replaceAll(publicUrl, `${reqUrl.origin}/`),
reqUrl.origin,
include,
),
filtered,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Exclude upstream entries before re-injecting custom ones.

Because vtex/loaders/proxy.ts:131-155 pushes includeSiteMapWithHandler paths into include, filtering after includeSiteMaps() removes the replacement entry too. Excluding /sitemap/foo.xml and re-adding the same path currently leaves no <sitemap> block for it in /sitemap.xml.

🐛 Proposed fix
-    const withIncludes = includeSiteMaps(
-      text.replaceAll(publicUrl, `${reqUrl.origin}/`),
-      reqUrl.origin,
-      include,
-    );
-
-    const filtered = excludeSitemapEntries(withIncludes, excludeSiteMapEntry);
+    const withoutExcluded = excludeSitemapEntries(
+      text.replaceAll(publicUrl, `${reqUrl.origin}/`),
+      excludeSiteMapEntry,
+    );
+
+    const filtered = includeSiteMaps(
+      withoutExcluded,
+      reqUrl.origin,
+      include,
+    );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const withIncludes = includeSiteMaps(
text.replaceAll(publicUrl, `${reqUrl.origin}/`),
reqUrl.origin,
include,
);
const filtered = excludeSitemapEntries(withIncludes, excludeSiteMapEntry);
return new Response(
includeSiteMaps(
text.replaceAll(publicUrl, `${reqUrl.origin}/`),
reqUrl.origin,
include,
),
filtered,
const withoutExcluded = excludeSitemapEntries(
text.replaceAll(publicUrl, `${reqUrl.origin}/`),
excludeSiteMapEntry,
);
const filtered = includeSiteMaps(
withoutExcluded,
reqUrl.origin,
include,
);
return new Response(
filtered,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vtex/handlers/sitemap.ts` around lines 86 - 95, The current flow calls
includeSiteMaps(...) then excludeSitemapEntries(...), which removes any upstream
entries that were just replaced and prevents re-injecting custom sitemap
entries; instead call excludeSitemapEntries on the original response content
(the value passed into includeSiteMaps — i.e., text.replaceAll(publicUrl,
`${reqUrl.origin}/`)) first using the same excludeSiteMapEntry predicate, then
pass that filtered content into includeSiteMaps along with reqUrl.origin and
include so upstream entries are removed before re-injection; update references
in sitemap handler (functions includeSiteMaps, excludeSitemapEntries and the
variables text, publicUrl, reqUrl.origin, include, excludeSiteMapEntry) and
return the result of includeSiteMaps(filteredContent, reqUrl.origin, include).

Comment on lines +38 to +43
const normalizeIncludeSiteMap = (
includeSiteMap?: IncludeSiteMapEntry[],
): IncludeSiteMapEntry[] => (includeSiteMap ?? []);

const includeEntriesToPaths = (entries: IncludeSiteMapEntry[]): string[] =>
entries.map((e) => e.path);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Normalize and dedupe sitemap paths before materializing them.

As vtex/handlers/sitemap.ts:8-27 currently emits one <sitemap> block per include item, the raw concatenation here can produce duplicate <loc> entries. Repeated IncludeSiteMapEntry.path values also generate duplicate custom routes.

♻️ Proposed fix
 const normalizeIncludeSiteMap = (
   includeSiteMap?: IncludeSiteMapEntry[],
-): IncludeSiteMapEntry[] => (includeSiteMap ?? []);
+): IncludeSiteMapEntry[] => {
+  const byPath = new Map<string, IncludeSiteMapEntry>();
+  for (const entry of includeSiteMap ?? []) {
+    byPath.set(entry.path, entry);
+  }
+  return [...byPath.values()];
+};
+
+const dedupePaths = (paths: string[]) => [...new Set(paths)];
 
 const includeEntriesToPaths = (entries: IncludeSiteMapEntry[]): string[] =>
   entries.map((e) => e.path);
@@
     const [include, routes] = generateDecoSiteMap
       ? [
-        [
+        dedupePaths([
           ...includeEntriesToPaths(entries),
           ...(includeSiteMap ?? []),
           decoSiteMapUrl,
-        ],
+        ]),
         [
           ...customSitemapRoutes,
           {
@@
       ]
       : [
-        [...includeEntriesToPaths(entries), ...(includeSiteMap ?? [])],
+        dedupePaths([
+          ...includeEntriesToPaths(entries),
+          ...(includeSiteMap ?? []),
+        ]),
         customSitemapRoutes,
       ];

Also applies to: 131-155

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vtex/loaders/proxy.ts` around lines 38 - 43, normalizeIncludeSiteMap and
includeEntriesToPaths currently return/emit raw entries which can contain
duplicate IncludeSiteMapEntry.path values causing duplicate <loc> and custom
routes; update normalizeIncludeSiteMap to both normalize (default to empty
array) and deduplicate entries by path, and change includeEntriesToPaths to
produce a list of unique, normalized paths (e.g., map to path, filter out falsy,
use a Set or keyed dedupe) so downstream sitemap materialization and route
generation use unique paths; apply the same dedupe/normalization logic to the
analogous functions around the other block mentioned (the functions referenced
by includeEntriesToPaths and normalizeIncludeSiteMap in the later section).

Comment on lines +192 to +196
/**
* @title Other site maps to include
* @description URL path (e.g. "/sitemap/blog.xml") or object with path + handler (__resolveType) to register a route and add to the index. Use the object form for dynamic sitemaps (e.g. from Sanity).
*/
includeSiteMapWithHandler?: IncludeSiteMapEntry[];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

The schema description still implies raw strings are valid here.

includeSiteMapWithHandler is typed as IncludeSiteMapEntry[], but this text reads like the field accepts either "/sitemap/foo.xml" or an object. That is likely to trip up anyone editing site.json directly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vtex/loaders/proxy.ts` around lines 192 - 196, The schema comment for
includeSiteMapWithHandler is misleading because IncludeSiteMapEntry[] does not
accept raw string paths; update the JSDoc above includeSiteMapWithHandler to
remove the suggestion that a plain string like "/sitemap/blog.xml" is valid and
instead state that entries must be objects with path and handler (mention
__resolveType for dynamic sitemaps) or, if you intend to support raw strings,
change the TypeScript type to include string (e.g., IncludeSiteMapEntry |
string) and update related validation/consumers accordingly; reference
includeSiteMapWithHandler and IncludeSiteMapEntry when making the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant