This Google Apps Script recursively returns all locs of url elements starting from a sitemap file or from a sitemap index file and writes the results to a pre-formatted Google Sheet that can be used as page feed in Google Ads.
This is my very first fork. Feel free to correct or add the information in this README and in the inline comments.
Also I'd be happy for any improvement of my very basic functions. Especially if you find a way to save more execution time. Just as reference: With this script I extracted 620K+ urls from 72 sitemap files in about 23 minutes. 30 minutes is the execution time limit in Google Apps. So this script will probably run into a timeout for bigger website projects with 800K+ indexed urls.
- main() ⇒
Sets start and export file urls, prepares cache arrays and calls sub-functions.
- fetchSitemaps(url) ⇒
Returns sitemap URLs from sitemap index files. Processes sitemap elements only.
- fetchXml(url) ⇒
Pre-fetches the XML data from all cached sitemap urls to speed up the following extracting process.
- extractLocsFromXml(xml) ⇒
Returns url locs from cached sitemap XML contents. Processes url elements only.
Sets start and export file urls, prepares cache arrays and calls sub-functions.
- Clear existing content in exportSheet
- Write header to exportSheet
- Start with url of sitemap or sitemap index file
- Recursively return all sitemap urls and write them to temp sitemaps array
- Retrieve the XML content of all sitemaps previously saved in temp sitemaps array
- Extract all urls (locs) from all cached sitemap contents
- Finally write all extracted urls (locs) to exportSheet
Kind: global function Customfunction:
Param | Type | Description |
---|---|---|
exportSheetUrl | "https://docs.google.com/spreadsheets/d/..." |
REQUIRED The URL of the Google Sheet to export to |
startUrl | "https://www.yourdomain.com/sitemap.xml" OR "https://www.yourdomain.com/sitemap-index.xml" |
REQUIRED The url of the sitemap or the sitemap index file |
Returns sitemap URLs from sitemap index files. Processes sitemap elements only.
Kind: local variable Customfunction:
Param | Type | Description |
---|---|---|
url | "../sitemap.xml" OR "../sitemap-index.xml |
Actually processed sitemap URL provided by main() |
return | Array |
Extracted sitemap URLs |
Pre-fetches the XML data from all cached sitemap urls to speed up the following extracting process.
Kind: local variable Customfunction:
Param | Type | Description |
---|---|---|
url | "../sitemap.xml" OR "../sitemap-index.xml |
Actually processed sitemap URL provided by main() |
return | Array |
Fetched XML |
Returns url locs from cached sitemap XML contents. Processes url elements only.
Kind: local variable Customfunction:
Param | Type | Description |
---|---|---|
xml | "<?xml version...>" |
Actually processed sitemap XML content provided by main() |
return | Array |
Extracted urls (locs) |