Skip to content

Conversation

@jonahadkins
Copy link
Contributor

@jonahadkins jonahadkins commented May 10, 2025

Description

This PR provides future looking improvements to the existing category csv:

  • Duplicates overture_categories.csv as overture_category_taxonomy.csv to de-risk planned changes affecting users.
  • Does not change or edit any values in the overture_categories.csv, with the exception of the column names.
  • Adds hierarchical naming for each level in the taxonomy, by walking back the current taxonomy from the most detailed category to the least detailed. The suggested names (Theme Domain, Topic, Subtopic, Niche, Variant, Subvariant) were chosen from standard hierarchical heading names.
    • Theme Domain: (Level 1) A broad category that groups related types of points of interest, such as food and drink, entertainment, or shopping. There are 22 unique values.
    • Topic: (Level 2) A mid-level category that provides more specificity within a theme, such as restaurants, museums, or parks. There are 697 unique values.
    • Subtopic: (Level 3) A more specific category that further refines a topic, such as Italian restaurants, art museums, or national parks. There are 1010 unique values.
    • Niche: (Level 4) A unique or specialized aspect of a subtopic, such as a specific type of cuisine, a particular artist's work, or a notable landmark. There are 355 unique values.
    • Variant: (Level 5) A specific variation or style within a niche, adding further distinction. There are 32 unique values. Note: since there are so few of these, it's recommended to flatten these in the future.
    • Subvariant: (Level 6) The most granular level that identifies a unique characteristic, special offering, or distinctive element. There are 2 unique values. Note: since there are so few of these, it's recommended to flatten these in the future.
  • Adds a Display Name column that is the formatted proper name of the category.
  • Adds a Hierarchy Level column that provides a numeric value (1-6) for each category's level in the taxonomic hierarchy.

Visual Example:

Screenshot 2025-05-10 at 7 54 33 AM

@allukac
Copy link

allukac commented May 13, 2025

@jonahadkins could you state somewhere how were theme, topic, subtopic, niche etc. generated or determined for each category?

DavidKarlas
DavidKarlas previously approved these changes May 14, 2025
@DavidKarlas DavidKarlas self-requested a review May 14, 2025 15:29
allukac
allukac previously approved these changes May 15, 2025
@Cj-Malone
Copy link
Contributor

Adds a Display Name column that is the formatted proper name of the category.

I think that "RV Park" would be better than the current "Rv Park".
Is this anything more than "_" -> " " -> title case?

Maybe this should be a separate file anyway, so these can be translated into other languages. Presumably these are en-US?

@jonahadkins
Copy link
Contributor Author

I think that "RV Park" would be better than the current "Rv Park".

canonically "RV" in most places: https://en.wikipedia.org/wiki/RV_park

@@ -1,4 +1,4 @@
Category,Theme,Topic,Subtopic,Niche,Variant,Subvariant,Display Name,Hierarchy Level
Category,Domain,Topic,Subtopic,Niche,Variant,Subvariant,Display Name,Hierarchy Level
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this change

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR description to strike out Theme and insert Domain per my understanding of the change.

Copy link
Collaborator

@vcschapp vcschapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing I'm missing on this PR so far is:

  1. Why are we doing this? (What is the problem we are solving?)
  2. Who will use this file?
  3. How will the file be used?
  4. Why is this file the right way of solving the problem?
  5. What alternative solutions were considered and rejected?
  6. Is this file part of the schema, yes or no?
  7. Do we expect people to depend on it and complain if we change it?

Following from the discussion in the schema meeting last Wednesday 2025-05-21, should this PR be cycled into an RFD to try to answer those motivating questions before we review it as a PR?

@vcschapp
Copy link
Collaborator

@jonahadkins can you update this to merge to a branch per the discussion on 2025-06-11?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change type - documentation - member 📝 Documentation change by Overture member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Category definitions Provide category headings for places

6 participants