Skip to content

Conversation

@lukavdplas
Copy link
Contributor

@lukavdplas lukavdplas commented Dec 16, 2025

This makes some changes to how indices are named, and how es_index and es_alias are used.

The main reason was that the behaviour for database corpora needed to change, e.g. to allow duplicate names for database corpora in the future. I also wanted to change es_alias to work better for use cases like parlamint or course explorer.

  • es_index is now optional, and intended to customise/override the default name. For Python corpora, the name is based on the corpus name, e.g. textcavator-times, for database corpora it's based on the ID, e.g. textcavator-custom[3]. Bracket characters are to avoid overlap with corpus names or version numbers.
  • es_index is no longer set when you create a corpus via the API (but you can still set it from the admin site). The generated index name is not derived from the corpus title, so this resolves Corpus form: updating corpus title orphans index #1824
  • Most test corpora no longer set es_index. Incidentally, that means you can use them in development too, without deleting the index every time you run unit tests. (Because the tests use a different application prefix.)
  • es_alias now specifies an additional alias, rather than replacing the alias name. I think this is closer to how we use it. This means the corpus is still searchable under its own index name, as well as the alias.
  • For a corpus with es_alias, the index --prod --rollover and alias commands now only remove the alias from earlier versions of the same corpus. (Not from indices that belong to other corpora.)
  • Indexing with --prod fails with a clear message if you previously created an unversioned index for the corpus (close Indexing: no catch if intended alias is already used as index name #1947)
  • In production mode, you can now create a new index version when the active version is not the latest one. (close Versioned indexing when alias is not set to latest version #1159)
  • es_index/es_alias are no longer visible in the API.
  • In some places, added a catch if the index name matches more than one index.

As far as I know, this won't break anything for existing corpora, because they all still specify es_index.

@lukavdplas lukavdplas changed the title Feature/index naming index naming Dec 16, 2025
@lukavdplas lukavdplas marked this pull request as ready for review December 16, 2025 18:38
@Meesch Meesch self-requested a review January 9, 2026 12:17
@lukavdplas lukavdplas added the backend changes to the django backend label Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend changes to the django backend

Projects

None yet

1 participant