Skip to content

Fix: Keyword parsing: support for semicolon-delimited strings #24

@caedmon5

Description

@caedmon5

Keyword parsing: support for semicolon-delimited strings

Problem

Currently, bibnow rejects long keyword strings when they are provided as a single semicolon-delimited string in CSL-JSON or BibTeX. For example, the following entry fails with a 413 error because the entire string is treated as one giant tag:

"keyword": "Donald Trump; United States politics; Neoliberalism; Markets; Fiduciary duties; Bond markets; Big data; Algorithmic governance; Platform capitalism; Conspiracy theory; New conspiracism; Hannah Arendt; Friedrich Hayek; Robert F. Kennedy Jr.; Elon Musk; Evidence-based governance; Tariffs; Idiocracy (film); Epistemology; Political sociology; Datafication; Outsourcing of judgment; Imagination and judgment"

This causes the keyword field to exceed safe limits and blocks commit.

Expected behaviour

bibnow should automatically split semicolon-delimited strings into an array of keyword values (or comma-delimited, if present). That is, the above should be parsed into:

"keyword": [
  "Donald Trump",
  "United States politics",
  "Neoliberalism",
  "Markets",
  "Fiduciary duties",
  "Bond markets",
  "Big data",
  "Algorithmic governance",
  "Platform capitalism",
  "Conspiracy theory",
  "New conspiracism",
  "Hannah Arendt",
  "Friedrich Hayek",
  "Robert F. Kennedy Jr.",
  "Elon Musk",
  "Evidence-based governance",
  "Tariffs",
  "Idiocracy (film)",
  "Epistemology",
  "Political sociology",
  "Datafication",
  "Outsourcing of judgment",
  "Imagination and judgment"
]

Why this matters

  • Many reference managers (Zotero, CSL exports, etc.) use semicolons as default keyword separators.
  • Scholars working across platforms often copy/paste CSL-JSON with semicolons intact.
  • Requiring manual conversion to arrays creates friction and commit failures.

Proposed solution

  • Detect if the keyword field is a string containing ; characters.
  • Split on ; (and trim whitespace) to create an array.
  • Preserve existing array behaviour if keyword is already an array.
  • Optional: also accept comma-delimited strings, though semicolons are the more common convention.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions