Skip to content

Add primitives for str lower() and upper() #1088

@JukkaL

Description

@JukkaL

Add primitives for s.lower() and s.upper() on str objects. There aren't any Python C API functions that directly implement these, so we'll need to add full implementations. Here are some ideas:

  • Use Py_UNICODE_TOLOWER and Py_UNICODE_TOUPPER to convert individual characters to upper/lower case.
    • Use a static table for the lower 128 or 256 character codes to avoid per-character function calls in common cases. Or maybe we can use some tricks to avoid table lookups on ascii characters? A table lookup is probably better than a branch that can't be predicted, though.
  • Create an uninitialized new str object and set the contents directly per-character.
  • We may need to set the string kind (1/2/4 bytes per character) directly for optimal performance.
  • Some code points are special, at least sigma. Look at the CPython implementation and tests to ensure we match the semantics 100%.
  • One option is to copy paste from CPython, if they don't rely on (too many) static functions that we'd need to vendor as well.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions