uniwidth - Modern Unicode Width Calculation for Go

uniwidth is a modern, high-performance Unicode width calculation library for Go 1.25+. It provides 3-46x faster width calculation compared to existing solutions through a 4-tier O(1) lookup architecture, SWAR optimization, and a ZWJ-aware emoji state machine.

Performance

Based on comprehensive benchmarks vs go-runewidth:

ASCII strings: 15-46x faster (SWAR, 8 bytes/iter)
CJK strings: 4-14x faster (O(1) table lookup)
Mixed/Emoji strings: 6-8x faster
ZWJ emoji: Correct width (👨‍👩‍👧‍👦 = 2, ~95 ns)
Zero allocations: 0 B/op, 0 allocs/op for ASCII paths

Run benchmarks yourself: cd bench && go test -bench=. -benchmem

Features

3-46x faster than go-runewidth (proven in benchmarks)
All tiers O(1) — 4-tier lookup with 3-stage hierarchical table (3.8KB)
ZWJ-aware — family emoji, skin tones, flags handled correctly
SWAR optimized — ASCII detection and width counting at 8 bytes/iter
Zero allocations for ASCII strings (no GC pressure)
Thread-safe (immutable design, no global state)
Unicode 16.0 support
Modern API (Go 1.25+, functional options pattern)

Installation

go get github.com/unilibs/uniwidth

Requirements: Go 1.25 or later

Usage

Basic Usage

package main

import (
    "fmt"
    "github.com/unilibs/uniwidth"
)

func main() {
    // Calculate width of a string
    width := uniwidth.StringWidth("Hello 世界")
    fmt.Println(width) // Output: 10 (Hello=5, space=1, 世界=4)

    // Calculate width of a single rune
    w := uniwidth.RuneWidth('世')
    fmt.Println(w) // Output: 2

    // ASCII-only strings are super fast!
    width = uniwidth.StringWidth("Hello, World!")
    fmt.Println(width) // Output: 13
}

ZWJ Emoji Sequences

// ZWJ family emoji — correctly returns 2, not 8
width := uniwidth.StringWidth("👨‍👩‍👧‍👦")
fmt.Println(width) // Output: 2

// Skin tone modifiers — correctly returns 2, not 4
width = uniwidth.StringWidth("👍🏽")
fmt.Println(width) // Output: 2

// Rainbow flag
width = uniwidth.StringWidth("🏳️‍🌈")
fmt.Println(width) // Output: 2

// Country flags
width = uniwidth.StringWidth("🇺🇸")
fmt.Println(width) // Output: 2

Options API

Configure handling of ambiguous-width characters:

import "github.com/unilibs/uniwidth"

// East Asian locale (ambiguous characters are wide)
opts := []uniwidth.Option{
    uniwidth.WithEastAsianAmbiguous(uniwidth.EAWide),
}
width := uniwidth.StringWidthWithOptions("±½", opts...)
fmt.Println(width) // Output: 4 (each character is 2 columns)

// Neutral locale (ambiguous characters are narrow) - DEFAULT
opts = []uniwidth.Option{
    uniwidth.WithEastAsianAmbiguous(uniwidth.EANarrow),
}
width = uniwidth.StringWidthWithOptions("±½", opts...)
fmt.Println(width) // Output: 2 (each character is 1 column)

Real-World TUI Examples

// Terminal prompt
prompt := "❯ Enter command: "
width := uniwidth.StringWidth(prompt)
fmt.Printf("Prompt width: %d columns\n", width)

// Table cell padding
text := "Hello 世界"
padding := 20 - uniwidth.StringWidth(text)
fmt.Printf("%s%s\n", text, strings.Repeat(" ", padding))

// Truncate to fit terminal width
func truncate(s string, maxWidth int) string {
    width := 0
    for i, r := range s {
        w := uniwidth.RuneWidth(r)
        if width+w > maxWidth {
            return s[:i] + "…"
        }
        width += w
    }
    return s
}

Architecture

4-Tier O(1) Lookup

uniwidth uses a multi-tier approach where all tiers are O(1):

Tier 1: ASCII Fast Path (O(1))
- Covers ~95% of typical terminal content
- SWAR isASCIIOnly() + asciiWidth() process 8 bytes/iter
- Short strings (< 8 bytes) use fused single-pass loop
Tier 2: Common CJK (O(1))
- CJK Unified Ideographs, Hangul Syllables, Hiragana/Katakana
- Simple range checks for 32,000+ characters
Tier 3: Common Emoji (O(1))
- Emoticons, Pictographs, Dingbats, Symbols
- Range checks for ~1,200 emoji codepoints
Tier 4: 3-Stage Table (O(1))
- ROOT[256] → MIDDLE[17×64] → LEAVES[78×32]
- 2-bit width encoding, 3.8KB total
- Covers all remaining Unicode codepoints in 3 array lookups

ZWJ State Machine

Forward-scan state machine for correct emoji sequence handling:

3 states: default → emoji → emojiZWJ
Handles: ZWJ sequences, skin tone modifiers, variation selectors, flag pairs
Inspired by Ghostty's approach, adapted for width calculation

SWAR Optimization

ASCII paths use SIMD Within A Register (SWAR) for high throughput:

isASCIIOnly(): uint64 word AND with 0x8080808080808080 mask
asciiWidth(): Daniel Lemire's underflow trick for control character detection
Both process 8 bytes per iteration with zero allocations

Benchmarks

goos: windows
goarch: amd64

BenchmarkStringWidth_ASCII_Short     ~7 ns/op     0 B/op   0 allocs/op
BenchmarkStringWidth_ASCII_Medium   ~20 ns/op     0 B/op   0 allocs/op
BenchmarkStringWidth_CJK_Short     ~25 ns/op     0 B/op   0 allocs/op
BenchmarkStringWidth_ZWJ_Family    ~95 ns/op     0 B/op   0 allocs/op
BenchmarkStringWidth_EmojiModifier ~40 ns/op     0 B/op   0 allocs/op

Run benchmarks yourself:

go test -bench=. -benchmem

Use Cases

Perfect for:

TUI frameworks (terminal rendering hot paths)
Terminal emulators (text layout calculations)
CLI tools (table alignment, formatting)
Text editors (cursor positioning, column calculation)
Any high-performance text width calculation

Migration from go-runewidth

uniwidth provides a compatible API for easy migration:

// Before (go-runewidth)
import "github.com/mattn/go-runewidth"
width := runewidth.StringWidth(s)

// After (uniwidth) - drop-in replacement!
import "github.com/unilibs/uniwidth"
width := uniwidth.StringWidth(s)

Performance improvement: 3-46x faster, zero code changes!

Documentation

API Reference - Full godoc documentation
Benchmark Comparisons - Performance comparison vs go-runewidth
Architecture Design - Technical deep dive & design decisions
Changelog - Version history & upgrade guide
Roadmap - What's next for uniwidth

Testing

# Run tests
go test -v

# Run benchmarks
go test -bench=. -benchmem

# Run with coverage
go test -cover

Current test coverage: 96.4%

Development Status

Current: v0.2.0

This library is stable and production-ready. The API is backward-compatible across minor versions. ZWJ emoji sequences, skin tone modifiers, variation selectors, and flag emoji are all handled correctly.

v0.2.0 Highlights:

All 4 lookup tiers are now O(1) (3-stage table replaced binary search)
SWAR ASCII optimization (8 bytes/iter)
ZWJ emoji state machine (👨‍👩‍👧‍👦 = width 2)
Emoji modifier support (👍🏽 = width 2)
96.4% test coverage

Roadmap (v0.3.0+):

Profile-Guided Optimization (PGO)
Benchmark CI for regression detection
Explicit SIMD via Go assembly and archsimd
Unicode 17.0 preparation

Contributing

Contributions welcome! This is part of the unilibs organization - modern Unicode libraries for Go.

License

MIT License - see LICENSE file

Related Projects

Built by the Phoenix TUI Framework team.

Part of the unilibs ecosystem:

uniwidth - Unicode width calculation (this project)
unigrapheme - Grapheme clustering (planned)
More Unicode utilities coming soon!

Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Special Thanks

Professor Ancha Baranova - This project would not have been possible without her invaluable help and support. Her assistance was crucial in bringing uniwidth to life.

Made with care by the Phoenix team | Powered by Go 1.25+

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github		.github
bench		bench
cmd/generate-tables		cmd/generate-tables
docs		docs
scripts		scripts
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
benchmark_test.go		benchmark_test.go
conformance_test.go		conformance_test.go
fuzz_test.go		fuzz_test.go
go.mod		go.mod
go.sum		go.sum
options.go		options.go
options_test.go		options_test.go
tables_generated.go		tables_generated.go
uniwidth.go		uniwidth.go
uniwidth_test.go		uniwidth_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

uniwidth - Modern Unicode Width Calculation for Go

Performance

Features

Installation

Usage

Basic Usage

ZWJ Emoji Sequences

Options API

Real-World TUI Examples

Architecture

4-Tier O(1) Lookup

ZWJ State Machine

SWAR Optimization

Benchmarks

Use Cases

Migration from go-runewidth

Documentation

Testing

Development Status

Contributing

License

Related Projects

Support

Special Thanks

About

Uh oh!

Releases 3

Packages

Contributors 2

Uh oh!

Languages

License

unilibs/uniwidth

Folders and files

Latest commit

History

Repository files navigation

uniwidth - Modern Unicode Width Calculation for Go

Performance

Features

Installation

Usage

Basic Usage

ZWJ Emoji Sequences

Options API

Real-World TUI Examples

Architecture

4-Tier O(1) Lookup

ZWJ State Machine

SWAR Optimization

Benchmarks

Use Cases

Migration from go-runewidth

Documentation

Testing

Development Status

Contributing

License

Related Projects

Support

Special Thanks

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Uh oh!

Languages

Packages