From bbaba0207715cdc32882b4e1996f5550f6fe1b4d Mon Sep 17 00:00:00 2001 From: Gitkbc Date: Tue, 24 Feb 2026 11:18:52 +0530 Subject: [PATCH 1/4] refactor(width): calibrate ambiguous width using LuaSystem >= 0.7.0 Remove per-character width cache and probe a single ambiguous-width character during initialization. Store the measured value globally and delegate width calculation to LuaSystem. Preserve test and test_write APIs for compatibility. Update documentation and rockspec dependency. --- CHANGELOG.md | 6 +- doc_topics/02-terminal_handling.md | 6 +- doc_topics/03-text_handling.md | 10 +- src/terminal/cli/select.lua | 6 +- src/terminal/draw/init.lua | 2 +- src/terminal/init.lua | 19 ++- src/terminal/progress.lua | 2 +- src/terminal/text/width.lua | 234 +++++++++-------------------- terminal-scm-1.rockspec | 2 +- 9 files changed, 104 insertions(+), 183 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c2e22d8f..4af1eee7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -32,8 +32,10 @@ The scope of what is covered by the version number excludes: ### Version X.Y.Z, unreleased -- a fix -- a change +- refactor: simplify `terminal.text.width` with `luasystem` (>= 0.7.0). + Removed per-character width cache and now calibrate one ambiguous-width + character during initialization, reused for all width calculations. + `test` and `test_write` are preserved for API compatibility. ### Version 0.1.0, released 01-Jan-2022 diff --git a/doc_topics/02-terminal_handling.md b/doc_topics/02-terminal_handling.md index 19181174..0a1de646 100644 --- a/doc_topics/02-terminal_handling.md +++ b/doc_topics/02-terminal_handling.md @@ -33,6 +33,10 @@ This is handled by the `terminal.input` module. Specifically the `terminal.input To properly control the UI in a terminal, it is important to know how text is displayed on the terminal. The primary thing to know is the display width of characters. -The `terminal.text.width` module provides functionality to test and report the width of characters and strings, as does `terminal.preload_widths`. The `terminal.size` function can be used to find the terminal size (in rows and columns), to see if the text to display fits the screen or will roll-over/scroll. +The `terminal.text.width` module reports character and string widths, using a +startup calibration for ambiguous-width characters. `terminal.preload_widths` +can be used to force that calibration. The `terminal.size` function can be used +to find the terminal size (in rows and columns), to see if the text to display +fits the screen or will roll-over/scroll. The `EditLine` class has advanced ways of handling width. diff --git a/doc_topics/03-text_handling.md b/doc_topics/03-text_handling.md index c68ad283..2ea5d95e 100644 --- a/doc_topics/03-text_handling.md +++ b/doc_topics/03-text_handling.md @@ -15,12 +15,12 @@ functions. If needed the Lua functions can be patched with the ones provided in # 3.1 Character display width -Since not all characters have a predefined width (east-asian languages with ambiguous widths), so even if using -LuaSystems functions to determine character display width there are still unknowns. The only way to know how they -render (single or double columns) is to actually test display width. +Some characters have an ambiguous width in Unicode (notably in East-Asian contexts). +To handle this, the library calibrates one ambiguous character at startup and reuses +that measured width in all width calculations. -For this purpose there are several utility functions in `terminal.text.width`, and there is the width-testing for -use during application startup/initialization by means of `terminal.preload_widths`. +This calibration is done during `terminal.initialize`, and can also be triggered +through `terminal.preload_widths`. # 3.2 Displaying strings diff --git a/src/terminal/cli/select.lua b/src/terminal/cli/select.lua index 2579bced..cab86307 100644 --- a/src/terminal/cli/select.lua +++ b/src/terminal/cli/select.lua @@ -165,15 +165,11 @@ end --- Returns the display height in rows. --- Note: on a first call it will test character widths, see `terminal.text.width.test`. --- So terminal must be initialized before calling this method. -- @treturn number The height of the menu in rows. function Select:height() if not self.widths then - -- first call, so test display width - t.text.width.test(self.prompt .. diamond .. circle .. dot .. pipe .. angle .. table.concat(self.choices)) - -- calculate display width + -- first call, calculate and cache display widths self.widths = {} for i, txt in ipairs(self.choices) do self.widths[i] = t.text.width.utf8swidth(pipe .. circle .. txt) diff --git a/src/terminal/draw/init.lua b/src/terminal/draw/init.lua index 950f59fb..4aabf90c 100644 --- a/src/terminal/draw/init.lua +++ b/src/terminal/draw/init.lua @@ -98,7 +98,7 @@ M.box_fmt = utils.make_lookup("box-format", { --- returns a string with all box_fmt characters, to pre-load the character width cache +-- returns a string with all box_fmt characters function M._box_fmt_chars() local r = {} for _, fmt in pairs(M.box_fmt) do diff --git a/src/terminal/init.lua b/src/terminal/init.lua index 2a77aeb0..db828e55 100644 --- a/src/terminal/init.lua +++ b/src/terminal/init.lua @@ -83,16 +83,19 @@ end ---- Preload known characters into the width-cache. --- Typically this should be called right after initialization. It will check default --- characters in use by this library, and the optional specified characters in `str`. --- Characters loaded will be the `terminal.draw.box_fmt` formats, and the `progress` spinner sprites. --- Uses `terminal.text.width.test` to test the widths of the characters. --- @tparam[opt] string str additional character string to preload +--- Calibrate display-width handling. +-- Detects the width of a single ambiguous-width character and stores it globally +-- for subsequent width calculations. +-- The optional argument is kept for backward compatibility. +-- @tparam[opt] string str unused; retained for backward compatibility -- @return true -- @within Initialization function M.preload_widths(str) - text.width.test((str or "") .. M.progress._spinner_fmt_chars() .. M.draw._box_fmt_chars()) + if str then + -- Kept for backward compatibility to preserve argument validation behavior. + assert(type(str) == "string", "expected string, got " .. type(str)) + end + text.width.initialize(true) return true end @@ -186,6 +189,8 @@ do sys.setconsoleflags(io.stdin, sys.getconsoleflags(io.stdin) - sys.CIF_PROCESSED_INPUT) end + text.width.initialize(true) + return true end diff --git a/src/terminal/progress.lua b/src/terminal/progress.lua index 889bc52f..06a88b76 100644 --- a/src/terminal/progress.lua +++ b/src/terminal/progress.lua @@ -52,7 +52,7 @@ M.sprites = utils.make_lookup("spinner-sprite", { --- returns a string with all spinner characters, to pre-load the character width cache +-- returns a string with all spinner characters function M._spinner_fmt_chars() local r = {} for _, fmt in pairs(M.sprites) do diff --git a/src/terminal/text/width.lua b/src/terminal/text/width.lua index f7e8baa3..5220b30c 100644 --- a/src/terminal/text/width.lua +++ b/src/terminal/text/width.lua @@ -1,18 +1,6 @@ ---- Module to check and validate character display widths. --- Not all characters are displayed with the same width on the terminal. --- The Unicode standard defines the width of many characters, but not all. --- Especially the ['ambiguous width'](https://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt) --- characters can be displayed with different --- widths especially when used with East Asian languages. --- The only way to truly know their display width is to write them to the terminal --- and measure the cursor position change. --- --- This module implements a cache of character widths as they have been measured. --- --- To populate the cache with tested widths use `test` and `test_write`. --- --- To check width, using the cached widths, use `utf8cwidth` and `utf8swidth`. Any --- character not in the cache will be passed to `system.utf8cwidth` to determine the width. +--- Character display width helpers. +-- Uses LuaSystem width calculations with an optional calibrated +-- ambiguous-width value for terminal-specific behavior. -- @module terminal.text.width local M = {} @@ -20,18 +8,76 @@ package.loaded["terminal.text.width"] = M -- Register the module early to avoid local t = require "terminal" local sys = require "system" + local sys_utf8cwidth = sys.utf8cwidth +local sys_utf8swidth = sys.utf8swidth local utf8 = require("utf8") -- explicit lua-utf8 library call, for <= Lua 5.3 compatibility +local ambiguous_char = "·" +local ambiguous_codepoint = utf8.codepoint(ambiguous_char) + +M.ambiguous_width = nil + + +local function detect_ambiguous_width() + local row, col = t.cursor.position.get() + if not row then + return nil, col + end + local setpos = t.cursor.position.set_seq(row, col) + local query = ambiguous_char .. t.cursor.position.query_seq() .. setpos + + t.text.stack.push({ brightness = 0 }) + local result, err = t.input.query(query, "^\27%[(%d+);(%d+)R$") + t.text.stack.pop() + if not result then + return nil, err + end + local measured_col = tonumber(result[2]) + if not measured_col then + return nil, "invalid cursor query response" + end -local char_widths = {} -- registry to keep track of already tested widths + local width = measured_col - col + if width < 0 then + local _, cols = t.size() + width = width + cols + end + if width ~= 1 and width ~= 2 then + return nil, "invalid ambiguous width: " .. tostring(width) + end + return width +end + + +--- Initializes the module-wide ambiguous-width value. +-- If terminal probing is unavailable, this falls back to LuaSystem defaults. +-- @tparam[opt=false] boolean force_probe force terminal probing when initialized +-- @treturn number ambiguous width (1 or 2) +function M.initialize(force_probe) + if M.ambiguous_width and not force_probe then + return M.ambiguous_width + end + + if t.ready and t.ready() then + local width = detect_ambiguous_width() + if width then + M.ambiguous_width = width + return width + end + if M.ambiguous_width then + return M.ambiguous_width + end + end + + M.ambiguous_width = sys_utf8cwidth(ambiguous_codepoint) + return M.ambiguous_width +end --- Returns the width of a character in columns, matches `system.utf8cwidth` signature. --- This will check the cache of recorded widths first, and if not found, --- use `system.utf8cwidth` to determine the width. It will not test the width. -- @tparam string|number char the character (string or codepoint) to check -- @treturn number the width of the first character in columns function M.utf8cwidth(char) @@ -40,168 +86,36 @@ function M.utf8cwidth(char) elseif type(char) ~= "number" then error("expected string or number, got " .. type(char), 2) end - return char_widths[utf8.char(char)] or sys_utf8cwidth(char) + return sys_utf8cwidth(char, M.ambiguous_width) end - --- Returns the width of a string in columns, matches `system.utf8swidth` signature. --- It will use the cached widths, if no cached width is available it falls back on `system.utf8cwidth`. --- It will not test the width. -- @tparam string str the string to check -- @treturn number the width of the string in columns function M.utf8swidth(str) - local w = 0 - for pos, char in utf8.codes(str) do - w = w + (char_widths[utf8.char(char)] or sys_utf8cwidth(char)) - end - return w + return sys_utf8swidth(str, M.ambiguous_width) end - ---- Returns the width of the string, by test writing. --- Characters will be written 'invisible', so it does not show on the terminal, but it does need --- room to print them. The cursor is returned to its original position. --- It will read many character-widths at once, and hence is a lot faster than checking --- each character individually. The width of each character measured is recorded in the cache. --- --- - the text stack is used to set the brightness to 0 before, and restore colors/attributes after the test. --- - the test will be done at the current cursor position, and hence content there might be overwritten. Since --- a character is either 1 or 2 columns wide. The content of those 2 columns might have to be restored. +--- Returns the width of a string. -- @tparam string str the string of characters to test --- @treturn[1] number width in columns of the string --- @treturn[2] nil --- @treturn[2] string error message +-- @treturn number the width of the string in columns -- @within Testing function M.test(str) - local size = 50 -- max number of characters to do in 1 terminal write - local test = {} - local dup = {} - local width = 0 - for pos, char in utf8.codes(str) do - char = utf8.char(char) -- convert back to utf8 string - local cw = char_widths[char] - if cw then - -- we already know the width - width = width + cw - elseif not dup[char] then - -- we have no width, and it is not yet in the test list, so add it - test[#test+1] = char - dup[char] = true - end - end - - if #test == 0 then - return width -- nothing to test, return the width - end - - t.text.stack.push({ brightness = 0 }) -- set color to "hidden" - - local r, c = t.cursor.position.get() -- retrieve current position - local setpos = t.cursor.position.set_seq(r, c) -- string to restore cursor to current position - local getpos = t.cursor.position.query_seq() -- string to inject query for current position - local chunk = {} - local chars = {} - for i = 1, #test do -- process in chunks of max size - chars[#chars+1] = test[i] - local s = test[i] -- the character - .. getpos -- query for new position - .. setpos -- restore cursor to current position - chunk[#chunk+1] = s - if #chunk == size or i == #test then - -- handle the chunk - t.output.write(table.concat(chunk) .. " " .. setpos) -- write the chunk - local positions, err = t.input.read_query_answer("^\27%[(%d+);(%d+)R$", #chunk) - if not positions then - t.text.stack.pop() -- restore color (drop hidden) - return nil, err - end - - -- record sizes reported - for j, pos in ipairs(positions) do - local w = pos[2] - c - if w < 0 then - -- cursor wrapped to next line - local _, cols = t.size() - w = w + cols - end - char_widths[chars[j]] = w - end - - chunk = {} -- clear for next chunk - chars = {} - end - end - - t.text.stack.pop() -- restore color (drop hidden) - return M.test(str) -- re-run to get the total width, since all widths are known now + M.initialize() + return M.utf8swidth(str) end - ---- Returns the width of the string, and writes it to the terminal. --- Writes the string to the terminal, visible, whilst at the same time injecting cursor-position queries --- to detect the width of the unknown characters in the string. --- It will read many character-widths at once, and hence is a lot faster than checking --- each character individually. --- The width of each character measured is recorded in the cache. +--- Writes a string and returns its width. -- @tparam string str the string of characters to write and test -- @treturn number the width of the string in columns -- @within Testing function M.test_write(str) - local chunk = {} -- every character, pre/post fixed with a query if needed - local chars = {} -- array chars to test - local width = 0 - - do -- parse the string to test - local getpos = t.cursor.position.query_seq() -- string to inject; query for current position - local dups = {} - - for pos, char in utf8.codes(str) do - char = utf8.char(char) -- convert back to utf8 string - local cw = char_widths[char] - local query = "" - if cw then - -- we already know the width - width = width + cw - elseif not dups[char] then - -- we have no width, and it is not yet in the test list, so add the query - query = getpos - chars[#chars+1] = char - dups[char] = true - end - chunk[#chunk+1] = query .. char .. query - end - end - - t.output.write(table.concat(chunk)) - if #chars == 0 then - return width -- nothing to test, return the width - end - - local positions, err = t.input.read_query_answer("^\27%[(%d+);(%d+)R$", #chars * 2) - if not positions then - return nil, err - end - - -- record sizes reported - for j, pos in ipairs(positions) do - local char = chars[j] - local col_start = pos[j*2 - 1][2] - local col_end = pos[j*2][2] - local w = col_end - col_start - if w < 0 then - -- cursor wrapped to next line - local _, cols = t.size() - w = w + cols - end - char_widths[char] = w - end - - -- re-run to get the total width, since all widths are known now, - -- but this time do not write the string, just return the width - return M.test(str) + M.initialize() + t.output.write(str) + return M.utf8swidth(str) end return M diff --git a/terminal-scm-1.rockspec b/terminal-scm-1.rockspec index c768ae55..293e0036 100644 --- a/terminal-scm-1.rockspec +++ b/terminal-scm-1.rockspec @@ -25,7 +25,7 @@ description = { dependencies = { "lua >= 5.1, < 5.6", - "luasystem >= 0.6.3", + "luasystem >= 0.7.0", "utf8 >= 1.3.0", } From 8ce5d764995c5b1b287b6bd25754a6d9ffedf3ee Mon Sep 17 00:00:00 2001 From: Gitkbc Date: Tue, 24 Feb 2026 12:19:47 +0530 Subject: [PATCH 2/4] feat(progress): strip ANSI sequences for sprite width calculations --- CHANGELOG.md | 3 ++ spec/12-draw_spec.lua | 14 +++++- spec/13-progress_spec.lua | 87 ++++++++++++++++++++++++++++++++++++++ src/terminal/draw/line.lua | 19 +++++++-- src/terminal/progress.lua | 15 +++++-- 5 files changed, 130 insertions(+), 8 deletions(-) create mode 100644 spec/13-progress_spec.lua diff --git a/CHANGELOG.md b/CHANGELOG.md index 4af1eee7..33ac12c3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -36,6 +36,9 @@ The scope of what is covered by the version number excludes: Removed per-character width cache and now calibrate one ambiguous-width character during initialization, reused for all width calculations. `test` and `test_write` are preserved for API compatibility. +- feat(progress): account for ANSI escape sequences in sprite width math. + Spinner frames and done sprites can now include color/style sequences + without breaking cursor rewind behavior. ### Version 0.1.0, released 01-Jan-2022 diff --git a/spec/12-draw_spec.lua b/spec/12-draw_spec.lua index 6d666090..245fa3a9 100644 --- a/spec/12-draw_spec.lua +++ b/spec/12-draw_spec.lua @@ -138,6 +138,18 @@ describe("terminal.draw", function() assert.are.equal("…llo 测试!", result) end) + + it("preserves ANSI codes in title when no truncation is needed", function() + local result = line.title_seq(10, "\27[31mTest\27[0m") + assert.are.equal("───\27[31mTest\27[0m───", result) + end) + + + it("strips ANSI codes when title truncation is needed", function() + local result = line.title_seq(8, "\27[31mVeryLongTitle\27[0m") + assert.are.equal("VeryLon…", result) + end) + end) @@ -191,4 +203,4 @@ describe("terminal.draw", function() end) -end) +end) \ No newline at end of file diff --git a/spec/13-progress_spec.lua b/spec/13-progress_spec.lua new file mode 100644 index 00000000..1f533562 --- /dev/null +++ b/spec/13-progress_spec.lua @@ -0,0 +1,87 @@ +local helpers = require "spec.helpers" + + +describe("terminal.progress", function() + + local terminal + local progress + + setup(function() + terminal = helpers.load() + progress = require("terminal.progress") + end) + + + teardown(function() + progress = nil + terminal = nil + helpers.unload() + end) + + + + describe("spinner()", function() + + before_each(function() + helpers.clear_output() + end) + + + + it("uses visible width for ANSI-styled single-width sprites", function() + local spinner = progress.spinner({ + sprites = { + [0] = "", + "\27[31mX\27[0m", + }, + stepsize = 10, + }) + + spinner(false) + + assert.are.equal( + "\27[31mX\27[0m" .. terminal.cursor.position.left_seq(1), + helpers.get_output() + ) + end) + + + it("uses visible width for ANSI-styled double-width sprites", function() + local spinner = progress.spinner({ + sprites = { + [0] = "", + "\27[31m界\27[0m", + }, + stepsize = 10, + }) + + spinner(false) + + assert.are.equal( + "\27[31m界\27[0m" .. terminal.cursor.position.left_seq(2), + helpers.get_output() + ) + end) + + + it("uses visible width for ANSI-styled done_sprite", function() + local spinner = progress.spinner({ + sprites = { + [0] = "x", + "x", + }, + done_sprite = "\27[32mOK\27[0m", + stepsize = 10, + }) + + spinner(true) + + assert.are.equal( + "\27[32mOK\27[0m" .. terminal.cursor.position.left_seq(2), + helpers.get_output() + ) + end) + + end) + +end) \ No newline at end of file diff --git a/src/terminal/draw/line.lua b/src/terminal/draw/line.lua index c18cd462..7482f7ce 100644 --- a/src/terminal/draw/line.lua +++ b/src/terminal/draw/line.lua @@ -76,6 +76,8 @@ end --- Creates a sequence to draw a horizontal line with a title centered in it without writing it to the terminal. -- Line is drawn left to right. If the width is too small for the title, the title is truncated. -- If less than 4 characters are available for the title, the title is omitted altogether. +-- ANSI escape sequences in title/prefix/postfix are ignored for width calculations. +-- If truncation is needed, the rendered title uses plain text. -- @tparam number width the total width of the line in columns -- @tparam[opt=""] string title the title to draw (if empty or nil, only the line is drawn) -- @tparam[opt="─"] string char the line-character to use @@ -92,11 +94,20 @@ function M.title_seq(width, title, char, pre, post, type, title_attr) pre = pre or "" post = post or "" - local pre_w = text.width.utf8swidth(pre) - local post_w = text.width.utf8swidth(post) + local pre_w = text.width.utf8swidth(utils.strip_ansi(pre)) + local post_w = text.width.utf8swidth(utils.strip_ansi(post)) local w_for_title = width - pre_w - post_w - local title, title_w = utils.truncate_ellipsis(w_for_title, title, type) + local stripped_title = utils.strip_ansi(title) + local stripped_title_w = text.width.utf8swidth(stripped_title) + local title_w + if stripped_title_w <= w_for_title then + title_w = stripped_title_w + else + stripped_title, title_w = utils.truncate_ellipsis(w_for_title, stripped_title, type) + title = stripped_title + end + if title_w == 0 then return M.horizontal_seq(width, char) end @@ -139,4 +150,4 @@ end -return M +return M \ No newline at end of file diff --git a/src/terminal/progress.lua b/src/terminal/progress.lua index 06a88b76..ed2da33e 100644 --- a/src/terminal/progress.lua +++ b/src/terminal/progress.lua @@ -13,6 +13,12 @@ local gettime = require("system").gettime +local function visible_width(str) + return tw.utf8swidth(utils.strip_ansi(str)) +end + + + --- table with predefined sprites for progress spinners. -- The sprites are tables of strings, where each string is a frame in the spinner animation. -- The frame at index 0 is optional and is the "done" message, the rest are the animation frames. @@ -71,6 +77,8 @@ end -- If `row` and `col` are given then terminal memory is used to (re)store the cursor position. If they are not given -- then the spinner will be printed at the current cursor position, and the cursor will return to the same position -- after each update. +-- ANSI escape sequences in sprites are ignored for width calculations, allowing +-- styled/colorized sprite frames. -- @tparam table opts a table of options; -- @tparam table opts.sprites a table of strings to display, one at a time, overwriting the previous one. Index 0 is the "done" message. -- See `sprites` for a table of predefined sprites. @@ -118,10 +126,11 @@ function M.spinner(opts) if i == 0 then s = opts.done_sprite or s end + local w = visible_width(s) local sequence = Sequence() sequence[#sequence+1] = pos_set sequence[#sequence+1] = (i == 0 and attr_push_done) or attr_push or nil - sequence[#sequence+1] = s .. t.cursor.position.left_seq(t.text.width.utf8swidth(s)) + sequence[#sequence+1] = s .. t.cursor.position.left_seq(w) sequence[#sequence+1] = attr_pop sequence[#sequence+1] = pos_restore steps[i] = sequence @@ -168,7 +177,7 @@ function M.ticker(text, width, text_done) local max_len = 0 for i = 1, lengths[0] do result[i] = utils.utf8sub(base, i, i + width - 1) - lengths[i] = tw.utf8swidth(result[i]) + lengths[i] = visible_width(result[i]) max_len = math.max(max_len, lengths[i]) end result[0] = utils.utf8sub(result[0], 1, max_len) @@ -185,4 +194,4 @@ end -return M +return M \ No newline at end of file From 57588d8acef136faa60c6e940e9a5a11d2e44a86 Mon Sep 17 00:00:00 2001 From: Gitkbc Date: Fri, 27 Feb 2026 01:49:39 +0530 Subject: [PATCH 3/4] Refactor width handling: remove caches and use single ambiguous-width calibration (LuaSystem >= 0.7.0) --- ARCHITECTURE.md | 26 +++-- doc_topics/02-terminal_handling.md | 22 +++-- doc_topics/03-text_handling.md | 11 ++- src/terminal/draw/init.lua | 18 ---- src/terminal/init.lua | 46 +++------ src/terminal/output.lua | 6 +- src/terminal/progress.lua | 11 --- src/terminal/text/width.lua | 152 ++++++++++------------------- 8 files changed, 104 insertions(+), 188 deletions(-) diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index d571063c..048a6f4f 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -109,7 +109,6 @@ The main entry point is `src/terminal/init.lua`, which exposes the `terminal` mo - Holds version metadata and high-level helpers: - `terminal.size()` – wrapper around `system.termsize`. - `terminal.bell()` / `terminal.bell_seq()` – terminal bell. - - `terminal.preload_widths()` – preloads characters into the width cache for box drawing and progress spinners. - Manages initialization/shutdown and integration with `system`: - Console flags, non-blocking input, code page, alternate screen buffer. - Sleep function wiring for async usage. @@ -234,18 +233,24 @@ This is encapsulated by **`terminal.input`** (e.g. `preread` and `read_query_ans ## 5. Text handling in the UI Terminal UI must align and truncate text by **display columns**, not by bytes or UTF-8 character count. Characters can be one or two columns wide (e.g. CJK, emojis), and some have ambiguous width. This section describes how to handle width, substrings, and formatted display so text renders correctly. - ### 5.1 Display width - **`terminal.text.width`** provides the width primitives: - - **`utf8cwidth(char)`** – width in columns of a single character (string or codepoint). Uses a cache when available; otherwise falls back to `system.utf8cwidth`. - - **`utf8swidth(str)`** – total display width of a string in columns. -- **Width cache:** Not all characters have a fixed width (e.g. East Asian ambiguous). The library maintains a cache of **tested** widths. To populate it: - - **`terminal.text.width.test(str)`** – writes characters invisibly, measures cursor movement, and records each character’s width. Call during startup or when you first display unknown glyphs. - - **`terminal.preload_widths(str)`** – convenience that tests the library’s own box-drawing and progress characters plus any optional `str`. Call once after `terminal.initialize` if you use `terminal.draw` or `terminal.progress`. -- Use **`terminal.size()`** to get terminal dimensions (rows × columns) so you can fit text to the visible area. + - **`utf8cwidth(char[, ambiguous_width])`** – returns the display width in columns of a single UTF-8 character (string or codepoint). + - **`utf8swidth(str[, ambiguous_width])`** – returns the total display width in columns of a UTF-8 string. + +Width calculation is delegated to the underlying `system.utf8cwidth` +and `system.utf8swidth` functions provided by `luasystem`. + +Ambiguous-width characters default to a width of 1 column. A different +width (1 or 2) can be specified explicitly via the optional +`ambiguous_width` parameter. + +Use **`terminal.size()`** to obtain terminal dimensions (rows × columns) +so text can be laid out to fit the visible area. -**Rule of thumb:** For correct alignment and truncation, always reason in **columns**. Use `utf8swidth` to measure strings and `utf8cwidth` for per-character width when implementing substrings or cursors. +**Rule of thumb:** For correct alignment and truncation, always reason in +**display columns**, not bytes or character count. ### 5.2 Substrings by characters vs columns @@ -289,7 +294,8 @@ Key methods for display and layout: - **Simple truncation or fixed-width slice:** use **`utils.utf8sub_col(str, 1, max_col)`** (and optionally ellipsis). - **Editable single/multi-line text with cursor and word wrap:** use **EditLine** and **`EditLine:format(...)`**. -- **Measuring or testing width:** use **`terminal.text.width.utf8swidth`** / **`utf8cwidth`** and **`terminal.text.width.test`** / **`terminal.preload_widths`** as above. +- Measuring display width: use `terminal.text.width.utf8swidth` + or `utf8cwidth`. All terminal output must go through **`terminal.output`** (e.g. `terminal.output.write`), not raw `print` or `io.write`, so that the library’s stream and any patching behave correctly. diff --git a/doc_topics/02-terminal_handling.md b/doc_topics/02-terminal_handling.md index 0a1de646..da512be8 100644 --- a/doc_topics/02-terminal_handling.md +++ b/doc_topics/02-terminal_handling.md @@ -30,13 +30,19 @@ This is handled by the `terminal.input` module. Specifically the `terminal.input # 2.4 Character width -To properly control the UI in a terminal, it is important to know how text is displayed on the terminal. -The primary thing to know is the display width of characters. - -The `terminal.text.width` module reports character and string widths, using a -startup calibration for ambiguous-width characters. `terminal.preload_widths` -can be used to force that calibration. The `terminal.size` function can be used -to find the terminal size (in rows and columns), to see if the text to display -fits the screen or will roll-over/scroll. +To properly control the UI in a terminal, it is important to know how text is displayed. +The primary thing to understand is the display width of characters. + +The `terminal.text.width` module reports character and string widths. Width +calculation is delegated to the underlying `system.utf8cwidth` and +`system.utf8swidth` functions. + +Ambiguous-width characters default to a width of 1 column, unless explicitly +configured. Width detection can optionally be performed during terminal +initialization. + +The `terminal.size` function can be used to determine the terminal size +(in rows and columns), to verify whether text fits the screen or will +wrap/scroll. The `EditLine` class has advanced ways of handling width. diff --git a/doc_topics/03-text_handling.md b/doc_topics/03-text_handling.md index 2ea5d95e..981bcbc4 100644 --- a/doc_topics/03-text_handling.md +++ b/doc_topics/03-text_handling.md @@ -15,12 +15,13 @@ functions. If needed the Lua functions can be patched with the ones provided in # 3.1 Character display width -Some characters have an ambiguous width in Unicode (notably in East-Asian contexts). -To handle this, the library calibrates one ambiguous character at startup and reuses -that measured width in all width calculations. +Some Unicode characters have an ambiguous display width (notably in +East-Asian contexts). Width calculation is delegated to the underlying +`system.utf8cwidth` and `system.utf8swidth` functions. -This calibration is done during `terminal.initialize`, and can also be triggered -through `terminal.preload_widths`. +Ambiguous-width characters default to a width of 1 column. If required, +a different width (1 or 2) can be specified explicitly when calling +the width functions. # 3.2 Displaying strings diff --git a/src/terminal/draw/init.lua b/src/terminal/draw/init.lua index 4aabf90c..8a594332 100644 --- a/src/terminal/draw/init.lua +++ b/src/terminal/draw/init.lua @@ -97,24 +97,6 @@ M.box_fmt = utils.make_lookup("box-format", { }) - --- returns a string with all box_fmt characters -function M._box_fmt_chars() - local r = {} - for _, fmt in pairs(M.box_fmt) do - if type(fmt) == "table" then - for _, v in pairs(fmt) do - if type(v) == "string" then - r[#r+1] = v - end - end - end - end - return table.concat(r) -end - - - --- Creates a sequence to draw a box, without writing it to the terminal. -- The box is drawn starting from the top-left corner at the current cursor position, -- after drawing the cursor will be in the same position. diff --git a/src/terminal/init.lua b/src/terminal/init.lua index db828e55..468cc0fe 100644 --- a/src/terminal/init.lua +++ b/src/terminal/init.lua @@ -9,13 +9,13 @@ -- -- For generic instruction please read the [introduction](../topics/01-introduction.md.html). -- --- @copyright Copyright (c) 2024-2025 Thijs Schreijer +-- @copyright Copyright (c) 2024-2024 Thijs Schreijer -- @author Thijs Schreijer -- @license MIT, see `LICENSE.md`. local M = { _VERSION = "0.0.1", - _COPYRIGHT = "Copyright (c) 2024-2025 Thijs Schreijer", + _COPYRIGHT = "Copyright (c) 2024-2026 Thijs Schreijer", _DESCRIPTION = "Cross platform terminal library for Lua (Windows/Unix/Mac)", } @@ -32,7 +32,7 @@ local sys = require "system" -- Push the module table already in `package.loaded` to avoid circular dependencies package.loaded["terminal"] = M --- load the submodules; all but object; editline, sequence, cli.*, ui.* +-- load the submodules M.input = require("terminal.input") M.output = require("terminal.output") M.clear = require("terminal.clear") @@ -41,7 +41,6 @@ M.cursor = require("terminal.cursor") M.text = require("terminal.text") M.draw = require("terminal.draw") M.progress = require("terminal.progress") -M.utils = require("terminal.utils") -- create locals local output = M.output local scroll = M.scroll @@ -68,7 +67,7 @@ M.size = sys.termsize --- Returns a string sequence to make the terminal beep. -- @treturn string ansi sequence to write to the terminal -function M.bell_seq() +function M.beep_seq() return "\a" end @@ -76,31 +75,12 @@ end --- Write a sequence to the terminal to make it beep. -- @return true -function M.bell() - output.write(M.bell_seq()) +function M.beep() + output.write(M.beep_seq()) return true end - ---- Calibrate display-width handling. --- Detects the width of a single ambiguous-width character and stores it globally --- for subsequent width calculations. --- The optional argument is kept for backward compatibility. --- @tparam[opt] string str unused; retained for backward compatibility --- @return true --- @within Initialization -function M.preload_widths(str) - if str then - -- Kept for backward compatibility to preserve argument validation behavior. - assert(type(str) == "string", "expected string, got " .. type(str)) - end - text.width.initialize(true) - return true -end - - - do local termbackup local reset = "\27[0m" @@ -137,9 +117,10 @@ do -- See [`luasystem.autotermrestore`](https://lunarmodules.github.io/luasystem/modules/system.html#autotermrestore). -- @tparam[opt=false] boolean opts.disable_sigint if `true`, the terminal will not send a SIGINT signal -- on Ctrl-C. Disables Ctrl-C, Ctrl-Z, and Ctrl-\, which allows the application to handle them. + -- @tparam[opt=true] boolean opts.calibrate_width if `false`, skips the automatic ambiguous-width calibration. -- @return true -- @within Initialization - function M.initialize(opts) +function M.initialize(opts) assert(not M.ready(), "terminal already initialized") opts = opts or {} @@ -189,7 +170,10 @@ do sys.setconsoleflags(io.stdin, sys.getconsoleflags(io.stdin) - sys.CIF_PROCESSED_INPUT) end - text.width.initialize(true) + -- Hook in the width calibration unless the user explicitly opts out + if opts.calibrate_width ~= false then + text.width.calibrate() + end return true end @@ -238,7 +222,7 @@ end -- This function wraps a function in calls to `initialize` and `shutdown`, ensuring the terminal is properly shut down. -- If an error is caught, it first shutsdown the terminal and then rethrows the error. -- @tparam function main the function to wrap --- @tparam[opt] table opts options table, see `initialize` for details. +-- @tparam[opt] table opts options table, to pass to `initialize`. -- @treturn function wrapped function -- @within Initialization -- @usage @@ -276,6 +260,4 @@ function M.initwrap(main, opts) end end - - -return M +return M \ No newline at end of file diff --git a/src/terminal/output.lua b/src/terminal/output.lua index 72ff52ee..ca7b0528 100644 --- a/src/terminal/output.lua +++ b/src/terminal/output.lua @@ -12,7 +12,7 @@ local M = {} package.loaded["terminal.output"] = M -- Register the module early to avoid circular dependencies - +local sys = require("system") local t = io.stderr -- the terminal/stream to operate on @@ -85,7 +85,9 @@ function M.print(...) return true end - +function M.isatty() + return sys.isatty(t) +end --- Flushes the stream. -- @return the return value of the stream's `flush` function diff --git a/src/terminal/progress.lua b/src/terminal/progress.lua index ed2da33e..e5abd451 100644 --- a/src/terminal/progress.lua +++ b/src/terminal/progress.lua @@ -58,17 +58,6 @@ M.sprites = utils.make_lookup("spinner-sprite", { --- returns a string with all spinner characters -function M._spinner_fmt_chars() - local r = {} - for _, fmt in pairs(M.sprites) do - for _, v in pairs(fmt) do - r[#r+1] = v - end - end - return table.concat(r) -end - --- Create a progress spinner. diff --git a/src/terminal/text/width.lua b/src/terminal/text/width.lua index 5220b30c..57ff8015 100644 --- a/src/terminal/text/width.lua +++ b/src/terminal/text/width.lua @@ -1,121 +1,69 @@ ---- Character display width helpers. --- Uses LuaSystem width calculations with an optional calibrated --- ambiguous-width value for terminal-specific behavior. +-- Character display width helpers (LuaSystem 0.7.0+). +-- +-- Delegates to system.utf8cwidth / utf8swidth. +-- Ambiguous-width characters (East-Asian) default to 1 but can be calibrated +-- once at startup to match what the actual terminal does. +-- -- @module terminal.text.width local M = {} -package.loaded["terminal.text.width"] = M -- Register the module early to avoid circular dependencies -local t = require "terminal" -local sys = require "system" - -local sys_utf8cwidth = sys.utf8cwidth -local sys_utf8swidth = sys.utf8swidth -local utf8 = require("utf8") -- explicit lua-utf8 library call, for <= Lua 5.3 compatibility -local ambiguous_char = "·" -local ambiguous_codepoint = utf8.codepoint(ambiguous_char) - -M.ambiguous_width = nil - - -local function detect_ambiguous_width() - local row, col = t.cursor.position.get() - if not row then - return nil, col - end - local setpos = t.cursor.position.set_seq(row, col) - local query = ambiguous_char .. t.cursor.position.query_seq() .. setpos - t.text.stack.push({ brightness = 0 }) - local result, err = t.input.query(query, "^\27%[(%d+);(%d+)R$") - t.text.stack.pop() - - if not result then - return nil, err - end - - local measured_col = tonumber(result[2]) - if not measured_col then - return nil, "invalid cursor query response" - end - - local width = measured_col - col - if width < 0 then - local _, cols = t.size() - width = width + cols - end - - if width ~= 1 and width ~= 2 then - return nil, "invalid ambiguous width: " .. tostring(width) - end +local sys = require "system" +local t = require "terminal" +-- Stored ambiguous width (1 or 2). +-- Default = 1 (safe for most modern terminals and non-TTY output). +local AMBIGUOUS_WIDTH = 1 -- Default - return width +--Getter and setter for ambiguous width, in case users want to manage it themselves or check it after calibration. +function M.get_ambiguous_width() + return AMBIGUOUS_WIDTH +end +-- Manually sets the ambiguous width setting. +-- @tparam number width must be 1 or 2 +function M.set_ambiguous_width(width) + if width ~= 1 and width ~= 2 then + error("ambiguous_width must be 1 or 2, got " .. tostring(width)) + end + AMBIGUOUS_WIDTH = width end ---- Initializes the module-wide ambiguous-width value. --- If terminal probing is unavailable, this falls back to LuaSystem defaults. --- @tparam[opt=false] boolean force_probe force terminal probing when initialized --- @treturn number ambiguous width (1 or 2) -function M.initialize(force_probe) - if M.ambiguous_width and not force_probe then - return M.ambiguous_width - end - if t.ready and t.ready() then - local width = detect_ambiguous_width() - if width then - M.ambiguous_width = width - return width - end - if M.ambiguous_width then - return M.ambiguous_width +-- Calibrates the ambiguous width by probing one character. +-- Only runs when we have a real TTY. Idempotent. +-- @return number the detected width (1 or 2) +function M.calibrate() + if not t.output.isatty() then return AMBIGUOUS_WIDTH end + + if not t.ready() then + error("terminal must be initialized before calibration") end - end - M.ambiguous_width = sys_utf8cwidth(ambiguous_codepoint) - return M.ambiguous_width + local r, c = t.cursor.position.get() + if not r then return AMBIGUOUS_WIDTH end + + -- Write an ambiguous character ("middle dot") and measure displacement + t.output.write("·") + t.output.flush() + + local _, new_c = t.cursor.position.get() + t.cursor.position.set(r, c) -- Restore cursor + + if new_c then + local measured = new_c - c + if measured == 1 or measured == 2 then + AMBIGUOUS_WIDTH = measured + end + end + return AMBIGUOUS_WIDTH end ---- Returns the width of a character in columns, matches `system.utf8cwidth` signature. --- @tparam string|number char the character (string or codepoint) to check --- @treturn number the width of the first character in columns function M.utf8cwidth(char) - if type(char) == "string" then - char = utf8.codepoint(char) - elseif type(char) ~= "number" then - error("expected string or number, got " .. type(char), 2) - end - return sys_utf8cwidth(char, M.ambiguous_width) + return sys.utf8cwidth(char, AMBIGUOUS_WIDTH) end - - ---- Returns the width of a string in columns, matches `system.utf8swidth` signature. --- @tparam string str the string to check --- @treturn number the width of the string in columns function M.utf8swidth(str) - return sys_utf8swidth(str, M.ambiguous_width) -end - - ---- Returns the width of a string. --- @tparam string str the string of characters to test --- @treturn number the width of the string in columns --- @within Testing -function M.test(str) - M.initialize() - return M.utf8swidth(str) -end - - ---- Writes a string and returns its width. --- @tparam string str the string of characters to write and test --- @treturn number the width of the string in columns --- @within Testing -function M.test_write(str) - M.initialize() - t.output.write(str) - return M.utf8swidth(str) + return sys.utf8swidth(str, AMBIGUOUS_WIDTH) end -return M +return M \ No newline at end of file From 50c239d4d02e6b94afb7ae3865679133e4bea35c Mon Sep 17 00:00:00 2001 From: Gitkbc Date: Fri, 27 Feb 2026 02:13:46 +0530 Subject: [PATCH 4/4] Restore terminal.utils export to public API --- src/terminal/init.lua | 1 + 1 file changed, 1 insertion(+) diff --git a/src/terminal/init.lua b/src/terminal/init.lua index 468cc0fe..0f268ed7 100644 --- a/src/terminal/init.lua +++ b/src/terminal/init.lua @@ -41,6 +41,7 @@ M.cursor = require("terminal.cursor") M.text = require("terminal.text") M.draw = require("terminal.draw") M.progress = require("terminal.progress") +M.utils = require("terminal.utils") -- create locals local output = M.output local scroll = M.scroll