Skip to content

Commit

Permalink
Merge pull request #2 from wbrown/kdcache
Browse files Browse the repository at this point in the history
KD Cache, fixes, optimizations, cleanup
  • Loading branch information
wbrown authored Aug 5, 2024
2 parents 7c3a21e + 6c2ddfb commit 686acf1
Show file tree
Hide file tree
Showing 11 changed files with 635 additions and 221 deletions.
142 changes: 119 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,136 @@
img2ansi
========
Image to ANSI conversion:

Major features:
* Modified Atkinson dithering
* Edge detection
* ANSI compression
* Subcharacter block rendering with 2 colors per character.
* Separate foreground and background palette for terminals that support it.
* Color quantization
* Maximum file size targeting
* For IRC: line limits

## Block-Based ANSI Art Dithering Algorithm (Brown Dithering Algorithm)

This project implements a unique dithering algorithm specifically designed
for converting images into ANSI art. Unlike traditional dithering methods,
my approach uses a block-based processing technique optimized for terminal
and text-based display.

## Key Features

1. **Block-Based Processing**: Operates on 2x2 pixel blocks instead of
individual pixels, allowing for more complex patterns within a single
character cell.

2. **ANSI Color Quantization**: Utilizes a specialized color quantization
scheme tailored for the ANSI color palette, ensuring optimal color
representation in terminal environments.

3. **Unicode Block Character Selection**: Chooses the best Unicode block
character to represent each 2x2 pixel block, maximizing the detail in the
final ANSI art.

4. **Dual-Color Representation**: Each block is represented by both a
foreground and background color, enabling more nuanced color transitions
and detail.

5. **Edge Detection Integration**: Incorporates edge detection to adjust
error distribution, preserving important image details.

6. **Optimized for Text Output**: Designed to produce ANSI escape code
sequences, making it ideal for terminal-based image display.

## How It Works

The algorithm processes the input image in 2x2 blocks, determining the best
Unicode character and color combination to represent each block. It then
uses a modified error diffusion technique inspired by Floyd-Steinberg
dithering to distribute quantization errors to neighboring blocks.

This approach results in high-quality ANSI art that preserves the detail
and color of the original image while optimizing for the constraints of
text-based display.

## Requirements
Requires OpenCV 4 to be installed.

## Example Output

The below examples are 80 column wide images, with a scale factor of 2. The
first example uses the default 16-color ANSI palette, while the second example
uses the 256 color scheme.

![Baboon ANSI Art - 16 colors](examples/baboon_16.png)
![Baboon ANSI Art - JetBrains](examples/baboon_jb.png)
![Baboon ANSI Art - 256 colors](examples/baboon_256.png)

## Installation
To build the program, run the following commands:

```sh
go get -u github.com/wbrown/ansi2img
```

## Usage
`./img2ansi -input <input> [-output <output>] [-width <width>]
[-scale <scale>] [-quantization <quantization>] [-maxchars <maxchars>]
[-8bit] [-jb] [-table]`

**Performance**

The following performance options are available. There are tradeoffs between
speed and quality. The defaults are chosen to be a good balance between the
two. But if you want the absolute best quality, set the `-kdsearch` option to
`0` and the `-cache_threshold` option to `0`. This may cause the program to
take multiple minutes to run.

* `-kdsearch <int>`: Number of nearest neighbors to search in KD-tree, `0` to
disable (default `50`)

The KD search option is the number of nearest neighbors to search in the
KD-tree for the block cache. A value of `0` will disable the KD-tree search
and the cache.

* `-cache_threshold <float>`: Threshold for block cache (default `40`)

The block cache is a cache of the block characters that are used to render the
image. The cache is used to speed up the program by not having to recompute
the blocks for each 2x2 pixel block in the image. It is a fuzzy cache, so it
is thresholded on error distance from the target block.

**Colors**

By default the program uses the 16-color ANSI palette, split into 8 foreground
colors and 8 background colors. The `-8bit` option can be used to enable 256
color mode. The `-jb` option can be used to use the JetBrains color scheme,
which allows for separate foreground and background palettes to effectively
double the number of colors available.

The program performes well without quantization, but if you want to reduce the
number of colors in the output, you can use the `-quantization` option. The
default is `256` colors. This isn't the output colors, but the number of
colors used in the quantization step.

**Image Size**

The `-width` option can be used to set the target width of the output image,
this is the primary factor in determining the output ANSI dimensions. The
default `-scale` is `2`, which approximately halves the height of the output,
to compensate for the fact that characters are taller than they are wide.

```
-edge int
Color difference threshold for edge detection skipping (default 100)
-8bit
Use 8-bit ANSI colors (256 colors)
-cache_threshold float
Threshold for block cache (default 40)
-input string
Path to the input image file (required)
-jb
Use JetBrains color scheme
-kdsearch int
Number of nearest neighbors to search in KD-tree, 0 to disable (default 50)
-maxchars int
Maximum number of characters in the output (default 1048576)
-maxline int
Maximum number of characters in a line, 0 for no limit
-output string
Path to save the output (if not specified, prints to stdout)
-quantization int
Quantization factor (default 256)
-scale float
Scale factor for the output image (default 3)
-separate
Use separate palettes for foreground and background colors
-shading
Enable shading for more detailed output
-space
Convert block characters to spaces
Scale factor for the output image (default 2)
-table
Print ANSI color table
-width int
Target width of the output image (default 100)
Target width of the output image (default 80)
```
8 changes: 6 additions & 2 deletions ansi.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import (

// compressANSI compresses an ANSI image by combining adjacent blocks with
// the same foreground and background colors. The function takes an ANSI
// image as a string and returns the compressed ANSI image as a string.
// image as a string and returns the more efficient ANSI image as a string.
func compressANSI(ansiImage string) string {
var compressed strings.Builder
var currentFg, currentBg, currentBlock string
Expand All @@ -34,17 +34,21 @@ func compressANSI(ansiImage string) string {
fg = ""
}

// If any color or block changes, write the current block
// and start a new one
if fg != currentFg || bg != currentBg || block != currentBlock {
if count > 0 {
compressed.WriteString(
formatANSICode(currentFg, currentBg, currentBlock, count))
formatANSICode(
currentFg, currentBg, currentBlock, count))
}
currentFg, currentBg, currentBlock = fg, bg, block
count = 1
} else {
count++
}
}
// Write the last block of the line
if count > 0 {
compressed.WriteString(
formatANSICode(currentFg, currentBg, currentBlock, count))
Expand Down
110 changes: 110 additions & 0 deletions approximatecache.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
package main

import (
"math"
)

// ApproximateCache is a map of Uint256 to lookupEntry
// that is used to store approximate matches for a given
// block of 4 RGB values. Approximate matches are performed
// by comparing the error of a given match to a threshold
// value.
//
// The key of the map is a Uint256, which is a 256-bit
// unsigned integer that is used to represent the foreground
// and background colors of a block of 4 RGB values.
//
// There may be multiple matches for a given key, so the
// value of the map is a lookupEntry, which is a struct
// that contains a slice of Match structs.
type ApproximateCache map[Uint256]lookupEntry

// Match is a struct that contains the rune, foreground
// color, background color, and error of a match. The error
// is a float64 value that represents the difference between
// the actual block of 4 RGB values and the pair of foreground
// and background colors encoded in the key as an Uint256.
type Match struct {
Rune rune
FG RGB
BG RGB
Error float64
}

type lookupEntry struct {
Matches []Match
}

// AddEntry adds a new entry to the cache. The entry is
// represented by a key, which is a Uint256, and a Match
// struct that contains the rune, foreground color, background
// color, and error of the match.
func (cache ApproximateCache) addEntry(
k Uint256,
r rune,
fg RGB,
bg RGB,
block [4]RGB,
isEdge bool,
) {
newMatch := Match{
Rune: r,
FG: fg,
BG: bg,
Error: calculateBlockError(
block,
getQuadrantsForRune(r),
fg,
bg,
isEdge,
),
}
if entry, exists := lookupTable[k]; exists {
// Create a new slice with the appended match
updatedMatches := append(entry.Matches, newMatch)
// Update the map with the new slice
lookupTable[k] = lookupEntry{Matches: updatedMatches}
} else {
lookupTable[k] = lookupEntry{Matches: []Match{newMatch}}
}
}

// GetEntry retrieves an entry from the cache. The entry is
// represented by a key, which is a Uint256, and a block of
// 4 RGB values. The function returns the rune, foreground
// color, background color, and a boolean value indicating
// whether the entry was found in the cache.
//
// There may be multiple matches for a given key, so the
// function returns the match with the lowest error value.
func (cache ApproximateCache) getEntry(
k Uint256,
block [4]RGB,
isEdge bool,
) (rune, RGB, RGB, bool) {
baseThreshold := cacheThreshold
if isEdge {
baseThreshold *= 0.7
}
lowestError := math.MaxFloat64
var bestMatch *Match = nil
if entry, exists := lookupTable[k]; exists {
for _, match := range entry.Matches {
// Recalculate error for this match
matchError := calculateBlockError(block,
getQuadrantsForRune(match.Rune), match.FG, match.BG, isEdge)
if matchError < baseThreshold {
if matchError < lowestError {
lowestError = matchError
bestMatch = &match
}
}
}
if bestMatch != nil {
lookupHits++
return bestMatch.Rune, bestMatch.FG, bestMatch.BG, true
}
}
lookupMisses++
return 0, RGB{}, RGB{}, false
}
Binary file added examples/baboon_16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/baboon_256.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/baboon_jb.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
15 changes: 13 additions & 2 deletions image.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,24 @@ func saveToPNG(img gocv.Mat, filename string) error {
// prepareForANSI prepares an image for conversion to ANSI art. The function
// takes an input image, the target width and height for the output image, and
// returns the resized image and the edges detected in the image.
func prepareForANSI(img gocv.Mat, width, height int) (resized, edges gocv.Mat) {
//
// It uses area interpolation for downscaling to an intermediate size, detects
// edges on the intermediate image, and resizes both the intermediate image
// and the edges to the final size. It also applies a very mild sharpening to
// the resized image.
func prepareForANSI(
img gocv.Mat,
width,
height int,
) (
resized, edges gocv.Mat,
) {
intermediate := gocv.NewMat()
resized = gocv.NewMat()
edges = gocv.NewMat()

// Use area interpolation for downscaling to an intermediate size
intermediateWidth := width * 4 // or another multiplier that gives results
intermediateWidth := width * 4
intermediateHeight := height * 4
gocv.Resize(img,
&intermediate,
Expand Down
Loading

0 comments on commit 686acf1

Please sign in to comment.