-
Notifications
You must be signed in to change notification settings - Fork 177
About the algorithm
These terms from the literature describe what the Resynthesizer can accomplish:
- inpainting
- texture synthesis or transfer
- content-aware fill
- patch matching
- reconstruction
Generally, similar algorithms seek to "plausibly reconstruct" an image, in some sense to understand "objects" in the image.
The algorithm does patch matching. A patch is a pixel and a small set of surrounding pixels. The patch need not be square or even contiguous.
The result from a matching is "for" a middle pixel of a patch. That is, the algorithm only changes that one pixel after a matching. A middle pixel is only approximately in the center, since a patch is digital and can be irregular.
The algorithm makes many passes. In this algorithm, during the first pass, the target is not filled completely in, but gradually gets filled in. The empty pixels are processed in random order. In the first pass, a patch is like a shotgun pattern: a small set of nearest pixels, but not necessarily nearby. The middle pixel has no value. A new value for the middle pixel, in the target, is found by searching the corpus for the best matching patch to the target pixel's patch (including its surrounding that may not be in the target area per se.)
In subsequent passes, the target is already filled in, but with imperfect results. But the patches are all regular, contiguous shapes (except for image borders.) Subsequent passes also process pixels at random.
The algorithm is a search in the corpus for the best matching patch for a patch in the target.
The algorithm uses a search heuristic: a new best match for a pixel is often found (in the corpus) at the opposite offset from a pixel (in the target) to a neighbor (in the target) pixel's best match (in the corpus.)
The algorithm is adaptive: it quits when improvement from pass to pass diminishes below a threshold. Also, on any particular patch matching search, the algorithm stops searching when it finds a match whose quality is below a threshold.
For most uses, the algorithm is "search and replace". That is, after every search, it replaces a pixel in the target with the middle pixel of the best match in the corpus. So in subsequent passes, a searchee patch might be different than earlier passes.
The algorithm keeps the coordinates of the best matches in the corpus, but does not return those coordinates as a result. Search without replacement could yield a result: the coordinates of the best matches in the corpus. That would be another mode of operation of the algorithm. This has been partially implemented only in the "falseColorMatch" branch of the repository.
The matching or comparison is "best fit." Any patch in an image is likely to be unique: not have an exact match elsewhere.
The algorithm uses a special distance measure function, the Cauchy probability distribution function. See Paul Harrison's thesis for a discussion why this might be better than other measures.
A more general discussion of the algorithm as a best-fit algorithm in a multi-dimensional space is in Paul Harrison's thesis.
Unlike many image filters, the Resynthesizer algorithm does not have a "kernel" in the sense of a computation that is scanned in a linear order across the image. Instead the Resynthesizer processes in a random order. A kernel has the property that its memory use pattern plays well with cached memory, especially when images are also "tiled" at a lower level of the image implementation (as is done in GIMP and Gegl.)
Because the Resynthesizer does not have a kernel, it needs a view of the entire image, i.e. it performs best when the entire image fits in memory, especially cache memory. Thus the Resynthesizer algorithm is very memory intensive, and not simple to multi-process. This particular implementation of the Resynthesizer algorithm emphasizes data structures that play well with cached memory, namely interleaving mask bytes with color bytes.
There are other patch matching algorithms. One starts by completely filling in the target with random guesses from the corpus. Then the target is processed in a raster scan (scan line) order, again matching patches. A simple distance measure is used between patches. Processing in raster order lets the algorithm reuse some of computations for pixels above and to the left. Completely filling in the initial target lets the algorithm use square, vectorizable patches. The simple distance measure is also vectorizable.
(I briefly explored vectorization of the resynthesizer code. There are some remnants in the repository.)
You may also be interested in 'bidirectional similarity', which uses patch matching (and supplements it.)
The algorithm might be better than other algorithms because it uses two kinds of patches: shotgun and regular patches. The shotgun kind of patch is more complicated to implement, but might give better results in the early stages of the algorithm, and the overall result might depend on the quality of the early results.
The algorithm also implements several kinds of ordering or direction for the "random" search. For example it can fill in the target from the outside in, still randomly, but not totally randomly. A user can often choose a direction that gives better apparent results.
The plugins use the algorithm (built as a library) but the plugins specialize the use of the algorithm i.e. choose parameters of the algorithm, especially the target and corpus. Most plugins take one image and separate it into an appropriate target and corpus image, based on the selection or other criteria. For example, the "Heal Selection" plugin creates the target as the selection, and creates the corpus from a frisket around the selection.
The algorithm does not create any new colors. It does change the color for pixels in the target. But all colors come from pixels in the corpus. Other algorithms may fabricate new colors, similar to dithering. Another way of saying it is, if the image were to use 8-bit indexed colors, then the color map and count of distinct colors is not changed by the algorithm. But also read about encoding and accuracy.
Internally, the algorithm uses 8-bit encoding i.e. depth. This reduces the memory required, and increases the speed.
Images of higher bit-depth are copied and converted to 8-bit depth before being passed to the algorithm.
The original images retain their encoding. But the algorithm writes back colors of lower-bit depth to the target area of the images. This means a loss of accuracy in the target area, since the low-order bits will be fabricated as zeroes. The target can be a portion of the result image (the selection) or the entire result image.
You should not use Resynthesizer plugins when you want to retain accuracy, say for scientific uses. The Resynthesizer plugins "doctor" images artificially, in the pejorative sense.
An enhancement to the Resynthesizer would be to write back the exact colors from the corpus (retaining all bits of accuracy) into the target area of the target image, based on the coordinates of the best matching pixel in the corpus. But the Resynthesizer does not now do that. This enhancement would not alter the fact that the Resynthesizer "doctors" an image artificially.
Since Resynthesizer v3, the plugins will work on images of mode "Indexed" i.e. using a palette. The results may not be what you would expect, except for palettes where the order of colors in the palette is perceptual. For example, for a grayscale or monotone palette. In other words, nearby colors in the palette must be perceptually similar.
In Resynthesizer v2, you could not even invoke the plugins on an image of mode Indexed.
When the order of colors in the palette is not perceptual, for example for an image with a mix of red, green, blue colors, you should convert an image of mode Indexed to mode RGB, use the Resynthesizer, then convert back to mode Indexed.