-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
64 lines (41 loc) · 3.38 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(matchr)
test_urls <- list.files("tests/testthat/resources")
test_urls <- test_urls[1:15]
```
# matchr
<!-- badges: start -->
[![R build status](https://github.com/UPGo-McGill/matchr/workflows/R-CMD-check/badge.svg)](https://github.com/UPGo-McGill/matchr/actions)
[![codecov](https://codecov.io/gh/UPGo-McGill/matchr/branch/master/graph/badge.svg)](https://codecov.io/gh/UPGo-McGill/matchr)
<!-- badges: end -->
The goal of matchr is to facilitate fast and reliable comparison of large sets of images to identify identical or nearly-identical pairs. It works by generating distinctive image signatures from pixel data then correlating these signatures between sets of images. matchr's approach allows for image comparisons to be made on large sets of images (tens or hundreds of thousands of files) on even modest computer hardware.
Images are down-sampled to 8x8 greyscale bitmaps and passed through a discrete cosine transformation, from which 128-bit signatures are calculated. These signatures can identify a given image even if the image is rescaled or compressed, and thus serve as a reliable indicator of whether two images are the same.
Using matrix algebra, matchr can compare multiple sets of images and identify likely matches. The method is robust to changes in tint, compression or aspect ratio, and to other minor differences such as small overlays added to one of a pair of matching images.
The package includes a Shiny app (`compare_images` and `confirm_matches`) for visually inspecting match results and confirming or rejecting the matches.
All time-consuming matchr functions can be parallelized through the [future](https://cran.r-project.org/web/packages/future/vignettes/future-1-overview.html) package, can report progress through the [progressr](https://cran.r-project.org/web/packages/progressr/vignettes/progressr-intro.html) package, and can save incremental backups in case of an error.
## Installation
You can install the released version of matchr from [CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("matchr")
```
And the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("remotes")
remotes::install_github("UPGo-McGill/matchr")
```
## Usage
The standard matchr workflow involves importing one or more sets of images with `load_image`, generating image signatures from the image sets (or the underlying file paths) using `create_signature`, matching image signatures using `match_signatures`, then refining and verifying the matches using `identify_matches`, `confirm_matches`, and `integrate_changes`.
Because each of the matchr functions can be very time- or computation-intensive, it is usually most convenient to run the functions separately. But in the case of relatively small comparison tasks, `match_images` provides an "all-in-one" function which performs each task sequentially and delivers the final results.
When the {shiny} package is installed, `confirm_matches` loads an interactive Shiny app for verifying the results of the comparison algorithm and for flagging individual matches for follow up.
## Learn more
Use `vignette(package = "matchr")` to learn more about how matchr works!