-
Notifications
You must be signed in to change notification settings - Fork 16
/
data-actual-severity.qmd
245 lines (173 loc) · 11.9 KB
/
data-actual-severity.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
---
title: "Measuring foliar severity"
editor_options:
chunk_output_type: console
---
## The actual severity measure
Among the various methods to express plant disease severity, the percent area affected (or symptomatic) by the disease is one of the most common, especially when dealing with diseases that affect leaves. In order to evaluate whether visual estimates of plant disease severity are sufficiently accurate (as discussed in the previous chapter), we require the actual severity values. These are also essential when creating Standard Area Diagrams (SADs), which are diagrammatic representations of severity values used as a reference either before or during visual assessment to standardize and produce more accurate results across different raters [@delponte2017].
The actual severity values are typically approximated using image analysis, wherein the image is segmented, and each pixel is categorized into one of three classes:
1. Diseased (or symptomatic)
2. Non-diseased (or healthy)
3. Background (the non-plant portion of the image)
The ratio of the diseased area to the total area of the unit (e.g., the entire plant organ or section of the image) yields the proportion of the diseased area, or the percent area affected (when multiplied by 100). Researchers have employed various proprietary or open-source software to determine the actual severity, as documented in a review on Standard Area Diagrams [@delponte2017].
In this section, we will utilize the `measure_disease()` function from the {pliman} (Plant IMage ANalysis) R package [@olivoto2022a], and its variations, to measure the percent area affected. The package was compared with other software for determining plant disease severity across five different plant diseases and was shown to produce accurate results in most cases [@olivoto2022].
There are essentially two methods to measure severity. The first is predicated on image palettes that define each class of the image. The second relies on RGB-based indices [@alves2021]. Let's explore the first method, as well as an interactive approach to setting color palettes.
## Image palettes
The most crucial step is the initial one, where the user needs to correctly define the color palettes for each class. In pliman, the palettes can be separate images representing each of the three classes: background (b), symptomatic (s), and healthy (h).
The reference image palettes can be constructed by manually sampling small areas of the image and creating a composite image. As expected, the results may vary depending on how these areas are selected. A study that validated pliman for determining disease severity demonstrated the effect of different palettes prepared independently by three researchers, in which the composite palette (combining the three users) was superior [@olivoto2022]. During the calibration of the palettes, examining the processed masks is crucial to create reference palettes that are the most representative of the respective class.
In this example, I manually selected and pasted several sections of images representing each class from a few leaves into a Google slide. Once the image palette was ready, I exported each one as a separate PNG image file (JPG also works). These files were named: sbr_b.png, sbr_h.png, and sbr_s.png. They can be found [here in this folder](https://github.com/emdelponte/epidemiology-R/tree/main/imgs) for downloading.
[![Preparation of image palettes by manually sampling fraction of the images that represent background, heatlhy leaf and lesions](imgs/pliman1.png){#fig-pliman1 style="margin: 15px" fig-align="center"}](Fig_palettes)
Now that we have the image palettes, we need to import them into the environment, using `image_import()` function for further analysis. Let's create an image object for each palette named h (healthy), s (symptoms) and b (background).
```{r, warning=FALSE, message=FALSE}
library(pliman)
h <- image_import("imgs/sbr_h.png")
s <- image_import("imgs/sbr_s.png")
b <- image_import("imgs/sbr_b.png")
```
We can visualize the imported image palettes using the `image_combine()` function.
```{r}
#| label: fig-palettes
#| fig-cap: "Image palettes created to segment images into background, sypomtoms and healthy area of the image"
image_combine(h, s, b, ncol =3)
```
An alternative way to set the palettes is to use the `pick_palette()` function. It allows to manually pick the colors for each class by clicking on top of the imported image. We can use one of the original images or a composite images with portions of several leaves. Let's use here one of the original images and pick the background colors and assigned to `b` vector.
```{r}
#| eval: false
img <- image_import("imgs/originals/img46.png")
b <- pick_palette(img, viewer = "mapview")
```
The original image is displayed and the user needs to select the place marker. It is possible to zoom in the image. After placing the markers via clicking on the background colors multiple times, the user should click on "Done".
![Soybean leaf displaying symptoms of soybean rust](imgs/sbr_pick_palette.png){#fig_pick_palette fig-align="center"}
```{r}
#| eval: false
image_combine(b)
```
![Image generating after picking the palette colors for the background of the leaf](imgs/b_picked.png){#fig_background_colors_picked fig-align="center" width="372"}
Now, we can proceed and pick the colors for the other categories following the same logic.
```{r}
#| eval: false
# Symptoms
s <- pick_palette(img, viewer = "mapview")
# healthy
h <- pick_palette(img, viewer = "mapview")
```
## Measuring severity
### Single image
#### Using color palettes
To determine severity in a single image (e.g. img46.png), the image file needs to be loaded and assigned to an object using the same `image_import()` function used to load the palettes. We can then visualize the image, again using `image_combine()`.
::: callout-tip
The collection of images used in this chapter can be found [here](https://github.com/emdelponte/epidemiology-R/tree/main/imgs/originals).
:::
```{r}
#| label: fig-img
#| fig-cap: "Imported image for further analysis"
img <- image_import("imgs/originals/img46.png")
image_combine(img)
```
Now the engaging part starts with the `measure_disease()` function. Four arguments are required when using the reference image palettes: the image representing the target image and the three images of the color palettes. As the author of the package states, "pliman will take care of all the details!" The severity is the value displayed under 'symptomatic' in the output.
```{r}
set.seed(123)
measure_disease(
img = img,
img_healthy = h,
img_symptoms = s,
img_background = b
)
```
If we want to show the mask with two colors instead of the original, we can set to FALSE two "show\_" arguments:
```{r}
set.seed(123)
measure_disease(
img = img,
img_healthy = h,
img_symptoms = s,
img_background = b,
show_contour = FALSE,
show_original = FALSE
)
```
### Multiple images
Measuring severity in single images is indeed engaging, but we often deal with multiple images, not just one. Using the above procedure to process each image individually would be time-consuming and potentially tedious.
To automate the process, {pliman} offers a batch processing approach. Instead of using the `img` argument, one can use the `pattern` argument and define the prefix of the image names. Moreover, we also need to specify the directory where the original files are located.
If the user wants to save the processed masks, they should set the `save_image` argument to TRUE and also specify the directory where the images will be saved. Here's an example of how to process 10 images of soybean rust symptoms. The output is a `list` object with the measures of the percent healthy and percent symptomatic area for each leaf in the `severity` object.
```{r}
pliman <- measure_disease(
pattern = "img",
dir_original = "imgs/originals" ,
dir_processed = "imgs/processed",
save_image = TRUE,
img_healthy = h,
img_symptoms = s,
img_background = b,
verbose = FALSE,
plot = FALSE
)
severity <- pliman$severity
severity
```
When the argument `save_image` is set to TRUE, the images are all saved in the folder with the standard prefix "proc."
[![Images created by pliman and exported to a specific folder](imgs/pliman2.png){#fig-pliman2 style="margin: 15px" fig-align="center"}](fig_folder)
Let's have a look at one of the processed images.
[![Figure created by pliman after batch processing to segment the images and calculate percent area covered by symptoms. The symptomatic area is delinated in the image.](imgs/processed/proc_img46.jpg){#fig-processed style="margin: 15px" fig-align="center" width="452"}](fig_proc1)
#### More than a target per image
{pliman} offers a custom function to estimate the severity in multiple targets (e.g. leaf) per image. This is convenient to decrease the time when scanning the specimens, for example. Let's combine three soybean rust leaves into a single image and import it for processing. We will further set the `index_lb` (leaf background), `save_image` to `TRUE` and inform the directory for the processed images using `dir_processed`.
```{r}
#| label: fig-img2
#| fig-cap: "Imported image with multiple targets in a single image for further analysis using measure_disease_byl() function of the {pliman} package"
img2 <- image_import("imgs/soybean_three.png")
image_combine(img2)
```
```{r}
pliman2 <- measure_disease_byl(img = img2,
index_lb = b,
img_healthy = h,
img_symptoms = s,
save_image = TRUE,
dir_processed = "imgs/proc")
pliman2$severity
```
The original image is splited and the individual new images are saved in the proc folder.
![Individual images of the soybean leaves after processed using the measure_disease_byl function of the {pliman} package.](imgs/sbr_procor.png){#fig_sbr_processed fig-align="center" width="505"}
## How good are these measurements?
These 10 images were previously processed in QUANT software for measuring severity which is also based on image threshold. Let's create a tibble for the image code and respective "actual" severity - assuming QUANT measures as reference.
```{r, warning=FALSE, message=FALSE}
library(tidyverse)
library(r4pde)
quant <- tribble(
~img, ~actual,
"img5", 75,
"img11", 24,
"img35", 52,
"img37", 38,
"img38", 17,
"img46", 7,
"img63", 2.5,
"img67", 0.25,
"img70", 67,
"img75", 10
)
```
We can now combine the two dataframes and produce a scatter plot relating the two measures.
```{r}
#| label: fig-scatter
#| fig-cap: "Scatter plot for the relationship between severity values measured by pliman and Quant software"
dat <- left_join(severity, quant)
dat %>%
ggplot(aes(actual, symptomatic)) +
geom_point(size = 3, shape = 16) +
ylim(0, 100) +
xlim(0, 100) +
geom_abline(slope = 1, intercept = 0) +
labs(x = "Quant",
y = "pliman")+
theme_r4pde()
```
The concordance correlation coefficient is a test for agreement between two observers or two methods (see previous chapter). It is an indication of how accurate the *pliman* measures are compared with a standard. The coefficient is greater than 0.99 (1.0 is perfect concordance), suggesting an excellent agreement!
```{r}
#| warning: false
#| message: false
library(epiR)
ccc <- epi.ccc(dat$actual, dat$symptomatic)
ccc$rho.c
```
In conclusion, as mentioned earlier, the most critical step is defining the reference image palettes. A few preliminary runs may be necessary for some images to ensure that the segmentation is being carried out correctly, based on visual judgment. This is not different from any other color-threshold based methods, where the choices made by the user impact the final result and contribute to variation among assessors. The drawbacks are the same as those encountered with direct competitors, namely, the need for images to be taken under uniform and controlled conditions, especially with a contrasting background.