-
Notifications
You must be signed in to change notification settings - Fork 13
/
images_and_memory.Rmd
268 lines (202 loc) · 6.63 KB
/
images_and_memory.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
---
jupyter:
jupytext:
notebook_metadata_filter: all,-language_info
split_at_heading: true
text_representation:
extension: .Rmd
format_name: rmarkdown
format_version: '1.2'
jupytext_version: 1.14.1
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
orphan: true
---
# Images and memory
We saw in doc:`nibabel_images` that images loaded from disk are usually *proxy
images*. Proxy images are images that have a `dataobj` property that is not a
numpy array, but an *array proxy* that can fetch the array data from disk.
## The preliminaries
```{python}
# Numpy array library
import numpy as np
# Show values in array to 2 digits only.
# This does not affect calculations, just display.
np.set_printoptions(precision=2, suppress=True)
# Standard form of Nibabel import
import nibabel as nib
# Library to fetch example data.
import nipraxis
```
```{python}
example_file = nipraxis.fetch_file('ds107_sub012_t1r2.nii')
img = nib.load(example_file)
img
```
Nibabel does not load the image array from the proxy when you `load` the
image. It waits until you ask for the array data. The standard way to
ask for the array data is to call the `get_fdata()` method:
```{python}
data = img.get_fdata()
data.shape
```
We also saw in `proxies-caching` that this call to `get_fdata()` will
(by default) load the array data into an internal image cache. The image
returns the cached copy on the next call to `get_fdata()`:
```{python}
data_again = img.get_fdata()
data is data_again
```
This behavior is convenient if you want quick and repeated access to the
image array data. The down-side is that the image keeps a reference to
the image data array, so the array can't be cleared from memory until
the image object gets deleted. You might prefer to keep loading the
array from disk instead of keeping the cached copy in the image.
This page describes ways of using the image array proxies to save memory
and time.
## Using `in_memory` to check the state of the cache
You can use the `in_memory` property to check if the image has cached
the array.
The `in_memory` property is always True for array images, because the
image data is always an array in memory:
```{python}
array_data = np.arange(24, dtype=np.int16).reshape((2, 3, 4))
affine = np.diag([1, 2, 3, 1])
array_img = nib.Nifti1Image(array_data, affine)
array_img.in_memory
```
For a proxy image, the `in_memory` property is False when the array is
not in cache, and True when it is in cache:
```{python}
img = nib.load(example_file)
img.in_memory
```
```{python}
data = img.get_fdata()
img.in_memory
```
## Using `uncache`
As y'all know, the proxy image has the array in cache, `get_fdata()` returns
the cached array:
```{python}
data_again = img.get_fdata()
data_again is data # Same data.
```
You can uncache a proxy image with the `uncache()` method:
```{python}
img.uncache()
img.in_memory
```
```{python}
data_once_more = img.get_fdata()
data_once_more is data # New copy of data.
```
`uncache()` has no effect if the image is an array image, or if the
cache is already empty.
You need to be careful when you modify arrays returned by `get_fdata()`
on proxy images, because `uncache` will then change the result you get
back from `get_fdata()`:
```{python}
proxy_img = nib.load(example_file)
data = proxy_img.get_fdata() # array cached and returned
data[0, 0, 0, 0]
```
```{python}
data[0, 0, 0, 0] = 99 # modify returned array
data_again = proxy_img.get_fdata() # return cached array
data_again[0, 0, 0, 0] # cached array modified 99.0
```
So far the proxy image behaves the same as an array image. `uncache()` has no
effect on an array image, but it does have an effect on the returned array of a
proxy image:
```{python}
proxy_img.uncache() # cached array discarded from proxy image
data_once_more = proxy_img.get_fdata() # new copy of array loaded.
data_once_more[0, 0, 0, 0] # array modifications discarded
```
## Saving memory
### Uncache the array
If you do not want the image to keep the array in its internal cache,
you can use the `uncache()` method:
```{python}
img.uncache()
```
### Use the array proxy instead of `get_fdata()`
The `dataobj` property of a proxy image is an array proxy. We can ask
the proxy to return the array directly by passing `dataobj` to the numpy
`asarray` function:
```{python}
proxy_img = nib.load(example_file)
data_array = np.asarray(proxy_img.dataobj)
type(data_array)
```
This also works for array images, because `np.asarray` returns the
array:
```{python}
array_img = nib.Nifti1Image(array_data, affine)
data_array = np.asarray(array_img.dataobj)
type(data_array)
```
If you want to avoid caching you can avoid `get_fdata()` and always use
`np.asarray(img.dataobj)`.
### Use the `caching` keyword to `get_fdata()`
The default behavior of the `get_fdata()` function is to always fill the
cache, if it is empty. This corresponds to the default `'fill'` value to
the `caching` keyword. So, this:
```{python}
proxy_img = nib.load(example_file)
data = proxy_img.get_fdata() # default caching='fill'
proxy_img.in_memory
```
is the same as this:
```{python}
proxy_img = nib.load(example_file)
data = proxy_img.get_fdata(caching='fill')
proxy_img.in_memory
```
Sometimes you may want to avoid filling the cache, if it is empty. In this
case, you can use `caching='unchanged'`:
```{python}
proxy_img = nib.load(example_file)
data = proxy_img.get_fdata(caching='unchanged')
proxy_img.in_memory
```
`caching='unchanged'` will leave the cache full if it is already full.
```{python}
data = proxy_img.get_fdata(caching='fill')
proxy_img.in_memory
```
```{python}
data = proxy_img.get_fdata(caching='unchanged')
proxy_img.in_memory
```
See the `get_fdata()` docstring (`img.get_data?` in Jupyter) for more detail.
## Saving time and memory
You can use the array proxy to get slices of data from disk in an
efficient way.
The array proxy API allows you to do slicing on the proxy. In most cases
this will mean that you only load the data from disk that you actually
need, often saving both time and memory.
For example, let us say you only wanted the second volume from the
example dataset. You could do this:
```{python}
proxy_img = nib.load(example_file)
data = proxy_img.get_fdata()
data.shape
```
```{python}
vol1 = data[..., 1]
vol1.shape
```
The problem is that you had to load the whole data array into memory
before throwing away the first volume and keeping the second.
You can use array proxy slicing to do this more efficiently:
```{python}
proxy_img = nib.load(example_file)
vol1 = proxy_img.dataobj[..., 1]
vol1.shape
```
The slicing call in `proxy_img.dataobj[..., 1]` will only load the data
from disk that you need to fill the memory of `vol1`.