Skip to content

Commit 98c3f23

Browse files
committed
Merge pull request #6 from mapbox/kwarg-be-gone
Kwarg be gone
2 parents d157d54 + 17926f2 commit 98c3f23

File tree

7 files changed

+186
-73
lines changed

7 files changed

+186
-73
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ pip install -e .
2727
with riomucho.RioMucho([{inputs}], {output}, {run function},
2828
windows={windows},
2929
global_args={global arguments},
30-
kwargs={kwargs to write}) as rios:
30+
options={options to write}) as rios:
3131

3232
rios.run({processes})
3333
```
@@ -80,9 +80,9 @@ global_args = {
8080
}
8181
```
8282

83-
#### `kwargs={keyword args}`
83+
#### `options={keyword args}`
8484

85-
The kwargs to pass to the output. `[Default = srcs[0].kwargs`
85+
The options to pass to the writing output. `[Default = srcs[0].meta`
8686

8787
## Example
8888

@@ -100,9 +100,9 @@ def basic_run(data, window, ij, g_args):
100100
with rasterio.open('/tmp/test_1.tif') as src:
101101
## grabbing the windows as an example. Default behavior is identical.
102102
windows = [[window, ij] for ij, window in src.block_windows()]
103-
kwargs = src.meta
103+
options = src.meta
104104
# since we are only writing to 2 bands
105-
kwargs.update(count=2)
105+
options.update(count=2)
106106

107107
global_args = {
108108
'divide': 2
@@ -114,7 +114,7 @@ processes = 4
114114
with riomucho.RioMucho(['input1.tif','input2.tif'], 'output.tif', basic_run,
115115
windows=windows,
116116
global_args=global_args,
117-
kwargs=kwargs) as rm:
117+
options=options) as rm:
118118

119119
rm.run(processes)
120120

README.rst

Lines changed: 141 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -5,68 +5,170 @@ Parallel processing wrapper for rasterio
55

66
|Build Status|
77

8+
Install
9+
-------
10+
11+
From pypi:
12+
13+
``pip install rio-mucho --pre``
14+
15+
From github (usually for a branch / dev):
16+
17+
``pip install pip install git+ssh://git@github.com/mapbox/rio-mucho.git@<branch>``
18+
19+
Development:
20+
21+
::
22+
23+
git clone git@github.com:mapbox/rio-mucho.git
24+
cd rio-mucho
25+
pip install -e .
26+
827
Usage
928
-----
1029

11-
1. Define a function to be applied to each window chunk. This should
12-
have input arguments of:
30+
.. code:: python
31+
32+
with riomucho.RioMucho([{inputs}], {output}, {run function},
33+
windows={windows},
34+
global_args={global arguments},
35+
meta={meta to write}) as rios:
36+
37+
rios.run({processes})
38+
39+
Arguments
40+
~~~~~~~~~
41+
42+
``inputs``
43+
^^^^^^^^^^
44+
45+
An list of file paths to open and read.
46+
47+
``output``
48+
^^^^^^^^^^
49+
50+
What file to write to.
51+
52+
``run_function``
53+
^^^^^^^^^^^^^^^^
54+
55+
A function to be applied to each window chunk. This should have input
56+
arguments of:
57+
58+
1. A data input, which can be one of:
59+
60+
- A list of numpy arrays of shape (x,y,z), one for each file as
61+
specified in input file list ``mode="simple_read" [default]``
62+
- A numpy array of shape ({*n* input files x *n* band count}, {window
63+
rows}, {window cols}) ``mode=array_read"``
64+
- A list of open sources for reading ``mode="manual_read"``
1365

14-
- A list of numpy arrays (one for each file as specified in input file
15-
list) of shape ``({bands}, {window rows}, {window cols})``
16-
- A ``rasterio`` window tuple
17-
- A ``rasterio`` window index (``ij``)
18-
- A global arguments object that you can use to pass in global
66+
2. A ``rasterio`` window tuple
67+
3. A ``rasterio`` window index (``ij``)
68+
4. A global arguments object that you can use to pass in global
1969
arguments
2070

71+
This should return:
72+
73+
1. An output array of ({count}, {window rows}, {window cols}) shape, and
74+
of the correct data type for writing
75+
2176
.. code:: python
2277
23-
def basic_run(data, window, ij, g_args):
24-
return data[0]
78+
def basic_run({data}, {window}, {ij}, {global args}):
79+
## do something
80+
return {out}
81+
82+
Keyword arguments
83+
~~~~~~~~~~~~~~~~~
84+
85+
``windows={windows}``
86+
^^^^^^^^^^^^^^^^^^^^^
87+
88+
A list of ``rasterio`` (window, ij) tuples to operate on.
89+
``[Default = src[0].block_windows()]``
2590

26-
2. Alternatively, for more flexibility, you can use a "manual read"
27-
where you read each raster in this function. This is useful if you
28-
want to read / write different window sizes (eg for pansharpening, or
29-
buffered window reading). Here, instead of a list of arrays, the
30-
function is passed an array of rasters open for reading.
91+
``global_args={global arguments}``
92+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
93+
94+
Since this is working in parallel, any other objects / values that you
95+
want to be accessible in the ``run_function``. ``[Default = {}]``
3196

3297
.. code:: python
3398
34-
def basic_run(open_files, window, ij, g_args):
35-
return numpy.array([f.read(window=window)[0] for f in open_files]) / g_args['divide']
99+
global_args = {
100+
'divide_value': 2
101+
}
36102
37-
For both of these, an array of identical shape to the destination window
38-
should be returned.
103+
``meta={keyword args}``
104+
^^^^^^^^^^^^^^^^^^^^^^^
39105

40-
3. To run, make some windows, get or make some keyword args for writing,
41-
and pass these and the above function into ``riomucho``: \`\`\`python
42-
import riomucho, rasterio, numpy
106+
The meta to pass to the output. ``[Default = srcs[0].meta``
43107

44-
get windows from an input
45-
=========================
108+
Example
109+
-------
46110

47-
with rasterio.open('/tmp/test\_1.tif') as src: windows = [[window, ij]
48-
for ij, window in src.block\_windows()] kwargs = src.meta # since we are
49-
only writing to 2 bands kwargs.update(count=2)
111+
.. code:: python
50112
51-
global\_args = { 'divide': 2 }
113+
import riomucho, rasterio, numpy
52114
53-
processes = 4
115+
def basic_run(data, window, ij, g_args):
116+
## do something
117+
out = np.array(
118+
[d[0] /= global_args['divide'] for d in data]
119+
)
120+
return out
54121
55-
run it
56-
======
122+
# get windows from an input
123+
with rasterio.open('/tmp/test_1.tif') as src:
124+
## grabbing the windows as an example. Default behavior is identical.
125+
windows = [[window, ij] for ij, window in src.block_windows()]
126+
meta = src.meta
127+
# since we are only writing to 2 bands
128+
meta.update(count=2)
57129
58-
with riomucho.RioMucho(['input1.tif','input2, input2.tif'],
59-
'output.tif', basic\_run, windows=windows, global\_args=global\_args,
60-
kwargs=kwargs) as rm:
130+
global_args = {
131+
'divide': 2
132+
}
61133
62-
::
134+
processes = 4
135+
136+
# run it
137+
with riomucho.RioMucho(['input1.tif','input2.tif'], 'output.tif', basic_run,
138+
windows=windows,
139+
global_args=global_args,
140+
meta=meta) as rm:
63141
64-
rm.run(processes)
142+
rm.run(processes)
143+
144+
Utility functions
145+
-----------------
146+
147+
\`riomucho.utils.array\_stack([array, array, array,...])
148+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
149+
150+
Given a list of ({depth}, {rows}, {cols}) numpy arrays, stack into a
151+
single (l{list length \* each image depth}, {rows}, {cols}) array. This
152+
is useful for handling variation between ``rgb`` inputs of a single
153+
file, or separate files for each.
154+
155+
One RGB file
156+
^^^^^^^^^^^^
157+
158+
.. code:: python
159+
160+
files = ['rgb.tif']
161+
open_files = [rasterio.open(f) for f in files]
162+
rgb = `riomucho.utils.array_stack([src.read() for src in open_files])
163+
164+
Separate RGB files
165+
^^^^^^^^^^^^^^^^^^
166+
167+
.. code:: python
65168
66-
\`\`\` - If no windows are specified, rio-mucho uses the block windows
67-
of the first input raster - If no kwargs are specified, rio-mucho uses
68-
the kwargs of the first input dataset to write to output - If no global
69-
args are specified, an empty object is passed.
169+
files = ['r.tif', 'g.tif', 'b.tif']
170+
open_files = [rasterio.open(f) for f in files]
171+
rgb = `riomucho.utils.array_stack([src.read() for src in open_files])
70172
71173
.. |Build Status| image:: https://travis-ci.org/mapbox/rio-mucho.svg?branch=master
72174
:target: https://travis-ci.org/mapbox/rio-mucho

riomucho/__init__.py

Lines changed: 29 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import numpy as np
55
import click
66
import riomucho.scripts.riomucho_utils as utils
7+
import traceback
78

89
work_func = None
910
global_args = None
@@ -23,19 +24,30 @@ def main_worker(inpaths, g_work_func, g_args):
2324
return
2425

2526
def manualRead(args):
26-
window, ij = args
27-
return work_func(srcs, window, ij, global_args), window
27+
try:
28+
window, ij = args
29+
return work_func(srcs, window, ij, global_args), window
30+
except Exception as e:
31+
traceback.print_exc()
32+
raise e
2833

2934
def arrayRead(args):
3035
window, ij = args
31-
return work_func(utils.array_stack(
32-
[src.read(window=window) for src in srcs]),
33-
window, ij, global_args), window
36+
try:
37+
return work_func(utils.array_stack(
38+
[src.read(window=window) for src in srcs]),
39+
window, ij, global_args), window
40+
except Exception as e:
41+
traceback.print_exc()
42+
raise e
3443

3544
def simpleRead(args):
3645
window, ij = args
37-
return work_func([src.read(window=window) for src in srcs], window, ij, global_args), window
38-
46+
try:
47+
return work_func([src.read(window=window) for src in srcs], window, ij, global_args), window
48+
except Exception as e:
49+
traceback.print_exc()
50+
raise e
3951

4052
class RioMucho:
4153
def __init__(self, inpaths, outpath, run_function, **kwargs):
@@ -45,16 +57,17 @@ def __init__(self, inpaths, outpath, run_function, **kwargs):
4557
else:
4658
self.windows = kwargs['windows']
4759

48-
if not 'kwargs' in kwargs:
49-
self.kwargs = utils.getKwargs(inpaths[0])
60+
if not 'options' in kwargs:
61+
self.options = utils.getOptions(inpaths[0])
5062
else:
51-
self.kwargs = kwargs['kwargs']
63+
self.options = kwargs['options']
5264

5365
if not 'global_args' in kwargs:
5466
self.global_args = {}
5567
else:
5668
self.global_args = kwargs['global_args']
5769

70+
5871
if not 'mode' in kwargs or kwargs['mode'] == 'simple_read':
5972
self.mode = 'simple_read'
6073
elif kwargs['mode'] == 'array_read':
@@ -74,10 +87,10 @@ def __exit__(self, ext_t, ext_v, trace):
7487
click.echo("in __exit__")
7588

7689
def run(self, processes=4):
77-
pool = Pool(processes, main_worker, (self.inpaths, self.run_function, self.global_args))
90+
self.pool = Pool(processes, main_worker, (self.inpaths, self.run_function, self.global_args))
7891

7992
##shh
80-
self.kwargs['transform'] = self.kwargs['affine']
93+
self.options['transform'] = self.options['affine']
8194

8295
if self.mode == 'manual_read':
8396
reader_worker = manualRead
@@ -87,13 +100,12 @@ def run(self, processes=4):
87100
reader_worker = simpleRead
88101

89102
## Open an output file, work through the function in parallel, and write out the data
90-
with rio.open(self.outpath, 'w', **self.kwargs) as dst:
91-
for data, window in pool.imap_unordered(reader_worker, self.windows):
103+
with rio.open(self.outpath, 'w', **self.options) as dst:
104+
for data, window in self.pool.imap_unordered(reader_worker, self.windows):
92105
dst.write(data, window=window)
93106

94-
pool.close()
95-
pool.join()
96-
107+
self.pool.close()
108+
self.pool.join()
97109
return
98110

99111

riomucho/scripts/riomucho_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
import click
44

55

6-
def getKwargs(input):
6+
def getOptions(input):
77
with rio.open(input) as src:
88
return src.meta
99

setup.cfg

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1 @@
1-
[egg_info]
2-
tag_build = dev
1+
[egg_info]

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99

1010
setup(name='rio-mucho',
11-
version='0.0.2',
11+
version='0.1.1',
1212
description=u"Windowed multiprocessing wrapper for rasterio",
1313
long_description=long_description,
1414
classifiers=[],

0 commit comments

Comments
 (0)