-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmulti_crystal.html
265 lines (223 loc) · 16.4 KB
/
multi_crystal.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
<!DOCTYPE html>
<html lang="en" data-content_root="./">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Processing multi-crystal datasets with xia2 — xia2 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=fa44fd50" />
<link rel="stylesheet" type="text/css" href="_static/alabaster.css?v=12dfc556" />
<link rel="stylesheet" href="_static/xia2.css" type="text/css" />
<script src="_static/documentation_options.js?v=5929fcd5"></script>
<script src="_static/doctools.js?v=9a2dae69"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<script async="async" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Processing serial synchrotron crystallography (SSX) datasets with xia2" href="serial_crystallography.html" />
<link rel="prev" title="Insulin tutorial" href="insulin_tutorial.html" />
<link rel="stylesheet" href="_static/custom.css" type="text/css" />
</head><body>
<!-- <div class="logoheader container"> -->
<!-- <a href="index.html"> -->
<!-- <img class="logoheader" alt="DIALS" src="_static/dials_header.png" /> -->
<!-- </a> -->
<!-- </div> -->
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<section id="processing-multi-crystal-datasets-with-xia2">
<h1>Processing multi-crystal datasets with xia2<a class="headerlink" href="#processing-multi-crystal-datasets-with-xia2" title="Link to this heading">¶</a></h1>
<p>xia2 is ideally suited to processing multi-crystal (or multi-sweep) datasets,
and is able to process more than one dataset simultaneously, providing many
features that can make the processing of such datasets both easier and faster.
Examples include (but not limited to):</p>
<ul class="simple">
<li><p>Merging multiple datasets taken from multiple crystals</p></li>
<li><p>Merging multiple datasets taken from a single crystal</p></li>
<li><p>Scaling together, but merging individually multiple wavelength datasets</p></li>
<li><p>Inverse beam experiments</p></li>
</ul>
<p>Images or directories can be passed on the command line, as with a normal
xia2 job processing a single sweep. Xia2 now allows passing multiple directories
on the command line, or passing multiple images via the <code class="samp docutils literal notranslate"><span class="pre">image=</span></code> parameter:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span> <span class="n">pipeline</span><span class="o">=</span><span class="n">dials</span> <span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span><span class="o">/</span><span class="n">dataset_1</span> <span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span><span class="o">/</span><span class="n">dataset_2</span>
<span class="n">xia2</span> <span class="n">pipeline</span><span class="o">=</span><span class="n">dials</span> <span class="n">image</span><span class="o">=/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span><span class="o">/</span><span class="n">sweep_1_0001</span><span class="o">.</span><span class="n">cbf</span> <span class="n">image</span><span class="o">=/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span><span class="o">/</span><span class="n">sweep_2_0001</span><span class="o">.</span><span class="n">cbf</span>
</pre></div>
</div>
<p>Alternatively, you can specify exactly which images you wish to process in an
.xinfo file:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">BEGIN</span> <span class="n">PROJECT</span> <span class="n">AUTOMATIC</span>
<span class="n">BEGIN</span> <span class="n">CRYSTAL</span> <span class="n">DEFAULT</span>
<span class="n">BEGIN</span> <span class="n">WAVELENGTH</span> <span class="n">NATIVE</span>
<span class="n">WAVELENGTH</span> <span class="mf">0.979500</span>
<span class="n">END</span> <span class="n">WAVELENGTH</span> <span class="n">NATIVE</span>
<span class="n">BEGIN</span> <span class="n">SWEEP</span> <span class="n">SWEEP1</span>
<span class="n">WAVELENGTH</span> <span class="n">NATIVE</span>
<span class="n">DIRECTORY</span> <span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span><span class="o">/</span>
<span class="n">IMAGE</span> <span class="n">sweep_1_0001</span><span class="o">.</span><span class="n">cbf</span>
<span class="n">START_END</span> <span class="mi">1</span> <span class="mi">450</span>
<span class="n">END</span> <span class="n">SWEEP</span> <span class="n">SWEEP1</span>
<span class="n">BEGIN</span> <span class="n">SWEEP</span> <span class="n">SWEEP2</span>
<span class="n">WAVELENGTH</span> <span class="n">NATIVE</span>
<span class="n">DIRECTORY</span> <span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span><span class="o">/</span>
<span class="n">IMAGE</span> <span class="n">sweep_2_0001</span><span class="o">.</span><span class="n">cbf</span>
<span class="n">START_END</span> <span class="mi">1</span> <span class="mi">450</span>
<span class="n">END</span> <span class="n">SWEEP</span> <span class="n">SWEEP2</span>
<span class="n">END</span> <span class="n">CRYSTAL</span> <span class="n">DEFAULT</span>
<span class="n">END</span> <span class="n">PROJECT</span> <span class="n">AUTOMATIC</span>
</pre></div>
</div>
<p>When processing many datasets simultaneously, it may happen that some datasets
will process successfully, but xia2 will fail to process others. By default,
xia2 will stop with an error message if any error is encountered, however if
the <code class="samp docutils literal notranslate"><span class="pre">failover=True</span></code> is set on the command line, then xia2 will
ignore any failed sweeps and continue processing with only those sweeps that
processed successfully.</p>
<p>When xia2 is given a particularly large number of images to process, it may
take some time before it appears to start processing the data. This may be for
a couple of reasons:</p>
<ol class="arabic simple">
<li><p>On start up, xia2 reads all the image headers to ensure that it understands
them correctly. A speedup can be obtained with the parameter
<code class="samp docutils literal notranslate"><span class="pre">read_all_image_headers=False</span></code>, which tells xia2 to only read the
first image header for each set of files with a matching template, and
infer the rest of the sweep from the first image header.</p></li>
<li><p>When using DIALS for indexing xia2 will run a beam centre search on each
sweep. This step can be disabled using the parameter
<code class="samp docutils literal notranslate"><span class="pre">trust_beam_centre=True</span></code></p></li>
</ol>
<p>Furthermore, xia2 may not make the same conclusion as to the symmetry for each
sweep, leading it to process the final dataset in the lowest common symmetry.
Sometimes indexing for a given sweep may fail altogether, and specifying the
<code class="samp docutils literal notranslate"><span class="pre">unit_cell=</span></code> and <code class="samp docutils literal notranslate"><span class="pre">space_group=</span></code> parameters (if known) on the
command line can help in both these situations.</p>
<table class="docutils align-default">
<thead>
<tr class="row-odd"><th class="head"><p>Parameter</p></th>
<th class="head"><p>Description</p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">failover=True</span></code></p></td>
<td><p>If processing fails for any sweeps,
ignore and just use those sweeps that
processed successfully</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">trust_beam_centre=True</span></code></p></td>
<td><p>Don’t run DIALS beam centre search</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">read_all_image_headers=False</span></code></p></td>
<td><p>Skip reading all image headers - just
read the first one for each sweep</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">unit_cell=</span></code></p></td>
<td><p>Provide a target unit cell to help
indexing</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">space_group=</span></code></p></td>
<td><p>Provide a target space group to help
indexing</p></td>
</tr>
</tbody>
</table>
<section id="parallel-data-processing">
<h2>Parallel data processing<a class="headerlink" href="#parallel-data-processing" title="Link to this heading">¶</a></h2>
<p>By default, xia2 processes each sweep sequentially, using <code class="samp docutils literal notranslate"><span class="pre">nproc</span></code>
processors. When processing multiple datasets, it may be more efficient to
process the sweeps in parallel, by specifying
<code class="samp docutils literal notranslate"><span class="pre">multiprocessing.mode=parallel</span></code> and
using <code class="samp docutils literal notranslate"><span class="pre">multiprocessing.njob</span></code> to indicate how many sweeps should be
processed simultaneously, using <code class="samp docutils literal notranslate"><span class="pre">multiprocessing.nproc</span></code> processors
per sweep:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span> <span class="n">pipeline</span><span class="o">=</span><span class="n">dials</span> <span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span> <span class="n">multiprocessing</span><span class="o">.</span><span class="n">mode</span><span class="o">=</span><span class="n">parallel</span> \
<span class="n">multiprocessing</span><span class="o">.</span><span class="n">njob</span><span class="o">=</span><span class="mi">2</span> <span class="n">multiprocessing</span><span class="o">.</span><span class="n">nproc</span><span class="o">=</span><span class="mi">4</span>
</pre></div>
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>This will use a total of <code class="samp docutils literal notranslate"><span class="pre">njob</span></code> <span class="math notranslate nohighlight">\(*\)</span> <code class="samp docutils literal notranslate"><span class="pre">nproc</span></code> processors,
i.e. <span class="math notranslate nohighlight">\(2 * 4 = 8\)</span> processors, which should be less than or equal to the total
number of processors available on your machine.</p>
</div>
<p>Additionally, xia2 can utilise the processing power of a cluster where
available (currently we only support qsub) by specifying the parameter
<code class="samp docutils literal notranslate"><span class="pre">multiprocessing.type=qsub</span></code>. The parameter
<code class="samp docutils literal notranslate"><span class="pre">multiprocessing.qsub_command</span></code> may be used (if needed) to e.g. specify
which queue jobs should be submitted to:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span> <span class="n">pipeline</span><span class="o">=</span><span class="n">dials</span> <span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">to</span><span class="o">/</span><span class="n">images</span> <span class="n">multiprocessing</span><span class="o">.</span><span class="n">mode</span><span class="o">=</span><span class="n">parallel</span> \
<span class="n">multiprocessing</span><span class="o">.</span><span class="n">type</span><span class="o">=</span><span class="n">qsub</span> <span class="n">multiprocessing</span><span class="o">.</span><span class="n">qsub_command</span><span class="o">=</span><span class="s2">"qsub -q low.q"</span> \
<span class="n">multiprocessing</span><span class="o">.</span><span class="n">njob</span><span class="o">=</span><span class="mi">10</span> <span class="n">multiprocessing</span><span class="o">.</span><span class="n">nproc</span><span class="o">=</span><span class="mi">16</span>
</pre></div>
</div>
<table class="docutils align-default">
<thead>
<tr class="row-odd"><th class="head"><p>Parameter</p></th>
<th class="head"><p>Description</p></th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">multiprocessing.mode=parallel</span></code></p></td>
<td><p>Process multiple sweeps in parallel,
rather than serially</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">multiprocessing.njob=4</span></code></p></td>
<td><p>In conjunction with mode=parallel,
process 4 sweeps simultaneously</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">multiprocessing.nproc=1</span></code></p></td>
<td><p>Use 1 processor per job (sweep)</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">multiprocessing.type=qsub</span></code></p></td>
<td><p>Submit individual processing jobs to
cluster using qsub</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">multiprocessing.type=qsub_command</span></code></p></td>
<td><p>The command to use to submit qsub
jobs</p></td>
</tr>
</tbody>
</table>
</section>
</section>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="Main">
<div class="sphinxsidebarwrapper">
<p class="logo">
<a href="index.html">
<img class="logo" src="_static/xia2-new-logo.gif" alt="Logo" />
</a>
</p>
<h3>Navigation</h3>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="quick_start.html">Getting started</a></li>
<li class="toctree-l1"><a class="reference internal" href="using_xia2.html">Using xia2</a></li>
<li class="toctree-l1"><a class="reference internal" href="installation.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="introductory_example.html">Introductory example</a></li>
<li class="toctree-l1"><a class="reference internal" href="insulin_tutorial.html">Insulin tutorial</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Multi-crystal data</a></li>
<li class="toctree-l1"><a class="reference internal" href="serial_crystallography.html">Serial-crystallography</a></li>
<li class="toctree-l1"><a class="reference internal" href="program_output.html">Program output</a></li>
<li class="toctree-l1"><a class="reference internal" href="parameters.html">Parameters</a></li>
<li class="toctree-l1"><a class="reference internal" href="comments.html">Comments</a></li>
<li class="toctree-l1"><a class="reference internal" href="background.html">History</a></li>
<li class="toctree-l1"><a class="reference internal" href="acknowledgements.html">Acknowledgements</a></li>
<li class="toctree-l1"><a class="reference internal" href="release_notes.html">Release notes</a></li>
<li class="toctree-l1"><a class="reference internal" href="license.html">License</a></li>
</ul>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer container">
<a href="http://www.diamond.ac.uk/"><img class="logofooter" alt="Diamond" src="_static/diamond_logo.png" /></a>
<a href="http://dials.diamond.ac.uk/"><img class="logofooter" alt="DIALS" src="_static/dials_logo_fx_transparent_bg.png" /></a>
<a href="http://www.ccp4.ac.uk/"><img class="logofooter" alt="CCP4" src="_static/CCP4-logo-plain.png" /></a>
</div>
<div class="footer">
©2020, Diamond Light Source.
</div>
</body>
</html>