Skip to content

Commit b172545

Browse files
authored
Merge pull request #36 from sfiligoi/multi_230320
Add multi mode
2 parents c64c65f + 9d200d3 commit b172545

8 files changed

+698
-117
lines changed

.github/workflows/main.yml

+23
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,10 @@ jobs:
154154
python compare_unifrac_stats.py t1.h5 5 999 1.001112 0.456 0.001 0.1
155155
ls -l t1.h5
156156
rm -f t1.h5
157+
time ssu -m unweighted -i test500.biom -t test500.tre --pcoa 4 -r hdf5 --permanova 99 -g test500.tsv -c empo_2 -o t1.h5
158+
python compare_unifrac_stats.py t1.h5 5 99 1.001112 0.456 0.001 0.2
159+
ls -l t1.h5
160+
rm -f t1.h5
157161
time ssu -m weighted_unnormalized_fp32 -i test500.biom -t test500.tre --pcoa 4 -r hdf5_nodist -g test500.tsv -c empo_3 -o t1.h5
158162
# compare to values given by skbio.stats.distance.permanova
159163
python compare_unifrac_stats.py t1.h5 17 999 0.890697 0.865 0.001 0.1
@@ -181,6 +185,25 @@ jobs:
181185
ls -l t1.h5
182186
rm -f t1.h5
183187
rm -f t1.partial.*
188+
# subsample
189+
echo "subsample"
190+
time ssu -f -m unweighted -i test500.biom -t test500.tre --pcoa 4 -r hdf5_fp32 --subsample-depth 100 -o t1.h5
191+
./compare_unifrac_pcoa.py test500.unweighted_fp32.f.h5 t1.h5 3 0.3
192+
rm -f t1.h5
193+
time ssu -m unweighted -i test500.biom -t test500.tre --pcoa 4 -r hdf5_nodist --permanova 99 -g test500.tsv -c empo_2 --subsample-depth 100 -o t1.h5
194+
python compare_unifrac_stats.py t1.h5 5 99 1.001112 0.456 0.05 0.5
195+
ls -l t1.h5
196+
rm -f t1.h5
197+
# multi
198+
echo "multi"
199+
time ssu -f -m unweighted -i test500.biom -t test500.tre --pcoa 4 --mode multi --subsample-depth 100 --n-subsamples 10 -o t1.h5
200+
./compare_unifrac_pcoa_multi.py test500.unweighted_fp32.f.h5 t1.h5 10 3 0.3
201+
ls -l t1.h5
202+
rm -f t1.h5
203+
time ssu -m unweighted -i test500.biom -t test500.tre --pcoa 4 --mode multi --n-subsamples 10 --permanova 99 -g test500.tsv -c empo_2 --subsample-depth 100 -o t1.h5
204+
python compare_unifrac_stats_multi.py t1.h5 10 5 99 1.001112 0.456 0.08 0.5
205+
ls -l t1.h5
206+
rm -f t1.h5
184207
popd
185208
- name: Sanity checks
186209
shell: bash -l {0}

README.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -148,19 +148,21 @@ The methods can be used directly through the command line after install:
148148
partial : Compute UniFrac over a subset of stripes.
149149
partial-report : Start and stop suggestions for partial compute.
150150
merge-partial : Merge partial UniFrac results.
151+
multi : compute UniFrac multiple times.
151152
--start [OPTIONAL] If mode==partial, the starting stripe.
152153
--stop [OPTIONAL] If mode==partial, the stopping stripe.
153154
--partial-pattern [OPTIONAL] If mode==merge-partial, a glob pattern for partial outputs to merge.
154155
--n-partials [OPTIONAL] If mode==partial-report, the number of partitions to compute.
155156
--report-bare [OPTIONAL] If mode==partial-report, produce barebones output.
156157
--n-substeps [OPTIONAL] Internally split the problem in n substeps for reduced memory footprint, default is 1.
157158
--format|-r [OPTIONAL] Output format:
158-
ascii : [DEFAULT] Original ASCII format.
159+
ascii : Original ASCII format. (default if mode==one-off)
160+
hdf5_nodist : HFD5 format, no distance matrix. (default if mode==multi)
159161
hdf5 : HFD5 format. May be fp32 or fp64, depending on method.
160162
hdf5_fp32 : HFD5 format, using fp32 precision.
161163
hdf5_fp64 : HFD5 format, using fp64 precision.
162-
hdf5_nodist : HFD5 format, no distance matrix, just PCoA.
163-
--subsample-depth [OPTIONAL] Depth of subsampling of the input BIOM before computing unifrac
164+
--subsample-depth Depth of subsampling of the input BIOM before computing unifrac (required for mode==multi, optional for one-off)
165+
--n-subsamples [OPTIONAL] if mode==multi, number of subsampled UniFracs to compute (default: 100)
164166
--permanova [OPTIONAL] Number of PERMANOVA permutations to compute (default: 999 with -g, do not compute if 0)
165167
--pcoa [OPTIONAL] Number of PCoA dimensions to compute (default: 10, do not compute if 0)
166168
--seed [OPTIONAL] Seed to use for initializing the random gnerator

0 commit comments

Comments
 (0)