-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
1979 lines (1859 loc) · 106 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd-->
<html xmlns="http://www.w3.org/1999/xhtml"
>
<head><title>The Vmatch large scale sequence analysis software</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="TeX4ht (http://www.tug.org/tex4ht/)" />
<meta name="originator" content="TeX4ht (http://www.tug.org/tex4ht/)" />
<!-- xhtml,charset=utf-8,html -->
<meta name="src" content="vmweb.tex" />
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<meta name="description" content="The Vmatch large scale sequence analysis
software is a versatile software tool for efficiently solving large scale sequence matching tasks."/>
<meta name="keywords" content="sequence analysis, sequence mapping, BLAST, bioinformatics, computational biology"/>
<meta http-equiv="Content-Style-Type" content="text/css"/>
<link rel="stylesheet" type="text/css" href="vmweb.css" />
</head><body
>
<div class="maketitle">
<h1 align="center" class="titleHead">The Vmatch large scale sequence analysis
software</h1>
<div align="center" class="author" ><span
class="ptmr7t-x-x-144">Stefan Kurtz</span></div>
<br />
<div align="center" class="date" ><span
class="ptmr7t-x-x-144">June 15, 2017</span></div>
</div>
<!--l. 61--> <br/> <center> <img src="matchgraph.gif" alt="show matches of different sizes in a matchgraph"/> </center> <div id="downloadbox"> <ul> <li><a href="download.html">Download <i>Vmatch</i>!</a></li> </ul> </div>
<!--l. 63--><p class="noindent" >This is the web-site for <span
class="ptmri7t-x-x-120">Vmatch</span>, a versatile software tool for efficiently solving large
scale sequence matching tasks. <span
class="ptmri7t-x-x-120">Vmatch </span>subsumes the software tool <a
href="http://bibiserv.techfak.uni-bielefeld.de/reputer" >REPuter</a>, but is
much more general, with a very flexible user interface, and improved space and time
requirements. <a href="vmweb.pdf">Here</a> is a printable version of this HTML-page in PDF.
</p>
<h3 class="likesectionHead"><a
id="x1-1000"></a>Features of <span
class="ptmri7t-x-x-120">Vmatch</span></h3>
<!--l. 76--><p class="noindent" >The <a
href="virtman.pdf" ><span
class="ptmri7t-x-x-120">Vmatch</span>-manual</a> gives many examples on how to use <span
class="ptmri7t-x-x-120">Vmatch</span>. Here are the
program’s most important features.
</p><!--l. 3--><p class="noindent" >
</p>
<h4 class="likesubsectionHead"><a
id="x1-2000"></a>Persistent index</h4>
<!--l. 4--><p class="noindent" >Usually, in a large scale matching problem, extensive portions of the sequences under
consideration are static, i.e. they do not change much over time. Therefore it makes
sense to preprocess this static data to extract information from it and to store this in a
structured manner, allowing efficient searches. <span
class="ptmri7t-x-x-120">Vmatch </span>does exactly this: it
preprocesses a set of sequences into an index structure. This is stored as a collection of
several files constituting the persistent index. The index efficiently represents all
substrings of the preprocessed sequences and, unlike many other sequence
comparison tools, allows matching tasks to be solved in time, <span
class="ptmri7t-x-x-120">independent </span>of
the size of the index. Different matching tasks require different parts of the
index, but only the required parts of the index are accessed during the matching
process.
</p><!--l. 21--><p class="noindent" >
</p>
<h4 class="likesubsectionHead"><a
id="x1-3000"></a>Alphabet independency</h4>
<!--l. 22--><p class="noindent" >Most software tools for sequence analysis are restricted to DNA and/or protein
sequences. In contrast, <span
class="ptmri7t-x-x-120">Vmatch </span>can process sequences over any user defined alphabet
not larger than 250 symbols. <span
class="ptmri7t-x-x-120">Vmatch </span>fully implements the concept of <span
class="ptmri7t-x-x-120">symbol</span>
<span
class="ptmri7t-x-x-120">mappings</span>, denoting alphabet transformations. These allow the user to specify that
different characters in the input sequences should be considered identical in
the matching process. This feature is used to group similar amino acids, for
example.
</p><!--l. 31--><p class="noindent" >
</p>
<h4 class="likesubsectionHead"><a
id="x1-4000"></a>Versatility</h4>
<!--l. 32--><p class="noindent" ><span
class="ptmri7t-x-x-120">Vmatch </span>allows a multitude of different matching tasks to be solved using the
persistent index. Every matching task is basically characterized by (1) the <span
class="ptmri7t-x-x-120">kind</span>
<span
class="ptmri7t-x-x-120">of sequences </span>to be matched, (2) the <span
class="ptmri7t-x-x-120">kind of matches </span>sought, (3) additional
<span
class="ptmri7t-x-x-120">constraints </span>on the matches, and (4) the <span
class="ptmri7t-x-x-120">kind of postprocessing </span>to be done with the
matches.
</p><!--l. 39--><p class="noindent" >In the standard case, <span
class="ptmri7t-x-x-120">Vmatch </span>matches sequences over the same alphabet. Additionally,
DNA sequences can be matched against a protein sequence index in all six reading
frames. Finally, DNA sequences can be transformed in all six reading frames and
compared against itself.
</p><!--l. 44--><p class="noindent" >Where appropriate, <span
class="ptmri7t-x-x-120">Vmatch </span>can compute the following kinds of matches, using
state-of-the-art algorithms:
</p>
<ul class="itemize1">
<li class="itemize">maximal and supermaximal repeats using the algorithms of <a
id="XABO:KUR:OHL:2004"></a>M.I.
Abouelhoda, S. Kurtz, and E. Ohlebusch. Replacing suffix trees with
enhanced suffix arrays. <span
class="ptmri7t-x-x-120">Journal of Discrete Algorithms</span>, 2:53–86, 2004
</li>
<li class="itemize">branching tandem repeats using the algorithm of <a
id="XABO:KUR:OHL:2002"></a>M.I. Abouelhoda,
S. Kurtz, and E. Ohlebusch. The enhanced suffix array and its applications
to genome analysis. In <span
class="ptmri7t-x-x-120">Proceedings of the Second Workshop on Algorithms</span>
<span
class="ptmri7t-x-x-120">in Bioinformatics</span>, pages 449–463. Lecture Notes in Computer Science
2452, Springer-Verlag, 2002
</li>
<li class="itemize">maximal (unique) substring matches using the algorithms of <a
id="XKUR:2002B"></a>S. Kurtz. A
Time and Space Efficient Algorithm for the Substring Matching Problem,
2002
</li>
<li class="itemize">complete matches using the algorithms of <a
id="XMAN:MYE:1993"></a>U. Manber and E.W. Myers.
Suffix Arrays: A New Method for On-Line String Searches. <span
class="ptmri7t-x-x-120">SIAM Journal</span>
<span
class="ptmri7t-x-x-120">on Computing</span>, 22(5):935–948, 1993 and [<a
href="#XMYE:1999">86</a>]
</li></ul>
<!--l. 69--><p class="noindent" >To compute degenerate substring matches or degenerate repeats, each kind
of match (with the exception of tandem repeats and complete matches) can
be taken as an exact seed and extended by either of two different strategies:
</p>
<ul class="itemize1">
<li class="itemize">the <span
class="ptmri7t-x-x-120">maximum error </span>extension strategy, as described in
<!--l. 77--><p class="noindent" ><a
id="XKUR:CHO:OHL:SCHLE:STO:GIE:2001"></a>S. Kurtz, J.V. Choudhuri, E. Ohlebusch, C. Schleiermacher, J. Stoye, and
R. Giegerich. REPuter: The manifold applications of repeat analysis on
a genomic scale. <span
class="ptmri7t-x-x-120">Nucleic Acids Res.</span>, 29(22):4633–4642, 2001 for repeat
detection,
</p></li>
<li class="itemize">the <span
class="ptmri7t-x-x-120">greedy </span>extension strategy of <a
id="XZHA:SCHWA:WAG:MIL:2000"></a>Z. Zhang, S. Schwartz, L. Wagner,
and W. Miller. A Greedy Algorithm for Aligning DNA Sequences.
<span
class="ptmri7t-x-x-120">J.</span><span
class="ptmri7t-x-x-120"> Comp.</span><span
class="ptmri7t-x-x-120"> Biol.</span>, 7(1/2):203–214, 2000
</li></ul>
<!--l. 84--><p class="noindent" >Matches can be selected according to their length, their E-value, their identity value, or
match score.
</p><!--l. 87--><p class="noindent" >In the standard case, a match is displayed as an alignment including positional
information. Alternatively, a match can directly be postprocessed in different
ways:
</p>
<ul class="itemize1">
<li class="itemize"><span
class="ptmri7t-x-x-120">inverse output</span>, i.e. reporting of substrings <span
class="ptmri7t-x-x-120">not </span>covered by a match.
</li>
<li class="itemize"><span
class="ptmri7t-x-x-120">masking </span>of substrings covered by a match.
</li>
<li class="itemize"><span
class="ptmri7t-x-x-120">clustering </span>of sequences according to the matches found.
</li>
<li class="itemize"><span
class="ptmri7t-x-x-120">chaining </span>of matches, i.e. finding optimal subsets of matches which do not
cross, using the algorithms described in
<!--l. 104--><p class="noindent" ><a
id="XABO:OHL:2003"></a>M.I. Abouelhoda and E. Ohlebusch. A Local Chaining Algorithm
and its Applications in Comparative Genomics. In <span
class="ptmri7t-x-x-120">Proc. 3rd Worksh.</span>
<span
class="ptmri7t-x-x-120">Algorithms in Bioinformatics (WABI 2003)</span>, number 2812 in Lecture Notes
in Bioinformatics, pages 1–16. Springer-Verlag, 2003
</p></li>
<li class="itemize"><span
class="ptmri7t-x-x-120">clustering </span>of matches according to pairwise sequence similarities computed
by the dynamic programming algorithm of <a
id="XUKK:1985A"></a>E. Ukkonen. Algorithms for
Approximate String Matching. <span
class="ptmri7t-x-x-120">Information and Control</span>, 64:100–118, 1985
</li>
<li class="itemize"><span
class="ptmri7t-x-x-120">clustering </span>of matches according to the positions where they occur, following
the approach of
<!--l. 115--><p class="noindent" ><a
id="XVOL:HAA:SAL:2001"></a>N. Volfovsky,
B.J. Haas, and S.L. Salzberg. A Clustering Method for Repeat Analysis in
DNA Sequences. <span
class="ptmri7t-x-x-120">Genome Biology</span>, 2(8):research0027.1–0027.11, 2001
</p>
</li></ul>
<!--l. 119--><p class="noindent" >
</p>
<h4 class="likesubsectionHead"><a
id="x1-5000"></a>Efficient algorithms and data structures</h4>
<!--l. 120--><p class="noindent" ><span
class="ptmri7t-x-x-120">Vmatch </span>is based on enhanced suffix arrays described Abouelhoda, Kurtz & Ohlebusch,
2004. This data structure has been shown to be as powerful as suffix trees, with the
advantage of a reduced space requirement and reduced processing time. Careful
implementation of the algorithms and data structures incorporated in <span
class="ptmri7t-x-x-120">Vmatch</span>
have led to exceedingly fast and robust software, allowing very large sequence
sets to be processed quickly. The 32-bit version of <span
class="ptmri7t-x-x-120">Vmatch </span>can process up to
400 million symbols, if enough memory is available. For large server class
machines (e.g. SUN-Sparc/Solaris, Intel Xeon/Linux, Compaq-Alpha/Tru64)
<span
class="ptmri7t-x-x-120">Vmatch </span>is available as a 64 bit version, enabling gigabytes of sequences to be
processed.
</p><!--l. 138--><p class="noindent" >
</p>
<h4 class="likesubsectionHead"><a
id="x1-6000"></a>Flexible input format</h4>
<!--l. 139--><p class="noindent" >The most common formats for input sequences (Fasta, Genbank, EMBL, and
SWISSPROT) are accepted. The user does not have to specify the input format. It is
automatically recognized. All input files can contain an arbitrary number of sequences.
Gzipped compressed inputs are accepted.
</p><!--l. 145--><p class="noindent" >
</p>
<h4 class="likesubsectionHead"><a
id="x1-7000"></a>Customized output and match selection</h4>
<!--l. 146--><p class="noindent" ><span
class="ptmri7t-x-x-120">Vmatch</span>’s output can be parsed by other programs easily. Furthermore, several options
allow for its customization. XML output is available and new output formats can easily
be incorporated without changing <span
class="ptmri7t-x-x-120">Vmatch</span>’s program code. Certain matches can easily
be selected by user defined criteria, without intermediate output and subsequent
parsing.
</p><!--l. 154--><p class="noindent" >
</p>
<h3 class="likesectionHead"><a
id="x1-8000"></a>The parts of Vmatch</h3>
<!--l. 155--><p class="noindent" >Up until now we have referred to <span
class="ptmri7t-x-x-120">Vmatch </span>as a collection of programs. In the following
we use the same name, <span
class="cmtt-12">vmatch </span>(in typewriter font), for the most important
program in this collection. Besides <span
class="cmtt-12">vmatch</span>, there are the following programs
available:
</p><ol class="enumerate1" >
<li
class="enumerate" id="x1-8002x1"><span
class="cmtt-12">mkvtree </span>constructs the persistent index and stores it on files.
</li>
<li
class="enumerate" id="x1-8004x2"><span
class="cmtt-12">mkdna6idx </span>constructs an index for a DNA sequence after translating this in
all six reading frames.
</li>
<li
class="enumerate" id="x1-8006x3"><span
class="cmtt-12">vseqinfo </span>delivers information about indexed database sequences.
</li>
<li
class="enumerate" id="x1-8008x4"><span
class="cmtt-12">vstree2tex </span>outputs a representation of the index in <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span
class="E">E</span>X</span></span>-format. It can
be used, for example, for educational or debugging purposes.
</li>
<li
class="enumerate" id="x1-8010x5"><span
class="cmtt-12">vseqselect </span>selects indexed sequences satisfying specific criteria.
</li>
<li
class="enumerate" id="x1-8012x6"><span
class="cmtt-12">vsubseqselect </span>selects substrings of a specified length range from an
index.
</li>
<li
class="enumerate" id="x1-8014x7"><span
class="cmtt-12">vmigrate.sh </span>converts an index from big endian to little endian
architectures, or vice versa.
</li>
<li
class="enumerate" id="x1-8016x8"><span
class="cmtt-12">vmatchselect </span>sort and selects matches delivered by <span
class="cmtt-12">vmatch</span>.
</li>
<li
class="enumerate" id="x1-8018x9"><span
class="cmtt-12">chain2dim </span>computes optimal chains of matches from files in
<span
class="ptmri7t-x-x-120">Vmatch</span>-format.
</li>
<li
class="enumerate" id="x1-8020x10"><span
class="cmtt-12">matchcluster </span>computes clusters of matches from files in <span
class="ptmri7t-x-x-120">Vmatch</span>-format.</li></ol>
<!--l. 85--><p class="noindent" > <a href="Dataflowfig.pdf">Here</a> is an overview of the dataflow in <i>Vmatch</i>.
</p><!--l. 87--><p class="noindent" >
</p>
<h3 class="likesectionHead"><a
id="x1-9000"></a>Related tools</h3>
<!--l. 88--><p class="noindent" >There are several tools which are based on the persistent index of <span
class="ptmri7t-x-x-120">Vmatch</span>:
</p><!--l. 91--><p class="noindent" >
</p><dl class="description"><dt class="description">
<span
class="ptmb7t-x-x-120">Genalyzer</span> </dt><dd
class="description">is a graphical user interface to visualize the output of <span
class="ptmri7t-x-x-120">Vmatch </span>in form
of a match graph. For details see
<!--l. 97--><p class="noindent" ><a
id="XCHO:SCHLE:KUR:GIE:2004"></a>J.V. Choudhuri, C. Schleiermacher, S. Kurtz, and R. Giegerich.
Genalyzer: Interactive visualization of sequence similarities between entire
genomes. <span
class="ptmri7t-x-x-120">Bioinformatics</span>, 20:1964–1965, 2004
</p><!--l. 99--><p class="noindent" >Genalyzer is not available any more.
</p></dd><dt class="description">
<a
href="http://bibiserv.techfak.uni-bielefeld.de/mga/" ><span
class="ptmb7t-x-x-120">MGA</span></a> </dt><dd
class="description">is a program to compute multiple alignments of complete genomes. For
details see
<!--l. 104--><p class="noindent" ><a
id="XHOEH:KUR:OHL:2002"></a>M. Höhl, S. Kurtz, and E. Ohlebusch. Efficient multiple genome
alignment. <span
class="ptmri7t-x-x-120">Bioinformatics</span>, 18(Suppl. 1):S312–S320, 2002
</p></dd><dt class="description">
<span
class="ptmb7t-x-x-120">Multimat</span> </dt><dd
class="description">is a program to compute multiple exact matches between three or more
genome size sequences. For details see
<!--l. 108--><p class="noindent" ><a
id="XOHL:KUR:2008"></a>E. Ohlebusch and S. Kurtz. Space efficient computation of rare
maximal exact matches between multiple sequences. <span
class="ptmri7t-x-x-120">J.</span><span
class="ptmri7t-x-x-120"> Comp.</span><span
class="ptmri7t-x-x-120"> Biol.</span>,
15(4):357–377, 2008
</p><!--l. 110--><p class="noindent" >Please contact <a
href="http://www.zbh.uni-hamburg.de/kurtz" >Stefan Kurtz</a> if you are interested in using Multimat.
</p></dd><dt class="description">
<a
href="http://bibiserv.techfak.uni-bielefeld.de/possumsearch/" ><span
class="ptmb7t-x-x-120">PossumSearch</span></a> </dt><dd
class="description">Is a program to search for position specific scoring matrices. For
details, see
<!--l. 118--><p class="noindent" ><a
id="XBEC:HOM:GIE:KUR:2006"></a>M. Beckstette, R. Homann, R. Giegerich, and S. Kurtz. Fast index based
algorithms and software for matching position specific scoring matrices.
<span
class="ptmri7t-x-x-120">BMC Bioinformatics</span>, 7:389, 2006
</p></dd><dt class="description">
</dt><dd
class="description">
</dd><dt class="description">
<a
href="http://www.genomethreader.org/" ><span
class="ptmb7t-x-x-120">GenomeThreader</span></a> </dt><dd
class="description">is a software tool to compute gene structure predictions. The
gene structure predictions are calculated using a similarity-based approach
where additional cDNA/EST and/or protein sequences are used to predict
gene structures via spliced alignments. <span
class="ptmri7t-x-x-120">GenomeThreader </span>uses the matching
capabilities of <span
class="ptmri7t-x-x-120">Vmatch </span>to efficiently map the reference sequence to a
genomic sequence. For details, see
<!--l. 128--><p class="noindent" ><a
id="XGRE:BRE:SPA:KUR:2005"></a>G. Gremme, V. Brendel, M.E. Sparks, and S. Kurtz. Engineering a
software tool for gene prediction in higher organisms. <span
class="ptmri7t-x-x-120">Information and</span>
<span
class="ptmri7t-x-x-120">Software Technology</span>, 47(15):965–978, 2005
</p></dd><dt class="description">
</dt><dd
class="description">
</dd><dt class="description">
<a
href="http://www.biopieces.org/" ><span
class="ptmb7t-x-x-120">Biopieces</span></a> </dt><dd
class="description">is a collection of bioinformatics tools that can be pieced together
in a very easy and flexible manner to perform both simple and
complex tasks. Some Biopieces depend on <span
class="ptmri7t-x-x-120">Vmatch</span>. For details see
<a
href="http://www.biopieces.org/" class="url" ><span
class="cmtt-12">http://www.biopieces.org/</span></a>.</dd></dl>
<!--l. 139--><p class="noindent" > <a name="CurrentUsage"/>
</p>
<h3 class="likesectionHead"><a
id="x1-10000"></a>Previous and Current Usages</h3>
<!--l. 142--><p class="noindent" >We provide an annotated bibliography listing papers which applied <span
class="ptmri7t-x-x-120">Vmatch </span>and shortly
describe the tasks for which <span
class="ptmri7t-x-x-120">Vmatch </span>was used. We omit our own papers. The references
were collected by a <a
href="https://scholar.google.de/scholar?q=Vmatch+AND+Kurtz+OR+www.vmatch.de" >search in Google scholar</a> (which, as of Jan 2, 2016 retrieved 397
results.)
</p><!--l. 149--><p class="noindent" >
</p>
<h4 class="likesubsectionHead"><a
id="x1-11000"></a>Usages in Plant Genome Research</h4>
<!--l. 150--><p class="noindent" >
</p><ol class="enumerate1" >
<li
class="enumerate" id="x1-11002x1"><a
id="XBRE:KUR:WAL:2002"></a>V. Brendel, S. Kurtz, and V. Walbot. Comparative genomics of
Arabidopsis and Maize: Prospects and limitations. <span
class="ptmri7t-x-x-120">Genome Biology</span>,
3(3):reviews1005.1–1005.6, 2002
<!--l. 153--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to a compute a non-redundant set from a
large collection of protein sequences from Zea-Maize.
</p><!--l. 155--><p class="noindent" >Similar applications are described in
</p><!--l. 157--><p class="noindent" ><a
id="XDON:ROY:FRE:WAL:BRE:2003"></a>Q. Dong, L. Roy, M. Freeling, V. Walbot, and V. Brendel. ZmDB, an
integrated Database for Maize Genome Research. <span
class="ptmri7t-x-x-120">Nucleic Acids Res.</span>,
31:244–247, 2003.
</p></li>
<li
class="enumerate" id="x1-11004x2">PLEXdb is a database for gene expression resources for plants and plant
pathogens, see
<!--l. 166--><p class="noindent" ><a
id="XDAS:VAN:HON:WIS:DIC:2012"></a>S. Dash, J. Van Hemert, L. Hong, R. P. Wise, and J. A. Dickerson.
PLEXdb: gene expression resources for plants and plant pathogens. <span
class="ptmri7t-x-x-120">Nucleic</span>
<span
class="ptmri7t-x-x-120">Acids Res.</span>, 40(Database issue):D1194–1201, Jan 2012
</p><!--l. 168--><p class="noindent" >PLEXdb provides a <span
class="ptmri7t-x-x-120">Vmatch</span>-based <a
href="http://www.plantgdb.org/cgi-bin/prj/PLEXdb/ProbeMatch.pl" >web-service</a> to match PLEXdb probes.
</p></li>
<li
class="enumerate" id="x1-11006x3">The assembly of the Arabidopsis thaliana genome from 2004 (GenBank
entries of 2/19/04) contained vector sequence contaminations. For example,
region 3 617 880 to 3 625 027 of chromosome II contained a cloning vector.
<span
class="ptmri7t-x-x-120">Vmatch </span>was used to detect the vector contamination, see <a
href="http://www.plantgdb.org/AtGDB/Annotation/vector.php" >here</a>
</li>
<li
class="enumerate" id="x1-11008x4"><a
id="XDON:LAW:SCHLUE:WIL:KUR:LUS:BRE:2005"></a>Q. Dong, C.J. Lawrence, S.D. Schlueter, M.D. Wilkerson, S. Kurtz,
C. Lushbough, and V. Brendel. Comparative Plant Genomics Resources at
PlantGDB. <span
class="ptmri7t-x-x-120">Plant Physiology, Plant Database Focus Issue</span>, 2005
<!--l. 183--><p class="noindent" >This work describes PlantGDB, which provides a service called
<a
href="http://www.plantgdb.org/PlantGDB-cgi/vmatch/patternsearch.pl" >PatternSearch@PlantGDB</a> for genome wide pattern searches in plant
sequences. The service is based on <span
class="ptmri7t-x-x-120">Vmatch</span>.
</p></li>
<li
class="enumerate" id="x1-11010x5"><a
id="XLIN:KRO:2005"></a>M. Lindow and A. Krogh. Computational evidence for hundreds of
non-conserved plant micrornas. <span
class="ptmri7t-x-x-120">BMC Genomics</span>, 6(1):119, 2005
<!--l. 202--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for three different tasks: </p>
<ul class="itemize1">
<li class="itemize">Searching spliced mRNA in the Arabidopsis genome to detect
micromatches of length at least 20 with maximum 2 mismatches.
</li>
<li class="itemize">Finding matches of length at least 15 long with at most one mismatch
between predicted mature miRNA-sequences and a set of ESTs as well
as sequences from the Arabidopsis Small RNA Project (ASRP).
</li>
<li class="itemize">Aligning and performing single linkage clustering of the predicted
mature miRNA sequences. Candidate pairs aligning over at least 17
bases, allowing an edit distance of 1 were grouped in the same family.</li></ul>
</li>
<li
class="enumerate" id="x1-11012x6"><a
id="XPOM:LEM:TUR:2006"></a>J.-F. Pombert, C. Lemieux, and M. Turmel. The complete chloroplast DNA
sequence of the green alga Oltmannsiellopsis viridis reveals a distinctive
quadripartite architecture in the chloroplast genome of early diverging ulvophytes.
<span
class="ptmri7t-x-x-120">BMC Biology</span>, 4:3, 2006
<!--l. 207--><p class="noindent" ><a
id="XTUR:OTI:LEM:2006"></a>M. Turmel, C. Otis, and C. Lemieux. The Chloroplast Genome Sequence of
Chara vulgaris Sheds New Light into the Closest Green Algal Relatives of Land
Plants. <span
class="ptmri7t-x-x-120">Molecular Biology and Evolution</span>, 23:1324–1338, 2006
</p><!--l. 209--><p class="noindent" >In these papers <span
class="ptmri7t-x-x-120">Vmatch </span>was used to search and compare repeated elements in
different chloroplast DNA.
</p></li>
<li
class="enumerate" id="x1-11014x7"><a
id="XSPA:NOU:HAA:YAN:GUN:HIN:KLE:HAB:SCHOO:MAY:2007"></a>M. Spannagl, O. Noubibou, D. Haase, L. Yang, H. Gundlach, T. Hindemitt,
K. Klee, G. Haberer, H. Schoof, and K.F.X. Mayer. MIPSPlantsDB–plant
database resource for integrative and comparative plant genome research. <span
class="ptmri7t-x-x-120">Nucleic</span>
<span
class="ptmri7t-x-x-120">Acids Res</span>, 35(Database issue):D834–40, 2007 In this work about the
<span
class="ptmri7t-x-x-120">MIPSPlantsDB </span>database <span
class="ptmri7t-x-x-120">Vmatch </span>was used to cluster large sequence
sets.
</li>
<li
class="enumerate" id="x1-11016x8"><a
id="XSCHIJ:VOS:MAR:JON:ROS:MOL:TIK:ANG:TUN:BOV:2007"></a>E.G.W.M. Schijlen, C.H. Ric de Vos, S. Martens, H.H. Jonker, F.M. Rosin, J.W.
Molthoff, Y.M. Tikunov, G.C. Angenent, A.J. van Tunen, and A.G. Bovy. RNA
interference silencing of chalcone synthase, the first step in the flavonoid
biosynthesis pathway, leads to parthenocarpic tomato fruits. <span
class="ptmri7t-x-x-120">Plant Physiol</span>,
144(3):1520–30, 2007
<!--l. 218--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to compare target genes of the tomato Chs RNAi
to a tomato gene index.
</p></li>
<li
class="enumerate" id="x1-11018x9"><a
id="XLIN:JAC:NYG:MAN:KRO:2007"></a>M. Lindow, A. Jacobsen, S. Nygaard, Y. Mang, and A. Krogh. Intragenomic
matching reveals a huge potential for mirna-mediated regulation in plants. <span
class="ptmri7t-x-x-120">PLOS</span>
<span
class="ptmri7t-x-x-120">Comput. Biol</span>, 3(11):e238, 2007
<!--l. 223--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to search different plant genomes for matches of
length at least 20 with maximum of 2 mismatches. Here the fact that <span
class="ptmri7t-x-x-120">Vmatch </span>is an
exhaustive search tool is important.
</p></li>
<li
class="enumerate" id="x1-11020x10"><a
id="XDEC:OTI:THU:LEM:2007"></a>J.-C. de Cambiaire, C. Otis, M. Turmel, and C. Lemieux. The chloroplast
genome sequence of the green alga leptosira terrestris: multiple losses of
the inverted repeat and extensive genome rearrangements within the
trebouxiophyceae. <span
class="ptmri7t-x-x-120">BMC Genomics</span>, 8(1):213, 2007
<!--l. 228--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to determine the presence of shared repeated
elements of minimum length 30, with up to 10% mismatches using in different
sequence sets from the green alga <span
class="ptmri7t-x-x-120">Leptosira terrestris</span>.
</p></li>
<li
class="enumerate" id="x1-11022x11"><a
id="XOSS:SCHNE:CLA:LAN:WAR:WEI:2008"></a>S. Ossowski, K. Schneeberger, R.M. Clark, C. Lanz, N. Warthmann, and
D. Weigel. Sequencing of natural strains of Arabidopsis thaliana with short
reads. <span
class="ptmri7t-x-x-120">Genome Res.</span>, 18:2024–2033, 2008
<!--l. 235--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to map millions of short sequence reads to the
<span
class="ptmri7t-x-x-120">A.</span><span
class="ptmri7t-x-x-120"> Thaliana </span>genome. Up to four mismatches and up to three indels were allowed
in the matching process. The seed size was chosen to be 0. The reads were aligned
using the best match strategy by iteratively increasing the the allowed number of
mismatches and gaps at each round.
</p></li>
<li
class="enumerate" id="x1-11024x12"><a
id="XDIBO:OSS:SCHNE:RAT:2008"></a>F. De Bona, S. Ossowski, K. Schneeberger, and G. Ratsch. Optimal spliced
alignments of short sequence reads. <span
class="ptmri7t-x-x-120">Bioinformatics</span>, 24(16):i174–180,
2008
<!--l. 242--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to map millions of short sequence reads to the
<span
class="ptmri7t-x-x-120">A.</span><span
class="ptmri7t-x-x-120"> Thaliana </span>genome. <span
class="ptmri7t-x-x-120">Vmatch </span>was part of a multi-step pipeline, combining a fast
matching algorithm (<span
class="ptmri7t-x-x-120">Vmatch</span>) for initial read mapping and an optimal alignment
algorithm based on dynamic programming (QPALMA) for high quality detection
of splice sites.
</p></li>
<li
class="enumerate" id="x1-11026x13"><a
id="XASS:HER:LIN:HUE:TAL:SMA:IMM:ELD:FIE:SCHAT:2010"></a>A. G. L. Assunção, E. Herrero, Y-F. Lin, B. Huettel, S. Talukdar,
C. Smaczniak, R. GH Immink, M. Van Eldik, M. Fiers, H. Schat, et al.
Arabidopsis thaliana transcription factors bzip19 and bzip23 regulate the
adaptation to zinc deficiency. <span
class="ptmri7t-x-x-120">Proceedings of the National Academy of Sciences</span>,
107(22):10296–10301, 2010
<!--l. 245--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for motif searching in different plant
genomes.
</p></li>
<li
class="enumerate" id="x1-11028x14"><a
id="XEVE:SAT:GOL:MEY:BET:SAK:WAR:JAC:2010"></a>Andrea L Eveland, Namiko Satoh-Nagasawa, Alexander Goldshmidt, Sandra
Meyer, Mary Beatty, Hajime Sakai, Doreen Ware, and David Jackson. Digital
gene expression signatures for maize development. <span
class="ptmri7t-x-x-120">Plant physiology</span>,
154(3):1024–1039, 2010
<!--l. 248--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to map unique consensus sequence tags to the
maize reference genome.
</p></li>
<li
class="enumerate" id="x1-11030x15"><a
id="XBRO:OTI:LEM:TUR:2010"></a>Jean-Simon Brouard, Christian Otis, Claude Lemieux, and Monique Turmel. The
exceptionally large chloroplast genome of the green alga floydiella terrestris
illuminates the evolutionary history of the chlorophyceae. <span
class="ptmri7t-x-x-120">Genome biology and</span>
<span
class="ptmri7t-x-x-120">evolution</span>, 2:240, 2010
<!--l. 252--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to identify and cluster repeated sequences in
<span
class="ptmri7t-x-x-120">Floydiella </span>chloroplast genome.
</p></li>
<li
class="enumerate" id="x1-11032x16"><a
id="XREH:AQU:GRU:HEN:HIL:LAU:NAO:PAT:ROM:SHU:2010"></a>Hubert Rehrauer, Catharine Aquino, Wilhelm Gruissem, Stefan R Henz, Pierre
Hilson, Sascha Laubinger, Naira Naouar, Andrea Patrignani, Stephane Rombauts,
Huan Shu, et al. Agronomics1: a new resource for arabidopsis transcriptome
profiling. <span
class="ptmri7t-x-x-120">Plant Physiology</span>, 152(2):487–499, 2010
<!--l. 257--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to calculate direct and reverse complementary
matches of length 17 bp or greater with edit distance 1 or less between
five nuclear chromosomes and mitochondrial and chloroplast genome
sequences.
</p></li>
<li
class="enumerate" id="x1-11034x17"><a
id="XSEK:LIN:CHI:HAN:BUE:LEO:KAE:2011"></a>R. S. Sekhon, H. Lin, K. L. Childs, C. N. Hansey, C. R. Buell, N. de Leon,
and S. M. Kaeppler. Genome-wide atlas of transcription during maize
development. <span
class="ptmri7t-x-x-120">Plant J.</span>, 66(4):553–563, May 2011
<!--l. 261--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to search probe sequences against the maize
genome the cDNA sequences of the official maize gene models.
</p></li>
<li
class="enumerate" id="x1-11036x18"><a
id="XDAS:OH:HAA:HER:HON:ALI:YUN:BRE:ZHU:BOH:2011"></a>M. Dassanayake, D. H. Oh, J. S. Haas, A. Hernandez, H. Hong, S. Ali, D. J.
Yun, R. A. Bressan, J. K. Zhu, H. J. Bohnert, and J. M. Cheeseman. The
genome of the extremophile crucifer Thellungiella parvula. <span
class="ptmri7t-x-x-120">Nat. Genet.</span>,
43(9):913–918, Sep 2011
<!--l. 266--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for clustering sequences assembled from 454-reads
of <span
class="ptmri7t-x-x-120">Thellungiella parvula</span>, a model for the evolution of plant adaptation to extreme
environments.
</p></li>
<li
class="enumerate" id="x1-11038x19"><a
id="XWIL:HOF:KLE:WEI:2011"></a>E. M. Willing, M. Hoffmann, J. D. Klein, D. Weigel, and C. Dreyer.
Paired-end RAD-seq for de novo assembly and marker design without available
reference. <span
class="ptmri7t-x-x-120">Bioinformatics</span>, 27(16):2187–2193, Aug 2011
<!--l. 270--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for grouping short reads into pools representing
the same RAD tag.
</p></li>
<li
class="enumerate" id="x1-11040x20"><a
id="XGAO:ZHO:WAN:SU:WAN:2011"></a>L. Gao, Y. Zhou, Z.-W. Wang, Y.-J. Su, and T. Wang. Evolution of the
<span
class="ptmri7t-x-x-120">rpoB-psbZ </span>region in fern plastid genomes: notable structural rearrangements
and highly variable intergenic spacers. <span
class="ptmri7t-x-x-120">BMC Plant Biology</span>, 11(1):64,
2011
<!--l. 274--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for detecting and clustering repetitive sequences in
diverse fern plastid genomes.
</p></li>
<li
class="enumerate" id="x1-11042x21"><a
id="XSLO:ALV:CHU:WU:MCC:PAL:TAY:2012"></a>D. B. Sloan, A. J. Alverson, J. P. Chuckalovcak, M. Wu, D. E. McCauley,
J. D. Palmer, and D. R. Taylor. Rapid evolution of enormous, multichromosomal
genomes in flowering plant mitochondria with exceptionally high mutation rates.
<span
class="ptmri7t-x-x-120">PLoS Biol.</span>, 10(1):e1001241, Jan 2012
<!--l. 278--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to precisely define the boundaries of all repeats
with 100% sequence identity.
</p></li>
<li
class="enumerate" id="x1-11044x22"><a
id="XDUB:FAR:SCHLU:CAN:ABE:TUT:WOO:SHA:MUL:KUD:2011"></a>Anuja Dubey, Andrew Farmer, Jessica Schlueter, Steven B Cannon, Brian
Abernathy, Reetu Tuteja, Jimmy Woodward, Trushar Shah, Benjamin
Mulasmanovic, Himabindu Kudapa, et al. Defining the transcriptome
assembly and its use for genome dynamics and transcriptome profiling
studies in pigeonpea (<span
class="ptmri7t-x-x-120">Cajanus cajan </span>l.). <span
class="ptmri7t-x-x-120">DNA research</span>, 18(3):153–164,
2011
<!--l. 281--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used cluster sequences based on their six-frame
translation.
</p></li>
<li
class="enumerate" id="x1-11046x23"><a
id="XSAX:PEN:UPA:KUM:CAR:SCHLU:FAR:WHA:SAR:MAY:2012"></a>Rachit K Saxena, R Varma Penmetsa, Hari D Upadhyaya, Ashish Kumar,
Noelia Carrasquilla-Garcia, Jessica A Schlueter, Andrew Farmer, Adam M
Whaley, Birinchi K Sarma, Gregory D May, et al. Large-scale development of
cost-effective single-nucleotide polymorphism marker assays for genetic
mapping in pigeonpea and comparative mapping in legumes. <span
class="ptmri7t-x-x-120">DNA research</span>,
19(6):449–461, 2012
<!--l. 285--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to identify reciprocal best matches between the
pigeonpea sequences and other legume sequences.
</p></li>
<li
class="enumerate" id="x1-11048x24"><a
id="XHAZ:REE:RIS:PEC:2012"></a>B. Z. Haznedaroglu, D. Reeves, H. Rismani-Yazdi, and J. Peccia. Optimization
of de novo transcriptome assembly from high-throughput short read sequencing
data improves functional annotation for non-model organisms. <span
class="ptmri7t-x-x-120">BMC</span>
<span
class="ptmri7t-x-x-120">Bioinformatics</span>, 13:170, 2012
<!--l. 290--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for assembly clustering and optimization
of contigs for <span
class="ptmri7t-x-x-120">Neochloris oleoabundans </span>(a Chlorophyceae class green
microalgae).
</p></li>
<li
class="enumerate" id="x1-11050x25"><a
id="XMAR:KLE:BAN:BLA:MAC:SCHMU:SCHOL:GUN:WIC:SIM:2012"></a>M. M. Martis, S. Klemme, A. M. Banaei-Moghaddam, F. R. Blattner,
J. Macas, T. Schmutzer, U. Scholz, H. Gundlach, T. Wicker, H. Šimková,
P. Novak, P. Neumann, M. Kubalakova, E. Bauer, G. Haseneyer, J. Fuchs,
J. Dolezel, N. Stein, K. F. Mayer, and A. Houben. Selfish supernumerary
chromosome reveals its origin as a mosaic of host genome and organellar
sequences. <span
class="ptmri7t-x-x-120">Proc. Natl. Acad. Sci. U.S.A.</span>, 109(33):13343–13346, Aug
2012
<!--l. 294--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to match reads against a repeat library to identity
the content of the repetitive DNA per sequence read.
</p></li>
<li
class="enumerate" id="x1-11052x26"><a
id="XCHI:DAV:BUE:2011"></a>K. L. Childs, R. M. Davidson, and C. R. Buell. Gene coexpression network
analysis as a source of functional annotation for rice genes. <span
class="ptmri7t-x-x-120">PloS one</span>,
6(7):e22196, 2011
<!--l. 297--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to align individual probes to representative gene
models.
</p></li>
<li
class="enumerate" id="x1-11054x27"><a
id="XSEV:DIJ:HAM:2011"></a>E. I. Severing, A. D. J. van Dijk, and R. C. H. J. van Ham. Assessing the
contribution of alternative splicing to proteome diversity in arabidopsis thaliana
using proteomics data. <span
class="ptmri7t-x-x-120">BMC Plant Biology</span>, 11(1):82, 2011
<!--l. 301--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for performing exact searches with peptides
against the filtered proteome of <span
class="ptmri7t-x-x-120">A. thaliana</span>.
</p></li>
<li
class="enumerate" id="x1-11056x28"><a
id="XWOL:WEI:SEG:ROS:BEI:DON:SPI:NOR:REH:KOE:2011"></a>P. Wolff, I. Weinhofer, J. Seguin, P. Roszak, C. Beisel, M.T. Donoghue,
C. Spillane, M. Nordborg, M. Rehmsmeier, and C. Köhler. High-resolution
analysis of parent-of-origin allelic expression in the arabidopsis endosperm. <span
class="ptmri7t-x-x-120">PLoS</span>
<span
class="ptmri7t-x-x-120">Genet</span>, 7(6):e1002126–e1002126, 2011
<!--l. 307--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to map RNAseq reads, allowing up to two
mismatches (option <span
class="cmtt-12">-h 2</span>) and generating maximal substring matches that are
unique in some reference dataset (option <span
class="cmtt-12">-mum cand</span>).
</p></li>
<li
class="enumerate" id="x1-11058x29"><a
id="XFLE:KHA:JOH:YOU:MIT:WRE:HES:FOS:SCHAR:SCO:2011"></a>D. J. Fleetwood, A. K. Khan, R. D. Johnson, C. A. Young, S. Mittal, R. E.
Wrenn, U. Hesse, S. J. Foster, C. L. Schardl, and B. Scott. Abundant
degenerate miniature inverted-repeat transposable elements in genomes of
epichloid fungal endophytes of grasses. <span
class="ptmri7t-x-x-120">Genome Biol Evol</span>, 3:1253–1264,
2011
<!--l. 312--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to identify terminal inverted repeats of length
range 10-65 bp, <span
class="zptmcm7y-x-x-120">≥ </span><span
class="zptmcm7t-x-x-120">80% </span>identity, maximum inter-TIR distance 650 bp in in
genomes of epichloid fungal endophytes of grasses.
</p></li>
<li
class="enumerate" id="x1-11060x30"><a
id="XCHI:KON:BUE:2012"></a>K. L. Childs, K. Konganti, and C. R. Buell. The Biofuel Feedstock Genomics
Resource: a web-based portal and database to enable functional genomics
of plant biofuel feedstock species. <span
class="ptmri7t-x-x-120">Database (Oxford)</span>, 2012:bar061,
2012
<!--l. 315--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to match putative unique transcript sequence
assemblies.
</p></li>
<li
class="enumerate" id="x1-11062x31"><a
id="XCHE:CAS:BAI:RED:MIC:2012"></a>Y. Chen, B. J. Cassone, X. Bai, M. G. Redinbaugh, and A. P. Michel.
Transcriptome of the plant virus vector Graminella nigrifrons, and the molecular
interactions of maize fine streak rhabdovirus transmission. <span
class="ptmri7t-x-x-120">PLoS ONE</span>,
7(7):e40613, 2012
<!--l. 319--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for refining assemblies of Illumina reads in
the context of a transcriptome project for plant virus vector <span
class="ptmri7t-x-x-120">Graminella</span>
<span
class="ptmri7t-x-x-120">nigrifrons</span>.
</p></li>
<li
class="enumerate" id="x1-11064x32"><a
id="XKRI:PAT:JAI:GAU:CHOU:VAI:DEE:HAR:KRI:NAI:2012"></a>N. M. Krishnan, S. Pattnaik, P. Jain, P. Gaur, R. Choudhary, S. Vaidyanathan,
S. Deepak, A. K. Hariharan, P. B. Krishna, J. Nair, L. Varghese, N. K.
Valivarthi, K. Dhas, K. Ramaswamy, and B. Panda. A draft of the genome and
four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica.
<span
class="ptmri7t-x-x-120">BMC Genomics</span>, 13:464, 2012
<!--l. 324--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for clustering repeats and for building a consensus
repeat library in the context of genome and transcriptome projects for <span
class="ptmri7t-x-x-120">Azadirachta</span>
<span
class="ptmri7t-x-x-120">indica</span>, a medicinal and pesticidal angiosperm.
</p></li>
<li
class="enumerate" id="x1-11066x33"><a
id="XLIU:KUM:ZHA:ZHE:WAR:2012"></a>Z. Liu, S. Kumari, L. Zhang, Y. Zheng, and D. Ware. Characterization of
mirnas in response to short-term waterlogging in three inbred lines of zea mays.
<span
class="ptmri7t-x-x-120">PLoS One</span>, 7(6):e39786, 2012
<!--l. 328--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to map unique consensus sequences tags to the
maize reference genome and to predict targets of novel miRNAs.
</p></li>
<li
class="enumerate" id="x1-11068x34"><a
id="XBOU:KOU:PAV:MIN:TSA:DAR:2012"></a>A. Bousios, Y. A. I. Kourmpetis, P. Pavlidis, E. Minga, A. Tsaftaris, and
N. Darzentas. The turbulent life of sirevirus retrotransposons and the evolution of
the maize genome: more than ten thousand elements tell the story. <span
class="ptmri7t-x-x-120">The Plant</span>
<span
class="ptmri7t-x-x-120">Journal</span>, 69(3):475–488, 2012
<!--l. 331--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for masking Long Terminal Repeats in the Maize
Genome Sequence.
</p></li>
<li
class="enumerate" id="x1-11070x35">In the papers
<!--l. 335--><p class="noindent" ><a
id="XHER:MAR:DOR:PFE:GAL:SCHAA:JOU:SIM:VAL:DOL:2012"></a>P. Hernandez, M. Martis, G. Dorado, M. Pfeifer, S. Galvez, S. Schaaf, N. Jouve,
H. Šimková, M. Valarik, J. Dolezel, and K. F. Mayer. Next-generation
sequencing and syntenic integration of flow-sorted arms of wheat chromosome
4A exposes the chromosome structure and gene content. <span
class="ptmri7t-x-x-120">Plant J.</span>, 69(3):377–386,
Feb 2012
</p><!--l. 337--><p class="noindent" ><a
id="XPHI:PAU:BER:SOU:CHO:LAU:SIM:SAF:BEL:VAU:2013"></a>R. Philippe, E. Paux, I. Bertin, P. Sourdille, F. Choulet, C. Laugier,
H. Šimková, J. Šafář, A. Bellec, S. Vautrin, et al. A high density physical map
of chromosome 1bl supports evolutionary studies, map-based cloning and
sequencing in wheat. <span
class="ptmri7t-x-x-120">Genome Biol</span>, 14(6):R64, 2013
</p><!--l. 339--><p class="noindent" ><span
class="ptmri7t-x-x-120">Vmatch </span>was used to mask repetitive DNA.
</p></li>
<li
class="enumerate" id="x1-11072x36"><a
id="XHOW:YU:KNA:CRO:KOL:DOL:LOR:DEA:2013"></a>G. T. Howe, J. Yu, B. Knaus, R. Cronn, S. Kolpak, P. Dolan, W. W. Lorenz,
and J. F. Dean. A SNP resource for Douglas-fir: de novo transcriptome
assembly and SNP detection and validation. <span
class="ptmri7t-x-x-120">BMC Genomics</span>, 14:137,
2013
<!--l. 342--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to cluster 40 010 assembled isotigs.
</p></li>
<li
class="enumerate" id="x1-11074x37"><a
id="XKAR:HAA:MAL:GEE:BOV:LAM:ANG:MAA:2013"></a>R. Karlova, J. C. van Haarst, C. Maliepaard, H. van de Geest, A. G. Bovy,
M. Lammers, G. C. Angenent, and R. A. de Maagd. Identification of
microRNA targets in tomato fruit development using high-throughput
sequencing and degradome analysis. <span
class="ptmri7t-x-x-120">J. Exp. Bot.</span>, 64(7):1863–1878, Apr
2013
<!--l. 346--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to preprocess short reads in the context of
identifying mircoRNA targets in tomato fruit development.
</p></li>
<li
class="enumerate" id="x1-11076x38"><a
id="XGRO:MAR:SIM:ABR:WAN:VIS:2013"></a>S. M. Gross, J. A. Martin, J. Simpson, M. J. Abraham-Juarez, Z. Wang, and
A. Visel. De novo transcriptome assembly of drought tolerant CAM
plants, Agave deserti and Agave tequilana. <span
class="ptmri7t-x-x-120">BMC Genomics</span>, 14:563,
2013
<!--l. 351--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used in an all-vs-all comparison to bin contigs into loci
based on a minimum of 200 bp sequence overlap in the context of transcriptome
assembly for two Agave-species.
</p></li>
<li
class="enumerate" id="x1-11078x39"><a
id="XKAN:HEL:DUR:WIN:ENG:BEH:HOL:BRA:HAU:FER:2013"></a>U. Kanter, W. Heller, J. Durner, J. B. Winkler, M. Engel, H. Behrendt,
A. Holzinger, P. Braun, M. Hauser, F. Ferreira, K. Mayer, M. Pfeifer, and
D. Ernst. Molecular and immunological characterization of ragweed (Ambrosia
artemisiifolia L.) pollen after exposure of the plants to elevated ozone over a
whole growing season. <span
class="ptmri7t-x-x-120">PLoS ONE</span>, 8(4):e61518, 2013
<!--l. 354--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to align 454-reads to assembled isotigs for
Ragweed pollen.
</p></li>
<li
class="enumerate" id="x1-11080x40"><a
id="XKUG:SIE:NUS:AME:SPAN:STEI:LEM:MAY:BUE:SCHWE:2013"></a>K. G. Kugler, G. Siegwart, T. Nussbaumer, C. Ametz, M. Spannagl,
B. Steiner, M. Lemmens, K. F. X. Mayer, H. Buerstmayr, and W. Schweiger.
Quantitative trait loci-dependent analysis of a gene co-expression network
associated with fusarium head blight resistance in bread wheat (triticum aestivum
l.). <span
class="ptmri7t-x-x-120">BMC Genomics</span>, 14(1):728, 2013
<!--l. 357--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used for comparing gene sets.
</p></li>
<li
class="enumerate" id="x1-11082x41"><a
id="XMAR:ZHO:HAS:SCHMU:VRA:KUB:KOEN:KUG:SCHOL:HAC:2013"></a>Mihaela M Martis, Ruonan Zhou, Grit Haseneyer, Thomas Schmutzer, Jan
Vrána, Marie Kubaláková, Susanne König, Karl G Kugler, Uwe Scholz, Bernd
Hackauf, et al. Reticulate evolution of the rye genome. <span
class="ptmri7t-x-x-120">The Plant Cell</span>,
25(10):3685–3698, 2013
<!--l. 361--><p class="noindent" >In this work <span
class="ptmri7t-x-x-120">Vmatch </span>was used to detect repetitive DNA content of chromosomal
survey sequences from the Rye genome.
</p></li>
<li
class="enumerate" id="x1-11084x42">In the papers
<!--l. 366--><p class="noindent" ><a
id="XKOP:MAR:VHA:HRV:VRA:BAR:KOP:CAT:STO:NOV:2013"></a>D. Kopeckỳ, M. Martis, J. Číhalíková, E. Hřibová, J. Vrána, J. Bartoš,
J. Kopecká, F. Cattonaro, Š. Stočes, Petr Novák, et al. Flow sorting and
sequencing meadow fescue chromosome 4f. <span
class="ptmri7t-x-x-120">Plant Physiology</span>, 163(3):1323–1337,
2013
</p><!--l. 368--><p class="noindent" ><a
id="XKOP:MAR:CHA:HRI:VRA:BAR:2013"></a>D. Kopeckỳ, M Martis, J Číhalíková, E Hřibová, J Vrána, J Bartoš, et al.