HEX (HLA Extractor) is a HLA typing software tool designed for HLA typing NGS data obtained with the novel developed technology. HEX was developed in the Laboratory of Comparative and Functional Genomics.
It depends on:
- Python 3 is the the main programming language of the tool.
- bwa-mem is used for mapping the input sequences to the reference HLA sequences.
- Perl is used in processing the
bwa-mem
mapping results. - ncbi-blast is used for aligning the resulting HLA candidates to the alleles.
HEX already comes with prepocessed database with alleles and reference sequences. To start HEX you just need to run:
$./hex_extract <path to a folder with raw .fastq files>
In order to build the database with reference sequences HEX need to download the IMGT HLA data and process it. To do so you just need to run:
$./hex_build [-i <optional input file with URLs>] [-o <optional output folder>]
-i
- a path to a file with links to files with HLA (default - hlalinks.txt
).
-o
- output folder, to which the database will be written.
For each input sample do:
-
Map sequences using
bwa-mem
to the reference sequences. -
For each sequence fill the gaps with
Xs
and write to the file related to the specific HLA class.
Make consensuses sequences from each HLA class-specific file:
-
Choose the non-conservative position with the best coverage.
-
For each direction (i.e., two) make the two best sequences using the markov chains logic.
-
Merge the corresponding consensus sequences by the start nucleotide.
- Align consensus sequences to the HLA database using
BLAST
.
The input NGS data is foollows:
???
???
???