Skip to content

Commit

Permalink
Restore website
Browse files Browse the repository at this point in the history
  • Loading branch information
robjgiff committed May 29, 2024
1 parent 1524f35 commit 956dc53
Show file tree
Hide file tree
Showing 9 changed files with 1,303 additions and 1 deletion.
2 changes: 2 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ <h1 class="project-name">The DIGS Tool</h1>
<h2 class="project-tagline">A Software Framework for Database-Integrated Genome Screening (DIGS)</h2>

<a href="./website/user-guide/explore.html" class="btn">Background</a>
<a href="./website/user-guide/overview.html" class="btn">Overview</a>
<a href="./website/user-guide//user-guide.html" class="btn">Manual</a>
<a href="https://github.com/giffordlabcvr/DIGS-tool/zipball/master" class="btn">Download</a>
<a target="_blank" href="https://github.com/giffordlabcvr/DIGS-tool" class="btn">GitHub</a>
<a target="_blank" href="https://twitter.com/DigsTool" class="btn">Twitter</a>
Expand Down
54 changes: 54 additions & 0 deletions md/database-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# DIGS screening database field definitions

When a screen is initiated by the DIGS tool, a project-specific, relational database is created. This 'screening database' captures data generated by DIGS.

### `searches_performed` table

Records the details of BLAST searches that have been performed in this DIGS project.

| **Field** | **Type** | **Description** |
|---|---|---|
| record_ID | INT | Automatically incremented primary key |
| probe_ID | VARCHAR | Unique iID of probe sequence |
| probe_name | VARCHAR | Name of probe sequence |
| probe_gene | VARCHAR | Name of probe sequence gene |
| target_id | VARCHAR | Unique identifier the TDb file |
| organism | VARCHAR | Name of the organism (Latin binomial) from which TDb was generated |
| target_datatype | VARCHAR | Data type of the TDb file |
| target_version | VARCHAR | Version details of the TDb file |
| target_name | VARCHAR | Name of TDb file |
| Timestamp | TIMESTAMP | Timestamp of the table entry |


### `digs_results` table

Contains the extracted sequences of the loci specified in BLAST_results table, and results of the second round of paired BLAST, in which extracted sequences are 'genotyped' by BLAST comparison to the reference library.

| **Field** | **Type** | **Description** |
|---|---|---|
| record_ID | INT | Automatically incremented primary key |
| organism | VARCHAR | Organism name (Latin binomial) |
| target_datatype | VARCHAR | Genome data type |
| target_version | VARCHAR | Genome build version details |
| target_name | VARCHAR | Name of genome data file containing the BLAST hit |
| probe_type | VARCHAR | Type of probe sequence (amino acid or nucleotide) |
| extract_start | INT | 5’ (start) position of reverse BLAST alignment in the RSL sequence |
| extract_end | INT | 3’ (end) position of reverse BLAST alignment in the RSL sequence |
| scaffold | VARCHAR | Name of scaffold/contig/chromosome containing the BLAST hit |
| orientation | ENUM | Orientation of the BLAST hit relative to the probe |
| assigned_name | VARCHAR | Name of closest matching sequence in RSL |
| assigned_gene | VARCHAR | Name of gene of closest matching sequence in RSL |
| bit_score | FLOAT | Bit score of the best match from reverse BLAST |
| identity | FLOAT | Percentage identity of the best match from reverse BLAST |
| e_value_num | FLOAT | Coefficient of the expect (e) value for the best match from reverse BLAST |
| e_value_exp | INT | Exponent (base e) of the expect (e) value for the best match from reverse BLAST |
| subject_start | INT | 5’ (start) position of reverse BLAST alignment in the RSL sequence |
| subject_end | INT | 3’ (end) position of reverse BLAST alignment in the RSL sequence |
| query_start | INT | 5’ (start) position of reverse BLAST alignment in the probe sequence |
| query_end | INT | 3’ (end) position of reverse BLAST alignment in the probe sequence |
| mismatches | INT | Number of mismatches in alignment from reverse BLAST |
| gap_openings | INT | Number of gap openings in alignment from reverse BLAST |
| sequence_length | INT | Length of the extracted sequence |
| sequence | TEXT | Text string of the extracted sequence |
| timestamp | TIMESTAMP | Timestamp of the table entry |

Binary file added website/assets/images/analysis-phase.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added website/assets/images/analysis-phase.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added website/assets/images/exploration-phase.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added website/assets/images/overview-phases.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 4 additions & 1 deletion website/user-guide/explore.html
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,10 @@ <h2 class="project-tagline">Database-Integrated Genome Screening</h2>


<a href="../../index.html" class="btn">Home</a>
<a href="https://github.com/giffordlabcvr/DIGS-tool/zipball/master" class="btn">Download</a>
<a href="./overview.html" class="btn">Overview</a>
<a href="./user-guide.html" class="btn">Manual</a>

<a href="https://github.com/giffordlabcvr/DIGS-tool/zipball/master" class="btn">Download</a>
<a target="_blank" href="https://github.com/giffordlabcvr/DIGS-tool" class="btn">GitHub</a>
<a target="_blank" href="https://twitter.com/DigsTool" class="btn">Twitter</a>

Expand Down
217 changes: 217 additions & 0 deletions website/user-guide/overview.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
<!DOCTYPE html>
<html lang="en-us">

<head>


<meta charset="UTF-8">
<title>DIGS: Overview</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" type="text/css" href="../assets/stylesheets/normalize.css" media="screen">
<link href='https://fonts.googleapis.com/css?family=Open+Sans:400,700' rel='stylesheet' type='text/css'>
<link rel="stylesheet" type="text/css" href="../assets/stylesheets/stylesheet.css" media="screen">
<link rel="stylesheet" type="text/css" href="../assets/stylesheets/github-light.css" media="screen">

</head>

<body>

<section class="page-header">



<h1 class="project-name">DIGS</h1>
<h2 class="project-tagline">Database-Integrated Genome Screening</h2>


<a href="../../index.html" class="btn">Home</a>
<a href="./explore.html" class="btn">Background</a>
<a href="./user-guide.html" class="btn">Manual</a>
<a href="https://github.com/giffordlabcvr/DIGS-tool/zipball/master" class="btn">Download</a>
<a target="_blank" href="https://github.com/giffordlabcvr/DIGS-tool" class="btn">GitHub</a>
<a target="_blank" href="https://twitter.com/DigsTool" class="btn">Twitter</a>

</section>

<section class="main-content">

<h3>
<a id="SearchStructure" class="anchor" href="#SearchStructure" aria-hidden="true"><span class="octicon octicon-link"></span></a><strong>Structure of a DIGS-based investigation</strong>
</h3>
<hr>


<p>
Comparative studies using database-integrated genome screening (DIGS) entail
separate 'exploration' and 'analysis' phases,
with each of these phases being split into two component parts as follows:


<ul>

<li> <b>Exploration</b>: (1) Setting up and (2) running a similarity search-based screen.</li>
<li> <b>Analysis</b>: (3) Inspecting screening output via a relational database, and (4)
performing comparative analysis of exported sequence data.</li>
</ul>



</p>


<p><img src="../assets/images/overview-phases.jpg" alt="Overview - phases" /></p>

<p>

As shown above, this process is usually iterative - at least to some degree - since
analysis of screening results often reveals new information that can be used to
design more informative or comprehensive screens.

</p>


<br>



<h3>
<a id="setUpAndRunScreen" class="anchor" href="#setUpAndRunScreen" aria-hidden="true"><span class="octicon octicon-link"></span></a><strong>Exploration phase: Setting up and running an <i>in silico</i> screen</strong>
</h3>
<hr>

<p>

DIGS is a <b>project-based framework</b> in which investigations are centred around
a <b>genome feature</b> of interest. Any genome feature can be investigated in principle,
so long as it contains sufficient sequence conservation to be reliably detected in a similarity
search.

</p>


<p>
The '<b>reference sequence library</b>'
is a curated set of sequences relevant to the genome feature under investigation).
Usually this will consist of:


<ul>
<li> A set of conserved DNA or polypeptide sequences derived from the genome feature of interest.</li>
</ul>

</p>


<p>
However, depending on the kind of investigation being performed, it may also contain :


<ul>
<li>Sequences that do not derive from the genome feature under investigation,
but can provide useful information about the locus in which it occurs.</li>
<li>Sequences representing genome features that are not relevant to the investigation,
but are sufficiently similar to them to generate 'false positive' matches.</li>
</ul>


Screening entails selecting particular sequences from the reference library for use as
'<b>probes</b>' in a BLAST search of a specific '<b>target database</b>'.
</p>


<p>
Sequences that match to the query ('<b>hits</b>') can then be extracted and <b>classified</b>.
A convenient way of rapidly classifying or 'genotyping' hits is via BLAST-based comparison to the
reference library, as indicated in the illustration below.
</p>




<p><img src="../assets/images/exploration-phase.jpg" alt="Exploration phase" /></p>



<blockquote>

<b>Schematic representation of the <u>exploration</u> phase of a DIGS-based investigation</b>.
<br>
Here, the genome features being investigated are a set of related genes
In <b>step (1)</b> a sequence from the reference library is selected and used
as a 'probe' or 'query' in a BLAST-based search of a chosen target database.
In <b>step (2)</b>, sequences identified in this search are extracted
and classified via BLAST-based comparison to the reference library.
These searches provides a way to effectively 'delve in' to
genomic databanks and recover related sequences and as such, they provide a means
to survey unmapped regions of the genomic 'landscape'.

</blockquote>


<br>
<h3>
<a id="analyseScreenOutput" class="anchor" href="#setUpAndRunScreen" aria-hidden="true"><span class="octicon octicon-link"></span></a><strong>Analysis phase: Inspecting results and exporting data for comparative analysis</strong>
</h3>
<hr>


<p>
In DIGS, a similarity search-based screening pipeline is linked to a <b>relational
database management system (RDBMS)</b>, and the outputs of screening are captured
in a <b>project-specific relational database</b>.
</p>

<p>
This approach not only provides a convenient and robust
basis for implementing systematic, automated screens that proceed in an efficient,
non-redundant way, it also allows screening data to be interrogated using
<b>structured query language (SQL)</b> - a well-established, powerful approach for
querying relational databases.
</p>

<p>

<ol>
<b>
<li> Investigation of output via the relational database.</li>
<li> Comparative genomic analysis of exported sequence data </li>
</b>
</ol>


<br>



<p><img src="../assets/images/analysis-phase.png" alt="Analysis phase" /></p>



<blockquote>
<b>Analysing screening output</b>: A schematic representation of the two component
parts of the 'analysis' phase of DIGS-based screen
(some comparative analysis do not require an alignment, but most do).
</blockquote>




<br>


<br>



<footer class="site-footer">
<span class="site-footer-owner"><a href="https://github.com/giffordlabcvr/DIGS-tool">DIGS</a> is maintained by <a href="https://github.com/giffordlabcvr">giffordlabcvr</a>.</span>

<span class="site-footer-credits">This page was generated by <a href="https://pages.github.com">GitHub Pages</a> using the <a href="https://github.com/jasonlong/cayman-theme">Cayman theme</a> by <a href="https://twitter.com/jasonlong">Jason Long</a>.</span>
</footer>

</section>


</body>

</html>
Loading

0 comments on commit 956dc53

Please sign in to comment.