Skip to content

Commit

Permalink
feedback Andreas Maier
Browse files Browse the repository at this point in the history
  • Loading branch information
slobentanzer committed Feb 9, 2024
1 parent d7c7663 commit af454b5
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion content/10.introduction.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Introduction

Despite technological advances, major challenges remain to understand biological and biomedical systems [@gallagher-infinite;@dl-bioscience].
Despite technological advances, understanding biological and biomedical systems still poses major challenges [@gallagher-infinite;@dl-bioscience].
We measure more and more data points with ever-increasing resolution to such a degree that their analysis and interpretation have become the bottleneck for their exploitation [@dl-bioscience].
One reason for this challenge may be the inherent limitation of human knowledge [@doi:10.1016/j.tics.2005.04.010]: Even seasoned domain experts cannot know the implications of every gene, molecule, symptom, or biomarker.
In addition, biological events are context-dependent, for instance with respect to a cell type or specific disease.
Expand Down
2 changes: 1 addition & 1 deletion content/20.results.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Increasingly secure solutions require more effort to set up and maintain, but al
Fully local solutions are available given sufficient hardware (starting with contemporary laptops), but are not highly scalable.
](images/biochatter_architecture.png "Architecture"){#fig:architecture}

The framework is designed to be composable, meaning that any of its components can be exchanged with other implementations (Figure @fig:overview).
The framework is designed to be modular, meaning that any of its components can be exchanged with other implementations (Figure @fig:overview).
Functionalities include:

- **basic question answering** with LLMs hosted by providers (such as OpenAI) as well as locally deployed open-source models
Expand Down
4 changes: 2 additions & 2 deletions content/40.methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ The individual dimensions of the matrix are:
- **data processing**: Some data processing steps can have great impact on the downstream performance of LLMs.
For instance, we test the conversion of numbers (which LLMs are notoriously bad at handling) to categorical text (e.g., low, medium, high).

- **model quantisations**: We test a set of quantisations for each model (where available) to account for the trade-off between model size and performance.
- **model quantisations**: We test a set of quantisations for each model (where available) to account for the trade-off between model size, inference speed, and performance.

- **model parameters**: Where suitable, we test a set of parameters for each model, such as "temperature," which determines the reproducibility of model responses.

Expand Down Expand Up @@ -96,7 +96,7 @@ This method is sometimes described as in-context learning [@doi:10.48550/arxiv.2
To provide access to this functionality in BioChatter, we implement classes for the connection to, and management of, vector database systems (in the vectorstore_host.py module), and for performing semantic search on the vector database and injecting the results into the prompt (in the vectorstore.py module).
To demonstrate the use of the API, we add a “Retrieval-Augmented Generation” tab to the preview apps that allows the upload of text documents to be added to a vector database, which then can be queried to add contextual information to the prompt sent to the primary model.
This contextual information is transparently displayed.
Since this functionality requires a connection to a vector database system, we provide connectivity to a Milvus server, including a way to start the server in conjunction with a BioCypher knowledge graph and the BioChatter Light app in one Docker Compose workflow.
Since this functionality requires a connection to a vector database system, we provide connectivity to a Milvus service, including a way to start the service in conjunction with a BioCypher knowledge graph and the BioChatter Light app in one Docker Compose workflow.

An example use case of this functionality is available in [Supplementary Note 2: Retrieval-Augmented Generation] and on our website ([https://biochatter.org/vignette-rag/](https://biochatter.org/vignette-rag/)).

Expand Down

0 comments on commit af454b5

Please sign in to comment.