Skip to content

Commit

Permalink
Running Marimo in Docker Image
Browse files Browse the repository at this point in the history
  • Loading branch information
GullyBurns committed Jul 2, 2024
1 parent 2a0b23c commit 9af327d
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 13 deletions.
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,17 @@ The preferred method to run Alhazen is through
Note, for correct functionality, set the following environment variables
for the shell from which you are calling Docker:

**MANDATORY** \* LOCAL_FILE_PATH - the directory where the system will
store full-text files.

**OPTIONAL** \* OPENAI_API_KEY - if you are using OpenAI large language
models.
\* DATABRICKS_API_KEY - if you are using the Databricks AI Playground
endpoint as an LLM server. \* GROQ_API_KEY - if you are calling LLMs on
groq.com
**MANDATORY**

- LOCAL_FILE_PATH - the directory where the system will store full-text
files.

**OPTIONAL**

- OPENAI_API_KEY - if you are using OpenAI large language models.
- DATABRICKS_API_KEY - if you are using the Databricks AI Playground
endpoint as an LLM server.
- GROQ_API_KEY - if you are calling LLMs on groq.com

#### Quickstart

Expand Down
14 changes: 11 additions & 3 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -563,9 +563,17 @@ <h2 class="anchored" data-anchor-id="installation">Installation</h2>
<h3 class="anchored" data-anchor-id="docker">Docker</h3>
<p>The preferred method to run Alhazen is through <a href="https://www.docker.com/">Docker</a>.</p>
<p>Note, for correct functionality, set the following environment variables for the shell from which you are calling Docker:</p>
<p><strong>MANDATORY</strong> * LOCAL_FILE_PATH - the directory where the system will store full-text files.</p>
<p><strong>OPTIONAL</strong> * OPENAI_API_KEY - if you are using OpenAI large language models.<br>
* DATABRICKS_API_KEY - if you are using the Databricks AI Playground endpoint as an LLM server. * GROQ_API_KEY - if you are calling LLMs on groq.com</p>
<p><strong>MANDATORY</strong></p>
<ul>
<li>LOCAL_FILE_PATH - the directory where the system will store full-text files.</li>
</ul>
<p><strong>OPTIONAL</strong></p>
<ul>
<li>OPENAI_API_KEY - if you are using OpenAI large language models.<br>
</li>
<li>DATABRICKS_API_KEY - if you are using the Databricks AI Playground endpoint as an LLM server.</li>
<li>GROQ_API_KEY - if you are calling LLMs on groq.com</li>
</ul>
<section id="quickstart" class="level4">
<h4 class="anchored" data-anchor-id="quickstart">Quickstart</h4>
<p>To run the system out of the box, run these commands:</p>
Expand Down
2 changes: 1 addition & 1 deletion docs/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"href": "index.html#installation",
"title": "Home - Alhazen",
"section": "Installation",
"text": "Installation\n\nDocker\nThe preferred method to run Alhazen is through Docker.\nNote, for correct functionality, set the following environment variables for the shell from which you are calling Docker:\nMANDATORY * LOCAL_FILE_PATH - the directory where the system will store full-text files.\nOPTIONAL * OPENAI_API_KEY - if you are using OpenAI large language models.\n* DATABRICKS_API_KEY - if you are using the Databricks AI Playground endpoint as an LLM server. * GROQ_API_KEY - if you are calling LLMs on groq.com\n\nQuickstart\nTo run the system out of the box, run these commands:\n$ git clone https://github.com/chanzuckerberg/alhazen\n$ cd alhazen\n$ docker compose build\n$ docker compose up\nThis should generate the output that includes a link formatted like this one: http://127.0.0.1:8888/lab?token=LONG-ALPHANUMERIC-STRING.\nOpen a browser to that location and you should get access to a juypter lab notebook that provides access to all notebooks in the repo.\nBrowse to nbs/tutorials/CryoET_Tutorial.ipynb to access a walkthrough of an analysis over papers involving CryoET as a demonstration.\n\n\nRun Huridocs as PDF extraction\nTo run the system with support from the Huridocs PDF extraction system (needed for processing full text articles), you must first run the docker container for that system:\n$ git clone https://github.com/huridocs/pdf_paragraphs_extraction\n$ cd pdf_paragraphs_extraction\n$ docker compose build\n$ docker compose up\nThen repeat as before, but with the huridocs alhazen image\n$ cd ..\n$ git clone https://github.com/chanzuckerberg/alhazen\n$ cd alhazen\n$ docker compose build\n$ docker compose -f docker-compose-huridocs.yml up\n\n\n\nInstall dependencies\n\nPostgresql\nAlhazen requires postgresql@14 to run. Homebrew provides an installer:\n$ brew install postgresql@14\nwhich can be run as a service:\n$ brew services start postgresql@14\n$ brew services list\nIf you install Postgresql via homebrew, you will need to create a postgres superuser to run the psql command.\n$ createuser -s postgres\nNote that the Postgres.app system also provides a nice GUI interface for Postgres but installing the pgvector package is a little more involved.\n\n\nOllama\nThe tool uses the Ollama library to execute large language models locally on your machine. Note that to able to run the best performing models on a Apple Mac M1 or M2 machine, you will need at least 48GB of memory.\n\n\nHuridocs\nWe use a PDF document text extraction and classification system called Huridocs. In particular, our PDF processing requires a docker image of their PDF Paragraphs Extraction system. To run this, perform the following steps:\n1. git clone https://github.com/huridocs/pdf_paragraphs_extraction\n2. cd pdf_paragraphs_extraction\n3. docker-compose up\n\n\n\nInstall Alhazen source code\ngit clone https://github.com/chanzuckerberg/alzhazen\nconda create -n alhazen python=3.11\nconda activate alhazen\ncd alhazen\npip install -e .",
"text": "Installation\n\nDocker\nThe preferred method to run Alhazen is through Docker.\nNote, for correct functionality, set the following environment variables for the shell from which you are calling Docker:\nMANDATORY\n\nLOCAL_FILE_PATH - the directory where the system will store full-text files.\n\nOPTIONAL\n\nOPENAI_API_KEY - if you are using OpenAI large language models.\n\nDATABRICKS_API_KEY - if you are using the Databricks AI Playground endpoint as an LLM server.\nGROQ_API_KEY - if you are calling LLMs on groq.com\n\n\nQuickstart\nTo run the system out of the box, run these commands:\n$ git clone https://github.com/chanzuckerberg/alhazen\n$ cd alhazen\n$ docker compose build\n$ docker compose up\nThis should generate the output that includes a link formatted like this one: http://127.0.0.1:8888/lab?token=LONG-ALPHANUMERIC-STRING.\nOpen a browser to that location and you should get access to a juypter lab notebook that provides access to all notebooks in the repo.\nBrowse to nbs/tutorials/CryoET_Tutorial.ipynb to access a walkthrough of an analysis over papers involving CryoET as a demonstration.\n\n\nRun Huridocs as PDF extraction\nTo run the system with support from the Huridocs PDF extraction system (needed for processing full text articles), you must first run the docker container for that system:\n$ git clone https://github.com/huridocs/pdf_paragraphs_extraction\n$ cd pdf_paragraphs_extraction\n$ docker compose build\n$ docker compose up\nThen repeat as before, but with the huridocs alhazen image\n$ cd ..\n$ git clone https://github.com/chanzuckerberg/alhazen\n$ cd alhazen\n$ docker compose build\n$ docker compose -f docker-compose-huridocs.yml up\n\n\n\nInstall dependencies\n\nPostgresql\nAlhazen requires postgresql@14 to run. Homebrew provides an installer:\n$ brew install postgresql@14\nwhich can be run as a service:\n$ brew services start postgresql@14\n$ brew services list\nIf you install Postgresql via homebrew, you will need to create a postgres superuser to run the psql command.\n$ createuser -s postgres\nNote that the Postgres.app system also provides a nice GUI interface for Postgres but installing the pgvector package is a little more involved.\n\n\nOllama\nThe tool uses the Ollama library to execute large language models locally on your machine. Note that to able to run the best performing models on a Apple Mac M1 or M2 machine, you will need at least 48GB of memory.\n\n\nHuridocs\nWe use a PDF document text extraction and classification system called Huridocs. In particular, our PDF processing requires a docker image of their PDF Paragraphs Extraction system. To run this, perform the following steps:\n1. git clone https://github.com/huridocs/pdf_paragraphs_extraction\n2. cd pdf_paragraphs_extraction\n3. docker-compose up\n\n\n\nInstall Alhazen source code\ngit clone https://github.com/chanzuckerberg/alzhazen\nconda create -n alhazen python=3.11\nconda activate alhazen\ncd alhazen\npip install -e .",
"crumbs": [
"Get Started",
"Home - Alhazen"
Expand Down
2 changes: 1 addition & 1 deletion docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://chanzuckerberg.github.io/alhazen/index.html</loc>
<lastmod>2024-07-02T20:30:04.606Z</lastmod>
<lastmod>2024-07-02T20:31:46.092Z</lastmod>
</url>
<url>
<loc>https://chanzuckerberg.github.io/alhazen/tutorials/index.html</loc>
Expand Down
2 changes: 2 additions & 0 deletions nbs/index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,11 @@
"Note, for correct functionality, set the following environment variables for the shell from which you are calling Docker:\n",
"\n",
"**MANDATORY**\n",
"\n",
"* LOCAL_FILE_PATH - the directory where the system will store full-text files.\n",
"\n",
"**OPTIONAL**\n",
"\n",
"* OPENAI_API_KEY - if you are using OpenAI large language models. \n",
"* DATABRICKS_API_KEY - if you are using the Databricks AI Playground endpoint as an LLM server. \n",
"* GROQ_API_KEY - if you are calling LLMs on groq.com\n",
Expand Down

0 comments on commit 9af327d

Please sign in to comment.