diff --git a/docs/nlp.md b/docs/nlp.md
index f7063fac..ff7d4709 100644
--- a/docs/nlp.md
+++ b/docs/nlp.md
@@ -17,13 +17,14 @@ and records those findings alongside the coded FHIR data.
 
 This way you can often surface symptoms that simply aren't recorded in the traditional FHIR data.
 
-## NLP Is Always Specific to a Study
+## NLP Is Always Specific to a Clinical Purpose
 
 The first thing to understand is that Cumulus ETL always
-runs NLP in the context of a specific study purpose.
+runs NLP in the context of a specific clinical purpose,
+which we'll call a "study."
 
 Each study's design will have its own needs and its own NLP strategy,
-so we support multiple approaches.
+so Cumulus ETL supports multiple approaches.
 
 {: .note }
 Example: The `covid_symptom` study uses cTAKES and a negation transformer working together to tag
@@ -31,8 +32,94 @@ COVID-19 symptoms in clinical notes.
 But another study might use the Llama2 large language model,
 with a prompt like "Does this patient have a nosebleed?"
 
+### But With Re-Usable Code
+
+While the clinical "business logic" of how to drive NLP is inevitably study-specific,
+the code structure of Cumulus ETL is generic.
+That's what makes it easy to support multiple different NLP strategies.
+
+We'll go into more depth about what an NLP task does under the covers [below](#technical-workflow).
+But for now, here's a basic outline of how Cumulus ETL runs an NLP study task:
+1. Prepare the clinical notes
+2. Hand those notes to a bit of study-specific Python code
+3. Record the structured results in an AWS Athena database
+
+Because Cumulus ETL has a growing internal library of NLP support code
+(things like automatically caching results, calling known interfaces like
+[Hugging Face's inference API](https://github.com/huggingface/text-generation-inference),
+or configuring cTAKES with a custom dictionary),
+the study-specific Python code can focus on the clinical logic.
+
+#### Example Code
+In pseudocode, here's the Python code for a task that talks to an LLM like Llama2 might look like:
+
+```python
+for clinical_note in etl.read_notes():
+    prompt = "Does the patient in this clinical note have a nosebleed?"
+    yield etl.ask_llama2(prompt, clinical_note)
+```
+
+Those calls to `etl.*` are calls to the internal NLP support code that the task does not have to
+re-invent.
+
+And with that relatively low level of complexity (though finding the right prompt can be hard),
+you've got a study task that you can run over all your institution's clinical notes. 
+
 ## Available NLP Strategies
 
+### Large Language Models (LLMs)
+
+Cumulus ETL makes it easy to pass clinical notes to an LLM,
+which are often difficult to set up yourself.
+
+Some LLMs are freely-distributable like Meta's [Llama2](https://ai.meta.com/llama/),
+and thus can be run locally.
+While others are cloud-based proprietary LLMs like OpenAI's [ChatGPT](https://openai.com/chatgpt),
+which your institution may have a HIPAA Business Associate Agreement (BAA) with.
+
+Cumulus ETL can handle either type.
+
+#### Local LLMs
+
+With a local LLM,
+your notes never leave your network and the only cost is GPU time.
+
+Which is great!
+But they can be complicated to set up.
+That's where Cumulus ETL can help by shipping turnkey configurations for these LLMs.
+
+See full details [below](#docker-integration),
+but the basic idea is that Cumulus ETL will download the LLM for you,
+configure it for study needs, and launch it.
+We'll also be able to offer recommendations on what sort of hardware you'll need
+(for example, Llama2 works well with two NVIDIA A100 GPUs).
+
+{: .note }
+Only LLama2 is supported right now, because that's the current best-in-class local LLM.
+But Cumulus ETL uses the standard
+[Hugging Face inference interface](https://github.com/huggingface/text-generation-inference)
+as an abstraction layer, so integrating new local LLMs is a lightweight process.
+
+#### Cloud LLMs
+
+Your institution may have a BAA to share protected health information (PHI) with a cloud LLM.
+
+Talking to a cloud LLM is very similar to a local LLM.
+Instead of making an internal network call to a Docker container,
+Cumulus ETL makes an external network call to the cloud.
+
+The exact API is different, but the concept is the same.
+And importantly, 99% of the Cumulus ETL workflow is the same.
+It would just swap out the actual call to the LLM.
+
+One additional challenge with cloud LLMs is reproducibility,
+but recording metadata like the current time and vendor version in the database
+along with the results can at least help explain changes over time.
+
+{: .note }
+Cloud LLM support has not yet been prioritized, and none are currently supported.
+But if a new study did need to talk to a specific vendor, we know how we would integrate it.
+
 ### cTAKES
 
 [Apache cTAKES](https://ctakes.apache.org/) is a tried and true method of tagging symptoms in text.
@@ -42,23 +129,11 @@ A Cumulus ETL study can pass clinical notes to cTAKES and augment its results by
 - a [cNLP transformer](https://github.com/Machine-Learning-for-Medical-Language/cnlp_transformers)
   to improve negation detection (e.g. "does not have a fever" vs "has a fever")
 
-### Llama2
-
-Meta's [Llama2](https://ai.meta.com/llama/) is a freely available large language model (LLM).
-
-A Cumulus ETL study can pass a carefully-crafted prompt and clinical notes to Llama2
-and record the answer.
-Since Llama2 is run locally, your notes never leave your network.
-
-The answer you receive might need additional post-processing, to get to a simple yes/no.
-Just depends on the study & the prompt.
-
 ### Others
 
-There are plans to add the ability to talk to cloud LLMs, like ChatGPT.
-
 Or any other new transformers or services could be integrated, as needed.
-If a new study required a new service, Cumulus ETL can add support for it.
+If a new study required a new service, Cumulus ETL can add support for it,
+and then _any_ study would be able to use it.
 
 ## Technical Workflow