Merge pull request #3 from permitio/cleanup-tools-integration

Cleanup tools integration
permitio · Feb 17, 2025 · c0fe498 · c0fe498
2 parents 8c60477 + 60e2756
commit c0fe498
Show file tree

Hide file tree

Showing 9 changed files with 598 additions and 1,170 deletions.
diff --git a/docs/retrievers.ipynb b/docs/retrievers.ipynb
@@ -1,225 +1,136 @@
 {
  "cells": [
-  {
-   "cell_type": "raw",
-   "id": "afaf8039",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: LangchainPermit\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e49f1e0d",
-   "metadata": {},
-   "source": [
-    "# LangchainPermitRetriever\n",
-    "\n",
-    "- TODO: Make sure API reference link is correct.\n",
-    "\n",
-    "This will help you getting started with the LangchainPermit [retriever](/docs/concepts/#retrievers). For detailed documentation of all LangchainPermitRetriever features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/retrievers/langchain_permit.retrievers.LangchainPermit.LangchainPermitRetriever.html).\n",
-    "\n",
-    "### Integration details\n",
-    "\n",
-    "TODO: Select one of the tables below, as appropriate.\n",
-    "\n",
-    "1: Bring-your-own data (i.e., index and search a custom corpus of documents):\n",
-    "\n",
-    "| Retriever | Self-host | Cloud offering | Package |\n",
-    "| :--- | :--- | :---: | :---: |\n",
-    "[LangchainPermitRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain-permit.retrievers.langchain_permit.LangchainPermitRetriever.html) | ❌ | ❌ | langchain-permit |\n",
-    "\n",
-    "2: External index (e.g., constructed from Internet data or similar)):\n",
-    "\n",
-    "| Retriever | Source | Package |\n",
-    "| :--- | :--- | :---: |\n",
-    "[LangchainPermitRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain-permit.retrievers.langchain_permit.LangchainPermitRetriever.html) | Source description | langchain-permit |\n",
-    "\n",
-    "## Setup\n",
-    "\n",
-    "- TODO: Update with relevant info."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "72ee0c4b-9764-423a-9dbf-95129e185210",
-   "metadata": {},
-   "source": [
-    "If you want to get automated tracing from individual queries, you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n",
-    "# os.environ[\"LANGSMITH_TRACING\"] = \"true\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0730d6a1-c893-4840-9817-5e5251676d5d",
-   "metadata": {},
-   "source": [
-    "### Installation\n",
-    "\n",
-    "This retriever lives in the `langchain-permit` package:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "652d6238-1f87-422a-b135-f5abbb8652fc",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%pip install -qU langchain-permit"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a38cde65-254d-4219-a441-068766c0d4b5",
-   "metadata": {},
-   "source": [
-    "## Instantiation\n",
-    "\n",
-    "Now we can instantiate our retriever:\n",
-    "\n",
-    "- TODO: Update model instantiation with relevant params."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "70cc8e65-2a02-408a-bbc6-8ef649057d82",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_permit import LangchainPermitRetriever\n",
-    "\n",
-    "retriever = LangchainPermitRetriever(\n",
-    "    # ...\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5c5f2839-4020-424e-9fc9-07777eede442",
-   "metadata": {},
-   "source": [
-    "## Usage"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "51a60dbe-9f2e-4e04-bb62-23968f17164a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "query = \"...\"\n",
-    "\n",
-    "retriever.invoke(query)"
-   ]
-  },
   {
    "cell_type": "markdown",
-   "id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e",
+   "id": "e1455a0c",
    "metadata": {},
    "source": [
-    "## Use within a chain\n",
+    "#Langchain Permit Retrievers\n",
+    "This module provides two specialized retrievers that integrate Permit.io's authorization capabilities with Langchain's retrieval systems:\n",
     "\n",
-    "Like other retrievers, LangchainPermitRetriever can be incorporated into LLM applications via [chains](/docs/how_to/sequence/).\n",
+    "* ReBACSelfQueryRetriever: Implements Relationship-Based Access Control (ReBAC) using natural language queries\n",
     "\n",
-    "We will need a LLM or chat model:\n",
-    "\n",
-    "```{=mdx}\n",
-    "import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
-    "\n",
-    "<ChatModelTabs customVarName=\"llm\" />\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "25b647a3-f8f2-4541-a289-7a241e43f9df",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# | output: false\n",
-    "# | echo: false\n",
-    "\n",
-    "from langchain_openai import ChatOpenAI\n",
-    "\n",
-    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_core.output_parsers import StrOutputParser\n",
-    "from langchain_core.prompts import ChatPromptTemplate\n",
-    "from langchain_core.runnables import RunnablePassthrough\n",
+    "* RBACEnsembleRetriever: Combines semantic search with Role-Based and Attribute-Based Access Control (RBAC/ABAC)\n",
     "\n",
-    "prompt = ChatPromptTemplate.from_template(\n",
-    "    \"\"\"Answer the question based only on the context provided.\n",
+    "## Prerequisites\n",
     "\n",
-    "Context: {context}\n",
+    "```python\n",
+    "from permit import Permit\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "from langchain_chroma import Chroma\n",
+    "# ... other imports as needed\n",
     "\n",
-    "Question: {question}\"\"\"\n",
+    "# Initialize Permit client\n",
+    "permit_client = Permit(\n",
+    "    token=\"your_permit_api_key\",\n",
+    "    pdp=\"your_pdp_url\"\n",
+    ")\n",
+    "```\n",
+    "\n",
+    "## ReBACSelfQueryRetriever\n",
+    "This retriever extends Langchain's SelfQueryRetriever to include relationship-based access control through Permit.io integration.\n",
+    "\n",
+    "### Basic Usage\n",
+    "\n",
+    "```python\n",
+    "# Initialize vector store with documents\n",
+    "docs = [\n",
+    "    Document(\n",
+    "        page_content=\"Confidential project proposal\",\n",
+    "        metadata={\n",
+    "            \"owner\": \"user-123\",\n",
+    "            \"relationships\": [\"team-a\", \"managers\"],\n",
+    "            \"resource_type\": \"document\"\n",
+    "        }\n",
+    "    )\n",
+    "]\n",
+    "\n",
+    "# Create retriever\n",
+    "rebac_retriever = ReBACSelfQueryRetriever(\n",
+    "    llm=ChatOpenAI(),\n",
+    "    vectorstore=Chroma.from_documents(docs, OpenAIEmbeddings()),\n",
+    "    permit_client=permit_client\n",
     ")\n",
     "\n",
+    "# Query with relationship context\n",
+    "docs = await rebac_retriever.aget_relevant_documents(\n",
+    "    \"Find project proposals\",\n",
+    "    user_context={\"user_id\": \"user-123\", \"relationships\": [\"team-a\"]}\n",
+    ")\n",
+    "```\n",
+    "\n",
+    "### Metadata Schema\n",
+    "The retriever uses the following metadata fields:\n",
+    "\n",
+    "* owner: Document owner identifier\n",
+    "* relationships: List of relationship identifiers that have access\n",
+    "* resource_type: Type of the resource for permission checking\n",
+    "\n",
+    "\n",
+    "## RBACEnsembleRetriever\n",
+    "This retriever combines multiple search strategies with role-based and attribute-based access control.\n",
+    "\n",
+    "### Basic Usage\n",
+    "\n",
+    "```python\n",
+    "# Initialize with documents\n",
+    "docs = [\n",
+    "    Document(\n",
+    "        page_content=\"HR Policy Document\",\n",
+    "        metadata={\n",
+    "            \"department\": \"HR\",\n",
+    "            \"classification\": \"internal\",\n",
+    "            \"required_role\": \"hr_staff\"\n",
+    "        }\n",
+    "    )\n",
+    "]\n",
+    "\n",
+    "# Create retrievers\n",
+    "semantic_retriever = Chroma.from_documents(\n",
+    "    docs, \n",
+    "    OpenAIEmbeddings()\n",
+    ").as_retriever()\n",
+    "\n",
+    "permission_retriever = BM25Retriever.from_documents(docs)\n",
+    "\n",
+    "# Create ensemble\n",
+    "rbac_retriever = RBACEnsembleRetriever(\n",
+    "    retrievers=[semantic_retriever, permission_retriever],\n",
+    "    weights=[0.6, 0.4],\n",
+    "    permit_client=permit_client\n",
+    ")\n",
     "\n",
-    "def format_docs(docs):\n",
-    "    return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
-    "\n",
-    "\n",
-    "chain = (\n",
-    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
-    "    | prompt\n",
-    "    | llm\n",
-    "    | StrOutputParser()\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d47c37dd-5c11-416c-a3b6-bec413cd70e8",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "chain.invoke(\"...\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d1ee55bc-ffc8-4cfa-801c-993953a08cfd",
-   "metadata": {},
-   "source": [
-    "## TODO: Any functionality or considerations specific to this retriever\n",
-    "\n",
-    "Fill in or delete if not relevant."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
-   "metadata": {},
-   "source": [
-    "## API reference\n",
-    "\n",
-    "For detailed documentation of all LangchainPermitRetriever features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/retrievers/langchain_permit.retrievers.LangchainPermit.LangchainPermitRetriever.html)."
+    "# Query with role context\n",
+    "docs = await rbac_retriever.aget_relevant_documents(\n",
+    "    \"HR policies\",\n",
+    "    user_context={\n",
+    "        \"roles\": [\"hr_staff\"],\n",
+    "        \"attributes\": {\"department\": \"HR\"}\n",
+    "    }\n",
+    ")\n",
+    "```\n",
+    "## Access Control\n",
+    "The retriever supports:\n",
+    "\n",
+    "* Role-Based Access: Using the roles field in user context\n",
+    "* Attribute-Based Access: Using the attributes field in user context\n",
+    "* Weighted Results: Combining semantic relevance with permission-based filtering\n",
+    "\n",
+    "## Advanced Features\n",
+    "### Custom Permission Logic\n",
+    "Both retrievers support custom permission logic through Permit.io configurations:\n",
+    "\n",
+    "```python\n",
+    "# Example of custom permission check\n",
+    "allowed = await permit_client.check(\n",
+    "    user=user_context,\n",
+    "    action=\"read\",\n",
+    "    resource={\n",
+    "        \"type\": doc.metadata.get(\"resource_type\"),\n",
+    "        \"attributes\": doc.metadata\n",
+    "    }\n",
+    ")\n",
+    "```"
    ]
   }
  ],