Base URL: http://localhost:8400
All requests require Authorization: Bearer {token} header unless noted otherwise.
All request/response bodies are JSON with Content-Type: application/json.
Health check (no auth required).
Response: 200 OK
Health status (no auth required).
Response: 200 OK
{ "Status": "Healthy", "Version": "0.2.0" }Health status JSON (no auth required).
Response: 200 OK
{ "Status": "Healthy", "Version": "0.2.0" }Returns the role and tenant of the authenticated caller.
Response: 200 OK
{ "Role": "Admin", "TenantName": "Admin" }Role—"Admin"or"User"TenantName—"Admin"for global admins, or the tenant's name
Process a single semantic cell. Requires bearer token authentication.
Request Body: SemanticCellRequest
The Type field determines which content field is used. Supported types: Text, List, Table, Code, Hyperlink, Meta.
{
"Type": "Text",
"Text": "Your text content here...",
"ChunkingConfiguration": {
"Strategy": "FixedTokenCount",
"FixedTokenCount": 256,
"OverlapCount": 32,
"OverlapStrategy": "SlidingWindow",
"ContextPrefix": "doc-123 "
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": true
},
"Labels": ["label1"],
"Tags": { "key": "value" }
}{
"Type": "Text",
"Text": "Your long text content here...",
"ChunkingConfiguration": {
"Strategy": "SentenceBased"
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx"
},
"SummarizationConfiguration": {
"CompletionEndpointId": "cep_xxxx",
"Order": "TopDown",
"MaxSummaryTokens": 1024,
"MinCellLength": 100,
"MaxParallelTasks": 4,
"MaxRetries": 10,
"MaxRetriesPerSummary": 2,
"TimeoutMs": 30000
}
}When SummarizationConfiguration is present, Partio generates summary child cells using the specified completion endpoint before chunking and embedding. The CompletionEndpointId is required; all other fields have defaults.
{
"Type": "List",
"UnorderedList": ["First item", "Second item", "Third item"],
"ChunkingConfiguration": {
"Strategy": "WholeList"
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}{
"Type": "List",
"OrderedList": ["Step one", "Step two", "Step three"],
"ChunkingConfiguration": {
"Strategy": "ListEntry"
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}{
"Type": "Table",
"Table": [
["Name", "Age", "City"],
["Alice", "30", "New York"],
["Bob", "25", "London"]
],
"ChunkingConfiguration": {
"Strategy": "RowWithHeaders"
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}{
"Type": "Code",
"Text": "function hello() {\n return 'world';\n}",
"ChunkingConfiguration": {
"Strategy": "FixedTokenCount",
"FixedTokenCount": 256
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}{
"Type": "Hyperlink",
"Text": "https://example.com - Example website description",
"ChunkingConfiguration": {
"Strategy": "FixedTokenCount",
"FixedTokenCount": 256
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}{
"Type": "Meta",
"Text": "Author: John Doe | Created: 2026-01-15 | Version: 2.1",
"ChunkingConfiguration": {
"Strategy": "FixedTokenCount",
"FixedTokenCount": 256
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}{
"Type": "Table",
"Table": [
["Name", "Age", "City"],
["Alice", "30", "New York"],
["Bob", "25", "London"]
],
"ChunkingConfiguration": {
"Strategy": "Row"
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}Each data row becomes a chunk of space-separated values: "Alice 30 New York".
{
"Type": "Table",
"Table": [
["Name", "Age", "City"],
["Alice", "30", "New York"],
["Bob", "25", "London"],
["Carol", "35", "Paris"]
],
"ChunkingConfiguration": {
"Strategy": "RowGroupWithHeaders",
"RowGroupSize": 2
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}Groups of RowGroupSize rows with headers prepended as a markdown table. Default RowGroupSize is 5.
{
"Type": "Table",
"Table": [
["Name", "Age", "City"],
["Alice", "30", "New York"],
["Bob", "25", "London"]
],
"ChunkingConfiguration": {
"Strategy": "KeyValuePairs"
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}Each data row becomes: "Name: Alice, Age: 30, City: New York".
{
"Type": "Table",
"Table": [
["Name", "Age", "City"],
["Alice", "30", "New York"],
["Bob", "25", "London"]
],
"ChunkingConfiguration": {
"Strategy": "WholeTable"
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}Entire table serialized as a single markdown table chunk.
{
"Type": "Text",
"Text": "# Introduction\nPartio is a chunking platform.\n\n# Architecture\nPartio uses a ...\n\n# Deployment\nUse Docker Compose to ...",
"ChunkingConfiguration": {
"Strategy": "RegexBased",
"RegexPattern": "(?=^#{1,3}\\s)",
"FixedTokenCount": 512
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_xxxx",
"L2Normalization": false
}
}Split at boundaries defined by the RegexPattern. Text is split using Regex.Split at every match. Useful for Markdown headings, log timestamps, LaTeX sections, function definitions, and more.
Response: 200 OK — SemanticCellResponse
{
"GUID": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"ParentGUID": null,
"Type": "Text",
"Text": "Your text content here...",
"Children": [],
"Chunks": [
{
"CellGUID": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"Text": "Your text content here...",
"Labels": ["label1"],
"Tags": { "key": "value" },
"Embeddings": [-0.4418, 0.1234, ...]
}
]
}Errors:
404 Not Found— EmbeddingEndpointId not found or does not belong to the caller's tenant400 Bad Request— Endpoint is inactive, request body is missing/invalid, or strategy is incompatible with atom type
The API validates that the chunking strategy is compatible with the atom type. Incompatible combinations return 400 Bad Request.
- Generic strategies (
FixedTokenCount,SentenceBased,ParagraphBased,RegexBased) work with all types - List strategies (
WholeList,ListEntry) only work withList - Table strategies (
Row,RowWithHeaders,RowGroupWithHeaders,KeyValuePairs,WholeTable) only work withTable
Example error response for missing RegexPattern:
{
"Error": "BadRequest",
"Message": "RegexPattern is required when using the RegexBased strategy.",
"StatusCode": 400
}Example error response for invalid RegexPattern:
{
"Error": "BadRequest",
"Message": "RegexPattern is not a valid regular expression: parsing '([' - Unterminated [] set.",
"StatusCode": 400
}Example error response for using Row strategy on a Text type:
{
"Error": "BadRequest",
"Message": "Strategy 'Row' is only compatible with atom type 'Table', but got 'Text'.",
"StatusCode": 400
}| Property | Type | Default | Description |
|---|---|---|---|
Strategy |
string | FixedTokenCount |
Chunking strategy to use |
FixedTokenCount |
int | 256 |
Tokens per chunk (for FixedTokenCount) |
OverlapCount |
int | 0 |
Overlap tokens between chunks |
OverlapPercentage |
float? | null |
Overlap as percentage (0.0-1.0) |
OverlapStrategy |
string | SlidingWindow |
Overlap boundary strategy |
ContextPrefix |
string? | null |
Prefix prepended to each chunk |
RowGroupSize |
int | 5 |
Rows per group (for RowGroupWithHeaders). Minimum: 1 |
RegexPattern |
string? | null |
Regular expression split pattern (required for RegexBased strategy). Text is split at every match of this pattern. |
| Property | Type | Default | Description |
|---|---|---|---|
EmbeddingEndpointId |
string | (required) | The embedding endpoint ID to use for generating embeddings (e.g. eep_xxxx). The endpoint must belong to the caller's tenant (non-admin) and be active. |
L2Normalization |
bool | false |
Whether to L2-normalize the embedding vectors |
Process multiple semantic cells.
Request Body: List<SemanticCellRequest>
Response: 200 OK — List<SemanticCellResponse>
The explorer endpoints are intended for diagnostics from the dashboard or SDKs. They execute the selected configured endpoint through Partio's own backend client path and always return a structured result payload with Success, StatusCode, any Error, and the captured upstream call details.
Exercise a configured embedding endpoint through Partio.
Request Body:
{
"EndpointId": "eep_xxxx",
"Input": "Partio explorer embedding test payload",
"L2Normalization": false
}Response: 200 OK — EndpointExplorerEmbeddingResponse
{
"Success": true,
"StatusCode": 200,
"Error": null,
"EndpointId": "eep_xxxx",
"Model": "nomic-embed-text",
"Input": "Partio explorer embedding test payload",
"Embedding": [0.0123, -0.0456, 0.0789],
"Dimensions": 768,
"ResponseTimeMs": 123,
"RequestHistoryId": "req_xxxx",
"EmbeddingCalls": []
}When the provider call fails, Success is false, StatusCode contains the mapped failure code, Error contains the error text, and EmbeddingCalls still contains any upstream request/response data captured before the failure.
Exercise a configured inference endpoint through Partio.
Request Body:
{
"EndpointId": "cep_xxxx",
"Prompt": "Explain what Partio does in one short paragraph.",
"SystemPrompt": "Be concise.",
"MaxTokens": 512,
"TimeoutMs": 60000
}Response: 200 OK — EndpointExplorerCompletionResponse
{
"Success": true,
"StatusCode": 200,
"Error": null,
"EndpointId": "cep_xxxx",
"Model": "gpt-4.1-mini",
"Prompt": "Explain what Partio does in one short paragraph.",
"SystemPrompt": "Be concise.",
"Output": "Partio is a multi-tenant service for chunking, embedding, and optional summarization.",
"ResponseTimeMs": 187,
"RequestHistoryId": "req_xxxx",
"CompletionCalls": []
}If the selected endpoint has EnableRequestHistory = true and request history is enabled globally, the explorer response also includes the created RequestHistoryId.
Create a tenant. Also creates a default user, credential, and embedding endpoints.
Request Body:
{
"Name": "My Tenant",
"Labels": ["production"],
"Tags": { "env": "prod" }
}Response: 201 Created — TenantMetadata
Read a tenant by ID.
Response: 200 OK — TenantMetadata
Update a tenant.
Request Body: TenantMetadata (partial update)
Response: 200 OK — TenantMetadata
Delete a tenant.
Response: 204 No Content
Check if a tenant exists.
Response: 200 OK or 404 Not Found
List tenants with pagination and filtering.
Request Body: EnumerationRequest
{
"MaxResults": 100,
"ContinuationToken": null,
"Order": "CreatedDescending",
"NameFilter": null,
"ActiveFilter": null
}Response: 200 OK — EnumerationResult<TenantMetadata>
Create a user.
Request Body:
{
"TenantId": "ten_...",
"Email": "user@example.com",
"Password": "plaintext-password",
"FirstName": "John",
"LastName": "Doe",
"IsAdmin": false
}Response: 200 OK — UserMaster (password redacted)
Read a user by ID (password redacted).
Response: 200 OK — UserMaster
Update a user.
Response: 200 OK — UserMaster
Delete a user.
Response: 204 No Content
Check if a user exists.
Response: 200 OK or 404 Not Found
List users with pagination.
Request/Response: Same pattern as tenants.
Create a credential (generates a bearer token).
Request Body:
{
"TenantId": "ten_...",
"UserId": "usr_...",
"Name": "My API Key"
}Response: 201 Created — Credential (includes generated BearerToken)
Read a credential.
Response: 200 OK — Credential
Update a credential.
Response: 200 OK — Credential
Delete a credential.
Response: 204 No Content
Check if a credential exists.
Response: 200 OK or 404 Not Found
List credentials with pagination.
Create an embedding endpoint.
Request Body:
{
"TenantId": "ten_...",
"Name": "My Embedding Endpoint",
"Model": "nomic-embed-text",
"Endpoint": "http://localhost:11434",
"ApiFormat": "Ollama",
"ApiKey": null,
"Active": true,
"EnableRequestHistory": true,
"HealthCheckEnabled": false,
"HealthCheckUrl": null,
"HealthCheckMethod": "GET",
"HealthCheckIntervalMs": 5000,
"HealthCheckTimeoutMs": 2000,
"HealthCheckExpectedStatusCode": 200,
"HealthyThreshold": 3,
"UnhealthyThreshold": 3,
"HealthCheckUseAuth": false
}| Property | Type | Default | Description |
|---|---|---|---|
HealthCheckEnabled |
bool | false |
Enable background health checking for this endpoint |
HealthCheckUrl |
string? | null |
Custom URL to check (defaults to the endpoint URL if null) |
HealthCheckMethod |
string | "GET" |
HTTP method for health checks (GET or HEAD) |
HealthCheckIntervalMs |
int | 5000 |
Milliseconds between health checks |
HealthCheckTimeoutMs |
int | 2000 |
Timeout per health check request in milliseconds |
HealthCheckExpectedStatusCode |
int | 200 |
Expected HTTP status code for a healthy response |
HealthyThreshold |
int | 3 |
Consecutive successes required to transition to healthy |
UnhealthyThreshold |
int | 3 |
Consecutive failures required to transition to unhealthy |
HealthCheckUseAuth |
bool | false |
Include the endpoint's API key in health checks (Bearer for OpenAI/vLLM, x-goog-api-key for Gemini) |
When HealthCheckEnabled is true and the endpoint is active, the server runs a background loop that periodically checks the endpoint. If the endpoint becomes unhealthy, process requests to it return 502 Bad Gateway.
Health check defaults are applied automatically based on ApiFormat when creating or updating an endpoint:
- Ollama: URL defaults to
{Endpoint}/api/tags, 5s interval, 2s timeout, no auth - OpenAI: URL defaults to
{Endpoint}/v1/models, 15s interval, 5s timeout, auth enabled - vLLM: URL defaults to
{Endpoint}/v1/models, 15s interval, 5s timeout, auth enabled - Gemini: URL defaults to
{Endpoint}/v1beta/models, 15s interval, 5s timeout, auth enabled
Response: 201 Created — EmbeddingEndpoint
Read an embedding endpoint.
Response: 200 OK — EmbeddingEndpoint
Update an embedding endpoint.
Response: 200 OK — EmbeddingEndpoint
Delete an embedding endpoint.
Response: 204 No Content
Check if an embedding endpoint exists.
Response: 200 OK or 404 Not Found
List embedding endpoints with pagination.
Get the health status for a specific monitored embedding endpoint.
Path Parameters:
id— Embedding endpoint ID
Response: 200 OK — EndpointHealthStatus
{
"EndpointId": "eep_xxxx",
"EndpointName": "nomic-embed-text",
"TenantId": "ten_xxxx",
"IsHealthy": true,
"FirstCheckUtc": "2026-02-07T12:00:00Z",
"LastCheckUtc": "2026-02-07T12:01:00Z",
"LastHealthyUtc": "2026-02-07T12:00:30Z",
"LastUnhealthyUtc": null,
"LastStateChangeUtc": "2026-02-07T12:00:30Z",
"TotalUptimeMs": 60000,
"TotalDowntimeMs": 30000,
"UptimePercentage": 66.67,
"ConsecutiveSuccesses": 3,
"ConsecutiveFailures": 0,
"LastError": null,
"History": [
{ "TimestampUtc": "2026-02-07T12:00:10Z", "Success": false },
{ "TimestampUtc": "2026-02-07T12:00:20Z", "Success": true },
{ "TimestampUtc": "2026-02-07T12:00:30Z", "Success": true }
]
}Errors:
404 Not Found— No health state exists (health check not enabled or endpoint not found)
Get health status for all monitored embedding endpoints. Non-admin callers see only their tenant's endpoints.
Response: 200 OK — List<EndpointHealthStatus>
Create a completion endpoint.
Request Body:
{
"TenantId": "ten_...",
"Name": "My Inference Endpoint",
"Model": "llama3",
"Endpoint": "http://localhost:11434",
"ApiFormat": "Ollama",
"ApiKey": null,
"Active": true,
"EnableRequestHistory": true,
"HealthCheckEnabled": false
}Response: 201 Created — CompletionEndpoint
Read a completion endpoint.
Response: 200 OK — CompletionEndpoint
Update a completion endpoint.
Response: 200 OK — CompletionEndpoint
Delete a completion endpoint.
Response: 204 No Content
Check if a completion endpoint exists.
Response: 200 OK or 404 Not Found
List completion endpoints with pagination.
Get the health status for a specific completion endpoint.
Response: 200 OK — EndpointHealthStatus
Get health status for all monitored completion endpoints.
Response: 200 OK — List<EndpointHealthStatus>
Read a request history entry.
Response: 200 OK — RequestHistoryEntry
Read request/response body detail from filesystem.
Response: 200 OK — JSON object with the following fields:
| Field | Type | Description |
|---|---|---|
RequestHeaders |
object? | Outer request headers (key-value pairs) |
RequestBody |
string? | Outer request body (may be truncated) |
ResponseHeaders |
object? | Outer response headers (key-value pairs) |
ResponseBody |
string? | Outer response body (may be truncated) |
EmbeddingCalls |
array? | Upstream embedding HTTP call details (present only for process requests) |
CompletionCalls |
array? | Upstream completion/inference HTTP call details (present only for requests that use summarization or other completion features) |
Each item in the EmbeddingCalls or CompletionCalls array:
| Field | Type | Description |
|---|---|---|
Url |
string? | Full URL called on the upstream embedding endpoint |
Method |
string? | HTTP method (e.g. POST) |
RequestHeaders |
object? | Headers sent to the upstream endpoint |
RequestBody |
string? | Body sent to the upstream endpoint (may be truncated) |
StatusCode |
int? | HTTP status code returned by the upstream endpoint |
ResponseHeaders |
object? | Response headers from the upstream endpoint |
ResponseBody |
string? | Response body from the upstream endpoint (may be truncated) |
ResponseTimeMs |
long? | Round-trip time for this call in milliseconds |
Success |
bool | Whether the call returned a success status code |
Error |
string? | Error message if the call failed |
TimestampUtc |
string | ISO 8601 timestamp when the call was initiated |
Delete a request history entry.
Response: 204 No Content
List request history with pagination.
Get aggregated request statistics grouped by time bucket, broken out by success/failure.
Request Body: RequestStatisticsRequest
{
"RequestType": "Embedding",
"Timeframe": "Day",
"EndpointFilter": null
}| Field | Type | Description |
|---|---|---|
RequestType |
string? | "Embedding", "Inference", or null for all requests. Embedding matches URLs containing /process or /embedding. Inference matches URLs containing /completion. |
Timeframe |
string? | "Hour" (1-minute buckets, ~60 samples), "Day" (15-minute buckets, ~96 samples), "Week" (1-hour buckets, ~168 samples), or "Month" (4-hour buckets, ~180 samples). Defaults to "Day". |
EndpointFilter |
string? | Optional URL substring filter to narrow results to a specific endpoint. |
Response: 200 OK — RequestStatisticsResponse
{
"Buckets": [
{
"TimeBucket": "2026-03-20T14",
"SuccessCount": 42,
"FailureCount": 3
},
{
"TimeBucket": "2026-03-20T15",
"SuccessCount": 38,
"FailureCount": 1
}
],
"TotalSuccess": 80,
"TotalFailure": 4
}| Field | Type | Description |
|---|---|---|
Buckets |
array | Time-bucketed request counts |
Buckets[].TimeBucket |
string | ISO 8601 timestamp for the bucket start, always "yyyy-MM-ddTHH:mm" format (e.g. "2026-03-20T14:30") |
Buckets[].SuccessCount |
long | Requests with HTTP status 100-399 |
Buckets[].FailureCount |
long | Requests with HTTP status 400+ or null status |
TotalSuccess |
long | Total successful requests in the time range |
TotalFailure |
long | Total failed requests in the time range |
Summarization is an optional step in the processing pipeline that runs before chunking and embedding. When a SummarizationConfiguration is provided inline on the SemanticCellRequest, the server uses a completion endpoint to generate summaries of the input content. The resulting summary cells have Type = "Summary".
- TopDown — Summarizes the entire content in a single pass, producing one summary from the full input.
- BottomUp — Splits the content into smaller pieces first, summarizes each piece individually, then optionally combines the summaries.
The summarization prompt is configurable via a template string. The following tokens are available for substitution:
| Token | Description |
|---|---|
{tokens} |
The target token count for the summary |
{content} |
The content to be summarized |
{context} |
Additional context provided by the caller |
Summarization supports two levels of retry control:
MaxRetriesPerSummary— Maximum number of retries for each individual summary cell. If a single cell fails repeatedly, it is skipped after this many attempts.MaxRetries— Global maximum number of total retries across the entire summarization operation. Once this limit is reached, the operation fails regardless of per-cell limits.
All errors return an ApiErrorResponse:
{
"Error": "ArgumentException",
"Message": "Request body is required.",
"StatusCode": 400,
"TimestampUtc": "2026-02-06T12:00:00Z"
}| Status Code | Meaning |
|---|---|
| 400 | Bad Request (invalid input) |
| 401 | Unauthorized (missing/invalid token) |
| 404 | Not Found |
| 502 | Bad Gateway (endpoint is unhealthy) |
| 500 | Internal Server Error |
Include the bearer token in the Authorization header:
Authorization: Bearer partioadmin
- Admin API keys (from
partio.jsonAdminApiKeysarray) grant full admin access - Credential bearer tokens grant tenant-scoped access for processing
- Health endpoints (
/,/v1.0/health) do not require authentication
All enumeration endpoints accept POST with an EnumerationRequest body and return EnumerationResult<T>.
Use ContinuationToken from the response to fetch the next page:
// First page
POST /v1.0/tenants/enumerate
{ "MaxResults": 10 }
// Next page
POST /v1.0/tenants/enumerate
{ "MaxResults": 10, "ContinuationToken": "ten_abc123..." }CreatedAscending— oldest firstCreatedDescending— newest first (default)NameAscending— alphabetical A-ZNameDescending— alphabetical Z-A
NameFilter— partial match on name fieldLabelFilter— exact match on labelsTagKeyFilter/TagValueFilter— filter by tag key/valueActiveFilter— filter by active status (true/false)