feat(responses)!: add Prompts API to Responses API #3514

r3v5 · 2025-09-21T13:01:16Z

What does this PR do?

The purpose of this PR is to integrate Prompts API to Responses API to achieve full OpenAI compatibility for current Responses API in Llama Stack.

The values of variables inside prompts can be attached during creation of response in three ways:

As OpenAIResponseInputMessageContentText object
As OpenAIResponseInputMessageContentImage object
As OpenAIResponseInputMessageContentFile object

This is done to match OpenAI API specs. Reference can be found here

Files API is now available inside Agents API as an internal dependency to efficiently handle files that are used by user inside prompts during creating response via Responses API.

Closes #3321

Test Plan

Manual API testing and running newly added unit tests.

Prerequisites:
uv run --with llama-stack llama stack build --distro starter --image-type venv --run

The comprehensive testing can be found here

mattf

this is an api change unrelated to how prompts are used in /v1/responses

please review your code assistant output before posting as a PR.

r3v5 · 2025-09-22T12:04:08Z

this is an api change unrelated to how prompts are used in /v1/responses

please review your code assistant output before posting as a PR.

Hi @mattf ! Could you please elaborate on how the prompts should be used in Responses API in your opinion. My understanding was that they should be propagated to Agent’s messages context as OpenAISystemMessageParam

franciscojavierarceo

Hey @r3v5 it looks like you've suggested adding prompt_id here where you need to add a Prompt object with an id, version, and variables, which would then be consistent with OpenAI's client usage, as outlined here:

response = client.responses.create(
  prompt={
    "id": "pmpt_68b0c29740048196bd3a6e6ac3c4d0e20ed9a13f0d15bf5e",
    "version": "2",
    "variables": {
        "city": "San Francisco",
        "age": 30,
    }
  }
)

So this is currently incorrect. As @mattf suggested, let's make sure we double check this. Thank you.

r3v5 · 2025-09-22T14:20:59Z

Hey @r3v5 it looks like you've suggested adding prompt_id here where you need to add a Prompt object with an id, version, and variables, which would then be consistent with OpenAI's client usage, as outlined here:
response = client.responses.create(
  prompt={
    "id": "pmpt_68b0c29740048196bd3a6e6ac3c4d0e20ed9a13f0d15bf5e",
    "version": "2",
    "variables": {
        "city": "San Francisco",
        "age": 30,
    }
  }
)
So this is currently incorrect. As @mattf suggested, let's make sure we double check this. Thank you.

Oh yeah, this makes sense. I got it. I will adjust the implementation then

r3v5 · 2025-09-22T17:52:48Z

llama_stack/apis/agents/openai_responses.py

+    """
+
+    id: str
+    version: str | None = None


Version has type string because OpenAI has it like string. Reference is here

mattf · 2025-09-23T08:49:50Z

@cdoern this is an enhancement to the /openai/v1/responses api, does it match the openai /v1/responses api spec?

cdoern · 2025-09-23T15:12:38Z

@mattf:

╰─ oasdiff breaking --fail-on ERR \
docs/_static/llama-stack-spec.yaml \
/Users/charliedoern/Downloads/openapi.documented.yml \
--strip-prefix-base "/v1/openai/v1" \
--match-path '(^/v1/openai/v1/responses.*|^/responses.*)'
51 changes: 24 error, 27 warning, 0 info
error	[api-removed-without-deprecation] at docs/_static/llama-stack-spec.yaml
  in API GET /responses
  	api removed without deprecation

error	[request-body-all-of-added] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	added '#/components/schemas/CreateModelResponseProperties, #/components/schemas/ResponseProperties, subschema #3' to the request body 'allOf' list

error	[request-body-type-changed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	the request's body type/format changed from 'object'/'' to ''/''

error	[response-body-type-changed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	the response's body type/format changed from 'object'/'' to ''/'' for status '200'

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'created_at' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'id' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'model' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'object' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'output' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'parallel_tool_calls' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'status' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the required property 'text' from the response with the '200' status

error	[response-media-type-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API DELETE /responses/{response_id}
  	removed the media type 'application/json' for the response with the status '200'

error	[response-body-type-changed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	the response's body type/format changed from 'object'/'' to ''/'' for status '200'

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'created_at' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'id' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'model' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'object' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'output' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'parallel_tool_calls' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'status' from the response with the '200' status

error	[response-required-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the required property 'text' from the response with the '200' status

error	[request-parameter-default-value-added] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}/input_items
  	for the 'query' request parameter 'limit', default value '20.00' was added

error	[response-property-type-changed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}/input_items
  	the 'object' response's property type/format changed from 'string'/'' to ''/'' for status '200'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'include'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'input'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'instructions'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'max_infer_iters'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'model'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'previous_response_id'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'prompt'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'store'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'stream'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'temperature'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'text'

warning	[request-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the request property 'tools'

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the optional property 'error' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the optional property 'previous_response_id' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the optional property 'prompt' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the optional property 'temperature' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the optional property 'top_p' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the optional property 'truncation' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API POST /responses
  	removed the optional property 'user' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the optional property 'error' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the optional property 'previous_response_id' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the optional property 'prompt' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the optional property 'temperature' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the optional property 'top_p' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the optional property 'truncation' from the response with the '200' status

warning	[response-optional-property-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}
  	removed the optional property 'user' from the response with the '200' status

warning	[request-parameter-removed] at /Users/charliedoern/Downloads/openapi.documented.yml
  in API GET /responses/{response_id}/input_items
  	deleted the 'query' request parameter 'before'
  	This is a warning because some apps may return an error when receiving a parameter that they do not expect. It is recommended to deprecate the parameter first.

there seems to be some breaking changes, BUT these might have existed in main, let me check.

franciscojavierarceo · 2025-09-29T12:49:47Z

hey @cdoern any update on the main branch check here?

leseb · 2025-10-15T07:46:06Z

Great work! Can you share a workflow example using the Prompt API and re-use that prompt in Responses create? Thanks!

Thanks, @leseb ! Basically, in PR description I have attached the testing workflow that shows how prompts work in Responses API. But it involves only basic prompts with text variables. Let me know if you want me to show working example with input image and input file variables.

The PR description shows a working example of Prompt inside the Response create, but I'd like to see pmpt_dc6c124c7f1393cd4ddb88ed707ffbfd3d937e644d10052c being taken from a previous call to our internal Prompt API and re-use it in the Response create. Does that make sense? Thanks!

franciscojavierarceo · 2025-10-16T15:23:05Z

@r3v5 unit tests are failing

r3v5 · 2025-10-16T15:24:43Z

@r3v5 unit tests are failing

I just rebased commit from main today. Still haven't finished my implementation

r3v5 · 2025-10-16T21:11:45Z

Hey @leseb , @franciscojavierarceo !

Here I provide a comprehensive testing of support Prompts in Responses API via curl requests to LLS server.

Test Prompts with Images with text on them in Responses API:

I used this image for testing purposes: iphone 17 image

Upload an image:

curl -X POST http://localhost:8321/v1/files \                                      
  -H "Content-Type: multipart/form-data" \
  -F "file=@/Users/ianmiller/iphone.jpeg" \
  -F "purpose=assistants"

{"object":"file","id":"file-7b52da8206ef42d2b9acd991ec5f1e23","bytes":556241,"created_at":1760646583,"expires_at":1792182583,"filename":"iphone.jpeg","purpose":"assistants"}%

Create prompt:

curl -X POST http://localhost:8321/v1/prompts \                                   
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.",
    "variables": ["product_name", "description", "product_photo"]
  }'

{"prompt":"You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.","version":1,"prompt_id":"pmpt_e9d108411aaacdc4ea6fbe199a7948fce69bfc826cd43e7f","variables":["product_name","description","product_photo"],"is_default":false}%

Create response:

curl -X POST http://localhost:8321/v1/responses \                                  
  -H "Accept: application/json, text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Please analyze this product",
    "model": "openai/gpt-4o",
    "store": true,
    "prompt": {
      "id": "pmpt_e9d108411aaacdc4ea6fbe199a7948fce69bfc826cd43e7f",
      "version": "1",
      "variables": {
        "product_name": {
          "type": "input_text",
          "text": "iPhone 17 Pro Max"
        },
        "description": {
          "type": "input_text",
          "text": "6.9-inch Display, Anti-Reflective Display, 120Hz Refresh rate, Apple A19 Pro chip, 48MP Camera System, 12GB RAM, Titanium Frame, Wi-Fi 7, Launching in 2025"
        },
        "product_photo": {
          "type": "input_image",
          "file_id": "file-7b52da8206ef42d2b9acd991ec5f1e23",
          "detail": "high"
        }
      }
    }
  }'

Output after inferencing:

{"created_at":1760646678,"error":null,"id":"resp_598a00e4-0372-438f-9f00-e0351a222bf4","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"## Product Analysis: iPhone 17 Pro Max\n\n### Quality Assessment\n\n**Display:**\n- **Size:** The 6.9-inch display is expansive, catering to users who enjoy media consumption and multitasking.\n- **Anti-Reflective Feature:** This enhances usability in bright environments, making it appealing for outdoor use.\n- **120Hz Refresh Rate:** Ensures smooth scrolling and a more fluid visual experience, beneficial for gaming and video playback.\n\n**Performance:**\n- **Apple A19 Pro Chip:** Likely promises enhanced processing power and efficiency, making it suitable for demanding applications.\n- **12GB RAM:** Supports heavy multitasking and resource-intensive applications, offering a seamless user experience.\n\n**Camera System:**\n- **48MP Triple Camera with Wide, Ultra-Wide, and Telephoto Lenses:** Versatile photography options enabling high-quality images in various settings.\n- **24MP Front Camera:** High-resolution selfies and improved video calls.\n\n**Build:**\n- **Titanium Frame:** Suggests increased durability and a premium feel.\n\n**Connectivity:**\n- **Wi-Fi 7:** Future-proofing the device with the latest connectivity standards for faster internet speeds and better network performance.\n\n### Target Audience\n\n- **Tech Enthusiasts:** Users who prioritize cutting-edge technology and performance.\n- **Professional Photographers and Videographers:** Those who seek high-quality camera systems for mobile photography.\n- **Content Consumers and Creators:** Individuals who watch, create, or edit media extensively on their devices.\n- **Luxury Market:** Users looking for premium materials and design.\n\n### Pricing Recommendations\n\nGiven the high-end features and materials:\n\n- **Premium Segment:** A price range of $1,200 to $1,500 seems appropriate, aligning with market trends for flagship devices from top brands.\n- **Competitive Position:** Ensure it is competitively priced against other flagship models from leading rivals to capture tech-savvy consumers.\n\n### Conclusion\n\nThe iPhone 17 Pro Max appears to set a high bar for mobile technology with its advanced features and premium build. It is likely to appeal to a wide range of high-end users seeking performance, quality, and luxury.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_4a733ec0-d082-47e4-b197-79858e3ad708","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"prompt":"You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.","version":1,"prompt_id":"pmpt_e9d108411aaacdc4ea6fbe199a7948fce69bfc826cd43e7f","variables":["product_name","description","product_photo"],"is_default":false},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":878,"output_tokens":432,"total_tokens":1310,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}}}%

The same example but without providing the description of product:

curl -X POST http://localhost:8321/v1/responses \                                 
  -H "Accept: application/json, text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Please analyze this product",
    "model": "openai/gpt-4o",
    "store": true,
    "prompt": {
      "id": "pmpt_721d32a4f92124c99cd3fef37cdedeb681afd96597f00cf0",
      "version": "1",
      "variables": {
        "product_name": {
          "type": "input_text",
          "text": "iPhone 17 Pro Max"
        },
        "product_photo": {
          "type": "input_image",
          "file_id": "file-f69a89ddf2d645aeb703cda9f6d8f43f",
          "detail": "high"
        }
      }
    }
  }'

Output:

{"created_at":1760649433,"error":null,"id":"resp_85211c30-c494-410c-9055-518c224ad3d0","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"### Product Analysis: iPhone 17 Pro Max\n\n#### Quality Assessment\n\n- **Display:**\n - The 6.9-inch display size is quite large, ideal for media consumption.\n - An anti-reflective display enhances usability in bright environments.\n - A 120Hz refresh rate ensures smooth scrolling and a more responsive touch experience.\n\n- **Performance:**\n - Equipped with the Apple A19 Pro chip, indicating top-tier performance for demanding applications and gaming.\n - 12GB RAM supports heavy multitasking and resource-intensive tasks.\n\n- **Camera System:**\n - Triple 48MP rear cameras (wide, ultra-wide, telephoto) suggest high-quality photography capabilities with versatile options.\n - A 24MP front camera supports excellent selfie and video call quality.\n\n- **Build:**\n - The titanium frame enhances durability while maintaining a premium feel.\n\n- **Connectivity:**\n - Wi-Fi 7 support suggests faster wireless connectivity and future proofing for newer internet standards.\n\n#### Target Audience\n\n- **Tech Enthusiasts:** Individuals desiring cutting-edge technology and who frequently upgrade to the latest devices.\n- **Professional Users:** Those who need a reliable, high-performing device for work-related tasks, including intensive multitasking.\n- **Content Creators:** Users interested in mobile photography and video capabilities due to the advanced camera setup.\n- **Luxury Market:** Consumers looking for a premium smartphone experience with high-end materials like a titanium frame.\n\n#### Pricing Recommendations\n\nGiven the high-end specifications and build quality, the iPhone 17 Pro Max should be positioned in the premium market segment. Pricing should reflect its top-tier status while considering competitors' flagship models. A recommended starting price range could be $1,299 to $1,499, aligning with market expectations for flagship Apple devices with advanced features.\n\n### Conclusion\n\nThe iPhone 17 Pro Max appears to be a powerful and luxurious smartphone poised to satisfy high-end market demands, especially for users prioritizing performance, design, and camera quality.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_590729b1-84bb-4d81-9768-9e6ae512f935","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"prompt":"You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.","version":1,"prompt_id":"pmpt_721d32a4f92124c99cd3fef37cdedeb681afd96597f00cf0","variables":["product_name","product_photo"],"is_default":false},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":825,"output_tokens":406,"total_tokens":1231,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}}}%

Test Prompts with PDF files in Responses API:

I used this PDF file for testing purposes: invoicesample.pdf

Upload PDF:

curl -X POST http://localhost:8321/v1/files \                                      
  -H "Content-Type: multipart/form-data" \
  -F "file=@/Users/ianmiller/invoicesample.pdf" \
  -F "purpose=assistants"

{"object":"file","id":"file-9575907114944364895e0607b1bb84fc","bytes":149568,"created_at":1760646828,"expires_at":1792182828,"filename":"invoicesample.pdf","purpose":"assistants"}%

Create prompt:

curl -X POST http://localhost:8321/v1/prompts \                                    
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis",
    "variables": ["invoice_doc"]
  }'

{"prompt":"You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis","version":1,"prompt_id":"pmpt_3d276c04719f06eb90077733a9a9c9d792031d2f62ceed42","variables":["invoice_doc"],"is_default":false}%

Create response:

curl -X POST http://localhost:8321/v1/responses \                                  
  -H "Content-Type: application/json" \
  -d '{
    "input": "Please provide a detailed analysis of this invoice",
    "model": "openai/gpt-4o",
    "store": true,
    "prompt": {
      "id": "pmpt_3d276c04719f06eb90077733a9a9c9d792031d2f62ceed42",
      "version": "1",
      "variables": {
        "invoice_doc": {
          "type": "input_file",
          "file_id": "file-9575907114944364895e0607b1bb84fc",
          "filename": "invoicesample.pdf"
        }
      }
    }
  }'

Output after inferencing:

{"created_at":1760646886,"error":null,"id":"resp_b7adc30e-8313-4818-afc7-6000ff57e193","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"Here's a comprehensive analysis of the invoice provided:\n\n### Invoice Details:\n\n- **Recipient:** Denny Gunawan\n- **Recipient Address:** 221 Queen St, Melbourne VIC 3000\n- **Supplier Address:** 123 Somewhere St, Melbourne VIC 3000\n- **Contact Number:** (03) 1234 5678\n- **Invoice Number:** #20130304\n\n### Items Listed:\n\n1. **Apple**\n - Price per kg: $5.00\n - Quantity: 1 kg\n - Subtotal: $5.00\n\n2. **Orange**\n - Price per kg: $1.99\n - Quantity: 2 kg\n - Subtotal: $3.98\n\n3. **Watermelon**\n - Price per kg: $1.69\n - Quantity: 3 kg\n - Subtotal: $5.07\n\n4. **Mango**\n - Price per kg: $9.56\n - Quantity: 2 kg\n - Subtotal: $19.12\n\n5. **Peach**\n - Price per kg: $2.99\n - Quantity: 1 kg\n - Subtotal: $2.99\n\n### Financial Summary:\n\n- **Subtotal:** $36.00\n- **GST (10%):** $3.60\n- **Total:** $39.60\n\n### Additional Observations:\n\n- **GST Calculation:** The GST is calculated at 10% of the subtotal, which aligns with typical Australian tax rates.\n- **Supplier Logo:** The invoice features a logo at the top of the document indicating the business name \"Sunny Farm\" with the tagline \"Naturally Fresh Produce - Victoria\".\n- **Decoration:** There's a \"Thank You\" note at the bottom, adding a customer-friendly touch.\n- **Miscellaneous Text:** Contains placeholder text (\"Lorem ipsum...\"), suggesting potential space for additional terms or messages.\n\n### Conclusion:\n\nThe invoice is clearly structured, providing detailed information about each item purchased, quantities, pricing, and tax details. The GST is accurately applied, and the design includes space for branding and customer relation elements. The inclusion of placeholder text hints at customization or further information typically present in full invoices. \n\nOverall, it is representative of a business that focuses on providing organic produce, likely maintaining standardized procedures for documenting sales and taxes.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_0ccb6c2e-cb1d-4ec6-8cd2-0744c1dade07","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"prompt":"You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis","version":1,"prompt_id":"pmpt_3d276c04719f06eb90077733a9a9c9d792031d2f62ceed42","variables":["invoice_doc"],"is_default":false},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":529,"output_tokens":486,"total_tokens":1015,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}}}%

Test simple text Prompt in Responses API:

Create prompt:

 curl -X POST http://localhost:8321/v1/prompts \                                    
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.",
    "variables": ["name", "company", "role", "tone"]
  }'

{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_60d390fd55948e03803dbea7dfad034709a2a75214104b82","variables":["name","company","role","tone"],"is_default":false}%

Create response:

curl -X POST http://localhost:8321/v1/responses \                                  
  -H "Accept: application/json, text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "What is the capital of Ireland?",
    "model": "openai/gpt-4o",
    "store": true,
    "prompt": {
      "id": "pmpt_60d390fd55948e03803dbea7dfad034709a2a75214104b82",
      "version": "1",
      "variables": {
        "name": {
          "type": "input_text",
          "text": "Alice"
        },
        "company": {
          "type": "input_text",
          "text": "Dummy Company"
        },
        "role": {
          "type": "input_text",
          "text": "Geography expert"
        },
        "tone": {
          "type": "input_text",
          "text": "professional and helpful"
        }
      }
    }
  }'

Output after inferencing:

{"created_at":1760647144,"error":null,"id":"resp_be7b207c-29d0-4cba-b583-a240fe010b70","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"The capital of Ireland is Dublin. If you have any more questions about Ireland or need further geographical assistance, feel free to ask!","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_4a71ede8-2e6d-4e9b-8637-b2a69ca21734","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_60d390fd55948e03803dbea7dfad034709a2a75214104b82","variables":["name","company","role","tone"],"is_default":false},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":47,"output_tokens":26,"total_tokens":73,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}}}%

r3v5 · 2025-10-16T21:13:23Z

@r3v5 unit tests are failing

I just rebased commit from main today. Still haven't finished my implementation

The implementation is there :)

franciscojavierarceo · 2025-10-21T14:54:55Z

sorry @r3v5 this keeps getting wrecked 😭

franciscojavierarceo · 2025-10-21T14:55:23Z

last rebase and i think we're good to go

ashwinb · 2025-10-21T15:34:14Z

We haven't landed prompts API implementation have we?

franciscojavierarceo · 2025-10-21T15:37:38Z

@ashwinb we landed that a while back here #3319

r3v5 · 2025-10-21T15:47:58Z

last rebase and i think we're good to go

No worries, @franciscojavierarceo ! I will do rebase today.

…ses API

r3v5 · 2025-10-21T18:55:27Z

last rebase and i think we're good to go

No worries, @franciscojavierarceo ! I will do rebase today.

I rebased from main, CI is green!

llama_stack/apis/agents/openai_responses.py

mattf

make sure to expand the title & description of this pr to match the expanded scope

also, make sure there is test coverage for new apis as they're used outside of prompts

r3v5 · 2025-10-22T12:28:31Z

make sure to expand the title & description of this pr to match the expanded scope

also, make sure there is test coverage for new apis as they're used outside of prompts

I have updated PR description now.

ashwinb · 2025-10-22T17:21:34Z

llama_stack/apis/agents/openai_responses.py

    output: list[OpenAIResponseOutput]
    parallel_tool_calls: bool = False
    previous_response_id: str | None = None
+    prompt: Prompt | None = None


but then is this object the full one?! and not the same object you created above?

When instance of OpenAIResponseObject class is created, it correctly contains link to object of Prompt class. If user doesn't provide prompt during creation of response, then there is no link to any prompt.

The prompt params we use during creation of response, refer to OpenAIResponsePromptParam class that is the standing to handle different types of prompt's variables.

@json_schema_type class OpenAIResponsePromptParam(BaseModel): """Prompt object that is used for OpenAI responses. :param id: Unique identifier of the prompt template :param variables: Dictionary of variable names to OpenAIResponseInputMessageContent structure for template substitution :param version: Version number of the prompt to use (defaults to latest if not specified) """ id: str variables: dict[str, OpenAIResponseInputMessageContent] | None = None version: str | None = None

Prompt class sits in Prompts API while OpenAIResponsePromptParam helper structure is defined in Agents API in apis/agents/openai_responses.py

@r3v5 now I am not so sure. OpenAI's "Response object" doc (on their reference) says the prompt field within contains exactly three fields { id, variables, version }. On the other hand the Prompt field in our incantation of the prompts API (which is NOT part of the OpenAI API set) has the fields { prompt_id, version, variables, is_default, prompt }

This is a discrepancy -- at least the { id } field being a clear discrepancy but even referencing that other object we made up seems wrong. This is what @leseb brought up before.

Got it. What do you think we can settle down with in terms of classes? Because as you said, now we have two classes for prompts on different API layers.

ashwinb · 2025-10-22T20:47:25Z

llama_stack/distributions/template.py

                backend="sql_default",
                table_name="openai_conversations",
            ).model_dump(exclude_none=True),
+            "prompts": SqlStoreReference(


this change should be a separate PR completely

I think you should just separate this PR into three PRs

first is the change I point out here

second is the change to the API only -- no implementation at all

implementation and tests

Yeah, I see. I will do it. Should I create issues for each PR or just submit PRs?

@r3v5 I think PRs all linking to the same issue is fine.

r3v5 · 2025-10-23T14:40:06Z

I submitted the first PR of three

# What does this PR do?  This PR is responsible for attaching prompts to storage stores in run configs. It allows to specify prompts as stores in different distributions. The need of this functionality was initiated in #3514 > Note, #3514 is divided on three separate PRs. Current PR is the first of three.   ## Test Plan  Manual testing and updated CI unit tests Prerequisites: 1. `uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install` 2. `llama stack run starter ` ``` INFO 2025-10-23 15:36:17,387 llama_stack.cli.stack.run:100 cli: Using run configuration: /Users/ianmiller/llama-stack/llama_stack/distributions/starter/run.yaml INFO 2025-10-23 15:36:17,423 llama_stack.cli.stack.run:157 cli: HTTPS enabled with certificates: Key: None Cert: None INFO 2025-10-23 15:36:17,424 llama_stack.cli.stack.run:159 cli: Listening on ['::', '0.0.0.0']:8321 INFO 2025-10-23 15:36:17,749 llama_stack.core.server.server:521 core::server: Run configuration: INFO 2025-10-23 15:36:17,756 llama_stack.core.server.server:524 core::server: apis: - agents - batches - datasetio - eval - files - inference - post_training - safety - scoring - tool_runtime - vector_io image_name: starter providers: agents: - config: persistence: agent_state: backend: kv_default namespace: agents responses: backend: sql_default max_write_queue_size: 10000 num_writers: 4 table_name: responses provider_id: meta-reference provider_type: inline::meta-reference batches: - config: kvstore: backend: kv_default namespace: batches provider_id: reference provider_type: inline::reference datasetio: - config: kvstore: backend: kv_default namespace: datasetio::huggingface provider_id: huggingface provider_type: remote::huggingface - config: kvstore: backend: kv_default namespace: datasetio::localfs provider_id: localfs provider_type: inline::localfs eval: - config: kvstore: backend: kv_default namespace: eval provider_id: meta-reference provider_type: inline::meta-reference files: - config: metadata_store: backend: sql_default table_name: files_metadata storage_dir: /Users/ianmiller/.llama/distributions/starter/files provider_id: meta-reference-files provider_type: inline::localfs inference: - config: api_key: '********' url: https://api.fireworks.ai/inference/v1 provider_id: fireworks provider_type: remote::fireworks - config: api_key: '********' url: https://api.together.xyz/v1 provider_id: together provider_type: remote::together - config: {} provider_id: bedrock provider_type: remote::bedrock - config: api_key: '********' base_url: https://api.openai.com/v1 provider_id: openai provider_type: remote::openai - config: api_key: '********' provider_id: anthropic provider_type: remote::anthropic - config: api_key: '********' provider_id: gemini provider_type: remote::gemini - config: api_key: '********' url: https://api.groq.com provider_id: groq provider_type: remote::groq - config: api_key: '********' url: https://api.sambanova.ai/v1 provider_id: sambanova provider_type: remote::sambanova - config: {} provider_id: sentence-transformers provider_type: inline::sentence-transformers post_training: - config: checkpoint_format: meta provider_id: torchtune-cpu provider_type: inline::torchtune-cpu safety: - config: excluded_categories: [] provider_id: llama-guard provider_type: inline::llama-guard - config: {} provider_id: code-scanner provider_type: inline::code-scanner scoring: - config: {} provider_id: basic provider_type: inline::basic - config: {} provider_id: llm-as-judge provider_type: inline::llm-as-judge - config: openai_api_key: '********' provider_id: braintrust provider_type: inline::braintrust tool_runtime: - config: api_key: '********' max_results: 3 provider_id: brave-search provider_type: remote::brave-search - config: api_key: '********' max_results: 3 provider_id: tavily-search provider_type: remote::tavily-search - config: {} provider_id: rag-runtime provider_type: inline::rag-runtime - config: {} provider_id: model-context-protocol provider_type: remote::model-context-protocol vector_io: - config: persistence: backend: kv_default namespace: vector_io::faiss provider_id: faiss provider_type: inline::faiss - config: db_path: /Users/ianmiller/.llama/distributions/starter/sqlite_vec.db persistence: backend: kv_default namespace: vector_io::sqlite_vec provider_id: sqlite-vec provider_type: inline::sqlite-vec registered_resources: benchmarks: [] datasets: [] models: [] scoring_fns: [] shields: [] tool_groups: - provider_id: tavily-search toolgroup_id: builtin::websearch - provider_id: rag-runtime toolgroup_id: builtin::rag vector_stores: [] server: port: 8321 storage: backends: kv_default: db_path: /Users/ianmiller/.llama/distributions/starter/kvstore.db type: kv_sqlite sql_default: db_path: /Users/ianmiller/.llama/distributions/starter/sql_store.db type: sql_sqlite stores: conversations: backend: sql_default table_name: openai_conversations inference: backend: sql_default max_write_queue_size: 10000 num_writers: 4 table_name: inference_store metadata: backend: kv_default namespace: registry prompts: backend: kv_default namespace: prompts telemetry: enabled: true vector_stores: default_embedding_model: model_id: nomic-ai/nomic-embed-text-v1.5 provider_id: sentence-transformers default_provider_id: faiss version: 2 INFO 2025-10-23 15:36:20,032 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-10-23 15:36:20,422 llama_stack.providers.inline.telemetry.meta_reference.telemetry:84 telemetry: OTEL_EXPORTER_OTLP_ENDPOINT is not set, skipping telemetry INFO 2025-10-23 15:36:22,379 llama_stack.providers.utils.inference.openai_mixin:436 providers::utils: OpenAIInferenceAdapter.list_provider_model_ids() returned 105 models INFO 2025-10-23 15:36:22,703 uvicorn.error:84 uncategorized: Started server process [17328] INFO 2025-10-23 15:36:22,704 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-10-23 15:36:22,706 llama_stack.core.server.server:179 core::server: Starting up Llama Stack server (version: 0.3.0) INFO 2025-10-23 15:36:22,707 llama_stack.core.stack:470 core: starting registry refresh task INFO 2025-10-23 15:36:22,708 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-10-23 15:36:22,708 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` As you can see, prompts are attached to stores in config Testing: 1. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.", "variables": ["name", "company", "role", "tone"] }' ``` `{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e163f","variables":["name","company","role","tone"],"is_default":false}% ` 2. Get prompt: `curl -X GET http://localhost:8321/v1/prompts/pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e163f` `{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e163f","variables":["name","company","role","tone"],"is_default":false}% ` 3. Query sqlite KV storage to check created prompt: ``` sqlite> .mode column sqlite> .headers on sqlite> SELECT * FROM kvstore WHERE key LIKE 'prompts:v1:%'; key value expiration ------------------------------------------------------------ ------------------------------------------------------------ ---------- prompts:v1:pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e {"prompt_id": "pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab 163f:1 5f6e163f", "prompt": "Hello {{name}}! You are working at {{c ompany}}. Your role is {{role}} at {{company}}. Remember, {{ name}}, to be {{tone}}.", "version": 1, "variables": ["name" , "company", "role", "tone"], "is_default": false} prompts:v1:pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e 1 163f:default sqlite> ```

r3v5 requested review from ashwinb, bbrowning, ehhuang, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners September 21, 2025 13:01

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 21, 2025

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch from ef753bc to fe6ea4c Compare September 21, 2025 13:11

mattf requested changes Sep 22, 2025

View reviewed changes

franciscojavierarceo requested changes Sep 22, 2025

View reviewed changes

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch from fe6ea4c to a3cdf78 Compare September 22, 2025 17:49

r3v5 commented Sep 22, 2025

View reviewed changes

r3v5 requested review from franciscojavierarceo and mattf September 22, 2025 17:56

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch from a3cdf78 to d76b15b Compare September 28, 2025 15:01

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch 2 times, most recently from fadf1d0 to f474e0c Compare September 30, 2025 10:55

franciscojavierarceo mentioned this pull request Oct 1, 2025

feat: Add OpenAI Conversations API #3429

Merged

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch 2 times, most recently from 4175600 to f37efb1 Compare October 2, 2025 13:33

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch 2 times, most recently from d63e31f to b954305 Compare October 16, 2025 12:24

r3v5 changed the title ~~feat: add Prompts API to Responses API~~ feat(responses)!: add Prompts API to Responses API Oct 16, 2025

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch 2 times, most recently from 1660935 to 7a7b2b7 Compare October 16, 2025 20:51

r3v5 requested a review from leseb October 16, 2025 21:12

feat(responses)!: add support for OpenAI compatible Prompts in Respon…

59169bf

…ses API

r3v5 force-pushed the add-prompts-api-support-to-responses-api branch from 7a7b2b7 to 59169bf Compare October 21, 2025 18:51

franciscojavierarceo reviewed Oct 21, 2025

View reviewed changes

llama_stack/apis/agents/openai_responses.py Show resolved Hide resolved

franciscojavierarceo reviewed Oct 21, 2025

View reviewed changes

llama_stack/apis/agents/openai_responses.py Show resolved Hide resolved

r3v5 requested a review from franciscojavierarceo October 21, 2025 19:20

mattf reviewed Oct 22, 2025

View reviewed changes

ashwinb reviewed Oct 22, 2025

View reviewed changes

r3v5 requested a review from ashwinb October 22, 2025 18:01

ashwinb reviewed Oct 22, 2025

View reviewed changes

r3v5 mentioned this pull request Oct 23, 2025

feat(prompts): attach prompts to storage stores in run configs #3893

Merged

Uh oh!

feat(responses)!: add Prompts API to Responses API #3514

Are you sure you want to change the base?

feat(responses)!: add Prompts API to Responses API #3514

Conversation

r3v5 commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

r3v5 commented Sep 22, 2025

Uh oh!

franciscojavierarceo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

r3v5 commented Sep 22, 2025

Uh oh!

r3v5 Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

mattf commented Sep 23, 2025

Uh oh!

cdoern commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

franciscojavierarceo commented Sep 29, 2025

Uh oh!

leseb commented Oct 15, 2025

Uh oh!

franciscojavierarceo commented Oct 16, 2025

Uh oh!

r3v5 commented Oct 16, 2025

Uh oh!

r3v5 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

r3v5 commented Oct 16, 2025

Uh oh!

franciscojavierarceo commented Oct 21, 2025

Uh oh!

franciscojavierarceo commented Oct 21, 2025

Uh oh!

ashwinb commented Oct 21, 2025

Uh oh!

franciscojavierarceo commented Oct 21, 2025

Uh oh!

r3v5 commented Oct 21, 2025

Uh oh!

r3v5 commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

r3v5 commented Oct 22, 2025

Uh oh!

ashwinb Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

r3v5 Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

r3v5 Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

ashwinb Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

r3v5 Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ashwinb Oct 22, 2025

r3v5 commented Sep 21, 2025 •

edited

Loading

franciscojavierarceo left a comment •

edited

Loading

cdoern commented Sep 23, 2025 •

edited

Loading

r3v5 commented Oct 16, 2025 •

edited

Loading

r3v5 commented Oct 21, 2025 •

edited

Loading

ashwinb Oct 22, 2025 •

edited

Loading

r3v5 Oct 22, 2025 •

edited

Loading

r3v5 Oct 22, 2025 •

edited

Loading