-
Notifications
You must be signed in to change notification settings - Fork 5
Integrating the Re-prompting pipeline into the SDK #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…k). updating residual error score calculation to have no penalty for adhereed instructions (follow probability >= 0.5) as per Alex's recommendation. also updated the description of a test case
… construct an example reprompting flow.
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
pjoshi30
reviewed
Jul 31, 2025
Deleting Colab notebook demo as it is already linked elsewhere
…or to safely call the user-provided LLM function and the AIMon Detect function. Re-raises last encountered exception upon failure. Also handled 0 division error in get_penalized_average by returning -1 if the list of follow probabilities was empty and added this info to documentation
ee1e106
to
de5345c
Compare
pjoshi30
approved these changes
Aug 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Looking forward to shipping this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a configurable, framework‑agnostic re‑prompting pipeline for iteratively improving LLM responses using AIMon’s detectors. The pipeline evaluates model outputs for instruction adherence, groundedness, and toxicity, and automatically generates corrective prompts until predefined stopping conditions (max iterations, latency limit, or adherence achieved) are met. This enables developers to wrap any black‑box LLM in an automated feedback loop to improve response instruction adherence by an average of 22%!
Here is the design doc for this API: Integrating Re-prompting Pipeline into the Python SDK
Here is a blogpost that details the pipeline and experimental results:Re-Prompting: A Smarter Loop for Smarter Models
Key Features & Highlights:
Files Added:
Testers:
test_reprompting_cases.py: tests different facets and configs for the pipeline. Passes if generated_text exists in the returned dictionary, but print statements can show the progression of reprompting over iterations. It tests the following cases with various contexts and query types:
Test_reprompting_success.py: sample successful test run with llm_fn calling Mistral 7B instruct, provided system_prompt, context, instructions, and user_query, and telemetry returned.
Test_reprompting_failures.py: tests failures modes. Each test case should trigger an error.
Implementation guide Colab notebook: https://colab.research.google.com/drive/1s6lup_4_v2YpE2vlPz-fkP6_BLloKouj?usp=sharing
Key questions: