code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
inverse-reinforcement-learning irl offline-rl large-language-models llm prompt-engineering rlhf rlaif offline-irl
-
Updated
Mar 20, 2024 - Python