#

reward-model

Here are 8 public repositories matching this topic...

Westlake-AI / SemiReward

[ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning

machine-learning natural-language-processing computer-vision regression transformer semi-supervised-learning audio-classification weakly-supervised-learning yahoo-answers cifar-100 label-noise esc-50 vision-transformer reward-model

Updated Jun 10, 2024
Python

rochitasundar / Generative-AI-with-Large-Language-Models

This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".

reinforcement-learning transformer kl-divergence proximal-policy-optimization large-language-models prompt-engineering flan-t5 instruction-finetuning low-rank-adaptation reward-model parameter-efficient-fine-tuning llm-evaluation

Updated Dec 1, 2023
Jupyter Notebook

hlp-ai / miniChatGPT

Mini ChatGPT

pytorch ppo sft gpt2 chatgpt instructgpt reward-model

Updated May 12, 2023
Python

techandy42 / LLM_Reward_Model

Developing a LLM response ranking reward model using HFRL except it's GPT-3.5 instead of human.

language-model reward-model hfrl

Updated Dec 28, 2023
Jupyter Notebook

taishan1994 / Reward-Model-Finetuning

专门用于训练奖励模型的仓库。

reward-model qwen2

Updated Aug 7, 2024
Python

thisisHJLee / RLHF

nlp reinforcement-learning language-model ppo rlhf supervised-finetuning reward-model

Updated Jul 20, 2023

jddunn / rlhf

POC library built on TextRL for easy training and usage of fine-tuned models using RLHF, a rewards model, and PPO

ppo rlhf reward-model textrl

Updated Feb 28, 2024
Python

RuvenGuna94 / Dialogue-Summary-remove-toxic-text-PPO

Fine-tuning FLAN-T5 with PPO and PEFT to generate less toxic text summaries. This notebook leverages Meta AI's hate speech reward model and utilizes RLHF techniques for improved safety.

nlp toxic-comment-classification hate-speech-detection toxicity-analysis ppo-pytorch dialogue-summarization generative-ai detoxification reward-model

Updated Jan 4, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the reward-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the reward-model topic, visit your repo's landing page and select "manage topics."