Skip to content
View ChaoyuWang04's full-sized avatar

Block or report ChaoyuWang04

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ChaoyuWang04/README.md

Hi, I'm Chaoyu Wang 👋

I'm an AI/ML engineer focused on LLM alignment and fine-tuning,
currently self-studying at UC Berkeley and building toward AI research.

🔭 What I'm working on

  • Agent SFT pipelines with LoRA fine-tuning (Qwen3, custom tool-call alignment)
  • RAG systems with hybrid retrieval (BM25 + Dense, Cross-Encoder reranking)
  • GRPO/RLHF alignment for domain-specific LLMs

🛠️ Tech Stack

ML/AI: PyTorch · HuggingFace Transformers · vLLM · LLaMA Factory
Infra: RunPod · FastAPI · Docker
Dev: Python · Next.js · PostgreSQL

🎓 Background

  • M.S. Engineering & Applied Mathematics — Northwestern University (2025)
  • B.S. Applied Mathematics - University of California, San Diego (2024)
  • Ex-intern @ GuruGame HK

📫 Reach me

Email Website

Pinned Loading

  1. AdCampaignAgent-SFT AdCampaignAgent-SFT Public

    End-to-end training pipeline for mobile game UA tool-calling agents, covering rule-based synthetic data generation across 7 workflows, OpenAI Messages conversion, Qwen3 LoRA SFT, GRPO/RLVR alignmen…

    Python

  2. FinReas-R1 FinReas-R1 Public

    Reasoning Reward Model trained via GRPO on synthetic customer service preference data. Generates evaluation rationale before outputting preference labels, reducing reward hacking vs. scalar RM. Bui…

    Python

  3. promptgen-next promptgen-next Public

    AI-powered prompt generation and image production system with template-driven workflows, multi-provider orchestration, and multilingual image stitching.

    TypeScript 2

  4. MonitorSysUA MonitorSysUA Public

    Internal full-stack monitoring system for Google Ads operations, AppsFlyer cohort analytics, and evaluation-driven optimization workflows.

    TypeScript 1

  5. Chaoyu-Personal-Web Chaoyu-Personal-Web Public

    Personal Web Page

    TypeScript