Zhi-Yi Chin joycenerd

Hi there 👋 This is Zhi-Yi's GitHub Profile

🍀 I'm a researcher focused on AI safety, interpretability, and trustworthy machine learning.

👀 Currently, I'm a visiting research fellow at the University of Oxford working with Fazl Barez on scalable interpretability methods for LLM capability analysis and safety benchmarking. I'm also a research assistant at the @NYCU-RL-Bandits-Lab at National Yang Ming Chiao Tung University working with Ping-Chun Hsieh on RL backdoor attack detection and post-hoc interpretation of text-to-image model misbehavior, collaborating closely with Pin-Yu Chen from IBM Research. I'll be starting my PhD at CISPA Helmholtz Center for Information Security soon, where I'll work with Mario Fritz on trustworthy AI systems.

🔬 Research Interests

AI safety & red-teaming
Trustworthy text-to-image generation
Reinforcement learning security
Interpretability & mechanistic understanding

📫 Get in Touch

📄 CV / 🐦 Twitter / 🐱 GitHub / 🎓 Google Scholar / 💼 LinkedIn / 📷 Instagram / 🧵 Threads / 📘 Facebook

In my free time, I enjoy 🏃running, 📚reading, and exploring 🧁dessert and ☕️coffee shops.

I would like to connect if you have similar interests in all the things I've mentioned above (AI Safety research, running, reading, dessert, coffee). Please feel free reaching out to me at joycenerd.cs09[AT]nycu.edu.tw

You are the 👇 visitor who visits my profile 😆

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhi-Yi Chin joycenerd

Achievements

Achievements

Highlights

Block or report joycenerd

Hi there 👋 This is Zhi-Yi's GitHub Profile

🔬 Research Interests

📫 Get in Touch

Pinned Loading

Uh oh!