Skip to content
View joycenerd's full-sized avatar
:octocat:
Focusing
:octocat:
Focusing

Highlights

  • Pro

Block or report joycenerd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
joycenerd/README.md

Hi there ๐Ÿ‘‹ This is Zhi-Yi's GitHub Profile

๐Ÿ€ I'm a researcher focused on AI safety, interpretability, and trustworthy machine learning.

๐Ÿ‘€ Currently, I'm a visiting research fellow at the University of Oxford working with Fazl Barez on scalable interpretability methods for LLM capability analysis and safety benchmarking. I'm also a research assistant at the @NYCU-RL-Bandits-Lab at National Yang Ming Chiao Tung University working with Ping-Chun Hsieh on RL backdoor attack detection and post-hoc interpretation of text-to-image model misbehavior, collaborating closely with Pin-Yu Chen from IBM Research. I'll be starting my PhD at CISPA Helmholtz Center for Information Security soon, where I'll work with Mario Fritz on trustworthy AI systems.

๐Ÿ”ฌ Research Interests

  • AI safety & red-teaming
  • Trustworthy text-to-image generation
  • Reinforcement learning security
  • Interpretability & mechanistic understanding

๐Ÿ“ซ Get in Touch

๐Ÿ“„ CV / ๐Ÿฆ Twitter / ๐Ÿฑ GitHub / ๐ŸŽ“ Google Scholar / ๐Ÿ’ผ LinkedIn / ๐Ÿ“ท Instagram / ๐Ÿงต Threads / ๐Ÿ“˜ Facebook

In my free time, I enjoy ๐Ÿƒrunning, ๐Ÿ“šreading, and exploring ๐Ÿงdessert and โ˜•๏ธcoffee shops.

I would like to connect if you have similar interests in all the things I've mentioned above (AI Safety research, running, reading, dessert, coffee). Please feel free reaching out to me at joycenerd.cs09[AT]nycu.edu.tw

You are the ๐Ÿ‘‡ visitor who visits my profile ๐Ÿ˜†

Pinned Loading

  1. P4D P4D Public

    [ICML 2024] Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts (Official Pytorch Implementation)

    Python 46 1

  2. DNN-accelerator-on-zynq DNN-accelerator-on-zynq Public

    Digital Design Lab Spring 2019 Final Project

    Verilog 12 1

  3. 3D_Augmentation 3D_Augmentation Public

    3D point cloud data augmentation

    Jupyter Notebook 6 2

  4. personal-website-template personal-website-template Public template

    My personal website as template

    HTML 5

  5. MPO_Reimplementation MPO_Reimplementation Public

    Reimplementation of Maximum a Posteriori Policy Optimisation

    Python 3 1

  6. rsna-pneumonia-detection rsna-pneumonia-detection Public

    Final project of VRDL course in 2021 fall semester at NYCU.

    Python 1