Hello, authors.
Thank you for sharing your wonderful work.
I am attempting to reproduce results and would like to apply activation capping.
First, I have a question regarding how to extract the vectors required for activation capping. (e.g., {'vector: 'layer_55/contrast_role_pos3_default1', 'cap': 117.0) I would actually like to check steered and unsteered responses.
Second, I would like to know if there is a guide or notebook on the overall method for reproducing Figure 11-13. Due to resource constraints, I am using Qwen3-0.6B model instead of models used in the paper.
Thanks,
Jeewoo Sul