The project seems not support pure cpu inference, i.e. run example.py with --device cpu, as the attention modules always use xformers with attn_bias is not None.
ValueError: Attention bias and Query/Key/Value should be on the same device
query.device: cpu
attn_bias : cuda:0