Proposal
Hello, I want to apply your NSA attention mechanism on my local large model. It appears when I run tests/test_nsa.py
"ERROR tests/ test_nsa.py-ValueError: 'bitnet' is already used by a Transformers config, pick another name"
Is there any good way to solve it?
Rationale
No response