Possible issue with buffer where start_index can be smaller than end index.

Hi Conner,

I think I found a possible bug in the buffer code. It's possible for `self.token_pointer` to be less than `min(self.token_pointer + self.cfg["model_batch_size"], num_batches)` in the buffer refresh. Then an empty token gets passed to the two models. However, this doesn't immediately raise an error. 

If you run train.py in this branch https://github.com/tim-hua-01/crosscoder_fun/tree/issues_demo, you can see that the the loss appears to down even while empty tokens are added to the buffer. 

I rewrote the buffer code here: https://github.com/tim-hua-01/crosscoder_fun/blob/main/buffer.py, although I'm not 100% sure if that's correctly done either.

Thanks!
Tim

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible issue with buffer where start_index can be smaller than end index. #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Possible issue with buffer where start_index can be smaller than end index. #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions